Analysis of MapReduce scheduling and its improvements in cloud environment

Analysis of MapReduce scheduling and its improvements in cloud environment MapReduce has become a prominent Parallel processing model used for analysing large scale data. MapReduce applications are increasingly being deployed in the cloud along with other applications sharing the same physical resources. In this scenario, efficient scheduling of MapReduce applications is of utmost importance. Also, MapReduce has to consider various other parameters like energy efficiency and meeting SLA goals besides achieving performance when executing jobs in cloud environments. In this work, we have classified MapReduce Scheduling as Cluster based Scheduling and Objective based Scheduling. We then summarize and analyse the different class of schedulers highlighting the strong points and limitations of each of the scheduling approaches. The Adaptive scheduling techniques provide dynamic resource management and meet performance goals. The Energy efficient scheduling techniques aim to cut data centre costs by using different approaches. Finally, we discuss the current challenges and future work.