BeTL: MapReduce Checkpoint Tactics Beneath the Task Level

BeTL: MapReduce Checkpoint Tactics Beneath the Task Level Big data analysis has gained significant popularity within the last few years. The MapReduce framework presented by Google makes it easier to write applications that process vast amount of data. MapReduce targets at large commodity clusters where failures are not exceptions. However, Hadoop, the most popular implementation of MapReduce performs poorly under failures. Hadoop implements the fault tolerance strategy at the task level, as a result, a task failure will require a re-execution of the whole task regardless of how much input has already been processed. In this paper, we present BeTL which introduces slight changes to the execution flow of MapReduce, and makes it possible to gain a finer-grained fault tolerance. Map tasks can create checkpoints so that a retrying task doesn’t have to start from scratch and thus saves a lot of time. Speculation strategy can also benefit from this. The new execution flow involves less IO operations and performs better than Hadoop even under no failures. In our experiments, BeTL outperforms Hadoop by 6.6% on average under no failures and 4.6% to 51.0% under different failure densities.