Platform and Co-Runner Affinities for Many-Task Applications in Distributed Computing Platforms

Platform and Co-Runner Affinities for Many-Task Applications in Distributed Computing Platforms Recent emerging applications from a wide range of scientific domains often require a very large number of loosely coupled tasks to be efficiently processed. To support such applications effectively, all the available resources from different types of computing platforms such as supercomputers, grids, and clouds need to be utilized. However, exploiting heterogeneous resources from the platforms for multiple loosely coupled many-task applications is challenging, since the performance of an application can vary significantly depending on which platform is used to run it, and which applications co-run in the same node with it. In this paper, we analyze the platform and co-runner affinities of many-task applications indistributed computing platforms. We perform a comprehensive experimental study using four different platforms, and five many-task applications. We then present a two-level scheduling algorithm, whichdistributes the resources of different platforms to each application based on the platform affinity in the first level, and maps tasks of the applications to computing nodes based on the co-runner affinity for each platform in the second level. Finally, we evaluate the performance of our scheduling algorithm, using a trace-based simulator. Our simulation results demonstrate that our scheduling algorithm can improve the performance up to 30.0%, compared to a baseline scheduling algorithm.