Title：Optimizing MapReduce Framework through Joint Scheduling of Overlapping Phases
MapReduce includes three phases of map, shuffle, and reduce. Since the map phase is CPU-intensive and the shuffle phase is I/O-intensive, these phases can be conducted in parallel. This paper studies a joint scheduling optimization of overlapping map and shuffle phases to minimize the average job makespan. Challenges come from the dependency relationship between map and shuffle phases, since the shuffle phase may wait to transfer the data emitted by the map phase. A new concept of the strong pair is introduced. Two jobs are defined as a strong pair, if the shuffle and map workloads of one job equal the map and shuffle workloads of the other job, respectively. We prove that, if the entire set of jobs can be decomposed to strong pairs of jobs, then the optimal schedule is to pairwisely execute jobs that can form a strong pair. Following the above intuition, several offline and online scheduling policies are proposed. They first group jobs according to job workloads, and then, execute jobs within each group through a pairwise manner. Real data-driven experiments validate the efficiency and effectiveness of the proposed policies.
Jie Wu is the Associate Vice Provost for International Affairs at Temple University. He also serves as the Director of Center for Networked Computing and Laura H. Carnell professor in the Department of Computer and Information Sciences. Prior to joining Tempe University, he was a program director at the National Science Foundation and was a distinguished professor at Florida Atlantic University. His current research interests include mobile computing and wireless networks, routing protocols, cloud and green computing, network trust and security, and social network applications. Dr. Wu regularly publishes in scholarly journals, conference proceedings, and books. He serves on several editorial boards, including IEEE Transactions on Service Computing and the Journal of Parallel and Distributed Computing. Dr. Wu was general co-chair/chair for IEEE MASS 2006, IEEE IPDPS 2008, IEEE ICDCS 2013, and ACM MobiHoc 2014, as well as program co-chair for IEEE INFOCOM 2011 and CCF CNCC 2013. He was an IEEE Computer Society Distinguished Visitor, ACM Distinguished Speaker, and chair for the IEEE Technical Committee on Distributed Processing (TCDP). Dr. Wu is a CCF Distinguished Speaker and a Fellow of the IEEE. He is the recipient of the 2011 China Computer Federation (CCF) Overseas Outstanding Achievement Award.