Abstract
The MapReduce framework has become the de facto standard for big data processing due to its attractive features and abilities. One is that it automatically parallelizes a job into multiple tasks and transparently handles task execution on a large cluster of commodity machines. The increasing heterogeneity of distributed environments may result in a few straggling tasks, which prolong job completion. Speculative execution is proposed to mitigate stragglers. However, the existing speculative execution mechanism could not work efficiently as many speculative tasks are still slower than their original tasks. In this paper, we explore an approach to increase the efficiency of speculative execution, and further improve MapReduce performance. We propose the Partial Speculative Execution (PSE) strategy to make speculative tasks start from the checkpoint. By leveraging the checkpoint of original tasks, PSE can eliminate the costs of re-reading, re-copying, and re-computing the processed data. We implement PSE in Hadoop, and evaluate its performance in terms of job completion time and the efficiency of speculative execution under several kinds of classical workloads. Experimental results show that, in heterogeneous environments with stragglers, PSE completes jobs 56 % faster than that with no speculation and 12 % faster than that with LATE, an improved speculative execution algorithm. In addition, on average PSE can improve the efficiency of speculative execution by 24 % compared to LATE.
Similar content being viewed by others
References
Ananthanarayanan, G., Kandula, S., Greenberg, A.G., Stoica, I., Lu, Y., Saha, B., Harris, E.: Reining in the outliers in map-reduce clusters using mantri. In: OSDI, 10, 24 (2010)
Ananthanarayanan, G., Ghodsi, A., Shenker, S., Stoica, I.: Effective straggler mitigation: attack of the clones (2013)
Apache: Apache hadoop. http://hadoop.apache.org/ (2014a)
Apache: Apache zookeeper. http://zookeeper.apache.org/ (2014b)
Benjamin Gufler ARAK Nikolaus Agustine: Handling data skew in mapreduce (2011)
Bhatotia, P., Wieder, A., Rodrigues, R., Acar, U.A., Pasquin, R.: Incoop: Mapreduce for incremental computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, ACM, p 7 (2011)
Chen, Q., Zhang, D., Guo, M., Deng, Q., Guo, S.: Samr: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment. In: Computer and Information Technology (CIT), 2010 IEEE 10th International Conference on, IEEE, 2736–2743 (2010)
Cho, B., Rahman, M., Chajed, T., Gupta, I., Abad, C., Roberts, N., Lin, P.: Natjam: Design and evaluation of eviction policies for supporting priorities and deadlines in mapreduce clusters. In: Proceedings of the 4th annual Symposium on Cloud Computing, ACM, 6 (2013)
Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: Mapreduce online (2010)
Curino, C.: [mapreduce-5197]checkpoint service: a library component to facilitate checkpoint of task state. https://issues.apache.org/jira/browse/MAPREDUCE-5197 (2013)
Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Doulkeridis, C., Nørvåg, K.: A survey of large-scale analytical query processing in mapreduce. The VLDB Journal 1–26 (2013)
Elmeleegy, K.: Piranha: Optimizing short jobs in hadoop. Proc. VLDB Endowment 6(11), 985–996 (2013)
Eltabakh, M.Y., Özcan, F., Sismanis, Y., Haas, P.J., Pirahesh, H., Vondrak, J.: Eagle-eyed elephant: split-oriented indexing in hadoop. In: Proceedings of the 16th International Conference on Extending Database Technology, ACM, 89–100 (2013)
Grover, R., Carey, M.J.: Extending map-reduce for efficient predicate-based sampling. In: Data Engineering (ICDE), 2012 IEEE 28th International Conference on, IEEE, 486–497 (2012)
Gu, R., Yang, X., Yan, J., Sun, Y., Wang, B., Yuan, C., Huang, Y.: Shadoop: Improving mapreduce performance by optimizing job execution mechanism in hadoop clusters. J. Parallel Distrib. Comput. 74(3), 2166–2179 (2014)
Gufler, B., Augsten, N., Reiser, A., Kemper, A.: Load balancing in mapreduce based on scalable cardinality estimates. In: Data Engineering (ICDE), 2012 IEEE 28th International Conference on, IEEE, 522–533 (2012)
Guo, Y., Rao, J., Zhou, X.: Ishuffle: Improving hadoop performance with shuffle-on-write. 10th International Conference on Autonomic Computing 107–117 (2013)
Harringer, M.: Xen-the art of virtualization (2004)
Herodotou, H., Dong, F., Babu, S.: No one (cluster) size fits all: automatic cluster sizing for data-intensive analytics. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, ACM, 18 (2011)
Hsu, C.H., Lin, C.C., Ts, Hsu: Adaptable scheduling algorithm for grids with resource redeployment capability. J. Grid Computing 12(3), 447–463 (2014)
Hueske, F., Peters, M., Sax, M.J., Rheinländer, A., Bergmann, R., Krettek, A., Tzoumas, K.: Opening the black boxes in data flow optimization. Proc. VLDB Endowment 5(11), 1256–1267 (2012)
Isard, M., Budiu, M., Yu, Y., Birrell, A., Fetterly, D.: Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Oper. Syst. Rev. 41(3), 59–72 (2007)
Kwon, Y., Balazinska, M., Howe, B., Rolia, J.: Skew-resistant parallel processing of feature-extracting scientific user-defined functions. In: Proceedings of the 1st ACM symposium on Cloud computing, ACM, 75–86 (2010)
Kwon, Y., Balazinska, M., Howe, B., Rolia, J., A study of skew in mapreduce applications. Open Cirrus Summit (2011)
Kwon, Y., Balazinska, M., Howe, B., Rolia, J.: Skewtune: mitigating skew in mapreduce applications. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, ACM, 25–36 (2012)
Kwon, Y., Ren, K., Balazinska, M., Howe, B., Rolia, J.: Managing skew in hadoop. IEEE Data Eng Bull 36(1), 24–33 (2013)
Laptev, N., Zeng, K., Zaniolo, C.: Early accurate results for advanced analytics on mapreduce. Proc. VLDB Endowment 5(10), 1028–1039 (2012)
Lim, H., Herodotou, H., Babu, S.: Stubby: A transformation-based optimizer for mapreduce workflows. Proc. VLDB Endowment 5(11), 1196–1207 (2012)
Logothetis, D., Olston, C., Reed, B., Webb, K.C., Yocum, K.: Stateful bulk processing for incremental analytics (2010)
Olston, C., Chiou, G., Chitnis, L., Liu, F., Han, Y., Larsson, M., Neumann, A., Rao, V.B., Sankarasubramanian, V., Seth, S., et al.: Nova: continuous pig/hadoop workflows. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, ACM, 1081–1090 (2011)
Onizuka, M., Kato, H., Hidaka, S., Nakano, K., Hu, Z.: Optimization for iterative queries on mapreduce. Proc. VLDB Endowment 7(4) (2013)
Quiané-Ruiz, J.A., Pinkel, C., Schad, J., Dittrich, J.: Rafting mapreduce: Fast recovery on the raft. In: Data Engineering (ICDE), 2011 IEEE 27th International Conference on, IEEE, 589–600 (2011)
Qureshi, M.B., Dehnavi, M.M., Min-Allah, N., Qureshi, M.S., Hussain, H., Rentifis, I., Tziritas, N., Loukopoulos, T., Khan, S.U., Xu, C.Z., et al.: Survey on grid resource allocation mechanisms. J. Grid Computing 12(2), 399–441 (2014)
Ramakrishnan, S.R., Swart, G., Urmanov, A.: Balancing reducer skew in mapreduce workloads using progressive sampling. In: Proceedings of the Third ACM Symposium on Cloud Computing, ACM, 16 (2012)
Rao, S., Ramakrishnan, R., Silberstein, A., Ovsiannikov, M., Reeves, D.: Sailfish: A framework for large scale data processing. In: Proceedings of the Third ACM Symposium on Cloud Computing, ACM, 4 (2012)
Rasmussen, A., Conley, M., Porter, G., Kapoor, R., Vahdat, A., et al.: Themis: an i/o-efficient mapreduce. In: Proceedings of the Third ACM Symposium on Cloud Computing, ACM, 13 (2012)
Rasooli, A., Down, D.G.: Guidelines for selecting hadoop schedulers based on system heterogeneity. J. Grid Computing 12(3), 499–519 (2014)
Reiss, C., Tumanov, A., Ganger, G.R., Katz, R.H., Kozuch, M.A.: Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In: Proceedings of the Third ACM Symposium on Cloud Computing, ACM, 7 (2012)
Schad, J., Quianee-Ruiz, J. A., Dittrich, J.: Elephant, do not forget everything! efficient processing of growing datasets. In: Cloud Computing (CLOUD), 2013 IEEE Sixth International Conference on, IEEE, 252–259 (2013)
Sun, X., He, C., Lu, Y.: Esamr: An enhanced self-adaptive mapreduce scheduling algorithm. In: Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on, IEEE, 148–155 (2012)
Vavilapalli, V.K., Murthy, A.C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., et al.: Apache hadoop yarn: Yet another resource negotiator. In: Proceedings of the 4th annual Symposium on Cloud Computing, ACM, 5 (2013)
Vernica, R., Balmin, A., Beyer, K.S., Ercegovac, V.: Adaptive mapreduce using situation-aware mappers. In: Proceedings of the 15th International Conference on Extending Database Technology, ACM, 420–431 (2012)
Wang, W., Zeng, G.: Bayesian cognitive model in scheduling algorithm for data intensive computing. J. Grid Computing 10(1), 173–184 (2012)
Wolf, J., Rajan, D., Hildrum, K., Khandekar, R., Kumar, V., Parekh, S., Wu, K.L., Balmin, A.: Flex: A slot allocation scheduling optimizer for mapreduce workloads. In: Middleware 2010, Springer, 1–20 (2010)
Wu, S., Li, F., Mehrotra, S., Ooi, B.C.: Query optimization for massively parallel data processing. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, ACM, 12 (2011)
Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R.H., Stoica, I.: Improving mapreduce performance in heterogeneous environments (2008)
Zaharia, M., Borthakur, D., Sen Sarma, J., Elmeleegy, K., Shenker, S., Stoica, I.: Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of the 5th European conference on Computer systems, ACM, 265–278 (2010)
Zhang, J., Zhou, H., Chen, R., Fan, X., Guo, Z., Lin, H., Li, J.Y., Lin, W., Zhou, J., Zhou, L.: Optimizing data shuffling in data-parallel computation by understanding user-defined functions (2012a)
Zhang, Y, Gao, Q, Gao, L, Wang, C.: Imapreduce: A distributed computing framework for iterative computation. J. Grid Computing 10(1), 47–68 (2012b)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, Y., Lu, W., Lou, R. et al. Improving MapReduce Performance with Partial Speculative Execution. J Grid Computing 13, 587–604 (2015). https://doi.org/10.1007/s10723-015-9350-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-015-9350-y