Abstract
As a primary big data processing framework, Spark can support memory computing to improve the computation efficiency. However, Spark cannot handle the situation of a heterogeneous cluster, in which the nodes have different structures. Specifically, a primary problem in Spark is that the resource allocation strategy based on the number of homogeneous processor cores cannot adapt to the heterogeneous cluster environment. To solve the above-mentioned problem, we propose a zone-based resource allocation strategy based on heterogeneous Spark cluster (ZbRAS) and implement such a strategy to improve the efficiency of Spark. We compare the proposed strategy with the native resource allocation strategy of Spark, and the comparison results show that our proposed strategy can significantly enhance the execution speed of Spark jobs in a heterogeneous cluster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Armbrust M, Das T, Davidson A, Ghodsi A, Or A, Rosen J, Stoica I, Wendell P, Xin R, Zaharia M (2015) Scaling spark in the real world: performance and usability. Proc VLDB Endowment 8(12):1840–1843
Apache Hadoop (2011) http://hadoop.apache.org
Gao H, Yang Z, Bhimani J, Wang T, Wang J, Sheng B, Mi N (2017) Autopath: harnessing parallel execution paths for efficient resource allocation in multi-stage big data frameworks. In: 2017 26th international conference on computer communication and networks (ICCCN). IEEE, pp 1–9
Zaharia M, Chowdhury M, Franklin MJ, Shenker S, Stoica I (2010) Spark: cluster computing with working sets. HotCloud 10(10–10):95
Birrell AD, Nelson BJ (1984) Implementing remote procedure calls. ACM Trans Comput Syst (TOCS) 2(1):39–59
Huang S, Huang J, Liu Y, Yi L, Dai J (2010) Hibench: a representative and comprehensive hadoop benchmark suite. In: Proceedings of the ICDE workshops, pp 41–51
Acknowledgements
This work is jointly supported by the National Natural Science Foundation of China (No. 61601082, No. 61471100, No. 61701503, No. 61750110527).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Qin, Y., Tang, Y., Zhu, X., Yan, C., Wu, C., Lin, D. (2020). Zone-Based Resource Allocation Strategy for Heterogeneous Spark Clusters. In: Liang, Q., Wang, W., Mu, J., Liu, X., Na, Z., Chen, B. (eds) Artificial Intelligence in China. Lecture Notes in Electrical Engineering, vol 572. Springer, Singapore. https://doi.org/10.1007/978-981-15-0187-6_13
Download citation
DOI: https://doi.org/10.1007/978-981-15-0187-6_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-0186-9
Online ISBN: 978-981-15-0187-6
eBook Packages: EngineeringEngineering (R0)