Abstract
Graph partitioning is an important preprocessing step for distributed processing of large-scale graph data. By balancing workloads and reducing communication costs among nodes, graph partitioning methods improve the efficiency of homogeneous clusters for processing power-law graphs. However, a real cluster usually consists of heterogeneous nodes, each with different computing and communication ability. Nodes handle the same workload with different time cost, and the slowest node is the bottleneck. Therefore, a Heterogeneous environment Aware Edge Partitioning method (HAEP) is proposed to balance graph processing time by skewing the workload. HAEP can adapt to the challenge of unbalanced performance among nodes. First, a k-time balanced graph partitioning problem is defined to balance the expected time cost of graph processing in heterogeneous environments. Then, a neighborhood heuristic expansion is performed according to the node performance, minimizing the communication time among nodes and assigning an appropriate workload for each node. Further, a distributed method of HAEP, DHAEP, is proposed to improve the efficiency of graph partitioning. The performance evaluation shows that HAEP and DHAEP can improve graph processing efficiency by up to 41\(\%\) compared to state-of-the-art partitioning methods, and the graph partition time of DHAEP is 15\(\%\) of HAEP.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: KDD, pp. 1456–1465. ACM (2014)
Cao, J., Simonin, M., Cooperman, G., Morin, C.: Checkpointing as a service in heterogeneous cloud environments. In: CCGRID, pp. 61–70. IEEE Computer Society (2015)
Chan, Y., Fathoni, H., Yen, H., Yang, C.: Implementation of a cluster-based heterogeneous edge computing system for resource monitoring and performance evaluation. IEEE Access 10, 38458–38471 (2022)
Chen, C., Nagel, L., Cui, L., Tso, F.P.: Distributed federated service chaining for heterogeneous network environments. In: UCC, pp. 2:1–2:10. ACM (2021)
Chen, R., Yang, M., Weng, X., Choi, B., He, B., Li, X.: Improving large graph processing on partitioned graphs in the cloud. In: SoCC, p. 3. ACM (2012)
Chen, R., Shi, J., Chen, Y., Chen, H.: Powerlyra: differentiated graph computation and partitioning on skewed graphs. In: EuroSys, pp. 1:1–1:15. ACM (2019)
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp. 17–30. USENIX Association (2012)
Hanai, M., Suzumura, T., Tan, W.J., Liu, E.S., Theodoropoulos, G., Cai, W.: Distributed edge partitioning for trillion-edge graphs. Proc. VLDB Endow. 12(13), 2379–2392 (2019)
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998)
LeBeane, M., Song, S., Panda, R., Ryoo, J.H., John, L.K.: Data partitioning strategies for graph workloads on heterogeneous clusters. In: SC, pp. 56:1–56:12. ACM (2015)
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection (2014). http://snap.stanford.edu/data
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Malewicz, G., et al.: Pregel: a system for large-scale graph processing. In: SIGMOD, pp. 135–146. ACM (2010)
Mayer, R., Jacobsen, H.: Hybrid edge partitioner: partitioning large power-law graphs under memory constraints. In: SIGMOD, pp. 1289–1302. ACM (2021)
Petroni, F., Querzoni, L., Daudjee, K., Kamali, S., Iacoboni, G.: HDRF: stream-based partitioning for power-law graphs. In: CIKM, pp. 243–252. ACM (2015)
Xie, C., Yan, L., Li, W., Zhang, Z.: Distributed power-law graph computing: theoretical and empirical analysis. In: NIPS, pp. 1673–1681 (2014)
Xu, N., Cui, B., Chen, L., Huang, Z., Shao, Y.: Heterogeneous environment aware streaming graph partitioning. IEEE Trans. Knowl. Data Eng. 27(6), 1560–1572 (2015)
Zhang, C., Wei, F., Liu, Q., Tang, Z.G., Li, Z.: Graph edge partitioning via neighborhood heuristic. In: KDD, pp. 605–614. ACM (2017)
Acknowledgement
This work was supported by National Natural Science Foundation of China (62072089); Fundamental Research Funds for the Central Universities of China (N2116016, N2104001, N2019007).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, X., Xin, J., Chen, J., Wang, B., Wang, Z. (2023). HAEP: Heterogeneous Environment Aware Edge Partitioning for Power-Law Graphs. In: Wang, X., et al. Database Systems for Advanced Applications. DASFAA 2023. Lecture Notes in Computer Science, vol 13945. Springer, Cham. https://doi.org/10.1007/978-3-031-30675-4_23
Download citation
DOI: https://doi.org/10.1007/978-3-031-30675-4_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-30674-7
Online ISBN: 978-3-031-30675-4
eBook Packages: Computer ScienceComputer Science (R0)