Skip to main content
Log in

L-PowerGraph: a lightweight distributed graph-parallel communication mechanism

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In order to process complex and large-scale graph data, numerous distributed graph-parallel computing platforms have been proposed. PowerGraph is an excellent representative of them. It has exhibited better performance, such as faster graph-processing rate and higher scalability, than others. However, like in other distributed graph computing systems, unnecessary and excessive communications among computing nodes in PowerGraph not only aggravate the network I/O workload of the underlying computing hardware systems but may also cause a decrease in runtime performance. In this paper, we propose and implement a mechanism called L-PowerGraph, which reduces the communication overhead in PowerGraph. First, L-PowerGraph identifies and eliminates the avoidable communications in PowerGraph. Second, in order to further reduce the required communications L-PowerGraph proposes an edge direction-aware master appointment strategy, in which L-PowerGraph appoints the replica with both incoming and outgoing edges as master. Third, L-PowerGraph proposes an edge direction-aware graph partition strategy, which optimally isolates the outgoing edges from the incoming edges of a vertex during the graph partition process. We have conducted extensive experiments using real-world datasets, and our results verified the effectiveness of the proposed mechanism. For example, compared with PowerGraph under Random partition scenario L-PowerGraph can not only reduce up to 30.5% of the communication overhead but also cut up to 20.3% of the runtime for PageRank algorithm while processing Live-journal dataset. The performance improvement achieved by L-PowerGraph over our precursor work, LightGraph, which only reduces the synchronizing communication overhead, is also verified by our experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

References

  1. Kyrola A, Blelloch G, Guestrin C (2012) Graphchi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12. USENIX Association, Berkeley, CA, USA, pp 31–46. URL http://dl.acm.org/citation.cfm?id=2387880.2387884

  2. Low Y, Gonzalez J, Kyrola A, Bickson D, Guestrin C, Hellerstein JM (2010) Graphlab: a new parallel framework for machine learning. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), Catalina Island, California

  3. Han W-S, Lee S, Park K, Lee J-H, Kim M-S, Kim J, Yu H (2013) Turbograph: a fast parallel graph engine handling billion-scale graphs in a single PC. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13. ACM, New York, NY, USA, pp 77–85. https://doi.org/10.1145/2487575.2487581

  4. Shun J, Blelloch GE (2013) Ligra: a lightweight graph processing framework for shared memory. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’13. ACM, New York, NY, USA, pp 135–146. https://doi.org/10.1145/2442516.2442530

  5. Pearce R, Gokhale M, Amato NM (2010) Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’10. IEEE Computer Society, Washington, DC, USA, pp 1–11. https://doi.org/10.1109/SC.2010.34

  6. Prabhakaran V, Wu M, Weng X, McSherry F, Zhou L, Haridasan M (2010) Managing large graphs on multi-cores with graph awareness. In: Proceedings of the 2012 USENIX Conference on Annual Technical Conference, USENIX ATC’12. USENIX Association, Berkeley, CA, USA, p 4. http://dl.acm.org/citation.cfm?id=2342821.2342825

  7. Roy A, Mihailovic I, Zwaenepoel W (2013) X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, pp 472–488

  8. Shun J, Blelloch GE (2013) Ligra: a lightweight graph processing framework for shared memory. In: ACM SIGPLAN Notices, vol 48. ACM, pp 135–146

  9. Gregor D, Lumsdaine A (2005) The parallel BGL: a generic library for distributed graph computations. In: In Parallel Object-Oriented Scientific Computing (POOSC)

  10. Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10. ACM, New York, NY, USA, pp 135–146. https://doi.org/10.1145/1807167.1807184

  11. Salihoglu S, Widom J (2013) GPS: a graph processing system. In: Proceedings of the 25th International Conference on Scientific and Statistical Database Management, SSDBM. ACM, New York, NY, USA, pp 22:1–22:12. https://doi.org/10.1145/2484838.2484843

  12. Cheng R, Hong J, Kyrola A, Miao Y, Weng X, Wu M, Yang F, Zhou L, Zhao F, Chen E (2012) Kineograph: taking the pulse of a fast-changing and connected world. In: Proceedings of the 7th ACM European Conference on Computer Systems, EuroSys ’12. ACM, New York, NY, USA, pp 85–98. https://doi.org/10.1145/2168836.2168846

  13. Apache incubator giraph http://incubator.apache.org/giraph/. Accessed June 2011

  14. Stutz P, Bernstein A, Cohen W (2010) Signal/collect: graph algorithms for the (semantic) web. In: Proceedings of the 9th International Semantic Web Conference on the Semantic Web—Volume Part I, ISWC’10. Springer, Berlin, pp 764–780. http://dl.acm.org/citation.cfm?id=1940281.1940330. Accessed 7 Nov 2010

  15. Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein JM (2012) Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc VLDB Endow 5(8):716–727. http://dl.acm.org/citation.cfm?id=2212351.2212354. Accessed 1 Apr 2012

  16. Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12. USENIX Association, Berkeley, CA, USA, pp 17–30. http://dl.acm.org/citation.cfm?id=2387880.2387883. Accessed 8 Oct 2012

  17. Chen R, Shi J, Chen Y, Chen H (2015) Powerlyra: differentiated graph computation and partitioning on skewed graphs. In: Proceedings of the Tenth European Conference on Computer Systems. ACM, p 1

  18. Zhao Y, Yoshigoe K, Bian J, Xie M, Xue Z, Feng Y (2016) A distributed graph-parallel computing system with lightweight communication overhead. IEEE Trans Big Data 2(3):204–218

    Article  Google Scholar 

  19. Nai L, Xia Y, Tanase IG, Kim H, Lin C-Y (2015) GraphBIG: understanding graph computing in the context of industrial solutions. In: 2015 SC-International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, pp 1–12

  20. Sundaram N, Satish N, Patwary MMA, Dulloor SR, Anderson MJ, Vadlamudi SG, Das D, Dubey P (2015) Graphmat: high performance graph analytics made productive. Proc VLDB Endow 8(11):1214–1225

    Article  Google Scholar 

  21. Zhao Y, Yoshigoe K, Xie M, Zhou S, Seker R, Bian J (2014) Lightgraph: lighten communication in distributed graph-parallel processing. In: 2014 IEEE International Congress on Big Data (BigData Congress). IEEE, pp 717–724

  22. Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Technical Report, Stanford InfoLab

    Google Scholar 

  23. Hung BQ, Otsubo M, Hijikata Y, Nishida S (2010) Hits algorithm improvement using semantic text portion. Web Intell Agent Syst 8(2):149–164. http://dblp.uni-trier.de/db/journals/wias/wias8.html_HungOHN10. Accessed 1 Jan 2010

  24. Lempel R, Moran S (2005) Rank-stability and rank-similarity of link-based web ranking algorithms in authority-connected graphs. Inf Retr 8(2):245–264

    Article  Google Scholar 

  25. Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: Proceedings of OSDI, pp 599–613

  26. Mayer C, Tariq MA, Mayer R, Rothermel K (2018) GrapH: Traffic-Aware Graph Processing. In: IEEE Transactions on Parallel and Distributed Systems. https://doi.org/10.1109/TPDS.2018.2794989

  27. Snap (2006) http://snap.stanford.edu/data/soc-LiveJournal1.html. Accessed 06 Aug 2016

  28. Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web. ACM, pp 591–600

  29. Gjoka M, Kurant M, Butts CT, Markopoulou A (2010) Walking in Facebook: a case study of unbiased sampling of OSNs. In: INFOCOM, 2010 Proceedings IEEE. IEEE, pp 1–9

  30. Hong S, Depner S, Manhardt T, Van Der Lugt J, Verstraaten M, Chafi H (2015) PGX. D: a fast distributed graph processing engine. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. ACM, p 58

  31. McSherry FD, Isaacs R, Isard MA, Murray DG (2012) Differential dataflow. US Patent App. 13/468,726

  32. Murray DG, McSherry F, Isaacs R, Isard M, Barham P, Abadi M (2013) Naiad: a timely dataflow system. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, pp 439–455

  33. Zaharia M, Das T, Li H, Hunter T, Shenker S, Stoica I (2013) Discretized streams: fault-tolerant streaming computation at scale. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, pp 423–438

  34. Zhao Y, Yoshigoe K, Xie M, Zhou S, Seker R, Bian J (2014) Evaluation and analysis of distributed graph-parallel processing frameworks. J Cyber Secur 3:289–316

    Article  Google Scholar 

  35. Guo Y, Biczak M, Varbanescu AL, Iosup A, Martella C, Willke TL (2014) How well do graph-processing platforms perform? An empirical performance evaluation and analysis. In: Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE, pp 395–404

  36. Elser B, Montresor A (2013) An evaluation study of bigdata frameworks for graph processing. In: 2013 IEEE International Conference on Big Data. IEEE, pp 60–67

  37. Zhou C, Gao J, Sun B, Yu JX (2014) MOCgraph: scalable distributed graph processing using message online computing. Proc VLDB Endow 8(4):377–388

    Article  Google Scholar 

  38. Nguyen D, Lenharth A, Pingali K (2013) A lightweight infrastructure for graph analytics. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. ACM, pp 456–471

  39. Kulkarni M, Pingali K, Walter B, Ramanarayanan G, Bala K, Chew LP (2007) Optimistic parallelism requires abstractions. In: ACM SIGPLAN Notices, vol 42. ACM, pp 211–222

  40. Chen R, Ding X, Wang P, Chen H, Zang B, Guan H (2014) Computation and communication efficient graph processing with distributed immutable view. In: Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing. ACM, pp 215–226

  41. The apache hama project. http://hama.apache.org/. Accessed 30 Nov 2010

  42. Shi X, Luo X, Liang J, Zhao P, Di S, He B, Jin H (2018) Frog: asynchronous graph processing on GPU with hybrid coloring model. IEEE Trans Knowl Data Eng 30(1):29–42

    Article  Google Scholar 

  43. Xiao W, Xue J, Miao Y, Li Z, Chen C, Wu M, Li W, Zhou L (2017) Tux2: distributed graph computation for machine learning. In: NSDI, pp 669–682

  44. Yan D, Huang Y, Liu M, Chen H, Cheng J, Wu H, Zhang C (2018) GraphD: distributed vertex-centric graph processing beyond the memory limit. IEEE Trans Parallel Distrib Syst 29(1):99–114

    Article  Google Scholar 

  45. Pearce R, Gokhale M, Amato NM (2014) Faster parallel traversal of scale free graphs at extreme scale with vertex delegates. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Press, pp 549–559

Download references

Acknowledgements

This work was supported in part by the National Science Foundation under Grant CRI CNS-0855248, and Grant MRI CNS-0619069.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yue Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Yoshigoe, K., Xie, M. et al. L-PowerGraph: a lightweight distributed graph-parallel communication mechanism. J Supercomput 76, 1850–1879 (2020). https://doi.org/10.1007/s11227-018-2359-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-018-2359-9

Keywords

Navigation