Encyclopedia of Big Data Technologies

Living Edition
| Editors: Sherif Sakr, Albert Zomaya

Parallel Graph Processing

Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-63962-8_272-1

Definition

The term parallel graph processing refers to the use of multiple cores to process a graph for the purpose of (1) speeding up of processing and (2) scaling to bigger graphs. The environment can be (1) a stand-alone machine running multiple threads or (2) a distributed cluster of machines (i.e., the shared-nothing architecture).

Overview

Modern big graph processing systems place emphasis on two aspects:
  1. 1.

    user-friendliness: the programming interface should be designed based on an intuitive computation model, to allow algorithm developers to focus on the graph analytics logic without touching low-level execution details (e.g., network communication);

     
  2. 2.

    efficiency: the underlying execution engine should guarantee high-throughput execution and automatically support fault tolerance and horizontal/vertical scaling.

     

Since comprehensive surveys of this area already exist (Yan et al. 2017a,d), this chapter aims at a succinct and more up-to-date review of the key programming...

This is a preview of subscription content, log in to check access.

References

  1. Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K (2015) Apache flink™: stream and batch processing in a single engine. IEEE Data Eng Bull 38(4):28–38Google Scholar
  2. Ching A, Edunov S, Kabiljo M, Logothetis D, Muthukrishnan S (2015) One trillion edges: graph processing at facebook-scale. PVLDB 8(12):1804–1815Google Scholar
  3. Fan W, Xu J, Wu Y, Yu W, Jiang J, Zheng Z, Zhang B, Cao Y, Tian C (2017) Parallelizing sequential graph computations. In: SIGMOD, pp 495–510Google Scholar
  4. Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI, pp 17–30Google Scholar
  5. Gonzalez JE, Xin RS, Dave A, Crankshaw D, Franklin MJ, Stoica I (2014) Graphx: graph processing in a distributed dataflow framework. In: OSDI, pp 599–613Google Scholar
  6. Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392MathSciNetCrossRefMATHGoogle Scholar
  7. Kyrola A, Blelloch GE, Guestrin C (2012) GraphChi: Large-scale graph computation on just a PC. In: OSDI, pp 31–46Google Scholar
  8. Liu H, Huang HH (2017) Graphene: fine-grained IO management for graph computing. In: FAST, pp 285–300Google Scholar
  9. Lu Y, Cheng J, Yan D, Wu H (2014) Large-scale distributed graph computing systems: an experimental evaluation. PVLDB 8(3):281–292Google Scholar
  10. Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: SIGMOD, pp 135–146Google Scholar
  11. Quamar A, Deshpande A, Lin JJ (2016) Nscale: neighborhood-centric large-scale graph analytics in the cloud. VLDB J 25(2):125–150CrossRefGoogle Scholar
  12. Quick L, Wilkinson P, Hardcastle D (2012) Using pregel-like large scale graph processing frameworks for social network analysis. In: International conference on advances in social networks analysis and mining, ASONAM 2012, Istanbul, pp 457–463Google Scholar
  13. Roy A, Mihailovic I, Zwaenepoel W (2013) X-stream: edge-centric graph processing using streaming partitions. In: SOSP, pp 472–488Google Scholar
  14. Salihoglu S, Widom J (2013) GPS: a graph processing system. In: SSDBM, pp 22:1–22:12Google Scholar
  15. Tian Y, Balmin A, Corsten SA, Tatikonda S, McPherson J (2013) From “think like a vertex” to “think like a graph”. PVLDB 7(3):193–204Google Scholar
  16. Yan D, Cheng J, Lu Y, Ng W (2014a) Blogel: a block-centric framework for distributed computation on real-world graphs. PVLDB 7(14):1981–1992Google Scholar
  17. Yan D, Cheng J, Xing K, Lu Y, Ng W, Bu Y (2014b) Pregel algorithms for graph connectivity problems with performance guarantees. PVLDB 7(14):1821–1832Google Scholar
  18. Yan D, Cheng J, Lu Y, Ng W (2015) Effective techniques for message reduction and load balancing in distributed graph computation. In: WWW, pp 1307–1317Google Scholar
  19. Yan D, Bu Y, Tian Y, Deshpande A (2017a) Big graph analytics platforms. Found Trends Databases 7(1–2): 1–195. https://doi.org/10.1561/1900000056 CrossRefGoogle Scholar
  20. Yan D, Chen H, Cheng J, Özsu MT, Zhang Q, Lui JCS (2017b) G-thinker: big graph mining made easier and faster. CoRR abs/1709.03110Google Scholar
  21. Yan D, Huang Y, Liu M, Chen H, Cheng J, Wu H, Zhang C (2017c) GraphD: distributed vertex-centric graph processing beyond the memory limit. IEEE Trans Parallel Distrib Syst 29(1):99–114CrossRefGoogle Scholar
  22. Yan D, Tian Y, Cheng J (2017d) Systems for big graph analytics. Springer briefs in computer science. Springer, Cham. https://doi.org/10.1007/978-3-319-58217-7 CrossRefGoogle Scholar
  23. Yan D, Chen H, Cheng J, Cai Z, Shao B (2018) Scalable de novo genome assembly using pregel. In: ICDEGoogle Scholar
  24. Zhang Y, Gao Q, Gao L, Wang C (2014) Maiter: an asynchronous graph processing framework for delta-based accumulative iterative computation. IEEE Trans Parallel Distrib Syst 25(8):2091–2100CrossRefGoogle Scholar
  25. Zhou C, Gao J, Sun B, Yu JX (2014) Mocgraph: scalable distributed graph processing using message online computing. PVLDB 8(4):377–388Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of Computer ScienceThe University of Alabama at BirminghamBirminghamUSA
  2. 2.University of Massachusetts LowellLowellUSA

Section editors and affiliations

  • Hannes Voigt
    • 1
  • George Fletcher
    • 2
  1. 1.Dresden Database Systems GroupTechnische Universität DresdenDresdenGermany
  2. 2.Department of Mathematics and Computer ScienceEindhoven University of Technology