Skip to main content

Advertisement

Log in

Efficient graph computation on hybrid CPU and GPU systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Graphs are used to model many real objects such as social networks and web graphs. Many real applications in various fields require efficient and effective management of large-scale, graph-structured data. Although distributed graph engines such as GBase and Pregel handle billion-scale graphs, users need to be skilled at managing and tuning a distributed system in a cluster, which is a non-trivial job for ordinary users. Furthermore, these distributed systems need many machines in a cluster in order to provide reasonable performance. Several recent works proposed non-distributed graph processing platforms as complements to distributed platforms. In fact, efficient non-distributed platforms require less hardware resource and can achieve better energy efficiency than distributed ones. GraphChi is a representative non-distributed platform that is disk-based and can process billions of edges on CPUs in a single PC. However, the design drawbacks of GraphChi on I/O and computation model have limited its parallelism and performance. In this paper, we propose a general, disk-based graph engine called gGraph to process billion-scale graphs efficiently by utilizing both CPUs and GPUs in a single PC. GGraph exploits full parallelism and full overlap of computation and I/O processing as much as possible. Experiment results show that gGraph outperforms GraphChi and PowerGraph. In addition, gGraph achieves the best energy efficiency among all evaluated platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Addario-Berry L, Kennedy WS, King AD, Li Z, Reed B (2010) Finding a maximum-weight induced k-partite subgraph of an i-triangulated graph. Discret Appl Math 158(7):765–770

    Article  MATH  MathSciNet  Google Scholar 

  2. Arabnia HR (1990) A parallel algorithm for the arbitrary rotation of digitized images using process-and-data-decomposition approach. J Parallel Distrib Comput 10(2):188–192

    Article  Google Scholar 

  3. Arabnia HR, Oliver MA (1989) A transputer network for fast operations on digitised images. In: Arnold D, Ruiter Bd (eds.) Computer graphics forum, vol. 8. Wiley Online Library, pp 3–11

  4. Avery C (2011) Giraph: large-scale graph processing infrastruction on Hadoop. In: Proceedings of Hadoop Summit

  5. Boldi P, Santini M, Vigna S (2008) A large time-aware web graph. In: ACM SIGIR forum, vol 42. ACM, pp 33–38

  6. Chakrabarti D, Zhan Y, Faloutsos C (2004) R-MAT: a recursive model for graph mining. In: SIAM international conference on data mining (SDM), vol 4. SIAM, pp 442–446

  7. Chan A, Dehne F, Taylor R (2005) Cgmgraph/cgmlib: implementing and testing cgm graph algorithms on PC clusters and shared memory machines. Int J High Perform Comput Appl 19(1):81–97

    Article  Google Scholar 

  8. Che S, Beckmann B, Reinhardt S, Skadron K (2013) Pannotia: understanding irregular GPGPU graph applications. In: IEEE international symposium on workload characterization (IISWC), pp 185–195

  9. Fackbook (2015) Facebook statistics. http://www.statisticbrain.com/facebook-statistics/

  10. Fang J, Varbanescu AL, Sips H (2011) A comprehensive performance comparison of CUDA and OpenCL. In: 2011 international conference on parallel processing (ICPP). IEEE, pp 216–225

  11. Ferrer R, Planas J, Bellens P, Duran A, Gonzalez M, Martorell X, Badia RM, Ayguade E, Labarta J (2011) Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL. In: Cooper K, Mellor-Crummey J, Sarkar V (eds.) Languages and compilers for parallel computing. Springer, Berlin, pp 215–229

  12. Gharaibeh A, Costa LB, Santos-Neto E, Ripeanu M (2013) On graphs, GPUs, and blind dating: a workload to processor matchmaking quest. In: 2013 IEEE 27th international symposium on parallel and distributed processing (IPDPS). IEEE, pp 851–862

  13. Gharaibeh A, Santos-Neto E, Costa LB, Ripeanu M (2013) Efficient large-scale graph processing on hybrid CPU and GPU systems. arXiv:1312.3018

  14. Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX symposium on operating systems design and implementation (OSDI), pp 17–30

  15. Graph 500 Steering Committee and others: Graph 500 benchmark. http://www.graph500.org/

  16. Gregor D, Lumsdaine A (2005) The parallel bgl: a generic library for distributed graph computations. In: Parallel Object-Oriented Scientific Computing (POOSC)

  17. Harish P, Vineet V, Narayanan P (2009) Large graph algorithms for massively multithreaded architectures. Centre for Visual Information Technology, I. Institute of Information Technology, Hyderabad, India, Tech. Rep. IIIT/TR/2009/74

  18. ILK Research Group, Tilburg University (2015) The size of the world wide web (the internet). http://www.worldwidewebsize.com

  19. Isard M, Budiu M, Yu Y, Birrell A, Fetterly D (2007) Dryad: distributed data-parallel programs from sequential building blocks. ACM SIGOPS Oper Syst Rev 41(3):59–72

    Article  Google Scholar 

  20. Kang U, Tong H, Sun J, Lin CY, Faloutsos C (2012) Gbase: an efficient analysis platform for large graphs. VLDB J 21(5):637–650

    Article  Google Scholar 

  21. Kang U, Tsourakakis CE, Faloutsos C (2009) Pegasus: a peta-scale graph mining system implementation and observations. In: Ninth IEEE international conference on data mining (ICDM). IEEE, pp 229–238

  22. Karypis G, Kumar V (1999) Parallel multilevel series k-way partitioning scheme for irregular graphs. SIAM Rev 41(2):278–300

    Article  MATH  MathSciNet  Google Scholar 

  23. Kwak H, Lee C, Park H, Moon S (2010) What is Twitter, a social network or a news media? In: Proceedings of the 19th international conference on World wide web. ACM, pp 591–600

  24. Kyrola A, Blelloch G, Guestrin C (2012) Graphchi: large-scale graph computation on just a PC. In: Proceedings of the 10th USENIX symposium on operating systems design and implementation (OSDI), vol 8, pp 31–46

  25. Lee J, Han WS, Kasperovics R, Lee JH (2012) An in-depth comparison of subgraph isomorphism algorithms in graph databases. In: Proceedings of the 39th international conference on very large data bases. VLDB Endowment, pp 133–144

  26. Low Y, Bickson D, Gonzalez J, Guestrin C, Kyrola A, Hellerstein JM (2012) Distributed graphlab: a framework for machine learning and data mining in the cloud. In: Proceedings of the very large data bases (VLDB) endowment, vol 5, issue 8, pp 716–727

  27. Malewicz G, Austern MH, Bik AJ, Dehnert JC, Horn I, Leiser N, Czajkowski G (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM international conference on management of data (SIGMOD). ACM, pp 135–146

  28. Maserrat H, Pei J (2010) Neighbor query friendly compression of social networks. In: Proceedings of the 16th ACM international conference on Knowledge discovery and data mining (SIGKDD). ACM, pp 533–542

  29. Merrill D, Garland M, Grimshaw A (2012) Scalable GPU graph traversal. In: ACM SIGPLAN notices, vol 47. ACM, pp 117–128

  30. Neo Technology, inc. (2015) Learning neo4j. http://neo4j.com/book-learning-neo4j/

  31. Page L, Brin S, Motwani R, Winograd T (1999) The pagerank citation ranking: bringing order to the web

  32. Pearce R, Gokhale M, Amato NM (2010) Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In: Proceedings of the 2010 ACM/IEEE international conference for high performance computing, networking, storage and analysis. IEEE Computer Society, pp 1–11

  33. Salihoglu S, Widom J (2013) Gps: a graph processing system. In: Proceedings of the 25th international conference on scientific and statistical database management. ACM, p 22

  34. Shao B, Wang H, Li Y (2013) Trinity: a distributed graph engine on a memory cloud. In: Proceedings of the 2013 international conference on Management of data. ACM, pp 505–516

  35. Shen J, Fang J, Sips H, Varbanescu AL (2012) Performance gaps between OpenMP and OpenCL for multi-core CPUs. In: 2012 41st international conference on parallel processing workshops (ICPPW). IEEE, pp 116–125

  36. Shiloach Y, Vishkin U (1982) An O(log(n)) parallel connectivity algorithm. J Algorithms 3(1):57–67

    Article  MATH  MathSciNet  Google Scholar 

  37. Shun J, Blelloch GE (2013) Ligra: a lightweight graph processing framework for shared memory. In: Proceedings of the 18th ACM symposium on principles and practice of parallel programming (SIGPLAN). ACM, pp 135–146

  38. Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111

    Article  Google Scholar 

  39. Vineet V, Harish P, Patidar S, Narayanan P (2009) Fast minimum spanning tree for large graphs on the GPU. In: Proceedings of the conference on high performance graphics 2009. ACM, pp 167–171

  40. Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proceedings of the ACM SIGKDD workshop on mining data semantics. ACM, p 3

Download references

Acknowledgments

The authors would like to thank anonymous reviewers for their fruitful feedback and comments that have helped them improve the quality of this work. This work is partly supported by the National Natural Science Foundation of China (Grant No. 61202026, No. 61332001 and No. 61373155) and Program of China National 1000 Young Talent Plan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tao Zhang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, T., Zhang, J., Shu, W. et al. Efficient graph computation on hybrid CPU and GPU systems. J Supercomput 71, 1563–1586 (2015). https://doi.org/10.1007/s11227-015-1378-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-015-1378-z

Keywords

Navigation