EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU

Sengupta, Dipanjan; Song, Shuaiwen Leon

doi:10.1007/978-3-319-58667-0_6

Dipanjan Sengupta¹⁹ &
Shuaiwen Leon Song²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10266))

Included in the following conference series:

International Conference on High Performance Computing

2632 Accesses
14 Citations

Abstract

With the prevalence of the World Wide Web and social networks, there has been a growing interest in high performance analytics for constantly-evolving dynamic graphs. Modern GPUs provide massive amount of parallelism for efficient graph processing, but the challenges remain due to their lack of support for the near real-time streaming nature of dynamic graphs. Specifically, due to the current high volume and velocity of graph data combined with the complexity of user queries, traditional processing methods by first storing the updates and then repeatedly running static graph analytics on a sequence of versions or snapshots are deemed undesirable and computational infeasible on GPU. We present EvoGraph, a highly efficient and scalable GPU-based dynamic graph analytics framework that incrementally processes graphs on-the-fly using fixed-sized batches of updates. The runtime realizes this vision with a user friendly programming model, along with a vertex property-based optimization to choose between static and incremental execution; and efficient utilization of all hardware resources using GPU streams, including its computational and data movement engines. Extensive experimental evaluations for a wide variety of graph inputs and algorithms demonstrate that EvoGraph achieves up to 429 million updates/sec and over 232x speedup compared to the competing frameworks such as STINGER.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
In this paper, we use the NVIDIA CUDA terminology to describe the GPU architecture. However, our work is independent of the terminology itself.
2.
Computational frontier describes the number of inconsistent/active vertices in a given iteration.

References

Luk, C.-K., Hong, S., Kim, H.: Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping. In: Proceedings of MICRO 2009. ACM (2009)
Google Scholar
Tarditi, D., Puri, S., Oglesby, J.: Accelerator: using data parallelism to program GPUs for general-purpose uses. SIGOPS Oper. Syst. Rev. 40(5) (2006)
Google Scholar
Sengupta, D., et al.: Scheduling multi-tenant cloud workloads on accelerator-based systems. In: Proceedings of the SC 2014. IEEE Press (2014)
Google Scholar
Sengupta, D., Belapure, R., Schwan, K.: Multi-tenancy on GPGPU-based servers. In: Proceedings of the VTDC 2013. ACM (2013)
Google Scholar
Top 500 List. http://www.top500.org/system/177975
Fu, Z., et al.: Mapgraph: a high level API for fast development of high performance graph analytics on GPUs. In: GRADES 2014. ACM (2014)
Google Scholar
Khorasani, F., Vora, K., Gupta, R., Bhuyan, L.N.: Cusha: vertex-centric graph processing on GPUs. In: Proceedings of HPDC 2014. ACM (2014)
Google Scholar
Sengupta, D., Song, S.L., et al.: Graphreduce: processing large-scale graphs on accelerator-based systems. In: Proceedings of the SC 2015. ACM (2015)
Google Scholar
Sengupta, D., et al.: Graphreduce: large-scale graph analytics on accelerator-based HPC systems. In: IEEE IPDPSW (2015)
Google Scholar
Ching, A., Edunov, S., Kabiljo, M., et al.: One trillion edges: graph processing at facebook-scale. Proc. VLDB Endow. 8(12), 1804–1815 (2015)
Article Google Scholar
Twitter Statistics. http://tinyurl.com/kcuhdcw
Han, W., Miao, Y., Li, K., et al.: Chronos: a graph engine for temporal graph analysis. EuroSys (2014)
Google Scholar
Sun, J., Faloutsos, C., Papadimitriou, S., Yu, P.S.: Graphscope: parameter-free mining of large time-evolving graphs, KDD 2007. ACM (2007)
Google Scholar
Fard, A., Abdolrashidi, A., Ramaswamy, L., Miller, J.: Towards efficient query processing on massive time-evolving graphs. In: CollaborateCom, October 2012
Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J., et al.: Pregel: a system for large-scale graph processing, SIGMOD 2010. ACM (2010)
Google Scholar
Gonzalez, J.E., Low, Y., Gu, H., et al.: Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI 2012. USENIX, Hollywood (2012)
Google Scholar
Low, Y., Bickson, D., Gonzalez, J., et al.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5, 716–727 (2012)
Article Google Scholar
Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: large-scale graph computation on just a PC. In: OSDI 2012. USENIX Association, Berkeley (2012)
Google Scholar
Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centric graph processing using streaming partitions. In: SOSP 2013. ACM (2013)
Google Scholar
Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Corporation, NVIDIA Technical report NVR-2008-004, December 2008
Google Scholar
Ediger, D., Jiang, K., Riedy, J., Bader, D.: Massive streaming data analytics: a case study with clustering coefficients. In: IPDPSW 2010, pp. 1–8, April 2010
Google Scholar
McColl, R., Green, O., Bader, D.: A new parallel algorithm for connected components in dynamic graphs. In: HiPC, December 2013
Google Scholar
Ediger, D., Riedy, J., Bader, D., Meyerhenke, H.: Tracking structure of streaming social networks. In: IPDPSW 2011, May 2011
Google Scholar
Sengupta, D., et al.: GraphIn: An Online High Performance Incremental Graph Processing Framework. Springer International Publishing, Cham (2016)
Google Scholar
System design principles for heterogeneous resource management and scheduling in accelerator-based systems (2016). http://hdl.handle.net/1853/55607
CUDA 7.0. https://developer.nvidia.com/cuda-downloads/
Ramalingam, G., Reps, T.: An incremental algorithm for a generalization of the shortest-path problem. J. Algorithms, 21(2) (1996)
Google Scholar
Sundaram, N., Satish, N., Patwary, M.M.A., et al.: Graphmat: high performance graph analytics made productive. Proc. VLDB Endow. 8(11), 1214–1225 (2015)
Article Google Scholar
Ediger, D., McColl, R., Riedy, J., Bader, D.: Stinger: high performance data structure for streaming graphs. In: HPEC, September 2012
Google Scholar
BlazeGraph. https://www.blazegraph.com/
University of Florida Sparse Matrix Collection. http://tinyurl.com/hh8g3n9
Murphy, R.C., Wheeler, K., Barrett, B., Ang, J.A.: Introducing the graph 500. In: Cray User’s Group (CUG) (2010)
Google Scholar
Sengupta, D., et al.: A framework for emulating non-volatile memory systems with different performance characteristics. In: Proceedings of the ICPE 2015. ACM (2015)
Google Scholar
Feng, G., Meng, X., Ammar, K.: Distinger: a distributed graph data structure for massive dynamic graph processing. In: Big Data. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Georgia Tech, Atlanta, USA
Dipanjan Sengupta
Pacific Northwest National Lab, Richland, USA
Shuaiwen Leon Song

Authors

Dipanjan Sengupta
View author publications
You can also search for this author in PubMed Google Scholar
Shuaiwen Leon Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shuaiwen Leon Song .

Editor information

Editors and Affiliations

Deutsches Klimarechenzentrum (DKRZ), Hamburg, Germany
Julian M. Kunkel
Tokyo Institute of Technology, Tokyo, Japan
Rio Yokota
Argonne National Laboratory, Argonne, IL, USA
Pavan Balaji
KAUST, Thuwal, Saudi Arabia
David Keyes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sengupta, D., Song, S.L. (2017). EvoGraph: On-the-Fly Efficient Mining of Evolving Graphs on GPU. In: Kunkel, J.M., Yokota, R., Balaji, P., Keyes, D. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10266. Springer, Cham. https://doi.org/10.1007/978-3-319-58667-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-58667-0_6
Published: 12 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58666-3
Online ISBN: 978-3-319-58667-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics