SGraph: A Distributed Streaming System for Processing Big Graphs

  • Cheng Chen
  • Hejun WuEmail author
  • Dyce Jing Zhao
  • Da Yan
  • James Cheng
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9784)


Big graph processing has been widely used in various computational domains, ranging from language modeling to social networks. Graph-parallel systems have been proposed to process such big graphs on clusters with up to hundreds of nodes. However, the size of a big graph often exceeds the available main memories in a small cluster. As a consequence, task failures happen frequently. To address this problem, we propose SGraph, a distributed streaming graph processing system built on top of Spark. SGraph introduces a streaming data model to avoid loading all of the graph data which may exceed the available RAM space. In addition, SGraph leverages an edge-centric scatter-gather computing model that can be used to conveniently implement graph algorithms. Experiments demonstrate that SGraph can process graphs with up to 1.5 billion edges on small clusters with several low-cost commodity PCs, whereas existing systems may require up to tens or hundreds of high-end machines. Furthermore, SGraph is up to 2.3 times faster than existing systems.


Distributed computing Graph processing Streaming 



We appreciate the reviewers’s comments and the efforts of open-source contributors. This paper is supported by National Natural Science Foundation of China-Guangdong Government Joint Funding (2nd) for Super Computer Application Research and the Hong Kong GRF 2150851.


  1. 1.
  2. 2.
  3. 3.
    Avery, C.: Giraph: large-scale graph processing infrastructure on Hadoop. In: Proceedings of the Hadoop Summit, Santa Clara (2011)Google Scholar
  4. 4.
    Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: distributed graph-parallel computation on natural graphs. In: OSDI, vol. 12, p. 2 (2012)Google Scholar
  5. 5.
    Jain, N., Liao, G., Willke, T.L.: Graphbuilder: scalable graph ETL framework. In: First International Workshop on Graph Data Management Experiences and Systems, p. 4. ACM (2013)Google Scholar
  6. 6.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600. ACM (2010)Google Scholar
  7. 7.
    Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM (2010)Google Scholar
  8. 8.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web (1999)Google Scholar
  9. 9.
    Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centric graph processing using streaming partitions. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, pp. 472–488. ACM (2013)Google Scholar
  10. 10.
    Xin, R.S., Crankshaw, D., Dave, A., Gonzalez, J.E., Franklin, M.J., Stoica, I.: GraphX: Unifying Data-Parallel and Graph-Parallel Analytics (2014). arXiv preprint arXiv:1402.2394
  11. 11.
    Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, p. 2. USENIX Association (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Cheng Chen
    • 1
    • 2
  • Hejun Wu
    • 1
    • 2
    Email author
  • Dyce Jing Zhao
    • 3
  • Da Yan
    • 4
  • James Cheng
    • 4
  1. 1.Guangdong Province Key Laboratory of Big Data Analysis and ProcessingSun Yat-Sen UniversityGuangzhouChina
  2. 2.SYSU-CMU Shunde International Joint Research Institute (JRI)FoshanChina
  3. 3.BNU-HKBU United International CollegeZhuhaiHong Kong
  4. 4.Department of Computer Science and EngineeringThe Chinese University of Hong KongShatinHong Kong

Personalised recommendations