On Characterizing the Performance of Distributed Graph Computation Platforms

  • Ahmed Barnawi
  • Omar Batarfi
  • Seyed-Mehdi-Reza Behteshi
  • Radwa Elshawi
  • Ayman Fayoumi
  • Reza Nouri
  • Sherif SakrEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8904)


Graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, transportation networks, bibliographical networks, knowledge bases and many more. Currently, graphs with millions and billions of nodes and edges have become very common. Therefore, designing scalable systems for processing and analyzing large scale graphs has become one of the most timely problems facing the big data research community. In practice, distributed processing of large scale graphs is a challenging task due to their size in addition to their inherent irregular structure and the iterative nature of graph processing and computation algorithms. In recent years, several distributed graph processing systems have been presented, most notably Pregel and GraphLab, to tackle this challenge. In particular, both systems use a vertex-centric computation model which enables the user to design a program that is executed locally for each vertex in parallel. In this paper, we analyze the performance characteristics of distributed graph processing systems and provide an experimental comparison on the performance of two popular systems in this area.


Execution Time Outgoing Edge Total Execution Time Open Source Project Storage Scheme 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by King Abdulaziz City for Science and Technology (KACST) project 11-INF1990-03.


  1. 1.
    Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: The HaLoop approach to large-scale iterative data analysis. VLDB J. 21(2), 169–190 (2012)CrossRefGoogle Scholar
  2. 2.
    Dean, J., Ghemawa, S.: MapReduce: simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)Google Scholar
  3. 3.
    Ekanayake, J., Li, H., Zhang, B., Gunarathne, T., Bae, S.-H., Qiu, J., Fox, G.: Twister: a runtime for iterative MapReduce. In: HPDC, pp. 810–818 (2010)Google Scholar
  4. 4.
    Fard, A., Nisar, M.U., Ramaswamy, L., Miller, J.A., Saltz, M.: A distributed vertex-centric approach for pattern matching in massive graphs. In: BigData Conference, pp. 403–411 (2013)Google Scholar
  5. 5.
    Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning in the cloud. PVLDB 5(8), 716–727 (2012)Google Scholar
  6. 6.
    Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: SIGMOD Conference, pp. 135–146 (2010)Google Scholar
  7. 7.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank Citation Ranking: Bringing Order to the Web. Technical report 1999–66, Stanford InfoLab, November 1999. Previous number = SIDL-WP-1999-0120Google Scholar
  8. 8.
    Sakr, S., Liu, A., Fayoumi, A.G.: The family of mapreduce and large-scale data processing systems. ACM Comput. Surv. 46(1), 11 (2013)CrossRefGoogle Scholar
  9. 9.
    Salihoglu, S., Widom, J.: GPS: a graph processing system. In: SSDBM, p. 22 (2013)Google Scholar
  10. 10.
    Schad, J., Dittrich, J., Quiané-Ruiz, J.-A.: Runtime measurements in the cloud: observing, analyzing, and reducing variance. PVLDB 3(1), 460–471 (2010)Google Scholar
  11. 11.
    Stutz, P., Bernstein, A., Cohen, W.: Signal/Collect: graph algorithms for the (semantic) web. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 764–780. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  12. 12.
    Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)CrossRefGoogle Scholar
  13. 13.
    Wang, G., Xie, W., Demers, A., Gehrke, J.: Asynchronous large-scale graph processing made easy. In: CIDR (2013)Google Scholar
  14. 14.
    Zhang, Y., Gao, Q., Gao, L., Wang, C.: iMapReduce: a distributed computing framework for iterative computation. J. Grid Comput. 10(1), 47–68 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ahmed Barnawi
    • 1
  • Omar Batarfi
    • 1
  • Seyed-Mehdi-Reza Behteshi
    • 2
  • Radwa Elshawi
    • 3
  • Ayman Fayoumi
    • 1
  • Reza Nouri
    • 2
  • Sherif Sakr
    • 2
    • 4
    Email author
  1. 1.King Abdulaziz UniversityJeddahSaudi Arabia
  2. 2.University of New South WalesSydneyAustralia
  3. 3.Princess Nourah Bint Abdulrahman UniversityRiyadhSaudi Arabia
  4. 4.King Saud Bin Abdulaziz University for Health SciencesRiyadhSaudi Arabia

Personalised recommendations