M-Flash: Fast Billion-Scale Graph Computation Using a Bimodal Block Processing Model

  • Hugo GualdronEmail author
  • Robson CordeiroEmail author
  • Jose RodriguesJr.Email author
  • Duen Horng (Polo) ChauEmail author
  • Minsuk KahngEmail author
  • U. KangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9852)


Recent graph computation approaches have demonstrated that a single PC can perform efficiently on billion-scale graphs. While these approaches achieve scalability by optimizing I/O operations, they do not fully exploit the capabilities of modern hard drives and processors. To overcome their performance, in this work, we introduce the Bimodal Block Processing (BBP), an innovation that is able to boost the graph computation by minimizing the I/O cost even further. With this strategy, we achieved the following contributions: (1) M-Flash, the fastest graph computation framework to date; (2) a flexible and simple programming model to easily implement popular and essential graph algorithms, including the first single-machine billion-scale eigensolver; and (3) extensive experiments on real graphs with up to 6.6 billion edges, demonstrating M-Flash’s consistent and significant speedup. The software related to this paper is available at


Graph algorithms Graph processing Graph mining Complex networks 



This work received support from Brazilian agencies CNPq (grant 444985/2014-0), Fapesp (grants 2016/02557-0, 2014/21483-2), and Capes; from USA agencies NSF (grants IIS-1563816, TWC-1526254, IIS-1217559), and GRFP (grant DGE-1148903); and Korean (MSIP) agency IITP (grant R0190-15-2012).


  1. 1.
    Aggarwal, A., Vitter, J.: The input/output complexity of sorting and related problems. Commun. ACM 31, 1116–1127 (1988)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large social networks: membership, growth, and evolution. In: KDD, pp. 44–54 (2006)Google Scholar
  3. 3.
    Berry, M.: Large-scale sparse singular value computations. Int. J. High Perform. Comput. Appl. 6(1), 13–49 (1992)Google Scholar
  4. 4.
    Cheng, J., Liu, Q., Li, Z., Fan, W., Lui, J., He, C.: Venus: vertex-centric streamlined graph computation on a single PC. In: IEEE International Conference on Data Engineering, pp. 1131–1142 (2015)Google Scholar
  5. 5.
    Han, W.S., Lee, S., Park, K., Lee, J.H., Kim, M.S., Kim, J., Yu, H.: Turbograph: a fast parallel graph engine handling billion-scale graphs in a single PC. In: KDD, pp. 77–85 (2013)Google Scholar
  6. 6.
    Kang, U., Meeder, B., Papalexakis, E., Faloutsos, C.: Heigen: spectral analysis for billion-scale graphs. IEEE TKDE 26(2), 350–362 (2014)Google Scholar
  7. 7.
    Kang, U., Tong, H., Sun, J., Lin, C.Y., Faloutsos, C.: Gbase: an efficient analysis platform for large graphs. VLDB J. 21(5), 637–650 (2012)CrossRefGoogle Scholar
  8. 8.
    Kang, U., Tsourakakis, C., Faloutsos, C.: Pegasus: a peta-scale graph mining system implementation and observations. In: ICDM, pp. 229–238. IEEE (2009)Google Scholar
  9. 9.
    Kolda, T., Bader, B.: Tensor decompositions and applications. SIAM Rev. 51(3), 455–500 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Kwak, H., Lee, C., Park, H., Moon, S.: What is twitter, a social network or a news media? In: WWW, pp. 591–600. ACM (2010)Google Scholar
  11. 11.
    Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: large-scale graph computation on just a PC. In: OSDI, pp. 31–46. USENIX Association (2012)Google Scholar
  12. 12.
    Lin, Z., Kahng, M., Sabrin, K., Chau, D.H., Lee, H., Kang, U.: Mmap: fast billion-scale graph computation on a PC via memory mapping. In: BigData (2014)Google Scholar
  13. 13.
    Lu, Y., Cheng, J., Yan, D., Wu, H.: Large-scale distributed graph computing systems: an experimental evaluation. VLDB 8(3), 281–292 (2014)Google Scholar
  14. 14.
    McSherry, F., Isard, M., Murray, D.G.: Scalability! but at what cost. In: HotOS (2015)Google Scholar
  15. 15.
    Roy, A., Mihailovic, I., Zwaenepoel, W.: X-stream: edge-centric graph processing using streaming partitions. In: SOSP, pp. 472–488. ACM (2013)Google Scholar
  16. 16.
    Tarjan, R.E., van Leeuwen, J.: Worst-case analysis of set union algorithms. J. ACM 31(2), 245–281 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Tsourakakis, C.: Fast counting of triangles in large real networks without counting: algorithms and laws. In: ICDM, pp. 608–617. IEEE (2008)Google Scholar
  18. 18.
    Zhou, Y., Liu, L., Lee, K., Zhang, Q.: Graphtwist: fast iterative graph computation with two-tier optimizations. Proc. VLDB Endowment 8(11), 1262–1273 (2015)CrossRefGoogle Scholar
  19. 19.
    Zhu, X., Han, W., Chen, W.: Gridgraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning. In: USENIX ATC 2015, pp. 375–386. USENIX Association (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.University of Sao PauloSao CarlosBrazil
  2. 2.Georgia Institute of TechnologyAtlantaUSA
  3. 3.Seoul National UniversitySeoulRepublic of Korea

Personalised recommendations