Multimedia Tools and Applications

, Volume 78, Issue 1, pp 457–475 | Cite as

GMR: graph-compatible MapReduce programming model

  • Weidong ZhangEmail author
  • Boxin He
  • Yifeng Chen
  • Qifei Zhang


The MapReduce programming model is widely used to parallelize data processing over the large scale of commodity computer clusters. However, on account of its monotonous data representation, it fails to express graph-parallel algorithms naturally and execute them efficiently. Alternatively, Pregel and PowerGraph could address these challenges. But they require users to familiarize another set of programming patterns and platforms, and at the same time the legacy MapReduce code also becomes incompatible and useless. In this paper, we proposed the Graph-compatible MapReduce (GMR) as an extension of Google’s Standard MapReduce (SMR). In this way, graph-parallel algorithm will be naturally expressed without compromising the efficiency and simplicity, and meanwhile the conventional MapReduce programming pattern be preserved. Also, users could gain the convenience of “Think like a vertex”. Based on the experimental studying, we analyzed the ratio of the redundant computation, transmission and data caching introduced in naive iterative MapReduce platforms (e.g., HaLoop, Twister). Furthermore, we discussed the difference between GMR and the graph-targeted frameworks. The evaluation experiment results show that GMR outperforms GraphX in a series of real-world graph-parallel algorithms.


Distributed systems Parallel architectures Graph theory Systems programs and utilities Performance analysis and design aids Concurrent programming Modes of computation Performance of systems 



Our thanks to the Institute of Process Engineering, Chinese Academy of Science for their help. This research was supported by the Zhejiang Engineering Research Center of Intelligent Medicine(2016E10011) and the research and application of key technologies for rapid individualized sculpture manufacture and carving stone materials appraisal.


  1. 1.
    Beierlein F, Clark T (2005) Computer simulations of enzyme reaction mechanisms: simulation of protein spectra. High performance computing in science & engineering Munich 2004, Springer, pp 245-259Google Scholar
  2. 2.
    Bu Y, Howe B, Balazinska M, Ernst MD (2010) Haloop: efficient iterative data processing on large clusters. Proceedings of the Vldb endowment 3(1):285–296CrossRefGoogle Scholar
  3. 3.
    Buluç A, Fineman JT, Frigo M, Gilbert JR, Leiserson CE (2009) Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In: SPAA ’09: proceedings of the twenty-first annual symposium on parallelism in algorithms and archi, pp 233–244Google Scholar
  4. 4.
    Cherkassky BV, Goldberg AV, Radzik T (1996) Shortest path algorithms: theory and experimental evaluation. Math Program 73(2):129–174MathSciNetCrossRefGoogle Scholar
  5. 5.
    Chua TS, Chua TS, Chua TS, Chua TS, Chua TS (2016) Learning from collective intelligence: Feature learning using social images and tags. ACM Trans Multimed Comput Commun Appl 13(1):1MathSciNetzbMATHGoogle Scholar
  6. 6.
    Ekanayake J, Li H, Zhang B, Gunarathne T, Bae SH, Qiu J, Fox G (2010) Twister: a runtime for iterative mapreduce. In: ACM international symposium on high performance distributed computing, pp 810–818Google Scholar
  7. 7.
    Elgohary A, (2012) Stateful mapreduceGoogle Scholar
  8. 8.
    Gao Z, Zhang H, Xu GP, Xue YB, Hauptmann AG (2014) Multi-view discriminative and structured dictionary learning with group sparsity for human action recognition. Signal Process 112(C):83–97Google Scholar
  9. 9.
    Gao Z, Zhang H, Xu GP, Xue YB (2015) Multi-perspective and multi-modality joint representation and recognition model for 3d action recognition. Neurocomputing 151:554–564CrossRefGoogle Scholar
  10. 10.
    Gonzalez JE, Low Y, Gu H, Bickson D, Guestrin C (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Usenix conference on operating systems design and implementation, pp 17–30Google Scholar
  11. 11.
    Guattery S, Miller GL (1995) On the performance of spectral graph partitioning methods. In: ACM-SIAM symposium on discrete algorithms, pp 233–242Google Scholar
  12. 12.
    Karypis G, Kumar V (1998) Metis: a software package for partitioning unstructured graphs. In: International cryogenics monograph, pp 121–124Google Scholar
  13. 13.
    Karypis G, Kumar V (1999) Multilevel k-way partitioning scheme for irregular graphs. J Parallel Distrib Comput 48(1):96–129CrossRefGoogle Scholar
  14. 14.
    Liu AA, Nie WZ, Gao Y, Su YT (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116MathSciNetCrossRefGoogle Scholar
  15. 15.
    Liu AA, Su YT, Nie WZ, Kankanhalli M (2016) Hierarchical clustering multi-task learning for joint human action grouping and recognition. IEEE Trans Pattern Anal Mach Intell 39(1):102–114CrossRefGoogle Scholar
  16. 16.
    Lv Q, Josephson W, Wang Z, Charikar M, Li K (2007) Multi-probe lsh: efficient indexing for high-dimensional similarity search. In: International conference on very large data bases, University of Vienna, Austria, September, pp 950–961Google Scholar
  17. 17.
    Malewicz G, Austern MH, Bik AJC, Dehnert JC, Horn I, Leiser N, Czajkowski G (2009) Pregel: a system for large-scale graph processing. In: SPAA 2009: proceedings of the ACM symposium on parallelism in algorithms and architectures, Calgary, Alberta, Canada, August, pp 135–146Google Scholar
  18. 18.
    Miller F (1993) A library for bulk-synchronous parallel programming. In: Proceedings of the BCS parallel processing specialist group workshop on general purpose parallel computing, pp 100–108Google Scholar
  19. 19.
    Nie W, Liu A, Li W, Su Y (2016) Cross-view action recognition by cross-domain learning *. Image Vis Comput 55:109–118CrossRefGoogle Scholar
  20. 20.
    Nie WZ, Liu AA, Gao Z, Su YT (2015) Clique-graph matching by preserving global & local structure. In: Computer vision and pattern recognition, pp 4503–4510Google Scholar
  21. 21.
    Nie WZ, Liu AA, Su YT (2016) 3D object retrieval based on sparse coding in weak supervision. J Vis Commun Image Represent 37(C):40–45CrossRefGoogle Scholar
  22. 22.
    Raji RP (2009) Mapreduce: simplified data processing on large clusters. Commun. ACM 51(1):107–113MathSciNetGoogle Scholar
  23. 23.
    Savage JE, Wloka MG (1991) Parallelism in graph-partitioning. J Parallel Distrib Comput 13(3):257–272MathSciNetCrossRefGoogle Scholar
  24. 24.
    Weilenmann M (2012) Aspects of highly transient catalyst simulation. Catal Today 188(1):121–134CrossRefGoogle Scholar
  25. 25.
    Xin RS, Gonzalez JE, Franklin MJ, Stoica I (2013) Graphx: a resilient distributed graph system on spark. In: International workshop on graph data management experiences and systems, pp 1–6Google Scholar
  26. 26.
    Yang J, Leskovec J (2015) Defining and evaluating network communities based on ground-truth. Knowl Inf Syst 42(1):745–754CrossRefGoogle Scholar
  27. 27.
    Zhang H, Liu W, Liu W, He X, Luan H, Chua TS (2016) Discrete collaborative filtering. In: International ACM SIGIR conference on research and development in information retrieval, pp 325–334Google Scholar
  28. 28.
    Zhang H, Zha ZJ, Yang Y, Yan S, Chua TS (2014) Robust (semi) nonnegative graph embedding. IEEE Trans Image Process A Publ IEEE Signal Process Soc 23(7):2996MathSciNetCrossRefGoogle Scholar
  29. 29.
    Zhang H, Zha ZJ, Yang Y, Yan S, Gao Y, Chua TS (2013) Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval. In: Proceedings of the 21st ACM international conference on Multimedia. ACM, pp 33–42Google Scholar
  30. 30.
    Zhang Y, Gao Q, Gao L, Wang C (2012) Imapreduce: a distributed computing framework for iterative computation. J Grid Comput 10(1):1112–1121CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017
Corrected publication October/2017

Authors and Affiliations

  1. 1.Peking UniversityBeijingChina
  2. 2.Zhejiang UniversityHangzhouChina

Personalised recommendations