MapReduce for Big Data Analysis: Benefits, Limitations and Extensions

  • Yang Song
  • Hongzhi WangEmail author
  • Jianzhong Li
  • Hong Gao
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 623)


Big data becomes a hot topic. MapReduce is a popular programming paradigm for big data analysis with many benefits. Even though it has widely applications in industry, MapReduce still has limitations in some applications. For these limitations, some extensions have been proposed. In these brief communications, we discuss the benefits and limitations of MapReduce programming paradigm and also its extensions to make MapReduce go beyond the limitations.


Big data MapReduce Parallel computation Analysis 



This paper was partially supported by National Sci-Tech Support Plan 2015BAH10F01 and NSFC grant U1509216, 61472099, 61133002 and the Scientific Research Foundation for the Returned Overseas Chinese Scholars of Heilongjiang Provience LC2016026.


  1. 1.
    Rajaraman, A., Ullman, J.D.: Mining of Massive Datasets. Posts & Telecom Press, Beijing (2012)Google Scholar
  2. 2.
    Phan, T.-C., d’Orazio, L., Rigaux, P.: Toward intersection filter-based optimization for joins in MapReduce. In: Cloud-I, p. 2 (2013)Google Scholar
  3. 3.
    Tao, Y., Lin, W., Xiao, X.: Minimal MapReduce algorithms. In: SIGMOD Conference, pp. 529–540 (2013)Google Scholar
  4. 4.
    Zhang, Y., Chen, S.: i2MapReduce: incremental iterative MapReduce. In: Cloud-I, p. 3 (2013)Google Scholar
  5. 5.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Comput. Netw. ISDN Syst. 30, 107–117 (1998)CrossRefGoogle Scholar
  6. 6.
    Avriel, M.: Nonlinear Programming: Analysis and Methods. Courier Dover Publications, Mineola (2003)zbMATHGoogle Scholar
  7. 7.
    Baluja, S., Seth, R., Sivakumar, D., Jing, Y., Yagnik, J., Kumar, S., Ravichandran, D., Aly, M.: Video suggestion and discovery for youtube: taking random walks through the view graph. In: Proceedings of the WWW 2008, pp. 895–904 (2008)Google Scholar
  8. 8.
    Liben-Nowell, D., Kleinberg, J.M.: The link-prediction problem for social networks. JASIST 58(7), 1019–1031 (2007)CrossRefGoogle Scholar
  9. 9.
    Dean, J., Ghemawat, S.: Mapreduce: simplified data processing on large clusters. In: Proceedings of the OSDI 2004 (2004)Google Scholar
  10. 10.
    Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)CrossRefzbMATHGoogle Scholar
  11. 11.
    Borkar, V.R., Carey, M.J., Grover, R., Onose, N., Vernica, R.: Hyracks: a flexible and extensible foundation for data-intensive computing. In: ICDE, pp. 1151–1162 (2011)Google Scholar
  12. 12.
    Jiang, D., Chen, G., Ooi, B.C., Tan, K.-L., Wu, S.: epiC: an extensible and scalable system for processing big data. PVLDB 7(7), 541–552 (2014)Google Scholar

Copyright information

© Springer Science+Business Media Singapore 2016

Authors and Affiliations

  • Yang Song
    • 1
  • Hongzhi Wang
    • 1
    Email author
  • Jianzhong Li
    • 1
  • Hong Gao
    • 1
  1. 1.Harbin Institute of TechnologyHarbinChina

Personalised recommendations