The Journal of Supercomputing

, Volume 72, Issue 8, pp 3136–3155 | Cite as

Formal performance evaluation of the Map/Reduce framework within cloud computing

  • M. Carmen RuizEmail author
  • Diego Cazorla
  • Diego Pérez
  • Javier Conejero


The recent appearance, evolution and massive expansion of social media-based technologies, in conjunction with what currently is known as Internet of Things, results in a vertiginous data production. One of the main contributions to address this matter has been the Hadoop framework (which implements the Map/Reduce paradigm), especially when used in conjunction with Cloud computing environments. In this paper, a comprehensive and rigourous study of the Map/Reduce framework using formal methods is presented. Specifically, the Timed Process Algebra BTC is used, and the resulting formal model is evaluated with a real social media data Hadoop-based application. Moreover, the formal model is validated by carrying out several experiments on a real private Cloud environment. Finally, the formal model outcomes are harnessed to determine the best performance–cost agreement in a real scenario. Results show that the proposed model enables to determine in advance both the performance of a Hadoop-based application within Cloud environments and the best performance–cost agreement.


Big data Performance analysis Process algebra  Map/Reduce Hadoop 


  1. 1.
    Amazon Calculator—Simple Monthly Calculator. Accessed 21 July 2015
  2. 2.
    Anderson P (2007) What is Web 2.0? Ideas, technologies and implications for education. In: JISC Online ReportGoogle Scholar
  3. 3.
    Apache Hadoop (2015) Accessed 21 July 2015
  4. 4.
    Babu S (2010) Towards automatic optimization of MapReduce programs. In: Proceedings of the 1st ACM symposium on cloud computing (SoCC ’10ACM), New York, pp 137–142Google Scholar
  5. 5.
    CentOS (2015) Accessed 21 July 2015
  6. 6.
    Conejero J, Rana O, Burnap P, Morgan J (2013) Scaling archived social media data analysis using a hadoop cloud. In: IEEE 6th international conference on cloud computing (CLOUD). Santa ClaraGoogle Scholar
  7. 7.
    COSMOS: Cardiff On-line Social Media Observatory (2013). Accessed 21 July 2015
  8. 8.
    Freitas L, Woodcock J (2007) FDR explorer. Electron Notes Theor Comput Sci 187:19–34CrossRefzbMATHGoogle Scholar
  9. 9.
    Hoare C (1985) Communicating sequential processes. Prentice Hall, Englewood CliffszbMATHGoogle Scholar
  10. 10.
    Jiang D, Ooi BC, Shi L, Wu S (2010) The performance of MapReduce: an in-depth study. Proc VLDB Endow 3(1–2):472–483CrossRefGoogle Scholar
  11. 11.
    Kernel Based Virtual Machine (2015) Accessed 21 July 2015
  12. 12.
    Ono K, Hirai Y, Tanabe Y, Noda N, Hagiya M (2011) Using Coq in Specification and Program Extraction of Hadoop MapReduce applications. In: Proceedings of the 9th international conference on software engineering and formal methods (SEFM’11), Springer, Berlin, pp 350–365Google Scholar
  13. 13.
    OpenNebula (2015) Accessed 21 July 2015
  14. 14.
    Ruiz MC, Cazorla D, Cuartero F, Pardo JJ (2006) Analysis of the SET e-commerce protocol using a true concurrency process algebra. In: 21st ACM Symposium on Applied Computing (SAC-06), ACM Press, New York, pp 879–886Google Scholar
  15. 15.
    Ruiz MC, Cazorla D, Cuartero F, Pardo JJ, Maciá H (2004) A bounded true concurrency process algebra for performance evaluation. FORTE Workshops, vol 3236., Lecture Notes in Computer ScienceSpringer, Berlin, pp 143–155Google Scholar
  16. 16.
    Ruiz MC, Pérez D, Pardo JJ, Cazorla D (2009) BAL Tool. Accessed 21 July 2015
  17. 17.
    SentiStrength (2013) The sentiment strength detection in short texts. Accessed 21 July 2015
  18. 18.
    The Coq Proof Assistant (2015) Accessed 21 July 2015
  19. 19.
    Valiant LG (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111CrossRefGoogle Scholar
  20. 20.
    Yang F, Su W, Zhu H, Li Q (2010) Formalizing MapReduce with CSP. In: Proceedings of the 17th IEEE international conference and workshops on the engineering of computer-based systems (ECBS’2010), pp 358–367Google Scholar
  21. 21.
    Yoshimura M (2010) System design optimization for product manufacturing, 1st edn. Springer, LondonCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • M. Carmen Ruiz
    • 1
    Email author
  • Diego Cazorla
    • 1
  • Diego Pérez
    • 2
  • Javier Conejero
    • 2
  1. 1.Universidad de Castilla-La ManchaAlbaceteSpain
  2. 2.Instituto de Investigación en Informática de AlbaceteAlbaceteSpain

Personalised recommendations