The Journal of Supercomputing

, Volume 73, Issue 5, pp 2258–2283 | Cite as

Performance modeling of big data applications in the cloud centers

  • Chao Shen
  • Weiqin Tong
  • Jenq-Neng Hwang
  • Qiang Gao


Cloud computing has evolved as an efficient paradigm to process big data applications. Performance evaluation of cloud center is a necessary prerequisite to guarantee quality of service. However, it is a challenge task to effectively analyze the performance of cloud service due to the complexity of cloud resources and the diversity of big data applications. In this paper, we leverage queuing theory and probabilistic statistics to propose a performance evaluation model for cloud center under big data application arrivals. In this model, the tasks (i.e., big data applications) are with Poisson arrivals, each task is divided into lots of parallel subtasks, and the number of subtasks follows a general distribution. The model allows to calculate the important performance indicators such as mean number of subtasks in the system, the probability that a task obtains immediate service, task waiting time and blocking probability. The model can also be used to predict the time cost of performing application. Finally, we use the simulations and benchmarking running WordCount and TeraSort applications on a Hadoop platform to demonstrate the utility of the model.


Cloud computing Big data Performance modeling Embedded Markov chain Response time 



This work is supported by Innovation Action Plan supported by Science and Technology Commission of Shanghai Municipality (15DZ1100305).


  1. 1.
    Vaquero LM, Rodero-Merino L, Caceres J, Lindner M (2009) A break in the clouds: towards a cloud definition. ACM SIGCOMM Comput Commun Rev 39(1):50–55CrossRefGoogle Scholar
  2. 2.
    Amazon Elastic Compute Cloud, Amazon EC2 (2015) An Company.
  3. 3.
  4. 4.
    IBM Cloud Computing (2015) IBM.
  5. 5.
    Khazaei H, Misic J, Misic Vojislav B (2012) Performance analysis of cloud computing centers using m/g/m/m+r queuing systems. IEEE Trans Parallel Distrib Syst 23(5):936–943CrossRefGoogle Scholar
  6. 6.
    Ghosh R, Trivedi KS, Naik VK, Kim DS (2010) End-to-end performability analysis for infrastructure-as-a-service cloud. In: Proceedings of IEEE 16th Pacific Rim International Symposium on Dependable Computing. pp 125–132Google Scholar
  7. 7.
    Suresh Varma P, Satyanarayana A, Sundari R (2012) Performance analysis of cloud computing using queuing models. In: International Conference on Cloud Computing, Technologies, Applications and Management. pp 12–15Google Scholar
  8. 8.
    Xiong K, Perros H (2009) Service performance and analysis in cloud computing. In: World Conference on Services. pp 693–700Google Scholar
  9. 9.
    Qian H, Medhi D, Trivedi KS (2011) A hierarchical model to evaluate quality of experience of online services hosted by cloud computing. In: Proceedings of IFIP/IEEE International Symposium on Integrated Network Management (IM). pp 105–112Google Scholar
  10. 10.
    Ghosh R, Longo F, Naik VK, Trivedi KS (2010) Quantifying resiliency of IaaS cloud. In: Proceedings of IEEE Symposium on Reliable Distributed Systems. pp 343–347Google Scholar
  11. 11.
    Khazaei H, Misic J, Misic VB, Rashwand S (2013) Analysis of a pool management scheme for cloud computing centers. IEEE Trans Parallel Distrib Syst 24(5):849–861CrossRefGoogle Scholar
  12. 12.
    Khazaei H, Misic J, Misic Vojislav B (2013) Performance of cloud centers with high degree of virtualization under batch task arrivals. IEEE Trans Parallel Distrib Syst 24(12):2429–2438CrossRefGoogle Scholar
  13. 13.
    Khazaei H, Misic J, Misic VB (2013) A fine-grained performance model of cloud computing centers. IEEE Trans Parallel Distrib Syst 24(11):2138–2147CrossRefGoogle Scholar
  14. 14.
    Yang B, Tan F, Dai YS (2013) Performance evaluation of cloud service considering fault recovery. J Supercomput 65(1):426–444CrossRefGoogle Scholar
  15. 15.
    Liu X, Tong W, Zhi X, ZhiRen F, WenZhao Liao (2014) Performance analysis of cloud computing services considering resources sharing among virtual machines. J Supercomput 69(1):357–374CrossRefGoogle Scholar
  16. 16.
    Khazaei H, Misic J, Misic VB, Mohammadi NB (2013) Modeling the performance of heterogeneous IaaS cloud centers. In: 33rd International Conference on Distributed Computing Systems Workshops. pp 232–237Google Scholar
  17. 17.
    Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113CrossRefGoogle Scholar
  18. 18.
    Valiant Leslie G (1990) A bridging model for parallel computation. Commun ACM 33(8):103–111CrossRefGoogle Scholar
  19. 19.
    Bolch G, Greiner S, de Meer H, Trivedi KS (2006) Q ueueing networks and markov chains, 2nd edn. Wiley, HobokenCrossRefMATHGoogle Scholar
  20. 20.
    Doulkeridis C, Norvag Kjetil (2014) A survey of large-scale analytical query processing in MapReduce. Very Large Data Bases J 23:355–380CrossRefGoogle Scholar
  21. 21.
    Pace MF (2012) BSP vs MapReduce. Procedia Comput Sci 9:246–255CrossRefGoogle Scholar
  22. 22.
    Shvachko K, Kuang H, Radia S, Chansler R (2010) The hadoop distributed file system. In: Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies, Washington, DC, USA. IEEE Computer SocietyGoogle Scholar
  23. 23.
    Garfinkel SL (2007) An evaluation of Amazons grid computing services: EC2, S3 and SQS. Tech. Rep., \(\#\) TR-08-07Google Scholar
  24. 24.
    Jackson KR, Ramakrishnan L, Muriki K et al. (2010) Performance analysis of high performance computing applications on the Amazon web services cloud. In: 2nd IEEE International Conference on Cloud Computing Technology and Science. pp 159–168Google Scholar
  25. 25.
    Iosup A, Ostermann S, Yigitbasi N, Prodan R, Fahringer T, Epema D (2011) Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans Parallel Distrib Syst 22(6):931–945CrossRefGoogle Scholar
  26. 26.
    Yigitbasi N, Iosup A, Epema D, Ostermann S (2009) C-meter: a framework for performance analysis of computing clouds. In: CCGRID ’09: Proceedings of Ninth IEEE/ACM International Symposium on Cluster Computing and the Grid. pp 472–477Google Scholar
  27. 27.
    Liu X, Li S, Tong W (2015) A queuing model considering resources sharing for cloud service performance. J Supercomput 71(11):1–14CrossRefGoogle Scholar
  28. 28.
    Chao S, Weiqin T, Kausar S (2015) Predicting the performance of parallel computing models using queuing system. In: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. pp 757–760Google Scholar
  29. 29.
    Dai Yuan-Shun, Pan Yi, Zou Xukai (2007) A hierarchical modeling and analysis for grid service reliability. IEEE Trans Comput 56(5):681–691MathSciNetCrossRefGoogle Scholar
  30. 30.
    Maple 18 (2015) Maplesoft.
  31. 31.
    Hadoop, Apache (2015)
  32. 32.
  33. 33.
    Xiong W, Yu Z, Bei Z, Zhao J, Zhang F, Zou Y, Bai X, Li Y, Xu C (2013) A characterization of big data benchmarks. In: IEEE International Conference on Big Data. pp 118–125Google Scholar
  34. 34.
    Xiong W, Yu Z, Eeckhout L, Bei Z, Zhang F, Xu C (2015) SZTS: A novel big data transportation system benchmark suite. In: 44th International Conference on Parallel Processing. pp 819–828Google Scholar
  35. 35.
    Wang L, Zhan J, Luo C, Zhu Y, Yang Q, He Y, Gao W, Jia Z, Shi Y, Zhang S, Zheng C, Lu G, Zhan K, Li X, Qiu B (2014) Bigdatabench: a big data benchmark suite from internetservices. In: IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). pp 488–499Google Scholar
  36. 36.
    Wasi-ur-Rahman M, Lu X, Islam NS, Panda DK (2014) Performance modeling for RDMA-enhanced hadoop MapReduce. In: 43rd International Conference on Parallel Processing. pp 50–59Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Chao Shen
    • 1
  • Weiqin Tong
    • 1
  • Jenq-Neng Hwang
    • 2
  • Qiang Gao
    • 1
  1. 1.School of Computer Engineering and ScienceShanghai UniversityShanghaiChina
  2. 2.Department of Electrical EngineeringUniversity of WashingtonSeattleUSA

Personalised recommendations