IaaS Cloud Benchmarking: Approaches, Challenges, and Experience

  • Alexandru Iosup
  • Radu Prodan
  • Dick Epema


Infrastructure-as-a-Service (IaaS) cloud computing is an emerging commercial infrastructure paradigm under which clients (users) can lease resources when and for how long needed, under a cost model that reflects the actual usage of resources by the client. For IaaS clouds to become mainstream technology and for current cost models to become more clientfriendly, benchmarking and comparing the non-functional system properties of various IaaS clouds is important, especially for the cloud users. In this article we focus on the IaaS cloud-specific elements of benchmarking, from a user’s perspective. We propose a generic approach for IaaS cloud benchmarking, discuss numerous challenges in developing this approach, and summarize our experience towards benchmarking IaaS clouds. We argue for an experimental approach that requires, among others, new techniques for experiment compression, new benchmarking methods that go beyond blackbox and isolated-user testing, new benchmark designs that are domain-specific, and new metrics for elasticity and variability.


System Under Test Cloud User Performance Isolation IaaS Cloud Trace Archive 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was partially supported by the STW/NWO Veni grant @larGe (11881), EU projects PEDCA and EYE, Austrian Science Fund (FWF) project TRP 237-N23, and the ENIAC Joint Undertaking (project eRAMP).


  1. 1.
    Albayraktaroglu K, Jaleel A, Wu X, Franklin M, Jacob B, Tseng CW, Yeung D (2005) Biobench: A benchmark suite of bioinformatics applications. In: ISPASS, IEEE Computer Society, pp 2–9Google Scholar
  2. 2.
    Amaral JN (2012) How did this get published? Pitfalls in experimental evaluation of computing systems. LTES talk, [Online] Available: Last accessed Oct 2012.
  3. 3.
    Amazon Web Services (2012) Case studies. Amazon web site, [Online] Available: Last accessed Oct 2012.
  4. 4.
    Antoniou A, Iosup A (2012) Performance evaluation of cloud infrastructure using complex workloads. TU Delft MSc thesis, [Online] Available: Last accessed Oct 2012.
  5. 5.
    Armbrust M, Fox A, Griffith R, Joseph AD, Katz RH, Konwinski A, Lee G, Patterson DA, Rabkin A, Stoica I, Zaharia M (2010) A view of cloud computing. Commun ACM 53(4):50–58CrossRefGoogle Scholar
  6. 6.
    Arpaci-Dusseau RH, Arpaci-Dusseau AC, Vahdat A, Liu LT, Anderson TE, Patterson DA (1995) The interaction of parallel and sequential workloads on a network of workstations. In: SIGMETRICS, pp 267–278Google Scholar
  7. 7.
    Brebner P (2012) Is your cloud elastic enough?: performance modelling the elasticity of infrastructure as a service (iaas) cloud applications. In: ICPE, pp 263–266Google Scholar
  8. 8.
    Brebner P, Cecchet E, Marguerite J, Tuma P, Ciuhandu O, Dufour B, Eeckhout L, Frénot S, Krishna AS, Murphy J, Verbrugge C (2005) Middleware benchmarking: approaches, results, experiences. Concurrency and Computation: Practice and Experience 17(15):1799–1805CrossRefGoogle Scholar
  9. 9.
    Buble A, Bulej L, Tuma P (2003) Corba benchmarking: A course with hidden obstacles. In: IPDPS, p 279Google Scholar
  10. 10.
    Chapin SJ, Cirne W, Feitelson DG, Jones JP, Leutenegger ST, Schwiegelshohn U, Smith W, Talby D (1999) Benchmarks and standards for the evaluation of parallel job schedulers. In: JSSPP, pp 67–90Google Scholar
  11. 11.
    Chen Y, Ganapathi A, Griffith R, Katz RH (2011) The case for evaluating mapreduce performance using workload suites. In: MASCOTS, pp 390–399Google Scholar
  12. 12.
    Czajkowski K, Foster IT, Kesselman C (1999) Resource co-allocation in computational grids. In: HPDCGoogle Scholar
  13. 13.
    De Ruiter TA, Iosup A (2012) A workload model for MapReduce. TU Delft MSc thesis, [Online] Available: Last accessed Oct 2012.
  14. 14.
    Deelman E, Singh G, Livny M, Berriman JB, Good J (2008) The cost of doing science on the cloud: the Montage example. In: SC, IEEE/ACM, p 50Google Scholar
  15. 15.
    Downey AB, Feitelson DG (1999) The elusive goal of workload characterization. SIGMETRICS Performance Evaluation Review 26(4):14–29CrossRefGoogle Scholar
  16. 16.
    Durillo JJ, Prodan R (2014) Workflow scheduling on federated clouds. In: Euro-Par, Springer, LNCSGoogle Scholar
  17. 17.
    Eeckhout L, Nussbaum S, Smith JE, Bosschere KD (2003) Statistical simulation: Adding efficiency to the computer designer’s toolbox. IEEE Micro 23(5):26–38CrossRefGoogle Scholar
  18. 18.
    Feitelson D (2013) Parallel Workloads Archive.
  19. 19.
    Folkerts E, Alexandrov A, Sachs K, Iosup A, Markl V, Tosun C (2012) Benchmarking in the cloud: What it should, can, and cannot be. In: TPCTC, pp 173–188Google Scholar
  20. 20.
    Frachtenberg E, Feitelson DG (2005) Pitfalls in parallel job scheduling evaluation. In: JSSPP, pp 257–282CrossRefGoogle Scholar
  21. 21.
    Ganapathi A, Chen Y, Fox A, Katz RH, Patterson DA (2010) Statistics-driven workload modeling for the cloud. In: ICDE Workshops, pp 87–92Google Scholar
  22. 22.
    Genbrugge D, Eeckhout L (2009) Chip multiprocessor design space exploration through statistical simulation. IEEE Trans Computers 58(12):1668–1681CrossRefMathSciNetGoogle Scholar
  23. 23.
    Georges A, Buytaert D, Eeckhout L (2007) Statistically rigorous java performance evaluation. In: OOPSLA, pp 57–76Google Scholar
  24. 24.
    Gray J (ed) (1993) The Benchmark Handbook for Database and Transaction Systems, 2nd edn. Mergan KaufmannzbMATHGoogle Scholar
  25. 25.
    Guo Y, Iosup A (2012) The Game Trace Archive. In: NETGAMES, pp 1–6Google Scholar
  26. 26.
    Guo Y, Biczak M, Varbanescu AL, Iosup A, Martella C, Willke TL (2014) How well do graph-processing platforms perform? an empirical performance evaluation and analysis. In: IPDPSGoogle Scholar
  27. 27.
    Guo Y, Varbanescu AL, Iosup A, Martella C, Willke TL (2014) Benchmarking graph-processing platforms: a vision. In: ICPE, pp 289–292Google Scholar
  28. 28.
    Hegeman T, Ghit B, Capota M, Hidders J, Epema DHJ, Iosup A (2013) The btworld use case for big data analytics: Description, mapreduce logical workflow, and empirical evaluation. In: BigData Conference, pp 622–630Google Scholar
  29. 29.
    Herbst NR, Kounev S, Reussner R (2013) Elasticity in Cloud Computing: What it is, and What it is Not. In: Proceedings of the 10th International Conference on Autonomic Computing (ICAC 2013), San Jose, CA, June 24–28, USENIX, preliminary VersionGoogle Scholar
  30. 30.
    Huber N, von Quast M, Hauck M, Kounev S (2011) Evaluating and modeling virtualization performance overhead for cloud environments. In: CLOSER, pp 563–573Google Scholar
  31. 31.
    Huber N, Brosig F, Dingle N, Joshi K, Kounev S (2012) Providing Dependability and Performance in the Cloud: Case Studies. In: Wolter K, Avritzer A, Vieira M, van Moorsel A (eds) Resilience Assessment and Evaluation of Computing Systems, XVIII, Springer-Verlag, Berlin, Heidelberg, URL, iSBN: 978-3-642-29031-2
  32. 32.
    Iosup A (2013) Iaas cloud benchmarking: approaches, challenges, and experience. In: HotTopiCS, pp 1–2Google Scholar
  33. 33.
    Iosup A, Epema DHJ (2006) GrenchMark: A framework for analyzing, testing, and comparing grids. In: CCGrid, pp 313–320Google Scholar
  34. 34.
    Iosup A, Epema DHJ (2011) Grid computing workloads. IEEE Internet Computing 15(2):19–26CrossRefGoogle Scholar
  35. 35.
    Iosup A, Epema DHJ, Franke C, Papaspyrou A, Schley L, Song B, Yahyapour R (2006) On grid performance evaluation using synthetic workloads. In: JSSPP, pp 232–255Google Scholar
  36. 36.
    Iosup A, Jan M, Sonmez OO, Epema DHJ (2007) On the dynamic resource availability in grids. In: GRID, IEEE, pp 26–33Google Scholar
  37. 37.
    Iosup A, Li H, Jan M, Anoep S, Dumitrescu C, Wolters L, Epema DHJ (2008) The grid workloads archive. Future Gener Comput Syst 24(7):672–686CrossRefGoogle Scholar
  38. 38.
    Iosup A, Sonmez OO, Anoep S, Epema DHJ (2008) The performance of bags-of-tasks in large-scale distributed systems. In: HPDC, ACM, pp 97–108Google Scholar
  39. 39.
    Iosup A, Ostermann S, Yigitbasi N, Prodan R, Fahringer T, Epema DHJ (2011) Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans Par Dist Syst 22(6):931–945CrossRefGoogle Scholar
  40. 40.
    Iosup A, Yigitbasi N, Epema DHJ (2011) On the performance variability of production cloud services. In: CCGRID, pp 104–113Google Scholar
  41. 41.
    Iosup A, Prodan R, Epema DHJ (2012) Iaas cloud benchmarking: approaches, challenges, and experience. In: SC Companion/MTAGSGoogle Scholar
  42. 42.
    Islam S, Lee K, Fekete A, Liu A (2012) How a consumer can measure elasticity for cloud platforms. In: ICPE, pp 85–96Google Scholar
  43. 43.
    Jackson KR, Muriki K, Ramakrishnan L, Runge KJ, Thomas RC (2011) Performance and cost analysis of the supernova factory on the amazon aws cloud. Scientific Programming 19(2–3):107–119Google Scholar
  44. 44.
    Jain R (ed) (1991) The Art of Computer Systems Performance Analysis. John Wiley and Sons Inc.zbMATHGoogle Scholar
  45. 45.
    Janetschek M, Prodan R, Ostermann S (2013) Bringing scientific workflows to amazon swf. In: 2013 39th Euromicro Conference Series on Software Engineering and Advanced Applications, IEEE, pp 389–396, DOI 10.1109/SEAA.2013.13
  46. 46.
    Kim K, Jeon K, Han H, Kim SG, Jung H, Yeom HY (2008) Mrbench: A benchmark for mapreduce framework. In: ICPADS, pp 11–18Google Scholar
  47. 47.
    Kondo D, Javadi B, Iosup A, Epema DHJ (2010) The failure trace archive: Enabling comparative analysis of failures in diverse distributed systems. In: CCGrid, pp 398–407Google Scholar
  48. 48.
    Krebs R, Momm C, Kounev S (2012) Architectural concerns in multi-tenant saas applications. In: CLOSER, pp 426–431Google Scholar
  49. 49.
    Krebs R, Momm C, Kounev S (2012) Metrics and techniques for quantifying performance isolation in cloud environments. In: Int’l. ACM SIGSOFT conference Quality of Software Architectures (QoSA), pp 91–100Google Scholar
  50. 50.
    Kwok YK, Ahmad I (1999) Benchmarking and comparison of the task graph scheduling algorithms. J Parallel Distrib Comput 59(3):381–422CrossRefzbMATHGoogle Scholar
  51. 51.
    Mell P, Grance T (2011) The NIST definition of cloud computing. National Institute of Standards and Technology (NIST) Special Publication 800-145, [Online] Available: Last accessed Oct 2012.
  52. 52.
    Momm C, Krebs R (2011) A qualitative discussion of different approaches for implementing multi-tenant saas offerings. In: Software Engineering (Workshops), pp 139–150Google Scholar
  53. 53.
    Moreno-Vozmediano R, Montero RS, Llorente IM (2011) Multicloud deployment of computing clusters for loosely coupled mtc applications. IEEE Trans Parallel Distrib Syst 22(6):924–930CrossRefGoogle Scholar
  54. 54.
    Oliner AJ, Ganapathi A, Xu W (2012) Advances and challenges in log analysis. Commun ACM 55(2):55–61CrossRefGoogle Scholar
  55. 55.
    Oskin M, Chong FT, Farrens MK (2000) Hls: combining statistical and symbolic simulation to guide microprocessor designs. In: ISCA, pp 71–82Google Scholar
  56. 56.
    Ostermann S, Prodan R (2012) Impact of variable priced cloud resources on scientific workflow scheduling. In: Euro-Par 2012 – Parallel Processing, Springer, Lecture Notes in Computer Science, vol 7484, pp 350–362Google Scholar
  57. 57.
    Prodan R, Sperk M, Ostermann S (2012) Evaluating high-performance computing on google app engine. IEEE Software 29(2):52–58CrossRefGoogle Scholar
  58. 58.
    Raicu I, Zhang Z, Wilde M, Foster IT, Beckman PH, Iskra K, Clifford B (2008) Toward loosely coupled programming on petascale systems. In: SC, ACM, p 22Google Scholar
  59. 59.
    Saavedra RH, Smith AJ (1996) Analysis of benchmark characteristics and benchmark performance prediction. ACM Trans Comput Syst 14(4):344–384CrossRefGoogle Scholar
  60. 60.
    Schroeder B, Wierman A, Harchol-Balter M (2006) Open versus closed: A cautionary tale. In: NSDIGoogle Scholar
  61. 61.
    Sharkawi S, DeSota D, Panda R, Indukuru R, Stevens S, Taylor VE, Wu X (2009) Performance projection of hpc applications using spec cfp2006 benchmarks. In: IPDPS, pp 1–12Google Scholar
  62. 62.
    Sonmez OO, Mohamed HH, Epema DHJ (2010) On the benefit of processor coallocation in multicluster grid systems. IEEE Trans Parallel Distrib Syst 21(6):778–789CrossRefGoogle Scholar
  63. 63.
    Spacco J, Pugh W (2005) Rubis revisited: Why j2ee benchmarking is hard. Stud Inform Univ 4(1):25–30Google Scholar
  64. 64.
    Vahdat A, Yocum K, Walsh K, Mahadevan P, Kostic D, Chase JS, Becker D (2002) Scalability and accuracy in a large-scale network emulator. In: OSDIGoogle Scholar
  65. 65.
    Villegas D, Antoniou A, Sadjadi SM, Iosup A (2012) An analysis of provisioning and allocation policies for infrastructure-as-a-service clouds. In: CCGrid, pp 612–619Google Scholar
  66. 66.
    Vishwanath KV, Vahdat A, Yocum K, Gupta D (2009) Modelnet: Towards a datacenter emulation environment. In: Peer-to-Peer Computing, pp 81–82Google Scholar
  67. 67.
    Walker E (2009) The real cost of a cpu hour. IEEE Computer 42(4):35–41CrossRefGoogle Scholar
  68. 68.
    Wang G, Butt AR, Pandey P, Gupta K (2009) Using realistic simulation for performance analysis of MapReduce setups. In: HPDC Workshops, pp 19–26Google Scholar
  69. 69.
    Zaharia M, Borthakur D, Sarma JS, Elmeleegy K, Shenker S, Stoica I (2010) Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: EuroSys, pp 265–278Google Scholar
  70. 70.
    Zhang B, Iosup A, Pouwelse J, Epema D (2010) The Peer-to-Peer Trace Archive: design and comparative trace analysis. In: Proceedings of the ACM CoNEXT Student Workshop, CoNEXT ’10 Student Workshop, pp 21:1–21:2Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Delft University of TechnologyDelftThe Netherlands
  2. 2.Parallel and Distributed SystemsUniversity of InnsbruckInnsbruckAustria

Personalised recommendations