Advertisement

The Journal of Supercomputing

, Volume 69, Issue 1, pp 139–160 | Cite as

Characterizing and modeling cloud applications/jobs on a Google data center

  • Sheng Di
  • Derrick Kondo
  • Franck Cappello
Article

Abstract

In this paper, we characterize and model Google applications and jobs, based on a 1-month Google trace from a large-scale Google data center. We address four contributions: (1) we compute the valuable statistics about task events and resource utilization for Google applications, based on various types of resources and execution types; (2) we analyze the classification of applications via a K-means clustering algorithm with optimized number of sets, based on task events and resource usage; (3) we study the correlation of Google application properties and running features (e.g., job priority and scheduling class); (4) we finally build a model that can simulate Google jobs/tasks and dynamic events, in accordance with Google trace. Experiments show that the tasks simulated based on our model exhibit fairly analogous features with those in Google trace. 95+ % of tasks’ simulation errors are \(<\)20 %, confirming a high accuracy of our simulation model.

Keywords

Google data center Cloud task Characterization and analysis Large-scale system trace 

Notes

Acknowledgments

We thank Google Inc, in particular Charles Reiss and John Wilkes, for making their invaluable trace data available. This work is supported by ANR project Clouds@home (ANR-09-JCJC-0056-01), also in part by the Advanced Scientific Computing Research Program, Office of Science, U.S. Department of Energy, under Contract DE-AC02-06CH11357, and by the INRIA-Illinois Joint Laboratory for Petascale Computing. This paper has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.

References

  1. 1.
    Armbrust M, Fox A, Griffith R, Joseph A et al (2009), Above the clouds: a Berkeley view of cloud computing. EECS, University of California, Berkeley, Technical Report. UCB/EECS-2009-28Google Scholar
  2. 2.
    Vaquero L, Rodero-Merino L, Caceres J, Lindner M (2009) A break in the clouds: towards a cloud definition. SIGCOMM Comput Commun Rev 39(1):50–55CrossRefGoogle Scholar
  3. 3.
    Wilkes J (2011) More Google cluster data. Google research blog. http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html
  4. 4.
    Reiss C, Wilkes J, Hellerstein J (2012) Google cluster-usage traces: format + schema. Google Inc., Mountain View, USA, Technical ReportGoogle Scholar
  5. 5.
    Di S, Kondo D, Cirne W (2012) Characterization and comparison of cloud versus grid workloads. IEEE international conference on cluster computing (cluster’12), pp 230–238Google Scholar
  6. 6.
    Meng X, Isci C, Kephart J, Zhang L, Bouillet E, Pendarakis D (2010) Efficient resource provisioning in compute clouds via vm multiplexing. In: Proceedings of the 7th international conference on autonomic computing (ICAC’10), New York, ACM, pp 11–20Google Scholar
  7. 7.
    Buyya R, Ranjan R, Calheiros R (2010) Intercloud: utility-oriented federation of cloud computing environments for scaling of application services. In: 10th international conference on algorithms and architectures for parallel processing (ICA3PP’10), pp 13–31Google Scholar
  8. 8.
    Stillwell M, Vivien F, Casanova H (2012) Virtual machine resource allocation for service hosting on heterogeneous distributed platforms. In: Proceedings of IEEE 26th international conference on parallel distributed processing symposium (IPDPS’12), pp 786–797Google Scholar
  9. 9.
    Calheiros R, Ranjan R, Beloglazov A, De-Rose C, Buyya R (2011) Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50CrossRefGoogle Scholar
  10. 10.
    Di S, Wang C-L (2013) Dynamic optimization of multi-attribute resource allocation in self-organizing clouds. IEEE Trans Parallel Distrib Syst (TPDS) 24(3):464–478CrossRefGoogle Scholar
  11. 11.
    Dean J, Ghemawat S (2004) MapReduce: Simplified data processing on large clusters. In: 5th USENIX symposium on operating systems design and implementation (OSDI’04), pp 137–150Google Scholar
  12. 12.
    Reiss C, Tumanov A, Ganger G, Katz R, Kozuch M (2012) Towards understanding heterogeneous clouds at scale: Google trace analysis. Intel science and technology center for cloud computing. Carnegie Mellon University, Pittsburgh, Technical Report ISTC-CC-TR-12-101Google Scholar
  13. 13.
    Feitelson D (2011) Workload modeling for computer systems performance evaluation. http://www.cs.huji.ac.il/~feit/wlmod/
  14. 14.
    Koch R (1997) The 80/20 principle: the secret of achieving more with less. Nicholas BrealeyGoogle Scholar
  15. 15.
    MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, pp 281–297Google Scholar
  16. 16.
    Okabe A, Boots B, Sugihara K, Chiu S (2000) Spatial tessellations: concepts and applications of voronoi diagrams, 2nd edn. Series in probability and statistics. Wiley, EnglandGoogle Scholar
  17. 17.
    Ross S (2010) Introduction to probability models, 10th edn. Academic Press, BurlingtonzbMATHGoogle Scholar
  18. 18.
    Sharma B, Chudnovsky V, Hellerstein J, Rifaat R, Das C (2011) Modeling and synthesizing task placement constraints in google compute clusters. In: Proceedings of the 2nd ACM symposium on cloud computing (SOCC’11), New York, ACM, pp 3:1–3:14Google Scholar
  19. 19.
    Mishra A, Hellerstein J, Cirne W, Das C-R (2010) Towards characterizing cloud backend workloads: insights from Google compute clusters. SIGMETRICS Perform Eval Rev 37(4):34–41CrossRefGoogle Scholar
  20. 20.
    Zhang Q, Hellerstein J.L., Boutaba R (2011) Characterizing task usage shapes in google compute clusters. Large scale distributed systems and middleware, workshop (LADIS’11)Google Scholar
  21. 21.
    Liu Z, Cho S (2012) Characterizing machines and workloads on a Google cluster. In: 8th international workshop on scheduling and resource management for parallel and distributed systems (SRMPDS’12), pp 397–403Google Scholar
  22. 22.
    Ganapathi A, Chen Y, Fox A, Katz RH, Patterson DA (2010) Statistics-driven workload modeling for the cloud. ICDE workshops’10, pp 87–92Google Scholar
  23. 23.
    Shvachko K, Kuang H, Radia S, and Chansler R (2010) The hadoop distributed file system. In: IEEE 26th symposium on mass storage systems and technologies (MSST’10), pp 1–10Google Scholar
  24. 24.
    Li A, Zong X, Kandula S, Yang X, Zhang M (2011) Cloudprophet: Towards application performance prediction in cloud. ACM SIGCOMM student poster, pp 426–427Google Scholar
  25. 25.
    Jackson K.R., Ramakrishnan L, Muriki K at al (2010) Performance analysis of high performance computing applications on the amazon web services cloud. In: Proceedings of the IEEE 2nd international conference on cloud computing technology and science (CloudCom’10). Washington, DC, IEEE Computer Society, pp 159–168Google Scholar
  26. 26.
    Hamerly G, Elkan C (2002) Alternatives to the k-means algorithm that find better clusterings. In: Proceedings of the 17th international conference on Information and knowledge management (CIKM’02), New York, ACM, pp 600–607Google Scholar

Copyright information

© Argonne National Laboratory; DE-AC02-06CH11357  2014

Authors and Affiliations

  1. 1.INRIAParisFrance
  2. 2.Argonne National LaboratoryLemontUSA

Personalised recommendations