Abdul-Rahman OA, Aida K (2014) Towards understanding the usage behavior of Google cloud users: the mice and elephants phenomenon. In: 2014 IEEE 6th international conference on cloud computing technology and science (CloudCom). IEEE, pp 272–277. doi:10.1109/CloudCom.2014.75
Abraham L, Allen J, Barykin O, Borkar V, Chopra B, Gerea C, Merl D, Metzler J, Reiss D, Subramanian S, Wiener JL, Zed O (2013) Scuba: diving into data at facebook. Proc VLDB Endow 6(11):1057–1067. doi:10.14778/2536222.2536231
Article
Google Scholar
Armbrust M, Xin RS, Lian C, Huai Y, Liu D, Bradley JK, Meng X, Kaftan T, Franklin MJ, Ghodsi A, Zaharia M (2015) Spark SQL: relational data processing in spark. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data (SIGMOD’15). ACM, New York, pp 1383–1394. doi:10.1145/2723372.2742797
Breitgand D, Dubitzky Z, Epstein A, Feder O, Glikson A, Shapira I, Toffetti G (2014) An adaptive utilization accelerator for virtualized environments. In: 2014 IEEE international conference on cloud engineering (IC2E). IEEE, pp 165–174. doi:10.1109/IC2E.2014.63
Caglar F, Gokhale A (2014) iOverbook: intelligent resource-overbooking to support soft real-time applications in the cloud. In: Proceedings of the 2014 IEEE international conference on cloud computing (CLOUD’14). IEEE Computer Society, Washington, DC, pp 538–545. doi:10.1109/CLOUD.2014.78
Calheiros RN, Ranjan R, Beloglazov A, De Rose CAF, Buyya R (2011) Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw Pract Exp 41(1):23–50. doi:10.1002/spe.995
Article
Google Scholar
Chen Y, Alspaugh S, Katz RH (2012) Design insights for MapReduce from diverse production workloads. Tech. Rep. UCB/EECS-2012-17, EECS Department, University of California, Berkeley. http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-17.html. Accessed Dec 2015
Chen Y, Ganapathi A, Griffith R, Katz RH (2011) The case for evaluating MapReduce performance using workload suites. In: 2011 IEEE 19th annual international symposium on modelling, analysis, and simulation of computer and telecommunication systems, pp 390–399. doi:10.1109/MASCOTS.2011.12
Dean J, Ghemawat S (2010) Mapreduce: a flexible data processing tool. Commun ACM 53(1):72–77. doi:10.1145/1629175.1629198
Article
Google Scholar
Di S, Kondo D, Cirne W (2012) Characterization and comparison of Google cloud load versus grids. In: 2012 IEEE international conference on cluster computing (CLUSTER), Beijing, pp 230–238. doi:10.1109/CLUSTER.2012.35
Di S, Kondo D, Cirne W (2012) Host load prediction in a Google compute cloud with a bayesian model. In: Proceedings of the international conference on high performance computing, networking, storage and analysis. IEEE Computer Society Press, USA, pp 1–11. doi:10.1109/SC.2012.68
Di S, Robert Y, Vivien F, Kondo D, Wang CL, Cappello F (2013) Optimization of cloud task processing with checkpoint-restart mechanism. In: 2013 international conference for high performance computing, networking, storage and analysis (SC). IEEE, pp 1–12
Gamma E, Helm R, Johnson R, Vlissides J (1994) Design patterns: elements of reusable object-oriented software. Addison-Wesley Professional, Boston
MATH
Google Scholar
Gibbons JD, Chakraborti S (2010) Nonparametric statistical inference. Chapman and Hall/CRC, London
Guan Q, Fu S (2013) Adaptive anomaly identification by exploring metric subspace in cloud computing infrastructures. In: Proceedings of the 2013 IEEE 32nd international symposium on reliable distributed systems (SRDS’13). IEEE Computer Society, Washington, DC, pp 205–214. doi:10.1109/SRDS.2013.29
Gupta SKS, Banerjee A, Abbasi Z, Varsamopoulos G, Jonas M, Ferguson J, Gilbert RR, Mukherjee T (2014) Gdcsim: a simulator for green data center design and analysis. ACM Trans Model Comput Simul 24(1):3:1–3:27. doi:10.1145/2553083
Iglesias JO, Murphy L, De Cauwer M, Mehta D, O’Sullivan B (2014) A methodology for online consolidation of tasks through more accurate resource estimations. In: Proceedings of the 2014 IEEE/ACM 7th international conference on utility and cloud computing (UCC’14). IEEE Computer Society, Washington, DC, pp 89–98. doi:10.1109/UCC.2014.17
Isard M (2007) Autopilot: automatic data center management. SIGOPS Oper Syst Rev 41(2):60–67. doi:10.1145/1243418.1243426
Article
Google Scholar
Javadi B, Kondo D, Iosup A, Epema D (2013) The failure trace archive: enabling the comparison of failure measurements and models of distributed systems. J Parallel Distrib Comput 73(8):1208–1223. doi:10.1016/j.jpdc.2013.04.002
Article
Google Scholar
Kavulya S, Tan J, Gandhi R, Narasimhan P (2010) An analysis of traces from a production MapReduce cluster. In: Proceedings of the 2010 10th IEEE/ACM international conference on cluster, cloud and grid computing (CCGRID’10). IEEE Computer Society, Washington, DC, pp 94–103. doi:10.1109/CCGRID.2010.112
Kornacker M, Behm A, Bittorf V, Bobrovytsky T, Ching C, Choi A, Erickson J, Grund M, Hecht D, Jacobs M, Joshi I, Kuff L, Kumar D, Leblang A, Li N, Pandis I, Robinson H, Rorke D, Rus S, Russell J, Tsirogiannis D, Wanderman-Milne S, Yoder M (2015) Impala: a modern, open-source SQL engine for Hadoop. In: CIDR 2015, seventh biennial conference on innovative data systems research, Asilomar
Liu Z, Cho S (2012) Characterizing machines and workloads on a Google cluster. In: 2012 41st international conference on parallel processing workshops (ICPPW), pp 397–403. doi:10.1109/ICPPW.2012.57
Mishra AK, Hellerstein JL, Cirne W, Das CR (2010) Towards characterizing cloud backend workloads: insights from Google compute clusters. SIGMETRICS Perform Eval Rev 37(4):34–41. doi:10.1145/1773394.1773400
Article
Google Scholar
Olston C, Reed B, Srivastava U, Kumar R, Tomkins A (2008) Pig latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data (SIGMOD’08). ACM, New York, pp 1099–1110. doi:10.1145/1376616.1376726
R Development Core Team (2008) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. http://www.R-project.org. Accessed Dec 2015
Reiss C, Tumanov A, Ganger GR, Katz RH, Kozuch MA (2012) Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In: Proceedings of the third ACM symposium on cloud computing (SoCC’12). ACM, New York, pp 7:1–7:13. doi:10.1145/2391229.2391236
Reiss C, Wilkes J, Hellerstein JL (2011) Google cluster-usage traces: format \(+\) schema. Technical report, Google Inc., Mountain View. http://code.google.com/p/googleclusterdata/wiki/TraceVersion2. Accessed 20 March 2012
Salfner F, Lenk M, Malek M (2010) A survey of online failure prediction methods. ACM Comput Surv 42(3):10:1–10:42. doi:10.1145/1670679.1670680
Schwarzkopf M, Konwinski A, Abd-El-Malek M, Wilkes J (2013) Omega: flexible, scalable schedulers for large compute clusters. In: Proceedings of the 8th ACM European conference on computer systems (EuroSys’13). ACM, New York, pp 351–364. doi:10.1145/2465351.2465386
Sharma B, Chudnovsky V, Hellerstein JL, Rifaat R, Das CR (2011) Modeling and synthesizing task placement constraints in Google compute clusters. In: Proceedings of the 2nd ACM symposium on cloud computing (SOCC’11). ACM, New York, pp 3:1–3:14. doi:10.1145/2038916.2038919
Shvachko K, Kuang H, Radia S, Chansler R (2010) The Hadoop distributed file system. In: Proceedings of the 2010 IEEE 26th symposium on mass storage systems and technologies (MSST’10). IEEE Computer Society, USA, pp 1–10. doi:10.1109/MSST.2010.5496972
Sîrbu A, Babaoglu O (2015) Towards data-driven autonomics in data centers. In: IEEE international conference on cloud and autonomic computing (ICCAC). IEEE
Thusoo A, Sarma JS, Jain N, Shao Z, Chakka P, Zhang N, Anthony S, Liu H, Murthy R (2010) Hive—a petabyte scale data warehouse using Hadoop. In: Proceedings of the 26th international conference on data engineering (ICDE), Long Beach, pp 996–1005. doi:10.1109/ICDE.2010.5447738
Thusoo A, Shao Z, Anthony S, Borthakur D, Jain N, Sen Sarma J, Murthy R, Liu H (2010) Data warehousing and analytics infrastructure at facebook. In: Proceedings of the 2010 ACM SIGMOD international conference on management of data (SIGMOD’10). ACM, New York, pp 1013–1020. doi:10.1145/1807167.1807278
Varga A et al (2001) The OMNeT++ discrete event simulation system. In: Proceedings of the European simulation multiconference (ESM’01), Prague
Verma A, Pedrosa L, Korupolu M, Oppenheimer D, Tune E, Wilkes J (2015) Large-scale cluster management at Google with borg. In: Proceedings of the tenth European conference on computer systems (EuroSys’15). ACM, New York, pp 18:1–18:17. doi:10.1145/2741948.2741964
Wang G, Butt AR, Monti H, Gupta K (2011) Towards synthesizing realistic workload traces for studying the Hadoop ecosystem. In: Proceedings of the 2011 IEEE 19th annual international symposium on modelling, analysis, and simulation of computer and telecommunication systems (MASCOTS’11). IEEE Computer Society, Washington, DC, pp 400–408. doi:10.1109/MASCOTS.2011.59
Wang G, Butt AR, Pandey P, Gupta K (2009) A simulation approach to evaluating design decisions in MapReduce setups. In: IEEE international symposium on modeling, analysis simulation of computer and telecommunication systems (MASCOTS’09), pp 1–11 (2009). doi:10.1109/MASCOT.2009.5366973
Wilkes J (2011) More Google cluster data. Google research blog. http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html. Accessed Dec 2015
Wolski R, Brevik J (2014) Using parametric models to represent private cloud workloads. IEEE Trans Serv Comput 7(4):714–725. doi:10.1109/TSC.2013.48
Article
Google Scholar
Zhang Q, Hellerstein JL, Boutaba R (2011) Characterizing task usage shapes in Google’s compute clusters. In: Proceedings of the 5th international workshop on large scale distributed systems and middleware
Zhang Q, Zhani MF, Boutaba R, Hellerstein JL (2014) Dynamic heterogeneity-aware resource provisioning in the cloud. IEEE Trans Cloud Comput 2(1):14–28. doi:10.1109/TCC.2014.2306427
Article
Google Scholar
Zhang X, Tune E, Hagmann R, Jnagal R, Gokhale V, Wilkes J (2013) CPI2: CPU performance isolation for shared compute clusters. In: Proceedings of the 8th ACM European conference on computer systems (EuroSys’13). ACM, New York, pp 379–391. doi:10.1145/2465351.2465388
Zhao W, Peng Y, Xie F, Dai Z (2012) Modeling and simulation of cloud computing: a review. In: 2012 IEEE Asia Pacific cloud computing congress (APCloudCC), pp 20–24. doi:10.1109/APCloudCC.2012.6486505