Journal of Grid Computing

, Volume 10, Issue 3, pp 447–473 | Cite as

Energy-Efficient Thermal-Aware Autonomic Management of Virtualized HPC Cloud Infrastructure

  • Ivan Rodero
  • Hariharasudhan Viswanathan
  • Eun Kyung Lee
  • Marc Gamell
  • Dario Pompili
  • Manish Parashar


Virtualized datacenters and clouds are being increasingly considered for traditional High-Performance Computing (HPC) workloads that have typically targeted Grids and conventional HPC platforms. However, maximizing energy efficiency and utilization of datacenter resources, and minimizing undesired thermal behavior while ensuring application performance and other Quality of Service (QoS) guarantees for HPC applications requires careful consideration of important and extremely challenging tradeoffs. Virtual Machine (VM) migration is one of the most common techniques used to alleviate thermal anomalies (i.e., hotspots) in cloud datacenter servers as it reduces load and, hence, the server utilization. In this article, the benefits of using other techniques such as voltage scaling and pinning (traditionally used for reducing energy consumption) for thermal management over VM migrations are studied in detail. As no single technique is the most efficient to meet temperature/performance optimization goals in all situations, an autonomic approach that performs energy-efficient thermal management while ensuring the QoS delivered to the users is proposed. To address the problem of VM allocation that arises during VM migrations, an innovative application-centric energy-aware strategy for Virtual Machine (VM) allocation is proposed. The proposed strategy ensures high resource utilization and energy efficiency through VM consolidation while satisfying application QoS by exploiting knowledge obtained through application profiling along multiple dimensions (CPU, memory, and network bandwidth utilization). To support our arguments, we present the results obtained from an experimental evaluation on real hardware using HPC workloads under different scenarios.


Cloud infrastructure Virtualization Thermal management Energy-efficiency 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Report to congress on server and data center energy efficiency. Tech. rep., U.S. Environmental Protection Agency (2007)Google Scholar
  2. 2.
    Ajiro, Y., Tanaka, A.: Improving packing algorithms for server consolidation. In: Proc. of Computer Measurement Group Conf. (CMG), San Diego, CA, pp. 399–406 (2007)Google Scholar
  3. 3.
    Apparao, P., Iyer, R., Zhang, X., Newell, D., Adelmeyer, T.: Characterization & analysis of a server consolidation benchmark. In: Proc. of ACM SIGPLAN/SIGOPS Conf. on Virtual Execution Environments (VEE), Seattle, WA, pp. 21–30 (2008)Google Scholar
  4. 4.
    Bash, C., Forman, G.: Cool job allocation: Measuring the power savings of placing jobs at cooling-efficient locations in the data center. In: Proc. of USENIX Annual Technical Conf. (ATEC), Santa Clara, CA, pp. 363–368 (2007)Google Scholar
  5. 5.
    Beitelmal, A., Patel, C.: Thermo-fluids provisioning of a high performance high density data center. Distrib. Parallel Dat. 21(2–3), 227–238 (2007)CrossRefGoogle Scholar
  6. 6.
    Bobroff, N., Kochut, A., Beaty, K.: Dynamic placement of virtual machines for managing SLA violations. In: Proc. of IFIP/IEEE Symp. on Integrated Network Management (IM), Munich, Germany, pp. 119–128 (2007)Google Scholar
  7. 7.
    Chen, Y., Das, A., Qin, W., Sivasubramaniam, A., Wang, Q., Gautam, N.: Managing server energy and operational costs in hosting centers. In: Proc. of ACM Intl. Conf. on Measurement and Modeling of Computer Systems (SIGMETRICS), Banff, Canada, pp. 303–314 (2005)Google Scholar
  8. 8.
    Curtis-Maury, M., Shah, A., Blagojevic, F., Nikolopoulos, D.S., de Supinski, B.R., Schulz, M.: Prediction models for multi-dimensional power-performance optimization on many cores. In: Proc. of the 17th Intl. Conf. on Parallel Architectures and Compilation Techniques, pp. 250–259 (2008)Google Scholar
  9. 9.
    Das, R., Kephart, J.O., Lefurgy, C., Tesauro, G., Levine, D.W., Chan, H.: Autonomic multi-agent management of power and performance in data centers. In: Proc. of Intl. joint Conf. on Autonomous Agents and Multiagent Systems (AAMAS), Estoril, Portugal, pp. 107–114 (2008)Google Scholar
  10. 10.
    Enabling Grid for E-sciencE (2010). Accessed 1 Oct 2011
  11. 11.
    Feitelson, D.: Parallel workload archive (2010). Accessed 1 Oct 2011
  12. 12.
    Garday, D., Housley, J.: Thermal storage system provides emergency data center cooling. Tech. rep., Intel Corporation (2007)Google Scholar
  13. 13.
    Grid Observatory (2010). Accessed 1 Oct 2011
  14. 14.
    Gonzalez, R., Horowitz, M.: Energy dissipation in general purpose microprocessors. IEEE J. Solid-State Circuits 31(9), 1277–1284 (1996)CrossRefGoogle Scholar
  15. 15.
    Govindan, S., Nath, A.R., Das, A., Urgaonkar, B., Sivasubramaniam, A.: Xen and co.: communication-aware CPU scheduling for consolidated Xen-based hosting platforms. In: Proc. of the Intl. Conf. on Virtual Execution Environments (VEE), San Diego, CA, pp. 126–136 (2007)Google Scholar
  16. 16.
    Greenberg, S., Mills, E., Tschudi, B.: Best practices for data centers: lessons learned from benchmarking 22 data centers. In: Proc. of American Council for an Energy-Efficient Economy (ACEEE), Pacific Grove, CA (2006)Google Scholar
  17. 17.
    Gupta, D., Lee, S., Vrable, M., Savage, S., Snoeren, A.C., Varghese, G., Voelker, G.M., Vahdat, A.: Difference engine: harnessing memory redundancy in virtual machines. Commun. ACM 53(10), 85–93 (2010)CrossRefGoogle Scholar
  18. 18.
    Heath, T., Centeno, A.P., George, P., Ramos, L., Jaluria, Y., Bianchini, R.: Mercury and freon: temperature emulation and management for server systems. In: Proc. of Intl. Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), San Jose, CA, pp. 106–116 (2006)Google Scholar
  19. 19.
    Hermenier, F., Lorca, X., Menaud, J.M., Muller, G., Lawall, J.: Entropy: a consolidation manager for clusters. In: Proc. of the ACM SIGPLAN/SIGOPS Conf. on Virtual Execution Environments (VEE), Washington, DC, pp. 41–50 (2009)Google Scholar
  20. 20.
    Kang, S., Schmidt, R.R., Kelkar, K., Patankar, S.: A methodology for the design of perforated tiles in raised floor data centers using computational flow analysis. IEEE Trans. Compon. Packag. Technol. 24(2), 177–183 (2001)CrossRefGoogle Scholar
  21. 21.
    Kaxiras, S., Martonosi, M.: Computer Architecture Techniques for Power-Efficiency. Morgan and Claypool (2008)Google Scholar
  22. 22.
    Kephart, J.O., Chan, H., Das, R., Levine, D.W., Tesauro, G., Rawson, F., Lefurgy, C.: Coordinating multiple autonomic managers to achieve specified power-performance tradeoffs. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Jacksonville, FL, pp. 24–34 (2007)Google Scholar
  23. 23.
    Kochut, A., Beaty, K.: On strategies for dynamic resource management in virtualized server environments. In: Proc. of the Intl. Symp. on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), Istanbul, Turkey, pp. 193–200 (2007)Google Scholar
  24. 24.
    Kumar, S., Talwar, V., Kumar, V., Ranganathan, P., Schwan, K.: vManage: loosely coupled platform and virtualization management in data centers. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Barcelona, Spain, pp. 127–136 (2009)Google Scholar
  25. 25.
    Laszewski, G., Wang, L., Younge, A.J., He, X.: Power-aware scheduling of virtual machines in DVFS-enabled clusters. In: Proc. of IEEE Intl. Conf. on Cluster Computing (CLUSTER), New Orleans, LA, pp. 1–10 (2009)Google Scholar
  26. 26.
    Lee, E.K., Kulkarni, I., Pompili, D., Parashar, M.: Proactive thermal management in green datacenter. J. Supercomput. 51(1), 1–31 (2010)CrossRefGoogle Scholar
  27. 27.
    Liu, J., Priyantha, B., Zhao, F., Liang, C., Wang, Q., James, S.: Towards discovering data center genome using sensor networks. In: Proc. of the Workshop on Embedded Networked Sensors (HotEmNets), Charlottesville, VA (2008)Google Scholar
  28. 28.
    Mannas, E., Jones, S.: Add thermal monitoring to reduce data center energy consumption (2009). Accessed 1 Oct 2011
  29. 29.
    Menasce, D.A., Bennani, M.N.: Autonomic virtualized environments. In: Proc. of Intl. Conf. on Autonomic and Autonomous Systems (ICAS), Santa Clara, CA, pp. 1–28 (2006)Google Scholar
  30. 30.
    Meng, X., Isci, C., Kephart, J., Zhang, L., Bouillet, E., Pendarakis, D.: Efficient resource provisioning in compute clouds via VM multiplexing. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Washington, DC, pp. 11–20 (2010)Google Scholar
  31. 31.
    Moore, J., Chase, J., Ranganathan, P., Sharma, R.: Making scheduling “cool”: temperature-aware workload placement in data centers. In: Proc. of USENIX Annual Technical Conf. (ATEC), pp. 61–75 (2005)Google Scholar
  32. 32.
    Moore, J.D., Chase, J.S., Ranganathan, P.: Weatherman: automated, online and predictive thermal mapping and management for data centers. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Dublin, Ireland, pp. 155–164 (2006)Google Scholar
  33. 33.
    Mukherjee, T., Banerjee, A., Varsamopoulos, G., Gupta, S., Rungta, S.: Spatio-temporal thermal-aware job scheduling to minimize energy consumption in virtualized heterogeneous data centers. Comput. Networks 53(17), 2888–2904 (2009)zbMATHCrossRefGoogle Scholar
  34. 34.
    Nathuji, R., Isci, C., Gorbatov, E.: Exploiting platform heterogeneity for power efficient data centers. In: Proc. of Intl. Conf. on Autonomic Computing (ICAC), Jacksonville, FL, pp. 1–5 (2007)Google Scholar
  35. 35.
    Nathuji, R., Schwan, K.: VirtualPower: coordinated power management in virtualized enterprise systems. In: Proc. of ACM SIGOPS Symp. on Operating Systems Principles (SOSP), pp. 265–278 (2007)Google Scholar
  36. 36.
    Orlov, M.: Efficient generation of set partitions. Tech. rep., Engineering and Computer Sciences, University of Ulm (2002)Google Scholar
  37. 37.
    Patel, C., Bash, C., Belady, L., Stahl, L., Sullivan, D.: Computational fluid dynamics modeling of high compute density data centers to assure system inlet air specifications. In: Proc. of Pacific Rim/ASME International Electronic Packaging Technical Conferenceof (IPACK), Kauai, HI (2001)Google Scholar
  38. 38.
    Rambo, J., Joshi, Y.: Modeling of data center airflow and heat transfer: state of the art and future trends. Distrib. Parallel Dat. 21(2–3), 193–225 (2007)CrossRefGoogle Scholar
  39. 39.
    Ramos, L., Bianchini, R.: C-oracle: predictive thermal management for data centers. In: Proc. of Intl. Symp. on High-Performance Computer Architecture (HPCA), pp. 111–122 (2008)Google Scholar
  40. 40.
    Ranganathan, P., Leech, P., Irwin, D., Chase, J.: Ensemble-level power management for dense blade servers. SIGARCH Comput. Archit. News 34(2), 66–77 (2006)CrossRefGoogle Scholar
  41. 41.
    Rodero, I., Chandra, S., Parashar, M., Muralidhar, R., Seshadri, H., Poole, S.: Investigating the potential of application-centric aggressive power management for HPC workloads. In: Proc. of the IEEE Intl. Conf. on High Performance Computing (HiPC), Goa, India, pp. 1–10 (2010)Google Scholar
  42. 42.
    Rodero, I., Lee, E.K., Pompili, D., Parashar, M., Gamell, M., Figueiredo, R.J.: Exploiting VM technologies for reactive thermal management in instrumented datacenters. In: Workshop on Energy Efficient Grids, Clouds, and Clusters in conjunction with IEEE Grid, Brussels, Belguim, pp. 321–328 (2010)Google Scholar
  43. 43.
    Rusu, C., Ferreira, A., Scordino, C., Watson, A.: Energy-efficient real-time heterogeneous server clusters. In: Proc. of IEEE Real-Time and Embedded Technology and Applications Symp. (RTAS), St. Louis, MO, pp. 418–428 (2006)Google Scholar
  44. 44.
    Schmidt, R.R., Cruz, E.: Raised floor computer data center: effect on rack inlet temperatures of exiting both the hot and cold aisle. In: Proc. of Itherm Conference (ITHERM), San Diego, CA (2002)Google Scholar
  45. 45.
    Schmidt, R.R., Cruz, E.E., Iyengar, M.K.: Challenges of data center thermal management. IBM J. Res. Develop. 49(4/5), 709–723 (2005)CrossRefGoogle Scholar
  46. 46.
    Schmidt, R.R., Karki, K., Kelkar, K., Radmehr, A., Patankar, S.: Measurements and predictions of the flow distribution through perforated tiles in raised floor data centers. In: Proc. of Pacific Rim/ASME International Electronic Packaging Technical Conferenceof (IPACK), Kauai, HI (2001)Google Scholar
  47. 47.
    Sharma, R., Bash, C., Patel, R.: Dimensionless parameters for evaluation of thermal design and performance of large-scale data centers. In: Proc. of ASME/AIAA Joint Thermophysics and Heat Transfer Conference. St. Louis, MO (2002)Google Scholar
  48. 48.
    Sharma, R.K., Bash, C.E., Patel, C.D., Friedrich, R.J., Chase, J.S.: Balance of power: dynamic thermal management for internet data centers. IEEE Internet Comput. 9(1), 42–49 (2005)CrossRefGoogle Scholar
  49. 49.
    Song, Y., Sun, Y., Wang, H., Song, X.: An adaptive resource flowing scheme amongst VMs in a VM-based utility computing. In: Proc. of IEEE Intl. Conf. on Computer and Information Technology (ICCIT), Fukushima, Japan, pp. 1053–1058 (2007)Google Scholar
  50. 50.
    Steinder, M., Whalley, I., Carrera, D., Gaweda, I., Chess, D.: Server virtualization in autonomic management of heterogeneous workloads. In: Proc. of IEEE Symp. on Integrated Network Management, pp. 139–148 (2007)Google Scholar
  51. 51.
    Stoess, J., Lang, C., Bellosa, F.: Energy management for hypervisor-based virtual machines. In: Proc. of USENIX Annual Technical Conf. (ATEC), Santa Clara, CA, pp. 1–14 (2007)Google Scholar
  52. 52.
    Tang, Q., Gupta, S.K.S., Varsamopoulos, G.: Energy-efficient thermal-aware task scheduling for homogeneous high-performance computing data centers: a cyber-physical approach. IEEE Trans. Parallel Distrib. Syst. 19(11), 1458–1472 (2008)CrossRefGoogle Scholar
  53. 53.
    Verma, A., Ahuja, P., Neogi, A.: pMapper: power and migration cost aware application placement in virtualized systems. In: Proc. of ACM/IFIP/USENIX Intl. Conf. on Middleware (MIDDLEWARE), Leuven, Belgium, pp. 243–264 (2008)Google Scholar
  54. 54.
    Verma, A., Ahuja, P., Neogi, A.: Power-aware dynamic placement of HPC applications. In: Proc. of Intl. Conf. on Supercomputing (ICS), Island of Kos, Greece, pp. 175–184 (2008)Google Scholar
  55. 55.
    Voorsluys, W., Broberg, J., Venugopal, S., Buyya, R.: Cost of virtual machine live migration in clouds: a performance evaluation. In: Proc. of Intl. Conf. on Cloud Computing (CloudCom), Beijing, China, pp. 254–265 (2009)Google Scholar
  56. 56.
    Wood, T., Tarasuk-Levin, G., Shenoy, P., Desnoyers, P., Cecchet, E., Corner, M.D.: Memory buddies: exploiting page sharing for smart colocation in virtualized data centers. In: Proc. of the ACM SIGPLAN/SIGOPS Conf. on Virtual Execution Environments (VEE), Washington, DC, pp. 31–40 (2009)Google Scholar
  57. 57.
    Zhu, Q., Zhu, J., Agrawal, G.: Power-aware consolidation of scientific workflows in virtualized environments. In: Proc. of Intl. Conf. on High Performance Computing, Networking, Storage and Analysis (SC), New Orleans, LA, pp. 1–12 (2010)Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Ivan Rodero
    • 1
  • Hariharasudhan Viswanathan
    • 1
  • Eun Kyung Lee
    • 1
  • Marc Gamell
    • 1
  • Dario Pompili
    • 1
  • Manish Parashar
    • 1
  1. 1.NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics InstituteRutgers UniversityPiscatawayUSA

Personalised recommendations