Cluster Computing

, Volume 14, Issue 3, pp 259–274 | Cite as

Loosely coupled coordinated management in virtualized data centers

  • Sanjay Kumar
  • Vanish Talwar
  • Vibhore Kumar
  • Parthasarathy Ranganathan
  • Karsten Schwan


Management is an important challenge for future enterprises. Previous work has addressed platform management (e.g., power and thermal management) separately from virtualization management (e.g., virtual machine (VM) provisioning, application performance). Coordinating the actions taken by these different management layers is important and beneficial, for reasons of performance, stability, and efficiency. Such coordination, in addition to working well with existing multi-vendor solutions, also needs to be extensible to support future management solutions potentially operating on different sensors and actuators. In response to these requirements, this paper proposes vManage, a solution to loosely couple platform and virtualization management and facilitate coordination between them in data centers. Our solution is comprised of registry and proxy mechanisms that provide unified monitoring and actuation across platform and virtualization domains, and coordinators that provide policy execution for better VM placement and runtime management, including a formal approach to ensure system stability from inefficient management actions. The solution is instantiated in a Xen environment through a platform-aware virtualization manager at a cluster management node, and a virtualization-aware platform manager on each server. Experimental evaluations using enterprise benchmarks show that compared to traditional solutions, vManage can achieve additional power savings (10% lower power) with significantly improved service-level guarantees (71% less violations) and stability (54% fewer VM migrations), at low overhead.


Management Algorithms Performance Standardization vManage SLA Power management 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gartner TCO Reports. 2005-08.
  2. 2.
    Clark, C., Fraiser, K., Hand, S., Hansen, J.G., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Proceedings of the 2nd ACM/USENIX Symposium on Networked Systems Design and Implementation (NSDI) (2005) Google Scholar
  3. 3.
    Kumar, S., Schwan, K.: Netchannel: a VMM-level mechanism for continuous, transparent device access during VM migration. In: Proceedings of the Fourth ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE) (2008) Google Scholar
  4. 4.
    HP Proliant Esentials: Systems Insight Manager.,
  5. 5.
    Raghavendra, R., Ranganathan, P., Talwar, V., Wang, Z., Zhu, X.: No “power” struggles: coordinated multilevel power management for the data center. In: Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2008) Google Scholar
  6. 6.
    Nathuji, R., Schwan, K.: VirtualPower: Coordinated power management in virtualized enterprise systems. In: Proceedings of 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP) (2007) Google Scholar
  7. 7.
    VMWare Virtual Center.
  8. 8.
  9. 9.
    Wood, T., Shenoy, P., Venkataramani, A.: Black-box and gray-box strategies for virtual machine migration. In: Proceedings of 4th ACM/USENIX Symposium on Networked Systems Design and Implementation (NSDI) (2007) Google Scholar
  10. 10.
    Padala, P., Shin, K.G., Zhu, X., Uysal, M., Wang, Z., Singhal, S., Merchant, A., Salem, K.: Adaptive control of virtualized resources in utility computing environments. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems (Eurosys) (2007) Google Scholar
  11. 11.
    Kumar, V., Schwan, K., Iyer, S., Chen, Y., Sahai, A.: A state space approach to SLA based management. In: Proceedings of Network Operations and Management Symposium (NOMS) (2008) Google Scholar
  12. 12.
    Hellerstein, J., Diao, Y., Parekh, S., Tilbury, D.M.: Feedback Control of Computing Systems. Wiley, New York (2004) CrossRefGoogle Scholar
  13. 13.
    HP iLO Management Processor.
  14. 14.
    Chase, J., Anderson, D., Thakar, P., Vahdat, A., Doyle, R.: Managing energy and server resources in hosting centers. In: Proceedings of the Eighteenth ACM Symposium on Operating Systems Principles (SOSP) (2001) Google Scholar
  15. 15.
    Donald, J., Martonosi, M.: Techniques for multicore thermal management: classification and new exploration. In: Proceedings of 33rd Annual International Symposium on Computer Architecture (ISCA) (2006) Google Scholar
  16. 16.
    Lefurgy, C., Rajamani, K., Rawson, F., Felter, W., Kistler, M., Keller, T.: Energy management for commercial servers. IEEE Comput. 39–48 (2003) Google Scholar
  17. 17.
    Ranganathan, P., Leech, P., Irwin, D., Chase, J.: Ensemble-level power management for dense blade servers. In: Proceedings of the 33rd Annual International Symposium on Computer Architecture (ISCA) (2006) Google Scholar
  18. 18.
    Nathuji, R., Schwan, K.: VPM tokens: virtual machine-aware power budgeting in datacenters. In: Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC) (2008) Google Scholar
  19. 19.
    Verma, A., Ahuja, P., Neogi, A.: pMapper: power and migration cost aware application placement in virtualized systems. In: Proceedings of the 9th ACM/IFIP/USENIX International Conference on Middleware (2008) Google Scholar
  20. 20.
    Kephart, J.O., Chan, H., Das, R., Levine, D.W., Tesauro, G., Rawson, F., Lefurgy, C.: Coordinating multiple autonomic managers to achieve specified power-performance tradeoffs. In: Proceedings of the Fourth International Conference on Autonomic Computing (ICAC) (2007) Google Scholar
  21. 21.
    Das, R., Kephart, J.O., Lefurgy, C., Tesauro, G., Levine, D.W., Chan, H.: Autonomic multi-agent management of power and performance in data centers. In: Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS) (2008) Google Scholar
  22. 22.
    Gmach, D., Rolia, J., Cherkasova, L., Belrose, G., Turicchi, T., Kemper, A.: An integrated approach to resource pool management: policies, efficiency and quality metrics. In: Proceedings of IEEE International Conference on Dependable Systems and Networks (DSN) (2008) Google Scholar
  23. 23.
    Chen, G., He, W., Liu, J., Nath, S., Rigas, L., Xiao, L., Zhao, F.: Energy-aware server provisioning and load dispatching for connection-intensive internet services. In: Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI) (2008) Google Scholar
  24. 24.
    Adve, S., Harris, A., Hughes, C., Jones, D., Kravets, R., Nahrstedt, K., Sachs, D., Sachs, D.G., Sasanka, R., Srinivasan, J., Yuan, W.: The Illinois GRACE project: global resource adaptation through cooperation. In: Proceedings of the Workshop on Self-Healing, Adaptive, self-MANaged Systems (SHAMAN) (2002) Google Scholar
  25. 25.
    Common Information Model.
  26. 26.
    EPA-HTTP—a day of HTTP logs from EPA WWW server.
  27. 27.
    Worldspan by Travelport.
  28. 28.
    Intelligent Platform Management Interface (IPMI).
  29. 29.
    Gupta, D., Gardner, R., Cherkasova, L.: Xenmon: QoS monitoring and performance profiling tool. Technical Report HPL-2005-187, HP Labs (2005) Google Scholar
  30. 30.
    RUBiS: Rice University Bidding System.
  31. 31.
    Nutch: Open Source Web-Search Software.
  32. 32.
    VMWare Distributed Power Management (DPM).
  33. 33.
    ObjectManagementGroup: The Common Object Request Broker: Architecture and Specification, 2.0 edn., July 1995 Google Scholar
  34. 34.
    Web-based Enterprise Management (WBEM).
  35. 35.
    Pietzuch, P.: Hermes: a scalable event-based middleware. Ph.D. thesis, Computer Laboratory, Queens’ College, University of Cambridge, February 2004 Google Scholar
  36. 36.
    Urgaonkar, B., Shenoy, P., Roscoe, T.: Resource overbooking and application profiling in shared hosting platforms. In: Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI) (2002) Google Scholar
  37. 37.
    Talwar, V., Nahrstedt, K., Milojicic, D.: Modeling remote desktop systems in utility environments with application to QoS management. In: Proceedings of IFIP/IEEE International Symposium on Integrated Network Management (IM) (2009) Google Scholar
  38. 38.
    Allen, A.: Probability, Statistics, and Queueing Theory with Computer Science Applications. Academic Press, San Diego (1990) zbMATHGoogle Scholar
  39. 39.
  40. 40.
    Guttman, E.: Service location protocol: automatic discovery of IP network services. IEEE Internet Comput. 3, 4 (1999) CrossRefGoogle Scholar
  41. 41.
    Kesavan, M., Ranadive, A., Gavrilovska, A., Schwan, K.: Active CoordinaTion (ACT)—towards effectively managing virtualized multicore clouds. In: Proceedings of Cluster (2008) Google Scholar
  42. 42.
    Kong, J., Schwan, K., Lee, M., Ahamad, M.: ProtectIT: trusted distributed services operating on sensitive data. In: Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems (Eurosys) (2008) Google Scholar
  43. 43.
  44. 44.
    Urgaonkar, B., Shenoy, P.: Resource overbooking and application profiling in shared hosting platforms. In: The Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI) (2002) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Sanjay Kumar
    • 1
  • Vanish Talwar
    • 2
  • Vibhore Kumar
    • 3
  • Parthasarathy Ranganathan
    • 2
  • Karsten Schwan
    • 4
  1. 1.Intel LabsHillsboroUSA
  2. 2.HP LabsPalo AltoUSA
  3. 3.IBM ResearchHawthorneUSA
  4. 4.Georgia TechAtlantaUSA

Personalised recommendations