Data Mining for Modeling Chiller Systems in Data Centers

  • Debprakash Patnaik
  • Manish Marwah
  • Ratnesh K. Sharma
  • Naren Ramakrishnan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6065)


We present a data mining approach to model the cooling infrastructure in data centers, particularly the chiller ensemble. These infrastructures are poorly understood due to the lack of “first principles” models of chiller systems. At the same time, they abound in data due to instrumentation by modern sensor networks. We present a multi-level framework to transduce sensor streams into an actionable dynamic Bayesian network model of the system. This network is then used to explain observed system transitions and aid in diagnostics and prediction. We showcase experimental results using a HP data center in Bangalore, India.


Data Center Bayesian Network Centrifugal Compressor Dynamic Bayesian Network Utilization Variable 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Koomey, J.: Power conversion in servers and data centers: A review of recent data and developments. In: Applied Power Electronics Conference, February 25 (2008)Google Scholar
  2. 2.
    Kaplan, J.M., Forrest, W., Kindler, N.: Revolutionizing data center efficiency. Technical report, McKinsey Report (2008)Google Scholar
  3. 3.
    Vanderbilt, T.: Data center overload, June 8. New York Times (2009)Google Scholar
  4. 4.
    Watson, B., et al.: Integrated design and management of a sustainable data center. In: InterPACK 2009: Proceedings of ASME InterPACK, New York, NY, USA, July 2009. ASME (2009)Google Scholar
  5. 5.
    Sharma, R., et al.: On building next generation data centers: Energy flow in the information technology stack. In: Compute 2008, Bangalore, India (January 2008)Google Scholar
  6. 6.
    Cameron, K.W., Pyla, H.K., Varadarajan, S.: Tempest: A portable tool to identify hot spots in parallel code. In: ICPP 2007: Proceedings of the 2007 International Conference on Parallel Processing, Washington, DC, USA. IEEE Computer Society, Los Alamitos (2007)Google Scholar
  7. 7.
    Patel, C., et al.: Computational fluid dynamics modeling of high compute density data centers to assure system inlet air specifications. In: ASME IPACK 2001, Kauai, HI (July 2001)Google Scholar
  8. 8.
    Bautista, L., Sharma, R.: Analysis of environmental data in data centers. Technical Report HPL-2007-98, HP Labs (June 2007)Google Scholar
  9. 9.
    Sharma, R., et al.: Application of exploratory data analysis (eda) techniques to temperature data in a conventional data center. In: ASME IPACK 2007, Vancouver, BC (2007)Google Scholar
  10. 10.
    Marwah, M., Sharma, R., Bautista, L., Lugo, W.: Stream mining of sensor data for anomalous behavior detection in data centers. Technical Report HPL-2008-40, HP Labs (May 2008)Google Scholar
  11. 11.
    Hoke, E., Sun, J., Faloutsos, C.: Intemon: Intelligent system monitoring on large clusters. In: 32nd International Conference on Very Large Data Bases, September 2006, pp. 1239–1242 (2006)Google Scholar
  12. 12.
    Hoke, E., et al.: Intemon: Continuous mining of sensor data in large scale self infrastructures. Operating Systems Review 40(3), 38–44 (2006)CrossRefGoogle Scholar
  13. 13.
    Papadimitriou, S., Sun, J., Faloutsos, C.: Streaming pattern discovery in multiple time series. In: 31st International Conference on Very Large Data Bases, pp. 697–708 (2005)Google Scholar
  14. 14.
    Morchen, F.: Unsupervised pattern mining from symbolic temporal data. ACM SIGKDD Explorations 9(1), 41–55 (2007)CrossRefGoogle Scholar
  15. 15.
    Ramoni, M., et al.: Bayesian clustering by dynamics. Mach. Learn. 47(1), 91–121 (2002)zbMATHCrossRefGoogle Scholar
  16. 16.
    Patnaik, D., et al.: Sustainable operation and management of data center chillers using temporal data mining. In: Proc. 15th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pp. 1305–1314 (2009)Google Scholar
  17. 17.
    Nielsen, T.D., Jensen, F.V.: Alert systems for production plants: A methodology based on conflict analysis. In: Symbolic and Quantitative Approaches to Reasoning with Uncertainty, pp. 76–87 (2005)Google Scholar
  18. 18.
    Friedman, N., Nachman, I., Pe’er, D.: Learning bayesian network structure from massive datasets: The “sparse candidate” algorithm. In: 5th Conf. on Uncertainty in Artificial Intelligence UAI, pp. 206–215 (1999)Google Scholar
  19. 19.
    Patnaik, D., Laxman, S., Ramakrishnan, N.: Discovering excitatory networks from discrete event streams with applications to neuronal spike train analysis. In: IEEE Intl. Conf. on Data Mining, ICDM 2009 (December 2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Debprakash Patnaik
    • 1
  • Manish Marwah
    • 2
  • Ratnesh K. Sharma
    • 2
  • Naren Ramakrishnan
    • 1
  1. 1.Virginia TechBlacksburgUSA
  2. 2.HP LabsPalo AltoUSA

Personalised recommendations