Skip to main content

Supporting Data Center Management through Clustering of System Data Streams

  • Conference paper
Advanced Infocomm Technology (ICAIT 2012)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 7593))

Included in the following conference series:

  • 798 Accesses

Abstract

Aggregating large data sets related to hardware and software resources into clusters is at the basis of several operations and strategies for management and control. High variability and noise characterizing data collected from system resources monitoring prevent the application of existing solutions that are affected by low accuracy and scarce robustness.

We present a new algorithm which extends the clustering method to data center management because it is able to find groups of related objects even when correlation is hidden by high variability.

Our experimental evaluation performed on both synthetic and real data shows the accuracy and robustness of the proposed solution, and its ability in clustering servers with correlated functionality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liao, T.W.: Clustering of time series data - a survey. Pattern Recognition 38 (2005)

    Google Scholar 

  2. Xu, R., Wunsch, D.: Survey of clustering algorithms. IEEE Trans. on Neural Networks 16 (2005)

    Google Scholar 

  3. Böhm, C., Kailing, K., Kröger, P., Zimek, A.: Computing clusters of correlation connected objects. In: Proc. of the 2004 ACM SIGMOD International Conference on Management of Data, Paris, France (2004)

    Google Scholar 

  4. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press (1967)

    Google Scholar 

  5. Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. Technical Report TR 00-034, University of Minnesota - Department of Computer Science and Engineering, Minneapolis (2000)

    Google Scholar 

  6. Cohen, J.: Applied multiple regression/correlation analysis for the behavioral sciences. L. Erlbaum Associates (2003)

    Google Scholar 

  7. Spearman, C.: The proof and measurement of association between two things. The American Journal of Psychology 100 (1904)

    Google Scholar 

  8. Kendall, M.G.: Rank correlation methods. Charles Griffin & Company Ltd. (1962)

    Google Scholar 

  9. Papadimitriou, S., Sun, J., Yu, P.S.: Local correlation tracking in time series. In: IEEE International Conference on Data Mining, Los Alamitos, CA, USA (2006)

    Google Scholar 

  10. Hamao, Y., Masulis, R., Ng, V.: Correlations in price changes and volatility across international stock markets. Review of Financial Studies 3 (1990)

    Google Scholar 

  11. Taqqu, M.S.: Random processes with long-range dependence and high variability. Journal of Geophysical Research 92 (1987)

    Google Scholar 

  12. Willinger, W., Alderson, D., Li, L.: A pragmatic approach to dealing with high-variability in network measurements. In: Proc. of the 4th ACM SIGCOMM Conference on Internet Measurement, Taormina, Sicily, Italy (2004)

    Google Scholar 

  13. Bennani, M.N., Menasce, D.A.: Assessing the robustness of self-managing computer systems under highly variable workloads. In: Proc. of the First International Conference on Autonomic Computing, Washington, DC, USA (2004)

    Google Scholar 

  14. Andreolini, M., Casolari, S., Colajanni, M.: Models and framework for supporting run-time decisions in web-based systems. ACM Trans. on the Web 2 (2008)

    Google Scholar 

  15. Ghosh, S., Squillante, M.S.: Analysis and control of correlated web server queues. Computer Communications 5244 (2004)

    Google Scholar 

  16. Buda, A., Jarynowski, A.: Life-time of correlations and its applications. Wydawnictwo Niezalezne (2010)

    Google Scholar 

  17. Sørensen, T.: A Method of Establishing Groups of Equal Amplitude in Plant Sociology Based on Similarity of Species Content. Biologiske Skrifter. E. Munksgaard (1948)

    Google Scholar 

  18. Papadimitriou, S., Yu, P.S.: Optimal multi-scale patterns in time series streams. In: Proc. of the 2006 ACM SIGMOD International Conference on Management of Data, Chicago, IL, USA (2006)

    Google Scholar 

  19. Papadimitriou, S., Sun, J., Faloutsos, C.: Streaming pattern discovery in multiple time-series. In: Proc. of the 31st International Conference on Very Large Data Bases, Trondheim, Norway (2005)

    Google Scholar 

  20. Bakshi, B.R.: Multiscale pca with application to multivariate statistical process monitoring. AIChE Journal 44 (1998)

    Google Scholar 

  21. Abrahao, B., Zhang, A.: Characterizing application workloads on cpu utilization in utility computing. Technical Report HPL-2004-157, Hewlett-Packard Labs (2004)

    Google Scholar 

  22. Khattree, R., Naik, D.: Multivariate data reduction and discrimination with SAS software. SAS Institute Inc. (2000)

    Google Scholar 

  23. Lakhina, A., Papagiannaki, K., Crovella, M., Diot, C., Kolaczyk, E.D., Taft, N.: Structural analysis of network traffic flows. In: Proc. of the Joint International Conference on Measurement and Modeling of Computer Systems, New York, NY, USA (2004)

    Google Scholar 

  24. Hurst, H.E.: Long-term storage capacity of reservoirs. Trans. of the American Society of Civil Engineers 116 (1951)

    Google Scholar 

  25. Weron, R.: Estimating long range dependence: finite sample properties and confidence intervals. Physica A 312 (2002)

    Google Scholar 

  26. Brockwell, B.L., Davis, R.A.: Time Series: Theory and Methods. Springer (1987)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tosi, S., Casolari, S., Colajanni, M. (2013). Supporting Data Center Management through Clustering of System Data Streams. In: Guyot, V. (eds) Advanced Infocomm Technology. ICAIT 2012. Lecture Notes in Computer Science, vol 7593. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38227-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38227-7_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38226-0

  • Online ISBN: 978-3-642-38227-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics