Skip to main content

Clustering of High Dimensional Data Streams

  • Conference paper
Artificial Intelligence: Theories and Applications (SETN 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7297))

Included in the following conference series:

Abstract

Clustering of data streams has become a task of great interest in the recent years as such data formats is are becoming increasingly ambiguous. In many cases, these data are also high dimensional and in result more complex for clustering. As such there is a growing need for algorithms that can be applied on streaming data and the at same time can cope with high dimensionality. To this end, here we design a streaming clustering approach by extending a recently proposed high dimensional clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When Is Nearest Neighbor Meaningful? In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  2. Blake, C., Merz, C.: UCI repository of machine learning databases (1998)

    Google Scholar 

  3. Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: 2006 SIAM Conference on Data Mining, pp. 328–339 (2006)

    Google Scholar 

  4. Domingos, P., Hulten, G., Edu, P.C.W., Edu, C.H.G.W.: A general method for scaling up machine learning algorithms and its application to clustering. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 106–113. Morgan Kaufmann (2001)

    Google Scholar 

  5. Heinz, C., Seeger, B.: Towards Kernel Density Estimation over Streaming Data. In: International Conference on Management of Data. Computer Society of India, COMAD 2006, Delhi, India (December 2006)

    Google Scholar 

  6. Oja, E., Karhunen, J.: On Stochastic Approximation of the Eigenvectors and Eigenvalues of the Expectation of a Random Matrix. Journal of Mathematical Analysis and Applications 106, 69–84 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  7. Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 410–420 (2007)

    Google Scholar 

  8. Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks 2(6), 459–473 (1989)

    Article  Google Scholar 

  9. Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley Series in Probability and Statistics. Wiley (September 1992)

    Google Scholar 

  10. Steinbach, M., Ertöz, L., Kumar, V.: The challenges of clustering high dimensional data. New Vistas in Statistical Physics: Applications in Econophysics, Bioinformatics, and Pattern Recognition (2003)

    Google Scholar 

  11. Tasoulis, S., Tasoulis, D., Plagianakos, V.: Enhancing Principal Direction Divisive Clustering. Pattern Recognition 43, 3391–3411 (2010)

    Article  MATH  Google Scholar 

  12. Weng, J., Zhang, Y., Hwang, W.: Candid covariance-free incremental principal component analysis (2003)

    Google Scholar 

  13. Zhang, Y., Weng, J.: Convergence analysis of complementary candid incremental principal component analysis (2001)

    Google Scholar 

  14. Zhou, A., Cai, Z., Wei, L., Qian, W.: M-kernel merging: Towards density estimation over data streams. In: International Conference on Database Systems for Advanced Applications, p. 285 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tasoulis, S.K., Tasoulis, D.K., Plagianakos, V.P. (2012). Clustering of High Dimensional Data Streams. In: Maglogiannis, I., Plagianakos, V., Vlahavas, I. (eds) Artificial Intelligence: Theories and Applications. SETN 2012. Lecture Notes in Computer Science(), vol 7297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30448-4_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30448-4_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30447-7

  • Online ISBN: 978-3-642-30448-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics