Clustering of High Dimensional Data Streams

Tasoulis, Sotiris K.; Tasoulis, Dimirtis K.; Plagianakos, Vassilis P.

doi:10.1007/978-3-642-30448-4_28

Sotiris K. Tasoulis²²,
Dimirtis K. Tasoulis²³ &
Vassilis P. Plagianakos²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7297))

Included in the following conference series:

Hellenic Conference on Artificial Intelligence

1673 Accesses
1 Citations

Abstract

Clustering of data streams has become a task of great interest in the recent years as such data formats is are becoming increasingly ambiguous. In many cases, these data are also high dimensional and in result more complex for clustering. As such there is a growing need for algorithms that can be applied on streaming data and the at same time can cope with high dimensionality. To this end, here we design a streaming clustering approach by extending a recently proposed high dimensional clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When Is Nearest Neighbor Meaningful? In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1998)
Chapter Google Scholar
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Google Scholar
Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: 2006 SIAM Conference on Data Mining, pp. 328–339 (2006)
Google Scholar
Domingos, P., Hulten, G., Edu, P.C.W., Edu, C.H.G.W.: A general method for scaling up machine learning algorithms and its application to clustering. In: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 106–113. Morgan Kaufmann (2001)
Google Scholar
Heinz, C., Seeger, B.: Towards Kernel Density Estimation over Streaming Data. In: International Conference on Management of Data. Computer Society of India, COMAD 2006, Delhi, India (December 2006)
Google Scholar
Oja, E., Karhunen, J.: On Stochastic Approximation of the Eigenvectors and Eigenvalues of the Expectation of a Random Matrix. Journal of Mathematical Analysis and Applications 106, 69–84 (1985)
Article MathSciNet MATH Google Scholar
Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 410–420 (2007)
Google Scholar
Sanger, T.D.: Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Networks 2(6), 459–473 (1989)
Article Google Scholar
Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley Series in Probability and Statistics. Wiley (September 1992)
Google Scholar
Steinbach, M., Ertöz, L., Kumar, V.: The challenges of clustering high dimensional data. New Vistas in Statistical Physics: Applications in Econophysics, Bioinformatics, and Pattern Recognition (2003)
Google Scholar
Tasoulis, S., Tasoulis, D., Plagianakos, V.: Enhancing Principal Direction Divisive Clustering. Pattern Recognition 43, 3391–3411 (2010)
Article MATH Google Scholar
Weng, J., Zhang, Y., Hwang, W.: Candid covariance-free incremental principal component analysis (2003)
Google Scholar
Zhang, Y., Weng, J.: Convergence analysis of complementary candid incremental principal component analysis (2001)
Google Scholar
Zhou, A., Cai, Z., Wei, L., Qian, W.: M-kernel merging: Towards density estimation over data streams. In: International Conference on Database Systems for Advanced Applications, p. 285 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Biomedical Informatics, University of Central Greece, Papassiopoulou 2–4, Lamia, 35100, Greece
Sotiris K. Tasoulis & Vassilis P. Plagianakos
Winton Capital Management, 1–5 St Mary Abbot’s Place, SW8 6LS, United Kingdom
Dimirtis K. Tasoulis

Authors

Sotiris K. Tasoulis
View author publications
You can also search for this author in PubMed Google Scholar
Dimirtis K. Tasoulis
View author publications
You can also search for this author in PubMed Google Scholar
Vassilis P. Plagianakos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Biomedical Informatics, University of Central Greece, 2-4 Passiopoulou Street, 35100, Lamia, Greece
Ilias Maglogiannis
Department of Computer Science and Biomedical Informatics, University of Central Greece, 2-4 Papassiopoulou Street, 35100, Lamia, Greece
Vassilis Plagianakos
Department of Informatics, Aristotle University of Thessaloniki, 54124, Thessaloniki, Greece
Ioannis Vlahavas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tasoulis, S.K., Tasoulis, D.K., Plagianakos, V.P. (2012). Clustering of High Dimensional Data Streams. In: Maglogiannis, I., Plagianakos, V., Vlahavas, I. (eds) Artificial Intelligence: Theories and Applications. SETN 2012. Lecture Notes in Computer Science(), vol 7297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30448-4_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-30448-4_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30447-7
Online ISBN: 978-3-642-30448-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics