Abstract
Researchers analyzing large (>100,000 objects) data sets with the methods of cluster analysis often face the problem of computational complexity of algorithms that sometimes makes it impossible to analyze in an acceptable time. Common solution of this problem is to use less computationally complex algorithms (like k-means), which in turn can in many cases give much worse results than for example algorithms using eigenvalues decomposition. In the article, the new algorithm from spectral clustering family is proposed and compared with other approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Dimitriadou E, Weingessel A, Hornik K (2001) Voting-merging: an ensemble method for clustering. In: Dorffner G, Bischop H, Hornik K (eds) Artificial neural networks—ICANN 2001. Lecture notes in computer science, vol 2130. Springer, Heidelberg, pp 217–224
Dudek A (2013) Classification of large data sets. Comparison of performance of chosen algorithms. Acta Universitatis Lodziensis. Folia Oeconomica 285:71–78
Hubert LJ, Arabie P (1985) Comparing partitions. J Classif 2:193–218
Kong T, Tian Y, Shen H (2011) A fast incremental spectral clustering for large data sets, pp 1–5. https://doi.org/10.1109/PDCAT.2011.4
Ng A, Jordan M, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Dietterich T, Becker S, Ghahramani Z (eds) Advances in neural information processing systems 14. MIT Press, pp 849–856
Shinnou H, Sasaki M (2008) Spectral clustering for a large data set by reducing the similarity matrix size. In: Proceedings of the sixth international conference on language resources and evaluation (LREC), pp 201–2014
von Luxburg U (2006) A tutorial on spectral clustering. Max planck institute for biological cybernetics, Technical Report TR-149
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Dudek, A. (2021). Evaluation of Two-Step Spectral Clustering Algorithm for Large Untypical Data Sets. In: Jajuga, K., Najman, K., Walesiak, M. (eds) Data Analysis and Classification. SKAD 2020. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-030-75190-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-75190-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75189-0
Online ISBN: 978-3-030-75190-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)