Abstract
The goal of this chapter is to survey and present the main concepts and techniques from the vast collection of clustering models, from the hierarchical to the partitioning ones. In the partitioning class, we extensively discuss the clustering approaches that can be seen as particular neural networks, with attention to the key concepts involved into the social sciences’ applications. The clustering is one of the main ingredients of the new copula-based algorithm of aggregation which will be discussed in Part II.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Divisive methods do the opposite of the agglomerative ones.
- 2.
This kind of model is very familiar to physicists, and they represent the vigor of the shaking as a temperature because at higher temperature, there is more molecular activity. Annealing refers to a way of tempering certain alloys of metal by heating and then gradually cooling them.
References
Anderberg, M. R. (1973). Cluster Analysis for Applications. New York: Academic.
Baglioni, A., & Cherubini, U. (2010). Marking-to-Market Government Guarantees to Financial Systems: An Empirical Analysis of Europe, working paper available at SSRN: https://doi.org/10.2139/ssrn.1715405.
Bernardi, E., & Romagnoli, S. (2011). Computing the volume of an high-dimensional semi-unsupervised hierarchical copula. International Journal of Computer Mathematics, 88(12), 2591–2607.
Caliński, R., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.
Dunn, J. C. (1974). Well separated clusters and fuzzy partitions. Journal on Cybernetics, 4, 95–104.
Francois, D., Wertz, V., & Verleysen, M. (2005). Non-Euclidean metrics for similarity search in noisy data sets. In Proceedings of European Symposium on Artificial Neural Networks, ESANN’2005.
Gallant, S. I. (1993). Neural Network Learning and Expert Systems. Cambridge: MIT Press.
Gowda, K. C., & Krishna, G. (1977). Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recognition, 10, 105–112.
Hartigan, J. (1985). Statistical theory in clustering. Journal of Classification, 2(1), 63–76.
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys, 31(3), 264–323.
Klein, R. W., & Dubes, R. C. (1989). Experiments in projection and clustering by simulated annealing. Pattern Recognition, 22(2), 213–220.
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69.
Kohonen, T. (1991). Self-organizing maps: Optimization approaches. In Artificial Neural Networks (pp. 981–990). Amsterdam: Elsevier Science Publisher.
Kosmidis, I., & Karlis, D. (2014). Model-based clustering using copulas with applications. Statistics and Computing, Source: arXiv:1404.4077v5.
Krzanowski, W. J., & Lai, Y. T. (1985). A criterion for determining the number of groups in a data set using sum of squared clustering. Biometrics, 44, 23–34.
Liao, T. W. (2005). Clustering of time useries data-a survey. Pattern Recognition, 38, 1857–1874.
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkeley: University of California Press.
Mohebi, E., & Sap, M. N. M. (2009). Hybrid Kohonen self organizing map for the uncertainty involved in overlapping clusters using simulated annealing. In Proceedings of the UKSim 2009: 11th International Conference on Computer Modelling and Simulation (pp. 53–58).
Montero, P., & Vilar, J. A. (2014). TSclust: An R package for time series clustering. Journal of Statistical Software, 62(1), 1–43.
Muhr, M., & Granitzer, M. (2009). Automatic cluster number selection using a split and merge K-means approach. DEXA Workshops 2009 (pp. 363–367).
Munakata, T. (2008). Fundamentals of the New Artificial Intelligence. New York: Springer.
Pawlak, Z. (1982). Rough sets. International Journal of Computer and Information Science, 11, 341–356.
Pelleg, D., & Moore, A. (2000). X-means: Extending K-means with efficient estimation of the number of clusters. In ICML-2000 (pp. 727–734).
Raghavan, V. V., & Birchand, K. (1979). A clustering strategy based on a formalism of the reproductive process in a natural system. In Proceedings of the Second Annual International ACM SIGIR Conference on Information Storage and Retrieval, Information Implications into the Eighties, Dallas, 27–28 Sept 1979 (pp. 10–22). New York: ACM Press.
Su, M. C., & Chou, C. H. (2001). A modified version of the K-means algorithm with a a distance based on cluster symmetry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 674–680.
Wang, L., Bo, L., & Jiao, L. (2006). A modified K-means clustering with a density-sensitive distance metric. In Rough Sets and Knowledge Technology (Lecture notes in computer science, Vol. 4062, pp. 544–551). Berlin/Heidelberg: Springer.
Wilson, D. R., & Martinez, T. R. (1997). Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 6, 1–34.
Wishart, D. (1969). An algorithm for hierarchical classifications. Biometrics, 25(1), 165–170.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Bernardi, E., Romagnoli, S. (2021). Clustering. In: Counting Statistics for Dependent Random Events. Springer, Cham. https://doi.org/10.1007/978-3-030-64250-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-64250-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64249-5
Online ISBN: 978-3-030-64250-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)