Clustering

Bernardi, Enrico; Romagnoli, Silvia

doi:10.1007/978-3-030-64250-1_1

Enrico Bernardi³ &
Silvia Romagnoli³

480 Accesses

Abstract

The goal of this chapter is to survey and present the main concepts and techniques from the vast collection of clustering models, from the hierarchical to the partitioning ones. In the partitioning class, we extensively discuss the clustering approaches that can be seen as particular neural networks, with attention to the key concepts involved into the social sciences’ applications. The clustering is one of the main ingredients of the new copula-based algorithm of aggregation which will be discussed in Part II.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Divisive methods do the opposite of the agglomerative ones.
2.
This kind of model is very familiar to physicists, and they represent the vigor of the shaking as a temperature because at higher temperature, there is more molecular activity. Annealing refers to a way of tempering certain alloys of metal by heating and then gradually cooling them.

References

Anderberg, M. R. (1973). Cluster Analysis for Applications. New York: Academic.
MATH Google Scholar
Baglioni, A., & Cherubini, U. (2010). Marking-to-Market Government Guarantees to Financial Systems: An Empirical Analysis of Europe, working paper available at SSRN: https://doi.org/10.2139/ssrn.1715405.
Bernardi, E., & Romagnoli, S. (2011). Computing the volume of an high-dimensional semi-unsupervised hierarchical copula. International Journal of Computer Mathematics, 88(12), 2591–2607.
Article MathSciNet Google Scholar
Caliński, R., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.
MathSciNet MATH Google Scholar
Dunn, J. C. (1974). Well separated clusters and fuzzy partitions. Journal on Cybernetics, 4, 95–104.
Article MathSciNet Google Scholar
Francois, D., Wertz, V., & Verleysen, M. (2005). Non-Euclidean metrics for similarity search in noisy data sets. In Proceedings of European Symposium on Artificial Neural Networks, ESANN’2005.
Google Scholar
Gallant, S. I. (1993). Neural Network Learning and Expert Systems. Cambridge: MIT Press.
Book Google Scholar
Gowda, K. C., & Krishna, G. (1977). Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recognition, 10, 105–112.
Article Google Scholar
Hartigan, J. (1985). Statistical theory in clustering. Journal of Classification, 2(1), 63–76.
Article MathSciNet Google Scholar
Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys, 31(3), 264–323.
Article Google Scholar
Klein, R. W., & Dubes, R. C. (1989). Experiments in projection and clustering by simulated annealing. Pattern Recognition, 22(2), 213–220.
Article Google Scholar
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69.
Article MathSciNet Google Scholar
Kohonen, T. (1991). Self-organizing maps: Optimization approaches. In Artificial Neural Networks (pp. 981–990). Amsterdam: Elsevier Science Publisher.
Chapter Google Scholar
Kosmidis, I., & Karlis, D. (2014). Model-based clustering using copulas with applications. Statistics and Computing, Source: arXiv:1404.4077v5.
Google Scholar
Krzanowski, W. J., & Lai, Y. T. (1985). A criterion for determining the number of groups in a data set using sum of squared clustering. Biometrics, 44, 23–34.
Article Google Scholar
Liao, T. W. (2005). Clustering of time useries data-a survey. Pattern Recognition, 38, 1857–1874.
Article Google Scholar
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkeley: University of California Press.
Google Scholar
Mohebi, E., & Sap, M. N. M. (2009). Hybrid Kohonen self organizing map for the uncertainty involved in overlapping clusters using simulated annealing. In Proceedings of the UKSim 2009: 11th International Conference on Computer Modelling and Simulation (pp. 53–58).
Google Scholar
Montero, P., & Vilar, J. A. (2014). TSclust: An R package for time series clustering. Journal of Statistical Software, 62(1), 1–43.
Article Google Scholar
Muhr, M., & Granitzer, M. (2009). Automatic cluster number selection using a split and merge K-means approach. DEXA Workshops 2009 (pp. 363–367).
Google Scholar
Munakata, T. (2008). Fundamentals of the New Artificial Intelligence. New York: Springer.
MATH Google Scholar
Pawlak, Z. (1982). Rough sets. International Journal of Computer and Information Science, 11, 341–356.
Article Google Scholar
Pelleg, D., & Moore, A. (2000). X-means: Extending K-means with efficient estimation of the number of clusters. In ICML-2000 (pp. 727–734).
Google Scholar
Raghavan, V. V., & Birchand, K. (1979). A clustering strategy based on a formalism of the reproductive process in a natural system. In Proceedings of the Second Annual International ACM SIGIR Conference on Information Storage and Retrieval, Information Implications into the Eighties, Dallas, 27–28 Sept 1979 (pp. 10–22). New York: ACM Press.
Google Scholar
Su, M. C., & Chou, C. H. (2001). A modified version of the K-means algorithm with a a distance based on cluster symmetry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 674–680.
Article Google Scholar
Wang, L., Bo, L., & Jiao, L. (2006). A modified K-means clustering with a density-sensitive distance metric. In Rough Sets and Knowledge Technology (Lecture notes in computer science, Vol. 4062, pp. 544–551). Berlin/Heidelberg: Springer.
Google Scholar
Wilson, D. R., & Martinez, T. R. (1997). Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 6, 1–34.
Article MathSciNet Google Scholar
Wishart, D. (1969). An algorithm for hierarchical classifications. Biometrics, 25(1), 165–170.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistical Sciences “Paolo Fortunati”, University of Bologna, Bologna, Italy
Enrico Bernardi & Silvia Romagnoli

Authors

Enrico Bernardi
View author publications
You can also search for this author in PubMed Google Scholar
Silvia Romagnoli
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bernardi, E., Romagnoli, S. (2021). Clustering. In: Counting Statistics for Dependent Random Events. Springer, Cham. https://doi.org/10.1007/978-3-030-64250-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-64250-1_1
Published: 23 March 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64249-5
Online ISBN: 978-3-030-64250-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics