Skip to main content

Abstract

The goal of this chapter is to survey and present the main concepts and techniques from the vast collection of clustering models, from the hierarchical to the partitioning ones. In the partitioning class, we extensively discuss the clustering approaches that can be seen as particular neural networks, with attention to the key concepts involved into the social sciences’ applications. The clustering is one of the main ingredients of the new copula-based algorithm of aggregation which will be discussed in Part II.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Divisive methods do the opposite of the agglomerative ones.

  2. 2.

    This kind of model is very familiar to physicists, and they represent the vigor of the shaking as a temperature because at higher temperature, there is more molecular activity. Annealing refers to a way of tempering certain alloys of metal by heating and then gradually cooling them.

References

  1. Anderberg, M. R. (1973). Cluster Analysis for Applications. New York: Academic.

    MATH  Google Scholar 

  2. Baglioni, A., & Cherubini, U. (2010). Marking-to-Market Government Guarantees to Financial Systems: An Empirical Analysis of Europe, working paper available at SSRN: https://doi.org/10.2139/ssrn.1715405.

  3. Bernardi, E., & Romagnoli, S. (2011). Computing the volume of an high-dimensional semi-unsupervised hierarchical copula. International Journal of Computer Mathematics, 88(12), 2591–2607.

    Article  MathSciNet  Google Scholar 

  4. Caliński, R., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.

    MathSciNet  MATH  Google Scholar 

  5. Dunn, J. C. (1974). Well separated clusters and fuzzy partitions. Journal on Cybernetics, 4, 95–104.

    Article  MathSciNet  Google Scholar 

  6. Francois, D., Wertz, V., & Verleysen, M. (2005). Non-Euclidean metrics for similarity search in noisy data sets. In Proceedings of European Symposium on Artificial Neural Networks, ESANN’2005.

    Google Scholar 

  7. Gallant, S. I. (1993). Neural Network Learning and Expert Systems. Cambridge: MIT Press.

    Book  Google Scholar 

  8. Gowda, K. C., & Krishna, G. (1977). Agglomerative clustering using the concept of mutual nearest neighborhood. Pattern Recognition, 10, 105–112.

    Article  Google Scholar 

  9. Hartigan, J. (1985). Statistical theory in clustering. Journal of Classification, 2(1), 63–76.

    Article  MathSciNet  Google Scholar 

  10. Jain, A. K., Murty, M. N., & Flynn, P. J. (1999). Data clustering: A review. ACM Computing Surveys, 31(3), 264–323.

    Article  Google Scholar 

  11. Klein, R. W., & Dubes, R. C. (1989). Experiments in projection and clustering by simulated annealing. Pattern Recognition, 22(2), 213–220.

    Article  Google Scholar 

  12. Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69.

    Article  MathSciNet  Google Scholar 

  13. Kohonen, T. (1991). Self-organizing maps: Optimization approaches. In Artificial Neural Networks (pp. 981–990). Amsterdam: Elsevier Science Publisher.

    Chapter  Google Scholar 

  14. Kosmidis, I., & Karlis, D. (2014). Model-based clustering using copulas with applications. Statistics and Computing, Source: arXiv:1404.4077v5.

    Google Scholar 

  15. Krzanowski, W. J., & Lai, Y. T. (1985). A criterion for determining the number of groups in a data set using sum of squared clustering. Biometrics, 44, 23–34.

    Article  Google Scholar 

  16. Liao, T. W. (2005). Clustering of time useries data-a survey. Pattern Recognition, 38, 1857–1874.

    Article  Google Scholar 

  17. MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of the 5th Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkeley: University of California Press.

    Google Scholar 

  18. Mohebi, E., & Sap, M. N. M. (2009). Hybrid Kohonen self organizing map for the uncertainty involved in overlapping clusters using simulated annealing. In Proceedings of the UKSim 2009: 11th International Conference on Computer Modelling and Simulation (pp. 53–58).

    Google Scholar 

  19. Montero, P., & Vilar, J. A. (2014). TSclust: An R package for time series clustering. Journal of Statistical Software, 62(1), 1–43.

    Article  Google Scholar 

  20. Muhr, M., & Granitzer, M. (2009). Automatic cluster number selection using a split and merge K-means approach. DEXA Workshops 2009 (pp. 363–367).

    Google Scholar 

  21. Munakata, T. (2008). Fundamentals of the New Artificial Intelligence. New York: Springer.

    MATH  Google Scholar 

  22. Pawlak, Z. (1982). Rough sets. International Journal of Computer and Information Science, 11, 341–356.

    Article  Google Scholar 

  23. Pelleg, D., & Moore, A. (2000). X-means: Extending K-means with efficient estimation of the number of clusters. In ICML-2000 (pp. 727–734).

    Google Scholar 

  24. Raghavan, V. V., & Birchand, K. (1979). A clustering strategy based on a formalism of the reproductive process in a natural system. In Proceedings of the Second Annual International ACM SIGIR Conference on Information Storage and Retrieval, Information Implications into the Eighties, Dallas, 27–28 Sept 1979 (pp. 10–22). New York: ACM Press.

    Google Scholar 

  25. Su, M. C., & Chou, C. H. (2001). A modified version of the K-means algorithm with a a distance based on cluster symmetry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6), 674–680.

    Article  Google Scholar 

  26. Wang, L., Bo, L., & Jiao, L. (2006). A modified K-means clustering with a density-sensitive distance metric. In Rough Sets and Knowledge Technology (Lecture notes in computer science, Vol. 4062, pp. 544–551). Berlin/Heidelberg: Springer.

    Google Scholar 

  27. Wilson, D. R., & Martinez, T. R. (1997). Improved heterogeneous distance functions. Journal of Artificial Intelligence Research, 6, 1–34.

    Article  MathSciNet  Google Scholar 

  28. Wishart, D. (1969). An algorithm for hierarchical classifications. Biometrics, 25(1), 165–170.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bernardi, E., Romagnoli, S. (2021). Clustering. In: Counting Statistics for Dependent Random Events. Springer, Cham. https://doi.org/10.1007/978-3-030-64250-1_1

Download citation

Publish with us

Policies and ethics