Abstract
Clustering is a technique of grouping similar observations into smaller groups within the larger population. The resulting groups should be homogeneous, with each member of the cluster having more in common with members of the same cluster than with members of the other clusters. Cluster analysis is an exploratory data analysis tool which aims at sorting different objects into groups in a way to maximize the degree of association between objects in the same cluster. In this chapter several clustering techniques are explained. A simple example is used to explain the different clustering methods. Finally, clustering is applied to a subset of the automobile insurance data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ferreira L, Hitchcock DB (2009) A comparison of hierarchical methods for clustering functional data. Commun Stat – Simul Comput 38:1925–1949
Hands S, Everitt B (1987) A Monte Carlo study of the recovery of cluster structure in binary data by hierarchical clustering techniques. Multivar Behav Res 22:235–243
Kaufman L, Rousseeuw P (1990) Finding groups in data: an introduction to cluster analysis. Wiley Publishing
Mahalanobis PC (1936) On the generalized distance in statistics. J Genet 41:159–193
Prabhakaran S (2019) Mahalanobis distance – understanding the math with examples (python). Retrieved from https://www.machinelearningplus.com/statistics/mahalanobis-distance/
Roux M (2018) A comparative study of divisive and agglomerative hierarchical clustering algorithms. J Classif, Springer 5(2):345–366
Sokal RR, Michener CD (1958) A statistical method for evaluating systematic relationships. Univ Kansas Sci Bull 28:1409–1438
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a dataset via the gap statistic. J R Stat Soc 63(2):411–423
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
McCarthy, R.V., McCarthy, M.M., Ceccucci, W. (2022). Finding Associations in Data Through Cluster Analysis. In: Applying Predictive Analytics. Springer, Cham. https://doi.org/10.1007/978-3-030-83070-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-83070-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-83069-4
Online ISBN: 978-3-030-83070-0
eBook Packages: EngineeringEngineering (R0)