Abstract
Unsupervised learning, the task of clustering observations in such a way that observations within cluster are more similar than those assigned to other clusters is one the central tasks of data science. Its exploratory and descriptive nature make it one of the most underused and underappreciated methods. In the present chapter we describe its core function with applied examples, explore different approaches, and discuss meaningful applications of the approach for the practicing researcher.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Storrs KR, Fleming RW. Unsupervised learning predicts human perception and misperception of gloss. bioRxiv. 2020. https://doi.org/10.1101/2020.04.07.026120.
Driver HE, Kroeber AL. Quantitative expression of cultural relationships. Berkeley: University of California Press; 1932.
Sánchez-Hernández G, Chiclana F, Agell N, Aguado JC. Ranking and selection of unsupervised learning marketing segmentation. Knowl Based Syst. 2013;44:20–33.
Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet. 2015;16:321–32. https://doi.org/10.1038/nrg3920.
Denny M, Spirling A. Text preprocessing for unsupervised learning: why it matters, when it misleads, and what to do about it. Polit Anal. 2017;26(2):168–89.
Wang L. Discovering phase transitions with unsupervised learning. Phys Rev B. 2016;94:195105.
Sonnewald M, Dutkiewicz S, Hill C, Forget G. Elucidating ecological complexity: unsupervised learning determines global marine eco-provinces. Sci Adv. 2020;6:eaay4740.
Syakur MA, Khotimah BK, Rochman EMS, Satoto BD. Integration K-means clustering method and elbow method for identification of the best customer profile cluster. In: IOP conference series: materials science and engineering. 2018.
Kodinariya TM, Makwana PR. Review on determining number of cluster in K-means clustering. Int J Adv Res Comput Sci Manag Stud. 2013;1:90–5.
Tibshirani R, Walther G, Hastie T. Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Ser B Stat Methodol. 2001;63:411–23.
Fichet B, Piccolo D, Verde R, Vichi M. Studies in classification, data analysis, and knowledge organization. In: Knowledge organization. 2011.
Lloyd S. Least squares quantization in PCM. IEEE Trans Inf Theory. 1982;28:129–37.
MacQueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1: statistics. Berkeley: University of California Press; 1967. p. 281–97. https://projecteuclid.org/euclid.bsmsp/1200512992.
Hartigan JA, Wong MA. Algorithm AS 136: a k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat). 1979;28:100–8.
Ester M, Kriegel H-P, Sander J, Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd. 1996. p. 226–31.
Ames CP, Smith JS, Pellisé F, Kelly M, Alanay A, Acaroğlu E, et al. Artificial intelligence based hierarchical clustering of patient types and intervention categories in adult spinal deformity surgery: towards a new classification scheme that predicts quality and value. Spine (Phila Pa 1976). 2019;44:915–26.
Terran J, Schwab F, Shaffrey CI, Smith JS, Devos P, Ames CP, et al. The SRS-Schwab adult spinal deformity classification: assessment and clinical correlations based on a prospective operative and nonoperative cohort. Neurosurgery. 2013;73(4):559–68.
Lenke LG. The Lenke classification system of operative adolescent idiopathic scoliosis. Neurosurg Clin N Am. 2007;18(2):199–206.
Seymour CW, Kennedy JN, Wang S, Chang C-CH, Elliott CF, Xu Z, et al. Derivation, validation, and potential treatment implications of novel clinical phenotypes for sepsis. JAMA. 2019;321:2003–17. https://doi.org/10.1001/jama.2019.5791.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Dr. Serra-Burriel reports receiving grant funding from the European Commission H2020 program and European Commission EiT Health program.
Dr. Ames reports receiving royalties from Stryker, Biomet Zimmer Spine, DePuy Synthes, NuVasive, Next Orthosurgical, K2M, and Medicrea; being a consultant to DePuy Synthes, Medtronic, Medicrea, and K2M; receiving research support from Titan Spine, DePuy Synthes, and ISSG; being on the editorial board of Operative Neurosurgery; receiving grant funding from SRS; being on the executive committee of ISSG; and being a director of Global Spine Analytics.
None in relation to the present work.
1 Electronic Supplementary Material
Supplementary Content 12.1
(R 10 kb)
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Serra-Burriel, M., Ames, C. (2022). Machine Learning-Based Clustering Analysis: Foundational Concepts, Methods, and Applications. In: Staartjes, V.E., Regli, L., Serra, C. (eds) Machine Learning in Clinical Neuroscience. Acta Neurochirurgica Supplement, vol 134. Springer, Cham. https://doi.org/10.1007/978-3-030-85292-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-85292-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85291-7
Online ISBN: 978-3-030-85292-4
eBook Packages: MedicineMedicine (R0)