Skip to main content

Advertisement

Log in

Unsupervised machine learning methods and emerging applications in healthcare

  • REVIEW
  • Published:
Knee Surgery, Sports Traumatology, Arthroscopy Aims and scope

Abstract

Unsupervised machine learning methods are important analytical tools that can facilitate the analysis and interpretation of high-dimensional data. Unsupervised machine learning methods identify latent patterns and hidden structures in high-dimensional data and can help simplify complex datasets. This article provides an overview of key unsupervised machine learning techniques including K-means clustering, hierarchical clustering, principal component analysis, and factor analysis. With a deeper understanding of these analytical tools, unsupervised machine learning methods can be incorporated into health sciences research to identify novel risk factors, improve prevention strategies, and facilitate delivery of personalized therapies and targeted patient care.

Level of evidence: I

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Altman NKM (2017) Clustering. Nat Methods 14:545–546

    Article  CAS  Google Scholar 

  2. Angelini F, Widera P, Mobasheri A, Blair J, Struglics A, Uebelhoer M et al (2022) Osteoarthritis endotype discovery via clustering of biochemical marker data. Ann Rheum Dis 81:666–675

    Article  Google Scholar 

  3. Bastanlar Y, Ozuysal M (2014) Introduction to machine learning. Methods Mol Biol 1107:105–128

    Article  Google Scholar 

  4. Cadima J, Cerdeira JO, Minhoto M (2004) Computational aspects of algorithms for variable selection in the context of principal components. Comput Stat Data Anal 47:225–236

    Article  Google Scholar 

  5. Davenport T, Kalakota R (2019) The potential for artificial intelligence in healthcare. Future Healthc J 6:94–98

    Article  Google Scholar 

  6. Davies DL, Bouldin DW (1979) A cluster separation measure. IEEE Trans Pattern Anal Mach Intell 1:224–227

    Article  CAS  Google Scholar 

  7. Eckhardt CM, Gambazza S, Bloomquist TR, De Hoff P, Vuppala A, Vokonas PS et al (2022) Extracellular vesicle-encapsulated microRNAs as novel biomarkers of lung health. Am J Respir Crit Care Med. https://doi.org/10.1164/rccm.202109-2208OC

    Article  Google Scholar 

  8. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning, 1st edn. Springer, New York, NY

    Book  Google Scholar 

  9. Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32:241–254

    Article  CAS  Google Scholar 

  10. Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci 374:20150202

    Google Scholar 

  11. Lever J, Krzywinski M, Altman N (2017) Principal component analysis. Nat Methods 14:641–642

    Article  CAS  Google Scholar 

  12. MacQueen J (1967) Classification and analysis of multivariate observations. In 5th Berkeley Symp Math Statist Probability 281–297

  13. Martin JA, Stiffler-Joachim MR, Wille CM, Heiderscheit BC (2022) A hierarchical clustering approach for examining potential risk factors for bone stress injury in runners. J Biomech 141:111136. https://doi.org/10.1016/j.jbiomech.2022.111136

    Article  Google Scholar 

  14. Nwachukwu BU, Beck EC, Lee EK, Cancienne JM, Waterman BR, Paul K et al (2020) Application of machine learning for predicting clinically meaningful outcome after arthroscopic femoroacetabular impingement surgery. Am J Sports Med 48:415–423

    Article  Google Scholar 

  15. Pourahmad S, Basirat A, Rahimi A, Doostfatemeh M (2020) Does Determination of initial cluster centroids improve the performance of K-means clustering algorithm? Comparison of three hybrid methods by genetic algorithm, minimum spanning tree, and hierarchical clustering in an applied study. Comput Math Methods Med 2020:7636857

    Article  Google Scholar 

  16. Selim SZ, Ismail MA (1984) K-means-type algorithms: a generalized convergence theorem and characterization of local optimality. IEEE Trans Pattern Anal Mach Intell 6:81–87

    Article  CAS  Google Scholar 

  17. Steinley D, Brusco MJ (2007) Initializing K-means batch clustering: a critical evaluation of several techniques. J Classif 24:99–121

    Article  Google Scholar 

  18. Tavakol M, Wetzel A (2020) Factor analysis: a means for theory and instrument development in support of construct validity. Int J Med Educ 11:245–247

    Article  Google Scholar 

  19. Velten B, Braunger JM, Argelaguet R, Arnol D, Wirbel J, Bredikhin D et al (2022) Identifying temporal and spatial patterns of variation from multimodal data using MEFISTO. Nat Methods 19:179–186

    Article  CAS  Google Scholar 

  20. Walsh BM, Kosik KB, Bain KA, Houston MN, Hoch MC, Gribble P et al (2022) Exploratory factor analysis of the fear-avoidance beliefs questionnaire in patients with chronic ankle instability. Foot (Edinb) 51:101902

    Article  Google Scholar 

  21. Walters SJ, Campbell MJ (2004) The use of bootstrap methods for analysing health-related quality of life outcomes (particularly the SF-36). Health Qual Life Outcomes 2:70. https://doi.org/10.1186/1477-7525-2-70

    Article  Google Scholar 

  22. Xu N, Finkelman RB, Dai S, Xu C, Peng M (2021) Average linkage hierarchical clustering algorithm for determining the relationships between elements in coal. ACS Omega 6:6206–6217

    Article  CAS  Google Scholar 

  23. Yocum D, Reinbolt J, Weinhandl JT, Standifird TW, Fitzhugh E, Cates H et al (2021) Principal component analysis of knee joint differences between bilateral and unilateral total knee replacement patients during level walking. J Biomech Eng 143(11):111003. https://doi.org/10.1115/1.4051524 (PMID: 34159353)

    Article  Google Scholar 

Download references

Funding

There is no funding source.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ayoosh Pareek.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Eckhardt, C.M., Madjarova, S.J., Williams, R.J. et al. Unsupervised machine learning methods and emerging applications in healthcare. Knee Surg Sports Traumatol Arthrosc 31, 376–381 (2023). https://doi.org/10.1007/s00167-022-07233-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00167-022-07233-7

Keywords

Navigation