Clinical fracture risk evaluated by hierarchical agglomerative clustering
Clustering analysis can identify subgroups of patients based on similarities of traits. From data on 10,775 subjects, we document nine patient clusters of different fracture risks. Differences emerged after age 60 and treatment compliance differed by hip and lumbar spine bone mineral density profiles.
The purposes of this study were to establish and quantify patient clusters of high, average and low fracture risk using an unsupervised machine learning algorithm.
Regional and national Danish patient data on dual-energy X-ray absorptiometry (DXA) scans, medication reimbursement, primary healthcare sector use and comorbidity of female subjects were combined. Standardized variable means, Euclidean distances and Ward’s D2 method of hierarchical agglomerative clustering (HAC), were used to form the clustering object. K number of clusters was selected with the lowest cluster containing less than 250 subjects. Clusters were identified as high, average or low fracture risk based on bone mineral density (BMD) characteristics. Cluster-based descriptive statistics and relative Z-scores for variable means were computed.
Ten thousand seven hundred seventy-five women were included in this study. Nine (k = 9) clusters were identified. Four clusters (n = 2886) were identified based on low to very low BMD with differences in comorbidity, anthropometrics and future bisphosphonate compliance. Two clusters of younger subjects (n = 1058, mean ages 30 and 51 years) were identified as low fracture risk with high to very high BMD. A mean age of 60 years was the earliest that allowed for separation of high-risk clusters. DXA scan results could identify high-risk subjects with different antiresorptive treatment compliance levels based on similarities and differences in lumbar spine and hip region BMD.
Unsupervised HAC presents a novel technology to improve patient characteristics in bone disease beyond traditional T-score-based diagnosis. Technological and validation limitations need to be overcome to improve its use in internal medicine. Current DXA scan indication guidelines could be further improved by clustering algorithms.
KeywordsClustering Densitometry Machine learning Osteoporosis Risk factors
The Obel Family Foundation and the Department of Clinical Medicine at Aalborg University are acknowledged for providing grants that enable the PhD fellowship of Dr. Christian Kruse. Statistics Denmark is acknowledged for providing data. The community around the R statistical software is acknowledged for the programming packages and guidance that enable studies such as this.
Compliance with ethical standards
Conflicts of Interest
CK has received travel grants from Eli Lilly, Otsuka Pharmaceutical, and is a speaker for Novartis and Otsuka Pharmaceutical. PE is an advisory board member with Amgen, MSD and Eli Lilly and at the speakers bureau with Amgen and Eli Lilly, stocks from Novo Nordisk A/S. PV has received unrestricted grants from MSD and Servier, and travel grants from Amgen, Eli Lilly, Novartis, Sanofi-Aventis, and Servier.
- 13.Agnelli L, Mosca L, Fabris S, Lionetti M, Andronache A, Kwee I, Todoerti K, Verdelli D, Battaglia C, Bertoni F, Deliliers GL, Neri A (2009) A SNP microarray and FISH-based procedure to detect allelic imbalances in multiple myeloma: an integrated genomics approach reveals a wide gene dosage effect. Genes Chromosomes Cancer 48(7):603–614. doi: 10.1002/gcc.20668 CrossRefPubMedGoogle Scholar
- 14.Cotsapas C, Voight BF, Rossin E, Lage K, Neale BM, Wallace C, Abecasis GR, Barrett JC, Behrens T, Cho J, De Jager PL, Elder JT, Graham RR, Gregersen P, Klareskog L, Siminovitch KA, van Heel DA, Wijmenga C, Worthington J, Todd JA, Hafler DA, Rich SS, Daly MJ (2011) FOCiS network of consortia. Pervasive sharing of genetic effects in autoimmune disease. PLoS Genet 7(8):e1002254. doi: 10.1371/journal.pgen.1002254 CrossRefPubMedPubMedCentralGoogle Scholar
- 15.Linnaeus C (1758) Systema naturae per regna tria naturae : secundum classes, ordines, genera, species, cum characteribus, differentiis, synonymis, locisGoogle Scholar
- 20.Sundhedsstyrelsen. http://sundhedsdatastyrelsen.dk/da
- 27.Liberman UA, Weiss SR, Bröll J, Minne HW, Quan H, Bell NH, Rodriguez-Portales J, Downs RW Jr, Dequeker J, Favus M (1995) Effect of oral alendronate on bone mineral density and the incidence of fractures in postmenopausal osteoporosis. The alendronate phase III osteoporosis treatment study group. N Engl J Med 333(22):1437–1443CrossRefPubMedGoogle Scholar
- 28.Burger H, van Daele PL, Odding E, Valkenburg HA, Hofman A, Grobbee DE, Schütte HE, Birkenhäger JC, Pols HA (1996) Association of radiographically evident osteoarthritis with higher bone mineral density and increased bone loss with age. The Rotterdam study. Arthritis Rheum 39(1):81–86CrossRefPubMedGoogle Scholar
- 36.ISCD 2015 Official position. http://www.iscd.org/official-positions/2015-iscd-official-positions-adult/. Accessed 15 June 2016
- 38.Leslie WD, Kovacs CS, Olszynski WP, Towheed T, Kaiser SM, Prior JC, Josse RG, Jamal SA, Kreiger N, Goltzman D (2011) CaMos research group. Spine-hip T-score difference predicts major osteoporotic fracture risk independent of FRAX(®): a population-based report from CAMOS. J Clin Densitom 14(3):286–293. doi: 10.1016/j.jocd.2011.04.011 CrossRefPubMedPubMedCentralGoogle Scholar
- 43.Kaufman L, Rousseeuw PJ (1990) (=: “K&R(1990)”) finding groups in data: an introduction to cluster analysis. Wiley, New YorkGoogle Scholar
- 44.Houle ME, Kriegel H, Kröger P, Schubert E, Zimek A (2010) Can shared-neighbor distances defeat the curse of dimensionality? Proceedings of the 22nd International Conference on Scientific and Statistical Database Management. Heidelberg, GermanyGoogle Scholar
- 45.Suzuki R, Shimodaira H (2006) Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, Oxford Univ PressGoogle Scholar