Machine Learning Principles Can Improve Hip Fracture Prediction

Kruse, Christian; Eiken, Pia; Vestergaard, Peter

doi:10.1007/s00223-017-0238-7

Machine Learning Principles Can Improve Hip Fracture Prediction

Original Research
Published: 14 February 2017

Volume 100, pages 348–360, (2017)
Cite this article

Calcified Tissue International Aims and scope Submit manuscript

3372 Accesses
67 Citations
Explore all metrics

Abstract

Apply machine learning principles to predict hip fractures and estimate predictor importance in Dual-energy X-ray absorptiometry (DXA)-scanned men and women. Dual-energy X-ray absorptiometry data from two Danish regions between 1996 and 2006 were combined with national Danish patient data to comprise 4722 women and 717 men with 5 years of follow-up time (original cohort n = 6606 men and women). Twenty-four statistical models were built on 75% of data points through k-5, 5-repeat cross-validation, and then validated on the remaining 25% of data points to calculate area under the curve (AUC) and calibrate probability estimates. The best models were retrained with restricted predictor subsets to estimate the best subsets. For women, bootstrap aggregated flexible discriminant analysis (“bagFDA”) performed best with a test AUC of 0.92 [0.89; 0.94] and well-calibrated probabilities following Naïve Bayes adjustments. A “bagFDA” model limited to 11 predictors (among them bone mineral densities (BMD), biochemical glucose measurements, general practitioner and dentist use) achieved a test AUC of 0.91 [0.88; 0.93]. For men, eXtreme Gradient Boosting (“xgbTree”) performed best with a test AUC of 0.89 [0.82; 0.95], but with poor calibration in higher probabilities. A ten predictor subset (BMD, biochemical cholesterol and liver function tests, penicillin use and osteoarthritis diagnoses) achieved a test AUC of 0.86 [0.78; 0.94] using an “xgbTree” model. Machine learning can improve hip fracture prediction beyond logistic regression using ensemble models. Compiling data from international cohorts of longer follow-up and performing similar machine learning procedures has the potential to further improve discrimination and calibration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The clinician’s guide to prevention and treatment of osteoporosis

Article Open access 28 April 2022

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Article 13 January 2022

Accumulation of risk factors associated with poor bone health in older adults

Article Open access 22 December 2015

References

Lin CC, Ou YK, Chen SH, Liu YC, Lin J (2010) Comparison of artificial neural network and logistic regression models for predicting mortality in elderly patients with hip fracture. Injury 41(8):869–873
Article PubMed Google Scholar
Jin H, Lu Y, Harris ST et al (2004) Classification algorithms for hip fracture prediction based on recursive partitioning methods. Med Decis Mak 24(4):386–398
Article Google Scholar
Sundhedsdatastyrelsen, Cancerregistret. http://sundhedsdatastyrelsen.dk/da/registre-og-services/om-de-nationale-sundhedsregistre/sygedomme-laegemidler-og-behandlinger/cancerregisteret
Sundhedsdatastyrelsen, Landspatientregistret. http://sundhedsdatastyrelsen.dk/da/registre-og-services/om-de-nationale-sundhedsregistre/sygedomme-laegemidler-og-behandlinger/landspatientregisteret
Sundhedsdatastyrelsen, Lægemiddelstatistikregisteret. http://sundhedsdatastyrelsen.dk/da/registre-og-services/om-de-nationale-sundhedsregistre/sygedomme-laegemidler-og-behandlinger/laegemiddelstatistikregisteret
Sundhedsdatastyrelsen. http://sundhedsdatastyrelsen.dk/da
CPR-registret. http://www.cpr.dk
Statistics Denmark. http://www.dst.dk/da/Statistik/emner/befolkning-og-befolkningsfremskrivning/folketal.aspx
Sundararajan V, Henderson T, Perry C, Muggivan A, Quan H, Ghali WA (2004) New ICD-10 version of the Charlson comorbidity index predicted in-hospital mortality. J Clin Epidemiol 57(12):1288–1294
Article PubMed Google Scholar
Mitra AK, Mukherjee UK, Harding T et al (2016) Single-cell analysis of targeted transcriptome predicts drug sensitivity of single cells within human myeloma tumors. Leukemia 30(5):1094–1102
Article CAS PubMed Google Scholar
Sharma GB, Robertson DD, Laney DA, Gambello MJ, Terk M (2016) Machine learning based analytics of micro-MRI trabecular bone microarchitecture and texture in type 1 Gaucher disease. J Biomech 49(9):1961–1968
Article Google Scholar
Cohen G, Hilario M, Pellegrini C, Geissbuhler A (2005) SVM modeling via a hybrid genetic strategy. A health care application. Stud Health Technol Inform 116:193–198
PubMed Google Scholar
Kim JH (2009) Estimating classification error rate: repeated cross–validation, repeated hold–out and bootstrap. Comput Stat Data Anal 53(11):3735–3745
Article Google Scholar
Kohavi R (1995) A study of cross–validation and bootstrap for accuracy estimation and model selection. Int Jt Conf Artif Intell 14:1137–1145
Google Scholar
Simon R, Radmacher MD, Dobbin K, Mcshane LM (2003) Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 95(1):14–18
Article CAS PubMed Google Scholar
Molinaro AM, Simon R, Pfeiffer RM (2005) Prediction error estimation: a comparison of resampling methods. Bioinformatics 21(15):3301–3307
Article CAS PubMed Google Scholar
Altman DG, Bland JM (1994) Diagnostic tests 3: receiver operating characteristic plots. BMJ 309(6948):188
Article CAS PubMed PubMed Central Google Scholar
Brown C, Davis H (2006) Receiver operating characteristics curves and related decision measures: a tutorial. Chemom Intell Lab Syst 80(1):24–38
Article CAS Google Scholar
Kvalseth T (1985) Cautionary note about R2. Am Stat 39(4):279–285
Google Scholar
Hawkins DM, Basak SC, Mills D (2003) Assessing model fit by cross-validation. J Chem Inform Comput Sci 43(2):579–586
Article CAS Google Scholar
Martin J, Hirschberg D (1996) Small sample statistics for classification error rates I: error rate measurements. Department of Informatics and Computer Science Technical Report
Lemeshow S, Hosmer DW (1982) A review of goodness of fit statistics for use in the development of logistic regression models. Am J Epidemiol 115(1):92–106
Article CAS PubMed Google Scholar
Youden WJ (1950) Index for rating diagnostic tests. Cancer 3(1):32–35
Article CAS PubMed Google Scholar
Platt J (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv large Margin Classif 10(3):61–74
Google Scholar
B Zadrozny, C Elkan (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the eighth ACM SIGKDD
Elith J, Leathwick JR, Hastie T (2008) A working guide to boosted regression trees. J Anim Ecol 77(4):802–813
Article CAS PubMed Google Scholar
Kanis JA, Oden A, Johnell O et al (2007) The use of clinical risk factors enhances the performance of BMD in the prediction of hip and osteoporotic fractures in men and women. Osteoporos Int 18(8):1033–1046
Article CAS PubMed Google Scholar
Kanis JA, Johnell O, Oden A, Dawson A, De laet C, Jonsson B (2001) Ten year probabilities of osteoporotic fractures according to BMD and diagnostic thresholds. Osteoporos Int 12(12):989–995
Article CAS PubMed Google Scholar
Azagra R, Roca G, Encabo G et al (2012) FRAX® tool, the WHO algorithm to predict osteoporotic fractures: the first analysis of its discriminative and predictive ability in the Spanish FRIDEX cohort. BMC Musculoskelet Disord 13:204
Article PubMed PubMed Central Google Scholar
Friis-holmberg T, Rubin KH, Brixen K, Tolstrup JS, Bech M (2014) Fracture risk prediction using phalangeal bone mineral density or FRAX(®)?-a Danish cohort study on men and women. J Clin Densitom 17(1):7–15
Article PubMed Google Scholar
Hawkins DM (2004) The problem of overfitting. J Chem Inform Comput Sci 44(1):1–12
Article CAS Google Scholar
Van Der Putten P, Van Someren M (2004) A bias–variance analysis of a real world learning problem: the CoIL challenge 2000. Mach Learn 7(1–2):177–195
Google Scholar
Ho-le TP, Center JR, Eisman JA, Nguyen HT, Nguyen TV (2016) Prediction of bone mineral density and fragility fracture by genetic profiling. J Bone Miner Res. Doi: 10.1002/jbmr.2998
Google Scholar
Vestergaard P, Mosekilde L (2002) Fracture risk in patients with celiac Disease, Crohn’s disease, and ulcerative colitis: a nationwide follow-up study of 16,416 patients in Denmark. Am J Epidemiol 156(1):1–10
Article PubMed Google Scholar
Zorn C (2005) A solution to separation in binary response models. Political Anal 13(2):157–170
Article Google Scholar

Download references

Acknowledgements

We acknowledge Statistics Denmark for providing data and a server platform for data analysis. The Obel Family Foundation of Aalborg, Denmark, and the Department of Clinical Medicine at Aalborg University, Denmark, are acknowledged for funding the PhD fellowship of Dr. Christian Kruse. Grant Numbers are not applicable in Denmark.

Author Contributors

Christian Kruse designed the study and performed data management, modelling, model validation, statistical analysis, graphical presentations and manuscript preparation of first draft of the paper. He is guarantor. Pia Eiken and Peter Vestergaard performed revisions and final approval of the manuscript draft. All authors revised the paper critically for intellectual content and approved the final version. All authors agree to be accountable for the work and to ensure that any questions relating to the accuracy and integrity of the paper are investigated and properly resolved.

Author information

Authors and Affiliations

Department of Endocrinology, Aalborg University Hospital, Moelleparkvej 4, 9000, Aalborg, Denmark
Christian Kruse & Peter Vestergaard
Department of Clinical Medicine, Aalborg University, Sdr. Skovvej 15, 9000, Aalborg, Denmark
Christian Kruse & Peter Vestergaard
Department of Cardiology, Nephrology and Endocrinology, Nordsjaellands Hospital, Dyrehavevej 29, 3400, Hilleroed, Denmark
Pia Eiken
Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200, Copenhagen N, Denmark
Pia Eiken
Department of Endocrinology, Aalborg University Hospital, Hobrovej 19, 9100, Aalborg, Denmark
Christian Kruse

Authors

Christian Kruse
View author publications
You can also search for this author in PubMed Google Scholar
Pia Eiken
View author publications
You can also search for this author in PubMed Google Scholar
Peter Vestergaard
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Kruse.

Ethics declarations

Conflict of interest

CK has received travel grants from Eli Lilly, Otsuka Pharmaceutical and is a speaker for Novartis and Otsuka Pharmaceutical. PE is an advisory board member with Amgen, MSD and Eli Lilly and at the speakers bureau with Amgen and Eli Lilly, stocks from Novo Nordisk A/S. PV has received unrestricted grants from MSD and Servier, and travel grants from Amgen, Eli Lilly, Novartis, Sanofi-Aventis and Servier.

Human and Animal Rights and Informed Consent

The procedures followed were in accordance with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Helsinki Declaration of 1975, as revised in 2008.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 17 KB)

Supplementary material 2 (EPS 10 KB)

Supplementary material 3 (EPS 795 KB)

MOESM2: Supplementary material 2: Calibration plot of binned probability intervals versus actual observed percentages. A line close to the diagonal line indicates good calibration. Female subjects, bootstrap aggregated flexible discriminant analysis with Naïve Bayes calibration.

MOESM3: Supplementary material 3: Calibration plot of binned probability intervals versus actual observed percentages. A line close to the diagonal line indicates good calibration. Male subjects, eXtreme Gradient Boosting with Naïve Bayes calibration.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kruse, C., Eiken, P. & Vestergaard, P. Machine Learning Principles Can Improve Hip Fracture Prediction. Calcif Tissue Int 100, 348–360 (2017). https://doi.org/10.1007/s00223-017-0238-7

Download citation

Received: 06 September 2016
Accepted: 05 December 2016
Published: 14 February 2017
Issue Date: April 2017
DOI: https://doi.org/10.1007/s00223-017-0238-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine Learning Principles Can Improve Hip Fracture Prediction

Abstract

Access this article

Similar content being viewed by others

The clinician’s guide to prevention and treatment of osteoporosis

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Accumulation of risk factors associated with poor bone health in older adults

References

Acknowledgements

Author Contributors

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and Animal Rights and Informed Consent

Electronic supplementary material

Supplementary material 1 (DOCX 17 KB)

Supplementary material 2 (EPS 10 KB)

Supplementary material 3 (EPS 795 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Machine Learning Principles Can Improve Hip Fracture Prediction

Abstract

Access this article

Similar content being viewed by others

The clinician’s guide to prevention and treatment of osteoporosis

Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda

Accumulation of risk factors associated with poor bone health in older adults

References

Acknowledgements

Author Contributors

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Human and Animal Rights and Informed Consent

Electronic supplementary material

Supplementary material 1 (DOCX 17 KB)

Supplementary material 2 (EPS 10 KB)

Supplementary material 3 (EPS 795 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation