Abstract
The study aims were to develop fracture prediction models by using machine learning approaches and genomic data, as well as to identify the best modeling approach for fracture prediction. The genomic data of Osteoporotic Fractures in Men, cohort Study (nā=ā5130), were analyzed. After a comprehensive genotype imputation, genetic risk score (GRS) was calculated from 1103 associated Single Nucleotide Polymorphisms for each participant. Data were normalized and split into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, neural network, and logistic regression were used to develop prediction models for major osteoporotic fractures separately, with GRS, bone density, and other risk factors as predictors. In model training, the synthetic minority oversampling technique was used to account for low fracture rate, and tenfold cross-validation was employed for hyperparameters optimization. In the testing, the area under curve (AUC) and accuracy were used to assess the model performance. The McNemar test was employed to examine the accuracy difference between models. The results showed that the prediction performance of gradient boosting was the best, with AUC of 0.71 and an accuracy of 0.88, and the GRS ranked as the 7th most important variable in the model. The performance of random forest and neural network were also significantly better than that of logistic regression. This study suggested that improving fracture prediction in older men can be achieved by incorporating genetic profiling and by utilizing the gradient boosting approach. This result should not be extrapolated to women or young individuals.
This is a preview of subscription content, access via your institution.




Abbreviations
- MrOS:
-
Osteoporotic fractures in men study
- ML:
-
Machine learning
- BMD:
-
Bone mineral density
- FRAX:
-
The fracture risk assessment tool
- GRS:
-
Genetic risk score
- QUS:
-
Quantitative ultrasound
- ROC:
-
Receiver-operating curve
- AUC:
-
Area under curve
- LR:
-
Logistic regression
- RF:
-
Random forest
- GB:
-
Gradient boosting
- NN:
-
Neural network
- MOF:
-
Major osteoporotic fracture
- SNPs:
-
Single nucleotide polymorphisms
- FNBMD:
-
Femoral neck BMD
- TSBMD:
-
Total spine BMD
- THBMD:
-
Total hip BMD
- SOS:
-
Speed of sound
- BUA:
-
Broadband ultrasonic attenuation
- QUI:
-
Quantitative ultrasonic index
References
Johnell O, Kanis JA (2006) An estimate of the worldwide prevalence and disability associated with osteoporotic fractures. Osteoporos Int 17(12):1726ā1733
Melton LJ, Cooper C (2007) Chapter 21āMagnitude and impact of osteoporosis and fractures osteoporosis., 2nd edn, Academic Press Inc, San Diego, pp 557ā567
Boonen S et al (2012) Fracture risk and zoledronic acid therapy in men with osteoporosis. N Engl J Med 367(18):1714ā1723
Jiang HX et al (2005) Development and initial validation of a risk score for predicting in-hospital and 1-year mortality in patients with hip fractures. J Bone Miner Res 20(3):494ā500
Papaioannou A et al (2009) Risk factors for low BMD in healthy men age 50 years or older: a systematic review. Osteoporos Int 20(4):507ā518
Kanis JA, Johnell O, Oden A, Johansson H, McCloskey E (2008) FRAXTM and the assessment of fracture probability in men and women from the UK. Osteoporos Int 19:385ā397
McCloskey EV, Johansson H, Oden A, Kanis JA (2009) From relative risk to absolute fracture risk calculation: the FRAX algorithm. Curr Osteoporos Rep 7(3):77ā83
Morris JA et al (2019) An atlas of genetic influences on osteoporosis in humans and mice. Nat Genet 51(2):258ā266
Ralston SH, Uitterlinden AG (2010) Genetics of osteoporosis. Endocr Rev 31(5):629ā662
Hsu YH et al (2010) An integration of genome-wide association study and gene expression profiling to prioritize the discovery of novel susceptibility loci for osteoporosis-related traits. PLoS Genet 6(6):1ā16
Kim SK (2018) Identification of 613 new loci associated with heel bone mineral density and a polygenic risk score for bone mineral density, osteoporosis and fracture. PLoS ONE 13(7):e0200785
Hsieh CH, Lu RH, Lee NH, Chiu WT, Hsu MH, Li YC (2011) Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks. Surgery 149(1):87ā93
Orwoll E et al (2005) Design and baseline characteristics of the osteoporotic fractures in men (MrOS) studyāA large observational study of the determinants of fracture in older men. Contemp Clin Trials 26:569ā585
Blank JB et al (2005) Overview of recruitment for the osteoporotic fractures in men study (MrOS). Contemp Clin Trials 26(5):557ā568
Cauley JA et al (2005) Factors associated with the lumbar spine and proximal femur bone mineral density in older men. Osteoporos Int 16(12):1525ā1537
Bauer DC, Ewing SK, Cauley JA, Ensrud KE, Cummings SR, Orwoll ES (2007) Quantitative ultrasound predicts hip and non-spine fracture in men: the MrOS study. Osteoporos Int 18(6):771ā777
Lix LM, Leslie WD, Majumdar SR (2018) Measuring improvement in fracture risk prediction for a new risk factor: a simulation. BMC Res Notes 11:62
Andrews NA (2010) Genome-wide association studies in the osteoporosis field: Impressive technological achievements, but an uncertain future in the clinical setting. IBMS BoneKEy 7(11):382ā387
Melton LJ, Atkinson EJ, OāFallon WM, Wahner HW, Riggs BL (1993) Long-term fracture prediction by bone mineral assessed at different skeletal sites. J Bone Miner Res 8(10):1227ā1233
Kanis JA et al (2005) Assessment of fracture risk. Osteoporos Int 16(6):581ā589
Stone KL et al (2003) BMD at multiple sites and risk of fracture of multiple types: long-term results from the study of osteoporotic fractures. J Bone Miner Res 18(9):1947ā1954
Iniesta R, Stahl D, McGuffin P (2016) Machine learning, statistical learning and the future of biological research in psychiatry. Psychol Med 46(12):2455ā2465
Sun Y, Kamel MS, Wong AKC, Wang Y (2007) Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn 40(12):3358ā3378
Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Handling imbalanced datasets : a review. GESTS Int Trans Comput Sci Eng 30(1):25ā36
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique nitesh. J Artif Intell Res 16(1):321ā357
Raschka S (2018) Model evaluation , model selection , and algorithm selection in machine learning. CoRR abs/1811.12808.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011) Scikit-learn: machine learning in Python. J. Mach Learn Res 12:2825ā2830
Bolland MJ et al (2011) Evaluation of the FRAX and Garvan fracture risk calculators in older women. J Bone Miner Res 26(2):420ā427
Al-Barghouthi BM, Farber CR (2019) Dissecting the genetics of osteoporosis using systems approaches. Trends Genet 35(1):55ā67
Eriksson J et al (2015) Limited clinical utility of a genetic risk score for the prediction of fracture risk in elderly subjects. J Bone Miner Res 30(1):184ā194
Ho-Le TP, Center JR, Eisman JA, Nguyen HT, Nguyen TV (2017) Prediction of bone mineral density and fragility fracture by genetic profiling. J Bone Miner Res 32(2):285ā293
Estrada K et al (2012) Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture. Nat Genet 44(5):491ā501
Taylor RA, Moore CL, Cheung KH, Brandt C (2018) Predicting urinary tract infections in the emergency department with machine learning. PLoS ONE 13(3):1ā15
Kruse C, Eiken P, Vestergaard P (2017) Machine learning principles can improve hip fracture prediction. Calcif Tissue Int 100(4):348ā360
Sato M et al (2019) Machine-learning approach for the development of a novel predictive model for the diagnosis of hepatocellular carcinoma. Sci Rep 9(1):1ā7
Chiew CJ, Liu N, Tagami T, Wong TH, Koh ZX, Ong MEH (2019) Heart rate variability based machine learning models for risk prediction of suspected sepsis patients in the emergency department. Medicine 98(6):e14197
Babajide Mustapha I, Saeed F (2016) Bioactive molecule prediction using extreme gradient boosting. Molecules (Basel, Switzerland) 21(8):1ā11
Cummings SR et al (1993) Bone density at various sites for prediction of hip fractures. The Lancet 341(8837):72ā75
Beleites C, Neugebauer U, Bocklitz T, Krafft C, Popp J (2015) Sample size planning for classification models. Anal Chim Acta 760:25ā33
Nguyen TV, Eisman JA (2013) Genetic profiling and individualized assessment of fracture risk. Nat Rev Endocrinol 9(3):153ā161
Acknowledgements
The data/analyses presented in the current publication are based on the use of study data downloaded from the dbGaP web site, under phs000373.v1.p1 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000373.v1.p1). The research and analysis described in the present study were supported by a COBRE grant from the National Institute of General Medical Sciences (GR08954), the Genome Acquisition to Analytics (GAA) Research Core of the Personalized Medicine Center of Biomedical Research Excellence at the Nevada Institute of Personalized Medicine, and the National Supercomputing Institute at the University of Nevada Las Vegas. The funding sponsors were not involved in the analysis design, genotype imputation, data analysis, and interpretation of the analysis results or the preparation, review, or approval of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Qing Wu, Fatma Nasoz, Jongyun Jung, Bibek Bhattarai and Mira V Han declare that they have no conflict of interest.
Human and Animal Rights and Informed Consent
This study analyzed de-identified, secondary data only, and wasĀ exemptedĀ by the Institutional Review Board at the University of Nevada, Las Vegas.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wu, Q., Nasoz, F., Jung, J. et al. Machine Learning Approaches for Fracture Risk Assessment: A Comparative Analysis of Genomic and Phenotypic Data in 5130 Older Men. Calcif Tissue Int 107, 353ā361 (2020). https://doi.org/10.1007/s00223-020-00734-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00223-020-00734-y
Keywords
- Machine learning
- Fracture
- Osteoporosis
- Genomics
- Comparison