Skip to main content

The New Possibilities from “Big Data” to Overlooked Associations Between Diabetes, Biochemical Parameters, Glucose Control, and Osteoporosis

Abstract

Purpose of Review

To review current practices and technologies within the scope of “Big Data” that can further our understanding of diabetes mellitus and osteoporosis from large volumes of data. “Big Data” techniques involving supervised machine learning, unsupervised machine learning, and deep learning image analysis are presented with examples of current literature.

Recent Findings

Supervised machine learning can allow us to better predict diabetes-induced osteoporosis and understand relative predictor importance of diabetes-affected bone tissue. Unsupervised machine learning can allow us to understand patterns in data between diabetic pathophysiology and altered bone metabolism. Image analysis using deep learning can allow us to be less dependent on surrogate predictors and use large volumes of images to classify diabetes-induced osteoporosis and predict future outcomes directly from images.

Summary

“Big Data” techniques herald new possibilities to understand diabetes-induced osteoporosis and ascertain our current ability to classify, understand, and predict this condition.

This is a preview of subscription content, access via your institution.

References

Papers of particular interest, published recently, have been highlighted as: • Of importance

  1. Vestergaard P. Discrepancies in bone mineral density and fracture risk in patients with type 1 and type 2 diabetes—a meta-analysis. Osteoporos Int. 2007;18(4):427–44.

    Article  CAS  PubMed  Google Scholar 

  2. Vestergaard P, Rejnmark L, Mosekilde L. Relative fracture risk in patients with diabetes mellitus, and the impact of insulin and oral antidiabetic medication on relative fracture risk. Diabetologia. 2005;48(7):1292–9.

    Article  CAS  PubMed  Google Scholar 

  3. Vestergaard P. Bone metabolism in type 2 diabetes and role of thiazolidinediones. Curr Opin Endocrinol Diabetes Obes. 2009;16(2):125–31.

    Article  CAS  PubMed  Google Scholar 

  4. Starup-Linde J, Vestergaard P. Management of endocrine disease: diabetes and osteoporosis: cause for concern? Eur J Endocrinol. 2015;173(3):R93–9.

    Article  CAS  PubMed  Google Scholar 

  5. Hofbauer LC, Brueck CC, Singh SK, Dobnig H. Osteoporosis in patients with diabetes mellitus. J Bone Miner Res. 2007;22(9):1317–28.

    Article  CAS  PubMed  Google Scholar 

  6. Starup-Linde J. Diabetes, biochemical markers of bone turnover, diabetes control, and bone. Front Endocrinol (Lausanne). 2013;4:21.

    Google Scholar 

  7. Starup-Linde J, Lykkeboe S, Gregersen S, Hauge E-M, Langdahl BL, Handberg A, et al. Bone structure and predictors of fracture in type 1 and type 2 diabetes. J Clin Endocrinol Metab. 2016;101(3):928–36.

    Article  CAS  PubMed  Google Scholar 

  8. Giangregorio LM, Leslie WD, Lix LM, Johansson H, Oden A, McCloskey E, et al. FRAX underestimates fracture risk in patients with diabetes. J Bone Miner Res. 2012;27(2):301–8.

    Article  PubMed  Google Scholar 

  9. Cruz JA, Wishart DS. Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2007;2:59–77.

    PubMed  PubMed Central  Google Scholar 

  10. Witten IH (Ian H., Frank E, Hall MA (Mark A, Pal CJ. Data mining: practical machine learning tools and techniques. 621 p.

  11. Maglogiannis IG. Emerging artificial intelligence applications in computer engineering: real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies. IOS Press; 2007. 407 p.

  12. Kurth T, Walker AM, Glynn RJ, Chan KA, Gaziano JM, Berger K, et al. Results of multivariable logistic regression, propensity matching, propensity adjustment, and propensity-based weighting under conditions of nonuniform effect. Am J Epidemiol. 2005;163(3):262–70.

    Article  PubMed  Google Scholar 

  13. Nemes S, Jonasson JM, Genell A, Steineck G. Bias in odds ratios by logistic regression modelling and sample size. BMC Med Res Methodol. 2009;9(1):56.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Larsen K, Merlo J. Appropriate assessment of neighborhood effects on individual health: integrating random and fixed effects in multilevel logistic regression. Am J Epidemiol 2005;161(1):81–88.

  15. Vittinghoff E, McCulloch CE. Relaxing the rule of ten events per variable in logistic and Cox regression. Am J Epidemiol. 2007;165(6):710–8.

    Article  PubMed  Google Scholar 

  16. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36.

    Article  CAS  PubMed  Google Scholar 

  17. Hanley JA, McNeil BJ. A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology. 1983;148(3):839–43.

    Article  CAS  PubMed  Google Scholar 

  18. Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 1993;39(4).

  19. Niculescu-Mizil A, Caruana R. Predicting good probabilities with supervised learning. In: Proceedings of the 22nd international conference on Machine learning - ICML ‘05. New York, New York, USA: ACM Press; 2005. p. 625–32.

    Google Scholar 

  20. Mapstone BD. Scalable decision rules for environmental impact studies: effect size, type I, and type II errors. Ecol Appl. 1995;5(2):401–10.

    Article  Google Scholar 

  21. Baer DM. “Perhaps it would be better not to know everything.”1. J Appl Behav Anal. 1977;10(1):1311163.

    Google Scholar 

  22. Berger JO, Sellke T. Testing a point null hypothesis: the irreconcilability of P values and evidence. J Am Stat Assoc. 1987;82(397):112–22.

    Google Scholar 

  23. Greenwald A, Gonzalez R, Harris RJ, Guthrie D. Effect sizes and p values: what should be reported and what should be replicated? Psychophysiology. 1996;33(2):175–83.

    Article  CAS  PubMed  Google Scholar 

  24. Alin A. Multicollinearity. Wiley Interdiscip Rev Comput Stat. 2010;2(3):370–4.

    Article  Google Scholar 

  25. Farrar DE, Glauber RR. Multicollinearity in regression analysis: the problem revisited. Rev Econ Stat. 1967;49(1):92.

    Article  Google Scholar 

  26. Cucker F, Smale S. Best choices for regularization parameters in learning theory: on the bias—variance problem. Found Comput Math. 2002;2(4):413–28.

    Article  Google Scholar 

  27. Friedman JH. On bias, variance, 0/1—loss, and the curse-of-dimensionality. Data Min Knowl Discov. 1997;1(1):55–77.

    Article  Google Scholar 

  28. Geman S, Bienenstock E, Doursat R. Neural networks and the bias/variance dilemma. Neural Comput. 1992;4(1):1–58.

    Article  Google Scholar 

  29. Valentini G, Dietterich TG. Bias-variance analysis of support vector machines for the development of SVM-based ensemble methods. J Mach Learn Res. 2004;5(Jul):725–75.

    Google Scholar 

  30. Hero AO, Fessler JA, Usman M. Exploring estimator bias-variance tradeoffs using the uniform CR bound. IEEE Trans Signal Process. 1996;44(8):2026–41.

    Article  Google Scholar 

  31. Krogh A. Neural Network Ensembles, Cross Validation, and Active Learning.

  32. Efron B, Gong G. A leisurely look at the bootstrap, the jackknife, and cross-validation. Am Stat. 1983;37(1):36–48.

    Google Scholar 

  33. Shao J. Linear model selection by cross-validation. J Am Stat Assoc. 1993;88(422):486–94.

    Article  Google Scholar 

  34. Efron B, Tibshirani R. Improvements on cross-validation: the 632+ bootstrap method. J Am Stat Assoc. 1997;92(438):548–60.

    Google Scholar 

  35. Therneau TM, Atkinson EJ, Foundation M. An introduction to recursive partitioning using the RPART routines. 2017;

  36. Liaw A, Wiener M. Classification and Regression by randomForest 2002;23.

  37. Widrow B, Hoff M. Adaptive switching circuits. 1960 IRE WESCON Convention Record. 1960. p. 96–104.

  38. Sudharsan B, Peeples M, Shomali M. Hypoglycemia prediction using machine learning models for patients with type 2 diabetes. J Diabetes Sci Technol. 2015;9(1):86–90.

    Article  PubMed  Google Scholar 

  39. Farran B, Channanath AM, Behbehani K, Thanaraj TA. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait—a cohort study. BMJ Open. 2013;3(5):e002457.

    Article  PubMed  PubMed Central  Google Scholar 

  40. Targeting weight loss interventions to reduce cardiovascular complications of type 2 diabetes: a machine learning-based post-hoc analysis of heterogeneous treatment effects in the Look AHEAD trial. Lancet Diabetes Endocrinol. 2017 Oct 1;5(10):808–15.

  41. Jain AK, Dubes RC. Algorithms for clustering data. Prentice Hall. 1988;355:320.

  42. Xu R, Wunsch D. Survey of clustering algorithms. IEEE Trans Neural Netw. 2005;16:645–78.

  43. Rand WM. Objective criteria for the evaluation of clustering methods. J Am Stat Assoc. 1971;66(336):846–50.

    Article  Google Scholar 

  44. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1987;20(C:53–65.

    Article  Google Scholar 

  45. Nagarajan S, Chandrasekaran RM. Design and implementation of expert clinical system for diagnosing diabetes using data mining techniques. Indian J Sci Technol. 2015;8(8):771.

    Article  Google Scholar 

  46. Karasneh RA, Al-Azzam SI, Alzoubi KH, Abu Abeeleh JA, Khader YS. Depressive symptoms and clustering of cardiovascular disease risk factors in diabetes patients. Int J Diabetes Dev Ctries. 2015;35(S2):240–7.

    Article  Google Scholar 

  47. Sanakal R, Jayakumari ST. Prognosis of diabetes using data mining approach-fuzzy C means clustering and support vector machine. Int J Comput Trends Technol. 2014;11(2):94–8.

    Article  Google Scholar 

  48. Clustering of cardiometabolic risk factors and risk of elevated HbA1c in non-Hispanic White, non-Hispanic Black and Mexican-American adults with type 2 diabetes. Diabetes Metab Syndr Clin Res Rev. 2014;8(2):75–81.

  49. • Kim E, Oh W, Pieczkiewicz DS, Castro MR, Caraballo PJ, Simon GJ. Divisive hierarchical clustering towards identifying clinically significant pre-diabetes subpopulations. AMIA. Annu Symp proceedings AMIA Symp. 2014;2014:1815–24. This is an important paper exemplifying the use of clustering to quantify groups of patients by mathematical similiarity. We can step away from the terms “Type 1” and “Type 2” and create more complex and nuanced groups.

    Google Scholar 

  50. Sharmila K, Vetha Manickam SA. Diagnosing diabetic dataset using Hadoop and K-means clustering techniques. Indian J Sci Technol 2016;9(40).

  51. Kagawa R, Kawazoe Y, Ida Y, Shinohara E, Tanaka K, Imai T, et al. Development of type 2 diabetes mellitus phenotyping framework using expert knowledge and machine learning approach. J Diabetes Sci Technol. 2017;11(4):791–9.

    Article  PubMed  Google Scholar 

  52. WHO. Prevention and management of osteoporosis. World Health Organ Tech Rep Ser. 2003;921:1–164. back cover

    Google Scholar 

  53. Kanis JA, Johnell O, Oden A, Johansson H, McCloskey E. FRAX® and the assessment of fracture probability in men and women from the UK. Osteoporos Int. 2008;19(4):385–97.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Burger H, van Daele PLA, Odding E, Valkenburg HA, Hofman A, Grobbee DE, et al. Association of radiographically evident osteoarthritis with higher bone mineral density and increased bone loss with age. The Rotterdam study. Arthritis Rheum. 1996;39(1):81–6.

    Article  CAS  PubMed  Google Scholar 

  55. Eriksen EF. Treatment of osteopenia. Rev Endocr Metabol Disord. 2012;13:209–23.

    Article  Google Scholar 

  56. Bergstra J, Breuleux O, Bastien FF, Lamblin P, Pascanu R, Desjardins G, et al. Theano: a CPU and GPU math compiler in Python. Proc Python Sci Comput Conf. 2010;(Scipy):1–7.

  57. Gulshan V, Peng L, Coram M, Stumpe MC, Wu D, Narayanaswamy A, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016 Dec 13;316(22):2402–10.

    Article  PubMed  Google Scholar 

  58. • Nayak J, Bhat PS, Acharya UR, Lim CM, Kagathi M. Automated identification of diabetic retinopathy stages using digital fundus images. J Med Syst. 2008;32(2):107–15. This paper revealed how we can categorize large image collections almost automatically if we develop machine learning algorithms to classify the images.

    Article  PubMed  Google Scholar 

  59. Design and baseline characteristics of the osteoporotic fractures in men (MrOS) study—a large observational study of the determinants of fracture in older men. Contemp Clin Trials. 2005;26(5):569–85.

  60. Faulkner KG, Cummings SR, Black D, Palermo L, Glüer C-C, Genant HK. Simple measurement of femoral geometry predicts hip fracture: the study of osteoporotic fractures. J Bone Miner Res. 2009;8(10):1211–7.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Kruse.

Ethics declarations

Conflict of Interest

Christian Kruse declares no conflict of interest.

Human and Animal Rights and Informed Consent

This article does not contain any studies with human or animal subjects performed by any of the authors.

Additional information

This article is part of the Topical Collection on Bone and Diabetes

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kruse, C. The New Possibilities from “Big Data” to Overlooked Associations Between Diabetes, Biochemical Parameters, Glucose Control, and Osteoporosis. Curr Osteoporos Rep 16, 320–324 (2018). https://doi.org/10.1007/s11914-018-0445-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11914-018-0445-9

Keywords

  • Diabetes
  • Osteoporosis
  • Fractures
  • Glucose
  • Big data
  • Machine learning