Abstract
Diagnostic procedures, therapeutic recommendations, and medical risk stratifications are based on dedicated, strictly controlled clinical trials. However, a plethora of real-world medical data exists, whereupon the increase in data volume comes at the expense of completeness, uniformity, and control. Here, a case-by-case comparison shows that the predictive power of our real world data–based model for diabetes-related chronic kidney disease outperforms published algorithms, which were derived from clinical study data.
Similar content being viewed by others
Data availability
Restrictions apply to the general availability of the data because of patient agreements and the nature of patient data. Data were used under license for the study presented in this manuscript. The IBM Explorys database data are run by IBM who makes the data available for secondary use (for example, scientific research) on a commercial basis. The INPC database is owned by the participating health institutions of the INPC. Access to the INPC can be provided for research purposes through the Regenstrief Institute Data Core.
References
Trojano, M. et al. Nat. Rev. Neurol. 13, 105–118 (2017).
Marx, V. Nature 498, 255–260 (2013).
Bender, E. Nature 527, S19 (2015).
Wu, X. et al. IEEE Trans. Knowl. Data Eng. 26, 97–107 (2014).
Frieden, T. R. N. Engl. J. Med. 377, 465–475 (2017).
Bates, D. W. et al. Health Aff. 33, 1123–1131 (2014).
Razavian, N. et al. Big Data 3, 277–287 (2015).
Miotto, R., Li, L., Kidd, B. A. & Dudley, J. T. Sci. Rep. 6, 26094 (2016).
Levin, A. et al. Lancet 390, 1888–1917 (2017).
Fioretto, P., Dodson, P. M., Ziegler, D. & Rosenson, R. S. Nat. Rev. Endocrinol. 6, 19–25 (2010).
Wanner, C. et al. N. Engl. J. Med. 375, 323–334 (2016).
Kaelber, D. C. et al. J. Am. Med. Inform. Assoc. 19, 965–972 (2012).
Hosmer, Jr., D. W., Lemeshow, S. & Sturdivant, R. X. Applied Logistic Regression 3rd edn (John Wiley & Sons, Inc., Hoboken, NJ, USA, 2013).
Vossen, P. Science 357, 22–27 (2017).
McDonald, C. J. et al. Health Aff. 24, 1214–1220 (2005).
Swets, J. A. Science 240, 1285–1293 (1988).
Bradley, A. P. Patt. Recogn. 30, 1145–1159 (1997).
The Diabetes Control and Complications Trial Research Group N. Engl. J. Med. 329, 977–986 (1993).
Dunkler, D. et al. Clin. J. Am. Soc. Nephrol. 10, 1371–1379 (2015).
Vergouwe, Y. et al. Diabetologia 53, 254–262 (2010).
Keane, W. F. et al. Clin. J. Am. Soc. Nephrol. 1, 761–767 (2006).
Jardine, M. J. et al. Am. J. Kidn. Dis. 60, 770–778 (2012).
Liaw, A. & Wiener, M. R News 2, 18–22 (2002).
Unger, J. & Schwartz, Z. Diabetes Management in Primary Care 2nd edn (Lippincott Williams & Wilkens, Philadelphia, 2013).
Glassock, R. J., Warnock, D. G. & Delanaye, P. Nat. Rev. Nephrol. 13, 104–114 (2017).
GBD 2015 Mortality and Causes of Death Collaborators. Lancet 388, 1459–1544 (2016).
Platinga, L. C., Tuot, D. S. & Powe, N. R. Adv. Chron. Kidn. Dis. 17, 225–236 (2010).
Bursac, Z. et al. Source Code Biol. Med. 3, 17 (2008).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference and Prediction (Springer, New York, 2009).
Van Rijsbergen, C. J. Information Retrieval (Butterworth-Heinemann, Newton, MA, USA, 1979).
Wasserstein, R. L. & Lazar, N. A. The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70, 129–133 (2016).
Robin, X. et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics 12, 77 (2011).
Carpenter, J. & Bithell, J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000).
Acknowledgements
The authors thank O. Quarder, C. Ringemann, P. Stephan (Roche Diabetes Care GmbH, Germany), and H. Mikulski (Roche Diabetes Care Spain, S.L.) for their continuing contributions to this work. We are grateful to T. Beck, S. Chittajallu, and S. Weinert (Roche Diabetes Care, Inc., USA) for their consultancy in the early phase of the investigation. The support from U. Günzel as well as H. Rincker and team (Roche Diabetes Care Deutschland, Germany) is highly appreciated. We are indebted to R. Daikeler, K. Kusterer, S. Waibel, and S. Zink (Germany) for their medical advice concerning our initial results. The research described in this manuscript was funded by Roche Diabetes Care GmbH and supplemented with in-kind contributions from Eli Lilly and Company (S.M.), Indiana Biosciences Research Institute (D.R.), and Regenstrief Institute, Inc. (T.S.).
Author information
Authors and Affiliations
Contributions
S.R., A.A., A.B., and F.F.F. generated and validated the Roche/IBM algorithm. T.H. and H.K. performed independent validation and further analysis. S.M., D.R., T.S., and teams enabled data withdrawal and assessment. B.S., L.B., and R.H. provided consultation for the overall research project, which was led by W.P.
Corresponding author
Ethics declarations
Competing interests
The authors declare the following potential conflicts of interest: T.H., B.S., W.P., S.R., and A.B. are inventors of a patent application related to the work described in this manuscript. T.H., H.K., B.S., R.H., and W.P. are employees of Roche Diabetes Care GmbH. S.R., A.A., A.B., L.B., and F.F.F. are employees of IBM Switzerland Ltd. S.M. is an employee of Eli Lilly and Company. Independent of his employment at Roche, W.P. is affiliated with Heidelberg University and is a member of the Faculty of Physics and Astronomy. T.S. is affiliated with Indiana University School of Medicine.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Text and Figures
Supplementary Figures 1–3 and Supplementary Tables 1–7
Rights and permissions
About this article
Cite this article
Ravizza, S., Huschto, T., Adamov, A. et al. Predicting the early risk of chronic kidney disease in patients with diabetes using real-world data. Nat Med 25, 57–59 (2019). https://doi.org/10.1038/s41591-018-0239-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-018-0239-8
- Springer Nature America, Inc.
This article is cited by
-
Federated learning-based AI approaches in smart healthcare: concepts, taxonomies, challenges and open issues
Cluster Computing (2023)
-
Machine learning techniques to predict the risk of developing diabetic nephropathy: a literature review
Journal of Diabetes & Metabolic Disorders (2023)
-
Dense phenotyping from electronic health records enables machine learning-based prediction of preterm birth
BMC Medicine (2022)
-
Prediction of 3-year risk of diabetic kidney disease using machine learning based on electronic medical records
Journal of Translational Medicine (2022)
-
The efficacy of canagliflozin in diabetes subgroups stratified by data-driven clustering or a supervised machine learning method: a post hoc analysis of canagliflozin clinical trial data
Diabetologia (2022)