Early diagnosis of Type 2 Diabetes Mellitus (T2DM) is crucial to enable timely therapeutic interventions and lifestyle modifications. As the time available for clinical office visits shortens and medical imaging data become more widely available, patient image data could be used to opportunistically identify patients for additional T2DM diagnostic workup by physicians. We investigated whether image-derived phenotypic data could be leveraged in tabular learning classifier models to predict T2DM risk in an automated fashion to flag high-risk patients without the need for additional blood laboratory measurements. In contrast to traditional binary classifiers, we leverage neural networks and decision tree models to represent patient data as ‘SynthA1c’ latent variables, which mimic blood hemoglobin A1c empirical lab measurements, that achieve sensitivities as high as 87.6%. To evaluate how SynthA1c models may generalize to other patient populations, we introduce a novel generalizable metric that uses vanilla data augmentation techniques to predict model performance on input out-of-domain covariates. We show that image-derived phenotypes and physical examination data together can accurately predict diabetes risk as a means of opportunistic risk stratification enabled by artificial intelligence and medical imaging. Our code is available at https://github.com/allisonjchae/DMT2RiskAssessment.
- Disease Prediction
- Representation Learning
M. S. Yao and A. Chae—Equal contribution as co-first authors.
W. R. Witschey and H. Sagreiya—Equal contribution as co-senior authors.
This is a preview of subscription content, access via your institution.
Tax calculation will be finalised at checkout
Purchases are for personal use onlyLearn about institutional subscriptions
Khan, M.A.B., Hashim, M.J., King, J.K., Govender, R.D., Mustafa, H., Al Kaabi, J.: Epidemiology of type 2 diabetes - Global burden of disease and forecasted trends. J. Epi. Glob. Health 10(1), 107–111 (2020). https://doi.org/10.2991/jegh.k.191028.001
Xu, G., et al.: Prevalence of diagnosed type 1 and type 2 diabetes among US adults in 2016 and 2017: Population based study. BMJ 362 (2018). https://doi.org/10.1136/bmj.k1497
Albarakat, M., Guzu, A.: Prevalence of type 2 diabetes and their complications among home health care patients at Al-Kharj military industries corporation hospital. J. Family Med. Prim. Care 8(10), 3303–3312 (2019). https://doi.org/10.4103/jfmpc.jfmpc_634_19
Polubriaginof, F.C.G., Shang, N., Hripcsak, G., Tatonetti, N.P., Vawdrey, D.K.: Low screening rates for diabetes mellitus among family members of affected relatives. In: AMIA Annual Symposium Proceedings, pp. 1471–1417 (2019)
Kaul, P., Chu, L.M., Dover, D.C., Yeung, R.O., Eurich, D.T., Butalia, S.: Disparities in adherence to diabetes screening guidelines among males and females in a universal care setting: a population-based study of 1,380,697 adults. Lancet Regional Health (2022). https://doi.org/10.1016/j.lana.2022.100320
Porter, J., Boyd, C., Skandari, M.R., Laiteerapong, N.: Revisiting the time needed to provide adult primary care. J. Gen. Intern. Med. (2022). https://doi.org/10.1007/s11606-022-07707-x
Farran, B., Channanath, A.M., Behbehani, K., Thanaraj, T.A.: Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait—A cohort study. BMJ Open 3(5) (2013). https://doi.org/10.1136/bmjopen-2012-002457
Dowhanik, S.P.D., Schieda, N., Patlas, M.N., Salehi, F., van der Pol, C.B.: Doing more with less: CT and MRI utilization in Canada 2003–2019. Canadian Assoc. Radiol. J. 73(3), 592–594 (2022). https://doi.org/10.1177/08465371211052012
Hong, A.S., Levin, D., Parker, L., Rao, V.M., Ross-Degnan, D., Wharam, J.F.: Trends in diagnostic imaging utilization among Medicare and commercially insured adults from 2003 through 2016. Radiology 294(2), 342–350 (2020). https://doi.org/10.1148/radiol.2019191116
Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., Johannes, R.S.: Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In: Proceedings of Symposium on Computer Application in Medical Care, 261–265 (1988)
MacLean, M.T., et al.: Quantification of abdominal fat from computed tomography using deep learning and its association with electronic health records in an academic biobank. J. Am. Med. Inform. Assoc. 28(6), 1178–1187 (2021). https://doi.org/10.1093/jamia/ocaa342
Uddin, S., Khan, A., Hossain, M.E., Moni, M.A.: Comparing different supervised machine learning algorithms for disease prediction. BMC Med. Inform. Decis. Mak. 19(281) (2019). https://doi.org/10.1093/jamia/ocaa342
Kopitar, L., Kocbek, P., Cilar, L., Sheikh, A., Stiglic, G.: Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Nat. Sci. Rep. (2020). https://doi.org/10.1038/s41598-020-68771-z
Deberneh, H.M., Kim, I.: Prediction of type 2 diabetes based on machine learning algorithm. Int. J. Environ. Res. Public Health 18(6), 3317 (2021). https://doi.org/10.3390/ijerph18063317
Sivaraman, V., Bukowski, L.A., Levin, J., Kahn, J.M., Perer, A.: Ignore, trust, or negotiate: Understanding clinician acceptance of AI-based treatment recommendations in health care. arXiv (2023). https://doi.org/10.48550/arxiv.2302.00096
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016). https://doi.org/10.1145/2939672.2939785
Popov, S., Morozov, S., Babenko, A.: Neural oblivious decision ensembles for deep learning on tabular data. arXiv (2019). https://doi.org/10.48550/arxiv.1909.06312
Gorishniy, Y., Rubachev, I., Khrulkov, V., Babenko, A.: Revisiting deep learning models for tabular data. arXiv (2021). https://doi.org/10.48550/arxiv.2106.11959
Bang, H., et al.: Development and validation of a patient self-assessment score for diabetes risk. Ann. Intern. Med. 151(11), 775–783 (2009). https://doi.org/10.7326/0003-4819-151-11-200912010-00005
Ng, N., Hulkund, N., Cho, K., Ghassemi, M.: Predicting out-of-domain generalization with local manifold smoothness. arXiv (2022). https://doi.org/10.48550/arxiv.2207.02093
Jiang, Z., Zhou, J., Huang, H.: Relationship between manifold smoothness and adversarial vulnerability in deep learning with local errors. Chin. Phys. B 30(4) (2021). https://doi.org/10.1088/1674-1056/abd68e
MSY is supported by NIH T32 EB009384. AC is supported by the A\(\Upomega \)A Carolyn L. Kuckein Student Research Fellowship and the University of Pennsylvania Diagnostic Radiology Research Fellowship. WRW is supported by NIH R01 HL137984. MTM received funding from the Sarnoff Cardiovascular Research Foundation. HS received funding from the RSNA Scholar Grant.
Editors and Affiliations
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yao, M.S. et al. (2023). SynthA1c: Towards Clinically Interpretable Patient Representations for Diabetes Risk Stratification. In: Rekik, I., Adeli, E., Park, S.H., Cintas, C., Zamzmi, G. (eds) Predictive Intelligence in Medicine. PRIME 2023. Lecture Notes in Computer Science, vol 14277. Springer, Cham. https://doi.org/10.1007/978-3-031-46005-0_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-46004-3
Online ISBN: 978-3-031-46005-0