Analysis of Health Screening Records Using Interpretations of Predictive Models
- 263 Downloads
Health screening is conducted in many countries to track general health conditions and find asymptomatic patients. In recent years, large-scale data analyses on health screening records have been utilized to predict patients’ future health conditions. While such predictions are significantly important, it is also of great interest for medical researchers to identify factors that could deteriorate patients’ medical conditions in the future. For this purpose, we propose to use interpretations of trained predictive models. Specifically, we trained machine learning models to predict future diabetes stages, then applied permutation importance, SHapley Additive exPlanations (SHAP), and a sensitivity analysis to extract features that contribute to aggravation. Among the trained models, XGBoost performed best in terms of the Matthews correlation coefficient. Permutation importance and SHAP showed that the model makes good predictions using a number of attributes conventionally known to be related to diabetes, but also those not commonly used in the diagnosis of diabetes. A sensitivity analysis showed that the predictions’ changes were mostly consistent with our intuition on how daily behavior affects type 2 diabetes’s aggravation.
This work was supported by JST COI Grant Number JPMJCE1301 and JSPS KAKENHI Grant Number JP16K00228, JP16H02904.
- 4.Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceeding ACM International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)Google Scholar
- 5.Garske, T.: Using Deep Learning on EHR Data to Predict Diabetes. Ph.D. thesis, University of St. Thomas (2018)Google Scholar
- 6.Kim, H.-G., Jang, G.-J., Choi, H.-J., Kim, M., Kim, Y.-W., Choi, J.: Recurrent neural networks with missing information imputation for medical examination data prediction. In: International Conference on Big Data and Smart Computing (2017)Google Scholar
- 9.Makino, M., et al.: Artificial intelligence predicts the progression of diabetic kidney disease using big data machine learning. Nat. Sci. Rep. 9(1), 1–9 (2019)Google Scholar
- 10.Marini, S., et al.: A dynamic Bayesian Network model for long-term simulation of clinical complications in type 1 diabetes. J. Biomed. Inform. (2015)Google Scholar
- 11.Mussone, L., Bassani, M., Masci, P.: Analysis of factors affecting the severity of crashes in urban road intersections. Accident Analysis & Prevention 103 (2017)Google Scholar
- 15.Tsunekawa, M., Oka, N., Araki, M., Shintani, M., Yoshikawa, M., Tanigawa, T.: Prediction of the onset of lifestyle-related diseases using regular health checkup data. In: Proceedings of the Annual Conference of the Japan Social for Artificial Intelligence (2019)Google Scholar