Abstract
Diabetes disease is triggering many health problems including microvascular diseases, macrovascular abnormalities, and neuropathic diseases. From an economic perspective, diabetes is one of the costliest diseases, moreover, high percentage of adults with diabetes live in low- and middle-income countries triggering more economic troubles to these countries. Diagnosing the risk of diabetes will help combat against this silent killer. In this paper, we propose an inclusive machine learning based predictive model for diagnosing the risk of having diabetes using a recent dataset of signs and symptoms, known as Diabetes Risk Prediction (DRP2020). We employ more than twenty ML techniques on DRP2020 and we evaluate all ML based models using different performance evaluation metrics including accuracy, precision, recall, harmonic mean, prediction speed and alarm errors. We provide extensive simulation results and compare the performance of various ML based models. Accordingly, the model based shallow neural networks (SNN) has been elected as the optimum model for constructing of the early stage diabetes risk prediction scoring a 99.23% and 99.38% for prediction accuracy and the harmonic mean of precision and recall, respectively. The obtained results exhibit the proficiency and distinction of our model over other state-of art models. Eventually, the proposed system can be proficiently deployed as a clinical tool to assist in provide early stage diabetes risk prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
IDF Diabetes Atlas. 9th Edition (2019). https://www.diabetesatlas.org/upload/resources/2019
Nada, W.M., Abdel-Moety, D.A.: Serum C-reactive protein and diabetic retinopathy. Open J. Ophthalmol. 7(2), 73–78 (2017)
Mansour, M., Salam, R., Rashed, L., Salam, H.: Role of toll receptors in diabetic nephropathy. J. Diab. Mellitus 04(01), 26–32 (2014)
Onuma, H., Inukai, K., Watanabe, M., Sumitani, Y., Hosaka, T., Ishida, H.: Effects of long-term monotherapy with glimepiride vs glibenclamide on glycemic control and macrovascular events in Japanese Type 2 diabetic patients. J. Diab. Mellitus 04(01), 33–37 (2014)
Ganong, W.F.: Review of Medical Physiology, 19th edn. Appleon & Lange Press, New York (1999)
Silverthorn, D.: Human Physiology An Integrated Approach, 2nd edn. Prentice Hall press, Hoboken (2001)
Kavakiotis, et al.: Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 15, 104–116 (2017)
Shetty, D., Rit, K., Shaikh, S., Patil, N.: Diabetes disease prediction using data mining. In: International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), pp. 1–5 (2017)
Kareem, L.S., Wei, L., Tao, Y.: A comparative analysis and risk prediction of diabetes at early stage using machine learning approach. Int. J. Future Gener. Commun. Netw. 13(3), 4151–4163 (2020)
Patel, S., et al.: Predicting a risk of diabetes at early stage using machine learning approach. Turkish J. Comput. Math. Educ. 12, 5277–5284 (2021)
Mahboob Alam, T., et al.: A model for early prediction of diabetes. Inform. Med. Unlocked 16(100204), 1–6 (2019)
Kavitha, M., Subbaiah, S.: Implementing classification algorithms for predicting chronic diabetes diseases. Int. J. Eng. Adv. Technol. 8(6S3), 1748–1751 (2019)
Almustafa, K.M.: Prediction of heart disease and classifiers’ sensitivity analysis. BMC Bioinform. 21(1), 1–18 (2020)
Krittanawong, et al.: Machine learning prediction in cardiovascular diseases: a meta-analysis. Sci. Rep. 10(1), 1–11 (2020)
Tian, et al.: Using machine learning algorithms to predict hepatitis B surface antigen seroclearance. Comput. Math. Methods Med. 2019, 1–7 (2019)
Sisodia, D.S.: Prediction of diabetes using classification algorithms. Procedia Comput. Sci. 132, 1578–1585 (2018)
Kandhasamy, et al.: Performance analysis of classifier models to predict diabetes mellitus. Procedia Comput. Sci. 47, 45–51 (2015)
Nai-arun, N., Moungmai, R.: Comparison of classifiers for the risk of diabetes prediction. Procedia Comput. Sci. 69, 132–142 (2015)
Ioannis, K., et al.: Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 15, 104–116 (2017)
Komi, M., Li, J., Zhai, Y., Zhang, X.: Application of data mining methods in diabetes prediction. In: 2nd International Conference on Image, Vision and Computing (ICIVC), pp. 1006–1010 (2017)
Pradeep, K.R., Naveen, N.: Predictive analysis of diabetes using J48 algorithm of classification techniques. In: 2nd International Conference on Contemporary Computing and Informatics (IC3I), pp. 347–352 (2016)
Kumari, S., Kumar, D., Mittal, M.: An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int. J. Cogn. Comput. Eng. 2, 40–46 (2021)
Sajida, P., et al.: Performance analysis of data mining classification techniques to predict diabetes. Procedia Comput. Sci. 82, 115–121 (2016)
Zheng, T., et al.: A machine learning-based framework to identify type 2 diabetes through electronic health records. Int. J. Med. Inform. 97, 120–127 (2016)
Sivakumar, S., et al.: Classification algorithm in predicting the diabetes in early stages. J. Comput. Sci. 16, 1417–1422 (2020)
Faniqul Islam, M.M., Ferdousi, R., Rahman, S., Bushra, H.: Likelihood prediction of diabetes at early stage using data mining techniques. In: Gupta, M., Konar, D., Bhattacharyya, S., Biswas, S. (eds.) Computer Vision and Machine Intelligence in Medical Image Analysis. AISC, vol. 992, pp. 113–125. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-8798-2_12
Al-Haija, Q., Ishtaiwi, A.: Multiclass classification of firewall log files using shallow neural network for network security applications. In: Ranganathan, G., Fernando, X., Shi, F., El Allioui, Y. (eds.) Soft Computing for Security Applications: Proceedings of ICSCS 2021, pp. 27–41. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-5301-8_3
Song, Y.Y., Lu, Y.: Decision tree methods: applications for classification and prediction. Shanghai Arch Psychiatry 27(2), 130. PMID: 26120265, PMCID: PMC4466856 (2015)
Zang, F., Zhang. J.-S.: Softmax discriminant classifier. In: 2011 Third International Conference on Multimedia Information Networking and Security. IEEE (2011)
Itoo, F., Singh, S.: Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection. Int. J. Inf. Technol. 13(4), 1503–1511 (2021)
Ghose, A.: Support vector machine (SVM) tutorial: learning SVMs from examples. Medium: towards data science (2017)
Meneses, J.S., Chavez, Z.R., Rodriguez, J.G.: Compressed kNN: K-nearest neighbors with data compression. Entropy 21(3), 234 (2019). https://doi.org/10.3390/e21030234
Tama, B.A., Rhee, K.H.: An extensive empirical evaluation of classifier ensembles for intrusion detection task. Int. J. Comput. Syst. Sci. Eng. 32(2), 149–158 (2017)
Gupta, P.: Cross-validation in machine learning. medium: towards data science (2017). https://towardsdatascience.com/cross-validation-in-machine-learning
Al-Haija, Q.A., Smadi, M.A., Al-Bataineh, O.M.: Identifying phasic dopamine releases using DarkNet-19 convolutional neural network. In: 2021 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), pp. 1–5 (2021)
Upadhyay, P.K., Pandita, A., Joshi, N.: Scaled conjugate gradient backpropagation based SLA violation prediction in cloud computing. In: 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE), pp. 203–208 (2019)
Koech, K.E.: Cross-entropy loss function. Medium: Towards Data Science (2020)
Al-Haija, Q.A., Smadi, M.A., Zein-Sabatto, S.: Multi-class weather classification using ResNet-18 CNN for autonomous IoT and CPS applications. In: 2020 International Conference on Computational Science and Computational Intelligence (CSCI), pp. 1586–1591 (2020)
Fan, J., Upadhye, S., Worster, A.: Understanding receiver operating characteristic (ROC) curves. Canadian J. Emerg. Med. 8(1), 19–20 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Al-Haija, Q.A., Smadi, M., Al-Bataineh, O.M. (2022). Early Stage Diabetes Risk Prediction via Machine Learning. In: Abraham, A., et al. Proceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021). SoCPaR 2021. Lecture Notes in Networks and Systems, vol 417. Springer, Cham. https://doi.org/10.1007/978-3-030-96302-6_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-96302-6_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96301-9
Online ISBN: 978-3-030-96302-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)