Abstract
Objectives
Diabetes has become a leading cause of mortality in both developed and developing countries, impacting a growing number of individuals worldwide. As the prevalence of the disease continues to rise, researchers have diligently worked towards developing accurate diabetes prediction models. The primary aim of this study is to utilize a diverse set of machine learning algorithms to detect the presence of diabetes, particularly in females, at an early stage. By leveraging these methods, this research seeks to provide physicians with valuable tools to identify the disease early, enabling timely interventions and improving patient outcomes.
Methods
In this study, some state-of-the-art machine learning techniques, such as random forest classifiers with gridsearchCV, XGBoost, NGBoost, Bagging, LightGBM, and AdaBoost classifiers, were employed. These models were chosen as the base layer of our proposed stacked ensemble model because of their high accuracy. Before feeding the data into the models, the dataset was preprocessed to ensure optimal performance and obtain improved results.
Results
The accuracy achieved in this study was 92.91%, which demonstrates its competitiveness with the existing approaches. Moreover, the utilization of the Shapley additive explanation (SHAP) facilitated the interpretation of machine learning models.
Conclusion
We anticipate that these findings will be beneficial to healthcare providers, stakeholders, students, and researchers involved in diabetes prediction research and development.
Similar content being viewed by others
Data Availability
The data used to support the findings of the study are available at https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database.
References
Alam TM, Iqbal MA, Ali Y, Wahab A, Ijaz S, Baig TI, Hussain A, Malik MA, Raza MM, Ibrar S, et al. A model for early prediction of diabetes. Inform Med Unlocked. 2019;16:100204.
National Diabetes Statistics Report | Diabetes | Centers for Disease Control and Prevention. 2022. https://www.cdc.gov/diabetes/data/statistics-report/index.html. Accessed 25 Jan 2023
Hosseini Sarkhosh SM, Esteghamati A, Hemmatabadi M, Daraei M. Predicting diabetic nephropathy in type 2 diabetic patients using machine learning algorithms. J Diabetes Metab Disord. 2022;21(2):1433–41.
Yang MH, Hall SA, Piccolo RS, Maserejian NN, McKinlay JB. Do behavioral risk factors for prediabetes and insulin resistance differ across the socioeconomic gradient? results from a community-based epidemiologic survey. International journal of endocrinology 2015. 2015
Hemanth S, Alagarsamy S. Hybrid adaptive deep learning classifier for early detection of diabetic retinopathy using optimal feature extraction and classification. J Diabetes Metab Disord. 2023:1–15
Nabovati E, Rangraz Jeddi F, Tabatabaeizadeh SM, Hamidi R, Sharif R. Design, development, and usability evaluation of a smartphone-based application for nutrition management in patients with type ii diabetes. J Diabetes Metab Disord. 2022:1–9
Bukhari MM, Alkhamees BF, Hussain S, Gumaei A, Assiri A, Ullah SS. An improved artificial neural network model for effective diabetes prediction. Complexity. 2021;2021:1–10.
Khodabakhsh P, Asadnia A, Moghaddam AS, Khademi M, Shakiba M, Maher A, Salehian E. Prediction of in-hospital mortality rate in covid-19 patients with diabetes mellitus using machine learning methods. J Diabetes Metab Disord. 2023:1–14
Gupta H, Varshney H, Sharma TK, Pachauri N, Verma OP. Comparative performance analysis of quantum machine learning with deep learning for diabetes prediction. Complex Intell Syst. 2022;8(4):3073–87.
Maniruzzaman M, Rahman MJ, Ahammed B, Abedin MM. Classification and prediction of diabetes disease using machine learning paradigm. Health Inf Sci Syst. 2020;8:1–14.
Ramesh J, Aburukba R, Sagahyroon A. A remote healthcare monitoring framework for diabetes prediction using machine learning. Healthc Technol Lett. 2021;8(3):45–57.
Mujumdar A, Vaidehi V. Diabetes prediction using machine learning algorithms. Procedia Comput Sci. 2019;165:292–9.
Swapna G, Vinayakumar R, Soman K. Diabetes detection using deep learning algorithms. ICT Express. 2018;4(4):243–6.
Mohammadi G, Pezeshki F, Vatanchi YM, Moghbeli F. Application of technology in educating nursing students during covid-19: A systematic review. Front Health Inform. 2021;10(1):64.
Latchoumi T, Dayanika J, Archana G. A comparative study of machine learning algorithms using quick-witted diabetic prevention. Ann Romanian Soc Cell Biol. 2021:4249–59
Krishnamoorthi R, Joshi S, Almarzouki HZ, Shukla PK, Rizwan A, Kalpana C, Tiwari B, et al. A novel diabetes healthcare disease prediction framework using machine learning techniques. J Healthc Eng. 2022:2022
Abdulhadi, N., Al-Mousa, A.: Diabetes detection using machine learning classification methods. In: 2021 International conference on information technology (ICIT). IEEE; 2021. pp. 350–354.
Nadeem MW, Goh HG, Ponnusamy V, Andonovic I, Khan MA, Hussain M. A fusion-based machine learning approach for the prediction of the onset of diabetes. In: Healthcare, MDPI; 2021. vol. 9, p. 1393.
Hasan MK, Alam MA, Das D, Hossain E, Hasan M. Diabetes prediction using ensembling of different machine learning classifiers. IEEE Access. 2020;8:76516–31.
Naz H, Ahuja S. Deep learning approach for diabetes prediction using pima indian dataset. J Diabetes Metab Disord. 2020;19:391–403.
Juneja A, Juneja S, Kaur S, Kumar V. Predicting diabetes mellitus with machine learning techniques using multi-criteria decision making. Int J Inf Retr Res (IJIRR). 2021;11(2):38–52.
Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting diabetes mellitus with machine learning techniques. Front Genet. 2018;9:515.
Moradifar P, Amiri MM. Prediction of hypercholesterolemia using machine learning techniques. J Diabetes Metab Disord. 2022:1–11
Srivastava S, Sharma L, Sharma V, Kumar A, Darbari H. Prediction of diabetes using artificial neural network approach. In: Engineering vibration, communication and information processing: ICoEVCI 2018, Springer: India; 2019. pp. 679–687.
Ahmed U, Issa GF, Khan MA, Aftab S, Khan MF, Said RA, Ghazal TM, Ahmad M. Prediction of diabetes empowered with fused machine learning. IEEE Access. 2022;10:8529–38.
Rehman A, Athar A, Khan MA, Abbas S, Fatima A, Saeed A, et al. Modelling, simulation, and optimization of diabetes type ii prediction using deep extreme learning machine. J Ambient Intell Smart Environ. 2020;12(2):125–38.
Pima Indians Diabetes Database — kaggle.com. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database. Accessed 22 Nov 2022
Data MC, Komorowski M, Marshall DC, Salciccioli JD, Crutain Y. Exploratory data analysis. Secondary Analysis of Electronic Health Records, 2016:185–203
Ahmad GN, Fatima H, Ullah S, Saidi AS, et al. Efficient medical diagnosis of human heart diseases using machine learning techniques with and without gridsearchcv. IEEE Access. 2022;10:80151–73.
Ahamed BS, Arya S, et al. Lgbm classifier based technique for predicting type-2 diabetes. Eur J Intern Med. 2021;8(3):454–67.
Wang C, Deng C, Wang S. Imbalance-xgboost: leveraging weighted and focal losses for binary label-imbalanced classification with xgboost. Pattern Recogn Lett. 2020;136:190–7.
Dhaliwal SS, Nahid A-A, Abbas R. Effective intrusion detection system using xgboost. Information. 2018;9(7):149.
Duan T, Anand A, Ding DY, Thai KK, Basu S, Ng A, Schuler A. Ngboost: natural gradient boosting for probabilistic prediction. In: International conference on machine learning. PMLR; 2020. pp. 2690–2700.
Soui M, Mansouri N, Alhamad R, Kessentini M, Ghedira K. Nsga-ii as feature selection technique and adaboost classifier for covid-19 prediction using patient’s symptoms. Nonlinear Dyn. 2021;106(2):1453–75.
Manimegalai T, Manju J, Rubiston MM, Vidhyashree B, Prabu RT. Prediction of optimized stock market trends using hybrid approach based on knn and bagging classifier (knnb). In: 2022 IEEE 11th International Conference on Communication Systems and Network Technologies (CSNT). IEEE; 2022. pp. 257–262.
Wang D, Thunéll S, Lindberg U, Jiang L, Trygg J, Tysklind M. Towards better process management in wastewater treatment plants: Process analytics based on shap values for tree-based machine learning methods. J Environ Manage. 2022;301: 113941.
Sagar SP, Oliullah K, Sohan K, Patwary MFK. Prcmla: product review classification using machine learning algorithms. In: Proceedings of international conference on trends in computational and cognitive engineering: proceedings of TCCE 2020. Springer; 2021. pp. 65–75.
Acknowledgements
We would like to thank the Bangladesh University of Business and Technology, and the Queensland University of Technology for providing the necessary facilities.
Funding
This work received no external funding.
Author information
Authors and Affiliations
Contributions
Conceptualization and methodology, K.O. and M.H.R.; software, K.O., M.H.R. and M.M.I.; validation, K.O., M.H.R and M.R.I.; formal analysis, M.W.; investigation, K.O. and M.R.I; resources, M.H.R, and M.M.I; writing-original draft preparation, K.O. and M.H.R.; writing-review and editing, M.W. and A.H.W.; visualization, K.O. and A.H.W; supervision, M.W. and A.H.W. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflicts of Interest
We have no conflicts of interest.
Financial Disclosure
No financial interests related to the material of this manuscript have been declared.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Supplementary data
Appendix: Supplementary data
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Oliullah, K., Rasel, M.H., Islam, M.M. et al. A stacked ensemble machine learning approach for the prediction of diabetes. J Diabetes Metab Disord (2023). https://doi.org/10.1007/s40200-023-01321-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40200-023-01321-2