Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 1023))

Abstract

Diabetes is a disease that actually impacts the capacity of the body to obtain blood glucose, which is usually referred to as blood sugar. At the end of 2019, a new public health problem (COVID-19) emerged. This disease has greatly harmed people with diabetes. Therefore, we intend to make use of data mining algorithms to prevent death and improve the quality of life through the prediction of diabetes. In this paper, four different algorithms have been used to analyze Diabetes from DAT260x Lab01: Logistic, Decision Tree Classifier, Xgboost and SVC. The models are evaluated for which algorithm is much effective. The paper then provides a quick overview of both the set of data and the fieldwork carried out on the subject. In the adjoining step, the dataset and its features are discussed. In addition, the paper explains the four algorithms and virtual environments that have been used to clarify the variables, which have the largest impact on raw data. The findings are obtained by evaluating the confusion matrix applied to the whole selected algorithm. The paper outlines the full observations and conclusions taken based on the results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Abdar, M., Nasarian, E., Zhou, X., Bargshady, G., Wijayaningrum, V. N., & Hussain, S. (2019). Performance improvement of decision trees for diagnosis of coronary artery disease using multi filtering approach. In 2019 IEEE 4th International Conference on Computer and Communication Systems (ICCCS) (pp. 26–30). Singapore. https://doi.org/10.1109/CCOMS.2019.8821633

    Google Scholar 

  • Akter, L., & Ferdib-Al-Islam. (2021). Dementia identification for diagnosing Alzheimer's disease using XGBoost algorithm. In 2021 International Conference on Information and Communication Technology for Sustainable Development (ICICT4SD) (pp. 205–209).

    Google Scholar 

  • American Diabetes Association. (2021). How COVID-19 Impacts People with Diabetes. Available online: https://www.diabetes.org/coronavirus-covid-19/how-coronavirus-impacts-people-with-diabetes. Retrieved January 3, 2021.

  • Asselman, A., Khaldi, M., & Aammou, S. (2021). Enhancing the prediction of student performance based on the machine learning XGBoost algorithm. Interactive Learning Environments, 1–20

    Google Scholar 

  • Boyd, C. R., Tolson, M. A., & Copes, W. S. (1987). Evaluating trauma care: The TRISS method. Trauma score and the injury severity score. The Journal of Trauma., 27(4), 370–378. https://doi.org/I0.1097/00005373-198704000-00005.PMID3106646

    Google Scholar 

  • Charan, R., Manisha. A., Ravichandran, K., & Muthu, R. (2017). A text-independent speaker verification model: A comparative analysis. In 2017 IEEE International Conference on Intelligent Computing and Control (I2C2). India. https://doi.org/10.1109/I2C2.2017.8321794.

  • Chaves, L., & Marques, G. (2021). Data mining techniques for early diagnosis of diabetes: A comparative study. Applied Sciences, 11(5), 2218.

    Google Scholar 

  • Chitra, R., & Seenivasagam, V. (2013). Review of heart disease prediction system using data mining and hybrid intelligent techniques. ICTACT Journal on Soft Computing, 3(04), 605–609.

    Google Scholar 

  • Christodoulou, E., Ma, J., Collins, G. S., Steyerberg, E. W., Verbakel, J. Y., & Calster, B. V. (2019). A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. Journal of Clinical Epidemiology, 110, 12–22.

    Google Scholar 

  • Deng, M., Jiang, L., Ren, Y., & Liao, J. (2020). Can we reduce mortality of COVID-19 if we do better in glucose control? Medicine in Drug Discovery, 7(2020), 100048.

    Google Scholar 

  • Dewi, K. E., & Widiastuti, N. I. (2020, July). Support vector regression for GPA prediction. In IOP Conference Series: Materials Science and Engineering (Vol. 879, No. 1, p. 012112). IOP Publishing.

    Google Scholar 

  • Fitriyani, N., Syafrudin, M., Alfian, G., & Rhee, J. (2019). Development of disease prediction model based on ensemble learning approach for diabetes and hypertension. IEEE Access., 7, 144777–144789. https://doi.org/10.1109/ACCESS.2019.2945129

  • Gawali, S., Agale, P., Ghorpade, S., Gawade, R., Nimat, P. (2020). Intrusion detection using hidden Markov model and XGBoost algorithm. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 466–470. https://doi.org/10.32628/CSEIT206287.

  • Gomathi, S., & Narayani, V. (2015). Monitoring of Lupus disease using decision tree induction classification algorithm. In 2015 International Conference on Advanced Computing and Communication Systems (pp. 1–6). Coimbatore, India. https://doi.org/10.1109/ICACCS.2015.7324054

  • Hackernoon.com. (2020). Introduction to Machine Learning Algorithms: Logistic Regressio|Hacker Noon. [online] Available at: https://hackernoon.com/introduction-to-machine-learning-algorithms-logistic-regression-cbdd82d81a36. Retrieved August 10, 2020.

  • Hartmann-Boyce, J., Morris, E., Goyder, C., Kinton, J., Perring, J., Nunan, D., & Khunti, K. (2020). Diabetes and COVID-19: risks, management, and learnings from other national disasters. Diabetes Care, 43(8), 1695–1703

    Google Scholar 

  • Hu, C., & Albertani, R. (2021). Wind turbine event detection by support vector machine. Wind Energy, 24(7), 672–685.

    Google Scholar 

  • Kologlu M., Elker D., Altun H., & Sayek I. (2001). Validation of MPI and OIA II in two different groups of patients with secondary peritonitis II. Hepato-Gastroenterology, 48, N2 37. 147–151.

    Google Scholar 

  • Komi, M., Li, J., Zhai, Y., & Zhang, X. (2017). Application of data mining methods in diabetes prediction. In Presented at the 2017 2nd International Conference on Image, Vision and Computing (ICIVC) (pp. 1006–1010). Chengdu, China: IEEE. https://doi.org/10.1109/ICIVC.2017.7984706

  • Kumar, P. S., Kumari, A., Mohapatra, S., Naik, B., Nayak, J., & Mishra, M. (2021). CatBoost ensemble approach for diabetes risk prediction at early stages. In Presented at the 2021 1st Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology(ODICON) (pp. 1–6). Bhubaneswar, India: IEEE. https://doi.org/10.1109/ODICON50556.2021.9428943

  • Lim, S., Bae, J. H., Kwon, H. S., & Nauck, M. A. (2021). COVID-19 and diabetes mellitus: From pathophysiology to clinical management. Nature Reviews Endocrinology, 17(1), 11–30.

    Google Scholar 

  • Medicalnewstoday.com. 2020. Diabetes: Symptoms, Treatment, And Early Diagnosis. [online] Available at: https://www.medicalnewstoday.com/articles/323627. Retrieved August 10, 2020.

  • Medium. (2020). Decision Tree Algorithm — Explained. [online] Available at: https://towardsdatascience.com/decision-tree-algorithm-explained-83beb6e78ef4. Retrieved August 12, 2020.

  • Ming, Y., Zhang, J., Qi, J., Liao, T., Wang, M., & Zhang, L. (2020, September). Prediction and analysis of Chengdu housing rent based on XGBoost algorithm. In Proceedings of the 2020 3rd International Conference on Big Data Technologies (pp. 1–5).

    Google Scholar 

  • nhs.uk. (2020). Diabetes. [online] Available at: https://www.nhs.uk/conditions/diabetes/. Retrieved August 10, 2020.

  • NIDDK. (2020). What Is Diabetes?|NIDDK. [online] National Institute of Diabetes and Digestive and Kidney Diseases. Available at: https://www.niddk.nih.gov/health-information/diabetes/overview/what-is-diabetes. Retrieved August 10, 2020.

  • Ohri, A. (2021). XGBoost Algorithm: An Easy Overview For 2021. Available at: XGBoost Algorithm: An Easy Overview For 2021 (jigsawacademy.com). Retrieved June 15.

    Google Scholar 

  • Pandiangan, N., Buono, M. L. C., & Loppies, S. H. D. (2020). Implementation of decision tree and Naïve Bayes classification method for predicting study period. Journal of Physics: Conference, 1569, 022022. https://doi.org/10.1088/1742-6596/1569/2/022022

  • Parui, S., Bajiya, A. K. R., Samanta, D., & Chakravorty, N. (2019, December). Emotion recognition from EEG signal using XGBoost algorithm. In 2019 IEEE 16th India Council International Conference (INDICON) (pp. 1–4). IEEE.

    Google Scholar 

  • Quinlan, J. R. (1996). Learning decision tree classifiers. ACM Computing Surveys (CSUR), 28(1), 71–72.

    Google Scholar 

  • Rashid, M., Singh, H., Goyal, V., Parah, S. A., & Wani, A. R. (2021). Big data based hybrid machine learning model for improving performance of medical Internet of Things data in healthcare systems. In Healthcare Paradigms in the Internet of Things Ecosystem (pp. 47–62). Academic Press.

    Google Scholar 

  • Reinstein, I. (2020). Xgboost, A Top Machine Learning Method On Kaggle, Explained—Kdnuggets. [online] KDnuggets. Available at: https://www.kdnuggets.com/2017/10/xgboost-top-machine-learning-method-kaggle-explained.html. Retrieved August 10, 2020.

  • Rochmawati, N., Hidayati, H. B., Yamasari, Y., Yustanti, W., Rakhmawati, L., Tjahyaningtijas, H. P. A., & Anistyasari, Y. (2020). Covid Symptom Severity Using Decision Tree. In 2020 Third International Conference on Vocational Education and Electrical Engineering (ICVEE) (pp. 1–5). Indonesia: Surabaya. https://doi.org/10.1109/ICVEE50212.2020.9243246

  • Samant, P., & Agarwal, R. (2018a). Machine learning techniques for medical diagnosis of diabetes using iris images. Computer Methods and Programs in Biomedicine, 157, 121–128. https://doi.org/10.1016/j.cmpb.2018.01.004

  • Samant, P., & Agarwal, R. (2018b). Comparative analysis of classification based algorithms for diabetes diagnosis using iris images. Journal of Medical Engineering and Technology, 42, 35–42. https://doi.org/10.1080/03091902.2017.1412521

  • Saxena, R. (2017). How Decision Tree Algorithm Works. Available at: https://dataaspirant.com/2017/01/30/how-decision-tree-algorithm-works/. Retrieved 04, April 40.

  • Sisodia, D., & Sisodia, D. S. (2018). Prediction of diabetes using classification algorithms. Procedia Computer Science, 132, 1578–1585.

    Google Scholar 

  • Swapna, G., Soman, K. P., & Vinayakumar, R. (2018). Automated detection of diabetes using CNN and CNN-LSTM network and heart rate signals. Procedia Computer Science, 132, 1253–1262.

    Google Scholar 

  • Syam, N., & Kaul, R. (2021). Support vector machines in marketing and sales. In Machine learning and artificial intelligence in marketing and sales. Emerald Publishing Limited.

    Google Scholar 

  • WHO. Available at :https://covid19.who.int/. Retrieved October 27, 2021.

  • Wu, F., Zhao, S., Yu, B., Chen, Y. M., Wang, W., Song, Z. G., & Zhang, Y. Z. (2020). A new coronavirus associated with human respiratory disease in China. Nature, 579(7798), 265–269.

    Google Scholar 

  • Yang, J. K., Lin, S. S., Ji, X. J., & Guo, L. M. (2010). Binding of SARS coronavirus to its receptor damages islets and causes acute diabetes. Acta Diabetologica, 47(3), 193–199.

    Google Scholar 

Download references

Acknowledgements

This work is partly supported by VC Research (VCR 0000156).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Victor Chang .

Editor information

Editors and Affiliations

Appendix

Appendix

The program is composed in a python programed on a Jupiter notebook. That necessary resources are mentioned throughout the code as provided by the algorithm.

Logistic Regression:

figure a
figure b
figure c

Applying Logistic

figure d
figure e
figure f
figure g
figure h
figure i
figure j
figure k
figure l
figure m

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Chang, V., Javvaji, S., Xu, Q.A., Hall, K., Guan, S. (2022). Diabetes Analysis with a Dataset Using Machine Learning. In: Chang, V., Kaur, H., Fong, S.J. (eds) Artificial Intelligence and Machine Learning Methods in COVID-19 and Related Health Diseases. Studies in Computational Intelligence, vol 1023. Springer, Cham. https://doi.org/10.1007/978-3-031-04597-4_8

Download citation

Publish with us

Policies and ethics