Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques

Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 992)


Diabetes is one of the fastest growing chronic life threatening diseases that have already affected 422 million people worldwide according to the report of World Health Organization (WHO), in 2018. Due to the presence of a relatively long asymptomatic phase, early detection of diabetes is always desired for a clinically meaningful outcome. Around 50% of all people suffering from diabetes are undiagnosed because of its long-term asymptomatic phase. The early diagnosis of diabetes is only possible by proper assessment of both common and less common sign symptoms, which could be found in different phases from disease initiation up to diagnosis. Data mining classification techniques have been well accepted by researchers for risk prediction model of the disease. To predict the likelihood of having diabetes requires a dataset, which contains the data of newly diabetic or would be diabetic patient. In this work, we have used such a dataset of 520 instances, which has been collected using direct questionnaires from the patients of Sylhet Diabetes Hospital in Sylhet, Bangladesh. We have analyzed the dataset with Naive Bayes Algorithm, Logistic Regression Algorithm, and Random Forest Algorithm and after applying tenfold Cross- Validation and Percentage Split evaluation techniques, Random forest has been found having best accuracy on this dataset. Finally, a commonly accessible, user-friendly tool for the end user to check the risk of having diabetes from assessing the symptoms and useful tips to control over the risk factors has been proposed.


Diabetes risk Symptom Early stage Data mining KDD Dataset Evaluation model Supervised learning algorithms Unsupervised learning algorithms Dataset Mining tools 


Ethical Approval

All procedures performed in studies involving human were in accordance with the ethical standards of the institution at which the studies were conducted and ethical approval was obtained from Sylhet Diabetic Hospital, Sylhet Bangladesh. Ref: S.D.A/88

Informed Consent

Informed consent was obtained from all individual participants included in the study.


  1. 1.
    The 6 Different Types of Diabetes: (5 Mar 2018). The diabetic journey.
  2. 2.
    Statistics About Diabetes: American Diabetes Association, 22 Mar 2018.
  3. 3.
    Diabetes, World Health Organization (WHO): 30 Oct 2018.
  4. 4.
    Failure to detect type 2 diabetes early costing \$700 million per year, Diabetes Australia, 8 July 2018.
  5. 5.
    Harris, M.I., et al.: Onset of NIDDM occurs at least 4–7 yr before clinical diagnosis. Diabetes Care 15(7), 815–819 (1992)CrossRefGoogle Scholar
  6. 6.
    Akter, S., et al.: Prevalence of diabetes and prediabetes and their risk factors among Bangladeshi adults: a nationwide survey. Bull. World Health Organ. 92, 204–213A (2014)CrossRefGoogle Scholar
  7. 7.
    Ramachandran, A.: Know the signs and symptoms of diabetes. Indian J. Med. Res. 140(5), 579 (2014)Google Scholar
  8. 8.
    Kumar, V., Valide, L.: A data mining approach for prediction and treatment of diabetes disease. Int. J. Sci. Invent. Today (2014). ISSN 2319-5436Google Scholar
  9. 9.
    Agrawal, P., Dewangan, A.: A brief survey on the techniques used for the diagnosis of diabetes-mellitus. Int. Res. J. Eng. Technol. (IRJET). 02(03) (2015). e-ISSN: 2395-0056; p-ISSN: 2395-0072Google Scholar
  10. 10.
    Joshi, T.N. Chawan, P.M.: Diabetes prediction using machine learning techniques. Dewangan, S. Int. J. Eng. Res. Appl. (Part -II) 8(1), 09–13 (2018). ISSN: 2248-9622Google Scholar
  11. 11.
    Sapon, M.A., Ismail, K., Zainudin, S.: Prediction of diabetes by using artificial neural network. In: 2011 International Conference on Circuits, System and Simulation IPCSIT, vol. 7. IACSIT Press, Singapore (2011)Google Scholar
  12. 12.
    Asir, A.G., Singh, E.J., Leavline, Baig, B.S.: Diabetes prediction using medical data. J. Comput. Intell. Bioinform. 10(1), 1–8 (2017)Google Scholar
  13. 13.
    Ahmed: Developing a predicted model for diabetes type 2 treatment plans by using data mining (2016b)Google Scholar
  14. 14.
    Rabina1, Er. Anshu Chopra2: Diabetes prediction by supervised and unsupervised learning with feature selection, 2(5). ISSN: 2454-132Google Scholar
  15. 15.
    Mishra, V., Samuel, C., Sharma, S.K.: Use of machine learning to predict the onset of diabetes. Int. J. Recent Adv. Mech. Eng. (IJMECH) 4(2) (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Queen Mary University of LondonLondonUnited Kingdom
  2. 2.Metropolitan University SylhetSylhetBangladesh
  3. 3.Metropolitan University SylhetSylhetBangladesh
  4. 4.Metropolitan University SylhetSylhetBangladesh

Personalised recommendations