Cloud based framework for diagnosis of diabetes mellitus using K-means clustering

  • P. Mohamed ShakeelEmail author
  • S. Baskar
  • V. R. Sarma Dhulipala
  • Mustafa Musa Jaber
Part of the following topical collections:
  1. Special Issue on Emerging Applications of Internet of Medical Things in Personalised Healthcare System


Diabetes mellitus is a serious health problem affecting the entire population all over the world for many decades. It is a group of metabolic disorder characterized by chronic disease which occurs due to high blood sugar, unhealthy foods, lack of physical activity and also hereditary. The sorts of diabetes mellitus are type1, type2 and gestational diabetes. The type1 appears during childhood and type2 diabetes develop at any age, mostly affects older than 40. The gestational diabetes occurs for pregnant women. According to the statistical report of WHO 79% of deaths occurred in people under the age of 60, due to diabetes. With a specific end goal to deal with the vast volume, speed, assortment, veracity and estimation of information a scalable environment is needed. Cloud computing is an interesting computing model suitable for accommodating huge volume of dynamic data. To overcome the data handling problems this work focused on Hadoop framework along with clustering technique. This work also predicts the occurrence of diabetes under various circumstances which is more useful for the human. This paper also compares the efficiency of two different clustering techniques suitable for the environment. The predicted result is used to diagnose which age group and gender are mostly affected by diabetes. Further some of the attributes such as hyper tension and work nature are also taken into consideration for analysis.


Diabetes mellitus Clustering techniques Hadoop Cloud computing Dynamic data 



  1. 1.
    Barakat NH, Bradley AP, Barakat NBH. Intelligible support vector machines for diagnosis of diabetes mellitus. IEEE Trans Inf Technol BioMed. 2010;14(4):1114–20.CrossRefGoogle Scholar
  2. 2.
    Shivakumar BL, Alby S. A survey on data-mining technologies for prediction and diagnosis of diabetes. In: International conference on intelligent computing applications, 978-1-4799-3966-4/14. 2014.Google Scholar
  3. 3.
    Ayed AB, Halima MB, Alimi AM. Survey on clustering methods: towards fuzzy clustering for big data. In: International conference of soft computing and pattern recognition, 978-1-4799-5934-1/14. 2014.Google Scholar
  4. 4.
    Han J, Kamber M, Pei J. Data mining: concepts and techniques. Waltham: Elsevier; 2011.zbMATHGoogle Scholar
  5. 5.
    Sivanandini LD, Raj MM. A survey on data clustering algorithms based on fuzzy techniques. Int J Sci Res. 2013;2(4):246–51.Google Scholar
  6. 6.
    Fahad A, Alshatri N, Tari Z, Alamri A, Khalil I, Zomaya AY, Foufou S, Bouras A. Survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans Emerg Top Comput. 2014;2(3):267–79.CrossRefGoogle Scholar
  7. 7.
    Dharmarajan A, Velmurugan T. Applications of partition based clustering algorithms: a survey. In: International conference on computational intelligence and computing research. 2013.Google Scholar
  8. 8.
    Kazi A, Kurian DT. A survey of data clustering techniques. Int J Eng Res Technol. 2014;3(4).Google Scholar
  9. 9.
    Zhang T, Ramakrishnan R, Livny M. BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data, pp. 73–84, June 01–04, 1998, Seattle, WA.Google Scholar
  10. 10.
    Vidhya K, Shanmugalakshmi R. Cloud based framework to handle and analyze diabetes data C. Int J Innov Sci Res. 2016;22(2):401–7.Google Scholar
  11. 11.
    Patil YS, Vaidya MB. K-means clustering with MapReduce technique. Int J Adv Res Comput Commun Eng. 2015;4(11).Google Scholar
  12. 12.
    Al-Ayyoub M, AlZu’bi S, Jararweh Y, Shehab MA, Gupta BB. Accelerating 3D medical volume segmentation using GPUs. Multimed Tools Appl. 2018;77(4):4939–58.CrossRefGoogle Scholar
  13. 13.
    Mohan V, Sandeep S, Deepa R, Shah B, Varghese C. Epidemiology of type 2 diabetes: Indian scenario. Indian J Med Res. 2007;125:217–30.Google Scholar
  14. 14.
    Rashno A, Koozekanani DD, Drayna PM, Nazari B, Sadri S, Rabbani H, Parhi KK. Fully automated segmentation of fluid/cyst regions in optical coherence tomography images with diabetic macular edema using neutrosophic sets and graph algorithms. IEEE Trans Biomed Eng. 2018;65(5):989–1001.Google Scholar
  15. 15.
    Barik RK, Priyadarshini R, Dubey H, Kumar V, Yadav S. Leveraging machine learning in mist computing telemonitoring system for diabetes prediction. In: Advances in data and information sciences. Singapore: Springer; 2018. pp. 95–104.CrossRefGoogle Scholar
  16. 16.
    Abawajy JH, Hassan MM. Federated internet of things and cloud computing pervasive patient health monitoring system. IEEE Commun Mag. 2017;55(1):48–53.CrossRefGoogle Scholar
  17. 17.
    Chen Z, Xu G, Mahalingam V, Ge L, Nguyen J, Yu W, Lu C. A cloud computing based network monitoring and threat detection system for critical infrastructures. Big Data Res. 2016;3:10–23.CrossRefGoogle Scholar
  18. 18.
    La HJ. A conceptual framework for trajectory-based medical analytics with IoT contexts. J Comput Syst Sci. 2016;82(4):610–26.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • P. Mohamed Shakeel
    • 1
    Email author
  • S. Baskar
    • 2
  • V. R. Sarma Dhulipala
    • 3
  • Mustafa Musa Jaber
    • 4
  1. 1.Faculty of Information and Communication TechnologyUniversiti Teknikal Malaysia MelakaDurian TunggalMalaysia
  2. 2.Department of ECEKarpagam Academy of Higher EducationCoimbatoreIndia
  3. 3.Department of PhysicsAnna UniversityTiruchirappalliIndia
  4. 4.Dijlah University CollegeBaghdadIraq

Personalised recommendations