Cluster Computing

, Volume 22, Supplement 1, pp 1–9 | Cite as

Diabetes prediction in healthcare systems using machine learning algorithms on Hadoop cluster

  • N. YuvarajEmail author
  • K. R. SriPreethaa


Health care systems are merely designed to meet the needs of increasing population globally. People around the globe are affected with different types of deadliest diseases. Among the different types of commonly existing diseases, diabetes is a major cause of blindness, kidney failure, heart attacks, etc. Health care monitoring systems for different diseases and symptoms are available all around the world. The rapid development in the fields of Information and Communication Technologies made remarkable improvements in health care systems. Various Machine Learning algorithms are proposed which automates the working model of health care systems and enhances the accuracy of disease prediction. Hadoop cluster based distributed computing framework supports in efficient processing and storing of extremely large datasets in cloud environment. This work proposes the novel implementation of machine learning algorithms in hadoop based clusters for diabetes prediction. The results show that the machine learning algorithms can able to produce highly accurate diabetes predictive healthcare systems. Pima Indians Diabetes Database from National Institute of Diabetes and Digestive Diseases is used to evaluate the working of algorithm.


Healthcare systems Machine learning algorithms Hadoop clusters Predictive analysis Cluster computing 


  1. 1.
    Song, T.M.: Efficient utilization of big data on health and welfare. Health Welf. Policy Forum. 193, 68–76 (2012)Google Scholar
  2. 2.
    Kumar, S., Chakravarty, A.: ABC-VED analysis of expendable medical stores at a tertiary care hospital. Med. J. Armed Forces India 71(1), 24–27 (2015)CrossRefGoogle Scholar
  3. 3.
    Yuvaraj, N., Sabari, A.: An extensive survey on information retrieval and information recommendation algorithms implemented in user personalization. Aust. J. Basic Appl. Sci. 9(31), 571–575 (2016)Google Scholar
  4. 4.
    Gebicki, M., Mooney, E., Chen, S.-J.G., Mazur, L.M.: Evaluation of hospital medication inventory policies. Health Care Manage. Sci. 17, 215–229 (2014)CrossRefGoogle Scholar
  5. 5.
  6. 6.
  7. 7.
    Qi, Y., Jie, L.: Research of cloud storage security technology based on HDFS. Comput. Eng. Des. 8, 2700–2705 (2013)Google Scholar
  8. 8.
    Huang, B., Xu, S., Pu, W.: Design and implementation of MapReduce based data mining, platform. Comput. Eng. Des. 2, 495–501 (2013)Google Scholar
  9. 9.
    Yuvaraj, N., Sabari, A.: Twitter sentiment classification using binary shuffled frog algorithm. Intell. Autom. Soft Comput. 1(1), 1–9 (2016)Google Scholar
  10. 10.
    Huang, W., Wang, H., Zhang, Y., Zhang, S.: A novel cluster computing technique based on signal clustering and analytic hierarchy model using hadoop. Clust. Comput. (2017).
  11. 11.
    Bakshi, S., Sa, P.K., Wang, H., Barpanda, S.S., Majhi, B.: Fast periocular authentication in handheld devices with reduced phase intensive local pattern. Multimed. Tools Appl. (2017).
  12. 12.
    Chen, Q., Zhang, G., Yang, X., Li, S., Li, Y., Wang, H.H.: Single image shadow detection and removal based on feature fusion and multiple dictionary learning. Multimed. Tools Appl. (2017).
  13. 13.
    Wang, H., Wang, J.: An effective image representation method using kernel classification. In: 2014 IEEE 26th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, pp. 853–858 (2014)Google Scholar
  14. 14.
    Zhang, J., Williams, S.O., Wang, H.: Intelligent computing system based on pattern recognition and data mining algorithms. Sustain. Comput. (2017).
  15. 15.
    Yuvaraj, N., Sabari, A.: Performance analysis of supervised machine learning algorithms for opinion mining in e-commerce websites. Middle-East J. Sci. Res. 1(1), 341–345 (2016)Google Scholar
  16. 16.
    Chapelle, O., Sindhwani, V., Keerthi, S.S.: Optimization techniques for semi- supervised support vector machines. J. Mach. Learn. Res. 9, 203–233 (2013)zbMATHGoogle Scholar
  17. 17.
    Zhang, N., Chandrasekar, P.: Sparse learning of maximum likelihood model for optimization of complex loss function. Neural Comput. Appl. 28(5), 1057–1067 (2017)CrossRefGoogle Scholar
  18. 18.
    Zhang, S., Wang, H., Huang, W.: Two-stage plant species recognition by local mean clustering and Weighted sparse representation classification. Clust. Comput. 20(2), 1517–1525 (2017)CrossRefGoogle Scholar
  19. 19.
    Smys, S., Kumar, A.D.: Secured WBANs for pervasive m-healthcare social networks. In: 10th International Conference IEEE on Intelligent Systems and Control (ISCO), January 2016, pp. 1–4. (2016)Google Scholar
  20. 20.
    Huang, S., Wang, B., Wang, G.: A survey on MapReduce optimization technologies. J. Front. Comput. Sci. Technol. 10, 885–905 (2013)Google Scholar
  21. 21.
    Gao, S., Li, L., Li, W., Janowicz, K., Zhang, Y.: Constructing gazetteers from volunteered big geo-data based on Hadoop. Comput. Environ. Urban Syst. 61, 172–186 (2017)CrossRefGoogle Scholar
  22. 22.
    Li, J., Cui, J., Wang, D., et al.: Survey of MapReduce parallel programming model. Acta Electronica Sinica 11, 2635–2642 (2011)Google Scholar
  23. 23.
    Chen, J., Chen, H., Wan, X., Zheng, G.: MR-ELM: a MapReduce-based framework for large-scale ELM training in big data era. Neural Comput. Appl. 27(1), 101–110 (2016)CrossRefGoogle Scholar
  24. 24.
    Huang, W., et al.: A novel cluster computing technique based on signal clustering and analytic hierarchy model using hadoop. Clust. Comput. (2007).
  25. 25.
    Cai, Z., Deng, L., Li, D., Yao, X., Cox, D., Wang, H.: A FCM cluster: cloud networking model for intelligent transportation in the city of Macau. Clust. Comput. (2017).
  26. 26.
    Pattern mining model based on improved neural network and modified genetic algorithm for cloud mobile networksGoogle Scholar
  27. 27.
    Wang, Y., Li, J., Wang, H.H.: Cluster and cloud computing framework for scientific metrology in flow control. Clust. Comput. (2017).
  28. 28.
    Haindl, M., Somol, P., Ververidis, D., Kotropoulos, C.: Feature selection based on mutual correlation. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds.) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2006. Lecture Notes in Computer Science, vol. 4225, pp. 569–577. Springer, Berlin, Heidelberg (2006)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2017

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringKPR Institute of Engineering and TechnologyCoimbatoreIndia

Personalised recommendations