Abstract
Machine learning is a widely growing field which helps in better learning from data and its analysis without any human intervention. It is being popularly used in the field of healthcare for analyzing and detecting serious and complex conditions. Diabetes is one such condition that heavily affects the entire system. In this paper, application of intelligent machine learning algorithms like logistic regression, naïve Bayes, support vector machine, decision tree, k-nearest neighbors, neural network, and random decision forest are used along with feature extraction. The accuracy of each algorithm, with and without feature extraction, leads to a comparative study of these predictive models. Therefore, a list of algorithms that works better with feature extraction and another that works better without it is obtained. These results can be used further for better prediction and diagnosis of diabetes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
The Times of India, India—“44 lakh Indians don’t know they are diabetic”. http://timesofindia.indiatimes.com/india/44-lakh-Indians-dont-know-they-arediabetic/articleshow/17274366.cms
Jakhmola, S., Pradhan, T.: A computational approach of data smoothening and prediction of diabetes dataset. In: Proceedings of the Third International Symposium on Women in Computing and Informatics. ACM (2015)
Kayaer, K., Yıldırım, T.: Medical diagnosis on Pima Indian diabetes using general regression neural networks. In: Proceedings of the International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP) (2003)
Karegowda, A.G., Jayaram, M.A., Manjunath, A.S.: Cascading k-means clustering and k-nearest neighbor classifier for categorization of diabetic patients. Int. J. Eng. Adv. Technol. 1.3, 147–151 (2012)
Karegowda, A.G., Manjunath, A.S., Jayaram, M.A.: Application of genetic algorithm optimized neural network connection weights for medical diagnosis of pima Indians diabetes. Int. J. Soft Comput. 2.2: 15–23 (2011)
Scherf, M., Brauer, W.: Feature selection by means of a feature weighting approach. Inst. für Informatik (1997)
Ratanamahatana, C.A., Dimitrios, G.: Scaling up the Naive Bayesian Classifier: Using Decision Trees for Feature Selection (2002)
Campbell, C., Cristianini, N.: Simple Learning Algorithms for Training Support Vector Machines. University of Bristol (1998)
Setiono, R., Liu, H.: Neural-network feature selector. IEEE Trans. Neural Netw. 8.3, 654–662 (1997)
Hall, L.O., Chawla, N., Bowyer, K.W.: Combining decision trees learned in parallel. In: Working Notes of the KDD-97 Workshop on Distributed Data Mining (1998)
Rajesh, K., Sangeetha, V.: Application of data mining methods and techniques for diabetes diagnosis. Int. J. Eng. Innov. Technol. (IJEIT) 2.3 (2012)
Vrushali, R., Balpande, R., Wajgi, D.: Prediction and severity estimation of diabetes using data mining technique. In: 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 576–580 (2017)
Veena Vijayan, V., Anjali, C.: Computerized information system using stacked generalization for diagnosis of diabetes mellitus. In: 2015 IEEE Recent Advances in Intelligent Computational Systems (RAICS), pp. 173–178 (2015)
Kavakiotis, I., Tsave, O., Salifoglou, A., Maglaveras, N., Vlahavas, I., Chouvarda, I.: Machine learning and data mining methods in diabetes research. Comput. Struct. Biotechnol. J. 15, 104–116 (2017). ISSN 2001-0370,2016
Lagani, V., Chiarugi, F., Thomson, S., Fursse, J., Lakasing, E., Jones, R.W., et al.: Development and validation of risk assessment models for diabetes-related complications based on the DCCT/EDIC data. J. Diabetes Complicat. 29(4), pp. 479–487 (2015)
Lagani, V., Chiarugi, F., Manousos, D., Verma, V., Fursse, J., Marias, K., et al.: Realization of a service for the long-term risk assessment of diabetes-related complications. J. Diabetes Complicat. 29(5), 691–698 (2015)
Sacchi, L., Dagliati, A., Segagni, D., Leporati, P., Chiovato, L., Bellazzi, R.: Improving risk-stratification of diabetes complications using temporal data mining. Conf. Proc. IEEE Eng. Med. Biol. Soc. 2015, 2131–2213 (2015)
Huang, G.-M., Huang, K.-Y., Lee, T.-Y., Weng, J.: An interpretable rule-based diagnostic classification of diabetic nephropathy among type 2 diabetes patients. BMC Bioinform. 16(S-1), S5 (2015)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier (2011)
Prima Indians Diabetes Data Set (2017). https://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Karun, S., Raj, A., Attigeri, G. (2019). Comparative Analysis of Prediction Algorithms for Diabetes. In: Bhatia, S., Tiwari, S., Mishra, K., Trivedi, M. (eds) Advances in Computer Communication and Computational Sciences. Advances in Intelligent Systems and Computing, vol 759. Springer, Singapore. https://doi.org/10.1007/978-981-13-0341-8_16
Download citation
DOI: https://doi.org/10.1007/978-981-13-0341-8_16
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0340-1
Online ISBN: 978-981-13-0341-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)