Abstract
Diabetes offer a sea of opportunity to build classifier as wealth of patient data is available in public domain. It is a disease which affects the vast population and hence cost a great deal of money. It spreads over the years to the other organs in body thus make its impact lethal. Thus, the physicians are interested in early and accurate detection of diabetes. This paper presents an efficient binary classifier for detection of diabetes using data preprocessing and Support Vector Machine (SVM). In this study, attribute evaluator and the best first search is used for reducing the number of features. The dimension of the input feature is reduced from eight to three. The dataset used is Pima diabetic dataset from UCI repository. The substantial increase is noted in accuracy by using the data pre processing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Danaei, G., Finucane, M.M., Lu, Y., Singh, G.M., Cowan, M.J., Paciorek, C.J., et al.: National, regional, and global trends in fasting plasma glucose and diabetes prevalence since 1980, systematic analysis of health examination surveys and epidemiological studies with 370 country-years and 2.7 million participants. Lancet 378(9785), 31–40 (2011)
Fernández-Navarro, F., Riuz, R., Riquelme, J.C.: Evolutionary Generalized Radial Basis Function neural networks for improving prediction accuracy in gene classification using feature selection. Applied Soft Computing Journal (2012), doi:10.1016/j.asoc.2012.01.008,1787-1800
Li, D.-C., Liu, C.-W., Hu, S.C.: A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets. Artificial Intelligence in Medicine 52(1), 45–56 (2011)
Huang, Y., McCullagh, P., Black, N., Harper, R.: Feature selection and classification model construction on type 2 diabetic patients’ data. Artificial Intelligencte Medicine Journal 41, 251–262 (2007)
Polat, K., Gunes, S.: An expert system approach based on principal component analysis and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease. Digital Signal Processing 17, 702–710 (2007)
Polat, K., Gunes, S., Aslan, A.: A cascade learning system for classification of diabetes disease: Generalized discriminant analysis and least square supoort vector machine. Expert Systems with Applications 34(1), 214–221 (2008)
Hasan, T., Nijat, Y., Feyzullah, T.: A comparative study on diabetes disease using neural networks. Expert system with applications 36, 8610–8615 (2009)
Kayaer, K., Yildirim, T.: Medical diagnosis on Pima Indian Diabetes using General Regression Neural Networks. In: International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP), pp. 181–184
Day, P., Nandi, A.K.: Binary String Fitness Characterization and Comparative Partner Selection in Genetic Programming. IEEE Trans. on Evolutionary Computation 12(6), 724–735 (2008)
Muni, D.P., Pal, N.R., Das, J.: A Novel Approach to Design Classifiers Using Genetic Programming. IEEE Trans. on Evolutionary Computation 8(2), 183–196 (2004)
Khashei, M., Effekhari, S., Parvizian, J.: Diagnosing Diabetes Type II Using Soft Intelligent Binary Classification Model. Review of Bioinformatics and Biometrics (RBB) 1, 9–23 (2012)
Han, J., Kamber, M.: Data Mining concepts and techniques, 2nd edn., pp. 61–77. Morgan Kaumann publication, An imprint of Elsevier (2006) ISBN: 978-81-312-0535-8
Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Transaction on Knowledge and Data Engineering 15(3), 1437–1447 (2003)
Witten, H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementation. Morgan Kaufmann, San Mateo (2000)
Kohavi, R., Joh, G.H.: Wrappers for feature subset selection. Artificial Intelligence, pp. 273–324 (1997)
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–360 (2000)
Karagiannopoulos, M., Anyfantis, D., Kotsiantis, S.B., Pintelas, P.E.: Educational Software Development Laboratory. Department of Mathematics. University of Patras, Greece, math.upatras.gr
Rai, S., Saini, P., Jain, A.K.: Model for Prediction of Dropout Student Using. ID Decision Tree Algorithm International Journal of Advanced Research in Computer Science & Technology (IJARCST 2014) 2(1), 142–149 (2014) ISSN : 2347 - 9817
Bolón-Canedoa, V., Sánchez-Maroño, N., Alonso-Betanzos, A., BenÃtez, J.M., Herrera, B.F.: A review of microarray datasets and applied feature selection methods. Information Sciences 282, 111–135 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pradhan, M., Bamnote, G.R. (2015). Efficient Binary Classifier for Prediction of Diabetes Using Data Preprocessing and Support Vector Machine. In: Satapathy, S., Biswal, B., Udgata, S., Mandal, J. (eds) Proceedings of the 3rd International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA) 2014. Advances in Intelligent Systems and Computing, vol 327. Springer, Cham. https://doi.org/10.1007/978-3-319-11933-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-11933-5_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11932-8
Online ISBN: 978-3-319-11933-5
eBook Packages: EngineeringEngineering (R0)