Advertisement

Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods

  • Huseyin Polat
  • Homay Danaei Mehr
  • Aydin Cetin
Systems-Level Quality Improvement
Part of the following topical collections:
  1. Systems-Level Quality Improvement

Abstract

As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.

Keywords

Feature selection Support vector machine Chronic kidney disease Machine learning 

Abbreviations

CKD

Chronic Kidney Disease

UCI

University of California Irvine

SVM

Support Vector Machine

GA

Genetic Algorithm

SymmetricUncertAttributesetEval

Symmetrical uncertainty attribute set evaluator

SVEGA

Shapely Value Embedded Genetic Algorithm

KNN

K-nearest Neighbor

GainRatioAttributeEval

Gain ratio attribute evaluator

PrincipalComponentsAttributeEval

Principal components attribute evaluator

SIMCA

Soft Independent Modeling of Class Analogy

AUC

Area Under the roc Curve

TCMSP

Traditional Chinese Medicine Syndrome Prediction method

OSAF

Oscillating Search Algorithm Feature Selection

NotCKD

Without Chronic Kidney Disease

ClassifierSubsetEval

Classifier subset evaluator

WrapperSubsetEval

Wrapper subset evaluator

FilterSubsetEval

Filtered subset evaluator

CfsSubsetEval

Correlation feature selection subset evaluator

TP

True Positive

TN

True Negative

FP

False Positive

FN

False Negative

ROC

Receiver Operating Characteristic

References

  1. 1.
    Nordqvist, C., Chronic kidney disease: causes, symptoms and treatments. IOP Publishing medicalnewstoday, 2016 http://www.medicalnewstoday.com/articles/172179.php. Accessed 14 Jan 2016.
  2. 2.
    Go, A.S., Chertow, G.M., Fan, D., McCulloch, C.E., and Hsu, C.-y., Chronic kidney disease and the risks of death, cardiovascular events, and hospitalization. N. Engl. J. Med., 2004. doi: 10.1056/NEJMoa041031.PubMedGoogle Scholar
  3. 3.
    Kathuria, P., and Wedro, B., Chronic kidney disease quick overview. IOP Publishing emedicinehealth, 2016 http://www.emedicinehealth.com/chronic_kidney_disease/page2_em.htm#chronic_kidney_disease_quick_overview. Accessed 23 Feb 2016.
  4. 4.
    Huang, M.-J., Chen, M.-Y., and Lee, S.-C., Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis. Expert Syst. Appl. 32:856–867, 2007. doi: 10.1016/j.eswa.2006.01.038.CrossRefGoogle Scholar
  5. 5.
    José, N., Rosário Martins, M., Vilhena, J., Neves, J., Gomes, S., Abelha, A., Machado, J., and Vicente, H., A soft computing approach to kidney diseases evaluation. J. Med. Syst. 39:131, 2015. doi: 10.1007/s10916-015-0313-4.CrossRefGoogle Scholar
  6. 6.
    Di Noia, T., Claudio, V., Ostuni, F.P., Binetti, G., Naso, D., Schena, F.P., and Di Sciascio, E., An end stage kidney disease predictor based on an artificial neural networks ensemble. Expert Syst. Appl. 40:4438–4445, 2013. doi: 10.1016/j.eswa.2013.01.046.CrossRefGoogle Scholar
  7. 7.
    Chen, Z., Zhang, X., and Zhang, Z., Clinical risk assessment of patients with chronic kidney disease by using clinical data and multivariate models. Int. Urol. Nephrol. 48:2069–2075, 2016. doi: 10.1007/s11255-016-1346-4.CrossRefPubMedGoogle Scholar
  8. 8.
    Akay, M.F., Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst. Appl. 36:3240–3247, 2009. doi: 10.1016/j.eswa.2008.01.009.CrossRefGoogle Scholar
  9. 9.
    Özçift, A., and Gülten, A., Genetic algorithm wrapped Bayesian network feature selection applied to differential diagnosis of erythemato-squamous diseases. Digital Signal Processing. 23:230–237, 2013. doi: 10.1016/j.dsp.2012.07.008.CrossRefGoogle Scholar
  10. 10.
    Singh, R.K., and Sivabalakrishnan, M., Feature selection of gene expression data for cancer classification: a review. Procedia Computer Science. 50:52–57, 2015. doi: 10.1016/j.procs.2015.04.060.CrossRefGoogle Scholar
  11. 11.
    Chao-Ton, S., and Yang, C.-H., Feature selection for the SVM: an application to hypertension diagnosis. Expert Syst. Appl. 34:754–763, 2008. doi: 10.1016/j.eswa.2006.10.010.CrossRefGoogle Scholar
  12. 12.
    Kumari, B., and Swarnkar, T., Filter versus wrapper feature subset selection in large dimensionality micro array: a review. International Journal of Computer Science and Information Technologies. 2(3):1048–1053, 2011.Google Scholar
  13. 13.
    Villacampa, O., Feature selection and classification methods for decision making: a comparative analysis. CEC Theses and Dissertations. College of Engineering and Computing. Nova Southeastern University, Florida, USA, 2015.Google Scholar
  14. 14.
    Karegowda, A.G., Jayaram, M.A., and Manjunath, A.S., Feature subset selection problem using wrapper approach in supervised learning. Int. J. Comput. Appl. 1(7):13–17, 2010. doi: 10.5120/169-295.Google Scholar
  15. 15.
    Cho, B.H., Yu, H., Kim, K.-W., Kim, T.H., Kim, I.Y., and Kim, S.I., Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods. Artif. Intell. Med. 42:37–53, 2008. doi: 10.1016/j.artmed.2007.09.005.CrossRefPubMedGoogle Scholar
  16. 16.
    Ladha, L., and Deepa, T., Feature selection methods and algorithms. Int. J. Comput. Sci. Eng. 3(5):1787–1797, 2011.Google Scholar
  17. 17.
    Mousin, L., Jourdan, L., Marmion, M.-E., and Dhaenens, C., Feature selection using tabu search with learning memory: learning Tabu Search. 10th International Conference. LION 10. Ischia, Italy, 2016. doi: 10.1007/978-3-319-50349-3_10.
  18. 18.
    Ma, S., and Huang, J., Penalized feature selection and classification in bioinformatics. Brief. Bioinform. 9:392–403, 2009. doi: 10.1093/bib/bbn027.CrossRefGoogle Scholar
  19. 19.
    Lavanya, D., and Usha Rani, K., Analysis of feature selection with Classfication: breast cancer datasets. Indian Journal of Computer Science and Engineering (IJCSE). 2(5):756–763, 2011.Google Scholar
  20. 20.
    Jiang, L., He, Y., and Zhang, Y., Prediction of hepatotoxicity of traditional Chinese medicine compounds by support vector machine approach. The 8th International Conference on Systems Biology (ISB). Qingdao, China, 2014. doi: 10.1109/ISB.2014.6990426.
  21. 21.
    Sasikala, S., Appavu alias Balamurugan, S., and Geetha, S., A novel feature selection technique for improved survivability diagnosis of breast cancer. Procedia Computer Science. 50:16–23, 2015. doi: 10.1016/j.procs.2015.04.005.CrossRefGoogle Scholar
  22. 22.
    Moore, D., Paxson, V., Savage, S., Shannon, C., Staniford, S., and Weaver, N., Center for applied internet data analysis. IEEE Security and Privacy article, 2003. http://www.caida.org/publications/papers/2003/sapphire/. Accessed 2 Feb 2017.
  23. 23.
    Poore, K., Nimda worm–why is it different?. SANS Institute, 2001. http://www.sans.org/reading-room/whitepapers/malicious/nimda-worm-different-98. Accessed 2 Feb 2017.
  24. 24.
    Center for Applied Internet Data Analysis., UCSD network telescope -- code-red worms dataset. Center for Applied Internet Data Analysis, 2016. http://www.caida.org/data/passive/codered_worms_dataset.xml. Accessed 2 Feb 2017.
  25. 25.
    Ćosović, M., Obradović, S., and Trajković, L., Performance evaluation of BGP anomaly classifiers. IEEE., 2015. doi: 10.1109/DINWC.2015.7054228.Google Scholar
  26. 26.
    Akbarisanto, R., Akbarisanto, R., and Purwarianti, A., Analyzing bandung public mood using twitter data. Fourth International Conference on Information and Communication Technologies (ICoICT). Bandung, Indonesia, 2016. doi: 10.1109/ICoICT.2016.7571910.
  27. 27.
    Wang, Y., Maa, L., and Liu, P., Feature selection and syndrome prediction for liver cirrhosis in traditional Chinese medicine. Comput. Methods Prog. Biomed. 95:249–257, 2009. doi: 10.1016/j.cmpb.2009.03.004.CrossRefGoogle Scholar
  28. 28.
    Chaves, R., Ramírez, J., Górriz, J.M., López, M., Salas-Gonzalez, D., Álvarez, I., and Segovia, F., SVM-based computer-aided diagnosis of the Alzheimer’s disease using t-test NMSE feature selection with feature correlation weighting. Neurosci. Lett. 461:293–297, 2009. doi: 10.1016/j.neulet.2009.06.052.CrossRefPubMedGoogle Scholar
  29. 29.
    Henneges, C., Bullinger, D., Fux, R., Friese, N., Seeger, H., Neubauer, H., Laufer, S., Gleiter, C.H., Schwab, M., Zell, A., and Kammerer, B., Prediction of breast cancer by profiling of urinary RNA metabolites using support vector machine-based feature selection. BMC Cancer. 9:104, 2009. doi: 10.1186/1471-2407-9-104.CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    John Peter, T., and Somasundaram, K., Study and development of novel feature selection framework for heart disease prediction. Int. J. Sci. Res. Publ. 2(10):577–583, 2012.Google Scholar
  31. 31.
    Randa Oqab Mujalli, de Juan Oña (2011) A method for simplifying the analysis of traffic accidents injury severity on two-lane highways using Bayesian networks. J. Saf. Res. 42: 317–326. doi: 10.1016/j.jsr.2011.06.010
  32. 32.
    Onik, A.R., Haq, N.F., Alam, L., and Mamun, T.I., An analytical comparison on filter feature extraction method in data mining using J48 classifier. Int. J. Comput. Appl. 124(13):1–8, 2015.Google Scholar
  33. 33.
    Yeom, J.S., Textile fingerprinting for dismount analysis in the visible, near, and shortwave infrared domain. Thesis. Department of The Air Force. Air Force Institute of Technology. Wright-Patterson Air Force Base, Ohio, USA, 2014.Google Scholar
  34. 34.
    Dechter, R., and Pearl, J., Generalized best-first search strategies and the optimality of a*. J. Assoc. Comput. Mach. 32(3):505–536, 1985.CrossRefGoogle Scholar
  35. 35.
    Sadeghi, R., Zarkami, R., Sabetraftar, K., and Van Damme, P., Application of genetic algorithm and greedy stepwise to select input variables in classification tree models for the prediction of habitat requirements of Azolla filiculoides (lam.) in Anzali wetland, Iran. Ecol. Model. 251:44–53, 2013. doi: 10.1016/j.ecolmodel.2012.12.010.CrossRefGoogle Scholar
  36. 36.
    Wald, R., Khoshgoftaar, T.M., and Napolitano, A., Optimizing wrapper-based feature selection for use on bioinformatics data. In Proceedings of the Twenty-Seventh International Florida Artificial Intelligence Research Society Conference, Florida, USA, 2014.Google Scholar
  37. 37.
    Xie, J., and Wang, C., Using support vector machines with a novel hybrid feature selection method for diagnosis of erythemato-squamous diseases. Expert Syst. Appl. 38:5809–5815, 2011. doi: 10.1016/j.eswa.2010.10.050.CrossRefGoogle Scholar
  38. 38.
    Fawcett, T., An introduction to ROC analysis. Pattern Recogn. Lett. 27:861–874, 2006. doi: 10.1016/j.patrec.2005.10.010.CrossRefGoogle Scholar
  39. 39.
    Hajian-Tilaki, K., Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian J Intern Med. 4(2):627–635, 2013.PubMedPubMedCentralGoogle Scholar
  40. 40.
    V. Mohan Patro, Manas Ranjan Patra (2014) Augmenting Weighted Average with Confusion Matrix to Enhance Classification Accuracy. Ransactions on Machine Learning and Artificial Intelligence. 2(4): 77–91. doi: 10.14738/tmlai.24.328
  41. 41.
    MAYO CLINIC., Kidney infection. MAYO CLINIC, 2016. http://www.mayoclinic.org/diseases-conditions/kidney-infection/basics/definition/con-20032448. Accessed 2 Feb 2017.
  42. 42.
    Healthline., Red Blood Cell Count (RBC). Healthline. http://www.healthline.com/health/rbc-count#Overview1, 2016. Accessed 2 Feb 2017.
  43. 43.
    DPC Education Center., Albumin and Chronic Kidney Disease. DPC Education Center, 2016. http://www.dpcedcenter.org/albumin-and-chronic-kidney-disease. Accessed 2 Feb 2017.
  44. 44.
    NLDA., Pus cells in urine: causes, symptoms, treatment and best home remedies. NLDA, 2016. https://www.nlda.org/pus-cells-in-urine-causes-symptoms-treatment-and-best-home-remedies/. Accessed 2 Feb 2017.
  45. 45.
    Charles Patrick Davis., Creatinine blood test. MedicineNet.com, 2016. http://www.medicinenet.com/creatinine_blood_test/page2.htm. Accessed 2 Feb 2017.
  46. 46.
    DAVITA., Stage 4 of chronic kidney disease (CKD). DAVITA, 2016. https://www.davita.com/kidney-disease/kidney-disease/symptoms-and-diagnosis/stage-4-of-chronic-kidney-disease-(ckd)/e/686. Accessed 2 Feb 2017.
  47. 47.
    Medline plus., Urine specific gravity test. Medline plus, 2015. https://medlineplus.gov/ency/article/003587.htm. Accessed 2 Feb 2017.
  48. 48.
    DPC Education Center., What you need to know about anemia and kidney disease. DPC Education Center, 2016. http://www.dpcedcenter.org/what-you-need-know-about-anemia-and-kidney-disease. Accessed 2 Feb 2017.
  49. 49.
    Medical-base.com., Pus cell in urine–causes, symptoms & treatment of pus cells. Medical-base.com, 2016. http://medical-base.com/pus-cell-in-urine-causes-symptoms-treatment-of-pus-cells. Accessed 2 Feb 2017.

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Department of Computer Engineering, Faculty of TechnologyGazi UniversityAnkaraTurkey

Personalised recommendations