Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines
- 65 Downloads
Heart is one of the essential operating organs of the human body and its failure is a major contributing factor toward the human deaths. Coronary heart disease may be asymptotic but can be anticipated through the medical tests and daily life routine of the subject. Diagnosis of the coronary heart disease needs a specialized medical resource with the plenty of experience. All over the world and particularly in the developing countries, there is a lack of such experts which make the diagnosis more difficult. In this paper, we present a clinical heart disease diagnostic system by proposing feature subset selection methodology with an object of achieving improved performance. The proposed methodology presents three algorithms for selecting candidate feature subsets: (1) mean Fisher score-based feature selection algorithm, (2) forward feature selection algorithm and (3) reverse feature selection algorithm. Feature subset selection algorithm is presented to select the most decisive subset from the candidate feature subsets. The features are added to the feature subsets on the basis of their individual Fisher scores, while the selection of a feature subset depends on its Matthews correlation coefficient score and dimension. The selected feature subset with the reduced dimension is fed to the RBF kernel-based SVM which results in binary classification: (1) heart disease patient and (2) normal control subject. The proposed methodology is validated through accuracy, specificity and sensitivity using four UCI datasets, i.e., Cleveland, Switzerland, Hungarian and SPECTF. The statistical results achieved using the proposed technique are shown in comparison with the existing techniques reflecting its better performance. It has an accuracy of 81.19, 84.52, 92.68 and 82.7% for Cleveland, Hungarian, Switzerland and SPECTF, respectively.
KeywordsHeart disease Feature selection Fisher score SVM RBF
This research did not receive any specific grant from funding agencies in the public, commercial or not-for-profit sectors.
Compliance with ethical standards
Conflicts of interest
There have been no involvements that might raise the question of bias in the work reported or in the conclusions, implications or opinions stated. The authors declare that they have no conflict of interest.
This article does not contain any studies with human participants or animals performed by any of the authors.
- 1.WHO (2016) Who—cardiovascular diseases (cvds). http://www.who.int/mediacentre/factsheets/fs317/en/. Accessed 10 Jan 2016
- 2.Naicker S, Plange-Rhule J, Tutt RC, Eastwood JB (2009) Shortage of healthcare workers in developing countries-Africa. Ethn Dis 19(1):60Google Scholar
- 4.Rajkumar A, Reena GS (2010) Diagnosis of heart disease using datamining algorithm. Glob J Comput Sci Technol 10(10):38–43Google Scholar
- 6.Ismaeel S, Miri A, Chourishi D (2015) Using the extreme learning machine (elm) technique for heart disease diagnosis. In: 2015 IEEE Canada international on humanitarian technology conference (IHTC2015), pp 1–3Google Scholar
- 7.Krishnaiah V, Narsimha G, Chandra N (2015) Heart disease prediction system using data mining technique by fuzzy k-nn approach. In: Satapathy SC, Govardhan A, Raju KS, Mandal JK (eds) Emerging ICT for bridging the future—proceedings of the 49th annual convention of the computer society of India (CSI) volume 1, ser. Advances in intelligent systems and computing, vol 337. Springer International Publishing, pp 371–384Google Scholar
- 8.Chitra R, Seenivasagam V (2015) Heart disease prediction system using intelligent network. In: Power electronics and renewable energy systems. Springer, New York, pp 1377–1384Google Scholar
- 11.Yuehjen ES, Hou C-D, Chiu C-C (2014) Hybrid intelligent modeling schemes for heart disease classification. Appl Soft Comput 14(Part A):47–52, special issue on hybrid intelligent methods for health technologiesGoogle Scholar
- 13.Anooj PK (2012) Clinical decision support system: risk level prediction of heart disease using decision tree fuzzy rules. Int J Res Rev Comput Sci 3(3):1659–1667Google Scholar
- 14.Anooj PK (2012) Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules. J King Saud Univ Comput Inf Sci 24(1):27–40Google Scholar
- 21.Li C, Shi C, Zhang H, Hui C, Lam K-M, Zhang S (2014) Cost-sensitive feature selection in medical data analysis with trace ratio criterion. In: 2014 12th international conference on signal processing (ICSP). IEEE, pp 1077–1082Google Scholar
- 23.Gu Q, Li Z, Han J (2012) Generalized fisher score for feature selection. arXiv preprint arXiv:1202.3725
- 28.Yahiaoui A, Er O, Yumusak N (2017) A new method of automatic recognition for tuberculosis disease diagnosis using support vector machines. Biomed Res 28(9):4208–4212Google Scholar
- 30.Asuncion A, Newman D (2007) UCI machine learning repository. http://www.ics.uci.edu/~mlearn/MLRepository.html. Accessed 15 June 2015