Skip to main content

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 121))

Abstract

This paper aims toward a greater idea and utilization of machine learning in the medical sector. In this paper, comparative performances of six classification models are presented, when used over the University of California Irvine’s (UCI) Cleveland Heart Disease Records to predict coronary artery disease (CAD). At first, all the 13 provided independent features were used to build the models. On comparing the accuracy of models, it was found that K-nearest neighbors (KNN), support vector machine (SVM), and Naive Bayes have expected and better performances. Thereafter, feature selection is applied to improve prediction accuracy. The backward elimination method and filter method based on the Pearson correlation coefficient is used to choose major predicting features. The accuracy of models using all features and using features selected significantly enhanced the performance of Naive Bayes and random forest, while the other models did not perform as expected. Naive Bayes produced an accuracy of 88.16% on the test set thereafter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J. J., Meyer, M., Guppy, K.H., Abi-Mansour, P.: Algorithm to predict triple-vessel/left main coronary artery disease in patients without myocardial infarction. An international cross validation. Circulation 83(5 Suppl), III89–96 (1991)

    Google Scholar 

  2. Alwan, A.: Global status report on noncommunicable diseases 2010. World Health Organization. Open J. Prev. Med. 5(8) (2015)

    Google Scholar 

  3. Kumari, M., Godara, S.: Comparative study of data mining classification methods in cardiovascular disease prediction 1. Int. J. Comput. Sci. Technol. 2, 304–308 (2011)

    Google Scholar 

  4. Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J.J., Sandhu, S., Guppy, K.H., Lee, S., Froelicher, V.: International application of a new probability algorithm for the diagnosis of coronary artery disease. Am. J. Cardiol. 64(5), 304–310 (1989)

    Article  Google Scholar 

  5. Yao, Z., Liu, P., Lei, L., Yin, J.: R-C4. 5 Decision tree model and its applications to health care dataset. In: Proceedings of ICSSSM’05. 2005 International Conference on Services Systems and Services Management, vol. 2, pp. 1099–1103. IEEE (2005)

    Google Scholar 

  6. Das, R., Turkoglu, I., Sengur, A.: Effective diagnosis of heart disease through neural networks ensembles. Expert Syst. Appl. 36(4), 7675–7680 (2009)

    Article  Google Scholar 

  7. Kurt, I., Ture, M., Kurum, A.T.: Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst. Appl. 34(1), 366–374 (2008)

    Article  Google Scholar 

  8. Jabbar, M.A., Deekshatulu, B.L., Chandra, P.: Classification of heart disease using artificial neural network and feature subset selection. Glob. J. Comput. Sci. Technol. Neural Artif. Intell. 13(3), 4–8 (2013)

    Google Scholar 

  9. Gennari, J.H., Langley, P., Fisher, D.: Models of incremental concept formation. Artif. Intell. 40(1–3), 11–61 (1989)

    Article  Google Scholar 

  10. Sabay, A., Harris, L., Bejugama, V., Jaceldo-Siegl, K.: Overcoming small data limitations in heart disease prediction by using surrogate data. SMU Data Sci. Rev. 1(3), 12 (2018)

    Google Scholar 

  11. Mehanović, D., Mašetić, Z., Kečo, D.: Prediction of heart diseases using majority voting ensemble method. In: International Conference on Medical and Biological Engineering, pp. 491–498. Springer, Cham (2019)

    Google Scholar 

  12. Heart Disease Data Set, UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/datasets/Heart+Disease

  13. Detrano, R.: Heart Disease Data Set of Cleveland, V.A. Medical Center, Long Beach and Cleveland Clinic Foundation

    Google Scholar 

  14. Wikipedia: https://en.wikipedia.org/wiki/Precision_and_recall#cite_note-OlsonDelen-7

  15. Chen, L., Cao, Q., Li, S., Ju, X.: Predicting heart attacks. Int. J. Comput. Appl. (0975–8887) 17(8) (2011)

    Google Scholar 

  16. Chaki, D., Das, A., Zaber, M.I.: A comparison of three discrete methods for classification of heart disease data. Bangladesh J. Sci. Ind. Res. 50(4), 293–296 (2015)

    Article  Google Scholar 

  17. Wei, L., Altman, R.B.: An automated system for generating comparative disease profiles and making diagnoses. IEEE Trans. Neural Netw. 15, 597 (2004)

    Article  Google Scholar 

  18. Sen, S.K.: Predicting and diagnosing of heart disease using machine learning algorithms. Int. J. Eng. Comput. Sci. 6(6) (2017)

    Google Scholar 

  19. Singh, Y.K., Sinha, N., Singh, S.K. Heart disease prediction system using random forest. In: International Conference on Advances in Computing and Data Sciences, pp. 613–623. Springer, Singapore (2016)

    Google Scholar 

  20. Basharat, I., Anjum, A.R., Fatima, M., Qamar, U., Khan, S.A.: A framework for classifying unstructured data of cardiac patients: a supervised learning approach. Framework 7(2) (2016)

    Google Scholar 

  21. Hossain, J., FazlidaMohdSani, N., Mustapha, A., SurianiAffendey, L.: Using feature selection as accuracy benchmarking in clinical data mining. J. Comput. Sci. 9(7), 883 (2013)

    Article  Google Scholar 

  22. Chowdhury, D.R., Chatterjee, M., Samanta, R.K.: An artificial neural network model for neonatal disease diagnosis. Int. J. Artif. Intell. Expert Syst. (IJAE) 2(3), 96–106 (2011)

    Google Scholar 

  23. Chavda, P., Bhavsar, H., Pithadia, Y., Kotecha, R.: Early Detection of Cardiac Disease Using Machine Learning. Available at SSRN 3370813 (2019)

    Google Scholar 

  24. Feature Selection with sklearn and Pandas. https://towardsdatascience.com/feature-selection-with-pandas-e3690ad8504b

  25. Deekshatulu, B.L., Chandra, P.: Classification of heart disease using k-nearest neighbor and genetic algorithm. Procedia Technol. 10, 85–94 (2013)

    Article  Google Scholar 

  26. Jain, D., Singh, V.: Feature selection and classification systems for chronic disease prediction: a review. Egypt. Inf. J. 19(3), 179–189 (2018)

    Article  Google Scholar 

  27. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

  28. Aha, D., Kibler, D.: Instance-based prediction of heart-disease presence with the Cleveland database. University of California, 3(1), 3-2 (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Akansh Gupta .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gupta, A., Kumar, L., Jain, R., Nagrath, P. (2020). Heart Disease Prediction Using Classification (Naive Bayes). In: Singh, P., Pawłowski, W., Tanwar, S., Kumar, N., Rodrigues, J., Obaidat, M. (eds) Proceedings of First International Conference on Computing, Communications, and Cyber-Security (IC4S 2019). Lecture Notes in Networks and Systems, vol 121. Springer, Singapore. https://doi.org/10.1007/978-981-15-3369-3_42

Download citation

Publish with us

Policies and ethics