Abstract
This study aimed to predict heart disease using machine learning models and feature selection techniques. The Cleveland Clinic Heart Disease dataset was used. Some feature selection methods were applied to identify relevant features, including Recursive Feature Elimination (RFE) and Least Absolute Shrinkage and Selection Operator (LASSO). Multiple machine learning models, such as Random Forest, Logistic Regression, and Support Vector Machines, were trained and evaluated for their performance in classifying heart disease. The results showed that the combination of Random Forest with RFE achieved the highest performance, exhibiting the highest accuracy (91%), precision (90%), recall (90%), and F-score (90%). Furthermore, the model with RFE and Random Forest demonstrated a superior discriminatory ability with an AUC Score of 92%. This research has implications for improving heart disease diagnosis and developing predictive models for early detection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alizadehsani, R., Habibi, J., Gopalakrishnan, V.: Predicting heart disease using feature selection and classifier ensemble. Comput. Biol. Med. 89, 15–22 (2017)
Pankaj, N., Kaur, G., Jain, A., Sharma, A.: Prediction of heart disease using feature selection and machine learning algorithms. Int. J. Eng. Adv. Technol. (IJEAT) 8(6), 84–89 (2019)
Zhang, W., Sun, Y., Wang, Y., Liu, X.: The feature selection and classification of heart disease based on random forest. J. Phys. Conf. Ser. 1942(1), 012052 (2021)
Dissanayake, K., Md Johar, M.G.: Comparative study on heart disease prediction using feature selection techniques on classification algorithms. Appl. Comput. Intell. Soft Comput. 2021, 1–17 (2021)
Modak, S., Abdel-Raheem, E., Rueda, L.: Heart disease prediction using adaptive infinite feature selection and deep neural networks. In: International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 235–240 (2022)
Sarah, S., Gourisaria, M.K., Khare, S., Das, H.: Heart disease prediction using core machine learning techniques—a comparative study. In: Advances in Data and Information Sciences. LNNS, vol. 318, pp. 247–260. Springer, Singapore (2022). https://doi.org/10.1007/978-981-16-5689-7_22
Nguyen, K., et al.: Heart disease classification using novel heterogeneous ensemble. In: IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 1–4 (2021)
Latha, C.B., Jeeva, S.C.: Improving the accuracy of prediction of heart disease risk based on ensemble classification techniques. Inform. Med. Unlocked 16, 100203 (2019)
Sujatha, P., Mahalakshmi, K.: Performance evaluation of supervised machine learning algorithms in prediction of heart disease. In: IEEE INOCON, pp. 1–7 (2020)
Li, J.P., Haq, A.U., Din, S.U., Khan, J., Khan, A., Saboor, A.: Heart disease identification method using machine learning classification in e-healthcare (2020)
Kaggle, Cleveland clinic heart disease dataset (2019). https://www.kaggle.com/datasets/aavigan/cleveland-clinic-heart-disease-dataset. Accessed 13 May 2023
Emmert-Streib, F., Dehmer, M.: High-dimensional LASSO-based computational regression models: regularization, shrinkage, and selection. Mach. Learn. Knowl. Extract. 1(1), 359–383 (2019)
Misra, P., Yadav, A.S.: Improving the classification accuracy using recursive feature elimination with cross-validation. Int. J. Emerg. Tech. 11(3), 659–665 (2020)
Maulana, A., Noviandy, T.R., Idroes, R., Sasmita, N.R., Suhendra, R., Irvanizam, I.: Prediction of kovats retention indices for fragrance and flavor using artificial neural network. In: 2020 International Conference on Electrical Engineering and Informatics (ICELTICs), pp. 1–5. IEEE (2020)
Noviandy, T.R., Maulana, A., Emran, T.B., Idroes, G.M., Idroes, R.: QSAR classification of beta-secretase 1 inhibitor activity in Alzheimer’s disease using ensemble machine learning algorithms. Heca J. Appl. Sci. 1(1), 1–7 (2023)
Maulana, A., et al.: Machine learning approach for diabetes detection using fine-tuned XGBoost algorithm. Infolitika J. Data Sci. 1(1), 1–7 (2023)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Maulana, A., Faisyal, F.R., Tarmizi, F.K., Abidin, T.F., Riza, H. (2023). Optimizing Heart Disease Classification: Exploring the Impact of Feature Selection and Performance of Machine Learning Algorithms. In: Anutariya, C., Bonsangue, M.M. (eds) Data Science and Artificial Intelligence. DSAI 2023. Communications in Computer and Information Science, vol 1942. Springer, Singapore. https://doi.org/10.1007/978-981-99-7969-1_20
Download citation
DOI: https://doi.org/10.1007/978-981-99-7969-1_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7968-4
Online ISBN: 978-981-99-7969-1
eBook Packages: Computer ScienceComputer Science (R0)