Prediction of Heart Disease Using Random Forest and Feature Subset Selection

  • M. A. JabbarEmail author
  • B. L. Deekshatulu
  • Priti Chandra
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 424)


Heart disease is a leading cause of death in the world. Heart disease is the number one killer in both urban and rural areas. Predicting the outcome of disease is the challenging task. Data mining can be can be used to automatically infer diagnostic rules and help specialists to make diagnosis process more reliable. Several data mining techniques are used by researchers to help health care professionals to predict the heart disease. Random forest is an ensemble and most accurate learning algorithm, suitable for medical applications. Chi square feature selection measure is used to evaluate between variables and determines whether they are correlated or not. In this paper, we propose a classification model which uses random forest and chi square to predict heart disease. We evaluate our approach on heart disease data sets. The experimental results demonstarte that our approach improve classification accuracy compared to other classification approaches, and the presented model can help health care professional for predicting heart disease.


Heart disease Random forest Data mining Feature selection Chi square 


  1. 1.
    Polat, K., Gunes, S., Tosun, S.: Diagnosis of heart disease using artificial immune recognition system and fuzzy weight preprocessing. Pattern Recognit. 39, 2186–193 (2006)Google Scholar
  2. 2.
    Das, R., Turkoglu, I., Sengur, A.: Effective diagnosis of heart disease through network ensembles. Expert Syst. Appl. 36, 7675–7680 (2009)Google Scholar
  3. 3.
    Anooj, P.K.: Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules. J. King Saud Univ. CIS, 24, 27–40 (2012)Google Scholar
  4. 4.
    Detrano, R., Janosi, A., Stein burn, W., et al.: International application of new probability algorithm for the diagnosis of CAD. Am. J. Cardiol. 64(5), 304–310 (1989)Google Scholar
  5. 5.
    Shouman, M., Turner, T., Stocker, R.: Using decision tree for diagnosing heart disease patients. In: 9th Australian Data Mining Conference, Australia, vol 121. ACM (2011)Google Scholar
  6. 6.
    Tu, M.C. et al.: Effective diagnosis of heart disease through bagging approach. In: Biomedical Engineering and Approach, pp. 1–4, BMEI 2009, IEEE (2009)Google Scholar
  7. 7.
    Andreeva: Data modeling and specific rule generation via data mining techniques. In: International Conference on Computer System and Technologies, Comsystech 2006, pp. 1–6 (2006)Google Scholar
  8. 8.
    Saaol times, Monthly magazine, Modifiable risk factors of heart disease, pp. 6–10, July (2015)Google Scholar
  9. 9.
  10. 10.
    Forman, G.: An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289–1305 (2003)Google Scholar
  11. 11.
    Sonwang, P., et al.: Computer network security based on SVM approach. In: 11th International Conference on Control, Automation, and SystemsGoogle Scholar
  12. 12.
    Med Calc: Last Accessed 5 Aug 2015
  13. 13.
    UCI machine learning repository: Last Accessed 15 Aug 2015

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • M. A. Jabbar
    • 1
    Email author
  • B. L. Deekshatulu
    • 2
  • Priti Chandra
    • 3
  1. 1.Muffakham Jah College of Engineering and TechnologyHyderabadIndia
  2. 2.IDRBT, RBIHyderabadIndia
  3. 3.ASL, DRDOHyderabadIndia

Personalised recommendations