Skip to main content

Empirical Study on Different Feature Selection and Classification Algorithms for Prediction of Hepatitis Disease

  • Chapter
  • First Online:
Technical Advancements of Machine Learning in Healthcare

Part of the book series: Studies in Computational Intelligence ((SCI,volume 936))

Abstract

Hepatitis is one of the most commonly diagnosed diseases in the world. With the enormous amount of data available in the medical industry, it is difficult to draw important conclusions. With the advent of technology, data mining techniques are used to solve this problem. In this study, we have applied various classifiers namely KNN, Logistic Regression, Naive Bayes, Decision Tree, Support Vector Machine (SVM), and Random Forest on Hepatitis dataset acquired from UCI Machine Learning repository. Two feature selection techniques: Chi-square test and Boruta Algorithm are used to improve the performance of the classifiers. Finally, we analyze which classifier performed the best and classify the patients into live or dead based on various performance measures. It was concluded that Naive Bayes with Chi-Square attribute selection performed better in terms of F1 score value. Overall, Logistic regression, Support Vector Machine, Kernel SVM, and KNN performed equally well with an accuracy of 90.32%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Turaiki, I., Alshahrani, M., Almutairi, T.: Building predictive models for MERS-CoV infections using data mining techniques. J. Inf. Public Health 9(6), 744–748 (2016)

    Article  Google Scholar 

  2. USF Health, Morsani College of Medicine. Data mining in healthcare. https://www.usfhealthonline.com/resources/key-concepts/data-mining-in-healthcare, last accessed 2020/08/21

  3. World Health Organization.“What is hepatitis?: https://www.who.int/news-room/q-a-detail/what-is-hepatitis, last accessed 2020/08/21

  4. World Health Organization.: Hepatitis. https://www.who.int/health-topics/hepatitis #tab=tab_1, last accessed 2020/08/21

  5. Pushpalatha, S., Pandya, D.: Framework for diagnosing hepatitis disease using classification algorithms. Int. J. Adv. Res. 4(7), 2189–2195 (2016)

    Article  Google Scholar 

  6. Kumar, V., BR, G.D.: Hepatitis prediction model based on data mining algorithm and optimal feature selection to improve predictive accuracy. Int. J. Comput. Appl. 51(19), 13–16 (2012)

    Google Scholar 

  7. Karthikeyan, T., Thangaraju, P.: Analysis of classification algorithms applied to hepatitis patients. Int. J. Comp. Appl. 62(15), 25–30 (2013)

    Google Scholar 

  8. Hepatitis dataset, UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/ Irvine. CA University of California, School of Information Technology and Computer Science, last accessed 2020/07/18

  9. Mallick, Pradeep Kumar, Debahuti Mishra, Srikanta Patnaik, Kailash Shaw: A semi-supervised rough set and random forest approach for pattern classification of gene expression data.Int. J. Rea.-Based Intel. Syst. 8(3–4), 155–167 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Niranjan Panda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Panda, N., Satapathy, S.K., Mishra, S., Mallick, P.K. (2021). Empirical Study on Different Feature Selection and Classification Algorithms for Prediction of Hepatitis Disease. In: Tripathy, H.K., Mishra, S., Mallick, P.K., Panda, A.R. (eds) Technical Advancements of Machine Learning in Healthcare. Studies in Computational Intelligence, vol 936. Springer, Singapore. https://doi.org/10.1007/978-981-33-4698-7_4

Download citation

Publish with us

Policies and ethics