Skip to main content

An Empirical Comparison of Classification Machine Learning Models Using Medical Datasets

  • Conference paper
  • First Online:
Proceedings of the 4th International Conference on Data Science, Machine Learning and Applications (ICDSMLA 2022)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1038))

  • 111 Accesses

Abstract

Classification is a supervised learning model where the class labels are accurately identified for future samples. Medical data is an important source for understanding and improving health outcomes and classification algorithms are often used to analyze these data. Learning models give significant experiences into the situational needs of patients. Various hypotheses have been carried out on different datasets yet it is truly challenging to track down which model is suitable. Proposed work compares the performance of classification models like LR, DT, SVM, NB, KNN, and RF on various datasets. SVM classifier yields accuracy of 0.59 for the Diabetic dataset as it considers individual model opinion, while RF classifier surpassed them both with accuracy 0.9974 for the breast cancer Wisconsin dataset since it is an ensemble approach that takes majority opinions. These findings highlight the need for careful consideration of the choice of classification model when analyzing medical data and provide valuable insights for researchers and practitioners working with these data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 219.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 279.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aksoy S, Koperski K, Tusk C, Marchisio G, Tilton JC (2005) Learning Bayesian classifiers for scene classification with a visual grammar. IEEE Trans Geosci Remote Sens 43(3):581–589

    Google Scholar 

  2. Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S (2016) Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans Med Imaging 35(5):1207–1216

    Article  Google Scholar 

  3. Astani M, Hasheminejad M, Vaghefi M (2022) A diverse ensemble classifier for tomato disease recognition. Comput Electron Agric 198:107054

    Article  Google Scholar 

  4. Barakat N, Bradley AP, Barakat MNH (2010) Intelligible support vector machines for diagnosis of diabetes mellitus. IEEE Trans Inf Technol Biomed 14(4):1114–1120

    Google Scholar 

  5. Fayn J (2010) A classification tree approach for cardiac ischemia detection using spa-tiotemporal information from three standard ecg leads. IEEE Trans Biomed Eng 58(1):95–102

    Article  Google Scholar 

  6. Ambrish G, Ganesh B, Ganesh A, Srinivas C, Dhanraj, Mensinkal K (2022) Logistic regression technique for prediction of cardiovascular disease. Glob Trans Proc 3(1):127–130. Int Conf Intell Eng Approach (ICIEA-2022)

    Google Scholar 

  7. Hameed N, Shabut AM, Ghosh MK, Hossain MA (2020) Multi-class multi-level classification algorithm for skin lesions classification using machine learning techniques. Expert Syst Appl 141:112961

    Google Scholar 

  8. Hossain E, Hossain MF, Rahaman MA (2019) A color and texture based approach for the detection and classification of plant leaf disease using knn classifier. In: 2019 international conference on electrical, computer and communication engineering (ECCE). IEEE, pp 1–6

    Google Scholar 

  9. Kumari S, Kumar D, Mittal M (2021) An ensemble approach for clas-sification and prediction of diabetes mellitus using soft voting classifier. Int J Cognitive Comput Eng 2:40–46

    Article  Google Scholar 

  10. Li JP, Ul Haq A, Ud Din S, Khan J, Khan A, Saboor A (2020) Heart disease identification method using machine learning classification in e-healthcare. IEEE Access 8:107562–107582

    Google Scholar 

  11. Li M, Nie X, Reheman Y, Huang P, Zhang S, Yuan Y, Chen C, Yan Z, Chen C, Lv X et al (2020) Computer-aided diagnosis and staging of pancreatic cancer based on ct images. IEEE Access 8:141705–141718

    Article  Google Scholar 

  12. Lindner C, Thiagarajah S, Wilkinson JM, Wallis GA, Cootes TF, arcOGEN Consortium et al (2013) Fully automatic segmentation of the proximal femur using random forest regression voting. IEEE Trans Med Imag 32(8):1462–1472

    Google Scholar 

  13. Liu M, Zhang J, Adeli E, Shen D (2019) Joint classification and regression via deep multi-task multi-channel learning for alzheimer’s disease diagnosis. IEEE Trans Biomed Eng 66(5):1195–1206

    Article  Google Scholar 

  14. Liu S, Liu S, Cai W, Che H, Pujol S, Kikinis R, Feng D, Fulham J, ADNI (2015) Multimodal neuroimaging feature learning for multi-class diagnosis of Alzheimer’s disease. IEEE Trans Biomed Eng 62(4):1132–1140

    Google Scholar 

  15. Lyngdoh AC, Choudhury NA, Moulik S (2021) Diabetes disease prediction using machine learning algorithms. In: 2020 IEEE-EMBS conference on biomedical engineering and sciences (IECBES), pp 517–521

    Google Scholar 

  16. Gunjan VK, Kumar S, Ansari MD, Vijayalata Y (2022) Prediction of agriculture yields using machine learning algorithms. In: Proceedings of the 2nd international conference on recent trends in machine learning, IoT, smart cities and applications: ICMISC 2021. Springer, Singapore, pp 17–26

    Google Scholar 

  17. Tsanas A, Little MA, McSharry PE, Spielman J, Ramig L-R (2012) Novel speech signal processing algorithms for high-accuracy classifica-tion of parkinson’s disease. IEEE Trans Biomed Eng 59(5):1264–1271

    Article  Google Scholar 

  18. Piao Y, Piao M, Ryu KH (2017) Multiclass cancer classification using a feature subset-based ensemble from microrna expression profiles. Comput Biol Med 80:39–44

    Google Scholar 

  19. Sambasivam G, Opiyo GD (2021) A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks. Egypt Inf J 22(1):27–34

    Google Scholar 

  20. Springer DB, Tarassenko L, Clifford GD (2015) Logistic regression-hsmm-based heart sound segmentation. IEEE Trans Biomed Eng 63(4):822–832

    Google Scholar 

  21. Tao R, Zhang S, Huang X, Tao M, Ma J, Ma S, Zhang C, Zhang T, Tang F, Jianping L, Shen C, Xie X (2019) Magnetocardiography-based ischemic heart disease detection and localization using machine learning methods. IEEE Trans Biomed Eng 66(6):1658–1667

    Article  Google Scholar 

  22. Kumar S, Gunjan VK, Ansari MD, Pathak R (2022) Credit card fraud detection using support vector machine. In: Proceedings of the 2nd international conference on recent trends in machine learning, IoT, smart cities and applications: ICMISC 2021. Springer, Singapore, pp 27–37

    Google Scholar 

  23. Gaddam DKR, Ansari MD, Vuppala S, Gunjan VK, Sati MM (2022) A performance comparison of optimization algorithms on a generated dataset. In: ICDSMLA 2020: proceedings of the 2nd international conference on data science, machine learning and applications. Springer, Singapore, pp 1407–1415

    Google Scholar 

  24. Narayana GS, Ansari MD, Gunjan VK (2022) Instantaneous approach for evaluating the initial centers in the agricultural databases using K-means clustering algorithm. J Mob Multimedia 43–60

    Google Scholar 

  25. Kumar S, Ansari MD, Gunjan VK, Solanki VK (2020) On classification of BMD images using machine learning (ANN) algorithm. In: ICDSMLA 2019: proceedings of the 1st international conference on data science, machine learning and applications. Springer, Singapore, pp 1590–1599

    Google Scholar 

  26. Gunjan VK, Prasad PS, Pathak R, Kumar A (2020) Machine learning methods for extraction and classification for biometric authentication. In: ICDSMLA 2019: proceedings of the 1st international conference on data science, machine learning and applications. Springer, Singapore, pp 1984–1988

    Google Scholar 

  27. Kumar MR, Gunjan VK (2020) Review of machine learning models for credit scoring analysis. Ingeniería Solidaria 16(1)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohd Dilshad Ansari .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saketha Rama, B.V., Suryanarayana, G., Ansari, M.D., Begum, R. (2023). An Empirical Comparison of Classification Machine Learning Models Using Medical Datasets. In: Kumar, A., Gunjan, V.K., Hu, YC., Senatore, S. (eds) Proceedings of the 4th International Conference on Data Science, Machine Learning and Applications. ICDSMLA 2022. Lecture Notes in Electrical Engineering, vol 1038. Springer, Singapore. https://doi.org/10.1007/978-981-99-2058-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-2058-7_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-2057-0

  • Online ISBN: 978-981-99-2058-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics