Influence of Bias and Variance in Selection of Machine Learning Classifiers for Biomedical Applications

Chakraborty, Parnasree; Rafiammal, S. Syed; Tharini, C.; Jamal, D. Najumnissa

doi:10.1007/978-981-19-3311-0_39

Parnasree Chakraborty⁸,
S. Syed Rafiammal⁸,
C. Tharini⁸ &
…
D. Najumnissa Jamal⁸

Part of the book series: Algorithms for Intelligent Systems ((AIS))

448 Accesses
3 Citations

Abstract

Machine learning classifiers play vital role in biomedical signals analysis and disease diagnosis. The selection of proper machine learning model for disease detection is based on the data characteristics. Bias and variance are the important errors which affects the machine learning model performance. Bias and variance are often taken into consideration for error analysis of any model. Unbiasedness is often considered as a positive property of a classifier selection condition but here we present a low variance is at least as significant, as a non-negligible variance introduces the possible solution for over-fitting problem in classifier selection and model training. In machine learning (ML), the performance degradation caused by over-fitting the ML classifiers selection criterion is a common problem but attained a minimum attention in machine learning literature. This paper is aimed to address the problems faced due to over-fitting in machine learning. The effects of over-fitting are often of comparable degree to differences in performance between various learning algorithms and hence cannot be avoided in experimental evaluation. Common performance measuring matrices are dependent on selection of bias/variance and hence results in over-fitting which are unreliable in practice. We discuss various methods to evade the over-fitting in the selection of classifiers and also discuss subsequent bias/variance selection in performance parameter evaluation. While this study focuses on statistical parameter-based ML classifiers selection, the findings are quite general and can be applied for any model selection in practice involving ML classifiers selection in biomedical signal and data applications. The novelty of the suggested method highlights on effect of bias and variance in choosing the ML classifiers, especially for biomedical signals and data classification. There is very limited work carried out to the best of our knowledge toward the ML classifier selection based on bias and variance, and hence, our suggested method ensures better performance in abnormality detection using biomedical signal and data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Steyerberg EW (2019) Clinical prediction models. Springer, Berlin
Book Google Scholar
Pradhan K, Chawla P (2020) Medical Internet of things using machine learning algorithms for lung cancer detection. J Manage Analytics 7(4):591–623
Article Google Scholar
Dhaya R (2021) Efficient two stage identification for face mask detection using multiclass deep learning approach. J Ubiquitous Comput Commun Technol 3(2):107–121
Google Scholar
Balasubramaniam V (2021) Artificial intelligence algorithm with SVM classification using dermascopic images for melanoma diagnosis. J Artif Intell Capsule Netw 3(1):34–42
Article Google Scholar
James Deva Koresh H. Chacko S (2020) Hybrid speckle reduction filter for corneal OCT images. In: International conference on image processing and capsule networks, pp 87–99. Springer, Cham
Google Scholar
Schnabel RB, Sullivan LM, Levy D, Pencina MJ, Massaro JM, D’Agostino RB Sr et al (2009) Development of a risk score for atrial fibrillation (Framingham Heart Study): a community-based cohort study. Lancet 373(9665):739–745
Article Google Scholar
D’Agostino RB, Wolf PA, Belanger AJ, Kannel WB (1994) Stroke risk profile: adjustment for antihypertensive medication. Framingham Study Stroke 25(1):40–43
Article Google Scholar
Framingham Heart Study: Risk Functions 2020. https://www.framinghamheartstudy.org/
Gawehn E, Hiss JA, Schneider G (2016) Deep learning in drug discovery. Mol Inf 35:3–14
Article Google Scholar
Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G et al (2019) Applications of machine learning in drug discovery and development. Nat Rev Drug Discov 18(6):463–477
Article Google Scholar
Marcus G (2018) Deep learning: a critical appraisal. arXiv preprint arXiv:180100631
Google Scholar
Fröhlich H, Balling R, Beerenwinkel N, Kohlbacher O, Kumar S, Lengauer T et al (2018) From hype to reality: data science enabling personalized medicine. BMC Med 16(1):150
Article Google Scholar
Chi-Hsien K, Nagasawa S (2019) Applying machine learning to market analysis: knowing your luxury consumer. J Manage Analytics 6(4):404–419
Article Google Scholar
Vafeiadis T, Dimitriou N, Ioannidis D, Wotherspoon T, Tinker G, Tzovaras D (2018) A framework for inspection of dies attachment on PCB utilizing machine learning techniques. J Manage Analytics 5(2):81–94
Article Google Scholar
Kullaya Swamy A, Sarojamma B (2020) Bank transaction data modeling by optimized hybrid machine learning merged with ARIMA. J Manage Analytics 7(4):624–648
Article Google Scholar
Wanigasekara C, Oromiehie E, Swain A, Prusty BG, Nguang SK (2021) Machine learning-based inverse predictive model for AFP based thermoplastic composites. J Ind Inf Integr 22:100197
Google Scholar
Ding D, He F, Yuan L, Pan Z, Wang L, Ros M (2021) The first step towards intelligent wire arc additive manufacturing: an automatic bead modelling system using machine learning through industrial information integration. J Ind Inf Integr 23:100218
Google Scholar
Bobrowski L (2004) Feature selection based on linear separability and a CPL criterion function. Task Q 8:183–192
Google Scholar
Lee I, Shin YJ (2020) Machine learning for enterprises: Applications, algorithm selection, and challenges. Bus Horiz 63(2):157–170. ISSN 0007-6813. https://doi.org/10.1016/j.bushor.2019.10.005
Chen W, Zhang H, Mehlawat MK, Jia L (2021) Mean–variance portfolio optimization using machine learning-based stock price prediction. Appl Soft Comput 100:106943. ISSN 1568-4946. https://doi.org/10.1016/j.asoc.2020.106943
https://www.kaggle.com/datasets/tauilabdelilah/mrl-eye-dataset
https://www.kaggle.com/uciml/indian-liver-patient-records
https://www.kaggle.com/yasserhessein/thyroid-disease-data-set
Thambawita V, Jha D, Hammer HL, Johansen HD, Johansen D, Halvorsen P, Riegler MA (2020) An extensive study on cross-dataset bias and evaluation metrics interpretation for machine learning applied to gastrointestinal tract abnormality classification. ACM Trans Comput Healthcare 1(3) Article 17 (July 2020), 29 pages. https://doi.org/10.1145/3386295
Wang Q, Guo A (2020) An efficient variance estimator of AUC and its applications to binary classification. Stat Med 39:4281–4300. https://doi.org/10.1002/sim.8725
Article MathSciNet Google Scholar
Chaubey G, Bisen D, Arjaria S, Yadav V (2020) Thyroid disease prediction using machine learning approaches. Natl Acad Sci Lett 44. https://doi.org/10.1007/s40009-020-00979-z
Jiao Y, Deng Y, Luo Y, Lu B-L (2020) Driver sleepiness detection from EEG and EOG signals using GAN and LSTM networks. Neurocomputing 408:100–111. ISSN 0925-2312. https://doi.org/10.1016/j.neucom.2019.05.108
Ma P, Gao Q (2020) EEG signal and feature interaction modeling-based eye behavior prediction research. Comput Math Methods Med 2020, Article ID 2801015, 10 pages. https://doi.org/10.1155/2020/2801015
Singh J, Bagga S, Kaur R (2020) Software-based prediction of liver disease with feature selection and classification techniques. Procedia Comput Sci 167:1970–1980. ISSN 1877-0509. https://doi.org/10.1016/j.procs.2020.03.226
Fathi M, Nemati M, Mohammadi S, Abbasi Kesbi R (2020) A machine learning approach based on SVM for classification of liver diseases. Biomed Eng: Appl Basis Commun 32:2050018. https://doi.org/10.4015/S1016237220500180
Article Google Scholar

Download references

Author information

Authors and Affiliations

B. S Abdul Rahman Crescent Institute of Science and Technology, Chennai, India
Parnasree Chakraborty, S. Syed Rafiammal, C. Tharini & D. Najumnissa Jamal

Authors

Parnasree Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
S. Syed Rafiammal
View author publications
You can also search for this author in PubMed Google Scholar
C. Tharini
View author publications
You can also search for this author in PubMed Google Scholar
D. Najumnissa Jamal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Parnasree Chakraborty .

Editor information

Editors and Affiliations

Kongunadu College of Engineering and Technology, Tholurpatti, India
R. Asokan
School of Information Technology and Telecommunications Engineering, University of Granada, Granada, Spain
Diego P. Ruiz
Deakin University, Waurn Ponds, VIC, Australia
Zubair A. Baig
Information Systems and Operations Management (ISOM), University of Florida, Gainesville, FL, USA
Selwyn Piramuthu

Ethics declarations

Conflict of Interest

The authors declare no conflict of interest.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chakraborty, P., Rafiammal, S.S., Tharini, C., Jamal, D.N. (2022). Influence of Bias and Variance in Selection of Machine Learning Classifiers for Biomedical Applications. In: Asokan, R., Ruiz, D.P., Baig, Z.A., Piramuthu, S. (eds) Smart Data Intelligence. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-19-3311-0_39

Download citation

DOI: https://doi.org/10.1007/978-981-19-3311-0_39
Published: 18 August 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3310-3
Online ISBN: 978-981-19-3311-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics