Machine Learning for Healthcare: Introduction

  • Shiwani GuptaEmail author
  • R. R. Sedamkar
Part of the Learning and Analytics in Intelligent Systems book series (LAIS, volume 13)


Machine Learning (ML) is an evolving area of research with lot many opportunities to explore. “It is the defining technology of this decade, though its impact on healthcare has been meagre”—says James Collin at MIT. Many of the ML industry’s young start-ups are knuckling down significant portions of their efforts to healthcare. Google has developed a machine learning algorithm to help identify cancerous tumours on mammograms. Stanford is using a Deep Learning algorithm to identify skin cancer. US healthcare system generates approximately one trillion GB of data annually. Different academic researchers have come up with different number of features and clinical researchers with different risk factors for identification of chronic diseases. More data means more knowledge for the machine to learn, but these large number of features require large number of samples for enhanced accuracy. Hence, it would be better if machines could extract medically high-risk factors. Accuracy is enhanced if data is pre-processed in form of Exploratory Data Analysis and Feature Engineering. Multiclass classification would be able to assess different risk level of disease for a patient. In healthcare, correctly identifying percentage of sick people (Sensitivity) is of priority than correctly identifying percentage of healthy people (Specificity), thus research should happen to increase the sensitivity of algorithms. This chapter presents an introduction to one of the most challenging and emerging application of ML i.e. Healthcare. Patients will always need the caring and compassionate relationship with the people who deliver care. Machine Learning will not eliminate this, but will become tools that clinicians use to improve ongoing care.


Healthcare Machine Learning Feature Selection Parameter Optimization Diagnosis Preprocessing 


  1. 1.
    S. Gupta, R.R. Sedamkar, Apply Machine Learning for Healthcare to enhance performance and identify informative features, in IEEE INDIACom; 6th International Conference on “Computing for Sustainable Global Development”, BVICAM, New Delhi, India, 13–15 Mar 2019Google Scholar
  2. 2.
    C.B. Gokulnath, S.P. Shantharajah, An Optimized Feature Selection Based on Genetic Approach and Support Vector Machine for Heart Disease (Springer Nature, Iran, 2018)Google Scholar
  3. 3.
    E.R.Q. Fernanded, A.C.P.L.F. de Carvalho, X. Yao, Ensemble of classifiers based on multiobjective genetic sampling for imbalanced data. IEEE Trans. Knowl. Data Eng. 14(8) (2015)Google Scholar
  4. 4.
    F. Babič, J. Olejár, Z. Vantová, J. Paralič, Predictive and descriptive analysis for heart disease diagnosis, in FedCSIS, vol. 11 pp. 155–163, IEEE Catalog Number: CFP1785N-ART c 2017, Slovakia, ISSN 2300-5963
  5. 5.
    R. Pari, M. Sandhya, S. Sankar, A Multitier Stacked Ensemble Algorithm for Improving Classification Accuracy (IEEE, 2018)Google Scholar
  6. 6.
    S. Mahendru, S. Agarwal, Feature Selection Using Metaheuristic Algorithms on Medical Datasets (Springer Nature, Singapore, 2019)Google Scholar
  7. 7.
    S.M. Saqlain, M. Sher, F.A. Shah, I. Khan, M.U. Ashraf, M. Awais, A. Ghani, Fisher Score and Matthews Correlation Coefficient-Based Feature Subset Selection for Heart Disease Diagnosis Using Support Vector Machines (Springer, London, 2018)Google Scholar
  8. 8.
    Z. Arabasadi, R. Alizadehsani, M. Roshanzamir, H. Moosaei, A.A. Yarifard, Computer aided decision making for heart disease detection using hybrid neural network-Genetic algorithm. Comput. Methods Programs Biomed. 141, 19–26 (2017). (Elsevier ScienceDirect)CrossRefGoogle Scholar
  9. 9.
    N. Fayyazifar, M. Samadiani, Parkinson’s Disease Detection Using Ensemble Techniques and Genetic Algorithm (IEEE, Pakistan, 2017)CrossRefGoogle Scholar
  10. 10.
    I. Chlioui, A. Idri, I. Abnane, J.M.C. de Gea, J.L.F. Alemán, Breast Cancer Classification with Missing Data Imputation (Springer Nature, Switzerland, 2019)CrossRefGoogle Scholar
  11. 11.
    T. Santhanam, M.S. Padmavathi, Application of K-Means and genetic algorithms for dimension reduction by integrating SVM for diabetes diagnosis. Procedia Comput. Sci. 47, 76–83 (2015). (Elsevier ScienceDirect)CrossRefGoogle Scholar
  12. 12.
    Y. Khan, U. Qamar, N. Yousaf, A. Khan, Machine learning techniques for heart disease dataset: a survey, in ICMLC, ACM, China, 22–24 Feb 2019Google Scholar
  13. 13.
    S. Gupta, R.R. Sedamkar, Feature Selection to reduce dimensionality of heart disease dataset without compromising accuracy. Int. J. Comput. Trends Technol. (IJCTT) 67(6) (2019)Google Scholar
  14. 14.
    X.Y. Liu, Y. Liang, S. Wang, Z.Y. Yang, H.S. Ye, Hybrid genetic algorithm with wrapper embedded approaches for feature selection. IEEE Access 6, 22863–22874 (2018)CrossRefGoogle Scholar
  15. 15.
    Z. Yang, Y. Zhou, C. Gong, Diagnosis of diabetes based on improved Support Vector Machine and Ensemble Learning, in ICIAI, ACM, China, 15–18 Mar 2019Google Scholar
  16. 16.
    A. Ogunleye, Q.G. Wang, XGBoost model for Chronic Kidney Disease diagnosis. IEEE/ACM Trans. Comput. Biol. Bioinform. (2019)Google Scholar
  17. 17.
    S. Fletcher, B. Verma, Z.M. Jan, M. Zhang, The optimized selection of base-classifiers for ensemble classification using a multi-objective genetic algorithm, in 2018 IEEE International Joint Conference on Neural Networks (IJCNN), AustraliaGoogle Scholar
  18. 18.
    H.A.G. Elsayed, L. Syed, An Automatic early risk classification of hard coronary heart diseases using framingham scoring model, in ICC (ACM, Cambridge, UK, 2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Computer EngineeringThakur College of Engineering and TechnologyMumbaiIndia

Personalised recommendations