Healthcare and anomaly detection: using machine learning to predict anomalies in heart rate data


The application of machine learning algorithms to healthcare data can enhance patient care while also reducing healthcare worker cognitive load. These algorithms can be used to detect anomalous physiological readings, potentially leading to expedited emergency response or new knowledge about the development of a health condition. However, while there has been much research conducted in assessing the performance of anomaly detection algorithms on well-known public datasets, there is less conceptual comparison across unsupervised and supervised performance on physiological data. Moreover, while heart rate data are both ubiquitous and noninvasive, there has been little research specifically for anomaly detection of this type of data. Considering that heart rate data are indicative of both potential health complications and an individual’s physical activity, this is a rich source of largely overlooked data. To this end, we employed and evaluated five machine learning algorithms, two of which are unsupervised and the remaining three supervised, in their ability to detect anomalies in heart rate data. These algorithms were then evaluated on real heart rate data. Findings supported the effectiveness of local outlier factor and random forests algorithms in the task of heart rate anomaly detection, as each model generalized well from their training on simulated heart rate data to real world heart rate data. Furthermore, results support that simulated data can help configure algorithms to a degree of performance when real labeled data are not available and that this type of learning might be especially helpful in initial deployment of a system without prior data.

This is a preview of subscription content, access via your institution.

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. Adnan J, Daud NN, Mokhtar ASN, Hashim FR, Ahmad S, Rashidi AF, Rizman ZI (2017) Multilayer perceptron based activation function on heart abnormality activity. J Fund Appl Sci 9(3S):417–432

    Article  Google Scholar 

  2. Albert MV, Kording K, Herrmann M, Jayaraman A (2012) Fall classification by machine learning using mobile phones. PLoS ONE 7(5):e36556

    Article  Google Scholar 

  3. Amin M, Banos O, Khan W, Muhammad Bilal H, Gong J, Bui DM, Chung T (2016) On curating multimodal sensory data for health and wellness platforms. Sensors 16(7):980

    Article  Google Scholar 

  4. Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7):1545–1588

    Article  Google Scholar 

  5. Banaee H, Ahmed M, Loutfi A (2013) Data mining for wearable sensors in health monitoring systems: a review of recent trends and challenges. Sensors 13(12):17472–17500

    Article  Google Scholar 

  6. Bekkerman R, Bilenko M, Langford J (eds) (2011) Scaling up machine learning: Parallel and distributed approaches. Cambridge University Press, Cambridge

    Google Scholar 

  7. Bose EL, Clermont G, Chen L, Dubrawski AW, Ren D, Hoffman LA, Hravnak M (2018) Cardiorespiratory instability in monitored step-down unit patients: using cluster analysis to identify patterns of change. J Clin Monit Comput 32(1):117–126

    Article  Google Scholar 

  8. Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140

    MATH  Google Scholar 

  9. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  10. Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM Sigmod Rec 29:93–104

    Article  Google Scholar 

  11. Cvach M (2012) Monitor alarm fatigue: an integrative review. Biomed Instrum Technol 46(4):268–277

    Article  Google Scholar 

  12. Dietterich TG (2000) Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer, Berlin

    Google Scholar 

  13. Garcia-Teodoro P, Diaz-Verdejo J, Maciá-Fernández G, Vázquez E (2009) Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput Secur 28:18–28

    Article  Google Scholar 

  14. Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):e0152173

    Article  Google Scholar 

  15. Grömping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319

    MathSciNet  Article  Google Scholar 

  16. Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 10:993–1001

    Article  Google Scholar 

  17. Haque S, Rahman M, Aziz S (2015) Sensor anomaly detection in wireless sensor networks for healthcare. Sensors 15(4):8764–8786

    Article  Google Scholar 

  18. Hu W, Liao Y, Vemuri VR (2003) Robust support vector machines for anomaly detection in computer security. In ICMLA, pp. 168–174.

  19. Jothi N, Husain W (2015) Data mining in healthcare–a review. Procedia Comput Sci 72:306–313

    Article  Google Scholar 

  20. Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  21. Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining pp. 413–422.

  22. Liu J, Bier E, Wilson A, Guerra-Gomez JA, Honda T, Sricharan K, Davies D (2016) Graph analysis for detecting fraud, waste, and abuse in healthcare data. AI Mag 37(2):33–46

    Article  Google Scholar 

  23. Malhotra P, Vig L, Shroff G, Agarwal P (2015) Long short term memory networks for anomaly detection in time series. Proceedings. Presses universitaires de Louvain, Louvain

    Google Scholar 

  24. Mukkamala S, Janoski G, Sung A (2002) Intrusion detection using neural networks and support vector machines. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No. 02CH37290), pp. 1702–1707.

  25. Muniyandi AP, Rajeswari R, Rajaram R (2012) Network anomaly detection by cascading k-Means clustering and C4. 5 decision tree algorithm. Procedia Eng 30:174–182

    Article  Google Scholar 

  26. Omar S, Ngadi A, Jebur HH (2013) Machine learning techniques for anomaly detection: an overview. Int J Comput Appl 79(2):975–8887

    Google Scholar 

  27. Sotiris VA, Peter WT, Pecht MG (2010) Anomaly detection through a bayesian support vector machine. IEEE Trans Reliab 59(2):277–286

    Article  Google Scholar 

  28. Wang K, Zhao Y, Xiong Q, Fan M, Sun G, Ma L, Liu T (2016) Research on healthy anomaly detection model based on deep learning from multiple time-series physiological signals. Sci Program.

    Article  Google Scholar 

  29. Yassin W, Udzir NI, Muda Z, Sulaiman MN (2013) Anomaly-based intrusion detection through k-means clustering and naives bayes classification. In: Proceedings of 4th International Conference on Computer Informatics, ICOCI, pp. 298–303.

  30. Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Synth Lect Artif Intell Mach Learn 3(1):1–130

    Article  Google Scholar 

Download references


The authors would like to thank both New Mexico State University and Electronic Caregiver for supporting this research. Additionally, the authors would like to thank Hannah Rich, Andrew Washburn, Morgan Beasley, and Taylor Bunker for their comments on the manuscript.

Author information




In addition to the specified contributions below, all authors met the ICMJE author criteria. The great majority of effort, including all analyses and drafting, was accomplished by Edin Šabić. The design of the work and revisions were primarily led by David Keeley. Additionally, interpretation of the data and revisions were also performed by Bailey Henderson. Lastly, substantial feedback during early drafting as well as subject matter knowledge concerning nursing and patient management was provided by Sara Nannemann.

Corresponding author

Correspondence to Edin Šabić.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Šabić, E., Keeley, D., Henderson, B. et al. Healthcare and anomaly detection: using machine learning to predict anomalies in heart rate data. AI & Soc 36, 149–158 (2021).

Download citation


  • Anomaly detection
  • Healthcare
  • Heart rate
  • Machine learning