The application of machine learning algorithms to healthcare data can enhance patient care while also reducing healthcare worker cognitive load. These algorithms can be used to detect anomalous physiological readings, potentially leading to expedited emergency response or new knowledge about the development of a health condition. However, while there has been much research conducted in assessing the performance of anomaly detection algorithms on well-known public datasets, there is less conceptual comparison across unsupervised and supervised performance on physiological data. Moreover, while heart rate data are both ubiquitous and noninvasive, there has been little research specifically for anomaly detection of this type of data. Considering that heart rate data are indicative of both potential health complications and an individual’s physical activity, this is a rich source of largely overlooked data. To this end, we employed and evaluated five machine learning algorithms, two of which are unsupervised and the remaining three supervised, in their ability to detect anomalies in heart rate data. These algorithms were then evaluated on real heart rate data. Findings supported the effectiveness of local outlier factor and random forests algorithms in the task of heart rate anomaly detection, as each model generalized well from their training on simulated heart rate data to real world heart rate data. Furthermore, results support that simulated data can help configure algorithms to a degree of performance when real labeled data are not available and that this type of learning might be especially helpful in initial deployment of a system without prior data.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
Adnan J, Daud NN, Mokhtar ASN, Hashim FR, Ahmad S, Rashidi AF, Rizman ZI (2017) Multilayer perceptron based activation function on heart abnormality activity. J Fund Appl Sci 9(3S):417–432
Albert MV, Kording K, Herrmann M, Jayaraman A (2012) Fall classification by machine learning using mobile phones. PLoS ONE 7(5):e36556
Amin M, Banos O, Khan W, Muhammad Bilal H, Gong J, Bui DM, Chung T (2016) On curating multimodal sensory data for health and wellness platforms. Sensors 16(7):980
Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7):1545–1588
Banaee H, Ahmed M, Loutfi A (2013) Data mining for wearable sensors in health monitoring systems: a review of recent trends and challenges. Sensors 13(12):17472–17500
Bekkerman R, Bilenko M, Langford J (eds) (2011) Scaling up machine learning: Parallel and distributed approaches. Cambridge University Press, Cambridge
Bose EL, Clermont G, Chen L, Dubrawski AW, Ren D, Hoffman LA, Hravnak M (2018) Cardiorespiratory instability in monitored step-down unit patients: using cluster analysis to identify patterns of change. J Clin Monit Comput 32(1):117–126
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM Sigmod Rec 29:93–104
Cvach M (2012) Monitor alarm fatigue: an integrative review. Biomed Instrum Technol 46(4):268–277
Dietterich TG (2000) Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer, Berlin
Garcia-Teodoro P, Diaz-Verdejo J, Maciá-Fernández G, Vázquez E (2009) Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput Secur 28:18–28
Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):e0152173
Grömping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 10:993–1001
Haque S, Rahman M, Aziz S (2015) Sensor anomaly detection in wireless sensor networks for healthcare. Sensors 15(4):8764–8786
Hu W, Liao Y, Vemuri VR (2003) Robust support vector machines for anomaly detection in computer security. In ICMLA, pp. 168–174.
Jothi N, Husain W (2015) Data mining in healthcare–a review. Procedia Comput Sci 72:306–313
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining pp. 413–422.
Liu J, Bier E, Wilson A, Guerra-Gomez JA, Honda T, Sricharan K, Davies D (2016) Graph analysis for detecting fraud, waste, and abuse in healthcare data. AI Mag 37(2):33–46
Malhotra P, Vig L, Shroff G, Agarwal P (2015) Long short term memory networks for anomaly detection in time series. Proceedings. Presses universitaires de Louvain, Louvain
Mukkamala S, Janoski G, Sung A (2002) Intrusion detection using neural networks and support vector machines. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No. 02CH37290), pp. 1702–1707.
Muniyandi AP, Rajeswari R, Rajaram R (2012) Network anomaly detection by cascading k-Means clustering and C4. 5 decision tree algorithm. Procedia Eng 30:174–182
Omar S, Ngadi A, Jebur HH (2013) Machine learning techniques for anomaly detection: an overview. Int J Comput Appl 79(2):975–8887
Sotiris VA, Peter WT, Pecht MG (2010) Anomaly detection through a bayesian support vector machine. IEEE Trans Reliab 59(2):277–286
Wang K, Zhao Y, Xiong Q, Fan M, Sun G, Ma L, Liu T (2016) Research on healthy anomaly detection model based on deep learning from multiple time-series physiological signals. Sci Program. https://doi.org/10.1155/2016/5642856
Yassin W, Udzir NI, Muda Z, Sulaiman MN (2013) Anomaly-based intrusion detection through k-means clustering and naives bayes classification. In: Proceedings of 4th International Conference on Computer Informatics, ICOCI, pp. 298–303.
Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Synth Lect Artif Intell Mach Learn 3(1):1–130
The authors would like to thank both New Mexico State University and Electronic Caregiver for supporting this research. Additionally, the authors would like to thank Hannah Rich, Andrew Washburn, Morgan Beasley, and Taylor Bunker for their comments on the manuscript.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Šabić, E., Keeley, D., Henderson, B. et al. Healthcare and anomaly detection: using machine learning to predict anomalies in heart rate data. AI & Soc 36, 149–158 (2021). https://doi.org/10.1007/s00146-020-00985-1
- Anomaly detection
- Heart rate
- Machine learning