Abstract
The application of machine learning algorithms to healthcare data can enhance patient care while also reducing healthcare worker cognitive load. These algorithms can be used to detect anomalous physiological readings, potentially leading to expedited emergency response or new knowledge about the development of a health condition. However, while there has been much research conducted in assessing the performance of anomaly detection algorithms on well-known public datasets, there is less conceptual comparison across unsupervised and supervised performance on physiological data. Moreover, while heart rate data are both ubiquitous and noninvasive, there has been little research specifically for anomaly detection of this type of data. Considering that heart rate data are indicative of both potential health complications and an individual’s physical activity, this is a rich source of largely overlooked data. To this end, we employed and evaluated five machine learning algorithms, two of which are unsupervised and the remaining three supervised, in their ability to detect anomalies in heart rate data. These algorithms were then evaluated on real heart rate data. Findings supported the effectiveness of local outlier factor and random forests algorithms in the task of heart rate anomaly detection, as each model generalized well from their training on simulated heart rate data to real world heart rate data. Furthermore, results support that simulated data can help configure algorithms to a degree of performance when real labeled data are not available and that this type of learning might be especially helpful in initial deployment of a system without prior data.
Similar content being viewed by others
References
Adnan J, Daud NN, Mokhtar ASN, Hashim FR, Ahmad S, Rashidi AF, Rizman ZI (2017) Multilayer perceptron based activation function on heart abnormality activity. J Fund Appl Sci 9(3S):417–432
Albert MV, Kording K, Herrmann M, Jayaraman A (2012) Fall classification by machine learning using mobile phones. PLoS ONE 7(5):e36556
Amin M, Banos O, Khan W, Muhammad Bilal H, Gong J, Bui DM, Chung T (2016) On curating multimodal sensory data for health and wellness platforms. Sensors 16(7):980
Amit Y, Geman D (1997) Shape quantization and recognition with randomized trees. Neural Comput 9(7):1545–1588
Banaee H, Ahmed M, Loutfi A (2013) Data mining for wearable sensors in health monitoring systems: a review of recent trends and challenges. Sensors 13(12):17472–17500
Bekkerman R, Bilenko M, Langford J (eds) (2011) Scaling up machine learning: Parallel and distributed approaches. Cambridge University Press, Cambridge
Bose EL, Clermont G, Chen L, Dubrawski AW, Ren D, Hoffman LA, Hravnak M (2018) Cardiorespiratory instability in monitored step-down unit patients: using cluster analysis to identify patterns of change. J Clin Monit Comput 32(1):117–126
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Breiman L (2001) Random forests. Mach Learn 45(1):5–32
Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. ACM Sigmod Rec 29:93–104
Cvach M (2012) Monitor alarm fatigue: an integrative review. Biomed Instrum Technol 46(4):268–277
Dietterich TG (2000) Ensemble methods in machine learning. International workshop on multiple classifier systems. Springer, Berlin
Garcia-Teodoro P, Diaz-Verdejo J, Maciá-Fernández G, Vázquez E (2009) Anomaly-based network intrusion detection: Techniques, systems and challenges. Comput Secur 28:18–28
Goldstein M, Uchida S (2016) A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data. PLoS ONE 11(4):e0152173
Grömping U (2009) Variable importance assessment in regression: linear regression versus random forest. Am Stat 63(4):308–319
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE Trans Pattern Anal Mach Intell 10:993–1001
Haque S, Rahman M, Aziz S (2015) Sensor anomaly detection in wireless sensor networks for healthcare. Sensors 15(4):8764–8786
Hu W, Liao Y, Vemuri VR (2003) Robust support vector machines for anomaly detection in computer security. In ICMLA, pp. 168–174.
Jothi N, Husain W (2015) Data mining in healthcare–a review. Procedia Comput Sci 72:306–313
Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22
Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining pp. 413–422.
Liu J, Bier E, Wilson A, Guerra-Gomez JA, Honda T, Sricharan K, Davies D (2016) Graph analysis for detecting fraud, waste, and abuse in healthcare data. AI Mag 37(2):33–46
Malhotra P, Vig L, Shroff G, Agarwal P (2015) Long short term memory networks for anomaly detection in time series. Proceedings. Presses universitaires de Louvain, Louvain
Mukkamala S, Janoski G, Sung A (2002) Intrusion detection using neural networks and support vector machines. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No. 02CH37290), pp. 1702–1707.
Muniyandi AP, Rajeswari R, Rajaram R (2012) Network anomaly detection by cascading k-Means clustering and C4. 5 decision tree algorithm. Procedia Eng 30:174–182
Omar S, Ngadi A, Jebur HH (2013) Machine learning techniques for anomaly detection: an overview. Int J Comput Appl 79(2):975–8887
Sotiris VA, Peter WT, Pecht MG (2010) Anomaly detection through a bayesian support vector machine. IEEE Trans Reliab 59(2):277–286
Wang K, Zhao Y, Xiong Q, Fan M, Sun G, Ma L, Liu T (2016) Research on healthy anomaly detection model based on deep learning from multiple time-series physiological signals. Sci Program. https://doi.org/10.1155/2016/5642856
Yassin W, Udzir NI, Muda Z, Sulaiman MN (2013) Anomaly-based intrusion detection through k-means clustering and naives bayes classification. In: Proceedings of 4th International Conference on Computer Informatics, ICOCI, pp. 298–303.
Zhu X, Goldberg AB (2009) Introduction to semi-supervised learning. Synth Lect Artif Intell Mach Learn 3(1):1–130
Acknowledgement
The authors would like to thank both New Mexico State University and Electronic Caregiver for supporting this research. Additionally, the authors would like to thank Hannah Rich, Andrew Washburn, Morgan Beasley, and Taylor Bunker for their comments on the manuscript.
Author information
Authors and Affiliations
Contributions
In addition to the specified contributions below, all authors met the ICMJE author criteria. The great majority of effort, including all analyses and drafting, was accomplished by Edin Šabić. The design of the work and revisions were primarily led by David Keeley. Additionally, interpretation of the data and revisions were also performed by Bailey Henderson. Lastly, substantial feedback during early drafting as well as subject matter knowledge concerning nursing and patient management was provided by Sara Nannemann.
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Šabić, E., Keeley, D., Henderson, B. et al. Healthcare and anomaly detection: using machine learning to predict anomalies in heart rate data. AI & Soc 36, 149–158 (2021). https://doi.org/10.1007/s00146-020-00985-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00146-020-00985-1