Real alerts and artifact classification in archived multi-signal vital sign monitoring data: implications for mining big data


Huge hospital information system databases can be mined for knowledge discovery and decision support, but artifact in stored non-invasive vital sign (VS) high-frequency data streams limits its use. We used machine-learning (ML) algorithms trained on expert-labeled VS data streams to automatically classify VS alerts as real or artifact, thereby “cleaning” such data for future modeling. 634 admissions to a step-down unit had recorded continuous noninvasive VS monitoring data [heart rate (HR), respiratory rate (RR), peripheral arterial oxygen saturation (SpO2) at 1/20 Hz, and noninvasive oscillometric blood pressure (BP)]. Time data were across stability thresholds defined VS event epochs. Data were divided Block 1 as the ML training/cross-validation set and Block 2 the test set. Expert clinicians annotated Block 1 events as perceived real or artifact. After feature extraction, ML algorithms were trained to create and validate models automatically classifying events as real or artifact. The models were then tested on Block 2. Block 1 yielded 812 VS events, with 214 (26 %) judged by experts as artifact (RR 43 %, SpO2 40 %, BP 15 %, HR 2 %). ML algorithms applied to the Block 1 training/cross-validation set (tenfold cross-validation) gave area under the curve (AUC) scores of 0.97 RR, 0.91 BP and 0.76 SpO2. Performance when applied to Block 2 test data was AUC 0.94 RR, 0.84 BP and 0.72 SpO2. ML-defined algorithms applied to archived multi-signal continuous VS monitoring data allowed accurate automated classification of VS alerts as real or artifact, and could support data mining for future model building.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7


  1. 1.

    Otero A, Félix P, Barro S, Palacios F. Addressing the flaws of current critical alarms: a fuzzy constraint satisfaction approach. Artif Intell Med. 2009;47:219–38.

    Article  PubMed  Google Scholar 

  2. 2.

    Takla G, Petre JH, Doyle DJ, Horibe M, Gopakumaran B. The problem of artifacts in patient monitor data during surgery: a clinical and methodological review. Anesth Analg. 2006;103:1196–204.

    Article  PubMed  Google Scholar 

  3. 3.

    Smith M. Rx for ECG monitoring artifact. Crit Care Nurse. 1984;4:64.

    CAS  PubMed  Google Scholar 

  4. 4.

    Murdoch TB, Ketsky AL. The inevitable application of big data in health care. JAMA. 2013;309:1351.

    CAS  Article  PubMed  Google Scholar 

  5. 5.

    The Big Data Research and Development Initiative.

  6. 6.

    Merelli I, Perez-Sanchez H, Gesing S, D’Agostino, D. Managing, analysing, and integrating big data in medical bioinformatics: open problems and future perspectives. DioMed Res Int 2104, Article ID 134024. doi:10.1155/2014/134023

  7. 7.

    Peek N, Holmes JH, Sun J. Technical challenges for big data in biomedicine and health: data sources, infrastructure and analytics. Yearb Med Inform. 2014;9(1):42–7.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Zhang B, Wang Y, Chen F. Multilabel image classification via high-order label correlation driven active learning. IEEE Trans Image Process. 2014;23:1430–41.

    Article  PubMed  Google Scholar 

  9. 9.

    Zhu Y, Zhang S, Liu W, Metaxas DN. Scalable histopathological image analysis via active learning. Med Image Comput Comput Assist Interv. 2014;17(Pt 3):369–76.

    PubMed  Google Scholar 

  10. 10.

    Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006.

    Google Scholar 

  11. 11.

    Acharya UR, Sree SV, Ribeiro R, et al. Data mining framework for fatty liver disease classification in ultrasound: a hybrid feature extraction paradigm. Med Phys. 2012;39:4255–64.

    Article  PubMed  Google Scholar 

  12. 12.

    Acharya UR, Sree SV, Muthu Rama Krishnan M, et al. Automated classification of patients with coronary artery disease using grayscale features from left ventricle echocardiographic images. Comput Methods Programs Biomed. 2013;112:624–32.

    Article  PubMed  Google Scholar 

  13. 13.

    Suzuki K. Machine learning in computer-aided diagnosis of the thorax and colon in CT: a survey. IEICE Trans Inf Syst. 2013;E96-D(4):772–83.

    Article  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Halford JJ, Schalkoff RJ, Zhou J, et al. Standardized database development for EEG epileptiform transient detection: EEGnet scoring system and machine learning analysis. J Neurosci Methods. 2013;212:308–16.

    Article  PubMed  Google Scholar 

  15. 15.

    Kim S, Hamilton R, Pineles S, Bergsneider M, Hu X. Noninvasive intracranial hypertension detection utilizing semisupervised learning. IEEE Trans Biomed Eng. 2013;60:1126–33.

    Article  PubMed  Google Scholar 

  16. 16.

    Zweigenbaum P, Lavergne T, Grabar N, Hamon T, Rosset S, Grouin C. Combining an expert-based medical entity recognizer to a machine-learning system: methods and a case study. Biomed Inform Insights. 2013;6(Suppl 1):51–62.

    Article  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Kruppa J, Liu Y, Biau G, et al. Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory. Biomed J. 2014;56:534–63.

    Google Scholar 

  18. 18.

    Mohri M, Rostamizadeh A, Talwalkar A. Foundations of machine learning. Cambridge: MIT Press; 2012.

    Google Scholar 

  19. 19.

    Saeed M, Villarroel M, Reisner AT, et al. Multiparameter intelligent monitoring in intensive care II (MIMIC-II): a public-access intensive care unit database. Crit Care Med. 2011;39:952–60.

    Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Li Q, Clifford GD. Signal quality and data fusion for false alarm reduction in the intensive care unit. J Electrocardiol. 2012;45:596–603.

    Article  PubMed  Google Scholar 

  21. 21.

    Clifford GC, Scott DJ, Villarroel M. User guide and documentation for the MIMIC II database. MIMIC-II database version 2.6, Rev: 291 Last Changed Date: 2012-02-24 15:53:51-0500 (24 Feb 2012).

  22. 22.

    Aboukhalil A, Nielsen L, Saeed M, Mark RG, Clifford GD. Reducing false alarm rates for critical arrhythmias using the arterial blood pressure waveform. J Biomed Inform. 2008;41:442–51.

    Article  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Lehman LH, Adams RP, Mayaud L, Moody GB, Malhotra A, Mark RG, Nemati S. A physiological time series dynamics-based approach to patient monitoring and outcome prediction. IEEE J Biomed Health Inform. 2014;PP(99):1. doi:10.1109/JBHI.2014.2330827.

    Google Scholar 

  24. 24.

    Heldt T, Kashif FM, Sulemanji M, O’Leary HM, du Plessis AJ, Verghese GC. Continuous quantitative monitoring of cerebral oxygen metabolism in neonates by ventilator-gated analysis of NIRS recordings. Acta Neurochir Suppl. 2012;114:177–80.

    Article  PubMed  PubMed Central  Google Scholar 

  25. 25.

    Hug C, Clifford GD, Reisner AT. Clinician blood pressure documentation of stable intensive care patients: an intelligent archiving agent has a higher association with future hypotension. Crit Care Med. 2011;39:1006–14.

    Article  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Boumbarov O, Velchev Y, Sokolov S. ECG personal identification in subspaces using radial basis neural networks. In: 2009 IEEE international workshop on intelligent data acquisition and advanced computing systems: technology and applications, 2009 IDAACS IEEE 2009, p. 446–51.

  27. 27.

    Paul JS, Reddy MR, Kumar VJ. A transform domain SVD filter for suppression of muscle noise artifacts in exercise ECG’s. IEEE Trans Biomed Eng. 2000;47:654–63.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Marque C, Bisch C, Dantas R, Elayoubi S, Brosse V, Perot C. Adaptive filtering for ECG rejection from surface EMG recordings. J Electromyogr Kinesiol. 2005;15:310–5.

    CAS  Article  PubMed  Google Scholar 

  29. 29.

    Lu G, Brittain J-S, Holland P, et al. Removing ECG noise from surface EMG signals using adaptive filtering. Neurosci Lett. 2009;462:14–9.

    CAS  Article  PubMed  Google Scholar 

  30. 30.

    Ko BH, Lee T, Choi C, Kim YH, Park G, Kang K, Bae SK, Shin K. Motion artifact in the electrocardiogram using adaptive filtering on behalf of half cell potential monitoring. Conf Proc IEEE Eng Med Biol Soc. 2012;2102:1590–3. doi:10.1109/EMBC.2012.6346248.

    Google Scholar 

  31. 31.

    Hamilton PS, Curley M, Aimi R. Effect of adaptive motion-artifact reduction on QRS detection. Biomed Instrum Technol. 2000;34:197–202.

    CAS  PubMed  Google Scholar 

  32. 32.

    Thakral A, Wallace J, Tomlin D, Seth N, Thakor NV. Surgical motion adaptive robotic technology (SMART): taking the motion out of physiological motion. In: Medical image computing and computer-assisted intervention–MICCAI 2001. Springer; 2001;p. 317–25.

  33. 33.

    Chong JW, Dao DK, Salehizadeh SM, McManus DD, Darling CE, Chon KH, Mendelson Y. Photoplethysmograph signal reconstruction based on a novel hybrid motion artifact detection-reduction approach. Part I: motion and noise artifact detection. Ann Biomed Eng. 2014;42(11):2238–50. doi:10.1007/s10439-014-1080-y (Epub 5 Aug 2014).

  34. 34.

    Tsien CL. Event discovery in medical time-series data. In: Proceedings AMIA symposium 2000; p. 858–62.

  35. 35.

    Hu X, Sapo M, Nenov V, et al. Predictive combinations of monitor alarms preceding in-hospital code blue events. J Biomed Inform. 2012;45:913–21.

    Article  PubMed  Google Scholar 

  36. 36.

    Cao H, Norris P, Ozdas A, Jenkins J, Morris JA. A simple non-physiological artifact filter for invasive arterial blood pressure monitoring: a study of 1852 trauma ICU patients. Conf Proc IEEE Eng Med Biol Soc. 2006;1:1417–20.

    PubMed  Google Scholar 

  37. 37.

    Görges M, Winton P, Koval V, et al. An evaluation of an expert system for detecting critical events during anesthesia in a human patient simulator: a prospective randomized controlled study. Anesth Analg. 2013;117:380–91.

    Article  PubMed  Google Scholar 

  38. 38.

    Güiza F, Depreitere B, Piper I, Van den Berghe G, Meyfroidt G. Novel methods to predict increased intracranial pressure during intensive care and long-term neurologic outcome after traumatic brain injury: development and validation in a multicenter dataset. Crit Care Med. 2013;41:554–64.

    Article  PubMed  Google Scholar 

  39. 39.

    Siebig S, Kuhls S, Imhoff M, Langgartner J, Reng M, Scholmerich J, Gather U, Wrede CE. Collection of annotated data in a clinical validation study for alarm algorithms in intensive care—a methodologic framework. J Crit Care. 2010;25:128–35.

    Article  PubMed  Google Scholar 

  40. 40.

    Bonafide CP, Sander M, Graham CS, Werich Paine CM, Rock W, Rich A, Roberts KE, Fortino M, Nadkarni VM, Lin R, Keren R. Video methods for evaluating physiologic monitor alarms and alarm responses. Biomed Instrum Technol. 2014;48:220–30.

    Article  PubMed  Google Scholar 

  41. 41.

    Siebig S, Kuhls S, Imhoff M, Gather U, Schölmerich J, Wrede CE. Intensive care unit alarms—How many do we need? Crit Care Med. 2010;38:451–6.

    Article  PubMed  Google Scholar 

  42. 42.

    Kleinberg S, Elhadad N. Lessons learned in replicating data-driven experiments in multiple medical systems and patient populations. In: AMIA annual symposium proceedings 2013; vol 16, p. 786–95 (eCollection 2013).

  43. 43.

    Goldstein B, McNames J, McDonald BA, et al. Physiologic data acquisition system and database for the study of disease dynamics in the intensive care unit. Crit Care Med. 2003;31:433–41.

    Article  PubMed  Google Scholar 

  44. 44.

    Silaganesan A, Manley G, Huang MC. Informatics for neurocritical care: challenges and opportunity. Neurocrit Care. 2014;20:132–41.

    Article  Google Scholar 

  45. 45.

    Della MV, Maddalena E, Mizzaro S, Machin P, Beltrami CA. Preliminary results from a crowdsourcing experiment in immunohistochemistry. Diagn Pathol. 2014;9(1):1069. doi:10.1186/1746-1596-9-S1-S6.

    Google Scholar 

  46. 46.

    Ranard BL, Ha YP, Meisel ZF, Asch DA, Hill SS, Becker LB, Seymour AK, Merchant RM. Crowdsourcing—harnessing the masses to advance health and medicine, a systematic review. J Gen Intern Med. 2014;29:187–203.

    Article  PubMed  Google Scholar 

Download references


This study was funded with grant support from the United States National Institutes of Health, National Institute of Nursing Research RO1 NR13912 and the National Science Foundation NSF 1320347. The funding bodies approved the study design as submitted in the grant application proposal, but had no role in the data collection, analyses, or interpretation, manuscript preparation, or decision to submit the manuscript for publication.

Author information



Corresponding author

Correspondence to Marilyn Hravnak.

Ethics declarations

The study was approved and has current active approval from the University of Pittsburgh Institutional Review Board (PRO12070002). All procedures performed in this study involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. For this type of study formal consent is not required, and was approved with consent waiver.

Conflict of interest

The authors have no commercial conflicts of interest to report.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 14 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hravnak, M., Chen, L., Dubrawski, A. et al. Real alerts and artifact classification in archived multi-signal vital sign monitoring data: implications for mining big data. J Clin Monit Comput 30, 875–888 (2016).

Download citation


  • Physiologic monitoring
  • Machine learning
  • Archived data
  • Big-data
  • Vital signs
  • Artifact
  • Cardiorespiratory instability