Studies reveal that the false alarm rate (FAR) demonstrated by intensive care unit (ICU) vital signs monitors ranges from 0.72 to 0.99. We applied machine learning (ML) to ICU multi-sensor information to imitate a medical specialist in diagnosing patient condition. We hypothesized that applying this data-driven approach to medical monitors will help reduce the FAR even when data from sensors are missing. An expert-based rules algorithm identified and tagged in our dataset seven clinical alarm scenarios. We compared a random forest (RF) ML model trained using the tagged data, where parameters (e.g., heart rate or blood pressure) were (deliberately) removed, in detecting ICU signals with the full expert-based rules (FER), our ground truth, and partial expert-based rules (PER), missing these parameters. When all alarm scenarios were examined, RF and FER were almost identical. However, in the absence of one to three parameters, RF maintained its values of the Youden index (0.94–0.97) and positive predictive value (PPV) (0.98–0.99), whereas PER lost its value (0.54–0.8 and 0.76–0.88, respectively). While the FAR for PER with missing parameters was 0.17–0.39, it was only 0.01–0.02 for RF. When scenarios were examined separately, RF showed clear superiority in almost all combinations of scenarios and numbers of missing parameters. When sensor data are missing, specialist performance worsens with the number of missing parameters, whereas the RF model attains high accuracy and low FAR due to its ability to fuse information from available sensors, compensating for missing parameters.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
In this paper, “missingness” of data is considered in two similar contexts. First is to describe parameter values that were missing in our TAMC ICU database. As mentioned above, we only used data without missing values. Second is to describe parameters that although have values in the dataset, are deliberately deleted by us in some of the experiments to check the RF ability to classify an alarm not relying on those missing parameters in order to imitate such a missingness situation at the ICU.
When the RF is trained without using a specific parameter, it is forced to find the best fit for the missing data in order to map the remaining parameters onto the alarm annotation/tag (“alarm” vs. “no alarm” or each of the seven clinical scenarios we identified) without using this parameter. That is, the remaining parameters provide a classification rule that dispenses with the missing parameter and thus, informally, we consider this behavior as “compensation” of the existing parameters to the missing parameter.
Drew BJ, Califf RM, Funk M, Kaufman ES, Krucoff MW, Laks MM, et al. Practice standards for electrocardiographic monitoring in hospital settings. Circulation. 2004;110(17):2721–46.
Sendelbach S, Funk M. Alarm fatigue: a patient safety concern. Adv Crit Care. 2013;24(4):378–86.
Drew BJ, Harris P, Schindler D, Salas-Boni R, Bai Y, Tinoco A, et al. Insights into the problem of alarm fatigue with physiologic monitor devices: a comprehensive observational study of consecutive intensive care unit patients. PLoS ONE. 2014;9(10):1–23.
Cvach M. Monitor alarm fatigue: an integrative review. Biomed Instrum Technol. 2012;46(4):268–77.
Sorkin RD. FORUM: why are people turning off our alarms? J Acoust Soc Am. 1988;84(3):1107–8.
Edworthy J. The design and implementation of non-verbal auditory warnings. Appl Ergon. 1994;25(4):202–10.
Xie H, Kang J, Mills GH. Clinical review: the impact of noise on patients’ sleep and the effectiveness of noise reduction strategies in intensive care units. Crit Care. 2009;13(2):208.
Institute ECRI. Top 10 heath technology hazards for 2012. Health Devices. 2011;40(11):358–73.
Institute ECRI. Top 10 health technology hazards for 2013. Health Devices. 2012;41(11):342–65.
Institute ECRI. Top 10 heath technology hazards for 2014. Health Devices. 2013;42(11):354–80.
ECRI Institute. Top 10 heath technology hazards for 2015. Health Devices. 2014.
ECRI Institute. Top 10 heath technology hazards for 2016. Health Devices. 2015.
ECRI Institute. Top 10 heath technology hazards for 2017. Health Devices. 2016.
Clifford GD, Silva I, Moody B, Li Q, Kella D, Shahin A, et al. The PhysioNet/Computing in cardiology challenge 2015: reducing false arrhythmia alarms in the ICU. Comput Cardiol. 2015;2015:273–6.
Eerikäinen LM, Vanschoren J, Rooijakkers MJ, Vullings R, Aarts RM. Reduction of false arrhythmia alarms using signal selection and machine learning. Physiol Meas. 2016;37(8):204.
Rijsbergen CJ. Information Retrieval. 2nd ed. London: Butterworths; 1979.
Bitan Y, O’Connor MF. Correlating data from different sensors to increase the positive predictive value of alarms: an empiric assessment. F1000Research. 2012;1:45.
Imhoff M, Kuhls S. Alarm algorithms in critical care monitoring. Anesth Analg. 2006;102(5):1525–37.
Vesin A, Azoulay E, Ruckly S, Vignoud L, Rusinovà K, Benoit D, et al. Reporting and handling missing values in clinical studies in intensive care units. Intensive Care Med. 2013;39(8):1396–404.
Altman DG, Bland JM. Statistics notes: diagnostic tests 2: predictive values. Br Med J. 1994;30(6947):102.
Lalkhen AG, McCluskey A. Clinical tests: sensitivity and specificity. Contin Educ Anaesth Crit Care Pain. 2008;8(6):221–3.
Božikov J, Zaletel-Kragelj L. Test validity measures and receiver operating characteristic (ROC) analysis. Methods Tools Public Health. 2010;50:749–70.
Bishop CM. Pattern recognition and machine learning. New York: Springer; 2007.
Liaw A, Wiener M. Classification and regression by random forest. R News. 2002;2(3):18–22.
Breiman L. Classification and regression trees. Belmont: Wadsworth International Group; 1984.
Conflict of interest
The authors decalre that they have no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Full results for the first test
Unlike the RF model for which the output can be thresholded with different values, the FER/PER classification results are a single value (a binary output obtained by the model rules, e.g., ARTBPM < 50 mm Hg, CVP < 5 mm Hg, and CVP > – 10 mm Hg indicate deterministically Hypovolemia). Thus, it is impossible to calculate the value of the area under the curve (AUC) for PER, and therefore these cells below are left empty. Green and orange cells indicate better and worse results, respectively, for each of the missing parameters in each clinical alarm scenario.
One missing parameter
Two missing parameters
Three missing parameters
About this article
Cite this article
Hever, G., Cohen, L., O’Connor, M.F. et al. Machine learning applied to multi-sensor information to reduce false alarm rate in the ICU. J Clin Monit Comput 34, 339–352 (2020). https://doi.org/10.1007/s10877-019-00307-x
- False alarms
- Intensive care unit
- Machine learning
- Missing data
- Random forest