Machine learning applied to multi-sensor information to reduce false alarm rate in the ICU


Studies reveal that the false alarm rate (FAR) demonstrated by intensive care unit (ICU) vital signs monitors ranges from 0.72 to 0.99. We applied machine learning (ML) to ICU multi-sensor information to imitate a medical specialist in diagnosing patient condition. We hypothesized that applying this data-driven approach to medical monitors will help reduce the FAR even when data from sensors are missing. An expert-based rules algorithm identified and tagged in our dataset seven clinical alarm scenarios. We compared a random forest (RF) ML model trained using the tagged data, where parameters (e.g., heart rate or blood pressure) were (deliberately) removed, in detecting ICU signals with the full expert-based rules (FER), our ground truth, and partial expert-based rules (PER), missing these parameters. When all alarm scenarios were examined, RF and FER were almost identical. However, in the absence of one to three parameters, RF maintained its values of the Youden index (0.94–0.97) and positive predictive value (PPV) (0.98–0.99), whereas PER lost its value (0.54–0.8 and 0.76–0.88, respectively). While the FAR for PER with missing parameters was 0.17–0.39, it was only 0.01–0.02 for RF. When scenarios were examined separately, RF showed clear superiority in almost all combinations of scenarios and numbers of missing parameters. When sensor data are missing, specialist performance worsens with the number of missing parameters, whereas the RF model attains high accuracy and low FAR due to its ability to fuse information from available sensors, compensating for missing parameters.

This is a preview of subscription content, log in to check access.

Fig. 1


  1. 1.

    In this paper, “missingness” of data is considered in two similar contexts. First is to describe parameter values that were missing in our TAMC ICU database. As mentioned above, we only used data without missing values. Second is to describe parameters that although have values in the dataset, are deliberately deleted by us in some of the experiments to check the RF ability to classify an alarm not relying on those missing parameters in order to imitate such a missingness situation at the ICU.

  2. 2.

    When the RF is trained without using a specific parameter, it is forced to find the best fit for the missing data in order to map the remaining parameters onto the alarm annotation/tag (“alarm” vs. “no alarm” or each of the seven clinical scenarios we identified) without using this parameter. That is, the remaining parameters provide a classification rule that dispenses with the missing parameter and thus, informally, we consider this behavior as “compensation” of the existing parameters to the missing parameter.


  1. 1.

    Drew BJ, Califf RM, Funk M, Kaufman ES, Krucoff MW, Laks MM, et al. Practice standards for electrocardiographic monitoring in hospital settings. Circulation. 2004;110(17):2721–46.

    Article  Google Scholar 

  2. 2.

    Sendelbach S, Funk M. Alarm fatigue: a patient safety concern. Adv Crit Care. 2013;24(4):378–86.

    Article  Google Scholar 

  3. 3.

    Drew BJ, Harris P, Schindler D, Salas-Boni R, Bai Y, Tinoco A, et al. Insights into the problem of alarm fatigue with physiologic monitor devices: a comprehensive observational study of consecutive intensive care unit patients. PLoS ONE. 2014;9(10):1–23.

    Article  Google Scholar 

  4. 4.

    Cvach M. Monitor alarm fatigue: an integrative review. Biomed Instrum Technol. 2012;46(4):268–77.

    Article  Google Scholar 

  5. 5.

    Sorkin RD. FORUM: why are people turning off our alarms? J Acoust Soc Am. 1988;84(3):1107–8.

    Article  Google Scholar 

  6. 6.

    Edworthy J. The design and implementation of non-verbal auditory warnings. Appl Ergon. 1994;25(4):202–10.

    CAS  Article  Google Scholar 

  7. 7.

    Xie H, Kang J, Mills GH. Clinical review: the impact of noise on patients’ sleep and the effectiveness of noise reduction strategies in intensive care units. Crit Care. 2009;13(2):208.

    Article  Google Scholar 

  8. 8.

    Institute ECRI. Top 10 heath technology hazards for 2012. Health Devices. 2011;40(11):358–73.

    Google Scholar 

  9. 9.

    Institute ECRI. Top 10 health technology hazards for 2013. Health Devices. 2012;41(11):342–65.

    Google Scholar 

  10. 10.

    Institute ECRI. Top 10 heath technology hazards for 2014. Health Devices. 2013;42(11):354–80.

    Google Scholar 

  11. 11.

    ECRI Institute. Top 10 heath technology hazards for 2015. Health Devices. 2014.

  12. 12.

    ECRI Institute. Top 10 heath technology hazards for 2016. Health Devices. 2015.

  13. 13.

    ECRI Institute. Top 10 heath technology hazards for 2017. Health Devices. 2016.

  14. 14.

    Clifford GD, Silva I, Moody B, Li Q, Kella D, Shahin A, et al. The PhysioNet/Computing in cardiology challenge 2015: reducing false arrhythmia alarms in the ICU. Comput Cardiol. 2015;2015:273–6.

    Google Scholar 

  15. 15.

    Eerikäinen LM, Vanschoren J, Rooijakkers MJ, Vullings R, Aarts RM. Reduction of false arrhythmia alarms using signal selection and machine learning. Physiol Meas. 2016;37(8):204.

    Article  Google Scholar 

  16. 16.

    Rijsbergen CJ. Information Retrieval. 2nd ed. London: Butterworths; 1979.

    Google Scholar 

  17. 17.

    Bitan Y, O’Connor MF. Correlating data from different sensors to increase the positive predictive value of alarms: an empiric assessment. F1000Research. 2012;1:45.

    Article  Google Scholar 

  18. 18.

    Imhoff M, Kuhls S. Alarm algorithms in critical care monitoring. Anesth Analg. 2006;102(5):1525–37.

    Article  Google Scholar 

  19. 19.

    Vesin A, Azoulay E, Ruckly S, Vignoud L, Rusinovà K, Benoit D, et al. Reporting and handling missing values in clinical studies in intensive care units. Intensive Care Med. 2013;39(8):1396–404.

    Article  Google Scholar 

  20. 20.

    Altman DG, Bland JM. Statistics notes: diagnostic tests 2: predictive values. Br Med J. 1994;30(6947):102.

    Article  Google Scholar 

  21. 21.

    Lalkhen AG, McCluskey A. Clinical tests: sensitivity and specificity. Contin Educ Anaesth Crit Care Pain. 2008;8(6):221–3.

    Article  Google Scholar 

  22. 22.

    Božikov J, Zaletel-Kragelj L. Test validity measures and receiver operating characteristic (ROC) analysis. Methods Tools Public Health. 2010;50:749–70.

    Google Scholar 

  23. 23.

    Bishop CM. Pattern recognition and machine learning. New York: Springer; 2007.

    Google Scholar 

  24. 24.

    Liaw A, Wiener M. Classification and regression by random forest. R News. 2002;2(3):18–22.

    Google Scholar 

  25. 25.

    Breiman L. Classification and regression trees. Belmont: Wadsworth International Group; 1984.

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Yuval Bitan.

Ethics declarations

Conflict of interest

The authors decalre that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.



Appendix A: Full results for the first test

Unlike the RF model for which the output can be thresholded with different values, the FER/PER classification results are a single value (a binary output obtained by the model rules, e.g., ARTBPM < 50 mm Hg, CVP < 5 mm Hg, and CVP >  – 10 mm Hg indicate deterministically Hypovolemia). Thus, it is impossible to calculate the value of the area under the curve (AUC) for PER, and therefore these cells below are left empty. Green and orange cells indicate better and worse results, respectively, for each of the missing parameters in each clinical alarm scenario.

One missing parameter


Two missing parameters


Three missing parameters


Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hever, G., Cohen, L., O’Connor, M.F. et al. Machine learning applied to multi-sensor information to reduce false alarm rate in the ICU. J Clin Monit Comput 34, 339–352 (2020).

Download citation


  • False alarms
  • Intensive care unit
  • Machine learning
  • Missing data
  • Random forest