Skip to main content

AI-Assisted Annotator Using Reinforcement Learning


Machine learning in the healthcare domain is often hindered by data which are both noisy and lacking reliable ground truth labeling. Moreover, the cost of cleaning and annotating this data is significant since, unlike other data domains, medical data annotation requires the work of skilled medical professionals. In this work, we introduced the use of reinforcement learning to mimic the decision-making process of annotators for medical events allowing automation of annotation and labeling. Our reinforcement agent learns to annotate health monitor alarm data based on annotations done by an expert. We demonstrate the efficacy of our implementation on ICU critical alarm data sets. We evaluate our algorithm against standard supervised machine learning and deep learning methods. Compared to SVM and LSTM methods, our method achieves high sensitivity that is critical for alarm data; exhibits better generalization across mixed downsampling; and preserves comparable model performance.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. Zhou L, Pan S, Wang J, Vasilakos AV. Opportunities and challenges. Machine learning on big data. Neurocomputing. 2017;237:350–61.

    Article  Google Scholar 

  2. Xiao C, Choi E, Sun J. Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review. In: JAMIA. 2018.

  3. Ghassemi M, Naumann T, Schulam P, Beam AL, Ranganath R. Opportunities in machine learning for healthcare. 2018. arXiv:1806.00388.

  4. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, Cui C, Corrado G, Thrun S, Dean J. A guide to deep learning in healthcare. Nat Med. 2019;25(1):24.

    Article  Google Scholar 

  5. Wang X, Gao Y, Lin J, Rangwala H, Mittu R. A machine learning approach to false alarm detection for critical arrhythmia alarms. In: 2015 IEEE 14th international conference on machine learning and applications (ICMLA). IEEE; 2015. p. 202–207.

  6. Schwab P, Keller E, Muroi C, Mack DJ, Strässle C, Karlen W. Not to cry wolf: Distantly supervised multitask learning in critical care. 2018. arXiv:1802.05027.

  7. Sutton RS, Barto AG, et al. Introduction to reinforcement learning, vol. 2. Cambridge: MIT Press; 1998.

    MATH  Google Scholar 

  8. Volodymyr M, Koray K, David S, Andrei AR, Joel V. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529–33.

    Article  Google Scholar 

  9. Sayadi O, Shamsollahi MB. Life-threatening arrhythmia verification in icu patients using the joint cardiovascular dynamical model and a Bayesian filter. IEEE Trans Biomed Eng. 2011;58(10):2748–57.

    Article  Google Scholar 

  10. Clifford GD, Silva I, Moody B, Li Q, Kella D, Shahin A, Kooistra T, Perry D, Mark RG. The physionet/computing in cardiology challenge 2015: reducing false arrhythmia alarms in the icu. In: 2015 computing in cardiology conference (CinC). IEEE; 2015. p. 273–76.

  11. Plesinger F, Klimes P, Halamek J, Jurak P. Taming of the monitors: reducing false alarms in intensive care units. Physiol Meas. 2016;37(8):1313.

    Article  Google Scholar 

  12. Salas-Boni R, Bai Y, Harris PRE, Drew BJ, Hu X. False ventricular tachycardia alarm suppression in the icu based on the discrete wavelet transform in the ecg signal. J Electrocardiol. 2014;47(6):775–80.

    Article  Google Scholar 

  13. Behar J, Oster J, Li Q, Clifford GD. ECG signal quality during arrhythmia and its application to false alarm reduction. IEEE Trans Biomed Eng. 2013;60(6):1660–6.

    Article  Google Scholar 

  14. Prasad N, Cheng L-F, Chivers C, Draugelis M, Engelhardt BE. A reinforcement learning approach to weaning of mechanical ventilation in intensive care units. 2017. arXiv:1704.06300.

  15. Escandell-Montero P, Chermisi M, Martinez-Martinez JM, Gomez-Sanchis J, Barbieri C, Soria-Olivas E, Mari F, Vila-Francés J, Stopper A, Gatti E. Optimization of anemia treatment in hemodialysis patients via reinforcement learning. Artif Intell Med. 2014;62(1):47–60.

    Article  Google Scholar 

  16. Nemati S, Ghassemi MM, Clifford GD. Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach. In: 2016 38th annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE; 2016. p. 2978–81.

  17. Padmanabhan R, Meskin N, Haddad WM. Closed-loop control of anesthesia and mean arterial pressure using reinforcement learning. Biomed Signal Process Control. 2015;22:54–64.

    Article  Google Scholar 

  18. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020;21(1):6.

    Article  Google Scholar 

  19. Mnih V, Badia AP, Mirza M, Graves A, Lillicrap T, Harley T, Silver D, Kavukcuoglu K. Asynchronous methods for deep reinforcement learning. In: International conference on machine learning. 2016. p. 1928–37.

  20. Kobayashi L, Oyalowo A, Agrawal U, Chen S-L, Asaad W, Hu X, Loparo KA, Jay GD, Merck DL. Development and deployment of an open, modular, near-real-time patient monitor datastream conduit toolkit to enable healthcare multimodal data fusion in a live emergency department setting for experimental bedside clinical informatics research. IEEE Sens Lett. 2018;3(1):1–4.

    Article  Google Scholar 

  21. Mintz M, Bills S, Snow R, Jurafsky D. Distant supervision for relation extraction without labeled data. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP, vol 2, no 2. 2009. p. 1003–11.

  22. Kobayashi L, Oyalowo A, Agrawal U, Hu X, Loparo KA, Leary OP, Jay GD, Merck DL. Push electronic relay for smart alarms for end user situational awareness (PERSEUS) research program full original dataset. Brown University Open Data Collection. PERSEUS/ATOMICS Digital Archive. Brown Digital Repository: Brown University Library; 2018.

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to V. Ratna Saripalli .

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Saripalli , V.R., Pati , D., Potter , M. et al. AI-Assisted Annotator Using Reinforcement Learning. SN COMPUT. SCI. 1, 327 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • False alarms
  • Reinforcement learning
  • Annotation