A Neural Attention Model for Categorizing Patient Safety Events

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10193)


Patient Safety Event reports are narratives describing potential adverse events to the patients and are important in identifying, and preventing medical errors. We present a neural network architecture for identifying the type of safety events which is the first step in understanding these narratives. Our proposed model is based on a soft neural attention model to improve the effectiveness of encoding long sequences. Empirical results on two large-scale real-world datasets of patient safety reports demonstrate the effectiveness of our method with significant improvements over existing methods.


Deep learning Text categorization Medical text 



We thank the 3 anonymous reviewers for their helpful comments. This project was funded under contract/grant number Grant R01 HS023701-02 from the Agency for Healthcare Research and Quality (AHRQ), U.S. Department of Health and Human Services. The opinions expressed in this document are those of the authors and do not necessarily reflect the official position of AHRQ or the U.S. Department of Health and Human Services.


  1. 1.
    American Hospital Association: Fast facts on US hospitals (2013)Google Scholar
  2. 2.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 (2014)
  3. 3.
    Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: KDD (2016)Google Scholar
  4. 4.
    Cho, K., Van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: encoder-decoder approaches. arXiv:1409.1259 (2014)
  5. 5.
    Clarke, J.R.: How a system for reporting medical errors can and cannot improve patient safety. Am. Surg. 72(11), 1088–1091 (2006)Google Scholar
  6. 6.
    Dahl, G.E., Sainath, T.N., Hinton, G.E.: Improving deep neural networks for LVCSR using rectified linear units and dropout. In: IEEE ICASSP, pp. 8609–8613 (2013)Google Scholar
  7. 7.
    Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. In: NIPS (2015)Google Scholar
  8. 8.
    Elman, J.L.: Finding structure in time. Cogn. Sci. 14(2), 179–211 (1990)CrossRefGoogle Scholar
  9. 9.
    Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: ICML, vol. 14, pp. 1764–1772 (2014)Google Scholar
  10. 10.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  11. 11.
    Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: ACL, pp. 655–665, June 2014Google Scholar
  12. 12.
    Kim, Y.: Convolutional neural networks for sentence classification. In: EMNLP (2014)Google Scholar
  13. 13.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  14. 14.
    Makary, M.A., Daniel, M.: Medical error - the third leading cause of death in the US. BMJ 353, i2139 (2016)CrossRefGoogle Scholar
  15. 15.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS (2013)Google Scholar
  16. 16.
    Pascanu, R., Mikolov, T., Bengio, Y.: On the difficulty of training recurrent neural networks. ICML 28, 1310–1318 (2013)Google Scholar
  17. 17.
    Soldaini, L., Goharian, N.: Learning to rank for consumer health search: a semantic approach. In: Jose, J.M., et al. (eds.) ECIR 2017. LNCS, vol. 10193, pp. 640–646. Springer, Heidelberg (2017)Google Scholar
  18. 18.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: NIPS, pp. 3104–3112 (2014)Google Scholar
  19. 19.
    Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp. 1422–1432 (2015)Google Scholar
  20. 20.
    Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: ACL, pp. 90–94 (2012)Google Scholar
  21. 21.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: NAACL-HLT (2016)Google Scholar
  22. 22.
    Yates, A., Goharian, N., Frieder, O.: Extracting adverse drug reactions from social media. In: AAAI, pp. 2460–2467 (2015)Google Scholar
  23. 23.
    Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: IJCAI, pp. 4069–4076. AAAI Press (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Georgetown UniversityWashington DCUSA
  2. 2.National Center for Human Factors in HealthcareMedStar HealthWashington DCUSA

Personalised recommendations