Skip to main content

Deep Text Prior: Weakly Supervised Learning for Assertion Classification

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions (ICANN 2019)


The success of neural networks is typically attributed to their ability to closely mimic relationships between features and labels observed in the training dataset. This, however, is only part of the answer: in addition to being fit to data, neural networks have been shown to be useful priors on the conditional distribution of labels given features and can be used as such even in the absence of trustworthy training labels. This feature of neural networks can be harnessed to train high quality models on low quality training data in tasks for which large high-quality ground truth datasets don’t exist. One of these problems is assertion classification in biomedical texts: discriminating between positive, negative and speculative statements about certain pathologies a patient may have. We present an assertion classification methodology based on recurrent neural networks, attention mechanism and two flavours of transfer learning (language modelling and heuristic annotation) that achieves state of the art results on MIMIC-CXR radiology reports.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.

    It’s C, B, C and probably C.

  2. 2.

    Uzuner et. al. [1] refer to this type as alter-assertion.

  3. 3.

    pun intended.


  1. Uzuner, Ö., Zhang, X., Sibanda, T.: Machine learning and rule-based approaches to assertion classification. J. Am. Med. Inform. Assoc. 16(1), 109–115 (2009)

    Article  Google Scholar 

  2. Goff, D.J., Loehfelm, T.W.: Automated radiology report summarization using an open-source natural language processing pipeline. J. Digit. Imaging 31(2), 185–192 (2018)

    Article  Google Scholar 

  3. Bodenreider, O.: The unified medical language system (UMLS): integrating biomedical terminology. Nucleic Acids Res. 32(suppl-1), D267–D270 (2004)

    Article  Google Scholar 

  4. Chute, C.G., et al.: Mayo clinical text analysis and knowledge extraction system (cTAKES): architecture, component evaluation and applications. J. Am. Med. Inform. Assoc. 17(5), 507–513 (2010).

    Article  Google Scholar 

  5. Soldaini, L., Goharian, N.: Quickumls: a fast, unsupervised approach for medical concept extraction. In: MedIR Workshop, sigir (2016)

    Google Scholar 

  6. Aronson, A.R.: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of the AMIA Symposium, p. 17. American Medical Informatics Association (2001)

    Google Scholar 

  7. Uzuner, Ö., South, B.R., Shen, S., DuVall, S.L.: 2010 i2b2/va challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 18(5), 552–556 (2011)

    Article  Google Scholar 

  8. Miranda, E., Aryuni, M., Irwansyah, E.: A survey of medical image classification techniques. In: 2016 International Conference on Information Management and Technology (ICIMTech), pp. 56–61, November 2016.

  9. Lai, M.: Deep learning for medical image segmentation. arXiv preprint arXiv:1505.02000 (2015)

  10. Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)

    Article  Google Scholar 

  11. Johnson, A.E., et al.: MIMIC-CXR: a large publicly available database of labeled chest radiographs. arXiv preprint arXiv:1901.07042 (2019)

  12. Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. arXiv preprint arXiv:1901.07031 (2019)

  13. Rubin, J., Sanghavi, D., Zhao, C., Lee, K., Qadir, A., Xu-Wilson, M.: Large scale automated reading of frontal and lateral chest x-rays using dual convolutional neural networks. arXiv preprint arXiv:1804.07839 (2018)

  14. Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanan, B.G.: A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inform. 34(5), 301–310 (2001)

    Article  Google Scholar 

  15. Mehrabi, S., et al.: DEEPEN: a negation detection system for clinical text incorporating dependency relation into NegEx. J. Biomed. Inform. 54, 213–219 (2015)

    Article  Google Scholar 

  16. Enger, M., Velldal, E., Øvrelid, L.: An open-source tool for negation detection: a maximum-margin approach. In: Proceedings of the Workshop Computational Semantics Beyond Events and Roles, pp. 64–69 (2017)

    Google Scholar 

  17. Peng, Y., Wang, X., Lu, L., Bagheri, M., Summers, R.M., Lu, Z.: NegBio: a high-performance tool for negation and uncertainty detection in radiology reports. CoRR abs/1712.05898 (2017).

  18. Shelmanov, A., Smirnov, I., Vishneva, E.: Information extraction from clinical texts in Russian. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, vol. 14, pp. 537–549 (2015)

    Google Scholar 

  19. Afzal, Z., Pons, E., Kang, N., Sturkenboom, M.C., Schuemie, M.J., Kors, J.A.: ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus. BMC Bioinform. 15(1), 373 (2014)

    Article  Google Scholar 

  20. Sleator, D.D., Temperley, D.: Parsing English with a link grammar. arXiv preprint cmp-lg/9508004 (1995)

    Google Scholar 

  21. McCray, A.T., Srinivasan, S., Browne, A.C.: Lexical methods for managing variation in biomedical terminologies. In: Proceedings of the Annual Symposium on Computer Application in Medical Care, p. 235. American Medical Informatics Association (1994)

    Google Scholar 

  22. Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 740–750 (2014)

    Google Scholar 

  23. Wu, S., et al.: Negation’s not solved: generalizability versus optimizability in clinical natural language processing. PLoS One 9(11), e112774 (2014)

    Article  Google Scholar 

  24. Apostolova, E., Tomuro, N., Demner-Fushman, D.: Automatic extraction of lexico-syntactic patterns for detection of negation and speculation scopes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: Short Papers-Volume 2, pp. 283–287. Association for Computational Linguistics (2011)

    Google Scholar 

  25. Zou, B., Zhou, G., Zhu, Q.: Tree kernel-based negation and speculation scope detection with structured syntactic parse features. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 968–976 (2013)

    Google Scholar 

  26. Torralba, A., Efros, A.A.: Unbiased look at dataset bias (2011)

    Google Scholar 

  27. de Bruijn, B., Cherry, C., Kiritchenko, S., Martin, J., Zhu, X.: NRC at i2b2: one challenge, three practical tasks, nine statistical systems, hundreds of clinical records, millions of useful features

    Google Scholar 

  28. Clark, C., et al.: Determining assertion status for medical problems in clinical records

    Google Scholar 

  29. Demner-Fushman, D., Apostolova, E., Islamaj Dogan, R., et al.: NLM’s system description for the fourth i2b2/va challenge. In: Proceedings of the 2010 i2b2/VA Workshop on Challenges in Natural Language Processing for Clinical Data, Boston, MA, USA: i2b2 (2010)

    Google Scholar 

  30. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  31. Zhou, Z.H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5(1), 44–53 (2017)

    Article  Google Scholar 

  32. Olivier Chapelle, B.S., Zien, A.: Semi-Supervised Learning. Adaptive Computation and Machine Learning Series. MIT Press, Cambridge (2010)

    Google Scholar 

  33. Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)

    Article  Google Scholar 

  34. Settles, B.: Active learning. Synth. Lect. Artif. Intell. Mach. Learn. 6(1), 1–114 (2012)

    Article  MathSciNet  Google Scholar 

  35. Hanneke, S., et al.: Theory of disagreement-based active learning. Found. Trends® Mach. Learn. 7(2–3), 131–309 (2014)

    Article  Google Scholar 

  36. Zhang, C., Chaudhuri, K.: Beyond disagreement-based agnostic active learning. In: Advances in Neural Information Processing Systems, pp. 442–450 (2014)

    Google Scholar 

  37. Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L.: MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. arXiv preprint arXiv:1712.05055 (2017)

  38. Kumar, M.P., Packer, B., Koller, D.: Self-paced learning for latent variable models. In: Advances in Neural Information Processing Systems, pp. 1189–1197 (2010)

    Google Scholar 

  39. Jiang, L., Meng, D., Zhao, Q., Shan, S., Hauptmann, A.G.: Self-paced curriculum learning. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  40. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. CoRR abs/1607.04606 (2016).

  41. Peters, M.E., et al.: Deep contextualized word representations. arXiv preprint arXiv:1802.05365 (2018)

  42. Chelba, C., et al.: One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013)

  43. Vaswani, A., et al.: Attention is all you need. CoRR abs/1706.03762 (2017).

  44. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  45. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. CoRR abs/1412.6980 (2014).

  46. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010)

    Google Scholar 

  47. Varma, S., Simon, R.: Bias in error estimation when using cross-validation for model selection. BMC Bioinform. 7(1), 91 (2006)

    Article  Google Scholar 

  48. Sigurd, B., Eeg-Olofsson, M., Van De Weijer, J.: Word length, sentence length and frequency - Zipf revisited. Studia Linguistica 58(1), 37–52 (2004).

    Article  Google Scholar 

  49. Ulyanov, D., Vedaldi, A., Lempitsky, V.: Deep image prior. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (2018)

    Google Scholar 

Download references


The authors would like to acknowledge Artem Shelmanov and Ilya Sochenkov for sharing their expertise in natural language processing, mentorship and support.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Vadim Liventsev .

Editor information

Editors and Affiliations

A Appendix: F1 Scores

A Appendix: F1 Scores

Throughout the paper, we use accuracy as the metric for our results. For completeness sake, micro-averaged f1 scores are attached here (Tables 7, 8 and 9).

Table 7. F1 scores on MIMIC-CXR-FREQ
Table 8. F1 scores on MIMIC-CXR-LONG
Table 9. F1 scores on I2B2 Challenge

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liventsev, V., Fedulova, I., Dylov, D. (2019). Deep Text Prior: Weakly Supervised Learning for Assertion Classification. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30492-8

  • Online ISBN: 978-3-030-30493-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics