Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment

Chauhan, Geeticka; Liao, Ruizhi; Wells, William; Andreas, Jacob; Wang, Xin; Berkowitz, Seth; Horng, Steven; Szolovits, Peter; Golland, Polina

doi:10.1007/978-3-030-59713-9_51

Geeticka Chauhan¹⁶,
Ruizhi Liao¹⁶,
William Wells^16,17,
Jacob Andreas¹⁶,
Xin Wang¹⁸,
Seth Berkowitz¹⁹,
Steven Horng¹⁹,
Peter Szolovits¹⁶ &
…
Polina Golland¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12262))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

8601 Accesses
21 Citations
12 Altmetric

Abstract

We propose and demonstrate a novel machine learning algorithm that assesses pulmonary edema severity from chest radiographs. While large publicly available datasets of chest radiographs and free-text radiology reports exist, only limited numerical edema severity labels can be extracted from radiology reports. This is a significant challenge in learning such models for image classification. To take advantage of the rich information present in the radiology reports, we develop a neural network model that is trained on both images and free-text to assess pulmonary edema severity from chest radiographs at inference time. Our experimental results suggest that the joint image-text representation learning improves the performance of pulmonary edema assessment compared to a supervised model trained on images only. We also show the use of the text for explaining the image classification by the joint model. To the best of our knowledge, our approach is the first to leverage free-text radiology reports for improving the image model performance in this application. Our code is available at: https://github.com/RayRuizhiLiao/joint_chestxray.

G. Chauhan and R. Liao—Co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The numbers of images of the four severity levels are 2883, 1511, 1709, and 640 respectively.

References

Adams Jr, K.F., et al.: Characteristics and outcomes of patients hospitalized for heart failure in the United States: rationale, design, and preliminary observations from the first 100,000 cases in the acute decompensated heart failure national registry (adhere). Am. Heart J. 149(2), 209–216 (2005)
Google Scholar
Anderson, P., et al.: Bottom-up and top-down attention for image captioning and visual question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6077–6086 (2018)
Google Scholar
Antol, S., et al.: VQA: visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2425–2433 (2015)
Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3606–3611 (2019)
Google Scholar
Ben-Younes, H., Cadene, R., Cord, M., Thome, N.: MUTAN: multimodal tucker fusion for visual question answering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2612–2620 (2017)
Google Scholar
Chapman, W.W., Bridewell, W., Hanbury, P., Cooper, G.F., Buchanan, B.G.: A simple algorithm for identifying negated findings and diseases in discharge summaries. J. Biomed. Inform. 34(5), 301–310 (2001)
Article Google Scholar
Chechik, G., Sharma, V., Shalit, U., Bengio, S.: Large scale online learning of image similarity through ranking. J. Mach. Learn. Res. 11, 1109–1135 (2010)
MathSciNet MATH Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Gheorghiade, M., et al.: Assessing and grading congestion in acute heart failure: a scientific statement from the acute heart failure committee of the heart failure association of the European society of cardiology and endorsed by the European society of intensive care medicine. Eur. J. Heart Fail. 12(5), 423–433 (2010)
Article Google Scholar
Goldberger, A.L., et al.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)
Article Google Scholar
Harwath, D., Recasens, A., Surís, D., Chuang, G., Torralba, A., Glass, J.: Jointly discovering visual objects and spoken words from raw sensory input. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 649–665 (2018)
Google Scholar
Harwath, D., Torralba, A., Glass, J.: Unsupervised learning of spoken language with visual context. In: Advances in Neural Information Processing Systems, pp. 1858–1866 (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hunt, S.A., et al.: 2009 focused update incorporated into the ACC/AHA 2005 guidelines for the diagnosis and management of heart failure in adults: a report of the American college of cardiology foundation/American heart association task force on practice guidelines developed in collaboration with the international society for heart and lung transplantation. J. Am. Coll. Cardiol. 53(15), e1–e90 (2009)
Article Google Scholar
Jing, B., Xie, P., Xing, E.: On the automatic generation of medical imaging reports. arXiv preprint arXiv:1711.08195 (2017)
Johnson, A.E., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data 6 (2019)
Google Scholar
Liao, R., et al.: Semi-supervised learning for quantification of pulmonary edema in chest x-ray images. arXiv preprint arXiv:1902.10785 (2019)
Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: Advances in Neural Information Processing Systems, pp. 289–297 (2016)
Google Scholar
Moradi, M., Madani, A., Gur, Y., Guo, Y., Syeda-Mahmood, T.: Bimodal network architectures for automatic generation of image annotation from text. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 449–456. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_51
Chapter Google Scholar
Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop and Shared Task. pp. 319–327. Association for Computational Linguistics, Florence, Italy, August 2019. https://doi.org/10.18653/v1/W19-5034. https://www.aclweb.org/anthology/W19-5034
Plummer, B.A., Brown, M., Lazebnik, S.: Enhancing video summarization via vision-language embedding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5781–5789 (2017)
Google Scholar
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 618–626 (2017)
Google Scholar
Vasudevan, A.B., Gygli, M., Volokitin, A., Van Gool, L.: Query-adaptive video summarization via quality-aware relevance estimation. In: Proceedings of the 25th ACM International Conference on Multimedia, pp. 582–590 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., Luxburg, U.V., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 5998–6008. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf
Wang, X., Peng, Y., Lu, L., Lu, Z., Summers, R.M.: TieNet: text-image embedding network for common thorax disease classification and reporting in chest x-rays. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9049–9058 (2018)
Google Scholar
Wolf, T., et al.: Huggingface’s transformers: state-of-the-art natural language processing. ArXiv arXiv:1910 (2019)
Xu, K., et al.: Show, attend and tell: neural image caption generation with visual attention. In: International Conference on Machine Learning, pp. 2048–2057 (2015)
Google Scholar
Xue, Y., Huang, X.: Improved disease classification in chest X-rays with transferred features from report generation. In: Chung, A.C.S., Gee, J.C., Yushkevich, P.A., Bao, S. (eds.) IPMI 2019. LNCS, vol. 11492, pp. 125–138. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20351-1_10
Chapter Google Scholar

Download references

Acknowledgments

This work was supported in part by NIH NIBIB NAC P41EB015902, Wistron Corporation, Takeda, MIT Lincoln Lab, and Philips. We also thank Dr. Daniel Moyer for helping generate Fig. 1.

Author information

Authors and Affiliations

Massachusetts Institute of Technology, Cambridge, MA, USA
Geeticka Chauhan, Ruizhi Liao, William Wells, Jacob Andreas, Peter Szolovits & Polina Golland
Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
William Wells
Philips Research North America, Cambridge, MA, USA
Xin Wang
Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
Seth Berkowitz & Steven Horng

Authors

Geeticka Chauhan
View author publications
You can also search for this author in PubMed Google Scholar
Ruizhi Liao
View author publications
You can also search for this author in PubMed Google Scholar
William Wells
View author publications
You can also search for this author in PubMed Google Scholar
Jacob Andreas
View author publications
You can also search for this author in PubMed Google Scholar
Xin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Seth Berkowitz
View author publications
You can also search for this author in PubMed Google Scholar
Steven Horng
View author publications
You can also search for this author in PubMed Google Scholar
Peter Szolovits
View author publications
You can also search for this author in PubMed Google Scholar
Polina Golland
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruizhi Liao .

Editor information

Editors and Affiliations

University of Toronto, Toronto, ON, Canada
Anne L. Martel
The University of British Columbia, Vancouver, BC, Canada
Purang Abolmaesumi
University College London, London, UK
Danail Stoyanov
École Centrale de Nantes, Nantes, France
Diana Mateus
EURECOM, Biot, France
Maria A. Zuluaga
Chinese Academy of Sciences, Beijing, China
S. Kevin Zhou
Sorbonne University, Paris, France
Daniel Racoceanu
The Hebrew University of Jerusalem, Jerusalem, Israel
Leo Joskowicz

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 433 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chauhan, G. et al. (2020). Joint Modeling of Chest Radiographs and Radiology Reports for Pulmonary Edema Assessment. In: Martel, A.L., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science(), vol 12262. Springer, Cham. https://doi.org/10.1007/978-3-030-59713-9_51

Download citation

DOI: https://doi.org/10.1007/978-3-030-59713-9_51
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59712-2
Online ISBN: 978-3-030-59713-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)