Skip to main content

Clinically Correct Report Generation from Chest X-Rays Using Templates

  • Conference paper
  • First Online:
Machine Learning in Medical Imaging (MLMI 2021)

Abstract

We address the task of automatically generating a medical report from chest X-rays. Many authors have proposed deep learning models to solve this task, but they focus mainly on improving NLP metrics, such as BLEU and CIDEr, which are not suitable to measure clinical correctness in clinical reports. In this work, we propose CNN-TRG, a Template-based Report Generation model that detects a set of abnormalities and verbalizes them via fixed sentences, which is much simpler than other state-of-the-art NLG methods and achieves better results in medical correctness metrics.

We benchmark our model in the IU X-ray and MIMIC-CXR datasets against naive baselines as well as deep learning-based models, by employing the Chexpert labeler and MIRQI as clinical correctness evaluations, and NLP metrics as secondary evaluation. We also provide further evidence indicating that traditional NLP metrics are not suitable for this task by presenting their lack of robustness in multiple cases. We show that slightly altering a template-based model can increase NLP metrics considerably while maintaining high clinical performance. Our work contributes by a simple but effective approach for chest X-ray report generation, as well as by supporting a model evaluation focused primarily on clinical correctness metrics and secondarily on NLP metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://pytorch.org/.

  2. 2.

    https://pdpino.github.io/clinically-correct.

  3. 3.

    https://openi.nlm.nih.gov/faq.

  4. 4.

    https://physionet.org/content/mimic-cxr-jpg/2.0.0/.

  5. 5.

    https://github.com/salaniz/pycocoevalcap.

  6. 6.

    https://github.com/stanfordmlgroup/chexpert-labeler.

  7. 7.

    https://github.com/xiaosongwang/MIRQI.

  8. 8.

    https://radreport.org/.

References

  1. Biswal, S., Xiao, C., Glass, L.M., Westover, B., Sun, J.: Clara: clinical report auto-completion. In: The Web Conference (2020). https://doi.org/10.1145/3366423.3380137

  2. Boag, W., Hsu, T.M.H., Mcdermott, M., Berner, G., Alesentzer, E., Szolovits, P.: Baselines for chest X-ray report generation. In: ML4H at NeurIPS (2020)

    Google Scholar 

  3. Chen, Z., Song, Y., Chang, T.H., Wan, X.: Generating radiology reports via memory-driven transformer. In: EMNLP (2020). https://doi.org/10.18653/v1/2020.emnlp-main.112

  4. Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. JAMIA (2015). https://doi.org/10.1093/jamia/ocv080

    Article  Google Scholar 

  5. Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR (2009). https://doi.org/10.1109/CVPR.2009.5206848

  6. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017). https://doi.org/10.1109/CVPR.2017.243

  7. Huang, X., Yan, F., Xu, W., Li, M.: Multi-attention and incorporating background information model for chest x-ray image report generation. IEEE Access (2019). https://doi.org/10.1109/ACCESS.2019.2947134

    Article  Google Scholar 

  8. Irvin, J., et al.: CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: AAAI Conference on Artificial Intelligence (2019). https://doi.org/10.1609/aaai.v33i01.3301590

  9. Jing, B., Wang, Z., Xing, E.: Show, describe and conclude: on exploiting the structure information of chest x-ray reports. In: ACL (2019). https://doi.org/10.18653/v1/P19-1657

  10. Jing, B., Xie, P., Xing, E.: On the automatic generation of medical imaging reports. In: ACL (2018). https://doi.org/10.18653/v1/P18-1240

  11. Johnson, A., et al.: MIMIC-CXR-JPG-chest radiographs with structured labels (version 2.0.0). PhysioNet (2019). https://doi.org/10.13026/8360-t248

  12. Johnson, A.E.W., et al.: MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci. Data (2019). https://doi.org/10.1038/s41597-019-0322-0

  13. Kougia, V., Pavlopoulos, J., Papapetrou, P., Gordon, M.: RTEX: a novel framework for ranking, tagging, and explanatory diagnostic captioning of radiography exams. JAMIA (2021). https://doi.org/10.1093/jamia/ocab046

  14. Li, C.Y., Liang, X., Hu, Z., Xing, E.P.: Knowledge-driven encode, retrieve, paraphrase for medical image report generation. In: AAAI Conference on Artificial Intelligence (2019). https://doi.org/10.1609/aaai.v33i01.33016666

  15. Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Text Summarization Branches Out (2004)

    Google Scholar 

  16. Liu, G., et al.: Clinically accurate chest x-ray report generation. In: ML4H (2019)

    Google Scholar 

  17. Lovelace, J., Mortazavi, B.: Learning to generate clinically coherent chest X-ray reports. In: EMNLP (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.110

  18. Mathur, N., Baldwin, T., Cohn, T.: Tangled up in BLEU: Reevaluating the evaluation of automatic machine translation evaluation metrics. In: ACL (2020). https://doi.org/10.18653/v1/2020.acl-main.448

  19. Messina, P., et al.: A survey on deep learning and explainability for automatic image-based medical report generation (2020)

    Google Scholar 

  20. Ni, J., Hsu, C.N., Gentili, A., McAuley, J.: Learning visual-semantic embeddings for reporting abnormal findings on chest X-rays. In: EMNLP (2020). https://doi.org/10.18653/v1/2020.findings-emnlp.176

  21. Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: ACL (2002). https://doi.org/10.3115/1073083.1073135

  22. Pino, P., Parra, D., Messina, P., Besa, C., Uribe, S.: Inspecting state of the art performance and NLP metrics in image-based medical report generation. arXiv preprint arXiv:2011.09257 (2020). In LXAI at NeurIPS 2020

  23. Rajpurkar, P., et al.: CheXNet: radiologist-level pneumonia detection on chest x-rays with deep learning (2017)

    Google Scholar 

  24. Reiter, E.: A structured review of the validity of BLEU. Comput. Linguist. (2018). https://doi.org/10.1162/coli_a_00322

    Article  Google Scholar 

  25. Reyes, M., et al.: On the interpretability of artificial intelligence in radiology: Challenges and opportunities. Radiol. Artif. Intell. (2020). https://doi.org/10.1148/ryai.2020190043

  26. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: Visual explanations from deep networks via gradient-based localization. In: ICCV, pp. 618–626 (2017). https://doi.org/10.1109/ICCV.2017.74

  27. Syeda-Mahmood, T., et al.: Chest X-ray report generation through fine-grained label learning. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12262, pp. 561–571. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59713-9_54

    Chapter  Google Scholar 

  28. Vedantam, R., Lawrence Zitnick, C., Parikh, D.: CIDEr: consensus-based image description evaluation. In: CVPR (2015). https://doi.org/10.1109/CVPR.2015.7299087

  29. Xiong, Y., Du, B., Yan, P.: Reinforced transformer for medical image captioning. In: MLMI (2019). https://doi.org/10.1007/978-3-030-32692-0_77

  30. Xue, Y., et al.: Multimodal recurrent model with attention for automated radiology report generation. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11070, pp. 457–466. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00928-1_52

    Chapter  Google Scholar 

  31. Zhang, Y., Wang, X., Xu, Z., Yu, Q., Yuille, A., Xu, D.: When radiology report generation meets knowledge graph. In: AAAI Conference on Artificial Intelligence (2020). https://doi.org/10.1609/aaai.v34i07.6989

  32. Zhang, Y., Ding, D.Y., Qian, T., Manning, C.D., Langlotz, C.P.: Learning to summarize radiology findings. In: LOUHI at NeurIPS (2018). https://doi.org/10.18653/v1/W18-5623

Download references

Acknowledgments

This work was partially funded by ANID, Millennium Science Initiative Program, Code ICN17_002 and by ANID, Fondecyt grant 1191791.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pablo Pino .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 218 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pino, P., Parra, D., Besa, C., Lagos, C. (2021). Clinically Correct Report Generation from Chest X-Rays Using Templates. In: Lian, C., Cao, X., Rekik, I., Xu, X., Yan, P. (eds) Machine Learning in Medical Imaging. MLMI 2021. Lecture Notes in Computer Science(), vol 12966. Springer, Cham. https://doi.org/10.1007/978-3-030-87589-3_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87589-3_67

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87588-6

  • Online ISBN: 978-3-030-87589-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics