Abstract
Many healthcare applications would significantly benefit from the processing and analyzing of multi-modal data. In this paper, we propose a novel multi-task, multi-modal, and multi-attention framework to learn and align information from multiple medical sources. Based on experiments on a public medical dataset, we show that combining features from images (e.g. x-rays) and texts (e.g. clinical reports), sharing information among different tasks (e.g. x-rays classification, autoencoder, and diagnosis generation) and across domains boost the performance of diagnosis generation (86.0% in terms of BLEU@4).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Xu, J., Mei, T., Yao, T., Rui, Y.: MSR-VTT: a large video description dataset for bridging video and language. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 5288–5296 (2016)
Yu, H., Wang, J., Huang, Z., Yang, Y., Xu, W.: Video paragraph captioning using hierarchical recurrent neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 4584–4593 (2016)
Xu, T., Zhang, H., Huang, X., Zhang, S., Metaxas, D.N.: Multimodal deep learning for cervical dysplasia diagnosis. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 115–123. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_14
Zhang, Z., Chen, P., Sapkota, M., Yang, L.: TandemNet: distilling knowledge from medical images using diagnostic reports as optional semantic references. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 320–328. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_37
Demner-Fushman, D., et al.: Preparing a collection of radiology examinations for distribution and retrieval. J. Am. Med. Inform. Assoc. 23(2), 304–310 (2016)
Li, J., Luong, T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and documents. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1106–1115 (2015)
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)
Luong, M.T., Le, Q.V., Sutskever, I., Vinyals, O., Kaiser, L.: Multi-task sequence to sequence learning. In: International Conference on Learning Representations 2016, May 2016
Pasunuru, R., Bansal, M.: Multi-task video captioning with video and entailment generation. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1273–1283 (2017)
Pascanu, R., Gülçehre, Ç., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. In: International Conference on Learning Representations 2014 (2014)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: BLEU: a method for automatic evaluation of machine translation. In: Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out: Proceedings of the ACL-04 Workshop, pp. 74–81 (2004)
Jing, B., Xie, P., Xing, E.P.: On the automatic generation of medical imaging reports. CoRR abs/1711.08195 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Tian, J., Zhong, C., Shi, Z., Xu, F. (2019). Towards Automatic Diagnosis from Multi-modal Medical Data. In: Suzuki, K., et al. Interpretability of Machine Intelligence in Medical Image Computing and Multimodal Learning for Clinical Decision Support. ML-CDS IMIMIC 2019 2019. Lecture Notes in Computer Science(), vol 11797. Springer, Cham. https://doi.org/10.1007/978-3-030-33850-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-33850-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33849-7
Online ISBN: 978-3-030-33850-3
eBook Packages: Computer ScienceComputer Science (R0)