Abstract
Clinical decision of oncology comes from multi-modal information, such as morphological information from histopathology and molecular profiles from genomics. Most of the existing multi-modal learning models achieve better performance than single-modal models. However, these multi-modal models only focus on the interactive information between modalities, which ignore the internal relationship between multiple tasks. Both survival analysis task and tumor grading task can provide reliable information for pathologists in the diagnosis and prognosis of cancer. In this work, we present a Multi-modal and Multi-task Fusion (\(\mathrm {M^{2}F}\)) model to make use of the potential connection between modalities and tasks. The co-attention module in multi-modal transformer extractor can excavate the intrinsic information between modalities more effectively than the original fusion methods. Joint training of tumor grading branch and survival analysis branch, instead of separating them, can make full use of the complementary information between tasks to improve the performance of the model. We validate our \(\mathrm {M^{2}F}\) model on glioma datasets from the Cancer Genome Atlas (TCGA). Experiment results show our \(\mathrm {M^{2}F}\) model is superior to existing multi-modal models, which proves the effectiveness of our model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)
Braman, N., Gordon, J.W.H., Goossens, E.T., Willis, C., Stumpe, M.C., Venkataraman, J.: Deep orthogonal fusion: multimodal prognostic biomarker discovery integrating radiology, pathology, genomic, and clinical data. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 667–677. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87240-3_64
Cerami, E., et al.: The cbio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov. 2(5), 401–404 (2012)
Chen, R.J., et al.: Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging 41(4), 757–770 (2020)
Chen, R.J., et al.: Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4015–4025 (2021)
Cui, C., et al.: Survival prediction of brain cancer with incomplete radiology, pathology, genomics, and demographic data. arXiv preprint. arXiv:2203.04419 (2022)
Gallego, O.: Nonsurgical treatment of recurrent glioblastoma. Curr. Oncol. 22(4), 273–281 (2015)
Gurcan, M.N., Boucheron, L.E., Can, A., Madabhushi, A., Rajpoot, N.M., Yener, B.: Histopathological image analysis: a review. IEEE Rev. Biomed. Eng. 2, 147–171 (2009)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 770–778 (2016)
Kang, M., Ko, E., Mersha, T.B.: A roadmap for multi-omics data integration using deep learning. Briefings Bioinformatics 23(1), bbab454 (2022)
Katzman, J.L., Shaham, U., Cloninger, A., Bates, J., Jiang, T., Kluger, Y.: Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC Med. Res. Methodol. 18(1), 1–12 (2018)
Kim, J.H., Jun, J., Zhang, B.T.: Bilinear attention networks. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Kim, W., Son, B., Kim, I.: Vilt: Vision-and-language transformer without convolution or region supervision. In: International Conference on Machine Learning, pp. 5583–5594. PMLR (2021)
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Louis, D.N., et al.: The 2021 who classification of tumors of the central nervous system: a summary. Neuro Oncol. 23(8), 1231–1251 (2021)
Lu, J., Yang, J., Batra, D., Parikh, D.: Hierarchical question-image co-attention for visual question answering. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Mobadersany, P., et al.: Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl. Acad. Sci. 115(13), E2970–E2979 (2018)
Ngiam, J., Khosla, A., Kim, M., Nam, J., Lee, H., Ng, A.Y.: Multimodal deep learning. In: International Conference on Machine Learning (2011)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556 (2014)
Sun, X., Panda, R., Feris, R., Saenko, K.: Adashare: learning what to share for efficient deep multi-task learning. Adv. Neural. Inf. Process. Syst. 33, 8728–8740 (2020)
Vafaeikia, P., Wagner, M.W., Tabori, U., Ertl-Wagner, B.B., Khalvati, F.: Improving the segmentation of pediatric low-grade gliomas through multitask learning. arXiv preprint. arXiv:2111.14959 (2021)
Vandenhende, S., Georgoulis, S., Van Gansbeke, W., Proesmans, M., Dai, D., Van Gool, L.: Multi-task learning for dense prediction tasks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3614–3633 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wang, P., Li, Y., Reddy, C.K.: Machine learning for survival analysis: a survey. ACM Comput. Surv. (CSUR) 51(6), 1–36 (2019)
Wang, R., Huang, Z., Wang, H., Wu, H.: Ammasurv: asymmetrical multi-modal attention for accurate survival analysis with whole slide images and gene expression data. In: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 757–760. IEEE (2021)
Wen, P.Y., Reardon, D.A.: Progress in glioma diagnosis, classification and treatment. Nat. Rev. Neurol. 12(2), 69–70 (2016)
Acknowledgement
This work was supported in part by the Natural Science Foundation of Ningbo City, China, under Grant 2021J052, in part by the National Natural Science Foundation of China under Grants 62171377, and in part by the Key Research and Development Program of Shaanxi Province under Grant 2022GY-084.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lu, Z., Lu, M., Xia, Y. (2022). \(\mathrm {M^{2}F}\): A Multi-modal and Multi-task Fusion Network for Glioma Diagnosis and Prognosis. In: Li, X., Lv, J., Huo, Y., Dong, B., Leahy, R.M., Li, Q. (eds) Multiscale Multimodal Medical Imaging. MMMI 2022. Lecture Notes in Computer Science, vol 13594. Springer, Cham. https://doi.org/10.1007/978-3-031-18814-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-18814-5_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-18813-8
Online ISBN: 978-3-031-18814-5
eBook Packages: Computer ScienceComputer Science (R0)