Skip to main content

TMSS: An End-to-End Transformer-Based Multimodal Network for Segmentation and Survival Prediction

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13437))


When oncologists estimate cancer patient survival, they rely on multimodal data. Even though some multimodal deep learning methods have been proposed in the literature, the majority rely on having two or more independent networks that share knowledge at a later stage in the overall model. On the other hand, oncologists do not do this in their analysis but rather fuse the information in their brain from multiple sources such as medical images and patient history. This work proposes a deep learning method that mimics oncologists’ analytical behavior when quantifying cancer and estimating patient survival. We propose TMSS, an end-to-end Transformer based Multimodal network for Segmentation and Survival predication that leverages the superiority of transformers that lies in their abilities to handle different modalities. The model was trained and validated for segmentation and prognosis tasks on the training dataset from the HEad & NeCK TumOR segmentation and the outcome prediction in PET/CT images challenge (HECKTOR). We show that the proposed prognostic model significantly outperforms state-of-the-art methods with a concordance index of \({\textbf {0.763}} \pm {{\textbf {0.14}}}\) while achieving a comparable dice score of \({\textbf {0.772}} \pm {{\textbf {0.030}}}\) to a standalone segmentation model. The code is publicly available at

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others


  1. 1.


  1. Akiba, T., Sano, S., Yanase, T., Ohta, T., Koyama, M.: Optuna: a next-generation hyperparameter optimization framework. CoRR abs/1907.10902 (2019).

  2. Chen, J., Cheung, H.M.C., Milot, L., Martel, A.L.: AMINN: autoencoder-based multiple instance neural network improves outcome prediction in multifocal liver metastases. In: de Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C. (eds.) MICCAI 2021. LNCS, vol. 12905, pp. 752–761. Springer, Cham (2021).

    Chapter  Google Scholar 

  3. Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc.: Ser. B (Methodol.) 34(2), 187–202 (1972)

    MathSciNet  MATH  Google Scholar 

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  5. Diamant, A., Chatterjee, A., Vallières, M., Shenouda, G., Seuntjens, J.: Deep learning in head & neck cancer outcome prediction. Sci. Rep. 9(1), 1–10 (2019)

    Article  Google Scholar 

  6. Doppalapudi, S., Qiu, R.G., Badr, Y.: Lung cancer survival period prediction and understanding: deep learning approaches. Int. J. Med. Inf. 148, 104371 (2021)

    Article  Google Scholar 

  7. Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

  8. Fh, T., Cyw, C., Eyw, C.: Radiomics AI prediction for head and neck squamous cell carcinoma (hnscc) prognosis and recurrence with target volume approach. BJR| Open 3, 20200073 (2021)

    Google Scholar 

  9. Fujima, N., et al.: Prediction of the local treatment outcome in patients with oropharyngeal squamous cell carcinoma using deep learning analysis of pretreatment fdg-pet images. BMC Cancer 21(1), 1–13 (2021)

    Article  MathSciNet  Google Scholar 

  10. Gupta, N., Kaushik, B.N.: Prognosis and prediction of breast cancer using machine learning and ensemble-based training model. Comput. J. (2021)

    Google Scholar 

  11. Hatamizadeh, A., et al.: Unetr: transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 574–584 (2022)

    Google Scholar 

  12. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  13. Hosny, A., et al.: Deep learning for lung cancer prognostication: a retrospective multi-cohort radiomics study. PLoS Med. 15(11), e1002711 (2018)

    Article  Google Scholar 

  14. Johnson, D.E., Burtness, B., Leemans, C.R., Lui, V.W.Y., Bauman, J.E., Grandis, J.R.: Head and neck squamous cell carcinoma. Nat. Rev. Dis. Primers 6(1), 1–22 (2020)

    Article  Google Scholar 

  15. Kazmierski, M.: Machine Learning for Prognostic Modeling in Head and Neck Cancer Using Multimodal Data. Ph.D. thesis, University of Toronto (Canada) (2021)

    Google Scholar 

  16. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)

    Google Scholar 

  17. Lee, H., Hong, H., Seong, J., Kim, J.S., Kim, J.: Survival prediction of liver cancer patients from ct images using deep learning and radiomic feature-based regression. In: Medical Imaging 2020: Computer-Aided Diagnosis, vol. 11314, p. 113143L. International Society for Optics and Photonics (2020)

    Google Scholar 

  18. Li, H., et al.: Deep convolutional neural networks for imaging data based survival analysis of rectal cancer. In: 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), pp. 846–849. IEEE (2019)

    Google Scholar 

  19. Mackillop, W.J.: The importance of prognosis in cancer medicine. TNM Online (2003)

    Google Scholar 

  20. Oreiller, V., et al.: Head and neck tumor segmentation in pet/ct: the hecktor challenge. Med. Image Anal., 102336 (2021)

    Google Scholar 

  21. Saeed, N., Majzoub, R.A., Sobirov, I., Yaqub, M.: An ensemble approach for patient prognosis of head and neck tumor using multimodal data (2022)

    Google Scholar 

  22. Sobirov, I., Nazarov, O., Alasmawi, H., Yaqub, M.: Automatic segmentation of head and neck tumor: how powerful transformers are? arXiv preprint arXiv:2201.06251 (2022)

  23. Sun, D., Wang, M., Li, A.: A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data. IEEE/ACM Trans. Comput. Biol. Bioinf. 16(3), 841–850 (2018)

    Article  Google Scholar 

  24. Sun, L., Zhang, S., Chen, H., Luo, L.: Brain tumor segmentation and survival prediction using multimodal MRI scans with deep learning. Front. Neurosci. 13, 810 (2019)

    Article  Google Scholar 

  25. Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  26. Vale-Silva, L.A., Rohr, K.: Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 11(1), 1–12 (2021)

    Article  Google Scholar 

  27. Wang, X., Li, B.b.: Deep learning in head and neck tumor multiomics diagnosis and analysis: review of the literature. Front. Genet. 12, 42 (2021).,

  28. WHO: Cancer. Accessed 30 Jan 2022

  29. Yu, C.N., Greiner, R., Lin, H.C., Baracos, V.: Learning patient-specific cancer survival distributions as a sequence of dependent regressors. Adv. Neural Inf. Process. Syst. 24, 1845–1853 (2011)

    Google Scholar 

  30. Zhen, S.H., et al.: Deep learning for accurate diagnosis of liver tumor based on magnetic resonance imaging and clinical data. Front. Oncol. 10, 680 (2020)

    Google Scholar 

  31. Zhou, T., et al.: \(\text{ M}^2\text{ Net }\): Multi-modal multi-channel network for overall survival time prediction of brain tumor patients. In: Martel, A.L., Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12262, pp. 221–231. Springer, Cham (2020).

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Numan Saeed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saeed, N., Sobirov, I., Al Majzoub, R., Yaqub, M. (2022). TMSS: An End-to-End Transformer-Based Multimodal Network for Segmentation and Survival Prediction. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16448-4

  • Online ISBN: 978-3-031-16449-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics