Skip to main content

Multi-modal Multi-instance Learning Using Weakly Correlated Histopathological Images and Tabular Clinical Information

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 (MICCAI 2021)

Abstract

The fusion of heterogeneous medical data is essential in precision medicine to assist medical experts in treatment decision-making. However, there is often little explicit correlation between data from different modalities such as histopathological images and tabular clinical data. Besides, attention-based multi-instance learning (MIL) often lacks sufficient supervision to assign appropriate attention weights for informative image patches and thus generates a good global representation for the whole image. In this paper, we propose a novel multi-modal multi-instance joint learning method, which fuses different modalities and magnification scales as a cross-modal representation to capture the potential complementary information and recalibrate the features in each modality. Furthermore, we leverage the information from tabular clinical data to optimize the MIL bag representation in the imaging modality. The proposed method is evaluated on a challenging medical task, i.e., lymph node metastasis (LNM) prediction of breast cancer, and achieves the state-of-the-art performance with AUC of 0.8844, outperforming the AUC of 0.7111 using histopathological images or the AUC of 0.8312 using tabular clinical data alone. An open-source implementation of our approach can be found at https://github.com/yfzon/Multi-modal-Multi-instance-Learning.

H. Li and F. Yang contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arik, S.O., Pfister, T.: TabNet: attentive interpretable tabular learning. arXiv preprint arXiv:1908.07442 (2019)

  2. Baltrušaitis, T., Ahuja, C., Morency, L.P.: Multimodal machine learning: a survey and taxonomy. IEEE Trans. Pattern Anal. Mach. Intell. 41(2), 423–443 (2018)

    Article  Google Scholar 

  3. Camgoz, N.C., Hadfield, S., Koller, O., Bowden, R.: Using convolutional 3D neural networks for user-independent continuous gesture recognition. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 49–54. IEEE (2016)

    Google Scholar 

  4. Cao, R., et al.: Development and interpretation of a pathomics-based model for the prediction of microsatellite instability in colorectal cancer. Theranostics 10(24), 11080 (2020)

    Article  Google Scholar 

  5. Chen, R.J., et al.: Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Trans. Med. Imaging (2020)

    Google Scholar 

  6. Collins, F.S., Varmus, H.: A new initiative on precision medicine. N. Engl. J. Med. 372(9), 793–795 (2015)

    Article  Google Scholar 

  7. DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L.: Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988)

    Article  Google Scholar 

  8. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  9. Dihge, L., Ohlsson, M., Edén, P., Bendahl, P.O., Rydén, L.: Artificial neural network models to predict nodal status in clinically node-negative breast cancer. BMC Cancer 19(1), 610 (2019)

    Article  Google Scholar 

  10. Egger, P., Borges, P.V., Catt, G., Pfrunder, A., Siegwart, R., Dubé, R.: Posemap: lifelong, multi-environment 3d lidar localization. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3430–3437. IEEE (2018)

    Google Scholar 

  11. Krieken, J.H.: Precision medicine. J. Hematop. 6(1), 1 (2013). https://doi.org/10.1007/s12308-013-0176-x

    Article  Google Scholar 

  12. Hou, J.C., Wang, S.S., Lai, Y.H., Tsao, Y., Chang, H.W., Wang, H.M.: Audio-visual speech enhancement using multimodal deep convolutional neural networks. IEEE Trans. Emerging Topics Comput. Intell. 2(2), 117–128 (2018)

    Article  Google Scholar 

  13. Ilse, M., Tomczak, J., Welling, M.: Attention-based deep multiple instance learning. In: International Conference on Machine Learning, pp. 2127–2136 (2018)

    Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  15. Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Artificial Intelligence and Statistics, pp. 562–570 (2015)

    Google Scholar 

  16. Li, R., Yao, J., Zhu, X., Li, Y., Huang, J.: Graph CNN for survival analysis on whole slide pathological images. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 174–182. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_20

    Chapter  Google Scholar 

  17. Nazeri, K., Aminpour, A., Ebrahimi, M.: Two-stage convolutional neural network for breast cancer histology image classification. In: Campilho, A., Karray, F., ter Haar Romeny, B. (eds.) ICIAR 2018. LNCS, vol. 10882, pp. 717–726. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93000-8_81

    Chapter  Google Scholar 

  18. Schmitz, R., et al.: Multi-scale fully convolutional neural networks for histopathology image segmentation: from nuclear aberrations to the global tissue architecture. Med. Image Anal. 70, 101996 (2021)

    Article  Google Scholar 

  19. Song, L., et al.: A deep multi-modal CNN for multi-instance multi-label image classification. IEEE Trans. Image Process. 27(12), 6025–6038 (2018)

    Article  MathSciNet  Google Scholar 

  20. Srinidhi, C.L., Ciga, O., Martel, A.L.: Deep neural network models for computational histopathology: a survey. Med. Image Anal. 67, 101813 (2020)

    Article  Google Scholar 

  21. Tai, W., Qin, B., Cheng, K.: Inhibition of breast cancer cell growth and invasiveness by dual silencing of HER-2 and VEGF. Mol. Pharm. 7(2), 543–556 (2010)

    Article  Google Scholar 

  22. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114 (2019)

    Google Scholar 

  23. Wang, T., et al.: Microsatellite instability prediction of uterine corpus endometrial carcinoma based on H&E histology whole-slide imaging. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 1289–1292. IEEE (2020)

    Google Scholar 

  24. Wang, X., et al.: Weakly supervised deep learning for whole slide lung cancer image analysis. IEEE Trans. Cybern. 50(9), 3950–3962 (2019)

    Article  Google Scholar 

  25. Yang, Y., Fu, Z.Y., Zhan, D.C., Liu, Z.B., Jiang, Y.: Semi-supervised multi-modal multi-instance multi-label deep network with optimal transport. IEEE Trans. Knowl. Data Eng. 33, 696–709 (2019)

    Google Scholar 

  26. Yang, Z., Ran, L., Zhang, S., Xia, Y., Zhang, Y.: EMS-Net: ensemble of multiscale convolutional neural networks for classification of breast cancer histology images. Neurocomputing 366, 46–53 (2019)

    Article  Google Scholar 

  27. Yao, J., Zhu, X., Huang, J.: Deep multi-instance learning for survival prediction from whole slide images. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 496–504. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_55

    Chapter  Google Scholar 

  28. Zhao, Y., et al.: Predicting lymph node metastasis using histopathological images based on multiple instance learning with deep graph convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4837–4846 (2020)

    Google Scholar 

  29. Zhao, Z., Lin, H., Chen, H., Heng, P.-A.: PFA-ScanNet: pyramidal feature aggregation with synergistic learning for breast cancer metastasis analysis. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 586–594. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_65

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was partially funded by National Key R&D Program of China (2018YFC2000702).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liansheng Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, H. et al. (2021). Multi-modal Multi-instance Learning Using Weakly Correlated Histopathological Images and Tabular Clinical Information. In: de Bruijne, M., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2021. MICCAI 2021. Lecture Notes in Computer Science(), vol 12908. Springer, Cham. https://doi.org/10.1007/978-3-030-87237-3_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87237-3_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87236-6

  • Online ISBN: 978-3-030-87237-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics