INSIDE: Steering Spatial Attention with Non-imaging Information in CNNs

Jacenków, Grzegorz; O’Neil, Alison Q.; Mohr, Brian; Tsaftaris, Sotirios A.

doi:10.1007/978-3-030-59719-1_38

Grzegorz Jacenków¹⁶,
Alison Q. O’Neil^16,17,
Brian Mohr¹⁷ &
…
Sotirios A. Tsaftaris^16,17,18

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12264))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

8741 Accesses
5 Citations

Abstract

We consider the problem of integrating non-imaging information into segmentation networks to improve performance. Conditioning layers such as FiLM provide the means to selectively amplify or suppress the contribution of different feature maps in a linear fashion. However, spatial dependency is difficult to learn within a convolutional paradigm. In this paper, we propose a mechanism to allow for spatial localisation conditioned on non-imaging information, using a feature-wise attention mechanism comprising a differentiable parametrised function (e.g. Gaussian), prior to applying the feature-wise modulation. We name our method INstance modulation with SpatIal DEpendency (INSIDE). The conditioning information might comprise any factors that relate to spatial or spatio-temporal information such as lesion location, size, and cardiac cycle phase. Our method can be trained end-to-end and does not require additional supervision. We evaluate the method on two datasets: a new CLEVR-Seg dataset where we segment objects based on location, and the ACDC dataset conditioned on cardiac phase and slice location within the volume. Code and the CLEVR-Seg dataset are available at https://github.com/jacenkow/inside.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Softcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning.
2.
Automated Cardiac Diagnosis Challenge (ACDC), MICCAI Challenge 2017.

References

Bai, W., et al.: Automated cardiovascular magnetic resonance image analysis with fully convolutional networks. J. Cardiovasc. Magn. Reson. 20(1), 65 (2018)
Article Google Scholar
Bernard, O., et al.: Deep learning techniques for automatic MRI cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE Trans. Med. Imaging 37(11), 2514–2525 (2018)
Article Google Scholar
Brock, A., Donahue, J., Simonyan, K.: Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018)
Chartsias, A., et al.: Disentangled representation learning in cardiac image analysis. Med. Image Anal. 58, 101535 (2019)
Article Google Scholar
Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1945)
Article Google Scholar
Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. arXiv preprint arXiv:1610.07629 (2016)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1501–1510 (2017)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Jacenków, G., Chartsias, A., Mohr, B., Tsaftaris, S.A.: Conditioning convolutional segmentation architectures with non-imaging data. In: International Conference on Medical Imaging with Deep Learning–Extended Abstract Track (2019)
Google Scholar
Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Lawrence Zitnick, C., Girshick, R.: CLEVR: a diagnostic dataset for compositional language and elementary visual reasoning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2901–2910 (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kosiorek, A., Bewley, A., Posner, I.: Hierarchical attentive recurrent tracking. In: Advances in Neural Information Processing Systems, pp. 3053–3061 (2017)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Google Scholar
Nibali, A., He, Z., Morgan, S., Prendergast, L.: Numerical coordinate regression with convolutional neural networks. arXiv preprint arXiv:1801.07372 (2018)
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Google Scholar
Perez, E., Strub, F., De Vries, H., Dumoulin, V., Courville, A.: FiLM: visual reasoning with a general conditioning layer. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Rupprecht, C., Laina, I., Navab, N., Hager, G.D., Tombari, F.: Guide me: interacting with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8551–8561 (2018)
Google Scholar
Sato, S., et al.: Conjugate eye deviation in acute intracerebral hemorrhage: stroke acute management with urgent risk-factor assessment and improvement-ich (samurai-ich) study. Stroke 43(11), 2898–2903 (2012)
Article Google Scholar
Sofiiuk, K., Barinova, O., Konushin, A.: AdaptIS: adaptive instance selection network. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 7355–7363 (2019)
Google Scholar
Arslan, S., Ktena, S.I., Glocker, B., Rueckert, D.: Graph saliency maps through spectral convolutional networks: application to sex classification with brain connectivity. In: Stoyanov, D., et al. (eds.) GRAIL/Beyond MIC -2018. LNCS, vol. 11044, pp. 3–13. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00689-1_1
Chapter Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
Zakharov, E., Shysheya, A., Burkov, E., Lempitsky, V.: Few-shot adversarial learning of realistic neural talking head models. arXiv preprint arXiv:1905.08233 (2019)

Download references

Acknowledgment

This work was supported by the Engineering and Physical Sciences Research Council [grant number EP/R513209/1]; and Canon Medical Research Europe Ltd. S.A. Tsaftaris acknowledges the support of the Royal Academy of Engineering and the Research Chairs and Senior Research Fellowships scheme.

Author information

Authors and Affiliations

The University of Edinburgh, Edinburgh, UK
Grzegorz Jacenków, Alison Q. O’Neil & Sotirios A. Tsaftaris
Canon Medical Research Europe, Edinburgh, UK
Alison Q. O’Neil, Brian Mohr & Sotirios A. Tsaftaris
The Alan Turing Institute, London, UK
Sotirios A. Tsaftaris

Authors

Grzegorz Jacenków
View author publications
You can also search for this author in PubMed Google Scholar
Alison Q. O’Neil
View author publications
You can also search for this author in PubMed Google Scholar
Brian Mohr
View author publications
You can also search for this author in PubMed Google Scholar
Sotirios A. Tsaftaris
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Grzegorz Jacenków .

Editor information

Editors and Affiliations

University of Toronto, Toronto, ON, Canada
Anne L. Martel
The University of British Columbia, Vancouver, BC, Canada
Purang Abolmaesumi
University College London, London, UK
Danail Stoyanov
École Centrale de Nantes, Nantes, France
Diana Mateus
EURECOM, Biot, France
Maria A. Zuluaga
Chinese Academy of Sciences, Beijing, China
S. Kevin Zhou
Sorbonne University, Paris, France
Daniel Racoceanu
The Hebrew University of Jerusalem, Jerusalem, Israel
Leo Joskowicz

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 613 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jacenków, G., O’Neil, A.Q., Mohr, B., Tsaftaris, S.A. (2020). INSIDE: Steering Spatial Attention with Non-imaging Information in CNNs. In: Martel, A.L., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science(), vol 12264. Springer, Cham. https://doi.org/10.1007/978-3-030-59719-1_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-59719-1_38
Published: 29 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59718-4
Online ISBN: 978-3-030-59719-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)