Investigating Transformer Encoding Techniques to Improve Data-Driven Volume-to-Surface Liver Registration for Image-Guided Navigation

Young, Michael; Yang, Zixin; Simon, Richard; Linte, Cristian A.

doi:10.1007/978-3-031-44992-5_9

Michael Young ORCID: orcid.org/0000-0002-3738-578X¹⁴,
Zixin Yang¹⁴,
Richard Simon¹⁵ &
…
Cristian A. Linte ORCID: orcid.org/0000-0001-7602-7937^14,15

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14314))

Included in the following conference series:

MICCAI Workshop on Data Engineering in Medical Imaging

302 Accesses
1 Altmetric

Abstract

Due to limited direct organ visualization, minimally invasive interventions rely extensively on medical imaging and image guidance to ensure accurate surgical instrument navigation and target tissue manipulation. In the context of laparoscopic liver interventions, intra-operative video imaging only provides a limited field-of-view of the liver surface, with no information of any internal liver lesions identified during diagnosis using pre-procedural imaging. Hence, to enhance intra-procedural visualization and navigation, the registration of pre-procedural, diagnostic images and anatomical models featuring target tissues to be accessed or manipulated during surgery entails a sufficient accurate registration of the pre-procedural data into the intra-operative setting. Prior work has demonstrated the feasibility of neural network-based solutions for nonrigid volume-to-surface liver registration. However, view occlusion, lack of meaningful feature landmarks, and liver deformation between the pre- and intra-operative settings all contribute to the difficulty of this registration task. In this work, we leverage some of the state-of-the-art deep learning frameworks to implement and test various network architecture modifications toward improving the accuracy and robustness of volume-to-surface liver registration. Specifically, we focus on the adaptation of a transformer-based segmentation network for the task of better predicting the optimal displacement field for nonrigid registration. Our results suggest that one particular transformer-based network architecture—UTNet—led to significant improvements over baseline performance, yielding a mean displacement error on the order of 4 mm across a variety of datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Acidi, B., Ghallab, M., Cotin, S., Vibert, E., Golse, N.: Augmented reality in liver surgery. J. Visceral Surg. 160(2), 118–126 (2023)
Article Google Scholar
Antonelli, M., et al.: The medical segmentation decathlon. Nat. Commun. 13(1), 4128 (2022)
Article Google Scholar
Attaiki, S., Pai, G., Ovsjanikov, M.: DPFM: deep partial functional maps. In: 2021 International Conference on 3D Vision (3DV), pp. 175–185 (2021)
Google Scholar
Barcali, E., Iadanza, E., Manetti, L., Francia, P., Nardi, C., Bocchi, L.: Augmented reality in surgery: a scoping review. Appl. Sci. 12(14), 6890 (2022)
Article Google Scholar
Barequet, G., Sharir, M.: Partial surface and volume matching in three dimensions. IEEE Trans. Pattern Anal. Mach. Intell. 19(9), 929–948 (1997)
Article Google Scholar
Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam
Google Scholar
Corona-Figueroa, A., Frawley, J., Bond-Taylor, S., Bethapudi, S., Shum, H.P.H., Willcocks, C.G.: MedNeRF: medical neural radiance fields for reconstructing 3D-aware CT-projections from a single X-ray (2022)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale (2021)
Google Scholar
Elhawary, H., et al.: Multimodality Non-rigid image registration for planning, targeting and monitoring during CT-guided percutaneous liver tumor cryoablation. Acad. Radiol. 17(11), 1334–1344 (2010)
Article Google Scholar
Galle, P.R., et al.: EASL clinical practice guidelines: management of hepatocellular carcinoma. J. Hepatol. 69(1), 182–236 (2018)
Article MathSciNet Google Scholar
Gao, Y., Zhou, M., Metaxas, D.: UTNet: a hybrid transformer architecture for medical image segmentation (2021)
Google Scholar
Gelfand, N., Mitra, N.J., Guibas, L.J., Pottmann, H.: Robust global registration (2005)
Google Scholar
Geuzaine, C., Remacle, J.F.: Gmsh: a three-dimensional finite element mesh generator with built-in pre- and post-processing facilities. Int. J. Numer. Methods Eng. 79, 1309–1331 (2009)
Article MATH Google Scholar
Hontani, H., Watanabe, W.: Point-based non-rigid surface registration with accuracy estimation. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 446–452 (2010)
Google Scholar
Li, H., Chen, H., Jing, W., Li, Y., Zheng, R.: 3D ultrasound spine imaging with application of neural radiance field method. In: 2021 IEEE International Ultrasonics Symposium (IUS), pp. 1–4 (2021)
Google Scholar
Maki, H., Hasegawa, K.: Advances in the surgical treatment of liver cancer. BioSci. Trends 16(3), 178–188 (2022)
Article Google Scholar
Malinen, M., Råback, P.: Elmer finite element solver for multiphysics and multiscale problems. Multiscale Model. Methods Appl. Mater. Sci. 19, 101–113 (2013)
Google Scholar
Mendizabal, A., Márquez-Neila, P., Cotin, S.: Simulation of hyperelastic materials in real-time using deep learning. Med. Image Anal. 59, 101569 (2020)
Article Google Scholar
Mendizabal, Andrea, Tagliabue, Eleonora, Brunet, Jean-Nicolas., Dall’Alba, Diego, Fiorini, Paolo, Cotin, Stéphane.: Physics-based deep neural network for real-time lesion tracking in ultrasound-guided breast biopsy. In: Miller, Karol, Wittek, Adam, Joldes, Grand, Nash, Martyn P.., Nielsen, Poul M. F.. (eds.) MICCAI 2018-2019, pp. 33–45. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-42428-2_4
Chapter MATH Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis (2020)
Google Scholar
Nakamura, K., et al.: The hepatic left lateral segment inverting method offering a wider operative field of view during laparoscopic proximal gastrectomy. J. Gastrointest. Surg. 24(10), 2395–2403 (2020)
Article Google Scholar
Pfeiffer, M., et al.: Non-rigid volume to surface registration using a data-driven biomechanical model (2020)
Google Scholar
Rochester Institute of Technology. Research Computing Services (2019)
Google Scholar
Ronneberger, Olaf, Fischer, Philipp, Brox, Thomas: U-net: convolutional networks for biomedical image segmentation. In: Navab, Nassir, Hornegger, Joachim, Wells, William M.., Frangi, Alejandro F.. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Sivagami, S., Chitra, P., Kailash, G.S.R., Muralidharan, S.: UNet architecture based dental panoramic image segmentation. In: 2020 International Conference on Wireless Communications Signal Processing and Networking (WiSPNET), pp. 187–191 (2020)
Google Scholar
Suwelack, S., et al.: Physics-based shape matching for intraoperative image guidance. Med. Phys. 41(11), 111901 (2014)
Article Google Scholar
Vaswani, A., et al.: Attention is all you need (2017)
Google Scholar
Xiao, X., Guo, W., Chen, R., Hui, Y., Wang, J., Zhao, H.: A swin transformer-based encoding booster integrated in U-shaped network for building extraction. Remote Sens. 14(11), 2611 (2022)
Article Google Scholar
Yang, X., Kwitt, R., Niethammer, M.: Fast predictive image registration (2016)
Google Scholar
Zeng, Y., Wang, C., Wang, Y., Gu, X., Samaras, D., Paragios, N.: Dense non-rigid surface registration using high-order graph matching. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 382–389 (2010)
Google Scholar

Download references

Acknowledgements

Research reported in this publication was supported by the National Institute of General Medical Sciences Award No. R35GM128877 of the National Institutes of Health, the Office of Advanced Cyber Infrastructure Award No. 1808530 of the National Science Foundation, and the Division Of Chemistry, Bioengineering, Environmental, and Transport Systems Award No. 2245152 of the National Science Foundation.

Author information

Authors and Affiliations

Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester, NY, 14623, USA
Michael Young, Zixin Yang & Cristian A. Linte
Department of Biomedical Engineering, Rochester Institute of Technology, Rochester, NY, 14623, USA
Richard Simon & Cristian A. Linte

Authors

Michael Young
View author publications
You can also search for this author in PubMed Google Scholar
Zixin Yang
View author publications
You can also search for this author in PubMed Google Scholar
Richard Simon
View author publications
You can also search for this author in PubMed Google Scholar
Cristian A. Linte
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Young .

Editor information

Editors and Affiliations

University of Aberdeen, Aberdeen, UK
Binod Bhattarai
University of Leeds, Leeds, UK
Sharib Ali
Stanford University, Stanford, CA, USA
Anita Rau
University of Liverpool, Liverpool, UK
Anh Nguyen
University of Oxford, Oxford, UK
Ana Namburete
University College London, London, UK
Razvan Caramalau
University College London, London, UK
Danail Stoyanov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Young, M., Yang, Z., Simon, R., Linte, C.A. (2023). Investigating Transformer Encoding Techniques to Improve Data-Driven Volume-to-Surface Liver Registration for Image-Guided Navigation. In: Bhattarai, B., et al. Data Engineering in Medical Imaging. DEMI 2023. Lecture Notes in Computer Science, vol 14314. Springer, Cham. https://doi.org/10.1007/978-3-031-44992-5_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-44992-5_9
Published: 01 October 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44991-8
Online ISBN: 978-3-031-44992-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Investigating Transformer Encoding Techniques to Improve Data-Driven Volume-to-Surface Liver Registration for Image-Guided Navigation