Recurrent Implicit Neural Graph for Deformable Tracking in Endoscopic Videos

Schmidt, Adam; Mohareri, Omid; DiMaio, Simon; Salcudean, Septimiu E.

doi:10.1007/978-3-031-16440-8_46

Adam Schmidt ORCID: orcid.org/0000-0003-4769-4313¹²,
Omid Mohareri¹³,
Simon DiMaio¹³ &
…
Septimiu E. Salcudean¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13434))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

5957 Accesses
1 Citations

Abstract

Tracking points in robotic assisted surgery will help to enable models in augmented reality and image guidance applications. For these applications, both speed and accuracy are critical. Current dense convolutional neural networks can be costly, especially so when we only desire to track user defined regions. Faster methods use keypoints and their movement as a way to estimate flow in an image. In this paper we introduce a recurrent implicit neural graph (RING) which estimates flow efficiently. RING interpolates the flow at any selected query points with a implicit neural representation (also known as coordinate-based representation) that takes the surrounding points and history of the tracked (query) points as input. RING is able to track an arbitrary number of image points. We demonstrate that RING estimates point motion better than methods that do not use a state. We evaluate RING both photometrically and using ground truth depth data. Finally we demonstrate RING’s real-time effectiveness in timing experiments.

This work was supported by Intuitive Surgical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking

Learning to See Forces: Surgical Force Prediction with RGB-Point Cloud Temporal Convolutional Networks

Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking

References

Allan, M., et al.: Stereo correspondence and reconstruction of endoscopic data challenge. arXiv:2101.01133 [cs] (2021)
Bian, J.-W., et al.: GMS: grid-based motion statistics for fast, ultra-robust feature correspondence. Int. J. Comput. Vis. 128(6), 1580–1593 (2019). https://doi.org/10.1007/s11263-019-01280-3
Article MathSciNet Google Scholar
Božič, A., Palafox, P., Zollhöfer, M., Thies, J., Dai, A., Nießner, M.: Neural deformation graphs for globally-consistent non-rigid reconstruction. In: CVPR (2021)
Google Scholar
Du, Y., Zhang, Y., Yu, H.X., Tenenbaum, J.B., Wu, J.: Neural radiance flow for 4D view synthesis and video processing. In: ICCV, pp. 14324–14334 (2021)
Google Scholar
Erler, P., Guerrero, P., Ohrhallinger, S., Mitra, N.J., Wimmer, M.: Points2Surf learning implicit surfaces from point clouds. In: ECCV (2020)
Google Scholar
Feng, W., Li, J., Cai, H., Luo, X., Zhang, J.: Neural points: point cloud representation with neural fields. arXiv:2112.04148 [cs] (2021)
Giannarou, S., Visentini-Scarzanella, M., Yang, G.: Probabilistic tracking of affine-invariant anisotropic regions. IEEE Trans. Patt. Anal. Mach. Intell. 35(1), 130–143 (2013). https://doi.org/10.1109/TPAMI.2012.81
Article Google Scholar
Giannarou, S., Ye, M., Gras, G., Leibrandt, K., Marcus, H.J., Yang, G.-Z.: Vision-based deformation recovery for intraoperative force estimation of tool–tissue interaction for neurosurgery. Int. J. Comput. Assist. Radiol. Surg. 11(6), 929–936 (2016). https://doi.org/10.1007/s11548-016-1361-z
Article Google Scholar
González, C., Bravo-Sánchez, L., Arbelaez, P.: ISINet: An instance-based approach for sdurgical instrument segmentation. In: MICCAI (2020)
Google Scholar
He, K., Zhao, Y., Liu, Z., Li, D., Ma, X.: Whole-pixel registration of non-rigid images using correspondences interpolation on sparse feature seeds. Vis. Comput. 38(5), 1815–1832 (2021). https://doi.org/10.1007/s00371-021-02107-4
Article Google Scholar
Jiang, S., Lu, Y., Li, H., Hartley, R.: Learning optical flow from a few matches. In: CVPR, pp. 16587–16595. IEEE, Nashville, TN, USA (2021)
Google Scholar
Kalia, M., Mathur, P., Tsang, K., Black, P., Navab, N., Salcudean, S.: Evaluation of a marker-less, intra-operative, augmented reality guidance system for robot-assisted laparoscopic radical prostatectomy. Int. J. Comput. Assist. Radiol. Surg. 15(7), 1225–1233 (2020). https://doi.org/10.1007/s11548-020-02181-4
Article Google Scholar
Kuang, Z., Li, J., He, M., Wang, T., Zhao, Y.: DenseGAP: graph-structured dense correspondence learning with anchor points. arXiv:2112.06910 [cs] (2021)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Lu, J., Jayakumari, A., Richter, F., Li, Y., Yip, M.C.: Super deep: a surgical perception framework for robotic tissue manipulation using deep learning for feature extraction. In: ICRA. IEEE (2021)
Google Scholar
Ozyoruk, K.B., et al.: EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos. Med. Image Anal. 71, 102058 (2021)
Article Google Scholar
Qi, C.R., Su, H., Kaichun, M., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 77–85. IEEE, Honolulu, HI (2017). https://doi.org/10.1109/CVPR.2017.16
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: NeurIPS (2017)
Google Scholar
Richa, R., Bó, A.P., Poignet, P.: Towards robust 3D visual tracking for motion compensation in beating heart surgery. Med. Image Anal. 15(3), 302–315 (2011)
Article Google Scholar
Rodríguez, J.J.G., Lamarca, J., Morlana, J., Tardós, J.D., Montiel, J.M.M.: SD-DefSLAM: semi-direct monocular SLAM for deformable and intracorporeal scenes. arXiv:2010.09409 [cs] (2020)
Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: SuperGlue: learning feature matching with graph neural networks. In: CVPR (2020)
Google Scholar
Schmidt, A., Mohareri, O., DiMaio, S.P., Salcudean, S.E.: Fast graph refinement and implicit neural representation for tissue tracking. In: ICRA (2022)
Google Scholar
Schmidt, A., Salcudean, S.E.: Real-time rotated convolutional descriptor for surgical environments. In: MICCAI (2021)
Google Scholar
Shao, S., et al.: Self-supervised monocular depth and ego-motion estimation in endoscopy: appearance flow to the rescue. arXiv:2112.08122 [cs] (2021)
Sinha, A., Murez, Z., Bartolozzi, J., Badrinarayanan, V., Rabinovich, A.: Deltas: depth estimation by learning triangulation and densification of sparse points. In: ECCV (2020)
Google Scholar
Sitzmann, V., Martel, J.N.P., Bergman, A.W., Lindell, D.B., Wetzstein, G.: Implicit neural representations with periodic activation functions. In: NeurIPS (2020)
Google Scholar
Song, J., Wang, J., Zhao, L., Huang, S., Dissanayake, G.: MIS-SLAM: real-time large-scale dense deformable SLAM system in minimal invasive surgery based on heterogeneous computing. IEEE Robot. Autom. Lett. 3(4), 4068–4075 (2018)
Article Google Scholar
Tancik, M., et al.: Fourier features let networks learn high frequency functions in low dimensional domains. In: NeurIPS (2020)
Google Scholar
Teed, Z., Deng, J.: RAFT: recurrent all-pairs field transforms for optical flow. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 402–419. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_24
Chapter Google Scholar
Vaswani, A., et al.: Attention is all you need. In: NeurIPS (2017)
Google Scholar
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio’, P., Bengio, Y.: Graph attention networks. In: ICLR (2018). https://doi.org/10.17863/CAM.48429
Wang, Q., Zhou, X., Hariharan, B., Snavely, N.: Learning feature descriptors using camera pose supervision. In: Computer Vision – ECCV 2020 (2020)
Google Scholar
Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. 38(5), 1–12 (2019)
Article Google Scholar
Yang, Z., Simon, R., Li, Y., Linte, C.A.: Dense depth estimation from stereo endoscopy videos using unsupervised optical flow methods. In: Papież, B.W., et al. (eds.) MIUA 2021. LNCS, vol. 12722, pp. 337–349. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80432-9_26
Chapter Google Scholar
Ye, M., Johns, E., Handa, A., Zhang, L., Pratt, P., Yang, G.Z.: Self-supervised siamese learning on stereo image pairs for depth estimation in robotic surgery. arXiv:1705.08260 [cs] (2017)
Yip, M.C., Lowe, D.G., Salcudean, S.E., Rohling, R.N., Nguan, C.Y.: Tissue tracking and registration for image-guided surgery. IEEE Trans. Med. Imaging 31(11), 2169–2182 (2012)
Article Google Scholar
Zhang, Y., et al.: ColDE: a depth estimation framework for colonoscopy reconstruction. arXiv:2111.10371 [cs, eess] (2021)
Zhou, H., Jayender, J.: EMDQ-SLAM: real-time high-resolution reconstruction of soft tissue surface from stereo laparoscopy videos. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12904, pp. 331–340. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87202-1_32
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
Adam Schmidt & Septimiu E. Salcudean
Advanced Research, Intuitive Surgical, Sunnyvale, CA, 94086, USA
Omid Mohareri & Simon DiMaio

Authors

Adam Schmidt
View author publications
You can also search for this author in PubMed Google Scholar
Omid Mohareri
View author publications
You can also search for this author in PubMed Google Scholar
Simon DiMaio
View author publications
You can also search for this author in PubMed Google Scholar
Septimiu E. Salcudean
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adam Schmidt .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 8430 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmidt, A., Mohareri, O., DiMaio, S., Salcudean, S.E. (2022). Recurrent Implicit Neural Graph for Deformable Tracking in Endoscopic Videos. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13434. Springer, Cham. https://doi.org/10.1007/978-3-031-16440-8_46

Download citation

DOI: https://doi.org/10.1007/978-3-031-16440-8_46
Published: 16 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16439-2
Online ISBN: 978-3-031-16440-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Recurrent Implicit Neural Graph for Deformable Tracking in Endoscopic Videos

Abstract

Access this chapter

Similar content being viewed by others

SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking

Learning to See Forces: Surgical Force Prediction with RGB-Point Cloud Temporal Convolutional Networks

Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 8430 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Recurrent Implicit Neural Graph for Deformable Tracking in Endoscopic Videos

Abstract

Access this chapter

Similar content being viewed by others

SENDD: Sparse Efficient Neural Depth and Deformation for Tissue Tracking

Learning to See Forces: Surgical Force Prediction with RGB-Point Cloud Temporal Convolutional Networks

Real-Time Segmentation of Non-rigid Surgical Tools Based on Deep Learning and Tracking

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 8430 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation