Abstract
Purpose
Vision-based robot tool segmentation plays a fundamental role in surgical robots perception and downstream tasks. CaRTS, based on a complementary causal model, has shown promising performance in unseen counterfactual surgical environments in the presence of smoke, blood, etc. However, CaRTS requires over 30 iterations of optimization to converge for a single image due to limited observability.
Method
To address the above limitations, we take temporal relation into consideration and propose a temporal causal model for robot tool segmentation on video sequences. We design an architecture named Temporally Constrained CaRTS (TC-CaRTS). TC-CaRTS has three novel modules to complement CaRTS—temporal optimization pipeline, kinematics correction network, and spatial-temporal regularization.
Results
Experiment results show that TC-CaRTS requires fewer iterations to achieve the same or better performance as CaRTS on different domains. All three modules are proven to be effective.
Conclusion
We propose TC-CaRTS, which takes advantage of temporal constraints as additional observability. We show that TC-CaRTS outperforms prior work in the robot tool segmentation task with improved convergence speed on test datasets from different domains.
Similar content being viewed by others
References
García-Peraza-Herrera LC. Li W, Fidon L, Gruijthuijsen C, Devreker A, Attilakos G, Deprest J, Poorten EBV, Stoyanov D, Vercauteren T, Ourselin S (2017) ToolNet: holistically-nested real-time segmentation of robotic surgical tools. In: Proceedings of the IROS
Jin Y, Cheng K, Dou Q, Heng P (2019) Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In: Proceedings of the MICCAI
Shvets AA, Rakhlin A, Kalinin AA, Iglovikov VI (2018) Automatic instrument segmentation in robot-assisted surgery using deep learning. In: Proceedings of the ICMLA
Pakhomov D, Premachandran V, Allan M, Azizian M, Navab N (2019) Deep residual learning for instrument segmentation in robotic surgery. In: Proceedings of the MLMI
Islam M, Atputharuban DA, Ramesh R, Ren H (2019) Real-time instrument segmentation in robotic surgery using auxiliary supervised deep adversarial learning. IEEE Robot Autom Lett 4:2188
Qin F (2019) Surgical instrument segmentation for endoscopic vision with data fusion of CNN prediction and kinematic pose. In: Proceedings of the ICRA
Zhao Z, Jin Y, Lu B, Ng C, Dou Q, Liu Y, Heng P (2021) One to many: adaptive instrument segmentation via meta learning and dynamic online adaptation in robotic surgical video. In: Proceedings of the ICRA
Su Y-H, Huang K, Hannaford B (2018) Real-time vision-based surgical tool segmentation with robot kinematics prior. In: 2018 international symposium on medical robotics (ISMR). IEEE, pp 1–6
da Costa Rocha C, Padoy N, Rosa B (2019) Self-supervised surgical tool segmentation using kinematic information. In: 2019 international conference on robotics and automation (ICRA). IEEE, pp. 8720–8726
Colleoni E, Edwards PJ, Stoyanov D (2020) Synthetic and real inputs for tool segmentation in robotic surgery. In: Proceedings of the MICCAI (2020)
Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. In: Proceedings of the MICCAI
Chen L, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder–decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the ECCV
He K, Gkioxari G, Dollár P, Girshick RB (2017) Mask R-CNN. In: Proceedings of the ICCV
Chen K, Pang J, Wang J, Xiong Y, Li X, Sun S, Feng W, Liu Z, Shi J, Ouyang W, Loy CC, Lin D (2019) Hybrid task cascade for instance segmentation. In: Proceedings of the CVPR
Ding H, Qiao S, Yuille AL, Shen W (2021) Deeply shape-guided cascade for instance segmentation. In: Proceedings of the CVPR
Liu Z, Lin Y, Cao Y, Hu H, Wei Y, Zhang Z, Lin S, Guo B (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the ICCV
Wang J, Sun K, Cheng T, Jiang B, Deng C, Zhao Y, Liu D, Mu Y, Tan M, Wang X, Liu W, Xiao B (2019) Deep high-resolution representation learning for visual recognition. In: TPAMI
Cheng HK, Tai Y-W, Tang C-K (2021) Rethinking space-time networks with improved memory coverage for efficient video object segmentation. In: NeurIPS
Drenkow N, Sani N, Shpitser I, Unberath M (2021) Robustness in deep learning for computer vision: mind the gap? arxiv:2112.00639
Mitrovic J, McWilliams B, Walker JC, Buesing LH, Blundell C (2021) Representation learning via invariant causal mechanisms. In: Proceedings of the ICLR
Ouyang C, Chen C, Li S, Li Z, Qin C, Bai W, Rueckert D (2021) Causality-inspired single-source domain generalization for medical image segmentation. arxiv:2111.12525
Zhang C, Zhang K, Li Y (2020) A causal view on robustness of neural networks. In: Larochelle H, Ranzato M, Hadsell R, Balcan M, Lin H (eds) Proceedings of the NIPS
Liu C, Sun X, Wang J, Tang H, Li T, Qin T, Chen W, Liu T-Y (2021) Learning causal semantic representation for out-of-distribution prediction. In: Proceedings of the NIPS
Ding H, Zhang J, Kazanzides P, Wu JY, Unberath M (2022) Carts: causality-driven robot tool segmentation from vision and kinematics data. In: Proceedings of the MICCAI. Springer, pp. 387–398
Allan M, Ourselin S, Hawkes DJ, Kelly JD, Stoyanov D (2018) 3-D pose estimation of articulated instruments in robotic minimally invasive surgery. IEEE Trans Med Imaging 37(5):1204–1213
Li Z, Liu X, Drenkow N, Ding AS, Creighton FX, Taylor RH, Unberath M (2021) Revisiting stereo depth estimation from a sequence-to-sequence perspective with transformers. In: Proceedings of the ICCV
Ye M, Zhang L, Giannarou S, Yang G (2016) Real-time 3d tracking of articulated tools for robotic surgery. In: Proceedings of the MICCAI
Reinhold JC, Carass A, Prince JL (2021) A structural causal model for MR images of multiple sclerosis. In: Proceedings of the MICCAI
Pawlowski N, de Castro DC, Glocker B (2020) Deep structural causal models for tractable counterfactual inference. In: Proceedings of the NIPS
Lenis D, Major D, Wimmer M, Berg A, Sluiter G, Bühler K (2020) Domain aware medical image classifier interpretation by counterfactual impact analysis. In: Proceedings of MICCAI
Castro DC, Walker I, Glocker B (2020) Causality matters in medical imaging. Nat Commun 11(1):3673
Acknowledgements
This research is supported by a collaborative research agreement with the MultiScale Medical Robotics Center at The Chinese University of Hong Kong.
Funding
Hao Ding is funded by the MultiScale Medical Robotics Center at The Chinese University of Hong Kong. This article does not contain any studies with human participants or animals performed by any of the authors. This article does not contain patient data.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ding, H., Wu, J.Y., Li, Z. et al. Rethinking causality-driven robot tool segmentation with temporal constraints. Int J CARS 18, 1009–1016 (2023). https://doi.org/10.1007/s11548-023-02872-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11548-023-02872-8