Domain adaptive Sim-to-Real segmentation of oropharyngeal organs

Wang, Guankun; Ren, Tian-Ao; Lai, Jiewen; Bai, Long; Ren, Hongliang

doi:10.1007/s11517-023-02877-0

Domain adaptive Sim-to-Real segmentation of oropharyngeal organs

Original Article
Published: 18 July 2023

Volume 61, pages 2745–2755, (2023)
Cite this article

Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Guankun Wang¹,
Tian-Ao Ren^2,3,
Jiewen Lai¹,
Long Bai¹ &
…
Hongliang Ren ORCID: orcid.org/0000-0002-6488-1551¹

320 Accesses
2 Citations
Explore all metrics

Abstract

Video-assisted transoral tracheal intubation (TI) necessitates using an endoscope that helps the physician insert a tracheal tube into the glottis instead of the esophagus. The growing trend of robotic-assisted TI would require a medical robot to distinguish anatomical features like an experienced physician which can be imitated by utilizing supervised deep-learning techniques. However, the real datasets of oropharyngeal organs are often inaccessible due to limited open-source data and patient privacy. In this work, we propose a domain adaptive Sim-to-Real framework called IoU-Ranking Blend-ArtFlow (IRB-AF) for image segmentation of oropharyngeal organs. The framework includes an image blending strategy called IoU-Ranking Blend (IRB) and style-transfer method ArtFlow. Here, IRB alleviates the problem of poor segmentation performance caused by significant datasets domain differences, while ArtFlow is introduced to reduce the discrepancies between datasets further. A virtual oropharynx image dataset generated by the SOFA framework is used as the learning subject for semantic segmentation to deal with the limited availability of actual endoscopic images. We adapted IRB-AF with the state-of-the-art domain adaptive segmentation models. The results demonstrate the superior performance of our approach in further improving the segmentation accuracy and training stability.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Dynamic Linear Transformer for 3D Biomedical Image Segmentation

Disentangled representation and cross-modality image translation based unsupervised domain adaptation method for abdominal organ segmentation

Article 17 March 2022

CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation

Notes

Endoscopic Images generated from SOFA-based oropharynx model with style transfer from phantom (EISOST) - https://github.com/gkw0010/EISOST-Sim2Real-Dataset-Release.

References

Thomas EB, Moss S (2014) Tracheal intubation. Anaesth Intensiv Care Med 15(1):5–7
Article Google Scholar
Caplan RA, Benumof JL, Berry FA, Blitt CD, Bode RH, Cheney FW, Connis RT, Guidry OF, Nickinovich DG, Ovassapian A (2003) Practice guidelines for management of the difficult airway. Anesthesiology 98(1269–1277):2
Google Scholar
Lu B, Li B, Chen W, Jin Y, Zhao Z, Dou Q, Heng PA, Liu Y (2021) Toward image-guided automated suture grasping under complex environments: a learning-enabled and optimization-based holistic framework. IEEE Transac Automation Sci Eng 19(4):3794–3808
Article Google Scholar
Lai J, Lu B, Chu HK (2021) Variable-stiffness control of a dual-segment soft robot using depth vision. IEEE ASME Trans Mechatron 27(2):1034–1045
Article Google Scholar
Lu B, Li B, Dou Q, Liu Y (2022) A unified monocular camera-based and pattern-free hand-to-eye calibration algorithm for surgical robots with RCM constraints. IEEE/ASME Trans Mechatron 27(6):5124–5135
Article Google Scholar
Yu BX, Liu Y, Zhang X, Zhong Sh, Chan KC (2022) Mmnet: a modelbased multimodal network for human action recognition in rgb-d videos. IEEE Trans Pattern Anal Mach Intell
Asgari Taghanaki S, Abhishek K, Cohen JP, Cohen-Adad J, Hamarneh G (2021) Deep semantic segmentation of natural and medical images: a review. Artif Intell Rev 54(1):137–178
Article Google Scholar
Frangi AF, Tsaftaris SA, Prince JL (2018) Simulation and synthesis in medical imaging. IEEE Trans Med Image 37(3):673–679
Article Google Scholar
Rehman M, Arsenault L, Javan R (2022) Organs in color: utilizing free software and emerging multi jet fusion technology to color and surface label 3D-printed anatomical models. J Digit Imaging 35(6):1611–1622
Article PubMed Google Scholar
Duriez C (2013) Control of elastic soft robots based on real-time finite element method. In: Proc IEEE Int Conf Robot Autom (ICRA), 3982-3987
Zhao W, Queralta JP, Westerlund T (2020) Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: Proc IEEE Symp Ser Comput Intell (SSCI), 737-744
Ganry L, Hersant B, Quilichini J, Leyder P, Meningaud J (2017) Use of the 3D surgical modelling technique with open-source software for mandibular fibula free flap reconstruction and its surgical guides. J Stomatol Oral Maxillofac Surg 118(3):197–202
Article CAS PubMed Google Scholar
Pierri R, Nogueira L, Balan I, Iwaki L et al (2019) Bimaxillary orthognatic surgery planned with the software blender, through the addon ortogonblender. Int J Oral Maxillofac Surg 48:254
Article Google Scholar
Chen X, Hu J, Jin C, Li L, Wang L (2021) Understanding domain randomization for sim-to-real transfer. arXiv:2110.03239
Tobin J, Fong R, Ray A, Schneider J, Zaremba W, Abbeel P (2017) Domain randomization for transferring deep neural networks from simulation to the real world. In: IEEE/RSJ Int. Conf. Intell. Robot. Syst. (IROS), pp. 23-30 . IEEE
Yang Y, Soatto S (2020) Fda: Fourier domain adaptation for semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4085-4095
Geng B, Tao D, Xu C (2011) DAML: domain adaptation metric learning. IEEE Trans Image Process 20(10):2980–2989
Article PubMed Google Scholar
Long M, Cao Y, Wang J, Jordan M (2015) Learning transferable features with deep adaptation networks. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 97-105. PMLR
Zellinger W, Grubinger T, Lughofer E, Natschläger T, SamingerPlatz S (2017) Central moment discrepancy (cmd) for domain-invariant representation learning. arXiv:1702.08811
Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 289-305
Wu Z, Han X, Lin YL, Uzunbas MG, Goldstein T, Lim SN, Davis LS (2018) Dcan: dual channel-wise alignment networks for unsupervised scene adaptation. In: Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 518-534
Sankaranarayanan S, Balaji Y, Jain A, Lim SN, Chellappa R (2018) Learning from synthetic data: addressing domain shift for semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3752-3761
Vu TH, Jain H, Bucher M, Cord M, Pérez P (2019) Advent: adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2517-2526
Hoffman J, Tzeng E, Park T, Zhu JY, Isola P, Saenko K, Efros A, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 1989-1998 . Pmlr
Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In: Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 102-118. Springer
Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) Thesynthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3234-3243
Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3213-3223
Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 6936-6945
Zhu XJ (2005) Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences
Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv:1511.06390
An J, Huang S, Song Y, Dou D, Liu W, Luo J (2021) Artflow: unbiased image style transfer via reversible neural flows. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 862-871
Lai J, Ren TA, Yue W, Su S, Chan JYK, Ren H (2023) Sim-to-real transfer of soft robotic navigation strategies that learns from the virtual eye-in-hand vision. Under Review
Lai J, Lu B, Zhao Q, Chu HK (2022) Constrained motion planning of a cable-driven soft robot with compressible curvature modeling. IEEE Robot Autom Lett 7(2):4813–4820
Article Google Scholar
Allan M, Shvets A, Kurmann T, Zhang Z, Duggal R, Su YH, Rieke N, Laina I, Kalavakonda N, Bodenstedt S, et al. (2019) 2017 robotic instrument segmentation challenge. arXiv:1902.06426
Allan M, Kondo S, Bodenstedt S, Leger S, Kadkhodamohammadi R, Luengo I, Fuentes F, Flouty E, Mohammed A, Pedersen M, et al.: 2018 robotic scene segmentation challenge. arXiv:2001.11190
University of Dundee, School of Medicine (2022): Pharynx and floor of mouth. https://skfb.ly/6QXqr. Accessed: 2022-08-01
Ghiasi G, Lee H, Kudlur M, Dumoulin V, Shlens J (2017) Exploringthe structure of a real-time, arbitrary neural artistic stylization network. arXiv:1705.06830
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang MH (2017) Universal style transfer via feature transforms. Adv Neural Info Process Syst 30
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proc. IEEE Int. Conf. Compt. Vis. (ICCV), pp. 1501-1510
Liao J, Yao Y, Yuan L, Hua G, Kang SB (2017) Visual attribute transfer through deep image analogy. arXiv:1705.01088
Kingma DP, Dhariwal P (2018) Glow: generative flow with invertible 1x1 convolutions. Adv Neural Info Process Syst 31
Dinh L, Krueger D, Bengio Y (2014) Nice: non-linear independent components estimation. arXiv:1410.8516
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848
Article PubMed Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770-778
Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

Download references

Acknowledgements

This work was supported in part by the Hong Kong Research Grants Council (RGC) Collaborative Research Fund (CRF-C4026-21GF)

Author information

Authors and Affiliations

Department of Electronic Engineering, The Chinese University of Hong Kong, Shatin, N.T., 999077, Hong Kong, China
Guankun Wang, Jiewen Lai, Long Bai & Hongliang Ren
College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, 15 Beisanhuan East Rd., Chaoyang, 100029, Beijing, China
Tian-Ao Ren
Shenzhen Research Institute, The Chinese University of Hong Kong, 2 Yuexing Rd., Nanshan, Shenzhen, 518057, Guangdong, China
Tian-Ao Ren

Authors

Guankun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tian-Ao Ren
View author publications
You can also search for this author in PubMed Google Scholar
Jiewen Lai
View author publications
You can also search for this author in PubMed Google Scholar
Long Bai
View author publications
You can also search for this author in PubMed Google Scholar
Hongliang Ren
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.W., J.L., and H.R. conceived the concepts. G.W., T.R., J.L., and L.B. advised on the design and implementation of the experiments. G.W., T.R., and J.L. conducted experiments and analyzed the data. G.W., T.R., J.L., and L.B. wrote the manuscript. All authors read, edited, and discussed the manuscript and agree with the claims made in this work. H.R. coordinated and supervised the research.

Corresponding author

Correspondence to Hongliang Ren.

Ethics declarations

Ethics approval

Ethical approval was not sought for the present study because this article does not contain any studies with human or animal subjects.

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wang, G., Ren, TA., Lai, J. et al. Domain adaptive Sim-to-Real segmentation of oropharyngeal organs. Med Biol Eng Comput 61, 2745–2755 (2023). https://doi.org/10.1007/s11517-023-02877-0

Download citation

Received: 05 March 2023
Accepted: 25 June 2023
Published: 18 July 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11517-023-02877-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions