Skip to main content
Log in

Domain adaptive Sim-to-Real segmentation of oropharyngeal organs

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

Video-assisted transoral tracheal intubation (TI) necessitates using an endoscope that helps the physician insert a tracheal tube into the glottis instead of the esophagus. The growing trend of robotic-assisted TI would require a medical robot to distinguish anatomical features like an experienced physician which can be imitated by utilizing supervised deep-learning techniques. However, the real datasets of oropharyngeal organs are often inaccessible due to limited open-source data and patient privacy. In this work, we propose a domain adaptive Sim-to-Real framework called IoU-Ranking Blend-ArtFlow (IRB-AF) for image segmentation of oropharyngeal organs. The framework includes an image blending strategy called IoU-Ranking Blend (IRB) and style-transfer method ArtFlow. Here, IRB alleviates the problem of poor segmentation performance caused by significant datasets domain differences, while ArtFlow is introduced to reduce the discrepancies between datasets further. A virtual oropharynx image dataset generated by the SOFA framework is used as the learning subject for semantic segmentation to deal with the limited availability of actual endoscopic images. We adapted IRB-AF with the state-of-the-art domain adaptive segmentation models. The results demonstrate the superior performance of our approach in further improving the segmentation accuracy and training stability.

Graphical abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Endoscopic Images generated from SOFA-based oropharynx model with style transfer from phantom (EISOST) - https://github.com/gkw0010/EISOST-Sim2Real-Dataset-Release.

References

  1. Thomas EB, Moss S (2014) Tracheal intubation. Anaesth Intensiv Care Med 15(1):5–7

    Article  Google Scholar 

  2. Caplan RA, Benumof JL, Berry FA, Blitt CD, Bode RH, Cheney FW, Connis RT, Guidry OF, Nickinovich DG, Ovassapian A (2003) Practice guidelines for management of the difficult airway. Anesthesiology 98(1269–1277):2

    Google Scholar 

  3. Lu B, Li B, Chen W, Jin Y, Zhao Z, Dou Q, Heng PA, Liu Y (2021) Toward image-guided automated suture grasping under complex environments: a learning-enabled and optimization-based holistic framework. IEEE Transac Automation Sci Eng 19(4):3794–3808

    Article  Google Scholar 

  4. Lai J, Lu B, Chu HK (2021) Variable-stiffness control of a dual-segment soft robot using depth vision. IEEE ASME Trans Mechatron 27(2):1034–1045

    Article  Google Scholar 

  5. Lu B, Li B, Dou Q, Liu Y (2022) A unified monocular camera-based and pattern-free hand-to-eye calibration algorithm for surgical robots with RCM constraints. IEEE/ASME Trans Mechatron 27(6):5124–5135

    Article  Google Scholar 

  6. Yu BX, Liu Y, Zhang X, Zhong Sh, Chan KC (2022) Mmnet: a modelbased multimodal network for human action recognition in rgb-d videos. IEEE Trans Pattern Anal Mach Intell

  7. Asgari Taghanaki S, Abhishek K, Cohen JP, Cohen-Adad J, Hamarneh G (2021) Deep semantic segmentation of natural and medical images: a review. Artif Intell Rev 54(1):137–178

    Article  Google Scholar 

  8. Frangi AF, Tsaftaris SA, Prince JL (2018) Simulation and synthesis in medical imaging. IEEE Trans Med Image 37(3):673–679

    Article  Google Scholar 

  9. Rehman M, Arsenault L, Javan R (2022) Organs in color: utilizing free software and emerging multi jet fusion technology to color and surface label 3D-printed anatomical models. J Digit Imaging 35(6):1611–1622

    Article  PubMed  Google Scholar 

  10. Duriez C (2013) Control of elastic soft robots based on real-time finite element method. In: Proc IEEE Int Conf Robot Autom (ICRA), 3982-3987

  11. Zhao W, Queralta JP, Westerlund T (2020) Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In: Proc IEEE Symp Ser Comput Intell (SSCI), 737-744

  12. Ganry L, Hersant B, Quilichini J, Leyder P, Meningaud J (2017) Use of the 3D surgical modelling technique with open-source software for mandibular fibula free flap reconstruction and its surgical guides. J Stomatol Oral Maxillofac Surg 118(3):197–202

    Article  CAS  PubMed  Google Scholar 

  13. Pierri R, Nogueira L, Balan I, Iwaki L et al (2019) Bimaxillary orthognatic surgery planned with the software blender, through the addon ortogonblender. Int J Oral Maxillofac Surg 48:254

    Article  Google Scholar 

  14. Chen X, Hu J, Jin C, Li L, Wang L (2021) Understanding domain randomization for sim-to-real transfer. arXiv:2110.03239

  15. Tobin J, Fong R, Ray A, Schneider J, Zaremba W, Abbeel P (2017) Domain randomization for transferring deep neural networks from simulation to the real world. In: IEEE/RSJ Int. Conf. Intell. Robot. Syst. (IROS), pp. 23-30 . IEEE

  16. Yang Y, Soatto S (2020) Fda: Fourier domain adaptation for semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4085-4095

  17. Geng B, Tao D, Xu C (2011) DAML: domain adaptation metric learning. IEEE Trans Image Process 20(10):2980–2989

    Article  PubMed  Google Scholar 

  18. Long M, Cao Y, Wang J, Jordan M (2015) Learning transferable features with deep adaptation networks. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 97-105. PMLR

  19. Zellinger W, Grubinger T, Lughofer E, Natschläger T, SamingerPlatz S (2017) Central moment discrepancy (cmd) for domain-invariant representation learning. arXiv:1702.08811

  20. Zou Y, Yu Z, Kumar B, Wang J (2018) Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 289-305

  21. Wu Z, Han X, Lin YL, Uzunbas MG, Goldstein T, Lim SN, Davis LS (2018) Dcan: dual channel-wise alignment networks for unsupervised scene adaptation. In: Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 518-534

  22. Sankaranarayanan S, Balaji Y, Jain A, Lim SN, Chellappa R (2018) Learning from synthetic data: addressing domain shift for semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3752-3761

  23. Vu TH, Jain H, Bucher M, Cord M, Pérez P (2019) Advent: adversarial entropy minimization for domain adaptation in semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 2517-2526

  24. Hoffman J, Tzeng E, Park T, Zhu JY, Isola P, Saenko K, Efros A, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: Proc. Int. Conf. Mach. Learn. (ICML), pp. 1989-1998 . Pmlr

  25. Richter SR, Vineet V, Roth S, Koltun V (2016) Playing for data: ground truth from computer games. In: Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 102-118. Springer

  26. Ros G, Sellart L, Materzynska J, Vazquez D, Lopez AM (2016) Thesynthia dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3234-3243

  27. Cordts M, Omran M, Ramos S, Rehfeld T, Enzweiler M, Benenson R, Franke U, Roth S, Schiele B (2016) The cityscapes dataset for semantic urban scene understanding. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 3213-3223

  28. Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 6936-6945

  29. Zhu XJ (2005) Semi-supervised learning literature survey. Technical report, University of Wisconsin-Madison Department of Computer Sciences

  30. Springenberg JT (2015) Unsupervised and semi-supervised learning with categorical generative adversarial networks. arXiv:1511.06390

  31. An J, Huang S, Song Y, Dou D, Liu W, Luo J (2021) Artflow: unbiased image style transfer via reversible neural flows. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 862-871

  32. Lai J, Ren TA, Yue W, Su S, Chan JYK, Ren H (2023) Sim-to-real transfer of soft robotic navigation strategies that learns from the virtual eye-in-hand vision. Under Review

  33. Lai J, Lu B, Zhao Q, Chu HK (2022) Constrained motion planning of a cable-driven soft robot with compressible curvature modeling. IEEE Robot Autom Lett 7(2):4813–4820

    Article  Google Scholar 

  34. Allan M, Shvets A, Kurmann T, Zhang Z, Duggal R, Su YH, Rieke N, Laina I, Kalavakonda N, Bodenstedt S, et al. (2019) 2017 robotic instrument segmentation challenge. arXiv:1902.06426

  35. Allan M, Kondo S, Bodenstedt S, Leger S, Kadkhodamohammadi R, Luengo I, Fuentes F, Flouty E, Mohammed A, Pedersen M, et al.: 2018 robotic scene segmentation challenge. arXiv:2001.11190

  36. University of Dundee, School of Medicine (2022): Pharynx and floor of mouth. https://skfb.ly/6QXqr. Accessed: 2022-08-01

  37. Ghiasi G, Lee H, Kudlur M, Dumoulin V, Shlens J (2017) Exploringthe structure of a real-time, arbitrary neural artistic stylization network. arXiv:1705.06830

  38. Li Y, Fang C, Yang J, Wang Z, Lu X, Yang MH (2017) Universal style transfer via feature transforms. Adv Neural Info Process Syst 30

  39. Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proc. IEEE Int. Conf. Compt. Vis. (ICCV), pp. 1501-1510

  40. Liao J, Yao Y, Yuan L, Hua G, Kang SB (2017) Visual attribute transfer through deep image analogy. arXiv:1705.01088

  41. Kingma DP, Dhariwal P (2018) Glow: generative flow with invertible 1x1 convolutions. Adv Neural Info Process Syst 31

  42. Dinh L, Krueger D, Bengio Y (2014) Nice: non-linear independent components estimation. arXiv:1410.8516

  43. Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2017) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  PubMed  Google Scholar 

  44. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 770-778

  45. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv:1412.6980

Download references

Acknowledgements

This work was supported in part by the Hong Kong Research Grants Council (RGC) Collaborative Research Fund (CRF-C4026-21GF)

Author information

Authors and Affiliations

Authors

Contributions

G.W., J.L., and H.R. conceived the concepts. G.W., T.R., J.L., and L.B. advised on the design and implementation of the experiments. G.W., T.R., and J.L. conducted experiments and analyzed the data. G.W., T.R., J.L., and L.B. wrote the manuscript. All authors read, edited, and discussed the manuscript and agree with the claims made in this work. H.R. coordinated and supervised the research.

Corresponding author

Correspondence to Hongliang Ren.

Ethics declarations

Ethics approval

Ethical approval was not sought for the present study because this article does not contain any studies with human or animal subjects.

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, G., Ren, TA., Lai, J. et al. Domain adaptive Sim-to-Real segmentation of oropharyngeal organs. Med Biol Eng Comput 61, 2745–2755 (2023). https://doi.org/10.1007/s11517-023-02877-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-023-02877-0

Keywords

Navigation