Surgical Scene Segmentation Using Semantic Image Synthesis with a Virtual Surgery Environment

Yoon, Jihun; Hong, SeulGi; Hong, Seungbum; Lee, Jiwon; Shin, Soyeon; Park, Bokyung; Sung, Nakjun; Yu, Hayeong; Kim, Sungjae; Park, SungHyun; Hyung, Woo Jin; Choi, Min-Kook

doi:10.1007/978-3-031-16449-1_53

Jihun Yoon¹²,
SeulGi Hong¹²,
Seungbum Hong¹²,
Jiwon Lee¹²,
Soyeon Shin¹²,
Bokyung Park¹²,
Nakjun Sung¹²,
Hayeong Yu¹²,
Sungjae Kim¹²,
SungHyun Park¹³,
Woo Jin Hyung^12,13 &
…
Min-Kook Choi¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13437))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

5600 Accesses
5 Citations

Abstract

The previous image synthesis research for surgical vision had limited results for real-world applications with simple simulators, including only a few organs and surgical tools and outdated segmentation models to evaluate the quality of the image. Furthermore, none of the research released complete datasets to the public enabling the open research. Therefore, we release a new dataset to encourage further study and provide novel methods with extensive experiments for surgical scene segmentation using semantic image synthesis with a more complex virtual surgery environment. First, we created three cross-validation sets of real image data considering demographic and clinical information from 40 cases of real surgical videos of gastrectomy with the da Vinci Surgical System (dVSS). Second, we created a virtual surgery environment in the Unity engine with five organs from real patient CT data and 22 the da Vinci surgical instruments from actual measurements. Third, We converted this environment photo-realistically with representative semantic image synthesis models, SEAN and SPADE. Lastly, we evaluated it with various state-of-the-art instance and semantic segmentation models. We succeeded in highly improving our segmentation models with the help of synthetic training data. More methods, statistics, and visualizations on https://sisvse.github.io/.

J. Yoon and S. Hong—Co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://sisvse.github.io/.

References

Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of CVPR (2016)
Google Scholar
Bondi, E., et al.: BIRDSAI: a dataset for detection and tracking in aerial thermal infrared videos. In: Proceedings of WACV (2019)
Google Scholar
Koch, S., et al.: ABC: a big CAD model dataset for geometric deep learning. In: Proceedings of CVPR (2019)
Google Scholar
Yang, T., et al.: IntrA: 3D intracranial aneurysm dataset for deep learning. In: Proceedings of CVPR (2020)
Google Scholar
Jin, A., et al.: Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: Proceedings of WACV (2018)
Google Scholar
Twinanda, A.P., et al.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36, 86–97 (2017)
Article Google Scholar
Maier-Hein, L., et al.: Heidelberg colorectal data set for surgical data science in the sensor operating room. Sci. Data 8, 2025–2041 (2020)
Google Scholar
Hong, W.Y., et al.: CholecSeg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on Cholec80. In: Proceedings of IPCAI (2018)
Google Scholar
Yoon, J., et al.: hSDB-instrument: instrument localization database for laparoscopic and robotic surgeries. In: Proceedings of MICCAI (2021)
Google Scholar
Pfeiffer, M., et al.: Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation. In: Proceedings of MICCAI (2019)
Google Scholar
Rivoir, D., et al.: Long-term temporally consistent unpaired video translation from simulated surgical 3D data. ArXiv. abs/2103.17204 (2021)
Ozawa, T., et al.: Synthetic laparoscopic video generation for machine learning-based surgical instrument segmentation from real laparoscopic video and virtual surgical instruments. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 9, 225–232 (2021)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans. TPAMI 43, 1483–1498 (2019)
Article Google Scholar
Chen, K., et al.: Hybrid task cascade for instance segmentation. In: Proceedings of CVPR (2019)
Google Scholar
Liang, T., et al.: CBNetV2: a composite backbone network architecture for object detection. ArXiv:2107.00420. (2021)
Chen, L.C., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV (2018)
Google Scholar
Xiao, T., et al.: Unified perceptual parsing for scene understanding. In: Proceedings of ECCV (2018)
Google Scholar
Park, T., et al.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of CVPR (2019)
Google Scholar
Zhu, P., et al.: SEAN: image synthesis with semantic region-adaptive normalization. In: Proceedings of CVPR (2020)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)
Google Scholar
Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference On Computer Vision And Pattern Recognition (CVPR), pp. 5967–5976 (2017)
Google Scholar
Wang, T.C., et al.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: 2018 IEEE/CVF Conference On Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Google Scholar
Lee, K., Choi, M., Jung, H.: DavinciGAN: unpaired surgical instrument translation for data augmentation. In: Proceedings of MIDL (2019)
Google Scholar
Huang, X., et al.: Multimodal unsupervised image-to-image translation. ArXiv. abs/1804.04732 (2018)
Zhu, J.Y., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference On Computer Vision (ICCV), pp. 2242–2251 (2017)
Google Scholar
Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. TPAMI 43, 3349–3364 (2019)
Article Google Scholar
Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Proceedings of ECCV (2020)
Google Scholar
Vu, T., Kang, H., Yoo, C.: SCNet: training inference sample consistency for instance segmentation. In: Proceedings of AAAI (2021)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of ICCV (2021)
Google Scholar
Yoon, J., et al.: Semi-supervised learning for instrument detection with a class imbalanced dataset. In: Proceedings of MICCAIW (2020)
Google Scholar
Computer Vision Annotation Tool (CVAT). https://github.com/opencv/cvat
Schroeder, W., Martin, K., Lorensen, B.: The Visualization Toolkit. Kitware (2006)
Google Scholar
Tremblay, J., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: Proceedings of CVPRW (2018)
Google Scholar
Contributors, M.: MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark (2020). https://github.com/open-mmlab/mmsegmentation
Chen, K., et al.: MMDetection: open MMLab detection toolbox and benchmark (2019). ArXiv:1906.07155
Zhang, H., et al.: ResNeSt: split-attention networks (2020). ArXiv:2004.08955

Download references

Acknowledgement

This work was supported by the Korea Medical Device Development Fund grant funded by the Korea government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health & Welfare, the Ministry of Food and Drug Safety) (Project Number: 202012A02-02)

Author information

Authors and Affiliations

Hutom, Seoul, Republic of Korea
Jihun Yoon, SeulGi Hong, Seungbum Hong, Jiwon Lee, Soyeon Shin, Bokyung Park, Nakjun Sung, Hayeong Yu, Sungjae Kim, Woo Jin Hyung & Min-Kook Choi
Department of Surgery, Yonsei University College of Medicine, Seoul, Republic of Korea
SungHyun Park & Woo Jin Hyung

Authors

Jihun Yoon
View author publications
You can also search for this author in PubMed Google Scholar
SeulGi Hong
View author publications
You can also search for this author in PubMed Google Scholar
Seungbum Hong
View author publications
You can also search for this author in PubMed Google Scholar
Jiwon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Soyeon Shin
View author publications
You can also search for this author in PubMed Google Scholar
Bokyung Park
View author publications
You can also search for this author in PubMed Google Scholar
Nakjun Sung
View author publications
You can also search for this author in PubMed Google Scholar
Hayeong Yu
View author publications
You can also search for this author in PubMed Google Scholar
Sungjae Kim
View author publications
You can also search for this author in PubMed Google Scholar
SungHyun Park
View author publications
You can also search for this author in PubMed Google Scholar
Woo Jin Hyung
View author publications
You can also search for this author in PubMed Google Scholar
Min-Kook Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Min-Kook Choi .

Editor information

Editors and Affiliations

Rochester Institute of Technology, Rochester, NY, USA
Linwei Wang
Chinese University of Hong Kong, Hong Kong, Hong Kong
Qi Dou
University of Virginia, Charlottesville, VA, USA
P. Thomas Fletcher
National Center for Tumor Diseases (NCT/UCC), Dresden, Germany
Stefanie Speidel
Case Western Reserve University, Cleveland, OH, USA
Shuo Li

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 290 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yoon, J. et al. (2022). Surgical Scene Segmentation Using Semantic Image Synthesis with a Virtual Surgery Environment. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham. https://doi.org/10.1007/978-3-031-16449-1_53

Download citation

DOI: https://doi.org/10.1007/978-3-031-16449-1_53
Published: 17 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16448-4
Online ISBN: 978-3-031-16449-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Surgical Scene Segmentation Using Semantic Image Synthesis with a Virtual Surgery Environment