Skip to main content

Surgical Scene Segmentation Using Semantic Image Synthesis with a Virtual Surgery Environment

  • Conference paper
  • First Online:
Medical Image Computing and Computer Assisted Intervention – MICCAI 2022 (MICCAI 2022)

Abstract

The previous image synthesis research for surgical vision had limited results for real-world applications with simple simulators, including only a few organs and surgical tools and outdated segmentation models to evaluate the quality of the image. Furthermore, none of the research released complete datasets to the public enabling the open research. Therefore, we release a new dataset to encourage further study and provide novel methods with extensive experiments for surgical scene segmentation using semantic image synthesis with a more complex virtual surgery environment. First, we created three cross-validation sets of real image data considering demographic and clinical information from 40 cases of real surgical videos of gastrectomy with the da Vinci Surgical System (dVSS). Second, we created a virtual surgery environment in the Unity engine with five organs from real patient CT data and 22 the da Vinci surgical instruments from actual measurements. Third, We converted this environment photo-realistically with representative semantic image synthesis models, SEAN and SPADE. Lastly, we evaluated it with various state-of-the-art instance and semantic segmentation models. We succeeded in highly improving our segmentation models with the help of synthetic training data. More methods, statistics, and visualizations on https://sisvse.github.io/.

J. Yoon and S. Hong—Co-first authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://sisvse.github.io/.

References

  1. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  2. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of CVPR (2016)

    Google Scholar 

  3. Bondi, E., et al.: BIRDSAI: a dataset for detection and tracking in aerial thermal infrared videos. In: Proceedings of WACV (2019)

    Google Scholar 

  4. Koch, S., et al.: ABC: a big CAD model dataset for geometric deep learning. In: Proceedings of CVPR (2019)

    Google Scholar 

  5. Yang, T., et al.: IntrA: 3D intracranial aneurysm dataset for deep learning. In: Proceedings of CVPR (2020)

    Google Scholar 

  6. Jin, A., et al.: Tool detection and operative skill assessment in surgical videos using region-based convolutional neural networks. In: Proceedings of WACV (2018)

    Google Scholar 

  7. Twinanda, A.P., et al.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36, 86–97 (2017)

    Article  Google Scholar 

  8. Maier-Hein, L., et al.: Heidelberg colorectal data set for surgical data science in the sensor operating room. Sci. Data 8, 2025–2041 (2020)

    Google Scholar 

  9. Hong, W.Y., et al.: CholecSeg8k: a semantic segmentation dataset for laparoscopic cholecystectomy based on Cholec80. In: Proceedings of IPCAI (2018)

    Google Scholar 

  10. Yoon, J., et al.: hSDB-instrument: instrument localization database for laparoscopic and robotic surgeries. In: Proceedings of MICCAI (2021)

    Google Scholar 

  11. Pfeiffer, M., et al.: Generating large labeled data sets for laparoscopic image processing tasks using unpaired image-to-image translation. In: Proceedings of MICCAI (2019)

    Google Scholar 

  12. Rivoir, D., et al.: Long-term temporally consistent unpaired video translation from simulated surgical 3D data. ArXiv. abs/2103.17204 (2021)

  13. Ozawa, T., et al.: Synthetic laparoscopic video generation for machine learning-based surgical instrument segmentation from real laparoscopic video and virtual surgical instruments. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 9, 225–232 (2021)

    Google Scholar 

  14. Cai, Z., Vasconcelos, N.: Cascade R-CNN: high quality object detection and instance segmentation. IEEE Trans. TPAMI 43, 1483–1498 (2019)

    Article  Google Scholar 

  15. Chen, K., et al.: Hybrid task cascade for instance segmentation. In: Proceedings of CVPR (2019)

    Google Scholar 

  16. Liang, T., et al.: CBNetV2: a composite backbone network architecture for object detection. ArXiv:2107.00420. (2021)

  17. Chen, L.C., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV (2018)

    Google Scholar 

  18. Xiao, T., et al.: Unified perceptual parsing for scene understanding. In: Proceedings of ECCV (2018)

    Google Scholar 

  19. Park, T., et al.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of CVPR (2019)

    Google Scholar 

  20. Zhu, P., et al.: SEAN: image synthesis with semantic region-adaptive normalization. In: Proceedings of CVPR (2020)

    Google Scholar 

  21. Goodfellow, I., et al.: Generative adversarial nets. In: NIPS (2014)

    Google Scholar 

  22. Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: 2017 IEEE Conference On Computer Vision And Pattern Recognition (CVPR), pp. 5967–5976 (2017)

    Google Scholar 

  23. Wang, T.C., et al.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: 2018 IEEE/CVF Conference On Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)

    Google Scholar 

  24. Lee, K., Choi, M., Jung, H.: DavinciGAN: unpaired surgical instrument translation for data augmentation. In: Proceedings of MIDL (2019)

    Google Scholar 

  25. Huang, X., et al.: Multimodal unsupervised image-to-image translation. ArXiv. abs/1804.04732 (2018)

  26. Zhu, J.Y., et al.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 2017 IEEE International Conference On Computer Vision (ICCV), pp. 2242–2251 (2017)

    Google Scholar 

  27. Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. TPAMI 43, 3349–3364 (2019)

    Article  Google Scholar 

  28. Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Proceedings of ECCV (2020)

    Google Scholar 

  29. Vu, T., Kang, H., Yoo, C.: SCNet: training inference sample consistency for instance segmentation. In: Proceedings of AAAI (2021)

    Google Scholar 

  30. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of ICCV (2021)

    Google Scholar 

  31. Yoon, J., et al.: Semi-supervised learning for instrument detection with a class imbalanced dataset. In: Proceedings of MICCAIW (2020)

    Google Scholar 

  32. Computer Vision Annotation Tool (CVAT). https://github.com/opencv/cvat

  33. Schroeder, W., Martin, K., Lorensen, B.: The Visualization Toolkit. Kitware (2006)

    Google Scholar 

  34. Tremblay, J., et al.: Training deep networks with synthetic data: bridging the reality gap by domain randomization. In: Proceedings of CVPRW (2018)

    Google Scholar 

  35. Contributors, M.: MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark (2020). https://github.com/open-mmlab/mmsegmentation

  36. Chen, K., et al.: MMDetection: open MMLab detection toolbox and benchmark (2019). ArXiv:1906.07155

  37. Zhang, H., et al.: ResNeSt: split-attention networks (2020). ArXiv:2004.08955

Download references

Acknowledgement

This work was supported by the Korea Medical Device Development Fund grant funded by the Korea government (the Ministry of Science and ICT, the Ministry of Trade, Industry and Energy, the Ministry of Health & Welfare, the Ministry of Food and Drug Safety) (Project Number: 202012A02-02)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min-Kook Choi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 290 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yoon, J. et al. (2022). Surgical Scene Segmentation Using Semantic Image Synthesis with a Virtual Surgery Environment. In: Wang, L., Dou, Q., Fletcher, P.T., Speidel, S., Li, S. (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022. MICCAI 2022. Lecture Notes in Computer Science, vol 13437. Springer, Cham. https://doi.org/10.1007/978-3-031-16449-1_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-16449-1_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-16448-4

  • Online ISBN: 978-3-031-16449-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics