Skip to main content

Unsupervised Surgical Instrument Segmentation via Anchor Generation and Semantic Diffusion

Part of the Lecture Notes in Computer Science book series (LNIP,volume 12263)

Abstract

Surgical instrument segmentation is a key component in developing context-aware operating rooms. Existing works on this task heavily rely on the supervision of a large amount of labeled data, which involve laborious and expensive human efforts. In contrast, a more affordable unsupervised approach is developed in this paper. To train our model, we first generate anchors as pseudo labels for instruments and background tissues respectively by fusing coarse handcrafted cues. Then a semantic diffusion loss is proposed to resolve the ambiguity in the generated anchors via the feature correlation between adjacent video frames. In the experiments on the binary instrument segmentation task of the 2017 MICCAI EndoVis Robotic Instrument Segmentation Challenge dataset, the proposed method achieves 0.71 IoU and 0.81 Dice score without using a single manual annotation, which is promising to show the potential of unsupervised learning for surgical tool segmentation.

Keywords

  • Surgical instrument segmentation
  • Unsupervised learning
  • Semantic diffusion

D. Liu and Y. Wei—Equal contribution.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-59716-0_63
  • Chapter length: 11 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   109.00
Price excludes VAT (USA)
  • ISBN: 978-3-030-59716-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   149.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.

Notes

  1. 1.

    By unsupervised, we mean no manual annotation of surgical instruments is used.

  2. 2.

    [0, 1] means values are between 0 and 1 both inclusively.

References

  1. Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE TPAMI 34(11), 2189–2202 (2012)

    CrossRef  Google Scholar 

  2. Allan, M., et al.: 2017 robotic instrument segmentation challenge. arXiv:1902.06426 (2019)

  3. Bodenstedt, S., et al.: Unsupervised temporal context learning using convolutional neural networks for laparoscopic workflow analysis. arXiv:1702.03684 (2017)

  4. Bouget, D., Benenson, R., Omran, M., Riffaud, L., Schiele, B., Jannin, P.: Detecting surgical tools by modelling local appearance and global shape. IEEE Trans. Med. Imaging 34(12), 2603–2617 (2015)

    CrossRef  Google Scholar 

  5. da Costa Rocha, C., Padoy, N., Rosa, B.: Self-supervised surgical tool segmentation using kinematic information. In: ICRA (2019)

    Google Scholar 

  6. DiPietro, R., Hager, G.D.: Unsupervised learning for surgical motion by learning to predict the future. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 281–288. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00937-3_33

    CrossRef  Google Scholar 

  7. García-Peraza-Herrera, L.C., et al.: ToolNet: holistically-nested real-time segmentation of robotic surgical tools. In: IROS (2017)

    Google Scholar 

  8. Gutman, D., et al.: Skin lesion analysis toward melanoma detection: a challenge at the international symposium on biomedical imaging (ISBI) 2016, hosted by the international skin imaging collaboration (ISIC). arXiv:1605.01397 (2016)

  9. Hasan, S.K., Linte, C.A.: U-NetPlus: a modified encoder-decoder U-Net architecture for semantic and instance segmentation of surgical instruments from laparoscopic images. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (2019)

    Google Scholar 

  10. Islam, M., Li, Y., Ren, H.: Learning where to look while tracking instruments in robot-assisted surgery. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 412–420. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_46

    CrossRef  Google Scholar 

  11. Jin, Y., Cheng, K., Dou, Q., Heng, P.-A.: Incorporating temporal prior from motion flow for instrument segmentation in minimally invasive surgery video. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11768, pp. 440–448. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32254-0_49

    CrossRef  Google Scholar 

  12. Laina, I., et al.: Concurrent segmentation and localization for tracking of surgical instruments. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10434, pp. 664–672. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66185-8_75

    CrossRef  Google Scholar 

  13. Milletari, F., Rieke, N., Baust, M., Esposito, M., Navab, N.: CFCM: segmentation via coarse to fine context memory. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 667–674. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00937-3_76

    CrossRef  Google Scholar 

  14. Ni, Z.L., et al.: BARNet: bilinear attention network with adaptive receptive field for surgical instrument segmentation. arXiv:2001.07093 (2020)

  15. Ni, Z.L., Bian, G.B., Xie, X.L., Hou, Z.G., Zhou, X.H., Zhou, Y.J.: RASNet: segmentation for tracking surgical instruments in surgical videos using refined attention segmentation network. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (2019)

    Google Scholar 

  16. Nwoye, C.I., Mutter, D., Marescaux, J., Padoy, N.: Weakly supervised convolutional LSTM approach for tool tracking in laparoscopic videos. Int. J. Comput. Assist. Radiol. Surg. 14(6), 1059–1067 (2019). https://doi.org/10.1007/s11548-019-01958-6

    CrossRef  Google Scholar 

  17. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    CrossRef  Google Scholar 

  18. Pakhomov, D., Premachandran, V., Allan, M., Azizian, M., Navab, N.: Deep residual learning for instrument segmentation in robotic surgery. In: Suk, H.-I., Liu, M., Yan, P., Lian, C. (eds.) MLMI 2019. LNCS, vol. 11861, pp. 566–573. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32692-0_65

    CrossRef  Google Scholar 

  19. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (2019)

    Google Scholar 

  20. Rieke, N., et al.: Real-time localization of articulated surgical instruments in retinal microsurgery. Med. Image Anal. 34, 82–100 (2016)

    CrossRef  Google Scholar 

  21. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    CrossRef  Google Scholar 

  22. Ross, T., et al.: Exploiting the potential of unlabeled endoscopic video data with self-supervised learning. Int. J. Comput. Assist. Radiol. Surg. 13(6), 925–933 (2018). https://doi.org/10.1007/s11548-018-1772-0

    CrossRef  Google Scholar 

  23. Shvets, A.A., Rakhlin, A., Kalinin, A.A., Iglovikov, V.I.: Automatic instrument segmentation in robot-assisted surgery using deep learning. In: IEEE International Conference on Machine Learning and Applications (ICMLA) (2018)

    Google Scholar 

  24. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)

  25. Speidel, S., et al.: Visual tracking of da Vinci instruments for laparoscopic surgery. In: Medical Imaging 2014: Image-Guided Procedures, Robotic Interventions, and Modeling (2014)

    Google Scholar 

  26. Vardazaryan, A., Mutter, D., Marescaux, J., Padoy, N.: Weakly-supervised learning for tool localization in laparoscopic videos. In: Stoyanov, D., et al. (eds.) LABELS/CVII/STENT -2018. LNCS, vol. 11043, pp. 169–179. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01364-6_19

    CrossRef  Google Scholar 

  27. Yamazaki, Y., et al.: Automated surgical instrument detection from laparoscopic gastrectomy video images using an open source convolutional neural network platform. J. Am. Coll. Surg. 230(5), 725.e1–732.e1 (2020)

    CrossRef  Google Scholar 

Download references

Acknowledgments

This work was partially supported by MOST-2018AAA0102004 and the Natural Science Foundation of China under contracts 61572042, 61527804, 61625201. We also acknowledge the Clinical Medicine Plus X-Young Scholars Project, and High-Performance Computing Platform of Peking University for providing computational resources. Thank Boshuo Wang for making the video demo.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tingting Jiang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 62476 KB)

Rights and permissions

Reprints and Permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Liu, D. et al. (2020). Unsupervised Surgical Instrument Segmentation via Anchor Generation and Semantic Diffusion. In: , et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2020. MICCAI 2020. Lecture Notes in Computer Science(), vol 12263. Springer, Cham. https://doi.org/10.1007/978-3-030-59716-0_63

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-59716-0_63

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-59715-3

  • Online ISBN: 978-3-030-59716-0

  • eBook Packages: Computer ScienceComputer Science (R0)