Skip to main content

Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos

  • Conference paper
  • First Online:
MultiMedia Modeling (MMM 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11962))

Included in the following conference series:

Abstract

In the field of ophthalmic surgery, many clinicians nowadays record their microscopic procedures with a video camera and use the recorded footage for later purpose, such as forensics, teaching, or training. However, in order to efficiently use the video material after surgery, the video content needs to be analyzed automatically. Important semantic content to be analyzed and indexed in these short videos are operation instruments, since they provide an indication of the corresponding operation phase and surgical action. Related work has already shown that it is possible to accurately detect instruments in cataract surgery videos. However, their underlying dataset (from the CATARACTS challenge) has very good visual quality, which is not reflecting the typical quality of videos acquired in general hospitals. In this paper, we therefore analyze the generalization performance of deep learning models for instrument recognition in terms of dataset change. More precisely, we trained such models as ResNet-50, Inception v3 and NASNet Mobile using a dataset of high visual quality (CATARACTS) and test it on another dataset with low visual quality (Cataract-101), and vice versa. Our results show that the generalizability is surprisingly low in general, but slightly worse for the model trained on the high-quality dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Primus, M.J., et al.: Frame-based classification of operation phases in cataract surgery videos. In: Schoeffmann, K., et al. (eds.) MMM 2018. LNCS, vol. 10704, pp. 241–253. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73603-7_20

    Chapter  Google Scholar 

  2. Quellec, G., Lamard, M., Cochener, B., Cazuguel, G.: Real-time task recognition in cataract surgery videos using adaptive spatiotemporal polynomials. IEEE Trans. Med. Imaging 34(4), 877–887 (2014)

    Article  Google Scholar 

  3. Hajj, H.A., Lamard, M., Conze, P.-H., Cochener, B., Quellec, G.: Monitoring tool usage in surgery videos using boosted convolutional and recurrent neural networks. Med. Image Anal. 47, 203–218 (2018)

    Article  Google Scholar 

  4. Hajj, H.A., et al.: CATARACTS: challenge on automatic tool annotation for cataract surgery. Med. Image Anal. 52, 24–41 (2019)

    Article  Google Scholar 

  5. Schoeffmann, K., Taschwer, M., Sarny, S., Münzer, B., Primus, M.J., Putzgruber, D.: Cataract-101: video dataset of 101 cataract surgeries. In: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 421–425. ACM (2018)

    Google Scholar 

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  7. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. CoRR, vol. abs/1512.00567 (2015)

    Google Scholar 

  8. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR, vol. abs/1707.07012 (2017)

    Google Scholar 

  9. Charrière, K., Quellec, G., Lamard, M., Coatrieux, G., Cochener, B., Cazuguel, G.: Automated surgical step recognition in normalized cataract surgery videos. In: 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 4647–4650, August 2014

    Google Scholar 

  10. Quellec, G., Lamard, M., Cochener, B., Cazuguel, G.: Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans. Med. Imaging 33(12), 2352–2360 (2014)

    Article  Google Scholar 

  11. Charriere, K., et al.: Real-time multilevel sequencing of cataract surgery videos. In: 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6, June 2016

    Google Scholar 

  12. Charrière, K., et al.: Real-time analysis of cataract surgery videos using statistical models. Multimed. Tools Appl. 76(21), 22473–22491 (2017)

    Article  Google Scholar 

  13. Al Hajj, H., Lamard, M., Cochener, B., Quellec, G.: Smart data augmentation for surgical tool detection on the surgical tray. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 4407–4410, July 2017

    Google Scholar 

  14. Al Hajj, H., Lamard, M., Charrière, K., Cochener, B., Quellec, G.: Surgical tool detection in cataract surgery videos through multi-image fusion inside a convolutional neural network. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 2002–2005, July 2017

    Google Scholar 

  15. Zisimopoulos, O., et al.: DeepPhase: surgical phase recognition in CATARACTS videos. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 265–272. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00937-3_31

    Chapter  Google Scholar 

  16. Halevy, A., Norvig, P., Pereira, F.: The unreasonable effectiveness of data. IEEE Intell. Syst. 24(2), 8–12 (2009)

    Article  Google Scholar 

  17. Vardazaryan, A., Mutter, D., Marescaux, J., Padoy, N.: Weakly-supervised learning for tool localization in laparoscopic videos. In: Stoyanov, D., et al. (eds.) LABELS/CVII/STENT-2018. LNCS, vol. 11043, pp. 169–179. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01364-6_19

    Chapter  Google Scholar 

Download references

Acknowledgments

This work was funded by the FWF Austrian Science Fund under grant P 31486-N31.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalia Sokolova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sokolova, N., Schoeffmann, K., Taschwer, M., Putzgruber-Adamitsch, D., El-Shabrawi, Y. (2020). Evaluating the Generalization Performance of Instrument Classification in Cataract Surgery Videos. In: Ro, Y., et al. MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science(), vol 11962. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37734-2_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37733-5

  • Online ISBN: 978-3-030-37734-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics