Advertisement

Frame-Based Classification of Operation Phases in Cataract Surgery Videos

  • Manfred Jüergen PrimusEmail author
  • Doris Putzgruber-Adamitsch
  • Mario Taschwer
  • Bernd Münzer
  • Yosuf El-Shabrawi
  • Laszlo Böszörmenyi
  • Klaus Schoeffmann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10704)

Abstract

Cataract surgeries are frequently performed to correct a lens opacification of the human eye, which usually appears in the course of aging. These surgeries are conducted with the help of a microscope and are typically recorded on video for later inspection and educational purposes. However, post-hoc visual analysis of video recordings is cumbersome and time-consuming for surgeons if there is no navigation support, such as bookmarks to specific operation phases. To prepare the way for an automatic detection of operation phases in cataract surgery videos, we investigate the effectiveness of a deep convolutional neural network (CNN) to automatically assign video frames to operation phases, which can be regarded as a single-label multi-class classification problem. In absence of public datasets of cataract surgery videos, we provide a dataset of 21 videos of standardized cataract surgeries and use it to train and evaluate our CNN classifier. Experimental results display a mean F1-score of about 68% for frame-based operation phase classification, which can be further improved to 75% when considering temporal information of video frames in the CNN architecture.

Keywords

Medical multimedia Deep learning Video analysis Surgical workflow analysis 

Notes

Acknowledgement

This work was supported by Universität Klagenfurt and Lakeside Labs GmbH, Klagenfurt, Austria and funding from the European Regional Development Fund and the Carinthian Economic Promotion Fund (KWF) under grant KWF-20214 U. 3520/26336/38165.

References

  1. 1.
    Charrière, K., Quellec, G., Lamard, M., Martiano, D., Cazuguel, G., Coatrieux, G., Cochener, B.: Real-time analysis of cataract surgery videos using statistical models. Multimed. Tools App. 76, 1–19 (2016)Google Scholar
  2. 2.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678. ACM (2014)Google Scholar
  3. 3.
    Lalys, F., Riffaud, L., Bouget, D., Jannin, P.: A framework for the recognition of high-level surgical tasks from video images for cataract surgeries. IEEE Trans. Biomed. Eng. 59(4), 966–976 (2012)CrossRefGoogle Scholar
  4. 4.
    Petscharnig, S., Schöffmann, K.: Deep learning for shot classification in gynecologic surgery videos. In: Amsaleg, L., Guðmundsson, G.Þ., Gurrin, C., Jónsson, B.Þ., Satoh, S. (eds.) MMM 2017. LNCS, vol. 10132, pp. 702–713. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-51811-4_57 CrossRefGoogle Scholar
  5. 5.
    Petscharnig, S., Schöffmann, K.: Learning laparoscopic video shot classification for gynecological surgery. Multimed. Tools App. 1–19 (2017)Google Scholar
  6. 6.
    Primus, M.J., Schoeffmann, K., Böszörmenyi, L.: Instrument classification in laparoscopic videos. In: 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2015)Google Scholar
  7. 7.
    Primus, M.J., Schoeffmann, K., Böszörmenyi, L.: Temporal segmentation of laparoscopic videos into surgical phases. In: 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 1–6. IEEE (2016)Google Scholar
  8. 8.
    Quellec, G., Lamard, M., Cochener, B., Cazuguel, G.: Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans. Med. Imaging 33(12), 2352–2360 (2014)CrossRefGoogle Scholar
  9. 9.
    Speidel, S., Benzko, J., Krappe, S., Sudra, G., Azad, P., Peter, B.: Automatic classification of minimally invasive instruments based on endoscopic image sequences. In: SPIE Medical Imaging, pp. 72610A (2009)Google Scholar
  10. 10.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)Google Scholar
  11. 11.
    Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2017)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Manfred Jüergen Primus
    • 1
    Email author
  • Doris Putzgruber-Adamitsch
    • 2
  • Mario Taschwer
    • 1
  • Bernd Münzer
    • 1
  • Yosuf El-Shabrawi
    • 2
  • Laszlo Böszörmenyi
    • 1
  • Klaus Schoeffmann
    • 1
  1. 1.Alpen-Adria Universität KlagenfurtKlagenfurtAustria
  2. 2.Klinikum Klagenfurt am WörtherseeKlagenfurtAustria

Personalised recommendations