DeepPhase: Surgical Phase Recognition in CATARACTS Videos

  • Odysseas ZisimopoulosEmail author
  • Evangello Flouty
  • Imanol Luengo
  • Petros Giataganas
  • Jean Nehme
  • Andre Chow
  • Danail Stoyanov
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11073)


Automated surgical workflow analysis and understanding can assist surgeons to standardize procedures and enhance post-surgical assessment and indexing, as well as, interventional monitoring. Computer-assisted interventional (CAI) systems based on video can perform workflow estimation through surgical instruments’ recognition while linking them to an ontology of procedural phases. In this work, we adopt a deep learning paradigm to detect surgical instruments in cataract surgery videos which in turn feed a surgical phase inference recurrent network that encodes temporal aspects of phase steps within the phase classification. Our models present comparable to state-of-the-art results for surgical tool detection and phase recognition with accuracies of 99 and 78% respectively.


Surgical vision Instrument detection Surgical workflow Deep learning Surgical data science 


  1. 1.
    Maier-Hein, L., Vedula, S.S., et al.: Surgical data science for next-generation interventions. Nat. Biomed. Eng. 1(9), 691–696 (2017)CrossRefGoogle Scholar
  2. 2.
    Padoy, N., Blum, T., et al.: Statistical modeling and recognition of surgical workflow. Med. Image Anal. 16(3), 632–641 (2012)CrossRefGoogle Scholar
  3. 3.
    Meißner, C., Meixensberger, J., et al.: Sensor-based surgical activity recognition in unconstrained environments. Minim. Invasive Ther. Allied Technol. 23(4), 198–205 (2014)CrossRefGoogle Scholar
  4. 4.
    Stauder, R., et al.: Random forests for phase detection in surgical workflow analysis. In: Stoyanov, D., Collins, D.L., Sakuma, I., Abolmaesumi, P., Jannin, P. (eds.) IPCAI 2014. LNCS, vol. 8498, pp. 148–157. Springer, Cham (2014). Scholar
  5. 5.
    Quellec, G., Lamard, M., et al.: Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans. Med. Imaging 33(12), 2352–2360 (2014)CrossRefGoogle Scholar
  6. 6.
    Zappella, L., Béjar, B., et al.: Surgical gesture classification from video and kinematic data. Med. Image Anal. 17(7), 732–745 (2013)CrossRefGoogle Scholar
  7. 7.
    Du, X., Allan, M., et al.: Combined 2D and 3D tracking of surgical instruments for minimally invasive and robotic-assisted surgery. Int. J. Comput. Assist. Radiol. Surg. 11(6), 1109–1119 (2016)CrossRefGoogle Scholar
  8. 8.
    Bouget, D., Allan, M., et al.: Vision-based and marker-less surgical tool detection and tracking: a review of the literature. Med. Image Anal. 35, 633–654 (2017)CrossRefGoogle Scholar
  9. 9.
    He, K., Zhang, X., et al.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2016)Google Scholar
  10. 10.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition. IEEE (2015)Google Scholar
  11. 11.
    Twinanda, A.P., Shehata, S., et al.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2017)CrossRefGoogle Scholar
  12. 12.
    Zisimopoulos, O., Flouty, E., et al.: Can surgical simulation be used to train detection and classification of neural networks? Healthc. Technol. Lett. 4(5), 216–222 (2017)CrossRefGoogle Scholar
  13. 13.
    Stauder, R., Ostler, D., et al.: The TUM LapChole dataset for the M2CAI 2016 workflow challenge. arXiv preprint (2016)Google Scholar
  14. 14.
    Jin, Y., Dou, Q., et al.: EndoRCN: recurrent convolutional networks for recognition of surgical workflow in cholecystectomy procedure video. IEEE Trans. Med. Imaging (2016)Google Scholar
  15. 15.
    Trikha, S., Turnbull, A.M.J., et al.: The journey to femtosecond laser-assisted cataract surgery: new beginnings or false dawn? Eye 27(4), 461–473 (2013)CrossRefGoogle Scholar
  16. 16.
    Chung, J., Gulcehre, C., et al.: Empirical evaluation of gated recurrent neural networks on sequence modeling (2014)Google Scholar
  17. 17.
    Russakovsky, O., Deng, J., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Lea, C., Vidal, R., Reiter, A., Hager, G.D.: Temporal convolutional networks: a unified approach to action segmentation. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 47–54. Springer, Cham (2016). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Odysseas Zisimopoulos
    • 1
    Email author
  • Evangello Flouty
    • 1
  • Imanol Luengo
    • 1
  • Petros Giataganas
    • 1
  • Jean Nehme
    • 1
  • Andre Chow
    • 1
  • Danail Stoyanov
    • 1
    • 2
  1. 1.Digital SurgeryKinosis, Ltd.LondonUK
  2. 2.University College LondonLondonUK

Personalised recommendations