Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks

  • Hassan Ismail FawazEmail author
  • Germain Forestier
  • Jonathan Weber
  • Lhassane Idoumghar
  • Pierre-Alain Muller
Original Article



Manual feedback from senior surgeons observing less experienced trainees is a laborious task that is very expensive, time-consuming and prone to subjectivity. With the number of surgical procedures increasing annually, there is an unprecedented need to provide an accurate, objective and automatic evaluation of trainees’ surgical skills in order to improve surgical practice.


In this paper, we designed a convolutional neural network (CNN) to classify surgical skills by extracting latent patterns in the trainees’ motions performed during robotic surgery. The method is validated on the JIGSAWS dataset for two surgical skills evaluation tasks: classification and regression.


Our results show that deep neural networks constitute robust machine learning models that are able to reach new competitive state-of-the-art performance on the JIGSAWS dataset. While we leveraged from CNNs’ efficiency, we were able to minimize its black-box effect using the class activation map technique.


This characteristic allowed our method to automatically pinpoint which parts of the surgery influenced the skill evaluation the most, thus allowing us to explain a surgical skill classification and provide surgeons with a novel personalized feedback technique. We believe this type of interpretable machine learning model could integrate within “Operation Room 2.0” and support novice surgeons in improving their skills to eventually become experts.


Kinematic data Surgical education Deep learning Time-series classification Interpretable machine learning 



The authors would like to thank the creators of JIGSAWS, as well as NVIDIA Corporation for the GPU grant and the Mésocentre of Strasbourg for providing access to the cluster. The authors would also like to thank the MICCAI 2018 anonymous reviewers for their fruitful comments that helped us improve the quality of this manuscript.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.


  1. 1.
    Ahmidi N, Tao L, Sefati S, Gao Y, Lea C, Haro BB, Zappella L, Khudanpur S, Vidal R, Hager GD (2017) A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Trans Biomed Eng 64(9):2025–2041CrossRefGoogle Scholar
  2. 2.
    Bridgewater B, Grayson AD, Jackson M, Brooks N, Grotte GJ, Keenan DJ, Millner R, Fabri BM, Mark J (2003) Surgeon specific mortality in adult cardiac surgery: comparison between crude and risk stratified data. BMJ 327(7405):13–17CrossRefGoogle Scholar
  3. 3.
    Chollet Fea (2015) Keras.
  4. 4.
    Forestier G, Petitjean F, Senin P, Despinoy F, Jannin P (2017) Discovering discriminative and interpretable patterns for surgical motion analysis. In: Artificial intelligence in medicine, pp 136–145Google Scholar
  5. 5.
    Forestier G, Petitjean F, Senin P, Despinoy F, Huaulmé A, Ismail Fawaz H, Weber J, Idoumghar L, Muller PA, Jannin P (2018) Surgical motion analysis using discriminative interpretable patterns. Artif Intell Med 91:3–11CrossRefGoogle Scholar
  6. 6.
    Gao Y, Vedula SS, Reiley CE, Ahmidi N, Varadarajan B, Lin HC, Tao L, Zappella L, Béjar B, Yuh DD, Chen CCG, Vidal R, Khudanpur S, Hager GD (2014) The JHU-ISI gesture and skill assessment working set (JIGSAWS): a surgical activity dataset for human motion modeling. In: Modeling and monitoring of computer assisted interventions—MICCAI workshopGoogle Scholar
  7. 7.
    Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. Int Conf Artif Intell Stat 9:249–256Google Scholar
  8. 8.
    Hatala R, Cook DA, Brydges R, Hawkins R (2015) Constructing a validity argument for the objective structured assessment of technical skills (OSATS): a systematic review of validity evidence. Adv Health Sci Educ 20(5):1149–1175CrossRefGoogle Scholar
  9. 9.
    Intuitive Surgical Sunnyvale CA (2018) The Da Vinci Surgical SystemGoogle Scholar
  10. 10.
    Islam G, Kahol K, Li B, Smith M, Patel VL (2016) Affordable, web-based surgical skill training and evaluation tool. J Biomed Inf 59:102–114CrossRefGoogle Scholar
  11. 11.
    Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller PA (2018) Evaluating surgical skills from kinematic data using convolutional neural networks. In: International conference on medical image computing and computer assisted intervention, pp 214–221Google Scholar
  12. 12.
    Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller PA (2018) Transfer learning for time series classification. In: IEEE international conference on big data, pp 1367–1376Google Scholar
  13. 13.
    Ismail Fawaz H, Forestier G, Weber J, Idoumghar L, Muller PA (2019) Deep learning for time series classification: a review. Data Mining and Knowledge DiscoveryGoogle Scholar
  14. 14.
    Kassahun Y, Yu B, Tibebu AT, Stoyanov D, Giannarou S, Metzen JH, Vander Poorten E (2016) Surgical robotics beyond enhanced dexterity instrumentation: a survey of machine learning techniques and their role in intelligent and autonomous surgical actions. Int J Comput Assist Radiol Surg 11(4):553–568CrossRefGoogle Scholar
  15. 15.
    Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representationsGoogle Scholar
  16. 16.
    Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Hager GD, Jannin P (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 1(9):691–696CrossRefGoogle Scholar
  17. 17.
    Niitsu H, Hirabayashi N, Yoshimitsu M, Mimura T, Taomoto J, Sugiyama Y, Murakami S, Saeki S, Mukaida H, Takiyama W (2013) Using the objective structured assessment of technical skills (OSATS) global rating scale to evaluate the skills of surgical trainees in the operating room. Surg Today 43(3):271–275CrossRefGoogle Scholar
  18. 18.
    Polavarapu HV, Kulaylat A, Sun S, Hamed O (2013) 100 years of surgical education: the past, present, and future. Bull Am Coll Surg 98(7):22–29Google Scholar
  19. 19.
    Tao L, Elhamifar E, Khudanpur S, Hager GD, Vidal R (2012) Sparse hidden Markov models for surgical gesture classification and skill evaluation. In: Information processing in computer-assisted interventions, pp 167–177Google Scholar
  20. 20.
    Vedula SS, Malpani AO, Tao L, Chen G, Gao Y, Poddar P, Ahmidi N, Paxton C, Vidal R, Khudanpur S, Hager GD, Chen CCG (2016) Analysis of the structure of surgical activity for a suturing and knot-tying task. Public Libr Sci One 11(3):1–14Google Scholar
  21. 21.
    Wang Z, Majewicz Fey A (2018) Deep learning with convolutional neural network for objective skill evaluation in robot-assisted surgery. Int J Comput Assist Radiol Surg 13(12):1959–1970CrossRefGoogle Scholar
  22. 22.
    Wang Z, Yan W, Oates T (2017) Time series classification from scratch with deep neural networks: a strong baseline. In: International joint conference on neural networks, pp 1578–1585Google Scholar
  23. 23.
    Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: IEEE conference on computer vision and pattern recognition, pp 2921–2929Google Scholar
  24. 24.
    Zia A, Essa I (2018) Automated surgical skill assessment in RMIS training. Int J Comput Assist Radiol Surg 13(5):731–739CrossRefGoogle Scholar

Copyright information

© CARS 2019

Authors and Affiliations

  1. 1.IRIMASUniversité Haute AlsaceMulhouseFrance

Personalised recommendations