Machine learning methods for automated technical skills assessment with instructional feedback in ultrasound-guided interventions



Currently, there is a worldwide shift toward competency-based medical education. This necessitates the use of automated skills assessment methods during self-guided interventions training. Making assessment methods that are transparent and configurable will allow assessment to be interpreted into instructional feedback. The purpose of this work is to develop and validate skills assessment methods in ultrasound-guided interventions that are transparent and configurable.


We implemented a method based upon decision trees and a method based upon fuzzy inference systems for technical skills assessment. Subsequently, we validated these methods for their ability to predict scores of operators on a 25-point global rating scale in ultrasound-guided needle insertions and their ability to provide useful feedback for training.


Decision tree and fuzzy rule-based assessment performed comparably to state-of-the-art assessment methods. They produced median errors (on a 25-point scale) of 1.7 and 1.8 for in-plane insertions and 1.5 and 3.0 for out-of-plane insertions, respectively. In addition, these methods provided feedback that was useful for trainee learning. Decision tree assessment produced feedback with median usefulness 7 out of 7; fuzzy rule-based assessment produced feedback with median usefulness 6 out of 7.


Transparent and configurable assessment methods are comparable to the state of the art and, in addition, can provide useful feedback. This demonstrates their value in self-guided interventions training curricula.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6


  1. 1.

    Harden RM, Stevenson M, Downie WW, Wilson GM (1975) Assessment of clinical competence using objective structured examination. Br Med J 1(5955):447–451

    CAS  Article  Google Scholar 

  2. 2.

    Winckel CP, Reznick RK, Frcsc M, Cohen R (1994) Reliability and construct validity of a structured technical skills assessment form. Am J Surg 167:423–427

    CAS  Article  Google Scholar 

  3. 3.

    Gofton WT, Dudek NL, Wood TJ, Balaa F, Hamstra SJ (2012) The Ottawa surgical competency operating room evaluation (O-SCORE): a tool to assess surgical competence. Acad Med 87(10):1401–1407

    Article  Google Scholar 

  4. 4.

    Martin JA, Regehr G, Reznick R, Macrae H, Murnaghan J, Hutchison C, Brown M (1997) Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg 84(2):273–278

    CAS  Article  Google Scholar 

  5. 5.

    Reiley CE, Lin HC, Yuh DD, Hager GD (2011) Review of methods for objective surgical skill evaluation. Surg Endosc 25(2):356–366

    Article  Google Scholar 

  6. 6.

    Vedula SS, Ishii M, Hager GD (2017) Objective assessment of surgical technical skill and competency in the operating room. Annu Rev Biomed Eng 19(1):301–325

    CAS  Article  Google Scholar 

  7. 7.

    Fraser SA, Klassen DR, Feldman LS, Ghitulescu GA, Stanbridge D, Fried GM (2003) Evaluating laparoscopic skills, setting the pass/fail score for the MISTELS system. Surg Endosc Other Interv Tech 17(6):964–967

    CAS  Article  Google Scholar 

  8. 8.

    Stylopoulos N, Cotin S, Maithel SKK, Ottensmeye M, Jackson PGG, Bardsley RSS, Neumann PFF, Rattner DWW, Dawson SLL, Ottensmeyer M, Jackson PGG, Bardsley RSS, Neumann PFF, Rattner DWW, Dawson SLL (2004) Computer-enhanced laparoscopic training system (CELTS): bridging the gap. Surg Endosc 18(5):782–789

    CAS  Article  Google Scholar 

  9. 9.

    Chmarra MK, Klein S, de Winter JCF, Jansen F-WW, Dankelman J (2010) Objective classification of residents based on their psychomotor laparoscopic skills. Surg Endosc Other Interv Tech 24(5):1031–1039

    Article  Google Scholar 

  10. 10.

    Allen B, Nistor V, Dutson E, Carman G, Lewis C, Faloutsos P (2010) Support vector machines improve the accuracy of evaluation for the performance of laparoscopic training tasks. Surg Endosc 24(1):170–178

    Article  Google Scholar 

  11. 11.

    Oropesa I, Sánchez-González P, Chmarra MK, Lamata P, Pérez-Rodríguez R, Jansen FW, Dankelman J, Gómez EJ (2014) Supervised classification of psychomotor competence in minimally invasive surgery based on instruments motion analysis. Surg Endosc Other Interv Tech 28(2):657–670

    Article  Google Scholar 

  12. 12.

    Ahmidi N, Poddar P, Jones JD, Vedula SS, Ishii L, Hager GD, Ishii M (2015) Automated objective surgical skill assessment in the operating room from unstructured tool motion in septoplasty. Int J Comput Assist Radiol Surg 10(6):981–991

    Article  Google Scholar 

  13. 13.

    Fard MJ, Ameri S, Darin Ellis R, Chinnam RB, Pandya AK, Klein MD (2017) Automated robot-assisted surgical skill evaluation: Predictive analytics approach. Int J Med Robot Comput Assist Surg 14(1):e1850

    Article  Google Scholar 

  14. 14.

    Kramer BD, Losey DP, O’Malley MK, O’Malley MK (2016) SOM and LVQ classification of endovascular surgeons using motion-based metrics. In: Merényi E, Mendenhall MJ, O’Driscoll P (eds) Advances in self-organizing maps and learning vector quantization: proceedings of the 11th international workshop WSOM 2016, Houston, Texas, USA, January 6–8, 2016, vol. 428. Springer, Cham, pp 227–237

    Google Scholar 

  15. 15.

    Uemura M, Tomikawa M, Miao T, Souzaki R, Ieiri S, Akahoshi T, Lefor AK, Hashizume M (2018) Feasibility of an AI-based measure of the hand motions of expert and novice surgeons. Comput Math Methods Med.

    Article  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Hajshirmohammadi I, Payandeh S (2007) Fuzzy set theory for performance evaluation in a surgical simulator. Presence Teleoper Virtual Environ 16(6):603–622

    Article  Google Scholar 

  17. 17.

    Riojas M, Feng C, Hamilton A, Rozenblit J (2011) Knowledge elicitation for performance assessment in a computerized surgical training system. Appl Soft Comput J 11(4):3697–3708

    Article  Google Scholar 

  18. 18.

    Huang J, Payandeh S, Doris P, Hajshirmohammadi I (2005) Fuzzy classification: towards evaluating performance on a surgical simulator. Stud Health Technol Inform 111:194–200

    PubMed  Google Scholar 

  19. 19.

    Kotsiantis SB (2007) Supervised machine learning: a review of classification techniques. Informatica 31:249–268

    Google Scholar 

  20. 20.

    Chiticariu L, Li Y, Reiss F (2015) Transparent machine learning for information extraction: state-of-the-art and the future. In: Conference on empirical methods in natural language processing, pp 4–6

  21. 21.

    Rosen J, Brown JD, Chang L, Barreca M, Sinanan M, Hannaford B (2002) The BlueDRAGON—a system for measuring the kinematics and dynamics of minimally invasive surgical tools in-vivo. IEEE Int Conf Robot Autom 2:1876–1881

    Google Scholar 

  22. 22.

    Forestier G, Lalys F, Riffaud L, Trelhu B, Jannin P (2012) Classification of surgical processes using dynamic time warping. J Biomed Inform 45(2):255–264

    Article  Google Scholar 

  23. 23.

    Doughty H, Damen D, Mayol-Cuevas W (2018) Who’s better? Who’s best? pairwise deep ranking for skill determination. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, pp 6057–6066.

  24. 24.

    Kowalewski TM, Comstock B, Sweet R, Schaffhausen C, Menhadji A, Averch T, Box G, Brand T, Ferrandino M, Kaouk J, Knudsen B, Landman J, Lee B, Schwartz BF, McDougall E, Lendvay TS (2016) Crowd-sourced assessment of technical skills for validation of basic laparoscopic urologic skills tasks. J Urol 195(6):1859–1865

    Article  Google Scholar 

  25. 25.

    Holden MS, Lia H, Xia S, Keri Z, Ungi T, Fichtinger G (2018) Configurable overall skill assessment in ultrasound-guided needle insertion. In: 16th annual imaging network Ontario symposium (ImNO)

  26. 26.

    Frank E, Trigg L, Holmes G, Witten IH (2000) Technical note: Naive Bayes for regression. Mach Learn 41(1):5–25

    Article  Google Scholar 

  27. 27.

    Al Iqbal MR, Rahman S, Nabil SI, Chowdhury IUA (2012) Knowledge based decision tree construction with feature importance domain knowledge. In: 2012 7th international conference on electrical and computer engineering, pp 659–662

  28. 28.

    Silverman BW (1986) Density estimation for statistics and data analysis, no. 1951

    Google Scholar 

  29. 29.

    Xia S, Keri Z, Holden MS, Hisey R, Lia H, Ungi T, Mitchell CH, Fichtinger G (2018) A learning curve analysis of ultrasound-guided in-plane and out-of-plane vascular access training with Perk Tutor. In: Medical imaging 2018: image-guided procedures, robotic interventions, and modeling, vol 10576, p 66

  30. 30.

    Lasso A, Heffter T, Rankin A, Pinter C, Ungi T, Fichtinger G (2014) PLUS: open-source toolkit for ultrasound-guided intervention systems. IEEE Trans Biomed Eng 61(10):2527–2537

    Article  Google Scholar 

  31. 31.

    Ungi T, Sargent D, Moult E, Lasso A, Pinter C, McGraw RC, Fichtinger G (2012) Perk tutor: an open-source training platform for ultrasound-guided needle insertions. IEEE Trans Biomed Eng 59(12):3475–3481

    Article  Google Scholar 

  32. 32.

    Domuracki K, Wong A, Olivieri L, Grierson LEM (2015) The impacts of observing flawed and flawless demonstrations on clinical skill learning. Med Educ 49(2):186–192

    Article  Google Scholar 

  33. 33.

    Ma IWY, Zalunardo N, Pachev G, Beran T, Brown M, Hatala R, McLaughlin K (2012) Comparing the use of global rating scale with checklists for the assessment of central venous catheterization skills using simulation. Adv Health Sci Educ 17(4):457–470

    Article  Google Scholar 

  34. 34.

    Zia A, Sharma Y, Bettadapura V, Sarin EL, Clements MA, Essa I (2015) Automated assessment of surgical skills using frequency analysis. In: Medical image computing and computer-assisted interventions—MICCAI 2015, Pt I, vol 9349, pp 430–438

    Google Scholar 

  35. 35.

    Zia A, Sharma Y, Bettadapura V, Sarin EL, Essa I (2018) Video and accelerometer-based motion analysis for automated surgical skills assessment. Int J Comput Assist Radiol Surg 13(3):443–455

    Article  Google Scholar 

  36. 36.

    Stumpf S, Rajaram V, Li L, Burnett M, Dietterich T, Sullivan E, Drummond R, Herlocker J (2007) Toward harnessing user feedback for machine learning. In: Proceedings of the 12th international conference on Intelligent user interfaces—IUI’07, p 82

  37. 37.

    Talbot J, Lee B, Kapoor A, Tan DS (2009) EnsembleMatrix: interactive visualization to support machine learning with multiple classifiers. Learning.

    Article  Google Scholar 

  38. 38.

    Hendricks LA, Akata Z, Rohrbach M, Donahue J, Schiele B, Darrell T (2016) Generating visual explanations. In: Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9908. LNCS, pp 3–19

    Google Scholar 

  39. 39.

    Muir BM (1987) Trust between humans and machines, and the design of decision aids. Int J Man Mach Stud 27(5–6):527–539

    Article  Google Scholar 

  40. 40.

    McGraw R, Chaplin T, McKaigney C, Rang L, Jaeger M, Redfearn D, Davison C, Ungi T, Holden M, Yeo C, Keri Z, Fichtinger G (2016) Development and evaluation of a simulation-based curriculum for ultrasound-guided central venous catheterization. In: CJEM, pp 1–9

  41. 41.

    Datta V, Mackay S, Mandalia M, Darzi A (2001) The use of electromagnetic motion tracking analysis to objectively measure open surgical skill in the laboratory-based model. J Am Coll Surg 193(5):479–485

    CAS  Article  Google Scholar 

  42. 42.

    Holden MS, Keri Z, Ungi T, Fichtinger G (2017) Overall proficiency assessment in point-of-care ultrasound interventions: the stopwatch is not enough. In: Cardoso MJ, Arbel T, Tavares JMRS, Aylward S, Li S, Boctor E, Fichtinger G, Cleary K, Freeman B, Kohli L, Shipley Kane D, Oetgen M, Pujol S (eds) Imaging for patient-customized simulations and systems for point-of-care ultrasound: international workshops, BIVPCS 2017 and POCUS 2017, held in conjunction with MICCAI 2017, Québec City, QC, Canada, September 14, 2017. Springer International Publishing, Cham, pp 146–153

  43. 43.

    Lia H, Keri Z, Holden MS, Harish V, Mitchell CH, Ungi T, Fichtinger G (2017) Training with Perk Tutor improves ultrasound-guided in-plane needle insertion skill. In: SPIE medical imaging, 2017, p 101350T

Download references


Matthew S. Holden is supported by the Link Foundation Fellowship in Modeling, Simulation, and Training. Gabor Fichtinger is supported by a Canada Research Chair in Computer-Integrated Surgery.

Author information



Corresponding author

Correspondence to Matthew S. Holden.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

All procedures in this study involving human participants were performed in accordance with the ethical standards of the institution and were approved by the research ethics board at Queen’s University. This study does not contain any procedures involving animals.

Informed consent

All participation was voluntary, and written informed consent was obtained from all participants.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Holden, M.S., Xia, S., Lia, H. et al. Machine learning methods for automated technical skills assessment with instructional feedback in ultrasound-guided interventions. Int J CARS 14, 1993–2003 (2019).

Download citation


  • Ultrasound-guided needle insertion
  • Simulation-based training
  • Medical education
  • Objective skill assessment