Surgical Endoscopy

, Volume 31, Issue 10, pp 3883–3889 | Cite as

Video assessment of laparoscopic skills by novices and experts: implications for surgical education

  • Celine Yeung
  • Brian Carrillo
  • Victor Pope
  • Shahob Hosseinpour
  • J. Ted Gerstle
  • Georges AzzieEmail author



Previous investigators have shown that novices are able to assess surgical skills as reliably as expert surgeons. The purpose of this study was to determine how novices and experts arrive at these graded scores when assessing laparoscopic skills and the potential implications this may have for surgical education.


Four novices and four general laparoscopic surgeons evaluated 59 videos of a suturing task using a 5-point scale. Average novice and expert evaluator scores for each video and the average number of times that scores were changed were compared. Intraclass correlation coefficients were used to determine inter-rater and test–retest reliability. Evaluators were asked to define the number of videos they needed to watch before they could confidently grade and to describe how they were able to distinguish between different levels of expertise.


There were no significant differences in mean scores assigned by the two evaluator groups. Novices changed their scores more frequently compared to experts, but this did not reach statistical significance. There was excellent inter-rater reliability between the two groups (ICC = 0.91, CI 0.85–0.95) and good test–retest reliability (ICC > 0.83). On average, novices and experts reported that they needed to watch 13.8 ± 2.4 and 8.5 ± 2.5 videos, respectively, before they could confidently grade. Both groups also identified similar qualitative indicators (e.g., instrument control).


Evaluators with varying levels of expertise can reliably grade performance of an intracorporeal suturing task. While novices were less confident in their grading, both groups were able to assign comparable scores and identify similar elements of a suturing skill as being important in terms of assessment.


Video assessment Suturing skill Laparoscopic Novice evaluators 



This project was supported by the Comprehensive Research Experience for Medical Students (CREMS) program and the Department of Surgery at the University of Toronto. The authors would also like to acknowledge Dr. Paul Wales for providing us with his expertise in statistics and Dr. James Rutka for his continuous support of this project.


This study was funded by the University of Toronto Comprehensive Research Experience for Medical Students and by the University of Toronto Department of Surgery.


Celine Yeung, Dr. Brian Carrillo, Victor Pope, Shahob Hosseinpour, Dr. J. Ted Gerstle, and Dr. Georges Azzie have no conflicts of interest or financial ties to disclose.


  1. 1.
    Chen C, White L, Kowalewski T, Aggarwal R, Lintott C, Comstock B, Kuksenok K, Aragon C, Holst D, Lendvay T (2014) Crowd-sourced assessment of technical skills: a novel method to evaluate surgical performance. J Surg Res 187(1):65–71CrossRefPubMedGoogle Scholar
  2. 2.
    White LW, Kowalewski TM, Dockter RL, Comstock B, Hannaford B, Lendvay TS (2015) Crowd-sourced assessment of technical skill: a valid method for discriminating basic robotic surgery skills. J Endourol 29(11):1295–1301CrossRefPubMedGoogle Scholar
  3. 3.
    Holmboe ES, Sherbino J, Long DM, Swing SR, Frank JR (2010) The role of assessment in competency-based medical education The role of assessment in competency-based medical education. Med Teach 32(8):676–682CrossRefPubMedGoogle Scholar
  4. 4.
    Meier AH, Gruessner A, Cooney RN (2016) Using the ACGME Milestones for resident self-evaluation and faculty engagement. J Surg Educ 73(6):e150–e157CrossRefPubMedGoogle Scholar
  5. 5.
    Williams TE, Satiani B, Thomas A, Ellison EC (2009) The impending shortage and the estimated cost of training the future surgical workforce. Ann Surg 250(4):590–597PubMedGoogle Scholar
  6. 6.
    Malpani A, Vedula SS, Chen CCG, Hager GD (2015) A study of crowdsourced segment-level surgical skill assessment using pairwise rankings. Int J CARS 10(9):1435–1447CrossRefGoogle Scholar
  7. 7.
    Dath D, Regehr G, Birch D, Schlachta C, Poulin E, Mamazza J, Reznick R, MacRae HM (2004) Toward reliable operative assessment: the reliability and feasibility of videotaped assessment of laparoscopic technical skills. Surg Endosc 18(12):1800–1804CrossRefPubMedGoogle Scholar
  8. 8.
    Driscoll PJ, Paisley AM, Paterson-Brown S (2008) Video assessment of basic surgical trainees’ operative skills. Am J Surg 196(2):265–272CrossRefPubMedGoogle Scholar
  9. 9.
    Birkmeyer JD, Finks JF, O’Reilly A, Oerline M, Carlin AM, Nunn AR, Dimick J, Banerjee M, Birkmeyer NJO (2013) Surgical skill and complication rates after bariatric surgery. N Engl J Med 369(15):1434–1442CrossRefPubMedGoogle Scholar
  10. 10.
    Holst D, Kowalewski TM, White LW, Brand TC, Harper JD, Sorenson MD, Kirsch S, Lendvay TS (2015) Crowd-sourced assessment of technical skills: an adjunct to urology resident surgical simulation training. J Endourol 29(5):604–610CrossRefPubMedGoogle Scholar
  11. 11.
    Holst D, Kowalewski TM, White LW, Brand TC, Harper JD, Sorensen MD, Truong M, Simpson K, Tanaka A, Smith R, Lendvay TS (2015) Crowd-sourced assessment of technical skills: differentiating animate surgical skill through the wisdom of crowds. J Endourol 29(10):1183–1188CrossRefPubMedGoogle Scholar
  12. 12.
    Thomas MR, Beckman TJ, Mauck KF, Cha SS, Thomas KG (2011) Group assessments of resident physicians improve reliability and decrease halo error. J Gen Intern Med 26(7):759–764CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Gwet KL (2014) Handbook of inter-rater reliability. In: The definitive guide to measuring the extent of agreement among raters, 4th edn. Advanced Analytics, LLC, GaithersburgGoogle Scholar
  14. 14.
    Aghdasi N, Bly R, White LW, Hannaford B, Moe K, Lendvay TS (2015) Crowd-sourced assessment of surgical skills in cricothyrotomy procedure. J Surg Res 196(2):302–306CrossRefPubMedGoogle Scholar
  15. 15.
    Martin JA, Regehr G, Reznick R, MacRae H, Murnaghan J, Hutchinson C, Brown M (1997) Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg 84(2):273–278CrossRefPubMedGoogle Scholar
  16. 16.
    Hiemstra E, Chmarra MK, Dankelman J, Jansen FW (2011) Intracorporeal suturing: economy of instrument movements using a box trainer model. J Minim Invasive Gynecol 18(4):494–499CrossRefPubMedGoogle Scholar
  17. 17.
    Scott DJ, Rege RV, Bergen PC, Guo WA, Laycock R, Tesfay ST, Valentine RJ, Jones DB (2000) Measuring operative performance after laparoscopic skills training: edited videotape versus direct observation. J Laparoendosc Adv S 10(4):183–190CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  • Celine Yeung
    • 1
  • Brian Carrillo
    • 2
    • 4
  • Victor Pope
    • 3
  • Shahob Hosseinpour
    • 1
  • J. Ted Gerstle
    • 2
    • 4
  • Georges Azzie
    • 2
    • 4
    Email author
  1. 1.Faculty of MedicineUniversity of TorontoTorontoCanada
  2. 2.Division of General and Thoracic SurgeryThe Hospital for Sick ChildrenTorontoCanada
  3. 3.Division of OtolaryngologyThe Hospital for Sick ChildrenTorontoCanada
  4. 4.Centre for Image-Guided Innovation and Therapeutic InterventionTorontoCanada

Personalised recommendations