Video assessment of laparoscopic skills by novices and experts: implications for surgical education
- 304 Downloads
Previous investigators have shown that novices are able to assess surgical skills as reliably as expert surgeons. The purpose of this study was to determine how novices and experts arrive at these graded scores when assessing laparoscopic skills and the potential implications this may have for surgical education.
Four novices and four general laparoscopic surgeons evaluated 59 videos of a suturing task using a 5-point scale. Average novice and expert evaluator scores for each video and the average number of times that scores were changed were compared. Intraclass correlation coefficients were used to determine inter-rater and test–retest reliability. Evaluators were asked to define the number of videos they needed to watch before they could confidently grade and to describe how they were able to distinguish between different levels of expertise.
There were no significant differences in mean scores assigned by the two evaluator groups. Novices changed their scores more frequently compared to experts, but this did not reach statistical significance. There was excellent inter-rater reliability between the two groups (ICC = 0.91, CI 0.85–0.95) and good test–retest reliability (ICC > 0.83). On average, novices and experts reported that they needed to watch 13.8 ± 2.4 and 8.5 ± 2.5 videos, respectively, before they could confidently grade. Both groups also identified similar qualitative indicators (e.g., instrument control).
Evaluators with varying levels of expertise can reliably grade performance of an intracorporeal suturing task. While novices were less confident in their grading, both groups were able to assign comparable scores and identify similar elements of a suturing skill as being important in terms of assessment.
KeywordsVideo assessment Suturing skill Laparoscopic Novice evaluators
This project was supported by the Comprehensive Research Experience for Medical Students (CREMS) program and the Department of Surgery at the University of Toronto. The authors would also like to acknowledge Dr. Paul Wales for providing us with his expertise in statistics and Dr. James Rutka for his continuous support of this project.
This study was funded by the University of Toronto Comprehensive Research Experience for Medical Students and by the University of Toronto Department of Surgery.
Celine Yeung, Dr. Brian Carrillo, Victor Pope, Shahob Hosseinpour, Dr. J. Ted Gerstle, and Dr. Georges Azzie have no conflicts of interest or financial ties to disclose.
- 13.Gwet KL (2014) Handbook of inter-rater reliability. In: The definitive guide to measuring the extent of agreement among raters, 4th edn. Advanced Analytics, LLC, GaithersburgGoogle Scholar