Pairwise Comparison-Based Objective Score for Automated Skill Assessment of Segments in a Surgical Task

  • Anand Malpani
  • S. Swaroop Vedula
  • Chi Chiung Grace Chen
  • Gregory D. Hager
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8498)


Current methods for manual evaluation of surgical skill yield a global score for the entire task. The global score does not inform surgical trainees about where in the task they need to improve. We developed and evaluated a framework to automatically generate an objective score for assessing skill in maneuvers (circumscribed segments) within a surgical task. We used an existing video and kinematic data set (with manual annotation for maneuvers) of a suturing and knot-tying task performed by 18 surgeons on a bench-top model using a da Vinci® Surgical System (Intuitive Surgical, Inc., CA). We collected crowd annotations of preferences, for which of the maneuver in a presented pair appeared to have been performed with greater skill and their confidence in the annotation. We trained a classifier to automatically predict preferences using quantitative metrics of time and motion. We generated an objective percentile score for skill assessment by comparing each maneuver sample to all remaining samples in the data set. Accuracy of the classifier for assigning a preference to pairs of maneuvers was at least 80.06% against a single individual (with a larger training data set) and at least 68.0% against each of the seven individuals (with a smaller training data set). Our reliability analyses indicate that automated preference annotations by the classifier are consistent with those by the seven individuals. Trial-level scores computed from maneuver-level scores generated using our framework were moderately correlated with global rating scores assigned by an experienced surgeon (Spearman correlation = 0.47; P-value < 0.0001).


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wilson, E.B.: The evolution of robotic general surgery. Scandinavian Journal of Surgery 98, 125–129 (2009)Google Scholar
  2. 2.
    Chang, L., Satava, R.M., Pellegrini, C.A., Sinanan, M.N.: Robotic surgery: identifying the learning curve through objective measurement of skill. Surgical Endoscopy and Other Interventional Techniques 17, 1744–1748 (2003)CrossRefGoogle Scholar
  3. 3.
    Martin, J.A., Regehr, G., Reznick, R., MacRae, H., Murnaghan, J., Hutchison, C., Brown, M.: Objective structured assessment of technical skill (OSATS) for surgical residents. The British Journal of Surgery 84, 273–278 (1997)CrossRefGoogle Scholar
  4. 4.
    Goh, A.C., Goldfarb, D.W., Sander, J.C., Miles, B.J., Dunkin, B.J.: Global evaluative assessment of robotic skills: validation of a clinical assessment tool to measure robotic surgical skills. The Journal of Urology 187, 247–252 (2012)CrossRefGoogle Scholar
  5. 5.
    Kumar, R., Jog, A., Malpani, A., Vagvolgyi, B., Yuh, D., Nguyen, H., Hager, G.D., Chen, C.C.G.: Assessing system operation skills in robotic surgery trainees. The International Journal of Medical Robotics and Computer Assisted Surgery 8, 118–124 (2012)CrossRefGoogle Scholar
  6. 6.
    Mason, J.D., Ansell, J., Warren, N., Torkington, J.: Is motion analysis a valid tool for assessing laparoscopic skill? Surgical Endoscopy 27, 1468–1477 (2013)CrossRefGoogle Scholar
  7. 7.
    Cole, S.J., Mackenzie, H., Ha, J., Hanna, G.B., Miskovic, D.: Randomized controlled trial on the effect of coaching in simulated laparoscopic training. Surgical Endoscopy, 1–8 (2013)Google Scholar
  8. 8.
    Reiley, C.E., Hager, G.D.: Decomposition of Robotic Surgical Tasks: An Analysis of Subtasks and Their Correlation to Skill. In: Medical Image Computing and Computer-Assisted Intervention M2CAI Workshop (2009)Google Scholar
  9. 9.
    Ahmidi, N., Gao, Y., Béjar, B., Vedula, S.S., Khudanpur, S., Vidal, R., Hager, G.D.: String Motif-Based Description of Tool Motion for Detecting Skill and Gestures in Robotic Surgery. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013, Part I. LNCS, vol. 8149, pp. 26–33. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  10. 10.
    Kumar, R., Rajan, P., Bejakovic, S., Seshamani, S., Mullin, G., Dassopoulos, T., Hager, G.: Learning disease severity for capsule endoscopy images. In: IEEE International Symposium on Biomedical Imaging: From Nano to Macro, pp. 1314–1317 (2009)Google Scholar
  11. 11.
    Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank Aggregation Methods for the Web. In: Proceedings of the 10th International Conference on World Wide Web, pp. 613–622 (2001)Google Scholar
  12. 12.
    Yoav, F., Raj, I., Schapire Robert, E., Singer, Y.: An Efficient Boosting Algorithm for Combining Preferences. The Journal of Machine Learning Research 4, 933–969 (2013)Google Scholar
  13. 13.
    Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Recommendation Systems: A Probabilistic Analysis. In: Proc. IEEE Symp. on Foundations of Computer Science FOCS, pp. 664–673 (1998)Google Scholar
  14. 14.
    Curet, M., Dimaio, S.P., Gao, Y., Hager, G.D., Itkowitz, B., Jog, A.S., Kumar, R., Liu, M.: Method and system for analyzing a task trajectory. Patent, WO2012151585 A2 (2012)Google Scholar
  15. 15.
    Kumar, R., Jog, A., Vagvolgyi, B., Nguyen, H., Hager, G., Chen, C.C.G., Yuh, D.: Objective measures for longitudinal assessment of robotic surgery training. The Journal of Thoracic and Cardiovascular Surgery 143, 528–534 (2012)CrossRefGoogle Scholar
  16. 16.
    Dosis, A., Aggarwal, A., Belllo, F., Moorthy, K., Munz, Y., Gillies, D., Darzi, A.: Synchronized video and motion analysis for the assessment of procedures in the operating theater. Archives of Surgery 140, 293–299 (2005)CrossRefGoogle Scholar
  17. 17.
    Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 378–382 (1971)CrossRefGoogle Scholar
  18. 18.
    Chen, C., White, L., Kowalewski, T., Aggarwal, R., Lintott, C., Comstock, B., Kuksenok, K., Aragon, C., Holst, D., Lendvay, T.: Crowd-Sourced Assessment of Technical Skills: a novel method to evaluate surgical performance. Journal of Surgical Research (2013)Google Scholar
  19. 19.
    Varadarajan, B.: Learning and inference algorithms for dynamical system models of dextrous motion. Ph.D. Thesis (2011)Google Scholar
  20. 20.
    Tao, L., Zappella, L., Hager, G.D., Vidal, R.: Surgical gesture segmentation and recognition. In: Mori, K., Sakuma, I., Sato, Y., Barillot, C., Navab, N. (eds.) MICCAI 2013, Part III. LNCS, vol. 8151, pp. 339–346. Springer, Heidelberg (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Anand Malpani
    • 1
  • S. Swaroop Vedula
    • 1
  • Chi Chiung Grace Chen
    • 2
  • Gregory D. Hager
    • 1
  1. 1.Dept. of Computer ScienceJohns Hopkins UniversityUSA
  2. 2.Dept. of Gynecology and ObstetricsJohns Hopkins University School of MedicineUSA

Personalised recommendations