Automatic Alignment of Surgical Videos Using Kinematic Data

  • Hassan Ismail FawazEmail author
  • Germain Forestier
  • Jonathan Weber
  • François Petitjean
  • Lhassane Idoumghar
  • Pierre-Alain Muller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11526)


Over the past one hundred years, the classic teaching methodology of “see one, do one, teach one” has governed the surgical education systems worldwide. With the advent of Operation Room 2.0, recording video, kinematic and many other types of data during the surgery became an easy task, thus allowing artificial intelligence systems to be deployed and used in surgical and medical practice. Recently, surgical videos has been shown to provide a structure for peer coaching enabling novice trainees to learn from experienced surgeons by replaying those videos. However, the high inter-operator variability in surgical gesture duration and execution renders learning from comparing novice to expert surgical videos a very difficult task. In this paper, we propose a novel technique to align multiple videos based on the alignment of their corresponding kinematic multivariate time series data. By leveraging the Dynamic Time Warping measure, our algorithm synchronizes a set of videos in order to show the same gesture being performed at different speed. We believe that the proposed approach is a valuable addition to the existing learning tools for surgery.


Dynamic Time Warping Multivariate time series Video synchronization Surgical education 


  1. 1.
    Criss, K., McNames, J.: Video assessment of finger tapping for Parkinson’s disease and other movement disorders. In: IEEE International Conference on Engineering in Medicine and Biology Society, pp. 7123–7126 (2011)Google Scholar
  2. 2.
    Evangelidis, G.D., Bauckhage, C.: Efficient and robust alignment of unsynchronized video sequences. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 286–295. Springer, Heidelberg (2011). Scholar
  3. 3.
    Forestier, G., Petitjean, F., Riffaud, L., Jannin, P.: Non-linear temporal scaling of surgical processes. Artif. Intell. Med. 62(3), 143–152 (2014)CrossRefGoogle Scholar
  4. 4.
    Forestier, G., et al.: Surgical motion analysis using discriminative interpretable patterns. Artif. Intell. Med. 91, 3–11 (2018)CrossRefGoogle Scholar
  5. 5.
    Gao, Y., et al.: The JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS): a surgical activity dataset for human motion modeling. In: Modeling and Monitoring of Computer Assisted Interventions - MICCAI Workshop (2014)Google Scholar
  6. 6.
    Herrera-Almario, G.E., Kirk, K., Guerrero, V.T., Jeong, K., Kim, S., Hamad, G.G.: The effect of video review of resident laparoscopic surgical skills measured by self- and external assessment. Am. J. Surg. 211(2), 315–320 (2016)CrossRefGoogle Scholar
  7. 7.
    Intuitive Surgical Sunnyvale, C.A.: The Da Vinci surgical system.
  8. 8.
    Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.-A.: Evaluating surgical skills from kinematic data using convolutional neural networks. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11073, pp. 214–221. Springer, Cham (2018). Scholar
  9. 9.
    Kneebone, R., Kidd, J., Nestel, D., Asvall, S., Paraskeva, P., Darzi, A.: An innovative model for teaching and learning clinical procedures. Med. Educ. 36(7), 628–634 (2002)CrossRefGoogle Scholar
  10. 10.
    Li, Z., Huang, Y., Cai, M., Sato, Y.: Manipulation-skill assessment from videos with spatial attention network. ArXiv (2019)Google Scholar
  11. 11.
    Masic, I.: E-learning as new method of medical education. Acta informatica medica 16(2), 102 (2008)CrossRefGoogle Scholar
  12. 12.
    McNatt, S., Smith, C.: A computer-based laparoscopic skills assessment device differentiates experienced from novice laparoscopic surgeons. Surg. Endosc. 15(10), 1085–1089 (2001)CrossRefGoogle Scholar
  13. 13.
    Means, B., Toyama, Y., Murphy, R., Bakia, M., Jones, K.: Evaluation of evidence-based practices in online learning: a meta-analysis and review of online learning studies (2009)Google Scholar
  14. 14.
    Mota, P., Carvalho, N., Carvalho-Dias, E., Costa, M.J., Correia-Pinto, J., Lima, E.: Video-based surgical learning: improving trainee education and preparation for surgery. J. Surg. Educ. 75(3), 828–835 (2018)CrossRefGoogle Scholar
  15. 15.
    Padua, F., Carceroni, R., Santos, G., Kutulakos, K.: Linear sequence-to-sequence alignment. IEEE Trans. Pattern Anal. Mach. Intell. 32(2), 304–320 (2010)CrossRefGoogle Scholar
  16. 16.
    Petitjean, F., Forestier, G., Webb, G.I., Nicholson, A.E., Chen, Y., Keogh, E.: Dynamic time warping averaging of time series allows faster and more accurate classification. In: IEEE International Conference on Data Mining, pp. 470–479 (2014)Google Scholar
  17. 17.
    Petitjean, F., Gançarski, P.: Summarizing a set of time series by averaging: from Steiner sequence to compact multiple alignment. Theoret. Comput. Sci. 414(1), 76–91 (2012)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Petitjean, F., Ketterlin, A., Gançarski, P.: A global averaging method for dynamic time warping, with applications to clustering. Pattern Recognit. 44(3), 678–693 (2011)CrossRefGoogle Scholar
  19. 19.
    Rapp, A.K., Healy, M.G., Charlton, M.E., Keith, J.N., Rosenbaum, M.E., Kapadia, M.R.: Youtube is the most frequently used educational video source for surgical preparation. J. Surg. Educ. 73(6), 1072–1076 (2016)CrossRefGoogle Scholar
  20. 20.
    Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)CrossRefGoogle Scholar
  21. 21.
    Shokoohi-Yekta, M., Hu, B., Jin, H., Wang, J., Keogh, E.: Generalizing dtw to the multi-dimensional case requires an adaptive approach. Data Min. Knowl. Disc. 31(1), 1–31 (2017)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Smith, T.L., Ransbottom, S.: Digital video in education. In: Distance Learning Technologies: Issues, Trends and Opportunities, pp. 124–142 (2000)Google Scholar
  23. 23.
    Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1(4), 337–348 (1994)CrossRefGoogle Scholar
  24. 24.
    Wang, O., Schroers, C., Zimmer, H., Gross, M., Sorkine-Hornung, A.: Videosnapping: interactive synchronization of multiple videos. ACM Trans. Graph. 33(4), 77 (2014)Google Scholar
  25. 25.
    Wedge, D., Kovesi, P., Huynh, D.: Trajectory based video sequence synchronization. In: Digital Image Computing: Techniques and Applications, p. 13 (2005)Google Scholar
  26. 26.
    Wolf, L., Zomet, A.: Sequence-to-sequence self calibration. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2351, pp. 370–382. Springer, Heidelberg (2002). Scholar
  27. 27.
    Yamada, Y., Kobayashi, M.: Detecting mental fatigue from eye-tracking data gathered while watching video. In: ten Teije, A., Popow, C., Holmes, J.H., Sacchi, L. (eds.) AIME 2017. LNCS (LNAI), vol. 10259, pp. 295–304. Springer, Cham (2017). Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Hassan Ismail Fawaz
    • 1
    Email author
  • Germain Forestier
    • 1
    • 2
  • Jonathan Weber
    • 1
  • François Petitjean
    • 2
  • Lhassane Idoumghar
    • 1
  • Pierre-Alain Muller
    • 1
  1. 1.IRIMAS, University of Haute-AlsaceMulhouseFrance
  2. 2.Faculty of Information TechnologyMonash UniversityMelbourneAustralia

Personalised recommendations