Skip to main content

Keyframe extraction in endoscopic video


In medical endoscopy more and more surgeons archive the recorded video streams in a long-term storage. One reason for this development, which is enforced by law in some countries, is to have evidence in case of lawsuits from patients. Another more practical reason is to allow later inspection of previous procedures and also to use parts of such videos for research and for training. However, due to the dramatic amount of video data recorded in a hospital on a daily basis, it is very important to have good preview images for these videos in order to allow for quick filtering of undesired content and for easier browsing through such a video archive. Unfortunately, common shot detection and keyframe extraction methods cannot be used for that video data, because these videos contain unedited and highly similar content, especially in terms of color and texture, and no shot boundaries at all. We propose a new keyframe extraction approach for this special video domain and show that our method is significantly better than a previously proposed approach.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12




  1. Bailer W, Schöffmann K, Ahlström D, Weiss W, del Fabro M (2013) Interactive evaluation of video browsing tools. In: Proceedings of the multimedia modeling conference, pp 81–91.

  2. Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: Computer vision–ECCV. Springer, Berlin Heidelberg, pp 404–417

  3. Calonder M, Lepetit V, Strecha C, Fua P (2010) Brief: Binary robust independent elementary features. In: Daniilidis K, Maragos P, Paragios N (eds) Computer vision ECCV. Lecture Notes in Computer Science, vol 6314. Springer Berlin Heidelberg, pp 778–792. doi: 10.1007/978-3-642-15561-1_56

  4. Chang HS, Sull S, Lee SU (1999) Efficient video indexing scheme for content-based retrieval. IEEE Trans Circ Sys for Video Technol 9(8):1269–1279

    Article  Google Scholar 

  5. Cooper M, Foote J (2005) Discriminative techniques for keyframe selection. In: IEEE international conference on Multimedia and expo ICME 2005, p 4. IEEE

  6. Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: Ideas, influences, and trends of the new age. ACM Comput Surv (CSUR) 40(2):5

    Article  Google Scholar 

  7. Del Fabro M, B¨osz¨ormenyi L (2013) State-of-the-art and future challenges in video scene detection: a survey. Multimedia Systems 19(5):427–454. doi: 10.1007/s00530-013-0306-4

  8. Divakaran A, Radhakrishnan R, Peker KA (2002) Motion activity-based extraction of key-frames from video shots. In: Proceedings of international conference on image processing 2002, vol 1, IEEE, pp I–932

  9. Doulamis ND, Doulamis AD, Avrithis YS, Kollias SD (1998) Video content representation using optimal extraction of frames and scenes. In: Proceedings of international conference on Image processing ICIP 98, vol 1, pp 875–879. IEEE

  10. Gibson D, Campbell N, Thomas B (2002) Visual abstraction of wildlife footage using gaussian mixture models and the minimum description length criterion. In: Proceedings of 16th international conference on Pattern recognition, vol 2, pp 814–817. IEEE

  11. Guan G, Wang Z, Lu S, Deng J, Feng D (2013) Keypoint-based keyframe selection. IEEE Trans Circ Sys for Video Technol 23(4):729–734. doi: 10.1109/TCSVT.2012.2214871

  12. Harris C, Stephens M (1988) A combined corner and edge detector. In: Alvey vision conference, vol 15, Manchester, p 50

  13. Kim C, Hwang JN (2002) Object-based video abstraction for video surveillance systems. IEEE Trans Circ Sys for Video Technol 12(12):1128–1138. 10.1109/TCSVT.2002.806813

    Article  Google Scholar 

  14. Lee HC, Kim SD (2002) Rate-driven key frame selection using temporal variation of visual content. Electron Lett 38(5):217–218

    Article  Google Scholar 

  15. Liu T, Kender JR (2002) Optimization algorithms for the selection of key frame sequences of variable length. In: Proceedings of the 7th european conference on computer vision-part IV, ECCV ’02. Springer-Verlag, London, pp 403–417.

  16. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  17. Lux M, Marques O, Schöffmann K, Böszörmenyi L, Lajtai G (2010) A novel tool for summarization of arthroscopic videos. Multimed Tools and Appl 46(2–3):521–544

    Article  Google Scholar 

  18. Mendi E, Bayrak C, Cecen S, Ermisoglu E (2012) Content-based management service for medical videos. Telemed and e-Health

  19. Münzer B, Schoeffmann K, Böszörmenyi L (2013) Detection of circular content area in endoscopic videos. In: Proceedings of the IEEE international symposium on computer-based medical systems (CBMS13)

  20. Münzer B, Schoeffmann K, Boszormenyi L (2013) Improving encoding efficiency of endoscopic videos by using circle detection based border overlays. In: IEEE international conference on Multimedia and expo workshops (ICMEW), pp 1–4. doi: 10.1109/ICMEW.2013.6618304

  21. Münzer B, Schoeffmann K, B¨osz¨ormenyi L (2013) Relevance segmentation of laparoscopic videos. In: Proceedings of IEEE international symposium on Multimedia (ISM), Anaheim

  22. Primus M J, Schoeffmann K, Böszörmenyi L (2013) Segmentation of recorded endoscopic videos by detecting significant motion changes. In: Proceedings of the international workshop on content-based multimedia indexing (CBMI 2013). To appear

  23. Rosin PL (1999) Measuring corner properties. Comput Vis Image Underst 73(2):291–307

  24. Rosten E, Drummond T (2006) Machine learning for high-speed corner detection. In: Computer vision–ECCV. Springer, Berlin Heidelberg, pp 430–443

  25. Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: An efficient alternative to sift or surf. In: IEEE international conference on Computer vision (ICCV), pp 2564–2571

  26. Smeaton AF, Over P, Doherty AR (2010) Video shot boundary detection: Seven years of trecvid activity. Comput Vis Image Underst 114(4):411–418

  27. Spyrou E, Tolias G, Mylonas P, Avrithis Y (2009) Concept detection and keyframe extraction using a visual thesaurus. Multimed Tools and Appl 41(3):337–373

    Article  Google Scholar 

  28. Sun X, Kankanhalli MS (2000) Video summarization using r-sequences. Real-Time Imaging 6(6):449–459. doi: 10.1006/rtim.1999.0197

  29. Truong BT, Venkatesh S (2007) Video abstraction: A systematic review and classification. ACM Trans Multimedia Comput Commun Appl 3(1):3+

    Article  Google Scholar 

  30. Weng CY, Chu WT, Wu JL (2009) Rolenet: movie analysis from the perspective of social networks. Trans Multi 11(2):256–271. doi: 10.1109/TMM.2008.2009684

  31. Yeung MM, Liu B (1995) Efficient matching and clustering of video shots. In: Proceedings of the 1995 international conference on Image processing - volume 1. ICIP ’95. IEEE Computer Society, Washington, pp 338–.

  32. Yong SP, Deng JD, Purvis MK (2013) Wildlife video key-frame extraction based on novelty detection in semantic context. Multimedia Tools Appl 62(2):359–376. doi: 10.1007/s11042-011-0902-2

  33. Yu XD, Wang L, Tian Q, Xue P (2004) Multi-level video representation with application to keyframe extraction. In: MMM ’04. IEEE Computer Society, Washington, p 117.

  34. Zhang XD, Liu TY, Lo KT, Feng J (2003) Dynamic selection and effective compression of key frames for video abstraction. Pattern Recogn Lett 24(9–10):1523–1532. doi: 10.1016/S0167-8655(02)00391-4

Download references


This work was supported by Lakeside Labs GmbH, Klagenfurt, Austria, funding from the European Regional Development Fund and the Carinthian Economic Promotion Fund (KWF) under grant KWF-20214 22573 33955 and partially by the Hungarian National Development Agency under grant HUMAN_MB08-1-2011-0010.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Klaus Schoeffmann.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schoeffmann, K., Del Fabro, M., Szkaliczki, T. et al. Keyframe extraction in endoscopic video. Multimed Tools Appl 74, 11187–11206 (2015).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Keyframe extraction
  • Video segmentation
  • Endoscopy
  • Medical imaging