Abstract
In this work the use of feature tracks for the detection of high-level features (concepts) in video is proposed. Extending previous work on local interest point detection and description in images, feature tracks are defined as sets of local interest points that are found in different frames of a video shot and exhibit spatio-temporal and visual continuity, thus defining a trajectory in the 2D+Time space. These tracks jointly capture the spatial attributes of 2D local regions and their corresponding long-term motion. The extraction of feature tracks and the selection and representation of an appropriate subset of them allow the generation of a Bag-of-Spatiotemporal-Words model for the shot, which facilitates capturing the dynamics of video content. Experimental evaluation of the proposed approach on two challenging datasets (TRECVID 2007, TRECVID 2010) highlights how the selection, representation and use of such feature tracks enhances the results of traditional keyframe-based concept detection techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Mezaris V, Kompatsiaris I, Boulgouris N, Strintzis M (2004) Real-time compressed-domain spatiotemporal segmentation and ontologies for video indexing and retrieval. IEEE Trans Circuits Syst Video Technol 14(5):606–621
Mezaris V, Kompatsiaris I, Strintzis M (2004) Video object segmentation using Bayes-based temporal tracking and trajectory-based region merging. IEEE Trans Circuits Syst Video Technol 14(6):782–795
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110
Dance C, Willamowski J, Fan L, Bray C, Csurka G (2004) Visual categorization with bags of keypoints. In: Proceedings of the ECCV international workshop on statistical learning in computer vision, Prague, Czech Republic, May 2004
Mezaris V, Sidiropoulos P, Dimou A, Kompatsiaris I (2010) On the use of visual soft semantics for video temporal decomposition to scenes. In: Proceedings of the fourth IEEE international conference on semantic computing (ICSC 2010), Pittsburgh, PA, USA, Sept 2010
Gkalelis N, Mezaris V, Kompatsiaris I (2010) Automatic event-based indexing of multimedia content using a joint content-event model. In: Proceedings of the ACM multimedia 2010, events in multiMedia workshop (EiMM10), Firenze, Italy, Oct 2010
Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630
Bay H, Ess A, Tuytelaars T, Gool LV (2008) Surf: speeded up robust features. Comput Vis Image Underst 110(3):346–359
Burghouts GJ, Geusebroek JM (2009) Performance evaluation of local colour invariants. Comput Vis Image Underst 113:48–62
Smeaton AF, Over P, Kraaij W (2009) High-level feature detection from video in TRECVid: a 5-Year retrospective of achievements. In: Divakaran A (ed) Multimedia content analysis, signals and communication technology. Springer, Berlin, pp 151–174
Piro P, Anthoine S, Debreuve E, Barlaud M (2010) Combining spatial and temporal patches for scalable video indexing. Multimedia Tools Appl 48(1):89–104
Snoek C, van de Sande K, de Rooij O et al (2008) The MediaMill TRECVID 2008 semantic video search engine. In: Proceedings of the TRECVID 2008 workshop, USA, Nov 2008
Ballan L, Bertini M, Bimbo AD, Serra G (2010) Video event classification using String Kernels. Multimedia Tools Appl 48(1):69–87
Chen M, Hauptmann A (2009) Mo SIFT: recognizing human actions in surveillance videos. Technical Report CMU-CS-09-161, Carnegie Mellon University
Laptev I (2005) On space-time interest points. Int J Comput Vision 64(2/3):107–123
Niebles JC, Wang H, Fei-Fei L (2008) Unsupervised learning of human action categories using spatial-temporal words. Int J Comput Vision 79(3):299–318
Zhou H, Yuan Y, Shi C (2009) Object tracking using SIFT features and mean shift. Comput Vision Image Underst 113(3):345–352
Tsuduki Y, Fujiyoshi H (2009) A method for visualizing pedestrian traffic flow using SIFT feature point tracking. In: Proceedings of the 3rd Pacific-Rim symposium on image and video technology, Tokyo, Japan, Jan 2009
Anjulan A, Canagarajah N (2009) A unified framework for object retrieval and mining. IEEE IEEE Trans Circuits Syst Video Technol 19(1):63–76
Moenne-Loccoz N, Bruno E, Marchand-Maillet S (2006) Local feature trajectories for efficient event-based indexing of video sequences. In: Proceedings of the international conference on image and video retrieval (CIVR), Tempe, USA, July 2006
Sun J, Wu X, Yan S, Cheong L, Chua TS, Li J (2009) Hierarchical spatio-temporal context modeling for action recognition. In: Proceedings international conference on computer vision and pattern recognition (CVPR), Miami, USA, June 2009
Lazebnik S, Schmid C, Ponce J (2009) Spatial pyramid matching. In: Dickinson S, Leonardis A, Schiele B, Tarr M (eds) Object categorization: computer and human vision perspectives. Cambridge University Press, Cambridge
Moumtzidou A, Dimou A, Gkalelis N, Vrochidis S, Mezaris V, Kompatsiaris I (2010) ITI-CERTH participation to TRECVID 2010. In: Proceedings of the TRECVID 2010 workshop, USA, Nov 2010
Yilmaz E, Kanoulas E, Aslam J (2008) A simple and efficient sampling method for estimating AP and NDCG. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in, information retrieval (SIGIR), pp 603–610
Acknowledgments
This work was supported by the European Commission under contract FP7-248984 GLOCAL.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Mezaris, V., Dimou, A., Kompatsiaris, I. (2013). Local Invariant Feature Tracks for High-Level Video Feature Extraction. In: Adami, N., Cavallaro, A., Leonardi, R., Migliorati, P. (eds) Analysis, Retrieval and Delivery of Multimedia Content. Lecture Notes in Electrical Engineering, vol 158. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3831-1_10
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3831-1_10
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3830-4
Online ISBN: 978-1-4614-3831-1
eBook Packages: EngineeringEngineering (R0)