High-Level Feature Detection from Video in TRECVid: A 5-Year Retrospective of Achievements
Purchase on Springer.com
$29.95 / €24.95 / £19.95*
* Final gross prices may vary according to local VAT.
Successful and effective content-based access to digital video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like colour, texture or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one that determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically, however, this depends on being able to determine whether each feature is or is not present in a video clip. The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarize the work done on the TRECVid high-level feature task, showing the progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can achieve large-scale, fast and reliable high-level feature detection on video.
- Face Recognition Grand Challenge. URL:www.frvt.org/FRGC, 2006.
- AMI: Augmented Multi-Person Interaction. URL:www.amiproject.org/, Last checked 9 September 2007.
- ETISEO: Video Understanding Evaluation. URL:www.silogic.fr/etiseo/, Last checked 9 September 2007.
- The Internet Archive Movie Archive home page, Last checked 14 September 2007.
- LSCOM Lexicon Definitions and Annotations. URL:www.ee.columbia.edu/dvmm/lscom, Last checked 14 September 2007.
- PETS: Performance Evaluation of Tracking and Surveillance. URL:www.cvg.cs.rdg.ac.uk/slides/pets.html, Last checked 9 September 2007.
- M. G. Christel and A. G. Hauptmann. The Use and Utility of High-Level Semantic Features in Video Retrieval. In Proceedings of the International Conference on Video Retrieval, pp. 134–144, Singapore, 20–22 July 2005.
- A. Hauptman. How many high-level concepts will fill the semantic gap in video retrieval? In proceedings of the ACM International Conference on Image and Video Retrieval, 2007.
- P. Joly, J. Benois-Pineau, E. Kijak, and G. Quénot. The ARGOS campaign: Evaluation of video analysis and indexing tools. Image Communication, 22(7–8):705–717, 2007.
- W. Kraaij, A. F. Smeaton, P. Over, and J. Arlandis. TRECVID 2004–-An Overview. In Proceedings of the TRECVID Workshop (TRECVID 2004), Gaithersburg, MD, November 2004.
- C.-Y. Lin, B. L. Tseng, and J. R. Smith. Video collaborative annotation forum: Establishing ground-truth labels on large multimedia datasets. Proceedings of the TRECVID 2003 Workshop, 2003.
- A. Loui, J. Luo, S.-F. Chang, D. Ellis, W. Jiang, L. Kennedy, K. Lee, and A. Yanagawa. Kodak’s consumer video benchmark data set: concept definition and annotation. In MIR ’07: Proceedings of the international workshop on Workshop on multimedia information retrieval, pp. 245–254, New York, NY, USA, 2007. ACM Press.
- M. Naphade, J. R. Smith, J. Tesic, S.-F. Chang, W. Hsu, L. Kennedy, A. G. Hauptmann, and J. Curtis. Large-scale concept ontology for multimedia. IEEE MultiMedia Magazine, 13(3):86–91, 2006. CrossRef
- M. R. Naphade and J. R. Smith. On the Detection of Semantic Concepts at TRECVID. In MULTIMEDIA’04: Proceedings of the 12th ACM International Conference on Multimedia, pp. 660–667, New York, NY, USA, 10–16 October 2004.
- A. P. Natsev, A. Haubold, J. Tešić, L. Xie, and R. Yan. Semantic concept-based query expansion and re-ranking for multimedia retrieval. In MULTIMEDIA ’07: Proceedings of the 15th International Conference on Multimedia, pp. 991–1000, New York, NY, USA, 2007. ACM Press.
- P. Over, T. Ianeva, W. Kraaij, and A. F. Smeaton. TRECVID 2005–-An Overview. In Proceedings of the TRECVID Workshop (TRECVID 2005), Gaithersburg, MD, November 2005.
- P. Over, T. Ianeva, W. Kraaij, and A. F. Smeaton. TRECVID 2006–-An Overview. In Proceedings of the TRECVID Workshop (TRECVID 2006), Gaithersburg, MD, November 2006.
- A. F. Smeaton, W. Kraaij, and P. Over. TRECVid 2003: An overview. In TREC2003: Proceedings of the TREC Workshop (TREC 2003), Gaithersburg, MD, November 2003.
- A. F. Smeaton and P. Over. The TREC-2002 video track report. In TREC2002: Proceedings of the TREC Workshop (TREC 2002), Gaithersburg, MD, November 2002.
- A. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain. Content based image retrieval at the end of the early years. IEEE Transactions on Pattern Recognition and Machine Intelligence, 22(12):1349–1380, 2000. CrossRef
- C. G. Snoek and M. Worring. Are concept detector lexicons effective for video search? In Proceedings of the IEEE International Conference on Multimedia & Expo, pp. 1966–1969, 2007.
- C. G. Snoek, M. Worring, J.-M. Geusebroek, D. C. Koelma, F. J. Seinstra, and A. Smeulders. The semantic pathfinder: Using an authoring metaphor for generic multimedia indexing. IEEE Transactions, PAMI, 28(10):1678–1689, 2006. CrossRef
- C. G. M. Snoek, M. Worring, J. C. van Gemert, J.-M. Geusebroek, and A. W. M. Smeulders. The challenge problem for automated detection of 101 semantic concepts in multimedia. In MULTIMEDIA ’06: Proceedings of the 14th annual ACM international conference on Multimedia, pp. 421–430, New York, NY, USA, 2006. ACM Press.
- T. Volkmer, J. R. Smith, and A. P. Natsev. A web-based system for collaborative annotation of large image and video collections: an evaluation and user study. In MULTIMEDIA ’05: Proceedings of the 13th annual ACM international conference on Multimedia, pp. 892–901, New York, NY, USA, 2005. ACM Press.
- E. Yilmaz and J. A. Aslam. Estimating average precision with incomplete and imperfect judgments. In CIKM ’06: Proceedings of the 15th ACM international conference on Information and knowledge management, pp. 102–111, New York, NY, USA, 2006. ACM Press.
- High-Level Feature Detection from Video in TRECVid: A 5-Year Retrospective of Achievements
- Book Title
- Multimedia Content Analysis
- Book Subtitle
- Theory and Applications
- pp 1-24
- Print ISBN
- Online ISBN
- Series Title
- Signals and Communication Technology
- Series ISSN
- Springer US
- Copyright Holder
- Springer Science+Business Media, LLC
- Additional Links
- eBook Packages
To view the rest of this content please follow the download PDF link above.