CIVR 2006: Image and Video Retrieval pp 380-390

Recognizing Objects and Scenes in News Videos

  • Muhammet Baştan
  • Pınar Duygulu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4071)

Abstract

We propose a new approach to recognize objects and scenes in news videos motivated by the availability of large video collections. This approach considers the recognition problem as the translation of visual elements to words. The correspondences between visual elements and words are learned using the methods adapted from statistical machine translation and used to predict words for particular image regions (region naming), for entire images (auto-annotation), or to associate the automatically generated speech transcript text with the correct video frames (video alignment). Experimental results are presented on TRECVID 2004 data set, which consists of about 150 hours of news videos associated with manual annotations and speech transcript text. The results show that the retrieval performance can be improved by associating visual and textual elements. Also, extensive analysis of features are provided and a method to combine features are proposed.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Muhammet Baştan
    • 1
  • Pınar Duygulu
    • 1
  1. 1.Department of Computer EngineeringBilkent UniversityAnkaraTurkey

Personalised recommendations