Advertisement

Automatic Annotation and Retrieval for Videos

  • Fangshi Wang
  • De Xu
  • Wei Lu
  • Hongli Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4319)

Abstract

Retrieving videos by key words requires semantic knowledge of the videos. However, manual video annotation is very costly and time consuming. Most works reported in literatures focus on annotating a video shot with either only one semantic concept or a fixed number of words. In this paper, we propose a new approach to automatically annotate a video shot with a varied number of semantic concepts and to retrieve videos based on text queries. First, a simple but efficient method is presented to automatically extract Semantic Candidate Set (SCS) for a video shot based on visual features. Second, a semantic network with n nodes is built by an Improved Dependency Analysis Based Method (IDABM) which reduce the time complexity of orienting the edges from O(n 4) to O(n 2). Third, the final annotation set (FAS) is obtained from SCS by Bayesian Inference. Finally, a new way is proposed to rank the retrieved key frames according to the probabilities obtained during Bayesian Inference. Experiments show that our method is useful in automatically annotating video shots and retrieving videos by key words.

Keywords

Bayesian Network Bayesian Inference Semantic Network Semantic Concept Chordal Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Smith, J.R., Campbell, M., Naphade, M., Natsev, A., Tesic, J.: Learning and Classification of Semantic Concepts in Broadcast Video. In: International conference on intelligence analysis (2005)Google Scholar
  2. 2.
    Rong, Y.: Probabilistic Models for Combining Diverse Knowledge Sources in Multimedia Retrieval. Dissertation of Carnegie Mellon University (2005)Google Scholar
  3. 3.
    Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.I.: Matching Words and Pictures. Journal of Machine Learning Research (JMLR), Special Issue on Text and Images 3, 1107–1135 (2003)MATHGoogle Scholar
  4. 4.
    Tseng, B.T., Lin, C.-Y., Naphade, M.R., Natsev, A., Smith, J.R.: Normalized Classifier Fusion for Semantic Visual Concept Detection. In: Proc. of Int. Conf. on Image Processing (ICIP 2003), Barcelona, Spain, pp. 14–17 (2003)Google Scholar
  5. 5.
    Naphade, M.R.: A Probabilistic Framework For Mapping Audio-visual Features to High-Level Semantics in Terms of Concepts and Context. Dissertation of the University of Illinois at Urbana-Champaign (2001)Google Scholar
  6. 6.
    Belén, A., Jiménez, B.: Multimedia Knowledge: Discovery, Classification, Browsing, and Retrieval. Dissertation of Columbia University (2005)Google Scholar
  7. 7.
    Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: CVPR (2004)Google Scholar
  8. 8.
    Cheng, J., Greiner, R., Kelly, J., Bell, D., Liu, w.: Learning Belief Networks from Data: An Information Theory Based Approach. Artificial Intelligence 137(1-2), 43–90 (2002)MATHCrossRefMathSciNetGoogle Scholar
  9. 9.
    Huang, C.: Inference in Belief Networks: A Procedural Guide. International Journal of Approximate Reasoning 11, 1–158 (1994)MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fangshi Wang
    • 1
    • 2
  • De Xu
    • 1
  • Wei Lu
    • 2
  • Hongli Xu
    • 1
  1. 1.School of Computer & Information TechnologyBeijing Jiaotong UniversityBeijingChina
  2. 2.School of SoftwareBeijing Jiaotong UniversityBeijingChina

Personalised recommendations