Learned Lexicon-Driven Interactive Video Retrieval

  • Cees Snoek
  • Marcel Worring
  • Dennis Koelma
  • Arnold Smeulders
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4071)


We combine in this paper automatic learning of a large lexicon of semantic concepts with traditional video retrieval methods into a novel approach to narrow the semantic gap. The core of the proposed solution is formed by the automatic detection of an unprecedented lexicon of 101 concepts. From there, we explore the combination of query-by-concept, query-by-example, query-by-keyword, and user interaction into the MediaMill semantic video search engine. We evaluate the search engine against the 2005 NIST TRECVID video retrieval benchmark, using an international broadcast news archive of 85 hours. Top ranking results show that the lexicon-driven search engine is highly effective for interactive video retrieval.


Search Engine Average Precision Semantic Concept Video Retrieval Query Interface 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Flickner, M., et al.: Query by image and video content: The QBIC system. IEEE Computer 28(9), 23–32 (1995)Google Scholar
  2. 2.
    Chang, S.F., Chen, W., Men, H., Sundaram, H., Zhong, D.: A fully automated content-based video search engine supporting spatio-temporal queries. IEEE TCSVT 8(5), 602–615 (1998)Google Scholar
  3. 3.
    Rui, Y., Huang, T., Ortega, M., Mehrotra, S.: Relevance feedback: A power tool in interactive content-based image retrieval. IEEE TCSVT 8(5), 644–655 (1998)Google Scholar
  4. 4.
    Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content based image retrieval at the end of the early years. IEEE TPAMI 22(12), 1349–1380 (2000)Google Scholar
  5. 5.
    Naphade, M., Huang, T.: A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Trans. Multimedia 3(1), 141–151 (2001)CrossRefGoogle Scholar
  6. 6.
    Amir, A., et al.: IBM research TRECVID-2003 video retrieval system. In: Proc. TRECVID Workshop, Gaithersburg, USA (2003)Google Scholar
  7. 7.
    Snoek, C., Worring, M., Geusebroek, J., Koelma, D., Seinstra, F., Smeulders, A.: The semantic pathfinder: Using an authoring metaphor for generic multimedia indexing. IEEE TPAMI (in press, 2006)Google Scholar
  8. 8.
    Snoek, C., et al.: The MediaMill TRECVID 2005 semantic video search engine. In: Proc. TRECVID Workshop, Gaithersburg, USA (2005)Google Scholar
  9. 9.
    Rautiainen, M., Ojala, T., Seppänen, T.: Analysing the performance of visual, concept and text features in content-based video retrieval. In: ACM MIR, NY, USA, pp. 197–204 (2004)Google Scholar
  10. 10.
    Christel, M., Huang, C., Moraveji, N., Papernick, N.: Exploiting multiple modalities for interactive video retrieval. In: IEEE ICASSP, Montreal, CA, vol. 3, pp. 1032–1035 (2004)Google Scholar
  11. 11.
    Adcock, J., Cooper, M., Girgensohn, A., Wilcox, L.: Interactive video search using multilevel indexing. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 205–214. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  12. 12.
    Smeaton, A.: Large scale evaluations of multimedia information retrieval: The TRECVid experience. In: Leow, W.-K., Lew, M., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 11–17. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  13. 13.
    Salton, G., McGill, M.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)MATHGoogle Scholar
  14. 14.
    Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. American Soc. Inform. Sci. 41(6), 391–407 (1990)CrossRefGoogle Scholar
  15. 15.
    Lee, J.: Analysis of multiple evidence combination. In: ACM SIGIR, pp. 267–276 (1997)Google Scholar
  16. 16.
    Petersohn, C.: Fraunhofer HHI at TRECVID 2004: Shot boundary detection system. In: Proc. TRECVID Workshop, Gaithersburg, USA (2004)Google Scholar
  17. 17.
    Naphade, et al.: A light scale concept ontology for multimedia understanding for TRECVID 2005. Technical Report RC23612, IBM T.J. Watson Research Center (2005)Google Scholar
  18. 18.
    Fellbaum, C. (ed.): WordNet: an electronic lexical database. The MIT Press, Cambridge (1998)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Cees Snoek
    • 1
  • Marcel Worring
    • 1
  • Dennis Koelma
    • 1
  • Arnold Smeulders
    • 1
  1. 1.Intelligent Systems Lab AmsterdamUniversity of AmsterdamAmsterdamThe Netherlands

Personalised recommendations