Multi-level Fusion for Semantic Video Content Indexing and Retrieval
- Cite this paper as:
- Benmokhtar R., Huet B. (2008) Multi-level Fusion for Semantic Video Content Indexing and Retrieval. In: Boujemaa N., Detyniecki M., Nürnberger A. (eds) Adaptive Multimedia Retrieval: Retrieval, User, and Semantics. AMR 2007. Lecture Notes in Computer Science, vol 4918. Springer, Berlin, Heidelberg
In this paper, we present the results of our work on the analysis of an automatic semantic video content indexing and retrieval system based on fusing various low level visual descriptors. Global MPEG-7 features extracted from video shots, are described via IVSM signature (Image Vector Space Model) in order to have a compact description of the content. Both static and dynamic feature fusion are introduced to obtain effective signatures. Support Vector Machines (SVMs) are employed to perform classification (One classifier per feature). The task of the classifiers is to detect the video semantic content. Then, classifier outputs are fused using a neural network based on evidence theory (NNET) in order to provide a decision on the content of each shot. The experimental results are conducted in the framework of the TRECVid feature extraction task.
Unable to display preview. Download preview PDF.