Application of Bayesian Inference to Automatic Semantic Annotation of Videos
Summary. It is an important task to automatically extract semantic annotation of a video shot. This high level semantic information can improve the performance of video retrieval. In this paper, we propose a novel approach to annotate a new video shot automatically with a non-fixed number of concepts. The process is carried out by three steps. Firstly, the semantic importance degree (SID)is introduced and a simple method is proposed to extract the semantic candidate set (SCS) under considering SID of several concepts co-occurring in the same shot. Secondly, a semantic network is constructed using an improved K2 algorithm. Finally, the final annotation set is chosen by Bayesian inference. Experimental results show that the performance of automatically annotating a new video shot is significantly improved using our method, compared with classical classifiers such as Naïve Bayesian and K Nearest Neighbor.
KeywordsBayesian Network Bayesian Inference Average Precision Semantic Network Semantic Concept
Unable to display preview. Download preview PDF.
- 4.Tseng, B.T., Lin, C.Y., Naphade, M.R., Natsev, A., Smith, J.: Normalized classifier fusion for semantic visual concept detection. In Torres, L., Garcia, N., eds.: International Conference on Image Processing, Barcelona, Spain, I.E.E.E. Press (2003) 535-538Google Scholar
- 5.Naphade, M.R.: A Probabilistic Framework For Mapping Audio-visual Fea-tures to High-Level Semantics in Terms of Concepts and Context. PhD thesis, University of Illinois at Urbana-Champaign (2001)Google Scholar
- 6.Jiménez, A.B.B.: Multimedia Knowledge:Discovery, Classification, Browsing, and Retrieval. PhD thesis, Columbia University (2005)Google Scholar
- 7.Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In Clarke, C., ed.: Proceedings of the 26th International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, ACM Press (2003) 119-126Google Scholar
- 8.Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In Saul, L.K., Weiss, Y., Bottou, L., eds.: Proceedings of the Seven-teenth Annual Conference on Neural Information Processing Systems, Vancou-ver, British Columbia, Canada, MIT Press (2004) 553-560Google Scholar
- 9.Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In Davis, L., ed.: IEEE Conference on Computer Vision and Pattern Recognition, Washington DC, USA, IEEE Computer Society (2004) 1002-1009Google Scholar