Abstract
We present a novel multi-modal evidence fusion method for highlevel feature (HLF) detection in videos. The uni-modal features, such as color histogram, transcript texts, etc, tend to capture different aspects of HLFs and hence share complementariness and redundancy in modeling the contents of such HLFs. We argue that such inter-relation are key to effective multi-modal fusion. Here, we formulate the fusion as a multi-criteria group decision making task, in which the uni-modal detectors are coordinated for a consensus final detection decision, based on their inter-relations. Specifically, we mine the complementariness and redundancy inter-relation of uni-modal detectors using the Ordered Weighted Average (OWA) operator. The ‘or-ness’ measure in OWA models the inter-relation of uni-modal detectors as combination of pure complementariness and pure redundancy. The resulting weights of OWA can then yield a consensus fusion, by optimally leveraging the decisions of uni-modal detectors. The experiments on TRECVID 07 dataset show that the proposed OWA aggregation operator can significantly outperform other fusion methods, by achieving a state-of-art MAP of 0.132.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chang, S.-F., Hsu, W., Kennedy, L., Xie, L., Yanagawa, A., Zavesky, E., Zhang., D.-Q.: Columbia university trecvid 2005 video search and high-level feature extraction. In: TREC Video Retrieval Evaluation Proceedings (March 2006)
Dorai, C., Venkatesh., S.: Bridging the semantic gap with computational media aesthetics. IEEE MultiMedia 10(2), 15–17 (2003)
Hauptmann, A.G., Chen, M.-Y., Christel, M., Lin, W.-H., Yan, R., Yang, J.: 2006. Multi-lingual broadcast news retrieval. In: Proceedings of TREC Video Retrieval Evaluation Proceedings (March 2006)
Mei, T., Hua, X., Lai, W., Yang, L., Zha, Z., Liu, Y., Gu, Z., Qi, G., Wang, M., Tang, J., Yuan, X., Lu, Z., Liu, J.: MSRA-USTC-SJTU at TRECVID 2007: High-level feature extraction and search (2007), http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
Le, H.D., Satoh, S., Matsui, T.: NII-ISM, Japan at TRECVID 2007: High Level Feature Extraction (2007), http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
Snoek, C., Worring, M., Gemert, J., Geusebroek, J.-M., Smeulders, A.: 2006. The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of ACM MM, pp. 421–430 (2006)
Kacprzyk, J., Fedrizzi, M., Nurmi, H.: OWA operators in group decision making and consensus reaching under fuzzy preferences and fuzzy majority. In: Yager, R.R., Kacprzyk, J. (eds.) The Ordered Weighted Averaging Operators: Theory and Applications, pp. 193–206. Kluwer Academic Publishers, Dordrecht (1997)
Yager, R.R.: Ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Tran. On Systems, Man and Cybernetics 18, 183–190 (1988)
Marchant, T.: Maximal orness weights with a fixed variability for OWA operators. International Journal of Uncertainty Fuzziness and Knowledge Based Systems 14, 271–276 (2006)
Fuller, R., Majlender, P.: An analytic approach for obtaining maximal entropy OWA operator weights. Fuzzy Sets and System 124, 53–57 (2001)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval MIR 2006, pp. 321–330. ACM Press, New York (2006)
Ngo, C., Jiang, Y., Wei, X., Wang, F., Zhao, W., Tan, H., Wu, X.: Experimenting vireo-374: Bag-of-visual-words and visual-based ontology for semantic video indexing and search. In: TREC Video Retrieval Evaluation Proceedings (November 2007)
Magalhães, J., Rüger, S.: Information-theoretic semantic multimedia indexing. In: Proceedings of the 6th ACM international conference on Image and video retrieval (CIVR 2007) (July 2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, M., Zheng, YT., Lin, SX., Zhang, YD., Chua, TS. (2009). Multimedia Evidence Fusion for Video Concept Detection via OWA Operator. In: Huet, B., Smeaton, A., Mayer-Patel, K., Avrithis, Y. (eds) Advances in Multimedia Modeling . MMM 2009. Lecture Notes in Computer Science, vol 5371. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-92892-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-92892-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-92891-1
Online ISBN: 978-3-540-92892-8
eBook Packages: Computer ScienceComputer Science (R0)