A Selective Weighted Late Fusion for Visual Concept Recognition

Conference paper

pp 426–435
Cite this conference paper

Computer Vision – ECCV 2012. Workshops and Demonstrations (ECCV 2012)

Ningning Liu¹⁹,
Emmanuel Dellandrea¹⁹,
Chao Zhu¹⁹,
Charles-Edmond Bichot¹⁹ &
…
Liming Chen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7585))

Included in the following conference series:

European Conference on Computer Vision

4143 Accesses
8 Citations

Abstract

We propose in this paper a novel multimodal approach to automatically predict the visual concepts of images through an effective fusion of visual and textual features. It relies on a Selective Weighted Late Fusion (SWLF) scheme which, in optimizing an overall Mean interpolated Average Precision (MiAP), learns to automatically select and weight the best experts for each visual concept to be recognized. Experiments were conducted on the MIR Flickr image collection within the ImageCLEF 2011 Photo Annotation challenge. The results have brought to the fore the effectiveness of SWLF as it achieved a MiAP of 43.69 % for the detection of the 99 visual concepts which ranked 2^nd out of the 79 submitted runs, while our new variant of SWLF allows to reach a MiAP of 43.93 %.

Download to read the full chapter text

Chapter PDF

Similar content being viewed by others

A Selective Weighted Late Fusion for Visual Concept Recognition

Chapter © 2014

Building effective SVM concept detectors from clickthrough data for large-scale image retrieval

Article 19 March 2015

On the coupled use of signal and semantic concepts to bridge the semantic and user intention gaps for visual content retrieval

Article 14 July 2016

Keywords

References

Everingham, M., Van Gool, L.J., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision, 303–338 (2010)
Google Scholar
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330 (2006)
Google Scholar
Nowak, S., Nagel, K., Liebetrau, J.: The clef 2011 photo annotation and concept-based retrieval tasks. In: CLEF Workshop Notebook Paper (2011)
Google Scholar
Guillaumin, M., Verbeek, J.J., Schmid, C.: Multimodal semi-supervised learning for image classification. In: CVPR, pp. 902–909 (2010)
Google Scholar
Snoek, C.G.M., Worring, M., Smeulders, A.W.M.: Early versus late fusion in semantic video analysis. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 399–402 (2005)
Google Scholar
Ah-Pine, J., Bressan, M., Clinchant, S., Csurka, G., Hoppenot, Y., Renders, J.M.: Crossing textual and visual content in different application scenarios. Multimedia Tools and Applications 42, 31–56 (2009)
Article Google Scholar
Snoek, C.G.M., Worring, M., Geusebroek, J.M., Koelma, D.C., Seinstra, F.J.: The mediamill trecvid 2004 semantic video search engine. In: Proceedings of the TRECVID Workshop (2004)
Google Scholar
Westerveld, T., Vries, A.P.D., van Ballegooij, A., de Jong, F., Hiemstra, D.: A probabilistic multimedia retrieval model and its evaluation. EURASIP Journal on Applied Signal Processing 2003, 186–198 (2003)
Article MATH Google Scholar
Binder, A., Samek, W., Kloft, M., Müller, C., Müller, K.R., Kawanabe, M.: The joint submission of the tu berlin and fraunhofer first (tubfi) to the imageclef2011 photo annotation task. In: CLEF Workshop Notebook Paper (2011)
Google Scholar
Wu, Y., Chang, E.Y., Chang, K.C.C., Smith, J.R.: Optimal multimodal fusion for multimedia data analysis. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, pp. 572–579 (2004)
Google Scholar
Znaidia, A., Borgne, H.L., Popescu, A.: Cea list’s participation to visual concept detection task of imageclef 2011. In: CLEF Workshop Notebook Paper (2011)
Google Scholar
Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20, 226–239 (1998)
Article Google Scholar
Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15, 1119–1125 (1994)
Article Google Scholar
Haralick, R.M.: Statistical and structural approaches to texture. Proceedings of the IEEE 67, 786–804 (1979)
Article Google Scholar
Zhu, C., Bichot, C.E., Chen, L.: Multi-scale color local binary patterns for visual object classes recognition. In: ICPR, pp. 3065–3068 (2010)
Google Scholar
Pujol, A., Chen, L.: Line segment based edge feature using hough transform. In: The Seventh IASTED International Conference on Visualization, Imaging and Image Processing, VIIP 2007, pp. 201–206 (2007)
Google Scholar
van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582–1596 (2010)
Article Google Scholar
Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: CVPR, vol. 1, pp. 419–426 (June 2006)
Google Scholar
Datta, R., Li, J., Wang, J.Z.: Content-based image retrieval: approaches and trends of the new age. In: Multimedia Information Retrieval, pp. 253–262 (2005)
Google Scholar
Dellandréa, E., Liu, N., Chen, L.: Classification of affective semantics in images based on discrete and dimensional models of emotions. In: International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 99–104 (June 2010)
Google Scholar
Miller, G.A.: Wordnet: A lexical database for english. Communications of the ACM 38, 39–41 (1995)
Article Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer New York Inc., New York (1995)
Book MATH Google Scholar
Zhang, J., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73, 213–238 (2007)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 1–27 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

CNRS, Ecole Centrale de Lyon, LIRIS, UMR5205, Université de Lyon, F-69622, France
Ningning Liu, Emmanuel Dellandrea, Chao Zhu, Charles-Edmond Bichot & Liming Chen

Authors

Ningning Liu
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Dellandrea
View author publications
You can also search for this author in PubMed Google Scholar
Chao Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Charles-Edmond Bichot
View author publications
You can also search for this author in PubMed Google Scholar
Liming Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Ingegneria Elettrica, Gestionale e Meccanica (DIEGM), Università degli Studi di Udine, Via delle Scienze, 208, 33100, Udine, Italy
Andrea Fusiello
IIT Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genoa, Italy
Vittorio Murino
Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Modena e Reggio Emilia, Strada Vignolege, 905, 41125, Modena, Italy
Rita Cucchiara

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liu, N., Dellandrea, E., Zhu, C., Bichot, CE., Chen, L. (2012). A Selective Weighted Late Fusion for Visual Concept Recognition. In: Fusiello, A., Murino, V., Cucchiara, R. (eds) Computer Vision – ECCV 2012. Workshops and Demonstrations. ECCV 2012. Lecture Notes in Computer Science, vol 7585. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33885-4_43

Download citation

DOI: https://doi.org/10.1007/978-3-642-33885-4_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33884-7
Online ISBN: 978-3-642-33885-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics