Advertisement

A Selective Weighted Late Fusion for Visual Concept Recognition

  • Ningning Liu
  • Emmanuel Dellandrea
  • Chao Zhu
  • Charles-Edmond Bichot
  • Liming Chen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7585)

Abstract

We propose in this paper a novel multimodal approach to automatically predict the visual concepts of images through an effective fusion of visual and textual features. It relies on a Selective Weighted Late Fusion (SWLF) scheme which, in optimizing an overall Mean interpolated Average Precision (MiAP), learns to automatically select and weight the best experts for each visual concept to be recognized. Experiments were conducted on the MIR Flickr image collection within the ImageCLEF 2011 Photo Annotation challenge. The results have brought to the fore the effectiveness of SWLF as it achieved a MiAP of 43.69 % for the detection of the 99 visual concepts which ranked 2 nd out of the 79 submitted runs, while our new variant of SWLF allows to reach a MiAP of 43.93 %.

Keywords

Visual concept recognition multimodality feature fusion 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Everingham, M., Van Gool, L.J., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision, 303–338 (2010)Google Scholar
  2. 2.
    Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and trecvid. In: MIR 2006: Proceedings of the 8th ACM International Workshop on Multimedia Information Retrieval, pp. 321–330 (2006)Google Scholar
  3. 3.
    Nowak, S., Nagel, K., Liebetrau, J.: The clef 2011 photo annotation and concept-based retrieval tasks. In: CLEF Workshop Notebook Paper (2011)Google Scholar
  4. 4.
    Guillaumin, M., Verbeek, J.J., Schmid, C.: Multimodal semi-supervised learning for image classification. In: CVPR, pp. 902–909 (2010)Google Scholar
  5. 5.
    Snoek, C.G.M., Worring, M., Smeulders, A.W.M.: Early versus late fusion in semantic video analysis. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 399–402 (2005)Google Scholar
  6. 6.
    Ah-Pine, J., Bressan, M., Clinchant, S., Csurka, G., Hoppenot, Y., Renders, J.M.: Crossing textual and visual content in different application scenarios. Multimedia Tools and Applications 42, 31–56 (2009)CrossRefGoogle Scholar
  7. 7.
    Snoek, C.G.M., Worring, M., Geusebroek, J.M., Koelma, D.C., Seinstra, F.J.: The mediamill trecvid 2004 semantic video search engine. In: Proceedings of the TRECVID Workshop (2004)Google Scholar
  8. 8.
    Westerveld, T., Vries, A.P.D., van Ballegooij, A., de Jong, F., Hiemstra, D.: A probabilistic multimedia retrieval model and its evaluation. EURASIP Journal on Applied Signal Processing 2003, 186–198 (2003)zbMATHCrossRefGoogle Scholar
  9. 9.
    Binder, A., Samek, W., Kloft, M., Müller, C., Müller, K.R., Kawanabe, M.: The joint submission of the tu berlin and fraunhofer first (tubfi) to the imageclef2011 photo annotation task. In: CLEF Workshop Notebook Paper (2011)Google Scholar
  10. 10.
    Wu, Y., Chang, E.Y., Chang, K.C.C., Smith, J.R.: Optimal multimodal fusion for multimedia data analysis. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, pp. 572–579 (2004)Google Scholar
  11. 11.
    Znaidia, A., Borgne, H.L., Popescu, A.: Cea list’s participation to visual concept detection task of imageclef 2011. In: CLEF Workshop Notebook Paper (2011)Google Scholar
  12. 12.
    Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 20, 226–239 (1998)CrossRefGoogle Scholar
  13. 13.
    Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature selection. Pattern Recogn. Lett. 15, 1119–1125 (1994)CrossRefGoogle Scholar
  14. 14.
    Haralick, R.M.: Statistical and structural approaches to texture. Proceedings of the IEEE 67, 786–804 (1979)CrossRefGoogle Scholar
  15. 15.
    Zhu, C., Bichot, C.E., Chen, L.: Multi-scale color local binary patterns for visual object classes recognition. In: ICPR, pp. 3065–3068 (2010)Google Scholar
  16. 16.
    Pujol, A., Chen, L.: Line segment based edge feature using hough transform. In: The Seventh IASTED International Conference on Visualization, Imaging and Image Processing, VIIP 2007, pp. 201–206 (2007)Google Scholar
  17. 17.
    van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1582–1596 (2010)CrossRefGoogle Scholar
  18. 18.
    Ke, Y., Tang, X., Jing, F.: The design of high-level features for photo quality assessment. In: CVPR, vol. 1, pp. 419–426 (June 2006)Google Scholar
  19. 19.
    Datta, R., Li, J., Wang, J.Z.: Content-based image retrieval: approaches and trends of the new age. In: Multimedia Information Retrieval, pp. 253–262 (2005)Google Scholar
  20. 20.
    Dellandréa, E., Liu, N., Chen, L.: Classification of affective semantics in images based on discrete and dimensional models of emotions. In: International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 99–104 (June 2010)Google Scholar
  21. 21.
    Miller, G.A.: Wordnet: A lexical database for english. Communications of the ACM 38, 39–41 (1995)CrossRefGoogle Scholar
  22. 22.
    Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer New York Inc., New York (1995)zbMATHCrossRefGoogle Scholar
  23. 23.
    Zhang, J., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73, 213–238 (2007)CrossRefGoogle Scholar
  24. 24.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 1–27 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ningning Liu
    • 1
  • Emmanuel Dellandrea
    • 1
  • Chao Zhu
    • 1
  • Charles-Edmond Bichot
    • 1
  • Liming Chen
    • 1
  1. 1.CNRS, Ecole Centrale de Lyon, LIRIS, UMR5205Université de LyonFrance

Personalised recommendations