Visual Concept Detection and Annotation via Multiple Kernel Learning of Multiple Models

  • Yu Zhang
  • Stephane Bres
  • Liming Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8157)

Abstract

This paper presents a multi-model framework for Visual Concept Detection and Annotation(VCDA) task based on Multiple Kernel Learning(MKL), To extract discriminative visual features and build visual kernels. Meanwhile the tags associated with images are used to build the textual kernels. Finally, in order to benefit from both visual models and textual models, fusion is carried out by MKL efficiently embed. Traditionally the term frequencies model is used to capture this useful textual information. However, the shortcoming in the term frequencies model lies in the fact that the performance seriously depends on the dictionary construction and in the fact that the valuable semantic information can not be captured. To solve this problem, we propose one textual feature construction approach based on WordNet distance. The advantages of this approach are three-fold: (1) It is robust, because our feature construction approach does not depend on dictionary construction. (2) It can capture tags semantic information which is hardly described by the term frequencies model. (3) It efficiently fuses visual models and textual models. The experimental results on the ImageCLEF 2011 show that our approach effectively improves the recognition accuracy.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Binder, A., Samek, W., Kloft, M., Müller, C., Müller, K.R., Kawanabe, M.: The joint submission of the tu berlin and fraunhofer first (tubfi) to the imageclef 2011 photo annotation task. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (2011)Google Scholar
  2. 2.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C., Bray, C.: Visual categorization with bags of keypoints. In: European Conference on Computer Vision (ECCV), pp. 1–22 (2004)Google Scholar
  3. 3.
    Daróczy, B., Pethes, R., Benczúr, A.A.: Sztaki @ imageclef 2011. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (Notebook Papers/Labs/Workshop) (2011)Google Scholar
  4. 4.
    Fellbaum, C. (ed.): WordNet: an electronic lexical database. MIT Press (1998)Google Scholar
  5. 5.
    Guillaumin, M., Verbeek, J., Schmid, C.: Multimodal semi-supervised learning for image classification. In: CVPR (June 2010)Google Scholar
  6. 6.
    Lin, Y.Y., Liu, T.L., Fuh, C.S.: Local ensemble kernel learning for object category recognition. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  7. 7.
    Liu, N., Dellandréa, E., Tellez, B., Chen, L.: Associating textual features with visual ones to improve affective image classification. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.-C. (eds.) ACII 2011, Part I. LNCS, vol. 6974, pp. 195–204. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Liu, N., Zhang, Y., Dellandréa, E., Bres, S., Chen, L.: Liris-imagine at imageclef 2011 photo annotation task. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (Notebook Papers/Labs/Workshop) (2011)Google Scholar
  9. 9.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004), http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94 CrossRefGoogle Scholar
  10. 10.
    Mikolajczyk, K., Schmid, C.: Indexing based on scale invariant interest points (2001), http://perception.inrialpes.fr/Publications/2001/MS01a
  11. 11.
    Nagel, K., Nowak, S., Kühhirt, U., Wolter, K.: The fraunhofer idmt at imageclef 2011 photo annotation task. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (Notebook Papers/Labs/Workshop) (2011)Google Scholar
  12. 12.
    van de Sande, K.E.A., Snoek, C.G.M.: The university of amsterdam’s concept detection system at imageclef 2011. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (Notebook Papers/Labs/Workshop) (2011)Google Scholar
  13. 13.
    Siddiquie, B., Vitaladevuni, S.N.P., Davis, L.S.: Combining multiple kernels for efficient image classification. In: WACV, pp. 1–8. IEEE Computer Society (2009)Google Scholar
  14. 14.
    Su, Y., Jurie, F.: Semantic contexts and fisher vectors for the imageclef 2011 photo annotation task. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (Notebook Papers/Labs/Workshop) (2011)Google Scholar
  15. 15.
    Vishwanathan, S.V.N., Sun, Z., Theera-Ampornpunt, N., Varma, M.: Multiple kernel learning and the SMO algorithm. In: Advances in Neural Information Processing Systems (December 2010)Google Scholar
  16. 16.
    Xioufis, E.S., Sechidis, K., Tsoumakas, G., Vlahavas, I.P.: Mlkd’s participation at the clef 2011 photo annotation and concept-based retrieval tasks. In: Petras, V., Forner, P., Clough, P.D. (eds.) CLEF (Notebook Papers/Labs/Workshop) (2011)Google Scholar
  17. 17.
    Yue, Y., Finley, T., Radlinski, F., Joachims, T.: A support vector method for optimizing average precision. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007, pp. 271–278. ACM, New York (2007)Google Scholar
  18. 18.
    Zhu, C., Bichot, C.E., Chen, L.: Multi-scale Color Local Binary Patterns for Visual Object Classes Recognition. In: International Conference on Pattern Recognition (ICPR), pp. 3065–3068. IEEE (August 2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yu Zhang
    • 1
  • Stephane Bres
    • 2
  • Liming Chen
    • 1
  1. 1.LIRIS, UMR5205Universite de Lyon, CNRS, Ecole Centrale de LyonFrance
  2. 2.LIRIS-INSA de LyonVilleurbanne CedexFrance

Personalised recommendations