Cross-Domain Concept Detection with Dictionary Coherence by Leveraging Web Images

  • Yongqing Sun
  • Kyoko Sudo
  • Yukinobu Taniguchi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8936)


We propose a novel scheme to address video concept learning by leveraging social media, one that includes the selection of web training data and the transfer of subspace learning within a unified framework. Due to the existence of cross-domain incoherence resulting from the mismatch of data distributions, how to select sufficient positive training samples from scattered and diffused social media resources is a challenging problem in the training of effective concept detectors. In this paper, given a concept, the coherent positive samples from web images for further concept learning are selected based on the degree of image coherence. Then, by exploiting both the selected dataset and video keyframes, we train a robust concept classifier by means of a transfer subspace learning method. Experiment results demonstrate that the proposed approach can achieve constant overall improvement despite cross-domain incoherence.


Visual concept detection Web image mining Sparse representation Dictionary learning Transfer learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wessel, K.: TRECVID-2007 High-Level Feature task:Overview,
  2. 2.
    Jinqiao, W., et al.: IVA-NLPR-IA-CAS TRECVID 2009: High LevelFeatures Extraction,
  3. 3.
    Sun, Y.: A Novel Region-based Approach to Visual Concept Modeling using Web Images. In: ACM Multimedia (2008)Google Scholar
  4. 4.
    Jinhui, T.: To construct optimal training set for video annotation. In: ACM Multimedia (2006)Google Scholar
  5. 5.
    Su, Y.: Cross-database age estimation based on transfer learning. In: ICASSP 2010, pp. 1270–1273 (2010)Google Scholar
  6. 6.
    Fergus, R.: Learning Object Categories from Google’s Image Search. In: ICCV, vol. 2, pp. 1816–1823 (2005)Google Scholar
  7. 7.
    Kennedy, L.S.: Generating diverse and representative image search results for landmarks. In: WWW, pp. 297–306 (2008)Google Scholar
  8. 8.
    Baudat, G., et al.: Feature vector selection and projection using kernels. Neurocomputing 55(1-2), 21–38 (2003)CrossRefGoogle Scholar
  9. 9.
    Yang, J.: Cross-Domain Video Concept Detection using Adaptive SVMs. In: ACM Multimedia 2007, pp. 188–197 (2007)Google Scholar
  10. 10.
    Chang, S.: Columbia University/VIREO-CityU/IRIT TRECVID2008 High-Level Feature Extraction and Interactive Video Search,
  11. 11.
    Bay, H., Ess, A., Tuytelaars, T., Gool, L.V.: Surf: Speeded up robust features. Computer Vision and Image Understanding 110(3), 346–359 (2008)CrossRefGoogle Scholar
  12. 12.
    Borth, D., Ulges, A., Breuel, T.M.: Automatic concept-to-query mapping for web-based concept detector training. In: ACM Multimedia 2011, New York, NY, USA, pp. 1453–1456 (2011)Google Scholar
  13. 13.
    Donoho, D.: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Comm. Pure and Applied Math. 59(6), 797–826 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Efron, T.J.I., Bradley, H., Tibshirani, R.: Least angle regression.  Annals of Statistics 32(2), 407–499 (2004)zbMATHMathSciNetGoogle Scholar
  15. 15.
    Huiskes, M.J., Thomee, B., Lew, M.S.: New trends and ideas in visual concept detection: the mir flickr retrieval evaluation initiative. In: MIR 2010, New York, NY, USA, pp. 527–536 (2010)Google Scholar
  16. 16.
    Jiang, Y.-G., Yang, J., Ngo, C.-W., Hauptmann, A.G.: Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Transactions on Multimedia 12, 42–53 (2010)CrossRefGoogle Scholar
  17. 17.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  18. 18.
    Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. J. Mach. Learn. Res. 11, 19–60 (2010)zbMATHMathSciNetGoogle Scholar
  19. 19.
    Over, P., Awad, G., Rose, R.T., Fiscus, J.G., Kraaij, W., Smeaton, A.F.: Trecvid 2008 - goals, tasks, data, evaluation mechanisms and metrics. In: TRECVID Workshop (2008)Google Scholar
  20. 20.
    Ramirez, I., Sprechmann, P., Sapiro, G.: Classification and Clustering via Dictionary Learning with Structured Incoherence and Shared Features. In: CVPR 2010, pp. 3501–3508 (June 2010)Google Scholar
  21. 21.
    Sun, Y., Kojima, A.: A novel method for semantic video concept learning using web images. In: ACM Multimedia 2011, New York, NY, USA, pp. 1081–1084 (2011)Google Scholar
  22. 22.
    Sun, Y., Shimada, S., Taniguchi, Y., Kojima, A.: A novel region-based approach to visual concept modeling using web images. In: ACM Multimedia 2008, New York, NY, USA, pp. 635–638 (2008)Google Scholar
  23. 23.
    Tang, S., Li, J.-T., Li, M., Xie, C., Liu, Y.-Z., Tao, K., Xu, S.-X.: TRECVID 2008 High-Level Feature Extraction By MCG-ICT-CAS. In: Proc. TRECVID 2008 Workshop, Gaithesburg, USA (November 2008)Google Scholar
  24. 24.
    Tang, S., Zheng, Y.-T., Wang, Y., Chua, T.-S.: Sparse ensemble learning for concept detection. IEEE Transactions on Multimedia 14(1) (2012)Google Scholar
  25. 25.
    Zhu, S., Wang, G., Ngo, C.-W., Jiang, Y.-G.: On the sampling of web images for learning visual concept classifiers. In: CIVR 2010, New York, NY, USA, pp. 50–57 (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Yongqing Sun
    • 1
  • Kyoko Sudo
    • 1
  • Yukinobu Taniguchi
    • 1
  1. 1.NTT Media Intelligence LaboratoriesYokosuka-shiJapan

Personalised recommendations