Exploiting Privileged Information from Web Data for Image Categorization

  • Wen Li
  • Li Niu
  • Dong Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8693)


Relevant and irrelevant web images collected by tag-based image retrieval have been employed as loosely labeled training data for learning SVM classifiers for image categorization by only using the visual features. In this work, we propose a new image categorization method by incorporating the textual features extracted from the surrounding textual descriptions (tags, captions, categories, etc.) as privileged information and simultaneously coping with noise in the loose labels of training web images. When the training and test samples come from different datasets, our proposed method can be further extended to reduce the data distribution mismatch by adding a regularizer based on the Maximum Mean Discrepancy (MMD) criterion. Our comprehensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed methods for image categorization and image retrieval by exploiting privileged information from web data.


learning using privileged information multi-instance learning domain adaptation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andrews, S., Tsochantaridis, I., Hofmann, T.: Support vector machines for multiple-instance learning. In: NIPS (2003)Google Scholar
  2. 2.
    Baktashmotlagh, M., Harandi, M., Brian Lovell, M.S.: Unsupervised domain adaptation by domain invariant projection. In: ICCV (2013)Google Scholar
  3. 3.
    Bergamo, A., Torresani, L.: Exploiting weakly-labeled web images to improve object classification: a domain adaptation approach. In: NIPS (2010)Google Scholar
  4. 4.
    Bruzzone, L., Marconcini, M.: Domain adaptation problems: A DASVM classification technique and a circular validation strategy. T-PAMI 32(5), 770–787 (2010)CrossRefGoogle Scholar
  5. 5.
    Bunescu, R.C., Mooney, R.J.: Multiple instance learning for sparse positive bags. In: ICML (2007)Google Scholar
  6. 6.
    Chen, X., Shrivastava, A., Gupta, A.: NEIL: Extracting visual knowledge from web data. In: ICCV (2013)Google Scholar
  7. 7.
    Chen, Y., Bi, J., Wang, J.Z.: MILES: Multiple-instance learning via embedded instance selection. T-PAMI 28(12), 1931–1947 (2006)CrossRefGoogle Scholar
  8. 8.
    Chu, W.S., DelaTorre, F., Cohn, J.: Selective transfer machine for personalized facial action unit detection. In: CVPR (2013)Google Scholar
  9. 9.
    Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: a real-world web image database from National University of Singapore. In: CIVR (2009)Google Scholar
  10. 10.
    Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: DeCAF: A deep convolutional activation feature for generic visual recognition. In: ICML (2014)Google Scholar
  11. 11.
    Duan, L., Li, W., Tsang, I.W., Xu, D.: Improving web image search by bag-based re-ranking. T-IP 20(11), 3280–3290 (2011)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Duan, L., Xu, D., Tsang, I.W.: Domain adaptation from multiple sources: A domain-dependent regularization approach. T-NNLS 23(3), 504–518 (2012)Google Scholar
  13. 13.
    Duan, L., Tsang, I.W., Xu, D.: Domain transfer multiple kernel learning. T-PAMI 34(3), 465–479 (2012)CrossRefGoogle Scholar
  14. 14.
    Duan, L., Xu, D., Tsang, I.W., Luo, J.: Visual event recognition in videos by learning from web data. T-PAMI 34(9), 1667–1680 (2012)CrossRefGoogle Scholar
  15. 15.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  16. 16.
    Farquhar, J.D.R., Hardoon, D.R., Meng, H., Shawe-Taylor, J., Szedmak, S.: Two view learning: SVM-2K, theory and practice. In: NIPS (2005)Google Scholar
  17. 17.
    Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from Google’s image search. In: ICCV (2005)Google Scholar
  18. 18.
    Fernando, B., Habrard, A., Sebban, M., Tuytelaars, T.: Unsupervised visual domain adaptation using subspace alignment. In: ICCV (2013)Google Scholar
  19. 19.
    Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)Google Scholar
  20. 20.
    Fouad, S., Tino, P., Raychaudhury, S., Schneider, P.: Incorporating privileged information through metric learning. T-NNLS 24(7), 1086–1098 (2013)Google Scholar
  21. 21.
    Gong, B., Shi, Y., Sha, F., Grauman, K.: Geodesic flow kernel for unsupervised domain adaptation. In: CVPR (2012)Google Scholar
  22. 22.
    Gopalan, R., Li, R., Chellappa, R.: Domain adaptation for object recognition: An unsupervised approach. In: ICCV (2011)Google Scholar
  23. 23.
    Gretton, A., KBorgwardt, K.M., Rasch, M.J., Schölkopf, B., Smola, A.: A kernel two-sample test. JMLR 13, 723–773 (2012)zbMATHGoogle Scholar
  24. 24.
    Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Tech. rep., California Institute of Technology (2007)Google Scholar
  25. 25.
    Hardoon, D.R., Szedmak, S., Shawe-taylor, J.: Canonical correlation analysis: An overview with application to learning methods. Neural Computation 16(12), 2639–2664 (2004)CrossRefzbMATHGoogle Scholar
  26. 26.
    Huang, J., Smola, A., Gretton, A., Borgwardt, K., Scholkopf, B.: Correcting sample selection bias by unlabeled data. In: NIPS (2007)Google Scholar
  27. 27.
    Hwang, S.J., Grauman, K.: Learning the relative importance of objects from tagged images for retrieval and cross-modal search. IJCV 100(2), 134–153 (2012)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Krapac, J., Allan, M., Verbeek, J., Jurie, F.: Improving web image search results using query-relative classifier. In: CVPR (2010)Google Scholar
  29. 29.
    Kulis, B., Saenko, K., Darrell, T.: What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In: CVPR (2011)Google Scholar
  30. 30.
    Li, Q., Wu, J., Tu, Z.: Harvesting mid-level visual concepts from large-scale internet images. In: CVPR (2013)Google Scholar
  31. 31.
    Li, W., Duan, L., Tsang, I.W., Xu, D.: Batch mode adaptive multiple instance learning for computer vision tasks. In: CVPR, pp. 2368–2375 (2012)Google Scholar
  32. 32.
    Li, W., Duan, L., Tsang, I.W., Xu, D.: Co-labeling: A new multi-view learning approach for ambiguous problems. In: ICDM, pp. 419–428 (2012)Google Scholar
  33. 33.
    Li, W., Duan, L., Xu, D., Tsang, I.W.: Text-based image retrieval using progressive multi-instance learning. In: ICCV, pp. 2049–2055 (2011)Google Scholar
  34. 34.
    Li, W., Duan, L., Xu, D., Tsang, I.W.: Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation. T-PAMI 36(6), 1134–1148 (2014)CrossRefGoogle Scholar
  35. 35.
    Liang, L., Cai, F., Cherkassky, V.: Predictive learning with structured (grouped) data. Neural Networks 22, 766–773 (2009)CrossRefGoogle Scholar
  36. 36.
    Pan, S.J., Tsang, I.W., Kwok, J.T., Yang, Q.: Domain adaptation via transfer component analysis. T-NN 22(2), 199–210 (2011)Google Scholar
  37. 37.
    Schroff, F., Criminisi, A., Zisserman, A.: Harvesting image databases from the web. T-PAMI 33(4), 754–766 (2011)CrossRefGoogle Scholar
  38. 38.
    Sharmanska, V., Quadrianto, N., Lampert, C.H.: Learning to rank using privileged information. In: ICCV (2013)Google Scholar
  39. 39.
    Torralba, A., Efros, A.A.: Unbiased look at dataset bias. In: CVPR (2011)Google Scholar
  40. 40.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. T-PAMI 30(11), 1958–1970 (2008)CrossRefGoogle Scholar
  41. 41.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  42. 42.
    Vapnik, V., Vashist, A.: A new learning paradigm: Learning using privileged infromatin. Neural Networks 22, 544–557 (2009)CrossRefGoogle Scholar
  43. 43.
    Vijayanarasimhan, S., Grauman, K.: Keywords to visual categories: Multiple-instance learning for weakly supervised object categorization. In: CVPR (2008)Google Scholar
  44. 44.
    Zhou, Z., Zhang, M.: Multi-instance multi-label learning with application to scene classification. In: NIPS (2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Wen Li
    • 1
  • Li Niu
    • 1
  • Dong Xu
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore

Personalised recommendations