Weakly Supervised Learning of Objects, Attributes and Their Associations

  • Zhiyuan Shi
  • Yongxin Yang
  • Timothy M. Hospedales
  • Tao Xiang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8690)


When humans describe images they tend to use combinations of nouns and adjectives, corresponding to objects and their associated attributes respectively. To generate such a description automatically, one needs to model objects, attributes and their associations. Conventional methods require strong annotation of object and attribute locations, making them less scalable. In this paper, we model object-attribute associations from weakly labelled images, such as those widely available on media sharing sites (e.g. Flickr), where only image-level labels (either object or attributes) are given, without their locations and associations. This is achieved by introducing a novel weakly supervised non-parametric Bayesian model. Once learned, given a new image, our model can describe the image, including objects, attributes and their associations, as well as their locations and segmentation. Extensive experiments on benchmark datasets demonstrate that our weakly supervised model performs at par with strongly supervised models on tasks such as image description and retrieval based on object-attribute associations.


Weakly supervised learning object attribute associations 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arbelaez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. TPAMI (2011)Google Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR (2003)Google Scholar
  3. 3.
    Bourdev, L., Maji, S., Malik, J.: Describing people: A poselet-based approach to attribute classification. In: ICCV (2011)Google Scholar
  4. 4.
    Chen, H., Gallagher, A., Girod, B.: Describing clothing by semantic attributes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 609–623. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Chen, X., Shrivastava, A., Gupta, A.: Neil: Extracting visual knowledge from web data. In: ICCV (2013)Google Scholar
  6. 6.
    Deselaers, T., Alexe, B., Ferrari, V.: Weakly supervised localization and learning with generic knowledge. IJCV 100 (2012)Google Scholar
  7. 7.
    Doshi-Velez, F., Miller, K.T., Gael, J.V., Teh, Y.W.: Variational inference for the indian buffet process. Tech. rep., University of Cambridge (2009)Google Scholar
  8. 8.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. JMLR (2008)Google Scholar
  9. 9.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  10. 10.
    Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI (2010)Google Scholar
  11. 11.
    Feng, Z., Jin, R., Jain, A.: Large-scale image annotation by efficient and robust kernel metric learning. In: ICCV (2013)Google Scholar
  12. 12.
    Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Attribute learning for understanding unstructured social activity. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 530–543. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Griffiths, T.L., Ghahramani, Z.: The indian buffet process: An introduction and review. JMLR (2011)Google Scholar
  14. 14.
    Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. TPAMI (2011)Google Scholar
  15. 15.
    Kovashka, A., Grauman, K.: Attribute adaptation for personalized image search. In: ICCV (2013)Google Scholar
  16. 16.
    Kovashka, A., Vijayanarasimhan, S., Grauman, K.: Actively selecting annotations among objects and attributes. In: ICCV (2011)Google Scholar
  17. 17.
    Kulkarni, G., Premraj, V., Dhar, S., Li, S., Choi, Y., Berg, A., Berg, T.: Baby talk: Understanding and generating simple image descriptions. In: CVPR (2011)Google Scholar
  18. 18.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE TPAMI (2013)Google Scholar
  19. 19.
    Li, L.J., Socher, R., Fei-Fei, L.: Towards total scene understanding:classification, annotation and segmentation in an automatic framework. In: CVPR (2009)Google Scholar
  20. 20.
    Mahajan, D., Sellamanickam, S., Nair, V.: A joint learning framework for attribute models and object descriptions. In: ICCV (2011)Google Scholar
  21. 21.
    Marchesotti, L., Perronnin, F.: Learning beautiful (and ugly) attributes. In: BMVC (2013)Google Scholar
  22. 22.
    Nguyen, N.: A new svm approach to multi-instance multi-label learning. In: ICDM (2010)Google Scholar
  23. 23.
    Ordonez, V., Deng, J., Choi, Y., Berg, A.C., Berg, T.L.: From large scale image categorization to entry-level categories. In: ICCV (2013)Google Scholar
  24. 24.
    Patterson, G., Hays, J.: Sun attribute database: Discovering, annotating, and recognizing scene attributes. In: CVPR (2012)Google Scholar
  25. 25.
    Rasiwasia, N., Vasconcelos, N.: Latent dirichlet allocation models for image classification. TPAMI (2013)Google Scholar
  26. 26.
    Rastegari, M., Diba, A., Parikh, D., Farhadi, A.: Multi-attribute queries: To merge or not to merge? In: CVPR (2013)Google Scholar
  27. 27.
    Russakovsky, O., Fei-Fei, L.: Attribute learning in large-scale datasets. In: Kutulakos, K.N. (ed.) ECCV 2010 Workshops, Part I. LNCS, vol. 6553, pp. 1–14. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  28. 28.
    Sadeghi, M., Farhadi, A.: Recognition using visual phrases. In: CVPR (2011)Google Scholar
  29. 29.
    van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Evaluating color descriptors for object and scene recognition. TPAMI (2010)Google Scholar
  30. 30.
    Scheirer, W., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: Calibration for attribute fusion and similarity search. In: CVPR (2012)Google Scholar
  31. 31.
    Shi, Z., Hospedales, T.M., Xiang, T.: Bayesian joint topic modelling for weakly supervised object localisation. In: ICCV (2013)Google Scholar
  32. 32.
    Siddiquie, B., Feris, R., Davis, L.: Image ranking and retrieval based on multi-attribute queries. In: CVPR (2011)Google Scholar
  33. 33.
    Socher, R., Fei-Fei, L.: Connecting modalities: Semi-supervised segmentation and annotation of images using unaligned text corpora. In: CVPR (2010)Google Scholar
  34. 34.
    Turakhia, N., Parikh, D.: Attribute dominance: What pops out? In: ICCV (2013)Google Scholar
  35. 35.
    Wang, G., Forsyth, D.: Joint learning of visual attributes, object classes and visual saliency. In: ICCV (2009)Google Scholar
  36. 36.
    Wang, S., Joo, J., Wang, Y., Zhu, S.C.: Weakly supervised learning for attribute localization in outdoor scenes. In: CVPR (2013)Google Scholar
  37. 37.
    Wang, X., Ji, Q.: A unified probabilistic approach modeling relationships between attributes and objects. In: ICCV (2013)Google Scholar
  38. 38.
    Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 155–168. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  39. 39.
    Wu, L., Jin, R., Jain, A.K.: Tag completion for image retrieval. TPAMI (2013)Google Scholar
  40. 40.
    Zhang, N., Farrell, R., Iandola, F., Darrell, T.: Deformable part descriptors for fine-grained recognition and attribute prediction. In: ICCV (2013)Google Scholar
  41. 41.
    Zhou, Z.H., Zhang, M.L., Huang, S.J., Li, Y.F.: Multi-instance multi-label learning. Artificial Intelligence (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Zhiyuan Shi
    • 1
  • Yongxin Yang
    • 1
  • Timothy M. Hospedales
    • 1
  • Tao Xiang
    • 1
  1. 1.Queen Mary, University of LondonLondonUK

Personalised recommendations