Skip to main content

Attributes for Image Retrieval

  • Chapter
  • First Online:
Visual Attributes

Abstract

Image retrieval is a computer vision application that people encounter in their everyday lives. To enable accurate retrieval results, a human user needs to be able to communicate in a rich and noiseless way with the retrieval system. We propose semantic visual attributes as a communication channel for search because they are commonly used by humans to describe the world around them. We first propose a new feedback interaction where users can directly comment on how individual properties of retrieved content should be adjusted to more closely match the desired visual content. We then show how to ensure this interaction is as informative as possible, by having the vision system ask those questions that will most increase its certainty over what content is relevant. To ensure that attribute-based statements from the user are not misinterpreted by the system, we model the unique ways in which users employ attribute terms, and develop personalized attribute models. We discover clusters among users in terms of how they use a given attribute term, and consequently discover the distinct “shades of meaning” of these attributes. Our work is a significant step in the direction of bridging the semantic gap between high-level user intent and low-level visual features. We discuss extensions to further increase the utility of attributes for practical search applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    To derive an attribute vocabulary, one could use [43] which automatically generates splits in visual space and learns from human annotations whether these splits can be described with an attribute; [46] which shows pairs of images to users on Amazon’s Mechanical Turk platform and aggregates terms which describe what one image has and the other does not have; or [1, 41] which mine text to discover attributes for which reliable computer models can be learned.

  2. 2.

    The annotations are available at http://vision.cs.utexas.edu/whittlesearch/.

  3. 3.

    In Sect. 5.3, we extend this approach to also allow “equally” responses.

  4. 4.

    As another point of comparison against existing methods, a multi-attribute query baseline that ranks images by how many binary attributes they share with the target image achieves NDCG scores that are 40 % weaker on average than our method when using 40 feedback constraints.

  5. 5.

    The exhaustive baseline was too expensive to run on all 14K Shoes. On a 1000-image subset, it does similarly as on the other datasets.

  6. 6.

    Below we use the terms “school” and “shade” interchangeably.

  7. 7.

    Note that non-semantic attributes [49, 56, 69] are not readily applicable for applications that require human-machine communication as they do not have human-interpretable names.

References

  1. Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV) (2010)

    Google Scholar 

  2. Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: European Conference on Computer Vision (ECCV) (2010)

    Google Scholar 

  3. Chen, L., Zhang, Q., Li, B.: Predicting multiple attributes via relative multi-task learning. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)

    Google Scholar 

  4. Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  5. Cox, I., Miller, M., Minka, T., Papathomas, T., Yianilos, P.: The bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments. IEEE Trans. Image Process. 9(1), 20–37 (2000)

    Article  Google Scholar 

  6. Curran, W., Moore, T., Kulesza, T., Wong, W.K., Todorovic, S., Stumpf, S., White, R., Burnett, M.: Towards recognizing “cool”: can end users help computer vision recognize subjective attributes or objects in images? In: Intelligent User Interfaces (IUI) (2012)

    Google Scholar 

  7. Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

    Google Scholar 

  8. Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  9. Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17(11), 1624–1636 (2011)

    Article  Google Scholar 

  10. Endres, I., Farhadi, A., Hoiem, D., Forsyth, D.A.: The benefits and challenges of collecting richer object annotations. In: Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2010)

    Google Scholar 

  11. Escorcia, V., Niebles, J.C., Ghanem, B.: On the relationship between visual attributes and convolutional networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  12. Everett, C.: Linguistic relativity: evidence across languages and cognitive domains. In: Mouton De Gruyter (2013)

    Google Scholar 

  13. Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  14. Ferecatu, M., Geman, D.: Interactive search for image categories by mental matching. In: International Conference on Computer Vision (ICCV) (2007)

    Google Scholar 

  15. Ferrari, V., Zisserman, A.: Learning visual attributes. In: Conference on Neural Information Processing Systems (NIPS) (2007)

    Google Scholar 

  16. Fogarty, J., Tan, D.S., Kapoor, A., Winder, S.: Cueflik: interactive concept learning in image search. In: Conference on Human Factors in Computing Systems (CHI) (2008)

    Google Scholar 

  17. Geng, B., Yang, L., Xu, C., Hua, X.S.: Ranking model adaptation for domain-specific search. IEEE Trans. Knowle. Data Eng. 24(4), 745–758 (2012)

    Article  Google Scholar 

  18. Heim, E., Berger, M., Seversky, L., Hauskrecht, M.: Active perceptual similarity modeling with auxiliary information. In: arXiv preprint arXiv:1511.02254 (2015)

  19. Hofmann, T.: Probabilistic latent semantic analysis. In: Uncertainty in Artificial Intelligence (UAI) (1999)

    Google Scholar 

  20. Joachims, T.: Optimizing search engines using click through data. In: International Conference on Knowledge Discovery and Data Mining (KDD) (2002)

    Google Scholar 

  21. Kekalainen, J., Jarvelin, K.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  22. Kovashka, A., Grauman, K.: Attribute adaptation for personalized image search. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  23. Kovashka, A., Grauman, K.: Attribute pivots for guiding relevance feedback in image search. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  24. Kovashka, A., Grauman, K.: Discovering attribute shades of meaning with the crowd. Int. J. Comput. Vis. 114, 56–73 (2015)

    Article  Google Scholar 

  25. Kovashka, A., Vijayanarasimhan, S., Grauman, K.: Actively selecting annotations among objects and attributes. In: International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  26. Kovashka, A., Parikh, D., Grauman, K.: WhittleSearch: image search with relative attribute feedback. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  27. Kovashka, A., Parikh, D., Grauman, K.: WhittleSearch: interactive image search with relative attribute feedback. Int. J. Comput. Vis. 115, 185–210 (2015)

    Article  MathSciNet  Google Scholar 

  28. Kumar, N., Belhumeur, P.N., Nayar, S.K.: FaceTracer: a search engine for large collections of images with faces. In: European Conference on Computer Vision (ECCV) (2008)

    Google Scholar 

  29. Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: International Conference on Computer Vision (ICCV) (2009)

    Google Scholar 

  30. Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1962–1977 (2011)

    Article  Google Scholar 

  31. Kurita, T., Kato, T.: Learning of personal visual impression for image database systems. In: International Conference on Document Analysis and Recognition (ICDAR) (1993)

    Google Scholar 

  32. Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  33. Li, B., Chang, E., Li, C.S.: Learning image query concepts via intelligent sampling. In: International Conference on Multimedia and Expo (ICME) (2001)

    Google Scholar 

  34. Li, S., Shan, S., Chen, X.: Relative forest for attribute prediction. In: Asian Conference on Computer Vision (ACCV) (2013)

    Google Scholar 

  35. Liu, S., Kovashka, A.: Adapting attributes using features similar across domains. In: Winter Conference on Applications of Computer Vision (WACV) (2016)

    Google Scholar 

  36. Loeff, N., Alm, C.O., Forsyth, D.A.: Discriminating image senses by clustering with multimodal features. In: Association for Computational Linguistics (ACL) (2006)

    Google Scholar 

  37. Ma, W.Y., Manjunath, B.S.: Netra: a toolbox for navigating large image databases. Multimedia Syst. 7(3), 184–198 (1999)

    Article  Google Scholar 

  38. Mahajan, D., Sellamanickam, S., Nair, V.: A joint learning framework for attribute models and object descriptions. In: International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  39. Mensink, T., Verbeek, J., Csurka, G.: Learning structured prediction models for interactive image labeling. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

    Google Scholar 

  40. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)

    Article  MATH  Google Scholar 

  41. Ordonez, V., Jagadeesh, V., Di, W., Bhardwaj, A., Piramuthu, R.: Furniture-geek: understanding fine-grained furniture attributes from freely associated text and tags. In: Winter Conference on Applications of Computer Vision (WACV) (2014)

    Google Scholar 

  42. Ozeki, M., Okatani, T.: Understanding convolutional neural networks in terms of category-level attributes. In: Asian Conference on Computer Vision (ACCV) (2014)

    Google Scholar 

  43. Parikh, D., Grauman, K.: Interactively building a discriminative vocabulary of nameable attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

    Google Scholar 

  44. Parikh, D., Grauman, K.: Relative attributes. In: International Conference on Computer Vision (ICCV) (2011)

    Google Scholar 

  45. Parikh, D., Grauman, K.: Implied feedback: learning nuances of user behavior in image search. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  46. Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  47. Platt, J.C.: Probabilistic output for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers (1999)

    Google Scholar 

  48. Rasiwasia, N., Moreno, P.J., Vasconcelos, N.: Bridging the gap: query by semantic example. IEEE Trans. Multimedia 9(5), 923–938 (2007)

    Article  Google Scholar 

  49. Rastegari, M., Farhadi, A., Forsyth, D.A.: Attribute discovery via predictable discriminative binary codes. In: European Conference on Computer Vision (ECCV) (2012)

    Google Scholar 

  50. Rastegari, M., Parikh, D., Diba, A., Farhadi, A.: Multi-attribute queries: to merge or not to merge? In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

    Google Scholar 

  51. Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: International Conference on Machine Learning (ICML) (2011)

    Google Scholar 

  52. Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Technol. (1998)

    Google Scholar 

  53. Salakhutdinov, R., Mnih, A.: Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: International Conference on Machine Learning (ICML) (2008)

    Google Scholar 

  54. Sandeep, R.N., Verma, Y., Jawahar, C.: Relative parts: distinctive parts for learning relative attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)

    Google Scholar 

  55. Scheirer, W., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: calibration for attribute fusion and similarity search. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)

    Google Scholar 

  56. Schwartz, G., Nishino, K.: Automatically discovering local visual material attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  57. Shankar, S., Garg, V.K., Cipolla, R.: Deep-carving: discovering visual attributes by carving deep neural nets. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  58. Shao, J., Kang, K., Loy, C.C., Wang, X.: Deeply learned attributes for crowded scene understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  59. Sharmanska, V., Quadrianto, N., Lampert, C.: Augmented attribute representations. In: European Conference on Computer Vision (ECCV) (2012)

    Google Scholar 

  60. Siddiquie, B., Feris, R., Davis, L.: Image ranking and retrieval based on multi-attribute queries. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)

    Google Scholar 

  61. Tamuz, O., Liu, C., Belongie, S., Shamir, O., Kalai, A.T.: Adaptively learning the crowd kernel. In: International Conference on Machine Learning (ICML) (2011)

    Google Scholar 

  62. Tieu, K., Viola, P.: Boosting image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2000)

    Google Scholar 

  63. Tong, S., Chang, E.: Support vector machine active learning for image retrieval. In: ACM Multimedia (2001)

    Google Scholar 

  64. Tunkelang, D.: Faceted search. In: Synthesis Lectures on Information Concepts, Retrieval, and Services (2009)

    Google Scholar 

  65. Vondrick, C., Khosla, A., Malisiewicz, T., Torralba, A.: Hoggles: visualizing object detection features. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  66. Wang, X., Ji, Q.: A unified probabilistic approach modeling relationships between attributes and objects. In: International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  67. Xiao, F., Lee, Y.J.: Discovering the spatial extent of relative attributes. In: International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  68. Yang, J., Yan, R., Hauptmann, A.G.: Adapting SVM classifiers to data with shifted distributions. In: IEEE International Conference on Data Mining (ICDM) Workshops (2007)

    Google Scholar 

  69. Yu, F., Cao, L., Feris, R., Smith, J., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)

    Google Scholar 

  70. Zhou, X.S., Huang, T.S.: Relevance feedback in image retrieval: a comprehensive review. In: Multimedia Systems (2003)

    Google Scholar 

Download references

Acknowledgements

This research was supported by ONR YIP grant N00014-12-1-0754 and ONR ATL grant N00014-11-1-0105. We would like to thank Devi Parikh for her collaboration on WhittleSearch and feedback on our other work, as well as Ray Mooney for his suggestions for future work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adriana Kovashka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Kovashka, A., Grauman, K. (2017). Attributes for Image Retrieval. In: Feris, R., Lampert, C., Parikh, D. (eds) Visual Attributes. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-50077-5_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50077-5_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50075-1

  • Online ISBN: 978-3-319-50077-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics