Abstract
Image retrieval is a computer vision application that people encounter in their everyday lives. To enable accurate retrieval results, a human user needs to be able to communicate in a rich and noiseless way with the retrieval system. We propose semantic visual attributes as a communication channel for search because they are commonly used by humans to describe the world around them. We first propose a new feedback interaction where users can directly comment on how individual properties of retrieved content should be adjusted to more closely match the desired visual content. We then show how to ensure this interaction is as informative as possible, by having the vision system ask those questions that will most increase its certainty over what content is relevant. To ensure that attribute-based statements from the user are not misinterpreted by the system, we model the unique ways in which users employ attribute terms, and develop personalized attribute models. We discover clusters among users in terms of how they use a given attribute term, and consequently discover the distinct “shades of meaning” of these attributes. Our work is a significant step in the direction of bridging the semantic gap between high-level user intent and low-level visual features. We discuss extensions to further increase the utility of attributes for practical search applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
To derive an attribute vocabulary, one could use [43] which automatically generates splits in visual space and learns from human annotations whether these splits can be described with an attribute; [46] which shows pairs of images to users on Amazon’s Mechanical Turk platform and aggregates terms which describe what one image has and the other does not have; or [1, 41] which mine text to discover attributes for which reliable computer models can be learned.
- 2.
The annotations are available at http://vision.cs.utexas.edu/whittlesearch/.
- 3.
In Sect. 5.3, we extend this approach to also allow “equally” responses.
- 4.
As another point of comparison against existing methods, a multi-attribute query baseline that ranks images by how many binary attributes they share with the target image achieves NDCG scores that are 40 % weaker on average than our method when using 40 feedback constraints.
- 5.
The exhaustive baseline was too expensive to run on all 14K Shoes. On a 1000-image subset, it does similarly as on the other datasets.
- 6.
Below we use the terms “school” and “shade” interchangeably.
- 7.
References
Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV) (2010)
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: European Conference on Computer Vision (ECCV) (2010)
Chen, L., Zhang, Q., Li, B.: Predicting multiple attributes via relative multi-task learning. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Cox, I., Miller, M., Minka, T., Papathomas, T., Yianilos, P.: The bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments. IEEE Trans. Image Process. 9(1), 20–37 (2000)
Curran, W., Moore, T., Kulesza, T., Wong, W.K., Todorovic, S., Stumpf, S., White, R., Burnett, M.: Towards recognizing “cool”: can end users help computer vision recognize subjective attributes or objects in images? In: Intelligent User Interfaces (IUI) (2012)
Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17(11), 1624–1636 (2011)
Endres, I., Farhadi, A., Hoiem, D., Forsyth, D.A.: The benefits and challenges of collecting richer object annotations. In: Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2010)
Escorcia, V., Niebles, J.C., Ghanem, B.: On the relationship between visual attributes and convolutional networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Everett, C.: Linguistic relativity: evidence across languages and cognitive domains. In: Mouton De Gruyter (2013)
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Ferecatu, M., Geman, D.: Interactive search for image categories by mental matching. In: International Conference on Computer Vision (ICCV) (2007)
Ferrari, V., Zisserman, A.: Learning visual attributes. In: Conference on Neural Information Processing Systems (NIPS) (2007)
Fogarty, J., Tan, D.S., Kapoor, A., Winder, S.: Cueflik: interactive concept learning in image search. In: Conference on Human Factors in Computing Systems (CHI) (2008)
Geng, B., Yang, L., Xu, C., Hua, X.S.: Ranking model adaptation for domain-specific search. IEEE Trans. Knowle. Data Eng. 24(4), 745–758 (2012)
Heim, E., Berger, M., Seversky, L., Hauskrecht, M.: Active perceptual similarity modeling with auxiliary information. In: arXiv preprint arXiv:1511.02254 (2015)
Hofmann, T.: Probabilistic latent semantic analysis. In: Uncertainty in Artificial Intelligence (UAI) (1999)
Joachims, T.: Optimizing search engines using click through data. In: International Conference on Knowledge Discovery and Data Mining (KDD) (2002)
Kekalainen, J., Jarvelin, K.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Kovashka, A., Grauman, K.: Attribute adaptation for personalized image search. In: International Conference on Computer Vision (ICCV) (2013)
Kovashka, A., Grauman, K.: Attribute pivots for guiding relevance feedback in image search. In: International Conference on Computer Vision (ICCV) (2013)
Kovashka, A., Grauman, K.: Discovering attribute shades of meaning with the crowd. Int. J. Comput. Vis. 114, 56–73 (2015)
Kovashka, A., Vijayanarasimhan, S., Grauman, K.: Actively selecting annotations among objects and attributes. In: International Conference on Computer Vision (ICCV) (2011)
Kovashka, A., Parikh, D., Grauman, K.: WhittleSearch: image search with relative attribute feedback. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Kovashka, A., Parikh, D., Grauman, K.: WhittleSearch: interactive image search with relative attribute feedback. Int. J. Comput. Vis. 115, 185–210 (2015)
Kumar, N., Belhumeur, P.N., Nayar, S.K.: FaceTracer: a search engine for large collections of images with faces. In: European Conference on Computer Vision (ECCV) (2008)
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: International Conference on Computer Vision (ICCV) (2009)
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1962–1977 (2011)
Kurita, T., Kato, T.: Learning of personal visual impression for image database systems. In: International Conference on Document Analysis and Recognition (ICDAR) (1993)
Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Li, B., Chang, E., Li, C.S.: Learning image query concepts via intelligent sampling. In: International Conference on Multimedia and Expo (ICME) (2001)
Li, S., Shan, S., Chen, X.: Relative forest for attribute prediction. In: Asian Conference on Computer Vision (ACCV) (2013)
Liu, S., Kovashka, A.: Adapting attributes using features similar across domains. In: Winter Conference on Applications of Computer Vision (WACV) (2016)
Loeff, N., Alm, C.O., Forsyth, D.A.: Discriminating image senses by clustering with multimodal features. In: Association for Computational Linguistics (ACL) (2006)
Ma, W.Y., Manjunath, B.S.: Netra: a toolbox for navigating large image databases. Multimedia Syst. 7(3), 184–198 (1999)
Mahajan, D., Sellamanickam, S., Nair, V.: A joint learning framework for attribute models and object descriptions. In: International Conference on Computer Vision (ICCV) (2011)
Mensink, T., Verbeek, J., Csurka, G.: Learning structured prediction models for interactive image labeling. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)
Ordonez, V., Jagadeesh, V., Di, W., Bhardwaj, A., Piramuthu, R.: Furniture-geek: understanding fine-grained furniture attributes from freely associated text and tags. In: Winter Conference on Applications of Computer Vision (WACV) (2014)
Ozeki, M., Okatani, T.: Understanding convolutional neural networks in terms of category-level attributes. In: Asian Conference on Computer Vision (ACCV) (2014)
Parikh, D., Grauman, K.: Interactively building a discriminative vocabulary of nameable attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Parikh, D., Grauman, K.: Relative attributes. In: International Conference on Computer Vision (ICCV) (2011)
Parikh, D., Grauman, K.: Implied feedback: learning nuances of user behavior in image search. In: International Conference on Computer Vision (ICCV) (2013)
Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Platt, J.C.: Probabilistic output for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers (1999)
Rasiwasia, N., Moreno, P.J., Vasconcelos, N.: Bridging the gap: query by semantic example. IEEE Trans. Multimedia 9(5), 923–938 (2007)
Rastegari, M., Farhadi, A., Forsyth, D.A.: Attribute discovery via predictable discriminative binary codes. In: European Conference on Computer Vision (ECCV) (2012)
Rastegari, M., Parikh, D., Diba, A., Farhadi, A.: Multi-attribute queries: to merge or not to merge? In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: International Conference on Machine Learning (ICML) (2011)
Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Technol. (1998)
Salakhutdinov, R., Mnih, A.: Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: International Conference on Machine Learning (ICML) (2008)
Sandeep, R.N., Verma, Y., Jawahar, C.: Relative parts: distinctive parts for learning relative attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Scheirer, W., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: calibration for attribute fusion and similarity search. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Schwartz, G., Nishino, K.: Automatically discovering local visual material attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Shankar, S., Garg, V.K., Cipolla, R.: Deep-carving: discovering visual attributes by carving deep neural nets. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Shao, J., Kang, K., Loy, C.C., Wang, X.: Deeply learned attributes for crowded scene understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Sharmanska, V., Quadrianto, N., Lampert, C.: Augmented attribute representations. In: European Conference on Computer Vision (ECCV) (2012)
Siddiquie, B., Feris, R., Davis, L.: Image ranking and retrieval based on multi-attribute queries. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)
Tamuz, O., Liu, C., Belongie, S., Shamir, O., Kalai, A.T.: Adaptively learning the crowd kernel. In: International Conference on Machine Learning (ICML) (2011)
Tieu, K., Viola, P.: Boosting image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2000)
Tong, S., Chang, E.: Support vector machine active learning for image retrieval. In: ACM Multimedia (2001)
Tunkelang, D.: Faceted search. In: Synthesis Lectures on Information Concepts, Retrieval, and Services (2009)
Vondrick, C., Khosla, A., Malisiewicz, T., Torralba, A.: Hoggles: visualizing object detection features. In: International Conference on Computer Vision (ICCV) (2013)
Wang, X., Ji, Q.: A unified probabilistic approach modeling relationships between attributes and objects. In: International Conference on Computer Vision (ICCV) (2013)
Xiao, F., Lee, Y.J.: Discovering the spatial extent of relative attributes. In: International Conference on Computer Vision (ICCV) (2015)
Yang, J., Yan, R., Hauptmann, A.G.: Adapting SVM classifiers to data with shifted distributions. In: IEEE International Conference on Data Mining (ICDM) Workshops (2007)
Yu, F., Cao, L., Feris, R., Smith, J., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Zhou, X.S., Huang, T.S.: Relevance feedback in image retrieval: a comprehensive review. In: Multimedia Systems (2003)
Acknowledgements
This research was supported by ONR YIP grant N00014-12-1-0754 and ONR ATL grant N00014-11-1-0105. We would like to thank Devi Parikh for her collaboration on WhittleSearch and feedback on our other work, as well as Ray Mooney for his suggestions for future work.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Kovashka, A., Grauman, K. (2017). Attributes for Image Retrieval. In: Feris, R., Lampert, C., Parikh, D. (eds) Visual Attributes. Advances in Computer Vision and Pattern Recognition. Springer, Cham. https://doi.org/10.1007/978-3-319-50077-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-50077-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50075-1
Online ISBN: 978-3-319-50077-5
eBook Packages: Computer ScienceComputer Science (R0)