WhittleSearch: Interactive Image Search with Relative Attribute Feedback

Kovashka, Adriana; Parikh, Devi; Grauman, Kristen

doi:10.1007/s11263-015-0814-0

WhittleSearch: Interactive Image Search with Relative Attribute Feedback

Published: 04 April 2015

Volume 115, pages 185–210, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Adriana Kovashka¹,
Devi Parikh² &
Kristen Grauman³

1208 Accesses
48 Citations
3 Altmetric
Explore all metrics

Abstract

We propose a novel mode of feedback for image search, where a user describes which properties of exemplar images should be adjusted in order to more closely match his/her mental model of the image sought. For example, perusing image results for a query “black shoes”, the user might state, “Show me shoe images like these, but sportier.” Offline, our approach first learns a set of ranking functions, each of which predicts the relative strength of a nameable attribute in an image (e.g., sportiness). At query time, the system presents the user with a set of exemplar images, and the user relates them to his/her target image with comparative statements. Using a series of such constraints in the multi-dimensional attribute space, our method iteratively updates its relevance function and re-ranks the database of images. To determine which exemplar images receive feedback from the user, we present two variants of the approach: one where the feedback is user-initiated and another where the feedback is actively system-initiated. In either case, our approach allows a user to efficiently “whittle away” irrelevant portions of the visual feature space, using semantic language to precisely communicate her preferences to the system. We demonstrate our technique for refining image search for people, products, and scenes, and we show that it outperforms traditional binary relevance feedback in terms of search speed and accuracy. In addition, the ordinal nature of relative attributes helps make our active approach efficient—both computationally for the machine when selecting the reference images, and for the user by requiring less user interaction than conventional passive and active methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Memory benefits when actively, rather than passively, viewing images

Article 27 November 2023

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Article Open access 06 February 2017

Introducing the Open Affective Standardized Image Set (OASIS)

Article 23 February 2016

Notes

Note that one can also use the equality constraints in \(E_m\) for training these ranking functions, as in Parikh and Grauman (2011b). In our approach, we use these constraints to compute parameters for scoring relevance, in Sect. 3.2.
We do, however, assume that all users would agree on the true attribute strength in a given image. See Kovashka and Grauman (2013a) for an approach to model the user-specific perception of an attribute.
Fig. 4
Sketch of WhittleSearch relevance computation. This toy example illustrates the intersection of relative constraints with \(M=2\) attributes. The images are plotted on the axes for both attributes. The space of images that satisfy each constraint are marked in a different color. The region satisfying all constraints is marked with a black dashed line. In this case, there is only one image in it (outlined in black). Best viewed in color
Full size image
The exhaustive baseline was too expensive to run on all 14 K Shoes. On a 1000-image subset, it does similarly as on other datasets.

References

Berg, T., Berg, A. & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In: Proceedings of the European Conference on Computer Vision (ECCV).
Biswas, A. & Parikh, D. (2013). Simultaneous active learning of classifiers and attributes via relative feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P. & Belongie, S. (2010). Visual recognition with humans in the loop. In: Proceedings of the European Conference on Computer Vision (ECCV).
Cox, I., Miller, M., Minka, T., Papathomas, T., & Yianilos, P. (2000). The Bayesian image retrieval system, PicHunter: Theory, implementation and psychophysical experiments. IEEE Transactions on Image Processing, 9(1), 20–37.
Article Google Scholar
Douze, M., Ramisa, A., Schmid, C. (2011). Combining attributes and fisher vectors for efficient image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D. (2009). Describing objects by their attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Ferecatu, M., Geman, D. (2007). Interactive search for image categories by mental matching. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Flickner, M., Sawhney, H., Nilback, W., Ashley, J., Huang, Q., Dom, B., et al. (1995). Query by image and video content: The QBIC system. IEEE Computer, 28(9), 23–32.
Article Google Scholar
Geman, D. & Jedynak, B. (1998). Model-based classification trees. IEEE Transactions on Information Theory, 47(3), 1075–1082.
Iqbal, Q. & Aggarwal, J. K. (2002) CIRES: A system for content-based retrieval in digital image libraries. In: Proceedings of the International Conference on Control, Automation, Robotics and Vision.
Jayaraman, D., Sha, F. & Grauman, K. (2014). Decorrelating semantic visual attributes by resisting the urge to share. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Joachims, T. (2002). Optimizing search engines using clickthrough data. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).
Joachims, T. (2006). Training linear SVMs in linear time. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).
Kekalainen, J., & Jarvelin, K. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.
Article Google Scholar
Kovashka, A. & Grauman, K. (2013a). Attribute adaptation for personalized image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kovashka, A. & Grauman, K. (2013b). Attribute pivots for guiding relevance feedback in image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kovashka, A., Vijayanarasimhan, S. & Grauman, K. (2011). Actively selecting annotations among objects and attributes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kovashka, A., Parikh, D. & Grauman, K. (2012). Whittle search: Image search with relative attribute feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Kulkarni, P., Sharma, G., Zepeda, J. & Chevallier, L. (2014). Transfer learning via attributes for improved on-the-fly classification. In: Proceedings of the Winter Conference on Applications of Computer Vision (WACV).
Kumar, N., Belhumeur, P. & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In: Proceedings of the European Conference on Computer Vision (ECCV).
Kumar, N., Berg, A. C., Belhumeur, P. N. & Nayar, S. K. (2009). Attribute and simile classifiers for face verification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kurita, T., Kato, T. (1993). Learning of personal visual impression for image database systems. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR).
Lampert, C., Nickisch, H. & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Li, B., Chang, E. & Li, C. S. (2001). Learning image query concepts via intelligent sampling. In: Proceedings of the International Conference on Multimedia and Expo (ICME).
Ma, W. & Manjunath, B. (1997). NeTra: A toolbox for navigating large image databases. In: Proceedings of the International Conference on Image Processing (ICIP).
MacArthur, S. D., Brodley, C. E. & Shyu, C. R. (2000). Relevance feedback decision trees in content-based image retrieval. In: Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries.
Maji, S. (2012). Discovering a lexicon of parts and attributes. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshop on Parts and Attributes.
Mensink, T., Verbeek, J. & Csurka, G. (2011). Learning structured prediction models for interactive image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Naphade, M., Smith, J., Tesic, J., Chang, S. F., Hsu, W., Kennedy, L., et al. (2006). Large-scale concept ontology for multimedia. IEEE Transactions on Multimedia, 13(3), 86–91.
Article Google Scholar
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42(3), 145–175.
Article MATH Google Scholar
Parikh, D., & Grauman, K. (2011a) Interactively building a discriminative vocabulary of nameable attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Parikh, D., & Grauman, K. (2011b) Relative Attributes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Parikh, D., & Grauman, K. (2013) Implied feedback: Learning nuances of user behavior in image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Parkash, A., & Parikh, D. (2012) Attributes for classifier feedback. In: Proceedings of the European Conference on Computer Vision (ECCV).
Patterson, G., Xu, C., Su, H., & Hays, J. (2014). The SUN attribute database: Beyond Categories for deeper scene understanding. International Journal of Computer Vision (IJCV), 108(1–2), 59–81.
Article Google Scholar
Platt, J. C. (1999) Probabilistic output for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers 10(3), 61–74.
Rasiwasia, N., Moreno, P., & Vasconcelos, N. (2007). Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia, 9(5), 923–938.
Article Google Scholar
Rastegari, M., Parikh, D., Diba, A. & Farhadi, A. (2013). Multi-attribute queries: To merge or not to merge? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Rui, Y., Huang, T., Ortega, M., & Mehrotra, S. (1998). Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Transactions on Circuits and Video Technology, 8(5), 644–655.
Article Google Scholar
Saleh, B., Farhadi, A. & Elgammal, A. (2013). Object-centric anomaly detection by attribute-based reasoning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Scheirer, W., Kumar, N., Belhumeur, P. & Boult, T. (2012). Multi-attribute spaces: Calibration for attribute fusion and similarity search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Siddiquie, B., Feris, R. & Davis, L. (2011). Image ranking and retrieval based on multi-attribute queries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Smith, J., Naphade, M. & Natsev, A. (2003). Multimedia semantic indexing using model vectors. In: Proceedings of the International Conference on Multimedia and Expo (ICME).
Sznitman, R., & Jedynak, B. (2010). Active testing for face detection and localization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32(10), 1914–1920.
Article Google Scholar
Tieu, K. & Viola, P. (2000). Boosting image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Tong, S. & Chang, E. (2001). Support vector machine active learning for image retrieval. In: Proceedings of the ACM International Conference on Multimedia.
Vijayanarasimhan, S. & Kapoor, A. (2010). Visual recognition and detection under bounded computational resources. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wah, C. & Belongie, S. (2013). Attribute-based detection of unfamiliar classes with humans in the loop. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wah, C., Van Horn, G., Branson, S., Maji, S., Perona, P. & Belongie, S. (2014). Similarity comparisons for interactive fine-grained categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wang, X., Liu, K. & Tang, X. (2011). Query-specific visual semantic spaces for web image re-ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wang, Y. & Mori, G. (2010). A discriminative latent model of object classes and attributes. In: Proceedings of the European Conference on Computer Vision (ECCV).
Zavesky, E. & Chang, S. F. (2008). Cu-Zero: Embracing the Frontier of interactive visual search for informed users. In: Proceedings of the ACM International Conference on Multimedia Information Retrieval.
Zhang, C., & Chen, T. (2002). An active learning framework for content based information retrieval. IEEE Transactions on Multimedia, 4(2), 260–268.
Article Google Scholar
Zhou, X., & Huang, T. (2003). Relevance feedback in image retrieval: A comprehensive review. ACM Multimedia Systems, 8(6), 536–544.
Article Google Scholar

Download references

Acknowledgments

We thank the anonymous reviewers for their helpful feedback and suggestions. This research was supported by ONR YIP Award N00014-12-1-0754 (K.G. and A.K.) and Google Faculty Research Award (D.P.).

Author information

Authors and Affiliations

University of Pittsburgh, 5325 Sennott Square, 210 South Bouquet Street, Pittsburgh, PA, 15260, USA
Adriana Kovashka
Virginia Tech, 1185 Perry St, Room 302, MC 0111, Blacksburg, VA, 24061, USA
Devi Parikh
The University of Texas at Austin, 2317 Speedway, Stop D9500, Austin, TX, 78712, USA
Kristen Grauman

Authors

Adriana Kovashka
View author publications
You can also search for this author in PubMed Google Scholar
Devi Parikh
View author publications
You can also search for this author in PubMed Google Scholar
Kristen Grauman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adriana Kovashka.

Additional information

Communicated by M. Hebert.

The work was done while Adriana Kovashka was at The University of Texas at Austin.

Appendix

See Table 4.

Table 4 Ordering of classes for the attributes in the Shoes dataset. A score of 10 denotes that this class has the attribute the most, and 1 denotes the class has it the least

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kovashka, A., Parikh, D. & Grauman, K. WhittleSearch: Interactive Image Search with Relative Attribute Feedback. Int J Comput Vis 115, 185–210 (2015). https://doi.org/10.1007/s11263-015-0814-0

Download citation

Received: 05 August 2014
Accepted: 09 March 2015
Published: 04 April 2015
Issue Date: November 2015
DOI: https://doi.org/10.1007/s11263-015-0814-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

WhittleSearch: Interactive Image Search with Relative Attribute Feedback

Abstract

Access this article

Similar content being viewed by others

Memory benefits when actively, rather than passively, viewing images

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Introducing the Open Affective Standardized Image Set (OASIS)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Navigation

WhittleSearch: Interactive Image Search with Relative Attribute Feedback

Abstract

Access this article

Similar content being viewed by others

Memory benefits when actively, rather than passively, viewing images

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations

Introducing the Open Affective Standardized Image Set (OASIS)

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation