Skip to main content
Log in

WhittleSearch: Interactive Image Search with Relative Attribute Feedback

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We propose a novel mode of feedback for image search, where a user describes which properties of exemplar images should be adjusted in order to more closely match his/her mental model of the image sought. For example, perusing image results for a query “black shoes”, the user might state, “Show me shoe images like these, but sportier.” Offline, our approach first learns a set of ranking functions, each of which predicts the relative strength of a nameable attribute in an image (e.g., sportiness). At query time, the system presents the user with a set of exemplar images, and the user relates them to his/her target image with comparative statements. Using a series of such constraints in the multi-dimensional attribute space, our method iteratively updates its relevance function and re-ranks the database of images. To determine which exemplar images receive feedback from the user, we present two variants of the approach: one where the feedback is user-initiated and another where the feedback is actively system-initiated. In either case, our approach allows a user to efficiently “whittle away” irrelevant portions of the visual feature space, using semantic language to precisely communicate her preferences to the system. We demonstrate our technique for refining image search for people, products, and scenes, and we show that it outperforms traditional binary relevance feedback in terms of search speed and accuracy. In addition, the ordinal nature of relative attributes helps make our active approach efficient—both computationally for the machine when selecting the reference images, and for the user by requiring less user interaction than conventional passive and active methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21

Similar content being viewed by others

Notes

  1. Note that one can also use the equality constraints in \(E_m\) for training these ranking functions, as in Parikh and Grauman (2011b). In our approach, we use these constraints to compute parameters for scoring relevance, in Sect. 3.2.

  2. We do, however, assume that all users would agree on the true attribute strength in a given image. See Kovashka and Grauman (2013a) for an approach to model the user-specific perception of an attribute.

    Fig. 4
    figure 4

    Sketch of WhittleSearch relevance computation. This toy example illustrates the intersection of relative constraints with \(M=2\) attributes. The images are plotted on the axes for both attributes. The space of images that satisfy each constraint are marked in a different color. The region satisfying all constraints is marked with a black dashed line. In this case, there is only one image in it (outlined in black). Best viewed in color

  3. The exhaustive baseline was too expensive to run on all 14 K Shoes. On a 1000-image subset, it does similarly as on other datasets.

References

  • Berg, T., Berg, A. & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In: Proceedings of the European Conference on Computer Vision (ECCV).

  • Biswas, A. & Parikh, D. (2013). Simultaneous active learning of classifiers and attributes via relative feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P. & Belongie, S. (2010). Visual recognition with humans in the loop. In: Proceedings of the European Conference on Computer Vision (ECCV).

  • Cox, I., Miller, M., Minka, T., Papathomas, T., & Yianilos, P. (2000). The Bayesian image retrieval system, PicHunter: Theory, implementation and psychophysical experiments. IEEE Transactions on Image Processing, 9(1), 20–37.

    Article  Google Scholar 

  • Douze, M., Ramisa, A., Schmid, C. (2011). Combining attributes and fisher vectors for efficient image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Farhadi, A., Endres, I., Hoiem, D., Forsyth, D. (2009). Describing objects by their attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Ferecatu, M., Geman, D. (2007). Interactive search for image categories by mental matching. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Flickner, M., Sawhney, H., Nilback, W., Ashley, J., Huang, Q., Dom, B., et al. (1995). Query by image and video content: The QBIC system. IEEE Computer, 28(9), 23–32.

    Article  Google Scholar 

  • Geman, D. & Jedynak, B. (1998). Model-based classification trees. IEEE Transactions on Information Theory, 47(3), 1075–1082.

  • Iqbal, Q. & Aggarwal, J. K. (2002) CIRES: A system for content-based retrieval in digital image libraries. In: Proceedings of the International Conference on Control, Automation, Robotics and Vision.

  • Jayaraman, D., Sha, F. & Grauman, K. (2014). Decorrelating semantic visual attributes by resisting the urge to share. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Joachims, T. (2002). Optimizing search engines using clickthrough data. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).

  • Joachims, T. (2006). Training linear SVMs in linear time. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).

  • Kekalainen, J., & Jarvelin, K. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.

    Article  Google Scholar 

  • Kovashka, A. & Grauman, K. (2013a). Attribute adaptation for personalized image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Kovashka, A. & Grauman, K. (2013b). Attribute pivots for guiding relevance feedback in image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Kovashka, A., Vijayanarasimhan, S. & Grauman, K. (2011). Actively selecting annotations among objects and attributes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Kovashka, A., Parikh, D. & Grauman, K. (2012). Whittle search: Image search with relative attribute feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Kulkarni, P., Sharma, G., Zepeda, J. & Chevallier, L. (2014). Transfer learning via attributes for improved on-the-fly classification. In: Proceedings of the Winter Conference on Applications of Computer Vision (WACV).

  • Kumar, N., Belhumeur, P. & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In: Proceedings of the European Conference on Computer Vision (ECCV).

  • Kumar, N., Berg, A. C., Belhumeur, P. N. & Nayar, S. K. (2009). Attribute and simile classifiers for face verification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Kurita, T., Kato, T. (1993). Learning of personal visual impression for image database systems. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR).

  • Lampert, C., Nickisch, H. & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Li, B., Chang, E. & Li, C. S. (2001). Learning image query concepts via intelligent sampling. In: Proceedings of the International Conference on Multimedia and Expo (ICME).

  • Ma, W. & Manjunath, B. (1997). NeTra: A toolbox for navigating large image databases. In: Proceedings of the International Conference on Image Processing (ICIP).

  • MacArthur, S. D., Brodley, C. E. & Shyu, C. R. (2000). Relevance feedback decision trees in content-based image retrieval. In: Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries.

  • Maji, S. (2012). Discovering a lexicon of parts and attributes. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshop on Parts and Attributes.

  • Mensink, T., Verbeek, J. & Csurka, G. (2011). Learning structured prediction models for interactive image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Naphade, M., Smith, J., Tesic, J., Chang, S. F., Hsu, W., Kennedy, L., et al. (2006). Large-scale concept ontology for multimedia. IEEE Transactions on Multimedia, 13(3), 86–91.

    Article  Google Scholar 

  • Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42(3), 145–175.

    Article  MATH  Google Scholar 

  • Parikh, D., & Grauman, K. (2011a) Interactively building a discriminative vocabulary of nameable attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Parikh, D., & Grauman, K. (2011b) Relative Attributes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Parikh, D., & Grauman, K. (2013) Implied feedback: Learning nuances of user behavior in image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).

  • Parkash, A., & Parikh, D. (2012) Attributes for classifier feedback. In: Proceedings of the European Conference on Computer Vision (ECCV).

  • Patterson, G., Xu, C., Su, H., & Hays, J. (2014). The SUN attribute database: Beyond Categories for deeper scene understanding. International Journal of Computer Vision (IJCV), 108(1–2), 59–81.

    Article  Google Scholar 

  • Platt, J. C. (1999) Probabilistic output for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers 10(3), 61–74.

  • Rasiwasia, N., Moreno, P., & Vasconcelos, N. (2007). Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia, 9(5), 923–938.

    Article  Google Scholar 

  • Rastegari, M., Parikh, D., Diba, A. & Farhadi, A. (2013). Multi-attribute queries: To merge or not to merge? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Rui, Y., Huang, T., Ortega, M., & Mehrotra, S. (1998). Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Transactions on Circuits and Video Technology, 8(5), 644–655.

    Article  Google Scholar 

  • Saleh, B., Farhadi, A. & Elgammal, A. (2013). Object-centric anomaly detection by attribute-based reasoning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Scheirer, W., Kumar, N., Belhumeur, P. & Boult, T. (2012). Multi-attribute spaces: Calibration for attribute fusion and similarity search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Siddiquie, B., Feris, R. & Davis, L. (2011). Image ranking and retrieval based on multi-attribute queries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Smith, J., Naphade, M. & Natsev, A. (2003). Multimedia semantic indexing using model vectors. In: Proceedings of the International Conference on Multimedia and Expo (ICME).

  • Sznitman, R., & Jedynak, B. (2010). Active testing for face detection and localization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32(10), 1914–1920.

    Article  Google Scholar 

  • Tieu, K. & Viola, P. (2000). Boosting image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Tong, S. & Chang, E. (2001). Support vector machine active learning for image retrieval. In: Proceedings of the ACM International Conference on Multimedia.

  • Vijayanarasimhan, S. & Kapoor, A. (2010). Visual recognition and detection under bounded computational resources. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Wah, C. & Belongie, S. (2013). Attribute-based detection of unfamiliar classes with humans in the loop. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Wah, C., Van Horn, G., Branson, S., Maji, S., Perona, P. & Belongie, S. (2014). Similarity comparisons for interactive fine-grained categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Wang, X., Liu, K. & Tang, X. (2011). Query-specific visual semantic spaces for web image re-ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

  • Wang, Y. & Mori, G. (2010). A discriminative latent model of object classes and attributes. In: Proceedings of the European Conference on Computer Vision (ECCV).

  • Zavesky, E. & Chang, S. F. (2008). Cu-Zero: Embracing the Frontier of interactive visual search for informed users. In: Proceedings of the ACM International Conference on Multimedia Information Retrieval.

  • Zhang, C., & Chen, T. (2002). An active learning framework for content based information retrieval. IEEE Transactions on Multimedia, 4(2), 260–268.

    Article  Google Scholar 

  • Zhou, X., & Huang, T. (2003). Relevance feedback in image retrieval: A comprehensive review. ACM Multimedia Systems, 8(6), 536–544.

    Article  Google Scholar 

Download references

Acknowledgments

We thank the anonymous reviewers for their helpful feedback and suggestions. This research was supported by ONR YIP Award N00014-12-1-0754 (K.G. and A.K.) and Google Faculty Research Award (D.P.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adriana Kovashka.

Additional information

Communicated by M. Hebert.

The work was done while Adriana Kovashka was at The University of Texas at Austin.

Appendix

Appendix

See Table 4.

Table 4 Ordering of classes for the attributes in the Shoes dataset. A score of 10 denotes that this class has the attribute the most, and 1 denotes the class has it the least

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kovashka, A., Parikh, D. & Grauman, K. WhittleSearch: Interactive Image Search with Relative Attribute Feedback. Int J Comput Vis 115, 185–210 (2015). https://doi.org/10.1007/s11263-015-0814-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-015-0814-0

Keywords

Navigation