Attributes for Image Retrieval

Chapter

Abstract

Image retrieval is a computer vision application that people encounter in their everyday lives. To enable accurate retrieval results, a human user needs to be able to communicate in a rich and noiseless way with the retrieval system. We propose semantic visual attributes as a communication channel for search because they are commonly used by humans to describe the world around them. We first propose a new feedback interaction where users can directly comment on how individual properties of retrieved content should be adjusted to more closely match the desired visual content. We then show how to ensure this interaction is as informative as possible, by having the vision system ask those questions that will most increase its certainty over what content is relevant. To ensure that attribute-based statements from the user are not misinterpreted by the system, we model the unique ways in which users employ attribute terms, and develop personalized attribute models. We discover clusters among users in terms of how they use a given attribute term, and consequently discover the distinct “shades of meaning” of these attributes. Our work is a significant step in the direction of bridging the semantic gap between high-level user intent and low-level visual features. We discuss extensions to further increase the utility of attributes for practical search applications.

References

  1. 1.
    Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: European Conference on Computer Vision (ECCV) (2010)Google Scholar
  2. 2.
    Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: European Conference on Computer Vision (ECCV) (2010)Google Scholar
  3. 3.
    Chen, L., Zhang, Q., Li, B.: Predicting multiple attributes via relative multi-task learning. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  4. 4.
    Chen, Q., Huang, J., Feris, R., Brown, L.M., Dong, J., Yan, S.: Deep domain adaptation for describing people based on fine-grained clothing attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  5. 5.
    Cox, I., Miller, M., Minka, T., Papathomas, T., Yianilos, P.: The bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments. IEEE Trans. Image Process. 9(1), 20–37 (2000)CrossRefGoogle Scholar
  6. 6.
    Curran, W., Moore, T., Kulesza, T., Wong, W.K., Todorovic, S., Stumpf, S., White, R., Burnett, M.: Towards recognizing “cool”: can end users help computer vision recognize subjective attributes or objects in images? In: Intelligent User Interfaces (IUI) (2012)Google Scholar
  7. 7.
    Douze, M., Ramisa, A., Schmid, C.: Combining attributes and fisher vectors for efficient image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  8. 8.
    Duan, K., Parikh, D., Crandall, D., Grauman, K.: Discovering localized attributes for fine-grained recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  9. 9.
    Eitz, M., Hildebrand, K., Boubekeur, T., Alexa, M.: Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Vis. Comput. Graph. 17(11), 1624–1636 (2011)CrossRefGoogle Scholar
  10. 10.
    Endres, I., Farhadi, A., Hoiem, D., Forsyth, D.A.: The benefits and challenges of collecting richer object annotations. In: Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2010)Google Scholar
  11. 11.
    Escorcia, V., Niebles, J.C., Ghanem, B.: On the relationship between visual attributes and convolutional networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  12. 12.
    Everett, C.: Linguistic relativity: evidence across languages and cognitive domains. In: Mouton De Gruyter (2013)Google Scholar
  13. 13.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.A.: Describing objects by their attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  14. 14.
    Ferecatu, M., Geman, D.: Interactive search for image categories by mental matching. In: International Conference on Computer Vision (ICCV) (2007)Google Scholar
  15. 15.
    Ferrari, V., Zisserman, A.: Learning visual attributes. In: Conference on Neural Information Processing Systems (NIPS) (2007)Google Scholar
  16. 16.
    Fogarty, J., Tan, D.S., Kapoor, A., Winder, S.: Cueflik: interactive concept learning in image search. In: Conference on Human Factors in Computing Systems (CHI) (2008)Google Scholar
  17. 17.
    Geng, B., Yang, L., Xu, C., Hua, X.S.: Ranking model adaptation for domain-specific search. IEEE Trans. Knowle. Data Eng. 24(4), 745–758 (2012)CrossRefGoogle Scholar
  18. 18.
    Heim, E., Berger, M., Seversky, L., Hauskrecht, M.: Active perceptual similarity modeling with auxiliary information. In: arXiv preprint arXiv:1511.02254 (2015)
  19. 19.
    Hofmann, T.: Probabilistic latent semantic analysis. In: Uncertainty in Artificial Intelligence (UAI) (1999)Google Scholar
  20. 20.
    Joachims, T.: Optimizing search engines using click through data. In: International Conference on Knowledge Discovery and Data Mining (KDD) (2002)Google Scholar
  21. 21.
    Kekalainen, J., Jarvelin, K.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)CrossRefGoogle Scholar
  22. 22.
    Kovashka, A., Grauman, K.: Attribute adaptation for personalized image search. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  23. 23.
    Kovashka, A., Grauman, K.: Attribute pivots for guiding relevance feedback in image search. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  24. 24.
    Kovashka, A., Grauman, K.: Discovering attribute shades of meaning with the crowd. Int. J. Comput. Vis. 114, 56–73 (2015)CrossRefGoogle Scholar
  25. 25.
    Kovashka, A., Vijayanarasimhan, S., Grauman, K.: Actively selecting annotations among objects and attributes. In: International Conference on Computer Vision (ICCV) (2011)Google Scholar
  26. 26.
    Kovashka, A., Parikh, D., Grauman, K.: WhittleSearch: image search with relative attribute feedback. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  27. 27.
    Kovashka, A., Parikh, D., Grauman, K.: WhittleSearch: interactive image search with relative attribute feedback. Int. J. Comput. Vis. 115, 185–210 (2015)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Kumar, N., Belhumeur, P.N., Nayar, S.K.: FaceTracer: a search engine for large collections of images with faces. In: European Conference on Computer Vision (ECCV) (2008)Google Scholar
  29. 29.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: International Conference on Computer Vision (ICCV) (2009)Google Scholar
  30. 30.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1962–1977 (2011)CrossRefGoogle Scholar
  31. 31.
    Kurita, T., Kato, T.: Learning of personal visual impression for image database systems. In: International Conference on Document Analysis and Recognition (ICDAR) (1993)Google Scholar
  32. 32.
    Lampert, C., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  33. 33.
    Li, B., Chang, E., Li, C.S.: Learning image query concepts via intelligent sampling. In: International Conference on Multimedia and Expo (ICME) (2001)Google Scholar
  34. 34.
    Li, S., Shan, S., Chen, X.: Relative forest for attribute prediction. In: Asian Conference on Computer Vision (ACCV) (2013)Google Scholar
  35. 35.
    Liu, S., Kovashka, A.: Adapting attributes using features similar across domains. In: Winter Conference on Applications of Computer Vision (WACV) (2016)Google Scholar
  36. 36.
    Loeff, N., Alm, C.O., Forsyth, D.A.: Discriminating image senses by clustering with multimodal features. In: Association for Computational Linguistics (ACL) (2006)Google Scholar
  37. 37.
    Ma, W.Y., Manjunath, B.S.: Netra: a toolbox for navigating large image databases. Multimedia Syst. 7(3), 184–198 (1999)CrossRefGoogle Scholar
  38. 38.
    Mahajan, D., Sellamanickam, S., Nair, V.: A joint learning framework for attribute models and object descriptions. In: International Conference on Computer Vision (ICCV) (2011)Google Scholar
  39. 39.
    Mensink, T., Verbeek, J., Csurka, G.: Learning structured prediction models for interactive image labeling. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  40. 40.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42, 145–175 (2001)CrossRefMATHGoogle Scholar
  41. 41.
    Ordonez, V., Jagadeesh, V., Di, W., Bhardwaj, A., Piramuthu, R.: Furniture-geek: understanding fine-grained furniture attributes from freely associated text and tags. In: Winter Conference on Applications of Computer Vision (WACV) (2014)Google Scholar
  42. 42.
    Ozeki, M., Okatani, T.: Understanding convolutional neural networks in terms of category-level attributes. In: Asian Conference on Computer Vision (ACCV) (2014)Google Scholar
  43. 43.
    Parikh, D., Grauman, K.: Interactively building a discriminative vocabulary of nameable attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  44. 44.
    Parikh, D., Grauman, K.: Relative attributes. In: International Conference on Computer Vision (ICCV) (2011)Google Scholar
  45. 45.
    Parikh, D., Grauman, K.: Implied feedback: learning nuances of user behavior in image search. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  46. 46.
    Patterson, G., Hays, J.: Sun attribute database: discovering, annotating, and recognizing scene attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  47. 47.
    Platt, J.C.: Probabilistic output for support vector machines and comparisons to regularized likelihood methods. In: Advances in Large Margin Classifiers (1999)Google Scholar
  48. 48.
    Rasiwasia, N., Moreno, P.J., Vasconcelos, N.: Bridging the gap: query by semantic example. IEEE Trans. Multimedia 9(5), 923–938 (2007)CrossRefGoogle Scholar
  49. 49.
    Rastegari, M., Farhadi, A., Forsyth, D.A.: Attribute discovery via predictable discriminative binary codes. In: European Conference on Computer Vision (ECCV) (2012)Google Scholar
  50. 50.
    Rastegari, M., Parikh, D., Diba, A., Farhadi, A.: Multi-attribute queries: to merge or not to merge? In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  51. 51.
    Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: International Conference on Machine Learning (ICML) (2011)Google Scholar
  52. 52.
    Rui, Y., Huang, T.S., Ortega, M., Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Technol. (1998)Google Scholar
  53. 53.
    Salakhutdinov, R., Mnih, A.: Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In: International Conference on Machine Learning (ICML) (2008)Google Scholar
  54. 54.
    Sandeep, R.N., Verma, Y., Jawahar, C.: Relative parts: distinctive parts for learning relative attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  55. 55.
    Scheirer, W., Kumar, N., Belhumeur, P.N., Boult, T.E.: Multi-attribute spaces: calibration for attribute fusion and similarity search. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  56. 56.
    Schwartz, G., Nishino, K.: Automatically discovering local visual material attributes. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  57. 57.
    Shankar, S., Garg, V.K., Cipolla, R.: Deep-carving: discovering visual attributes by carving deep neural nets. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  58. 58.
    Shao, J., Kang, K., Loy, C.C., Wang, X.: Deeply learned attributes for crowded scene understanding. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  59. 59.
    Sharmanska, V., Quadrianto, N., Lampert, C.: Augmented attribute representations. In: European Conference on Computer Vision (ECCV) (2012)Google Scholar
  60. 60.
    Siddiquie, B., Feris, R., Davis, L.: Image ranking and retrieval based on multi-attribute queries. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  61. 61.
    Tamuz, O., Liu, C., Belongie, S., Shamir, O., Kalai, A.T.: Adaptively learning the crowd kernel. In: International Conference on Machine Learning (ICML) (2011)Google Scholar
  62. 62.
    Tieu, K., Viola, P.: Boosting image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2000)Google Scholar
  63. 63.
    Tong, S., Chang, E.: Support vector machine active learning for image retrieval. In: ACM Multimedia (2001)Google Scholar
  64. 64.
    Tunkelang, D.: Faceted search. In: Synthesis Lectures on Information Concepts, Retrieval, and Services (2009)Google Scholar
  65. 65.
    Vondrick, C., Khosla, A., Malisiewicz, T., Torralba, A.: Hoggles: visualizing object detection features. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  66. 66.
    Wang, X., Ji, Q.: A unified probabilistic approach modeling relationships between attributes and objects. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  67. 67.
    Xiao, F., Lee, Y.J.: Discovering the spatial extent of relative attributes. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  68. 68.
    Yang, J., Yan, R., Hauptmann, A.G.: Adapting SVM classifiers to data with shifted distributions. In: IEEE International Conference on Data Mining (ICDM) Workshops (2007)Google Scholar
  69. 69.
    Yu, F., Cao, L., Feris, R., Smith, J., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  70. 70.
    Zhou, X.S., Huang, T.S.: Relevance feedback in image retrieval: a comprehensive review. In: Multimedia Systems (2003)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.University of PittsburghPittsburghUSA
  2. 2.The University of Texas at AustinAustinUSA

Personalised recommendations