Skip to main content

What Visual Attributes Characterize an Object Class?

  • Conference paper
  • First Online:
Computer Vision – ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Included in the following conference series:

  • 2081 Accesses

Abstract

Visual attribute-based learning has shown a big impact on many computer vision problems in recent years. Albeit its usefulness, most of works only focus on predicting either the presence or the strength of pre-defined attributes. In this paper, we discuss how to automatically learn visual attributes that characterize an object class. Starting from the images of an object class that are collected from the Web, we first mine visual prototypes of attributes (i.e., a clean intermediate representation for learning attributes) by clustering with Gaussian mixtures from multi-scale salient areas in noisy Web images. Second, a joint optimization model is proposed to fulfill the attribute learning with feature selection. As sparse approximation is adopted for feature selection during the joint optimization, the learned attributes tend to present a more representative visual property, e.g., stripe pattern (when texture features are selected), yellow-color (when color features are selected). Finally, to quantify the confidence of attributes and restrain the noisy attributes learned from the Web, a ranking-based method is proposed to refine the learned attributes. Our approach ensures the learned visual attributes to be visually recognizable and representative, in contrast to manually constructed attributes [1] that contain properties difficult to be visualized, e.g., “smelly,” “smart.” We evaluated our approach on two benchmark datasets, and compared with state-of-the-art approaches in two aspects: the quality of the learned visual attributes and their effectiveness in object categorization.

Jianlong Fu—This work was conducted when Jianlong Fu was a research intern at Microsoft Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Visualness [10] is a quantitative measure of how likely a concept can be visualized with example images.

References

  1. Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 15, 251–269 (1991)

    Article  Google Scholar 

  2. Yu, F.X., Cao, L.L., Feris, R.S., Smith, J.R., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: CVPR (2013)

    Google Scholar 

  3. Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: ICCV (2009)

    Google Scholar 

  4. Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: CVPR, pp. 801–808 (2011)

    Google Scholar 

  5. Yu, F.X., Ji, R., Tsai, M.H., Ye, G., Chang, S.F.: Weak attributes for large-scale image retrieval. In: CVPR, pp. 2949–2956 (2012)

    Google Scholar 

  6. Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 155–168. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  8. Wang, G., Forsyth, D.A.: Joint learning of visual attributes, object classes and visual saliency. In: ICCV, pp. 537–544 (2009)

    Google Scholar 

  9. Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011)

    Google Scholar 

  10. Xu, Z., Wang, X.J., Chen, C.W.: Mining visualness. In: ICME, pp. 1–6 (2013)

    Google Scholar 

  11. Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate-search-based image annotation using web-scale data. Proc. IEEE 100, 2705–2721 (2012)

    Article  Google Scholar 

  12. Zoran, D., Weiss, Y.: Natural images, gaussian mixtures and dead leaves. In: NIPS, pp. 1745–1753 (2012)

    Google Scholar 

  13. Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)

    Google Scholar 

  14. Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)

    Google Scholar 

  15. Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 663–676. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Li, L.-J., Su, H., Xing, E.P., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: NIPS (2010)

    Google Scholar 

  17. Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  18. Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)

    Google Scholar 

  19. Yang, Y., Shah, M.: Complex events detection using data-driven concepts. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 722–735. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  20. Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR, pp. 3337–3344 (2011)

    Google Scholar 

  21. Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS, pp. 545–552 (2006)

    Google Scholar 

  22. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  23. Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: l\(_{\text{2, } \text{1 }}\)-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI, pp. 1589–1594 (2011)

    Google Scholar 

  24. Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint; 2, 1-norms minimization. In: NIPS, pp. 1813–1821 (2010)

    Google Scholar 

  25. Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  26. Golub, G.H., van der Vorst, H.A.: Eigenvalue computation in the 20th century. J. Comput. Appl. Math. 123, 35–65 (2000)

    Article  MathSciNet  Google Scholar 

  27. Lazebnik, S., Schmid, C., Ponce, J.: A discriminative framework for texture and object recognition using local image features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 423–442. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  28. Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: CIVR, pp. 401–408 (2007)

    Google Scholar 

  29. Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)

    Google Scholar 

  30. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  31. Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results (2007). http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

Download references

Acknowledgement

This work was supported by 863 Program (2014AA015104), and National Natural Science Foundation of China (61273034, and 61332016).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianlong Fu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Fu, J., Wang, J., Wang, XJ., Rui, Y., Lu, H. (2015). What Visual Attributes Characterize an Object Class?. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16865-4_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16864-7

  • Online ISBN: 978-3-319-16865-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics