What Visual Attributes Characterize an Object Class?

Fu, Jianlong; Wang, Jinqiao; Wang, Xin-Jing; Rui, Yong; Lu, Hanqing

doi:10.1007/978-3-319-16865-4_16

Jianlong Fu⁵,
Jinqiao Wang⁵,
Xin-Jing Wang⁶,
Yong Rui⁶ &
…
Hanqing Lu⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9003))

Included in the following conference series:

Asian Conference on Computer Vision

2081 Accesses

Abstract

Visual attribute-based learning has shown a big impact on many computer vision problems in recent years. Albeit its usefulness, most of works only focus on predicting either the presence or the strength of pre-defined attributes. In this paper, we discuss how to automatically learn visual attributes that characterize an object class. Starting from the images of an object class that are collected from the Web, we first mine visual prototypes of attributes (i.e., a clean intermediate representation for learning attributes) by clustering with Gaussian mixtures from multi-scale salient areas in noisy Web images. Second, a joint optimization model is proposed to fulfill the attribute learning with feature selection. As sparse approximation is adopted for feature selection during the joint optimization, the learned attributes tend to present a more representative visual property, e.g., stripe pattern (when texture features are selected), yellow-color (when color features are selected). Finally, to quantify the confidence of attributes and restrain the noisy attributes learned from the Web, a ranking-based method is proposed to refine the learned attributes. Our approach ensures the learned visual attributes to be visually recognizable and representative, in contrast to manually constructed attributes [1] that contain properties difficult to be visualized, e.g., “smelly,” “smart.” We evaluated our approach on two benchmark datasets, and compared with state-of-the-art approaches in two aspects: the quality of the learned visual attributes and their effectiveness in object categorization.

Jianlong Fu—This work was conducted when Jianlong Fu was a research intern at Microsoft Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Visualness [10] is a quantitative measure of how likely a concept can be visualized with example images.

References

Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 15, 251–269 (1991)
Article Google Scholar
Yu, F.X., Cao, L.L., Feris, R.S., Smith, J.R., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: CVPR (2013)
Google Scholar
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: ICCV (2009)
Google Scholar
Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: CVPR, pp. 801–808 (2011)
Google Scholar
Yu, F.X., Ji, R., Tsai, M.H., Ye, G., Chang, S.F.: Weak attributes for large-scale image retrieval. In: CVPR, pp. 2949–2956 (2012)
Google Scholar
Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 155–168. Springer, Heidelberg (2010)
Chapter Google Scholar
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010)
Chapter Google Scholar
Wang, G., Forsyth, D.A.: Joint learning of visual attributes, object classes and visual saliency. In: ICCV, pp. 537–544 (2009)
Google Scholar
Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011)
Google Scholar
Xu, Z., Wang, X.J., Chen, C.W.: Mining visualness. In: ICME, pp. 1–6 (2013)
Google Scholar
Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate-search-based image annotation using web-scale data. Proc. IEEE 100, 2705–2721 (2012)
Article Google Scholar
Zoran, D., Weiss, Y.: Natural images, gaussian mixtures and dead leaves. In: NIPS, pp. 1745–1753 (2012)
Google Scholar
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)
Google Scholar
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)
Google Scholar
Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 663–676. Springer, Heidelberg (2010)
Chapter Google Scholar
Li, L.-J., Su, H., Xing, E.P., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: NIPS (2010)
Google Scholar
Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)
Chapter Google Scholar
Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)
Google Scholar
Yang, Y., Shah, M.: Complex events detection using data-driven concepts. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 722–735. Springer, Heidelberg (2012)
Chapter Google Scholar
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR, pp. 3337–3344 (2011)
Google Scholar
Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS, pp. 545–552 (2006)
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977)
MathSciNet MATH Google Scholar
Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: l\(_{\text{2, } \text{1 }}\)-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI, pp. 1589–1594 (2011)
Google Scholar
Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint; 2, 1-norms minimization. In: NIPS, pp. 1813–1821 (2010)
Google Scholar
Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)
Article MathSciNet Google Scholar
Golub, G.H., van der Vorst, H.A.: Eigenvalue computation in the 20th century. J. Comput. Appl. Math. 123, 35–65 (2000)
Article MathSciNet Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: A discriminative framework for texture and object recognition using local image features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 423–442. Springer, Heidelberg (2006)
Chapter Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: CIVR, pp. 401–408 (2007)
Google Scholar
Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results (2007). http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html

Download references

Acknowledgement

This work was supported by 863 Program (2014AA015104), and National Natural Science Foundation of China (61273034, and 61332016).

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, No. 95, Zhongguancun East Road, Beijing, 100190, China
Jianlong Fu, Jinqiao Wang & Hanqing Lu
Microsoft Research, No. 5, Dan Ling Street, Haidian District, Beijing, 10080, China
Xin-Jing Wang & Yong Rui

Authors

Jianlong Fu
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Jing Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yong Rui
View author publications
You can also search for this author in PubMed Google Scholar
Hanqing Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianlong Fu .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fu, J., Wang, J., Wang, XJ., Rui, Y., Lu, H. (2015). What Visual Attributes Characterize an Object Class?. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision – ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9003. Springer, Cham. https://doi.org/10.1007/978-3-319-16865-4_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-16865-4_16
Published: 16 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16864-7
Online ISBN: 978-3-319-16865-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics