What Visual Attributes Characterize an Object Class?

  • Jianlong FuEmail author
  • Jinqiao Wang
  • Xin-Jing Wang
  • Yong Rui
  • Hanqing Lu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9003)


Visual attribute-based learning has shown a big impact on many computer vision problems in recent years. Albeit its usefulness, most of works only focus on predicting either the presence or the strength of pre-defined attributes. In this paper, we discuss how to automatically learn visual attributes that characterize an object class. Starting from the images of an object class that are collected from the Web, we first mine visual prototypes of attributes (i.e., a clean intermediate representation for learning attributes) by clustering with Gaussian mixtures from multi-scale salient areas in noisy Web images. Second, a joint optimization model is proposed to fulfill the attribute learning with feature selection. As sparse approximation is adopted for feature selection during the joint optimization, the learned attributes tend to present a more representative visual property, e.g., stripe pattern (when texture features are selected), yellow-color (when color features are selected). Finally, to quantify the confidence of attributes and restrain the noisy attributes learned from the Web, a ranking-based method is proposed to refine the learned attributes. Our approach ensures the learned visual attributes to be visually recognizable and representative, in contrast to manually constructed attributes [1] that contain properties difficult to be visualized, e.g., “smelly,” “smart.” We evaluated our approach on two benchmark datasets, and compared with state-of-the-art approaches in two aspects: the quality of the learned visual attributes and their effectiveness in object categorization.


Gaussian Mixture Model Object Class Object Categorization Rank Score Visual Attribute 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by 863 Program (2014AA015104), and National Natural Science Foundation of China (61273034, and 61332016).


  1. 1.
    Osherson, D.N., Stern, J., Wilkie, O., Stob, M., Smith, E.E.: Default probability. Cogn. Sci. 15, 251–269 (1991)CrossRefGoogle Scholar
  2. 2.
    Yu, F.X., Cao, L.L., Feris, R.S., Smith, J.R., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: CVPR (2013)Google Scholar
  3. 3.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: ICCV (2009)Google Scholar
  4. 4.
    Siddiquie, B., Feris, R.S., Davis, L.S.: Image ranking and retrieval based on multi-attribute queries. In: CVPR, pp. 801–808 (2011)Google Scholar
  5. 5.
    Yu, F.X., Ji, R., Tsai, M.H., Ye, G., Chang, S.F.: Weak attributes for large-scale image retrieval. In: CVPR, pp. 2949–2956 (2012)Google Scholar
  6. 6.
    Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 155–168. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  7. 7.
    Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P., Belongie, S.: Visual recognition with humans in the loop. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 438–451. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  8. 8.
    Wang, G., Forsyth, D.A.: Joint learning of visual attributes, object classes and visual saliency. In: ICCV, pp. 537–544 (2009)Google Scholar
  9. 9.
    Parikh, D., Grauman, K.: Relative attributes. In: ICCV, pp. 503–510 (2011)Google Scholar
  10. 10.
    Xu, Z., Wang, X.J., Chen, C.W.: Mining visualness. In: ICME, pp. 1–6 (2013)Google Scholar
  11. 11.
    Wang, X.J., Zhang, L., Ma, W.Y.: Duplicate-search-based image annotation using web-scale data. Proc. IEEE 100, 2705–2721 (2012)CrossRefGoogle Scholar
  12. 12.
    Zoran, D., Weiss, Y.: Natural images, gaussian mixtures and dead leaves. In: NIPS, pp. 1745–1753 (2012)Google Scholar
  13. 13.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)Google Scholar
  14. 14.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR (2009)Google Scholar
  15. 15.
    Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 663–676. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  16. 16.
    Li, L.-J., Su, H., Xing, E.P., Fei-Fei, L.: Object bank: a high-level image representation for scene classification and semantic feature sparsification. In: NIPS (2010)Google Scholar
  17. 17.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient object category recognition using classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  18. 18.
    Ferrari, V., Zisserman, A.: Learning visual attributes. In: NIPS (2007)Google Scholar
  19. 19.
    Yang, Y., Shah, M.: Complex events detection using data-driven concepts. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part III. LNCS, vol. 7574, pp. 722–735. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  20. 20.
    Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR, pp. 3337–3344 (2011)Google Scholar
  21. 21.
    Harel, J., Koch, C., Perona, P.: Graph-based visual saliency. In: NIPS, pp. 545–552 (2006)Google Scholar
  22. 22.
    Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Stat. Soc. Ser. B 39, 1–38 (1977)zbMATHMathSciNetGoogle Scholar
  23. 23.
    Yang, Y., Shen, H.T., Ma, Z., Huang, Z., Zhou, X.: l\(_{\text{2, } \text{1 }}\)-norm regularized discriminative feature selection for unsupervised learning. In: IJCAI, pp. 1589–1594 (2011)Google Scholar
  24. 24.
    Nie, F., Huang, H., Cai, X., Ding, C.H.Q.: Efficient and robust feature selection via joint; 2, 1-norms minimization. In: NIPS, pp. 1813–1821 (2010)Google Scholar
  25. 25.
    Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)CrossRefMathSciNetGoogle Scholar
  26. 26.
    Golub, G.H., van der Vorst, H.A.: Eigenvalue computation in the 20th century. J. Comput. Appl. Math. 123, 35–65 (2000)CrossRefzbMATHMathSciNetGoogle Scholar
  27. 27.
    Lazebnik, S., Schmid, C., Ponce, J.: A discriminative framework for texture and object recognition using local image features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 423–442. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  28. 28.
    Bosch, A., Zisserman, A., Munoz, X.: Representing shape with a spatial pyramid kernel. In: CIVR, pp. 401–408 (2007)Google Scholar
  29. 29.
    Shechtman, E., Irani, M.: Matching local self-similarities across images and videos. In: CVPR (2007)Google Scholar
  30. 30.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  31. 31.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results (2007).

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Jianlong Fu
    • 1
    Email author
  • Jinqiao Wang
    • 1
  • Xin-Jing Wang
    • 2
  • Yong Rui
    • 2
  • Hanqing Lu
    • 1
  1. 1.National Laboratory of Pattern RecognitionInstitute of Automation, Chinese Academy of SciencesBeijingChina
  2. 2.Microsoft ResearchBeijingChina

Personalised recommendations