Advertisement

Joint Learning of Semantic and Latent Attributes

  • Peixi Peng
  • Yonghong Tian
  • Tao Xiang
  • Yaowei Wang
  • Tiejun Huang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9908)

Abstract

As mid-level semantic properties shared across object categories, attributes have been studied extensively. Recent approaches have attempted joint modelling of multiple attributes together with class labels so as to exploit their correlations for better attribute prediction and object recognition. However, they often ignore the fact that there exist some shared properties other than nameable/semantic attributes, which we call latent attributes. Basically, they can be further divided into discriminative and non-discriminative parts depending on whether they can contribute to an object recognition task. We argue that learning the latent attributes jointly with user-defined semantic attributes not only leads to better representation for object recognition but also helps with semantic attribute prediction. A novel dictionary learning model is proposed which decomposes the dictionary space into three parts corresponding to semantic, latent discriminative and latent background attributes respectively. An efficient algorithm is then formulated to solve the resultant optimization problem. Extensive experiments show that the proposed attribute learning method produces state-of-the-art results on both attribute prediction and attribute-based person re-identification.

Keywords

Attribute learning Latent attributes Person re-identification Zero-shot learning Dictionary learning 

Notes

Acknowledgements

This work is partially supported by grants from the National Basic Research Program of China under grant 2015CB351806, the National Natural Science Foundation of China under contract No. 61390515, No. 61425025 and No. 61471042, Beijing Municipal Commission of Science and Technology under contract No. Z151100000915070 and the National Key Technology and Development Program of China under contract No. 2014BAK10B02. These authors are also supported by Microsoft Research Asia Collaborative Research Program 2016, project ID FY16-RES-THEME-034.

References

  1. 1.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1778–1785 (2009)Google Scholar
  2. 2.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 951–958, June 2009Google Scholar
  3. 3.
    Lampert, C.H., Nickisch, H., Harmeling, S.: Attribute-based classification for zero-shot visual object categorization. IEEE Trans. Pattern Anal. Machine Intell. 36(3), 453–465 (2014)CrossRefGoogle Scholar
  4. 4.
    Mahajan, D., Sellamanickam, S., Nair, V.: A joint learning framework for attribute models and object descriptions. In: IEEE International Conference on Computer Vision, pp. 1227–1234 (2011)Google Scholar
  5. 5.
    Jayaraman, D., Sha, F., Grauman, K.: Decorrelating semantic visual attributes by resisting the urge to share. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1629–1636 (2014)Google Scholar
  6. 6.
    Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 155–168. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15555-0_12 CrossRefGoogle Scholar
  7. 7.
    Liang, K., Chang, H., Shan, S., Chen, X.: A unified multiplicative framework for attribute learning. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2506–2514, December 2015Google Scholar
  8. 8.
    Akata, Z., Perronnin, F., Harchaoui, Z., Schmid, C.: Label-embedding for attribute-based classification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 819–826 (2013)Google Scholar
  9. 9.
    Huang, S., Elhoseiny, M., Elgammal, A., Yang, D.: Learning hypergraph-regularized attribute predictors. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 409–417 (2015)Google Scholar
  10. 10.
    Kovashka, A., Parikh, D., Grauman, K.: Whittlesearch: Interactive image search with relative attribute feedback. Int. J. Comput. Vis. 115(2), 185–210 (2015)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Shi, Z., Hospedales, T.M., Xiang, T.: Transferring a semantic representation for person re-identification and search. In: Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  12. 12.
    Fu, Y., Hospedales, T.M., Xiang, T., Gong, S.: Transductive multi-view zero-shot learning. IEEE Trans. Pattern Anal. Mach. Intell. 37(11), 2332–2345 (2015)CrossRefGoogle Scholar
  13. 13.
    Layne, R., Hospedales, T.M., Gong, S.: Attributes-Based Re-identification. Springer, London (2014)CrossRefGoogle Scholar
  14. 14.
    Deng, Y., Luo, P., Loy, C.C., Tang, X.: Pedestrian attribute recognition at far distance. In: Proceedings of the ACM International Conference on Multimedia, pp. 789–792 (2014)Google Scholar
  15. 15.
    Li, Y., Wang, R., Liu, H., Jiang, H., Shan, S., Chen, X.: Two birds, one stone: jointly learning binary code for large-scale face image retrieval and attributes prediction. In: IEEE International Conference on Computer Vision, pp. 3819–3827 (2015)Google Scholar
  16. 16.
    Yu, F.X., Cao, L., Feris, R.S., Smith, J.R., Chang, S.F.: Designing category-level attributes for discriminative visual recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 771–778 (2013)Google Scholar
  17. 17.
    Singh, S., Gupta, A., Efros, A.A.: Unsupervised discovery of mid-level discriminative patches. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7573, pp. 73–86. Springer, Heidelberg (2012)Google Scholar
  18. 18.
    Rifai, S., Bengio, Y., Courville, A., Vincent, P., Mirza, M.: Disentangling factors of variation for facial expression recognition. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 808–822. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33783-3_58 Google Scholar
  19. 19.
    Berg, T.L., Berg, A.C., Shih, J.: Automatic attribute discovery and characterization from noisy web data. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 663–676. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15549-9_48 CrossRefGoogle Scholar
  20. 20.
    Rastegari, M., Farhadi, A., Forsyth, D.: Attribute discovery via predictable discriminative binary codes. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 876–889. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33783-3_63 Google Scholar
  21. 21.
    Feng, J., Jegelka, S., Yan, S., Darrell, T.: Learning scalable discriminative dictionary with sample relatedness. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1645–1652 (2014)Google Scholar
  22. 22.
    Fu, Y., Hospedales, T.M., Tao, X., Gong, S.: Learning multimodal latent attributes. IEEE Trans. Pattern Anal. Mach. Intell. 36(2), 303–316 (2014)CrossRefGoogle Scholar
  23. 23.
    Sharmanska, V., Quadrianto, N., Lampert, C.H.: Augmented attribute representations. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 242–255. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33715-4_18 Google Scholar
  24. 24.
    Layne, R., Hospedales, T.M., Gong, S.: Towards Person Identification and Re-identification with Attributes. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7583, pp. 402–412. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33863-2_40 CrossRefGoogle Scholar
  25. 25.
    N Hospedales, T., Layne, R., Gong, S.: Re-id: hunting attributes in the wild. In: British Machine Vision Conference (BMVC) (2014)Google Scholar
  26. 26.
    Layne, R., Hospedales, T.M., Gong, S.: Person re-identification by attributes. In: British Machine Vision Conference (2012)Google Scholar
  27. 27.
    Liao, S., Hu, Y., Zhu, X., Li, S.Z.: Person re-identification by local maximal occurrence representation and metric learning. In: CVPR, pp. 2197–2206 (2015)Google Scholar
  28. 28.
    Su, C., Yang, F., Zhang, S., Tian, Q., Davis, L.S., Gao, W.: Multi-task learning with low rank attribute embedding for person re-identification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3739–3747, December 2015Google Scholar
  29. 29.
    Kenneth, K., Joseph, M., Bhaskar, R., Kjersti, E., Te-Won, L., Terrence, S.: Dictionary learning algorithms for sparse representation. Neural Comput. 15(2), 349–396 (2003)CrossRefzbMATHGoogle Scholar
  30. 30.
    Aharon, M., Elad, M., Bruckstein, A.: K-svd: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Sig. Proces. 54, 4311–4322 (2006)CrossRefGoogle Scholar
  31. 31.
    Guo, H., Jiang, Z., Davis, L.S.: Discriminative dictionary learning with pairwise constraints. In: Proceedings of the 11th Asian conference on Computer Vision (2014)Google Scholar
  32. 32.
    Zheng, J., Jiang, Z.: Learning view-invariant sparse representations for cross-view action recognition. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 3176–3183. IEEE (2013)Google Scholar
  33. 33.
    Liu, X., Song, M., Tao, D., Zhou, X., Chen, C., Bu, J.: Semi-supervised coupled dictionary learning for person re-identification. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  34. 34.
    Karanam, S., Li, Y., Radke, R.J.: Person re-identification with discriminatively trained viewpoint invariant dictionaries. In: 2015 IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  35. 35.
    Gray, D., Brennan, S., Tao, H.: Evaluating appearance models for recognition, reacquisition, and tracking. In: Proceedings of IEEE International Workshop on Performance Evaluation for Tracking and Surveillance (PETS), vol. 3. Citeseer (2007)Google Scholar
  36. 36.
    Lisanti, G., Masi, I., Del Bilmbo, A.: Matching people across camera views using kernel canonical correlation analysis. In: Proceedings of ICDSC (2014)Google Scholar
  37. 37.
    Hirzer, M., Beleznai, C., Roth, P.M., Bischof, H.: Person re-identification by descriptive and discriminative classification. In: Heyden, A., Kahl, F. (eds.) SCIA 2011. LNCS, vol. 6688, pp. 91–102. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-21227-7_9 CrossRefGoogle Scholar
  38. 38.
    Zheng, W., Gong, S., Xiang, T.: Associating groups of people. In: BMVC (2009)Google Scholar
  39. 39.
    Xiong, F., Gou, M., Camps, O., Sznaier, M.: Person re-identification using kernel-based metric learning methods. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 1–16. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10584-0_1 Google Scholar
  40. 40.
    Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1116–1124, December 2015Google Scholar
  41. 41.
    Hirzer, M., Roth, P.M., Köstinger, M., Bischof, H.: Relaxed pairwise learned metric for person re-identification. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 780–793. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33783-3_56 Google Scholar
  42. 42.
    Zhao, R., Ouyang, W., Wang, X.: Learning mid-level filters for person re-identification. In: Proceedings of CVPR (2014)Google Scholar
  43. 43.
    Li, Z., Chang, S., Liang, F., Huang, T.S., Cao, L., Smith, J.: Learning locally-adaptive decision functions for person verification. In: CVPR (2013)Google Scholar
  44. 44.
    Chen, D., Yuan, Z., Hua, G., Zheng, N., Wang, J.: Similarity learning on an explicit polynomial kernel feature map for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1565–1573 (2015)Google Scholar
  45. 45.
    Liao, S., Li, S.Z.: Efficient PSD constrained asymmetric metric learning for person re-identification. In: The IEEE International Conference on Computer Vision (ICCV), December 2015Google Scholar
  46. 46.
    Ahmed, E., Jones, M., Marks, T.K.: An improved deep learning architecture for person re-identification. In: CVPR (2015)Google Scholar
  47. 47.
    Paisitkriangkrai, S., Shen, C., van den Hengel, A.: Learning to rank in person re-identification with metric ensembles. arXiv preprint (2015). arXiv:1503.01543
  48. 48.
    Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., Darrell, T.: Decaf: A deep convolutional activation feature for generic visual recognition. University of California Berkeley, Brigham Young University, pp. 647–655 (2013)Google Scholar
  49. 49.
    Zhang, Z., Saligrama, V.: Zero-shot learning via semantic similarity embedding. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4166–4174, December 2015Google Scholar
  50. 50.
    Akata, Z., Reed, S., Walter, D., Lee, H., Schiele, B.: Evaluation of output embeddings for fine-grained image classification. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2927–2936, June 2015Google Scholar
  51. 51.
    Zhang, Z., Saligrama, V.: Zero-shot learning via joint latent similarity embedding. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  52. 52.
    Vedaldi, A., Lenc, K.: Matconvnet - convolutional neural networks for matlab. Eprint Arxiv (2016)Google Scholar
  53. 53.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.National Engineering Laboratory for Video TechnologyPeking UniversityBeijingChina
  2. 2.School of Electronic Engineering and Computer ScienceQueen Mary University of LondonLondonUK
  3. 3.Department of Electronic EngineeringBeijing Institute of TechnologyBeijingChina
  4. 4.Cooperative Medianet Innovation CenterBeijingChina

Personalised recommendations