Advertisement

Deep Learning Face Attributes for Detection and Alignment

  • Chen Change Loy
  • Ping Luo
  • Chen Huang
Chapter
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)

Abstract

Describable face attributes are labels that can be given to a face image to describe its characteristics. Examples of face attributes include gender, age, ethnicity, face shape, and nose size. Predicting face attributes in the wild is challenging due to complex face variations. This chapter aims to provide an in-depth presentation of recent progress and the current state-of-the-art approaches to solving some of the fundamental challenges in face attribute recognition, particularly from the angle of deep learning. We highlight effective techniques for training deep convolutional networks for predicting face attributes in the wild, and addressing the problem of imbalanced distribution of attributes. In addition, we discuss the use of face attributes as rich contexts to facilitate accurate face detection and face alignment in return. The chapter ends by posing an open question for the face attribute recognition challenge arising from emerging and future applications .

Keywords

Face Image Convolutional Neural Network Minority Class Class Imbalance Face Attribute 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Arbeláez, P., Pont-Tuset, J., Barron, J.T., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  2. 2.
    Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Robust discriminative response map fitting with constrained local models. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  3. 3.
    Belhumeur, P.N., Jacobs, D.W., Kriegman, D.J., Kumar, N.: Localizing parts of faces using a consensus of exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2930–2940 (2013)CrossRefGoogle Scholar
  4. 4.
    Berg, T., Belhumeur, P.N.: Poof: Part-based one-versus-one features for fine-grained categorization, face verification, and attribute estimation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  5. 5.
    Bourdev, L., Maji, S., Malik, J.: Describing people: a poselet-based approach to attribute classification. In: International Conference on Computer Vision (ICCV) (2011)Google Scholar
  6. 6.
    Boureau, Y.L., Roux, N.L., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: multi-way local pooling for image recognition. In: International Conference on Computer Vision (ICCV) (2011)Google Scholar
  7. 7.
    Burgos-Artizzu, X., Perona, P., Dollár, P.: Robust face landmark estimation under occlusion. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  8. 8.
    Cao, X., Wei, Y., Wen, F., Sun, J.: Face alignment by explicit shape regression. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  9. 9.
    Chechik, G., Shalit, U., Sharma, V., Bengio, S.: An online algorithm for large scale image similarity learning. In: Conference on Neural Information Processing Systems (NIPS) (2009)Google Scholar
  10. 10.
    Chen, D., Ren, S., Wei, Y., Cao, X., Sun, J.: Joint cascade face detection and alignment. In: European Conference on Computer Vision (ECCV) (2014)Google Scholar
  11. 11.
    Chen, K., Gong, S., Xiang, T., Loy, C.C.: Cumulative attribute space for age and crowd density estimation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  12. 12.
    Chrysos, G.G., Antonakos, E., Snape, P., Asthana, A., Zafeiriou, S.: A comprehensive performance evaluation of deformable face tracking “in-the-wild”. arXiv preprint arXiv:1603.06015 (2016)
  13. 13.
    Chung, J., Lee, D., Seo, Y., Yoo, C.D.: Deep attribute networks. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning (2012)Google Scholar
  14. 14.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–685 (2001)CrossRefGoogle Scholar
  15. 15.
    Cootes, T.F., Ionita, M.C., Lindner, C., Sauer, P.: Robust and accurate shape model fitting using random forest regression voting. In: European Conference on Computer Vision (ECCV) (2012)Google Scholar
  16. 16.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  17. 17.
    Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: European Conference on Computer Vision (ECCV) (2014)Google Scholar
  18. 18.
    Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2015)CrossRefGoogle Scholar
  19. 19.
    Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)CrossRefGoogle Scholar
  20. 20.
  21. 21.
    Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)zbMATHGoogle Scholar
  22. 22.
    Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 915–1929 (2013)CrossRefGoogle Scholar
  23. 23.
    Girshick, R.: Fast R-CNN. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  24. 24.
    Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2006)Google Scholar
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
  26. 26.
    Hu, Y., Lam, K.M., Qiu, G., Shen, T.: From local pixel structure to global image super-resolution: a new face hallucination framework. IEEE Trans. Image Process. 20(2), 433–445 (2011)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Huang, C., Loy, C.C., Tang, X.: Discriminative sparse neighbor approximation for imbalanced learning. arXiv preprint arXiv:1602.01197 (2016)
  28. 28.
    Huang, C., Loy, C.C., Tang, X.: Learning deep representation for imbalanced classification. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  29. 29.
    Huang, H., He, H., Fan, X., Zhang, J.: Super-resolution of human face image using canonical correlation analysis. Pattern Recogn. 43(7), 2532–2543 (2010)CrossRefzbMATHGoogle Scholar
  30. 30.
    Huang, Z., Zhao, X., Shan, S., Wang, R., Chen, X.: Coupling alignments with recognition for still-to-video face recognition. In: International Conference on Computer Vision (ICCV), pp. 3296–3303 (2013)Google Scholar
  31. 31.
    Jain, V., Learned-Miller, E.: FDDB: a benchmark for face detection in unconstrained settings. university of massachusetts. Technical report, Amherst, Tech. Rep. UM-CS-2010-009 (2010)Google Scholar
  32. 32.
    Jain, V., Learned-Miller, E.: Online domain adaptation of a pre-trained cascade of classifiers. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  33. 33.
    Jin, Y., Bouganis, C.S.: Robust multi-image based blind face hallucination. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  34. 34.
    Kazemi, V., Josephine, S.: One millisecond face alignment with an ensemble of regression trees. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  35. 35.
    Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (2011)Google Scholar
  36. 36.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Conference on Neural Information Processing Systems (NIPS) (2012)Google Scholar
  37. 37.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: International Conference on Computer Vision (ICCV) (2009)Google Scholar
  38. 38.
    Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Describable visual attributes for face verification and image search. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1962–1977 (2011)CrossRefGoogle Scholar
  39. 39.
    Le, V., Brandt, J., Lin, Z., Bourdev, L., Huang, T.S.: Interactive facial feature localization. In: European Conference on Computer Vision (ECCV) (2012)Google Scholar
  40. 40.
    LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a back-propagation network. In: Conference on Neural Information Processing Systems (NIPS) (1990)Google Scholar
  41. 41.
    Li, J., Zhang, Y.: Learning SURF cascade for fast and accurate object detection. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  42. 42.
    Li, H., Hua, G., Lin, Z., Brandt, J., Yang, J.: Probabilistic elastic part model for unsupervised face detector adaptation. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  43. 43.
    Li, H., Lin, Z., Brandt, J., Shen, X., Hua, G.: Efficient boosted exemplar-based face detection. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  44. 44.
    Liu, Z., Li, X., Luo, P., Loy, C.C., Tang, X.: Semantic image segmentation via deep parsing network. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  45. 45.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  46. 46.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  47. 47.
    Lu, C., Tang, X.: Surpassing human-level face verification performance on LFW with gaussianface. arXiv preprint arXiv:1404.3840 (2014)
  48. 48.
    Luo, P., Wang, X., Tang, X.: A deep sum-product architecture for robust facial attributes analysis. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  49. 49.
    Mathias, M., Benenson, R., Pedersoli, M., Van Gool, L.: Face detection without bells and whistles. In: European Conference on Computer Vision (ECCV) (2014)Google Scholar
  50. 50.
    McCullagh, P., Nelder, J.A., McCullagh, P.: Generalized linear models. Chapman and Hall London (1989)Google Scholar
  51. 51.
    Mnih, V., Hinton, G.: Learning to label aerial images from noisy data. In: International Conference on Machine Learning (ICML) (2012)Google Scholar
  52. 52.
    Ren, S., Cao, X., Wei, Y., Sun, J.: Face alignment at 3000 fps via regressing local binary features. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  53. 53.
    Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Conference on Neural Information Processing Systems (NIPS) (2015)Google Scholar
  54. 54.
    Sagonas, C., Tzimiropoulos, G., Zafeiriou, S., Pantic, M.: 300 faces in-the-wild challenge: the first facial landmark localization challenge. In: International Conference on Computer Vision Workshop (2013)Google Scholar
  55. 55.
    Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  56. 56.
    Shen, X., Lin, Z., Brandt, J., Wu, Y.: Detecting and aligning faces by image retrieval. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  57. 57.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  58. 58.
    Sun, Y., Wang, X., Tang, X.: Deep convolutional network cascade for facial point detection. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  59. 59.
    Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Conference on Neural Information Processing Systems (NIPS) (2014)Google Scholar
  60. 60.
    Sun, Y., Wang, X., Tang, X.: Deeply learned face representations are sparse, selective, and robust. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  61. 61.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  62. 62.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  63. 63.
    Tzimiropoulos, G., Pantic, M.: Gauss-newton deformable part models for face alignment in-the-wild. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  64. 64.
    Uijlings, J.R., van de Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)CrossRefGoogle Scholar
  65. 65.
    Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2010)Google Scholar
  66. 66.
    Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., Wu, Y.: Learning fine-grained image similarity with deep ranking. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  67. 67.
    Wang, N., Tao, D., Gao, X., Li, X., Li, J.: A comprehensive survey to face hallucination. Int. J. Comput. Vis. 106(1), 9–30 (2014)CrossRefGoogle Scholar
  68. 68.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)zbMATHGoogle Scholar
  69. 69.
    Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: SUN database: large-scale scene recognition from abbey to zoo. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2010)Google Scholar
  70. 70.
    Xiong, X., Torre, F.: Supervised descent method and its applications to face alignment. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  71. 71.
    Yan, J., Lei, Z., Wen, L., Li, S.: The fastest deformable part model for object detection. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  72. 72.
    Yan, J., Zhang, X., Lei, Z., Li, S.Z.: Face detection by structural models. Image Vis. Comput. 32(10), 790–799 (2014)CrossRefGoogle Scholar
  73. 73.
    Yang, B., Yan, J., Lei, Z., Li, S.Z.: Aggregate channel features for multi-view face detection. In: International Joint Conference on Biometrics (IJCB) (2014)Google Scholar
  74. 74.
    Yang, C.Y., Liu, S., Yang, M.H.: Structured face hallucination. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2013)Google Scholar
  75. 75.
    Yang, H., Patras, I.: Sieving regression forest votes for facial feature detection in the wild. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  76. 76.
    Yang, H., Jia, X., Loy, C.C., Robinson, P.: An empirical study of recent face alignment methods. arXiv preprint arXiv:1511.05049 (2015)
  77. 77.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: From facial parts responses to face detection: a deep learning approach. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  78. 78.
    Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  79. 79.
    Yu, X., Huang, J., Zhang, S., Yan, W., Metaxas, D.N.: Pose-free facial landmark fitting via optimized part mixtures and cascaded deformable shape model. In: International Conference on Computer Vision (ICCV) (2013)Google Scholar
  80. 80.
    Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: European Conference on Computer Vision (ECCV) (2014)Google Scholar
  81. 81.
    Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.: PANDA: pose aligned networks for deep attribute modeling. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  82. 82.
    Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: European Conference on Computer Vision (ECCV) (2014)Google Scholar
  83. 83.
    Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning deep representation for face alignment with auxiliary attributes. IEEE Trans. Pattern Anal. Mach. Intell. 38(5), 918–930 (2015)CrossRefGoogle Scholar
  84. 84.
    Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Learning social relation traits from face images. In: International Conference on Computer Vision (ICCV) (2015)Google Scholar
  85. 85.
    Zhu, S., Li, C., Loy, C.C., Tang, X.: Face alignment by coarse-to-fine shape searching. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  86. 86.
    Zhu, S., Li, C., Loy, C.C., Tang, X.: Unconstrained face alignment via cascaded compositional learning. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  87. 87.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  88. 88.
    Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: European Conference on Computer Vision (ECCV) (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Information EngineeringThe Chinese University of Hong KongShatinHong Kong
  2. 2.Robotics InstituteCarnegie Mellon UniversityPittsburghUnited States

Personalised recommendations