Advertisement

Human-Centric Visual Analysis: Tasks and Progress

  • Liang LinEmail author
  • Dongyu Zhang
  • Ping Luo
  • Wangmeng Zuo
Chapter

Abstract

The research of human-centric visual analysis has achieved considerable progress in recent years. In this chapter, we briefly review the tasks of human-centric visual analysis, including face detection, facial landmark localization, pedestrian detection, human segmentation, clothes parsing, etc.

References

  1. 1.
    T. Sakai, M. Nagao, and T. Kanade, Computer Analysis and Classification of Photographs of Human faces (Kyoto University, 1972)Google Scholar
  2. 2.
    K.-K. Sung, T. Poggio, Example-based learning for view-based human face detection. TPAMI 20(1), 39–51 (1998)CrossRefGoogle Scholar
  3. 3.
    H. Rowley, S. Baluja, T. Kanade, Rotation invariant neural network-based face detection, in CVPR. sn, p. 38 (1998)Google Scholar
  4. 4.
    P. Viola, M. Jones, Rapid object detection using a boosted cascade of simple features, in CVPR, vol. 1. IEEE, pp. I–511 (2001)Google Scholar
  5. 5.
    P. Viola, M.J. Jones, Robust real-time face detection. IJCV 57(2), 137–154 (2004)CrossRefGoogle Scholar
  6. 6.
    Q. Zhu, M.-C. Yeh, K.-T. Cheng, S. Avidan, Fast human detection using a cascade of histograms of oriented gradients, in Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, vol. 2. IEEE, pp. 1491–1498 (2006)Google Scholar
  7. 7.
    P.C. Ng, S. Henikoff, Sift: Predicting amino acid changes that affect protein function. Nucleic acids research 31(13), 3812–3814 (2003)CrossRefGoogle Scholar
  8. 8.
    Z. Li, S. Chang, F. Liang, T. S. Huang, L. Cao, J. R. Smith, Learning locally-adaptive decision functions for person verification, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3610–3617 (2013)Google Scholar
  9. 9.
    S. Liao, A.K. Jain, S.Z. Li, A fast and accurate unconstrained face detector. IEEE transactions on pattern analysis and machine intelligence 38(2), 211–223 (2016)CrossRefGoogle Scholar
  10. 10.
    X. Zhu, D. Ramanan, Face detection, pose estimation, and landmark localization in the wild, in CVPR. IEEE, pp. 2879–2886 (2012)Google Scholar
  11. 11.
    H. Li, Z. Lin, X. Shen, J. Brandt, G. Hua, A convolutional neural network cascade for face detection, in CVPR, pp. 5325–5334 (2015)Google Scholar
  12. 12.
    K. Zhang, Z. Zhang, Z. Li, Y. Qiao, Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23(10), 1499–1503 (2016)CrossRefGoogle Scholar
  13. 13.
    Z. Hao, Y. Liu, H. Qin, J. Yan, X. Li, X. Hu, Scale-aware face detection, in CVPR, vol. 3 (2017)Google Scholar
  14. 14.
    Y. Liu, H. Li, J. Yan, F. Wei, X. Wang, X. Tang, Recurrent scale approximation for object detection in cnn, in ICCV, vol. 5 (2017)Google Scholar
  15. 15.
    R. Girshick, Fast r-cnn, in Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)Google Scholar
  16. 16.
    S. Wan, Z. Chen, T. Zhang, B. Zhang, K.-k. Wong, Bootstrapping face detection with hard negative examples, arXiv preprint arXiv:1608.02236 (2016)
  17. 17.
    V. Jain, E. Learned-Miller, Fddb: a benchmark for face detection in unconstrained settings, Technical Report UM-CS-2010-009, University of Massachusetts, Amherst (Tech, Rep, 2010)Google Scholar
  18. 18.
    Y. Bai, Y. Zhang, M. Ding, B. Ghanem, Finding tiny faces in the wild with generative adversarial network, inCVPR (2018)Google Scholar
  19. 19.
    T.F. Cootes, G.J. Edwards, C.J. Taylor, Active appearance models. PAMI 6, 681–685 (2001)CrossRefGoogle Scholar
  20. 20.
    J.M. Saragih, S. Lucey, J.F. Cohn, Deformable model fitting by regularized landmark mean-shift. IJCV 91(2), 200–215 (2011)MathSciNetCrossRefGoogle Scholar
  21. 21.
    P.N. Belhumeur, D.W. Jacobs, D.J. Kriegman, N. Kumar, Localizing parts of faces using a consensus of exemplars. PAMI 35(12), 2930–2940 (2013)CrossRefGoogle Scholar
  22. 22.
    L. Liang, R. Xiao, F. Wen, J. Sun, Face alignment via component-based discriminative search, in ECCV (Springer, 2008), pp. 72–85Google Scholar
  23. 23.
    M. Dantone, J. Gall, G. Fanelli, L. Van Gool, Real-time facial feature detection using conditional regression forests, in CVPR (IEEE, 2012), pp. 2578–2585Google Scholar
  24. 24.
    M. Valstar, B. Martinez, X. Binefa, M. Pantic, Facial point detection using boosted regression and graph models, in CVPR (IEEE, 2010), pp. 2729–2736Google Scholar
  25. 25.
    X. Cao, Y. Wei, F. Wen, J. Sun, Face alignment by explicit shape regression. IJCV 107(2), 177–190 (2014)MathSciNetCrossRefGoogle Scholar
  26. 26.
    V. Kazemi, J. Sullivan, One millisecond face alignment with an ensemble of regression trees, in CVPR, pp. 1867–1874 (2014)Google Scholar
  27. 27.
    X. Xiong, F. Torre, Supervised descent method and its applications to face alignment, in CVPR, pp. 532–539 (2013)Google Scholar
  28. 28.
    S. Ren, X. Cao, Y. Wei, J. Sun, Face alignment at 3000 fps via regressing local binary features, in CVPR, pp. 1685–1692 (2014)Google Scholar
  29. 29.
    S. Zhu, C. Li, C.-C. Loy, X. Tang, Unconstrained face alignment via cascaded compositional learning, in CVPR, pp. 3409–3417 (2016)Google Scholar
  30. 30.
    O. Tuzel, T. K. Marks, S. Tambe, Robust face alignment using a mixture of invariant experts, in ECCV (Springer, 2016), pp. 825–841Google Scholar
  31. 31.
    X. Fan, R. Liu, Z. Luo, Y. Li, Y. Feng, Explicit shape regression with characteristic number for facial landmark localization, TMM (2017)Google Scholar
  32. 32.
    X. Burgos-Artizzu, P. Perona, P. Dollár, Robust face landmark estimation under occlusion, in ICCV, pp. 1513–1520 (2013)Google Scholar
  33. 33.
    E. Zhou, H. Fan, Z. Cao, Y. Jiang, Q. Yin, Extensive facial landmark localization with coarse-to-fine convolutional network cascade, in ICCV Workshops, pp. 386–391 (2013)Google Scholar
  34. 34.
    Z. Zhang, P. Luo, C.C. Loy, X. Tang, Facial landmark detection by deep multi-task learning, in ECCV (Springer, 2014), pp. 94–108Google Scholar
  35. 35.
    H. Liu, D. Kong, S. Wang, B. Yin, Sparse pose regression via componentwise clustering feature point representation. TMM 18(7), 1233–1244 (2016)Google Scholar
  36. 36.
    T. Zhang, W. Zheng, Z. Cui, Y. Zong, J. Yan, K. Yan, A deep neural network-driven feature learning method for multi-view facial expression recognition. TMM 18(12), 2528–2536 (2016)Google Scholar
  37. 37.
    J. Zhang, S. Shan, M. Kan, X. Chen, Coarse-to-fine auto-encoder networks (cfan) for real-time face alignment, in ECCV (Springer, 2014), pp. 1–16Google Scholar
  38. 38.
    J. Zhang, M. Kan, S. Shan, X. Chen, Occlusion-free face alignment: deep regression networks coupled with de-corrupt autoencoders, in CVPR, pp. 3428–3437 (2016)Google Scholar
  39. 39.
    H. Lai, S. Xiao, Z. Cui, Y. Pan, C. Xu, S. Yan, Deep cascaded regression for face alignment, arXiv preprint arXiv:1510.09083 (2015)
  40. 40.
    D. Merget, M. Rock, G. Rigoll, Robust facial landmark detection via a fully-convolutional local-global context network, in CVPR, pp. 781–790 (2018)Google Scholar
  41. 41.
    A. Bulat and G. Tzimiropoulos, Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans, in CVPR (2018)Google Scholar
  42. 42.
    Z. Tang, X. Peng, S. Geng, L. Wu, S. Zhang, D. Metaxas, Quantized densely connected u-nets for efficient landmark localization, in ECCV (2018)Google Scholar
  43. 43.
    X. Peng, R.S. Feris, X. Wang, D.N. Metaxas, A recurrent encoder-decoder network for sequential face alignment, in ECCV (Springer, 2016), pp. 38–56Google Scholar
  44. 44.
    S. Xiao, J. Feng, J. Xing, H. Lai, S. Yan, A. Kassim, Robust facial landmark detection via recurrent attentive-refinement networks, in ECCV (Springer, 2016), pp. 57–72Google Scholar
  45. 45.
    G. Trigeorgis, P. Snape, M.A. Nicolaou, E. Antonakos, S. Zafeiriou, Mnemonic descent method: a recurrent process applied for end-to-end face alignment, in CVPR, pp. 4177–4187 (2016)Google Scholar
  46. 46.
    X. Zhu, Z. Lei, X. Liu, H. Shi, S. Z. Li, Face alignment across large poses: a 3d solution, in CVPR, pp. 146–155 (2016)Google Scholar
  47. 47.
    A. Jourabloo, X. Liu, Large-pose face alignment via cnn-based dense 3d model fitting, in CVPR, pp. 4188–4196 (2016)Google Scholar
  48. 48.
    F. Liu, D. Zeng, Q. Zhao, X. Liu, Joint face alignment and 3d face reconstruction, in ECCV (Springer, 2016), pp. 545–560Google Scholar
  49. 49.
    A. Bulat, G. Tzimiropoulos, How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks, in CVPR, vol. 1, no. 2, p. 4 (2017)Google Scholar
  50. 50.
    Y. Feng, F. Wu, X. Shao, Y. Wang, X. Zhou, Joint 3d face reconstruction and dense alignment with position map regression network, in ECCV (2018)Google Scholar
  51. 51.
    X. Dong, S.-I. Yu, X. Weng, S.-E. Wei, Y. Yang, Y. Sheikh, Supervision-by-registration: an unsupervised approach to improve the precision of facial landmark detectors, in CVPR, pp. 360–368 (2018)Google Scholar
  52. 52.
    Y. Zhang, Y. Guo, Y. Jin, Y. Luo, Z. He, H. Lee, Unsupervised discovery of object landmarks as structural representations, in CVPR (2018)Google Scholar
  53. 53.
    X. Dong, Y. Yan, W. Ouyang, Y. Yang, Style aggregated network for facial landmark detection, in CVPR, vol. 2, p. 6 (2018)Google Scholar
  54. 54.
    S. Honari, P. Molchanov, S. Tyree, P. Vincent, C. Pal, J. Kautz, Improving landmark localization with semi-supervised learning, in CVPR (2018)Google Scholar
  55. 55.
    N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2005)Google Scholar
  56. 56.
    B.L. Andreas Ess, L. Van Gool, Depth and appearance for mobile scene analysis, in IEEE International Conference on Computer Vision (ICCV) (2007)Google Scholar
  57. 57.
    M. Enzweiler, D.M. Gavrila, Monocular pedestrian detection: Survey and experiments. IEEE Trans. Pattern Anal. Mach. Intell. 12, 2179–2195 (2008)Google Scholar
  58. 58.
    C. Wojek, S. Walk, B. Schiele, Multi-cue onboard pedestrian detection (2009)Google Scholar
  59. 59.
    A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, in Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on (IEEE, 2012), pp. 3354–3361Google Scholar
  60. 60.
    B.  Schiele Piotr Dollár, C. Wojek, P. Perona, Pedestrian detection: an evaluation of the state of the art (2012)Google Scholar
  61. 61.
    S. Maji, A.C. Berg, J. Malik, Classification using intersection kernel support vector machines is efficient, in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pp. 1–8. IEEE (2008)Google Scholar
  62. 62.
    J. Marin, D. Vázquez, A.M. López, J. Amores, B. Leibe, Random forests of local experts for pedestrian detection, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2592–2599 (2013)Google Scholar
  63. 63.
    P.P. Piotr Dollár, Z. Tu, S. Belongie, Integral channel features, in British Machine Vision Conference (BMVC) (2009)Google Scholar
  64. 64.
    R. Benenson, M. Mathias, T. Tuytelaars, L. Van Gool, Seeking the strongest rigid detector, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3666–3673 (2013)Google Scholar
  65. 65.
    S.B. Piotr Dollár, R. Appel, P. Perona, Fast feature pyramids for object detection (2014)Google Scholar
  66. 66.
    R. Tibshirani-et al. J. Friedman, T. Hastie, Additive logistic regression: a statistical view of boosting, in The Annals of Statistics (2000)Google Scholar
  67. 67.
    W. Nam, P. Dollár, J.H. Han, Local decorrelation for improved pedestrian detection, in Advances in Neural Information Processing Systems, pp. 424–432 (2014)Google Scholar
  68. 68.
    S. Paisitkriangkrai, C. Shen, A. Van Den Hengel, Strengthening the effectiveness of pedestrian detection with spatially pooled features, in European Conference on Computer Vision (Springer, 2014), pp. 546–561Google Scholar
  69. 69.
    S. Zhang, R. Benenson, B. Schiele, et al., Filtered channel features for pedestrian detection, in CVPR, volume 1, p. 4 (2015)Google Scholar
  70. 70.
    P. Felzenszwalb, D. McAllester, D. Ramanan. A discriminatively trained, multiscale, deformable part model, in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on (IEEE, 2008), pp. 1–8Google Scholar
  71. 71.
    D. Park, D. Ramanan, C. Fowlkes, Multiresolution models for object detection, in European Conference on Computer Vision (Springer, 2010), pp. 241–254Google Scholar
  72. 72.
    W. Ouyang, X. Wang, Single-pedestrian detection aided by multi-pedestrian detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3198–3205 (2013)Google Scholar
  73. 73.
    J. Yan, X. Zhang, Z. Lei, S. Liao, S.Z. Li, Robust multi-resolution pedestrian detection in traffic scenes, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3033–3040 (2013)Google Scholar
  74. 74.
    X. Wang, W. Ouyang, A discriminative deep model for pedestrian detection with occlusion handling, in 2012 IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2012), pp. 3258–3265Google Scholar
  75. 75.
    W. Ouyang, X. Zeng, X. Wang, Modeling mutual visibility relationship in pedestrian detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3222–3229 (2013)Google Scholar
  76. 76.
    P. Sermanet, K. Kavukcuoglu, S. Chintala, Y. LeCun, Pedestrian detection with unsupervised multi-stage feature learning, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3626–3633 (2013)Google Scholar
  77. 77.
    W. Ouyang, X. Wang, Joint deep learning for pedestrian detection, in Proceedings of the IEEE International Conference on Computer Vision, pp. 2056–2063 (2013)Google Scholar
  78. 78.
    X. Zeng, W. Ouyang, X. Wang, Multi-stage contextual deep learning for pedestrian detection, in Proceedings of the IEEE International Conference on Computer Vision, pp. 121–128 (2013)Google Scholar
  79. 79.
    P. Luo, Y. Tian, X. Wang, X. Tang, Switchable deep network for pedestrian detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 899–906 (2014)Google Scholar
  80. 80.
    R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587 (2014)Google Scholar
  81. 81.
    J. Hosang, M. Omran, R. Benenson, B. Schiele, Taking a deeper look at pedestrians, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4073–4082 (2015)Google Scholar
  82. 82.
    A. Krizhevsky, G. Hinton, Learning multiple layers of features from tiny images (Technical report, Citeseer, 2009)Google Scholar
  83. 83.
    A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  84. 84.
    X. Wang, Y. Tian, P. Luo, X. Tang, Pedestrian detection aided by deep learning semantic tasks, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  85. 85.
    X.  Wang, Y. Tian, P. Luo, X. Tang, Deep learning strong parts for pedestrian detection, in IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  86. 86.
    Jianan Li, Xiaodan Liang, ShengMei Shen, Xu Tingfa, Jiashi Feng, Shuicheng Yan, Scale-aware fast r-cnn for pedestrian detection. IEEE Transactions on Multimedia 20(4), 985–996 (2018)Google Scholar
  87. 87.
    M. Saberian, Z. Cai, N. Vasconcelos, Learning complexity-aware cascades for deep pedestrian detection, in IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  88. 88.
    B. Yang, J. Yan, Z. Lei, S.Z. Li, Convolutional channel features, in ICCV, pp. 82–90 (2015)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Liang Lin
    • 1
    Email author
  • Dongyu Zhang
    • 1
  • Ping Luo
    • 2
  • Wangmeng Zuo
    • 3
  1. 1.School of Data and Computer ScienceSun Yat-sen UniversityGuangzhouChina
  2. 2.School of Information EngineeringThe Chinese University of Hong KongHong KongHong Kong
  3. 3.School of Computer ScienceHarbin Institute of TechnologyHarbinChina

Personalised recommendations