How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection

Gao, Fei; Li, Shuai; Lu, Shufang

doi:10.1007/s00521-020-05167-0

How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection

Original Article
Published: 05 July 2020

Volume 33, pages 3035–3051, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

524 Accesses
3 Citations
Explore all metrics

Abstract

Face pose estimation has been widely used into various applications of human–computer interaction; however, it is yet a challenging work due to illumination, background, face orientations, appearance visibility, etc. In this paper, a novel coarse-to-fine method of face pose quantitative estimation based on convolutional neural networks (CNN) and geometric projection is proposed. In coarse classification, CNN is applied to classify the input image into a specific category and obtain a relevant weight. After that, geometric projections of 3D face landmarks projected into three planes, x–y, x–z and y–z, of 3D coordinate systems are used to perform the fine estimation of face pose, which can get the offset angles of the face in the three directions of roll, yaw, and pitch. Finally, the final score of face pose is obtained by combining the results of two stages. Experiments on standard datasets show that the proposed method can get better results than some competitive algorithms, which proves the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 11

SSD: Single Shot MultiBox Detector

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Convolutional neural network: a review of models, methodologies and applications to object detection

Article 20 December 2019

References

Doshi A, Trivedi MM (2012) Head and eye gaze dynamics during visual attention shifts in complex environments. J Vis 12(2):1–16
Article Google Scholar
Ding C, Xu C, Tao D (2015) Multi-task pose-invariant face recognition. IEEE Trans Image Process 24(3):980–993
Article MathSciNet Google Scholar
Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626
Article Google Scholar
Beymer D (1994) Face recognition under varying pose. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 756–761
Ng J, Gong S (2002) Composite support vector machines for detection of faces across views and pose estimation. Image Vis Comput 20(5–6):359–368
Article Google Scholar
Ng J, Gong S (1999) Multi-view face detection and pose estimation using a composite support vector machine across the view sphere. In: Proceedings international workshop on recognition, analysis, and tracking of faces and gestures in real-time systems, pp 14–21
Wang J, Sung E (2007) EM enhancement of 3D head pose estimated by point at infinity. Image Vis Comput 25(12):1864–1874
Article Google Scholar
Heo J, Savvides M (2011) Generic 3D face pose estimation using facial shapes. In: 2011 international joint conference on biometrics (IJCB), pp 1–8
Hegde C, Sankaranarayanan AC, Baraniuk RG (2011) Learning manifolds in the wild. J Mach Learn Res 1(2):1–34
Google Scholar
Sundararajan K, Woodard DL (2015) Head pose estimation in the wild using approximate view manifolds. In: 2015 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 50–58
Zhang Z, Hu Y, Liu M, Huang T (2007) Head pose estimation in seminar room using multi view face detectors. In: International evaluation workshop on classification of events, activities and relationships, pp 299–304
Ma B, Zhang W, Shan S, Chen X, Gao W (2006) Robust head pose estimation using LGBP. In: 18th international conference on pattern recognition (ICPR’06), pp 512–515
Murphy-Chutorian E, Trivedi MM (2007) Head pose estimation for driver assistance systems: a robust algorithm and experimental evaluation. In: 2007 IEEE intelligent transportation systems conference, pp 709–714
Ma Y, Konishi Y, Kinoshita K, Lao S, Kawade M (2006) Sparse Bayesian regression for head pose estimation. In: 18th International conference on pattern recognition (ICPR’06), pp 507–510
Han B, Lee S, Yang H (2014) Head pose estimation using image abstraction and local directional quaternary patterns for multiclass classification. Pattern Recogn Lett 45:145–153
Article Google Scholar
Drouard V, Ba S, Evangelidis G, Deleforge A, Horaud R (2015) Head pose estimation via probabilistic high-dimensional regression. In: 2015 IEEE international conference on image processing (ICIP), pp 4624–4628
Drouard V, Horaud R, Deleforge A, Ba S, Evangelidis G (2017) Robust head-pose estimation based on partially-latent mixture of linear regressions. IEEE Trans Image Process 26(3):1428–1440
Article MathSciNet Google Scholar
Aghajanian J, Prince S (2009) Face pose estimation in uncontrolled environments. BMVC 1(2):1–11
Google Scholar
Torki M, Elgammal A (2011) Regression from local features for viewpoint and pose estimation. In: 2011 international conference on computer vision, pp 2603–2610
Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2879–2886
Wang Y, Liang W, Shen J, Jia Y, Yu L (2019) A deep Coarse-to-Fine network for head pose estimation from synthetic data. Pattern Recogn 94(10):196–206
Article Google Scholar
Ranjan R, Patel VM, Chellappa R (2019) HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41(1):121–135
Article Google Scholar
Liu X, Liang W, Wang Y, Li S, Pei M (2016) 3D head pose estimation with convolutional neural network trained on synthetic images. In: 2016 IEEE international conference on image processing (ICIP), pp 1289–1293
Ahn B, Jaesik P, Kweon I (2014) Real-time head orientation from a monocular camera using deep neural network. In: Asian conference on computer vision, pp 82–96
Patacchiola M, Cangelosi A (2017) Head pose estimation in the wild using convolutional neural networks and adaptive gradient methods. Pattern Recogn 71:132–143
Article Google Scholar
Zavan F, Bellon OR, Silva L, Medioni GG (2019) Benchmarking parts based face processing in-the-wild for gender recognition and head pose estimation. Pattern Recogn Lett 123:104–110
Article Google Scholar
Ruiz N, Chong E, Rehg JM (2018) Fine-grained head pose estimation without keypoints. In: 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 2155–2164
Kumar A, Alavi A, Chellappa R (2017) KEPLER: keypoint and pose estimation of unconstrained faces by learning efficient H-CNN regressors. In: 2017 12th IEEE international conference on automatic face and gesture recognition (FG 2017), pp 258–265
Zhang H, Ji Y, Huang W, Liu L (2019) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural Comput Appl 31:7361–7380
Article Google Scholar
Ji Y, Zhang H, Wu Q (2018) Salient object detection via multi-scale attention CNN. Neurocomputing 322:130–140
Article Google Scholar
Hsu H, Wu T, Wan S, Wong W, Lee C (2019) QuatNet: quaternion-based head pose estimation with multiregression loss. IEEE Trans Multimed 21(4):1035–1046
Article Google Scholar
Huang B, Chen R, Xu W, Zhou Q (2020) Improving head pose estimation using two-stage ensembles with top-k regression. Image Vis Comput 93:103827–103835
Article Google Scholar
Wu H, Zhang K, Tian G (2018) Simultaneous face detection and pose estimation using convolutional neural network cascade. IEEE Access 6:49563–49575
Article Google Scholar
Fanelli G, Dantone M, Gall J, Fossati A, Gool L (2013) Random forests for real time 3D face analysis. Int J Comput Vision 101(3):437–458
Article Google Scholar
Gourier N, Hall D, Crowley JL (2004) Estimating face orientation from robust detection of salient facial structures. In: FG Net workshop on visual observation of deictic gestures, vol 6, pp 1–9
Kostinger M, Wohlhart P, Roth P, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops), pp 2144–2151
Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Thirty-First AAAI conference on artificial intelligence, pp 4278–4284
Zhang K-P, Zhang Z-P, Li Z-F, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503
Article Google Scholar
Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: 2017 IEEE international conference on computer vision (ICCV), pp 1021–1030
Jourabloo A, Liu X (2016) Large-pose face alignment via CNN-based dense 3D model fitting. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4188–4196
Zhu X, Lei Z, Liu X (2016) Face alignment across large poses: a 3D solution. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 146–155
Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement. http://arxiv.org/abs/1804.02767
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C, Berg A (2016) SSD: single shot multibox detector. In: European conference on computer vision (ECCV), pp 21–37

Download references

Acknowledgements

This work is being supported by the National Natural Science Foundation of China under Grant No. 61976193, the Zhejiang Provincial Science and Technology Planning Key Project of China under Grant No. 2018C01064 and the Zhejiang Provincial Natural Science Foundation of China under Grant No. LY19F020027.

Author information

Authors and Affiliations

Laboratory of Graphics and Image Processing, College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, 310023, Zhejiang Province, China
Fei Gao, Shuai Li & Shufang Lu

Authors

Fei Gao
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Li
View author publications
You can also search for this author in PubMed Google Scholar
Shufang Lu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Gao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gao, F., Li, S. & Lu, S. How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection. Neural Comput & Applic 33, 3035–3051 (2021). https://doi.org/10.1007/s00521-020-05167-0

Download citation

Received: 18 January 2020
Accepted: 24 June 2020
Published: 05 July 2020
Issue Date: April 2021
DOI: https://doi.org/10.1007/s00521-020-05167-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Facial emotion recognition using convolutional neural networks (FERC)

Convolutional neural network: a review of models, methodologies and applications to object detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Facial emotion recognition using convolutional neural networks (FERC)

Convolutional neural network: a review of models, methodologies and applications to object detection

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation