Skip to main content

Advertisement

Log in

How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Face pose estimation has been widely used into various applications of human–computer interaction; however, it is yet a challenging work due to illumination, background, face orientations, appearance visibility, etc. In this paper, a novel coarse-to-fine method of face pose quantitative estimation based on convolutional neural networks (CNN) and geometric projection is proposed. In coarse classification, CNN is applied to classify the input image into a specific category and obtain a relevant weight. After that, geometric projections of 3D face landmarks projected into three planes, xy, xz and yz, of 3D coordinate systems are used to perform the fine estimation of face pose, which can get the offset angles of the face in the three directions of roll, yaw, and pitch. Finally, the final score of face pose is obtained by combining the results of two stages. Experiments on standard datasets show that the proposed method can get better results than some competitive algorithms, which proves the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Doshi A, Trivedi MM (2012) Head and eye gaze dynamics during visual attention shifts in complex environments. J Vis 12(2):1–16

    Article  Google Scholar 

  2. Ding C, Xu C, Tao D (2015) Multi-task pose-invariant face recognition. IEEE Trans Image Process 24(3):980–993

    Article  MathSciNet  Google Scholar 

  3. Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626

    Article  Google Scholar 

  4. Beymer D (1994) Face recognition under varying pose. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 756–761

  5. Ng J, Gong S (2002) Composite support vector machines for detection of faces across views and pose estimation. Image Vis Comput 20(5–6):359–368

    Article  Google Scholar 

  6. Ng J, Gong S (1999) Multi-view face detection and pose estimation using a composite support vector machine across the view sphere. In: Proceedings international workshop on recognition, analysis, and tracking of faces and gestures in real-time systems, pp 14–21

  7. Wang J, Sung E (2007) EM enhancement of 3D head pose estimated by point at infinity. Image Vis Comput 25(12):1864–1874

    Article  Google Scholar 

  8. Heo J, Savvides M (2011) Generic 3D face pose estimation using facial shapes. In: 2011 international joint conference on biometrics (IJCB), pp 1–8

  9. Hegde C, Sankaranarayanan AC, Baraniuk RG (2011) Learning manifolds in the wild. J Mach Learn Res 1(2):1–34

    Google Scholar 

  10. Sundararajan K, Woodard DL (2015) Head pose estimation in the wild using approximate view manifolds. In: 2015 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 50–58

  11. Zhang Z, Hu Y, Liu M, Huang T (2007) Head pose estimation in seminar room using multi view face detectors. In: International evaluation workshop on classification of events, activities and relationships, pp 299–304

  12. Ma B, Zhang W, Shan S, Chen X, Gao W (2006) Robust head pose estimation using LGBP. In: 18th international conference on pattern recognition (ICPR’06), pp 512–515

  13. Murphy-Chutorian E, Trivedi MM (2007) Head pose estimation for driver assistance systems: a robust algorithm and experimental evaluation. In: 2007 IEEE intelligent transportation systems conference, pp 709–714

  14. Ma Y, Konishi Y, Kinoshita K, Lao S, Kawade M (2006) Sparse Bayesian regression for head pose estimation. In: 18th International conference on pattern recognition (ICPR’06), pp 507–510

  15. Han B, Lee S, Yang H (2014) Head pose estimation using image abstraction and local directional quaternary patterns for multiclass classification. Pattern Recogn Lett 45:145–153

    Article  Google Scholar 

  16. Drouard V, Ba S, Evangelidis G, Deleforge A, Horaud R (2015) Head pose estimation via probabilistic high-dimensional regression. In: 2015 IEEE international conference on image processing (ICIP), pp 4624–4628

  17. Drouard V, Horaud R, Deleforge A, Ba S, Evangelidis G (2017) Robust head-pose estimation based on partially-latent mixture of linear regressions. IEEE Trans Image Process 26(3):1428–1440

    Article  MathSciNet  Google Scholar 

  18. Aghajanian J, Prince S (2009) Face pose estimation in uncontrolled environments. BMVC 1(2):1–11

    Google Scholar 

  19. Torki M, Elgammal A (2011) Regression from local features for viewpoint and pose estimation. In: 2011 international conference on computer vision, pp 2603–2610

  20. Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE conference on computer vision and pattern recognition, pp 2879–2886

  21. Wang Y, Liang W, Shen J, Jia Y, Yu L (2019) A deep Coarse-to-Fine network for head pose estimation from synthetic data. Pattern Recogn 94(10):196–206

    Article  Google Scholar 

  22. Ranjan R, Patel VM, Chellappa R (2019) HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41(1):121–135

    Article  Google Scholar 

  23. Liu X, Liang W, Wang Y, Li S, Pei M (2016) 3D head pose estimation with convolutional neural network trained on synthetic images. In: 2016 IEEE international conference on image processing (ICIP), pp 1289–1293

  24. Ahn B, Jaesik P, Kweon I (2014) Real-time head orientation from a monocular camera using deep neural network. In: Asian conference on computer vision, pp 82–96

  25. Patacchiola M, Cangelosi A (2017) Head pose estimation in the wild using convolutional neural networks and adaptive gradient methods. Pattern Recogn 71:132–143

    Article  Google Scholar 

  26. Zavan F, Bellon OR, Silva L, Medioni GG (2019) Benchmarking parts based face processing in-the-wild for gender recognition and head pose estimation. Pattern Recogn Lett 123:104–110

    Article  Google Scholar 

  27. Ruiz N, Chong E, Rehg JM (2018) Fine-grained head pose estimation without keypoints. In: 2018 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW), pp 2155–2164

  28. Kumar A, Alavi A, Chellappa R (2017) KEPLER: keypoint and pose estimation of unconstrained faces by learning efficient H-CNN regressors. In: 2017 12th IEEE international conference on automatic face and gesture recognition (FG 2017), pp 258–265

  29. Zhang H, Ji Y, Huang W, Liu L (2019) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural Comput Appl 31:7361–7380

    Article  Google Scholar 

  30. Ji Y, Zhang H, Wu Q (2018) Salient object detection via multi-scale attention CNN. Neurocomputing 322:130–140

    Article  Google Scholar 

  31. Hsu H, Wu T, Wan S, Wong W, Lee C (2019) QuatNet: quaternion-based head pose estimation with multiregression loss. IEEE Trans Multimed 21(4):1035–1046

    Article  Google Scholar 

  32. Huang B, Chen R, Xu W, Zhou Q (2020) Improving head pose estimation using two-stage ensembles with top-k regression. Image Vis Comput 93:103827–103835

    Article  Google Scholar 

  33. Wu H, Zhang K, Tian G (2018) Simultaneous face detection and pose estimation using convolutional neural network cascade. IEEE Access 6:49563–49575

    Article  Google Scholar 

  34. Fanelli G, Dantone M, Gall J, Fossati A, Gool L (2013) Random forests for real time 3D face analysis. Int J Comput Vision 101(3):437–458

    Article  Google Scholar 

  35. Gourier N, Hall D, Crowley JL (2004) Estimating face orientation from robust detection of salient facial structures. In: FG Net workshop on visual observation of deictic gestures, vol 6, pp 1–9

  36. Kostinger M, Wohlhart P, Roth P, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops), pp 2144–2151

  37. Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2017) Inception-v4, Inception-ResNet and the impact of residual connections on learning. In: Thirty-First AAAI conference on artificial intelligence, pp 4278–4284

  38. Zhang K-P, Zhang Z-P, Li Z-F, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process Lett 23(10):1499–1503

    Article  Google Scholar 

  39. Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) ImageNet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, pp 248–255

  40. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256

  41. Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: 2017 IEEE international conference on computer vision (ICCV), pp 1021–1030

  42. Jourabloo A, Liu X (2016) Large-pose face alignment via CNN-based dense 3D model fitting. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 4188–4196

  43. Zhu X, Lei Z, Liu X (2016) Face alignment across large poses: a 3D solution. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp 146–155

  44. Redmon J, Farhadi A (2018) YOLOv3: An Incremental Improvement. http://arxiv.org/abs/1804.02767

  45. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C, Berg A (2016) SSD: single shot multibox detector. In: European conference on computer vision (ECCV), pp 21–37

Download references

Acknowledgements

This work is being supported by the National Natural Science Foundation of China under Grant No. 61976193, the Zhejiang Provincial Science and Technology Planning Key Project of China under Grant No. 2018C01064 and the Zhejiang Provincial Natural Science Foundation of China under Grant No. LY19F020027.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Gao.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, F., Li, S. & Lu, S. How frontal is a face? Quantitative estimation of face pose based on CNN and geometric projection. Neural Comput & Applic 33, 3035–3051 (2021). https://doi.org/10.1007/s00521-020-05167-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-020-05167-0

Keywords

Navigation