Multiclass classification based on a deep convolutional network for head pose estimation

Cai, Ying; Yang, Meng-long; Li, Jun

doi:10.1631/FITEE.1500125

Multiclass classification based on a deep convolutional network for head pose estimation

Published: 07 November 2015

Volume 16, pages 930–939, (2015)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Ying Cai^1,2,
Meng-long Yang³ &
Jun Li²

251 Accesses
15 Citations
3 Altmetric
Explore all metrics

Abstract

Head pose estimation has been considered an important and challenging task in computer vision. In this paper we propose a novel method to estimate head pose based on a deep convolutional neural network (DCNN) for 2D face images. We design an effective and simple method to roughly crop the face from the input image, maintaining the individual-relative facial features ratio. The method can be used in various poses. Then two convolutional neural networks are set up to train the head pose classifier and then compared with each other. The simpler one has six layers. It performs well on seven yaw poses but is somewhat unsatisfactory when mixed in two pitch poses. The other has eight layers and more pixels in input layers. It has better performance on more poses and more training samples. Before training the network, two reasonable strategies including shift and zoom are executed to prepare training samples. Finally, feature extraction filters are optimized together with the weight of the classification component through training, to minimize the classification error. Our method has been evaluated on the CAS-PEAL-R1, CMU PIE, and CUBIC FacePix databases. It has better performance than state-of-the-art methods for head pose estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Head Pose Estimation Using Convolutional Neural Network

Head Pose Classification Based on Deep Convolution Networks

Simultaneous Face Detection and Head Pose Estimation: A Fast and Unified Framework

References

Black, J.A.Jr., Gargesha, M., Kahol, K., et al., 2002. A framework for performance evaluation of face recognition algorithms. SPIE, 4862:163. [doi:10.1117/12. 473032]
Google Scholar
Ciresan, D., Meier, U., Schmidhuber, J., 2012. Multi-column deep neural networks for image classification. CVPR, p.3642–3649. [doi:10.1109/CVPR.2012.6248110]
Google Scholar
Farabet, C., Couprie, C., Najman, L., et al., 2013. Learning hierarchical features for scene labeling. IEEE Trans. Patt. Anal. Mach. Intell., 35(8):1915–1929. [doi:10.1109/TPAMI.2012.231]
Article Google Scholar
Fu, Y., Huang, T.S., 2006. Graph embedded analysis for head pose estimation. 7th Int. Conf. on Automatic Face and Gesture Recognition, p.1–6. [doi:10.1109/FGR.2006.60]
Google Scholar
Fukushima, K., 1980. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern., 36(4):193–202. [doi:10.1007/BF00344251]
Article MATH Google Scholar
Gao, W., Cao, B., Shan, S.G., et al., 2008. The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Trans. Syst. Man. Cybern. A, 38(1):149–161. [doi:10.1109/TSMCA.2007.909557]
Article Google Scholar
Huang, C., Ding, X.Q., Fang, C., 2010. Head pose estimation based on random forests for multiclass classification. ICPR, p.934–937. [doi:10.1109/ICPR.2010.234]
Google Scholar
Jarrett, K., Kavukcuoglu, K., Ranzato, M., et al., 2009. What is the best multi-stage architecture for object recognition? ICCV, p.2146–2153. [doi:10.1109/ICCV. 2009.5459469]
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E., 2012. ImageNet classification with deep convolutional neural networks. NIPS, p.1097–1105.
Google Scholar
Lanitis, A., Taylor, C.J., Cootes, T.F., et al., 1995. Automatic interpretation of human faces and hand gestures using flexible models. Int. Workshop on Automatic Face-and Gesture-Recognition, p.98–103.
Google Scholar
LeCun, Y., Bengio, Y., 1995. Convolutional networks for images, speech, and time series. In: Arbib, M.A., (Ed.), The Handbook of Brain Theory and Neural Networks. MIT Press, Cambridge.
Google Scholar
LeCun, Y., Jackel, L.D., Boser, B., et al., 1989. Handwritten digit recognition: applications of neural network chips and automatic learning. IEEE Commun. Mag., 27(11):41–46. [doi:10.1109/35.41400]
Article Google Scholar
LeCun, Y., Kanter, I., Solla, S.A., 1991. Eigenvalues of covariance matrices: application to neural-network learning. Phys. Rev. Lett., 66(18):2396. [doi:10.1103/ PhysRevLett.66.2396]
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., et al., 1998. Gradientbased learning applied to document recognition. Proc. IEEE, 86(11):2278–2324. [doi:10.1109/5.726791]
Article Google Scholar
Luo, P., Wang, X.G., Tang, X.O., 2012. Hierarchical face parsing via deep learning. CVPR, p.2480–2487. [doi:10.1109/CVPR.2012.6247963]
Google Scholar
Ma, B.P., Zhang, W.C., Shan, S.G., et al., 2006. Robust head pose estimation using LGBP. ICPR, p.512–515. [doi:10.1109/ICPR.2006.1006]
Google Scholar
Ma, B.P., Chai, X.J., Wang, T.J., 2013. A novel feature descriptor based on biologically inspired feature for head pose estimation. Neurocomputing, 115(4):1–10. [doi:10.1016/jneucom.2012.11.005]
Article Google Scholar
Matsugu, M., Cardon, P., 2004. Unsupervised feature selection for multi-class object detection using convolutional neural networks. ISNN, p.864–869. [doi:10.1007/978–3-540–28647-9_142]
Google Scholar
Murphy-Chutorian, E., Trivedi, M.M., 2009. Head pose estimation in computer vision: a survey. IEEE Trans. Patt. Anal. Mach. Intell., 31(4):607–626. [doi:10.1109/ TPAMI.2008.106]
Article Google Scholar
Raytchev, B., Yoda, I., Sakaue, K., 2004. Head pose estimation by nonlinear manifold learning. ICPR, p.462–466. [doi:10.1109/ICPR.2004.1333802]
Google Scholar
Scherer, D., Müller, A., Behnke, S., 2010. Evaluation of pooling operations in convolutional architectures for object recognition. Proc. 20th Int. Conf. on Artificial Neural Networks, p.92–101. [doi:10.1007/978–3-642–15825-4_10]
Google Scholar
Sim, T., Baker, S., Bsat, M., 2002. The CMU pose, illumination, and expression (PIE) database. 5th IEEE Int. Conf. on Automatic Face and Gesture Recognition, p.46–51. [doi:10.1109/AFGR.2002.1004130]
Google Scholar
Simard, P.Y., Steinkraus, D., Platt, J.C., 2003. Best practices for convolutional neural networks applied to visual document analysis. 7th Int. Conf. on Document Analysis and Recognition, p.958–963. [doi:10.1109/ ICDAR.2003.1227801]
Google Scholar
Storer, M., Urschler, M., Bischof, H., 2009. 3D-MAM: 3D morphable appearance model for efficient fine head pose estimation from still images. ICCV, p.192–199. [doi:10.1109/ICCVW.2009.5457701]
Google Scholar
Sun, Y., Wang, X.G., Tang, X.O., 2013. Deep convolutional network cascade for facial point detection. CVPR, p.3476–3483. [doi:10.1109/CVPR.2013.446]
Google Scholar
Tang, Y.Q., Sun, Z.N., Tan, T.N., 2014. A survey on head pose estimation. Patt. Recogn. Artif. Intell., 27(3):213–225 (in Chinese).
Google Scholar
Wang, J.G., Sung, E., 2007. EM enhancement of 3D head pose estimated by point at infinity. Image Vis. Comput., 25(12):1864–1874. [doi:10.1016/jimavis. 2005.12.017]
Article Google Scholar
Wang, X.W., Huang, X.Y., Gao, J.Z., et al., 2008. Illumination and person-insensitive head pose estimation using distance metric learning. ECCV, p.624–637. [doi:10.1007/978–3-540–88688-4_46]
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Sichuan University, Chengdu, 610065, China
Ying Cai
College of Information Engineering, Sichuan Agricultural University, Yaan, 625014, China
Ying Cai & Jun Li
School of Aeronautics and Astronautics, Sichuan University, Chengdu, 610065, China
Meng-long Yang

Authors

Ying Cai
View author publications
You can also search for this author in PubMed Google Scholar
Meng-long Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng-long Yang.

Additional information

ORCID: Ying CAI, http://orcid.org/0000-0002-5096-6175

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cai, Y., Yang, Ml. & Li, J. Multiclass classification based on a deep convolutional network for head pose estimation. Frontiers Inf Technol Electronic Eng 16, 930–939 (2015). https://doi.org/10.1631/FITEE.1500125

Download citation

Received: 20 April 2015
Revised: 15 May 2015
Accepted: 16 October 2015
Published: 07 November 2015
Issue Date: November 2015
DOI: https://doi.org/10.1631/FITEE.1500125

Key words

Document code

A

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiclass classification based on a deep convolutional network for head pose estimation

Abstract

Access this article

Similar content being viewed by others

Head Pose Estimation Using Convolutional Neural Network

Head Pose Classification Based on Deep Convolution Networks

Simultaneous Face Detection and Head Pose Estimation: A Fast and Unified Framework

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

Document code

CLC number

Navigation

Multiclass classification based on a deep convolutional network for head pose estimation

Abstract

Access this article

Similar content being viewed by others

Head Pose Estimation Using Convolutional Neural Network

Head Pose Classification Based on Deep Convolution Networks

Simultaneous Face Detection and Head Pose Estimation: A Fast and Unified Framework

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

Document code

CLC number

Search

Navigation