ECCV 2014: Computer Vision – ECCV 2014 pp 328-344 | Cite as
Probabilistic Temporal Head Pose Estimation Using a Hierarchical Graphical Model
Abstract
We present a hierarchical graphical model to probabilistically estimate head pose angles from real-world videos, that leverages the temporal pose information over video frames. The proposed model employs a number of complementary facial features, and performs feature level, probabilistic classifier level and temporal level fusion. Extensive experiments are performed to analyze the pose estimation performance for different combination of features, different levels of the proposed hierarchical model and for different face databases. Experiments show that the proposed head pose model improves on the current state-of-the-art for the unconstrained McGillFaces [10] and the constrained CMU Multi-PIE [14] databases, increasing the pose classification accuracy compared to the current top performing method by 19.38% and 19.89%, respectively.
Keywords
Face hierarchical probabilistic video graphical temporal head posePreview
Unable to display preview. Download preview PDF.
References
- 1.Berg, A.C., Berg, T.L., Malik, J.: Shape matching and object recognition using low distortion correspondences. In: Proc. International Conference on Computer Vision and Pattern Recognition, CVPR (2005)Google Scholar
- 2.Aghajanian, J., Prince, S.: Face pose estimation in uncontrolled environments. In: Cavallaro, A., Prince, S., Alexander, D.C. (eds.) BMVC, pp. 1–11. British Machine Vision Association (2009)Google Scholar
- 3.Balasubramanian, V.N., Ye, J., Panchanathan, S.: Biased manifold embedding: A framework for person-independent head pose estimation. In: CVPR. IEEE Computer Society (2007)Google Scholar
- 4.BenAbdelkader, C.: Robust head pose estimation using supervised manifold learning. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 518–531. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 5.Berg, A.C., Malik, J.: Geometric blur for template matching. In: Proceedings of the 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, pp. 607–614 (2001)Google Scholar
- 6.Beymer, D.: Face recognition under varying pose. In: Proceedings of the 1994 IEEE Computer Society Conference on Computer Vision and Pattern Recognition CVPR 1994, pp. 756–761 (June 1994)Google Scholar
- 7.Blanz, V., Grother, P., Phillips, P., Vetter, T.: Face recognition based on frontal views generated from non-frontal images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 2, pp. 454–461 (June 2005)Google Scholar
- 8.Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)CrossRefMATHGoogle Scholar
- 9.Burghouts, G., Geusebroek, J.: Performance evaluation of local color invariants. Computer Vision and Image Understanding (CVIU) 113, 48–62 (2009)CrossRefGoogle Scholar
- 10.Demirkus, M., Clark, J.J., Arbel, T.: Robust semi-automatic head pose labeling for real-world face video sequences. Multimedia Tools and Applications, 1–29 (2013)Google Scholar
- 11.Demirkus, M., Oreshkin, B.N., Clark, J.J., Arbel, T.: Spatial and probabilistic codebook template based head pose estimation from unconstrained environments. In: Macq, B., Schelkens, P. (eds.) ICIP, pp. 573–576. IEEE (2011)Google Scholar
- 12.Demirkus, M., Precup, D., Clark, J.J., Arbel, T.: Soft biometric trait classification from real-world face videos conditioned on head pose estimation. In: CVPR Workshops, pp. 130–137. IEEE (2012)Google Scholar
- 13.Demirkus, M., Precup, D., Clark, J.J., Arbel, T.: Multi-layer temporal graphical model for head pose estimation in real-world videos. In: ICIP. IEEE (2014)Google Scholar
- 14.Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-pie. Image Vision Comput. 28(5), 807–813 (2010)CrossRefGoogle Scholar
- 15.Hansen, N., Müller, S.D., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (cma-es). Evol. Comput. 11(1), 1–18 (2003)CrossRefGoogle Scholar
- 16.Hassner, T.: Viewing real-world faces in 3D. In: The IEEE International Conference on Computer Vision, ICCV (December 2013)Google Scholar
- 17.Hu, C., Xiao, J., Matthews, I., Baker, S., Cohn, J.F., Kanade, T.: Fitting a single active appearance model simultaneously to multiple images. In: Hoppe, A., Barman, S., Ellis, T. (eds.) BMVC, pp. 1–10. BMVA Press (2004)Google Scholar
- 18.Hu, N., Huang, W., Ranganath, S.: Head pose estimation by non-linear embedding and mapping. In: ICIP (2), pp. 342–345. IEEE (2005)Google Scholar
- 19.Hua, G., Yang, M., Learned-Miller, E., Ma, Y., Turk, M., Kriegman, D., Huang, T.: Special section on real-world face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 1921–1924 (2011)CrossRefGoogle Scholar
- 20.Huang, D., Storer, M., De la Torre, F., Bischof, H.: Supervised local subspace learning for continuous head pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2928 (2011)Google Scholar
- 21.Kim, J., Grauman, K.: Boundary preserving dense local regions. In: Proc. International Conference on Computer Vision and Pattern Recognition, CVPR (2011)Google Scholar
- 22.Kumar, N., Berg, A., Belhumeur, P., Nayar, S.: Describable visual attributes for face verification and image search. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 33(10), 1962–1977 (2011)CrossRefGoogle Scholar
- 23.Li, H., Hua, G., Lin, Z., Brandt, J., Yang, J.: Probabilistic elastic matching for pose variant face verification. In: CVPR, pp. 3499–3506 (2013)Google Scholar
- 24.Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV) 60(2), 91–110 (2004)CrossRefGoogle Scholar
- 25.Morency, L., Rahimi, A., Checka, N., Darrell, T.: Fast stereo-based head tracking for interactive environments. In: Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 390–395 (May 2002)Google Scholar
- 26.Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 607–626 (2009)CrossRefGoogle Scholar
- 27.Oka, K., Sato, Y., Nakanishi, Y., Koike, H.: Head pose estimation system based on particle filtering with adaptive diffusion control. In: MVA, pp. 586–589 (2005)Google Scholar
- 28.Orozco, J., Gong, S., Xiang, T.: Head pose classification in crowded scenes. In: BMVC. British Machine Vision Association (2009)Google Scholar
- 29.Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann (1988)Google Scholar
- 30.Raytchev, B., Yoda, I., Sakaue, K.: Head pose estimation by nonlinear manifold learning. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 4, pp. 462–466 (August 2004)Google Scholar
- 31.Van de Sande, K.E.A., Gevers, T.: Evaluating color descriptors for object and scene recognition. IEEE PAMI 32(9), 1582–1596 (2010)CrossRefGoogle Scholar
- 32.Sherrah, J., Gong, S.: Fusion of perceptual cues for robust tracking of head pose and position. Pattern Recognition 34(8), 1565–1572 (2001)CrossRefMATHGoogle Scholar
- 33.Toews, M., Arbel, T.: Detection, localization, and sex classification of faces from arbitrary viewpoints and under occlusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(9), 1567–1581 (2009)CrossRefGoogle Scholar
- 34.Torki, M., Elgammal, A.: Regression from local features for viewpoint and pose estimation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2603–2610 (November 2011)Google Scholar
- 35.Tosato, D., Farenzena, M., Spera, M., Murino, V., Cristani, M.: Multi-class classification on riemannian manifolds for video surveillance. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 378–391. Springer, Heidelberg (2010)CrossRefGoogle Scholar
- 36.Tuytelaars, T.: Mikolajczyk, K.: Local invariant feature detectors: a survey. Foundations and Trends in Computer Graphics and Vision 3(3), pp. 177–280 (2008)Google Scholar
- 37.Tuytelaars, T., Mikolajczyk, K.: A survey on local invariant features. Tutorial at ECCV (2006)Google Scholar
- 38.Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: Proc. IEEE Conf. Comput. Vision Pattern Recognition (2011)Google Scholar
- 39.Wu, J., Trivedi, M.M.: A two-stage head pose estimation framework and evaluation. Pattern Recogn. 41(3), 1138–1158 (2008)CrossRefMATHGoogle Scholar
- 40.Xiong, X., De la Torre, F.: Supervised descent method and its application to face alignment. In: Proc. International Conference on Computer Vision and Pattern Recognition, CVPR (2013)Google Scholar
- 41.Yi, D., Lei, Z., Li, S.Z.: Towards pose robust face recognition. In: CVPR, pp. 3539–3545 (2013)Google Scholar
- 42.Zhao, G., Chen, L., Song, J., Chen, G.: Large head movement tracking using sift-based registration. In: Lienhart, R., Prasad, A.R., Hanjalic, A., Choi, S., Bailey, B.P., Sebe, N. (eds.) ACM Multimedia, pp. 807–810. ACM (2007)Google Scholar
- 43.Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2879–2886 (June 2012)Google Scholar