Exploiting Depth and Intensity Information for Head Pose Estimation with Random Forests and Tensor Models

  • Sertan Kaymak
  • Ioannis Patras
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7729)


Real-time accurate head pose estimation is required for several applications. Methods based on 2D images might not provide accurate and robust head pose measurements due to large head pose variations and illumination changes. Robust and accurate head pose estimation can be achieved by integrating intensity and depth information. In this paper we introduce a head pose estimation system that employs random forests and tensor regression algorithms. The former allow the modeling of large head pose variations using large sets of training data, while the latter allow the estimation of more accurate head pose parameters. The combination of the above mentioned methods results in more robust and accurate predictions for large head pose variations. We also study the fusion of different sources of information (intensity and depth images) to determine how their combination affects the performance of a head pose estimation system. The efficiency of the proposed framework is tested on the Biwi Kinect Head Pose dataset, where it is shown that the proposed methodology outperforms typical random forests.


Random Forest Leaf Node Depth Data Feature Channel Large Head 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Fanelli, G., Gall, J., Van Gool, L.: Real Time Head Pose Estimation with Random Regression Forests. In: Computer Vision and Pattern Recognition, CVPR, pp. 617–624 (2011)Google Scholar
  2. 2.
    Fanelli, G., Weise, T., Gall, J., Van Gool, L.: Real Time Head Pose Estimation from Consumer Depth Cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Breiman, L.: Random Forests. Machine Learning 45, 5–32 (2001)zbMATHCrossRefGoogle Scholar
  4. 4.
    Guo, W., Kotsia, I., Patras, I.: Tensor Learning for Regression. IEEE Transactions on Image Processing 21, 816–827 (2012)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Breitenstein, M.D., Kuettel, D., Weise, T., Van Gool, L., Pfister, H.: Real-time face pose estimation from single range images. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8 (2008)Google Scholar
  6. 6.
    Kolda, T.G., Bader, B.W.: Tensor Decompositions and Applications. SIAM Review 51, 455–500 (2009)MathSciNetzbMATHCrossRefGoogle Scholar
  7. 7.
    Seemann, E., Nickel, K., Stiefelhagen, R.: Head pose estimation using stereo vision for human-robot interaction. In: Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 626–631 (2004)Google Scholar
  8. 8.
    Morency, L.P., Sundberg, P., Darrell, T.: Pose estimation using 3D view-based eigenspaces. In: IEEE International Workshop on Analysis and Modeling of Faces and Gestures, AMFG 2003, pp. 45–52 (2003)Google Scholar
  9. 9.
    Osadchy, M., Cun, Y.L., Miller, M.L.: Synergistic Face Detection and Pose Estimation with Energy-Based Models. J. Mach. Learn. Res. 8, 1197–1215 (2007)Google Scholar
  10. 10.
    Vatahska, T., Bennewitz, M., Behnke, S.: Feature-based head pose estimation from images. In: 2007 7th IEEE-RAS International Conference on Humanoid Robots, pp. 330–335. IEEE (2007)Google Scholar
  11. 11.
    Whitehill, J., Movellan, J.R.: A discriminative approach to frame-by-frame head pose tracking. In: 8th IEEE International Conference on Automatic Face Gesture Recognition, FG 2008, pp. 1–7 (2008)Google Scholar
  12. 12.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 681–685 (2001)CrossRefGoogle Scholar
  13. 13.
    Ramnath, K., Koterba, S., Xiao, J., Hu, C., Matthews, I., Baker, S., Cohn, J., Kanade, T.: Multi-view AAM fitting and construction. International Journal of Computer Vision 76, 183–204 (2008)CrossRefGoogle Scholar
  14. 14.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194. ACM Press/Addison-Wesley Publishing Co. (1999)Google Scholar
  15. 15.
    Storer, M., Urschler, M., Bischof, H.: 3d-mam: 3d morphable appearance model for efficient fine head pose estimation from still images. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp. 192–199. IEEE (2009)Google Scholar
  16. 16.
    Cristinacce, D., Cootes, T.: Feature detection and tracking with constrained local models, pp. 929–938 (2006)Google Scholar
  17. 17.
    Murphy-Chutorian, E., Trivedi, M.M.: Head Pose Estimation and Augmented Reality Tracking: An Integrated System and Evaluation for Monitoring Driver Awareness. IEEE Transactions on Intelligent Transportation Systems 11, 300–311 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Sertan Kaymak
    • 1
  • Ioannis Patras
    • 1
  1. 1.Queen MaryUniversity of LondonUK

Personalised recommendations