Skip to main content

Deep Head Pose Estimation from Depth Data for In-Car Automotive Applications

  • Conference paper
  • First Online:
Understanding Human Activities Through 3D Sensors (UHA3DS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10188))

Abstract

Recently, deep learning approaches have achieved promising results in various fields of computer vision. In this paper, we tackle the problem of head pose estimation through a Convolutional Neural Network (CNN). Differently from other proposals in the literature, the described system is able to work directly and based only on raw depth data. Moreover, the head pose estimation is solved as a regression problem and does not rely on visual facial features like facial landmarks. We tested our system on a well known public dataset, Biwi Kinect Head Pose, showing that our approach achieves state-of-art results and is able to meet real time performance requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The tool is written in Java and it is completely free and open source. It takes as input the JSON file produced by the Keras framework and generates image outputs in common formats such as png, jpeg or gif. We invite the readers to test and use this software, hoping it can help in deep learning studies and presentations. The code can be downloaded at the following link:

    http://imagelab.ing.unimore.it/deepvisualizer.

References

  1. distraction.gov, official us government website for distracted driving. http://www.distraction.gov/index.html. Accessed 1 Sept 2016

  2. Craye, C., Karray, F.: Driver distraction detection and recognition using RGB-D sensor. CoRR, vol. abs/1502.00250 (2015). http://arxiv.org/abs/1502.00250

  3. Rahman, H., Begum, S., Ahmed, M.U.: Driver monitoring in the context of autonomous vehicle, November 2015. http://www.es.mdh.se/publications/4021-

  4. Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 607–626 (2009). https://doi.org/10.1109/TPAMI.2008.106

    Article  Google Scholar 

  5. Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 617–624. IEEE (2011)

    Google Scholar 

  6. Ahn, B., Park, J., Kweon, I.S.: Real-time head orientation from a monocular camera using deep neural network. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014, Part III. LNCS, vol. 9005, pp. 82–96. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16811-1_6

    Chapter  Google Scholar 

  7. Mukherjee, S.S., Robertson, N.M.: Deep head pose: gaze-direction estimation in multimodal video. IEEE Trans. Multimed. 17(11), 2094–2107 (2015)

    Article  Google Scholar 

  8. Liu, X., Liang, W., Wang, Y., Li, S., Pei, M.: 3D head pose estimation with convolutional neural network trained on synthetic images. In: 2016 IEEE International Conference on Image Processing (ICIP), pp. 1289–1293. IEEE (2016)

    Google Scholar 

  9. Chen, J., Wu, J., Richter, K., Konrad, J., Ishwar, P.: Estimating head pose orientation using extremely low resolution images. In: IEEE Southwest Symposium on Image Analysis and Interpretation (SSIAI), pp. 65–68. IEEE (2016)

    Google Scholar 

  10. Drouard, V., Ba, S., Evangelidis, G., Deleforge, A., Horaud, R.: Head pose estimation via probabilistic high-dimensional regression. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 4624–4628. IEEE (2015)

    Google Scholar 

  11. Malassiotis, S., Strintzis, M.G.: Robust real-time 3D head pose estimation from range data. Pattern Recogn. 38(8), 1153–1165 (2005)

    Article  Google Scholar 

  12. Breitenstein, M.D., Kuettel, D., Weise, T., Van Gool, L., Pfister, H.: Real-time face pose estimation from single range images. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)

    Google Scholar 

  13. Kondori, F.A., Yousefi, S., Li, H., Sonning, S., Sonning, S.: 3D head pose estimation using the kinect. In: 2011 International Conference on Wireless Communications and Signal Processing (WCSP), pp. 1–4. IEEE (2011)

    Google Scholar 

  14. Padeleris, P., Zabulis, X., Argyros, A.A.: Head pose estimation on depth data based on particle swarm optimization. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 42–49. IEEE (2012)

    Google Scholar 

  15. Papazov, C., Marks, T.K., Jones, M.: Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4722–4730 (2015)

    Google Scholar 

  16. Seemann, E., Nickel, K., Stiefelhagen, R.: Head pose estimation using stereo vision for human-robot interaction. In: FGR, pp. 626–631. IEEE Computer Society (2004). http://dblp.uni-trier.de/db/conf/fgr/fgr2004.html

  17. Bleiweiss, A., Werman, M.: Robust head pose estimation by fusing time-of-flight depth and color. In: 2010 IEEE International Workshop on Multimedia Signal Processing (MMSP), pp. 116–121. IEEE (2010)

    Google Scholar 

  18. Baltrušaitis, T., Robinson, P., Morency, L.-P.: 3D constrained local model for rigid and non-rigid facial tracking. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2610–2617. IEEE (2012)

    Google Scholar 

  19. Yang, J., Liang, W., Jia, Y.: Face pose estimation with combined 2D and 3D hog features. In: 2012 21st International Conference on Pattern Recognition (ICPR), pp. 2492–2495. IEEE (2012)

    Google Scholar 

  20. Saeed, A., Al-Hamadi, A.: Boosted human head pose estimation using Kinect camera. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 1752–1756. IEEE (2015)

    Google Scholar 

  21. Ghiass, R.S., Arandjelović, O., Laurendeau, D.: Highly accurate and fully automatic head pose estimation from a low quality consumer-level RGB-D sensor. In: Proceedings of the 2nd Workshop on Computational Models of Social Interactions: Human-Computer-Media Communication, pp. 25–34. ACM (2015)

    Google Scholar 

  22. Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)

    Article  Google Scholar 

  23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  24. Fanelli, G., Dantone, M., Gall, J., Fossati, A., Van Gool, L.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013)

    Article  Google Scholar 

  25. Nuevo, J., Bergasa, L.M., Jiménez, P.: RSMAT: robust simultaneous modeling and tracking. Pattern Recogn. Lett. 31, 2455–2463 (2010). https://doi.org/10.1016/j.patrec.2010.07.016

    Article  Google Scholar 

  26. Bagdanov, A.D., Masi, I., Del Bimbo, A.: The florence 2D/3D hybrid face datset. In: Proceedings of ACM Multimedia International Workshop on Multimedia Access to 3D Human Objects (MA3HO 2011). ACM Press, December 2011

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guido Borghi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Venturelli, M., Borghi, G., Vezzani, R., Cucchiara, R. (2018). Deep Head Pose Estimation from Depth Data for In-Car Automotive Applications. In: Wannous, H., Pala, P., Daoudi, M., Flórez-Revuelta, F. (eds) Understanding Human Activities Through 3D Sensors. UHA3DS 2016. Lecture Notes in Computer Science(), vol 10188. Springer, Cham. https://doi.org/10.1007/978-3-319-91863-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-91863-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-91862-4

  • Online ISBN: 978-3-319-91863-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics