Real-Time Human Body Pose Estimation for In-Car Depth Images

  • Helena R. TorresEmail author
  • Bruno Oliveira
  • Jaime Fonseca
  • Sandro Queirós
  • João Borges
  • Nélson Rodrigues
  • Victor Coelho
  • Johannes Pallauf
  • José Brito
  • José Mendes
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 553)


Over the next years, the number of autonomous vehicles is expected to increase. This new paradigm will change the role of the driver inside the car, and so, for safety purposes, the continuous monitoring of the driver/passengers becomes essential. This monitoring can be achieved by detecting the human body pose inside the car to understand the driver/passenger’s activity. In this paper, a method to accurately detect the human body pose on depth images acquired inside a car with a time-of-flight camera is proposed. The method consists in a deep learning strategy where the architecture of the convolutional neural network used is composed by three branches: the first branch is used to estimate the confidence maps for each joint position, the second one to associate different body parts, and the third branch to detect the presence of each joint in the image. The proposed framework was trained and tested in 8820 and 1650 depth images, respectively. The method showed to be accurate, achieving an average distance error between the detected joints and the ground truth of 7.6 pixels and an average accuracy, precision, and recall of 95.6%, 96.0%, and 97.8% respectively. Overall, these results demonstrate the robustness of the method and its potential for in-car body pose monitoring purposes.


Autonomous driving Deep-learning Depth images Pose estimation 



This work is supported by: European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project nº 002797; Funding Reference: POCI-01-0247-FEDER-002797].


  1. 1.
    Levinson, J., et al.: Towards fully autonomous driving: systems and algorithms. In: IEEE Intelligent Vehicles Symposium, pp. 163–168 (2011)Google Scholar
  2. 2.
    Banks, V.A., Stanton, N.A.: Analysis of driver roles: modelling the changing role of the driver in automated driving systems using EAST analysis of driver roles: modelling the changing role of the driver in automated driving systems using EAST. Theor. Issues Ergon. Sci. 1–17 (2017) Google Scholar
  3. 3.
    Regazzoni, D., De Vecchi, G., Rizzi, C.: RGB cams vs RGB-D sensors: Low cost motion capture technologies performances and limitations. J. Manuf. Syst. 33(4), 719–728 (2014)CrossRefGoogle Scholar
  4. 4.
    Shao, L., Han, J., Xu, D., Shotton, J.: Computer vision for RGB-D sensors: Kinect and its applications. IEEE Trans. Cybern. 43(5), 1314–1317 (2013)CrossRefGoogle Scholar
  5. 5.
    Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)Google Scholar
  6. 6.
    Casner, B.Y.S.M., Hutchins, E.L., Norman, D.O.N., Promise, A.C.: The challenges of partially automated driving. In: Communications of the ACM, pp. 70–77 (2016)CrossRefGoogle Scholar
  7. 7.
    Fagnant, D.J., Kockelman, K.: Preparing a nation for autonomous vehicles: opportunities, barriers and policy recommendations. Transp. Res. Part A 77, 167–181 (2015)Google Scholar
  8. 8.
    Krueger, R., Rashidi, T.H., Rose, J.M.: Preferences for shared autonomous vehicles. Transp. Res. Part C Emerg. Technol. 69, 343–355 (2016)CrossRefGoogle Scholar
  9. 9.
    Demirdjian, D., Varri, C.: Driver pose estimation with 3D Time-of-Flight sensor. In: 2009 IEEE Workshop on Computational Intelligence in Vehicles and Vehicular Systems, pp. 16–22 (2009)Google Scholar
  10. 10.
    Ye, M., Yang, R.: Real-time simultaneous pose and shape estimation for articulated objects using a single depth camera. In: CVPR 2014, pp. 2345–2352 (2014)Google Scholar
  11. 11.
    Ye, M., Wang, X., Yang, R., Ren, L., Pollefeys, M.: Accurate 3D pose estimation from a single depth image. In: 2011 International Conference on Computer Vision, pp. 731–738 (2011)Google Scholar
  12. 12.
    Sigalas, M., Pateraki, M., Trahanias, P.: Full-body pose tracking? The top view reprojection approach. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1569–1582 (2016)CrossRefGoogle Scholar
  13. 13.
    Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56(1), 116–124 (2013)CrossRefGoogle Scholar
  14. 14.
    Shotton, J., et al.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35(12), 2821–2840 (2013)CrossRefGoogle Scholar
  15. 15.
    Tsai, M.-H., Chen, K.-H., Lin, I.-C.: Real-time upper body pose estimation from depth images. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 2234–2238 (2015)Google Scholar
  16. 16.
    Buys, K., Cagniart, C., Baksheev, A., De Laet, T., De Schutter, J., Pantofaru, C.: An adaptable system for RGB-D based human body detection and pose estimation. J. Vis. Commun. Image Represent. 25(1), 39–52 (2014)CrossRefGoogle Scholar
  17. 17.
    Haque, A., Peng, B., Luo, Z., Alahi, A., Yeung, S., Fei-Fei, L.: Towards Viewpoint Invariant 3D Human Pose Estimation, pp. 160–177. Springer, Cham (2016)Google Scholar
  18. 18.
    Belagiannis, V., Zisserman, A., Group, V.G.: Recurrent human pose estimation. In: 12th IEEE International Conference on Automatic Face & Gesture Recognition (2017)Google Scholar
  19. 19.
    Chu, X., Yang, W., Ouyang, W., Ma, C., Yuille, A.L., Wang, X.: Multi-context attention for human pose estimation, pp. 1831–1840. arXiv preprint arXiv:1702.07432
  20. 20.
    He, K, Gkioxari, G, Dollár, P, Girshick, R.: Mask R-CNN. In: Computer Vision (ICCV), pp. 2980–2988 (2017)Google Scholar
  21. 21.
    Chen, X., Yuille, A.: Articulated pose estimation by a graphical model with image dependent pairwise relations. In: Conference on Neural Information Processing Systems, pp. 1–9 (2014)Google Scholar
  22. 22.
    Tompson, J., Jain, A., Lecun, Y., Bregler, C.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems, pp. 1–9 (2014)Google Scholar
  23. 23.
    Fan, X., Zheng, K., Lin, Y., Wang, S.: Combining Local Appearance and Holistic View: Dual-Source Deep Neural Networks for Human Pose Estimation Google Scholar
  24. 24.
    Bulat, A., Tzimiropoulos, G.: Human pose estimation via convolutional part heatmap regression. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9911, pp. 717–732. Springer, Cham (2016). Scholar
  25. 25.
    Borghi, G.: POSEidon: Face-from-depth for driver pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5494–5503 (2017)Google Scholar
  26. 26.
    Murthy, P., Kovalenko, O., Elhayek, A., Gava, C., Stricker, D.: 3D Human Pose Tracking inside Car using Single RGB Spherical Camera (2017)Google Scholar
  27. 27.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Vision and Pattern Recognition, pp. 1–14 (2014)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2019

Authors and Affiliations

  • Helena R. Torres
    • 1
    Email author
  • Bruno Oliveira
    • 1
  • Jaime Fonseca
    • 1
  • Sandro Queirós
    • 1
  • João Borges
    • 1
  • Nélson Rodrigues
    • 1
  • Victor Coelho
    • 3
  • Johannes Pallauf
    • 2
  • José Brito
    • 4
  • José Mendes
    • 1
  1. 1.Algoritmi Center, University of MinhoGuimarãesPortugal
  2. 2.BoschAbstattGermany
  3. 3.BoschBragaPortugal
  4. 4.2Ai, Polytechnical Institute of Cávado and AveBarcelosPortugal

Personalised recommendations