Efficient Pose-Based Action Recognition

  • Abdalrahman Eweiwi
  • Muhammed S. Cheema
  • Christian Bauckhage
  • Juergen Gall
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9007)


Action recognition from 3d pose data has gained increasing attention since the data is readily available for depth or RGB-D videos. The most successful approaches so far perform an expensive feature selection or mining approach for training. In this work, we introduce an algorithm that is very efficient for training and testing. The main idea is that rich structured data like 3d pose does not require sophisticated feature modeling or learning. Instead, we reduce pose data over time to histograms of relative location, velocity, and their correlations and use partial least squares to learn a compact and discriminative representation from it. Despite of its efficiency, our approach achieves state-of-the-art accuracy on four different benchmarks. We further investigate differences of 2d and 3d pose data for action recognition.


Partial Little Square Video Clip Action Recognition Dynamic Time Warping Human Action Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was carried out in the project automatic activity recognition in large image databases which is funded by the German Research Foundation (DFG). The authors would also like to acknowledge the financial support provided by the DFG Emmy Noether program (GA 1927/1-1).


  1. 1.
    Campbell, L., Bobick, A.: Recognition of human body motion using phase space constraints. In: ICCV (1995)Google Scholar
  2. 2.
    Bissacco, A., Chiuso, A., Ma, Y., Soatto, S.: Recognition of human gaits. In: CVPR (2001)Google Scholar
  3. 3.
    Wu, S., Oreifej, O., Shah, M.: Action recognition in videos acquired by a moving camera using motion decomposition of Lagrangian particle trajectories. In: ICCV (2011)Google Scholar
  4. 4.
    Efros, A., Berg., A., Mori, G., Malik, J.: Recognizing action at a distance. In: CVPR (2003)Google Scholar
  5. 5.
    Thurau, C., Hlavac, V.: Pose primitive based human action recognition in videos or still images. In: CVPR (2008)Google Scholar
  6. 6.
    Ikizler-Cinbis, N., Cinbis, R., Sclaroff, S.: Learning actions from the web. In: ICCV (2009)Google Scholar
  7. 7.
    Eweiwi, A., Cheema, M.S., Bauckhage, C.: Discriminative joint non-negative matrix factorization for human action classification. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 61–70. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  8. 8.
    Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)CrossRefGoogle Scholar
  9. 9.
    Willems, G., Tuytelaars, T., Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  10. 10.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)Google Scholar
  11. 11.
    Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103, 60–79 (2013)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Xia, L., Aggarwal, J.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: CVPR (2013)Google Scholar
  13. 13.
    Ye, M., Zhang, Q., Wang, L., Zhu, J., Yang, R., Gall, J.: A survey on human motion analysis from depth data. In: Grzegorzek, M., Theobalt, C., Koch, R., Kolb, A. (eds.) Time-of-Flight and Depth Imaging. LNCS, vol. 8200, pp. 149–187. Springer, Heidelberg (2013) Google Scholar
  14. 14.
    Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2878–2890 (2013)CrossRefGoogle Scholar
  15. 15.
    Shotton, J., Girshick, R.B., Fitzgibbon, A.W., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., Blake, A.: Efficient human pose estimation from single depth images. IEEE Trans. Pattern Anal. Mach. Intell. 35, 2821–2840 (2013)CrossRefGoogle Scholar
  16. 16.
    Yao, A., Gall, J., van Gool, L.: Coupled action recognition and pose estimation from multiple views. Int. J. Comput. Vis. 100, 16–37 (2012)CrossRefzbMATHGoogle Scholar
  17. 17.
    Tran, K.N., Kakadiaris, I.A., Shah, S.K.: Modeling motion of body parts for action recognition. In: BMVC (2011)Google Scholar
  18. 18.
    Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.: Towards understanding action recognition. In: ICCV (2013)Google Scholar
  19. 19.
    Wang, C., Wang, Y., Yuille, A.: An approach to pose-based action recognition. In: CVPR (2013)Google Scholar
  20. 20.
    Wang, J., Liu, Z., Liu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR (2012)Google Scholar
  21. 21.
    Zanfir, M., Leordeanu, M., Sminchisescu, C.: The moving pose: an efficient 3D kinematics descriptor for low-latency action recognition and detection. In: ICCV (2013)Google Scholar
  22. 22.
    Wanqing, L., Zhengyou, Z., Zicheng, L.: Action recognition based on a bag of 3D points. In: CVPRW (2010)Google Scholar
  23. 23.
    Oreifej, O., Liu, Z.: Hon4d: Histogram of oriented 4D normals for activity recognition from depth sequences. In: CVPR (2013)Google Scholar
  24. 24.
    Barker, M., Rayens, W.: Partial least squares for discrimination. J. Chemometr. 17, 166–173 (2003)CrossRefGoogle Scholar
  25. 25.
    Hajd, M.A., Gonzlez, J., Davis, L.: On partial least squares in head pose estimation: how to simultaneously deal with misalignment. In: CVPR (2012)Google Scholar
  26. 26.
    Harada, T., Ushiku, Y., Yamashita, Y., Kuniyoshi, Y.: Discriminative spatial pyramid. In: CVPR (2011)Google Scholar
  27. 27.
    Schwartz, W.R., Kembhavi, A., Harwood, D., Davis, L.S.: Human detection using partial least squares analysis. In: ICCV (2009)Google Scholar
  28. 28.
    Sharma, A., Jacobs, D.: Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch. In: CVPR (2011)Google Scholar
  29. 29.
    Rosipal, R., Be, P.P., Trejo, L.J., Cristianini, N., Shawe-Taylor, J., Williamson, B.: Kernel partial least squares regression in reproducing Kernel Hilbert space. JMLR 2, 97–123 (2001)Google Scholar
  30. 30.
    Tenorth, M., Bandouch, J., Beetz, M.: The TUM Kitchen data set of everyday manipulation activities for motion tracking and action recognition. In: ICCV Workshops (2009)Google Scholar
  31. 31.
    Li, M., Yuan, B.: 2D-LDA: a statistical linear discriminant analysis for image matrix. Pattern Recogn. Lett. 26, 527–532 (2005)CrossRefGoogle Scholar
  32. 32.
    Bauckhage, C., Käster, T.: Benefits of separable, multilinear discriminant classification. In: ICPR (2006)Google Scholar
  33. 33.
    Wang, J., Wu, Y.: Learning maximum margin temporal warping for action recognition. In: ICCV (2013)Google Scholar
  34. 34.
    Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  35. 35.
    Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3D skeletons as points in a lie group. In: CVPR (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Abdalrahman Eweiwi
    • 1
  • Muhammed S. Cheema
    • 1
  • Christian Bauckhage
    • 1
    • 3
  • Juergen Gall
    • 2
  1. 1.Bonn-Aachen International Center for ITUniversity of BonnBonnGermany
  2. 2.Computer Vision GroupUniversity of BonnBonnGermany
  3. 3.Multimedia Pattern Recognition Group, Fraunhofer IAISSankt AugustinGermany

Personalised recommendations