Skip to main content

Advertisement

Log in

Online human moves recognition through discriminative key poses and speed-aware action graphs

  • Original Paper
  • Published:
Machine Vision and Applications Aims and scope Submit manuscript

Abstract

Recognizing user-defined moves serves a large number of applications including sport monitoring, virtual reality or natural user interfaces (NUI). However, many of the efficient human move recognition methods are still limited to specific situations, such as straightforward NUI gestures or everyday human actions. In particular, most methods depend on a prior segmentation of recordings to both train and recognize moves. This segmentation step is generally performed manually or based on heuristics such as neutral poses or short pauses, limiting the range of applications. Besides, speed is generally not considered as a criterion to distinguish moves. We present an approach composed of a simplified move training phase that requires minimal user intervention, together with a novel online method to robustly recognize moves online from unsegmented data without requiring any transitional pauses or neutral poses, and additionally considering human move speed. Trained gestures are automatically segmented in real time by a curvature-based method that detects small pauses during a training session. A set of most discriminant key poses between different moves is also extracted in real time, optimizing the number of key poses. All together, this semi-supervised learning approach only requires continuous move performances from the user with small pauses. Key pose transitions and moves execution speeds are used as input to a novel human move recognition algorithm that recognizes unsegmented moves online, achieving high robustness and very low latency in our experiments, while also effective in distinguishing moves that differ only in speed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. http://structure.io/openni.

References

  1. Althloothi, S., Mahoor, M.H., Zhang, X., Voyles, R.M.: Human activity recognition using multi-features and multiple kernel learning. Pattern Recognit. 47(5), 1800–1812 (2014)

    Article  Google Scholar 

  2. Bloom, V., Makris, D., Argyriou, V.: Clustered spatio-temporal manifolds for online action recognition. In: Pattern Recognition (ICPR), IEEE 2014 22nd International Conference on, pp. 3963–3968 (2014)

  3. Bobick, A., Davis, J.: The recognition of human movement using temporal templates. TPAMI 23 (2001)

  4. Cao, L., Liu, Z., Huang, T.: Cross-dataset action detection. In: CVPR, pp. 1998–2005 (2010)

  5. Cardinaux, F., Bhowmik, D., Abhayaratne, C., Hawley, M.S.: Video based technology for ambient assisted living: A review of the literature. J. Ambient Intell. Smart Environ. 3(3), 253–269 (2011)

    Google Scholar 

  6. Chaaraoui, A.A., Flórez-Revuelta, F.: Continuous Human Action Recognition in Ambient Assisted Living Scenarios. Springer International Publishing, Cham (2015)

    Book  Google Scholar 

  7. Chen, D.Y., Shih, S.W., Liao, H.Y.M.: Human action recognition using 2-d spatio-temporal templates. In: 2007 IEEE International Conference on Multimedia and Expo, pp. 667–670. IEEE (2007)

  8. Chen, W., Guo, G.: Triviews: A general framework to use 3d depth data effectively for action recognition. J. Vis. Commun. Image Represent. 26, 182–191 (2015). doi:10.1016/j.jvcir.2014.11.008

    Article  MathSciNet  Google Scholar 

  9. Desmond, M., Collet, P., Marsh, P., O’Shaughnessy, M.: Gestures: Their origins and distribution. Cape (1979)

  10. Devanne, M., Wannous, H., Berretti, S., Pala, P., Daoudi, M., Del Bimbo, A.: Space-time pose representation for 3d human action recognition. In: ICIAP, pp. 456–464 (2013)

  11. Do Carmo, M.: Differential geometry of curves and surfaces. Pearson, London (1976)

    MATH  Google Scholar 

  12. Ellis, C., Masood, S.Z., Tappen, M.F., La Viola Joseph, J.J., Sukthankar, R.: Exploring the trade-off between accuracy and observational latency in action recognition. J. Vis. Commun. Image Represent. 101(3), 420–436 (2013)

    Google Scholar 

  13. Faugeroux, R., Vieira, T., Martinez, D., Lewiner, T.: Simplified training for gesture recognition. In: Sibgrapi, pp. 133–140 (2014)

  14. Forbes, K., Fiu, E.: An efficient search algorithm for motion data using weighted PCA. In: SCA, pp. 67–76 (2005)

  15. Fothergill, S., Mentis, H., Kohli, P., Nowozin, S.: Instructing people for training gestural interactive systems. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’12, pp. 1737–1746. ACM, New York, NY, USA (2012)

  16. Gong, D., Medioni, G., Zhu, S., Zhao, X.: Kernelized temporal cut for online temporal segmentation and recognition. In: Proceedings of the 12th European Conference on Computer Vision - Volume Part III. ECCV’12, pp. 229–243. Springer, Berlin, Heidelberg (2012)

  17. Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: CVPR, pp. 2046–2053 (2010)

  18. Lan, R., Sun, H.: Automated human motion segmentation via motion regularities. J. Vis. Commun. Image Represent. 31, 35–53 (2015)

    Google Scholar 

  19. Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV 1, 432–439 (2003)

  20. LaViola, J.: 3d gestural interaction: The state of the field. ISRN Artificial Intelligence p. 514641 (2013)

  21. Lewiner, T., Gomes, J., Lopes, H., Craizer, M.: Curvature and torsion estimators based on parametric curve fitting. Comput. Gr. 29(5), 641–655 (2005)

    Article  Google Scholar 

  22. Li, W., Zhang, Z., Liu, Z.: Expandable data-driven graphical modeling of human actions based on salient postures. Circuits Syst. Video Technol. 18(11), 1499–1510 (2008)

    Article  Google Scholar 

  23. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: CVPR W. Human Communicative Behavior Analysis, pp. 9–14 (2010)

  24. Liu, Z.: MSR action recognition datasets and codes. http://research.microsoft.com/~zliu/ActionRecoRsrc (2014)

  25. Lo Presti, L., La Cascia, M.: 3d skeleton-based human action classification. Comput. Gr. 53, 130–147 (2016)

    Google Scholar 

  26. Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and Viterbi path searching. In: CVPR, pp. 1–8 (2007)

  27. Miranda, L., Vieira, T., Martinez, D., Lewiner, T., Vieira, A.W., Campos, M.F.M.: Keypose and gesture database. http://www.im.ufal.br/professor/thales/gesturedb.html (2014)

  28. Miranda, L., Vieira, T., Martinez, D., Lewiner, T., Vieira, A.W., Campos, M.F.M.: Online gesture recognition from pose kernel learning and decision forests. Comput. Gr. 39, 65–73 (2014)

    Google Scholar 

  29. Müller, M., Baak, A., Seidel, H.P.: Efficient and robust annotation of motion capture data. In: SCA, pp. 17–26 (2009)

  30. Müller, M., Röder, T.: Motion templates for automatic classification and retrieval of motion capture data. In: SCA, pp. 137–146 (2006)

  31. Niebles, J.C., Chen, C.W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: ECCV, pp. 392–405 (2010)

  32. Nowozin, S., Shotton, J.: Action points: A representation for low-latency online human action recognition. Tech. Rep. MSR-TR-2012-68, Microsoft Research Cambridge (2012)

  33. Padilla-López, J.R., Chaaraoui, A.A., Flórez-Revuelta, F.: A discussion on the validation tests employed to compare human action recognition methods using the MSR action3d dataset. CoRR (2015)

  34. Poppe, R.: A survey on vision-based human action recognition. Comput. Gr. 28(6), 976–990 (2010)

    Google Scholar 

  35. Raptis, M., Kirovski, D., Hoppe, H.: Real-time classification of dance gestures from skeleton animation. In: SCA, pp. 147–156 (2011)

  36. Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304 (2011)

  37. Slama, R., Wannous, H., Daoudi, M.: Grassmannian representation of motion depth for 3d human gesture and action recognition. In: ICPR, pp. 3499–3504 (2014)

  38. Slama, R., Wannous, H., Daoudi, M., Srivastava, A.: Accurate 3d action recognition using learning on the Grassmann manifold. Comput. Gr. 48(2), 556–567 (2015)

    Google Scholar 

  39. Sun, J., Wu, X., Yan, S., Cheong, L., Chua, T., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR, pp. 2004–2011 (2009)

  40. Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3d skeletons as points in a lie group. In: CVPR, pp. 588–595 (2014)

  41. Vieira, A.W., Lewiner, T., Schwartz, W., Campos, M.F.M.: Distance matrices as invariant features for classifying mocap data. In: ICPR, pp. 2934–2937 (2012)

  42. Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.: On the improvement of human action recognition from depth map sequences using Space-Time Occupancy Patterns. Comput. Gr. 36(15), 221–227 (2014)

    Google Scholar 

  43. Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.M.: STOP: Space-time occupancy patterns for 3d action recognition from depth map sequences. In: CIARP, pp. 252–259 (2012)

  44. Vieira, T., Faugeroux, R., Martínez, D., Lewiner, T.: Speed-aware gesture database. http://www.im.ufal.br/professor/thales/sgd.html (2016)

  45. Vitaladevuni, S.N., Kellokumpu, V., Davis, L.S.: Action recognition using ballistic dynamics. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)

  46. Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: ECCV, pp. 872–885 (2012)

  47. Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: CVPR, pp. 1290–1297 (2012)

  48. Weinland, D., Boyer, E.: Action recognition using exemplar-based embedding. In: CVPR, pp. 1–7 (2005)

  49. Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3d joints. In: CVPR W. on Human Activity Understanding from 3D Data, pp. 20–27 (2012)

  50. Yang, X., Tian, Y.: Eigenjoints-based action recognition using naïve-Bayes-nearest-neighbor. In: CVPR W. On Human Activity Understanding from 3D Data, pp. 14–19 (2012)

  51. Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Multimedia, pp. 1057–1060 (2012)

  52. Ye, J., Li, K., Qi, G.J., Hua, K.A.: Temporal order-preserving dynamic quantization for human action recognition from multimodal sensor streams. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ICMR ’15, pp. 99–106 (2015)

  53. Yu, G., Liu, Z., Yuan, J.: Discriminative orderlet mining for real-time recognition of human-object interaction. In: ACCV, Lecture Notes in Computer Science, pp. 50–65 (2014)

  54. Zhao, X., Li, X., Pang, C., Zhu, X., Sheng, Q.Z.: Online human gesture recognition from motion data streams. In: Proceedings of the 21st ACM International Conference on Multimedia. MM ’13, pp. 23–32. ACM, New York, NY, USA (2013)

  55. Zhou, F., De la Torre, F., Hodgins, J.K.: Hierarchical aligned cluster analysis for temporal clustering of human motion. IEEE Trans. Pattern Anal. Mach. Intell. 35(3), 582–596 (2013)

    Article  Google Scholar 

  56. Zhu, Y., Chen, W., Guo, G.: Evaluating spatiotemporal interest point features for depth-based action recognition. Image Vis. Comput. 32(8), 453–464 (2014)

    Article  Google Scholar 

  57. Zhu, Y., Chen, W., Guo, G.: Fusing multiple features for depth-based action recognition. Image Vis. Comput. 6(2), 18:1–18:20 (2015)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank CNPq, FAPEAL, FAPERJ and École Polytechnique for partially financing this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thales Vieira.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mov 23113 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vieira, T., Faugeroux, R., Martínez, D. et al. Online human moves recognition through discriminative key poses and speed-aware action graphs. Machine Vision and Applications 28, 185–200 (2017). https://doi.org/10.1007/s00138-016-0818-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00138-016-0818-y

Keywords

Navigation