Skip to main content

Enhanced Sequence Matching for Action Recognition from 3D Skeletal Data

  • Conference paper
  • First Online:
Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9007))

Included in the following conference series:

Abstract

Human action recognition using 3D skeletal data has become popular topic with the emergence of the cost-effective depth sensors, such as Microsoft Kinect. However, noisy joint position and speed variation between actors make action recognition from 3D joint positions difficult. To address these problems, this paper proposes a novel framework, called Enhanced Sequence Matching (ESM), to align and compare action sequences. Inspired by DNA sequence alignment method used in bioinformatics, we model the new scoring function to measure the similarity between two action sequences with noise. We construct action sequence from a set of elementary Moving Poses (eMP) built from affinity propagation. By using affinity propagation, eMP set is built automatically, in other words, it determines the number of eMPs itself. The proposed framework outperforms the state-of-the-art on UTKinect action dataset and MSRC-12 gesture dataset and achieves comparable performance to the state-of-the-art on MSR action 3D dataset. Moreover, experimental results show that our method is very intuitive and robust to noise and temporal variation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, p. 3 (2011)

    Google Scholar 

  2. Mount, D.W.: Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press, New york (2004)

    Google Scholar 

  3. Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  4. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)

    Article  Google Scholar 

  5. Vingron, M., Waterman, M.S.: Sequence alignment and penalty choice: review of concepts, case studies and implications. J. Mol. Biol. 235, 1–12 (1994)

    Article  Google Scholar 

  6. Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 315, 972–976 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  7. Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297 (2012)

    Google Scholar 

  8. Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3d joints. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 20–27 (2012)

    Google Scholar 

  9. Zanfir, M., Leordeanu, M., Sminchisescu, C.: The moving pose: an efficient 3d kinematics descriptor for low-latency action recognition and detection. In: The IEEE International Conference on Computer Vision (ICCV) (2013)

    Google Scholar 

  10. Wang, C., Wang, Y., Yuille, A.: An approach to pose-based action recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 915–922 (2013)

    Google Scholar 

  11. Luo, J., Wang, W., Qi, H.: Group sparsity and geometry constrained dictionary learning for action recognition from depth maps. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1809–1816 (2013)

    Google Scholar 

  12. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  13. Chapelle, O., Vapnik, V., Bousquet, O., Mukherjee, S.: Choosing multiple parameters for support vector machines. Mach. Learn. 46, 131–159 (2002)

    Article  MATH  Google Scholar 

  14. Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Conditional models for contextual human motion recognition. In: Tenth IEEE International Conference on Computer Vision, ICCV 2005, vol. 2, pp. 1808–1815 (2005)

    Google Scholar 

  15. Morency, L., Quattoni, A., Darrell, T.: Latent-dynamic discriminative models for continuous gesture recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8 (2007)

    Google Scholar 

  16. Lv, F., Nevatia, R.: Recognition and segmentation of 3-d human action using HMM and multi-class adaboost. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part IV. LNCS, vol. 3954, pp. 359–372. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  17. Li, K., Hu, J., Fu, Y.: Modeling complex temporal composition of actionlets for activity prediction. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 286–299. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  18. Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 9–14 (2010)

    Google Scholar 

  19. Quattoni, A., Wang, S., Morency, L., Collins, M., Darrell, T.: Hidden conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1848–1852 (2007)

    Article  Google Scholar 

  20. Han, L., Wu, X., Liang, W., Hou, G., Jia, Y.: Discriminative human action recognition in the learned hierarchical manifold space. Image Vis. Comput. 28, 836–849 (2010). Best of Automatic Face and Gesture Recognition 2008

    Article  Google Scholar 

  21. Wang, Y., Mori, G.: Hidden part models for human action recognition: probabilistic versus max margin. IEEE Trans. Pattern Anal. Mach. Intell. 33, 1310–1323 (2011)

    Article  Google Scholar 

  22. Veeraraghavan, A., Roy-chowdhury, A.K.: The function space of an activity. In: Proceedings of Computer Vision Pattern Recognition, pp. 959–968 (2006)

    Google Scholar 

  23. Müller, M., Röder, T.: Motion templates for automatic classification and retrieval of motion capture data. In: Proceedings of the 2006 ACM SIGGRAPH Eurographics Symposium on Computer Animation, SCA 2006, Aire-la-Ville, Switzerland, Switzerland, pp. 137–146. Eurographics Association (2006)

    Google Scholar 

  24. Yao, B.Z., Zhu, S.C.: Learning deformable action templates from cluttered videos. In: ICCV, pp. 1507–1514. IEEE (2009)

    Google Scholar 

  25. Wang, J., Wu, Y.: Learning maximum margin temporal warping for action recognition. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 2688–2695 (2013)

    Google Scholar 

  26. Fothergill, S., Mentis, H., Kohli, P., Nowozin, S.: Instructing people for training gestural interactive systems. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2012, pp. 1737–1746. ACM, New York (2012)

    Google Scholar 

  27. Zhu, Y., Chen, W., Guo, G.: Fusing spatiotemporal features and joints for 3d action recognition. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 486–491 (2013)

    Google Scholar 

  28. Lehrmann, A.M., Gehler, P.V., Nowozin, S.: Efficient non-linear markov models for human motion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, USA. IEEE (2014)

    Google Scholar 

Download references

Acknowledgement

This work was partly supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MEST) (No. 2011-00166669) and Samsung Electronics Co., Ltd.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ki-Sang Hong .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jung, HJ., Hong, KS. (2015). Enhanced Sequence Matching for Action Recognition from 3D Skeletal Data. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16814-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16813-5

  • Online ISBN: 978-3-319-16814-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics