Motion Context: A New Representation for Human Action Recognition

Zhang, Ziming; Hu, Yiqun; Chan, Syin; Chia, Liang-Tien

doi:10.1007/978-3-540-88693-8_60

Ziming Zhang⁴,
Yiqun Hu⁴,
Syin Chan⁴ &
…
Liang-Tien Chia⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5305))

Included in the following conference series:

European Conference on Computer Vision

10k Accesses
48 Citations

Abstract

One of the key challenges in human action recognition from video sequences is how to model an action sufficiently. Therefore, in this paper we propose a novel motion-based representation called Motion Context (MC), which is insensitive to the scale and direction of an action, by employing image representation techniques. A MC captures the distribution of the motion words (MWs) over relative locations in a local region of the motion image (MI) around a reference point and thus summarizes the local motion information in a rich 3D MC descriptor. In this way, any human action can be represented as a 3D descriptor by summing up all the MC descriptors of this action. For action recognition, we propose 4 different recognition configurations: MW+pLSA, MW+SVM, MC+w ³-pLSA (a new direct graphical model by extending pLSA), and MC+SVM. We test our approach on two human action video datasets from KTH and Weizmann Institute of Science (WIS) and our performances are quite promising. For the KTH dataset, the proposed MC representation achieves the highest performance using the proposed w ³-pLSA. For the WIS dataset, the best performance of the proposed MC is comparable to the state of the art.

Download to read the full chapter text

Chapter PDF

Motion of Oriented Magnitudes Patterns for Human Action Recognition

Frame-Level Covariance Descriptor for Action Recognition

EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis

Article 26 April 2016

References

Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV (2003)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: ICPR 2004, vol. III, pp. 32–36 (2004)
Google Scholar
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS (October 2005)
Google Scholar
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. ACM Multimedia, 357–360 (2007)
Google Scholar
Wang, Y., Loe, K.F., Tan, T.L., Wu, J.K.: Spatiotemporal video segmentation based on graphical models. Trans. IP 14, 937–947 (2005)
Google Scholar
Niebles, J., Fei Fei, L.: A hierarchical model of shape and appearance for human action classification. In: CVPR 2007, pp. 1–8 (2007)
Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. In: Mach. Learn., Hingham, MA, USA, vol. 42, pp. 177–196. Kluwer Academic Publishers, Dordrecht (2001)
Google Scholar
Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. In: Data Mining and Knowledge Discovery, vol. 2, pp. 121–167 (1998)
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV 2005, vol. II, pp. 1395–1402 (2005)
Google Scholar
Wang, Y., Sabzmeydani, P., Mori, G.: Semi-latent dirichlet allocation: A hierarchical model for human action recognition. In: HUMO 2007, pp. 240–254 (2007)
Google Scholar
Ikizler, N., Duygulu, P.: Human action recognition using distribution of oriented rectangular patches. In: HUMO 2007, pp. 271–284 (2007)
Google Scholar
Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV 2003, pp. 726–733 (2003)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 20, 91–110 (2003)
Google Scholar
Savarese, S., Sel Pozo, A., Fei-Fei, J.N.L.: Spatial-temporal correlations for unsupervised action classification. In: IEEE Workshop on Motion and Video Computing, Copper Mountain, Colorado (2008)
Google Scholar
Wang, Y., Tan, T., Loe, K.: Video segmentation based on graphical models. In: CVPR 2003, vol. II, pp. 335–342 (2003)
Google Scholar
Wang, L., Suter, D.: Informative shape representations for human action recognition. In: ICPR 2006, vol. II, pp. 1266–1269 (2006)
Google Scholar
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Computer Vision and Image Understanding 104 (November/December 2006)
Google Scholar
Bobick, A., Davis, J.: The recognition of human movement using temporal templates. PAMI 23(3), 257–267 (2001)
Google Scholar
Niebles, J., Wang, H., Wang, H., Fei Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: BMVC 2006, vol. III, p. 1249 (2006)
Google Scholar
Belongie, S., Malik, J., Puzicha, J.: Shape context: A new descriptor for shape matching and object recognition. In: NIPS, pp. 831–837 (2000)
Google Scholar
Bissacco, A., Yang, M.H., Soatto, S.: Fast human pose estimation using appearance and motion via multi-dimensional boosting regression. In: CVPR (2007)
Google Scholar
Mikolajczyk, K., Schmid, C.: A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis & Machine Intelligence 27, 1615–1630 (2005)
Article Google Scholar
Chang, C., Lin, C.: Libsvm: a library for support vector machines, Online (2001)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: International Conference on Computer Vision, vol. 1, p. 166 (October 2005)
Google Scholar
Wong, S., Kim, T., Cipolla, R.: Learning motion categories using both semantic and structural information. In: CVPR 2007, pp. 1–6 (2007)
Google Scholar
Ali, S., Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: ICCV 2007, pp. 1–8 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Multimedia and Network Technology, School of Computer Engineering, Nanyang Technological University, Singapore, 639798
Ziming Zhang, Yiqun Hu, Syin Chan & Liang-Tien Chia

Authors

Ziming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yiqun Hu
View author publications
You can also search for this author in PubMed Google Scholar
Syin Chan
View author publications
You can also search for this author in PubMed Google Scholar
Liang-Tien Chia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, IL 61801, Urbana, USA
David Forsyth
Department of Computing, Wheatley, Oxford Brookes University, OX33 1HX, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Hu, Y., Chan, S., Chia, LT. (2008). Motion Context: A New Representation for Human Action Recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88693-8_60

Download citation

DOI: https://doi.org/10.1007/978-3-540-88693-8_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88692-1
Online ISBN: 978-3-540-88693-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Motion Context: A New Representation for Human Action Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Motion of Oriented Magnitudes Patterns for Human Action Recognition

Frame-Level Covariance Descriptor for Action Recognition

EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Motion Context: A New Representation for Human Action Recognition

Abstract

Chapter PDF

Similar content being viewed by others

Motion of Oriented Magnitudes Patterns for Human Action Recognition

Frame-Level Covariance Descriptor for Action Recognition

EXMOVES: Mid-level Features for Efficient Action Recognition and Video Analysis

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation