Skip to main content

Advertisement

Log in

View Invariance for Human Action Recognition

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

This paper presents an approach for viewpoint invariant human action recognition, an area that has received scant attention so far, relative to the overall body of work in human action recognition. It has been established previously that there exist no invariants for 3D to 2D projection. However, there exist a wealth of techniques in 2D invariance that can be used to advantage in 3D to 2D projection. We exploit these techniques and model actions in terms of view-invariant canonical body poses and trajectories in 2D invariance space, leading to a simple and effective way to represent and recognize human actions from a general viewpoint. We first evaluate the approach theoretically and show why a straightforward application of the 2D invariance idea will not work. We describe strategies designed to overcome inherent problems in the straightforward approach and outline the recognition algorithm. We then present results on 2D projections of publicly available human motion capture data as well on manually segmented real image sequences. In addition to robustness to viewpoint change, the approach is robust enough to handle different people, minor variabilities in a given action, and the speed of aciton (and hence, frame-rate) while encoding sufficient distinction among actions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aggarwal, J. and Cai, Q. 1999. Human motion analysis: A review. Computer Vision and Image Understanding, 73:428–440.

    Article  Google Scholar 

  • Astrom, K. and Morin, L. 1995. Random cross ratios. In Proc. 9th Scand. Conf. on Image Analysis.

  • Brand, M., Oliver, N. and Pentland, A. 1997. Coupled Hidden Markov Models for Complex Action Recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 994–999.

  • Campbell, L. and Bobick, A. 1995. Recognition of human body motion using phase space constraints. In Proc. International Conference on Computer Vision, pp. 624–630.

  • Campbell, L.W., Becker, D.A., Azerbayejani, A., Bobick, A., and Pentland, A. 1996. Invariant Features for 3-D Gesture Rocognition. In 2nd Int. Conf. on Automatic Face- and Gesture-Rocognition, Killington, Vermont.

  • Cedras, C. and Shah, M. 1995, Motion-based Recognition: A Survey. Image and Vision Computing, 13(2).

  • Cormen, T.H., Leiserson, C.E., and Rivest, R.L. 1990. Introduction to Algorithms. MIT Press/McGraw-Hill, New York, pp. 301–328.

    Google Scholar 

  • David, R. and Alla, H. 1991. Petri Nets and Grafcet. Prentice Hall.

  • Davis, J. and Bobick, A. 1997. The representation and recognition of action using temporal templates. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.

  • Faugeras, O. 1993. Three-Dimensional Computer Vision, A Geometric Viewpoint. The MIT Press.

  • Fujiyoshi, H. and Lipton, A. 1998. Real-time human motion analysis by image skeletonization. In Fourth IEEE Workshop on Applications of Computer Vision, pp. 15–21.

  • Gavrila, D.M. 1998. The visual analysis of human movement. Computer Vision and Image Understanding, 73(1):82–98.

    Google Scholar 

  • Haritaoglu, I., Harwood, D., and Davis, L. 1998. Ghost: A human body part labeling system using silhouettes. In Proc. International Conference on Pattern Recognition, pp. 77–82.

  • Hoffman, D.H. 1998. Visual Intelligence. W.W. Norton & Company.

  • Johannson, G. 1973. Visual perception of biological motion and a model for its analysis. Perception and Psychophysics, 14(2):201–211.

    Google Scholar 

  • Leung, M. and Yang, Y. 1995. First Sight: A human-body outline labeling system. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17(4):359–377.

    Article  Google Scholar 

  • Marr, D. and Nishihara, H.K. 1978. Representation and recognition of the spatial organization of three dimensional shapes. Proceedings of the Royal Society, London, B:200:269–274.

    Google Scholar 

  • Maybank, S. 1995. Probabilistic analysis of the cross-ratio to model based vision. International Journal of Computer Vision, 16:5–33.

    Article  Google Scholar 

  • Moeslund, T. and Granumm, E. 2001. A survey of computer vison based human motion capture. Computer Vision and Image Understanding, 81(3):231–268.

    Google Scholar 

  • Murphy, K. 2002. Dynamic bayesian networks: representation, inference and learning. Ph.D thesis, University of California Berkeley.

  • Nam, Y., Wohn, K., and Lee-Kwang, H. 1999. Modeling and recognition of hand gestures using colored petri nets. IEEE Transactions on Systems, Man and Cybernetics-Part A, 29(5):514–521.

    Google Scholar 

  • Parameswaran, V., Burlina, P. and Chellappa, R. 1997. Performance Analysis and Learning Approaches for Vehicle Detection and Counting. In Proc. IEEE Conference on Acoustics, Speech and signal Processing.

  • Parameswaran, V. and Chellappa, R. 2002. Quasi-invariants for human action representation and recognition. In Proc. International Conference on Pattern Recognition.

  • Parameswaran, V. and Chellappa, R. 2003. View invariants for human action recognition. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.

  • Parameswaran, V. and Chellappa, R. 2005. Human action recognition using mutual invariants. Computer Vision and Image Understanding, 98(2):294–324.

    Article  Google Scholar 

  • Polana, R. and Nelson, R.C. 1993. Detecting activities. In Proc. IEEE Conference on Computer Vision and Pattern Recognition pp. 2–7.

  • Rao, C., Yilmaz, A., and Shah, M. 2002. View-invariant representation and recognition of actions. International Journal of Computer Vision, 50(2):203–226.

    Article  Google Scholar 

  • Rosales, R. 2002. Specialized mappings architecture with applications to vision-based estimation of articulated pose. Ph. D. thesis, Graduate School of Arts and Sciences, Boston University.

  • Rosales, R. and Sclaroff, S. 2000. Inferring body pose without tracking body parts. In Proc. IEEE Conference on Computer Vision and Pattern Recognition.

  • Rothwell, C.A. 1995. Object Recognition Through Invariant Indering. Oxford Science Publications.

  • Seitz, S.M. and Dyer, C.R. 1997. View-invariant analysis of cyclic motion. International Journal of Computer Vision, 25:1–25.

    Article  Google Scholar 

  • Syeda-Mahmood, T. and Vasilescu, A. 2001. Recognizing action events from multiple viewpoints. In Proc. IEEE Workshop on Detection and Recognition of Events in Video.

  • Wang, L., Hu, W., and Tan, T. 2003. Recent developments in human motion analysis. Pattern Recognition, 36(3):585–601.

    Google Scholar 

  • Zatsiorsky, V.M. 2002. Kinetics of human motion. Human Kinetics, Champaign, IL.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was done when the author was a graduate student in the Department of Computer Science and was partially supported by the NSF Grant ECS-02-5475. The author is curently with Siemens Corporate Research, Princeton, NJ.

Dr. Chellappa is with the Department of Electrical and Computer Engineering.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Parameswaran, V., Chellappa, R. View Invariance for Human Action Recognition. Int J Comput Vision 66, 83–101 (2006). https://doi.org/10.1007/s11263-005-3671-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-005-3671-4

Keywords

Navigation