Advertisement

Combining Models of Pose and Dynamics for Human Motion Recognition

  • Roman Filipovych
  • Eraldo Ribeiro
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4842)

Abstract

We present a novel method for human motion recognition. A video sequence is represented with a sparse set of spatial and spatial-temporal features by extracting static and dynamic interest points. Our model learns a set of poses along with the dynamics of the sequence. Pose models and the model of motion dynamics are represented as a constellation of static and dynamic parts, respectively. On top of the layer of individual models we build a higher level model that can be described as “constellation of constellation models”. This model encodes the spatial-temporal relationships between the dynamics of the motion and the appearance of individual poses. We test the model on a publicly available action dataset and demonstrate that our new method performs well on the classification tasks. We also perform additional experiments to show how the classification performance can be improved by increasing the number of pose models in our framework.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics), Secaucus, NJ, USA. Springer, Heidelberg (2006)Google Scholar
  2. 2.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Int. Conference on Computer Vision, 1395–1402 (2005)Google Scholar
  3. 3.
    Boiman, O., Irani, M.: Detecting irregularities in images and in video. In: Conf. on Computer Vision and Pattern Recognition, pp. 462–469 (2005)Google Scholar
  4. 4.
    Burl, M.C., Weber, M., Perona, P.: A probabilistic approach to object recognition using local photometry and global geometry. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, pp. 628–641. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  5. 5.
    Carneiro, G., Lowe, D.: Sparse flexible models of local features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 29–43. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Crandall, D.J., Huttenlocher, D.P.: Weakly supervised learning of part-based spatial models for visual object recognition. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3951, pp. 16–29. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS (October 2005)Google Scholar
  8. 8.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vision 61(1), 55–79 (2005)CrossRefGoogle Scholar
  9. 9.
    Fergus, R., Perona, P., Zisserman, A.: Weakly supervised scale-invariant learning of models for visual recognition. Int. J. Comput. Vision 71(3), 273–303 (2007)CrossRefGoogle Scholar
  10. 10.
    Fergus, R., Perona, P., Zisserman, A.: A sparse object category model for efficient learning and exhaustive recognition. In: CVPR 2005. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (June 2005)Google Scholar
  11. 11.
    Fischler, M., Elschlager, R.: The representation and matching of pictorial structures. IEEE Transactions - Computers 22, 67–92 (1977)CrossRefGoogle Scholar
  12. 12.
    Laptev, I., Lindeberg, T.: Space-time interest points. In: IEEE Int. Conf. on Computer Vision, Nice, France (October 2003)Google Scholar
  13. 13.
    Leo, M., D’Orazio, T., Gnoni, I., Spagnolo, P., Distante, A.: Complex human activity recognition for monitoring wide outdoor environments. In: ICPR 2004. Proceedings of the Pattern Recognition, 17th International Conference, vol. 4, pp. 913–916. IEEE Computer Society Press, Los Alamitos (2004)CrossRefGoogle Scholar
  14. 14.
    Niebles, J., Wang, H., Wang, H., Fei Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: BMVC 2006. British Machine Vision Conference, p. 1249 (2006)Google Scholar
  15. 15.
    Niebles, J.C., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, USA (July 2007)Google Scholar
  16. 16.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: ICPR 2004. Proceedings of the Pattern Recognition, 17th International Conference, vol. 3, pp. 32–36. IEEE Computer Society Press, Los Alamitos (2004)CrossRefGoogle Scholar
  17. 17.
    Wang, Y., Jiang, H., Drew, M.S., Li, Z.-N., Mori, G.: Unsupervised discovery of action classes. In: CVPR 2006. Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1654–1661. IEEE Computer Society Press, Los Alamitos (2006)Google Scholar
  18. 18.
    Wong, S.-F., Kim, T.-K., Cipolla, R.: Learning motion categories using both semantic and structural information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, USA (June 2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Roman Filipovych
    • 1
  • Eraldo Ribeiro
    • 1
  1. 1.Computer Vision and Bio-Inspired Computing Laboratory, Department of Computer Sciences, Florida Institute of Technology, Melbourne, FL 32901USA

Personalised recommendations