Group Action Recognition Using Space-Time Interest Points

  • Qingdi Wei
  • Xiaoqin Zhang
  • Yu Kong
  • Weiming Hu
  • Haibin Ling
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5876)

Abstract

Group action recognition is a challenging task in computer vision due to the large complexity induced by multiple motion patterns. This paper aims at analyzing group actions in video clips containing several activities. We combine the probability summation framework with the space-time (ST) interest points for this task. First, ST interest points are extracted from video clips to form the feature space. Then we use k-means for feature clustering and build a compact representation, which is then used for group action classification. The proposed approach has been applied to classification tasks including four classes: badminton, tennis, basketball, and soccer videos. The experimental results demonstrate the advantages of the proposed approach.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Laptev, I.: On space-time interest points. International Journal of Computer Vision 64, 107–123 (2005)CrossRefGoogle Scholar
  2. 2.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proceedings of the 17th International Conference on Pattern Recognition, Washington, DC, USA, vol. 3, pp. 32–36. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  3. 3.
    Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision 79, 299–318 (2008)CrossRefGoogle Scholar
  4. 4.
    Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72 (2005)Google Scholar
  5. 5.
    Gilbert, A., Illingworth, J., Bowden, R.: Scale invariant action recognition using compound features mined from dense spatio-temporal corners. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 222–233. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, 1–8 (2008)Google Scholar
  7. 7.
    Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, Washington, DC, USA, p. 726. IEEE Computer Society, Los Alamitos (2003)CrossRefGoogle Scholar
  8. 8.
    Kong, Y., Zhang, X., Wei, Q., Hu, W., Jia, Y.: Group action recognition in soccer videos. In: 19th International Conference on Pattern Recognition, pp. 1–4 (2008)Google Scholar
  9. 9.
    Ali, S., Shah, M.: Floor fields for tracking in high density crowd scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 1–14. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  10. 10.
    Roy, A.V., Chowdhury, A., Chellappa, R.: Matching shape sequences in video with applications in human movement analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1896–1909 (2005)CrossRefGoogle Scholar
  11. 11.
    Sminchisescu, C., Kanaujia, A., Metaxas, D.: Conditional models for contextual human motion recognition. In: 10th IEEE International Conference on Computer Vision, vol. 104, pp. 210–220 (2006)Google Scholar
  12. 12.
    Natarajan, P., Nevatia, R.: View and scale invariant action recognition using multiview shape-flow models. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  13. 13.
    Zhao, T., Nevatia, R.: 3d tracking of human locomotion: A tracking as recognition approach. In: Proceedings of the 16th International Conference on Pattern Recognition, Washington, DC, USA, p. 10546. IEEE Computer Society, Los Alamitos (2002)Google Scholar
  14. 14.
    Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Conditional models for contextual human motion recognition. In: 10th IEEE International Conference on Computer Vision, vol. 2, pp. 1808–1815 (2005)Google Scholar
  15. 15.
    Shi, Q., Wang, L., Cheng, L., Smola, A.: Discriminative human action segmentation and recognition using semi-markov model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  16. 16.
    Tran, D., Sorokin, A.: Human activity recognition with metric learning. In: Proceedings of the 10th European Conference on Computer Vision, pp. 548–561. Springer, Heidelberg (2008)Google Scholar
  17. 17.
    Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  18. 18.
    Morency, L.P., Quattoni, A., Darrell, T.: Latent-dynamic discriminative models for continuous gesture recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, 1–8 (2007)Google Scholar
  19. 19.
    Wang, L., Suter, D.: Recognizing human activities from silhouettes: Motion subspace and factorial discriminative graphical model. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)Google Scholar
  20. 20.
    Ning, H., Xu, W., Gong, Y., Huang, T.: Latent pose estimator for continuous action recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 419–433. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  21. 21.
    Boiman, O., Irani, M.: Detecting irregularities in images and in video. In: 10th IEEE International Conference on Computer Vision, vol. 1, pp. 462–469 (2005)Google Scholar
  22. 22.
    Vitaladevuni, S., Kellokumpu, V., Davis, L.: Action recognition using ballistic dynamics. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)Google Scholar
  23. 23.
    Huang, K.S., Trivedi, M.M.: 3d shape context based gesture analysis integrated with tracking using omni video array. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, Washington, DC, USA, p. 80. IEEE Computer Society, Los Alamitos (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Qingdi Wei
    • 1
  • Xiaoqin Zhang
    • 1
  • Yu Kong
    • 2
  • Weiming Hu
    • 1
  • Haibin Ling
    • 3
  1. 1.National Laboratory of Pattern RecognitionInstitute of Automation, CASBeijingP.R. China
  2. 2.Beijing Laboratory of Intelligent Information Technology, School of Computer ScienceBeijing Institute of TechnologyBeijingP.R. China
  3. 3.Center for Information Science and Technology, Computer and Information Science DepartmentTemple UniversityPhiladelphiaUSA

Personalised recommendations