Action Classification in Soccer Videos with Long Short-Term Memory Recurrent Neural Networks

  • Moez Baccouche
  • Franck Mamalet
  • Christian Wolf
  • Christophe Garcia
  • Atilla Baskurt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6353)

Abstract

In this paper, we propose a novel approach for action classification in soccer videos using a recurrent neural network scheme. Thereby, we extract from each video action at each timestep a set of features which describe both the visual content (by the mean of a BoW approach) and the dominant motion (with a key point based approach). A Long Short-Term Memory-based Recurrent Neural Network is then trained to classify each video sequence considering the temporal evolution of the features for each timestep. Experimental results on the MICC-Soccer-Actions-4 database show that the proposed approach outperforms classification methods of related works (with a classification rate of 77 %), and that the combination of the two features (BoW and dominant motion) leads to a classification rate of 92 %.

Keywords

Video Sequence Visual Word Recurrent Neural Network Convolutional Neural Network Visual Content 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ekin, A., Tekalp, A., Mehrotra, R.: Automatic Soccer Video Analysis and Summarization. IEEE Transactions on Image Processing 12(7) (2003)Google Scholar
  2. 2.
    Gong, Y., Lim, T., Chua, H.: Automatic Parsing of TV Soccer Programs. In: IEEE International Conference on Multimedia Computing and Systems, pp. 167–174 (1995)Google Scholar
  3. 3.
    Ballan, L., Bertini, M., Del Bimbo, A., Serra, G.: Action categorization in soccer videos using string kernels. In: Proc. of IEEE CBMI, Chania, Crete (2009)Google Scholar
  4. 4.
    Gers, F., Schraudolph, N., Schmidhuber, J.: Learning precise timing with LSTM recurrent networks. The Journal of Machine Learning Research 3, 115–143 (2003)MATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. International journal of computer vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  6. 6.
    Wolf, C., Jolion, J., Chassaing, F.: Text Localization, Enhancement and Binarization in Multimedia Documents. In: Proc. of ICPR (2002)Google Scholar
  7. 7.
    Fischler, M.: RANSAC: A Paradigm for Model Fitting With Applications to Image Analysis and Automated Cartography. Communications of the ACM (1981)Google Scholar
  8. 8.
    Delakis, E.: Multimodal Tennis Video Structure Analysis with Segment Models. PhD thesis, Université de Rennes 1 (2006)Google Scholar
  9. 9.
    Bishop, C.: Neural networks for pattern recognition. Oxford Univ. Press, Inc., Oxford (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Moez Baccouche
    • 1
    • 2
  • Franck Mamalet
    • 1
  • Christian Wolf
    • 2
  • Christophe Garcia
    • 1
  • Atilla Baskurt
    • 2
  1. 1.Orange LabsCesson-SévignéFrance
  2. 2.LIRIS, UMR 5205 CNRSINSA-LyonFrance

Personalised recommendations