Modeling Video Activity with Dynamic Phrases and Its Application to Action Recognition in Tennis Videos

  • Jonathan Vainstein
  • José F. Manera
  • Pablo Negri
  • Claudio Delrieux
  • Ana Maguitman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8827)


We present a novel approach to action recognition in tennis shot sequences. The underlying model considers the per-frame motion to be regarded as a word (within an alphabet of possible motions), and the sequence of frames as a phrase whose meaning is determined by the words given in a specific order. This feature extraction mechanism allows a semantic treatment of the classification stage using Conditional Random Fields. The system was applied on the RGB videos of the THETIS dataset, achieving an accuracy of over 86% in recognizing 12 different tennis shots among several takes produced by 55 different amateur and professional players.


Action recognition conditional random fields support vector machines optical flow motion description 


  1. 1.
    Andrew, G., Gao, J.: Scalable training of l1-regularized log-linear models. In: Proceedings of the 24th International Conference on Machine Learning, ICML, pp. 33–40 (2007)Google Scholar
  2. 2.
    Chang, C.-C., Lin, C.-J.: Libsvm: A library for support vector machines. ACM TIST 2(3), 1–27 (2011)CrossRefGoogle Scholar
  3. 3.
    Chapelle, O., Haffner, P., Vapnik, V.: Support vector machines for histogram-based image classification. Neural Networks 10(5), 1055–1064 (1999)CrossRefGoogle Scholar
  4. 4.
    Dietterich, T.: Machine learning for sequential data: A review. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SSPR & SPR 2002. LNCS, vol. 2396, pp. 15–30. Springer, Heidelberg (2002)Google Scholar
  5. 5.
    Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, Washington, USA, vol. 2, pp. 726–733 (2003)Google Scholar
  6. 6.
    Gourgari, S., Goudelis, G., Karpouzis, K., Kollias, S.: Thetis: Three dimensional tennis shots a human action dataset. In: CVPRW, pp. 676–681 (2013)Google Scholar
  7. 7.
    Gupta, R.: Conditional random fields. Unpublished report, ITT Bombay (2006)Google Scholar
  8. 8.
    Kamijo, S., Matsushita, Y., Ikeuchi, K., Sakauchi, M.: Incident detection at intersections utilizing hidden markov model. In: ITS (1999)Google Scholar
  9. 9.
    Lafferty, J.D., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)Google Scholar
  10. 10.
    Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV, pp. 432–439 (2003)Google Scholar
  11. 11.
    Manera, F., Vainstein, J., Delrieux, C., Maguitman, A.: Action recognition in tennis videos using optical flow and conditional random fields. In: AST JAIIO, pp. 152–162 (2013)Google Scholar
  12. 12.
    Miyamori, H., Iisaku, S.: Video annotation for content-based retrieval using human behavior analysis and domain knowledge. In: AFGR, pp. 320–325 (2000)Google Scholar
  13. 13.
    Nocedal, J.: Updating quasi-newton matrices with limited storage. Mathematics of Computation 35(151), 773–782 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Okazaki, N.: Crfsuite: a fast implementation of conditional random fields (CRFs) (2007)Google Scholar
  15. 15.
    Takahashi, M., Naemura, M., Fujii, M., Satoh, S.: Human action recognition in crowded surveillance video sequences by using features taken from key-point trajectories. In: CVPRW, pp. 9–16 (2011)Google Scholar
  16. 16.
    Wallraven, C., Caputo, B., Graf, A.: Recognition with local features: the kernel recipe. In: ICCV, pp. 257–264 (2003)Google Scholar
  17. 17.
    Wang, H., Klaser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)Google Scholar
  18. 18.
    Zhu, G.-Y., Xu, C.S., Gao, W., Huang, Q.: Action recognition in broadcast tennis video using optical flow and support vector machine. In: Huang, T.S., Sebe, N., Lew, M., Pavlović, V., Kölsch, M., Galata, A., Kisačanin, B. (eds.) HCI/ECCV 2006. LNCS, vol. 3979, pp. 89–98. Springer, Heidelberg (2006)Google Scholar
  19. 19.
    Zhu, G., Xu, C., Huang, Q.: Player action recognition in broadcast tennis video with applications to semantic analysis of sports game. In: ACM, pp. 431–440 (2006)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Jonathan Vainstein
    • 1
    • 2
  • José F. Manera
    • 1
    • 2
  • Pablo Negri
    • 3
  • Claudio Delrieux
    • 1
  • Ana Maguitman
    • 2
  1. 1.Laboratorio de Ciencias de las Imágenes (IIIE - CONICET), Departamento de Ingeniería Eléctrica y de Computadoras (DIEC)Universidad Nacional del Sur (UNS)Bahá BlancaArgentina
  2. 2.Grupo de Investigación en Administración de Conocimiento y Recuperación, de Información - LIDIA, Departamento de Ciencias e Ingeniería de la Computación (DCIC)Universidad Nacional del Sur (UNS)Bahá BlancaArgentina
  3. 3.Instituto de TecnologíaUADE-CONICETBuenos AiresArgentina

Personalised recommendations