Exploring Alternative Spatial and Temporal Dense Representations for Action Recognition

  • Pau Agustí
  • V. Javier Traver
  • Manuel J. Marin-Jimenez
  • Filiberto Pla
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6855)


The automatic analysis of video sequences with individuals performing some actions is currently receiving much attention in the computer vision community. Among the different visual features chosen to tackle the problem of action recognition, local histogram within a region of interest is proven to be very effective. However, we study for the first time whether spatiograms, which are histograms enriched with per-bin spatial information, can be alternatively effective for action characterization. On the other hand, the temporal information of these histograms is usually collapsed by simple averaging of the histograms, which basically ignores the dynamics of the action. In contrast, this paper explores a temporally holistic representation in the form of recurrence matrices which capture pair-wise spatiograms relationships on a frame-by-frame basis. Experimental results show that recurrence matrices are powerful for action classification, whereas spatiograms, in its current usage, are not.


Human action recognition recurrence matrices spatiograms spatio-temporal representations 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agustí, P., Traver, V.J., Montoliu, R., Pla, F.: An evaluation of the grid geometry for human action recognition. In: II Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 9–16 (September 2010)Google Scholar
  2. 2.
    Baysal, S., Kurt, M.C., Duygulu, P.: Recognizing human actions using key poses. In: International Conference on, Pattern Recognition, pp. 1727–1730 (2010)Google Scholar
  3. 3.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 509–522 (2002)CrossRefGoogle Scholar
  4. 4.
    Birchfield, S.T., Rangarajan, S.: Spatiograms versus histograms for region-based tracking. In: Proceedings of Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1158–1163 (2005)Google Scholar
  5. 5.
    Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 8(6), 679–698 (1986)CrossRefGoogle Scholar
  6. 6.
    Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Transactions on Pattern Analysis and Machine Intelligence 29(12), 2247–2253 (2007)CrossRefGoogle Scholar
  7. 7.
    Grundmann, M., Meier, F., Essa, I.: 3D shape context and distance transform for action recognition. In: International Conference on Pattern Recognition, pp. 1–4 (2008)Google Scholar
  8. 8.
    Ikizler, N., Duygulu, P.: Histogram of oriented rectangles: A new pose descriptor for human action recognition. Image and Vision Computing 27(10), 1515–1526 (2009)CrossRefGoogle Scholar
  9. 9.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  10. 10.
    Junejo, I., Dexter, E., Laptev, I., Pérez, P.: View-independent action recognition from temporal self-similarities. IEEE Trans. on Pattern Analysis and Machine Intelligence (2009)Google Scholar
  11. 11.
    Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: British Machine Vision Conference (2008)Google Scholar
  12. 12.
    Marwan, N., Carmenromano, M., Thiel, M., Kurths, J.: Recurrence plots for the analysis of complex systems. Physics Reports 438(5-6), 237–329 (2007)CrossRefGoogle Scholar
  13. 13.
    Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision 79, 299–318 (2008)CrossRefGoogle Scholar
  14. 14.
    Niebles, J.C., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: IEEE Conference on Computer Vision and Pattern Recognition (2007)Google Scholar
  15. 15.
    Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)CrossRefGoogle Scholar
  16. 16.
    Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: Proceedings of the 15th International Conference on Multimedia, pp. 357–360 (2007)Google Scholar
  17. 17.
    Serra-Toro, C., Montoliu, R., Traver, V.J., Hurtado-Melgar, I.M., Núnez-Redó, M., Cascales, P.: Assessing water quality by video monitoring fish swimming behavior. In: Proceedings of the International Conference on Pattern Recognition, pp. 428–431 (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Pau Agustí
    • 1
  • V. Javier Traver
    • 1
  • Manuel J. Marin-Jimenez
    • 2
  • Filiberto Pla
    • 1
  1. 1.iNIT and DLSIUniversidad Jaume ICastellónSpain
  2. 2.Universidad de CórdobaSpain

Personalised recommendations