Learning Features for Human Action Recognition Using Multilayer Architectures

  • Manuel Jesús Marín-Jiménez
  • Nicolás Pérez de la Blanca
  • María Ángeles Mendoza
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6669)


This paper presents an evaluation of two multilevel architectures in the human action recognition (HAR) task. By combining low level features with multi-layer learning architectures, we infer discriminative semantic features that highly improve the classification performance. This approach eliminates the difficult process of selecting good mid-level feature descriptors, changing the feature selection and extraction process by a learning stage. The data probability distribution is modeled by a multi-layer graphical model. In this way, this approach is different to the standard ones. Experiments on KTH and Weizmann video sequence databases are carried out in order to evaluate the performance of the proposal. The results show that the new learnt features offer a classification performance comparable to the state-of-the-art on these databases.


Hide Layer Human Action Recognition Restricted Boltzmann Machine Deep Belief Network Multilayer Architecture 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bengio, Y.: Learning deep architectures for AI. Tech. Rep. 1312, Dept. IRO, Universite de Montreal (2007),
  2. 2.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Int. Conf. Comp. Vision., vol. 2, pp. 1395–1402 (2005)Google Scholar
  3. 3.
    Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: CVPR (2008)Google Scholar
  4. 4.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2001)CrossRefzbMATHGoogle Scholar
  5. 5.
    Hinton, G.: Training product of experts by minimizing contrastive divergence. Neural Computation 14(8), 1711–1800 (2002)CrossRefzbMATHGoogle Scholar
  6. 6.
    Hinton, G.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for Deep Belief Nets. Neural Computation 18, 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proc. ICCV 2007, pp. 1–8 (2007)Google Scholar
  9. 9.
    Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: CVPR (2010)Google Scholar
  10. 10.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. CVPR 2008 (2008)Google Scholar
  11. 11.
    Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: Int. Conf. Comp. Vision (2009)Google Scholar
  12. 12.
    Lui, Y.M., Beveridge, J., Kirby, M.: Action classification on product manifolds. In: Comp. Vision and Patt. Rec., pp. 833–839 (2010)Google Scholar
  13. 13.
    Marín-Jiménez, M., de la Blanca, N.P., Mendoza, M., Lucena, M., Fuertes, J.: Learning action descriptors for recognition. In: WIAMIS 2009, pp. 5–8. IEEE Computer Society, London (2009)Google Scholar
  14. 14.
    Moeslund, T.B., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. In: CVIU, vol. 104, pp. 90–126 (2006)Google Scholar
  15. 15.
    Schindler, K., van Gool, L.: Combining densely sampled form and motion for human action recognition. In: Rigoll, G. (ed.) DAGM 2008. LNCS, vol. 5096, pp. 122–131. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Int. Conf. Patt. Rec., Cambridge, U.K, vol. 3, pp. 32–36 (2004)Google Scholar
  17. 17.
    Torralba, A., Fergus, R., Weiss, Y.: Small codes and large database for recognition. In: Comp. Vision and Patt. Rec. (2008)Google Scholar
  18. 18.
    Zhang, Z., Hu, Y., Chan, S., Chia, L.: Motion context: A new representation for human action recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 817–829. Springer, Heidelberg (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Manuel Jesús Marín-Jiménez
    • 1
  • Nicolás Pérez de la Blanca
    • 2
  • María Ángeles Mendoza
    • 2
  1. 1.University of CórdobaCórdobaSpain
  2. 2.University of GranadaGranadaSpain

Personalised recommendations