Advertisement

sNN-LDS: Spatio-temporal Non-negative Sparse Coding for Human Action Recognition

  • Thomas Guthier
  • Adrian Šošić
  • Volker Willert
  • Julian Eggert
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8681)

Abstract

Current state-of-the-art approaches for visual human action recognition focus on complex local spatio-temporal descriptors, while the spatio-temporal relations between the descriptors are discarded. These bag-of-features (BOF) based methods come with the disadvantage of limited descriptive power, because class-specific mid- and large-scale spatio-temporal information, such as body pose sequences, cannot be represented. To overcome this restriction, we propose sparse non-negative linear dynamical systems (sNN-LDS) as a dynamic, parts-based, spatio-temporal representation of local descriptors. We provide novel learning rules based on sparse non-negative matrix factorization (sNMF) to simultaneously learn both the parts as well as their transitions. On the challenging UCF-Sports dataset our sNN-LDS combined with simple local features is competitive with state-of-the-art BOF-SVM methods.

Keywords

Action Recognition Local Descriptor Sparse Code Positive Matrix Factorization Human Action Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as Space-Time Shapes. In: IEEE Int. Conf. on Computer Vision, ICCV (2005)Google Scholar
  2. 2.
    Rodriguez, M.D., Ahmed, J., Shah, M.: Action MACH: A Spatio-temporal Maximum Average Correlation Height Filter for Action Recognition. In: IEEE Conf. on Computer Vision and Pattern Recognition, CVPR (2008)Google Scholar
  3. 3.
    Olshausen, B., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996)CrossRefGoogle Scholar
  4. 4.
    Paatero, P., Tapper, U.: Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values. Environmetrics 5(2), 111–126 (1994)CrossRefGoogle Scholar
  5. 5.
    Lee, D.D., Seung, S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  6. 6.
    Tian, Y., Sukthankar, R., Shah, M.: Spatiotemporal Deformable Part Models for Action Detection. In: Int. Conf. on Computer Vision and Pattern Recognition, CVPR (2013)Google Scholar
  7. 7.
    Guthier, T., Eggert, J., Willert, V.: Unsupervised learning of motion patterns. In: European Symposium on Artificial Neural Networks, ESANN (2012)Google Scholar
  8. 8.
    Hoyer, P.O.: Non-negative sparse coding. IEEE Neural Networks for Signal Processing (2002)Google Scholar
  9. 9.
    Eggert, J., Koerner, E.: Sparse coding and NMF. In: IEEE Int. Joint Conf. on Neural Networks (IJCNN), vol. 4, pp. 2529–2533 (2004)Google Scholar
  10. 10.
    Amiri, S.M., Nasiopoulos, P., Leung, V.: Non-negative sparse coding for human action recognition. In: IEEE Int. Conf. on Image Processing, ICIP (2012)Google Scholar
  11. 11.
    Guha, T., Ward, R.K.: Learning sparse representations for human action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(8), 1576–1588 (2012)CrossRefGoogle Scholar
  12. 12.
    Guthier, T., Willert, V., Schnall, A., Kreuter, K., Eggert, J.: Non-negative sparse coding for motion extraction. In: IEEE Int. Joint Conf. on Neural Networks, IJCNN (2013)Google Scholar
  13. 13.
    Wang, H., Ullah, M.M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: British Machine Vision Conference, BMVC (2009)Google Scholar
  14. 14.
    Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision, 1–20 (2013)Google Scholar
  15. 15.
    Lakshminarayanan, B., Raich, R.: Non-negative matrix factorization for parameter estimation in hidden markov models. In: IEEE Int. Workshop on Machine Learning for Signal Processing, MLSP (2010)Google Scholar
  16. 16.
    Bilinski, P., Bremond, F.: Contextual statistics of space-time ordered features for human action recognition. In: IEEE Int. Conf. on Advanced Video and Signal-Based Surveillance (AVSS), pp. 228–233 (2012)Google Scholar
  17. 17.
    Wang, J., Chen, Z., Wu, Y.: Action recognition with multiscale spatio-temporal contexts. In: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 3185–3192 (2011)Google Scholar
  18. 18.
    Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Kneight, K.: Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 91–108 (2005)Google Scholar
  19. 19.
    Klaser, A., Marszałek, M., Laptev, I., Schmid, C., et al.: Will person detection help bag-of-features action recognition (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Thomas Guthier
    • 1
  • Adrian Šošić
    • 2
  • Volker Willert
    • 1
  • Julian Eggert
    • 3
  1. 1.Control Theory and RoboticsTU DarmstadtDarmstadtGermany
  2. 2.Signal Processing GroupTU DarmstadtDarmstadtGermany
  3. 3.Honda Research Institute EuropeOffenbachGermany

Personalised recommendations