Advertisement

Instant Action Recognition

  • Thomas Mauthner
  • Peter M. Roth
  • Horst Bischof
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5575)

Abstract

In this paper, we present an efficient system for action recognition from very short sequences. For action recognition typically appearance and/or motion information of an action is analyzed using a large number of frames. This is a limitation if very fast actions (e.g., in sport analysis) have to be analyzed. To overcome this limitation, we propose a method that uses a single-frame representation for actions based on appearance and motion information. In particular, we estimate Histograms of Oriented Gradients (HOGs) for the current frame as well as for the corresponding dense flow field. The thus obtained descriptors are efficiently represented by the coefficients of a Non-negative Matrix Factorization (NMF). Actions are classified using an one-vs-all Support Vector Machine. Since the flow can be estimated from two frames, in the evaluation stage only two consecutive frames are required for the action analysis. Both, the optical flow as well as the HOGs, can be computed very efficiently. In the experiments, we compare the proposed approach to state-of-the-art methods and show that it yields competitive results. In addition, we demonstrate action recognition for real-world beach-volleyball sequences.

Keywords

Recognition Rate Action Recognition Motion Information IEEE Conf Human Action Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2005)Google Scholar
  2. 2.
    Thurau, C., Hlaváč, V.: Pose primitive based human action recognition in videos or still images. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2008)Google Scholar
  3. 3.
    Agarwal, A., Triggs, B.: A local basis representation for estimating human pose from cluttered images. In: Narayanan, P.J., Nayar, S.K., Shum, H.-Y. (eds.) ACCV 2006. LNCS, vol. 3851, pp. 50–59. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  4. 4.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  5. 5.
    Bobick, A.F., Davis, J.W.: The representation and recognition of action using temporal templates. IEEE Trans. on Pattern Analysis and Machine Intelligence 23(3), 257–267 (2001)CrossRefGoogle Scholar
  6. 6.
    Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proc. IEEE Intern. Conf. on Computer Vision, pp. 1395–1402 (2005)Google Scholar
  7. 7.
    Weinland, D., Boyer, E.: Action recognition using exemplar-based embedding. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2008)Google Scholar
  8. 8.
    Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. European Conf. on Computer Vision (2003)Google Scholar
  9. 9.
    Laptev, I., Lindeberg, T.: Local descriptors for spatio-temporal recognition. In: Proc. IEEE Intern. Conf. on Computer Vision (2003)Google Scholar
  10. 10.
    Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proc. IEEE Workshop on PETS, pp. 65–72 (2005)Google Scholar
  11. 11.
    Niebles, J.C., Fei-Fei, L.: A hierarchical model of shape and appearance for human action classification. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2007)Google Scholar
  12. 12.
    Mikolajczyk, K., Uemura, H.: Action recognition with motion-appearance vocabulary forest. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2008)Google Scholar
  13. 13.
    Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proc. IEEE Intern. Conf. on Computer Vision (2007)Google Scholar
  14. 14.
    Schindler, K., van Gool, L.: Action snippets: How many frames does human action recognition require? In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition (2008)Google Scholar
  15. 15.
    Porikli, F.: Integral histogram: A fast way to extract histograms in cartesian spaces. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 829–836 (2005)Google Scholar
  16. 16.
    Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime tv-l1 optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713, pp. 214–223. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  17. 17.
    Lu, W.L., Little, J.J.: Tracking and recognizing actions at a distance. In: CVBASE, Workshop at ECCV (2006)Google Scholar
  18. 18.
    Ali, S., Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: Proc. IEEE Intern. Conf. on Computer Vision (2007)Google Scholar
  19. 19.
    Hoyer, P.O.: Non-negative matrix factorization with sparseness constraints. Journal of Machine Learning Research 5, 1457–1469 (2004)zbMATHMathSciNetGoogle Scholar
  20. 20.
    Heiler, M., Schnörr, C.: Learning non-negative sparse image codes by convex programming. In: Proc. IEEE Intern. Conf. on Computer Vision, vol. II, pp. 1667–1674 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Thomas Mauthner
    • 1
  • Peter M. Roth
    • 1
  • Horst Bischof
    • 1
  1. 1.Institute for Computer Graphics and VisionGraz University of TechnologyGrazAustria

Personalised recommendations