Skip to main content

Action Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6388))

Abstract

A novel framework for action recognition in video using empirical covariance matrices of bags of low-dimensional feature vectors is developed. The feature vectors are extracted from segments of silhouette tunnels of moving objects and coarsely capture their shapes. The matrix logarithm is used to map the segment covariance matrices, which live in a nonlinear Riemannian manifold, to the vector space of symmetric matrices. A recently developed sparse linear representation framework for dictionary-based classification is then applied to the log-covariance matrices. The log-covariance matrix of a query segment is approximated by a sparse linear combination of the log-covariance matrices of training segments and the sparse coefficients are used to determine the action label of the query segment. This approach is tested on the Weizmann and the UT-Tower human action datasets. The new approach attains a segment-level classification rate of 96.74% for the Weizmann dataset and 96.15% for the UT-Tower dataset. Additionally, the proposed method is computationally and memory efficient and easy to implement.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, S., Shah, M.: Human action recognition in videos using kinematic features and multiple instance learning. IEEE Trans. Pattern Anal. Machine Intell. 32(2), 288–303 (2010)

    Article  Google Scholar 

  2. Arsigny, V., Pennec, P., Ayache, X.: Log-euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic resonance in medicine 56(2), 411–421 (2006)

    Article  Google Scholar 

  3. Chen, C.C., Ryoo, M.S., Aggarwal, J.K.: UT-Tower Dataset: Aerial View Activity Classification Challenge (2010), http://cvrc.ece.utexas.edu/SDHA2010/Aerial_View_Activity.html

  4. Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: IEEE Int’l Workshop VS-PETS (2005)

    Google Scholar 

  5. Donoho, D.L.: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Comm. Pure Appl. Math. 59, 797–829 (2004)

    Article  MATH  Google Scholar 

  6. Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Machine Intell. 29(12), 2247–2253 (2007)

    Article  Google Scholar 

  7. Guo, K., Ishwar, P., Konrad, J.: Action recognition from video by covariance matching of silhouette tunnels. In: Proc. Brazilian Symp. on Computer Graphics and Image Proc. (October 2009)

    Google Scholar 

  8. Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learing realistic human actions from movies. In: Proc. IEEE Conf. Computer Vision Pattern Recognition (June 2008)

    Google Scholar 

  9. Niebles, J., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. In: Intern. J. Comput. Vis. (March 2008)

    Google Scholar 

  10. Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proc. Int. Conf. Pattern Recognition (June 2004)

    Google Scholar 

  11. Seo, H.J., Milanfar, P.: Action recognition from one example. IEEE Trans. Pattern Anal. Machine Intell. (submitted)

    Google Scholar 

  12. Starner, T., Pentland, A.: Visual recognition of american sign language using hidden markov models. In: IEEE Int. Conf. on Automatic Face and Gesture Recognition (1995)

    Google Scholar 

  13. Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Machine Intell. 31(2), 210–227 (2009)

    Article  Google Scholar 

  14. Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time sequential images using hidden markov model. In: Proc. IEEE Conf. Computer Vision Pattern Recognition (June 1992)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Guo, K., Ishwar, P., Konrad, J. (2010). Action Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels. In: Ünay, D., Çataltepe, Z., Aksoy, S. (eds) Recognizing Patterns in Signals, Speech, Images and Videos. ICPR 2010. Lecture Notes in Computer Science, vol 6388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17711-8_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17711-8_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17710-1

  • Online ISBN: 978-3-642-17711-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics