Abstract
Human action recognition from videos draws tremendous interest in the past many years. In this work, we first find that the trifocal tensor resides in a twelve dimensional subspace of the original space if the first two views are already matched and the fundamental matrix between them is known, which we refer to as subtensor. Then we use the subtensor to perform the task of action recognition under three views. We find that treating the two template views separately or not considering the correspondence relation already known between the first two views omits a lot of useful information. Experiments and datasets are designed to demonstrate the effectiveness and improved performance of the proposed approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. CVIU 104, 90–126 (2006)
Wang, L., Hu, W., Tan, T.: Recent developments in human motion analysis. PR 36, 585–601 (2003)
Weinland, D., Ronfard, R., Boyer, E.: A survey of vision-based methods for action representation. CVIU 115, 224–241 (2011)
Ikizler, N., Forsyth, D.A.: Searching video for complex activities with finite state models. In: CVPR, pp. 18–23 (2007)
Agarwal, A., Triggs, B.: Recovering 3d human pose from monocular images. PAMI 28, 44–58 (2006)
Peursum, P., Venkatesh, S., West, G.A.W.: Tracking-as-recognition for articulated full-body human motion analysis. In: CVPR, pp. 1–8 (2007)
Yilmaz, A., Shah, M.: Recognizing human actions in videos acquired by uncalibrated moving cameras. In: CVPR, pp. 150–157 (2005)
Yilmaz, A., Shah, M.: Actions sketch: A novel action representation. In: CVPR, pp. 984–989 (2005)
Fathi, A., Mori, G.: Action recognition by learning mid-level motion features. In: CVPR, pp. 24–32 (2008)
Schindler, K., Gool, L.J.V.: Action snippets: How many frames does human action recognition require. In: CVPR, pp. 1–8 (2008)
Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)
Gilbert, A., Illingworth, J., Bowden, R.: Scale Invariant Action Recognition Using Compound Features Mined from Dense Spatio-temporal Corners. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 222–233. Springer, Heidelberg (2008)
Niebles, J.C., Li, F.F.: A hierarchical model of shape and appearance for human action classification. In: CVPR, pp. 1–8 (2007)
Morency, L.P., Quattoni, A., Darrell, T.: Latent-dynamic discriminative models for continuous gesture recognition. In: CVPR, pp. 1–8 (2007)
Thurau, C., Hlavác, V.: Pose primitive based human action recognition in videos or still images. In: CVPR, pp. 1–8 (2008)
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. PAMI 29, 2247–2253 (2007)
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. IJCV 79, 299–318 (2008)
Nowozin, S., Bakir, G.H., Tsuda, K.: Discriminative subsequence mining for action classification. In: ICCV, pp. 1–8 (2007)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: ACMMM, pp. 357–360 (2007)
Parameswaran, V., Chellappa, R.: View invariants for human action recognition. In: CVPR, pp. 613–619 (2003)
Campbell, L.W., Becker, D.A., Azarbayejani, A., Bobick, A.F., Pentland, A.: Invariant features for 3-d gesture recognition. In: FG, pp. 157–163 (1996)
Holte, M.B., Moeslund, T.B., Fihl, P.: View invariant gesture recognition using the csem swissranger sr-2 camera. IJISTA 5, 295–303 (2008)
Zelnik-manor, L., Irani, M.: Event-based analysis of video. In: CVPR, pp. 123–130 (2001)
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. CVIU 104, 249–257 (2006)
Junejo, I.N., Dexter, E., Laptev, I., Pérez, P.: View-independent action recognition from temporal self-similarities. PAMI 33, 172–185 (2011)
Farhadi, A., Tabrizi, M.K.: Learning to Recognize Activities from the Wrong View Point. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 154–166. Springer, Heidelberg (2008)
Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. IJCV 50, 203–226 (2002)
Gritai, A., Sheikh, Y., Rao, C., Shah, M.: Matching trajectories of anatomical landmarks under viewpoint, anthropometric and temporal transforms. IJCV 84, 325–343 (2009)
Shen, Y., Foroosh, H.: View-invariant action recognition using fundamental ratios. In: CVPR, pp. 1–8 (2008)
Shen, Y., Foroosh, H.: View-invariant action recognition from point triplets. PAMI 31, 1898–1905 (2009)
ul Haq, A., Gondal, I., Murshed, M.: On dynamic scene geometry for view-invariant action matching. In: CVPR, pp. 3305–3312 (2011)
Faugeras, O.D., Papadopoulo, T.: A nonlinear method for estimating the projective geometry of three views. In: ICCV, pp. 477–484 (1998)
Müller, M., Röder, T., Clausen, M., Eberhardt, B., Krüger, B., Weber, A.: Documentation mocap database hdm05. Technical report CG-2007-2, Universität Bonn (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Liu, Q., Cao, X. (2012). Action Recognition Using Subtensor Constraint. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds) Computer Vision – ECCV 2012. ECCV 2012. Lecture Notes in Computer Science, vol 7574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33712-3_55
Download citation
DOI: https://doi.org/10.1007/978-3-642-33712-3_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33711-6
Online ISBN: 978-3-642-33712-3
eBook Packages: Computer ScienceComputer Science (R0)