Motion Based Foreground Detection and Poselet Motion Features for Action Recognition

Kraft, Erwin; Brox, Thomas

doi:10.1007/978-3-319-16814-2_23

Erwin Kraft¹⁷ &
Thomas Brox¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9007))

Included in the following conference series:

Asian Conference on Computer Vision

1638 Accesses

Abstract

For action recognition, the actor(s) and the tools they use as well as their motion are of central importance. In this paper, we propose separating foreground items of an action from the background on the basis of motion cues. As a consequence, separate descriptors can be defined for the foreground regions, while combined foreground-background descriptors still capture the context of an action. Also a low-dimensional global camera motion descriptor can be computed. Poselet activations in the foreground area indicate the actor and its pose. We propose tracking these poselets to obtain detailed motion features of the actor. Experiments on the Hollywood2 dataset show that foreground-background separation and the poselet motion features lead to consistently favorable results, both relative to the baseline and in comparison to the current state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Marzałek, M., Laptev, I., Schmid, C.: Actions in context. In: Computer Vision and Pattern Recognition (CVPR), pp. 2929–2936 (2009)
Google Scholar
Laptev, I., Lindeberg, T.: Space-time interest points. In: International Conference on Computer Vision (ICCV), pp. 432–439 (2003)
Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176 (2011)
Google Scholar
Jiang, Y.-G., Dai, Q., Xue, X., Liu, W., Ngo, C.-W.: Trajectory-based modeling of human actions with motion reference points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 425–438. Springer, Heidelberg (2012)
Chapter Google Scholar
Jain, M., Jégou, H. Bouthemy, P.: Better exploiting motion for better action recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 2555–2562 (2013)
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: International Conference on Computer Vision (ICCV), pp. 3551–3558 (2013)
Google Scholar
Ullah, M.M., Parizi, S.N., Laptev, I.: Improving bag-of-features action recognition with non-local cues. In: British Machine Vision Conference (BMVC), pp. 1–11 (2010)
Google Scholar
Prest, A., Schmid, C., Ferrari, V.: Weakly supervised learning of interactions between humans and objects. In: Pattern Analysis and Machine Intelligence (PAMI), pp. 601–614 (2012)
Google Scholar
Pascal Visual Object Challenge. http://pascallin.ecs.soton.ac.uk/challenges/VOC/
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 886–893 (2005)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Article Google Scholar
Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 168–181. Springer, Heidelberg (2010)
Chapter Google Scholar
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)
Chapter Google Scholar
Vig, E., Dorr, M., Cox, D.: Space-variant descriptor sampling for action recognition based on saliency and eye movements. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 84–97. Springer, Heidelberg (2012)
Chapter Google Scholar
Yang, W., Wang, Y., Mori, G.: Recognizing human actions from still images with latent poses. In: Computer Vision and Pattern Recognition (CVPR), pp. 2030–2037 (2010)
Google Scholar
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: Computer Vision and Pattern Recognition (CVPR), pp. 3177–3184 (2011)
Google Scholar
Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 438–451. Springer, Heidelberg (2010)
Chapter Google Scholar
Tomasi, C., Kanade, T.: Shape and motion from image streams under ortography: a factorization method. Int. J. Comput. Vis. 9, 137–154 (1992)
Article Google Scholar
Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: International Conference on Computer Vision (ICCV), pp. 1219–1225 (2009)
Google Scholar
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26, 147–159 (2004)
Article Google Scholar
Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J.: Towards understanding action recognition. In: International Conference on Computer Vision (ICCV), pp. 3192–3199 (2013)
Google Scholar
Zuffi, S., Black, M.J.: From pictorial structures to deformable structures. In: Computer Vision and Pattern Recognition (CVPR), pp. 3546–3553 (2012)
Google Scholar
Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Computer Vision and Pattern Recognition (CVPR), pp. 2911–2918 (2012)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1704–1716 (2012)
Article Google Scholar
Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using mined hierarchical compound features. IEEE Trans. Pattern Anal. Mach. Intell. 33, 883–897 (2011)
Article Google Scholar
Oneaţă, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: International Conference on Computer Vision (ICCV), pp. 1817–1824 (2013)
Google Scholar
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Chapter Google Scholar
Hassner, T.: A critical review of action recognition benchmarks. In: Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 245–250 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer ITWM, Fraunhofer-Platz 1, Kaiserslautern, Germany
Erwin Kraft
University of Freiburg, Georges-Köhler-Allee 52, Freiburg, Germany
Thomas Brox

Authors

Erwin Kraft
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Brox
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erwin Kraft .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kraft, E., Brox, T. (2015). Motion Based Foreground Detection and Poselet Motion Features for Action Recognition. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_23

Download citation

DOI: https://doi.org/10.1007/978-3-319-16814-2_23
Published: 17 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16813-5
Online ISBN: 978-3-319-16814-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics