Abstract
For action recognition, the actor(s) and the tools they use as well as their motion are of central importance. In this paper, we propose separating foreground items of an action from the background on the basis of motion cues. As a consequence, separate descriptors can be defined for the foreground regions, while combined foreground-background descriptors still capture the context of an action. Also a low-dimensional global camera motion descriptor can be computed. Poselet activations in the foreground area indicate the actor and its pose. We propose tracking these poselets to obtain detailed motion features of the actor. Experiments on the Hollywood2 dataset show that foreground-background separation and the poselet motion features lead to consistently favorable results, both relative to the baseline and in comparison to the current state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Marzałek, M., Laptev, I., Schmid, C.: Actions in context. In: Computer Vision and Pattern Recognition (CVPR), pp. 2929–2936 (2009)
Laptev, I., Lindeberg, T.: Space-time interest points. In: International Conference on Computer Vision (ICCV), pp. 432–439 (2003)
Wang, H., Kläser, A., Schmid, C., Liu, C.-L.: Action recognition by dense trajectories. In: Computer Vision and Pattern Recognition (CVPR), pp. 3169–3176 (2011)
Jiang, Y.-G., Dai, Q., Xue, X., Liu, W., Ngo, C.-W.: Trajectory-based modeling of human actions with motion reference points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 425–438. Springer, Heidelberg (2012)
Jain, M., Jégou, H. Bouthemy, P.: Better exploiting motion for better action recognition. In: Computer Vision and Pattern Recognition (CVPR), pp. 2555–2562 (2013)
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: International Conference on Computer Vision (ICCV), pp. 3551–3558 (2013)
Ullah, M.M., Parizi, S.N., Laptev, I.: Improving bag-of-features action recognition with non-local cues. In: British Machine Vision Conference (BMVC), pp. 1–11 (2010)
Prest, A., Schmid, C., Ferrari, V.: Weakly supervised learning of interactions between humans and objects. In: Pattern Analysis and Machine Intelligence (PAMI), pp. 601–614 (2012)
Pascal Visual Object Challenge. http://pascallin.ecs.soton.ac.uk/challenges/VOC/
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Computer Vision and Pattern Recognition (CVPR), pp. 886–893 (2005)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 168–181. Springer, Heidelberg (2010)
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)
Vig, E., Dorr, M., Cox, D.: Space-variant descriptor sampling for action recognition based on saliency and eye movements. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 84–97. Springer, Heidelberg (2012)
Yang, W., Wang, Y., Mori, G.: Recognizing human actions from still images with latent poses. In: Computer Vision and Pattern Recognition (CVPR), pp. 2030–2037 (2010)
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: Computer Vision and Pattern Recognition (CVPR), pp. 3177–3184 (2011)
Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 438–451. Springer, Heidelberg (2010)
Tomasi, C., Kanade, T.: Shape and motion from image streams under ortography: a factorization method. Int. J. Comput. Vis. 9, 137–154 (1992)
Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: International Conference on Computer Vision (ICCV), pp. 1219–1225 (2009)
Kolmogorov, V., Zabih, R.: What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Anal. Mach. Intell. 26, 147–159 (2004)
Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J.: Towards understanding action recognition. In: International Conference on Computer Vision (ICCV), pp. 3192–3199 (2013)
Zuffi, S., Black, M.J.: From pictorial structures to deformable structures. In: Computer Vision and Pattern Recognition (CVPR), pp. 3546–3553 (2012)
Arandjelović, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: Computer Vision and Pattern Recognition (CVPR), pp. 2911–2918 (2012)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1704–1716 (2012)
Gilbert, A., Illingworth, J., Bowden, R.: Action recognition using mined hierarchical compound features. IEEE Trans. Pattern Anal. Mach. Intell. 33, 883–897 (2011)
Oneaţă, D., Verbeek, J., Schmid, C.: Action and event recognition with fisher vectors on a compact feature set. In: International Conference on Computer Vision (ICCV), pp. 1817–1824 (2013)
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Hassner, T.: A critical review of action recognition benchmarks. In: Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 245–250 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kraft, E., Brox, T. (2015). Motion Based Foreground Detection and Poselet Motion Features for Action Recognition. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-16814-2_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16813-5
Online ISBN: 978-3-319-16814-2
eBook Packages: Computer ScienceComputer Science (R0)