Camera Motion and Surrounding Scene Appearance as Context for Action Recognition

Heilbron, Fabian Caba; Thabet, Ali; Niebles, Juan Carlos; Ghanem, Bernard

doi:10.1007/978-3-319-16817-3_38

Fabian Caba Heilbron^17,18,
Ali Thabet¹⁷,
Juan Carlos Niebles¹⁸ &
…
Bernard Ghanem¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9006))

Included in the following conference series:

Asian Conference on Computer Vision

2404 Accesses
1 Citations

Abstract

This paper describes a framework for recognizing human actions in videos by incorporating a new set of visual cues that represent the context of the action. We develop a weak foreground-background segmentation approach in order to robustly extract not only foreground features that are focused on the actors, but also global camera motion and contextual scene information. Using dense point trajectories, our approach separates and describes the foreground motion from the background, represents the appearance of the extracted static background, and encodes the global camera motion that interestingly is shown to be discriminative for certain action classes. Our experiments on four challenging benchmarks (HMDB51, Hollywood2, Olympic Sports, and UCF50) show that our contextual features enable a significant performance improvement over state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, J., Ryoo, M.S.: Human activity analysis: a review. ACM Comput. Surv. (CSUR) 43, 1–43 (2011)
Article Google Scholar
Atmosukarto, I., Ghanem, B., Ahuja, N.: Trajectory-based fisher kernel representation for action recognition in videos. In: International Conference on Pattern Recognition, pp. 3333–3336 (2012)
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV (2005)
Google Scholar
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2005 Visual Surveillance and Performance Evaluation of Tracking and Surveillance (2005)
Google Scholar
Escorcia, V., Niebles, J.C.: Spatio-temporal human-object interactions for action recognition in videos. In: ICCV (2013)
Google Scholar
Hartley, R.: In defense of the eight-point algorithm. TPAMI 19, 580–593 (1997)
Article Google Scholar
Ikizler-Cinbis, N., Sclaroff, S.: Object, scene and actions: combining multiple features for human action recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 494–507. Springer, Heidelberg (2010)
Chapter Google Scholar
Jain, M., Jégou, H., Bouthemy, P.: Better exploiting motion for better action recognition. In: CVPR (2013)
Google Scholar
Jégou, H., Perronnin, F., Douze, M., Sánchez, J., Pérez, P., Schmid, C.: Aggregating local image descriptors into compact codes. PAMI 34, 1704–1716 (2012)
Article Google Scholar
Jiang, Y.-G., Dai, Q., Xue, X., Liu, W., Ngo, C.-W.: Trajectory-based modeling of human actions with motion reference points. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part V. LNCS, vol. 7576, pp. 425–438. Springer, Heidelberg (2012)
Chapter Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: Hmdb: a large video database for human motion recognition. In: ICCV (2011)
Google Scholar
Laptev, I.: On space-time interest points. IJCV 64, 107–123 (2005)
Article Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)
Article Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR (2009)
Google Scholar
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Chapter Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)
Article MATH Google Scholar
Park, D., Zitnick, C.L., Ramanan, D., Dollár, P.: Exploring weak stabilization for motion feature extraction. In: CVPR (2013)
Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Reddy, K.K., Shah, M.: Recognizing 50 human action categories of web videos. Mach. Vis. Appl. 24, 971–981 (2013)
Article Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR (2004)
Google Scholar
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR (2011)
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV (2013)
Google Scholar
Wang, X., Wang, L.M., Qiao, Y.: A comparative study of encoding, pooling and normalization methods for action recognition. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part III. LNCS, vol. 7726, pp. 572–585. Springer, Heidelberg (2013)
Chapter Google Scholar
Wu, S., Oreifej, O., Shah, M.: Action recognition in videos acquired by a moving camera using motion decomposition of lagrangian particle trajectories. In: ICCV (2011)
Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. IJCV 73, 213–238 (2007)
Article Google Scholar

Download references

Acknowledgment

Research reported in this publication was supported by competitive research funding from King Abdullah University of Science and Technology (KAUST). F.C.H. was also supported by a COLCIENCIAS Young Scientist and Innovator Fellowship. J.C.N. is supported by a Microsoft Research Faculty Fellowship.

Author information

Authors and Affiliations

King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Fabian Caba Heilbron, Ali Thabet & Bernard Ghanem
Universidad del Norte, Barranquilla, Colombia
Fabian Caba Heilbron & Juan Carlos Niebles

Authors

Fabian Caba Heilbron
View author publications
You can also search for this author in PubMed Google Scholar
Ali Thabet
View author publications
You can also search for this author in PubMed Google Scholar
Juan Carlos Niebles
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Ghanem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bernard Ghanem .

Editor information

Editors and Affiliations

Technische Universität München, Garching, Bayern, Germany
Daniel Cremers
University of Adelaide, Adelaide, South Australia, Australia
Ian Reid
Keio University, Yokohama, Kanagawa, Japan
Hideo Saito
University of California at Merced, Merced, California, USA
Ming-Hsuan Yang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Heilbron, F.C., Thabet, A., Niebles, J.C., Ghanem, B. (2015). Camera Motion and Surrounding Scene Appearance as Context for Action Recognition. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-16817-3_38
Published: 17 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16816-6
Online ISBN: 978-3-319-16817-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics