Human Action Recognition Based on Oriented Motion Salient Regions

  • Baoxin Wu
  • Shuang Yang
  • Chunfeng Yuan
  • Weiming HuEmail author
  • Fangshi Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9008)


Motion is the most informative cue for human action recognition. Regions with high motion saliency indicate where actions occur and contain visual information that is most relevant to actions. In this paper, we propose a novel approach for human action recognition based on oriented motion salient regions (OMSRs). Firstly, we apply a bank of 3D Gabor filters and an opponent inhibition operator to detect OMSRs of videos, each of which corresponds to a specific motion direction. Then, a new low-level feature, named as oriented motion salient descriptor (OMSD), is proposed to describe the obtained OMSRs through the statistics of the texture in the regions. Next, we utilize the obtained OMSDs to explore the oriented characteristics of action classes and generate a set of class-specific oriented attributes (CSOAs) for each class. These CSOAs provide a compact and discriminative middle-level representation for human actions. Finally, an SVM classifier is utilized for human action classification and a new compatibility function is devised for measuring how well a given action matches to the CSOAs of a certain class. We test the proposed approach on four public datasets and the experimental results validate the effectiveness of our approach.


Video Sequence Action Class Gabor Filter Human Action Recognition Compatibility Function 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work is partly supported by the 973 basic research program of China (Grant No. 2014CB349303), the National 863 High-Tech R&D Program of China (Grant No. 2012AA012504), the Natural Science Foundation of Beijing (Grant No. 4121003), the Project Supported by Guangdong Natural Science Foundation (Grant No. S2012020011081) and NSFC (Grant No. 61100099, 61303086).


  1. 1.
    Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: CVPR, pp. 2929–2936 (2009)Google Scholar
  2. 2.
    Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VSPTES, pp. 65–72 (2005)Google Scholar
  3. 3.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR, pp. 1–8 (2008)Google Scholar
  4. 4.
    Wang, H., Ullah, M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC (2009)Google Scholar
  5. 5.
    Scovanner, P., Ali, S., Shah, M.: A 3-dimensional SIFT descriptor and its application to action recognition. In: ICMM, pp. 357–360 (2007)Google Scholar
  6. 6.
    Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. PAMI 23, 257–267 (2001)CrossRefGoogle Scholar
  7. 7.
    Wang, L., Suter, D.: Informative shape representations for human action recognition. In: ICPR, vol. 2, pp. 1266–1269 (2006)Google Scholar
  8. 8.
    Ikizler, N., Duygulu, P.: Histogram of oriented rectangles: A new pose descriptor for human action recognition. IVC 27, 1515–1526 (2009)CrossRefGoogle Scholar
  9. 9.
    Reed, T.R.: Motion analysis using the 3-d gabor transform. SSC 1, 506–509 (1996)Google Scholar
  10. 10.
    Adelson, E.H., Bergen, J.R.: Spatiotemporal energy models for the perception of motion. J. Opt. Soc. Am. A 2, 284–299 (1985)CrossRefGoogle Scholar
  11. 11.
    Farhadi, A., Endres, I., Hoiem, D., Forsyth, D.: Describing objects by their attributes. In: CVPR, pp. 1778–1785 (2009)Google Scholar
  12. 12.
    Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR, pp. 3337–3344 (2011)Google Scholar
  13. 13.
    Wang, Y., Mori, G.: A discriminative latent model of object classes and attributes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 155–168. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  14. 14.
    Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: ICCV, pp. 1331–1338 (2011)Google Scholar
  15. 15.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR, vol. 3, pp. 32–36 (2004)Google Scholar
  16. 16.
    Rodriguez, M., Ahmed, J., Shah, M.: Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: CVPR, pp. 1–8 (2008)Google Scholar
  17. 17.
    Derpanis, K., Sizintsev, M., Cannons, K., Wildes, R.: Action spotting and recognition based on a spatiotemporal orientation analysis. PAMI 35, 527–540 (2012)CrossRefGoogle Scholar
  18. 18.
    Kovashka, A., Grauman, K.: Learning a hierarchy of discriminative space-time neighborhood features for human action recognition. In: CVPR, pp. 2046–205 (2010)Google Scholar
  19. 19.
    Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: CVPR, pp. 3361–3368 (2011)Google Scholar
  20. 20.
    Wang, H., Klaser, A., Schmid, C., Liu, C.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)Google Scholar
  21. 21.
    Yeffet, L., Wolf, L.: Local trinary patterns for human action recognition. In: ICCV, pp. 492–497 (2009)Google Scholar
  22. 22.
    Derpanis, K., Lecce, M., Daniilidis, K., Wildes, R.P.: Dynamic scene understanding: The role of orientation features in space and time in scene classification. In: CVPR, pp. 1306–1313 (2012)Google Scholar
  23. 23.
    Wang, L., Qiao, Y., Tang, X.: Motionlets: mid-level 3D parts for human motion recognition. In: CVPR (2013)Google Scholar
  24. 24.
    Shi, F., Petriu, E., Laganiere, R.: Sampling strategies for real-time action recognition. In: CVPR, pp. 2595–2602 (2013)Google Scholar
  25. 25.
    Liu, L., Shao, L., Zhen, X., Li, X.: Learning discriminative key poses for action recognition. Cybernetics 43, 1314–1317 (2013)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Baoxin Wu
    • 1
  • Shuang Yang
    • 1
  • Chunfeng Yuan
    • 1
  • Weiming Hu
    • 1
    Email author
  • Fangshi Wang
    • 2
  1. 1.NLPR, Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.School of SoftwareBeijing Jiaotong UniversityBeijingChina

Personalised recommendations