Atomic Action Features: A New Feature for Action Recognition

  • Qiang Zhou
  • Gang Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7583)


We introduce an atomic action based features and demonstrate that it consistently improves performance on human activity recognition. The features are built using auxiliary atomic action data collected in our lab. We train a kernelized SVM classifier for each atomic action class. Then given a local spatio-temporal cuboid of a test video, we represent it using the responses of our atomic action classifiers. This new atomic action feature is discriminative, and has semantic meanings. We perform extensive experiments on four benchmark action recognition datasets. The results show that atomic action features either outperform the corresponding low level features or significantly boost the recognition performance by combining the two.


  1. 1.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. CVPR (2008)Google Scholar
  2. 2.
    Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: Proc. CVPR (2009)Google Scholar
  3. 3.
    Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Proc. BMVC (2009)Google Scholar
  4. 4.
    Ni, B., Yan, S., Kassim, A.A.: Recognizing human group activities with localized causalities. In: Proc. CVPR (2009)Google Scholar
  5. 5.
    Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  6. 6.
    Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: Proc. CVPR (2011)Google Scholar
  7. 7.
    Ni, B., Wang, G., Moulin, P.: Rgbd-hudaact: A color-depth video database for human daily activity recognition. In: ICCV Workshops (2011)Google Scholar
  8. 8.
    Zhang, T., Xu, C., Zhu, G., Liu, S., Lu, H.: A generic framework for event detection in various video domains. In: ACM Multimedia (2010)Google Scholar
  9. 9.
    Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local svm approach. In: Proc. ICPR (2004)Google Scholar
  10. 10.
    Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: Proc. CVPR (2009)Google Scholar
  11. 11.
    Li, L.-J., Su, H., Lim, Y., Fei-Fei, L.: Objects as attributes for scene classificcation. In: ECCV Workshop (2010)Google Scholar
  12. 12.
    Rasiwasia, N., Vasconcelos, N.: Scene classification with low-dimensional semantic spaces and weak supervision. In: Proc. CVPR (2008)Google Scholar
  13. 13.
    Sadanand, S., Corso, J.J.: Action bank: A high-level representation of activity in video. In: Proc. CVPR (2012)Google Scholar
  14. 14.
    Torresani, L., Szummer, M., Fitzgibbon, A.: Efficient Object Category Recognition Using Classemes. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 776–789. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Taylor, G.W., Fergus, R., LeCun, Y., Bregler, C.: Convolutional Learning of Spatio-temporal Features. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 140–153. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: Proc. CVPR (2011)Google Scholar
  17. 17.
    Gaidon, A., Harchaoui, Z., Schmid, C.: Actom sequence models for efficient action detection. In: Proc. CVPR (2011)Google Scholar
  18. 18.
    Ryoo, M.S., Aggarwal, J.K.: Recognition of composite human activities through context-free grammar based representation. In: Proc. CVPR (2006)Google Scholar
  19. 19.
    Laptev, I.: On space-time interest points. IJCV 64(2-3), 107–123 (2005)CrossRefGoogle Scholar
  20. 20.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011), Software,
  21. 21.
    Kläser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: Proc. BMVC (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Qiang Zhou
    • 1
  • Gang Wang
    • 1
    • 2
  1. 1.Advanced Digital Sciences CenterSingapore
  2. 2.Nanyang Technological UniversitySingapore

Personalised recommendations