Skip to main content
Log in

Projection transform on spatio-temporal context for action recognition

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper discusses the task of human action recognition. This task is important to applications like video surveillance and video retrieval. Most of the existing local interest points based works on human action analysis, lost the information about spatio-temporal distribution of features and neglected the relationship between features and each defined actions. In this paper, through the analysis of feature distribution and their interactions over spatio-temporal domain, we propose a novel projection transform to take the two factors into account. A video sequence of human action in our perspective can be modeled by three types of features of spatio-temporal interest points: the global projection transform feature, the relative position distribution feature and the bag of visual words based feature. Then a new context-K-nearest-neighbor classifier is utilized to fuse them to form discriminative feature sets for action matching. In most of the case, our experiments have indicated that the novel method outperforms other previous published results on the Weizmann and KTH datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. A. Fathi, G.Mori. Action recognition by learning mid-level motion features. CVPR,2008.

  2. A. grawal, R. Srikant, R. Fast algorithms for mining association rules in large data bases. In:VLDB1994, Proceedings of 20th International ConferenceonVery Large Data Bases, pp.487–499(1994).

  3. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. Journal of Machine Learning Research 3:993–1022

    MATH  Google Scholar 

  4. Burges CJC (1998) A tutorial on support vector machines for pattern recognition [J]. Data mining and knowledge discovery 2(2):121–167

    Article  Google Scholar 

  5. Chao-Yeh Chen, Kristen Grauman. Efficient Activity Detection with Max-Subgraph Search. CVPR,2012.

  6. Chunfeng Yuan, Xi Li, Weiming Hu, Haibin Ling, Stephen Maybank. 3D R Transformon Spatio-Temporal Interest Points for Action Recognition.CVPR,2013.

  7. P. Dollar, V. Rabaud, G. Cottrell, and S. Belongie. Behavior recognition via sparse spatio-temporal features. VS-PETS, 2005.

  8. Francis R. Bach, Gert R.G. Lanckriet, Michael I. Jordan (2004). Multiple Kernel Learning, Conic Duality, and the SMO Algorithm. Proceedings of the Twenty-first International Conference on Machine Learning.

  9. H. Wang, M. M. Ullah, A. Kläser, I. Laptev and C. Schmid. Evaluation of local spatio-temporal features for action recognition. Proc. BMVC’09.

  10. I. Laptev and T. Lindeberg. Space-time interest points. ICCV,2003.

  11. J. Liu, J. Luo, M. Shah. Action Recognition in Unconstrained Amateur Videos. In ICASSP, pp.3549–3552,2009.

  12. J. Sun, X. Wu, S. Yan, L. F. Cheong, T.-S. Chua, and J. Li. Hierarchical spatio-temporal context modeling for action recognition. CVPR,2009.

  13. Jiang Wang, Zhuoyuan Chen and Ying Wu. Action Recognition with Multiscale Spatio-Temporal Contexts. CVPR,2011.

  14. Juan Carlos Niebles, Hongcheng Wang, and Fei Li. Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision. 79, 3 (2008) 299–318.

  15. L. Wang, H. Zhou, S.-C. Low, C. Leckie. Action Recognition via Multi-Feature Fusion and Gaussian Process Classification. WACV,2009.

  16. I. Laptev, M. Marszalek, C. Schmid, and B. Rozenfield. Learning realistic human actions from movies. In CVPR, 2008.

  17. S. Lazebnik, C. Schmid, J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. CVPR 2006.

  18. M. Blank, L. Gorelick, E. Shechtman, M. Irani, and R. Basri. Actions as space-time shapes.ICCV,2005.

  19. M. Bregonzio, S. Gongand T. Xiang. Recognising Action as Clouds of Space-Time Interest Points. CVPR,2009.

  20. M. Ryoo and J. Aggarwal. Spatio-temporal relationship match:video structure comparison for recognition of complex human activities. ICCV,2009.

  21. Ping Guo, Zhenjiang Miao, Yuan Shen, Wanru Xu, Dianyong Zhang. Continuous human action recognition in real time. Multimedia Tools and Applications, 2012: 1–18.

  22. Rakotomamonjy A, Bach FR, Canu S, Grandvalet Y (2008) SimpleMKL. Journal of Machine Learning Research 9:2491–2521

    MathSciNet  MATH  Google Scholar 

  23. S. Savarese, A. Delpozo, J. C. Niebles, and L. Fei-fei. Spatial-temporal correlatons for unsupervised action classification. WMVC,2008.

  24. C. Schuldt, I. Laptev, and B. Caputo. Recognizing human actions:A local SVM approach. ICPR, 2004.

  25. T. Hofmann. Probabilistic latent semantic indexing. Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (California US August 1999 (1999) ACM Press. NY, New York, pp 50–57

    Google Scholar 

  26. TuanHue Thi, Li Cheng, Jian Zhang, Li Wang Shinichi Satoh. Structured learning of local features for human action classification and localization Image and Vision Computing . 2012,1–14.

  27. U. Gaur, Y. Zhu, B.Song, A. Roy-Chowdhury. A String of Feature Graphs Model for Recognition of Complex Activities in Natural Videos. ICCV,2011.

  28. W. Brendel, S. Todorovic. Learning spatiotemporal graphs of human activities. ICCV,2011.

  29. X. Sun, M. Chen, A. Hauptmann. Action recognition via local descriptors and holistic features. CVPR,2009.

  30. Xinxiao Wu, Dong Xu, Lixin Duan, JiebovLuo. Action Recognition using Context and Appearance Distribution Features.CVPR,2011.

  31. Y. Ye, L. Qin, Z. Cheng, Q. Huang. Recognizing Realistic Action Using Contextual Feature Group. PCM,2011.

Download references

Acknowledgments

This work is supported by the NSFC 61273274, 973 Program 2011CB302203, National Key Technology R&D Program of China 2012BAH01F03, NSFB4123104, Z131110001913143, Tsinghua-Tencent Joint Lab for IIT and Beijing Jiaotong University Research Foundation Program KKJB14029536.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhenjiang Miao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, W., Miao, Z. & Zhang, Q. Projection transform on spatio-temporal context for action recognition. Multimed Tools Appl 74, 7711–7728 (2015). https://doi.org/10.1007/s11042-014-2007-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-014-2007-1

Keywords

Navigation