Skip to main content
Log in

Complex human activities recognition using interval temporal syntactic model

  • Mechanical Engineering, Control Science and Information Engineering
  • Published:
Journal of Central South University Aims and scope Submit manuscript

Abstract

A novel method based on interval temporal syntactic model was proposed to recognize human activities in video flow. The method is composed of two parts: feature extract and activities recognition. Trajectory shape descriptor, speeded up robust features (SURF) and histograms of optical flow (HOF) were proposed to represent human activities, which provide more exhaustive information to describe human activities on shape, structure and motion. In the process of recognition, a probabilistic latent semantic analysis model (PLSA) was used to recognize sample activities at the first step. Then, an interval temporal syntactic model, which combines the syntactic model with the interval algebra to model the temporal dependencies of activities explicitly, was introduced to recognize the complex activities with a time relationship. Experiments results show the effectiveness of the proposed method in comparison with other state-of-the-art methods on the public databases for the recognition of complex activities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. KUMARI S, MITRA S K. Human action recognition using DFT[C]// IEEE National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics. Hubli: IEEE, 2011: 239–242.

    Google Scholar 

  2. LAPTEV I. On space-time interest points [J]. International Journal of Computer Vision, 2005, 1(2/3): 432–439.

    Google Scholar 

  3. DOLLAR P, RABAUD V, COTTRELL G, BELONGIE S. Behavior recognition via sparse spatio-temporal features [C]// IEEE International Workshop on Visual Surveillance & Performance Evaluation of Tracking & Surveillance. Beijing: IEEE, 2005: 65–72.

    Chapter  Google Scholar 

  4. LAPTEV I, MARSZALEK M, SCHMID C, ROZENFELD B. Learning realistic human actions from movies [C]// Conference on Computer Vision and Pattern Recognition. Anchorage: IEEE, 2008: 1–8.

    Google Scholar 

  5. SCHULDT C, LAPTEV I, Caputo B. Recognizing human actions: A local SVM approach [C]// International Conference on Pattern Recognition. Cambridge: IEEE, 2004: 32–36.

    Google Scholar 

  6. SCOVANNER P, ALI S, SHAH M. A 3-dimensional sift descriptor and its application to action recognition [C]// Proceedings of the 15th International Conference on Multimedia. NY: ACM, 2007: 357–360.

    Google Scholar 

  7. KLASER A, MARSZALEK M, SCHMID C. A spatio-temporal descriptor based on 3d-gradients [C]// Proceedings of the 19th British Machine Vision Conference. Leeds: Springer Verlag, 2008: 275: 1–10.

    Google Scholar 

  8. ERIC C, VINCENT R, ANDREW Z. Match-time covariance for descriptor [C]// Proceedings of the 24th British Machine Vision Conference. Bristol: Springer Verlag, 2013: 270–281.

    Google Scholar 

  9. BREGONZIO M, LI J, GONG S, TAO X. Discriminative topics modeling for action feature selection and recognition [C]// Proceedings of the 21th British Machine Vision Conference. Aberystwyth: Springer, 2010: 1–11.

    Google Scholar 

  10. DAI Peng, DI Hui-jun, DONG Li-geng, TAO Lin-mi. Group interaction analysis in dynamic context [J]. IEEE Transactions On Systems, Man and Cybernetics—Part B, 2009, 39(1): 34–42.

    Article  Google Scholar 

  11. ZHANG Y, ZHANG Y, SWEARS E, LARIOS N, WANG ZH, JI Q. Modeling temporal interactions with interval temporal bayesian networks for complex activity recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(10): 2468–2483.

    Article  Google Scholar 

  12. DAMEN D, HOGG D. Recognizing linked events: Searching the space of feasible explanations [C]// Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 927–934.

    Google Scholar 

  13. ZHANG J, GONG S. Action categorization by structural probabilistic latent semantic analysis [J]. Computer Vision and Image Understanding, 2010, 114(8): 857–864.

    Article  Google Scholar 

  14. ARSIC D, SCHULLER B. Real time person tracking and behavior interpretation in multi-camera scenarios applying homography and coupled HMMs [C]// International Conference on Analysis of Verbal and Nonverbal Communication and Enactment. Budapest, Hungary: Springer-Verlag, 2011: 1–18.

    Google Scholar 

  15. SUGIMOTO M, ZIN T T, TORIU T, NAKAJIMA S. Robust rule-based method for human activity recognition [J]. International Journal of Computer Science and Network Security, 2011, 11(4): 37–43.

    Google Scholar 

  16. FERNANDEZ-CABALLERO A, CASTILLO J C, RODRIGUEZ SANCHEZ J M. Human activity monitoring by local and global finite state machines [J]. Expert Systems with Applications, 2012, 39(8): 6982–6993.

    Article  Google Scholar 

  17. HUANG Jin-xia. Complex human activity recognition based on SCFG [D]. Changsha: Central South University, 2013. (in chinese)

    Google Scholar 

  18. XIA Li-min, SHI Xiao-ting, TU Hong-bin. An approach for complex activity recognition by key frames [J]. Journal of Central South University, 2015, 22(9): 3450–3457.

    Article  Google Scholar 

  19. SHI J, TOMASI C. Good features to track [C]// Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 1994: 593–600.

    Google Scholar 

  20. XIA Li-min, YANG Bao-juan, TU Hong-bing. Recognition of suspicious behavior using case-based reasoning [J]. Journal of Central South University, 2015, 22(1): 241–250.

    Article  Google Scholar 

  21. ALLEN J F. Maintaining knowledge about temporal intervals [J]. Communications of the ACM, 1983, 26(11): 832–843.

    Article  MATH  Google Scholar 

  22. GRUNWALD P. A minimum description length approach to grammar inference [C]// Connectionist, Statistical and Symbolic Approaches to Learning for Natural Language Processing, Montreal: Springer Verlag, 1996: 203–216.

    Chapter  Google Scholar 

  23. ZHANG Z, TAN T, HNANG K. An extended grammar system for learning and recognizing complex visual events [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(2): 240–255.

    Article  Google Scholar 

  24. YouTube action dataset [EB/OL]. 2011-11-01. http://www.cs.ucf. edu/liujg/ YouTubeActiondataset.html.

  25. UCF sports action dataset [EB/OL]. 2012-02-01. http://vision.eecs. ucf.edu/datasetsActions.html.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li-min Xia  (夏利民).

Additional information

Foundation item: Project(50808025) supported by the National Natural Science Foundation of China; Project(20090162110057) supported by the Doctoral Fund of Ministry of Education, China

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xia, Lm., Han, F. & Wang, J. Complex human activities recognition using interval temporal syntactic model. J. Cent. South Univ. 23, 2578–2586 (2016). https://doi.org/10.1007/s11771-016-3319-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11771-016-3319-2

Key words

Navigation