Skip to main content
Log in

Human interaction recognition based on sparse representation of feature covariance matrices

基于协方差矩阵稀疏表示的交互行为识别方法

  • Published:
Journal of Central South University Aims and scope Submit manuscript

Abstract

A new method for interaction recognition based on sparse representation of feature covariance matrices was presented. Firstly, the dense trajectories (DT) extracted from the video were clustered into different groups to eliminate the irrelevant trajectories, which could greatly reduce the noise influence on feature extraction. Then, the trajectory tunnels were characterized by means of feature covariance matrices. In this way, the discriminative descriptors could be extracted, which was also an effective solution to the problem that the description of the feature second-order statistics is insufficient. After that, an over-complete dictionary was learned with the descriptors and all the descriptors were encoded using sparse coding (SC). Classification was achieved using multiple instance learning (MIL), which was more suitable for complex environments. The proposed method was tested and evaluated on the WEB Interaction dataset and the UT interaction dataset. The experimental results demonstrated the superior efficiency.

摘要

人体行为识别是计算机视觉和模式识别领域的一个重要研究方向, 在监控系统、 人机交互、 人工智能等方面具有广阔的应用前景。 本文提出了一种基于协方差矩阵稀疏表示的交互行为识别方法。 首先, 对视频中提取的稠密轨迹进行聚类形成不同的轨迹群组, 以消除无关轨迹、 减少噪声对特征提取的影响。 然后通过协方差矩阵对轨迹通道进行特征描述, 得到有较强区分度的轨迹通道描述符, 该描述符维度更低, 并且能够有效解决以往描述符对特征二阶统计量描述不足的问题; 利用稀疏表示对特征描述符进行稀疏编码。 最后, 采用多示例学习进行行为分类。 在 UT-Interaction 数据集与 WEB-Interaction 数据集上的实验证明了本文方法的有效性。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. KONG Yu, FU Yun. Modeling supporting regions for close human interaction recognition [C]//Computer Vision-ECCV 2014 Workshops. Zurich: Springer International Publishing, 2014: 29–44.

    Google Scholar 

  2. KARUNGARU S, KENJI T, FUKUMI M. Human action recognition using normalized cone histogram features [C]//Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), 2014 IEEE Symposium on. Orkand, FL: IEEE, 2014: 1–5.

    Google Scholar 

  3. HOAI M, ZISSERMAN A. Talking heads: detecting humans and recognizing their interactions [C]//Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. Columbus, Dhio: IEEE, 2014: 875–882.

    Chapter  Google Scholar 

  4. YANG Lu-yu, GAO Cheng-qiang, MENG De-yu, LU Jiang. A novel group-sparsity-optimization-based feature selection model for complex interaction recognition [M]//Computer Vision–ACCV 2014. Singapore: Springer International Publishing, 2015: 508–521.

    Google Scholar 

  5. ZHANG J, LIN H, NIE W Z, CHAISORN L, WONG Y K, KANKANHALLI M S. Human action recognition bases on local action attributes [J]. Journal of Electrical Engineering & Technology, 2015, 10(3): 1264–1274.

    Article  Google Scholar 

  6. NOWAK E, JURIE F, TRIGGS B. Sampling strategies for bag-of-features image classification [M]. Computer vision–ECCV 2006. Springer Berlin Heidelberg, 2006: 490–503.

    Chapter  Google Scholar 

  7. WANG Heng, ULLAH M M, KLÄSER A, et al. Evaluation of local spatio-temporal features for action recognition [C]//British Machine Vision Conference. London: Springer, 2009: 1–10.

    Google Scholar 

  8. WANG Heng, KLÄSER A, SCHMID C, et al. Dense trajectories and motion boundary descriptors for action recognition [J]. International Journal of Computer Vision, 2013, 103(1): 60–79.

    Article  MathSciNet  Google Scholar 

  9. HAO Zong-bo, ZHANG Qian-ni, EZQUIERDO E, et al. Human action recognition by fast dense trajectories [C]//Proceedings of the 21st ACM international conference on Multimedia. Barcelona: ACM, 2013: 377–380.

    Chapter  Google Scholar 

  10. BEAUDRY C, PETERI R, MASCARILLA L. Action recognition in videos using frequency analysis of critical point trajectories [C]//2014 IEEE International Conference on Image Processing (ICIP). Paris: IEEE, 2014: 1445–1449.

    Chapter  Google Scholar 

  11. SEO J J, BADDAR W J, KIM D H, et al. Human action recognition using time-invariant key-trajectories describing spatio-temporal salient motion [C]//IEEE International Conference on Image Processing. Quebec City: IEEE, 2015: 586–590.

    Google Scholar 

  12. NI Bing-bing, MOULIN P, YANG Xiao-kai, et al. Motion Part Regularization: Improving action recognition via trajectory group selection [C]//Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3698–3706.

    Google Scholar 

  13. ZHANG Bo, ROTA P, CONCI N, et al. Human interaction recognition in the wild: Analyzing trajectory clustering from multiple-instance-learning perspective [C]//IEEE International Conference on Multimedia and Expo. Torino: IEEE, 2015: 1–6.

    Google Scholar 

  14. IOSIFIDIS A, TEFAS A, PITAS I. Merging linear discriminant analysis with Bag of Words model for human action recognition [C]//IEEE International Conference on Image Processing. Quebec City: IEEE, 2015: 832–836.

    Google Scholar 

  15. ELGUEBALY T, BOUGUILA N. Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians [C]//Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), 2014 IEEE Symposium on. Orbando, FL: IEEE, 2014: 1–7.

    Google Scholar 

  16. WANG Yang-yang, LI Yi-bo, JI Xiao-fei. Human action recognition based on global gist feature and local patch coding [J]. Management Review, 2015, 21(11): 38–43.

    Google Scholar 

  17. GUO Kai, ISHWAR P, KONRAD J. Action recognition from video using feature covariance matrices [J]. IEEE Transactions on Image Processing, 2013, 22(6): 2479–2494.

    Article  MathSciNet  MATH  Google Scholar 

  18. BROX T, MALIK J. Object segmentation by long term analysis of point trajectories [C]//Proc European Conference on Computer Vision. Crete, Greece: Springer, 2010: 282–295.

    Google Scholar 

  19. SENER F, IKIZLER-CINBIS N. Two-person interaction recognition via spatial multiple instance embedding [J]. Journal of Visual Communication & Image Representation, 2015, 32: 63–73.

    Article  Google Scholar 

  20. CHEN Yi-xin, BI Jin-bo, WANG J Z. MILES: Multiple-instance learning via embedded instance selection [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2006, 28(12): 1931–1947.

    Article  Google Scholar 

  21. GAO Cheng-qiang, YANG Lu-yu, DU Yin-he, et al. From constrained to unconstrained datasets: An evaluation of local action descriptors and fusion strategies for interaction recognition [J]. World Wide Web-internet & Web Information Systems, 2015, 19(2): 1–12.

    Google Scholar 

  22. XIA Li-min, SHI Xiao-ting, TU Hong-bin. An approach for complex activity recognition by key frames [J]. Journal of Central South University, 2015, 22(9): 3450–3457.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li-min Xia  (夏利民).

Additional information

Foundation item: Project(51678075) supported by the National Natural Science Foundation of China; Project(2017GK2271) supported by the Science and Technology Project of Hunan Province, China

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Zhou, Sc. & Xia, Lm. Human interaction recognition based on sparse representation of feature covariance matrices. J. Cent. South Univ. 25, 304–314 (2018). https://doi.org/10.1007/s11771-018-3738-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11771-018-3738-3

Key words

关键词

Navigation