Human interaction recognition based on sparse representation of feature covariance matrices

Wang, Jun; Zhou, Si-chao; Xia, Li-min

doi:10.1007/s11771-018-3738-3

Human interaction recognition based on sparse representation of feature covariance matrices

基于协方差矩阵稀疏表示的交互行为识别方法

Published: 12 February 2018

Volume 25, pages 304–314, (2018)
Cite this article

Journal of Central South University Aims and scope Submit manuscript

Jun Wang (王军)¹,
Si-chao Zhou (周思超)¹ &
Li-min Xia (夏利民)¹

181 Accesses
7 Citations
Explore all metrics

Abstract

A new method for interaction recognition based on sparse representation of feature covariance matrices was presented. Firstly, the dense trajectories (DT) extracted from the video were clustered into different groups to eliminate the irrelevant trajectories, which could greatly reduce the noise influence on feature extraction. Then, the trajectory tunnels were characterized by means of feature covariance matrices. In this way, the discriminative descriptors could be extracted, which was also an effective solution to the problem that the description of the feature second-order statistics is insufficient. After that, an over-complete dictionary was learned with the descriptors and all the descriptors were encoded using sparse coding (SC). Classification was achieved using multiple instance learning (MIL), which was more suitable for complex environments. The proposed method was tested and evaluated on the WEB Interaction dataset and the UT interaction dataset. The experimental results demonstrated the superior efficiency.

摘要

人体行为识别是计算机视觉和模式识别领域的一个重要研究方向, 在监控系统、人机交互、人工智能等方面具有广阔的应用前景。本文提出了一种基于协方差矩阵稀疏表示的交互行为识别方法。首先, 对视频中提取的稠密轨迹进行聚类形成不同的轨迹群组, 以消除无关轨迹、减少噪声对特征提取的影响。然后通过协方差矩阵对轨迹通道进行特征描述, 得到有较强区分度的轨迹通道描述符, 该描述符维度更低, 并且能够有效解决以往描述符对特征二阶统计量描述不足的问题; 利用稀疏表示对特征描述符进行稀疏编码。最后, 采用多示例学习进行行为分类。在 UT-Interaction 数据集与 WEB-Interaction 数据集上的实验证明了本文方法的有效性。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-Person Interaction Recognition Based on Video Sparse Representation and Improved Spatio-Temporal Feature

A Novel Group-Sparsity-Optimization-Based Feature Selection Model for Complex Interaction Recognition

A Better Trajectory Shape Descriptor for Human Activity Recognition

References

KONG Yu, FU Yun. Modeling supporting regions for close human interaction recognition [C]//Computer Vision-ECCV 2014 Workshops. Zurich: Springer International Publishing, 2014: 29–44.
Google Scholar
KARUNGARU S, KENJI T, FUKUMI M. Human action recognition using normalized cone histogram features [C]//Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), 2014 IEEE Symposium on. Orkand, FL: IEEE, 2014: 1–5.
Google Scholar
HOAI M, ZISSERMAN A. Talking heads: detecting humans and recognizing their interactions [C]//Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. Columbus, Dhio: IEEE, 2014: 875–882.
Chapter Google Scholar
YANG Lu-yu, GAO Cheng-qiang, MENG De-yu, LU Jiang. A novel group-sparsity-optimization-based feature selection model for complex interaction recognition [M]//Computer Vision–ACCV 2014. Singapore: Springer International Publishing, 2015: 508–521.
Google Scholar
ZHANG J, LIN H, NIE W Z, CHAISORN L, WONG Y K, KANKANHALLI M S. Human action recognition bases on local action attributes [J]. Journal of Electrical Engineering & Technology, 2015, 10(3): 1264–1274.
Article Google Scholar
NOWAK E, JURIE F, TRIGGS B. Sampling strategies for bag-of-features image classification [M]. Computer vision–ECCV 2006. Springer Berlin Heidelberg, 2006: 490–503.
Chapter Google Scholar
WANG Heng, ULLAH M M, KLÄSER A, et al. Evaluation of local spatio-temporal features for action recognition [C]//British Machine Vision Conference. London: Springer, 2009: 1–10.
Google Scholar
WANG Heng, KLÄSER A, SCHMID C, et al. Dense trajectories and motion boundary descriptors for action recognition [J]. International Journal of Computer Vision, 2013, 103(1): 60–79.
Article MathSciNet Google Scholar
HAO Zong-bo, ZHANG Qian-ni, EZQUIERDO E, et al. Human action recognition by fast dense trajectories [C]//Proceedings of the 21st ACM international conference on Multimedia. Barcelona: ACM, 2013: 377–380.
Chapter Google Scholar
BEAUDRY C, PETERI R, MASCARILLA L. Action recognition in videos using frequency analysis of critical point trajectories [C]//2014 IEEE International Conference on Image Processing (ICIP). Paris: IEEE, 2014: 1445–1449.
Chapter Google Scholar
SEO J J, BADDAR W J, KIM D H, et al. Human action recognition using time-invariant key-trajectories describing spatio-temporal salient motion [C]//IEEE International Conference on Image Processing. Quebec City: IEEE, 2015: 586–590.
Google Scholar
NI Bing-bing, MOULIN P, YANG Xiao-kai, et al. Motion Part Regularization: Improving action recognition via trajectory group selection [C]//Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3698–3706.
Google Scholar
ZHANG Bo, ROTA P, CONCI N, et al. Human interaction recognition in the wild: Analyzing trajectory clustering from multiple-instance-learning perspective [C]//IEEE International Conference on Multimedia and Expo. Torino: IEEE, 2015: 1–6.
Google Scholar
IOSIFIDIS A, TEFAS A, PITAS I. Merging linear discriminant analysis with Bag of Words model for human action recognition [C]//IEEE International Conference on Image Processing. Quebec City: IEEE, 2015: 832–836.
Google Scholar
ELGUEBALY T, BOUGUILA N. Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians [C]//Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), 2014 IEEE Symposium on. Orbando, FL: IEEE, 2014: 1–7.
Google Scholar
WANG Yang-yang, LI Yi-bo, JI Xiao-fei. Human action recognition based on global gist feature and local patch coding [J]. Management Review, 2015, 21(11): 38–43.
Google Scholar
GUO Kai, ISHWAR P, KONRAD J. Action recognition from video using feature covariance matrices [J]. IEEE Transactions on Image Processing, 2013, 22(6): 2479–2494.
Article MathSciNet MATH Google Scholar
BROX T, MALIK J. Object segmentation by long term analysis of point trajectories [C]//Proc European Conference on Computer Vision. Crete, Greece: Springer, 2010: 282–295.
Google Scholar
SENER F, IKIZLER-CINBIS N. Two-person interaction recognition via spatial multiple instance embedding [J]. Journal of Visual Communication & Image Representation, 2015, 32: 63–73.
Article Google Scholar
CHEN Yi-xin, BI Jin-bo, WANG J Z. MILES: Multiple-instance learning via embedded instance selection [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2006, 28(12): 1931–1947.
Article Google Scholar
GAO Cheng-qiang, YANG Lu-yu, DU Yin-he, et al. From constrained to unconstrained datasets: An evaluation of local action descriptors and fusion strategies for interaction recognition [J]. World Wide Web-internet & Web Information Systems, 2015, 19(2): 1–12.
Google Scholar
XIA Li-min, SHI Xiao-ting, TU Hong-bin. An approach for complex activity recognition by key frames [J]. Journal of Central South University, 2015, 22(9): 3450–3457.
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Science and Engineering, Central South University, Changsha, 410075, China
Jun Wang (王军), Si-chao Zhou (周思超) & Li-min Xia (夏利民)

Authors

Jun Wang (王军)
View author publications
You can also search for this author in PubMed Google Scholar
Si-chao Zhou (周思超)
View author publications
You can also search for this author in PubMed Google Scholar
Li-min Xia (夏利民)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li-min Xia (夏利民).

Additional information

Foundation item: Project(51678075) supported by the National Natural Science Foundation of China; Project(2017GK2271) supported by the Science and Technology Project of Hunan Province, China

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, J., Zhou, Sc. & Xia, Lm. Human interaction recognition based on sparse representation of feature covariance matrices. J. Cent. South Univ. 25, 304–314 (2018). https://doi.org/10.1007/s11771-018-3738-3

Download citation

Received: 14 June 2016
Accepted: 05 December 2017
Published: 12 February 2018
Issue Date: February 2018
DOI: https://doi.org/10.1007/s11771-018-3738-3

Key words

关键词

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Human interaction recognition based on sparse representation of feature covariance matrices

Abstract

摘要

Access this article

Similar content being viewed by others

Two-Person Interaction Recognition Based on Video Sparse Representation and Improved Spatio-Temporal Feature

A Novel Group-Sparsity-Optimization-Based Feature Selection Model for Complex Interaction Recognition

A Better Trajectory Shape Descriptor for Human Activity Recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Key words

关键词

Navigation

Human interaction recognition based on sparse representation of feature covariance matrices

Abstract

摘要

Access this article

Similar content being viewed by others

Two-Person Interaction Recognition Based on Video Sparse Representation and Improved Spatio-Temporal Feature

A Novel Group-Sparsity-Optimization-Based Feature Selection Model for Complex Interaction Recognition

A Better Trajectory Shape Descriptor for Human Activity Recognition

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Key words

关键词

Search

Navigation