Learning rich features for gait recognition by integrating skeletons and silhouettes

Peng, Yunjie; Ma, Kang; Zhang, Yang; He, Zhiqiang

doi:10.1007/s11042-023-15483-x

Learning rich features for gait recognition by integrating skeletons and silhouettes

Published: 07 June 2023

Volume 83, pages 7273–7294, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yunjie Peng^1,2,
Kang Ma²,
Yang Zhang³ &
…
Zhiqiang He ORCID: orcid.org/0000-0003-3103-1902^1,3

519 Accesses
8 Citations
Explore all metrics

Abstract

Gait recognition captures gait patterns from the walking sequence of an individual for identification. Most existing gait recognition methods learn features from silhouettes or skeletons for the robustness to clothing, carrying, and other exterior factors. The combination of the two data modalities, however, is not fully exploited. Previous multimodal gait recognition methods mainly employ the skeleton to assist the local feature extraction where the intrinsic discrimination of the skeleton data is ignored. To fill this gap and make full use of the two complementary data modalities, this paper proposes a simple yet effective Bimodal Fusion (BiFusion) network which mines discriminative gait patterns in skeletons and integrates with silhouette representations to learn rich features for better identification. Particularly, the inherent hierarchical semantics of body joints in a skeleton is leveraged to design a novel Multi-Scale Gait Graph (MSGG) network for the feature extraction of skeletons. Extensive experiments on CASIA-B and OUMVLP demonstrate both the superiority of the proposed MSGG network in modeling skeletons and the effectiveness of the bimodal fusion for gait recognition. Under the most challenging condition of cross-clothing gait recognition on CASIA-B, our method achieves the rank-1 accuracy of 94.0%, which outperforms previous state-of-the-art methods by a large margin. The code is released at https://github.com/YunjiePeng/BimodalFusion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gaitdlf: global and local fusion for skeleton-based gait recognition in the wild

Article Open access 01 May 2024

Gait coordination feature modeling and multi-scale gait representation for gait recognition

Article 24 April 2024

Exploiting skeleton-based gait events with attention-guided residual deep learning model for human identification

Article 12 October 2023

Data Availability

The data that support the findings of this study are available on request from the Institute of Automation, Chinese Academy of Sciences (CASIA) (http://www.cbsr.ia.ac.cn/english/Gait%20Databases.asp) and the Institute of Scientific and Industrial Research (ISIR), Osaka University (OU) (http://www.am.sanken.osaka-u.ac.jp/BiometricDB/GaitMVLP.html).

Notes

The keypoints matrix is a 3D matrix that organizes the skeleton sequence data into regular grid formats. Each keypoint in a skeleton contains three initial features, i.e., the x, y coordinates of the keypoint in the frame and the confidence of the prediction.
Positive: right elbow, right knee, left elbow, left knee, right wrist, right ankle, left wrist, and left ankle. Negative: right shoulder, right hip, left shoulder, left hip. Positive and Negative nodes of the limbs spatial-temporal graph and the bodyparts spatial-temporal graph are similarly defined.

References

Aggarwal H, Vishwakarma D K (2018) Covariate conscious approach for gait recognition based upon zernike moment invariants. IEEE Trans Cogn Develop Syst 10(2):397–407. https://doi.org/10.1109/tcds.2017.2658674
Article Google Scholar
An W, Yu S, Makihara Y et al (2020) Performance evaluation of model-based gait on multi-view very large population database with pose sequences. IEEE Trans Biometr Behav Ident Sci 2(4):421–430. https://doi.org/10.1109/tbiom.2020.3008862
Article Google Scholar
Bodla N, Zheng J, Xu H et al (2017) Deep heterogeneous feature fusion for template-based face recognition. In: 2017 IEEE winter conference on applications of computer vision (WACV). IEEE, pp 586–595
Bouchrika I, Goffredo M, Carter J et al (2011) On using gait in forensic biometrics. J Forens Sci 56(4):882–889. https://doi.org/10.1111/j.1556-4029.2011.01793.x
Article Google Scholar
Boulgouris N V, Huang X (2013) Gait recognition using hmms and dual discriminative observations for sub-dynamics analysis. IEEE Trans Image Process 22(9):3636–3647. https://doi.org/10.1109/tip.2013.2266578
Article Google Scholar
Cai C, Zhou Y, Wang Y (2019) Chd: consecutive horizontal dropout for human gait feature extraction. In: Proceedings of the 2019 8th international conference on computing and pattern recognition, pp 89–94
Cao Z, Simon T, Wei S-E et al (2017) Realtime multi-person 2d pose estimation using part affinity fields. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE Computer Society, pp 1302–1310
Chai T, Li A, Zhang S et al (2022) Lagrange motion analysis and view embeddings for improved gait recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp 20249–20258
Chao H, He Y, Zhang J et al (2019) Gaitset: regarding gait as a set for cross-view gait recognition. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8126–8133
Chao H, Wang K, He Y et al (2022) Gaitset: cross-view gait recognition through utilizing gait as a deep set. IEEE Trans Pattern Anal Mach Intell 44(7):3467–3478. https://doi.org/10.1109/TPAMI.2021.3057879
Article Google Scholar
Chen C, Ramanan D (2017) 3d human pose estimation = 2d pose estimation + matching. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE Computer Society, pp 5759–5767. https://doi.org/10.1109/cvpr.2017.610
Chen X, Luo X, Weng J et al (2021) Multi-view gait image generation for cross-view gait recognition. IEEE Trans Image Process 30:3041–3055. https://doi.org/10.1109/tip.2021.3055936
Article Google Scholar
Deng M, Wang C (2018) Human gait recognition based on deterministic learning and data stream of microsoft kinect. IEEE Trans Circuits Syst Video Technol 29(12):3636–3645. https://doi.org/10.1109/tcsvt.2018.2883449
Article Google Scholar
Dhiman C, Vishwakarma D K (2020) View-invariant deep architecture for human action recognition using two-stream motion and shape temporal dynamics. IEEE Trans Image Process 29:3835–3844. https://doi.org/10.1109/tip.2020.2965299
Article Google Scholar
Dhiman C, Vishwakarma D K, Agarwal P (2021) Part-wise spatio-temporal attention driven cnn-based 3d human action recognition. ACM Trans Multimid Comput Commun Applic 17(3):1–24. https://doi.org/10.1145/3441628
Article Google Scholar
Ding X, Wang K, Wang C et al (2021) Sequential convolutional network for behavioral pattern extraction in gait recognition. Neurocomputing 463:411–421. https://doi.org/10.1016/j.neucom.2021.08.054
Article Google Scholar
Fan C, Peng Y, Cao C et al (2020) Gaitpart: temporal part-based model for gait recognition. In: 2020 IEEE/CVF Conference on computer vision and pattern recognition (CVPR), pp 14225–14233. https://doi.org/10.1109/cvpr42600.2020.01423
Fang H-S, Xie S, Tai Y-W, Lu C (2017) Rmpe: regional multi-person pose estimation. In: Proceedings of the IEEE international conference on computer vision, pp 2334–2343. https://doi.org/10.1109/iccv.2017.256
Faundez-Zanuy M (2005) Data fusion in biometrics. IEEE Aerosp Electron Syst Mag 20(1):34–38. https://doi.org/10.1109/maes.2005.1396793
Article Google Scholar
Gallego G, Delbruck T, Orchard G M et al (2020) Event-based vision: a survey. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2020.3008413
Han J, Bhanu B (2005) Individual recognition using gait energy image. IEEE Trans Pattern Anal Mach Intell 28(2):316–322. https://doi.org/10.1109/tpami.2006.38
Article Google Scholar
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. 1703.07737
Hou S, Cao C, Liu X et al (2020) Gait lateral network: learning discriminative and compact representations for gait recognition. In: European conference on computer vision. Springer, pp 382–398. https://doi.org/10.1007/978-3-030-58545-7_22
Huang X, Zhu D, Wang H et al (2021) Context-sensitive temporal feature learning for gait recognition. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp 12909–12918
Iwama H, Okumura M, Makihara Y et al (2012) The ou-isir gait database comprising the large population dataset and performance evaluation of gait recognition. IEEE Trans Inf Forensics Secur 7(5):1511–1521. https://doi.org/10.1109/tifs.2012.2204253
Article Google Scholar
Larsen P K, Simonsen E B, Lynnerup N (2008) Gait analysis in forensic medicine. J Forensic Sci 53(5):1149–1153. https://doi.org/10.1111/j.1556-4029.2008.00807.x
Article Google Scholar
Li X, Makihara Y, Xu C, Yagi Y, Yu S, Ren M (2020) End-to-end model-based gait recognition. In: Proceedings of the Asian conference on computer vision, pp 3–20
Liang J, Fan C, Hou S, Shen C, Huang Y, Yu S (2022) Gaitedge: beyond plain end-to-end gait recognition for better practicality. arXiv:http://arxiv.org/abs/2203.03972
Liao R, Yu S, An W, Huang Y (2020) A model-based gait recognition method with body pose and human prior knowledge. Pattern Recogn 98:107069. https://doi.org/10.1016/j.patcog.2019.107069
Article Google Scholar
Lin B, Zhang S, Bao F (2020) Gait recognition with multiple-temporal-scale 3d convolutional neural network. In: Proceedings of the 28th ACM international conference on multimedia, pp 3054–3062. https://doi.org/10.1145/3394171.3413861
Lin B, Zhang S, Yu X (2021) Gait recognition via effective global-local feature representation and local temporal aggregation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 14648–14656. https://doi.org/10.1109/iccv48922.2021.01438
Lishani A O, Boubchir L, Khalifa E, Bouridane A (2019) Human gait recognition using gei-based local multi-scale feature descriptors. Multimed Tools Applic 78(5):5715–5730. https://doi.org/10.1007/s11042-018-5752-8
Article Google Scholar
Liu J, Zha Z-J, Wu W et al (2021) Spatial-temporal correlation and topology learning for person re-identification in videos. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4370–4379. https://doi.org/10.1109/cvpr46437.2021.00435
Liu X, You Z, He Y et al (2022) Symmetry-driven hyper feature gcn for skeleton-based gait recognition. Pattern Recogn 125:108520. https://doi.org/10.1016/j.patcog.2022.108520
Article Google Scholar
Makihara Y, Mannami H, Tsuji A et al (2012) The ou-isir gait database comprising the treadmill dataset. IPSJ Trans Comput Vis Applic 4:53–62. https://doi.org/10.2197/ipsjtcva.4.53
Article Google Scholar
Maltoni D, Maio D, Jain A et al (2005) Handbook of fingerprint recognition. Ch Synthetic Fingerprint Generation 33(5–6):1314
Google Scholar
Mao M, Song Y (2020) Gait recognition based on 3d skeleton data and graph convolutional network. In: 2020 IEEE International joint conference on biometrics (IJCB). https://doi.org/10.1109/ijcb48548.2020.9304916
Marín-Jimínez M J, Castro F M, Delgado-Escaño R et al (2021) Ugaitnet: multimodal gait recognition with missing input modalities. IEEE Trans Inf Forensics Secur 16:5452–5462. https://doi.org/10.1109/TIFS.2021.3132579
Article Google Scholar
Paszke A, Gross S, Massa F et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in neural information processing systems, pp 8026–8037
Ross A A, Govindarajan R (2005) Feature level fusion of hand and face biometrics. In: Biometric technology for human identification II, vol 5779. International Society for Optics and Photonics, pp 196–204. https://doi.org/10.1117/12.606093
Shekhar S, Patel V M, Nasrabadi N M et al (2014) Joint sparse representation for robust multimodal biometrics recognition. IEEE Trans Pattern Anal Mach Intell 36(1):113–126. https://doi.org/10.1109/tpami.2013.109
Article Google Scholar
Singh T, Vishwakarma D K (2021) A deep multimodal network based on bottleneck layer features fusion for action recognition. Multimed Tools Applic 80 (24):33505–33525. https://doi.org/10.1007/s11042-021-11415-9
Article Google Scholar
Singh T, Vishwakarma D K (2021) A deeply coupled convnet for human activity recognition using dynamic and rgb images. Neural Comput Appl 33(1):469–485. https://doi.org/10.1007/s00521-020-05018-y
Article Google Scholar
Sun J, Wang Y, Li J et al (2018) View-invariant gait recognition based on kinect skeleton feature. Multimed Tools Applic 77(19):24909–24935. https://doi.org/10.1007/s11042-018-5722-1
Article Google Scholar
Sun Y, Chen Y, Wang X et al (2014) Deep learning face representation by joint identification-verification. Advances in Neural Information Processing Systems, 27
Sun K, Xiao B, Liu D et al (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). https://doi.org/10.1109/cvpr.2019.00584
Takemura N, Makihara Y, Muramatsu D et al (2018) Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition. IPSJ Trans Comput Vis Applic 10(1):1–14. https://doi.org/10.1186/s41074-018-0039-6
Article Google Scholar
Teepe T, Khan A, Gilg J et al (2021) Gaitgraph: graph convolutional network for skeleton-based gait recognition. In: 2021 IEEE International Conference on Image Processing (ICIP). IEEE, pp 2314–2318. https://doi.org/10.1109/icip42928.2021.9506717
Tong S, Fu Y, Yue X et al (2018) Multi-view gait recognition based on a spatial-temporal deep neural network. IEEE Access 6:57583–57596. https://doi.org/10.1109/access.2018.2874073
Article Google Scholar
Wang Y, Zhang X, Shen Y et al (2021) Event-stream representation for human gaits identification using deep neural networks. IEEE Trans Pattern Anal Mach Intell, https://doi.org/10.1109/tpami.2021.3054886
Wildes, R.P. (1997) Iris recognition: an emerging biometric technology. Proc IEEE 85(9):1348–1363. https://doi.org/10.1109/5.628669
Article Google Scholar
Wu Z, Huang Y, Wang L et al (2017) A comprehensive study on cross-view gait based human identification with deep cnns. IEEE Trans Pattern Anal Mach Intell 39(02):209–226. https://doi.org/10.1109/tpami.2016.2545669
Article Google Scholar
Xin Y, Kong L, Liu Z et al (2018) Multimodal feature-level fusion for biometrics identification system on iomt platform. IEEE Access, 1–1. https://doi.org/10.1109/access.2018.2815540
Xu C, Makihara Y, Li X et al (2019) Speed-invariant gait recognition using single-support gait energy image. Multimed Tools Applic 78 (18):26509–26536. https://doi.org/10.1007/s11042-019-7712-3
Article Google Scholar
Xu H, Li Y, Sun X et al (2020) Joint metric learning and hierarchical network for gait recognition. IEEE Access 8:228088–228098. https://doi.org/10.1109/ACCESS.2020.3044580
Article Google Scholar
Xu K, Jiang X, Sun T (2021) Gait identification based on human skeleton with pairwise graph convolutional network. In: 2021 IEEE International conference on multimedia and expo (ICME). IEEE, pp 1–6. https://doi.org/10.1109/icme51207.2021.9428123
Yan S, Xiong Y, Lin D (2018) Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-second AAAI conference on artificial intelligence
Yao L, Kusakunniran W, Wu Q et al (2021) Collaborative feature learning for gait recognition under cloth changes. IEEE Trans Circuits Syst Video Technol. https://doi.org/10.1109/tcsvt.2021.3112564
Yu S, Tan D, Tan T (2006) A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th International conference on pattern recognition (ICPR 2006), 20-24 August 2006, Hong Kong, China. https://doi.org/10.1109/icpr.2006.67
Yu S, Chen H, Wang Q et al (2017) Invariant feature extraction for gait recognition using only one uniform model. Neurocomputing 239:81–93. https://doi.org/10.1016/j.neucom.2017.02.006
Article Google Scholar
Zhang Z, Tran L, Yin X et al (2019) Gait recognition via disentangled representation learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4710–4719. https://doi.org/10.1109/cvpr.2019.00484
Zhang Z, Tran L, Liu F et al (2020) On learning disentangled representations for gait recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/tpami.2020.2998790
Zheng L, Yang Y, Hauptmann A G (2016) Person re-identification: past, present and future. arXiv:http://arxiv.org/abs/1610.02984

Download references

Funding

No funding was received to assist with the preparation of this manuscript.

Author information

Authors and Affiliations

School of Computer Science and Technology, Beihang University, Xueyuan Road, Beijing, 100191, China
Yunjie Peng & Zhiqiang He
Watrix Technology Limited Co. Ltd., XueYuan Road, Beijing, 100083, China
Yunjie Peng & Kang Ma
Lenovo Co. Ltd., Xibeiwang Road, Beijing, 100085, China
Yang Zhang & Zhiqiang He

Authors

Yunjie Peng
View author publications
You can also search for this author in PubMed Google Scholar
Kang Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiqiang He.

Ethics declarations

Conflict of Interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Peng, Y., Ma, K., Zhang, Y. et al. Learning rich features for gait recognition by integrating skeletons and silhouettes. Multimed Tools Appl 83, 7273–7294 (2024). https://doi.org/10.1007/s11042-023-15483-x

Download citation

Received: 05 May 2022
Revised: 02 August 2022
Accepted: 18 April 2023
Published: 07 June 2023
Issue Date: January 2024
DOI: https://doi.org/10.1007/s11042-023-15483-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning rich features for gait recognition by integrating skeletons and silhouettes

Abstract

Access this article

Similar content being viewed by others

Gaitdlf: global and local fusion for skeleton-based gait recognition in the wild

Gait coordination feature modeling and multi-scale gait representation for gait recognition

Exploiting skeleton-based gait events with attention-guided residual deep learning model for human identification

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning rich features for gait recognition by integrating skeletons and silhouettes

Abstract

Access this article

Similar content being viewed by others

Gaitdlf: global and local fusion for skeleton-based gait recognition in the wild

Gait coordination feature modeling and multi-scale gait representation for gait recognition

Exploiting skeleton-based gait events with attention-guided residual deep learning model for human identification

Data Availability

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation