Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Lu, Zeng; Zhang, Ganghan; Huang, Guoheng; Yu, Zhiwen; Pun, Chi-Man; Zhang, Weiwen; Chen, Junan; Ling, Wing-Kuen

doi:10.1007/s13042-022-01560-4

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Original Article
Published: 19 April 2022

Volume 13, pages 2745–2761, (2022)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Zeng Lu¹,
Ganghan Zhang¹,
Guoheng Huang ORCID: orcid.org/0000-0002-3640-3229¹,
Zhiwen Yu²,
Chi-Man Pun³,
Weiwen Zhang¹,
Junan Chen¹ &
…
Wing-Kuen Ling⁴

380 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Nowadays, there are many video person re-identification networks that do not consider screening the input video frame sequence, which result in the high-similarity of the video frames used for training the neural network. In this way, the temporal information in the video cannot be effectively modeled. To address that, we try to propose a video person re-identification scheme based on inter-frame reorganization, which consists of two modules. First, the Key Frame Screening with Index (KFSI) is proposed to screen the similar frames, and a frame sequence with richer information is extracted when loading the training dataset. Second, the Feature Reorganization Based on Inter-Frame Relation (FRBIFR) is proposed to reorganize the features of key frame sequence by calculating the correlation between the frames, and the reorganized features are more robust by eliminating some distractions (such as occlusion etc.). The experimental results show that our method outperforms the state-of-the-art methods on four mainstream datasets MARS, ILIDS-VID, PRID-2011 and DukeMTMC-VideoReID.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

ByteTrack: Multi-object Tracking by Associating Every Detection Box

SCATT: Transformer tracking with symmetric cross-attention

Article 04 May 2024

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

Article 30 January 2023

References

Zhang W, Ma B, Liu K, Huang R (2017) Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Trans Image Process 26(4):2042–2054
Article MathSciNet Google Scholar
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359
Article Google Scholar
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. Ieee, pp. 886–893
Chaudhry R, Ravichandran A, Hager G, Vidal R (2009) Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1932–1939
Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79
Article MathSciNet Google Scholar
Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 1528–1535
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision. Springer, pp 688–703
Cong DNT, Khoudour L, Achard C, Meurie C, Lezoray O (2010) People re-identification by spectral classification of silhouettes. Signal Process 90(8):2362–2374
Article Google Scholar
Ma X, Zhu X, Gong S, Xie X, Hu J, Lam KM, Zhong Y (2017) Person re-identification by unsupervised video matching. Pattern Recogn 65:197–210
Article Google Scholar
Bedagkar-Gala A, Shah SK (2011) Multiple person re-identification using part based spatio-temporal color appearance model. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops). IEEE, pp 1721–1728
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3810–3818
McLaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European conference on computer vision. Springer, pp 701–716
Liu Z, Wang Y, Li A (2018) Hierarchical integration of rich features for video-based person re-identification. IEEE Trans Circuits Syst Video Technol 29(12):3646–3659
Article Google Scholar
Liao X, He L, Yang Z, Zhang C (2018) Video-based person re-identification via 3D convolutional networks and non-local attention. In: Asian conference on computer vision. Springer, pp 620–634
Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4733–4742
Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378
Zhang Z, Lan C, Zeng W, Chen Z (2020) Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10407–10416
Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) VRSTC: occlusion-free video person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7183–7192
Song G, Leng B, Liu Y, Hetang C, Cai S (2018) Region-based quality estimation network for large-scale person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Yang J, Zheng WS, Yang Q, Chen YC, Tian Q (2020) Spatial-temporal graph convolutional network for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3289–3299
Yan Y, Qin J, Chen J, Liu L, Zhu F, Tai Y, Shao L (2020) Learning multi-granular hypergraphs for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2899–2908
Gao C, Chen Y, Yu JG, Sang N (2020) Pose-guided spatiotemporal alignment for video-based person re-identification. Inf Sci 527:176–190
Article MathSciNet Google Scholar
Zhang C, Zhu L, Zhang S, Yu W (2020) PAC-GAN: an effective pose augmentation scheme for unsupervised cross-view person re-identification. Neurocomputing 387:22–39
Article Google Scholar
Frikha M, Chebbi O, Fendri E, Hammami M (2016) Key frame selection for multi-shot person re-identification. In: International workshop on representations, analysis and recognition of shape and motion from imaging data. Springer, pp 97–110
Chen Y, Huang T, Niu Y, Ke X, Lin Y (2019) Pose-guided spatial alignment and key frame selection for one-shot video-based person re-identification. IEEE Access 7:78991–79004
Article Google Scholar
Nguyen TB, Le TL, Devillaine L, Pham TTT, Ngoc NP (2019) Effective multi-shot person re-identification through representative frames selection and temporal feature pooling. Multimed Tools Appl 78(23):33939–33967
Article Google Scholar
Ouyang D, Shao J, Zhang Y, Yang Y, Shen HT (2018) Video-based person re-identification via self-paced learning and deep reinforcement learning framework. In: Proceedings of the 26th ACM international conference on multimedia, pp 1562–1570
Li Y, Luo X, Hou S, Li C, Yin G (2021) End-to-end network embedding unsupervised key frame extraction for video-based person re-identification. In: 2021 11th international conference on information science and technology (ICIST). IEEE, pp 404–410
Chao H, He Y, Zhang J, Feng J (2019) Gaitset: regarding gait as a set for cross-view gait recognition. Proc AAAI Conf Artif Intell 33:8126–8133
Google Scholar
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Makihara Y, Sagawa R, Mukaigawa Y, Echigo T, Yagi Y (2006) Gait recognition using a view transformation model in the frequency domain. In: European conference on computer vision. Springer, pp 151–163
Wallach HM (2006) Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on machine learning, pp 977–984
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666
Article Google Scholar
Gao J, Nevatia R (2018) Revisiting temporal modeling for video-based person reid. arXiv preprint arXiv:1805.02104
Fu Y, Wang X, Wei Y, Huang T (2019) STA: spatial-temporal attention for large-scale video-based person re-identification. Proc AAAI Conf Artif Intell 33:8287–8294
Google Scholar
Liu G, Wu J (2021) Video-based person re-identification by intra-frame and inter-frame graph neural network. Image Vis Comput 106:104068
Article Google Scholar
Yu S, Tan D, Tan T (2006) A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th international conference on pattern recognition (ICPR’06), vol 4. IEEE, pp 441–444
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis. Springer, pp 91–102
Zheng L, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q (2016) Mars: a video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer, pp 868–884
Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, Yang Y (2018) Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5177–5186
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision. Springer, pp 17–35
Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops
Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S, Gu J (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimed 22(10):2597–2609
Article Google Scholar
Zhang J, Wang N, Zhang L (2018) Multi-shot pedestrian re-identification via sequential decision making. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6781–6789
Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5363–5372
Chen D, Li H, Xiao T, Yi S, Wang X (2018) Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1169–1178
Gu X, Ma B, Chang H, Shan S, Chen X (2019) Temporal knowledge propagation for image-to-video person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9647–9656
Subramaniam A, Nambiar A, Mittal A (2019) Co-segmentation inspired attention networks for video-based person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 562–572
Liu CT, Wu CW, Wang YCF, Chien SY (2019) Spatially and temporally efficient non-local attention network for video-based person re-identification. arXiv preprint arXiv:1908.01683
Hou R, Chang H, Ma B, Shan S, Chen X (2020) Temporal complementary learning for video person re-identification. In: European conference on computer vision. Springer, pp 388–405
Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3054775
You J, Wu A, Li X, Zheng WS (2016) Top-push video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1345–1353
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4747–4756
Li Z, Yao L, Nie F, Zhang D, Xu M (2018) Multi-rate gated recurrent convolutional networks for video-based pedestrian re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Wu Y, Bourahla OEF, Li X, Wu F, Tian Q, Zhou X (2020) Adaptive graph representation learning for video person re-identification. IEEE Trans Image Process 29:8821–8830
Article Google Scholar
Zang X, Li G, Gao W, Shu X (2021) Exploiting robust unsupervised video person re-identification. arXiv preprint arXiv:2111.05170

Download references

Acknowledgements

This work was supported in part by the Key-Area Research and Development Program of Guangdong Province under Grant 2018B010109007 and 2019B010153002, and the Guangzhou R&D Programne in the Key Area of Science and Technology Projects 202007040006, and the Program of Marine Economy Development (Six Marine Industries) Special Foundation of Department of Natural Resources of Guangdong Province under Grant GDNRC [2020]056, and the Guangdong Provincial Key Laboratory of Cyber-Physical System under Grant 2020B1212060069.

Author information

Authors and Affiliations

School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, China
Zeng Lu, Ganghan Zhang, Guoheng Huang, Weiwen Zhang & Junan Chen
School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006, China
Zhiwen Yu
Department of Computer and Information Science, University of Macau, Macau SAR, 999078, China
Chi-Man Pun
School of Information Engineering, Guangdong University of Technology, Guangzhou, 510006, China
Wing-Kuen Ling

Authors

Zeng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Ganghan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Guoheng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Chi-Man Pun
View author publications
You can also search for this author in PubMed Google Scholar
Weiwen Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Junan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Wing-Kuen Ling
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guoheng Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Z., Zhang, G., Huang, G. et al. Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation. Int. J. Mach. Learn. & Cyber. 13, 2745–2761 (2022). https://doi.org/10.1007/s13042-022-01560-4

Download citation

Received: 26 July 2021
Accepted: 31 March 2022
Published: 19 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s13042-022-01560-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Abstract

Access this article

Similar content being viewed by others

ByteTrack: Multi-object Tracking by Associating Every Detection Box

SCATT: Transformer tracking with symmetric cross-attention

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

Abstract

Access this article

Similar content being viewed by others

ByteTrack: Multi-object Tracking by Associating Every Detection Box

SCATT: Transformer tracking with symmetric cross-attention

A new YOLO-based method for real-time crowd detection from video and performance analysis of YOLO models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation