Abstract
Nowadays, there are many video person re-identification networks that do not consider screening the input video frame sequence, which result in the high-similarity of the video frames used for training the neural network. In this way, the temporal information in the video cannot be effectively modeled. To address that, we try to propose a video person re-identification scheme based on inter-frame reorganization, which consists of two modules. First, the Key Frame Screening with Index (KFSI) is proposed to screen the similar frames, and a frame sequence with richer information is extracted when loading the training dataset. Second, the Feature Reorganization Based on Inter-Frame Relation (FRBIFR) is proposed to reorganize the features of key frame sequence by calculating the correlation between the frames, and the reorganized features are more robust by eliminating some distractions (such as occlusion etc.). The experimental results show that our method outperforms the state-of-the-art methods on four mainstream datasets MARS, ILIDS-VID, PRID-2011 and DukeMTMC-VideoReID.
Similar content being viewed by others
References
Zhang W, Ma B, Liu K, Huang R (2017) Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Trans Image Process 26(4):2042–2054
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. Ieee, pp. 886–893
Chaudhry R, Ravichandran A, Hager G, Vidal R (2009) Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1932–1939
Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79
Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 1528–1535
Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision. Springer, pp 688–703
Cong DNT, Khoudour L, Achard C, Meurie C, Lezoray O (2010) People re-identification by spectral classification of silhouettes. Signal Process 90(8):2362–2374
Ma X, Zhu X, Gong S, Xie X, Hu J, Lam KM, Zhong Y (2017) Person re-identification by unsupervised video matching. Pattern Recogn 65:197–210
Bedagkar-Gala A, Shah SK (2011) Multiple person re-identification using part based spatio-temporal color appearance model. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops). IEEE, pp 1721–1728
Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3810–3818
McLaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334
Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European conference on computer vision. Springer, pp 701–716
Liu Z, Wang Y, Li A (2018) Hierarchical integration of rich features for video-based person re-identification. IEEE Trans Circuits Syst Video Technol 29(12):3646–3659
Liao X, He L, Yang Z, Zhang C (2018) Video-based person re-identification via 3D convolutional networks and non-local attention. In: Asian conference on computer vision. Springer, pp 620–634
Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4733–4742
Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378
Zhang Z, Lan C, Zeng W, Chen Z (2020) Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10407–10416
Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) VRSTC: occlusion-free video person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7183–7192
Song G, Leng B, Liu Y, Hetang C, Cai S (2018) Region-based quality estimation network for large-scale person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Yang J, Zheng WS, Yang Q, Chen YC, Tian Q (2020) Spatial-temporal graph convolutional network for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3289–3299
Yan Y, Qin J, Chen J, Liu L, Zhu F, Tai Y, Shao L (2020) Learning multi-granular hypergraphs for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2899–2908
Gao C, Chen Y, Yu JG, Sang N (2020) Pose-guided spatiotemporal alignment for video-based person re-identification. Inf Sci 527:176–190
Zhang C, Zhu L, Zhang S, Yu W (2020) PAC-GAN: an effective pose augmentation scheme for unsupervised cross-view person re-identification. Neurocomputing 387:22–39
Frikha M, Chebbi O, Fendri E, Hammami M (2016) Key frame selection for multi-shot person re-identification. In: International workshop on representations, analysis and recognition of shape and motion from imaging data. Springer, pp 97–110
Chen Y, Huang T, Niu Y, Ke X, Lin Y (2019) Pose-guided spatial alignment and key frame selection for one-shot video-based person re-identification. IEEE Access 7:78991–79004
Nguyen TB, Le TL, Devillaine L, Pham TTT, Ngoc NP (2019) Effective multi-shot person re-identification through representative frames selection and temporal feature pooling. Multimed Tools Appl 78(23):33939–33967
Ouyang D, Shao J, Zhang Y, Yang Y, Shen HT (2018) Video-based person re-identification via self-paced learning and deep reinforcement learning framework. In: Proceedings of the 26th ACM international conference on multimedia, pp 1562–1570
Li Y, Luo X, Hou S, Li C, Yin G (2021) End-to-end network embedding unsupervised key frame extraction for video-based person re-identification. In: 2021 11th international conference on information science and technology (ICIST). IEEE, pp 404–410
Chao H, He Y, Zhang J, Feng J (2019) Gaitset: regarding gait as a set for cross-view gait recognition. Proc AAAI Conf Artif Intell 33:8126–8133
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Makihara Y, Sagawa R, Mukaigawa Y, Echigo T, Yagi Y (2006) Gait recognition using a view transformation model in the frequency domain. In: European conference on computer vision. Springer, pp 151–163
Wallach HM (2006) Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on machine learning, pp 977–984
Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666
Gao J, Nevatia R (2018) Revisiting temporal modeling for video-based person reid. arXiv preprint arXiv:1805.02104
Fu Y, Wang X, Wei Y, Huang T (2019) STA: spatial-temporal attention for large-scale video-based person re-identification. Proc AAAI Conf Artif Intell 33:8287–8294
Liu G, Wu J (2021) Video-based person re-identification by intra-frame and inter-frame graph neural network. Image Vis Comput 106:104068
Yu S, Tan D, Tan T (2006) A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th international conference on pattern recognition (ICPR’06), vol 4. IEEE, pp 441–444
Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis. Springer, pp 91–102
Zheng L, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q (2016) Mars: a video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer, pp 868–884
Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, Yang Y (2018) Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5177–5186
Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision. Springer, pp 17–35
Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops
Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S, Gu J (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimed 22(10):2597–2609
Zhang J, Wang N, Zhang L (2018) Multi-shot pedestrian re-identification via sequential decision making. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6781–6789
Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5363–5372
Chen D, Li H, Xiao T, Yi S, Wang X (2018) Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1169–1178
Gu X, Ma B, Chang H, Shan S, Chen X (2019) Temporal knowledge propagation for image-to-video person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9647–9656
Subramaniam A, Nambiar A, Mittal A (2019) Co-segmentation inspired attention networks for video-based person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 562–572
Liu CT, Wu CW, Wang YCF, Chien SY (2019) Spatially and temporally efficient non-local attention network for video-based person re-identification. arXiv preprint arXiv:1908.01683
Hou R, Chang H, Ma B, Shan S, Chen X (2020) Temporal complementary learning for video person re-identification. In: European conference on computer vision. Springer, pp 388–405
Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3054775
You J, Wu A, Li X, Zheng WS (2016) Top-push video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1345–1353
Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4747–4756
Li Z, Yao L, Nie F, Zhang D, Xu M (2018) Multi-rate gated recurrent convolutional networks for video-based pedestrian re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Wu Y, Bourahla OEF, Li X, Wu F, Tian Q, Zhou X (2020) Adaptive graph representation learning for video person re-identification. IEEE Trans Image Process 29:8821–8830
Zang X, Li G, Gao W, Shu X (2021) Exploiting robust unsupervised video person re-identification. arXiv preprint arXiv:2111.05170
Acknowledgements
This work was supported in part by the Key-Area Research and Development Program of Guangdong Province under Grant 2018B010109007 and 2019B010153002, and the Guangzhou R&D Programne in the Key Area of Science and Technology Projects 202007040006, and the Program of Marine Economy Development (Six Marine Industries) Special Foundation of Department of Natural Resources of Guangdong Province under Grant GDNRC [2020]056, and the Guangdong Provincial Key Laboratory of Cyber-Physical System under Grant 2020B1212060069.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lu, Z., Zhang, G., Huang, G. et al. Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation. Int. J. Mach. Learn. & Cyber. 13, 2745–2761 (2022). https://doi.org/10.1007/s13042-022-01560-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01560-4