Skip to main content
Log in

Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Nowadays, there are many video person re-identification networks that do not consider screening the input video frame sequence, which result in the high-similarity of the video frames used for training the neural network. In this way, the temporal information in the video cannot be effectively modeled. To address that, we try to propose a video person re-identification scheme based on inter-frame reorganization, which consists of two modules. First, the Key Frame Screening with Index (KFSI) is proposed to screen the similar frames, and a frame sequence with richer information is extracted when loading the training dataset. Second, the Feature Reorganization Based on Inter-Frame Relation (FRBIFR) is proposed to reorganize the features of key frame sequence by calculating the correlation between the frames, and the reorganized features are more robust by eliminating some distractions (such as occlusion etc.). The experimental results show that our method outperforms the state-of-the-art methods on four mainstream datasets MARS, ILIDS-VID, PRID-2011 and DukeMTMC-VideoReID.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Zhang W, Ma B, Liu K, Huang R (2017) Video-based pedestrian re-identification by adaptive spatio-temporal appearance model. IEEE Trans Image Process 26(4):2042–2054

    Article  MathSciNet  Google Scholar 

  2. Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (surf). Comput Vis Image Underst 110(3):346–359

    Article  Google Scholar 

  3. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. Ieee, pp. 886–893

  4. Chaudhry R, Ravichandran A, Hager G, Vidal R (2009) Histograms of oriented optical flow and Binet–Cauchy kernels on nonlinear dynamical systems for the recognition of human actions. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 1932–1939

  5. Wang H, Kläser A, Schmid C, Liu CL (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79

    Article  MathSciNet  Google Scholar 

  6. Gheissari N, Sebastian TB, Hartley R (2006) Person reidentification using spatiotemporal appearance. In: 2006 IEEE computer society conference on computer vision and pattern recognition (CVPR’06), vol 2. IEEE, pp 1528–1535

  7. Wang T, Gong S, Zhu X, Wang S (2014) Person re-identification by video ranking. In: European conference on computer vision. Springer, pp 688–703

  8. Cong DNT, Khoudour L, Achard C, Meurie C, Lezoray O (2010) People re-identification by spectral classification of silhouettes. Signal Process 90(8):2362–2374

    Article  Google Scholar 

  9. Ma X, Zhu X, Gong S, Xie X, Hu J, Lam KM, Zhong Y (2017) Person re-identification by unsupervised video matching. Pattern Recogn 65:197–210

    Article  Google Scholar 

  10. Bedagkar-Gala A, Shah SK (2011) Multiple person re-identification using part based spatio-temporal color appearance model. In: 2011 IEEE international conference on computer vision workshops (ICCV Workshops). IEEE, pp 1721–1728

  11. Liu K, Ma B, Zhang W, Huang R (2015) A spatio-temporal appearance representation for viceo-based pedestrian re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3810–3818

  12. McLaughlin N, Del Rincon JM, Miller P (2016) Recurrent convolutional network for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1325–1334

  13. Yan Y, Ni B, Song Z, Ma C, Yan Y, Yang X (2016) Person re-identification via recurrent feature aggregation. In: European conference on computer vision. Springer, pp 701–716

  14. Liu Z, Wang Y, Li A (2018) Hierarchical integration of rich features for video-based person re-identification. IEEE Trans Circuits Syst Video Technol 29(12):3646–3659

    Article  Google Scholar 

  15. Liao X, He L, Yang Z, Zhang C (2018) Video-based person re-identification via 3D convolutional networks and non-local attention. In: Asian conference on computer vision. Springer, pp 620–634

  16. Xu S, Cheng Y, Gu K, Yang Y, Chang S, Zhou P (2017) Jointly attentive spatial-temporal pooling networks for video-based person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 4733–4742

  17. Li S, Bak S, Carr P, Wang X (2018) Diversity regularized spatiotemporal attention for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 369–378

  18. Zhang Z, Lan C, Zeng W, Chen Z (2020) Multi-granularity reference-aided attentive feature aggregation for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10407–10416

  19. Hou R, Ma B, Chang H, Gu X, Shan S, Chen X (2019) VRSTC: occlusion-free video person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7183–7192

  20. Song G, Leng B, Liu Y, Hetang C, Cai S (2018) Region-based quality estimation network for large-scale person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  21. Yang J, Zheng WS, Yang Q, Chen YC, Tian Q (2020) Spatial-temporal graph convolutional network for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3289–3299

  22. Yan Y, Qin J, Chen J, Liu L, Zhu F, Tai Y, Shao L (2020) Learning multi-granular hypergraphs for video-based person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2899–2908

  23. Gao C, Chen Y, Yu JG, Sang N (2020) Pose-guided spatiotemporal alignment for video-based person re-identification. Inf Sci 527:176–190

    Article  MathSciNet  Google Scholar 

  24. Zhang C, Zhu L, Zhang S, Yu W (2020) PAC-GAN: an effective pose augmentation scheme for unsupervised cross-view person re-identification. Neurocomputing 387:22–39

    Article  Google Scholar 

  25. Frikha M, Chebbi O, Fendri E, Hammami M (2016) Key frame selection for multi-shot person re-identification. In: International workshop on representations, analysis and recognition of shape and motion from imaging data. Springer, pp 97–110

  26. Chen Y, Huang T, Niu Y, Ke X, Lin Y (2019) Pose-guided spatial alignment and key frame selection for one-shot video-based person re-identification. IEEE Access 7:78991–79004

    Article  Google Scholar 

  27. Nguyen TB, Le TL, Devillaine L, Pham TTT, Ngoc NP (2019) Effective multi-shot person re-identification through representative frames selection and temporal feature pooling. Multimed Tools Appl 78(23):33939–33967

    Article  Google Scholar 

  28. Ouyang D, Shao J, Zhang Y, Yang Y, Shen HT (2018) Video-based person re-identification via self-paced learning and deep reinforcement learning framework. In: Proceedings of the 26th ACM international conference on multimedia, pp 1562–1570

  29. Li Y, Luo X, Hou S, Li C, Yin G (2021) End-to-end network embedding unsupervised key frame extraction for video-based person re-identification. In: 2021 11th international conference on information science and technology (ICIST). IEEE, pp 404–410

  30. Chao H, He Y, Zhang J, Feng J (2019) Gaitset: regarding gait as a set for cross-view gait recognition. Proc AAAI Conf Artif Intell 33:8126–8133

    Google Scholar 

  31. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  32. Makihara Y, Sagawa R, Mukaigawa Y, Echigo T, Yagi Y (2006) Gait recognition using a view transformation model in the frequency domain. In: European conference on computer vision. Springer, pp 151–163

  33. Wallach HM (2006) Topic modeling: beyond bag-of-words. In: Proceedings of the 23rd international conference on machine learning, pp 977–984

  34. Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  35. Gao J, Nevatia R (2018) Revisiting temporal modeling for video-based person reid. arXiv preprint arXiv:1805.02104

  36. Fu Y, Wang X, Wei Y, Huang T (2019) STA: spatial-temporal attention for large-scale video-based person re-identification. Proc AAAI Conf Artif Intell 33:8287–8294

    Google Scholar 

  37. Liu G, Wu J (2021) Video-based person re-identification by intra-frame and inter-frame graph neural network. Image Vis Comput 106:104068

    Article  Google Scholar 

  38. Yu S, Tan D, Tan T (2006) A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: 18th international conference on pattern recognition (ICPR’06), vol 4. IEEE, pp 441–444

  39. Hermans A, Beyer L, Leibe B (2017) In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737

  40. Hirzer M, Beleznai C, Roth PM, Bischof H (2011) Person re-identification by descriptive and discriminative classification. In: Scandinavian conference on image analysis. Springer, pp 91–102

  41. Zheng L, Bie Z, Sun Y, Wang J, Su C, Wang S, Tian Q (2016) Mars: a video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer, pp 868–884

  42. Wu Y, Lin Y, Dong X, Yan Y, Ouyang W, Yang Y (2018) Exploit the unknown gradually: One-shot video-based person re-identification by stepwise learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5177–5186

  43. Ristani E, Solera F, Zou R, Cucchiara R, Tomasi C (2016) Performance measures and a data set for multi-target, multi-camera tracking. In: European conference on computer vision. Springer, pp 17–35

  44. Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops

  45. Luo H, Jiang W, Gu Y, Liu F, Liao X, Lai S, Gu J (2019) A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans Multimed 22(10):2597–2609

    Article  Google Scholar 

  46. Zhang J, Wang N, Zhang L (2018) Multi-shot pedestrian re-identification via sequential decision making. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6781–6789

  47. Si J, Zhang H, Li CG, Kuen J, Kong X, Kot AC, Wang G (2018) Dual attention matching network for context-aware feature sequence based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5363–5372

  48. Chen D, Li H, Xiao T, Yi S, Wang X (2018) Video person re-identification with competitive snippet-similarity aggregation and co-attentive snippet embedding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1169–1178

  49. Gu X, Ma B, Chang H, Shan S, Chen X (2019) Temporal knowledge propagation for image-to-video person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9647–9656

  50. Subramaniam A, Nambiar A, Mittal A (2019) Co-segmentation inspired attention networks for video-based person re-identification. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 562–572

  51. Liu CT, Wu CW, Wang YCF, Chien SY (2019) Spatially and temporally efficient non-local attention network for video-based person re-identification. arXiv preprint arXiv:1908.01683

  52. Hou R, Chang H, Ma B, Shan S, Chen X (2020) Temporal complementary learning for video person re-identification. In: European conference on computer vision. Springer, pp 388–405

  53. Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2021) Deep learning for person re-identification: a survey and outlook. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2021.3054775

  54. You J, Wu A, Li X, Zheng WS (2016) Top-push video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1345–1353

  55. Zhou Z, Huang Y, Wang W, Wang L, Tan T (2017) See the forest for the trees: joint spatial and temporal recurrent neural networks for video-based person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4747–4756

  56. Li Z, Yao L, Nie F, Zhang D, Xu M (2018) Multi-rate gated recurrent convolutional networks for video-based pedestrian re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  57. Wu Y, Bourahla OEF, Li X, Wu F, Tian Q, Zhou X (2020) Adaptive graph representation learning for video person re-identification. IEEE Trans Image Process 29:8821–8830

    Article  Google Scholar 

  58. Zang X, Li G, Gao W, Shu X (2021) Exploiting robust unsupervised video person re-identification. arXiv preprint arXiv:2111.05170

Download references

Acknowledgements

This work was supported in part by the Key-Area Research and Development Program of Guangdong Province under Grant 2018B010109007 and 2019B010153002, and the Guangzhou R&D Programne in the Key Area of Science and Technology Projects 202007040006, and the Program of Marine Economy Development (Six Marine Industries) Special Foundation of Department of Natural Resources of Guangdong Province under Grant GDNRC [2020]056, and the Guangdong Provincial Key Laboratory of Cyber-Physical System under Grant 2020B1212060069.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoheng Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, Z., Zhang, G., Huang, G. et al. Video person re-identification using key frame screening with index and feature reorganization based on inter-frame relation. Int. J. Mach. Learn. & Cyber. 13, 2745–2761 (2022). https://doi.org/10.1007/s13042-022-01560-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01560-4

Keywords

Navigation