Abstract
In this paper, we investigate the problem of person re-identification by learning pedestrian distinguishing features and reducing model complexity. Traditional methods usually extract pedestrian features by designing better network structures and loss functions, which lack the consideration of the model size and ignore the impact of model efficiency on the accuracy of person re-identification. In this work, an end-to-end joint learning framework, namely PA-Net, with attention model and dynamic filter pruning algorithm is proposed. First, for a feature node, we mine patterns from a compact representation for attention learning, which points out the direction for dynamic filter pruning during training. The compact representation is obtained by stacking its pairwise relations with all feature nodes as a vector. Second, in an epoch training phase, the filters of small ℓ2-norm are given high priority of being pruned to temporarily eliminate their contribution to the model output than those of higher ℓ2-norm. Pruned filters can still be updated in the next epoch training phase until some filters no longer have any effect on the model and are completely pruned. Third, the weighted regularized triplet (WRT) loss and center loss are used to constrain the original features, and the softmax loss is used to constrain the batch normalized (BN) processed features to obtain the final score. Comprehensive experiments on the Market-1501, DukeMTMC-reID and MSMT17 datasets clearly show the superior performance of our proposed method in comparison with state-of-the-art methods.
Similar content being viewed by others
References
Cao Y, Xu J, Lin S, Wei F, Hu H (2019) Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: 2019 IEEE/CVF international conference on computer vision workshop (ICCVW). IEEE, pp 1971–1980
Cheng D, Gong Y, Zhou S, Wang J, Zheng N (2016) Person re-identification by multi-channel parts-based cnn with improved triplet loss function. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1335–1344
Eom C, Ham B (2019) Learning disentangled representation for robust person re-identification. In: Advances in neural information processing systems, pp 5297–5308
Figurnov M, Collins MD, Zhu Y, Zhang L, Huang J, Vetrov D, Salakhutdinov R (2017) Spatially adaptive computation time for residual networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1039–1048
Fu Y, Wei Y, Zhou Y, Shi H, Huang G, Wang X, Yao Z, Huang T (2019) Horizontal pyramid matching for person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 8295–8302
Gherardi R, Farenzena M, Fusiello A (2010) Improving the efficiency of hierarchical structure-and-motion. In: 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, pp 1594–1600
Han S, Mao H, Dally WJ (2015a) Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv:151000149
Han S, Pool J, Tran J, Dally W (2015b) Learning both weights and connections for efficient neural network. Adv Neural Inform Process Syst 28:1135–1143
He Y, Zhang X, Sun J (2017) Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 1389–1397
He Y, Kang G, Dong X, Fu Y, Yang Y (2018) Soft filter pruning for accelerating deep convolutional neural networks. arXiv:180806866
He Y, Liu P, Wang Z, Hu Z, Yang Y (2019) Filter pruning via geometric median for deep convolutional neural networks acceleration. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4340–4349
Herzog F, Ji X, Teepe T, Hörmann S, Gilg J, Rigoll G (2021) Lightweight multi-branch network for person re-identification. arXiv:210110774
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7132–7141
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv:160808710
Li W, Wang X (2013) Locally aligned feature transforms across views. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3594–3601
Lowe DG (1999) Object recognition from local scale-invariant features. In: Proceedings of the seventh IEEE international conference on computer vision, vol 2. Ieee, pp 1150–1157
Luo H, Gu Y, Liao X, Lai S, Jiang W (2019) Bag of tricks and a strong baseline for deep person re-identification. In: 2019 IEEE/CVF conference on computer vision and pattern recognition workshops (CVPRW). IEEE Computer Society, pp 1487–1495
Luo JH, Wu J, Lin W (2017) Thinet: A filter level pruning method for deep neural network compression. In: Proceedings of the IEEE international conference on computer vision, pp 5058–5066
Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Proceedings of the 27th international conference on neural information processing systems-volume, vol 2, pp 2204–2212
Obeso AM, Benois-Pineau J, Vázquez M S G, Acosta A A ́R (2019) Forward-backward visual saliency propagation in deep nns vs internal attentional mechanisms. In: 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA). IEEE, pp 1–6
Prosser B, Zheng WS, Gong S, Xiang T (2010) Person re-identification by support vector ranking. In: British machine vision conference, pp 21.1–21.11
Su C, Yang F, Zhang S, Tian Q, Davis LS, Gao W (2015) Multi-task learning with low rank attribute embedding for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3739–3747
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Wang G, Yuan Y, Chen X, Li J, Zhou X (2018a) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on multimedia, pp 274–282
Wang X, Girshick R, Gupta A, He K (2018b) Non-local neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7794–7803
Wei L, Zhang S, Gao W, Tian Q (2018) Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 79–88
Wen Y, Zhang K, Li Z, Qiao Y (2016) A discriminative feature learning approach for deep face recognition. In: European conference on computer vision, Springer, pp 499–515
Xie B, Wu X, Zhang S, Zhao S, Li M (2020) Learning diverse features with part-level resolution for person re-identification. In: Chinese conference on pattern recognition and computer vision (PRCV). Springer, pp 16–28
Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2020) Deep learning for person re-identification: a survey and outlook. arXiv:200104193
Yi D, Lei Z, Liao S, Li SZ (2014) Deep metric learning for person re-identification. In: 2014 22nd international conference on pattern recognition. IEEE, pp 34–39
Zajdel W, Zivkovic Z, Krose BJ (2005) Keeping track of humans: Have i seen this person before?. In: Proceedings of the 2005 IEEE International conference on robotics and automation. IEEE, pp 2081–2086
Zhang Z, Lan C, Zeng W, Jin X, Chen Z (2020) Relation-aware global attention for person re-identification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3186–3195
Zheng L, Yang Y, Hauptmann AG (2016) Person re-identification: Past, present and future. arXiv:161002984
Zheng Z, Yang Y (2020) Person re-identification in the 3d space. arXiv:200604569
Zheng Z, Zheng L, Yang Y (2017a) A discriminatively learned cnn embedding for person reidentification. ACM Trans Multimed Comput Commun Appl (TOMM) 14(1):1–20
Zheng Z, Zheng L, Yang Y (2017b) Unlabeled samples generated by gan improve the person re-identification baseline in vitro. In: Proceedings of the IEEE international conference on computer vision, pp 3754–3762
Zheng Z, Yang X, Yu Z, Zheng L, Yang Y, Kautz J (2019) Joint discriminative and generative learning for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2138–2147
Zhou K, Yang Y, Cavallaro A, Xiang T (2019) Omni-scale feature learning for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3702–3712
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Cheng, R., Wang, L., Wei, M. et al. Joint learning dynamic pruning and attention for person re-identification. Multimed Tools Appl 81, 39409–39429 (2022). https://doi.org/10.1007/s11042-022-12195-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12195-6