Visible-infrared person re-identification employing style-supervision and content-supervision

Tang, Qingwei; Yan, Pu; Sun, Wei

doi:10.1007/s00371-023-02929-4

Visible-infrared person re-identification employing style-supervision and content-supervision

Original article
Published: 17 June 2023

Volume 40, pages 2443–2456, (2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

290 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Cross-modal visible-infrared person re-identification (VI-ReID) aims to retrieve images of the same pedestrians captured by visible (VIS) cameras and infrared (IR) cameras and it is a challenging task in intelligent security systems. The differences in imaging principles between visible and infrared images lead to large cross-modal differences and intra-class differences, and such cross-modal image differences can be considered as special image style differences, while several intra-class differences can be considered as differences in the form of content expression between visible and infrared images. Some state-of-the-art methods improve the performance of the VI-ReID model by using additional feature enhancement or feature generation modules, however, these methods also introduce additional parameters and increase the training cost. In this paper, to mitigate the differences in image style and content between VIS and IR images, we design two objective functions based on content and style, which are style loss and content loss for the VI-ReID task, respectively. Our model can effectively mitigate the differences between modes by optimizing the objective function to map VIS and IR features into the same feature space without additional auxiliary modules. After extensive experiments, our model achieves competitive performance on two challenging datasets. Notably, under the visible2infrared setting on the RegDB dataset, our model achieves the state-of-the-art (SOTA) Rank-1/mAP/mINP = 96.13%/91.35%/83.67%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

Article 31 May 2022

A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification

Article 10 October 2023

Mask-guided dual attention-aware network for visible-infrared person re-identification

Article 10 February 2021

Data availability

Data available in a public (institutional, general or subject specific) repository that does not issue datasets with DOIs (non-mandated deposition)

References

Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2021)
Article Google Scholar
Pervaiz, N., Fraz, M., Shahzad, M.: Per-former: rethinking person re-identification using transformer augmented with self-attention and contextual mapping. Vis. Comput. 1–16 (2022)
Wang, P., Wang, M., He, D.: Multi-scale feature pyramid and multi-branch neural network for person re-identification. Vis. Comput. 1–13 (2022)
Dai, P., Ji, R., Wang, H., Wu, Q., Huang, Y.: Cross-modality person re-identification with generative adversarial training. In: IJCAI, vol. 1, p. 6 (2018)
Feng, Z., Lai, J., Xie, X.: Learning modality-specific representations for visible-infrared person re-identification. IEEE Trans. Image Process. 29, 579–590 (2019)
Article MathSciNet Google Scholar
Wu, A., Zheng, W.-S., Gong, S., Lai, J.: RGB-IR person re-identification by cross-modality similarity preservation. Int. J. Comput. Vis. 128(6), 1765–1785 (2020)
Article MathSciNet Google Scholar
Ye, M., Shen, J., Crandall, D.J., Shao, L., Luo, J.: Dynamic dual-attentive aggregation learning for visible-infrared person re-identification. In: European Conference on Computer Vision. Springer, pp. 229–247 (2020)
Wu, A., Zheng, W.-S., Yu, H.-X., Gong, S., Lai, J.: RGB-infrared cross-modality person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 5380–5389 (2017)
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Commun. ACM 63(11), 139–144 (2020)
Article MathSciNet Google Scholar
Li, D., Wei, X., Hong, X., Gong, Y.: Infrared-visible cross-modal person re-identification with an x modality. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 4610–4617 (2020)
Hao, Y., Wang, N., Li, J., Gao, X.: HSME: hypersphere manifold embedding for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8385–8392 (2019)
Ye, M., Wang, Z., Lan, X., Yuen, P.C.: Visible thermal person re-identification via dual-constrained top-ranking. In: IJCAI, vol. 1, p. 2 (2018)
Ye, M., Lan, X., Li, J., Yuen, P.: Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)
Hao, Y., Wang, N., Gao, X., Li, J., Wang, X.: Dual-alignment feature embedding for cross-modality person re-identification. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 57–65 (2019)
Wang, Z., Wang, Z., Zheng, Y., Chuang, Y.-Y., Satoh, S.: Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 618–626 (2019)
Wang, G., Zhang, T., Cheng, J., Liu, S., Yang, Y., Hou, Z.: RGB-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3623–3632 (2019)
Wang, G.-A., Zhang, T., Yang, Y., Cheng, J., Chang, J., Liang, X., Hou, Z.-G.: Cross-modality paired-images generation for RGB-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 12144–12151 (2020)
Choi, S., Lee, S., Kim, Y., Kim, T., Kim, C.: HI-CMD: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10257–10266 (2020)
Fan, X., Jiang, W., Luo, H., Mao, W.: Modality-transfer generative adversarial network and dual-level unified latent representation for visible thermal person re-identification. Vis. Comput. 1–16 (2020)
Cheng, D., Gong, Y., Zhou, S., Wang, J., Zheng, N.: Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1335–1344 (2016)
Gray, D., Tao, H.: Viewpoint invariant pedestrian recognition with an ensemble of localized features. In: European Conference on Computer Vision. Springer, pp. 262–275 (2008)
Prosser, B.J., Zheng, W.-S., Gong, S., Xiang, T., Mary, Q., et al.: Person re-identification by support vector ranking. In: BMVC, vol. 2, p. 6 (2010)
Su, C., Li, J., Zhang, S., Xing, J., Gao, W., Tian, Q.: Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3960–3969 (2017)
Zhang, Y., Yan, Y., Lu, Y., Wang, H.: Towards a unified middle modality learning for visible-infrared person re-identification. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 788–796 (2021)
Liu, H., Tan, X., Zhou, X.: Parameter sharing exploration and hetero-center triplet loss for visible-thermal person re-identification. IEEE Trans. Multimed. 23, 4414–4425 (2020)
Article Google Scholar
Zhang, L., Du, G., Liu, F., Tu, H., Shu, X.: Global-local multiple granularity learning for cross-modality visible-infrared person reidentification. IEEE Trans. Neural Netw. Learn. Syst. (2021). https://doi.org/10.1109/TNNLS.2021.3085978
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803 (2018)
Jing, Y., Yang, Y., Feng, Z., Ye, J., Yu, Y., Song, M.: Neural style transfer: a review. IEEE Trans. Vis. Comput. Graph. 26(11), 3365–3385 (2019)
Article Google Scholar
Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)
Ye, M., Shen, J., Zhang, X., Yuen, P.C., Chang, S.-F.: Augmentation invariant and instance spreading feature for softmax embedding. IEEE Trans Pattern Anal. Mach. Intell. 44(2), 924–939. (2022) https://doi.org/10.1109/TPAMI.2020.3013379
Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 480–496 (2018)
Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, vol. 2(7) (2015)
Nguyen, D.T., Hong, H.G., Kim, K.W., Park, K.R.: Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17(3), 605 (2017)
Article Google Scholar
Luo, H., Jiang, W., Gu, Y., Liu, F., Liao, X., Lai, S., Gu, J.: A strong baseline and batch normalization neck for deep person re-identification. IEEE Trans. Multimed. 22(10), 2597–2609 (2019)
Article Google Scholar
Pu, N., Chen, W., Liu, Y., Bakker, E.M., Lew, M.S.: Dual Gaussian-based variational subspace disentanglement for visible-infrared person re-identification. In: Proceedings of the 28th ACM International Conference on Multimedia, pp. 2149–2158 (2020)
Ye, M., Shen, J., Shao, L.: Visible-infrared person re-identification via homogeneous augmented tri-modal learning. IEEE Trans. Inf. Forensics Secur. 16, 728–739 (2020)
Article Google Scholar
Chen, Y., Wan, L., Li, Z., Jing, Q., Sun, Z.: Neural feature search for RGB-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 587–597 (2021)
Wu, Q., Dai, P., Chen, J., Lin, C.-W., Wu, Y., Huang, F., Zhong, B., Ji, R.: Discover cross-modality nuances for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4330–4339 (2021)
Liu, H., Xia, D., Jiang, W., Xu, C.: Towards homogeneous modality learning and multi-granularity information exploration for visible-infrared person re-identification. arXiv preprint arXiv:2204.04842 (2022)
Chen, F., Wu, F., Wu, Q., Wan, Z.: Memory regulation and alignment toward generalizer RGB-infrared person. arXiv preprint arXiv:2109.08843 (2021)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Google Scholar

Download references

Funding

This work was supported by the Key Research Project of Natural Science in Anhui Province under Grant 2022AH050249 and by the Anhui Provincial Graduate Education Quality Project Academic Innovation Project under Grant 2022xscx116.

Author information

Authors and Affiliations

Anhui International Joint Research Center for Intelligent Perception and High-Dimensional Modeling of Ancient Buildings, Anhui Jianzhu University, No. 292, Ziyun Road, Economic and Technological Development Zone, Hefei, 230601, Anhui, China
Qingwei Tang & Pu Yan
School of Electrical Engineering and Automation, Hefei University of Technology, No.193 Tunxi Road, Baohe District, Hefei, 230009, Anhui, China
Qingwei Tang & Wei Sun

Authors

Qingwei Tang
View author publications
You can also search for this author in PubMed Google Scholar
Pu Yan
View author publications
You can also search for this author in PubMed Google Scholar
Wei Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pu Yan.

Ethics declarations

Conflict of interest

We declared that we have no conflicts of interest in this work. We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Tang, Q., Yan, P. & Sun, W. Visible-infrared person re-identification employing style-supervision and content-supervision. Vis Comput 40, 2443–2456 (2024). https://doi.org/10.1007/s00371-023-02929-4

Download citation

Accepted: 26 May 2023
Published: 17 June 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s00371-023-02929-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Visible-infrared person re-identification employing style-supervision and content-supervision

Abstract

Access this article

Similar content being viewed by others

Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification

Mask-guided dual attention-aware network for visible-infrared person re-identification

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Visible-infrared person re-identification employing style-supervision and content-supervision

Abstract

Access this article

Similar content being viewed by others

Learn Robust Pedestrian Representation Within Minimal Modality Discrepancy for Visible-Infrared Person Re-Identification

A camera style-invariant learning and channel interaction enhancement fusion network for visible-infrared person re-identification

Mask-guided dual attention-aware network for visible-infrared person re-identification

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation