Skip to main content
Log in

Staged encoder training for cross-camera person re-identification

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

As a cross-camera retrieval problem, person re-identification (ReID) suffers from image style variations caused by camera parameters, lighting and other reasons, which will seriously affect the model recognition accuracy. To address this problem, this paper proposes a two-stage contrastive learning method to gradually reduce the impact of camera variations. In the first stage, we train an encoder for each camera using only images from the respective camera. This ensures that each encoder has better recognition performance on images from its respective camera while being unaffected by camera variations. In the second stage, we encode the same image using all trained encoders to generate a new combination code that is robust against camera variations. We also use Cross-Camera Encouragement (Lin et al., in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020) distance that complements the advantages of combined encoding to further mitigate the impact of camera variations. Our method achieves high accuracy on several commonly used person ReID datasets, e.g., on the Market-1501, achieves 90.8% rank-1 accuracy and 85.2% mAP, outperforming the recent unsupervised works by 12+% in terms of mAP. Code is available at https://github.com/yjwyuanwu/SET.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

The datasets are available at https://virutalbuy-public.oss-cn-hangzhou.aliyuncs.com/share/data.zip. Code is available at https://github.com/-yjwyuanwu/SET.

References

  1. Chen, Y.C., Zhu, X., Zheng, W.S., Lai, J.H.: Person re-identification by camera correlation aware feature augmentation. IEEE Trans. Pattern Anal. Mach. Intell. 40(2), 392–408 (2017)

    Article  PubMed  Google Scholar 

  2. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

  3. Deng, W., Zheng, L., Ye, Q., Kang, G., Yang, Y., Jiao, J.: Image-image domain adaptation with preserved self-similarity and domain-dissimilarity for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 994–1003 (2018)

  4. Dou, Q., Coelho de Castro, D., Kamnitsas, K., Glocker, B.: Domain generalization via model-agnostic learning of semantic features. In: Advances in Neural Information Processing Systems 32 (2019)

  5. Ge, Y., Chen, D., Li, H.: Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint arXiv:2001.01526 (2020)

  6. Ge, Y., Zhu, F., Chen, D., Zhao, R., et al.: Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. Adv. Neural. Inf. Process. Syst. 33, 11309–11321 (2020)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  8. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

  9. Javed, O., Shafique, K., Rasheed, Z., Shah, M.: Modeling inter-camera space-time and appearance relationships for tracking across non-overlapping views. Comput. Vis. Image Underst. 109(2), 146–162 (2008)

    Article  Google Scholar 

  10. Li, Y.J., Lin, C.S., Lin, Y.B., Wang, Y.C.F.: Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7919–7929 (2019)

  11. Lin, Y., Dong, X., Zheng, L., Yan, Y., Yang, Y.: A bottom-up clustering approach to unsupervised person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8738–8745 (2019)

  12. Lin, Y., Xie, L., Wu, Y., Yan, C., Tian, Q.: Unsupervised person re-identification via softened similarity learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3390–3399 (2020)

  13. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in Python. J Mach Learn Res 12, 2825–2830 (2011)

    MathSciNet  Google Scholar 

  14. Porikli, F.: Inter-camera color calibration by correlation model function. In: Proceedings 2003 International Conference on Image Processing (cat. No. 03CH37429), vol. 2, pp. II–133. IEEE (2003)

  15. Ristani, E., Solera, F., Zou, R., Cucchiara, R., Tomasi, C.: Performance measures and a data set for multi-target, multi-camera tracking. In: European Conference on Computer Vision, pp. 17–35. Springer (2016)

  16. Sun, X., Zheng, L.: Dissecting person re-identification from the viewpoint of viewpoint. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 608–617 (2019)

  17. Wang, D., Zhang, S.: Unsupervised person re-identification via multi-label classification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10981–10990 (2020)

  18. Wei, L., Zhang, S., Gao, W., Tian, Q.: Person transfer gan to bridge domain gap for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 79–88 (2018)

  19. Xuan, S., Zhang, S.: Intra-inter camera similarity for unsupervised person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11926–11935 (2021)

  20. Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., Hoi, S.C.: Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44(6), 2872–2893 (2021)

    Article  Google Scholar 

  21. Zeng, K., Ning, M., Wang, Y., Guo, Y.: Hierarchical clustering with hard-batch triplet loss for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13657–13665 (2020)

  22. Zhang, Y., Xiang, T., Hospedales, T.M., Lu, H.: Deep mutual learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4320–4328 (2018)

  23. Zhao, F., Liao, S., Xie, G.S., Zhao, J., Zhang, K., Shao, L.: Unsupervised domain adaptation with noise resistible mutual-training for person re-identification. In: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, Proceedings, Part XI 16, pp. 526–544. Springer (2020)

  24. Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., Tian, Q.: Scalable person re-identification: a benchmark. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1116–1124 (2015)

  25. Zheng, Z., Yang, X., Yu, Z., Zheng, L., Yang, Y., Kautz, J.: Joint discriminative and generative learning for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2138–2147 (2019)

  26. Zhong, Z., Zheng, L., Li, S., Yang, Y.: Generalizing a person retrieval model hetero-and homogeneously. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 172–188 (2018)

  27. Zhong, Z., Zheng, L., Zheng, Z., Li, S., Yang, Y.: Camera style adaptation for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5157–5166 (2018)

  28. Zou, Y., Yang, X., Yu, Z., Kumar, B.V., Kautz, J.: Joint disentangling and adaptation for cross-domain person re-identification. In: Computer Vision–ECCV 2020: 16th European Conference, Proceedings, Part II 16, pp. 87–104. Springer (2020)

Download references

Funding

This work was supported by Guangxi Natural Science Foundation (No. 2020GXNSFAA297186), Jiangsu Province Agricultural Science and Technology Innovation and Promotion Special Project (No. NJ2021-21), Guilin Key Research and Development Program (No. 20210206-1), Guan-gxi Key Laboratory of Precision Navigation Technology and Application (No. DH202227), Guangxi Key Laboratory of Image and Graphic Intelligent Processing (No. GIIP2301). There are no financial conflicts of interest to disclose.

Author information

Authors and Affiliations

Authors

Contributions

ZX contributed to conceptualization, methodology, software, resources, writing—review and editing. JY contributed to methodology, software, validation and writing—original draft. YL contributed to supervision and writing—original draft. LZ contributed to Data curation and Investigation. JL contributed to formal analysis and Data curation. All authors reviewed the manuscript.

Corresponding author

Correspondence to Zhi Xu.

Ethics declarations

Competing interests

The authors declare no competing interests

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, Z., Yang, J., Liu, Y. et al. Staged encoder training for cross-camera person re-identification. SIViP 18, 2323–2331 (2024). https://doi.org/10.1007/s11760-023-02909-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02909-0

Keywords

Navigation