Skip to main content
Log in

Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Visible and infrared person re-identification (VI-ReID) describes the task of matching the images of a person, captured by visible-light and infrared cameras; this is a particular challenge in night time surveillance applications. Existing cross-modality recognition studies have been conducted mainly with a focus on learning the global and shareable feature representation of pedestrians to handle cross-modality discrepancies. However, the global features of pedestrian images cannot solve the unaligned image pairs efficiently, particularly when encountering the human appearance or posture misalignment caused by inaccurate pedestrian detection boxes. To mitigate the impact of these problems, we propose an end-to-end dual alignment and partitioning network to simultaneously learn global and local modal invariant features of pedestrians. First, we use two adaptive spatial transform modules to align the visible and infrared input images. Subsequently, the aligned image is divided horizontally, and the features of each local block are extracted. Then, we fuse these local features with global features. To alleviate the differences between heterogeneous modals and learn the common feature representation of heterogeneous modals, we map the features of heterogeneous modes into the same feature embedding space. Finally, we use the combination of identity loss and weighted regularized TriHard loss to improve the recognition accuracy. Extensive experimental results on two cross-modality datasets, RegDB and SYSU-MM01, demonstrate the superiority of the proposed method over other existing state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Vezzani R, Baltieri D, Cucchiara R (2013) People reidentification in surveillance and forensics: a survey. ACM Comput Surv (CSUR) 46:29

    Article  Google Scholar 

  2. Wang K, Wang H, Liu M, Xing X, Han T (2018) Survey on person re-identification based on deep learning. CAAI Trans Intell Technol 3:219–227

    Article  Google Scholar 

  3. Ye M, Shen J, Lin G, Xiang T, Shao L, Hoi SC (2020) Deep learning for person re-identification: A survey and outlook, arXiv:2001.04193

  4. Wang Z, Wang Z, Zheng Y, Wu Y, Zeng W, Satoh S (2019) Beyond intra-modality: A survey of heterogeneous person re-identification, arXiv:1905.10048

  5. Leng Q, Ye M, Tian Q (2019) A survey of open-world person re-identification. IEEE Transactions on Circuits and Systems for Video Technology

  6. Masson H, Bhuiyan A, Nguyen-Meidine LT, Javan M, Siva P, Ayed IB, Granger E (2019) A survey of pruning methods for efficient person re-identification across domains, arXiv:1907.02547

  7. Layne R, Hospedales TM, Gong S, Mary Q (2012) Person re-identification by attributes. In: Bmvc, pp 8

  8. Liao S, Hu Y, Zhu X, Li S (2015) Person re-identification by local maximal occurrence representation and metric learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp 2197–2206

  9. Xiao T, Li H, Ouyang W, Wang X (2016) Learning deep feature representations with domain guided dropout for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp 1249–1258

  10. Song W, Zheng J, Wu Y, Chen C, Liu F (2020) Discriminative feature extraction for video person re-identification via multi-task network. Appl Intell:1–16

  11. Li R, Zhang B, Teng Z, Fan J (2020) A divide-and-unite deep network for person re-identification. Appl Intell:1–13

  12. Yin J, Fan Z, Chen S, Wang Y (2020) In-depth exploration of attribute information for person re-identification. Appl Intell 50:3607–3622

    Article  Google Scholar 

  13. Liu J, Sun C, Xu X, Xu B, Yu S (2019) A spatial and temporal features mixture model with body parts for video-based person re-identification. Appl Intell 49:3436–3446

    Article  Google Scholar 

  14. Yu X, Ye X, Gao Q (2020) Infrared handprint image restoration algorithm based on apoptotic mechanism. IEEE Access 8:47334–47343

    Article  Google Scholar 

  15. Felzenszwalb P, McAllester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008 IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8

  16. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks, arXiv:1506.01497

  17. Wu A, Zheng W-S, Yu H-X, Gong S, Lai J (2017) Rgb-infrared cross-modality person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 5380–5389

  18. Nguyen DT, Hong HG, Kim K, Park KRJS (2017) Person recognition system based on a combination of body images from visible light and thermal cameras. Sensors 17:605

    Article  Google Scholar 

  19. Ye M, Lan X, Li J, Yuen P (2018a) Hierarchical discriminative learning for visible thermal person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 7501–7508

  20. Ye M, Wang Z, Lan X, Yuen PC (2018b) Visible thermal person re-identification via dual-constrained top-ranking. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, International Joint Conferences on Artificial Intelligence Organization, pp 1092–1099

  21. Wang Z, Wang Z, Zheng Y, Chuang Y, Satoh S (2019) Learning to reduce dual-level discrepancy for infrared-visible person re-identification. In: computer vision and pattern recognition, pp 618–626

  22. Dai P, Ji R, Wang H, Wu Q, Huang Y (2018) Cross-modality person re-identification with generative adversarial training. In: IJCAI, pp 677–683

  23. Wang G, Zhang T, Cheng J, Liu S, Yang Y, Hou Z (2019) Rgb-infrared cross-modality person re-identification via joint pixel and feature alignment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 3623–3632

  24. Ye M, Lan X, Wang Z, Yuen PC (2019) Bi-directional center-constrained top-ranking for visible thermal person re-identification. IEEE Trans Inf Forensic Secur 15:407–419

    Article  Google Scholar 

  25. Liu Y, Yang H, Zhao Q (2019) Hierarchical feature aggregation from body parts for misalignment robust person re-identification. Appl Sci 9:2255

    Article  Google Scholar 

  26. Luo H, Jiang W, Zhang X, Fan X, Qian J, Zhang C (2019) Alignedreid++: Dynamically matching local information for person re-identification. Pattern Recogn 94:53–61

    Article  Google Scholar 

  27. Sun Y, Zheng L, Yang Y, Tian Q, Wang S (2018) Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In: Proceedings of the European conference on computer vision (ECCV), pp 480–496

  28. Zheng Z, Zheng L, Yang Y Pedestrian alignment network for large-scale person re-identification

  29. BO LI, Xiaohong WU, Qiang LIU, Xiaohai HE (2019) Cross-modality person re-identification network based on adaptive pedestrian alignment. IEEE Access

  30. Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025

  31. Iranmanesh SM, Dabouei A, Kazemi H, Nasrabadi NM (2018) Deep cross polarimetric thermal-to-visible face recognition. In: 2018 international conference on biometrics (ICB). IEEE, pp 166–173

  32. Samma H, Suandi SA, Mohamad-Saleh J (2019) Face sketch recognition using a hybrid optimization model. Neural Comput Appl 31:6493–6508

    Article  Google Scholar 

  33. Varior RR, Shuai B, Lu J, Xu D, Wang G (2016) A siamese long short-term memory architecture for human re-identification. In: European conference on computer vision. Springer, pp 135–153

  34. Huang H, Yang W, Chen X, Zhao X, Huang K, Lin J, Huang G, Du D (2018) Eanet: Enhancing alignment for cross-domain person re-identification, arXiv:1812.11369

  35. Wang G, Yuan Y, Chen X, Li J, Zhou X (2018) Learning discriminative features with multiple granularities for person re-identification. In: Proceedings of the 26th ACM international conference on Multimedia, pp 274–282

  36. Li W, Zhao R, Xiao T, Wang X (2014) Deepreid: Deep filter pairing neural network for person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 152–159

  37. He L, Liang J, Li H, Sun Z (2018) Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7073–7082

  38. He L, Wang Y, Liu W, Zhao H, Sun Z, Feng J (2019) Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 8450–8459

  39. Zhao H, Tian M, Sun S, Shao J, Yan J, Yi S, Wang X, Tang X (2017) Spindle net: Person re-identification with human body region guided feature decomposition and fusion. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1077–1085

  40. Wei L, Zhang S, Yao H, Gao W, Tian Q (2017) Glad: Global-local-alignment descriptor for pedestrian retrieval. In: Proceedings of the 25th ACM international conference on Multimedia, pp 420–428

  41. Su C, Li J, Zhang S, Xing J, Gao W, Tian Q (2017) Pose-driven deep convolutional model for person re-identification. In: Proceedings of the IEEE international conference on computer vision, pp 3960–3969

  42. Liu J, Ni B, Yan Y, Zhou P, Cheng S, Hu J (2018) Pose transferrable person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4099–4108

  43. Zheng L, Huang Y, Lu H, Yang Y (2019) Pose-invariant embedding for deep person re-identification. IEEE Trans Image Process 28:4500–4509

    Article  MathSciNet  Google Scholar 

  44. Fang H-S, Xie S, Tai Y-W, Lu C (2017) Rmpe: Regional multi-person pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2334–2343

  45. Cao Z, Hidalgo G, Simon T, Wei S. -E., Sheikh Y (2019) Openpose: realtime multi-person 2d pose estimation using part affinity fields. IEEE Trans Pattern Anal Mach Intell 43:172–186

    Article  Google Scholar 

  46. Kalayeh MM, Basaran E, Gökmen M, Kamasak ME, Shah M (2018) Human semantic parsing for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1062–1071

  47. Qi L, Huo J, Wang L, Shi Y, Gao Y (2018) Maskreid: A mask based deep ranking neural network for person re-identification, arXiv:1804.03864

  48. Cai H, Wang Z, Cheng J (2019) Multi-scale body-part mask guided attention for person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp 0–0

  49. Wang Z (2020) Robust segmentation of the colour image by fusing the sdd clustering results from different colour spaces. IET Image Process 14:3273–3281

    Article  Google Scholar 

  50. Liu M, Yan X, Wang C, Wang K (2020) Segmentation mask-guided person image generation. Appl Intell:1–16

  51. Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495

    Article  Google Scholar 

  52. Chen L-C, Zhu Y, Papandreou G, Schroff F, Adam H (2018) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 801–818

  53. Liu X, Zhu X, Li M, Wang L, Zhu E, Liu T, Kloft M, Shen D, Yin J, Gao W (2019) Multiple kernel k-means with incomplete kernels. IEEE Trans Pattern Anal Mach Intell 42:1191–1204

    Google Scholar 

  54. Jing X-Y, Zhu X, Wu F, You X, Liu Q, Yue D, Hu R, Xu B (2015) Super-resolution person re-identification with semi-coupled low-rank discriminant dictionary learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 695–704

  55. Wang Z, Hu R, Yu Y, Jiang J, Liang C, Wang J (2016) Scale-adaptive low-resolution person re-identification via learning a discriminating surface.. In: IJCAI, vol 2, pp 6

  56. Li X, Zheng W-S, Wang X, Xiang T, Gong S (2015) Multi-scale learning for low-resolution person re-identification. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3765–3773

  57. Wu A, Zheng W-S, Lai J-H (2017) Robust depth-based person re-identification. IEEE Trans Image Process 26:2588–2603

    Article  MathSciNet  Google Scholar 

  58. Hafner F, Bhuiyan A, Kooij JF, Granger E (2018) A cross-modal distillation network for person re-identification in rgb-depth, arXiv:1810.11641

  59. Pang L, Wang Y, Song Y-Z, Huang T, Tian Y (2018) Cross-domain adversarial feature learning for sketch re-identification. In: Proceedings of the 26th ACM international conference on Multimedia, pp 609–617

  60. Li S, Xiao T, Li H, Zhou B, Yue D, Wang X (2017) Person search with natural language description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1970–1979

  61. Zheng Z, Zheng L, Garrett M, Yang Y, Shen Y-D (2017) Dual-path convolutional image-text embedding with instance loss, arXiv:1711.05535

  62. Zhou T, Chen M, Yu J, Terzopoulos D (2017) Attention-based natural language person retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 27–34

  63. Cornia M, Baraldi L, Tavakoli HR, Cucchiara R (2018) Towards cycle-consistent models for text and image retrieval. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 0–0

  64. Cao Y, Long M, Wang J, Liu S (2017) Collective deep quantization for efficient cross-modal retrieval. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp 3974–3980

  65. Li S, Xiao T, Li H, Yang W, Wang X (2017) Identity-aware textual-visual matching with latent co-attention. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1890–1899

  66. Zheng Z, Zheng L, Yang Y (2017) A discriminatively learned cnn embedding for person reidentification. ACM Trans Multimed Comput Commun Appl (TOMM) 14:1–20

    Google Scholar 

  67. Hao Y, Wang N, Li J, Gao X (2019) Hsme: hypersphere manifold embedding for visible thermal person re-identification. In: Proceedings of the AAAI conference on artificial intelligence, pp 8385–8392

  68. Kang JK, Hoang TM, Park KRJIA (2019) Person re-identification between visible and thermal camera images based on deep residual cnn using single input. IEEE Access 7:57972–57984

    Article  Google Scholar 

  69. Liu H, Cheng J (2019) Enhancing the discriminative feature learning for visible-thermal cross-modality person re-identification, arXiv:1907.09659

  70. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778

  71. Wang G-A, Zhang T, Yang Y, Cheng J, Chang J, Liang X, Hou Z-G (2020) Cross-modality paired-images generation for rgb-infrared person re-identification. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 34, pp 12144–12151

  72. Ye M, Lan X, Leng Q, Shen J (2020) Cross-modality person re-identification via modality-aware collaborative ensemble learning. IEEE Trans Image Process 29:9387–9399

    Article  Google Scholar 

  73. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp 886–893

  74. Lin L, Wang G, Zuo W, Feng X, Zhang L (2016) Cross-domain visual matching via generalized similarity measure and feature learning. IEEE Trans Pattern Anal Mach Intell 39:1089–1102

    Article  Google Scholar 

  75. Kansal K, Subramanyam A, Wang Z, Satoh S (2020) Sdl: Spectrum-disentangled representation learning for visible-infrared person re-identification. IEEE Transactions on Circuits and Systems for Video Technology

  76. Hao Y, Li J, Wang N, Gao X (2020) Modality adversarial neural network for visible-thermal person re-identification. Pattern Recogn 107:107533

  77. Wu A, Zheng W-S, Gong S, Lai J (2020) Rgb-ir person re-identification by cross-modality similarity preservation. Int J Comput Vis 128:1765–1785

    Article  MathSciNet  Google Scholar 

  78. Choi S, Lee S, Kim Y, Kim T, Kim C (2020) Hi-cmd: Hierarchical cross-modality disentanglement for visible-infrared person re-identification. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 10257–10266

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61871278), the Fund of Sichuan University-Tomorrow Advancing Life (TAL), the Chengdu Science and Technology Project (No. 2016-XT00-00015-GX), the Sichuan Science and Technology Program (No. 2018HH0143), and by the Fundamental Research Funds for the Central Universities (No. 2021SCU12061).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qizhi Teng.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Q., Teng, Q., Chen, H. et al. Dual adaptive alignment and partitioning network for visible and infrared cross-modality person re-identification. Appl Intell 52, 547–563 (2022). https://doi.org/10.1007/s10489-021-02390-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02390-7

Keywords

Navigation