Skip to main content
Log in

Thermal pedestrian detection based on different resolution visual image

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Thermal pedestrian detection is a core problem in computer vision. Usually, the corresponding visual image knowledge is used to improve the performance in thermal domain. However, existing methods always assume the same resolution between visible and thermal images. But in reality, there is a problem with this setting. Since thermal imaging acquisition equipment is expensive, the resolution of thermal images is always lower than visible images. To address this issue, we propose a new method, named as Disentanglement Then Restoration (DTR). The key idea is to disentangle the features into content features and modal features and restore the complete content features of thermal images by learning the changes of content features caused by different resolutions. Specifically, we first train an object detector such as YOLO to initialize our model. Then, a feature disentanglement network is trained, which can disentangle the features from the backbone as content features and modal features. In the end, the feature disentanglement network is frozen. By forcing the content feature consistency between visual image and upsampled thermal image, the complete content features of low-resolution thermal images are restored. Experiment results on public datasets show that our method performs very well. Code is available at https://github.com/HaMeow-lst1/DTR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The KAIST dataset analyzed during the current study is available at https://soonminhwang.github.io/rgbt-ped-detection/. The LLVIP dataset analyzed during the current study is available at https://bupt-ai-cz.github.io/LLVIP/.

References

  1. Cao, J., Pang, Y., Xie, J., Khan, F.S., Shao, L.: From handcrafted to deep features for pedestrian detection: a survey. IEEE Trans. Patt. Anal. Mach. Intell. 44, 4913–4934 (2022)

    Article  Google Scholar 

  2. Tang, Y., Li, B., Liu, M., Chen, B., Wang, Y., Ouyang, W.: Autopedestrian: an automatic data augmentation and loss function search scheme for pedestrian detection. IEEE Trans. Image Process. 30, 8483–8496 (2021)

    Article  MathSciNet  Google Scholar 

  3. Zhou, C., Wu, M., Lam, S.-K.: Enhanced multi-task learning architecture for detecting pedestrian at far distance. IEEE Trans. Intell. Transport. Sys. 30, 15588–15604 (2022)

    Article  Google Scholar 

  4. He, Y., Zhu, C., Yin, X.-C.: Occluded pedestrian detection via distribution-based mutual-supervised feature learning. IEEE Trans. Intell. Transport. Syst. 23, 10514–10529 (2021)

    Article  Google Scholar 

  5. Jiao, Y., Yao, H., Xu, C.: San: selective alignment network for cross-domain pedestrian detection. IEEE Trans. Image Process. 30, 2155–2167 (2021)

    Article  Google Scholar 

  6. Wu, J., Zhou, C., Yang, M., Zhang, Q., Li, Y., Yuan, J.: Temporal-context enhanced detection of heavily occluded pedestrians. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13430–13439 (2020)

  7. Konig, D., Adam, M., Jarvers, C., Layher, G., Neumann, H., Teutsch, M.: Fully convolutional region proposal networks for multispectral person detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 49–56 (2017)

  8. Chen, Z., Huang, X.: Pedestrian detection for autonomous vehicle using multi-spectral cameras. IEEE Trans. Intell. Veh. 4(2), 211–219 (2019)

    Article  Google Scholar 

  9. Kim, J.U., Park, S., Ro, Y.M.: Uncertainty-guided cross-modal learning for robust multispectral pedestrian detection. IEEE Trans. Circ. Sys. Video Technol. 32(3), 1510–1523 (2022)

    Article  Google Scholar 

  10. Dasgupta, K., Das, A., Das, S., Bhattacharya, U., Yogamani, S.: Spatio-contextual deep network-based multimodal pedestrian detection for autonomous driving. IEEE Trans. Intell. Transport. Sys. 23, 15940–15950 (2022)

    Article  Google Scholar 

  11. Zhang, L., Liu, Z., Zhang, S., Yang, X., Qiao, H., Huang, K., Hussain, A.: Cross-modality interactive attention network for multispectral pedestrian detection. Inf. Fusion 50, 20–29 (2019)

    Article  Google Scholar 

  12. Li, C., Song, D., Tong, R., Tang, M.: Illumination-aware faster r-cnn for robust multispectral pedestrian detection. Patt. Recogn. 85, 161–171 (2019)

    Article  Google Scholar 

  13. Guan, D., Cao, Y., Yang, J., Cao, Y., Yang, M.Y.: Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Inf. Fusion 50, 148–157 (2019)

  14. Herrmann, C., Ruf, M., Beyerer, J.: Cnn-based thermal infrared person detection by domain adaptation. In: Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything. International Society for Optics and Photonics, vol. 10643, p. 1064308 (2018)

  15. Ghose, D., Desai, S.M., Bhattacharya, S., Chakraborty, D., Fiterau, M., Rahman, T.: Pedestrian detection in thermal images using saliency maps. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (2019)

  16. Xu, Z., Vong, C.-M., Wong, C.-C., Liu, Q.: Ground plane context aggregation network for day-and-night on vehicular pedestrian detection. IEEE Trans. Intell. Transp. Syst. 22(10), 6395–6406 (2020)

    Article  Google Scholar 

  17. Kim, J.U., Park, S., Ro, Y.M.: Robust small-scale pedestrian detection with cued recall via memory learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3050–3059 (2021)

  18. Kieu, M., Bagdanov, A.D., Bertini, M., Bimbo, A.d.: Task-conditioned domain adaptation for pedestrian detection in thermal imagery. In: European conference on computer vision, pp. 546–562 (2020). Springer

  19. Kieu, M., Bagdanov, A.D., Bertini, M., Bimbo, A.D.: Domain adaptation for privacy-preserving pedestrian detection in thermal imagery. In: International Conference on Image Analysis and Processing, Springer, pp. 203–213 (2019)

  20. Kieu, M., Bagdanov, A.D., Bertini, M.: Bottom-up and layerwise domain adaptation for pedestrian detection in thermal images. ACM Trans. Multim. Comput. Commun. Appl. (TOMM) 17(1), 1–19 (2021)

    Article  Google Scholar 

  21. Kieu, M., Berlincioni, L., Galteri, L., Bertini, M., Bagdanov, A.D., Del Bimbo, A.: Robust pedestrian detection in thermal imagery using synthesized images. In: 2020 25th International conference on pattern recognition (ICPR), IEEE, pp. 8804–8811 (2021)

  22. Guo, T., Huynh, C.P., Solh, M.: Domain-adaptive pedestrian detection in thermal images. In: 2019 IEEE International conference on image processing (ICIP), IEEE, pp. 1660–1664 (2019)

  23. Liu, D., Zhang, C., Song, Y., Huang, H., Wang, C., Barnett, M., Cai, W.: Decompose to adapt: cross-domain object detection via feature disentanglement. IEEE Trans. Multim. (2022). https://doi.org/10.1109/TMM.2022.3141614

    Article  Google Scholar 

  24. Chen, Z., Yang, C., Li, Q., Zhao, F., Zha, Z.-J., Wu, F.: Disentangle your dense object detector. In: Proceedings of the 29th ACM international conference on multimedia, pp. 4939–4948 (2021)

  25. Lin, C., Yuan, Z., Zhao, S., Sun, P., Wang, C., Cai, J.: Domain-invariant disentangled network for generalizable object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8771–8780 (2021)

  26. Wu, A., Han, Y., Zhu, L., Yang, Y.: Instance-invariant domain adaptive object detection via progressive disentanglement. IEEE Trans. Patt. Anal. Mach. Intell. 44(8), 4178–4193 (2022)

    Google Scholar 

  27. Kim, J.U., Park, S., Ro, Y.M.: Towards versatile pedestrian detector with multisensory-matching and multispectral recalling memory. In: 36th AAAI conference on artificial intelligence, Association for the Advancement of Artificial Intelligence (AAAI 22) (2022)

  28. Jhoo, W.Y., Heo, J.-P.: Collaborative learning with disentangled features for zero-shot domain adaptation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8896–8905 (2021)

  29. Lin, C.-C., Chu, H.-L., Wang, Y.-C.F., Lei, C.-L.: Joint feature disentanglement and hallucination for few-shot image classification. IEEE Trans. Image Process. 30, 9245–9258 (2021)

  30. Tang, L., Li, B., Zhong, Y., Ding, S., Song, M.: Disentangled high quality salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3580–3590 (2021)

  31. Wu, A., Liu, R., Han, Y., Zhu, L., Yang, Y.: Vector-decomposed disentanglement for domain-invariant object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 9342–9351 (2021)

  32. Jia, M., Cheng, X., Lu, S., Zhang, J.: Learning disentangled representation implicitly via transformer for occluded person re-identification. IEEE Trans. Multimed. (2022). https://doi.org/10.1109/TMM.2022.3141267

    Article  Google Scholar 

  33. Lee, Y., Yoo, H., Yu, J., Jeon, M.: Learning to see in the rain via disentangled representation. IEEE Robot. Autom. Lett. (2021). https://doi.org/10.1109/LRA.2021.3117249

    Article  Google Scholar 

  34. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  35. Peng, X., Huang, Z., Sun, X., Saenko, K.: Domain agnostic learning with disentangled representations. In: International Conference on Machine Learning, PMLR, pp. 5102–5112 (2019)

  36. Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285 (2016)

  37. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp. 234–241 (2015)

  38. Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2011)

    Article  Google Scholar 

  39. Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1037–1045 (2015)

  40. Li, C., Song, D., Tong, R., Tang, M.: Multispectral pedestrian detection via simultaneous detection and segmentation. arXiv preprint arXiv:1808.04818 (2018)

  41. Liu, J., Zhang, S., Wang, S., Metaxas, D.N.: Multispectral deep neural networks for pedestrian detection. arXiv preprint arXiv:1611.02644 (2016)

  42. Jia, X., Zhu, C., Li, M., Tang, W., Zhou, W.: Llvip: a visible-infrared paired dataset for low-light vision. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3496–3504 (2021)

  43. Baek, J., Hong, S., Kim, J., Kim, E.: Efficient pedestrian detection at nighttime using a thermal camera. Sensors 17(8), 1850 (2017)

    Article  Google Scholar 

  44. Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)

    MATH  Google Scholar 

  45. Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of computer experiments. Stat. Sci. 4(4), 409–423 (1989)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (62276048), Sichuan Science and Technology Program (2020YFG0476).

Author information

Authors and Affiliations

Authors

Contributions

SL presented the method and design of the experiment. SL, JC, and LT finished the experiment. SL, MY and TL wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ting Li.

Ethics declarations

Conflict of interest

There are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Cui, J., Ye, M. et al. Thermal pedestrian detection based on different resolution visual image. SIViP 17, 4347–4355 (2023). https://doi.org/10.1007/s11760-023-02667-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-023-02667-z

Keywords

Navigation