Thermal pedestrian detection based on different resolution visual image

Li, Songtao; Cui, Jinzhong; Ye, Mao; Li, Ting; Tian, Liang

doi:10.1007/s11760-023-02667-z

Thermal pedestrian detection based on different resolution visual image

Original Paper
Published: 26 July 2023

Volume 17, pages 4347–4355, (2023)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Songtao Li¹,
Jinzhong Cui¹,
Mao Ye¹,
Ting Li¹ &
…
Liang Tian¹

138 Accesses
Explore all metrics

Abstract

Thermal pedestrian detection is a core problem in computer vision. Usually, the corresponding visual image knowledge is used to improve the performance in thermal domain. However, existing methods always assume the same resolution between visible and thermal images. But in reality, there is a problem with this setting. Since thermal imaging acquisition equipment is expensive, the resolution of thermal images is always lower than visible images. To address this issue, we propose a new method, named as Disentanglement Then Restoration (DTR). The key idea is to disentangle the features into content features and modal features and restore the complete content features of thermal images by learning the changes of content features caused by different resolutions. Specifically, we first train an object detector such as YOLO to initialize our model. Then, a feature disentanglement network is trained, which can disentangle the features from the backbone as content features and modal features. In the end, the feature disentanglement network is frozen. By forcing the content feature consistency between visual image and upsampled thermal image, the complete content features of low-resolution thermal images are restored. Experiment results on public datasets show that our method performs very well. Code is available at https://github.com/HaMeow-lst1/DTR.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

YOLO-based Object Detection Models: A Review and its Applications

Article 14 March 2024

Data Availability

The KAIST dataset analyzed during the current study is available at https://soonminhwang.github.io/rgbt-ped-detection/. The LLVIP dataset analyzed during the current study is available at https://bupt-ai-cz.github.io/LLVIP/.

References

Cao, J., Pang, Y., Xie, J., Khan, F.S., Shao, L.: From handcrafted to deep features for pedestrian detection: a survey. IEEE Trans. Patt. Anal. Mach. Intell. 44, 4913–4934 (2022)
Article Google Scholar
Tang, Y., Li, B., Liu, M., Chen, B., Wang, Y., Ouyang, W.: Autopedestrian: an automatic data augmentation and loss function search scheme for pedestrian detection. IEEE Trans. Image Process. 30, 8483–8496 (2021)
Article MathSciNet Google Scholar
Zhou, C., Wu, M., Lam, S.-K.: Enhanced multi-task learning architecture for detecting pedestrian at far distance. IEEE Trans. Intell. Transport. Sys. 30, 15588–15604 (2022)
Article Google Scholar
He, Y., Zhu, C., Yin, X.-C.: Occluded pedestrian detection via distribution-based mutual-supervised feature learning. IEEE Trans. Intell. Transport. Syst. 23, 10514–10529 (2021)
Article Google Scholar
Jiao, Y., Yao, H., Xu, C.: San: selective alignment network for cross-domain pedestrian detection. IEEE Trans. Image Process. 30, 2155–2167 (2021)
Article Google Scholar
Wu, J., Zhou, C., Yang, M., Zhang, Q., Li, Y., Yuan, J.: Temporal-context enhanced detection of heavily occluded pedestrians. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 13430–13439 (2020)
Konig, D., Adam, M., Jarvers, C., Layher, G., Neumann, H., Teutsch, M.: Fully convolutional region proposal networks for multispectral person detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 49–56 (2017)
Chen, Z., Huang, X.: Pedestrian detection for autonomous vehicle using multi-spectral cameras. IEEE Trans. Intell. Veh. 4(2), 211–219 (2019)
Article Google Scholar
Kim, J.U., Park, S., Ro, Y.M.: Uncertainty-guided cross-modal learning for robust multispectral pedestrian detection. IEEE Trans. Circ. Sys. Video Technol. 32(3), 1510–1523 (2022)
Article Google Scholar
Dasgupta, K., Das, A., Das, S., Bhattacharya, U., Yogamani, S.: Spatio-contextual deep network-based multimodal pedestrian detection for autonomous driving. IEEE Trans. Intell. Transport. Sys. 23, 15940–15950 (2022)
Article Google Scholar
Zhang, L., Liu, Z., Zhang, S., Yang, X., Qiao, H., Huang, K., Hussain, A.: Cross-modality interactive attention network for multispectral pedestrian detection. Inf. Fusion 50, 20–29 (2019)
Article Google Scholar
Li, C., Song, D., Tong, R., Tang, M.: Illumination-aware faster r-cnn for robust multispectral pedestrian detection. Patt. Recogn. 85, 161–171 (2019)
Article Google Scholar
Guan, D., Cao, Y., Yang, J., Cao, Y., Yang, M.Y.: Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection. Inf. Fusion 50, 148–157 (2019)
Herrmann, C., Ruf, M., Beyerer, J.: Cnn-based thermal infrared person detection by domain adaptation. In: Autonomous Systems: Sensors, Vehicles, Security, and the Internet of Everything. International Society for Optics and Photonics, vol. 10643, p. 1064308 (2018)
Ghose, D., Desai, S.M., Bhattacharya, S., Chakraborty, D., Fiterau, M., Rahman, T.: Pedestrian detection in thermal images using saliency maps. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (2019)
Xu, Z., Vong, C.-M., Wong, C.-C., Liu, Q.: Ground plane context aggregation network for day-and-night on vehicular pedestrian detection. IEEE Trans. Intell. Transp. Syst. 22(10), 6395–6406 (2020)
Article Google Scholar
Kim, J.U., Park, S., Ro, Y.M.: Robust small-scale pedestrian detection with cued recall via memory learning. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3050–3059 (2021)
Kieu, M., Bagdanov, A.D., Bertini, M., Bimbo, A.d.: Task-conditioned domain adaptation for pedestrian detection in thermal imagery. In: European conference on computer vision, pp. 546–562 (2020). Springer
Kieu, M., Bagdanov, A.D., Bertini, M., Bimbo, A.D.: Domain adaptation for privacy-preserving pedestrian detection in thermal imagery. In: International Conference on Image Analysis and Processing, Springer, pp. 203–213 (2019)
Kieu, M., Bagdanov, A.D., Bertini, M.: Bottom-up and layerwise domain adaptation for pedestrian detection in thermal images. ACM Trans. Multim. Comput. Commun. Appl. (TOMM) 17(1), 1–19 (2021)
Article Google Scholar
Kieu, M., Berlincioni, L., Galteri, L., Bertini, M., Bagdanov, A.D., Del Bimbo, A.: Robust pedestrian detection in thermal imagery using synthesized images. In: 2020 25th International conference on pattern recognition (ICPR), IEEE, pp. 8804–8811 (2021)
Guo, T., Huynh, C.P., Solh, M.: Domain-adaptive pedestrian detection in thermal images. In: 2019 IEEE International conference on image processing (ICIP), IEEE, pp. 1660–1664 (2019)
Liu, D., Zhang, C., Song, Y., Huang, H., Wang, C., Barnett, M., Cai, W.: Decompose to adapt: cross-domain object detection via feature disentanglement. IEEE Trans. Multim. (2022). https://doi.org/10.1109/TMM.2022.3141614
Article Google Scholar
Chen, Z., Yang, C., Li, Q., Zhao, F., Zha, Z.-J., Wu, F.: Disentangle your dense object detector. In: Proceedings of the 29th ACM international conference on multimedia, pp. 4939–4948 (2021)
Lin, C., Yuan, Z., Zhao, S., Sun, P., Wang, C., Cai, J.: Domain-invariant disentangled network for generalizable object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8771–8780 (2021)
Wu, A., Han, Y., Zhu, L., Yang, Y.: Instance-invariant domain adaptive object detection via progressive disentanglement. IEEE Trans. Patt. Anal. Mach. Intell. 44(8), 4178–4193 (2022)
Google Scholar
Kim, J.U., Park, S., Ro, Y.M.: Towards versatile pedestrian detector with multisensory-matching and multispectral recalling memory. In: 36th AAAI conference on artificial intelligence, Association for the Advancement of Artificial Intelligence (AAAI 22) (2022)
Jhoo, W.Y., Heo, J.-P.: Collaborative learning with disentangled features for zero-shot domain adaptation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 8896–8905 (2021)
Lin, C.-C., Chu, H.-L., Wang, Y.-C.F., Lei, C.-L.: Joint feature disentanglement and hallucination for few-shot image classification. IEEE Trans. Image Process. 30, 9245–9258 (2021)
Tang, L., Li, B., Zhong, Y., Ding, S., Song, M.: Disentangled high quality salient object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3580–3590 (2021)
Wu, A., Liu, R., Han, Y., Zhu, L., Yang, Y.: Vector-decomposed disentanglement for domain-invariant object detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 9342–9351 (2021)
Jia, M., Cheng, X., Lu, S., Zhang, J.: Learning disentangled representation implicitly via transformer for occluded person re-identification. IEEE Trans. Multimed. (2022). https://doi.org/10.1109/TMM.2022.3141267
Article Google Scholar
Lee, Y., Yoo, H., Yu, J., Jeon, M.: Learning to see in the rain via disentangled representation. IEEE Robot. Autom. Lett. (2021). https://doi.org/10.1109/LRA.2021.3117249
Article Google Scholar
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Peng, X., Huang, Z., Sun, X., Saenko, K.: Domain agnostic learning with disentangled representations. In: International Conference on Machine Learning, PMLR, pp. 5102–5112 (2019)
Dumoulin, V., Visin, F.: A guide to convolution arithmetic for deep learning. arXiv preprint arXiv:1603.07285 (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention, Springer, pp. 234–241 (2015)
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34(4), 743–761 (2011)
Article Google Scholar
Hwang, S., Park, J., Kim, N., Choi, Y., So Kweon, I.: Multispectral pedestrian detection: Benchmark dataset and baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1037–1045 (2015)
Li, C., Song, D., Tong, R., Tang, M.: Multispectral pedestrian detection via simultaneous detection and segmentation. arXiv preprint arXiv:1808.04818 (2018)
Liu, J., Zhang, S., Wang, S., Metaxas, D.N.: Multispectral deep neural networks for pedestrian detection. arXiv preprint arXiv:1611.02644 (2016)
Jia, X., Zhu, C., Li, M., Tang, W., Zhou, W.: Llvip: a visible-infrared paired dataset for low-light vision. In: Proceedings of the IEEE/CVF international conference on computer vision, pp. 3496–3504 (2021)
Baek, J., Hong, S., Kim, J., Kim, E.: Efficient pedestrian detection at nighttime using a thermal camera. Sensors 17(8), 1850 (2017)
Article Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
MATH Google Scholar
Sacks, J., Welch, W.J., Mitchell, T.J., Wynn, H.P.: Design and analysis of computer experiments. Stat. Sci. 4(4), 409–423 (1989)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (62276048), Sichuan Science and Technology Program (2020YFG0476).

Author information

Authors and Affiliations

School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
Songtao Li, Jinzhong Cui, Mao Ye, Ting Li & Liang Tian

Authors

Songtao Li
View author publications
You can also search for this author in PubMed Google Scholar
Jinzhong Cui
View author publications
You can also search for this author in PubMed Google Scholar
Mao Ye
View author publications
You can also search for this author in PubMed Google Scholar
Ting Li
View author publications
You can also search for this author in PubMed Google Scholar
Liang Tian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

SL presented the method and design of the experiment. SL, JC, and LT finished the experiment. SL, MY and TL wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Ting Li.

Ethics declarations

Conflict of interest

There are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Li, S., Cui, J., Ye, M. et al. Thermal pedestrian detection based on different resolution visual image. SIViP 17, 4347–4355 (2023). https://doi.org/10.1007/s11760-023-02667-z

Download citation

Received: 10 March 2023
Revised: 22 May 2023
Accepted: 10 June 2023
Published: 26 July 2023
Issue Date: November 2023
DOI: https://doi.org/10.1007/s11760-023-02667-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Thermal pedestrian detection based on different resolution visual image

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Thermal pedestrian detection based on different resolution visual image

Abstract

Access this article

Similar content being viewed by others

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

YOLO-based Object Detection Models: A Review and its Applications

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation