RCSLFNet: a novel real-time pedestrian detection network based on re-parameterized convolution and channel-spatial location fusion attention for low-resolution infrared image

Hao, Shuai; Liu, Zhengqi; Ma, Xu; Wu, Yingqi; He, Tian; Li, Jiahao

doi:10.1007/s11554-024-01469-x

RCSLFNet: a novel real-time pedestrian detection network based on re-parameterized convolution and channel-spatial location fusion attention for low-resolution infrared image

Research
Published: 11 May 2024

Volume 21, article number 89, (2024)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Shuai Hao¹,
Zhengqi Liu¹,
Xu Ma¹,
Yingqi Wu¹,
Tian He¹ &
…
Jiahao Li¹

71 Accesses
Explore all metrics

Abstract

A novel real-time infrared pedestrian detection algorithm is introduced in this study. The proposed approach leverages re-parameterized convolution and channel-spatial location fusion attention to tackle the difficulties presented by low-resolution, partial occlusion, and environmental interference in infrared pedestrian images. These factors have historically hindered the accurate detection of pedestrians using traditional algorithms. First, to tackle the problem of weak feature representation of infrared pedestrian targets caused by low resolution and partial occlusion, a new attention module that integrates channel and spatial is devised and introduced to CSPDarkNet53 to design a new backbone CSLF-DarkNet53. The designed attention model can enhance the feature expression ability of pedestrian targets and make pedestrian targets more prominent in complex backgrounds. Second, to enhance the efficiency of detection and accelerate convergence, a multi-branch decoupled detector head is designed to operate the classification and location of infrared pedestrians separately. Finally, to improve poor real-time without losing precision, we introduce the re-parameterized convolution (Repconv) using parameter identity transformation to decouple the training process and detection process. During the training procedure, to enhance the fitting ability of small convolution kernels, a multi-branch structure with convolution kernels of different scales is designed. Compared with the nice classical detection algorithms, the results of the experiment show that the proposed RCSLFNet not only detects partial occlusion infrared pedestrians in complex environments accurately but also has better real-time performance on the KAIST dataset. The mAP@0.5 reaches 86% and the detection time is 0.0081 s, 2.9% higher than the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention

Article 14 March 2022

Context-aware pedestrian detection especially for small-sized instances with Deconvolution Integrated Faster RCNN (DIF R-CNN)

Article 29 October 2018

FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

Article 21 September 2022

Data availability statement

Not applicable.

Abbreviations

BN:: Batch normalization-2D layer
CBS:: Convolution batch normalization SiLU
CNN:: Convolutional neural network
DL:: Deep learning
FN:: False negative
FP:: False positive
FPN:: Feature pyramid network
FPS:: Frames per second
IoU:: Intersection over union
ML:: Machine learning
RCN-N:: Region-CNN
RNN:: Recurrent neural network
SSD:: Single-shot detector
TN:: True negative
TP:: True positive
YOLO:: You only look once

References

Lee, Y., Chan, Y., Fu, L.: Near-infrared-based nighttime pedestrian detection using grouped part models. IEEE Trans. Intell. Transport. Syst. 16, 1929–1940 (2018)
Article Google Scholar
Morgan, F., Hurney, P., Glavin, M.: Review of pedestrian detection techniques in automotive far-infrared video. IET Intell. Transport. Syst. 9, 824–832 (2015). https://doi.org/10.1049/iet-its.2014.0236
Article Google Scholar
Fearghal, M., Patrick, H., Martin, G., Edward, J.: Review of pedestrian detection techniques in automotive far-infrared video. IET Intell. Transport. Syst. 8, 824–832 (2015). https://doi.org/10.1049/iet-its.2014.0236
Article Google Scholar
Alonso, I.P., Llorca, D.F., Sotelo, M.A.: Combination of feature extraction methods for SVM pedestrian detection. IEEE Trans. Intell. Transport. Syst. 8(2), 292–307 (2007)
Article Google Scholar
O’Malley, R., Jones, E., Glavin, M.: Detection of pedestrians in far-infrared automotive night vision using region-growing and clothing distortion compensation. Infrared Phys. Technol. 53, 439–449 (2010)
Article Google Scholar
Guo, L., Ge, P.S., Zhang, M.H.: Pedestrian detection for intelligent transportation systems combining AdaBoost algorithm and support vector machine. Expert Syst. Appl. 39, 4274–4286 (2012). https://doi.org/10.1016/j.eswa.2011.09.106
Article Google Scholar
Begard, J., Allezard, N., Sayd, P.: Real-time human detection in urban scenes: local descriptors and classifiers selection with AdaBoost-like algorithms. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–8 (2008)
Haider, A., Shaukat, F., Mir, J.: Human detection in aerial thermal imaging using a fully convolutional regression network. Infrared Phys. Technol. 116, 103796 (2021)
Article Google Scholar
Dai, X., Hu, J., Zhang, H., Shitu, A., Luo, C., Osman, A., Sfarra, S., Duan, Y.: Multi-task faster R-CNN for nighttime pedestrian detection and distance estimation. Infrared Phys. Technol. 115, 103694 (2021)
Article Google Scholar
Hong, F., Lu, C.H.M., Wang, T., Jiang, W.W.: Improved SSD model for pedestrian detection in natural scene. Wireless Commun. Mobile Comput. (2022). https://doi.org/10.1155/2022/1500428
Article Google Scholar
Xue, Y., Ju, Z., Li, Y., Zhang, W.: MAF-YOLO: Multi-modal attention fusion based YOLO for pedestrian detection. Infrared Phys. Technol. 118, 103906 (2021)
Article Google Scholar
Hao, S., Gao, S., Ma, X.: Anchor-free infrared pedestrian detection based on cross-scale feature fusion and hierarchical attention mechanism. Infrared Phys. Technol. 131, 104660 (2023). https://doi.org/10.1016/j.infrared.2023.104660
Article Google Scholar
Woo, S., Park, J., Lee, J.Y.: CBAM: convolutional block attention module. In: European Conference on Computer Vision, pp. 3–19 (2018). http://arxiv.org/abs/1807.06521v2
Fang, W., Han, X.: Spatial and channel attention modulated network for medical image segmentation. In: Asian Conference on Computer Vision, pp. 3–17 (2020). https://doi.org/10.1007/978-3-030-69756-31
Oren, M., Papageorgiou, C., Sinha, P, et al.: Pedestrian detection using wavelet templates. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 193–199 (2008)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Conference on Computer Vision and Pattern Recognition, San Diego, California, USA, pp. 886–893 (2005)
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Liu, Y., Su, H., Zeng, C., Li, X.: A robust thermal infrared vehicle and pedestrian detection method in complex scenes. Sensors 21, 1240 (2021)
Article Google Scholar
Zhou, L., Gao, S., Wang, S.M.: IPD-Net: infrared pedestrian detection network via adaptive feature extraction and coordinate information fusion. Sensors 22, 899–8966 (2022)
Article Google Scholar
Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022). https://doi.org/10.48550/arXiv.2207.02696
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, pp. 7132–7141 (2018)
Hou, Q.B., Zhou, D.Q., Feng, J.S.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021). https://doi.org/10.48550/arXiv.2103.02907
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39, 1137–1149 (2017)
Article Google Scholar
Liu, W., Anguelov, D., Erhan, D.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-02
Duan, K., Bai, S., Xie, L.: CenterNet: keypoint triplets for object detection (2019)
Tan, M.X., Pang, R.M., Le, Q.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790 (2020). https://doi.org/10.48550/arXiv.1911.09070
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). arXiv: 1804.02767. https://arxiv.org/abs/1804.02767
Bochkovskiy, A., Wang, C.Y., Liao, W.Y.M.: Yolov4: optimal speed and accuracy of object detection, p. 10934 (2020). https://arxiv.org/abs/2004.10934
Ge, Z., Song, S.T., Fang, W.: YOLOX: exceeding YOLO series in 2021 (2021). https://arxiv.org/abs/2107.08430

Download references

Funding

This research was funded by the National Natural Science Foundation of China (51804250); China Postdoctoral Science Foundation (2020M683522); and Natural Science Basic Research Program of Shaanxi (2024JC-YBMS-490).

Author information

Authors and Affiliations

College of Electrical and Control Engineering, Xi’an University of Science and Technology, Xi’an 710054, China
Shuai Hao, Zhengqi Liu, Xu Ma, Yingqi Wu, Tian He & Jiahao Li

Authors

Shuai Hao
View author publications
You can also search for this author in PubMed Google Scholar
Zhengqi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xu Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yingqi Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tian He
View author publications
You can also search for this author in PubMed Google Scholar
Jiahao Li
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Innovation, SH; proposed method, QZL; experiment, QZL; data collation and analysis, XM and HJL; data verification, QZL and QYW; investigation, TH; resources, QZL; data curation, QZL; preparing the initial draft of a written work, SH, and QZL; revision and proofreading of written work, SH and QZL; project administration, SH; Fund support, SH.

Corresponding author

Correspondence to Xu Ma.

Ethics declarations

Conflicts of interest

The authors have no conflicts of interest to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Hao, S., Liu, Z., Ma, X. et al. RCSLFNet: a novel real-time pedestrian detection network based on re-parameterized convolution and channel-spatial location fusion attention for low-resolution infrared image. J Real-Time Image Proc 21, 89 (2024). https://doi.org/10.1007/s11554-024-01469-x

Download citation

Received: 08 November 2023
Accepted: 26 April 2024
Published: 11 May 2024
DOI: https://doi.org/10.1007/s11554-024-01469-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RCSLFNet: a novel real-time pedestrian detection network based on re-parameterized convolution and channel-spatial location fusion attention for low-resolution infrared image

Abstract

Access this article

Similar content being viewed by others

MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention

Context-aware pedestrian detection especially for small-sized instances with Deconvolution Integrated Faster RCNN (DIF R-CNN)

FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

Data availability statement

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

RCSLFNet: a novel real-time pedestrian detection network based on re-parameterized convolution and channel-spatial location fusion attention for low-resolution infrared image

Abstract

Access this article

Similar content being viewed by others

MGA-YOLOv4: a multi-scale pedestrian detection method based on mask-guided attention

Context-aware pedestrian detection especially for small-sized instances with Deconvolution Integrated Faster RCNN (DIF R-CNN)

FE-CSP: a fast and efficient pedestrian detector with center and scale prediction

Data availability statement

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation