SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Sun, Yang; Zhang, Yuhang; Wang, Haiyang; Guo, Jianhua; Zheng, Jiushuai; Ning, Haonan

doi:10.1007/s11760-024-03003-9

SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Original Paper
Published: 22 March 2024

(2024)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Yang Sun¹,
Yuhang Zhang²,
Haiyang Wang³,
Jianhua Guo²,
Jiushuai Zheng² &
…
Haonan Ning²

348 Accesses
Explore all metrics

Abstract

The perception system in autonomous driving mainly uses object detection algorithms to obtain the distribution of obstacles for recognition and analysis. Current object detection algorithms have rapidly developed, but it is challenging to balance the requirements of real-time detection and high detection accuracy in actual application scenarios. To solve the above problems, this paper uses YOLOv8n as the baseline model and proposes an object detection network named SES-YOLOv8n. Firstly, the SPPF module in the network was replaced by the SPPCSPC module to enhance further the model’s fusion ability under feature maps of different scales. The efficient multi-scale attention module EMA is introduced into the C2F module of the backbone network, which improves the perception ability in critical areas and the efficiency of feature extraction. Finally, the SPD-Conv module is used to replace part of the convolution modules in the backbone network to replace the downsampling operation, which can more effectively retain the feature information and improve the network’s accuracy and learning ability. Experimental results on the KITTI dataset and BDD100K dataset show that the average accuracy of the improved network model reaches 92.7% and 41.9%, which is 3.4% and 5.0% higher than that of the baseline model and is significantly better than the baseline model. This model can realize real-time image processing in general scenes based on ensuring high detection accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast and accurate object detector for autonomous driving based on improved YOLOv5

Article Open access 15 June 2023

Vehicles Detection Based on Improved YOLOv3

An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

Data availability

Image dataset used in this research is available online.

References

Wang, F., Wang, P., Zhang, X., Li, H., Himed, B.J.I.A.: An overview of parametric modeling and methods for radar target detection with limited data. IEEE Access (2021). https://doi.org/10.1109/ACCESS.2021.3074063
Article Google Scholar
Zhang, Y., Zhang, W., Bi, J.: Recent advances in driverless car. Recent Pat. Mech. Eng. 10(1), 30–38 (2017)
Article Google Scholar
Milford, M., Anthony, S., Scheirer, W.: Self-driving vehicles: key technical challenges and progress off the road. IEEE Potentials (2019). https://doi.org/10.1109/MPOT.2019.2939376
Article Google Scholar
O'Shea, K., & Nash, R.: An introduction to convolutional neural networks. (2015). https://doi.org/10.48550/arXiv.1511.08458
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 38(1), 142–158 (2015). https://doi.org/10.1109/TPAMI.2015.2437384
Article Google Scholar
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28 (2015)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017). https://doi.org/10.48550/arXiv.1703.06870
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. (2018). https://doi.org/10.48550/arXiv.1804.02767
Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B.: A review of Yolo algorithm developments. Procedia Comput Sci (2022). https://doi.org/10.1016/j.procs.2022.01.135
Article Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer (2016)
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2980–2988. https://doi.org/10.48550/arXiv.1708.02002
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recog-nition (2014). https://doi.org/10.48550/arXiv.1409.1556
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Tan, M., Le, Q.: Efficientnet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning, pp. 6105–6114. PMLR (2019). https://doi.org/10.48550/arXiv.1905.11946
Tan, M., Le, Q.: Efficientnetv2: smaller models and faster training. In: International Conference on Machine Learning, pp. 10096–10106. PMLR (2021)
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019). https://doi.org/10.48550/arXiv.1905.02244
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132–7141. https://doi.org/10.48550/arXiv.1709.01507
Wang, W., Dai, J., Chen, Z., Huang, Z., Li, Z., Zhu, X., Hu, X., Lu, T., Lu, L., Li, H.: Internimage: exploring large-scale vision foundation models with deformable convolutions. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14408–14419 (2023). https://doi.org/10.48550/arXiv.2211.05778
Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: Gcnet: non-local networks meet squeeze-excitation networks and beyond. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, pp. 0–0 (2019)
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021). https://doi.org/10.48550/arXiv.2103.14030
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018). https://doi.org/10.48550/arXiv.1807.06521
Zhu, Y., Newsam, S.: Densenet for dense flow. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 790–794. IEEE (2017).https://doi.org/10.1109/ICIP.2017.8296389
Zhang, H., Zu, K., Lu, J., Zou, Y., Meng, D.: EPSANet: an efficient pyramid squeeze attention block on convolutional neural network. In: Proceedings of the Asian conference on computer vision, pp. 1161–1177 (2022). https://doi.org/10.48550/arXiv.2105.14447
Pan, X., Ge, C., Lu, R., Song, S., Chen, G., Huang, Z., Huang, G.: On the integration of self-attention and convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 815–825 (2022). https://doi.org/10.48550/arXiv.2111.14556
Zhu, L., Wang, X., Ke, Z., Zhang, W., Lau, R.W.: BiFormer: vision transformer with bi-level routing attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10323–10333 (2023). https://doi.org/10.48550/arXiv.2303.08810
Liu, J.-J., Hou, Q., Cheng, M.-M., Feng, J., Jiang, J: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3917–3926 (2019)
Liu, Y., Wang, Y., Wang, S., Liang, T., Zhao, Q., Tang, Z., Ling, H.: Cbnet: a novel composite backbone network architecture for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 11653–11660 (2020)
Kim, J.U., Kwon, J., Kim, H.G., Ro, Y.M.: BBC net: bounding-box critic network for occlusion-robust object detection. IEEE Trans. Circuits Syst. Video Technol. (2019). https://doi.org/10.1109/TCSVT.2019.2900709
Article Google Scholar
Xi, X., Wang, J., Li, F., Li, D.J.E.: IRSDet: Infrared small-object detection network based on sparse-skip connection and guide maps. Electronics (2022). https://doi.org/10.3390/electronics11142154
Article Google Scholar
Zhang, Y., Sun, Y., Wang, Z., Jiang, Y.J.S.: YOLOv7-RAR for Urban vehicle detection. Sensors (2023). https://doi.org/10.3390/s23041801
Article Google Scholar
Ouyang, D., He, S., Zhang, G., Luo, M., Guo, H., Zhan, J., Huang, Z.: Efficient multi-scale attention module with cross-spatial learning. In: ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5. IEEE (2023). https://doi.org/10.1109/ICASSP49357.2023.10096516
Sunkara, R., Luo, T.: No more strided convolutions or pooling: A new CNN building block for low-resolution images and small objects. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 443–459. Springer (2022). https://doi.org/10.48550/arXiv.2208.03641
Cao, Y., Chen, K., Loy, C.C., Lin, D.: Prime sample attention in object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11583–11591 (2020). https://doi.org/10.48550/arXiv.1904.04821
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7464–747 (2023). https://doi.org/10.48550/arXiv.2207.02696

Download references

Funding

Research on Key Technologies of Intelligent Equipment for Mine Powered by Pure Clean Energy, Natural Science Foundation of Hebei Province, F2021402011.

Author information

Authors and Affiliations

Key Laboratory of Intelligent Industrial Equipment Technology of Hebei Province, College of Mechanical and Equipment Engineering, Hebei University of Engineering, Handan, 056038, China
Yang Sun
College of Mechanical and Equipment Engineering, Hebei University of Engineering, Handan, 056038, China
Yuhang Zhang, Jianhua Guo, Jiushuai Zheng & Haonan Ning
Jizhong Energy Fengfeng Group Co. LTD, 16 Unicom South Road, Handan City, Hebei Province, China
Haiyang Wang

Authors

Yang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yuhang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haiyang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jianhua Guo
View author publications
You can also search for this author in PubMed Google Scholar
Jiushuai Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Haonan Ning
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors have contributed their unique insights to the research concept, and after review and discussion, they unanimously approved the content of the final manuscript.

Corresponding author

Correspondence to Yuhang Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, Y., Zhang, Y., Wang, H. et al. SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8. SIViP (2024). https://doi.org/10.1007/s11760-024-03003-9

Download citation

Received: 22 October 2023
Revised: 12 December 2023
Accepted: 03 January 2024
Published: 22 March 2024
DOI: https://doi.org/10.1007/s11760-024-03003-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Abstract

Access this article

Similar content being viewed by others

Fast and accurate object detector for autonomous driving based on improved YOLOv5

Vehicles Detection Based on Improved YOLOv3

An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

SES-YOLOv8n: automatic driving object detection algorithm based on improved YOLOv8

Abstract

Access this article

Similar content being viewed by others

Fast and accurate object detector for autonomous driving based on improved YOLOv5

Vehicles Detection Based on Improved YOLOv3

An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation