Application Research of YOLOv3 Incorporating Self-attention Mechanism

Zhang, Jiaxin; Jia, Yinshan; Yu, Hongfei

doi:10.1007/978-981-19-3927-3_45

Jiaxin Zhang³⁹,
Yinshan Jia³⁹ &
Hongfei Yu³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 920))

Included in the following conference series:

International Conference on Computing, Control and Industrial Engineering

1005 Accesses

Abstract

Improving the accuracy of target recognition is always the focus of machine learning research. YOLOv3 multi-scale detection accelerates the speed while ensuring accuracy, and the self-attention mechanism takes into account the attention weight of each pixel feature to enhance the ability of information extraction. In view of the large size difference between different targets in the target detection task, which makes it difficult to effectively detect multi-size targets, and the publication of the latest self-attention mechanism COT.NET, a combination of the YOLOv3 network Darknet-53 and the self-attention network COT.NET is proposed. The idea of YOLOv3 is improved by adding a self-attention mechanism to the residual structure of YOLOv3.Through verification on the VOC image set, the improved YOLOv3 is 1.34% higher than the original YOLOv3 map. Experimental results show that YOLOv3 integrated into the self-attention mechanism can improve the accuracy of image recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Hardcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Residual attention mechanism and weighted feature fusion for multi-scale object detection

Article 04 April 2023

An Improved Mobilenetv3-Yolov5 Infrared Target Detection Algorithm Based on Attention Distillation

Strategies for inserting attention in computer vision

Article 16 October 2023

References

Jin, L.S., Guo, B.C., Wang, F.R., Shi, J.: Dynamic multi-target detection algorithm in front of vehicles based on improved YOLOv3. J. Jilin Univ. (Eng. Technol. Edn.) 51(04), 1427–1436 (2021)
Google Scholar
Luo, S.J.: Research and application of YOLOv3 algorithm in the field of intelligent transportation. Lanzhou University, Lanzhou (2020)
Google Scholar
Xu, L.F., Fu, Z.J., Mo, H.W.: Recognition of breast ultrasound tumor based on improved YOLOv3 algorithm. J. Intell. Syst. 16(01), 21–29 (2021)
Google Scholar
Diao, R., Hu, Y.L., Jiang, Y.Z., Lu, W.: Human body detection technology based on YOLOv3 improved algorithm. Inf. Technol. Inf. Technol. (08), 249–252 (2021)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Lin, T., Dollar, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 936–944. IEEE Press, Washington D.C., USA (2017)
Google Scholar
Joseph, R., Ali, F.: YOLOv3: an incremental improvement. Comput. Vis. Pattern Recogn. 2021(01), 48–50 (2021)
Google Scholar
Ge, Z., Liu, S.T., Wang, F., Li, Z.M., Sun, J.: YOLOX: exceeding YOLO series in 2021. ArXiv abs/2107.08430 (2021)
Google Scholar
Fang, Y.X., et al.: You only look at one sequence: rethinking transformer in vision through object detection. ArXiv abs/2106.00666 (2021)
Google Scholar
Tong, N., Lu, H.C., Zhang, L.H., Ruan, X.: Saliency detection with multi-scale superpixels. IEEE Signal Process. Lett. 21(9), 1035–1039 (2014)
Article Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. ArXiv abs/1804.02767 (2018)
Google Scholar
Chefer, H.L., Shir, G., Lior, W.: Transformer interpretability beyond attention visualization. In: Proceeding of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, pp. 782–791 (2021)
Google Scholar
Li, Y.H., Yao, T., Pan, Y.W., Mei, T.: Contextual transformer networks for visual recognition. ArXiv abs/2107.12292 (2021)
Google Scholar
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Press, Washington D.C., USA (2016)
Google Scholar
Pascal VOC Dataset Mirror. pjreddie.com
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Communication Engineering, Liaoning Petrochemical University, Fushun, 113001, Liaoning, China
Jiaxin Zhang, Yinshan Jia & Hongfei Yu

Authors

Jiaxin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yinshan Jia
View author publications
You can also search for this author in PubMed Google Scholar
Hongfei Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yinshan Jia .

Editor information

Editors and Affiliations

Universidad de Guanajuato, Guanajuato, Mexico
Yuriy S. Shmaliy
Ain Shams University, Cairo, Egypt
Abdelhalim Abdelnaby Zekry

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Jia, Y., Yu, H. (2022). Application Research of YOLOv3 Incorporating Self-attention Mechanism. In: S. Shmaliy, Y., Abdelnaby Zekry, A. (eds) 6th International Technical Conference on Advances in Computing, Control and Industrial Engineering (CCIE 2021). CCIE 2021. Lecture Notes in Electrical Engineering, vol 920. Springer, Singapore. https://doi.org/10.1007/978-981-19-3927-3_45

Download citation

DOI: https://doi.org/10.1007/978-981-19-3927-3_45
Published: 06 July 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3926-6
Online ISBN: 978-981-19-3927-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Application Research of YOLOv3 Incorporating Self-attention Mechanism

Abstract

Access this chapter

Similar content being viewed by others

Residual attention mechanism and weighted feature fusion for multi-scale object detection

An Improved Mobilenetv3-Yolov5 Infrared Target Detection Algorithm Based on Attention Distillation

Strategies for inserting attention in computer vision

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Application Research of YOLOv3 Incorporating Self-attention Mechanism

Abstract

Access this chapter

Similar content being viewed by others

Residual attention mechanism and weighted feature fusion for multi-scale object detection

An Improved Mobilenetv3-Yolov5 Infrared Target Detection Algorithm Based on Attention Distillation

Strategies for inserting attention in computer vision

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation