DCM3-YOLOv4: A Real-Time Multi-Object Detection Framework

Guo, Baicang; Wang, Huanhuan; Jin, Lisheng; Han, Zhuotong; Zhang, Shunran

doi:10.1007/s42154-023-00258-9

DCM3-YOLOv4: A Real-Time Multi-Object Detection Framework

Published: 24 April 2024

Volume 7, pages 283–299, (2024)
Cite this article

Automotive Innovation Aims and scope Submit manuscript

Baicang Guo¹,
Huanhuan Wang¹,
Lisheng Jin ORCID: orcid.org/0000-0002-3086-1333¹,
Zhuotong Han¹ &
…
Shunran Zhang²

106 Accesses
Explore all metrics

Abstract

The key issues for roadside sensing system (RSS) include achieving accuracy and real-time sharing of over-horizon perception information. This study proposes a novel and efficient framework dedicated to multi-object detection from the roadside perspective. Firstly, compared to other backbones, the mobile net-based model has superior performance and speed as results of the network parameters obtained from network architecture search (NAS), developed to increase the forward inference speed. Secondly, a method of optimization based on the coordinate attention mechanism is developed to increase the long-range dependence of neural networks on spatial information. Thirdly, the traditional convolution operation in the attention mechanism is optimized by the depthwise over-parameterized convolution (DOPC) to improve the capability of extracting features from high-dimensional feature space. Finally, the lightweight single-stage multi-target detection model from the roadside perspective based on DCM3-YOLOv4 is developed. The test results show that the optimized one-stage lightweight multiple object detection model DCM3-YOLOv4 on the RS-UA dataset produces a mean average precision (mAP) value of 0.930 and a network model with parameter size of 31.12 Million. The inference time is 96.13 ms, which is faster than another basic model on the same platform. The proposed methods can be utilized in a wide range of applications, where the accuracy and speed requirements of RSS must be met.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on Vehicle Detection Algorithm Based on Improved YOLO

ST-YOLOX: a lightweight and accurate object detection network based on Swin Transformer

Article 13 November 2023

An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

Abbreviations

CA:: Coordinate attention
CNN:: Convolutional neural network
CPU:: Central processing unit
DO-Conv:: Depthwise over-parameterized convolution
DSSD:: Deconvolutional single shot detector
FPNs:: Feature pyramid network
FPS:: Frame Per Second
GPU:: Graphics processing unit
HOG:: Histogram of gradient
IDE:: Integrated drive electronics
LiDAR:: Light detect and ranging
mAP:: Mean average precision
PANet:: Path aggregation network
PCA:: Principal component analysis
PHOG:: Pyramid histogram of oriented gradient
RPN:: Region proposal network
RSS:: Roadside sensing system
SE:: Squeeze and excite
SIFT:: Scale-invariant feature transform
SPP:: Spatial pyramid pooling
SSD:: Single shot multibox detector
SSN:: Spatial shortcut network
SURF:: Speeded up robust feature
SVM:: Support vector machine
YOLOv4:: You only look once v4

References

Zhao, X., Fang, Y., Min, H., Wu, X., Wang, W., Teixeira, R.: Potential sources of sensor data anomalies for autonomous vehicles: an overview from road vehicle safety perspective. Expert Syst. Appl. 236, 121358 (2023)
Article Google Scholar
Lowe, D.: G: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Agrawal, P., Sharma, T., Verma, N.K.: Supervised approach for object identification using speeded up robust features. Int. J. Adv. Intell. Paradig. 15(2), 165–182 (2020)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. Paper presented at the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) (2005)
Sun, X., Liu, K., Chen, L., et al.: LLTH-YOLOv5: a real-time traffic sign detection algorithm for low-light scenes. Automot. Innov. 7, 121–137 (2024)
Article Google Scholar
Guo, K., Hu, Y., Qian, Z., et al.: Optimized graph convolution recurrent neural network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 22(2), 1138–1149 (2020)
Article Google Scholar
Jin, L., Liu, X., Wang, Y., et al.: Multi-modality trajectory prediction with the dynamic spatial interaction among vehicles under connected vehicle environment. Sci. Rep. 14, 2873 (2024)
Article Google Scholar
Wei, Y., Tian, Q., Guo, J., Huang, W., Cao, J.: Multi-vehicle detection algorithm through combining Harr and HOG features. Math. Comput. Simul. 155, 130–145 (2019)
Article MathSciNet Google Scholar
Khairdoost, N., Monadjemi, S.A.: Jamshidi, K: Front and rear vehicle detection using hypothesis generation and verification. Signal Image Process. 4(4), 31 (2013)
Google Scholar
Park, K.Y., Hwang, S.Y.: An improved Haar-like feature for efficient object detection. Pattern Recogn. Lett. 42, 148–153 (2014)
Article Google Scholar
Azimjonov, J., Özmen, A.: A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways. Adv. Eng. Inform. 50, 101393 (2021)
Article Google Scholar
Zhang, L., Wang, J., An, Z.: Vehicle recognition algorithm based on Haar-like features and improved Adaboost classifier. J. Ambient. Intell. Humaniz. Comput. 14(2), 807–815 (2023)
Article Google Scholar
Cheng, Y.H., Wang, J.: A motion image detection method based on the inter-frame difference method. Appl. Mech. Mater. 490, 1283–1286 (2014)
Google Scholar
Ramya, P., Rajeswari, R.: A modified frame difference method using correlation coefficient for background subtraction. Proc. Comput. Sci. 93, 478–485 (2016)
Article Google Scholar
Zhang, M., Zheng, Y., Lu, F.: Optical flow in the dark. IEEE Trans. Pattern Anal. Mach. Intell. 14(12), 9464–9476 (2021)
Article Google Scholar
Lei, M., Geng, J.: Fusion of three-frame difference method and background difference method to achieve infrared human target detection. In: Paper presented at the 2019 IEEE 1st International Conference on Civil Aviation Safety and Information Technology (ICCASIT), Kunming (2019)
Sengar, S.S., Mukhopadhyay, S.: Foreground detection via background subtraction and improved three-frame differencing. Arab. J. Sci. Eng. 42(8), 3621–3633 (2017)
Article Google Scholar
Barnich, O., Van Droogenbroeck, M.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20(6), 1709–1724 (2010)
Article MathSciNet Google Scholar
Jagannathan, P., Rajkumar, S., Frnda, J., et al.: Moving vehicle detection and classification using gaussian mixture model and ensemble deep learning technique. Wirel. Commun. Mob. Comput. 2021, 1–15 (2021)
Article Google Scholar
Hoss, M., Scholtes, M., Eckstein, L.: A review of testing object-based environment perception for safe automated driving. Automot. Innov. 5, 223–250 (2022)
Article Google Scholar
Schrittwieser, J., Antonoglou, I., Hubert, T., et al.: Mastering atari, go, chess and shogi by planning with a learned model. Nature 588(7839), 604–609 (2020)
Article Google Scholar
Moravčík, M., Schmid, M., Burch, N., et al.: Deepstack: expert-level artificial intelligence in heads-up no-limit poker. Science 356(6337), 508–513 (2017)
Article MathSciNet Google Scholar
Sharma, P., Singh, A., Singh, K.K., et al.: Vehicle identification using modified region based convolution network for intelligent transportation system. Multimed Tools Appl 81, 34893–34917 (2022)
Article Google Scholar
Girshick, R.: Fast R-CNN. In: Paper Presented at the Proceedings of the 2015 IEEE International Conference on Computer Vision (2015)
Redmon, J., Divvala, S., Girshick, R.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Liu, W., Anguelov, D., Erhan, D., et al.: SSD: Single shot multibox detector. In: European Conference on Computer Vision. Springer, Cham (2016)
Fu, C. Y., Liu, W., Ranga, A., et al.: DSSD: Deconvolutional Single Shot Detector. arXiv preprint arXiv:1701.06659 (2017)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Redmon, J.,Farhadi, A.: Yolov3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Bochkovskiy, A., Wang, C.Y., Liao, H. Y. M.: YOLOv4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
He, K., Zhang, X., Ren, S., et al.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: Paper Presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
Ramachandran P., Zoph B., Le Q. V.: Searching for Activation Functions. arXiv preprint arXiv:1710.05941 (2017)
Cao, J., Li, Y., Sun, M., et al.: DO-Conv: depthwise over-parameterized convolutional layer. IEEE Trans. Image Process. 31, 3726–3736 (2022)
Article Google Scholar
Wen, L., Du, D., Cai, Z., et al.: UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking. Comput. Vis. Image Underst. 193, 102907 (2020)
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (52072333, 52202503) and Science and Technology Project of Hebei Education Department (BJK2023026) and Hebei Natural Science Foundation (F2022203054).

Author information

Authors and Affiliations

School of Vehicle and Energy, Yanshan University, Qinhuangdao, 066004, China
Baicang Guo, Huanhuan Wang, Lisheng Jin & Zhuotong Han
Transportation College, Jilin University, Changchun, 130022, China
Shunran Zhang

Authors

Baicang Guo
View author publications
You can also search for this author in PubMed Google Scholar
Huanhuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Lisheng Jin
View author publications
You can also search for this author in PubMed Google Scholar
Zhuotong Han
View author publications
You can also search for this author in PubMed Google Scholar
Shunran Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lisheng Jin.

Ethics declarations

Conflict of interest

On behalf of all the authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Guo, B., Wang, H., Jin, L. et al. DCM3-YOLOv4: A Real-Time Multi-Object Detection Framework. Automot. Innov. 7, 283–299 (2024). https://doi.org/10.1007/s42154-023-00258-9

Download citation

Received: 15 October 2022
Accepted: 31 August 2023
Published: 24 April 2024
Issue Date: May 2024
DOI: https://doi.org/10.1007/s42154-023-00258-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DCM3-YOLOv4: A Real-Time Multi-Object Detection Framework

Abstract

Access this article

Similar content being viewed by others

Research on Vehicle Detection Algorithm Based on Improved YOLO

ST-YOLOX: a lightweight and accurate object detection network based on Swin Transformer

An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DCM3-YOLOv4: A Real-Time Multi-Object Detection Framework

Abstract

Access this article

Similar content being viewed by others

Research on Vehicle Detection Algorithm Based on Improved YOLO

ST-YOLOX: a lightweight and accurate object detection network based on Swin Transformer

An Improved Lightweight Network Based on YOLOv5s for Object Detection in Autonomous Driving

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation