Abstract
Multi-target detection based on corner pooling provides a distinctive framework without anchor boxes, which has achieved wide application in the area of intelligent transportation system. To effectively detect small vehicles in the distant view, we propose an improved detection network termed corner pooling with attention mechanism (CPAM). A newly devised network called Hourglass with Coordinate Attention(Hourglass-CA) is proposed as an alternative to the Hourglass-104 backbone network. This one incorporates a multi-level attention mechanism to optimize the efficiency of feature extraction. Additionally, a novel multi-level attention loss(MLA loss) is presented, which dynamically adjusts the offsets during the feature extraction process. The experimental results demonstrate that our proposed CPAM achieves lightweight detection, reducing the parameters from 201M to 117M with an FPS from 4.2 to 16.1. Moreover, the AP can reach 51.6\(\%\), surpassing several existing detectors.
Graphical abstract
Similar content being viewed by others
Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Yan C, Zhang H, Li X, Yuan D (2022) R-SSD: Refined single shot multibox detector for pedestrian detection. Appl Intell 52(9):10430–10447
Hao LY, Yang Z, Liu YP, Shen C (2023) TRCA-Net: stronger U structured network for human image segmentation. Neural Comput Appl 35(13):9627–9635
Li X, Kong D (2023) SRIF-RCNN: Sparsely represented inputs fusion of different sensors for 3D object detection. Appl Intell 53(5):5532–5553
Xiao J, Yang L, Zhong F, Chen H, Li X (2023) Robust anomaly-based intrusion detection system for in-vehicle network by graph neural network framework. Appl Intell 53(3):3183–3206
Zhao B, Wang C, Fu Q, Han Z (2021) A novel pattern for infrared small target detection with generative adversarial network. IEEE Trans Geosci Remote Sens 59(5):4481–4492
Pang D, Shan T, Li W, Ma P, Tao R (2020) Infrared dim and small target detection based on greedy bilateral factorization in image sequences. IEEE J Sel Top Appl Earth Obs Remote Sens 13:3394–3408
Chadwick S, Maddern W, Newman P (2019) Distant vehicle detection using radar and vision, International Conference on Robotics and Automation (ICRA), Montreal, Canada, 8311-8317
Gilroy S, Jones E, Glavin M (2019) Overcoming occlusion in the automotive environment-A review. IEEE Trans Intell Transp Syst 22(1):23–35
Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv: 1804.02767
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards realtime object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Transactions on Computer Vision and Pattern Recognition(CVPR), Columbia, USA, 580-587
Girshick R (2015) Fast R-CNN. IEEE Transactions on International Conference on Computer Vision(ICCV), Santiago, Chile, 1440–48
Ren S, He K, Girshick R, Sun J (2017) Faster R-CNN: Towards realtime object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137–1149
Hei L, Jia D (2020) CornerNet: Detecting objects as paired keypoints. IEEE Transactions on European Conference on Computer Vision(ECCV), Glasgow, UK, 642–656
Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger, IEEE Transactions on Computer Vision and Pattern Recognition(CVPR). Honolulu, USA, pp 6517–6525
Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger, IEEE Transactions on Computer Vision and Pattern Recognition(CVPR). Honolulu, USA, pp 6517–6525
Oh J, Lee Y, Yoo J, Kwo S (2022) Improved Feature-Based Gaze Estimation Using Self-Attention Module and Synthetic Eye Images. Sensors 22(11):4026
Sun L, Cheng S, Zheng Y, Wu Z, Zhang J (2022) SPANet: Successive pooling attention network for semantic segmentation of remote sensing images. IEEE J Sel Top Appl Earth Obs Remote Sens 15:4045–4057
Liu C, Yi Z, Huang B, Zhou Z, Fang S, Li X, Zhang Y, Wu X (2023) A Deep Learning Method Based on Triplet Network Using Self-Attention for Tactile Grasp Outcomes Prediction. IEEE Trans Instrum Meas 72:1–14
Tan S, Zhang L, Shu X, Wang Z (2023) A feature-wise attention module based on the difference with surrounding features for convolutional neural networks. Front Comput Sci 17(6):176338
Tan S, Zhang L, Shu X, Wang Z (2023) A feature-wise attention module based on the difference with surrounding features for convolutional neural networks. Front Comput Sci 17(6):176338
Newell A, Deng J (2017) Pixels to graphs by associative embedding. Adv Neural Inf Process 2172–2181
Qin Z, Hanwen J, Qiyu D, Yuanhao Y, Long C, Qian W (2022) Robust Lane Detection From Continuous Driving Scenes Using Deep Neural Networks. IEEE Trans Veh Technol 69(1):41–54
Tabelini L, Berriel R, Paixao, Thiago M, Badue C (2021) Keep your Eyes on the Lane: Real-time Attention-guided Lane Detection, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Nashvile, TN, USA, 294–302
Wang B, Wang G, Chan KL, Wang L (2014) Tracklet association with online target-specific metric learning, 2014 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Columbus, USA, 1234–1241
Qin Z, Hanwen J, Qiyu D, Yuanhao Y, Long C, Qian W (2022) Robust Lane Detection From Continuous Driving Scenes Using Deep Neural Networks. IEEE Trans Veh Technol 69(1):41–54
Hao LY, Li J, Guo G (2020) A multi-target corner pooling-based neural network for vehicle detection. Neural Comput Appl 32(18):14497–14506
Yuan Z, Li X, Wang Q. Exploring Multi-Level Attention and Semantic Relationship for Remote Sensing Image Captioning. IEEE Access, 8, 2608-2620
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Zitnick CL (2014) Microsoft coco: Common objects in context, 2014 Proceedings European Conference Computer Vision(ECCV). Zurich, Switzerland, pp 2117–2125
Wen L, Du D, Cai Z et al (2020) UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking. Comput Vis Image Underst 193:102907
Zhu Y, Zhao C, Wang J, Xu Z, Lu H (2017) CoupleNet: Coupling global structure with local parts for object detection, 2017 IEEE Transactions on International Conference on Computer Vision(ICCV), Venice, Italy, 4146-4154
Dai J, Qi H, Xiong Y, Li Y, Zhang G, Hu H, Wei Y (2017) Deformable convolutional networks, 2017 IEEE Transactions on International Conference on Computer Vision(ICCV), Venice, Italy, 764–773
Hao LY, Li J, Guo G (2020) A multi-target corner pooling-based neural network for vehicle detection. Neural Comput Appl 32(18):14497–14506
Shen Z, Liu Z, Li J, Jiang YG, Xue X (2017) DSOD: Learning deeply supervised object detectors from scratch, 2017 IEEE Transactions on International Conference on Computer Vision(ICCV), Venice, Italy, 1937–1945
Paszke A, Gross S, Chintala S, Chanan G, Yang E, Devito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in PyTorch, Workshop on Autodiff Decision Program, 1–4
Wen L, Du D, Cai Z et al (2020) UA-DETRAC: A new benchmark and protocol for multi-object detection and tracking. Comput Vis Image Underst 193:102907
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Zitnick CL (2014) Microsoft coco: Common objects in context, 2014 Proceedings European Conference Computer Vision(ECCV). Zurich, Switzerland, pp 2117–2125
King DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980
Acknowledgements
This work is supported by the National Natural Science Foundation of China (Grant Nos. 52171292, 51939001), the Outstanding Young Talent Program of Dalian (Grant No. 2022RJ05).
Author information
Authors and Affiliations
Contributions
Conceptualization: Li-Ying Hao; Methodology: Jia-Rui Yang; Writing - original draft preparation: Jian Zhang; Writing - review and editing: Yunze Zhang.
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Hao, LY., Yang, JR., Zhang, Y. et al. Multi-target vehicle detection based on corner pooling with attention mechanism. Appl Intell 53, 29128–29139 (2023). https://doi.org/10.1007/s10489-023-05084-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05084-4