VCANet: Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection

Chen, Guang; Chen, Kai; Zhang, Lijun; Zhang, Liming; Knoll, Alois

doi:10.1007/s42154-021-00157-x

VCANet: Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection

Original Article
Published: 13 September 2021

Volume 4, pages 400–412, (2021)
Cite this article

Automotive Innovation Aims and scope Submit manuscript

Guang Chen ORCID: orcid.org/0000-0002-7416-592X^1,2,
Kai Chen¹,
Lijun Zhang¹,
Liming Zhang³ &
…
Alois Knoll²

743 Accesses
23 Citations
Explore all metrics

Abstract

Advanced deep learning technology has made great progress in generic object detection of autonomous driving, yet it is still challenging to detect small road hazards in a long distance owing to lack of large-scale small-object datasets and dedicated methods. This work addresses the challenge from two aspects. Firstly, a self-collected long-distance road object dataset (TJ-LDRO) is introduced, which consists of 109,337 images and is the largest dataset so far for the small road object detection research. Secondly, a vanishing-point-guided context-aware network (VCANet) is proposed, which utilizes the vanishing point prediction block and the context-aware center detection block to obtain semantic information. The multi-scale feature fusion pipeline and the upsampling block in VCANet are introduced to enhance the region of interest (ROI) feature. The experimental results with TJ-LDRO dataset show that the proposed method achieves better performance than the representative generic object detection methods. This work fills a critical capability gap in small road hazards detection for high-speed autonomous vehicles.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Article 27 April 2023

Jiageng Mao, Shaoshuai Shi, … Hongsheng Li

Traffic sign recognition based on deep learning

Article Open access 07 March 2022

Yanzhao Zhu & Wei Qi Yan

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Article 12 August 2023

Emel Soylu & Tuncay Soylu

Notes

https://github.com/ispc-lab/VCANet.

Abbreviations

ROI:: Region of interest
TJ-LDRO:: Tongji long-distance road object
VCANet:: Vanishing-point-guided context-aware network
VPT:: Vanishing point

References

Pinggera, P., Ramos, S., Gehrig, S., Franke, U., Rother, C., Mester, R.: Lost and found: detecting small road hazards for self-driving vehicles. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2016)
Creusot, C., Munawar, A.: Real-time small obstacle detection on highways using compressive rbm road reconstruction. In: IEEE Intelligent Vehicles Symposium (2015)
Leng, J., Liu, Y., Du, D., Zhang, T., Quan, P.: Robust obstacle detection and recognition for driver assistance systems. IEEE Trans. Intell. Transp. Syst. 21(4), 1560–1571 (2019)
Article Google Scholar
Liu, Y., Chen, G., Knoll, A.: Globally optimal vertical direction estimation in atlanta world. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Ramos, S., Gehrig, S., Pinggera, P., Franke, U., Rother, C.: Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling. In: IEEE Intelligent Vehicles Symposium (2017)
Chen, G., Cao, H., Conradt, J., Tang, H., Rohrbein, F., Knoll, A.: Event-based neuromorphic vision for autonomous driving: a paradigm shift for bio-inspired visual sensing and perception. IEEE Signal Process. Mag. 37(4), 34–49 (2020). https://doi.org/10.1109/MSP.2020.2985815
Article Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Article Google Scholar
Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., Fu, Y.: Rethinking classification and localization for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)
Article Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European Conference on Computer Vision (2014)
Girshick, R.: Fast r-cnn. In: IEEE International Conference on Computer Vision (2015)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Cui, L., Ma, R., Lv, P., Jiang, X., Gao, Z., Zhou, B., Xu, M.: Mdssd: multi-scale deconvolutional single shot detector for small objects. SCIENCE CHINA Inf. Sci. 63(2), 120113 (2020)
Article Google Scholar
Duan, K., Du, D., Qi, H., Huang, Q.: Detecting small objects using a channel-aware deconvolutional network. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1639–1652 (2019)
Article Google Scholar
Liu, Z., Du, J., Tian, F., Wen, J.: Mr-cnn: A multi-scale region-based convolutional neural network for small traffic sign recognition. IEEE Access 7, 57120–57128 (2019)
Article Google Scholar
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Chen, C., Liu, M.Y., Tuzel, O., Xiao, J.: R-cnn for small object detection. In: Asian Conference on Computer Vision (2016)
Bai, Y., Zhang, Y., Ding, M., Ghanem, B.: Sod-mtgan: Small object detection via multi-task generative adversarial network. In: European Conference on Computer Vision (2018)
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Kembhavi, A., Harwood, D., Davis, L.S.: Vehicle detection using partial least squares. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1250–1265 (2011)
Article Google Scholar
Ma, J., Pan, Q., Hu, J., Zhao, C., Guo, Y., Wang, D.: Small object detection with random decision forests. In: IEEE International Conference on Unmanned Systems (2017)
Zhang, H., Niu, Y., Zhang, H.: Small target detection based on difference accumulation and gaussian curvature under complex conditions. Infrared Phys. Technol. 87, 55–64 (2017)
Article Google Scholar
Bansal, A., Chen, X., Russell, B., Gupta, A., Ramanan, D.: Pixelnet: Towards a general pixel-level architecture. arXiv preprint arXiv:1609.06694 (2016)
Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Cao, G., Xie, X., Yang, W., Liao, Q., Shi, G., Wu, J.: Feature-fused SSD: fast detection for small objects. In: International Conference on Graphic and Image Processing (2018)
Hu, G.X., Yang, Z., Hu, L., Huang, L., Han, J.M.: Small object detection with multiscale features. Int. J. Digital Multimedia Broadcast. 2018,(2018)
Liang, Z., Shao, J., Zhang, D., Gao, L.: Small object detection using deep feature pyramid networks. In: Pacific Rim Conference on Multimedia (2018)
Liu, Z., Li, D., Ge, S.S., Tian, F.: Small traffic sign detection from large image. Appl. Intell. 50(1), 1–13 (2020)
Article Google Scholar
Bar, M.: Visual objects in context. Nat. Rev. Neurosci. 5(8), 617 (2004)
Article Google Scholar
Biederman, I.: Perceiving real-world scenes. Science 177(4043), 77–80 (1972)
Article Google Scholar
Hu, P., Ramanan, D.: Finding tiny faces. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53(2), 169–191 (2003)
Article MathSciNet Google Scholar
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Loy, C.C.: Esrgan: Enhanced super-resolution generative adversarial networks. In: European Conference on Computer Vision Workshops (2018)
Yuan, Y., Xiong, Z., Wang, Q.: Vssa-net: vertical spatial sequence attention network for traffic sign detection. IEEE Trans. Image Process. 28(7), 3423–3434 (2019)
Article MathSciNet Google Scholar
Chen, X., Kundu, K., Zhu, Y., Berneshawi, A.G., Ma, H., Fidler, S., Urtasun, R.: 3d object proposals for accurate object class detection. In: Advances in Neural Information Processing Systems (2015)
Bosquet, B., Mucientes, M., Brea, V.M.: STDnet: a convnet for small target detection. In: British Machine Vision Conference (2018)
Guan, T., Zhu, H.: Atrous faster r-cnn for small scale object detection. In: International Conference on Multimedia and Image Processing (2017)
Fang, L., Zhao, X., Zhang, S.: Small-objectness sensitive detection based on shifted single shot detector. Multimedia Tools Appl. 78(10), 13227–13245 (2019)
Article Google Scholar
Meng, Z., Fan, X., Chen, X., Chen, M., Tong, Y.: Detecting small signs from large images. In: International Conference on Information Reuse and Integration (2017)
Zhang, R., Yin, D., Ding, J., Luo, Y., Liu, W., Yuan, M., Zhu, C., Zhou, Z.: A detection method for low-pixel ratio object. Multimedia Tools Appl. 78(9), 11655–11674 (2019)
Article Google Scholar
Zhu, Z., Liang, D., Zhang, S., Huang, X., Li, B., Hu, S.: Traffic-sign detection and classification in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Xiao, J., Ehinger, K.A., Hays, J., Torralba, A., Oliva, A.: Sun database: exploring a large collection of scene categories. Int. J. Comput. Vis. 119(1), 3–22 (2016)
Article MathSciNet Google Scholar
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In: Hutter, M., Siegwart, R. (eds) Field and Service Robotics. Springer Proceedings in Advanced Robotics, vol 5. Springer, Cham.
Qiu, W., Yuille, A.: Unrealcv: Connecting computer vision to unreal engine. In: European Conference on Computer Vision (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lee, S., Kim, J., Shin Yoon, J., Shin, S., Bailo, O., Kim, N., Lee, T.H., Seok Hong, H., Han, S.H., So Kweon, I.: Vpgnet: Vanishing point guided network for lane and road marking detection and recognition. In: IEEE International Conference on Computer Vision (2017)
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: The European Conference on Computer Vision (2018)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Liu, W., Rabinovich, A., Berg, A.C.: Parsenet: Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015)
Guan, L., Wu, Y., Zhao, J.: Scan: semantic context aware network for accurate small object detection. Int. J. Comput. Intell. Syst. 11(1), 951–961 (2018)
Article Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., et al.: Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2015)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: IEEE International Conference on Computer Vision (2019)

Download references

Acknowledgements

This research has received funding from the National Natural Science Foundation of China (No. 61906138), National Key Research and Development Program of China (No.2016YFB0100901), Shanghai AI Innovative Development Project 2018, and Shanghai Rising Star Program (No. 21QC1400900). We would like to thank Mingyuan Chen for the support of small objects collection in developing the TJ-LDRO dataset.

Author information

Authors and Affiliations

Tongji University, Shanghai, China
Guang Chen, Kai Chen & Lijun Zhang
Technical University of Munich, Munich, Germany
Guang Chen & Alois Knoll
Geely Research Institute, Hangzhou, China
Liming Zhang

Authors

Guang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kai Chen
View author publications
You can also search for this author in PubMed Google Scholar
Lijun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Liming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Alois Knoll
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Guang Chen or Lijun Zhang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, G., Chen, K., Zhang, L. et al. VCANet: Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection. Automot. Innov. 4, 400–412 (2021). https://doi.org/10.1007/s42154-021-00157-x

Download citation

Received: 22 November 2020
Accepted: 04 June 2021
Published: 13 September 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s42154-021-00157-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

VCANet: Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Traffic sign recognition based on deep learning

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

VCANet: Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection

Abstract

Access this article

Similar content being viewed by others

3D Object Detection for Autonomous Driving: A Comprehensive Survey

Traffic sign recognition based on deep learning

A performance comparison of YOLOv8 models for traffic sign detection in the Robotaxi-full scale autonomous vehicle competition

Notes

Abbreviations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation