Abstract
Advanced deep learning technology has made great progress in generic object detection of autonomous driving, yet it is still challenging to detect small road hazards in a long distance owing to lack of large-scale small-object datasets and dedicated methods. This work addresses the challenge from two aspects. Firstly, a self-collected long-distance road object dataset (TJ-LDRO) is introduced, which consists of 109,337 images and is the largest dataset so far for the small road object detection research. Secondly, a vanishing-point-guided context-aware network (VCANet) is proposed, which utilizes the vanishing point prediction block and the context-aware center detection block to obtain semantic information. The multi-scale feature fusion pipeline and the upsampling block in VCANet are introduced to enhance the region of interest (ROI) feature. The experimental results with TJ-LDRO dataset show that the proposed method achieves better performance than the representative generic object detection methods. This work fills a critical capability gap in small road hazards detection for high-speed autonomous vehicles.
Similar content being viewed by others
Abbreviations
- ROI:
-
Region of interest
- TJ-LDRO:
-
Tongji long-distance road object
- VCANet:
-
Vanishing-point-guided context-aware network
- VPT:
-
Vanishing point
References
Pinggera, P., Ramos, S., Gehrig, S., Franke, U., Rother, C., Mester, R.: Lost and found: detecting small road hazards for self-driving vehicles. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (2016)
Creusot, C., Munawar, A.: Real-time small obstacle detection on highways using compressive rbm road reconstruction. In: IEEE Intelligent Vehicles Symposium (2015)
Leng, J., Liu, Y., Du, D., Zhang, T., Quan, P.: Robust obstacle detection and recognition for driver assistance systems. IEEE Trans. Intell. Transp. Syst. 21(4), 1560–1571 (2019)
Liu, Y., Chen, G., Knoll, A.: Globally optimal vertical direction estimation in atlanta world. IEEE Trans. Pattern Anal. Mach. Intell. (2020)
Ramos, S., Gehrig, S., Pinggera, P., Franke, U., Rother, C.: Detecting unexpected obstacles for self-driving cars: Fusing deep learning and geometric modeling. In: IEEE Intelligent Vehicles Symposium (2017)
Chen, G., Cao, H., Conradt, J., Tang, H., Rohrbein, F., Knoll, A.: Event-based neuromorphic vision for autonomous driving: a paradigm shift for bio-inspired visual sensing and perception. IEEE Signal Process. Mag. 37(4), 34–49 (2020). https://doi.org/10.1109/MSP.2020.2985815
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)
Wu, Y., Chen, Y., Yuan, L., Liu, Z., Wang, L., Li, H., Fu, Y.: Rethinking classification and localization for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2020)
Everingham, M., Eslami, S.A., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vision 111(1), 98–136 (2015)
Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: Common objects in context. In: European Conference on Computer Vision (2014)
Girshick, R.: Fast r-cnn. In: IEEE International Conference on Computer Vision (2015)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision (2016)
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Cui, L., Ma, R., Lv, P., Jiang, X., Gao, Z., Zhou, B., Xu, M.: Mdssd: multi-scale deconvolutional single shot detector for small objects. SCIENCE CHINA Inf. Sci. 63(2), 120113 (2020)
Duan, K., Du, D., Qi, H., Huang, Q.: Detecting small objects using a channel-aware deconvolutional network. IEEE Trans. Circuits Syst. Video Technol. 30(6), 1639–1652 (2019)
Liu, Z., Du, J., Tian, F., Wen, J.: Mr-cnn: A multi-scale region-based convolutional neural network for small traffic sign recognition. IEEE Access 7, 57120–57128 (2019)
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Chen, C., Liu, M.Y., Tuzel, O., Xiao, J.: R-cnn for small object detection. In: Asian Conference on Computer Vision (2016)
Bai, Y., Zhang, Y., Ding, M., Ghanem, B.: Sod-mtgan: Small object detection via multi-task generative adversarial network. In: European Conference on Computer Vision (2018)
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., Yan, S.: Perceptual generative adversarial networks for small object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Kembhavi, A., Harwood, D., Davis, L.S.: Vehicle detection using partial least squares. IEEE Trans. Pattern Anal. Mach. Intell. 33(6), 1250–1265 (2011)
Ma, J., Pan, Q., Hu, J., Zhao, C., Guo, Y., Wang, D.: Small object detection with random decision forests. In: IEEE International Conference on Unmanned Systems (2017)
Zhang, H., Niu, Y., Zhang, H.: Small target detection based on difference accumulation and gaussian curvature under complex conditions. Infrared Phys. Technol. 87, 55–64 (2017)
Bansal, A., Chen, X., Russell, B., Gupta, A., Ramanan, D.: Pixelnet: Towards a general pixel-level architecture. arXiv preprint arXiv:1609.06694 (2016)
Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Fu, C.Y., Liu, W., Ranga, A., Tyagi, A., Berg, A.C.: Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659 (2017)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Cao, G., Xie, X., Yang, W., Liao, Q., Shi, G., Wu, J.: Feature-fused SSD: fast detection for small objects. In: International Conference on Graphic and Image Processing (2018)
Hu, G.X., Yang, Z., Hu, L., Huang, L., Han, J.M.: Small object detection with multiscale features. Int. J. Digital Multimedia Broadcast. 2018,(2018)
Liang, Z., Shao, J., Zhang, D., Gao, L.: Small object detection using deep feature pyramid networks. In: Pacific Rim Conference on Multimedia (2018)
Liu, Z., Li, D., Ge, S.S., Tian, F.: Small traffic sign detection from large image. Appl. Intell. 50(1), 1–13 (2020)
Bar, M.: Visual objects in context. Nat. Rev. Neurosci. 5(8), 617 (2004)
Biederman, I.: Perceiving real-world scenes. Science 177(4043), 77–80 (1972)
Hu, P., Ramanan, D.: Finding tiny faces. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Torralba, A.: Contextual priming for object detection. Int. J. Comput. Vis. 53(2), 169–191 (2003)
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Qiao, Y., Loy, C.C.: Esrgan: Enhanced super-resolution generative adversarial networks. In: European Conference on Computer Vision Workshops (2018)
Yuan, Y., Xiong, Z., Wang, Q.: Vssa-net: vertical spatial sequence attention network for traffic sign detection. IEEE Trans. Image Process. 28(7), 3423–3434 (2019)
Chen, X., Kundu, K., Zhu, Y., Berneshawi, A.G., Ma, H., Fidler, S., Urtasun, R.: 3d object proposals for accurate object class detection. In: Advances in Neural Information Processing Systems (2015)
Bosquet, B., Mucientes, M., Brea, V.M.: STDnet: a convnet for small target detection. In: British Machine Vision Conference (2018)
Guan, T., Zhu, H.: Atrous faster r-cnn for small scale object detection. In: International Conference on Multimedia and Image Processing (2017)
Fang, L., Zhao, X., Zhang, S.: Small-objectness sensitive detection based on shifted single shot detector. Multimedia Tools Appl. 78(10), 13227–13245 (2019)
Meng, Z., Fan, X., Chen, X., Chen, M., Tong, Y.: Detecting small signs from large images. In: International Conference on Information Reuse and Integration (2017)
Zhang, R., Yin, D., Ding, J., Luo, Y., Liu, W., Yuan, M., Zhu, C., Zhou, Z.: A detection method for low-pixel ratio object. Multimedia Tools Appl. 78(9), 11655–11674 (2019)
Zhu, Z., Liang, D., Zhang, S., Huang, X., Li, B., Hu, S.: Traffic-sign detection and classification in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
Xiao, J., Ehinger, K.A., Hays, J., Torralba, A., Oliva, A.: Sun database: exploring a large collection of scene categories. Int. J. Comput. Vis. 119(1), 3–22 (2016)
Shah, S., Dey, D., Lovett, C., Kapoor, A.: Airsim: High-fidelity visual and physical simulation for autonomous vehicles. In: Hutter, M., Siegwart, R. (eds) Field and Service Robotics. Springer Proceedings in Advanced Robotics, vol 5. Springer, Cham.
Qiu, W., Yuille, A.: Unrealcv: Connecting computer vision to unreal engine. In: European Conference on Computer Vision (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Lee, S., Kim, J., Shin Yoon, J., Shin, S., Bailo, O., Kim, N., Lee, T.H., Seok Hong, H., Han, S.H., So Kweon, I.: Vpgnet: Vanishing point guided network for lane and road marking detection and recognition. In: IEEE International Conference on Computer Vision (2017)
Law, H., Deng, J.: Cornernet: Detecting objects as paired keypoints. In: The European Conference on Computer Vision (2018)
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Liu, W., Rabinovich, A., Berg, A.C.: Parsenet: Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015)
Guan, L., Wu, Y., Zhao, J.: Scan: semantic context aware network for accurate small object detection. Int. J. Comput. Intell. Syst. 11(1), 951–961 (2018)
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., Sun, S., Feng, W., Liu, Z., Xu, J., et al.: Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (2009)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2015)
Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)
Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object detection. In: IEEE International Conference on Computer Vision (2019)
Acknowledgements
This research has received funding from the National Natural Science Foundation of China (No. 61906138), National Key Research and Development Program of China (No.2016YFB0100901), Shanghai AI Innovative Development Project 2018, and Shanghai Rising Star Program (No. 21QC1400900). We would like to thank Mingyuan Chen for the support of small objects collection in developing the TJ-LDRO dataset.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
About this article
Cite this article
Chen, G., Chen, K., Zhang, L. et al. VCANet: Vanishing-Point-Guided Context-Aware Network for Small Road Object Detection. Automot. Innov. 4, 400–412 (2021). https://doi.org/10.1007/s42154-021-00157-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42154-021-00157-x