Abstract
Traffic sign detection is one of the key components in autonomous driving. Advanced autonomous vehicles armed with high quality sensors capture high definition images for further analysis. Detecting traffic signs, moving vehicles, and lanes is important for localization and decision making. Traffic signs, especially those that are far from the camera, are small, and so are challenging to traditional object detection methods. In this work, in order to reduce computational cost and improve detection performance, we split the large input images into small blocks and then recognize traffic signs in the blocks using another detection module. Therefore, this paper proposes a three-stage traffic sign detector, which connects a BlockNet with an RPN–RCNN detection network. BlockNet, which is composed of a set of CNN layers, is capable of performing block-level foreground detection, making inferences in less than 1 ms. Then, the RPN–RCNN two-stage detector is used to identify traffic sign objects in each block; it is trained on a derived dataset named TT100KPatch. Experiments show that our framework can achieve both state-of-the-art accuracy and recall; its fastest detection speed is 102 fps.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Everingham, M.; van Gool, L.; Williams, C. K. I.; Winn, J.; Zisserman, A. The Pascal visual object classes (VOC) challenge. International Journal of Computer Vision Vol. 88, No. 2, 303–338, 2010.
Zhu, Z.; Liang, D.; Zhang, S. H.; Huang, X. L.; Li, B. L.; Hu, S. M. Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2110–2118, 2016.
Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi. A. You only look once: Unified, real-time object detection. In: Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, 779–788, 2016.
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C. Y.; Berg, A. C. SSD: Single shot MultiBox detector. In: Computer Vision–ECCV 2016. Lecture Notes in Computer Science, Vol. 9905. Leibe, B.; Matas, J.; Sebe, N.; Welling, M. Eds. Springer Cham, 21–37, 2016.
Kong, T.; Sun, F. C.; Yao, A. B.; Liu, H. P.; Lu, M.; Chen, Y. R. RON: Reverse connection with objectness prior networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5244–5252, 2017.
Lin, T. Y.; Goyal, P.; Girshick, R.; He, K. M.; Dollar, P. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence DOI: https://doi.org/10.1109/TPAMI.2018.2858826, 2018.
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 580–587, 2014.
Girshick, R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 1440–1448, 2015.
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks In: Proceedings of the Advances in Neural Information Processing Systems 28, 91–99, 2015.
He, K.; Gkioxari, G.; Dollar, P.; Girshick, R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, 2961–2969, 2017.
Everingham, M.; van Gool, L.; Williams, C. K. I.; Winn, J.; Zisserman, A. The Pascal visual object classes challenge 2007 (voc2007) results. Available at http://www.pascal-network.org/challenges/VOC/voc2007/workshop/index.html.
Lin, T. Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C. L. Microsoft COCO: Common objects in context. In: Computer Vision–ECCV 2014. Lecture Notes in Computer Science, Vol. 8693. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 740–755, 2014.
Houben, S.; Stallkamp, J.; Salmen, J.; Schlipsing, M.; Igel, C. Detection of traffic signs in real-world images: The German traffic sign detection benchmark. In: Proceedings of the International Joint Conference on Neural Networks, 1–8, 2013.
Stallkamp, J.; Schlipsing, M.; Salmen, J.; Igel, C. The German traffic sign recognition benchmark: A multiclass classification competition. In: Proceedings of the International Joint Conference on Neural Networks, 1453–1460, 2011.
Timofte, R.; Zimmermann, K.; van Gool, L. Multi-view traffic sign detection, recognition, and 3D localisation. Machine Vision and Applications Vol. 25, No. 3, 633–647, 2014.
Hemadri, V. B.; Kulkarni, U. P. Recognition of traffic sign based on support vector machine and creation of the Indian traffic sign recognition benchmark. In: Cognitive Computing and Information Processing. Communications in Computer and Information Science, Vol. 801. Nagabhushan, T.; Aradhya, V.; Jagadeesh, P.; Shukla, S. Eds. Springer Singapore, 227–238, 2018.
Larsson, F.; Felsberg, M. Using Fourier descriptors and spatial models for traffic sign recognition. In: Image Analysis. Lecture Notes in Computer Science, Vol. 6688. Heyden, A.; Kahl, F. Eds. Springer Berlin Heidelberg, 238–249, 2011.
Yang, Y.; Luo, H. L.; Xu, H. R.; Wu, F. C. Towards real-time traffic sign detection and classification. IEEE Transactions on Intelligent Transportation Systems Vol. 17, No. 7, 2022–2031, 2016.
Meng, Z.; Fan, X.; Chen, X.; Chen, M.; Tong, Y. Detecting small signs from large images. In: Proceedings of the IEEE International Conference on Information Reuse and Integration, 217–224, 2017.
Yang, T. T.; Long, X.; Sangaiah, A. K.; Zheng, Z. G.; Tong, C. Deep detection network for real-life traffic sign in vehicular networks. Computer Networks Vol. 136, 95–104, 2018.
Pon, A.; Adrienko, O.; Harakeh, A.; Waslander, S. L. A hierarchical deep architecture and mini-batch selection method for joint traffic sign and light detection. In: Proceedings of the 15th Conference on Computer and Robot Vision, 102–109, 2018.
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778, 2015.
Yang, J.; Lu, J.; Batra, D.; Parikh, D. A faster pytorch implementation of faster R-CNN 2017. Available at https://github.com/jwyang/faster-rcnn.pytorch.
Li, J.; Liang, X.; Wei, Y.; Xu, T.; Feng, J.; Yan, S. Perceptual generative adversarial networks for small object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1222–1230, 2017.
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
Lu, Y.; Lu, J.; Zhang, S.; Hall, P. Traffic signal detection and classification in street views using an attention model. Computational Visual Media Vol. 4, No. 3, 253–266, 2018.
Acknowledgements
We thank all anonymous reviewers for their valuable comments and suggestions. This paper was supported by the National Natural Science Foundation of China (No. 61832016) and Science and Technology Project of Zhejiang Province (No. 2018C01080).
Author information
Authors and Affiliations
Corresponding author
Additional information
Yizhi Song is a Ph.D. student in the Department of Computer Science at Purdue University. He received his B.E. degree from the College of Computer Science and Technology at Zhejiang University. His research interests are in the fields of computer vision, computer graphics, and information visualization.
Ruochen Fan is a master candidate in the Department of Computer Science and Technology, Tsinghua University. He received his bachelor degree from Beijing University of Posts and Telecommunications in 2016. His research interest is computer vision.
Sharon Huang received her B.E. degree in computer science from Tsinghua University in 1999, and her M.S. and Ph.D. degrees in computer science from Rutgers University in 2001 and 2006, respectively. She is currently an associate professor in the College of Information Sciences and Technology and a co-hire with Huck Institutes of the Life Sciences at Penn State University, USA. Her research interests are in the areas of biomedical image analysis, computer vision, machine learning, and computer graphics, focusing on object recognition, segmentation, registration, matching, real-time tracking, skeletonization, and deformable (non-rigid) model based methods.
Zhe Zhu is a postdoctoral associate at Duke University, working with Dr. Maciej A. Mazurowski. He received his Ph.D. degree from the Department of Computer Science and Technology, Tsinghua University, under the supervision of Prof. Shi-Min Hu. He got his B.Sc. degree from the School of Computing, Wuhan University, under the supervision of Profs. Aiguo Yao and Zhiyong Yuan. In 2016, he worked as a research intern with Drs. Brian Price and Scott Cohen in Adobe Research, San Jose. His research interests are in computer graphics, computer vision, and medical imaging.
Ruofeng Tong is a professor in the Department of Computer Science, Zhejiang University, China. He received his B.S. degree from Fudan University, China, in 1991, and Ph.D. degree from Zhejiang University in 1996. His research interests include image and video processing, computer graphics, and computer animation.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
About this article
Cite this article
Song, Y., Fan, R., Huang, S. et al. A three-stage real-time detector for traffic signs in large panoramas. Comp. Visual Media 5, 403–416 (2019). https://doi.org/10.1007/s41095-019-0152-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-019-0152-1