Accurate On-Road Vehicle Detection with Deep Fully Convolutional Networks

Jie, Zequn; Lu, Wen Feng; Tay, Eng Hock Francis

doi:10.1007/978-3-319-41920-6_50

Zequn Jie¹⁴,
Wen Feng Lu¹⁴ &
Eng Hock Francis Tay¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9729))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

3131 Accesses
1 Citations

Abstract

Vision-based on-road vehicle detection is one of the key problems for autonomous vehicles. Conventional vision-based on-road vehicle detection methods mainly rely on hand-crafted features, such as SIFT and HOG. These hand-crafted features normally require expensive human labor and expert knowledge. Also, they suffer from poor generalization and slow running speed. Therefore, they are difficult to be applied in realistic application which demands accurate and fast detection in all kinds of unpredictable complex environmental conditions. This paper presents a framework utilizing fully convolutional networks (FCN) to produce bounding boxes with high confidence to contain a vehicle, and bounding box location refinement with SVM to further improve localization accuracy. Experiments on the PASCAL VOC 2007 and LISA-Q benchmarks show that using high-level semantic vehicle confidence obtained by FCN, higher precision and recall are achieved. Additionally, FCN enables whole image inference, which makes the proposed method much faster than the object proposal or hand-crafted feature based detectors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
Google Scholar
Matthews, N., An, P., Charnley, D., Harris, C.: Vehicle detection and recognition in greyscale imagery. Control Engineering Practice 4, 473–479 (1996)
Article Google Scholar
Bertozzi, M., Broggi, A., Castelluccio, S.: A real-time oriented system for vehicle detection. Journal of Systems Architecture 43, 317–325 (1997)
Article Google Scholar
Caraffi, C., Vojíř, T., Trefnỳ, J., Šochman, J., Matas, J.: A system for real-time detection and tracking of vehicles from a single car-mounted camera. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 975–982. IEEE (2012)
Google Scholar
Jazayeri, A., Cai, H., Zheng, J.Y., Tuceryan, M.: Vehicle detection and tracking in car video based on motion model. IEEE Transactions on Intelligent Transportation Systems 12, 583–595 (2011)
Article Google Scholar
Sivaraman, S., Trivedi, M.M.: A general active-learning framework for on-road vehicle recognition and tracking. IEEE Transactions on Intelligent Transportation Systems 11, 267–276 (2010)
Article Google Scholar
Sivaraman, S., Trivedi, M.M.: Real-time vehicle detection using parts at intersections. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 1519–1524. IEEE (2012)
Google Scholar
Sun, Z., Bebis, G., Miller, R.: Monocular precrash vehicle detection: features and classifiers. IEEE Transactions on Image Processing 15, 2019–2034 (2006)
Article Google Scholar
Choi, J.: Realtime on-Road Vehicle Detection with Optical Flows and Haar-Like Feature Detectors (2012)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409.1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint (2014). arXiv:1409.4842
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE CVPR, pp. 248–255 (2009)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech. Rep. (2009)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint (2013). arXiv:1311.2524
Oquab, M., Bottou, L., Laptev, I., Sivic, J., et al.: Learning and transferring mid-level image representations using convolutional neural networks. arXiv preprint (2013)
Google Scholar
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint (2013). arXiv:1312.6229
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. arXiv preprint (2014). arXiv:1403.1840
Pinheiro, P.H., Collobert, R.: Recurrent convolutional neural networks for scene parsing. arXiv preprint (2013). arXiv:1306.2795
Eigen, D., Krishnan, D., Fergus, R.: Restoring an image taken through a window covered with dirt or rain. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 633–640. IEEE (2013)
Google Scholar
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 2189–2202 (2012)
Article Google Scholar
Forsyth, D.A., Malik, J., Fleck, M.M., Greenspan, H., Leung, T., Belongie, S., Carson, C., Bregler, C.: Finding Pictures of Objects in Large Collections of Images. Springer (1996)
Google Scholar
Heitz, G., Koller, D.: Learning spatial context: using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)
Chapter Google Scholar
Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.: Bing: binarized normed gradients for objectness estimation at 300fps. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3286–3293. IEEE (2014)
Google Scholar
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 391–405. Springer, Heidelberg (2014)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88, 303–338 (2010)
Article Google Scholar
McCall, J.C., Achler, O., Trivedi, M.M.: Design of an instrumented vehicle test bed for developing a human centered driver support system. In: 2004 IEEE Intelligent Vehicles Symposium, pp. 483–488. IEEE (2004)
Google Scholar
Trivedi, M.M., Gandhi, T., McCall, J.: Looking-in and looking-out of a vehicle: Computer-vision-based enhanced vehicle safety. IEEE Transactions on Intelligent Transportation Systems 8, 108–120 (2007)
Article Google Scholar
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 1–42 (2014)
Google Scholar
Lin, T.Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollr, P.: Microsoft coco: Common objects in context. arXiv preprint (2015). arXiv:1506.06204
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Discriminatively Trained Deformable Part Models, Release 4 (2010)
Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. arXiv preprint (2014). arXiv:1408.5093
Uijlings, J.R., van de Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. International Journal of Computer Vision 104, 154–171 (2013)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical Engineering, National University of Singapore, Singapore, Singapore
Zequn Jie, Wen Feng Lu & Eng Hock Francis Tay

Authors

Zequn Jie
View author publications
You can also search for this author in PubMed Google Scholar
Wen Feng Lu
View author publications
You can also search for this author in PubMed Google Scholar
Eng Hock Francis Tay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zequn Jie .

Editor information

Editors and Affiliations

IBaI, Inst of Comp Vision and applied Comp Sci, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jie, Z., Lu, W.F., Tay, E.H.F. (2016). Accurate On-Road Vehicle Detection with Deep Fully Convolutional Networks. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-41920-6_50
Published: 28 June 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics