Abstract
Vision-based on-road vehicle detection is one of the key problems for autonomous vehicles. Conventional vision-based on-road vehicle detection methods mainly rely on hand-crafted features, such as SIFT and HOG. These hand-crafted features normally require expensive human labor and expert knowledge. Also, they suffer from poor generalization and slow running speed. Therefore, they are difficult to be applied in realistic application which demands accurate and fast detection in all kinds of unpredictable complex environmental conditions. This paper presents a framework utilizing fully convolutional networks (FCN) to produce bounding boxes with high confidence to contain a vehicle, and bounding box location refinement with SVM to further improve localization accuracy. Experiments on the PASCAL VOC 2007 and LISA-Q benchmarks show that using high-level semantic vehicle confidence obtained by FCN, higher precision and recall are achieved. Additionally, FCN enables whole image inference, which makes the proposed method much faster than the object proposal or hand-crafted feature based detectors.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2005, vol. 1, pp. 886–893. IEEE (2005)
Matthews, N., An, P., Charnley, D., Harris, C.: Vehicle detection and recognition in greyscale imagery. Control Engineering Practice 4, 473–479 (1996)
Bertozzi, M., Broggi, A., Castelluccio, S.: A real-time oriented system for vehicle detection. Journal of Systems Architecture 43, 317–325 (1997)
Caraffi, C., VojÃÅ™, T., Trefnỳ, J., Å ochman, J., Matas, J.: A system for real-time detection and tracking of vehicles from a single car-mounted camera. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 975–982. IEEE (2012)
Jazayeri, A., Cai, H., Zheng, J.Y., Tuceryan, M.: Vehicle detection and tracking in car video based on motion model. IEEE Transactions on Intelligent Transportation Systems 12, 583–595 (2011)
Sivaraman, S., Trivedi, M.M.: A general active-learning framework for on-road vehicle recognition and tracking. IEEE Transactions on Intelligent Transportation Systems 11, 267–276 (2010)
Sivaraman, S., Trivedi, M.M.: Real-time vehicle detection using parts at intersections. In: 2012 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), pp. 1519–1524. IEEE (2012)
Sun, Z., Bebis, G., Miller, R.: Monocular precrash vehicle detection: features and classifiers. IEEE Transactions on Image Processing 15, 2019–2034 (2006)
Choi, J.: Realtime on-Road Vehicle Detection with Optical Flows and Haar-Like Feature Detectors (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint (2014). arXiv:1409.1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. arXiv preprint (2014). arXiv:1409.4842
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE CVPR, pp. 248–255 (2009)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Computer Science Department, University of Toronto, Tech. Rep. (2009)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv preprint (2013). arXiv:1311.2524
Oquab, M., Bottou, L., Laptev, I., Sivic, J., et al.: Learning and transferring mid-level image representations using convolutional neural networks. arXiv preprint (2013)
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint (2013). arXiv:1312.6229
Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. arXiv preprint (2014). arXiv:1403.1840
Pinheiro, P.H., Collobert, R.: Recurrent convolutional neural networks for scene parsing. arXiv preprint (2013). arXiv:1306.2795
Eigen, D., Krishnan, D., Fergus, R.: Restoring an image taken through a window covered with dirt or rain. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 633–640. IEEE (2013)
Alexe, B., Deselaers, T., Ferrari, V.: Measuring the objectness of image windows. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 2189–2202 (2012)
Forsyth, D.A., Malik, J., Fleck, M.M., Greenspan, H., Leung, T., Belongie, S., Carson, C., Bregler, C.: Finding Pictures of Objects in Large Collections of Images. Springer (1996)
Heitz, G., Koller, D.: Learning spatial context: using stuff to find things. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 30–43. Springer, Heidelberg (2008)
Cheng, M.M., Zhang, Z., Lin, W.Y., Torr, P.: Bing: binarized normed gradients for objectness estimation at 300fps. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3286–3293. IEEE (2014)
Zitnick, C.L., Dollár, P.: Edge boxes: locating object proposals from edges. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 391–405. Springer, Heidelberg (2014)
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88, 303–338 (2010)
McCall, J.C., Achler, O., Trivedi, M.M.: Design of an instrumented vehicle test bed for developing a human centered driver support system. In: 2004 IEEE Intelligent Vehicles Symposium, pp. 483–488. IEEE (2004)
Trivedi, M.M., Gandhi, T., McCall, J.: Looking-in and looking-out of a vehicle: Computer-vision-based enhanced vehicle safety. IEEE Transactions on Intelligent Transportation Systems 8, 108–120 (2007)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 1–42 (2014)
Lin, T.Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., Perona, P., Ramanan, D., Zitnick, C.L., Dollr, P.: Microsoft coco: Common objects in context. arXiv preprint (2015). arXiv:1506.06204
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Discriminatively Trained Deformable Part Models, Release 4 (2010)
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding. arXiv preprint (2014). arXiv:1408.5093
Uijlings, J.R., van de Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. International Journal of Computer Vision 104, 154–171 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Jie, Z., Lu, W.F., Tay, E.H.F. (2016). Accurate On-Road Vehicle Detection with Deep Fully Convolutional Networks. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science(), vol 9729. Springer, Cham. https://doi.org/10.1007/978-3-319-41920-6_50
Download citation
DOI: https://doi.org/10.1007/978-3-319-41920-6_50
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41919-0
Online ISBN: 978-3-319-41920-6
eBook Packages: Computer ScienceComputer Science (R0)