Skip to main content
Log in

Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In recent years, vehicle detection from aerial images obtained using unmanned aerial vehicles (UAVs) has become a research focus in image processing as remote sensing platforms on UAVs are rapidly popularised. This study proposes a detection algorithm using a deep convolutional neural network (DCNN) based on multi-scale spatial pyramid pooling (SPP). By using multi-scale SPP models to sample characteristic patterns with different sizes, feature vectors with a fixed length are generated. This avoids the stretching- or cropping-induced deformation of input images of different sizes, thus improving the detection effect. In addition, an imaging pre-processing algorithm based on maximum normed gradient (NG) with multiple thresholds is proposed. By using this algorithm, this research restores the edges of objects disturbed by clutter in the environment. Meanwhile, the raised candidate object extraction algorithm based on the maximum binarized NG entails fewer computations as it generates fewer candidate windows. Experimental results indicate that the multi-scale SPP based DCNN can better adapt to input images of different sizes to learn of the multi-scale characteristics of objects, thus further improving the detection effect.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Bryson M, Reid A, Ramos F, Sukkarieh S (2010) Airborne vision-based mapping and classification of large farmland environments. Journal of Field Robotics 27(5):632–655

    Article  Google Scholar 

  2. Caltabiano D, Muscato G, Orlando A, Federico C, Giudice G, Guerrieri S (2005) Architecture of a UAV for volcanic gas sampling. In: Emerging Technologies and Factory Automation, ETFA 2005. 10th IEEE Conference on, 2005. IEEE, pp 6 pp.-744

  3. Casbeer DW, Kingston DB, Beard RW, McLain TW (2006) Cooperative forest fire surveillance using a team of small unmanned air vehicles. Int J Syst Sci 37(6):351–360

    Article  MATH  Google Scholar 

  4. Eisenbeiss H, Zhang L (2006) Comparison of DSMs generated from mini UAV imagery and terrestrial laser scanner in a cultural heritage application. Int Arch Photogramm, Remote Sens Spat Inf Sci XXXVI-5:90e96

    Google Scholar 

  5. Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Computer Vision, ICCV 2005. Tenth IEEE International Conference on, 2005. IEEE, pp 1458–1465

  6. Grenzdörffer G, Engel A, Teichert B (2008) The photogrammetric potential of low-cost UAVs in forestry and agriculture. Int Arch Photogramm Remote Sens Spat Inf Sci 31(B3):1207–1214

    Google Scholar 

  7. He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. Pattern Anal Mach Intell, IEEE Trans 37(9):1904–1916

    Article  Google Scholar 

  8. Howard AG (2013) Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:13125402

  9. Hung C, Bryson M, Sukkarieh S (2012) Multi-class predictive template for tree crown detection. ISPRS J Photogramm Remote Sens 68(3):170–183

    Article  Google Scholar 

  10. Hung C, Xu Z, Sukkarieh S (2014) Feature learning based approach for weed classification using high resolution aerial images from a digital camera mounted on a uav. Remote Sens 6(12):12037–12054

    Article  Google Scholar 

  11. Krizhevsky A, Sutskever I (2012) Hinton GE Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  12. Lambers K, Eisenbeiss H, Sauerbier M, Kupferschmidt D, Gaisecker T, Sotoodeh S, Hanusch T (2007) Combining photogrammetry and laser scanning for the recording and modelling of the late intermediate period site of Pinchango alto, Palpa, Peru. J Archaeol Sci 34(10):1702–1712

    Article  Google Scholar 

  13. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 2006. IEEE, pp 2169–2178

  14. LeCun Y, Kavukcuoglu K (2010) Farabet C Convolutional networks and applications in vision. In: ISCAS, pp 253–256

  15. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551

    Article  Google Scholar 

  16. Liu K, Mattyus G (2015) Fast multiclass vehicle detection on aerial images. Geosci Remote Sens Lett, IEEE 12(9):1938–1942

    Article  Google Scholar 

  17. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110

    Article  Google Scholar 

  18. Mathieu M, Henaff M, LeCun Y (2013) Fast training of convolutional networks through FFTs. arXiv preprint arXiv:13125851

  19. Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell, IEEE Trans 24(7):971–987

    Article  MATH  Google Scholar 

  20. Sauerbier M, Eisenbeiss H (2010) UAVs for the documentation of archaeological excavations. Int Arch Photogramm, Remote Sens Spat Inf Sci 38(Part 5):526–531

    Google Scholar 

  21. Spiess T, Bange J, Buschmann M, Vörsmann P (2007) First application of the meteorological mini-UAV’M2AV’. Meteorol Z 16(2):159–169

    Article  Google Scholar 

  22. Turner D, Lucieer A, Malenovský Z, King DH, Robinson SA (2014) Spatial co-registration of ultra-high resolution visible, multispectral and thermal images acquired with a micro-UAV over Antarctic Moss beds. Remote Sens 6(5):4003–4024

    Article  Google Scholar 

  23. Van Gemert JC, Geusebroek J-M, Veenman CJ, Smeulders AW (2008) Kernel codebooks for scene categorization. In: Computer Vision–ECCV 2008. Springer, pp 696–709

  24. Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154

    Article  Google Scholar 

  25. Wang J, Yang J, Yu K, Lv F, Huang T (2010) Gong Y Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, 2010. IEEE, pp 3360–3367

  26. Zhang N, Wang M, Wang N (2002) Precision agriculture—a worldwide overview. Comput Electron Agric 36(2):113–132

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shilei Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qu, T., Zhang, Q. & Sun, S. Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks. Multimed Tools Appl 76, 21651–21663 (2017). https://doi.org/10.1007/s11042-016-4043-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-016-4043-5

Keywords

Navigation