Abstract
In recent years, vehicle detection from aerial images obtained using unmanned aerial vehicles (UAVs) has become a research focus in image processing as remote sensing platforms on UAVs are rapidly popularised. This study proposes a detection algorithm using a deep convolutional neural network (DCNN) based on multi-scale spatial pyramid pooling (SPP). By using multi-scale SPP models to sample characteristic patterns with different sizes, feature vectors with a fixed length are generated. This avoids the stretching- or cropping-induced deformation of input images of different sizes, thus improving the detection effect. In addition, an imaging pre-processing algorithm based on maximum normed gradient (NG) with multiple thresholds is proposed. By using this algorithm, this research restores the edges of objects disturbed by clutter in the environment. Meanwhile, the raised candidate object extraction algorithm based on the maximum binarized NG entails fewer computations as it generates fewer candidate windows. Experimental results indicate that the multi-scale SPP based DCNN can better adapt to input images of different sizes to learn of the multi-scale characteristics of objects, thus further improving the detection effect.
Similar content being viewed by others
References
Bryson M, Reid A, Ramos F, Sukkarieh S (2010) Airborne vision-based mapping and classification of large farmland environments. Journal of Field Robotics 27(5):632–655
Caltabiano D, Muscato G, Orlando A, Federico C, Giudice G, Guerrieri S (2005) Architecture of a UAV for volcanic gas sampling. In: Emerging Technologies and Factory Automation, ETFA 2005. 10th IEEE Conference on, 2005. IEEE, pp 6 pp.-744
Casbeer DW, Kingston DB, Beard RW, McLain TW (2006) Cooperative forest fire surveillance using a team of small unmanned air vehicles. Int J Syst Sci 37(6):351–360
Eisenbeiss H, Zhang L (2006) Comparison of DSMs generated from mini UAV imagery and terrestrial laser scanner in a cultural heritage application. Int Arch Photogramm, Remote Sens Spat Inf Sci XXXVI-5:90e96
Grauman K, Darrell T (2005) The pyramid match kernel: Discriminative classification with sets of image features. In: Computer Vision, ICCV 2005. Tenth IEEE International Conference on, 2005. IEEE, pp 1458–1465
Grenzdörffer G, Engel A, Teichert B (2008) The photogrammetric potential of low-cost UAVs in forestry and agriculture. Int Arch Photogramm Remote Sens Spat Inf Sci 31(B3):1207–1214
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. Pattern Anal Mach Intell, IEEE Trans 37(9):1904–1916
Howard AG (2013) Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:13125402
Hung C, Bryson M, Sukkarieh S (2012) Multi-class predictive template for tree crown detection. ISPRS J Photogramm Remote Sens 68(3):170–183
Hung C, Xu Z, Sukkarieh S (2014) Feature learning based approach for weed classification using high resolution aerial images from a digital camera mounted on a uav. Remote Sens 6(12):12037–12054
Krizhevsky A, Sutskever I (2012) Hinton GE Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lambers K, Eisenbeiss H, Sauerbier M, Kupferschmidt D, Gaisecker T, Sotoodeh S, Hanusch T (2007) Combining photogrammetry and laser scanning for the recording and modelling of the late intermediate period site of Pinchango alto, Palpa, Peru. J Archaeol Sci 34(10):1702–1712
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 2006. IEEE, pp 2169–2178
LeCun Y, Kavukcuoglu K (2010) Farabet C Convolutional networks and applications in vision. In: ISCAS, pp 253–256
LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551
Liu K, Mattyus G (2015) Fast multiclass vehicle detection on aerial images. Geosci Remote Sens Lett, IEEE 12(9):1938–1942
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Mathieu M, Henaff M, LeCun Y (2013) Fast training of convolutional networks through FFTs. arXiv preprint arXiv:13125851
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. Pattern Anal Mach Intell, IEEE Trans 24(7):971–987
Sauerbier M, Eisenbeiss H (2010) UAVs for the documentation of archaeological excavations. Int Arch Photogramm, Remote Sens Spat Inf Sci 38(Part 5):526–531
Spiess T, Bange J, Buschmann M, Vörsmann P (2007) First application of the meteorological mini-UAV’M2AV’. Meteorol Z 16(2):159–169
Turner D, Lucieer A, Malenovský Z, King DH, Robinson SA (2014) Spatial co-registration of ultra-high resolution visible, multispectral and thermal images acquired with a micro-UAV over Antarctic Moss beds. Remote Sens 6(5):4003–4024
Van Gemert JC, Geusebroek J-M, Veenman CJ, Smeulders AW (2008) Kernel codebooks for scene categorization. In: Computer Vision–ECCV 2008. Springer, pp 696–709
Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154
Wang J, Yang J, Yu K, Lv F, Huang T (2010) Gong Y Locality-constrained linear coding for image classification. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on, 2010. IEEE, pp 3360–3367
Zhang N, Wang M, Wang N (2002) Precision agriculture—a worldwide overview. Comput Electron Agric 36(2):113–132
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qu, T., Zhang, Q. & Sun, S. Vehicle detection from high-resolution aerial images using spatial pyramid pooling-based deep convolutional neural networks. Multimed Tools Appl 76, 21651–21663 (2017). https://doi.org/10.1007/s11042-016-4043-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-4043-5