Learning to Segment Objects of Various Sizes in VHR Aerial Images

  • Hao Chen
  • Tianyang Shi
  • Zhenghuan XiaEmail author
  • Dunge Liu
  • Xi Wu
  • Zhenwei ShiEmail author
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 875)


The goal of semantic segmentation is to assign semantic categories to each pixel in an image. In the context of aerial images, it is very important to yield dense labeling results, which can be applied for land use and land change detection. But small and large objects are difficult to be labeled correctly simultaneously in a single framework. Convolutional neural networks (CNN) can learn rich features and has achieved the state-of-the-art results in image labeling. We construct a novel CNN architecture: Pyramid Atrous Skip Deconvolution Network (PASDNet), which combines features of different levels and scales to learn small and large objects. Secondly, we employ a weighted loss function to overcome class imbalance problem, which improves the overall performance. Our proposed framework outperforms the other state-of-art methods on a public benchmark.


Convolutional neural networks (CNNs) Semantic segmentation Very high resolution aerial images 



The work was supported by the National Key R&D Program of China under the Grant 2017YFC1405600, the National Natural Science Foundation of China under the Grant 61671037 and the Open Research Fund of State Key Laboratory of Space-Ground Integrated Information Technology under grant NO.2016_SGIIT_KFJJ_YG_03.


  1. 1.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 60(2), 1097–1105 (2012)Google Scholar
  2. 2.
    Volpi, M., Tuia, D.: Dense semantic labeling of subdecimeter resolution images with convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 55, 881–893 (2016)CrossRefGoogle Scholar
  3. 3.
    Mnih, V.: Machine learning for aerial image labeling. Ph.D. thesis, 109 (2013)Google Scholar
  4. 4.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR, pp. 1–14 (2015)Google Scholar
  5. 5.
    Szegedy, C., et al.: Going deeper with convolutions, pp. 1–12 (2014)Google Scholar
  6. 6.
    Wu, S., Zhong, S., Liu, Y.: Deep residual learning for image steganalysis. Multimed. Tools Appl. 77, 1–17 (2017)Google Scholar
  7. 7.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: CVPR (2014)Google Scholar
  8. 8.
    Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017)CrossRefGoogle Scholar
  9. 9.
    Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)CrossRefGoogle Scholar
  10. 10.
    Yu, H., Yang, W., Xia, G.-S., Liu, G.: A color-texture-structure descriptor for high-resolution satellite image classification. Remote Sens. 8, 259 (2016)CrossRefGoogle Scholar
  11. 11.
    Chen, X.Y., Xiang, S.M., Liu, C.L., Pan, C.H.: Vehicle detection in satellite images by hybrid deep convolutional neural networks. IEEE Geosci. Remote Sens. Lett. 11, 1797–1801 (2014)CrossRefGoogle Scholar
  12. 12.
    Hu, W., Huang, Y., Wei, L., Zhang, F., Li, H.: Deep convolutional neural networks for hyperspectral image classification. J. Sens. 2015(2), 1–12 (2015)CrossRefGoogle Scholar
  13. 13.
    Penatti, A.B., Nogueira, K., Santos, J.A.: Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?, pp. 44–51 (2015)Google Scholar
  14. 14.
    Castelluccio, M., Poggi, G., Sansone, C., Verdoliva, L.: Land use classification in remote sensing images by convolutional neural networks, pp. 1–11 (2015)Google Scholar
  15. 15.
    Marmanis, D., Wegner, J.D., Galliani, S., Schindler, K., Datcu, M., Stilla, U.: Semantic segmentation of aerial images with an ensemble of CNNs. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. III-3, 473–480 (2016)CrossRefGoogle Scholar
  16. 16.
    Maggiori, E., Tarabalka, Y.: Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans. Geosci. Remote Sens. 55(2), 645–657 (2017)CrossRefGoogle Scholar
  17. 17.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding (2014)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Image Processing Center, School of AstronauticsBeihang UniversityBeijingChina

Personalised recommendations