Skip to main content
Log in

Deep convolutional encoder–decoder networks based on ensemble learning for semantic segmentation of high-resolution aerial imagery

  • Regular Paper
  • Published:
CCF Transactions on High Performance Computing Aims and scope Submit manuscript

Abstract

Due to the complexity of object information and optical conditions of high-resolution aerial imagery, it is difficult to obtain fine semantic segmentation performance. Although various deep neural network structures have been proposed to improve segmentation accuracy, there is still room for improving accuracy by making full use of multiscale features and integrating these single weak classifiers into a strong classifier. In this paper, we use a reduced SegNet network to realize the end-to-end classification of high-resolution aerial images. In addition, to use multiscale information, we present the R-SegUnet which combines the feature information of each convolution block in the reduced SegNet encoding network with the feature information of the corresponding convolution block in the decoding network. Furthermore, considering that the surface features in high-resolution aerial images are very complex, we investigate a 6to2_Net that converts the six-classification model into six binary-classification models for the recognition effect on small objects. Finally, we ensemble the above three different models to get the segmentation results. Experiment results on ISPRS Potsdam benchmark dataset show that our algorithm is state-of-the-art method. We also analyze the inference performance of our models on a variety of parallel computing devices.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

The data that support the findings of this study is available upon request from the authors.

Notes

  1. https://www.isprs.org/education/benchmarks/UrbanSemLab/semantic-labeling.aspx.

References

  • Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: a deep convolutional encoder–decoder architecture for scene segmentation. IEEE Trans. Pattern Anal. Mach. Intell.intell. 39(12), 2481–2495 (2017)

    Article  Google Scholar 

  • Campos, D., Kieu, T., Guo, C., Huang, F., Zheng, K., Yang, B., Jensen, C.S.: Unsupervised time series outlier detection with diversity-driven convolutional ensembles. Proc. VLDB Endow. 15, 611–623 (2021)

    Article  Google Scholar 

  • Carreira, J., Caseiro, R., Batista, J.: Semantic segmentation with second-order pooling. In: European Conference on Computer Vision, pp. 430–443 (2012)

  • Chen, K., Fu, K., Yan, M., et al.: Semantic segmentation of aerial images with shuffling convolutional neural networks. IEEE Geosci. Remote Sens. Lett.geosci. Remote Sens. Lett. 15(2), 173–177 (2018)

    Article  ADS  Google Scholar 

  • Cui, B., Jing, W.-P., Huang, L., Li, Z., Yan, Lu.: SANet: a sea–land segmentation network via adaptive multiscale feature learning. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 14, 116–126 (2021)

    Article  ADS  Google Scholar 

  • Dong, X., Yu, Z., Cao, W.: A survey on ensemble learning. Front. Comput. Sci. 14(2), 241–258 (2020)

    Article  Google Scholar 

  • He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conf. Comput. Vis. Pattern Recog., pp. 770–778 (2016)

  • Huang, C., Deng Yin, Y., Zeng, H.: Segmentation algorithm of road scene based on full convolutional network and conditional random field. In: 2019 2nd International Conference on Information Systems and Computer Aided Education, pp. 270–273 (2019)

  • Huang, G., Zhu, J., Li, J., et al.: Channel-attention U-Net: channel attention mechanism for semantic segmentation of esophagus and esophageal cancer. IEEE Access 8, 122798–122810 (2020)

    Article  Google Scholar 

  • Inglada, J.: Automatic recognition of man-made objects in high resolution optical remote sending images by SVM classification of geometric image features. ISPRS J. Photogramm. Remote Sens.photogramm. Remote Sens. 63(3), 236–248 (2007)

    Article  ADS  Google Scholar 

  • Li, X., Li, T., Chen, Z., Zhang, K., Xia, R.: Attentively learning edge distributions for semantic segmentation of remote sensing imagery. Remote Sens. 14(1), 102 (2022)

    Article  ADS  CAS  Google Scholar 

  • Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

  • Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  • Mao, H.-Z., Song, Y., Tang, T.-Q., et al.: Towards real-time object detection on embedded systems. IEEE Trans. Emerg. Top. Comput.emerg. Top. Comput. 6(3), 417–431 (2018)

    Article  Google Scholar 

  • Mariana, B., Lucian, D.: Random forest in remote sensing: a review of applications and future directions. ISPRS J. Photogramm. Remote Sens.photogramm. Remote Sens. 14(6), 24–31 (2016)

    Google Scholar 

  • Mou, L., Hua, Y., Zhu, X.: Relation matters: relational context-aware fully convolutional network for semantic segmentation of high-resolution aerial images. IEEE Trans. Geosci. Remote Sens.geosci. Remote Sens. 58, 7557–7569 (2020)

    Article  ADS  Google Scholar 

  • Niu, R., Sun, X., Tian, Y., Diao, W., Chen, K., Fu, K.: Hybrid multiple attention network for semantic segmentation in aerial images. IEEE Trans. Geosci. Remote Sens.geosci. Remote Sens. 60, 1–18 (2022)

    Google Scholar 

  • Park, J., Naumov, M., Basu, P., et al.: Deep learning inference in facebook data centers: characterization, performance optimizations and hardware implications. arXiv preprint arXiv:1811.09886 (2018)

  • Peng, C., Li, Y.-Y., Jiao, L.-C., et al.: Densely based multi-scale and multi-modal fully convolutional networks for high-resolution remote-sensing image semantic segmentation. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 12(8), 2612–2626 (2019)

    Article  ADS  Google Scholar 

  • Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: Proceedings of International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241 (2015)

  • Sherrah, J.: Fully convolutional networks for dense semantic labelling of high-resolution aerial imagery. arXiv preprint arXiv:1606.02585 (2016)

  • Taghanaki, S.A., Abhishek, K., Cohen, J.P., Cohen-Adad, J., Hamarneh, G.: Deep semantic segmentation of natural andmedical images: a review. Artif. Intell. Rev.. Intell. Rev. 54, 137–178 (2020)

    Article  Google Scholar 

  • Wang, Y., Gu, Y.-F., He, X., et al.: Deep learning ensemble for hyperspectral image classification. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 12(6), 1882–1897 (2019)

    Article  ADS  Google Scholar 

  • Weng, W., Zhu, X.: INet: convolutional networks for biomedical image segmentation. IEEE Access 9, 16591–16603 (2021)

    Article  Google Scholar 

  • Xiao, X., Zhao, Y., Zhang, F., et al.: BASeg: boundary aware semantic segmentation for autonomous driving. Neural Netw.netw. 157, 460–470 (2023)

    Article  Google Scholar 

  • Yang, R., Zhang, Y., Cheng, H., Zhao, Y., Dai, Q., Chen, N.: Semantic segmentation of remote sensing image based on two-time augmentation and atrous convolution. In: 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 1728–1734 (2021)

  • Yuan, Y., Chen, X., Wang, J.: Object-contextual representations for semantic segmentation. In: Proceedings of the 16th European Conference on Computer Vision, pp. 173–190 (2020)

Download references

Funding

This work was supported in part by the Key Research and Development Program of Shaanxi Program under Grant 2022ZDLGY01-09, and in part by the GHfund A under Grant 202107014474 and Grant 202202036165.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huming Zhu.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Liu, C., Li, Q. et al. Deep convolutional encoder–decoder networks based on ensemble learning for semantic segmentation of high-resolution aerial imagery. CCF Trans. HPC (2024). https://doi.org/10.1007/s42514-024-00184-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42514-024-00184-0

Keywords

Navigation