Skip to main content

Feature Aggregation Network for Building Extraction from High-Resolution Remote Sensing Images

  • Conference paper
  • First Online:
PRICAI 2023: Trends in Artificial Intelligence (PRICAI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14327))

Included in the following conference series:

  • 541 Accesses

Abstract

The rapid advancement in high-resolution satellite remote sensing data acquisition, particularly those achieving sub-meter precision, has uncovered the potential for detailed extraction of surface architectural features. However, the diversity and complexity of surface distributions frequently lead to current methods focusing exclusively on localized information of surface features. This often results in significant intra-class variability in boundary recognition and between buildings. Therefore, the task of fine-grained extraction of surface features from high-resolution satellite imagery has emerged as a critical challenge in remote sensing image processing. In this work, we propose the Feature Aggregation Network (FANet), concentrating on extracting both global and local features, thereby enabling the refined extraction of landmark buildings from high-resolution satellite remote sensing imagery. The Pyramid Vision Transformer captures these global features, which are subsequently refined by the Feature Aggregation Module and merged into a cohesive representation by the Difference Elimination Module. In addition, to ensure a comprehensive feature map, we have incorporated the Receptive Field Block and Dual Attention Module, expanding the receptive field and intensifying attention across spatial and channel dimensions. Extensive experiments on multiple datasets have validated the outstanding capability of FANet in extracting features from high-resolution satellite images. This signifies a major breakthrough in the field of remote sensing image processing. We will release our code soon.

X. Zhou and X. Wei—Equal contribution

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Nichol, J.E., Shaker, A., Wong, M.S.: Application of high-resolution stereo satellite images to detailed landslide hazard assessment. Geomorphology 76, 68–75 (2006)

    Article  Google Scholar 

  2. Shao, Z., Tang, P., Wang, Z., Saleem, N., Yam, S., Sommai, C.: BRRNet: a fully convolutional neural network for automatic building extraction from high-resolution remote sensing images. Remote Sens. 12, 1050 (2020)

    Article  Google Scholar 

  3. Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support Vector Machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)

    Article  Google Scholar 

  4. Biau, G., Scornet, E.: A random forest guided tour. TEST 25, 197–227 (2016)

    Google Scholar 

  5. Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE T. Pattern Anal. 40(4), 834–848 (2017)

    Article  Google Scholar 

  6. Chen, L., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)

    Google Scholar 

  7. Wang, W., et al.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: ICCV (2021)

    Google Scholar 

  8. Sun, Z., Fang, H., Deng, M., Chen, A., Yue, P., Di, L.: Regular shape similarity index: a novel index for accurate extraction of regular objects from remote sensing images. IEEE Trans. Geosci. Remote Sens. 53(7), 3737–3748 (2015)

    Article  Google Scholar 

  9. Huang, X., Zhang, L.: A multidirectional and multiscale morphological index for automatic building extraction from multispectral GeoEye-1 imagery. Photogram. Eng. Remote Sens. 77(7), 721–732 (2021)

    Article  Google Scholar 

  10. Huang, X., Zhang, L.: Morphological building/shadow index for building extraction from high-resolution imagery over urban areas. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 5(1), 161–172 (2012)

    Google Scholar 

  11. Plaza, A., Martinez, P., Perez, R., Plaza, J.: Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans. Geosci. Remote Sens. 40(9), 2025–2041 (2002)

    Article  Google Scholar 

  12. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In CVPR, pp. 3431–3440, (2015)

    Google Scholar 

  13. Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)

    Article  Google Scholar 

  14. Protopapadakis, E., et al.: Stacked autoencoders driven by semisupervised learning for building extraction from near infrared remote sensing imagery. Remote Sens. 13(3), 371 (2021)

    Article  Google Scholar 

  15. Alshehhi, R., Marpu, P.R., Woon, W.L., Mura, M.D.: Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 130, 139–149 (2017)

    Google Scholar 

  16. Liu, Y., Chen, D., Ma, A., Zhong, Y., Fang, F., Xu, K.: Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 59(7), 6106–6120 (2021)

    Google Scholar 

  17. Ji, S., Wei, S., Lu, M.: Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 57(1), 574–586 (2019)

    Google Scholar 

  18. Deng, W., Shi, Q., Li, J.: Attention-gate-based encoder-decodernetwork for automatical building extraction. IEEE J.-STARS. 14 2611–2620 (2021)

    Google Scholar 

  19. Tian, Q., Zhao, Y., Li, J., Chen, J., Chen, X., Qin, K.: MultiscaleBuilding extraction with refined attention pyramid networks. IEEEGeosci. Remote S. 19, 1–5 (2022)

    Google Scholar 

  20. Chatterjee, B., Poullis, C.: Semantic segmentation from remote sensordata and the exploitation of latent learning for classification of auxiliarytasks. Comput. Vis. Image Und. 210, 103251 (2021)

    Article  Google Scholar 

  21. Chen, M., et al.: DR-Net: An Improved Network for Building Extractionfrom High Resolution Remote Sensing Image. Remote Sens.-Basel 13(2), 294 (2021)

    Google Scholar 

  22. Zhang, H., Liao, Y., Yang, H., Yang, G., Zhang, L.: A local-global dual-stream network for building extraction from very-high-resolution remote sensing images. IEEE Trans. Neural Netw. Learn. Syst. 33(3), 1269–1283 (2022)

    Article  Google Scholar 

  23. Wang, W., Xie, E., Li, X., et al.: Pvt v2: improved baselines with pyramid vision transformer. Comput. Visual Media 8(3), 415–424 (2022)

    Article  Google Scholar 

  24. Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28

    Chapter  Google Scholar 

  25. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)

    Google Scholar 

  26. Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation, arXiv:1802.02611 (2018)

  27. Wang, J., Sun, K., Cheng, T., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)

    Article  Google Scholar 

  28. Zhou, Y., et al.: BOMSC-Net: boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 60, 1–17 (2022)

    Google Scholar 

  29. Liu, Z., Shi, Q., Ou, J.: LCS: A collaborative optimization framework of vector extraction and semantic segmentation for building extraction. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)

    Google Scholar 

  30. Zhu, X., Liang, J., Hauptmann, A.: Msnet: A multilevel instance segmentation network for natural disaster damage assessment in aerial videos. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2023–2032 (2021)

    Google Scholar 

  31. Mnih, V.: Machine learning for aerial image labeling. In: University of Toronto (Canada) (2013)

    Google Scholar 

  32. Maggiori, E., Tarabalka, Y., Charpiat, G., et al.: Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, pp. 3226–3229 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xuan Zhou or Xuefeng Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, X., Wei, X. (2024). Feature Aggregation Network for Building Extraction from High-Resolution Remote Sensing Images. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-99-7025-4_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-99-7024-7

  • Online ISBN: 978-981-99-7025-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics