Feature Aggregation Network for Building Extraction from High-Resolution Remote Sensing Images

Zhou, Xuan; Wei, Xuefeng

doi:10.1007/978-981-99-7025-4_9

Xuan Zhou¹² &
Xuefeng Wei¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14327))

Included in the following conference series:

Pacific Rim International Conference on Artificial Intelligence

541 Accesses

Abstract

The rapid advancement in high-resolution satellite remote sensing data acquisition, particularly those achieving sub-meter precision, has uncovered the potential for detailed extraction of surface architectural features. However, the diversity and complexity of surface distributions frequently lead to current methods focusing exclusively on localized information of surface features. This often results in significant intra-class variability in boundary recognition and between buildings. Therefore, the task of fine-grained extraction of surface features from high-resolution satellite imagery has emerged as a critical challenge in remote sensing image processing. In this work, we propose the Feature Aggregation Network (FANet), concentrating on extracting both global and local features, thereby enabling the refined extraction of landmark buildings from high-resolution satellite remote sensing imagery. The Pyramid Vision Transformer captures these global features, which are subsequently refined by the Feature Aggregation Module and merged into a cohesive representation by the Difference Elimination Module. In addition, to ensure a comprehensive feature map, we have incorporated the Receptive Field Block and Dual Attention Module, expanding the receptive field and intensifying attention across spatial and channel dimensions. Extensive experiments on multiple datasets have validated the outstanding capability of FANet in extracting features from high-resolution satellite images. This signifies a major breakthrough in the field of remote sensing image processing. We will release our code soon.

X. Zhou and X. Wei—Equal contribution

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Nichol, J.E., Shaker, A., Wong, M.S.: Application of high-resolution stereo satellite images to detailed landslide hazard assessment. Geomorphology 76, 68–75 (2006)
Article Google Scholar
Shao, Z., Tang, P., Wang, Z., Saleem, N., Yam, S., Sommai, C.: BRRNet: a fully convolutional neural network for automatic building extraction from high-resolution remote sensing images. Remote Sens. 12, 1050 (2020)
Article Google Scholar
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support Vector Machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)
Article Google Scholar
Biau, G., Scornet, E.: A random forest guided tour. TEST 25, 197–227 (2016)
Google Scholar
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE T. Pattern Anal. 40(4), 834–848 (2017)
Article Google Scholar
Chen, L., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)
Google Scholar
Wang, W., et al.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: ICCV (2021)
Google Scholar
Sun, Z., Fang, H., Deng, M., Chen, A., Yue, P., Di, L.: Regular shape similarity index: a novel index for accurate extraction of regular objects from remote sensing images. IEEE Trans. Geosci. Remote Sens. 53(7), 3737–3748 (2015)
Article Google Scholar
Huang, X., Zhang, L.: A multidirectional and multiscale morphological index for automatic building extraction from multispectral GeoEye-1 imagery. Photogram. Eng. Remote Sens. 77(7), 721–732 (2021)
Article Google Scholar
Huang, X., Zhang, L.: Morphological building/shadow index for building extraction from high-resolution imagery over urban areas. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 5(1), 161–172 (2012)
Google Scholar
Plaza, A., Martinez, P., Perez, R., Plaza, J.: Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans. Geosci. Remote Sens. 40(9), 2025–2041 (2002)
Article Google Scholar
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In CVPR, pp. 3431–3440, (2015)
Google Scholar
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Article Google Scholar
Protopapadakis, E., et al.: Stacked autoencoders driven by semisupervised learning for building extraction from near infrared remote sensing imagery. Remote Sens. 13(3), 371 (2021)
Article Google Scholar
Alshehhi, R., Marpu, P.R., Woon, W.L., Mura, M.D.: Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 130, 139–149 (2017)
Google Scholar
Liu, Y., Chen, D., Ma, A., Zhong, Y., Fang, F., Xu, K.: Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 59(7), 6106–6120 (2021)
Google Scholar
Ji, S., Wei, S., Lu, M.: Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 57(1), 574–586 (2019)
Google Scholar
Deng, W., Shi, Q., Li, J.: Attention-gate-based encoder-decodernetwork for automatical building extraction. IEEE J.-STARS. 14 2611–2620 (2021)
Google Scholar
Tian, Q., Zhao, Y., Li, J., Chen, J., Chen, X., Qin, K.: MultiscaleBuilding extraction with refined attention pyramid networks. IEEEGeosci. Remote S. 19, 1–5 (2022)
Google Scholar
Chatterjee, B., Poullis, C.: Semantic segmentation from remote sensordata and the exploitation of latent learning for classification of auxiliarytasks. Comput. Vis. Image Und. 210, 103251 (2021)
Article Google Scholar
Chen, M., et al.: DR-Net: An Improved Network for Building Extractionfrom High Resolution Remote Sensing Image. Remote Sens.-Basel 13(2), 294 (2021)
Google Scholar
Zhang, H., Liao, Y., Yang, H., Yang, G., Zhang, L.: A local-global dual-stream network for building extraction from very-high-resolution remote sensing images. IEEE Trans. Neural Netw. Learn. Syst. 33(3), 1269–1283 (2022)
Article Google Scholar
Wang, W., Xie, E., Li, X., et al.: Pvt v2: improved baselines with pyramid vision transformer. Comput. Visual Media 8(3), 415–424 (2022)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation, arXiv:1802.02611 (2018)
Wang, J., Sun, K., Cheng, T., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)
Article Google Scholar
Zhou, Y., et al.: BOMSC-Net: boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 60, 1–17 (2022)
Google Scholar
Liu, Z., Shi, Q., Ou, J.: LCS: A collaborative optimization framework of vector extraction and semantic segmentation for building extraction. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)
Google Scholar
Zhu, X., Liang, J., Hauptmann, A.: Msnet: A multilevel instance segmentation network for natural disaster damage assessment in aerial videos. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2023–2032 (2021)
Google Scholar
Mnih, V.: Machine learning for aerial image labeling. In: University of Toronto (Canada) (2013)
Google Scholar
Maggiori, E., Tarabalka, Y., Charpiat, G., et al.: Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, pp. 3226–3229 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Institut Polytechnique de Paris, Rte de Saclay, 91120, Palaiseau, France
Xuan Zhou & Xuefeng Wei

Authors

Xuan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Xuefeng Wei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xuan Zhou or Xuefeng Wei .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Fenrong Liu
SEEK Limited, Cremorne, NSW, Australia
Arun Anand Sadanandan
MIMOS Berhad, Kuala Lumpur, Malaysia
Duc Nghia Pham
Universitas Indonesia, Depok, Indonesia
Petrus Mursanto
Tabcorp Holdings Limited, Melbourne, VIC, Australia
Dickson Lukose

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, X., Wei, X. (2024). Feature Aggregation Network for Building Extraction from High-Resolution Remote Sensing Images. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_9

Download citation

DOI: https://doi.org/10.1007/978-981-99-7025-4_9
Published: 10 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7024-7
Online ISBN: 978-981-99-7025-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Feature Aggregation Network for Building Extraction from High-Resolution Remote Sensing Images