Abstract
The rapid advancement in high-resolution satellite remote sensing data acquisition, particularly those achieving sub-meter precision, has uncovered the potential for detailed extraction of surface architectural features. However, the diversity and complexity of surface distributions frequently lead to current methods focusing exclusively on localized information of surface features. This often results in significant intra-class variability in boundary recognition and between buildings. Therefore, the task of fine-grained extraction of surface features from high-resolution satellite imagery has emerged as a critical challenge in remote sensing image processing. In this work, we propose the Feature Aggregation Network (FANet), concentrating on extracting both global and local features, thereby enabling the refined extraction of landmark buildings from high-resolution satellite remote sensing imagery. The Pyramid Vision Transformer captures these global features, which are subsequently refined by the Feature Aggregation Module and merged into a cohesive representation by the Difference Elimination Module. In addition, to ensure a comprehensive feature map, we have incorporated the Receptive Field Block and Dual Attention Module, expanding the receptive field and intensifying attention across spatial and channel dimensions. Extensive experiments on multiple datasets have validated the outstanding capability of FANet in extracting features from high-resolution satellite images. This signifies a major breakthrough in the field of remote sensing image processing. We will release our code soon.
X. Zhou and X. Wei—Equal contribution
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nichol, J.E., Shaker, A., Wong, M.S.: Application of high-resolution stereo satellite images to detailed landslide hazard assessment. Geomorphology 76, 68–75 (2006)
Shao, Z., Tang, P., Wang, Z., Saleem, N., Yam, S., Sommai, C.: BRRNet: a fully convolutional neural network for automatic building extraction from high-resolution remote sensing images. Remote Sens. 12, 1050 (2020)
Hearst, M.A., Dumais, S.T., Osuna, E., Platt, J., Scholkopf, B.: Support Vector Machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)
Biau, G., Scornet, E.: A random forest guided tour. TEST 25, 197–227 (2016)
Chen, L., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE T. Pattern Anal. 40(4), 834–848 (2017)
Chen, L., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV, pp. 801–818 (2018)
Wang, W., et al.: Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In: ICCV (2021)
Sun, Z., Fang, H., Deng, M., Chen, A., Yue, P., Di, L.: Regular shape similarity index: a novel index for accurate extraction of regular objects from remote sensing images. IEEE Trans. Geosci. Remote Sens. 53(7), 3737–3748 (2015)
Huang, X., Zhang, L.: A multidirectional and multiscale morphological index for automatic building extraction from multispectral GeoEye-1 imagery. Photogram. Eng. Remote Sens. 77(7), 721–732 (2021)
Huang, X., Zhang, L.: Morphological building/shadow index for building extraction from high-resolution imagery over urban areas. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens. 5(1), 161–172 (2012)
Plaza, A., Martinez, P., Perez, R., Plaza, J.: Spatial/spectral endmember extraction by multidimensional morphological operations. IEEE Trans. Geosci. Remote Sens. 40(9), 2025–2041 (2002)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In CVPR, pp. 3431–3440, (2015)
Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., Alsaadi, F.E.: A survey of deep neural network architectures and their applications. Neurocomputing 234, 11–26 (2017)
Protopapadakis, E., et al.: Stacked autoencoders driven by semisupervised learning for building extraction from near infrared remote sensing imagery. Remote Sens. 13(3), 371 (2021)
Alshehhi, R., Marpu, P.R., Woon, W.L., Mura, M.D.: Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 130, 139–149 (2017)
Liu, Y., Chen, D., Ma, A., Zhong, Y., Fang, F., Xu, K.: Multiscale U-shaped CNN building instance extraction framework with edge constraint for high-spatial-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 59(7), 6106–6120 (2021)
Ji, S., Wei, S., Lu, M.: Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 57(1), 574–586 (2019)
Deng, W., Shi, Q., Li, J.: Attention-gate-based encoder-decodernetwork for automatical building extraction. IEEE J.-STARS. 14 2611–2620 (2021)
Tian, Q., Zhao, Y., Li, J., Chen, J., Chen, X., Qin, K.: MultiscaleBuilding extraction with refined attention pyramid networks. IEEEGeosci. Remote S. 19, 1–5 (2022)
Chatterjee, B., Poullis, C.: Semantic segmentation from remote sensordata and the exploitation of latent learning for classification of auxiliarytasks. Comput. Vis. Image Und. 210, 103251 (2021)
Chen, M., et al.: DR-Net: An Improved Network for Building Extractionfrom High Resolution Remote Sensing Image. Remote Sens.-Basel 13(2), 294 (2021)
Zhang, H., Liao, Y., Yang, H., Yang, G., Zhang, L.: A local-global dual-stream network for building extraction from very-high-resolution remote sensing images. IEEE Trans. Neural Netw. Learn. Syst. 33(3), 1269–1283 (2022)
Wang, W., Xie, E., Li, X., et al.: Pvt v2: improved baselines with pyramid vision transformer. Comput. Visual Media 8(3), 415–424 (2022)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation, arXiv:1802.02611 (2018)
Wang, J., Sun, K., Cheng, T., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 43(10), 3349–3364 (2020)
Zhou, Y., et al.: BOMSC-Net: boundary optimization and multi-scale context awareness based building extraction from high-resolution remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 60, 1–17 (2022)
Liu, Z., Shi, Q., Ou, J.: LCS: A collaborative optimization framework of vector extraction and semantic segmentation for building extraction. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2022)
Zhu, X., Liang, J., Hauptmann, A.: Msnet: A multilevel instance segmentation network for natural disaster damage assessment in aerial videos. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2023–2032 (2021)
Mnih, V.: Machine learning for aerial image labeling. In: University of Toronto (Canada) (2013)
Maggiori, E., Tarabalka, Y., Charpiat, G., et al.: Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). IEEE, pp. 3226–3229 (2017)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhou, X., Wei, X. (2024). Feature Aggregation Network for Building Extraction from High-Resolution Remote Sensing Images. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_9
Download citation
DOI: https://doi.org/10.1007/978-981-99-7025-4_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7024-7
Online ISBN: 978-981-99-7025-4
eBook Packages: Computer ScienceComputer Science (R0)