Advancing spatial mapping for satellite image road segmentation with multi-head attention

Ben Salah, Khawla; Othmani, Mohamed; Fourati, Jihen; Kherallah, Monji

doi:10.1007/s00371-024-03431-1

Advancing spatial mapping for satellite image road segmentation with multi-head attention

Research
Published: 14 May 2024

(2024)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Khawla Ben Salah¹,
Mohamed Othmani²^na1,
Jihen Fourati⁴^na1 &
…
Monji Kherallah³^na1

74 Accesses
Explore all metrics

Abstract

Remote sensing imaging is an interesting field, particularly in road areas. Road segmentation has become crucial in several areas, such as transportation network optimization, urban planning, and image analysis. We proposed in this study an upgraded mixed-scale UNet network (MAP-UNet) with a multi-head attention mechanism to identify and delineate road networks within aerial images. This upgraded model identifies and delineates road networks within aerial images. Modified MAP-UNet aims to enhance the efficiency of road segmentation through the integration of multi-scale features and attention mechanisms. We performed a comparison using the most recent methods. Our proposed approach achieves recall (76.18%), precision (80.30%), and IoU (63.00%) threshold overtime on the DeepGlobe dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An improved fire detection approach based on YOLO-v8 for smart cities

Article Open access 28 July 2023

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

SCA-YOLO: a new small object detection model for UAV images

Article 25 May 2023

Data availability

The datasets analyzed during the current study are available from the Kaggle website(https://www.kaggle.com/datasets/balraj98/deepglobe-road-extraction-dataset).

References

Malarvizhi, K., Kumar, S.V., Porchelvan, P.: Use of high resolution google earth satellite imagery in landuse map preparation for urban related applications. Proc. Technol. 24, 1835–1842 (2016). https://doi.org/10.1016/j.protcy.2016.05.231
Article Google Scholar
Bosurgi, G., Pellegrino, O., Ruggeri, A.: The role of ADAS while driving in complex road contexts: support or overload for drivers? Sustainability 15(2), 1334 (2023). https://doi.org/10.3390/su15021334
Article Google Scholar
Abdollahi, A., Pradhan, B., Shukla, N., Chakraborty, S., Alamri, A.: Deep learning approaches applied to remote sensing datasets for road extraction: a state-of-the-art review. Remote Sens. 12(9), 1444 (2020). https://doi.org/10.3390/rs12091444
Article Google Scholar
Ben Salah, K., Othmani, M., Kherallah, M.: Contactless heart rate estimation from facial video using skin detection and multi-resolution analysis. In: International Conference on Computational Collective Intelligence, pp. 554–563 (2022)
Ben Salah, K., Othmani, M., Kherallah, M.: A novel approach for human skin detection using convolutional neural network. Vis. Comput. 38(5), 1833–1843 (2022)
Article Google Scholar
Azooz, H.J., Ben Salah, K., Kherallah, M.: A novel steganography scheme using logistic map, brisk descriptor, and k-means clustering. In: Pacific-Rim Symposium on Image and Video Technology, pp. 366–379 (2023)
Fourati, J., Othmani, M., Ltifi, H.: A hybrid model based on bidirectional long-short term memory and support vector machine for rest tremor classification. Signal Image Video Process. 16(8), 2175–2182 (2022)
Article Google Scholar
Telli, M., Othmani, M., Ltifi, H.: A new approach to video steganography models with 3d deep CNN autoencoders. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-17358-7
Article Google Scholar
Fourati, J., Othmani, M., Ltifi, H.: An improved approach for Parkinson’s disease classification based on convolutional neural network. In: International Conference on Computational Collective Intelligence, pp. 123–135 (2023)
Telli, M., Othmani, M., Ltifi, H.: An improved multi-image steganography model based on deep convolutional neural networks. In: International Conference on Intelligent Systems Design and Applications, pp. 250–262 (2022)
Guennich, A., Othmani, M., Ltifi, H.: An improved model for semantic segmentation of brain lesions using CNN 3D. In: International Conference on Intelligent Systems Design and Applications, pp. 181–189 (2022)
Ben Salah, K., Othmani, M., Kherallah, M.: Long short-term memory based photoplethysmography biometric authentication. In: International Conference on Computational Collective Intelligence, pp. 554–563 (2022)
Guo, Y., Liu, Y., Georgiou, T.: A review of semantic segmentation using deep neural networks. Int. J. Multimed. Inf. Retr. 7, 87–93 (2018). https://doi.org/10.1007/s13735-018-0160-4
Yang, Y., Wang, Y., Zhu, C., Zhu, M., Sun, H., Yan, T.: Mixed-scale UNet based on dense Atrous pyramid for monocular depth estimation. IEEE Access 9, 114070–114084 (2021). https://doi.org/10.1109/ACCESS.2021.3104605
Article Google Scholar
Mattyus, G., Luo, W., Urtasun, R.: Deep roadmapper: extracting road topology from aerial images. In: ICCV, pp. 1-2–5-8 (2017)
Mattyus, G., Urtasun, R.: Matching adversarial networks. In: CVPR, pp. 2–8 (2018)
Batra, A., Singh, S., Pang, G., Basu, S., Jawahar, C., Paluri, M.: Improved road connectivity by joint learning of orientation and segmentation. In: Proceedings of the IEEE/CVF (2019)
Mosinska, A., Marquez-Neila, P., Kozinski, M., Fua, P.: Beyond the pixel-wise loss for topology-aware delineation. In: CVPR, 1-2–7-8 (2018)
Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: NIPS, pp. 2 (2016)
Chaurasia, A., Culurciello, E.: Linknet: Exploiting encoder representations for efficient semantic segmentation. In: VCIP, pp. 2-6–7-8 (2017)
Bastani, F., He, S., Abbar, S., Alizadeh, M., Balakrishnan, H., Chawla, S., Madden, S., DeWitt, D.: Roadtracer: automatic extraction of road networks from aerial images. In: CVPR, pp. 1-2–4-8 (2018)
He, H., Yang, D., Wang, S., Zheng, Y., Wang, S.: Light encoder-decoder network for road extraction of remote sensing images. J. Appl. Remote Sens. 13(3), 034510 (2019). https://doi.org/10.1117/1.JRS.13.034510
Article Google Scholar
Salah, K.B., Othmani, M., Saida, S., Kherallah, M.: Improved approach for semantic segmentation of mbrsc aerial imagery based on transfer learning and modified UNet. In: 2023 International Conference on Cyberworlds (CW), Sousse, Tunisia, pp. 46–53 (2023). https://doi.org/10.1109/CW58918.2023.00017
Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019). https://doi.org/10.1007/s11263-019-01228-7
Article Google Scholar
Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., Raskar, R.: Deepglobe 2018: a challenge to parse the earth through satellite images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 172–181 (2018)
Ma, J., Xu, Z., Zheng, E., Fan, Q.: Accurate road segmentation in remote sensing images using dense residual learning and improved focal loss. J. Phys. Conf. Ser. 1544(1), 012101 (2020). https://doi.org/10.1088/1742-6596/1544/1/012101
Article Google Scholar
Qi, X., Li, K., Liu, P., Zhou, X., Sun, M.: Deep attention and multi-scale networks for accurate remote sensing image segmentation. IEEE Access 8, 146627–146639 (2020). https://doi.org/10.1109/ACCESS.2020.3010195
Article Google Scholar
Lu, X., Zhong, Y., Zheng, Z., Zhang, L.: Gamsnet: globally aware road detection network with multi-scale residual learning. ISPRS J. Photogramm. Remote Sens. 175, 340–352 (2021). https://doi.org/10.1016/j.isprsjprs.2021.08.002
Article Google Scholar
Tang, M., Perazzi, F., Djelouah, A., Ben Ayed, I., Schroers, C., Boykov, Y.: On regularized losses for weakly-supervised CNN segmentation. In: Proceedings of the European Conference on Computer Vision, ECCV, pp. 507–522 (2018). https://doi.org/10.1007/978-3-030-01261-8_31
Lee, H., Jeong, W.-K.: Scribble2label: scribble-supervised cell segmentation via self-generating pseudo-labels with consistency. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 14–23. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_2
Marin, D., Boykov, Y.: Robust trust region for weakly supervised segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6608–6618 (2021). https://doi.org/10.1109/ICCV51839.2021.00661
Yu, S., Zhang, B., Xiao, J., Lim, E.G.: Structure-consistent weakly supervised salient object detection with local saliency coherence. In: Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, CA, USA, AAAI (2021)
Wei, Y., Ji, S.: Scribble-based weakly supervised deep learning for road surface extraction from remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–12 (2021). https://doi.org/10.1109/TGRS.2021.3064099
Zhou, M., Sui, H., Chen, S., Liu, J., Shi, W., Chen, X.: Large-scale road extraction from high-resolution remote sensing images based on a weakly-supervised structural and orientational consistency constraint network. ISPRS J. Photogramm. Remote Sens. 193, 234–251 (2022). https://doi.org/10.1016/j.isprsjprs.2022.09.005

Download references

Author information

Mohamed Othmani, Jihen Fourati, Monji Kherallah have contributed equally to this work.

Authors and Affiliations

Computer Sciences, ATES: Advanced Technologies on Environment and Smart City, National Engineering School, Sfax, Tunisia
Khawla Ben Salah
Computer Sciences, ATES: Advanced Technologies on Environment and Smart City, University of Gafsa, Gafsa, Tunisia
Mohamed Othmani
Physics, ATES: Advanced Technologies on Environment and Smart City, University of Sfax, Sfax, Tunisia
Monji Kherallah
Computer Sciences, National Engineering School, Sfax, Tunisia
Jihen Fourati

Authors

Khawla Ben Salah
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Othmani
View author publications
You can also search for this author in PubMed Google Scholar
Jihen Fourati
View author publications
You can also search for this author in PubMed Google Scholar
Monji Kherallah
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Authors 1–3 conceived of the presented idea; 1–2 developed the theory and performed the computations; 3 verified the analytical methods; 2–4 encouraged to investigate other state-of-the-art findings and supervised the findings of this work. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Khawla Ben Salah.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ben Salah, K., Othmani, M., Fourati, J. et al. Advancing spatial mapping for satellite image road segmentation with multi-head attention. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03431-1

Download citation

Accepted: 22 April 2024
Published: 14 May 2024
DOI: https://doi.org/10.1007/s00371-024-03431-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advancing spatial mapping for satellite image road segmentation with multi-head attention

Abstract

Access this article

Similar content being viewed by others

An improved fire detection approach based on YOLO-v8 for smart cities

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

SCA-YOLO: a new small object detection model for UAV images

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Advancing spatial mapping for satellite image road segmentation with multi-head attention

Abstract

Access this article

Similar content being viewed by others

An improved fire detection approach based on YOLO-v8 for smart cities

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

SCA-YOLO: a new small object detection model for UAV images

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation