Skip to main content
Log in

Advancing spatial mapping for satellite image road segmentation with multi-head attention

  • Research
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Remote sensing imaging is an interesting field, particularly in road areas. Road segmentation has become crucial in several areas, such as transportation network optimization, urban planning, and image analysis. We proposed in this study an upgraded mixed-scale UNet network (MAP-UNet) with a multi-head attention mechanism to identify and delineate road networks within aerial images. This upgraded model identifies and delineates road networks within aerial images. Modified MAP-UNet aims to enhance the efficiency of road segmentation through the integration of multi-scale features and attention mechanisms. We performed a comparison using the most recent methods. Our proposed approach achieves recall (76.18%), precision (80.30%), and IoU (63.00%) threshold overtime on the DeepGlobe dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

The datasets analyzed during the current study are available from the Kaggle website(https://www.kaggle.com/datasets/balraj98/deepglobe-road-extraction-dataset).

References

  1. Malarvizhi, K., Kumar, S.V., Porchelvan, P.: Use of high resolution google earth satellite imagery in landuse map preparation for urban related applications. Proc. Technol. 24, 1835–1842 (2016). https://doi.org/10.1016/j.protcy.2016.05.231

    Article  Google Scholar 

  2. Bosurgi, G., Pellegrino, O., Ruggeri, A.: The role of ADAS while driving in complex road contexts: support or overload for drivers? Sustainability 15(2), 1334 (2023). https://doi.org/10.3390/su15021334

    Article  Google Scholar 

  3. Abdollahi, A., Pradhan, B., Shukla, N., Chakraborty, S., Alamri, A.: Deep learning approaches applied to remote sensing datasets for road extraction: a state-of-the-art review. Remote Sens. 12(9), 1444 (2020). https://doi.org/10.3390/rs12091444

    Article  Google Scholar 

  4. Ben Salah, K., Othmani, M., Kherallah, M.: Contactless heart rate estimation from facial video using skin detection and multi-resolution analysis. In: International Conference on Computational Collective Intelligence, pp. 554–563 (2022)

  5. Ben Salah, K., Othmani, M., Kherallah, M.: A novel approach for human skin detection using convolutional neural network. Vis. Comput. 38(5), 1833–1843 (2022)

    Article  Google Scholar 

  6. Azooz, H.J., Ben Salah, K., Kherallah, M.: A novel steganography scheme using logistic map, brisk descriptor, and k-means clustering. In: Pacific-Rim Symposium on Image and Video Technology, pp. 366–379 (2023)

  7. Fourati, J., Othmani, M., Ltifi, H.: A hybrid model based on bidirectional long-short term memory and support vector machine for rest tremor classification. Signal Image Video Process. 16(8), 2175–2182 (2022)

    Article  Google Scholar 

  8. Telli, M., Othmani, M., Ltifi, H.: A new approach to video steganography models with 3d deep CNN autoencoders. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-17358-7

    Article  Google Scholar 

  9. Fourati, J., Othmani, M., Ltifi, H.: An improved approach for Parkinson’s disease classification based on convolutional neural network. In: International Conference on Computational Collective Intelligence, pp. 123–135 (2023)

  10. Telli, M., Othmani, M., Ltifi, H.: An improved multi-image steganography model based on deep convolutional neural networks. In: International Conference on Intelligent Systems Design and Applications, pp. 250–262 (2022)

  11. Guennich, A., Othmani, M., Ltifi, H.: An improved model for semantic segmentation of brain lesions using CNN 3D. In: International Conference on Intelligent Systems Design and Applications, pp. 181–189 (2022)

  12. Ben Salah, K., Othmani, M., Kherallah, M.: Long short-term memory based photoplethysmography biometric authentication. In: International Conference on Computational Collective Intelligence, pp. 554–563 (2022)

  13. Guo, Y., Liu, Y., Georgiou, T.: A review of semantic segmentation using deep neural networks. Int. J. Multimed. Inf. Retr. 7, 87–93 (2018). https://doi.org/10.1007/s13735-018-0160-4

  14. Yang, Y., Wang, Y., Zhu, C., Zhu, M., Sun, H., Yan, T.: Mixed-scale UNet based on dense Atrous pyramid for monocular depth estimation. IEEE Access 9, 114070–114084 (2021). https://doi.org/10.1109/ACCESS.2021.3104605

    Article  Google Scholar 

  15. Mattyus, G., Luo, W., Urtasun, R.: Deep roadmapper: extracting road topology from aerial images. In: ICCV, pp. 1-2–5-8 (2017)

  16. Mattyus, G., Urtasun, R.: Matching adversarial networks. In: CVPR, pp. 2–8 (2018)

  17. Batra, A., Singh, S., Pang, G., Basu, S., Jawahar, C., Paluri, M.: Improved road connectivity by joint learning of orientation and segmentation. In: Proceedings of the IEEE/CVF (2019)

  18. Mosinska, A., Marquez-Neila, P., Kozinski, M., Fua, P.: Beyond the pixel-wise loss for topology-aware delineation. In: CVPR, 1-2–7-8 (2018)

  19. Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. In: NIPS, pp. 2 (2016)

  20. Chaurasia, A., Culurciello, E.: Linknet: Exploiting encoder representations for efficient semantic segmentation. In: VCIP, pp. 2-6–7-8 (2017)

  21. Bastani, F., He, S., Abbar, S., Alizadeh, M., Balakrishnan, H., Chawla, S., Madden, S., DeWitt, D.: Roadtracer: automatic extraction of road networks from aerial images. In: CVPR, pp. 1-2–4-8 (2018)

  22. He, H., Yang, D., Wang, S., Zheng, Y., Wang, S.: Light encoder-decoder network for road extraction of remote sensing images. J. Appl. Remote Sens. 13(3), 034510 (2019). https://doi.org/10.1117/1.JRS.13.034510

    Article  Google Scholar 

  23. Salah, K.B., Othmani, M., Saida, S., Kherallah, M.: Improved approach for semantic segmentation of mbrsc aerial imagery based on transfer learning and modified UNet. In: 2023 International Conference on Cyberworlds (CW), Sousse, Tunisia, pp. 46–53 (2023). https://doi.org/10.1109/CW58918.2023.00017

  24. Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128(2), 336–359 (2019). https://doi.org/10.1007/s11263-019-01228-7

    Article  Google Scholar 

  25. Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., Raskar, R.: Deepglobe 2018: a challenge to parse the earth through satellite images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 172–181 (2018)

  26. Ma, J., Xu, Z., Zheng, E., Fan, Q.: Accurate road segmentation in remote sensing images using dense residual learning and improved focal loss. J. Phys. Conf. Ser. 1544(1), 012101 (2020). https://doi.org/10.1088/1742-6596/1544/1/012101

    Article  Google Scholar 

  27. Qi, X., Li, K., Liu, P., Zhou, X., Sun, M.: Deep attention and multi-scale networks for accurate remote sensing image segmentation. IEEE Access 8, 146627–146639 (2020). https://doi.org/10.1109/ACCESS.2020.3010195

    Article  Google Scholar 

  28. Lu, X., Zhong, Y., Zheng, Z., Zhang, L.: Gamsnet: globally aware road detection network with multi-scale residual learning. ISPRS J. Photogramm. Remote Sens. 175, 340–352 (2021). https://doi.org/10.1016/j.isprsjprs.2021.08.002

    Article  Google Scholar 

  29. Tang, M., Perazzi, F., Djelouah, A., Ben Ayed, I., Schroers, C., Boykov, Y.: On regularized losses for weakly-supervised CNN segmentation. In: Proceedings of the European Conference on Computer Vision, ECCV, pp. 507–522 (2018). https://doi.org/10.1007/978-3-030-01261-8_31

  30. Lee, H., Jeong, W.-K.: Scribble2label: scribble-supervised cell segmentation via self-generating pseudo-labels with consistency. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 14–23. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59710-8_2

  31. Marin, D., Boykov, Y.: Robust trust region for weakly supervised segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6608–6618 (2021). https://doi.org/10.1109/ICCV51839.2021.00661

  32. Yu, S., Zhang, B., Xiao, J., Lim, E.G.: Structure-consistent weakly supervised salient object detection with local saliency coherence. In: Proceedings of the AAAI Conference on Artificial Intelligence, Palo Alto, CA, USA, AAAI (2021)

  33. Wei, Y., Ji, S.: Scribble-based weakly supervised deep learning for road surface extraction from remote sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–12 (2021). https://doi.org/10.1109/TGRS.2021.3064099

  34. Zhou, M., Sui, H., Chen, S., Liu, J., Shi, W., Chen, X.: Large-scale road extraction from high-resolution remote sensing images based on a weakly-supervised structural and orientational consistency constraint network. ISPRS J. Photogramm. Remote Sens. 193, 234–251 (2022). https://doi.org/10.1016/j.isprsjprs.2022.09.005

Download references

Author information

Authors and Affiliations

Authors

Contributions

Authors 1–3 conceived of the presented idea; 1–2 developed the theory and performed the computations; 3 verified the analytical methods; 2–4 encouraged to investigate other state-of-the-art findings and supervised the findings of this work. All authors discussed the results and contributed to the final manuscript.

Corresponding author

Correspondence to Khawla Ben Salah.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ben Salah, K., Othmani, M., Fourati, J. et al. Advancing spatial mapping for satellite image road segmentation with multi-head attention. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03431-1

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00371-024-03431-1

Keywords

Navigation