Skip to main content
Log in

Improving the performance of learned descriptors in the matching of high spatial resolution aerial images by proposing a large-scale dataset of vertical images

  • Original Paper
  • Published:
Arabian Journal of Geosciences Aims and scope Submit manuscript

Abstract

Image matching is a branch of computer vision that forms the basis of most photogrammetry and remote sensing data processes. Outstanding results from recent activities in this field have shown the effectiveness of trainable descriptors. Currently, the existing datasets used to train these types of descriptors are collected based on front-view photos, and because of extractable features in these images, learned descriptors face challenges in the vertical image matching process. Also, a training dataset derived from top-view images has not been developed for the learned descriptors. To overcome these limitations, a training dataset based on remote sensing images is presented. Training patches in this collection are extracted from 61,223 images with 130 different scenes and different kind of platforms including fixed and multi-rotor unmanned aerial vehicles (UAV). To evaluate the performance of the dataset, mean average precision (mAP) criterion, ratio of descriptor distances, and point cloud density were considered. The effectiveness of the developed dataset was proved with 15% improvements in matching mAP of horizontally rotated UAV images compared to HardNet. It is also worth mentioning that the output point cloud derived from the proposed method has better quality with an average of 5426 points and 2575 accurate points on evaluation sets. By training the mentioned descriptors on the proposed dataset, it is possible to improve the stability of vertical image matching. Variety of landscapes, different imaging systems along with various environmental conditions, and finally, the appropriate architecture increase the robustness of the very high-resolution top-view image matching and make the output point cloud denser. Available at: https://github.com/farhadinima75/UAVPatches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://github.com/yyangynu/SUIRD/tree/master/SUIRD_v2.2

References

  • Aicardi I, Nex F, Gerke M, Lingua AM (2016) An image-based approach for the co-registration of multi-temporal UAV image datasets. Remote Sensing 8:779

    Article  Google Scholar 

  • Balntas V, Lenc K, Vedaldi A, Tuytelaars T, Matas J, Mikolajczyk K (2019) H-Patches: a benchmark and evaluation of handcrafted and learned local descriptors. IEEE Trans Pattern Anal Mach Intell 42:2825–2841. https://doi.org/10.1109/TPAMI.2019.2915233

    Article  Google Scholar 

  • Barroso-Laguna A, Riba E, Ponsa D, Mikolajczyk K (2019) Key. net: keypoint detection by handcrafted and learned cnn filters. Paper presented at the Proceedings of the IEEE/CVF International Conference on Computer Vision. https://doi.ieeecomputersociety.org/10.1109/ICCV.2019.00593

  • Chouari W (2021) Wetland land cover change detection using multitemporal Landsat data: a case study of the Al-Asfar wetland, Kingdom of Saudi Arabia. Arab J Geosci 14:1–14. https://doi.org/10.1007/s12517-021-06815-y

    Article  Google Scholar 

  • DeTone D, Malisiewicz T, Rabinovich A (2018) Superpoint: self-supervised interest point detection and description. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition workshops.https://doi.org/10.1109/CVPRW.2018.00060

  • Dusmanu M, Rocco I, Pajdla T, Pollefeys M, Sivic J, Torii A, Sattler T. (2019). D2-net: a trainable CNN for joint description and detection of local features. Paper presented at the Proceedings of the IEEE/cvf conference on computer vision and pattern recognition.https://doi.org/10.1109/CVPR.2019.00828

  • García-Moreno LM, Díaz-Paz JP, Loaiza-Correa H, Restrepo-Girón AD (2020) Dataset of thermal and visible aerial images for multi-modal and multi-spectral image registration and fusion. Data Brief 29:105326

    Article  Google Scholar 

  • Harris C, Stephens M (1988) A combined corner and edge detector. Alvey Vision Conference 15(50)

  • Jiang S, Jiang W, Li L, Wang L, Huang W (2020) Reliable and efficient UAV image matching via geometric constraints structured by delaunay triangulation. Remote Sensing 12:3390

    Article  Google Scholar 

  • Jin Y, Mishkin D, Mishchuk A, Matas J, Fua P, Yi KM, Trulls E (2021) Image matching across wide baselines: from paper to practice. Int J Comput Vision 129:517–547. https://doi.org/10.1007/s11263-020-01385-0

    Article  Google Scholar 

  • Joseph A, Rex ES, Christopher S, Jose J (2021) Content-based image retrieval using hybrid k-means moth flame optimization algorithm. Arab J Geosci 14:1–14. https://doi.org/10.1007/s12517-021-06990-y

    Article  Google Scholar 

  • Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60:91–110

    Article  Google Scholar 

  • Ma J, Jiang X, Fan A, Jiang J, Yan J (2021) Image matching from handcrafted to deep features: a survey. Int J Comput Vision 129:23–79. https://doi.org/10.1007/s11263-020-01359-2

    Article  Google Scholar 

  • Mishchuk A, Mishkin D, Radenovic F, Matas J (2017) Working hard to know your neighbor's margins: local descriptor learning loss. arXiv preprint arXiv:1705.10872

  • Mur-Artal R, Tardós JD (2017) Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans Rob 33:1255–1262

    Article  Google Scholar 

  • Pultar M, Mishkin D, Matas J (2019) Leveraging outdoor webcams for local descriptor learning. Paper presented at the Proceedings of the 24th Computer Vision Winter Workshop (CVWW 2019). https://doi.org/10.48550/arXiv.1901.09780

  • Revaud J, Weinzaepfel P, De Souza C, Pion N, Csurka G, Cabon Y, Humenberger M (2019) R2D2: repeatable and reliable detector and descriptor. arXiv preprint arXiv:1906.06195. https://doi.org/10.48550/arXiv.1906.06195

  • Shahbazi M, Ménard P, Sohn G, Théau J (2019) Unmanned aerial image dataset: ready for 3D reconstruction. Data Brief 25:103962

    Article  Google Scholar 

  • Tian Y, Barroso Laguna A, Ng T, Balntas V, Mikolajczyk K (2020) Hynet: learning local descriptor with hybrid similarity measure and triplet loss. Adv Neural Inf Process Syst 33:7401–7412

    Google Scholar 

  • Tian Y, Fan B, Wu F (2017) L2-net: Deep learning of discriminative patch descriptor in euclidean space. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 661–669

  • Tian Y, Yu X, Fan B, Wu F, Heijnen H, Balntas V (2019) Sosnet: second order similarity regularization for local descriptor learning. Paper presented at the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

  • Wang S, Quan D, Liang X, Ning M, Guo Y, Jiao L (2018) A deep learning framework for remote sensing image registration. ISPRS J Photogramm Remote Sens 145:148–164

    Article  Google Scholar 

  • Winder SA, Brown M (2007) Learning local image descriptors. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 1–8

  • Ye S, Yan F, Zhang Q, Shen D (2022) Comparing the accuracies of sUAV-SFM and UAV-LiDAR point clouds for topographic measurements. Arab J Geosci 15:1–18. https://doi.org/10.1007/s12517-022-09683-2

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abbas Kiani.

Ethics declarations

Conflict of interests

The authors declare no conflict of interest.

Additional information

Responsible Editor: Biswajeet Pradhan

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farhadi, N., Ebadi, H. & Kiani, A. Improving the performance of learned descriptors in the matching of high spatial resolution aerial images by proposing a large-scale dataset of vertical images. Arab J Geosci 16, 656 (2023). https://doi.org/10.1007/s12517-023-11747-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12517-023-11747-w

Keywords

Navigation