Visual media geo-localization methods aim at determining where an image/video was taken and involve complex matching processes between the query media content (2D/3D features) and the reference data (overhead and geo-tagged ground-level imagery). This area of research encompasses many disciplines, including pre-processing and processing earth-scale overhead and ground-level reference imagery (i.e. satellite, elevation maps, LIDAR, open-street maps, geo-tagged street-view images, etc.) to extract relevant information (i.e. land-types, roads, buildings, etc.), efficient indexing of extracted geo-information, alignment and fusion of (heterogeneous) reference imageries, satellite imagery segmentation, low-level and high-level feature extraction, ray-tracing and synthesized imagery generation, wide imagery registration, multi-view geometry and 3D reconstruction, 2D/3D matching, geometric verification, machine learning, optimization, probabilistic inference, supervised and automatic image/video calibration, fast search, and human-machine interaction.
Advances in this research field were given great impetus by the increasing availability of computer resources and cloud-based processing of large-scale reference data to generate geo-localization products (i.e. 3D world from multi-view satellite images, enhanced land-cover maps, mountain peaks, landmarks, etc.), continuing improvement in resolution of overhead sensors, and the availability of planet-scale geo-tagged image collections.
This special issue received 12 initial submissions. Four submissions were rejected after abstract review, eight passed through the IJCV full reviewing cycle, and four were accepted in this special issue. The manuscripts of which a guest editor was a co-author were handled by a non-conflicted guest editor or other IJCV editor to avoid possible conflicts of interest.
The first paper is on “Image Based Geo-Localization in the Alps” (doi:10.1007/s11263-015-0830-0) by Olivier Saurer, Georges Baatz, Kevin Köser, Lubor Ladicky and Marc Pollefeys. It tackles the problem of very large scale visual localization in mountainous terrain using digital elevation models to extract representations for fast visual database lookup. The authors proposed an automated skyline-based matching approach that efficiently exploits visual information (contours) and geometric constraints (consistent orientation) at the same time. They report a recognition rate up to 88 % over a search area of 40,000 sq. km and using 1000 landscape query images.
The second paper is on “Geo-localization using Volumetric Representations of Overhead Imagery” (doi:10.1007/s11263-015-0850-9) by Ozge C. Ozcanli, Yi Dong and Joseph L. Mundy. It addresses the problem of determining the location of a ground level image by using geo-referenced overhead imagery. The authors proposed (1) a 3D volumetric representation to fuse different modalities of overhead imagery and construct a 3D reference world where its attributes (i.e., orientation of the world surfaces, types of land cover, depth order of fronto-parallel surfaces) are matched to the attributes of the surfaces manually marked on the query image, and (2) a highly parallelizable matching scheme. Their 3D geo-localization framework performed better than two 2D approaches (region reduction and landmark existence matchers) for 75 % of the test query images.
The third paper is on “Image search with selective match kernels: aggregation across single and multiple images” (doi:10.1007/s11263-015-0810-4) by Giorgos Tolias, Yannis Avrithis and Herve Jegou. It addresses the problem of large scale photo-to-photo search using a family of metrics over local descriptors spaces. The authors propose a match kernel that takes the best of existing image representations and matching techniques by combining an aggregation procedure with a selective match kernel that makes the proposed method both precise and scalable. Reported results showed that the proposed photo-to-photo search outperformed state of the art methods on a large scale landmark recognition benchmark.
The fourth and last paper of this special issue is on “Automatic Geo-location Correction of Satellite Imagery” (doi:10.1007/s11263-015-0852-7) by Ozge C. Ozcanli, Yi Dong, Joseph L. Mundy, Helen Webb, Riad Hammoud, and Victor Tom. It addresses the problem of geo-tagging inaccuracy of satellite imageries used in 3D world reconstruction from large-scale joint utilization of cross-platform multi-view satellite images. The authors propose an automatic geo-location correction framework that corrects images from multiple satellites simultaneously. As a result of the proposed correction process, all the images are effectively registered to the same absolute geodetic coordinate frame. They demonstrate the usability and the quality of the correction framework through a 3D surface reconstruction application and alignment of Open Street Maps (OSM) with corrected satellite imageries which opens the door to collect extensive amount of satellite data with OSM labels for training classifiers and map generation processes.
These four papers cover a diverse range of reference data (photos, satellite imagery, LIDAR and Digital Elevation Maps) and novel methodologies for large-scale image geo-localization, thereby appealing to the data providers to understand the limitations of the collection sensors, the users of media geo-localization systems to benefit from automated techniques which aim at reducing their labor time to geo-locate images manually, the experts in the field to address the remaining challenges and as well as to those who wish a snapshot of the current breadth of image geo-localization research. This will continue to be a fertile area for growth in both research analysis and experimentation in many application fields in the years ahead.
About this article
Cite this article
Hammoud, R.I., Sivic, J., Davis, L.S. et al. Guest Editorial: Large Scale Visual Media Geo-Localization. Int J Comput Vis 116, 211–212 (2016). https://doi.org/10.1007/s11263-015-0870-5