Skip to main content

GAMa: Cross-View Video Geo-Localization

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13697))

Included in the following conference series:

Abstract

The existing work in cross-view geo-localization is based on images where a ground panorama is matched to an aerial image. In this work, we focus on ground videos instead of images which provides additional contextual cues which are important for this task. There are no existing datasets for this problem, therefore we propose GAMa dataset, a large-scale dataset with ground videos and corresponding aerial images. We also propose a novel approach to solve this problem. At clip-level, a short video clip is matched with corresponding aerial image and is later used to get video-level geo-localization of a long video. Moreover, we propose a hierarchical approach to further improve the clip-level geo-localization. On this challenging dataset, with unaligned images and limited field of view, our proposed method achieves a Top-1 recall rate of 19.4% and 45.1% @1.0mile. Code & dataset are available at this https://github.com/svyas23/GAMa.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Satellite images. https://www.apple.com/maps/. Accessed Jan 2021

  2. Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., Sivic, J.: Netvlad: Cnn architecture for weakly supervised place recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5297–5307 (2016)

    Google Scholar 

  3. Chaabane, M., Gueguen, L., Trabelsi, A., Beveridge, R., O’Hara, S.: End-to-end learning improves static object geo-localization from video. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2063–2072 (2021)

    Google Scholar 

  4. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)

    Google Scholar 

  5. Grigorescu, S., Trasnea, B., Cocias, T., Macesanu, G.: A survey of deep learning techniques for autonomous driving. J. Field Rob. 37(3), 362–386 (2020)

    Article  Google Scholar 

  6. Hakeem, A., Vezzani, R., Shah, M., Cucchiara, R.: Estimating geospatial trajectory of a moving camera. In: 18th International Conference on Pattern Recognition (ICPR 2006), vol. 2, pp. 82–87. IEEE (2006)

    Google Scholar 

  7. Hosseinpoor, H., Samadzadegan, F., Dadras Javan, F.: Pricise target geolocation and tracking based on uav video imagery. Int. Arch. Photogram. Remote Sens. Spatial Inf. Sci. 41 (2016)

    Google Scholar 

  8. Hu, S., Feng, M., Nguyen, R.M., Lee, G.H.: Cvm-net: cross-view matching network for image-based ground-to-aerial geo-localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7258–7267 (2018)

    Google Scholar 

  9. Hu, S., Lee, G.H.: Image-based geo-localization using satellite imagery. Int. J. Comput. Vision 128(5), 1205–1219 (2020)

    Article  Google Scholar 

  10. Kim, D.K., Walter, M.R.: Satellite image-based localization via learned embeddings. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 2073–2080. IEEE (2017)

    Google Scholar 

  11. Li, A., Hu, H., Mirowski, P., Farajtabar, M.: Cross-view policy learning for street navigation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8100–8109 (2019)

    Google Scholar 

  12. Lin, T.Y., Cui, Y., Belongie, S., Hays, J.: Learning deep representations for ground-to-aerial geolocalization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5007–5015 (2015)

    Google Scholar 

  13. Liu, L., Li, H.: Lending orientation to neural networks for cross-view geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5624–5633 (2019)

    Google Scholar 

  14. Miller, I.D., et al.: Any way you look at it: semantic crossview localization and mapping with lidar. IEEE Rob. Autom. Lett. 6(2), 2397–2404 (2021)

    Article  Google Scholar 

  15. Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)

    Google Scholar 

  16. Radenović, F., Tolias, G., Chum, O.: Fine-tuning cnn image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2018)

    Article  Google Scholar 

  17. Regmi, K., Borji, A.: Cross-view image synthesis using geometry-guided conditional gans. Comput. Vision Image Underst. 187, 102788 (2019)

    Google Scholar 

  18. Regmi, K., Shah, M.: Video geo-localization employing geo-temporal feature learning and gps trajectory smoothing. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12126–12135 (2021)

    Google Scholar 

  19. Rodrigues, R., Tani, M.: Are these from the same place? seeing the unseen in cross-view image geo-localization. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3753–3761 (2021)

    Google Scholar 

  20. Sarlin, P.E., DeTone, D., Malisiewicz, T., Rabinovich, A.: Superglue: learning feature matching with graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4938–4947 (2020)

    Google Scholar 

  21. Senlet, T., Elgammal, A.: Satellite image based precise robot localization on sidewalks. In: 2012 IEEE International Conference on Robotics and Automation, pp. 2647–2653. IEEE (2012)

    Google Scholar 

  22. Shi, Y., Liu, L., Yu, X., Li, H.: Spatial-aware feature aggregation for image based cross-view geo-localization. Adv. Neural Inf. Process. Syst. 32, 10090–10100 (2019)

    Google Scholar 

  23. Shi, Y., Yu, X., Campbell, D., Li, H.: Where am i looking at? joint location and orientation estimation by cross-view matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4064–4072 (2020)

    Google Scholar 

  24. Sohn, K.: Improved deep metric learning with multi-class n-pair loss objective. Adv. Neural Inf. Process. Syst. 29, 1–9 (2016)

    Google Scholar 

  25. Tian, X., Shao, J., Ouyang, D., Shen, H.T.: Uav-satellite view synthesis for cross-view geo-localization. IEEE Trans. Circ. Syst. Video Technol. 32, 4804–4815 (2021)

    Article  Google Scholar 

  26. Tian, Y., Chen, C., Shah, M.: Cross-view image matching for geo-localization in urban environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3608–3616 (2017)

    Google Scholar 

  27. Toker, A., Zhou, Q., Maximov, M., Leal-Taixé, L.: Coming down to earth: satellite-to-street view synthesis for geo-localization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6488–6497 (2021)

    Google Scholar 

  28. Vassileios Balntas, Edgar Riba, D.P., Mikolajczyk, K.: Learning local feature descriptors with triplets and shallow convolutional neural networks. In: Richard C. Wilson, E.R.H., Smith, W.A.P. (eds.) Proceedings of the British Machine Vision Conference (BMVC), pp. 119.1-119.11. BMVA Press (2016). https://doi.org/10.5244/C.30.119

  29. Vo, N.N., Hays, J.: Localizing and orienting street views using overhead imagery. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 494–509. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_30

    Chapter  Google Scholar 

  30. Wang, T., et al.: Each part matters: local patterns facilitate cross-view geo-localization. IEEE Trans. Circ. Syst. Video Technol. 32, 867–879 (2021)

    Article  Google Scholar 

  31. Workman, S., Souvenir, R., Jacobs, N.: Wide-area image geolocalization with aerial reference imagery. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3961–3969 (2015)

    Google Scholar 

  32. Yang, H., Lu, X., Zhu, Y.: Cross-view geo-localization with layer-to-layer transformer. Adv. Neural Inf. Process. Syst. 34, 29009–29020 (2021)

    Google Scholar 

  33. Yu, F., et al.: Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2636–2645 (2020)

    Google Scholar 

  34. Zamir, A.R., Shah, M.: Accurate image localization based on google maps street view. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 255–268. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_19

    Chapter  Google Scholar 

  35. Zemene, E., Tesfaye, Y.T., Idrees, H., Prati, A., Pelillo, M., Shah, M.: Large-scale image geo-localization using dominant sets. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 148–161 (2018)

    Article  Google Scholar 

  36. Zhu, S., Yang, T., Chen, C.: Vigor: cross-view image geo-localization beyond one-to-one retrieval. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3640–3649 (2021)

    Google Scholar 

  37. Zhu, Y., Sun, B., Lu, X., Jia, S.: Geographic semantic network for cross-view image geo-localization. IEEE Trans. Geosci. Remote Sens. 60, 1–15 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shruti Vyas .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2775 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vyas, S., Chen, C., Shah, M. (2022). GAMa: Cross-View Video Geo-Localization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13697. Springer, Cham. https://doi.org/10.1007/978-3-031-19836-6_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19836-6_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19835-9

  • Online ISBN: 978-3-031-19836-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics