Advertisement

Towards Fast Image-Based Localization on a City-Scale

  • Torsten Sattler
  • Bastian Leibe
  • Leif Kobbelt
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7474)

Abstract

Recent developments in Structure-from-Motion approaches allow the reconstructions of large parts of urban scenes. The available models can in turn be used for accurate image-based localization via pose estimation from 2D-to-3D correspondences. In this paper, we analyze a recently proposed localization method that achieves state-of-the-art localization performance using a visual vocabulary quantization for efficient 2D-to-3D correspondence search. We show that using only a subset of the original models allows the method to achieve a similar localization performance. While this gain can come at additional computational cost depending on the dataset, the reduced model requires significantly less memory, allowing the method to handle even larger datasets. We study how the size of the subset, as well as the quantization, affect both the search for matches and the time needed by RANSAC for pose estimation.

Keywords

image localization image retrieval camera pose estimation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a Day. In: IEEE 12th International Conference on Computer Vision, pp. 72–79. IEEE (2009)Google Scholar
  2. 2.
    Arth, C., Wagner, D., Klopschitz, M., Irschara, A., Schmalstieg, D.: Wide Area Localization on Mobile Phones. In: 8th IEEE International Symposium on Mixed and Augmented Reality, pp. 73–82. IEEE Comp. Society, Washington, DC (2009)CrossRefGoogle Scholar
  3. 3.
    Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions. J. ACM 45, 891–923 (1998)MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    Avrithis, Y., Kalantidis, Y., Tolias, G., Spyrou, E.: Retrieving Landmark and Non-Landmark Images from Community Photo Collections. In: Proceedings of the International Conference on Multimedia, pp. 153–161. ACM, New York (2010)Google Scholar
  5. 5.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding 110, 346–359 (2008)CrossRefGoogle Scholar
  6. 6.
    Castle, R.O., Klein, G., Murray, D.W.: Video-rate Localization in Multiple Maps for Wearable Augmented Reality. In: 12th IEEE International Symposium on Wearable Computers, pp. 15–22 (2008)Google Scholar
  7. 7.
    Chen, D.M., Baatz, G., Köser, K., Tsai, S.S., Vedantham, R., Pylvänäinen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., Grzeszczuk, R.: City-scale Landmark Identification on Mobile Devices. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 737–744. IEEE (2011)Google Scholar
  8. 8.
    Chum, O., Matas, J., Obdržálek, S.: Enhancing RANSAC by Generalized Model Optimization. In: Hong, K.-S., Zhang, Z. (eds.) Proceedings of the Asian Conference on Computer Vision, vol. 2, pp. 812–817. Asian Fed. of Comp. Vis. Societies (2004)Google Scholar
  9. 9.
    Chum, O., Matas, J.: Optimal Randomized RANSAC. Trans. Pattern Analysis and Machine Intelligence 30, 1472–1482 (2008)CrossRefGoogle Scholar
  10. 10.
    Crandall, D., Owens, A., Snavely, N., Huttenlocher, D.P.: Discrete-Continuous Optimization for Large-Scale Structure from Motion. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3001–3008. IEEE (2011)Google Scholar
  11. 11.
    Cummins, M., Newman, P.: FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance. Int. J. Robotics Research 27, 647–665 (2008)CrossRefGoogle Scholar
  12. 12.
    Eade, E., Drummond, T.: Scalable Monocular SLAM. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 469–476. IEEE Comp. Society, Washington, DC (2006)Google Scholar
  13. 13.
    Fischler, M.A., Bolles, R.C.: Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Comm. ACM 24, 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Frahm, J.-M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.-H., Dunn, E., Clipp, B., Lazebnik, S., Pollefeys, M.: Building Rome on a Cloudless Day. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 368–381. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  15. 15.
    Gammeter, S., Bossard, L., Quack, T., Van Gool, L.: I know what you did last summer: object-level auto-annotation of holiday snaps. In: IEEE 12th International Conference on Computer Vision, pp. 614–621. IEEE (2009)Google Scholar
  16. 16.
    Haralick, R.M., Lee, C.-N., Ottenberg, K., Nölle, M.: Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem. Int. J. Comp. Vision 13, 331–356 (1994)CrossRefGoogle Scholar
  17. 17.
    Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)zbMATHCrossRefGoogle Scholar
  18. 18.
    Havlena, M., Torii, A., Pajdla, T.: Efficient Structure from Motion by Graph Optimization. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 100–113. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  19. 19.
    Hays, J., Efros, A.A.: IM2GPS: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)Google Scholar
  20. 20.
    Irschara, A., Zach, C., Frahm, J.-M., Bischof, H.: From Structure-from-Motion Point Clouds to Fast Location Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2599–2606. IEEE (2009)Google Scholar
  21. 21.
    Josephson, K., Byröd, M.: Pose Estimation with Radial Distortion and Unknown Focal Length. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2419–2426. IEEE (2009)Google Scholar
  22. 22.
    Knopp, J., Sivic, J., Pajdla, T.: Avoiding Confusing Features in Place Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 748–761. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  23. 23.
    Li, Y., Snavely, N., Huttenlocher, D.P.: Location Recognition Using Prioritized Feature Matching. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 791–804. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  24. 24.
    Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comp. Vision 60, 91–110 (2004)CrossRefGoogle Scholar
  25. 25.
    Muja, M., Lowe, D.G.: Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In: International Conference on Computer Vision Theory and Application, pp. 331–340. INSTICC Press (2009)Google Scholar
  26. 26.
    Nister, D., Stewenius, H.: Scalable Recognition with a Vocabulary Tree. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2161–2168. IEEE (2006)Google Scholar
  27. 27.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)Google Scholar
  28. 28.
    Pollefeys, M., Nister, D., Frahm, J.-M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.-J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., Towles, H.: Detailed Real-Time Urban 3D Reconstruction From Video. Int. J. Comp. Vision 78, 143–167 (2008)CrossRefGoogle Scholar
  29. 29.
    Robertson, D., Cipolla, R.: An Image-Based System for Urban Navigation. In: Hoppe, A., Barman, S., Ellis, T. (eds.) The 15th British Machine Vision Conference, pp. 819–828. BMVA (2004)Google Scholar
  30. 30.
    Sattler, T., Leibe, B., Kobbelt, L.: Fast Image-Based Localization using Direct 2D-to-3D Matching. In: IEEE 13th International Conference on Computer Vision, pp. 667–674. IEEE (2011)Google Scholar
  31. 31.
    Schindler, G., Brown, M., Szeliski, R.: City-Scale Location Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE (2007)Google Scholar
  32. 32.
    Stephen, S., Lowe, D.G., Little, J.: Global Localization using Distinctive Visual Features. In: International Conference on Intelligent Robots and Systems, pp. 226–231 (2002)Google Scholar
  33. 33.
    Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 1470–1477. IEEE Comp. Society, Washington, DC (2003)CrossRefGoogle Scholar
  34. 34.
    Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3D. In: SIGGRAPH Conference Proceedings, pp. 835–846. ACM, New York (2006)Google Scholar
  35. 35.
    Strecha, C., Pylvanainen, T., Fua, P.: Dynamic and Scalable Large Scale Image Reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 406–413. IEEE (2010)Google Scholar
  36. 36.
    Strecha, C., Bronstein, A.M., Bronstein, M.M., Fua, P.: LDAHash: Improved matching with smaller descriptors. EPFL-REPORT-152487 (2010)Google Scholar
  37. 37.
    Wendel, A., Irschara, A., Bischof, H.: Natural Landmark-based Monocular Localization for MAVs. In: IEEE International Conference on Robotics and Automation, pp. 5792–5799. IEEE (2011)Google Scholar
  38. 38.
    Weyand, T., Leibe, B.: Discovering Favorite Views of Popular Places with Iconoid Shift. In: IEEE 13th International Conference on Computer Vision, pp. 1132–1139. IEEE (2011)Google Scholar
  39. 39.
    Zamir, A.R., Shah, M.: Accurate Image Localization Based on Google Maps Street View. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 255–268. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  40. 40.
    Zhang, W., Kosecka, J.: Image Based Localization in Urban Environments. In: 3rd International Symposium on 3D Data Processing, Visualization and Transmission, pp. 33–40. IEEE Comp. Society, Washington, DC (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Torsten Sattler
    • 1
  • Bastian Leibe
    • 2
  • Leif Kobbelt
    • 1
  1. 1.RWTH Aachen UniversityAachenGermany
  2. 2.UMIC Research CentreRWTH Aachen UniversityAachenGermany

Personalised recommendations