Abstract
Recent developments in Structure-from-Motion approaches allow the reconstructions of large parts of urban scenes. The available models can in turn be used for accurate image-based localization via pose estimation from 2D-to-3D correspondences. In this paper, we analyze a recently proposed localization method that achieves state-of-the-art localization performance using a visual vocabulary quantization for efficient 2D-to-3D correspondence search. We show that using only a subset of the original models allows the method to achieve a similar localization performance. While this gain can come at additional computational cost depending on the dataset, the reduced model requires significantly less memory, allowing the method to handle even larger datasets. We study how the size of the subset, as well as the quantization, affect both the search for matches and the time needed by RANSAC for pose estimation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a Day. In: IEEE 12th International Conference on Computer Vision, pp. 72–79. IEEE (2009)
Arth, C., Wagner, D., Klopschitz, M., Irschara, A., Schmalstieg, D.: Wide Area Localization on Mobile Phones. In: 8th IEEE International Symposium on Mixed and Augmented Reality, pp. 73–82. IEEE Comp. Society, Washington, DC (2009)
Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions. J. ACM 45, 891–923 (1998)
Avrithis, Y., Kalantidis, Y., Tolias, G., Spyrou, E.: Retrieving Landmark and Non-Landmark Images from Community Photo Collections. In: Proceedings of the International Conference on Multimedia, pp. 153–161. ACM, New York (2010)
Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. Computer Vision and Image Understanding 110, 346–359 (2008)
Castle, R.O., Klein, G., Murray, D.W.: Video-rate Localization in Multiple Maps for Wearable Augmented Reality. In: 12th IEEE International Symposium on Wearable Computers, pp. 15–22 (2008)
Chen, D.M., Baatz, G., Köser, K., Tsai, S.S., Vedantham, R., Pylvänäinen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., Grzeszczuk, R.: City-scale Landmark Identification on Mobile Devices. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 737–744. IEEE (2011)
Chum, O., Matas, J., Obdržálek, S.: Enhancing RANSAC by Generalized Model Optimization. In: Hong, K.-S., Zhang, Z. (eds.) Proceedings of the Asian Conference on Computer Vision, vol. 2, pp. 812–817. Asian Fed. of Comp. Vis. Societies (2004)
Chum, O., Matas, J.: Optimal Randomized RANSAC. Trans. Pattern Analysis and Machine Intelligence 30, 1472–1482 (2008)
Crandall, D., Owens, A., Snavely, N., Huttenlocher, D.P.: Discrete-Continuous Optimization for Large-Scale Structure from Motion. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3001–3008. IEEE (2011)
Cummins, M., Newman, P.: FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance. Int. J. Robotics Research 27, 647–665 (2008)
Eade, E., Drummond, T.: Scalable Monocular SLAM. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 469–476. IEEE Comp. Society, Washington, DC (2006)
Fischler, M.A., Bolles, R.C.: Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Comm. ACM 24, 381–395 (1981)
Frahm, J.-M., Fite-Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y.-H., Dunn, E., Clipp, B., Lazebnik, S., Pollefeys, M.: Building Rome on a Cloudless Day. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 368–381. Springer, Heidelberg (2010)
Gammeter, S., Bossard, L., Quack, T., Van Gool, L.: I know what you did last summer: object-level auto-annotation of holiday snaps. In: IEEE 12th International Conference on Computer Vision, pp. 614–621. IEEE (2009)
Haralick, R.M., Lee, C.-N., Ottenberg, K., Nölle, M.: Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem. Int. J. Comp. Vision 13, 331–356 (1994)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004)
Havlena, M., Torii, A., Pajdla, T.: Efficient Structure from Motion by Graph Optimization. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 100–113. Springer, Heidelberg (2010)
Hays, J., Efros, A.A.: IM2GPS: estimating geographic information from a single image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Irschara, A., Zach, C., Frahm, J.-M., Bischof, H.: From Structure-from-Motion Point Clouds to Fast Location Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2599–2606. IEEE (2009)
Josephson, K., Byröd, M.: Pose Estimation with Radial Distortion and Unknown Focal Length. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2419–2426. IEEE (2009)
Knopp, J., Sivic, J., Pajdla, T.: Avoiding Confusing Features in Place Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 748–761. Springer, Heidelberg (2010)
Li, Y., Snavely, N., Huttenlocher, D.P.: Location Recognition Using Prioritized Feature Matching. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 791–804. Springer, Heidelberg (2010)
Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comp. Vision 60, 91–110 (2004)
Muja, M., Lowe, D.G.: Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration. In: International Conference on Computer Vision Theory and Application, pp. 331–340. INSTICC Press (2009)
Nister, D., Stewenius, H.: Scalable Recognition with a Vocabulary Tree. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2161–2168. IEEE (2006)
Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2007)
Pollefeys, M., Nister, D., Frahm, J.-M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.-J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., Towles, H.: Detailed Real-Time Urban 3D Reconstruction From Video. Int. J. Comp. Vision 78, 143–167 (2008)
Robertson, D., Cipolla, R.: An Image-Based System for Urban Navigation. In: Hoppe, A., Barman, S., Ellis, T. (eds.) The 15th British Machine Vision Conference, pp. 819–828. BMVA (2004)
Sattler, T., Leibe, B., Kobbelt, L.: Fast Image-Based Localization using Direct 2D-to-3D Matching. In: IEEE 13th International Conference on Computer Vision, pp. 667–674. IEEE (2011)
Schindler, G., Brown, M., Szeliski, R.: City-Scale Location Recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–7. IEEE (2007)
Stephen, S., Lowe, D.G., Little, J.: Global Localization using Distinctive Visual Features. In: International Conference on Intelligent Robots and Systems, pp. 226–231 (2002)
Sivic, J., Zisserman, A.: Video Google: A Text Retrieval Approach to Object Matching in Videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, vol. 2, pp. 1470–1477. IEEE Comp. Society, Washington, DC (2003)
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3D. In: SIGGRAPH Conference Proceedings, pp. 835–846. ACM, New York (2006)
Strecha, C., Pylvanainen, T., Fua, P.: Dynamic and Scalable Large Scale Image Reconstruction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 406–413. IEEE (2010)
Strecha, C., Bronstein, A.M., Bronstein, M.M., Fua, P.: LDAHash: Improved matching with smaller descriptors. EPFL-REPORT-152487 (2010)
Wendel, A., Irschara, A., Bischof, H.: Natural Landmark-based Monocular Localization for MAVs. In: IEEE International Conference on Robotics and Automation, pp. 5792–5799. IEEE (2011)
Weyand, T., Leibe, B.: Discovering Favorite Views of Popular Places with Iconoid Shift. In: IEEE 13th International Conference on Computer Vision, pp. 1132–1139. IEEE (2011)
Zamir, A.R., Shah, M.: Accurate Image Localization Based on Google Maps Street View. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 255–268. Springer, Heidelberg (2010)
Zhang, W., Kosecka, J.: Image Based Localization in Urban Environments. In: 3rd International Symposium on 3D Data Processing, Visualization and Transmission, pp. 33–40. IEEE Comp. Society, Washington, DC (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sattler, T., Leibe, B., Kobbelt, L. (2012). Towards Fast Image-Based Localization on a City-Scale. In: Dellaert, F., Frahm, JM., Pollefeys, M., Leal-Taixé, L., Rosenhahn, B. (eds) Outdoor and Large-Scale Real-World Scene Analysis. Lecture Notes in Computer Science, vol 7474. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34091-8_9
Download citation
DOI: https://doi.org/10.1007/978-3-642-34091-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34090-1
Online ISBN: 978-3-642-34091-8
eBook Packages: Computer ScienceComputer Science (R0)