Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

Baatz, Georges; Köser, Kevin; Chen, David; Grzeszczuk, Radek; Pollefeys, Marc

doi:10.1007/s11263-011-0458-7

Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

Published: 27 May 2011

Volume 96, pages 315–334, (2012)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Georges Baatz¹,
Kevin Köser¹,
David Chen²,
Radek Grzeszczuk³ &
…
Marc Pollefeys¹

962 Accesses
52 Citations
6 Altmetric
Explore all metrics

Abstract

Given a cell phone image of a building we address the problem of place-of-interest recognition in urban scenarios. Here, we go beyond what has been shown in earlier approaches by exploiting the nowadays often available 3D building information (e.g. from extruded floor plans) and massive street-level image data for database creation. Exploiting vanishing points in query images and thus fully removing 3D rotation from the recognition problem allows then to simplify the feature invariance to a purely homothetic problem, which we show enables more discriminative power in feature descriptors than classical SIFT. We rerank visual word based document queries using a fast stratified homothetic verification that in most cases boosts the correct document to top positions if it was in the short list. Since we exploit 3D building information, the approach finally outputs the camera pose in real world coordinates ready for augmenting the cell phone image with virtual 3D information. The whole system is demonstrated to outperform traditional approaches on city scale experiments for different sources of street-level image data and a challenging set of cell phone images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DisLocation: Scalable Descriptor Distinctiveness for Location Recognition

Visual and Positioning Information Fusion Towards Urban Place Recognition

Article 07 November 2022

Don’t Be Confused: Region Mapping Based Visual Place Recognition

References

Baatz, G., Köser, K., Chen, D., Grzeszczuk, R., & Pollefeys, M. (2010). Handling urban location recognition as a 2D homothetic problem. In ECCV.
Google Scholar
Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). SURF: Speeded Up Robust Features. Computer Vision and Image Understanding, 110(3).
Bishop, C. M. (2006). Pattern recognition and machine learning. ISBN 0-387-31073-8, Section 2.5.1, p. 123.
Cao, Y., & McDonald, J. (2009). Viewpoint invariant features from single images using 3D geometry. In IEEE Workshop on Applications of Computer Vision.
Google Scholar
Chandrasekhar, V., Takacs, G., Chen, D., Tsai, S., Grzeszczuk, R., & Girod, B. (2009). CHoG: compressed histogram of gradients. In CVPR.
Google Scholar
Dreuw, P., Steingrube, P., Hanselmann, H., & Ney, H. (2009). SURF-face: face recognition under viewpoint consistency constraints. In BMVC.
Google Scholar
Duda, R. O., & Hart, P. E. (1972). Use of the hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1), 11–15.
Article Google Scholar
Irschara, A., Zach, C., Frahm, J.-M., & Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In CVPR.
Google Scholar
Jegou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. In ECCV.
Google Scholar
Knopp, J., Sivic, J., & Pajdla, T. (2010). Avoiding confusing features in place recognition. In ECCV.
Google Scholar
Kosecka, J., & Zhang, Wei (2002). Video compass. In ECCV.
Google Scholar
Köser, K., & Koch, R. (2007). Perspectively invariant normal features. In Workshop on 3D Representation for Recognition, ICCV.
Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.
Article Google Scholar
Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In CVPR.
Google Scholar
Perdoch, M., Chum, O., & Matas, J. (2009). Efficient representation of local geometry for large scale object retrieval. In CVPR.
Google Scholar
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In CVPR.
Google Scholar
Robertson, D., & Cipolla, R. (2004). An image based system for urban navigation. In BMVC.
Google Scholar
Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In CVPR.
Google Scholar
Schindler, G., Krishnamurthy, P., Lublinerman, R., Liu, Y., & Dellaert, F. (2008). Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In CVPR.
Google Scholar
Sivic, J., & Zisserman, A. (2003). Video Google: a text retrieval approach to object matching in videos. In ICCV.
Google Scholar
Takacs, G., Chandrasekhar, V., Tsai, S., Chen, D., Grzeszczuk, R., & Girod, B. (2010). Unified real-time tracking and recognition with rotation-invariant fast features. In CVPR.
Google Scholar
Wu, C., Clipp, B., Li, X., Frahm, J.-M., & Pollefeys, M. (2008a). 3D model matching with viewpoint invariant patches (VIPs). In CVPR.
Google Scholar
Wu, C., Fraundorfer, F., Frahm, J., & Pollefeys, M. (2008b). 3D model search and pose estimation from single images using VIP features. In Workshop on Search in 3D, CVPR.
Google Scholar
Zamir, A., & Shah, M. (2010). Accurate image localization based on Google maps street view. In ECCV.
Google Scholar
Zhang, W., & Kosecka, J. (2006). Image based localization in urban environments. In 3DPVT.
Google Scholar
Zhu, Z., Oskiper, T., Samarasekera, S., Kumar, R., & Sawhney, H. S. (2008). Real-time global localization with a pre-built visual landmark database. In CVPR.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, ETH Zurich, Zurich, Switzerland
Georges Baatz, Kevin Köser & Marc Pollefeys
Department of Electrical Engineering, Stanford University, Stanford, CA, USA
David Chen
Nokia Research at Palo Alto, Palo Alto, CA, USA
Radek Grzeszczuk

Authors

Georges Baatz
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Köser
View author publications
You can also search for this author in PubMed Google Scholar
David Chen
View author publications
You can also search for this author in PubMed Google Scholar
Radek Grzeszczuk
View author publications
You can also search for this author in PubMed Google Scholar
Marc Pollefeys
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georges Baatz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baatz, G., Köser, K., Chen, D. et al. Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition. Int J Comput Vis 96, 315–334 (2012). https://doi.org/10.1007/s11263-011-0458-7

Download citation

Received: 15 October 2010
Accepted: 05 May 2011
Published: 27 May 2011
Issue Date: February 2012
DOI: https://doi.org/10.1007/s11263-011-0458-7

Keywords

Location recognition

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

Abstract

Access this article

Similar content being viewed by others

DisLocation: Scalable Descriptor Distinctiveness for Location Recognition

Visual and Positioning Information Fusion Towards Urban Place Recognition

Don’t Be Confused: Region Mapping Based Visual Place Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

Abstract

Access this article

Similar content being viewed by others

DisLocation: Scalable Descriptor Distinctiveness for Location Recognition

Visual and Positioning Information Fusion Towards Urban Place Recognition

Don’t Be Confused: Region Mapping Based Visual Place Recognition

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation