Skip to main content
Log in

Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

Given a cell phone image of a building we address the problem of place-of-interest recognition in urban scenarios. Here, we go beyond what has been shown in earlier approaches by exploiting the nowadays often available 3D building information (e.g. from extruded floor plans) and massive street-level image data for database creation. Exploiting vanishing points in query images and thus fully removing 3D rotation from the recognition problem allows then to simplify the feature invariance to a purely homothetic problem, which we show enables more discriminative power in feature descriptors than classical SIFT. We rerank visual word based document queries using a fast stratified homothetic verification that in most cases boosts the correct document to top positions if it was in the short list. Since we exploit 3D building information, the approach finally outputs the camera pose in real world coordinates ready for augmenting the cell phone image with virtual 3D information. The whole system is demonstrated to outperform traditional approaches on city scale experiments for different sources of street-level image data and a challenging set of cell phone images.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Baatz, G., Köser, K., Chen, D., Grzeszczuk, R., & Pollefeys, M. (2010). Handling urban location recognition as a 2D homothetic problem. In ECCV.

    Google Scholar 

  • Bay, H., Ess, A., Tuytelaars, T., & Van Gool, L. (2008). SURF: Speeded Up Robust Features. Computer Vision and Image Understanding, 110(3).

  • Bishop, C. M. (2006). Pattern recognition and machine learning. ISBN 0-387-31073-8, Section 2.5.1, p. 123.

  • Cao, Y., & McDonald, J. (2009). Viewpoint invariant features from single images using 3D geometry. In IEEE Workshop on Applications of Computer Vision.

    Google Scholar 

  • Chandrasekhar, V., Takacs, G., Chen, D., Tsai, S., Grzeszczuk, R., & Girod, B. (2009). CHoG: compressed histogram of gradients. In CVPR.

    Google Scholar 

  • Dreuw, P., Steingrube, P., Hanselmann, H., & Ney, H. (2009). SURF-face: face recognition under viewpoint consistency constraints. In BMVC.

    Google Scholar 

  • Duda, R. O., & Hart, P. E. (1972). Use of the hough transformation to detect lines and curves in pictures. Communications of the ACM, 15(1), 11–15.

    Article  Google Scholar 

  • Irschara, A., Zach, C., Frahm, J.-M., & Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In CVPR.

    Google Scholar 

  • Jegou, H., Douze, M., & Schmid, C. (2008). Hamming embedding and weak geometric consistency for large scale image search. In ECCV.

    Google Scholar 

  • Knopp, J., Sivic, J., & Pajdla, T. (2010). Avoiding confusing features in place recognition. In ECCV.

    Google Scholar 

  • Kosecka, J., & Zhang, Wei (2002). Video compass. In ECCV.

    Google Scholar 

  • Köser, K., & Koch, R. (2007). Perspectively invariant normal features. In Workshop on 3D Representation for Recognition, ICCV.

    Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.

    Article  Google Scholar 

  • Nistér, D., & Stewénius, H. (2006). Scalable recognition with a vocabulary tree. In CVPR.

    Google Scholar 

  • Perdoch, M., Chum, O., & Matas, J. (2009). Efficient representation of local geometry for large scale object retrieval. In CVPR.

    Google Scholar 

  • Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In CVPR.

    Google Scholar 

  • Robertson, D., & Cipolla, R. (2004). An image based system for urban navigation. In BMVC.

    Google Scholar 

  • Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In CVPR.

    Google Scholar 

  • Schindler, G., Krishnamurthy, P., Lublinerman, R., Liu, Y., & Dellaert, F. (2008). Detecting and matching repeated patterns for automatic geo-tagging in urban environments. In CVPR.

    Google Scholar 

  • Sivic, J., & Zisserman, A. (2003). Video Google: a text retrieval approach to object matching in videos. In ICCV.

    Google Scholar 

  • Takacs, G., Chandrasekhar, V., Tsai, S., Chen, D., Grzeszczuk, R., & Girod, B. (2010). Unified real-time tracking and recognition with rotation-invariant fast features. In CVPR.

    Google Scholar 

  • Wu, C., Clipp, B., Li, X., Frahm, J.-M., & Pollefeys, M. (2008a). 3D model matching with viewpoint invariant patches (VIPs). In CVPR.

    Google Scholar 

  • Wu, C., Fraundorfer, F., Frahm, J., & Pollefeys, M. (2008b). 3D model search and pose estimation from single images using VIP features. In Workshop on Search in 3D, CVPR.

    Google Scholar 

  • Zamir, A., & Shah, M. (2010). Accurate image localization based on Google maps street view. In ECCV.

    Google Scholar 

  • Zhang, W., & Kosecka, J. (2006). Image based localization in urban environments. In 3DPVT.

    Google Scholar 

  • Zhu, Z., Oskiper, T., Samarasekera, S., Kumar, R., & Sawhney, H. S. (2008). Real-time global localization with a pre-built visual landmark database. In CVPR.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Georges Baatz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baatz, G., Köser, K., Chen, D. et al. Leveraging 3D City Models for Rotation Invariant Place-of-Interest Recognition. Int J Comput Vis 96, 315–334 (2012). https://doi.org/10.1007/s11263-011-0458-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-011-0458-7

Keywords

Navigation