Abstract
The topic of this paper is large-scale mapping and localization from images. We first describe recent progress in obtaining large-scale 3D visual maps from images only. Our approach consists of a multi-stage processing pipeline, which can process a recorded video stream in real-time on standard PC hardware by leveraging the computational power of the graphics processor. The output of this pipeline is a detailed textured 3D model of the recorded area. The approach is demonstrated on video data recorded in Chapel Hill containing more than a million frames. While for these results GPS and inertial sensor data was used, we further explore the possibility to extract the necessary information for consistent 3D mapping over larger areas from images only. In particular, we discuss our recent work focusing on estimating the absolute scale of motion from images as well as finding intersections where the camera path crosses itself to effectively close loops in the mapping process. For this purpose we introduce viewpoint-invariant patches (VIP) as a new 3D feature that we extract from 3D models locally computed from the video sequence. These 3D features have important advantages with respect to traditional 2D SIFT features such as much stronger viewpoint-invariance, a relative pose hypothesis from a single match and a hierarchical matching scheme robust to repetitive structures. In addition, we also briefly discuss some additional work related to absolute scale estimation and multi-camera calibration.
Keywords
- Global Position System
- Graphic Processing Unit
- Motion Estimation
- Visual Word
- Absolute Scale
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, S., Snavely, N., Simon, I., Seitz, S., Szeliski, R.: Building Rome in a Day. In: Proc. Int. Conf. on Computer Vision (2009)
Angst, R., Pollefeys, M.: Static Multi-Camera Factorization Using Rigid Motion. In: Int. Conf. on Computer Vision (2009)
Caspi, Y., Irani, M.: Aligning Non-Overlapping Sequences. Int. J. of Computer Vision 48(1), 39–51 (2002)
Clipp, B., Frahm, J.-M., Pollefeys, M., Kim, J.-H., Hartley, R.: Robust 6DOF Motion Estimation for Non-Overlapping Multi-Camera Systems. In: Proc. IEEE Workshop on Applications of Computer Vision (WACV 2008), 8 p. (2008)
Clipp, B., Raguram, R., Frahm, J.-M., Welch, G., Pollefeys, M.: A Mobile 3D City Reconstruction System. In: IEEE Virtual Reality Workshop on Virtual Cityscapes (2008)
Clipp, B., Zach, C., Frahm, J.-M., Pollefeys, M.: A New Minimal Solution to the Relative Pose of a Calibrated Stereo Camera with Small Field of View Overlap. In: Proc. Int. Conf. on Computer Vision, 8 p. (2009)
Clipp, B., Zach, C., Lim, J., Frahm, J.-M., Pollefeys, M.: Adaptive, Real-Time Visual Simultaneous Localization and Mapping. In: Proc. IEEE Workshop on Applications of Computer Vision, WACV 2009 (2009)
Cornelis, N., Cornelis, K., Van Gool, L.: Fast Compact City Modeling for Navigation Pre-Visualization. In: Proc. CVPR 2006 IEEE Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 1339–1344 (2006)
Cummins, M., Newman, P.: FAB-MAP: Probabilistic Localisation and Mapping in the Space of Appearance. Int. J. of Robotics Research (June 2008)
Fischler, M., Bolles, R.: Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Comm. of the ACM 24, 381–395 (1981)
Frahm, J.-M., Pollefeys, M.: RANSAC for (Quasi-)Degenereate data (QDEGSAC). In: Proc. CVPR 2006 IEEE Conf. on Computer Vision and Pattern Recognition (2006)
Fraundorfer, F., Frahm, J.-M., Pollefeys, M.: Visual Word based Location Recognition in 3D models using Distance Augmented Weighting. In: Proc. 3DPVT 2008 Int. Symp. on 3D Data Processing, Visualization and Transmission (2008)
Früh, C., Zakhor, A.: An Automated Method for Large-Scale, Ground-Based City Model Acquisition. Int. J. of Computer Vision 60(1), 5–24 (2004)
Gallup, D., Frahm, J.-M., Mordohai, P., Yang, Q., Pollefeys, M.: Real-Time Plane-sweeping Stereo with Multiple Sweeping Directions. In: Proc. CVPR 2007, IEEE Conf. on Computer Vision and Pattern Recognition (2007)
Gallup, D., Frahm, J.-M., Mordohai, P., Pollefeys, M.: Variable Baseline/Resolution Stereo. In: Proc. CVPR 2008, IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Gillespie, T.: Fundamentals of Vehicle Dynamics. SAE, Inc., Warrendale (1992)
Haralick, R., Lee, C.-N., Ottenberg, K., Nolle, M.: Review and Analysis of Solutions of the Three Point Perspective Pose Estimation Problem. Int. J. of Computer Vision 13(3), 331–356 (1994)
Hartley, R., Sturm, P.: Triangulation. Computer Vision and Image Understanding (CVIU) 68(2), 146–157 (1997)
Irschara, A., Zach, C., Frahm, J.-M., Bischof, H.: 3D Scene Summarization for Efficient View Registration. In: Proc. CVPR 2009, IEEE Conf. on Computer Vision and Pattern Recognition (2009)
Kim, J.-H., Hartley, R., Frahm, J.-M., Pollefeys, M.: Visual Odometry for Non-Overlapping Views Using Second-Order Cone Programming. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds.) ACCV 2007, Part II. LNCS, vol. 4844, pp. 353–362. Springer, Heidelberg (2007)
Kim, S.J., Gallup, D., Frahm, J.-M., Akbarzadeh, A., Yang, Q., Yang, R., Nister, D., Pollefeys, M.: Gain Adaptive Real-Time Stereo Streaming. In: Proc. Int. Conf. on Computer Vision Systems (2007)
Kim, S.-J., Frahm, J.-M., Pollefeys, M.: Joint Feature Tracking and Radiometric Calibration from Auto-Exposure Video. In: Proc. ICCV 2007, Int. Conf. on Computer Vision (2007)
Kim, S.J., Pollefeys, M.: Robust Radiometric Calibration and Vignetting Correction. IEEE Trans. on Pattern Analysis and Machine Intelligence 30(4), 562–576 (2008)
Kumar, R.K., Ilie, A., Frahm, J.-M., Pollefeys, M.: Simple calibration of non-overlapping cameras with a mirror. In: Proc. CVPR 2008, IEEE Int. Conf. on Computer Vision and Pattern Recognition (2008)
Li, X., Wu, C., Zach, C., Lazebnik, S., Frahm, J.-M.: Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 427–440. Springer, Heidelberg (2008)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. of Computer Vision 60(2), 91–110 (2004)
Merrell, P., Mordohai, P., Frahm, J.M., Pollefeys, M.: Evaluation of Large Scale Scene Reconstruction. In: Proc. Workshop on Virtual Representations and Modeling of Large-scale environments, VRML 2007 (2007)
Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J.-M., Yang, R., Nister, D., Pollefeys, M.: Fast Visibility-Based Fusion of Depth Maps. In: Proc. ICCV 2007, Int. Conf. on Computer Vision (2007)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. of Computer Vision 65(1/2), 43–72 (2005)
Nister, D.: An efficient solution to the five-point relative pose problem. IEEE Trans. on Pattern Analysis and Machine Intelligence 26(6), 756–777 (2004)
Nister, D., Naroditsky, O., Bergen, J.: Visual odometry for ground vehicle applications. J. of Field Robotics 23(1) (2006)
Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: Proc. CVPR 2006, IEEE Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168 (2006)
Pajarola, R.: Overview of quadtree-based terrain triangulation and visualization. Tech. Report UCI-ICS-02-01, Information & Computer Science, University of California Irvine (2002)
Pollefeys, M., Koch, R., Van Gool, L.: Self-Calibration and Metric Reconstruction in spite of Varying and Unknown Internal Camera Parameters. Int. J. of Computer Vision 32(1), 7–25 (1999)
Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. Int. J. of Computer Vision 59(3), 207–232 (2004)
Pollefeys, M., Nister, D., Frahm, J.-M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.-J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., Towles, H.: Detailed Real-Time Urban 3D Reconstruction From Video. Int. J. of Computer Vision 78(2), 143–167 (2008)
Raguram, R., Frahm, J.-M., Pollefeys, M.: A Comparative Analysis of RANSAC Techniques Leading to Adaptive Real-Time Random Sample Consensus. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 500–513. Springer, Heidelberg (2008)
Raguram, R., Frahm, J.-M., Pollefeys, M.: Exploiting Uncertainty in Random Sample Consensus. In: Proc. ICCV 2009, Int. Conf. on Computer Vision (2009)
Scaramuzza, D., Fraundorfer, F., Pollefeys, M., Siegwart, R.: Closing the Loop in Appearance-Guided Structure-from-Motion for Omnidirectional Cameras. In: Proc. Eight Workshop on Omnidirectional Vision, ECCV (2008)
Scaramuzza, D., Fraundorfer, F., Pollefeys, M., Siegwart, R.: Absolute Scale in Structure from Motion from a Single Vehicle Mounted Camera by Exploiting Nonholonomic Constraints. In: Proc. ICCV 2009, Int. Conf. on Computer Vision (2009)
Shi, J., Tomasi, C.: Good Features to Track. In: CVPR 1994, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 593–600 (1994)
Sinha, S., Mordohai, P., Pollefeys, M.: Multi-View Stereo via Graph Cuts on the Dual of an Adaptive Tetrahedral Mesh. In: Proc. ICCV 2007, Int. Conf. on Computer Vision (2007)
Sinha, S., Frahm, J.-M., Pollefeys, M., Genc, Y.: Feature Tracking and Matching in Video Using Programmable Graphics Hardware. Machine Vision and Application (November 2009) (online)
Sinha, S., Steedly, D., Szeliski, R., Agrawala, M., Pollefeys, M.: Interactive 3D Architectural Modeling from Unordered Photo Collections. ACM Trans. on Graphics (SIGGRAPH ASIA 2008) 27(5), 159, 1–10 (2008)
Snavely, N., Seitz, S., Szeliski, R.: Photo Tourism: Exploring image collections in 3D. In: Proc. of SIGGRAPH 2006 (2006)
Thirthala, S., Pollefeys, M.: The Radial Trifocal Tensor: A Tool for Calibrating Radial Disortion of Wide-Angle Cameras. In: Proc. CVPR 2005, IEEE Conf. on Computer Vision and Pattern Recognition, vol. 1, pp. 321–328 (2005)
Thirthala, S., Pollefeys, M.: Multi-view geometry of 1D radial cameras and its application to omnidirectional camera calibration. In: Proc. ICCV 2005, Int. Conf. on Computer Vision, vol. 2, pp. 1539–1546 (2005)
Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography—a factorization method. Int. J. of Computer Vision 9(2), 137–154 (1992)
Triggs, B., McLauchlan, P., Hartley, R., Fitzgibbon, A.: Bundle Adjustment – A Modern Synthesis. In: Triggs, B., Zisserman, A., Szeliski, R. (eds.) ICCV-WS 1999. LNCS, vol. 1883, pp. 298–372. Springer, Heidelberg (2000)
Tsai, R.Y.: A versatile camera calibration technique for high accuracy 3D machine vision metrology using off-the-shelf tv cameras and lenses. IEEE J. for Robotics and Automation 3, 323–344 (1987)
Wu, C., Clipp, B., Li, X., Frahm, J.-M., Pollefeys, M.: 3D Model Matching with Viewpoint Invariant Patches (VIPs). In: Proc. CVPR 2008, IEEE Conf. on Computer Vision and Pattern Recognition (2008)
Wu, C., Fraundorfer, F., Frahm, J.-M., Pollefeys, M.: 3D Model Search and Pose Estimation from Single Images using VIP Features. In: Proc. S3D Workshop, CVPR 2008 (2008)
Wu, C.: Open Source SIFTGPU, http://www.cs.unc.edu/~ccwu/siftgpu/
Xiao, J., Fang, T., Zhao, P., Lhuillier, M., Quan, L.: Image-Based Street-Side City Modeling. ACM Trans. on Graphics, SIGGRAPH ASIA (2009)
Yang, R., Pollefeys, M.: Multi-Resolution Real-Time Stereo on Commodity Graphics Hardware. In: Proc. CVPR 2003, IEEE Conf. on Computer Vision and Pattern Recognition, pp. 211–218 (2003)
Yang, R., Pollefeys, M., Welch, G.: Dealing with Textureless Regions and Specular Highlight: A Progressive Space Carving Scheme Using a Novel Photo-consistency Measure. In: Proc. ICCV 2003, Int. Conf. on Computer Vision, pp. 576–584 (2003)
Yang, R., Pollefeys, M.: A Versatile Stereo Implementation on Commodity Graphics Hardware. J. of Real-Time Imaging 11(1), 7–18 (2005)
Zach, C., Pock, T., Bischof, H.: A globally optimal algorithm for robust TV-L1 range image integration. In: Proc. ICCV 2007, IEEE Int. Conf. on Computer Vision (2007)
Zach, C., Gallup, D., Frahm, J.-M.: Fast Gain-Adaptive KLT Tracking on the GPU. In: CVPR Workshop on Visual Computer Vision on GPU’s, CVGPU (2008)
Zach, C.: Open Source Code for GPU-based KLT Feature Tracker and Sparse Bundle Adjustment, http://www.cs.unc.edu/~cmzach/opensource.html
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. on Pattern Analysis and Machine Intelligence, 1330–1334 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pollefeys, M. et al. (2011). Towards Large-Scale Visual Mapping and Localization. In: Pradalier, C., Siegwart, R., Hirzinger, G. (eds) Robotics Research. Springer Tracts in Advanced Robotics, vol 70. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19457-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-19457-3_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19456-6
Online ISBN: 978-3-642-19457-3
eBook Packages: EngineeringEngineering (R0)