Video Registration to SfM Models

Kroeger, Till; Van Gool, Luc

doi:10.1007/978-3-319-10602-1_1

Till Kroeger¹⁹ &
Luc Van Gool^19,20

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8693))

Included in the following conference series:

European Conference on Computer Vision

22k Accesses
8 Citations

Abstract

Registering image data to Structure from Motion (SfM) point clouds is widely used to find precise camera location and orientation with respect to a world model. In case of videos one constraint has previously been unexploited: temporal smoothness. Without temporal smoothness the magnitude of the pose error in each frame of a video will often dominate the magnitude of frame-to-frame pose change. This hinders application of methods requiring stable poses estimates (e.g. tracking, augmented reality). We incorporate temporal constraints into the image-based registration setting and solve the problem by pose regularization with model fitting and smoothing methods. This leads to accurate, gap-free and smooth poses for all frames. We evaluate different methods on challenging synthetic and real street-view SfM data for varying scenarios of motion speed, outlier contamination, pose estimation failures and 2D-3D correspondence noise. For all test cases a 2 to 60-fold reduction in root mean squared (RMS) positional error is observed, depending on pose estimation difficulty. For varying scenarios, different methods perform best. We give guidance which methods should be preferred depending on circumstances and requirements.

Download to read the full chapter text

Chapter PDF

4D Match Trees for Non-rigid Surface Alignment

3D Vehicle Trajectory Reconstruction in Monocular Video Data Using Environment Structure Constraints

Adaptive Structure from Motion with a Contrario Model Estimation

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Agarwal, S., Snavely, N., Simon, I., Seitz, S., Szeliski, R.: Building Rome in a day. In: ICCV (2009)
Google Scholar
Agrawal, M.: A Lie Algebraic Approach for Consistent Pose Registration for General Euclidean Motion. In: IEEE/RSJ IROS (2013)
Google Scholar
Aubry, M., Russell, B.C., Sivic, J.: Painting-to-3D Model Alignment Via Discriminative Visual Elements. ACM TOG (2013)
Google Scholar
Bergamo, A., Torresani, L.: Leveraging Structure from Motion to Learn Discriminative Codebooks for Scalable Landmark Classification. In: CVPR (2013)
Google Scholar
Boix, X., Gygli, M., Roig, G., Van Gool, L.: Sparse Quantization for Patch Description. In: CVPR (2013)
Google Scholar
Brubaker, M.A., Geiger, A., Urtasun, R.: Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization. In: CVPR (2013)
Google Scholar
Cao, S., Snavely, N.: Graph-Based Discriminative Learning for Location Recognition. In: CVPR (2013)
Google Scholar
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: MonoSLAM: real-time single camera SLAM. PAMI (2007)
Google Scholar
Gordon, I., Lowe, D.G.: What and Where: 3D Object Recognition with Accurate Pose. In: CLOR 2006 (2006)
Google Scholar
Hakeem, A., Vezzani, R., Shah, M., Cucchiara, R.: Estimating Geospatial Trajectory of a Moving Camera. In: ICPR (2006)
Google Scholar
Hao, Q., Cai, R., Li, Z., Zhang, L., Pang, Y., Wu, F.: 3D visual phrases for landmark recognition. In: CVPR (2012)
Google Scholar
Hao, Q., Cai, R., Li, Z., Zhang, L., Pang, Y., Wu, F., Rui, Y.: Efficient 2D-to-3D Correspondence Filtering for Scalable 3D Object Recognition. In: CVPR (2013)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, Data Mining, Inference, and Prediction, 2nd edn. Springer (2009)
Google Scholar
Hays, J., Efros, A.A.: IM 2 GPS: estimating geographic information from a single image. In: CVPR (2008)
Google Scholar
Hsu, S., Samarasekera, S., Kumar, R., Sawhney, H.S.: Pose estimation, model refinement, and enhanced visualization using video. In: CVPR (2000)
Google Scholar
Irschara, A., Zach, C., Frahm, J.M., Bischof, H.: From structure-from-motion point clouds to fast location recognition. In: CVPR (2009)
Google Scholar
Kalantidis, Y., Tolias, G., Avrithis, Y.: Viral: Visual image retrieval and localization. Multimedia Tools and Applications (2011)
Google Scholar
Kalogerakis, E., Vesselova, O., Hays, J., Efros, A.A., Hertzmann, A.: Image Sequence Geolocation with Human Travel Priors. In: ICCV (2009)
Google Scholar
Klein, G., Murray, D.: Parallel Tracking and Mapping for Small AR Workspaces. In: ISMAR (2007)
Google Scholar
Klingner, B., Martin, D., Roseborough, J.: Street View Motion-from-Structure-from-Motion. In: ICCV (2013)
Google Scholar
Knopp, J., Sivic, J., Pajdla, T.: Avoiding confusing features in place recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 748–761. Springer, Heidelberg (2010)
Chapter Google Scholar
Koch, O., Teller, S.: Wide-Area Egomotion Estimation from Known 3D Structure. In: CVPR (2007)
Google Scholar
Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: An Accurate O(n) Solution to the PnP Problem. IJCV (2009)
Google Scholar
Li, S., Xu, C., Xie, M.: A Robust O(n) Solution to the Perspective-n-Point Problem. PAMI (2012)
Google Scholar
Li, Y., Snavely, N., Huttenlocher, D., Fua, P.: Worldwide pose estimation using 3D point clouds. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 15–29. Springer, Heidelberg (2012)
Chapter Google Scholar
Li, Y., Snavely, N., Huttenlocher, D.P.: Location Recognition using Prioritized Feature Matching. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 791–804. Springer, Heidelberg (2010)
Chapter Google Scholar
Lim, H., Sinha, S.N., Cohen, M.F., Uyttendaele, M.: Real-time image-based 6-dof localization in large-scale environments. In: CVPR (2012)
Google Scholar
Lowe, D.G.: Distinctive Image Features from Scale-Invariant Keypoints. IJCV (2004)
Google Scholar
Newcombe, R.A., Davison, A.J.: Live dense reconstruction with a single moving camera. In: CVPR (2010)
Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: Dense tracking and mapping in real-time. In: ICCV (2011)
Google Scholar
Ramalingam, S., Bouaziz, S., Sturm, P.: Pose Estimation Using Both Points and Lines for Geolocation. In: ICRA (2011)
Google Scholar
Robertson, D., Cipolla, R.: An Image-Based System for Urban Navigation. In: BMVC (2004)
Google Scholar
Rodriguez, J., Aggarwal, J.: Matching aerial images to 3-D terrain maps. PAMI (1990)
Google Scholar
Sattler, T., Leibe, B., Kobbelt, L.: Fast image-based localization using direct 2D-to-3D matching. In: ICCV (2011)
Google Scholar
Sattler, T., Leibe, B., Kobbelt, L.: Improving Image-Based Localization by Active Correspondence Search. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 752–765. Springer, Heidelberg (2012)
Chapter Google Scholar
Schindler, G., Brown, M., Szeliski, R.: City-Scale Location Recognition. In: CVPR (2007)
Google Scholar
Se, S., Lowe, D., Little, J.: Vision-based mobile robot localization and mapping using scale-invariant features. In: ICRA (2001)
Google Scholar
Tanskanen, P., Kolev, K., Meier, L., Camposeco, F., Saurer, O., Pollefeys, M.: Live Metric 3D Reconstruction on Mobile Phones. In: ICCV (2013)
Google Scholar
Vaca-Castano, G., Zamir, A.R., Shah, M.: City scale geo-spatial trajectory estimation of a moving camera. In: CVPR (2012)
Google Scholar
Zamir, A.R., Shah, M.: Accurate image localization based on google maps street view. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 255–268. Springer, Heidelberg (2010)
Chapter Google Scholar
Zhao, W., Nister, D., Hsu, S.: Alignment of continuous video onto 3D point clouds. In: CVPR (2004)
Google Scholar
Zheng, Y., Kuang, Y., Sugimoto, S., Aström, K., Okutomi, M.: Revisiting the PnP Problem: A Fast, General and Optimal Solution. In: ICCV (2013)
Google Scholar
Zheng, Y., Sugimoto, S., Okutomi, M.: ASPnP: An Accurate and Scalable Solution to the Perspective-n-Point Problem. IEICE TIS (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Laboratory, ETH Zurich, Switzerland
Till Kroeger & Luc Van Gool
ESAT - PSI / IBBT, K.U. Leuven, Belgium
Luc Van Gool

Authors

Till Kroeger
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
ESAT - PSI, iMinds, KU Leuven, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

1 Electronic Supplementary Material

Electronic Supplementary Material (PDF 219 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kroeger, T., Van Gool, L. (2014). Video Registration to SfM Models. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8693. Springer, Cham. https://doi.org/10.1007/978-3-319-10602-1_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-10602-1_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10601-4
Online ISBN: 978-3-319-10602-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics