View synthesis for pose computation

Rolin, Pierre; Berger, Marie-Odile; Sur, Frédéric

doi:10.1007/s00138-019-01045-5

View synthesis for pose computation

Original Paper
Published: 05 September 2019

Volume 30, pages 1209–1227, (2019)
Cite this article

Machine Vision and Applications Aims and scope Submit manuscript

316 Accesses
Explore all metrics

Abstract

Geometrical registration of a query image with respect to a 3D model, or pose estimation, is the cornerstone of many computer vision applications. It is often based on the matching of local photometric descriptors invariant to limited viewpoint changes. However, when the query image has been acquired from a camera position not covered by the model images, pose estimation is often not accurate and sometimes even fails, precisely because of the limited invariance of descriptors. In this paper, we propose to add descriptors to the model, obtained from synthesized views associated with virtual cameras completing the covering of the scene by the real cameras. We propose an efficient strategy to localize the virtual cameras in the scene and generate valuable descriptors from synthetic views. We also discuss a guided sampling strategy for registration in this context. Experiments show that the accuracy of pose estimation is dramatically improved when large viewpoint changes makes the matching of classic descriptors a challenging task.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching

Matching of Images of Non-planar Objects with View Synthesis

AUTO3D: Novel View Synthesis Through Unsupervisely Learned Variational Viewpoint and Global 3D Representation

References

Billinghurst, M., Clark, A., Lee, G.: A survey of augmented reality. Found. Trends Hum. Comput. Interact. 8(2–3), 73–272 (2015)
Article Google Scholar
Marchand, E., Uchiyama, H., Spindler, F.: Pose estimation for augmented reality: a hands-on survey. IEEE Trans. Vis. Comput. Graph. 22(12), 2633–2651 (2016)
Article Google Scholar
Charmette, B., Royer, E., Chausse, F.: Vision-based robot localization based on the efficient matching of planar features. Mach. Vis. Appl. 27(4), 415–436 (2016)
Article Google Scholar
Shan, Q., Wu, C., Curless, B., Furukawa, Y., Hernandez, C., Seitz, S.M.: Accurate geo-registration by ground-to-aerial image matching. In: Proceedings of International Conference on 3D Vision (3DV) (2014)
Lowe, D.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Hesch, J.A., Roumeliotis, S.I.: A direct least-squares (DLS) method for PnP. In: Proceedings of International Conference on Computer Vision (2011)
Moreels, P., Perona, P.: Evaluation of features detectors and descriptors based on 3D objects. Int. J. Comput. Vis. 73(3), 263–284 (2007)
Article Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: Posenet: A convolutional network for real-time 6-dof camera relocalization. In: The IEEE International Conference on Computer Vision (ICCV) (2015)
Purkait, P., Zhao, C., Zach, C.: SPP-Net: Deep absolute pose regression with synthetic views. arXiv:1712.03452 (2018)
Purkait, P., Zhao, C., Zach, C.: Synthetic view generation for absolute pose regression and image synthesis. In: Proceedings of British Machine Vision Conference (BMVC) (2018)
Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., Kadir, T., Van Gool, L.: A comparison of affine region detectors. Int. J. Comput. Vis. 65(1–2), 43–72 (2005)
Article Google Scholar
Mikolajczyk, K., Schmid, C.: An affine invariant interest point detector. In: Proceedings of European Conference on Computer Vision (ECCV) (2002)
Chapter Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761–767 (2004)
Article Google Scholar
Yi, K., Trulls, E., Lepetit, V., Fua, P.: LIFT: Learned invariant feature transform. In: Proceedings of European Conference on Computer Vision (ECCV) (2016)
Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. IEEE Trans. Pattern Anal. Mach. Intell. 28(9), 1465–1479 (2006)
Article Google Scholar
Williams, B., Klein, G., Reid, I.: Real-time SLAM relocalisation. In: Proceedings of International Conference on Computer Vision (ICCV) (2007)
Paulin, M., Revaud, J., Harchaoui, Z., Perronnin, F., Schmid, C.: Transformation pursuit for image classification. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Morel, J.-M., Yu, G.: ASIFT: A new framework for fully affine invariant image comparison. SIAM J. Imaging Sci. 2(2), 438–469 (2009)
Article MathSciNet Google Scholar
Rolin, P., Berger, M.-O., Sur, F.: Viewpoint simulation for camera pose estimation from an unstructured scene model. In: Proceedings of International Conference on Robotics and Automation (ICRA) (2015)
Savarese, S., Fei-Fei, L.: View synthesis for recognizing unseen poses of object classes. In: Proceedings of European Conference on Computer Vision (ECCV) (2008)
Ozuysal, M., Calonder, M., Lepetit, V., Fua, P.: Fast keypoint recognition using random ferns. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 448–461 (2010)
Article Google Scholar
Mishkin, D., Matas, J., Perdoch, M.: MODS: Fast and robust method for two-view matching. Comput. Vis. Image Underst. 141, 81–93 (2015a)
Article Google Scholar
Mishkin, D., Matas, J., Perdoch, M., Lenc, K.: WXBS: Wide baseline stereo generalizations. In: Proceedings of British Machine Vision Conference (BMVC) (2015)
Rodriguez, M., Delon, J., Morel, J.-M.: Covering the space of tilts: application to affine invariant image comparison. SIAM J. Imaging Sci. 11(2), 1230–1267 (2018)
Article MathSciNet Google Scholar
Köser, K., Koch, R.: Perspectively invariant normal features. In: Proceedings of International Conference on Computer Vision (ICCV) (2007)
Kushnir, M., Shimshoni, I.: Epipolar geometry estimation for urban scenes with repetitive structures. IEEE Trans. Pattern Anal. Mach. Intell. 36(12), 2381–2395 (2014)
Article Google Scholar
Wu, C., Clipp, B., Li, X., Frahm, J.-M., Pollefeys, M.: 3D model matching with viewpoint-invariant patches (VIP). In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) (2008)
Petit, A., Marchand, E., Kanani, K.: Tracking complex targets for space rendezvous and debris removal applications. In: Proceedings of International Conference on Intelligent Robots and Systems (IROS) (2012)
Torii, A., Arandjelović, R., Sivic, J., Okutomi, M., Pajdla, T.: 24/7 place recognition by view synthesis. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Irschara, A., Zach, C., Frahm, J.-M., Bischof, H.: From structure-from-motion point clouds to fast location recognition. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Wendel, A., Irschara, A., Bischof, H.: Natural landmark-based monocular localization for MAVs. In: Proceedings of International Conference on Robotics and Automation (ICRA) (2011)
Molton, N, Davison, A.J., Reid, I.: Locally planar patch features for real-time structure from motion. In: Proceedings of British Machine Vision Conference (BMVC) (2004)
Reitmayr, G., Drummond, T.W.: Going out: robust tracking for outdoor augmented reality. In: Proceedings of International Symposium on Mixed and Augmented Reality (ISMAR) (2006)
Simon, G.: Tracking-by-synthesis using point features and pyramidal blurring. In: Proceedings of International Symposium on Mixed and Augmented Reality (ISMAR) (2011)
Delaunoy, A., Pollefeys, M.: Photometric bundle adjustment for dense multi-view 3D modeling. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)
Rolin, P., Berger, M.-O., Sur, F.: Enhancing pose estimation through efficient patch synthesis. In: Proceedings British Machine Vision Conference (BMVC) (2016)
Fischler, M., Bolles, R.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, second edn. Cambridge University Press, Cambridge (2004)
Book Google Scholar
Hoppe, H., DeRose, T., Duchamp, T., McDonald, J., Stuetzle, W.: Surface reconstruction from unorganized points. In: Proceedings of SIGGRAPH (1992)
Muja, M., Lowe, D.G.: Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36(11), 2227–2240 (2014)
Article Google Scholar
Morel, J.-M., Yu, G.: Is SIFT scale invariant? AIMS Inverse Probl. Imaging 5(1), 115–136 (2011)
Article Google Scholar
Rusu, R.B., Cousins, S.: 3D is here: Point Cloud Library (PCL). In: Proceedings of International Conference on Robotics and Automation (ICRA) (2011)
Boulch, A., Marlet, R.: Fast normal estimation for point clouds with sharp features using a robust randomized Hough transform. Comput. Graph. Forum 31(5), 1765–1774 (2012)
Article Google Scholar
Rolin, P., Berger, M.-O., Sur, F.: Simulation de point de vue pour la mise en correspondance et la localisation. Traitement du Signal 32(2–3), 169–194 (2015b)
Article Google Scholar
Katz, S., Tal, A., Basri, R.: Direct visibility of point sets. ACM Trans. Graph. 26(3), 24 (2007)
Article Google Scholar
Raguram, R., Frahm, J.-M., Pollefeys, M.: A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus. In: Proceedings of European Conference on Computer Vision (ECCV) (2008)
Chum, O., Matas, J.: Matching with PROSAC—progressive sample consensus. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) (2005)
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR) (2008)
Mount, D.M., Arya, S.: ANN: A library for approximate nearest neighbor searching. https://www.cs.umd.edu/~mount/ANN/. Accessed 19 Aug 2019 (2010)
Li, Y., Snavely, N., Huttenlocher, D.: Location recognition using prioritized feature matching. In: Proceedings of European Conference on Computer Vision (ECCV) (2010)
Chapter Google Scholar
Li, Y., Noah, S., Huttenlocher, D., Fua, P.: Worldwide pose estimation using 3D point clouds. In: Proceedings of European Conference on Computer Vision (ECCV) (2012)
Chapter Google Scholar
Wu, C.: VisualSFM: A visual structure from motion system. http://homes.cs.washington.edu/~ccwu/vsfm/. Accessed 19 Aug 2019 (2011)
Aanæs, H., Dahl, A.L., Pedersen, K.S.: Interesting interest points. Int. J. Comput. Vis. 97(1), 18–35 (2012)
Article Google Scholar
http://www.diegm.uniud.it/fusiello/demo/samantha/. Accessed 19 Aug 2019
https://cvg.ethz.ch/research/symmetries-in-sfm/. Accessed 19 Aug 2019
Simon, G., Fond, A., Berger, M.-O.: A simple and effective method to detect orthogonal vanishing points in uncalibrated images of man-made environments. In: Proceedings of Eurographics (2016)

Download references

Author information

Authors and Affiliations

Université de Lorraine, Villers-lès-Nancy, France
Pierre Rolin & Frédéric Sur
INRIA Nancy Grand Est, Villers-lès-Nancy, France
Marie-Odile Berger

Authors

Pierre Rolin
View author publications
You can also search for this author in PubMed Google Scholar
Marie-Odile Berger
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Sur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Frédéric Sur.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Registering cameras from two SfM reconstructions

We here have two SFM reconstructions. One is built from the full image sequence and the second is built from a subsequence. Let ${{{\mathscr {R}}}}_1$ and ${{{\mathscr {R}}}}_2$ be the world coordinate frames attached to these SfM reconstructions.

Given a camera matrix $P_1=[R_1 T_1]$ expressed in ${{{\mathscr {R}}}}_1$, our goal is to compute its projection matrix $P_2=[R_2 T_2]$ expressed in ${{{\mathscr {R}}}}_2$.

The rigid + scale transformation (s, R, T) between ${{{\mathscr {R}}}}_1$ and ${{{\mathscr {R}}}}_2$ can be easily recovered using Procrustes analysis from the set of corresponding camera centers. Since we consider the same camera, the viewing coordinates are the same in the two coordinates frames. As the link between the viewing and the world coordinate in homogeneous coordinates is given by $X_{vc}^1= R^{1}X_{wc}^1 + T^1$, we can deduce:

$$\begin{aligned} X_{vc}^2= & {} X_{vc}^1=R^{1}( s RX_{wc}^2 + T)+T^1\\= & {} s(R^{1}RX_{wc}^2 + 1/s\,( R^1T+T^1)) \end{aligned}$$

Therefore, the expression of the camera matrix $P_1$ in ${\mathscr {R}}_2$ is

$$\begin{aligned} {[R^{1}R, 1/s\,( R^1T+T^1)]}. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rolin, P., Berger, MO. & Sur, F. View synthesis for pose computation. Machine Vision and Applications 30, 1209–1227 (2019). https://doi.org/10.1007/s00138-019-01045-5

Download citation

Received: 03 June 2018
Revised: 14 March 2019
Accepted: 12 August 2019
Published: 05 September 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s00138-019-01045-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

View synthesis for pose computation

Abstract

Access this article

Similar content being viewed by others

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching

Matching of Images of Non-planar Objects with View Synthesis

AUTO3D: Novel View Synthesis Through Unsupervisely Learned Variational Viewpoint and Global 3D Representation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Registering cameras from two SfM reconstructions

Rights and permissions

About this article

Cite this article

Keywords

Navigation

View synthesis for pose computation

Abstract

Access this article

Similar content being viewed by others

DFNet: Enhance Absolute Pose Regression with Direct Feature Matching

Matching of Images of Non-planar Objects with View Synthesis

AUTO3D: Novel View Synthesis Through Unsupervisely Learned Variational Viewpoint and Global 3D Representation

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Registering cameras from two SfM reconstructions

Appendix A: Registering cameras from two SfM reconstructions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation