3D Trajectory Reconstruction under Perspective Projection

Park, Hyun Soo; Shiratori, Takaaki; Matthews, Iain; Sheikh, Yaser

doi:10.1007/s11263-015-0804-2

3D Trajectory Reconstruction under Perspective Projection

Published: 18 February 2015

Volume 115, pages 115–135, (2015)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Hyun Soo Park¹,
Takaaki Shiratori²,
Iain Matthews³ &
…
Yaser Sheikh¹

1835 Accesses
30 Citations
1 Altmetric
Explore all metrics

Abstract

We present an algorithm to reconstruct the 3D trajectory of a moving point from its correspondence in a collection of temporally non-coincidental 2D perspective images, given the time of capture that produced each image and the relative camera poses at each time instant. Triangulation-based solutions do not apply, as multiple views of the point may not exist at each time instant. We represent a 3D trajectory using a linear combination of compact trajectory basis vectors, such as the discrete cosine transform basis, that have been shown to approximate object independence. We note that such basis vectors are also coordinate independent, which allows us to directly use camera poses estimated from stationary areas in the scene (in contrast to nonrigid structure from motion techniques where cameras are simultaneously estimated). This reduces the reconstruction optimization to a linear least squares problem, allowing us to robustly handle missing data that often occur due to motion blur, texture deformation, and self occlusion. We present an algorithm to determine the number of trajectory basis vectors, individually for each trajectory via a cross validation scheme and refine the solution by minimizing the geometric error. The relationship between point and camera motion can cause degeneracies to occur. We geometrically analyze the problem by studying the relationship of the camera motion, point motion, and trajectory basis vectors. We define the reconstructability of a 3D trajectory under projection, and show that the estimate approaches the ground truth when reconstructability approaches infinity. This analysis enables us to precisely characterize cases when accurate reconstruction is achievable. We present qualitative results for the reconstruction of several real-world scenes from a series of 2D projections where high reconstructability can be guaranteed, and report quantitative results on motion capture sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

Article 13 November 2015

LSD-SLAM: Large-Scale Direct Monocular SLAM

A Review on 3D Reconstruction Techniques from 2D Images

Notes

Related observations have been made in Shashua and Avidan (2000) and Hartley and Vidal (2008).
For the purposes of this discussion, it should be noted that any global rigid motion of the object is equivalent to relative camera motion.
Related empiricial observations have been made by Ozden et al. (2004) and Akhter et al. (2008).
We estimate camera poses automatically via structure from motion. See Sect. 5.2 for a description of the camera pose estimation algorithm.
Ambiguity analyses have been investigated by Xiao et al. (2006), Vidal and Abretske (2006), Hartley and Vidal (2008), and Akhter et al. (2009). However, these analyses consider the ambiguity with the use of a shape basis representation, which utilizes the correlation across multiple points. In this section, we consider the case of the reconstruction of a single 3D point trajectory.
$\otimes $ is the Kronecker product and $\mathbf {D}$ is a diagonal matrix which consists of $\{a_1,\ldots ,a_F\}$, the scalar for each point along the trajectory.
The method by Avidan and Shashua (2000) can only reconstruct a linear or conic trajectory.
To solve the second part of the optimization, they have to additionally solve ${ \left( \begin{array}{c}d+2\\ d\end{array}\right) }$ linear equations.

References

Akhter, I., Sheikh, Y., & Khan, S. (2009). In defense of orthonormality constraints for nonrigid structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2008). Nonrigid structure from motion in trajectory space. In Advances in Neural Information Processing Systems.
Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2011). Trajectory space: A dual representation for nonrigid structure from motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(7), 1442–1456.
Article Google Scholar
Avidan, S., & Shashua, A. (2000). Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 348–357.
Article Google Scholar
Bartoli, A., Gay-Bellile, V., Castellani, U., Peyras, J., Olsen, S. I., & Sayd, P. (2008). Coarse-to-fine low-rank structure-from-motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3D faces. In ACM transactions on Graphics (SIGGRAPH).
Brand, M. (2001). Morphable 3D models from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Brand, M. (2005). A direct method for 3D factorization of nonrigid motion observed in 2D. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Bregler, C., Hertzmann, A., & Biermann, H. (1999). Recovering non-rigid 3D shape from image streams. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Dai, Y., Li, H., & He, M. (2012). A simple prior-free method for non-rigid structure-from-motion factorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Del Bue, A. (2008). A factorization approach to structure from motion with shape priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Del Bue, A., Llad, X., & Agapito, L. (2006). Non-rigid metric shape and motion recovery from uncalibrated images using priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Faugeras, O., Luong, Q.-T., & Papadopoulou, T. (2001). The geometry of multiple images: The laws that govern the formation of images of a scene and some of their applications. Cambridge: MIT Press.
Google Scholar
Fayad, J., Agapito, L., & Del Bue, A. (2010). Piecewise quadratic reconstruction of non-rigid surface from monocular sequences. In Proceedings of the European Conference on Computer Vision.
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Gotardo, P. F. U., & Martinez, A. M. (2011). Computing smooth time-trajectories for camera and deformable shape in structure from motion with occlusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10), 2051–2065.
Article Google Scholar
Hamidi, M., & Pearl, J. (1976). Comparison of the cosine and Fourier transforms of Markov-I signal. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24, 428–429.
Article MathSciNet Google Scholar
Hartley, R., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.
Book MATH Google Scholar
Hartley, R. (1997). In defense of the eight-point algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 580–593.
Article Google Scholar
Hartley, R., & Vidal, R. (2008). Perspective nonrigid shape and motion recovery. In Proceedings of the European Conference on Computer Vision.
Kaminski, J. Y., & Teicher, M. (2004). A general framework for trajectory triangulation. Journal of Mathematical Imaging and Vision, 21(1), 27–41.
Article MathSciNet Google Scholar
Lladó, X., Del Bue, A., & Agapito, L. (2010). Non-rigid metric reconstruction from perspective cameras. Image and Vision Computing, 28(9), 1339–1353.
Article Google Scholar
Longuet-Higgins, H. C. (1981). A computer algorithm for reconstructing a scene from two projections. Nature, 293, 133–135.
Article Google Scholar
Lourakis, M. I. A., & Argyros, A. A. (2009). SBA: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software, 36(1), 1–30.
Article MathSciNet Google Scholar
Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
Article Google Scholar
Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. S. (2003). An invitation to 3-D vision: From images to geometric models. New York: Springer.
Google Scholar
Moreno-Noguer, F., Lepetit, V., & Fua, P. (2007). EPnP: Efficient perspective-n-point camera pose estimation. In Proceedings of the International Conference on Computer Vision.
Olsen, S., & Bartoli, A. (2007). Using priors for improving generalization in non-rigid structure-from-motion. In Proceedings of British Machine Vision Conference.
Östlund, J., Varol, A., Ngo, D. T., & Fua, P. (2012). Laplacian meshes for monocular 3D shape recovery. In Proceedings of the European Conference on Computer Vision.
Ozden, K. E., Cornelis, K., Eychen, L. V., & Gool, L. V. (2004). Reconstructing 3D trajectories of independently moving objects using generic constraints. Computer Vision and Image Understanding, 93, 1453–1471.
Google Scholar
Paladini, M., Del Bue, A., Stosic, M., Dodig, M., Xavier, J., & Agapito, L. (2009). Factorization for non-rigid and articulated structure using metric projections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Park, H. S., Shiratori, T., Matthews, I., & Sheikh, Y. (2010). 3D reconstruction of a moving point from a series of 2D projections. In Proceedings of the European Conference on Computer Vision.
Salzmann, M., Pilet, J., Ilic, S., & Fua, P. (2007). Surface deformation models for nonrigid 3D shape recovery. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1481–1487.
Article Google Scholar
Shashua, A., & Wolf, L. (2000). Homography tensors: On algebraic entities that represent three views of static or moving planar points. In Proceedings of the European Conference on Computer Vision.
Sidenbladh, H., Black, M. J., & Fleet, D. J. (2000). Stochastic tracking of 3d human figures using 2D image motion. In Proceedings of the European Conference on Computer Vision.
Snavely, N., Seitz, S. M., & Szeliski, R. (2006). Photo tourism: Exploring photo collections in 3D. ACM Transactions on Graphics (SIGGRAPH).
Taylor, J., Jepson, A. D., & Kutulakos, K. N. (2010). Non-rigid structure from locally-rigid motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2), 137–154.
Article Google Scholar
Torresani, L., Yang, D., Alexander, G., & Bregler, C. (2001). Tracking and modeling non-rigid objects with rank constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Torresani, L., & Bregler, C. (2002). Space-time tracking. In Proceedings of the European Conference on Computer Vision.
Torresani, L., Hertzmann, A., & Bregler, C. (2008). Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 878–892.
Torresani, L., Hertzmann, A., & Bregler, C. (2003). Learning non-rigid 3D shape from 2D motion. In Advances in Neural Information Processing Systems.
Valmadre, J., & Lucey, S. (2012). General trajectory prior for non-rigid reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Vidal, R., & Abretske, D. (2006). Nonrigid shape and motion from multiple perspective views. In Proceedings of the European Conference on Computer Vision.
Vidal, R., & Hartley, R. (2004). Motion segmentation with missing data by powerfactorization and generalized pca. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Wexler, Y., & Shashua, A. (2000). On the synthesis of dynamic scenes from reference views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Wolf, L., & Shashua, A. (2002). On projection matrices ${\cal P}^{k} \rightarrow {\cal P}^{2}, k =3, \ldots $, 6, and their applications in computer vision. International Journal of Computer Vision, 48(1), 53–67.
Article MATH Google Scholar
Xiao, J., & Kanade, T. (2004). Non-rigid shape and motion recovery: Degenerate deformations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Xiao, J., Chai, J., & Kanade, T. (2006). A closed-form solution to non-rigid shape and motion recovery. International Journal of Computer Vision, 67(2), 233–246.
Article Google Scholar
Yan, J., & Pollefeys, M. (2005). A factorization-based approach to articulated motion recovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Zhu, S., Zhang, L., & Smith, B. M. (2010). Model evolution: An incremental approach to non-rigid structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

Download references

Acknowledgments

This work was supported by NSF Grant IIS-0916272.

Author information

Authors and Affiliations

Carnegie Mellon University, Pittsburgh, PA, USA
Hyun Soo Park & Yaser Sheikh
Microsoft Research Asia, Beijing, China
Takaaki Shiratori
Disney Research Pittsburgh, Pittsburgh, USA
Iain Matthews

Authors

Hyun Soo Park
View author publications
You can also search for this author in PubMed Google Scholar
Takaaki Shiratori
View author publications
You can also search for this author in PubMed Google Scholar
Iain Matthews
View author publications
You can also search for this author in PubMed Google Scholar
Yaser Sheikh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hyun Soo Park.

Additional information

Communicated by Jun Sato.

Appendix

To prove Result 1, we need to show that the transformed trajectory basis, $\mathbf {S}(\varvec{\Theta })$, span the same space spanned by the original trajectory basis vectors where $\mathbf {S}(\cdot )$ is a similarity transformation, i.e., $\mathrm{col}(\mathbf {S}(\varvec{\Theta }))=\mathrm{col}(\varvec{\Theta })$ where $\mathrm{col}(\varvec{\Theta })$ is a space spanned by the column space of $\varvec{\Theta }$.

Proof

(i) scale $\mathrm{col}(s\varvec{\Theta }) = \mathrm{col}(\varvec{\Theta })$ where $s$ is a scalar.

(ii) translation translation is spanned by the DC component of $\varvec{\Theta }_\mathrm{DCT}$.

(iii) rotation without loss of generality, the trajectory basis can be rearranged as $\bar{\varvec{\Theta }} = \mathrm{blkdiag}\{\varvec{\theta },\varvec{\theta },\varvec{\theta }\}$ where $\varvec{\theta } \in \mathbb {R}^{F\times K}$ is the DCT trajectory basis for each trajectory. The rotated trajectory basis, $(\mathbf {R}\otimes \mathbf {I}_F)\bar{\varvec{\Theta }}$ span the original trajectory basis vectors $\bar{\varvec{\Theta }}$ because,

$$\begin{aligned}&\mathrm{col}\left( (\mathbf {R}\otimes \mathbf {I}_F)\bar{\varvec{\Theta }}\right) \\&\qquad =\mathrm{col}\left( \left[ \begin{array}{ccc}{R}_{11}\mathbf {I}_F &{} {R}_{12}\mathbf {I}_F &{} {R}_{13}\mathbf {I}_F \\ {R}_{21}\mathbf {I}_F &{} {R}_{22}\mathbf {I}_F &{} {R}_{23}\mathbf {I}_F \\ {R}_{31}\mathbf {I}_F &{} {R}_{32}\mathbf {I}_F &{} {R}_{33}\mathbf {I}_F\end{array}\right] \left[ \begin{array}{ccc}\varvec{\theta }&{}&{}\\ {} &{}\varvec{\theta }&{}\\ &{}&{}\varvec{\theta }\end{array}\right] \right) \\&\qquad =\mathrm{col}\left( \left[ \begin{array}{ccc}{R}_{11} \varvec{\theta } &{} {R}_{12}\varvec{\theta } &{} {R}_{13}\varvec{\theta } \\ {R}_{21}\varvec{\theta } &{} {R}_{22}\varvec{\theta } &{} {R}_{23}\varvec{\theta } \\ {R}_{31}\varvec{\theta } &{} {R}_{32}\varvec{\theta } &{} {R}_{33}\varvec{\theta }\end{array}\right] \right) \\&\qquad =\mathrm{col}\left( \left[ \begin{array}{ccc}\varvec{\theta }&{}&{}\\ {} &{}\varvec{\theta }&{}\\ &{}&{}\varvec{\theta }\end{array}\right] \left[ \begin{array}{ccc}{R}_{11}\mathbf {I}_K &{} {R}_{12}\mathbf {I}_K &{} {R}_{13}\mathbf {I}_K \\ {R}_{21}\mathbf {I}_K &{} {R}_{22}\mathbf {I}_K &{} {R}_{23}\mathbf {I}_K\\ {R}_{31}\mathbf {I}_K&{} {R}_{32}\mathbf {I}_K&{} {R}_{33}\mathbf {I}_K\end{array}\right] \right) \\&\qquad =\mathrm{col}\left( \bar{\varvec{\Theta }}(\mathbf {R}\otimes \mathbf {I}_K)\right) \\&\qquad =\mathrm{col}(\bar{\varvec{\Theta }}) \end{aligned}$$

where $\otimes $ is the Kronecker product, $\mathbf {R}$ is a $3\times 3$ rotation matrix and, $\mathbf {I}_K$ is a $K \times K$ identity matrix. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, H.S., Shiratori, T., Matthews, I. et al. 3D Trajectory Reconstruction under Perspective Projection. Int J Comput Vis 115, 115–135 (2015). https://doi.org/10.1007/s11263-015-0804-2

Download citation

Received: 18 October 2012
Accepted: 24 January 2015
Published: 18 February 2015
Issue Date: November 2015
DOI: https://doi.org/10.1007/s11263-015-0804-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

3D Trajectory Reconstruction under Perspective Projection

Abstract

Access this article

Similar content being viewed by others

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

LSD-SLAM: Large-Scale Direct Monocular SLAM

A Review on 3D Reconstruction Techniques from 2D Images

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

3D Trajectory Reconstruction under Perspective Projection

Abstract

Access this article

Similar content being viewed by others

An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics

LSD-SLAM: Large-Scale Direct Monocular SLAM

A Review on 3D Reconstruction Techniques from 2D Images

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation