Skip to main content
Log in

3D Trajectory Reconstruction under Perspective Projection

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

We present an algorithm to reconstruct the 3D trajectory of a moving point from its correspondence in a collection of temporally non-coincidental 2D perspective images, given the time of capture that produced each image and the relative camera poses at each time instant. Triangulation-based solutions do not apply, as multiple views of the point may not exist at each time instant. We represent a 3D trajectory using a linear combination of compact trajectory basis vectors, such as the discrete cosine transform basis, that have been shown to approximate object independence. We note that such basis vectors are also coordinate independent, which allows us to directly use camera poses estimated from stationary areas in the scene (in contrast to nonrigid structure from motion techniques where cameras are simultaneously estimated). This reduces the reconstruction optimization to a linear least squares problem, allowing us to robustly handle missing data that often occur due to motion blur, texture deformation, and self occlusion. We present an algorithm to determine the number of trajectory basis vectors, individually for each trajectory via a cross validation scheme and refine the solution by minimizing the geometric error. The relationship between point and camera motion can cause degeneracies to occur. We geometrically analyze the problem by studying the relationship of the camera motion, point motion, and trajectory basis vectors. We define the reconstructability of a 3D trajectory under projection, and show that the estimate approaches the ground truth when reconstructability approaches infinity. This analysis enables us to precisely characterize cases when accurate reconstruction is achievable. We present qualitative results for the reconstruction of several real-world scenes from a series of 2D projections where high reconstructability can be guaranteed, and report quantitative results on motion capture sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. Related observations have been made in Shashua and Avidan (2000) and Hartley and Vidal (2008).

  2. For the purposes of this discussion, it should be noted that any global rigid motion of the object is equivalent to relative camera motion.

  3. Related empiricial observations have been made by Ozden et al. (2004) and Akhter et al. (2008).

  4. We estimate camera poses automatically via structure from motion. See Sect. 5.2 for a description of the camera pose estimation algorithm.

  5. Ambiguity analyses have been investigated by Xiao et al. (2006), Vidal and Abretske (2006), Hartley and Vidal (2008), and Akhter et al. (2009). However, these analyses consider the ambiguity with the use of a shape basis representation, which utilizes the correlation across multiple points. In this section, we consider the case of the reconstruction of a single 3D point trajectory.

  6. \(\otimes \) is the Kronecker product and \(\mathbf {D}\) is a diagonal matrix which consists of \(\{a_1,\ldots ,a_F\}\), the scalar for each point along the trajectory.

  7. The method by Avidan and Shashua (2000) can only reconstruct a linear or conic trajectory.

  8. To solve the second part of the optimization, they have to additionally solve \({ \left( \begin{array}{c}d+2\\ d\end{array}\right) }\) linear equations.

References

  • Akhter, I., Sheikh, Y., & Khan, S. (2009). In defense of orthonormality constraints for nonrigid structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2008). Nonrigid structure from motion in trajectory space. In Advances in Neural Information Processing Systems.

  • Akhter, I., Sheikh, Y., Khan, S., & Kanade, T. (2011). Trajectory space: A dual representation for nonrigid structure from motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(7), 1442–1456.

    Article  Google Scholar 

  • Avidan, S., & Shashua, A. (2000). Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 348–357.

    Article  Google Scholar 

  • Bartoli, A., Gay-Bellile, V., Castellani, U., Peyras, J., Olsen, S. I., & Sayd, P. (2008). Coarse-to-fine low-rank structure-from-motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Blanz, V., & Vetter, T. (1999). A morphable model for the synthesis of 3D faces. In ACM transactions on Graphics (SIGGRAPH).

  • Brand, M. (2001). Morphable 3D models from video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Brand, M. (2005). A direct method for 3D factorization of nonrigid motion observed in 2D. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Bregler, C., Hertzmann, A., & Biermann, H. (1999). Recovering non-rigid 3D shape from image streams. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Dai, Y., Li, H., & He, M. (2012). A simple prior-free method for non-rigid structure-from-motion factorization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Del Bue, A. (2008). A factorization approach to structure from motion with shape priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Del Bue, A., Llad, X., & Agapito, L. (2006). Non-rigid metric shape and motion recovery from uncalibrated images using priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Faugeras, O., Luong, Q.-T., & Papadopoulou, T. (2001). The geometry of multiple images: The laws that govern the formation of images of a scene and some of their applications. Cambridge: MIT Press.

    Google Scholar 

  • Fayad, J., Agapito, L., & Del Bue, A. (2010). Piecewise quadratic reconstruction of non-rigid surface from monocular sequences. In Proceedings of the European Conference on Computer Vision.

  • Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.

    Article  MathSciNet  Google Scholar 

  • Gotardo, P. F. U., & Martinez, A. M. (2011). Computing smooth time-trajectories for camera and deformable shape in structure from motion with occlusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10), 2051–2065.

    Article  Google Scholar 

  • Hamidi, M., & Pearl, J. (1976). Comparison of the cosine and Fourier transforms of Markov-I signal. IEEE Transactions on Acoustics, Speech, and Signal Processing, 24, 428–429.

    Article  MathSciNet  Google Scholar 

  • Hartley, R., & Zisserman, A. (2004). Multiple view geometry in computer vision (2nd ed.). Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Hartley, R. (1997). In defense of the eight-point algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19, 580–593.

    Article  Google Scholar 

  • Hartley, R., & Vidal, R. (2008). Perspective nonrigid shape and motion recovery. In Proceedings of the European Conference on Computer Vision.

  • Kaminski, J. Y., & Teicher, M. (2004). A general framework for trajectory triangulation. Journal of Mathematical Imaging and Vision, 21(1), 27–41.

    Article  MathSciNet  Google Scholar 

  • Lladó, X., Del Bue, A., & Agapito, L. (2010). Non-rigid metric reconstruction from perspective cameras. Image and Vision Computing, 28(9), 1339–1353.

    Article  Google Scholar 

  • Longuet-Higgins, H. C. (1981). A computer algorithm for reconstructing a scene from two projections. Nature, 293, 133–135.

    Article  Google Scholar 

  • Lourakis, M. I. A., & Argyros, A. A. (2009). SBA: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software, 36(1), 1–30.

    Article  MathSciNet  Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Ma, Y., Soatto, S., Kosecka, J., & Sastry, S. S. (2003). An invitation to 3-D vision: From images to geometric models. New York: Springer.

    Google Scholar 

  • Moreno-Noguer, F., Lepetit, V., & Fua, P. (2007). EPnP: Efficient perspective-n-point camera pose estimation. In Proceedings of the International Conference on Computer Vision.

  • Olsen, S., & Bartoli, A. (2007). Using priors for improving generalization in non-rigid structure-from-motion. In Proceedings of British Machine Vision Conference.

  • Östlund, J., Varol, A., Ngo, D. T., & Fua, P. (2012). Laplacian meshes for monocular 3D shape recovery. In Proceedings of the European Conference on Computer Vision.

  • Ozden, K. E., Cornelis, K., Eychen, L. V., & Gool, L. V. (2004). Reconstructing 3D trajectories of independently moving objects using generic constraints. Computer Vision and Image Understanding, 93, 1453–1471.

    Google Scholar 

  • Paladini, M., Del Bue, A., Stosic, M., Dodig, M., Xavier, J., & Agapito, L. (2009). Factorization for non-rigid and articulated structure using metric projections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Park, H. S., Shiratori, T., Matthews, I., & Sheikh, Y. (2010). 3D reconstruction of a moving point from a series of 2D projections. In Proceedings of the European Conference on Computer Vision.

  • Salzmann, M., Pilet, J., Ilic, S., & Fua, P. (2007). Surface deformation models for nonrigid 3D shape recovery. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(7), 1481–1487.

    Article  Google Scholar 

  • Shashua, A., & Wolf, L. (2000). Homography tensors: On algebraic entities that represent three views of static or moving planar points. In Proceedings of the European Conference on Computer Vision.

  • Sidenbladh, H., Black, M. J., & Fleet, D. J. (2000). Stochastic tracking of 3d human figures using 2D image motion. In Proceedings of the European Conference on Computer Vision.

  • Snavely, N., Seitz, S. M., & Szeliski, R. (2006). Photo tourism: Exploring photo collections in 3D. ACM Transactions on Graphics (SIGGRAPH).

  • Taylor, J., Jepson, A. D., & Kutulakos, K. N. (2010). Non-rigid structure from locally-rigid motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Tomasi, C., & Kanade, T. (1992). Shape and motion from image streams under orthography: A factorization method. International Journal of Computer Vision, 9(2), 137–154.

    Article  Google Scholar 

  • Torresani, L., Yang, D., Alexander, G., & Bregler, C. (2001). Tracking and modeling non-rigid objects with rank constraints. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Torresani, L., & Bregler, C. (2002). Space-time tracking. In Proceedings of the European Conference on Computer Vision.

  • Torresani, L., Hertzmann, A., & Bregler, C. (2008). Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 878–892.

  • Torresani, L., Hertzmann, A., & Bregler, C. (2003). Learning non-rigid 3D shape from 2D motion. In Advances in Neural Information Processing Systems.

  • Valmadre, J., & Lucey, S. (2012). General trajectory prior for non-rigid reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Vidal, R., & Abretske, D. (2006). Nonrigid shape and motion from multiple perspective views. In Proceedings of the European Conference on Computer Vision.

  • Vidal, R., & Hartley, R. (2004). Motion segmentation with missing data by powerfactorization and generalized pca. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Wexler, Y., & Shashua, A. (2000). On the synthesis of dynamic scenes from reference views. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Wolf, L., & Shashua, A. (2002). On projection matrices \({\cal P}^{k} \rightarrow {\cal P}^{2}, k =3, \ldots \), 6, and their applications in computer vision. International Journal of Computer Vision, 48(1), 53–67.

    Article  MATH  Google Scholar 

  • Xiao, J., & Kanade, T. (2004). Non-rigid shape and motion recovery: Degenerate deformations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Xiao, J., Chai, J., & Kanade, T. (2006). A closed-form solution to non-rigid shape and motion recovery. International Journal of Computer Vision, 67(2), 233–246.

    Article  Google Scholar 

  • Yan, J., & Pollefeys, M. (2005). A factorization-based approach to articulated motion recovery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

  • Zhu, S., Zhang, L., & Smith, B. M. (2010). Model evolution: An incremental approach to non-rigid structure from motion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.

Download references

Acknowledgments

This work was supported by NSF Grant IIS-0916272.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hyun Soo Park.

Additional information

Communicated by Jun Sato.

Appendix

Appendix

To prove Result 1, we need to show that the transformed trajectory basis, \(\mathbf {S}(\varvec{\Theta })\), span the same space spanned by the original trajectory basis vectors where \(\mathbf {S}(\cdot )\) is a similarity transformation, i.e., \(\mathrm{col}(\mathbf {S}(\varvec{\Theta }))=\mathrm{col}(\varvec{\Theta })\) where \(\mathrm{col}(\varvec{\Theta })\) is a space spanned by the column space of \(\varvec{\Theta }\).

Proof

(i) scale \(\mathrm{col}(s\varvec{\Theta }) = \mathrm{col}(\varvec{\Theta })\) where \(s\) is a scalar.

(ii) translation translation is spanned by the DC component of \(\varvec{\Theta }_\mathrm{DCT}\).

(iii) rotation without loss of generality, the trajectory basis can be rearranged as \(\bar{\varvec{\Theta }} = \mathrm{blkdiag}\{\varvec{\theta },\varvec{\theta },\varvec{\theta }\}\) where \(\varvec{\theta } \in \mathbb {R}^{F\times K}\) is the DCT trajectory basis for each trajectory. The rotated trajectory basis, \((\mathbf {R}\otimes \mathbf {I}_F)\bar{\varvec{\Theta }}\) span the original trajectory basis vectors \(\bar{\varvec{\Theta }}\) because,

$$\begin{aligned}&\mathrm{col}\left( (\mathbf {R}\otimes \mathbf {I}_F)\bar{\varvec{\Theta }}\right) \\&\qquad =\mathrm{col}\left( \left[ \begin{array}{ccc}{R}_{11}\mathbf {I}_F &{} {R}_{12}\mathbf {I}_F &{} {R}_{13}\mathbf {I}_F \\ {R}_{21}\mathbf {I}_F &{} {R}_{22}\mathbf {I}_F &{} {R}_{23}\mathbf {I}_F \\ {R}_{31}\mathbf {I}_F &{} {R}_{32}\mathbf {I}_F &{} {R}_{33}\mathbf {I}_F\end{array}\right] \left[ \begin{array}{ccc}\varvec{\theta }&{}&{}\\ {} &{}\varvec{\theta }&{}\\ &{}&{}\varvec{\theta }\end{array}\right] \right) \\&\qquad =\mathrm{col}\left( \left[ \begin{array}{ccc}{R}_{11} \varvec{\theta } &{} {R}_{12}\varvec{\theta } &{} {R}_{13}\varvec{\theta } \\ {R}_{21}\varvec{\theta } &{} {R}_{22}\varvec{\theta } &{} {R}_{23}\varvec{\theta } \\ {R}_{31}\varvec{\theta } &{} {R}_{32}\varvec{\theta } &{} {R}_{33}\varvec{\theta }\end{array}\right] \right) \\&\qquad =\mathrm{col}\left( \left[ \begin{array}{ccc}\varvec{\theta }&{}&{}\\ {} &{}\varvec{\theta }&{}\\ &{}&{}\varvec{\theta }\end{array}\right] \left[ \begin{array}{ccc}{R}_{11}\mathbf {I}_K &{} {R}_{12}\mathbf {I}_K &{} {R}_{13}\mathbf {I}_K \\ {R}_{21}\mathbf {I}_K &{} {R}_{22}\mathbf {I}_K &{} {R}_{23}\mathbf {I}_K\\ {R}_{31}\mathbf {I}_K&{} {R}_{32}\mathbf {I}_K&{} {R}_{33}\mathbf {I}_K\end{array}\right] \right) \\&\qquad =\mathrm{col}\left( \bar{\varvec{\Theta }}(\mathbf {R}\otimes \mathbf {I}_K)\right) \\&\qquad =\mathrm{col}(\bar{\varvec{\Theta }}) \end{aligned}$$

where \(\otimes \) is the Kronecker product, \(\mathbf {R}\) is a \(3\times 3\) rotation matrix and, \(\mathbf {I}_K\) is a \(K \times K\) identity matrix. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Park, H.S., Shiratori, T., Matthews, I. et al. 3D Trajectory Reconstruction under Perspective Projection. Int J Comput Vis 115, 115–135 (2015). https://doi.org/10.1007/s11263-015-0804-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-015-0804-2

Keywords

Navigation