Direct-from-Video: Unsupervised NRSfM

  • Karel LebedaEmail author
  • Simon Hadfield
  • Richard Bowden
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9915)


In this work we describe a novel approach to online dense non-rigid structure from motion. The problem is reformulated, incorporating ideas from visual object tracking, to provide a more general and unified technique, with feedback between the reconstruction and point-tracking algorithms. The resulting algorithm overcomes the limitations of many conventional techniques, such as the need for a reference image/template or precomputed trajectories. The technique can also be applied in traditionally challenging scenarios, such as modelling objects with strong self-occlusions or from an extreme range of viewpoints. The proposed algorithm needs no offline pre-learning and does not assume the modelled object stays rigid at the beginning of the video sequence. Our experiments show that in traditional scenarios, the proposed method can achieve better accuracy than the current state of the art while using less supervision. Additionally we perform reconstructions in challenging new scenarios where state-of-the-art approaches break down and where our method improves performance by up to an order of magnitude.


Non-rigid SfM Structure from motion Visual tracking Template-free Gaussian process 



This work was supported by the EPSRC project EP/I011811/1: “Learning to Recognise Dynamic Visual Content from Broadcast Footage” and the SNSF Sinergia project “Scalable Multimodal Sign Language Technology for Sign Language Learning and Assessment” (SMILE) grant agreement number CRSII2 160811.

Supplementary material

Supplementary material 1 (mp4 9993 KB)

Supplementary material 2 (mp4 1029 KB)

Supplementary material 3 (mp4 1695 KB)

Supplementary material 4 (mp4 9746 KB)


  1. 1.
    Agudo, A., Agapito, L., Calvo, B., Montiel, J.: Good vibrations: a modal analysis approach for sequential non-rigid structure from motion. In: CVPR (2014)Google Scholar
  2. 2.
    Agudo, A., Montiel, J., Agapito, L., Calvo, B.: Online dense non-rigid 3D shape and camera motion recovery. In: BMVC (2014)Google Scholar
  3. 3.
    Paladini, M., Bartoli, A., Agapito, L.: Sequential non-rigid structure-from-motion with the 3d-implicit low-rank shape model. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6312, pp. 15–28. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15552-9_2 CrossRefGoogle Scholar
  4. 4.
    Tao, L., Matuszewski, B.J.: Non-rigid structure from motion with diffusion maps prior. In: CVPR (2013)Google Scholar
  5. 5.
    Lebeda, K., Hadfield, S., Bowden, R.: 2D or Not 2D: bridging the gap between tracking and structure from motion. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 642–658. Springer, Heidelberg (2015). doi: 10.1007/978-3-319-16817-3_42 Google Scholar
  6. 6.
    Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis. 9, 137–154 (1992)CrossRefGoogle Scholar
  7. 7.
    Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3D shape from image streams. In: CVPR (2000)Google Scholar
  8. 8.
    Dai, Y., Li, H., He, M.: A simple prior-free method for non-rigid structure-from-motion factorization. In: CVPR (2012)Google Scholar
  9. 9.
    Rabaud, V., Belongie, S.: Linear embeddings in non-rigid structure from motion. In: CVPR (2009)Google Scholar
  10. 10.
    Bartoli, A., Gay-Bellile, V., Castellani, U., Peyras, J., Olsen, S., Sayd, P.: Coarse-to-fine low-rank structure-from-motion. In: CVPR (2008)Google Scholar
  11. 11.
    Perriollat, M., Hartley, R., Bartoli, A.: Monocular template-based reconstruction of inextensible surfaces. Int. J. Comput. Vis. 95, 124–137 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Vicente, S., Agapito, L.: Soft inextensibility constraints for template-free non-rigid reconstruction. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 426–440. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33712-3_31 CrossRefGoogle Scholar
  13. 13.
    Agudo, A., Calvo, B., Montiel, J.: Finite element based sequential bayesian non-rigid structure from motion. In: CVPR (2012)Google Scholar
  14. 14.
    Eriksson, A., van den Hengel, A.: Efficient computation of robust low-rank matrix approximations in the presence of missing data using the L1 norm. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2010)Google Scholar
  15. 15.
    Zollhofer, M., Niessner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., Stamminger, M.: Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. (TOG) 33, 156 (2014)CrossRefGoogle Scholar
  16. 16.
    Newcombe, R., Fox, D., Seitz, S.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: CVPR (2015)Google Scholar
  17. 17.
    Garg, R., Roussos, A., Agapito, L.: A variational approach to video registration with subspace constraints. Int. J. Comput. Vis. 104, 286–314 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Yu, R., Russell, C., Campbell, N.D.F., Agapito, L.: Direct, dense, and deformable: template-based non-rigid 3D reconstruction from RGB video. In: ICCV (2015)Google Scholar
  19. 19.
    Strasdat, H., Montiel, J., Davison, A.: Real-time monocular SLAM: why filter? In: ICRA (2010)Google Scholar
  20. 20.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: DeepFlow: large displacement optical flow with deep matching. In: ICCV (2013)Google Scholar
  21. 21.
    Torr, P., Zisserman, A.: MLESAC: a new robust estimator with application to estimating image geometry. Comput. Vis. Image Underst. 78, 138–156 (2000)CrossRefGoogle Scholar
  22. 22.
    Agarwal, S., Mierle, K., et al.: Ceres solver.
  23. 23.
    Paladini, M., Del Bue, A., Stosic, M., Dodig, M., Xavier, J., Agapito, L.: Factorization for non-rigid and articulated structure using metric projections. In: CVPR (2009)Google Scholar
  24. 24.
    Lebeda, K., Hadfield, S., Bowden, R.: Dense rigid reconstruction from unstructured discontinuous video. In: ICCV 3DRR (2015)Google Scholar
  25. 25.
    Kazhdan, M., Bolitho, M., Hoppe, H.: Poisson surface reconstruction. In: SGP (2006)Google Scholar
  26. 26.
    Kazhdan, M., Hoppe, H.: Screened poisson surface reconstruction. ACM Trans. Graph. (TOG) 32, 29 (2013)CrossRefzbMATHGoogle Scholar
  27. 27.
    Del Bue, A., Xavier, J., Agapito, L., Paladini, M.: Bilinear modeling via augmented lagrange multipliers (BALM). TPAMI (2012)Google Scholar
  28. 28.
    Chhatkuli, A., Pizarro, D., Bartoli, A.: Non-rigid shape-from-motion for isometric surfaces using infinitesimal planarity. In: BMVC (2014)Google Scholar
  29. 29.
    Ferraz, L., Binefa, X., Moreno-Noguer, F.: Very fast solution to the PnP problem with algebraic outlier rejection. In: CVPR (2014)Google Scholar
  30. 30.
    Chrysos, G., Antonakos, E., Zafeiriou, S., Snape, P.: Offline deformable face tracking in arbitrary videos. In: ICCVW (2015)Google Scholar
  31. 31.
    Shen, J., Zafeiriou, S., Chrysos, G., Kossaifi, J., Tzimiropoulos, G., Pantic, M.: The first facial landmark tracking in-the-wild challenge: Benchmark and results. In: ICCVW (2015)Google Scholar
  32. 32.
    Tzimiropoulos, G.: Project-out cascaded regression with an application to face alignment. In: CVPR (2015)Google Scholar
  33. 33.
    Varol, A., Salzmann, P., Urtasun, R.: A constrained latent variable model. In: CVPR (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Centre for Vision, Speech and Signal ProcessingUniversity of SurreyGuildfordUK

Personalised recommendations