NRST: Non-rigid Surface Tracking from Monocular Video

  • Marc HabermannEmail author
  • Weipeng XuEmail author
  • Helge RhodinEmail author
  • Michael ZollhöferEmail author
  • Gerard Pons-MollEmail author
  • Christian TheobaltEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11269)


We propose an efficient method for non-rigid surface tracking from monocular RGB videos. Given a video and a template mesh, our algorithm sequentially registers the template non-rigidly to each frame. We formulate the per-frame registration as an optimization problem that includes a novel texture term specifically tailored towards tracking objects with uniform texture but fine-scale structure, such as the regular micro-structural patterns of fabric. Our texture term exploits the orientation information in the micro-structures of the objects, e.g., the yarn patterns of fabrics. This enables us to accurately track uniformly colored materials that have these high frequency micro-structures, for which traditional photometric terms are usually less effective. The results demonstrate the effectiveness of our method on both general textured non-rigid objects and monochromatic fabrics.

Supplementary material

Supplementary material 1 (mp4 41639 KB)


  1. 1.
    Bartoli, A., Gérard, Y., Chadebecq, F., Collins, T.: On template-based reconstruction from a single view: analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces. In: CVPR (2012)Google Scholar
  2. 2.
    Brunet, F., Hartley, R., Bartoli, A., Navab, N., Malgouyres, R.: Monocular template-based reconstruction of smooth and inextensible surfaces. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010. LNCS, vol. 6494, pp. 52–66. Springer, Heidelberg (2011). Scholar
  3. 3.
    Carceroni, R.L., Kutulakos, K.N.: Multi-view scene capture by surfel sampling: from video streams to non-rigid 3D motion, shape & reflectance. In: ICCV (2001)Google Scholar
  4. 4.
    Dai, Y., Li, H., He, M.: A simple prior-free method for non-rigid structure-from-motion factorization. IJCV 107, 101–122 (2014)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)Google Scholar
  6. 6.
    Garg, R., Roussos, A., Agapito, L.: Dense variational reconstruction of non-rigid surfaces from monocular video. In: CVPR (2013)Google Scholar
  7. 7.
    Garrido, P., Valgaerts, L., Wu, C., Theobalt, C.: Reconstructing detailed dynamic face geometry from monocular video. TOG 32, 158 (2013)CrossRefGoogle Scholar
  8. 8.
    Gårding, J.: Shape from texture for smooth curved surfaces in perspective projection. J. Math. Imaging Vis. 2, 327–350 (1992)CrossRefGoogle Scholar
  9. 9.
    Jordt, A., Koch, R.: Fast tracking of deformable objects in depth and colour video. In: BMVC, pp. 1–11 (2011)Google Scholar
  10. 10.
    Jordt, A., Koch, R.: Direct model-based tracking of 3D object deformations in depth and color video. IJCV 102(1–3), 239–255 (2013)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Labatut, P., Pons, J.P., Keriven, R.: Efficient multi-view reconstruction of large-scale scenes using interest points, delaunay triangulation and graph cuts. In: ICCV (2007)Google Scholar
  12. 12.
    Liang, J., DeMenthon, D., Doermann, D.: Flattening curved documents in images. In: CVPR (2005)Google Scholar
  13. 13.
    Liu-Yin, Q., Yu, R., Agapito, L., Fitzgibbon, A., Russell, C.: Better together: joint reasoning for non-rigid 3D reconstruction with specularities and shading. In: BMVC (2016)Google Scholar
  14. 14.
    Ma, W.J.: Nonrigid 3D reconstruction from a single image. In: ISAI (2016)Google Scholar
  15. 15.
    Malti, A., Bartoli, A., Collins, T.: A pixel-based approach to template-based monocular 3D reconstruction of deformable surfaces. In: ICCV Workshops (2011)Google Scholar
  16. 16.
    Malti, A., Hartley, R., Bartoli, A., Kim, J.H.: Monocular template-based 3D reconstruction of extensible surfaces with local linear elasticity. In: CVPR (2013)Google Scholar
  17. 17.
    Moreno-Noguer, F., Salzmann, M., Lepetit, V., Fua, P.: Capturing 3D stretchable surfaces from single images in closed form. In: CVPR (2009)Google Scholar
  18. 18.
    Newcombe, R.A., Fox, D., Seitz, S.M.: DynamicFusion: reconstruction and tracking of non-rigid scenes in real-time. In: CVPR (2015)Google Scholar
  19. 19.
    Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: International Symposium on Mixed and Augmented Reality (2011)Google Scholar
  20. 20.
    Ngo, D.T., Park, S., Jorstad, A., Crivellaro, A., Yoo, C.D., Fua, P.: Dense image registration and deformable surface reconstruction in presence of occlusions and minimal texture. In: ICCV (2015)Google Scholar
  21. 21.
    Östlund, J., Varol, A., Ngo, D.T., Fua, P.: Laplacian meshes for monocular 3D shape recovery. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7574, pp. 412–425. Springer, Heidelberg (2012). Scholar
  22. 22.
    Pan, Q., Reitmayr, G., Drummond, T.: ProFORMA: probabilistic feature-based on-line rapid model acquisition. In: BMVC (2009)Google Scholar
  23. 23.
    Perriollat, M., Bartoli, A.: A quasi-minimal model for paper-like surfaces. In: CVPR (2007)Google Scholar
  24. 24.
    Perriollat, M., Hartley, R., Bartoli, A.: Monocular template-based reconstruction of inextensible surfaces. IJCV 95, 124–137 (2011)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Pons, J.P., Keriven, R., Faugeras, O.: Modelling dynamic scenes by registering multi-view image sequence. In: CVPR (2005)Google Scholar
  26. 26.
    Rao, A.R.: Computing oriented texture fields. Comput. Vis. Graph. Image Process.: Graph. Models Image Process. 53, 157–185 (1991)Google Scholar
  27. 27.
    Russell, C., Fayad, J., Agapito, L.: Energy based multiple model fitting for non-rigid structure from motion. In: CVPR (2011)Google Scholar
  28. 28.
    Salzmann, M., Fua, P.: Reconstructing sharply folding surfaces: a convex formulation. In: CVPR (2009)Google Scholar
  29. 29.
    Salzmann, M., Fua, P.: Linear local models for monocular reconstruction of deformable surface. Trans. Pattern Anal. Mach. Intell. 33, 931–944 (2011)CrossRefGoogle Scholar
  30. 30.
    Salzmann, M., Lepetit, V., Fua, P.: Deformable surface tracking ambiguities. In: CVPR (2007)Google Scholar
  31. 31.
    Salzmann, M., Moreno-Noguer, F., Lepetit, V., Fua, P.: Closed-form solution to non-rigid 3D surface registration. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5305, pp. 581–594. Springer, Heidelberg (2008). Scholar
  32. 32.
    Salzmann, M., Urtasun, R., Fua, P.: Local deformation models for monocular 3D shape recovery. In: CVPR (2008)Google Scholar
  33. 33.
    Shen, S., Shi, W., Liu, Y.: Monocular template-based tracking of inextensible deformable surfaces under \(L_2\)-norm. In: Zha, H., Taniguchi, R., Maybank, S. (eds.) ACCV 2009. LNCS, vol. 5995, pp. 214–223. Springer, Heidelberg (2010). Scholar
  34. 34.
    Sorkine, O., Alexa, M.: As-rigid-as-possible surface modeling. In: SGP (2007)Google Scholar
  35. 35.
    Sumner, R.W., Schmid, J., Pauly, M.: Embedded deformation for shape manipulation. ACM Trans. Graph. 26(3) (2007). Article no. 80. ISSN: 0730-0301CrossRefGoogle Scholar
  36. 36.
    Tao, Y., et al.: DoubleFusion: real-time capture of human performance with inner body shape from a depth sensor. In: IEEE Conference on Computer Vision and Pattern Recognition (2018)Google Scholar
  37. 37.
    Torresani, L., Hertzmann, A., Bregler, C.: Non-rigid structure-from-motion: estimating shape and motion with hierarchical priors. Trans. Pattern Anal. Mach. Intell. 30, 878–892 (2008)CrossRefGoogle Scholar
  38. 38.
    Tsoli, A., Argyros, A.: Tracking deformable surfaces that undergo topological changes using an RGB-D camera. In: 3DV, October 2016Google Scholar
  39. 39.
    Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.P., Theobalt, C.: Lightweight binocular facial performance capture under uncontrolled lighting. SIGGRAPH Asia (2012)Google Scholar
  40. 40.
    Varol, A., Salzmann, M., Fua, P., Urtasun, R.: A constrained latent variable model. In: CVPR (2012)Google Scholar
  41. 41.
    Xu, W., et al.: MonoPerfCap: human performance capture from monocular video. TOG 37, 27 (2018)Google Scholar
  42. 42.
    Xu, W., Salzmann, M., Wang, Y., Liu, Y.: Nonrigid surface registration and completion from RGBD images. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 64–79. Springer, Cham (2014). Scholar
  43. 43.
    Yu, R., Russell, C., Campbell, N.D.F., Agapito, L.: Direct, dense, and deformable: template-based non-rigid 3D reconstruction from RGB video. In: ICCV (2015)Google Scholar
  44. 44.
    Zollhoefer, M., et al.: Real-time non-rigid reconstruction using an RGB-D camera. TOG 33, 156 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Max Planck Institute for InformaticsSaarbrückenGermany
  2. 2.EPFLLausanneSwitzerland
  3. 3.Stanford UniversityStanfordUSA

Personalised recommendations