International Journal of Computer Vision

, Volume 116, Issue 2, pp 115–135 | Cite as

A Bayesian Approach to Multi-view 4D Modeling

  • Chun-Hao HuangEmail author
  • Cedric Cagniart
  • Edmond Boyer
  • Slobodan Ilic


This paper considers the problem of automatically recovering temporally consistent animated 3D models of arbitrary shapes in multi-camera setups. An approach is presented that takes as input a sequence of frame-wise reconstructed surfaces and iteratively deforms a reference surface such that it fits the input observations. This approach addresses several issues in this field that include: large frame-to-frame deformations, noise, missing data, outliers and shapes composed of multiple components with arbitrary geometries. The problem is cast as a geometric registration with two major features. First, surface deformations are modeled using mesh decomposition into elements called patches. This strategy ensures robustness by enabling flexible regularization priors through inter-patch rigidity constraints. Second, registration is formulated as a Bayesian estimation that alternates between probabilistic datal-model association and deformation parameter estimation. This accounts for uncertainties in the acquisition process and allows for noise, outliers and missing geometries in the observed meshes. In the case of marker-less 3D human motion capture, this framework can be specialized further with additional articulated motion constraints. Extensive experiments on various 4D datasets show that complex scenes with multiple objects of arbitrary nature can be processed in a robust way. They also demonstrate that the framework can capture human motion and provides visually convincing as well as quantitatively reliable human poses.


Multi-view Deformable surface tracking Mesh registration Expectation–Maximization 3D human motion tracking Bayesian network 



This work was partially funded by Deutsche Telekom Laboratories and partly conducted in their laboratory.


  1. Ahmed, N., Theobalt, C., Rössl, C., Thrun, S., & Seidel, H. P. (2008). Dense correspondence finding for parametrization-free animation reconstruction from video. In IEEE CVPR.Google Scholar
  2. Baran, I., & Popovic, J. (2007). Automatic rigging and animation of 3D characters. In SIGGRAPH.Google Scholar
  3. Bay, H., Ess, A., Tuytelaars, T., & Gool, L. V. (2008). Surf: Speeded up robust features. CVIU, 110, 346–359.Google Scholar
  4. Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 1). New York: Springer.zbMATHGoogle Scholar
  5. Botsch, M., & Sorkine, O. (2008). On linear variational surface deformation methods. In IEEE Transactions on Visualization and Computer Graphics.Google Scholar
  6. Botsch, M., Bommes, D., & Kobbelt, L. (2005). Efficient linear system solvers for mesh processing. In IMA Conference on the Mathematics of Surfaces.Google Scholar
  7. Botsch, M., Pauly, M., Wicke, M., & Gross, M. H. (2007). Adaptive space deformations based on rigid cells. Computer Graphics Forum, 26, 339–347.CrossRefGoogle Scholar
  8. Cagniart, C., Boyer, E., & Ilic, S. (2010a). Free-from mesh tracking: a patch-based approach. In IEEE CVPR.Google Scholar
  9. Cagniart, C., Boyer, E., & Ilic, S. (2010b). Probabilistic deformable surface tracking from multiple videos. In ECCV.Google Scholar
  10. Chai, J., Xiao, J., & Hodgins, J. K. (2003). Vision-based control of 3d facial animation. In Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation.Google Scholar
  11. Corazza, S., Mündermann, L., Gambaretto, E., Ferrigno, G., & Andriacchi, T. P. (2010). Markerless motion capture through visual hull, articulated ICP and subject specific model generation. IJCV, 87(1–2), 156–169.CrossRefGoogle Scholar
  12. De Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H. P., & Thrun, S. (2008). Performance capture from sparse multi-view video. In ACM SIGGRAPH 2008.Google Scholar
  13. De Aguiar, E., Sigal, L., Treuille, A., & Hodgins, J. K. (2010). Stable spaces for real-time clothing. In SIGGRAPH.Google Scholar
  14. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39(1), 1–38.MathSciNetzbMATHGoogle Scholar
  15. Duveau, E., Courtemanche, S., Reveret, L., & Boyer, E. (2012). Cage-based motion recovery using manifold learning. In 3DimPVT, IEEE.Google Scholar
  16. Franco, J. S., & Boyer, E. (2003). Exact polyhedral visual hulls. In: BMVC.Google Scholar
  17. Furukawa, Y., & Ponce, J. (2008). Dense 3D motion capture from synchronized video streams. In IEEE CVPR.Google Scholar
  18. Gall, J., Stoll, C., de Aguiar, E., Theobalt, C., Rosenhahn, B., & Seidel, H. P. (2009). Motion capture using joint skeleton tracking and surface estimation. In IEEE CVPR.Google Scholar
  19. Gall, J., Rosenhahn, B., Brox, T., & Seidel, H. P. (2010). Optimization and filtering for human motion capture. IJCV, 87, 75–92.CrossRefGoogle Scholar
  20. Guan, P., Weiss, A., Balan, A., & Black, M. J. (2009). Estimating human shape and pose from a single image. In ICCV (pp. 1381–1388).Google Scholar
  21. Horaud, R. P., Forbes, F., Yguel, M., Dewaele, G., Zhang, J. (2010). Rigid and articulated point registration with expectation conditional maximization. IEEE PAMI.Google Scholar
  22. Huang, C. H., Boyer, E., & Ilic, S. (2013). Robust human body shape and pose tracking. In 3D Vision.Google Scholar
  23. Huang, C. H., Boyer, E., Navab, N., & Ilic, S. (2014). Human shape and pose tracking using keyframes. In CVPR.Google Scholar
  24. James, D. L., & Twigg, C. D. (2005). Skinning mesh animations. SIGGRAPH.Google Scholar
  25. Karypis, G., & Kumar, V. (1998). A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 7, 14–23.MathSciNetGoogle Scholar
  26. Lewis, J. P., Cordner, M., & Fong, N. (2000). Pose space deformation: A unified approach to shape interpolation and skeleton-driven deformation. In SIGGRAPH, ACM.Google Scholar
  27. Li, H., Roivainen, P., & Forcheimer, R. (1993). 3-D motion estimation in model-based facial image coding. In PAMI.Google Scholar
  28. Li, H., Sumner, R. W., & Pauly, M. (2008). Global correspondence optimization for non-rigid registration of depth scans. Computer Graphics Forum, 25, 1459–1468.Google Scholar
  29. Liao, M., Zhang, Q., Wang, H., Yang, R., & Gong, M. (2009). Modeling deformable objects from a single depth camera. In ICCV.Google Scholar
  30. Liu, Y., Stoll, C., Gall, J., Seidel, H. P., & Theobalt, C. (2011). Markerless motion capture of interacting characters using multi-view image segmentation. In CVPR, IEEE.Google Scholar
  31. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. IJCV, 60, 91–110.CrossRefGoogle Scholar
  32. Meng, X. L., & Rubin, D. B. (1993). Maximum likelihood estimation via the ECM algorithm: A general framework. In Biometrika.Google Scholar
  33. Myronenko, A., & Song, X. (2010). Point-set registration: Coherent point drift. In IEEE PAMI.Google Scholar
  34. Nealen, A., Mueller, M., Keiser, R., Boxerman, E., & Carlson, M. (2006). Physically Based Deformable Models in Computer Graphics. Computer Graphics Forum, 25, 809–836.CrossRefGoogle Scholar
  35. Peel, D., & McLachlan, G. J. (2000). Robust mixture modelling using the t distribution. Statistics and Computing, 10(4), 339–348.CrossRefGoogle Scholar
  36. Popa, T., South-Dickinson, I., Bradley, D., Sheffer, A., & Heidrich, W. (2010). Globally consistent space-time reconstruction. Computer Graphics Forum, 29, 1633–1642.CrossRefGoogle Scholar
  37. Rydfalk, M. (1987). CANDIDE, a parameterized face. Technical report.Google Scholar
  38. Toledo, S. (2003). Taucs: A library of sparse linear solvers, Version 2.2. Technical report.Google Scholar
  39. Salzmann, M., Pilet, J., Ilic, S., & Fua, P. (2007). Surface deformation models for nonrigid 3D shape recovery. In IEEE PAMI.Google Scholar
  40. Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In CVPR.Google Scholar
  41. Sigal, L., Balan, A. O., & Black, M. J. (2010). HumanEva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. IJCV, 87(1), 4–27.CrossRefGoogle Scholar
  42. Sigal, L., Isard, M., Haussecker, H., & Black, M. J. (2012). Loose-limbed people: Estimating 3d human pose and motion using non-parametric belief propagation. IJCV, 98(1), 15–48.CrossRefMathSciNetzbMATHGoogle Scholar
  43. Sorkine, O., & Alexa, M. (2007). As-rigid-as-possible surface modeling. In Eurographics.Google Scholar
  44. Sorkine, O., Or, D. C., Lipman, Y., Alexa, M., Rössl, C., & Seidel, H. P. (2004). Laplacian surface editing. In SGP’04: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on geometry processing.Google Scholar
  45. Starck, J., & Hilton, A. (2007a). Correspondence labelling for wide-timeframe free-form surface matching. In ICCV 2007.Google Scholar
  46. Starck, J., & Hilton, A. (2007b). Surface capture for performance based animation. In IEEE Computer Graphics and Applications.Google Scholar
  47. Stoll, C., Hasler, N., Gall, J., Seidel, H. P., & Theobalt, C. (2011). Fast articulated motion tracking using a sums of Gaussians body model. In IEEE ICCV.Google Scholar
  48. Straka, M., Hauswiesner, S., Rüther, M., & Bischof, H. (2012). Simultaneous shape and pose adaption of articulated models using linear optimization. In ECCV. Heidelberg: Springer.Google Scholar
  49. Sumner, R.W., Schmid, J., & Pauly, M. (2007). Embedded deformation for shape manipulation. In ACM SIGGRAPH 2007.Google Scholar
  50. Urtasun, R., & Fua, P. (2004). 3D human body tracking using deterministic temporal motion models. In ECCV. Heidelberg: Springer.Google Scholar
  51. Varanasi, K., Zaharescu, A., Boyer, E., Horaud, R. P. (2008). Temporal surface tracking using mesh evolution. In ECCV.Google Scholar
  52. Vlasic, D., Baran, I., Matusik, W., & Popović, J. (2008). Articulated mesh animation from multi-view silhouettes. In SIGGRAPH.Google Scholar
  53. White, R., Crane, K., & Forsyth, D. (2007). Capturing and Animating Occluded Cloth. In SIGGRAPH.Google Scholar
  54. Zhou, Z., Zheng, J., Dai, Y., Zhou, Z., & Chen, S. (2014). Robust non-rigid point set registration using student’s-t mixture model. PLoS One, 9(3), e91,381.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Chun-Hao Huang
    • 1
    Email author
  • Cedric Cagniart
    • 1
  • Edmond Boyer
    • 2
  • Slobodan Ilic
    • 1
  1. 1.Technische Universität MüchenMunichGermany
  2. 2.LJK - INRIA Grenoble Rhône-AlpesGrenobleFrance

Personalised recommendations