A nonparametric regression model for virtual humans generation

Chou, Yun-Feng; Shih, Zen-Chung

doi:10.1007/s11042-009-0412-7

A nonparametric regression model for virtual humans generation

Published: 30 October 2009

Volume 47, pages 163–187, (2010)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Yun-Feng Chou¹ &
Zen-Chung Shih¹

147 Accesses
2 Citations
Explore all metrics

Abstract

In this paper, we propose a novel nonparametric regression model to generate virtual humans from still images for the applications of next generation environments (NG). This model automatically synthesizes deformed shapes of characters by using kernel regression with elliptic radial basis functions (ERBFs) and locally weighted regression (LOESS). Kernel regression with ERBFs is used for representing the deformed character shapes and creating lively animated talking faces. For preserving patterns within the shapes, LOESS is applied to fit the details with local control. The results show that our method effectively simulates plausible movements for character animation, including body movement simulation, novel views synthesis, and expressive facial animation synchronized with input speech. Therefore, the proposed model is especially suitable for intelligent multimedia applications in virtual humans generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multimodal emotion classification using machine learning in immersive and non-immersive virtual reality

Article Open access 06 May 2024

Machine learning for human emotion recognition: a comprehensive review

Article Open access 20 February 2024

The Singularity is Near

References

Alexa M, Cohen-Or D, Levin D (2000) As-rigid-as-possible shape interpolation. In SIGGRAPH ’00 157–164
Arad N, Dyn N, Reisfeld D, Yeshurun Y (1994) Image warping by radial basis functions: applications to facial expressions. CVGIP Graph Models Image Process 56(2):161–172
Article Google Scholar
Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2007) A database and evaluation methodology for optical flow. In IEEE International Conference on Computer Vision 1–8
Blanz V, Basso C, Poggio T, Vetter T (2003) Reanimating faces in images and video. Comput Graph Forum 22(3):641–650
Article Google Scholar
Botsch M, Sorkine O (2008) On linear variational surface deformation methods. IEEE Trans Vis Comput Graph 14(1):213–230
Article Google Scholar
Brand M (1999) Voice puppetry. In SIGGRAPH ’99 21–28
Bruce HT, Calder P (1995) Animating direct manipulation interfaces. In the 8th ACM Symposium on User Interface Software and Technology 3–12
Busso C, Deng Z, Grimm M, Neumann U, Narayanan SS (2007) Rigid head motion in expressive speech animation: analysis and synthesis. IEEE Trans Audio Speech Lang Process 15(8):1075–1086
Article Google Scholar
Busso C, Narayanan SS (2007) Interrelation between speech and facial gestures in emotional utterances: a single subject study. IEEE Trans Audio Speech Lang Process 15(8):2331–2347
Article Google Scholar
Chan TF (2001) Active contours without edges. IEEE Trans Image Process 10(2):266–277
Article MATH Google Scholar
Chen SE, William L (1993) View interpolation for image synthesis. In SIGGRAPH ’93 279–288
Chuang Y-Y, Goldman DB, Zheng KC, Curless B, Salesin D, Szeliski R (2005) Animating pictures with stochastic motion textures. ACM Trans Graph 24(3):853–860
Article Google Scholar
Deng Z, Neumann U (2006) efase: expressive facial animation synthesis and editing with phoneme-isomap controls. In SIGGRAPH/Eurographics Symposium on Computer Animation 251–260
Ezzat TF, Geiger G, Poggio T (2002) Trainable video realistic speech animation. ACM Trans Graph 21(3):388–398
Article Google Scholar
Forstmann S, Ohya J, Krohn-Grimberghe A, McDougall R (2007) Deformation styles for spline-based skeletal animation. In SIGGRAPH/Eurographics Symposium on Computer Animation 141–150
Fu T, Foroosh H (2004) Expression morphing from distant viewpoints. In International Conference on Image Processing 3519–3522
Glocker B, Paragios N, Komodakis K, Tziritas G, Navab N (2008) Optical flow estimation with uncertainties through dynamic MRFs. In IEEE Conference on Computer Vision and Pattern Recognition
Goldstein E, Gotsman C (1995) Polygon morphing using a multiresolution representation. In Graphics Interface ’95 247–254
Herbrich R (2002) Learning kernel classifiers theory and algorithms. The MIT Press
Hornung A, Dekkers E, Kobbelt L (2007) Character animation from 2D pictures and 3D motion data. ACM Transaction on Graphics 26(1) Article No. 1
Igarashi T, Moscovich T, Hughes JF (2005) As-rigid-as-possible shape manipulation. ACM Trans Graph 24(3):1134–1141
Article Google Scholar
Jang Y, Botchen RP, Lauser A, Ebert DS, Gaither KP, Ertl T (2006) Enhancing the interactive visualization of procedurally encoded multifield data with ellipsoidal basis functions. Comput Graph Forum 25(3):587–596
Article Google Scholar
Lempitsky L, Roth S, Rother C (2008) FusionFlow: discrete-continuous optimization for optical flow estimation. In IEEE Conference on Computer Vision and Pattern Recognition
Li Y, Huttenlocher D (2008) Learning for optical flow using stochastic optimization. In the 10th European Conference on Computer Vision 2:379–391
Litwinowicz P, Williams L (1994) Animating images with drawings. In SIGGRAPH ’94 409–412
Mahajan D, Huang F-C, Matusik W, Ramamoorthi R, Belhumeur P (2009) Moving gradients: a path-based method for plausible image interpolation. ACM Transaction on Graphics 28(3) Article No. 42
McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 746–748
Montgomery DC, Peck EA, Vining GG (2006) Introduction to linear regression analysis. Wiley
Mukundan R, Ong SH, Lee PA (2001) Image analysis by tchebichef moments. IEEE Trans Image Process 10(9):1357–1364
Article MATH MathSciNet Google Scholar
Ngo T, Cutrell D, Dan J, Donald B, Loeb L, Zhu S (2000) Accessible animation and customizable graphics via simplicial configuration modeling. In SIGGRAPH ’00 403–410
Park J, Sandberg WI (1993) Nonlinear approximations using elliptic basis function networks. In 32nd Conference on Decision and Control 3700–3705
Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. Readings in speech recognition 267–296
Ranjan V, Fournier A (1996) Matching and interpolation of shapes using unions of circles. Comput Graph Forum 15(3):129–142
Article Google Scholar
Ren X (2008) Local grouping for optical flow. In IEEE Conference on Computer Vision and Pattern Recognition
Rother C, Kolmogorov V, Blake A (2004) “GrabCut”: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314
Article Google Scholar
Ruprecht D, Müller H (1995) Image warping with scattered data interpolation. IEEE Comput Graph Appl 15(2):37–43
Article Google Scholar
Schaefer S, Mcphail T, Warren J (2006) Image deformation using moving least squares. ACM Trans Graph 25(3):533–540
Article Google Scholar
Sederberg T, Greenwood E (1992) A physically based approach to 2D shape blending. In SIGGRAPH ’92 25–34
Seitz SM, Dyer CR (1996) View morphing. In SIGGRAPH ’96 21–30
Sethian JA (1996) Level set methods. Cambridge University Press
Sethian JA (1999) Level set methods and fast marching methods: evolving interfaces in computational geometry, fluid mechanics, computer vision, and materials science. Cambridge University Press
Sun D, Roth S, Lewis JP, Black MJ (2008) Learning optical flow. In the 10th European Conference on Computer Vision 3:83–97
Trobin W, Pock T, Cremers D, Bischof H (2008) Continuous energy minimization via repeated binary fusion. In the 10th European Conference on Computer Vision 4:677–690
Vedula S, Baker S, Kanade T (2005) Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Trans Graph 24(2):240–261
Article Google Scholar
Vorobyov SA, Cichocki A (2001) Hyper radial basis function neural networks for interference cancellation with nonlinear processing of reference signal. Digit Signal Process 11(3):204–221
Article Google Scholar
Wang Y, Xu K, Xiong Y, Cheng Z-Q (2008) 2D shape deformation based on rigid square matching. Computer Animation and Virtual Worlds 19(3–4):411–420
Article Google Scholar
Weber O, Ben-Chen M, Gotsman C (2009) Complex barycentric coordinates with applications to planar shape deformation. Comput Graph Forum 28(2):587–397
Article Google Scholar
Wolberg G (1998) Image morphing: a survey. Vis Comput 14(8):360–372
Article Google Scholar
Xu L, Chen J, Jia J (2008) Segmentation based variational model for accurate optical flow estimation. In the 10th European Conference on Computer Vision 1:671–684
Yan H-B, Hu S-M, Martin RR, Yang Y-L (2008) Shape deformation using a skeleton to drive simplex transformations. IEEE Trans Vis Comput Graph 14(3):693–706
Article Google Scholar
Yotsukura T, Morishima S, Nakamura S (2003) Model-based talking face synthesis for anthropomorphic spoken dialog agent system. In the 11th ACM International Conference on Multimedia 351–354

Download references

Acknowledgements

This work is supported partially by the National Science Council, Republic of China, under grant NSC 98-2221-E-009-123-MY3. We would like to thank Prof. Sang-Soo Yeo and reviewers for their helpful suggestions.

Author information

Authors and Affiliations

Department of Computer Science, National Chiao Tung University, Hsinchu City, Taiwan
Yun-Feng Chou & Zen-Chung Shih

Authors

Yun-Feng Chou
View author publications
You can also search for this author in PubMed Google Scholar
Zen-Chung Shih
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zen-Chung Shih.

Appendix 1 Hyper radial basis functions (HRBFs)

HRBF is computed by using the Mahalanobis distance, which is defined in the matrix form as follows:

$$ \begin{array}{*{20}{c}} {k\left( {\vec u,\vec v} \right) = \exp \left( { - {{\left( {\vec u - \vec v} \right)}^T}\sum {\left( {\vec u - \vec v} \right)} } \right),} \\ {{\text{for }}\Sigma = diag\left( {\sigma_1^{ - 2}, \ldots, \sigma_N^{ - 2}} \right){\text{ and }}{\sigma_1}, \ldots, {\sigma_N} \in {\Re^{+} },} \\ \end{array} $$

(19)

where $ \sigma_N^2 $ should be the covariance of the multidimensional Gaussians rather than the single variance. HRBF differs from a standard RBF insofar each axis of the input space $ \chi \subseteq \ell_2^N $ (the space of square summable sequences of length N) has a separate smoothing parameter, i.e., a separate scale onto which the differences on this axis are viewed. It is worth mentioning that RBF kernels map the input space onto the surface of an infinite dimensional hyperspace. Note that N = 2 in arbitrary directional ERBF kernel represents the analysis of data distribution along the major axis and the minor axis in an ellipse. Along the orientation of arbitrary directional ERBF (the major axis and the minor axis), (1) is constructed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chou, YF., Shih, ZC. A nonparametric regression model for virtual humans generation. Multimed Tools Appl 47, 163–187 (2010). https://doi.org/10.1007/s11042-009-0412-7

Download citation

Published: 30 October 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11042-009-0412-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A nonparametric regression model for virtual humans generation

Abstract

Access this article

Similar content being viewed by others

Multimodal emotion classification using machine learning in immersive and non-immersive virtual reality

Machine learning for human emotion recognition: a comprehensive review

The Singularity is Near

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix 1 Hyper radial basis functions (HRBFs)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A nonparametric regression model for virtual humans generation

Abstract

Access this article

Similar content being viewed by others

Multimodal emotion classification using machine learning in immersive and non-immersive virtual reality

Machine learning for human emotion recognition: a comprehensive review

The Singularity is Near

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix 1 Hyper radial basis functions (HRBFs)

Appendix 1 Hyper radial basis functions (HRBFs)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation