SimuMan: A Simultaneous Real-Time Method for Representing Motions and Emotions of Virtual Human in Metaverse

Zhang, Mingmin; Wang, Yuan; Zhou, Jiehan; Pan, Zhigeng

doi:10.1007/978-3-030-96068-1_6

Mingmin Zhang¹¹,
Yuan Wang¹²,
Jiehan Zhou¹³ &
…
Zhigeng Pan¹²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12993))

Included in the following conference series:

International Conference on Internet of Things

1354 Accesses
5 Citations

Abstract

Metaverse is the next generation gaming Internet, and virtual humans play an important role in Metaverse. The simultaneous representation of motions and emotions of virtual humans attracts more attention in academics and industry, which significantly improves user experience with the vivid continuous simulation of virtual humans. Different from existing work which only focuses on either the expression of facial expressions or body motions, this paper presents a novel and real-time virtual human prototyping system, which enables a simultaneous real-time expression of motions and emotions of virtual humans (short for SimuMan). SimuMan not only enables users to generate personalized virtual humans in the metaverse world, but also enables them to naturally and simultaneously present six facial expressions and ten limb motions, and continuously generate various facial expressions by setting parameters. We evaluate SimuMan objectively and subjectively to demonstrate its fidelity, naturalness, and real-time. The experimental results show that the SimuMan system is characterized by low latency, good interactivity, easy operation, good robustness, and wide application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kamal, S., Jalal, A.: A hybrid feature extraction approach for human detection, tracking and activity recognition using depth sensors. Arab. J. Sci. Eng. 41(3), 1043–1051 (2016)
Google Scholar
Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)
Article Google Scholar
Thomasset, V., Caron, S., Weistroffer, V.: Lower body control of a semi-autonomous avatar in virtual reality: balance and locomotion of a 3D bipedal model. In: 25th ACM Symposium on Virtual Reality Software and Technology, 4, pp. 1–11 (2019)
Google Scholar
Tong, X.L., Xu, P., Yan, X.: Research on skeleton animation motion data based on Kinect. In: ISCID 2012 Proceedings of the 2012 Fifth International Symposium on Computational Intelligence and Design, vol. 2, pp. 347–350 (2012)
Google Scholar
Wang, C.C.: Adaptive 3D virtual human modeling and animation system in VR scene. Zhejiang University (2018)
Google Scholar
Fang, X.Y., Yang, J.K., Rao, J., et al.: Single RGB-D fitting: total human modeling with an RGB-D shot. In: ACM Symposium on Virtual Reality Software and Technology, vol. 24, pp. 1–11 (2019)
Google Scholar
Dushyant, M., Srinath, S., Oleksandr, S., et al.: VNect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36(4), 1–14 (2017)
Google Scholar
Pavlakos, G., Zhu, L.Y., Zhou, X.W., et al.: Learning to estimate 3d human pose and shape from a single color image. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 459–468 (2018)
Google Scholar
Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3d human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_34
Chapter Google Scholar
Huang, H., Chai, J., Tong, X., et al.: Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30(4), 1–10 (2011)
Google Scholar
Booth, J., Roussos, A., Ponniah, A., et al.: Large scale 3D morphable models. Int. J. Comput. Vis. 126(2–4), 233–254 (2018)
Article MathSciNet Google Scholar
Booth, J., Roussos, A., Zafeiriou, S., et al.: A 3D morphable model learnt from 10000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5543–5552 (2016)
Google Scholar
Li, T.Y., Bolkart, T.J., Black M., et al.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 1–17 (2017)
Google Scholar
Feng, Y., Wu, F., Shao, X.H., et al.: Joint 3D face reconstruction and dense alignment with position map regression network. In: IEEE International Conference on Computer Vision, pp. 557–574 (2018:)
Google Scholar
Wu, F.Z., Bao, L.C., Chen, Y.J., et al.: MVF-net: multi-view 3D face morphable model regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 959–968 (2019)
Google Scholar
Pavlakos, G., Choutas, V., Ghorbani, N., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10975–10985 (2019)
Google Scholar
Roth, D., Bente, G., Kullmann, P., et al.: Technologies for social augmentations in user-embodied virtual reality. In: ACM Symposium on Virtual Reality Software and Technology, pp. 1–12 (2019)
Google Scholar
Patel, N., Zaveri, M.: 3D Facial model reconstruction, expressions synthesis and animation using single frontal face image. SIViP 7(5), 889–897 (2013)
Article Google Scholar
Tóth, A., Kunkli, R.: An approximative and semi-automated method to create MPEG-4 compliant human face models. Acta Cybern. 23(4), 1055–1069 (2018)
Google Scholar
Kavan, L., Sloan, P., Carol, S.: Fast and efficient skinning of animated meshes. Comput. Graph. Forum 29(2), 327–336 (2010)
Google Scholar
Sujar, A., Casafranca, J.J., Serrurier, A., et al.: Real-time animation of human characters’ anatomy. Comput. Graph. 74, 268–277 (2018)
Google Scholar
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)
Google Scholar

Download references

Acknowledgement

This work is supported by NSFC project (Grant No. 62072150).

Author information

Authors and Affiliations

Zhejiang University, Hangzhou, China
Mingmin Zhang
Hangzhou Normal University, Hangzhou, China
Yuan Wang & Zhigeng Pan
University of Oulu, Oulu, Finland
Jiehan Zhou

Authors

Mingmin Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jiehan Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zhigeng Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhigeng Pan .

Editor information

Editors and Affiliations

Wageningen University Maatschappijw, Wageningen, The Netherlands
Bedir Tekinerdogan
University of Prince Edward Island, Charlottetown, Canada
Yingwei Wang
Kingdee International Software Group Co., Ltd., Shenzhen, China
Liang-Jie Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, M., Wang, Y., Zhou, J., Pan, Z. (2022). SimuMan: A Simultaneous Real-Time Method for Representing Motions and Emotions of Virtual Human in Metaverse. In: Tekinerdogan, B., Wang, Y., Zhang, LJ. (eds) Internet of Things – ICIOT 2021. ICIOT 2021. Lecture Notes in Computer Science(), vol 12993. Springer, Cham. https://doi.org/10.1007/978-3-030-96068-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-96068-1_6
Published: 18 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96067-4
Online ISBN: 978-3-030-96068-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics