Abstract
Metaverse is the next generation gaming Internet, and virtual humans play an important role in Metaverse. The simultaneous representation of motions and emotions of virtual humans attracts more attention in academics and industry, which significantly improves user experience with the vivid continuous simulation of virtual humans. Different from existing work which only focuses on either the expression of facial expressions or body motions, this paper presents a novel and real-time virtual human prototyping system, which enables a simultaneous real-time expression of motions and emotions of virtual humans (short for SimuMan). SimuMan not only enables users to generate personalized virtual humans in the metaverse world, but also enables them to naturally and simultaneously present six facial expressions and ten limb motions, and continuously generate various facial expressions by setting parameters. We evaluate SimuMan objectively and subjectively to demonstrate its fidelity, naturalness, and real-time. The experimental results show that the SimuMan system is characterized by low latency, good interactivity, easy operation, good robustness, and wide application.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kamal, S., Jalal, A.: A hybrid feature extraction approach for human detection, tracking and activity recognition using depth sensors. Arab. J. Sci. Eng. 41(3), 1043–1051 (2016)
Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)
Thomasset, V., Caron, S., Weistroffer, V.: Lower body control of a semi-autonomous avatar in virtual reality: balance and locomotion of a 3D bipedal model. In: 25th ACM Symposium on Virtual Reality Software and Technology, 4, pp. 1–11 (2019)
Tong, X.L., Xu, P., Yan, X.: Research on skeleton animation motion data based on Kinect. In: ISCID 2012 Proceedings of the 2012 Fifth International Symposium on Computational Intelligence and Design, vol. 2, pp. 347–350 (2012)
Wang, C.C.: Adaptive 3D virtual human modeling and animation system in VR scene. Zhejiang University (2018)
Fang, X.Y., Yang, J.K., Rao, J., et al.: Single RGB-D fitting: total human modeling with an RGB-D shot. In: ACM Symposium on Virtual Reality Software and Technology, vol. 24, pp. 1–11 (2019)
Dushyant, M., Srinath, S., Oleksandr, S., et al.: VNect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36(4), 1–14 (2017)
Pavlakos, G., Zhu, L.Y., Zhou, X.W., et al.: Learning to estimate 3d human pose and shape from a single color image. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 459–468 (2018)
Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3d human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_34
Huang, H., Chai, J., Tong, X., et al.: Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30(4), 1–10 (2011)
Booth, J., Roussos, A., Ponniah, A., et al.: Large scale 3D morphable models. Int. J. Comput. Vis. 126(2–4), 233–254 (2018)
Booth, J., Roussos, A., Zafeiriou, S., et al.: A 3D morphable model learnt from 10000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5543–5552 (2016)
Li, T.Y., Bolkart, T.J., Black M., et al.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 1–17 (2017)
Feng, Y., Wu, F., Shao, X.H., et al.: Joint 3D face reconstruction and dense alignment with position map regression network. In: IEEE International Conference on Computer Vision, pp. 557–574 (2018:)
Wu, F.Z., Bao, L.C., Chen, Y.J., et al.: MVF-net: multi-view 3D face morphable model regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 959–968 (2019)
Pavlakos, G., Choutas, V., Ghorbani, N., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10975–10985 (2019)
Roth, D., Bente, G., Kullmann, P., et al.: Technologies for social augmentations in user-embodied virtual reality. In: ACM Symposium on Virtual Reality Software and Technology, pp. 1–12 (2019)
Patel, N., Zaveri, M.: 3D Facial model reconstruction, expressions synthesis and animation using single frontal face image. SIViP 7(5), 889–897 (2013)
Tóth, A., Kunkli, R.: An approximative and semi-automated method to create MPEG-4 compliant human face models. Acta Cybern. 23(4), 1055–1069 (2018)
Kavan, L., Sloan, P., Carol, S.: Fast and efficient skinning of animated meshes. Comput. Graph. Forum 29(2), 327–336 (2010)
Sujar, A., Casafranca, J.J., Serrurier, A., et al.: Real-time animation of human characters’ anatomy. Comput. Graph. 74, 268–277 (2018)
Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)
Acknowledgement
This work is supported by NSFC project (Grant No. 62072150).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, M., Wang, Y., Zhou, J., Pan, Z. (2022). SimuMan: A Simultaneous Real-Time Method for Representing Motions and Emotions of Virtual Human in Metaverse. In: Tekinerdogan, B., Wang, Y., Zhang, LJ. (eds) Internet of Things – ICIOT 2021. ICIOT 2021. Lecture Notes in Computer Science(), vol 12993. Springer, Cham. https://doi.org/10.1007/978-3-030-96068-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-96068-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96067-4
Online ISBN: 978-3-030-96068-1
eBook Packages: Computer ScienceComputer Science (R0)