Skip to main content

SimuMan: A Simultaneous Real-Time Method for Representing Motions and Emotions of Virtual Human in Metaverse

  • Conference paper
  • First Online:
Internet of Things – ICIOT 2021 (ICIOT 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12993))

Included in the following conference series:

Abstract

Metaverse is the next generation gaming Internet, and virtual humans play an important role in Metaverse. The simultaneous representation of motions and emotions of virtual humans attracts more attention in academics and industry, which significantly improves user experience with the vivid continuous simulation of virtual humans. Different from existing work which only focuses on either the expression of facial expressions or body motions, this paper presents a novel and real-time virtual human prototyping system, which enables a simultaneous real-time expression of motions and emotions of virtual humans (short for SimuMan). SimuMan not only enables users to generate personalized virtual humans in the metaverse world, but also enables them to naturally and simultaneously present six facial expressions and ten limb motions, and continuously generate various facial expressions by setting parameters. We evaluate SimuMan objectively and subjectively to demonstrate its fidelity, naturalness, and real-time. The experimental results show that the SimuMan system is characterized by low latency, good interactivity, easy operation, good robustness, and wide application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kamal, S., Jalal, A.: A hybrid feature extraction approach for human detection, tracking and activity recognition using depth sensors. Arab. J. Sci. Eng. 41(3), 1043–1051 (2016)

    Google Scholar 

  2. Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)

    Article  Google Scholar 

  3. Thomasset, V., Caron, S., Weistroffer, V.: Lower body control of a semi-autonomous avatar in virtual reality: balance and locomotion of a 3D bipedal model. In: 25th ACM Symposium on Virtual Reality Software and Technology, 4, pp. 1–11 (2019)

    Google Scholar 

  4. Tong, X.L., Xu, P., Yan, X.: Research on skeleton animation motion data based on Kinect. In: ISCID 2012 Proceedings of the 2012 Fifth International Symposium on Computational Intelligence and Design, vol. 2, pp. 347–350 (2012)

    Google Scholar 

  5. Wang, C.C.: Adaptive 3D virtual human modeling and animation system in VR scene. Zhejiang University (2018)

    Google Scholar 

  6. Fang, X.Y., Yang, J.K., Rao, J., et al.: Single RGB-D fitting: total human modeling with an RGB-D shot. In: ACM Symposium on Virtual Reality Software and Technology, vol. 24, pp. 1–11 (2019)

    Google Scholar 

  7. Dushyant, M., Srinath, S., Oleksandr, S., et al.: VNect: real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36(4), 1–14 (2017)

    Google Scholar 

  8. Pavlakos, G., Zhu, L.Y., Zhou, X.W., et al.: Learning to estimate 3d human pose and shape from a single color image. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 459–468 (2018)

    Google Scholar 

  9. Bogo, F., Kanazawa, A., Lassner, C., Gehler, P., Romero, J., Black, M.J.: Keep it SMPL: automatic estimation of 3d human pose and shape from a single image. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 561–578. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_34

    Chapter  Google Scholar 

  10. Huang, H., Chai, J., Tong, X., et al.: Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM Trans. Graph. 30(4), 1–10 (2011)

    Google Scholar 

  11. Booth, J., Roussos, A., Ponniah, A., et al.: Large scale 3D morphable models. Int. J. Comput. Vis. 126(2–4), 233–254 (2018)

    Article  MathSciNet  Google Scholar 

  12. Booth, J., Roussos, A., Zafeiriou, S., et al.: A 3D morphable model learnt from 10000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5543–5552 (2016)

    Google Scholar 

  13. Li, T.Y., Bolkart, T.J., Black M., et al.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6), 1–17 (2017)

    Google Scholar 

  14. Feng, Y., Wu, F., Shao, X.H., et al.: Joint 3D face reconstruction and dense alignment with position map regression network. In: IEEE International Conference on Computer Vision, pp. 557–574 (2018:)

    Google Scholar 

  15. Wu, F.Z., Bao, L.C., Chen, Y.J., et al.: MVF-net: multi-view 3D face morphable model regression. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 959–968 (2019)

    Google Scholar 

  16. Pavlakos, G., Choutas, V., Ghorbani, N., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 10975–10985 (2019)

    Google Scholar 

  17. Roth, D., Bente, G., Kullmann, P., et al.: Technologies for social augmentations in user-embodied virtual reality. In: ACM Symposium on Virtual Reality Software and Technology, pp. 1–12 (2019)

    Google Scholar 

  18. Patel, N., Zaveri, M.: 3D Facial model reconstruction, expressions synthesis and animation using single frontal face image. SIViP 7(5), 889–897 (2013)

    Article  Google Scholar 

  19. Tóth, A., Kunkli, R.: An approximative and semi-automated method to create MPEG-4 compliant human face models. Acta Cybern. 23(4), 1055–1069 (2018)

    Google Scholar 

  20. Kavan, L., Sloan, P., Carol, S.: Fast and efficient skinning of animated meshes. Comput. Graph. Forum 29(2), 327–336 (2010)

    Google Scholar 

  21. Sujar, A., Casafranca, J.J., Serrurier, A., et al.: Real-time animation of human characters’ anatomy. Comput. Graph. 74, 268–277 (2018)

    Google Scholar 

  22. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)

    Google Scholar 

Download references

Acknowledgement

This work is supported by NSFC project (Grant No. 62072150).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhigeng Pan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, M., Wang, Y., Zhou, J., Pan, Z. (2022). SimuMan: A Simultaneous Real-Time Method for Representing Motions and Emotions of Virtual Human in Metaverse. In: Tekinerdogan, B., Wang, Y., Zhang, LJ. (eds) Internet of Things – ICIOT 2021. ICIOT 2021. Lecture Notes in Computer Science(), vol 12993. Springer, Cham. https://doi.org/10.1007/978-3-030-96068-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-96068-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-96067-4

  • Online ISBN: 978-3-030-96068-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics