Skip to main content
Log in

WeAnimate: Motion-coherent animation generation from video data

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Due to the lack of easy-to-use tools, User-Generated-Animation (UGA) only has limited development compared with the rapid development of user participatory culture and the resulting User-Generated Contents (UGC). In this paper, we develop a machine learning based tool called WeAnimate that can generate a clip of animation using only a character picture and a source video. Users are able to provide character movements by the video. In the tool, a classifier model is trained to identify every motion in the video and to obtain therefrom the motion sequences. Then a strategy combining skeletal animation with neural network is presented to produce multiple auxiliary images from one original image of the new character. Eventually, these images will be spliced into a new animation according to the time order of the motion sequences of video frames. We evaluate the capability, effects, and performance of this animation generation tool with practical applications. The evaluation shows the usage and effectiveness of WeAnimate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Agarap AF (2017) An architecture combining convolutional neural network (cnn) and support vector machine (svm) for image classification. arXiv:1712.03541

  2. Albrecht I, Haber J, Kahler K, Schroder M, Seidel HP (2002) May i talk to you?:-)-facial animation from text. In: 10Th pacific conference on computer graphics and applications, 2002. Proceedings. IEEE, pp 77–86

  3. Beer D, Burrows R (2010) Consumption prosumption and participatory web cultures: an introduction

  4. Berney S, Bétrancourt M (2016) Does animation enhance learning? a meta-analysis. Comput Educat 101:150–167

    Article  Google Scholar 

  5. Burgess J, Green J (2018) Youtube: Online video and participatory culture. Wiley, Hoboken

    Google Scholar 

  6. Chan C, Ginosar S, Zhou T, Efros AA (2019) Everybody dance now. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 5933–5942

  7. Chandra MA, Bedi S (2018) Survey on svm and their application in image classification. Int J Inf Technol, 1–11

  8. Chen L, Maddox RK, Duan Z, Xu C (2019) Hierarchical cross-modal talking face generation with dynamic pixel-wise loss. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7832–7841

  9. Condry I (2013) The soul of anime: Collaborative creativity and Japan’s media success story. Duke University Press, Durham

    Book  Google Scholar 

  10. Dai H, Cai B, Song J, Zhang D (2010) Skeletal animation based on bvh motion data

  11. Dale K, Sunkavalli K, Johnson MK, Vlasic D, Matusik W, Pfister H (2011) Video face replacement. In: Proceedings of the 2011 SIGGRAPH Asia conference, pp 1–10

  12. Delorme M, Filhol M, Braffort A (2009) Animation generation process for sign language synthesis. In: 2009 Second international conferences on advances in computer-human interactions. IEEE, pp 386–390

  13. Eskimez SE, Maddox RK, Xu C, Duan Z (2018) Generating talking face landmarks from speech. In: International conference on latent variable analysis and signal separation. Springer, pp 372–381

  14. Hayashi M, Inoue S, Douke M, Hamaguchi N, Kaneko H, Bachelder S, Nakajima M (2014) T2v: New technology of converting text to cg animation. ITE Trans Media Technol Appl 2(1):74–81

    Article  Google Scholar 

  15. Jhuang H, Gall J, Zuffi S, Schmid C, Black MJ (2013) Towards understanding action recognition. In: Proceedings of the IEEE international conference on computer vision, pp 3192–3199

  16. Kang N, Bai J, Pan J, Qin H (2019) Interactive animation generation of virtual characters using single rgb-d camera. Vis Comput 35(6):849–860

    Article  Google Scholar 

  17. Khungurn P (2020) Talking head anime from a single image. [EB/OL] (2019) https://pkhungurn.github.io/talking-head-anime/ Accessed May 6

  18. Kim RE, Koo SM (2018) Development of creativity program using disney animation of young children. Indian Journal of Public Health Research & Development 9(11)

  19. Korshunova I, Shi W, Dambre J, Theis L (2017) Fast face-swap using convolutional neural networks. In: Proceedings of the IEEE international conference on computer vision, pp 3677–3685

  20. Lee E, Lee JA, Moon JH, Sung Y (2015) Pictures speak louder than words: Motivations for using instagram. Cyberpsychol Behav Soc Netw 18 (9):552–556

    Article  Google Scholar 

  21. Li J, Yin B, Wang L, Kong D (2014) Chinese sign language animation generation considering context. Multimed Tools Appl 71(2):469–483

    Article  Google Scholar 

  22. Li Y, Min M, Shen D, Carlson D, Carin L (2018) Video generation from text. In: Proceedings of the AAAI conference on artificial intelligence, vol 32

  23. Lin TH, Teng CJ, Hsiao FJ (2013) Animation generation systems and methods. US Patent 8,462,198

  24. Liu Y, Xu F, Chai J, Tong X, Wang L, Huo Q (2015) Video-audio driven real-time facial animation. ACM Transactions on Graphics (TOG) 34(6):1–10

    Google Scholar 

  25. Meena HK, Joshi SD, Sharma KK (2019) Facial expression recognition using graph signal processing on hog. IETE J Res, 1–7

  26. O’Byrne I, Radakovic N, Hunter-Doniger T, Fox M, Kern R, Parnell S (2018) Designing spaces for creativity and divergent thinking: Pre-service teachers creating stop motion animation on tablets. Int J Educ Math Sci Technol 6(2):182–199

    Article  Google Scholar 

  27. Pan JJ, Zhang JJ (2011) Sketch-based skeleton-driven 2d animation and motion capture

  28. Peña-López I et al (2007) Participative web and user-created content. web 2.0, wikis and social networking

  29. Richard A, Lea C, Ma S, Gall J, de la Torre F, Sheikh Y (2021) Audio-and gaze-driven facial animation of codec avatars. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 41–50

  30. Shim H, Kang B, Kwag K (2009) Web2animation-automatic generation of 3d animation from the web text. In: 2009 IEEE/WIC/ACM International joint conference on web intelligence and intelligent agent technology, vol. 1. IEEE, pp. 596–601

  31. Sinha S (2016) Pro unity animation

  32. Song Y, Zhu J, Li D, Wang X, Qi H (2018) Talking face generation by conditional recurrent adversarial network. arXiv:1804.04786

  33. Sugisaki E, Seah HS, Kyota F, Nakajima M (2009) Simulation-based in-between creation for cacani system. In: ACM SIGGRAPH ASIA 2009 Sketches, pp 1–1

  34. Taylor S, Kim T, Yue Y, Mahler M, Krahe J, Rodriguez AG, Hodgins J, Matthews I (2017) A deep learning approach for generalized speech animation. ACM Transactions on Graphics (TOG) 36(4):1–11

    Article  Google Scholar 

  35. Tian G, Yuan Y, Liu Y (2019) Audio2face: Generating speech/face animation from single audio with attention-based bidirectional lstm networks. In: 2019 IEEE International conference on multimedia & expo workshops (ICMEW). IEEE, pp 366–371

  36. Vrhovski H (2017) Adobe character animator. Ph.D. thesis, University of Rijeka. Department of Informatics

  37. Wang H, Schmid C (2013) Action recognition with improved trajectories. In: Proceedings of the IEEE international conference on computer vision, pp 3551–3558

  38. Wang H, Ullah MM, Klaser A, Laptev I, Schmid C (2009) Evaluation of local spatio-temporal features for action recognition. In: Bmvc 2009-british machine vision conference. BMVA Press, pp 124–1

  39. Wang J, Wang L (2018) Animation development in multimedia teaching software based on multimedia tool book. Educational Sciences: Theory & Practice 18(5)

  40. Yoon H (2019) Do higher skills result in better jobs? the case of the korean animation industry. Geoforum 99:267–277

    Article  Google Scholar 

  41. Yu J, Shi J, Zhou Y (2005) Skeleton driven limb animation based on three-layered structure. Lect Notes Comput Sci 3809(18):1187–1190

    Article  MathSciNet  Google Scholar 

  42. Zhou Y, Han X, Shechtman E, Echevarria J, Kalogerakis E, Li D (2020) Makelttalk: Speaker-aware talking-head animation. ACM Trans. Graph 39(6). https://doi.org/10.1145/3414685.3417774

  43. Zhou Y, Xu Z, Landreth C, Kalogerakis E, Maji S, Singh K (2018) Visemenet: Audio-driven animator-centric speech animation. ACM Transactions on Graphics (TOG) 37(4):1–10

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaohong Chen.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, H., Liu, J., Chen, X. et al. WeAnimate: Motion-coherent animation generation from video data. Multimed Tools Appl 81, 20685–20703 (2022). https://doi.org/10.1007/s11042-022-12359-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12359-4

Keywords

Navigation