Professional dance is characterized by high impulsiveness, elegance, and aesthetic beauty. In order to reach the desired professionalism, it requires years of long and exhausting practice, good physical condition, musicality, but also, a good understanding of choreography. Capturing dance motions and transferring them to digital avatars is commonly used in the film and entertainment industries. However, so far, access to high-quality dance data is very limited, mainly due to the many practical difficulties in capturing the movements of dancers, making it prohibitive for large-scale data acquisition. In this paper, we present a model that enhances the professionalism of amateur dance movements, allowing movement quality to be improved in both spatial and temporal domains. Our model consists of a dance-to-music alignment stage responsible for learning the optimal temporal alignment path between dance and music, and a dance-enhancement stage that injects features of professionalism in both spatial and temporal domains. To learn a homogeneous distribution and credible mapping between the heterogeneous professional and amateur datasets, we generate amateur data from professional dances taken from the AIST++ dataset. We demonstrate the effectiveness of our method by comparing it with two baseline motion transfer methods via thorough qualitative visual controls, quantitative metrics, and a perceptual study. We also provide temporal and spatial module analysis to examine the mechanisms and necessity of key components of our framework.
Similar content being viewed by others
Hanna, J. L. The Performer-Audience Connection: Emotion to Metaphor in Dance and Society. University of Texas Press, 1983.
Aristidou, A.; Shamir, A.; Chrysanthou, Y. Digital dance ethnography. Journal on Computing and Cultural Heritage Vol. 12, No. 4, Article No. 29, 2020.
Li, R. L.; Yang, S.; Ross, D. A.; Kanazawa, A. AI choreographer: Music conditioned 3D dance generation with AIST. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, 13381–13392, 2021.
Chen, K.; Tan, Z.; Lei, J.; Zhang, S. H.; Guo, Y. C.; Zhang, W.; Hu, S. M. ChoreoMaster: Choreography-oriented music-driven dance synthesis. ACM Transactions on Graphics Vol. 40, No. 4, Article No. 145, 2021.
Butterworth, J. Dance Studies: The Basics. Routledge Press, 2011.
Holden, D.; Saito, J.; Komura, T.; Joyce, T. Learning motion manifolds with convolutional autoencoders. In: Proceedings of the SIGGRAPH Asia 2015 Technical Briefs, Article No. 18, 2015.
Holden, D.; Saito, J.; Komura, T. A deep learning framework for character motion synthesis and editing. ACM Transactions on Graphics Vol. 35, No. 4, Article No. 138, 2016.
Aberman, K.; Weng, Y. J.; Lischinski, D.; Cohen-Or, D.; Chen, B. Q. Unpaired motion style transfer from video to animation. ACM Transactions on Graphics Vol. 39, No. 4, Article No. 64, 2020.
Dong, Y. Z.; Aristidou, A.; Shamir, A.; Mahler, M.; Jain, E. Adult2child: Motion style transfer using CycleGANs. In: Proceedings of the 13th ACM SIGGRAPH Conference on Motion, Interaction and Games, Article No. 13, 2020.
Wen, Y. H.; Yang, Z. P.; Fu, H. B.; Gao, L.; Sun, Y. N.; Liu, Y. J. Autoregressive stylized motion synthesis with generative flow. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 13607–13607, 2021.
Koutedakis, Y.; Craig Sharp, N. C. The Fit and Healthy Dancer. Wiley Press, 1999.
Krasnow, D.; Chatfield, S. J. Development of the “performance competence evaluation measure”: Assessing qualitative aspects of dance performance. Journal of Dance Medicine & Science Vol. 13, No. 4, 101–107, 2009.
Neave, N.; McCarty, K.; Freynik, J.; Caplan, N.; Hönekopp, J.; Fink, B. Male dance moves that catch a woman’s eye. Biology Letters Vol. 7, No. 2, 221–224, 2011.
Torrents, C.; Castañer, M.; Jofre, T.; Morey, G.; Reverter, F. Kinematic parameters that influence the aesthetic perception of beauty in contemporary dance. Perception Vol. 42, No. 4, 447–458, 2013.
Park, Y. S. Correlation analysis between dance experience and smoothness of dance movement by using three jerk-based quantitative methods. Korean Journal of Sport Biomechanics Vol. 26, No. 1, 1–9, 2016.
Alexiadis, D. S.; Kelly, P.; Daras, P.; O’Connor, N. E.; Boubekeur, T.; Ben Moussa, M. Evaluating a dancer’s performance using kinect-based skeleton tracking. In: Proceedings of the 19th ACM International Conference on Multimedia, 659–662, 2011.
Raheb, K. E.; Stergiou, M.; Katifori, A.; Ioannidis, Y. Dance interactive learning systems: A study on interaction workflow and teaching approaches. ACM Computing Surveys Vol. 52, No. 3, Article No. 50, 2019.
Chen, H. Y.; Cheng, Y. H.; Lo, A. Improve dancing skills with motion capture systems: Case study of a Taiwanese high school dance class. Research in Dance Educationhttps://doi.org/10.1080/14647893.2021.1980524, 2021.
Chan, J. C. P.; Leung, H.; Tang, J. K. T.; Komura, T. A virtual reality dance training system using motion capture technology. IEEE Transactions on Learning Technologies Vol. 4, No. 2, 187–195, 2011.
Aristidou, A.; Stavrakis, E.; Charalambous, P.; Chrysanthou, Y.; Himona, S. L. Folk dance evaluation using laban movement analysis. Journal on Computing and Cultural Heritage Vol. 8, No. 4, Article No. 20, 2015.
Laban, R. The Mastery of Movement, 4th edn. Dance Books Ltd., 2011.
Tenenbaum, J.; Freeman, W. Separating style and content. In: Proceedings of the Advances in Neural Information Processing Systems, 662–668, 1996.
Aristidou, A.; Zeng, Q.; Stavrakis, E.; Yin, K. K.; Cohen-Or, D.; Chrysanthou, Y.; Chen, B. Emotion control of unstructured dance movements. In: Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation, Article No. 9, 2017.
Brand, M.; Hertzmann, A. Style machines. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, 183–192, 2000.
Hsu, E.; Pulli, K.; Popović J. Style translation for human motion. ACM Transactions on Graphics Vol. 24, No. 3, 1082–1089, 2005.
Xia, S. H.; Wang, C. Y.; Chai, J. X.; Hodgins, J. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics Vol. 34, No. 4, Article No. 119, 2015.
Mason, I.; Starke, S.; Zhang, H.; Bilen, H.; Komura, T. Few-shot learning of homogeneous human locomotion styles. Computer Graphics Forum Vol. 37, No. 7, 143–153, 2018.
Smith, H. J.; Cao, C.; Neff, M.; Wang, Y. Y. Efficient neural networks for real-time motion style transfer. Proceedings of the ACM on Computer Graphics and Interactive Techniques Vol. 2, No. 2, Article No. 13, 2019.
Du, H.; Herrmann, E.; Sprenger, J.; Cheema, N.; Hosseini, S.; Fischer, K.; Slusallek, P. Stylistic locomotion modeling with conditional variational autoencoder. In: Proceedings of the 12th ACM SIGGRAPH Conference on Motion, Interaction and Games, Article No. 32, 2019.
Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P. A. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research Vol. 11, 3371–3408, 2010.
Gatys, L.; Ecker, A.; Bethge, M. A neural algorithm of artistic style. Journal of Vision Vol. 16, No. 12, 326, 2016.
Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE International Conference on Computer Vision, 1510–1519, 2017.
Arikan, O.; Forsyth, D. A. Interactive motion generation from examples. ACM Transactions on Graphics Vol. 21, No. 3, 483–490, 2002.
Kim, T. H.; Park, S. I.; Shin, S. Y. Rhythmic-motion synthesis based on motion-beat analysis. ACM Transactions on Graphics Vol. 22, No. 3, 392–401, 2003.
Lee, H. C.; Lee, I. K. Automatic synchronization of background music and motion in computer animation. Computer Graphics Forum Vol. 24, No. 3, 353–361, 2005.
Shiratori, T.; Nakazawa, A.; Ikeuchi, K. Dancing-to-music character animation. Computer Graphics Forum Vol. 25, No. 3, 449–458, 2006.
Tang, T. R.; Jia, J.; Mao, H. Y. Dance with melody: An LSTM-autoencoder approach to music-oriented dance synthesis. In: Proceedings of the 26th ACM International Conference on Multimedia, 1598–1606, 2018.
Lee, H. Y.; Yang, X.; Liu, M. Y.; Wang, T. C.; Lu, Y. D.; Yang, M. H.; Kautz, J. Dancing to music. In: Proceedings of the 33rd Conference on Neural Information Processing Systems, 2019.
Tsuchida, S.; Fukayama, S.; Hamasaki, M.; Goto, M. AIST dance video database: Multi-genre, multi-dancer, and multi-camera database for dance information processing. In: Proceedings of the 20th International Society for Music Information Retrieval Conference, 501–510, 2019.
Zhuang, W. L.; Wang, C. Y.; Chai, J. X.; Wang, Y. G.; Shao, M.; Xia, S. Y. Music2Dance: DanceNet for music-driven dance generation. ACM Transactions on Multimedia Computing, Communications, and Applications Vol. 18, No. 2, Article No. 65, 2022.
Aristidou, A.; Yiannakidis, A.; Aberman, K.; Cohen-Or, D.; Shamir, A.; Chrysanthou, Y. Rhythm is a dancer: Music-driven motion synthesis with global structure. IEEE Transactions on Visualization and Computer Graphics DOI: https://doi.org/10.1109/TVCG.2022.3163676, 2022.
Tadamura, K.; Nakamae, E. Synchronizing computer graphics animation and audio. IEEE MultiMedia Vol. 5, No. 4, 63–73, 1998.
Cardle, M.; Barthe, L.; Brooks, S.; Robinson, P. Music-driven motion editing: Local motion transformations guided by music analysis. In: Proceedings of the 20th Eurographics UK Conference, 38–44, 2002.
Laichuthai, A.; Kanongchaiyo, P. Synchronization between motion and music using motion graph. In: Proceedings of the 8th Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, 496–499, 2011.
Davis, A.; Agrawala, M. Visual rhythm and beat. ACM Transactions on Graphics Vol. 37, No. 4, Article No. 122, 2018.
Bellini, R.; Kleiman, Y.; Cohen-Or, D. Dance to the beat: Synchronizing motion to audio. Computational Visual Media Vol. 4, No. 3, 197–208, 2018.
Chung, J. S.; Zisserman, A. Out of time: Automated lip sync in the wild. In: Computer Vision — ACCV 2016 Workshops. Lecture Notes in Computer Science, Vol. 10117. Chen, C. S.; Lu, J.; Ma, K. K. Eds. Springer Cham, 251–263, 2017.
Halperin, T.; Ephrat, A.; Peleg, S. Dynamic temporal alignment of speech to lips. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 3980–3984, 2019.
Wang, J. R.; Fang, Z. Y.; Zhao, H. AlignNet: A unifying approach to audio-visual alignment. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 3298–3306, 2020.
Phillips, G. M. Interpolation and Approximation by Polynomials. New York: Springer, 2003.
Holden, D.; Komura, T.; Saito, J. Phase-functioned neural networks for character control. ACM Transactions on Graphics Vol. 36, No. 4, Article No. 42, 2017.
Aristidou, A.; Lasenby, J.; Chrysanthou, Y.; Shamir, A. Inverse kinematics techniques in computer graphics: A survey. Computer Graphics Forum Vol. 37, No. 6, 35–58, 2018.
McFee, B.; Raffel, C.; Liang, D. W.; Ellis, D.; McVicar, M.; Battenberg, E.; Nieto, O. Librosa: Audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, 18–24, 2015.
Sakoe, H.; Chiba, S. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing Vol. 26, No. 1, 43–49, 1978.
Rabiner, L.; Juang, B. H. Fundamentals of Speech Recognition. Prentice-Hall, Inc., 1993
Daugman, J. G. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America A Vol. 2, No. 7, 1160–1169, 1985.
Dowson, D. C.; Landau, B. V. The Fréchet distance between multivariate normal distributions. Journal of Multivariate Analysis Vol. 12, No. 3, 450–455, 1982.
Aristidou, A.; Cohen-Or, D.; Hodgins, J. K.; Chrysanthou, Y.; Shamir, A. Deep motifs and motion signatures. ACM Transactions on Graphics Vol. 37, No. 6, Article No. 187, 2018.
Zhou, Y.; Barnes, C.; Lu, J. W.; Yang, J. M.; Li, H. On the continuity of rotation representations in neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5738–5746, 2019.
Andreou, N.; Aristidou, A.; Chrysanthou, Y. Pose representations for deep skeletal animation. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 2022.
This research was supported by National Natural Science Foundation of China (Grant No. 62072284), Natural Science Foundation of Shandong Province (Grant No. ZR2021MF102), a Special Project of Shandong Province for Software Engineering (Grant No. 11480004042015), and internal funds from the University of Cyprus. The authors would like to thank Anastasios Yiannakidis (University of Cyprus) for capturing the amateur dances, and the volunteers for participating in the perceptual studies. The authors would also like to thank the anonymous reviewers and editors for their fruitful comments and suggestions.
The authors have no competing interests to declare relevant to the content of this article.
Qiu Zhou is a postgraduate in the School of Computer Science and Technology at Shandong University. She received her B.Sc. degree from Shandong University in 2019. Her main interests are motion analysis and synthesis.
Manyi Li is an associate researcher in the School of Software at Shandong University. She received her B.Sc. and Ph.D. degrees from Shandong University in 2013 and 2018 respectively and was a postdoc fellow in the GrUVi Lab, Simon Fraser University during 2019–2021. Her main interests are 3D content creation and understanding.
Qiong Zeng is an associate researcher in the School of Computer Science and Technology at Shandong University. She received her B.Sc. and Ph.D. degrees from Nanchang University and Shandong University in 2010 and 2015 respectively. Her main interests are focused on motion analysis and visualization.
Andreas Aristidou is an assistant professor in the Department of Computer Science, University of Cyprus. He has been a Cambridge European Trust Fellow at the University of Cambridge, where he obtained his Ph.D. degree. He received his B.Sc. degree from the National and Kapodistrian University of Athens and has an M.Sc. degree with honors from King’s College London. His main research interests are focused in the areas of computer graphics and character animation.
Xiaojing Zhang is an undergraduate student in Taishan College of Shandong University. She entered the university in 2019. Her main interests are focused on computer graphics and visualization.
Lin Chen is an associate professor in the Qingdao Institute of Humanities and Social Sciences, Shandong University. She received her doctorate degree from the Freie Universität Berlin. Her research interests include the aesthetic ideas of Baumgarten and their far-reaching influence, theatre and dance research, and cultural studies.
Changhe Tu is a professor in the School of Computer Science and Technology, Shandong University. He received his B.Sc., M.Eng., and Ph.D. degrees from Shandong University in 1990, 1993, and 2003, respectively. His research interests are in the areas of computer graphics and robotics.
Supplementary material, approximately 51.5 MB.
About this article
Cite this article
Zhou, Q., Li, M., Zeng, Q. et al. Let’s all dance: Enhancing amateur dance motions. Comp. Visual Media 9, 531–550 (2023). https://doi.org/10.1007/s41095-022-0292-6