Skip to main content
Log in

A multi-modal dance corpus for research into interaction between humans in virtual environments

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

We present a new, freely available, multimodal corpus for research into, amongst other areas, real-time realistic interaction between humans in online virtual environments. The specific corpus scenario focuses on an online dance class application scenario where students, with avatars driven by whatever 3D capture technology is locally available to them, can learn choreographies with teacher guidance in an online virtual dance studio. As the dance corpus is focused on this scenario, it consists of student/teacher dance choreographies concurrently captured at two different sites using a variety of media modalities, including synchronised audio rigs, multiple cameras, wearable inertial measurement devices and depth sensors. In the corpus, each of the several dancers performs a number of fixed choreographies, which are graded according to a number of specific evaluation criteria. In addition, ground-truth dance choreography annotations are provided. Furthermore, for unsynchronised sensor modalities, the corpus also includes distinctive events for data stream synchronisation. The total duration of the recorded content is 1 h and 40 min for each single sensor, amounting to 55 h of recordings across all sensors. Although the dance corpus is tailored specifically for an online dance class application scenario, the data is free to download and use for any research and development purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. More ratings by other experienced Salsa dancers will be provided in the near future

References

  1. Clave (ryhthm) (2011). http://en.wikipedia.org/wiki/Clave_rhythm

  2. Openni (2011). http://www.openni.org/

  3. Alexiadis D, Kelly P, Daras P, O’Connor N, Boubekeur T, Moussa MB (2011) Evaluating a dancer’s performance using kinect-based skeleton tracking. In: ACMR, pp 659–662

  4. Alonso M, Richard G, David B (2005) Extracting note onsets from musical recordings. In: IEEE International Conference on Multimedia and Expo. IEEE Computer Society, Los Alamitos, USA. http://doi.ieeecomputersociety.org/10.1109/ICME.2005.1521568. ISBN 0-7803-9331-7

  5. Altun K, Barshan B (2010) Human activity recognition using inertial/magnetic sensor units. In: HBU, pp 38–51

  6. Blank M, Gorelick L, Shechtman E, Irani M, Basri R (2005) Actions as space-time shapes. In: ICCV, vol 3, pp 1395–1402.

  7. Cannam C, Landone C, Sandler M (2010) Sonic visualiser: an open source application for viewing, analysing and annotating music audio files. In: Proceedings of the ACM multimedia 2010 international conference, Firenze, Italy, October 2010, pp 1467– 1468.

  8. Eichner M, Marin-Jimenez M, Zisserman A, Ferrari V (2012) 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. Int J Comput Vis 99:190–214

    Article  MathSciNet  Google Scholar 

  9. Essid S, Alexiadis D, Tournemenne R, Gowing M, Kelly P, Monhagan D, Daras P, Dremeau A, O’Connor NE (2012) An advanced virtual dance performance evaluator. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan

  10. Essid S, Lin X, Gowing M, Kordelas G, Aksay A, Kelly P, Fillon T, Zhang Q, Dielmann A, Kitanovski V, Tournemenne R, O’Connor NE, Daras P (2011) Richard G (2011) A multimodal dance corpus for research into real-time interaction between humans in online virtual environments. In: ICMI workshop on multimodal corpora for machine learning, Alicante, Spain

  11. Gkalelis N, Kim H, Hilton A, Nikolaidis N, Pitas I (2009) The i3dpost multi-view and 3d human action/interactions. In: CMVP, pp 159–168

  12. Gowing M, Kell P, O’Connor N, Concolato C, Essid S, Lefeuvre J, Tournemenne R, Izquierdo E, Kitanovski V, Lin X, Zhang Q (2011) Enhanced visualisation of dance performance from automatically synchronised multimodal recordings. In: ACMR, pp 667–670

  13. Gross R, Shik J (2001) The cmu motion of body (mobo) database. Technical report.

  14. Hofmann M, Gavrila D (2012) Multi-view 3d human pose estimation in complex environment. IJCV 96(1):103–124

    Article  MathSciNet  Google Scholar 

  15. Ji X, Liu H (2010) Advances in view-invariant human motion analysis: a review. SMC Part C 40(1):13–24

    Google Scholar 

  16. Messing R, Pal C, Kautz H (2009) Activity recognition using the velocity histories of tracked keypoints. In: IEEE 12th International Conference on Computer Vision, Kyoto, Japan

    Google Scholar 

  17. Pons-Moll G, Baak A, Helten T, Mueller M, Seidel H, Rosenhahn B (2010) Multisensor-fusion for 3d full-body human motion capture. In: CVPR, pp 663–670

  18. Raptis M, Kirovski D, Hoppe H (2011) Real-time classification of dance gestures from skeleton animation. In: ACM/SIGGRAPH SCA

  19. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. 17th International Conference on Pattern Recognition. Cambridge, UK. 3, pp 32–36

  20. Schwarz L, Mateus D, Navab N (2012) Recognizing multiple human activities and tracking full-body pose in unconstrained environments. Pattern Recognit 45(1):11–23

    Article  Google Scholar 

  21. Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from single depth images. In: CVPR

  22. Sigal L, Balan AO, Black MJ (2010) Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int J Comput Vis 87(1):4–27

    Article  Google Scholar 

  23. Singh S, Velastin S, Ragheb H (2010) Muhavi: a multicamera human action video dataset for the evaluation of action recognition methods. In: AVSS, pp 48–55

  24. Wang Y, Huang K, Tan T (2007) Human activity recognition based on r transform. In: CVPR, pp 1–8

  25. Ushizaki KDM, Okatani T (2006) Video synchronization based on co-occurrence of appearance changes in video sequence. In: ICPR

  26. Weinland D, Ronfard R, Boyer E (2006) Free viewpoint action recognition using motion history volumes. CVIU 104(2–3):249–257

    Google Scholar 

  27. Yang A, Jarafi R, Kuryloski P, Iyengar S, Sastry S, Bajcsy R (2008) Distributed segmentation and classification of human actions using a wearable motion sensor network. In: CVPRW, pp 1–8

Download references

Acknowledgments

The authors and 3DLife would like to acknowledge the support of Huawei in the creation of this dataset. In addition, warmest thanks go to all the contributors to these capture sessions, especially: The Dancers Anne-Sophie K., Anne-Sophie M., Bertrand, Gabi, Gael, Habib, Helene, Jacky, Jean-Marc, Laetitia, Martine, Ming-Li, Remi, Roland, Thomas. The Tech Guys Alfred, Dave, Dominique, Fabrice, Gael, Georgios, Gilbert, Marc, Mounira, Noel, Phil, Radek, Robin, Slim, Qianni, Sophie-Charlotte, Thomas, Xinyu, Yves. This research was partially supported by the European Commission under contract FP7-247688 3DLife.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Slim Essid.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Essid, S., Lin, X., Gowing, M. et al. A multi-modal dance corpus for research into interaction between humans in virtual environments. J Multimodal User Interfaces 7, 157–170 (2013). https://doi.org/10.1007/s12193-012-0109-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-012-0109-5

Keywords

Navigation