Skip to main content

Gesture-Timbre Space: Multidimensional Feature Mapping Using Machine Learning and Concatenative Synthesis

  • Conference paper
  • First Online:
Perception, Representations, Image, Sound, Music (CMMR 2019)

Abstract

This chapter explores three systems for mapping embodied gesture, acquired with electromyography and motion sensing, to sound synthesis. A pilot study using granular synthesis is presented, followed by studies employing corpus-based concatenative synthesis, where small sound units are organized by derived timbral features. We use interactive machine learning in a mapping-by-demonstration paradigm to create regression models that map high-dimensional gestural data to timbral data without dimensionality reduction in three distinct workflows. First, by directly associating individual sound units and static poses (anchor points) in static regression. Second, in whole regression a sound tracing method leverages our intuitive associations between time-varying sound and embodied movement. Third, we extend interactive machine learning through the use of artificial agents and reinforcement learning in an assisted interactive machine learning workflow. We discuss the benefits of organizing the sound corpus using self-organizing maps to address corpus sparseness, and the potential of regression-based mapping at different points in a musical workflow: gesture design, sound design, and mapping design. These systems support expressive performance by creating gesture-timbre spaces that maximize sonic diversity while maintaining coherence, enabling reliable reproduction of target sounds as well as improvisatory exploration of a sonic corpus. They have been made available to the research community, and have been used by the authors in concert performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.cs.waikato.ac.nz/ml/weka/.

  2. 2.

    https://developerblog.myo.com/.

  3. 3.

    https://bitalino.com/.

  4. 4.

    https://cycling74.com/products/max.

  5. 5.

    https://github.com/JulesFrancoise/myo-for-max.

  6. 6.

    GIMLeT – Gestural Interaction Machine Learning Toolkit: https://github.com/federicoVisi/GIMLeT.

  7. 7.

    https://github.com/Ircam-RnD/coexplorer.

  8. 8.

    https://hexler.net/products/touchosc.

  9. 9.

    https://forumnet.ircam.fr/product/mubu-en/.

  10. 10.

    https://cmmr2019.prism.cnrs.fr/programArtistic.html.

  11. 11.

    http://newmusic.org/media/vnm-festival-2019-resonances-atau-tanaka/.

  12. 12.

    https://www.eegsynth.org/?p=2682.

References

  1. Aucouturier, J.J., Pachet, F.: Jamming with plunderphonics: interactive concatenative synthesis of music. J. New Music Res. 35(1), 35–50 (2006). https://doi.org/10.1080/09298210600696790

    Article  Google Scholar 

  2. Beller, G.: Gestural control of real time speech synthesis in Luna Park. In: Proceedings of Sound Music Computing Conference, SMC, Padova, Italy (2011)

    Google Scholar 

  3. Bernardo, F., Zbyszyński, M., Grierson, M., Fiebrink, R.: Designing and evaluating the usability of a machine learning API for rapid prototyping music technology. Front. Artif. Intell. 3(a13), 1–18 (2020)

    Google Scholar 

  4. Caramiaux, B., Bevilacqua, F., Schnell, N.: Towards a gesture-sound cross-modal analysis. In: Kopp, S., Wachsmuth, I. (eds.) GW 2009. LNCS (LNAI), vol. 5934, pp. 158–170. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12553-9_14

    Chapter  Google Scholar 

  5. Caramiaux, B., Donnarumma, M., Tanaka, A.: Understanding gesture expressivity through muscle sensing. ACM Trans. Comput. Hum. Interact. (TOCHI) 21(6), 31 (2015)

    Article  Google Scholar 

  6. Delle Monache, S., Rocchesso, D.: To embody or not to embody: a sound design dilemma. In: Machine Sounds, Sound Machines. XXII Colloquium of Music Informatics, Venice, Italy (2018)

    Google Scholar 

  7. Di Donato, B., Tanaka, A., Zbyszyński, M., Klang, M.: EAVI EMG board. In: Demo of International Conference on New Interfaces for Musical Expression. NIME 2019, Federal University of Rio Grande do Sul, Porto Allegre, Brazil, June 2019

    Google Scholar 

  8. Fails, J.A., Olsen Jr, D.R.: Interactive machine learning. In: Proceedings of the 8th International Conference on Intelligent User Interfaces, pp. 39–45 (2003)

    Google Scholar 

  9. Fiebrink, R., Cook, P.R.: The Wekinator: a system for real-time, interactive machine learning in music. In: Proceedings of the International Society for Music Information Retrieval Conference, ISMIR 2010, Utrecht, Netherlands (2010)

    Google Scholar 

  10. Fiebrink, R.A., Caramiaux, B.: The machine learning algorithm as creative musical tool. In: Dean, R.T., McLean, A. (eds.) The Oxford Handbook of Algorithmic Music, vol. 1, pp. 181–208. Oxford University Press (2018). https://doi.org/10.1093/oxfordhb/9780190226992.013.23

  11. Françoise, J.: Motion-sound mapping by demonstration. Ph.D. thesis, UPMC Université Pierre et Marie Curie, Paris (2015)

    Google Scholar 

  12. Françoise, J., Caramiaux, B., Bevilacqua, F.: A hierarchical approach for the design of gesture-to-sound mappings. In: 9th Sound and Music Computing Conference, SMC, Copenhagen, Denmark (2012)

    Google Scholar 

  13. Hunt, A., Wanderley, M.M.: Mapping performer parameters to synthesis engines. Organ. Sound 7(2), 97–108 (2002)

    Article  Google Scholar 

  14. Maccallum, J., Gottfried, R., Rostovtsev, I., Bresson, J., Freed, A.: Dynamic message-oriented middleware with open sound control and Odot. In: International Computer Music Conference, ICMA, Denton, United States (2015). https://hal.archives-ouvertes.fr/hal-01165775/document

  15. Magnusson, T.: Introduction: on objects, humans, and machines. In: Sonic Writing. Bloomsbury Academic (2019). https://doi.org/10.5040/9781501313899.0006

  16. Margraf, J.: Self-organizing maps for sound corpus organization. Master’s Thesis. Audiokommunikation - Technische Universität Berlin (2019). https://www2.ak.tu-berlin.de/~akgroup/ak_pub/abschlussarbeiten/2019/Margraf_MasA.pdf

  17. Parke-Wolfe, S.T., Scurto, H., Fiebrink, R.: Sound control: supporting custom musical interface design for children with disabilities. In: Proceedings of the International Conference on New Interfaces for Musical Expression, NIME 2019, Porto Alegre, Brazil (2019)

    Google Scholar 

  18. Roads, C.: Microsound. The MIT Press, Cambridge, MA (2002)

    Book  Google Scholar 

  19. Sanger, T.D.: Bayesian filtering of myoelectric signals. J. Neurophysiol. 97(2), 1839–1845 (2007)

    Article  Google Scholar 

  20. Savary, M., Schwarz, D., Pellerin, D.: Dirti–dirty tangible interfaces. In: Proceedings of the International Conference on New Interfaces for Musical Expression, NIME 2012, Ann Arbor, Michigan (2012). http://www.nime.org/proceedings/2012/nime2012_212.pdf

  21. Savary, M., Schwarz, D., Pellerin, D., Massin, F., Jacquemin, C., Cahen, R.: Dirty tangible interfaces: Expressive control of computers with true grit. In: CHI 2013 Extended Abstracts on Human Factors in Computing Systems, CHI EA 2013, Paris, France, pp. 2991–2994. ACM (2013). https://doi.org/10.1145/2468356.2479592

  22. Schnell, N., Röbel, A., Schwarz, D., Peeters, G., Borghesi, R.: MuBu & friends - assembling tools for content based real-time interactive audio processing in Max/MSP. In: Proceedings of the International Computer Music Conference, ICMC, Montreal, Quebec, QC, pp. 423–426 (2009)

    Google Scholar 

  23. Schwarz, D.: Concatenative sound synthesis: the early years. J. New Music Res. 35(1), 3–22 (2006). https://doi.org/10.1080/09298210600696857

    Article  Google Scholar 

  24. Schwarz, D.: The sound space as musical instrument: playing corpus-based concatenative synthesis. In: Proceedings of the International Conference on New Interfaces for Musical Expression, NIME 2012, Ann Arbor, Michigan (2012). http://www.nime.org/proceedings/2012/nime2012_120.pdf

  25. Schwarz, D., Beller, G., Verbrugghe, B., Britton, S.: Real-time corpus-based concatenative synthesis with CataRT. In: 9th International Conference on Digital Audio Effects, DAFx 2019, Montreal, Canada, pp. 279–282 (2006). https://hal.archives-ouvertes.fr/hal-01161358

  26. Schwarz, D., Cahen, R., Britton, S.: Principles and applications of interactive corpus-based concatenative synthesis. In: Journées d’Informatique Musicale. JIM, Albi, France (2008)

    Google Scholar 

  27. Schwarz, D., Tremblay, P.A., Harker, A.: Rich contacts: corpus-based convolution of contact interaction sound for enhanced musical expression. In: Proceedings of the International Conference on New Interfaces for Musical Expression, pp. 247–250 (2014). http://www.nime.org/proceedings/2014/nime2014_451.pdf

  28. Scurto, H., Van Kerrebroeck, B., Caramiaux, B., Bevilacqua, F.: Designing deep reinforcement learning for human parameter exploration. ArXiv Preprint (2019). http://arxiv.org/pdf/1907.00824.pdf

  29. Stowell, D., Pumbley, M.D.: Timbre remapping through a regression-tree technique. In: Proceedings of the Sound Music Computing Conference, SMC (2010)

    Google Scholar 

  30. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)

    MATH  Google Scholar 

  31. Tanaka, A., Di Donato, B., Zbyszyński, M.: Designing gestures for continuous sonic interaction. In: Proceedings of the International Conference on New Interfaces for Musical Expression, NIME 2019, Porto Alegre, Brazil (2019)

    Google Scholar 

  32. Tanaka, A., Ortiz, M.: Gestural musical performance with physiological sensors, focusing on the electromyogram. In: Lesaffre, M.L., Maes, P.J., Leman, M. (eds.) The Routledge Companion to Embodied Music Interaction. Routledge, London (2017)

    Google Scholar 

  33. Visi, F., Dahl, L.: Real-time motion capture analysis and music interaction with the modosc descriptor library. In: Proceedings of the International Conference on New Interfaces for Musical Expression, pp. 144–147 (2018). https://github.com/motiondescriptors/modosc

  34. Visi, F.G., AQAXA: “You have a new memory”. In: ICLI 2020 - the Fifth International Conference on Live Interfaces, Trondheim, Norway (2020)

    Google Scholar 

  35. Visi, F.G., Tanaka, A.: Towards assisted interactive machine learning: exploring gesture-sound mappings using reinforcement learning. In: ICLI 2020 - the Fifth International Conference on Live Interfaces, Trondheim, Norway (2020)

    Google Scholar 

  36. Warnell, G., Waytowich, N., Lawhern, V., Stone, P.: Deep TAMER: interactive agent shaping in high-dimensional state spaces. In: 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pp. 1545–1553 (2018). http://arxiv.org/abs/1709.10163

  37. Wessel, D.L.: Timbre space as a musical control structure. Comput. Music J. 45–52 (1979)

    Google Scholar 

  38. Zbyszyński, M., Di Donato, B., Tanaka, A.: The effect of co-adaptive learning & feedback in interactive machine learning. In: ACM CHI: Human-Centered Machine Learning Perspectives Workshop, Glasgow, UK. ACM (2019)

    Google Scholar 

  39. Zbyszyński, M., Grierson, M., Yee-King, M.: Rapid prototyping of new instruments with codecircle. In: Proceedings of the International Conference on New Interfaces for Musical Expression, Copenhagen, Denmark, pp. 227–230 (2017). http://www.nime.org/proceedings/2017/nime2017_paper0042.pdf

Download references

Acknowledgments

The research leading to these results has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 789825).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Zbyszyński .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zbyszyński, M., Di Donato, B., Visi, F.G., Tanaka, A. (2021). Gesture-Timbre Space: Multidimensional Feature Mapping Using Machine Learning and Concatenative Synthesis. In: Kronland-Martinet, R., Ystad, S., Aramaki, M. (eds) Perception, Representations, Image, Sound, Music. CMMR 2019. Lecture Notes in Computer Science(), vol 12631. Springer, Cham. https://doi.org/10.1007/978-3-030-70210-6_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-70210-6_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-70209-0

  • Online ISBN: 978-3-030-70210-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics