A Video System for Recognizing Gestures by Artificial Neural Networks for Expressive Musical Control

Modler, Paul; Myatt, Tony

doi:10.1007/978-3-540-24598-8_50

A Video System for Recognizing Gestures by Artificial Neural Networks for Expressive Musical Control

Paul Modler⁸ &
Tony Myatt⁹

Conference paper

2145 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2915))

Abstract

In this paper we describe a system to recognize gestures to control musical processes. For that we applied a Time Delay Neuronal Network to match gestures processed as variation of luminance information in video streams. This resulted in recognition rates of about 90% for 3 different types of hand gestures and it is presented here as a prototype for a gestural recognition system that is tolerant to ambient conditions and environments. The neural network can be trained to recognize gestures difficult to be described by postures or sign language. This can be used to adapt to unique gestures of a performer or video sequences of arbitrary moving objects. We will discuss the outcome of extending the system to learn successfully a set of 17 hand gestures. The application was implemented in jMax to achieve real-time conditions and easy integration into a musical environment. We will describe the design and learning procedure of the using the Stuttgart Neuronal Network Simulator. The system aims to integrate into an environment that enables expressive control of musical parameters (KANSEI).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berthold, R.M.: A Time Delay Radial Basis Function Network for Phoneme Recognition. In: Proceedings of IEEE International Conference on Neural Networks, Orlando, FL, vol. 7, pp. 4470–4473. IEEE Computer Society Press, Los Alamitos (1994)
Google Scholar
Bowden, R.: Learning Nonlinear Models of Shape and Motion. PhD Thesis, Brunel University (1999)
Google Scholar
Camurri, A., Trocca, R., Volpe, G.: Interactive Systems Design: A KANSEI-based Approach. In: Proc. NIME 2002, Dublin (May 2002)
Google Scholar
de Cecco, M., Dechelle, F.: jMax/FTS Documentation (1999), http://www.ircam.fr
Cuttler, R., Turk, M.: View-based Interpretation of Real-time Optical Flow for Gesture Recognition. In: Proceedings of the 3 IEEE International Conference of Face and Gesture Recognition (1998)
Google Scholar
Marolt, Martia: A Comparison of feed forward neural network architectures for piano music transcription. In: Proceedings of the ICMC 1999, ICMA 1999 (1999)
Google Scholar
MEGA, Multisensory Expressive Gesture Applications, V Framework Programme IST Project No.1999-20410 (2002), http://www.megaproject.org/
Modler, Paul, Zannos, Ioannis: Emotional Aspects of Gesture Recognition by Neural Networks, using dedicated Input Devices. In: Camurri, A., (ed.) Proc. of KANSEI The Technology of Emotion, AIMI International Workshop, Universita Genova, Genova (1997)
Google Scholar
Modler, Paul: A General Purpose Open Source Artificial Neural Network Simulator for jMax. IRCAM-Forum, Paris (November 2002)
Google Scholar
Myatt, A.: Strategies for interaction in construction 3, Organised Sound. In: CUP, Cambridge, UK, vol. 7(3), pp. 157–169 (2002)
Google Scholar
Palindrome, http://www.palindrome.de/
The RIMM Project, Real-time Interactive Multiple Media Content Generation Using High Performance Computing and Multi-Parametric Human-Computer Interfaces, European Commission 5th Framework programme Information, Societies, Technology (2002), http://www.york.ac.uk/res/rimm/
Rokeby, David: SoftVNS Motion Tracking system, http://www.interlog.com/~drokeby/softVNS.html
SNNS, Stuttgarter Neural Network Simulator, User Manual 4.1, Stuttgart, University of Stuttgart (1995)
Google Scholar
Vassilakis, H., Howell, J.A., Buxton, H.I.: Comparison of Feedforward (TDRBF) and Generative (TDRGBN) Network for Gesture Based Control. In: Proceedings of the Int. Gesture Workshop (2001)
Google Scholar
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K.J.: Phoneme recognition using time-delay neural networks. IEEE Transactions On Acoustics, Speech, and Signal Processing 37(3), 328–339 (1989)
Article Google Scholar
Wanderley, M.M., Marc, B.: Trends in Gestural Control of Music, CD-Rom, IRCAM, Paris (2000)
Google Scholar
Zell, A.: Simulation Neuronaler Netze, Bonn, Paris. Addison Wesley, Reading (1994)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Hochschule für Gestaltung, 76135, Karlsruhe, Germany
Paul Modler
Music Department, University of York, YO10 5DD, York, UK
Tony Myatt

Authors

Paul Modler
View author publications
You can also search for this author in PubMed Google Scholar
Tony Myatt
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

InfoMus Lab, DIST- University of Genova, Viale Causa 13, I-16145, Genova, Italy
Antonio Camurri
InfoMus Lab, DIST University of Genova, Viale Causa 13, I-16145, Genova, Italy
Gualtiero Volpe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Modler, P., Myatt, T. (2004). A Video System for Recognizing Gestures by Artificial Neural Networks for Expressive Musical Control. In: Camurri, A., Volpe, G. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 2003. Lecture Notes in Computer Science(), vol 2915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24598-8_50

Download citation

DOI: https://doi.org/10.1007/978-3-540-24598-8_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21072-6
Online ISBN: 978-3-540-24598-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics