Abstract
In this paper we describe a system to recognize gestures to control musical processes. For that we applied a Time Delay Neuronal Network to match gestures processed as variation of luminance information in video streams. This resulted in recognition rates of about 90% for 3 different types of hand gestures and it is presented here as a prototype for a gestural recognition system that is tolerant to ambient conditions and environments. The neural network can be trained to recognize gestures difficult to be described by postures or sign language. This can be used to adapt to unique gestures of a performer or video sequences of arbitrary moving objects. We will discuss the outcome of extending the system to learn successfully a set of 17 hand gestures. The application was implemented in jMax to achieve real-time conditions and easy integration into a musical environment. We will describe the design and learning procedure of the using the Stuttgart Neuronal Network Simulator. The system aims to integrate into an environment that enables expressive control of musical parameters (KANSEI).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Berthold, R.M.: A Time Delay Radial Basis Function Network for Phoneme Recognition. In: Proceedings of IEEE International Conference on Neural Networks, Orlando, FL, vol. 7, pp. 4470–4473. IEEE Computer Society Press, Los Alamitos (1994)
Bowden, R.: Learning Nonlinear Models of Shape and Motion. PhD Thesis, Brunel University (1999)
Camurri, A., Trocca, R., Volpe, G.: Interactive Systems Design: A KANSEI-based Approach. In: Proc. NIME 2002, Dublin (May 2002)
de Cecco, M., Dechelle, F.: jMax/FTS Documentation (1999), http://www.ircam.fr
Cuttler, R., Turk, M.: View-based Interpretation of Real-time Optical Flow for Gesture Recognition. In: Proceedings of the 3 IEEE International Conference of Face and Gesture Recognition (1998)
Marolt, Martia: A Comparison of feed forward neural network architectures for piano music transcription. In: Proceedings of the ICMC 1999, ICMA 1999 (1999)
MEGA, Multisensory Expressive Gesture Applications, V Framework Programme IST Project No.1999-20410 (2002), http://www.megaproject.org/
Modler, Paul, Zannos, Ioannis: Emotional Aspects of Gesture Recognition by Neural Networks, using dedicated Input Devices. In: Camurri, A., (ed.) Proc. of KANSEI The Technology of Emotion, AIMI International Workshop, Universita Genova, Genova (1997)
Modler, Paul: A General Purpose Open Source Artificial Neural Network Simulator for jMax. IRCAM-Forum, Paris (November 2002)
Myatt, A.: Strategies for interaction in construction 3, Organised Sound. In: CUP, Cambridge, UK, vol. 7(3), pp. 157–169 (2002)
Palindrome, http://www.palindrome.de/
The RIMM Project, Real-time Interactive Multiple Media Content Generation Using High Performance Computing and Multi-Parametric Human-Computer Interfaces, European Commission 5th Framework programme Information, Societies, Technology (2002), http://www.york.ac.uk/res/rimm/
Rokeby, David: SoftVNS Motion Tracking system, http://www.interlog.com/~drokeby/softVNS.html
SNNS, Stuttgarter Neural Network Simulator, User Manual 4.1, Stuttgart, University of Stuttgart (1995)
Vassilakis, H., Howell, J.A., Buxton, H.I.: Comparison of Feedforward (TDRBF) and Generative (TDRGBN) Network for Gesture Based Control. In: Proceedings of the Int. Gesture Workshop (2001)
Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K.J.: Phoneme recognition using time-delay neural networks. IEEE Transactions On Acoustics, Speech, and Signal Processing 37(3), 328–339 (1989)
Wanderley, M.M., Marc, B.: Trends in Gestural Control of Music, CD-Rom, IRCAM, Paris (2000)
Zell, A.: Simulation Neuronaler Netze, Bonn, Paris. Addison Wesley, Reading (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Modler, P., Myatt, T. (2004). A Video System for Recognizing Gestures by Artificial Neural Networks for Expressive Musical Control. In: Camurri, A., Volpe, G. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 2003. Lecture Notes in Computer Science(), vol 2915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24598-8_50
Download citation
DOI: https://doi.org/10.1007/978-3-540-24598-8_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21072-6
Online ISBN: 978-3-540-24598-8
eBook Packages: Springer Book Archive