Skip to main content

A Video System for Recognizing Gestures by Artificial Neural Networks for Expressive Musical Control

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2915))

Abstract

In this paper we describe a system to recognize gestures to control musical processes. For that we applied a Time Delay Neuronal Network to match gestures processed as variation of luminance information in video streams. This resulted in recognition rates of about 90% for 3 different types of hand gestures and it is presented here as a prototype for a gestural recognition system that is tolerant to ambient conditions and environments. The neural network can be trained to recognize gestures difficult to be described by postures or sign language. This can be used to adapt to unique gestures of a performer or video sequences of arbitrary moving objects. We will discuss the outcome of extending the system to learn successfully a set of 17 hand gestures. The application was implemented in jMax to achieve real-time conditions and easy integration into a musical environment. We will describe the design and learning procedure of the using the Stuttgart Neuronal Network Simulator. The system aims to integrate into an environment that enables expressive control of musical parameters (KANSEI).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berthold, R.M.: A Time Delay Radial Basis Function Network for Phoneme Recognition. In: Proceedings of IEEE International Conference on Neural Networks, Orlando, FL, vol. 7, pp. 4470–4473. IEEE Computer Society Press, Los Alamitos (1994)

    Google Scholar 

  2. Bowden, R.: Learning Nonlinear Models of Shape and Motion. PhD Thesis, Brunel University (1999)

    Google Scholar 

  3. Camurri, A., Trocca, R., Volpe, G.: Interactive Systems Design: A KANSEI-based Approach. In: Proc. NIME 2002, Dublin (May 2002)

    Google Scholar 

  4. de Cecco, M., Dechelle, F.: jMax/FTS Documentation (1999), http://www.ircam.fr

  5. Cuttler, R., Turk, M.: View-based Interpretation of Real-time Optical Flow for Gesture Recognition. In: Proceedings of the 3 IEEE International Conference of Face and Gesture Recognition (1998)

    Google Scholar 

  6. Marolt, Martia: A Comparison of feed forward neural network architectures for piano music transcription. In: Proceedings of the ICMC 1999, ICMA 1999 (1999)

    Google Scholar 

  7. MEGA, Multisensory Expressive Gesture Applications, V Framework Programme IST Project No.1999-20410 (2002), http://www.megaproject.org/

  8. Modler, Paul, Zannos, Ioannis: Emotional Aspects of Gesture Recognition by Neural Networks, using dedicated Input Devices. In: Camurri, A., (ed.) Proc. of KANSEI The Technology of Emotion, AIMI International Workshop, Universita Genova, Genova (1997)

    Google Scholar 

  9. Modler, Paul: A General Purpose Open Source Artificial Neural Network Simulator for jMax. IRCAM-Forum, Paris (November 2002)

    Google Scholar 

  10. Myatt, A.: Strategies for interaction in construction 3, Organised Sound. In: CUP, Cambridge, UK, vol. 7(3), pp. 157–169 (2002)

    Google Scholar 

  11. Palindrome, http://www.palindrome.de/

  12. The RIMM Project, Real-time Interactive Multiple Media Content Generation Using High Performance Computing and Multi-Parametric Human-Computer Interfaces, European Commission 5th Framework programme Information, Societies, Technology (2002), http://www.york.ac.uk/res/rimm/

  13. Rokeby, David: SoftVNS Motion Tracking system, http://www.interlog.com/~drokeby/softVNS.html

  14. SNNS, Stuttgarter Neural Network Simulator, User Manual 4.1, Stuttgart, University of Stuttgart (1995)

    Google Scholar 

  15. Vassilakis, H., Howell, J.A., Buxton, H.I.: Comparison of Feedforward (TDRBF) and Generative (TDRGBN) Network for Gesture Based Control. In: Proceedings of the Int. Gesture Workshop (2001)

    Google Scholar 

  16. Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K.J.: Phoneme recognition using time-delay neural networks. IEEE Transactions On Acoustics, Speech, and Signal Processing 37(3), 328–339 (1989)

    Article  Google Scholar 

  17. Wanderley, M.M., Marc, B.: Trends in Gestural Control of Music, CD-Rom, IRCAM, Paris (2000)

    Google Scholar 

  18. Zell, A.: Simulation Neuronaler Netze, Bonn, Paris. Addison Wesley, Reading (1994)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Modler, P., Myatt, T. (2004). A Video System for Recognizing Gestures by Artificial Neural Networks for Expressive Musical Control. In: Camurri, A., Volpe, G. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 2003. Lecture Notes in Computer Science(), vol 2915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24598-8_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24598-8_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21072-6

  • Online ISBN: 978-3-540-24598-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics