Skip to main content

Conducting Audio Files via Computer Vision

  • Conference paper
Gesture-Based Communication in Human-Computer Interaction (GW 2003)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2915))

Included in the following conference series:

  • 2224 Accesses

Abstract

This paper presents a system to control the playback of audio files by means of the standard classical conducting technique. Computer vision techniques are developed to track a conductor’s baton, and the gesture is subsequently analysed. Audio parameters are extracted from the sound-file and are further processed for audio beat tracking. The sound-file playback speed is adjusted in order to bring the audio beat points into alignment with the gesture beat points. The complete system forms all parts necessary to simulate an orchestra reacting to a conductor’s baton.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Rudolf, M.: The Grammar of Conducting: A Comprehensive Guide to Baton Technique and Interpretation, Third edn., Macmillan (1993)

    Google Scholar 

  2. Wanderley, M., Battier, M. (eds.): Trends in Gestural Control of Music. IRCAM (2000)

    Google Scholar 

  3. Camurri, A., Hashimoto, S., Ricchetti, M., Trocca, R.: Eyesweb – towards gesture and affect recognition in dance/music interactive systems. Computer Music Journal 24, 57–69 (2000), www.musart.dist.unige.it/site_inglese/research/r_current/eyesweb.html

    Article  Google Scholar 

  4. Andersen, T.H.: Mixxx: Towards novel DJ interfaces. In: New Interfaces for Musical Expression, Montreal, Canada, pp. 30–35 (2003)

    Google Scholar 

  5. Boulanger, R., Mathews, M.: The 1997 Mathews radio-baton improvisation modes. In: Proceedings of the International Computer Music Conference, Thessaloniki, Greece, ICMA, pp. 395–398 (1997)

    Google Scholar 

  6. Rich, R., Buchla, D.: Lightning II. Electronic Musician 12, 118–124 (1996)

    Google Scholar 

  7. Borchers, J.O., Samminger, W., Mühlhäuser, M.: Engineering a realistic real-time conducting system for the audio/video rendering of a real orchestra. In: Fourth International Symposium on Multimedia Software Engineering, California, USA, IEEE MSE, Los Alamitos (2002)

    Google Scholar 

  8. Ilmonen, T., Takala, T.: Conductor following with artificial neural networks. In: Proceedings of the International Computer Music Conference, Beijing, China, ICMA, pp. 367–370 (1999)

    Google Scholar 

  9. Marrin, T., Paradiso, J.: The digital baton: a versatile performance instrument. In: Proceedings of the International Computer Music Conference, Thessaloniki, Greece, ICMA, pp. 313–316 (1997)

    Google Scholar 

  10. Marrin, T.: Inside the Conductor’s Jacket: analysis interpretation and musical synthesis of expressive gesture. Ph.D thesis, MIT (2000)

    Google Scholar 

  11. Carosi, P., Bertini, G.: The light baton: A system for conducting computer music performance. In: Proceedings of the International Computer Music Conference, San Francisco, CA, USA, pp. 73–76 (1992)

    Google Scholar 

  12. Segen, J., Kumar, S., Gluckman, J.: Visual interface for conducting virtual orchestra. In: Proceedings of the International Conference on Pattern Recognition (ICPR), Barcelona, Spain, vol. 1, pp. 1276–1279. IEEE, Los Alamitos (2000)

    Chapter  Google Scholar 

  13. Segen, J., Majumder, A., Gluckman, J.: Virtual dance and music conducted by a human conductor. In: Gross, M., Hopgood, F.R.A. (eds.) Eurographics, vol. 19(3), EACG (2000)

    Google Scholar 

  14. Gerver, R.: Conducting algorithms. WWW (2001), http://www.stanford.edu/~rgerver/conducting.htm

  15. McNeill, D.: Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press, Chicago (1992)

    Google Scholar 

  16. Humphries, L.: What to think about when you conduct: Perception, language, and musical communication. In: WWW (2000), http://www.ThinkingApplied.com

  17. Murphy, D.: Tracking a conductor’s baton. In: Olsen, S., (ed.) Proceedings of the 12th Danish Conference on Pattern Recognition and Image Analysis, Copenhagen, Denmark, vol. 2003/05. DIKU report. DSAGM, HCØ Tryk, pp. 59– 66 (2003)

    Google Scholar 

  18. Canny, J.: A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-8, 697–698 (1986)

    Google Scholar 

  19. Murphy, D.: Extracting arm gestures for VR using EyesWeb. In: Buyoli, C.L., Loureiro, R., (eds.) Workshop on Current Research Directions in Computer Music, Barcelona, Spain, Audiovisual Institute, Pompeu Fabra University, pp. 55–60 (2001)

    Google Scholar 

  20. Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: Proceedings of 7th International Joint Conference on Artificial Intelligence (IJCAI), pp. 674–679 (1981)

    Google Scholar 

  21. Jensen, K., Andersen, T.H.: Real-time beat estimation using feature selection. In: Wiil, U.K. (ed.) CMMR 2003. LNCS, vol. 2771, pp. 13–22. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  22. Masri, P., Bateman, A.: Improved modelling of attack transient in music analysisresynthesis. In: Proceedings of the International Computer Music Conference, Hong-Kong, pp. 100–104 (1996)

    Google Scholar 

  23. Desain, P.: A (de)composable theory of rhythm. Music Perception 9, 439–454 (1992)

    Google Scholar 

  24. Jensen, K., Murphy, D.: Segmenting melodies into notes. In: Olsen, S., (ed.) Proceedings of the 10th Danish Conference on Pattern Recognition and Image Analysis. DIKU report, vol. 2001/04, Copenhagen, Denmark, DSAGM, HCØ Tryk, pp. 115–119 (2001)

    Google Scholar 

  25. Murphy, D.: Baton tracker. In: WWW Includes user guide (2002), http://www.diku.dk/~declan/projects/baton-tracker.html

  26. Murphy, D.: Pattern play. In: Smaill, A., (ed.) Additional Proceedings of the 2nd International Conference on Music and Artificial Intelligence. On-line tech. report series of the Division of Informatics, University of Edinburgh, Scotland, UK(2002), http://dream.dai.ed.ac.uk/group/smaill/icmai/b06.pdf

  27. Dolson, M.: The phase vocoder: A tutorial. Computer Music Journal 4, 14–27 (1986)

    Article  Google Scholar 

  28. Rodet, X., Dapalle, P.: Spectral envelopes and inverse FFT synthesis. In: Proceedings of the 93rd AES Convention, San Francisco, USA (1992) (preprint 3393)

    Google Scholar 

  29. Andersen, T.H., Andersen, K.H.: Mixxx. In: WWW (2003), http://mixxx.sourceforge.net/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Murphy, D., Andersen, T.H., Jensen, K. (2004). Conducting Audio Files via Computer Vision. In: Camurri, A., Volpe, G. (eds) Gesture-Based Communication in Human-Computer Interaction. GW 2003. Lecture Notes in Computer Science(), vol 2915. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24598-8_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24598-8_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21072-6

  • Online ISBN: 978-3-540-24598-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics