Interactive Music with Active Audio CDs

  • Sylvain Marchand
  • Boris Mansencal
  • Laurent Girin
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6684)

Abstract

With a standard compact disc (CD) audio player, the only possibility for the user is to listen to the recorded track, passively: the interaction is limited to changing the global volume or the track. Imagine now that the listener can turn into a musician, playing with the sound sources present in the stereo mix, changing their respective volumes and locations in space. For example, a given instrument or voice can be either muted, amplified, or more generally moved in the acoustic space. This will be a kind of generalized karaoke, useful for disc jockeys and also for music pedagogy (when practicing an instrument). Our system shows that this dream has come true, with active CDs fully backward compatible while enabling interactive music. The magic is that “the music is in the sound”: the structure of the mix is embedded in the sound signal itself, using audio watermarking techniques, and the embedded information is exploited by the player to perform the separation of the sources (patent pending) used in turn by a spatializer.

Keywords

interactive music compact disc audio watermarking source separation sound spatialization 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Algazi, V.R., Duda, R.O., Thompson, D.M., Avendano, C.: The CIPIC HRTF database. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, New York, pp. 99–102 (2001)Google Scholar
  2. 2.
    Araki, S., Sawada, H., Makino, S.: K-means based underdetermined blind speech separation. In: Makino, S., et al. (eds.) Blind Source Separation, pp. 243–270. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Araki, S., Sawada, H., Mukai, R., Makino, S.: Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors. Signal Processing 87(8), 1833–1847 (2007)CrossRefMATHGoogle Scholar
  4. 4.
    Bass, H., Sutherland, L., Zuckerwar, A., Blackstock, D., Hester, D.: Atmospheric absorption of sound: Further developments. Journal of the Acoustical Society of America 97(1), 680–683 (1995)CrossRefGoogle Scholar
  5. 5.
    Berg, R.E., Stork, D.G.: The Physics of Sound, 2nd edn. Prentice Hall, Englewood Cliffs (1994)Google Scholar
  6. 6.
    Blauert, J.: Spatial Hearing. revised edn. MIT Press, Cambridge (1997); Translation by J.S. AllenGoogle Scholar
  7. 7.
    Bofill, P., Zibulevski, M.: Underdetermined blind source separation using sparse representations. Signal Processing 81(11), 2353–2362 (2001)CrossRefMATHGoogle Scholar
  8. 8.
    Chen, B., Wornell, G.: Quantization index modulation: A class of provably good methods for digital watermarking and information embedding. IEEE Transactions on Information Theory 47(4), 1423–1443 (2001)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Chowning, J.M.: The simulation of moving sound sources. Journal of the Acoustical Society of America 19(1), 2–6 (1971)Google Scholar
  10. 10.
    International Organization for Standardization, Geneva, Switzerland: ISO 9613-1:1993: Acoustics – Attenuation of Sound During Propagation Outdoors – Part 1: Calculation of the Absorption of Sound by the Atmosphere (1993)Google Scholar
  11. 11.
    ISO/IEC JTC1/SC29/WG11 MPEG: Information technology Generic coding of moving pictures and associated audio information Part 7: Advanced Audio Coding (AAC) IS13818-7(E) (2004)Google Scholar
  12. 12.
    ITU-R: Method for objective measurements of perceived audio quality (PEAQ) Recommendation BS1387-1 (2001)Google Scholar
  13. 13.
    Kuhn, G.F.: Model for the interaural time differences in the azimuthal plane. Journal of the Acoustical Society of America 62(1), 157–167 (1977)CrossRefGoogle Scholar
  14. 14.
    Mouba, J., Marchand, S., Mansencal, B., Rivet, J.M.: RetroSpat: a perception-based system for semi-automatic diffusion of acousmatic music. In: Proceedings of the Sound and Music Computing (SMC) Conference, Berlin, pp. 33–40 (2008)Google Scholar
  15. 15.
    O’Grady, P., Pearlmutter, B.A., Rickard, S.: Survey of sparse and non-sparse methods in source separation. International Journal of Imaging Systems and Technology 15(1), 18–33 (2005)CrossRefGoogle Scholar
  16. 16.
    Parvaix, M., Girin, L.: Informed source separation of underdetermined instantaneous stereo mixtures using source index embedding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, Texas (2010)Google Scholar
  17. 17.
    Parvaix, M., Girin, L.: Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding. IEEE Transactions on Audio, Speech, and Language Processing (accepted, pending publication, 2011)Google Scholar
  18. 18.
    Pinel, J., Girin, L., Baras, C.: A high-rate data hiding technique for uncompressed audio signals. IEEE Transactions on Audio, Speech, and Language Processing (submitted)Google Scholar
  19. 19.
    Pinel, J., Girin, L., Baras, C., Parvaix, M.: A high-capacity watermarking technique for audio signals based on MDCT-domain quantization. In: International Congress on Acoustics (ICA), Sydney, Australia (2010)Google Scholar
  20. 20.
    Plumbley, M.D., Blumensath, T., Daudet, L., Gribonval, R., Davies, M.E.: Sparse representations in audio and music: From coding to source separation. Proceedings of the IEEE 98(6), 995–1005 (2010)CrossRefGoogle Scholar
  21. 21.
    Princen, J., Bradley, A.: Analysis/synthesis filter bank design based on time domain aliasing cancellation. IEEE Transactions on Acoustics, Speech, and Signal Processing 64(5), 1153–1161 (1986)CrossRefGoogle Scholar
  22. 22.
    Strutt (Lord Rayleigh), J.W.: Acoustical observations i. Philosophical Magazine 3, 456–457 (1877)CrossRefGoogle Scholar
  23. 23.
    Strutt (Lord Rayleigh), J.W.: On the acoustic shadow of a sphere. Philosophical Transactions of the Royal Society of London 203A, 87–97 (1904)CrossRefMATHGoogle Scholar
  24. 24.
    Thiede, T., Treurniet, W., Bitto, R., Schmidmer, C., Sporer, T., Beerends, J., Colomes, C.: PEAQ - the ITU standard for objective measurement of perceived audio quality. Journal of the Audio Engineering Society 48(1), 3–29 (2000)Google Scholar
  25. 25.
    Tournery, C., Faller, C.: Improved time delay analysis/synthesis for parametric stereo audio coding. Journal of the Audio Engineering Society 29(5), 490–498 (2006)Google Scholar
  26. 26.
    Vincent, E., Gribonval, R., Plumbley, M.D.: Oracle estimators for the benchmarking of source separation algorithms. Signal Processing 87, 1933–1950 (2007)CrossRefMATHGoogle Scholar
  27. 27.
    Viste, H.: Binaural Localization and Separation Techniques. Ph.D. thesis, École Polytechnique Fédérale de Lausanne, Switzerland (2004)Google Scholar
  28. 28.
    Woodworth, R.S.: Experimental Psychology. Holt, New York (1954)Google Scholar
  29. 29.
    Yılmaz, O., Rickard, S.: Blind separation of speech mixtures via time-frequency masking. IEEE Transactions on Signal Processing 52(7), 1830–1847 (2004)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Sylvain Marchand
    • 1
  • Boris Mansencal
    • 1
  • Laurent Girin
    • 2
  1. 1.LaBRI – CNRSUniversity of BordeauxFrance
  2. 2.GIPSA-lab – CNRSGrenoble Institute of TechnologyFrance

Personalised recommendations