Skip to main content

Multichannel Audio Coding for Multimedia Services in Intelligent Environments

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 120))

Summary

Audio is an integral component of multimedia services in intelligent environments. Use of multiple channels in audio capturing and rendering offers the advantage of recreating arbitrary acoustic environments, immersing the listener into the acoustic scene. On the other hand, multichannel audio contains a large degree of information which is highly demanding to transmit, especially for real-time applications. For this reason, a variety of compression methods have been developed for multichannel audio content. In this chapter, we initially describe the currently popular methods for multichannel audio compression. Low-bitrate encoding methods for multichannel audio have also been recently starting to attract interest, mostly towards extending MP3 audio coding to multichannel audio recordings, and these methods are also examined here. For synthesizing a truly immersive intelligent audio environment, interactivity between the user(s) and the audio environment is essential. Towards this goal, we present recently proposed multichannel-audio-specific models, namely the source/filter and the sinusoidal models, which allow for flexible manipulation and high-quality low-bitrate encoding, tailored for applications such as remote mixing and distributed immersive performances.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ITU-R BS.1116, “Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems,” 1994. International Telecommunications Union, Geneva, Switzerland.

    Google Scholar 

  2. ISO/IEC JTC1/SC29/WG11 (MPEG) International Standard ISO/IEC 11172-3, “Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s,” 1992.

    Google Scholar 

  3. D. Pan,“A tutorial on MPEG/Audio compression,” IEEE Multimedia, pp. 60-74, 1995.

    Google Scholar 

  4. P. Noll,“MPEG digital audio coding,” IEEE Signal Processing Magazine, pp. 59-81, September 1997.

    Google Scholar 

  5. K. Brandenburg, “MP3 and AAC explained,” in Proceedings of the 17th Inter-national Conference on High Quality Audio Coding of the Audio Engineering Society (AES), September 1999.

    Google Scholar 

  6. T. Painter and A. Spanias, “Perceptual coding of digital audio,” Proceedings IEEE, vol. 88, pp. 100-120, April 2000.

    Article  Google Scholar 

  7. H. S. Malvar, “Lapped transforms for efficient transform/subband coding,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.38, pp. 969-978, June 1990.

    Article  Google Scholar 

  8. ISO/IEC JTC1/SC29/WG11(MPEG) International Standard ISO/IEC 13818-3, “Generic coding of moving pictures and associated audio: Audio,” 1994.

    Google Scholar 

  9. ISO/IEC JTC1/SC29/WG11(MPEG) International Standard ISO/IEC 13818-7, “Generic coding of moving pictures and associated audio: Advanced audio coding,” 1997.

    Google Scholar 

  10. M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, M. Dietz, J. Herre, G. Davidson, and Y. Oikawa, “ISO/IEC MPEG-2 advanced audio coding,” in Proceedings of the 101st Convention of the Audio Engineering Society (AES), preprint No. 4382, (Los Angeles, CA), November 1996.

    Google Scholar 

  11. K. Brandenburg and M. Bosi, “ISO/IEC MPEG-2 Advanced Audio Coding: Overview and applications,” in Proceedings of the 103rd Convention of the Audio Engineering Society (AES), preprint No. 4641, 1997.

    Google Scholar 

  12. J. D. Johnston and A. J. Ferreira, “Sum-difference stereo transform coding,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 569-572, 1992.

    Google Scholar 

  13. B. C. J. Moore, An Introduction in the Psychology of Hearing. Academic Press, London, 1989.

    Google Scholar 

  14. S. Haykin, Adaptive Filter Theory. Prentice Hall, Englewood Cliffs, NJ, 1996.

    Google Scholar 

  15. J. Herre, K. Brandenburg, and D. Lederer, “Intensity stereo coding,” in Proceed-ings of the 96th Convention of the Audio Engineering Society (AES), preprint No. 3799, February 1994.

    Google Scholar 

  16. R. Dressler, “Dolby Surround Pro Logic decoder principles of operation.” www.dolby.com.

  17. “Dolby Surround Pro Logic II decoder principles of operation.” http://www. dolby.com.

  18. D. Yang, H. Ai, C. Kyriakakis, and C. J. Kuo, “High-fidelity multichannel au-dio coding with karhunen-loeve transform,” IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 365-380, July 2003.

    Article  Google Scholar 

  19. R. Irwan and R. M. Aarts, “Two-to-five channel sound processing,” Journal of the Audio Engineering Society, vol. 50, pp. 914-926, November 2002.

    Google Scholar 

  20. J. Breebaart, J. Herre, C. Faller, J. Roden, F. Myburg, S. Disch, H. Purnhagen, G. Hotho, M. Neusinger, K. Kjorling, and W. Oomen, “MPEG Spatial Audio Coding/MPEG Surround: overview and current status,” in Proceedings of AES 119th Convention, Paper 6599 (New York, NY), October 2005.

    Google Scholar 

  21. F. Baumgarte and C. Faller, “Binaural cue coding - Part I: Psychoacoustic fundamentals and design principles,” IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 509-519, November 2003.

    Article  Google Scholar 

  22. C. Faller and F. Baumgarte, “Binaural cue coding - Part II: Schemes and applications,” IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 520-531, November 2003.

    Article  Google Scholar 

  23. J. Breebaart, S. van de Par, A. Kohlrausch, and E. Schuijers, “Parametric cod-ing of stereo audio,” EURASIP Journal on Applied Signal Processing, vol. 9, pp. 1305-1322, 2005.

    Google Scholar 

  24. A. Mouchtaris, K. Karadimou, and P. Tsakalides, “Multiresolution Source/ Filter Model for Low Bitrate Coding of Spot Microphone Signals,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2008, Article ID 624321, 16 pages, 2008. doi:10.1155/2008/624321.

    Google Scholar 

  25. L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition. Prentice Hall, Englewoodcliffs, NJ, 1993.

    Google Scholar 

  26. A. Sawchuk, E. Chew, R. Zimmermann, C. Papadopoulos, and C. Kyriakakis, “From remote media immersion to distributed immersive performance,” in Proceedings of ACM SIGMM Workshop on Experiential Telepresence (ETP), (Berkeley, CA), November 2003.

    Google Scholar 

  27. N. Iwakami, T. Moriya, and S. Miki, “High-quality audio coding at less than 64 kbit/s by using transform-domain weighted interleave vector quantization (TWINVQ),” in IEEE International Conference on Acoustics, Speech, and Sig-nal Processing (ICASSP), pp. 3095-3098, May 1995.

    Google Scholar 

  28. A. Mouchtaris, S. S. Narayanan, and C. Kyriakakis, “Virtual microphones for multichannel audio resynthesis,” EURASIP Journal on Applied Signal Processing, Special Issue on Digital Audio for Multimedia Communications, vol. 2003:10, pp. 968-979, 2003.

    Google Scholar 

  29. C. Tzagkarakis, A. Mouchtaris, and P. Tsakalides, “Modeling Spot Microphone Signals using the Sinusoidal Plus Noise Approach,” in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, pp. 183-186, October 21-24, 2007.

    Google Scholar 

  30. R. J. McAulay and T. F. Quatieri, “Speech analysis/synthesis based on a si-nusoidal representation,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 34(4), pp. 744-754, August 1986.

    Article  Google Scholar 

  31. Y. Stylianou, “Applying the harmonic plus noise model in concatinative speech synthesis,” IEEE Transactions on Speech and Audio Processing, vol. 9, no. 1, pp. 21-29, 2001.

    Article  Google Scholar 

  32. J. Jensen, R. Heusdens, and S. H. Jensen, “A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids,” IEEE Transac-tions on Speech and Audio Processing, vol. 12, no. 2, pp. 121-132, 2004.

    Article  Google Scholar 

  33. M. G. Christensen, A. Jakobsson, S. V. Andersen, and S. H. Jensen, “Linear AM decomposition for sinusoidal audio coding,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 165-168, 2005.

    Google Scholar 

  34. X. Serra and J. O. Smith, “Spectral modeling sythesis: A sound analy- sis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music Journal, vol. 14, no. 4, pp. 12-24, 1990.

    Article  Google Scholar 

  35. S. N. Levine, T. S. Verma, and J. O. Smith, “Multiresolution sinusoidal model-ing for wideband audio with modifications,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1998.

    Google Scholar 

  36. M. Goodwin, “Residual modeling in music analysis-synthesis,” in IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 1996.

    Google Scholar 

  37. R. C. Hendriks, R. Heusdens, and J. Jensen, “Perceptual linear predictive noise modelling for sinusoid-plus-noise audio coding,” in IEEE International Confer-ence on Acoustics, Speech, and Signal Processing (ICASSP), May 2004.

    Google Scholar 

  38. B. Edler, H. Purnhagen, and C. Ferekidis, “ASAC - analysis/synthesis audio codec for very low bit rates,” in Proceedings of 100th Convention of the Audio Engineering Society (AES), Preprint No. 4179, May 1996.

    Google Scholar 

  39. H. Purnhagen and N. Meine, “HILN - the MPEG-4 parametric audio coding tools,” in IEEE International Symposium on Circuits and Systems (ISCAS), pp. 201-204, May 2000.

    Google Scholar 

  40. R. Vafin and W. B. Kleijn, “On frequency quantization in sinusoidal audio coding,” IEEE Signal Processing Letters, vol. 12, no. 3, pp. 210-213, 2005.

    Article  Google Scholar 

  41. K. Karadimou, A. Mouchtaris, and P. Tsakalides, “Packet loss concealment for multichannel audio using the multiband source/filter model,” in Conference on Record of the Asilomar Conference Signals, Systems and Computers, pp. 1105-1109 (Pacific Grove, CA), November 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Mouchtaris, A., Tsakalides, P. (2008). Multichannel Audio Coding for Multimedia Services in Intelligent Environments. In: Tsihrintzis, G.A., Jain, L.C. (eds) Multimedia Services in Intelligent Environments. Studies in Computational Intelligence, vol 120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78502-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78502-6_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78491-3

  • Online ISBN: 978-3-540-78502-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics