Summary
Audio is an integral component of multimedia services in intelligent environments. Use of multiple channels in audio capturing and rendering offers the advantage of recreating arbitrary acoustic environments, immersing the listener into the acoustic scene. On the other hand, multichannel audio contains a large degree of information which is highly demanding to transmit, especially for real-time applications. For this reason, a variety of compression methods have been developed for multichannel audio content. In this chapter, we initially describe the currently popular methods for multichannel audio compression. Low-bitrate encoding methods for multichannel audio have also been recently starting to attract interest, mostly towards extending MP3 audio coding to multichannel audio recordings, and these methods are also examined here. For synthesizing a truly immersive intelligent audio environment, interactivity between the user(s) and the audio environment is essential. Towards this goal, we present recently proposed multichannel-audio-specific models, namely the source/filter and the sinusoidal models, which allow for flexible manipulation and high-quality low-bitrate encoding, tailored for applications such as remote mixing and distributed immersive performances.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ITU-R BS.1116, “Methods for the subjective assessment of small impairments in audio systems including multichannel sound systems,” 1994. International Telecommunications Union, Geneva, Switzerland.
ISO/IEC JTC1/SC29/WG11 (MPEG) International Standard ISO/IEC 11172-3, “Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s,” 1992.
D. Pan,“A tutorial on MPEG/Audio compression,” IEEE Multimedia, pp. 60-74, 1995.
P. Noll,“MPEG digital audio coding,” IEEE Signal Processing Magazine, pp. 59-81, September 1997.
K. Brandenburg, “MP3 and AAC explained,” in Proceedings of the 17th Inter-national Conference on High Quality Audio Coding of the Audio Engineering Society (AES), September 1999.
T. Painter and A. Spanias, “Perceptual coding of digital audio,” Proceedings IEEE, vol. 88, pp. 100-120, April 2000.
H. S. Malvar, “Lapped transforms for efficient transform/subband coding,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.38, pp. 969-978, June 1990.
ISO/IEC JTC1/SC29/WG11(MPEG) International Standard ISO/IEC 13818-3, “Generic coding of moving pictures and associated audio: Audio,” 1994.
ISO/IEC JTC1/SC29/WG11(MPEG) International Standard ISO/IEC 13818-7, “Generic coding of moving pictures and associated audio: Advanced audio coding,” 1997.
M. Bosi, K. Brandenburg, S. Quackenbush, L. Fielder, K. Akagiri, H. Fuchs, M. Dietz, J. Herre, G. Davidson, and Y. Oikawa, “ISO/IEC MPEG-2 advanced audio coding,” in Proceedings of the 101st Convention of the Audio Engineering Society (AES), preprint No. 4382, (Los Angeles, CA), November 1996.
K. Brandenburg and M. Bosi, “ISO/IEC MPEG-2 Advanced Audio Coding: Overview and applications,” in Proceedings of the 103rd Convention of the Audio Engineering Society (AES), preprint No. 4641, 1997.
J. D. Johnston and A. J. Ferreira, “Sum-difference stereo transform coding,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 569-572, 1992.
B. C. J. Moore, An Introduction in the Psychology of Hearing. Academic Press, London, 1989.
S. Haykin, Adaptive Filter Theory. Prentice Hall, Englewood Cliffs, NJ, 1996.
J. Herre, K. Brandenburg, and D. Lederer, “Intensity stereo coding,” in Proceed-ings of the 96th Convention of the Audio Engineering Society (AES), preprint No. 3799, February 1994.
R. Dressler, “Dolby Surround Pro Logic decoder principles of operation.” www.dolby.com.
“Dolby Surround Pro Logic II decoder principles of operation.” http://www. dolby.com.
D. Yang, H. Ai, C. Kyriakakis, and C. J. Kuo, “High-fidelity multichannel au-dio coding with karhunen-loeve transform,” IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 365-380, July 2003.
R. Irwan and R. M. Aarts, “Two-to-five channel sound processing,” Journal of the Audio Engineering Society, vol. 50, pp. 914-926, November 2002.
J. Breebaart, J. Herre, C. Faller, J. Roden, F. Myburg, S. Disch, H. Purnhagen, G. Hotho, M. Neusinger, K. Kjorling, and W. Oomen, “MPEG Spatial Audio Coding/MPEG Surround: overview and current status,” in Proceedings of AES 119th Convention, Paper 6599 (New York, NY), October 2005.
F. Baumgarte and C. Faller, “Binaural cue coding - Part I: Psychoacoustic fundamentals and design principles,” IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 509-519, November 2003.
C. Faller and F. Baumgarte, “Binaural cue coding - Part II: Schemes and applications,” IEEE Transactions on Speech and Audio Processing, vol. 11, pp. 520-531, November 2003.
J. Breebaart, S. van de Par, A. Kohlrausch, and E. Schuijers, “Parametric cod-ing of stereo audio,” EURASIP Journal on Applied Signal Processing, vol. 9, pp. 1305-1322, 2005.
A. Mouchtaris, K. Karadimou, and P. Tsakalides, “Multiresolution Source/ Filter Model for Low Bitrate Coding of Spot Microphone Signals,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2008, Article ID 624321, 16 pages, 2008. doi:10.1155/2008/624321.
L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition. Prentice Hall, Englewoodcliffs, NJ, 1993.
A. Sawchuk, E. Chew, R. Zimmermann, C. Papadopoulos, and C. Kyriakakis, “From remote media immersion to distributed immersive performance,” in Proceedings of ACM SIGMM Workshop on Experiential Telepresence (ETP), (Berkeley, CA), November 2003.
N. Iwakami, T. Moriya, and S. Miki, “High-quality audio coding at less than 64 kbit/s by using transform-domain weighted interleave vector quantization (TWINVQ),” in IEEE International Conference on Acoustics, Speech, and Sig-nal Processing (ICASSP), pp. 3095-3098, May 1995.
A. Mouchtaris, S. S. Narayanan, and C. Kyriakakis, “Virtual microphones for multichannel audio resynthesis,” EURASIP Journal on Applied Signal Processing, Special Issue on Digital Audio for Multimedia Communications, vol. 2003:10, pp. 968-979, 2003.
C. Tzagkarakis, A. Mouchtaris, and P. Tsakalides, “Modeling Spot Microphone Signals using the Sinusoidal Plus Noise Approach,” in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, pp. 183-186, October 21-24, 2007.
R. J. McAulay and T. F. Quatieri, “Speech analysis/synthesis based on a si-nusoidal representation,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 34(4), pp. 744-754, August 1986.
Y. Stylianou, “Applying the harmonic plus noise model in concatinative speech synthesis,” IEEE Transactions on Speech and Audio Processing, vol. 9, no. 1, pp. 21-29, 2001.
J. Jensen, R. Heusdens, and S. H. Jensen, “A perceptual subspace approach for modeling of speech and audio signals with damped sinusoids,” IEEE Transac-tions on Speech and Audio Processing, vol. 12, no. 2, pp. 121-132, 2004.
M. G. Christensen, A. Jakobsson, S. V. Andersen, and S. H. Jensen, “Linear AM decomposition for sinusoidal audio coding,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 165-168, 2005.
X. Serra and J. O. Smith, “Spectral modeling sythesis: A sound analy- sis/synthesis system based on a deterministic plus stochastic decomposition,” Computer Music Journal, vol. 14, no. 4, pp. 12-24, 1990.
S. N. Levine, T. S. Verma, and J. O. Smith, “Multiresolution sinusoidal model-ing for wideband audio with modifications,” IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 1998.
M. Goodwin, “Residual modeling in music analysis-synthesis,” in IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 1996.
R. C. Hendriks, R. Heusdens, and J. Jensen, “Perceptual linear predictive noise modelling for sinusoid-plus-noise audio coding,” in IEEE International Confer-ence on Acoustics, Speech, and Signal Processing (ICASSP), May 2004.
B. Edler, H. Purnhagen, and C. Ferekidis, “ASAC - analysis/synthesis audio codec for very low bit rates,” in Proceedings of 100th Convention of the Audio Engineering Society (AES), Preprint No. 4179, May 1996.
H. Purnhagen and N. Meine, “HILN - the MPEG-4 parametric audio coding tools,” in IEEE International Symposium on Circuits and Systems (ISCAS), pp. 201-204, May 2000.
R. Vafin and W. B. Kleijn, “On frequency quantization in sinusoidal audio coding,” IEEE Signal Processing Letters, vol. 12, no. 3, pp. 210-213, 2005.
K. Karadimou, A. Mouchtaris, and P. Tsakalides, “Packet loss concealment for multichannel audio using the multiband source/filter model,” in Conference on Record of the Asilomar Conference Signals, Systems and Computers, pp. 1105-1109 (Pacific Grove, CA), November 2006.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Mouchtaris, A., Tsakalides, P. (2008). Multichannel Audio Coding for Multimedia Services in Intelligent Environments. In: Tsihrintzis, G.A., Jain, L.C. (eds) Multimedia Services in Intelligent Environments. Studies in Computational Intelligence, vol 120. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78502-6_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-78502-6_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78491-3
Online ISBN: 978-3-540-78502-6
eBook Packages: EngineeringEngineering (R0)