Audio Coding Standards, (Proprietary) Audio Compression Algorithms, and Broadcasting/Speech/Data Communication Codecs: Overview of Adopted Filter Banks

Britanak, Vladimir; Rao, K. R.

doi:10.1007/978-3-319-61080-1_2

Audio Coding Standards, (Proprietary) Audio Compression Algorithms, and Broadcasting/Speech/Data Communication Codecs: Overview of Adopted Filter Banks

Vladimir Britanak³ &
K. R. Rao⁴

Chapter
First Online: 04 August 2017

732 Accesses
2 Citations

Abstract

In general, audio coding or audio compression algorithms are used to obtain compact digital representation of high-quality audio signals for their efficient transmission and storage. The central objective in audio coding is to represent the signal with a minimum number of bits while achieving its transparent reproduction. Besides speech coding schemes based on linear prediction methods especially tailored for efficient speech compression, the developed perceptual transform-based audio coding schemes gained a greater attention, particularly for applications in consumer electronics. Typically, any transform-based audio coding scheme utilizes a near-perfect quadrature mirror filter (QMF) and/or perfect reconstruction cosine-modulated filter bank to obtain a block-wise representation of the audio signal in the frequency domain. Perceptual transform-based audio coding schemes developed up to now are briefly reviewed including the family of ISO/IEC MPEG audio coding standards, proprietary audio compression algorithms, broadcasting/speech/data communication codecs, as well as open-free, patent royalty-free audio/speech codecs. The discussion is concentrated especially on adopted near-perfect QMF and perfect reconstruction cosine-modulated filter banks, processing methods, and specified transform block sizes.

This is a preview of subscription content, log in via an institution.

References

M. Bosi, R.E. Goldberg, Introduction to Digital Audio Coding and Standards, Part II: Audio Coding Standards (Springer Science+Business Media, New York, 2003), pp. 265–430
Google Scholar
V.K. Madisetti (ed.), The Digital Signal Processing Handbook: Video, Speech, and Audio Signal Processing and Associated Standards, 2nd edn. (CRC, Boca Raton, FL, 2010)
Google Scholar
H.S. Malvar, Extended lapped transforms: properties, applications, and fast algorithms. IEEE Trans. Signal Process. 40(11), 2703–2714 (1992)
Article Google Scholar
H.S. Malvar, Signal Processing with Lapped Transforms (Artech House, Norwood, MA, 1992)
MATH Google Scholar
H. Malvar, A modulated complex lapped transform and its applications to audio processing, in Proceedings of the IEEE ICASSP’99, Phoenix, AR, May 1999, pp. 1421–1424
Google Scholar
T. Painter, A. Spanias, Perceptual coding of digital audio. Proc. IEEE 88(4), 451–513 (2000)
Article Google Scholar
J.P. Princen, A.B. Bradley, Analysis/synthesis filter bank design based on time domain aliasing cancellation. IEEE Trans. Acoust. Speech Signal Process. ASSP-34(5), 1153–1161 (1986)
Article Google Scholar
J.P. Princen, A.W. Johnson, A.B. Bradley, Sub-band/transform coding using filter bank designs based on time domain aliasing cancellation, in Proceedings of IEEE ICASSP’87, Dallas, TX, April 1987, pp. 2161–2164
Google Scholar
K.R. Rao, J.J. Hwang, MPEG-1 audiovisual coder for digital storage media (Chapter 10), in Techniques and Standards for Image, Video, Audio Coding (Prentice-Hall, Upper Saddle River, NJ, 1996), pp. 242–265
Google Scholar
M. Schnell et al., Low delay filter banks for enhanced low delay audio coding, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2007, pp. 235–238
Google Scholar
A. Spanias, T. Painter, V. Atti, Audio coding standards and algorithms (Chapter 10), in Audio Signal Processing and Coding (Wiley-Interscience, Hoboken, NJ, 2007), pp. 263–342
Google Scholar

MPEG-1/2 Audio Coding Standards

Information Technology – Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s. Part 3: Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 11172-3 (MPEG-1) (1992)
Google Scholar
Information Technology – Generic Coding of Moving Pictures and Associated Audio, Part 3: Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 13818-3 (MPEG-2) (1994)
Google Scholar

MPEG–2/4 AAC Audio Coding Standards

M. Bosi et al., ISO/IEC MPEG-2 advanced audio coding, in 101st AES Convention, Los Angeles, CA, November 1996. Preprint #4382. Also published in J. Audio Eng. Soc. 45(10), 789–813 (1997)
Google Scholar
Information Technology – Generic Coding of Moving Pictures and Associated Audio Information, Subpart 7: Advanced Audio Coding (AAC), ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 13818-7 (MPEG-2 AAC) (1997)
Google Scholar
Information Technology – Coding of Audio-Visual Objects, Part 3: Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 14496-3 (MPEG-4 Audio) (1999)
Google Scholar

MPEG-4 AAC-LD Audio Coding Standard

E. Allamanche, R. Geiger, J. Herre, T. Sporer, MPEG-4 low delay audio coding based on the AAC codec, in 106th AES Convention, Munich, May 1999. Preprint #4929
Google Scholar
M. Lutzky, G. Schuller, M. Gayer, U. Krämer, S. Wabnik, A guideline to codec delay, in 116th AES Convention, Berlin, May 2004. Preprint #6062
Google Scholar
M. Lutzky, M. Schnell, M. Schmidt, R. Geiger, Structural analysis of low latency audio coding schemes, in 119th AES Convention, New York, NY, October 2005. Preprint #6601
Google Scholar

MPEG-4 HE-AAC Audio Coding Standard

A.C. den Brinker et al., An overview of the coding standard MPEG-4 audio Amendments 1 and 2: HE-AAC, SSC and HE-AAC v2. EURASIP J. Audio Speech Music Process. Article ID 468971, 21 (2009)
Google Scholar
J. Herre, M. Dietz, MPEG-4 High-Efficiency AAC coding. IEEE Signal Process. Mag. 25(3), 137–142 (2008)
Article Google Scholar
Information Technology – Coding of Audio-Visual Objects – Part 3: Audio, Subpart 4: General Audio Coding (GA)-AAC, TwinVQ, BSAC. ISO/IEC 14496–3:2005(E) (2005)
Google Scholar
M. Wolters, K. Kjörling, D. Homm, H. Purnhagen, A closer look into MPEG-4 High Efficiency AAC, in 115th AES Convention, New York, NY, October 2003. Preprint #5871
Google Scholar

MPEG-4 AAC-ELD Audio Coding Standard

Information Technology – Coding of Audio-Visual Objects – Part 3: Audio, Amendment 9: Enhanced Low Delay AAC. ISO/IEC 14496–3:2005/FDAM 9:2007(E), N9499, Shenzhen, October 2007
Google Scholar
M. Lutzky, M.L. Valero, M. Schnell, J. Hilpert, AAC-ELD v2 – The new state of the art in high quality communication audio coding, in 131st AES Convention, New York, NY, October 2011. Preprint #8516
Google Scholar
M. Schnell et al., Enhanced MPEG-4 low delay AAC – Low bitrate high quality communication, in 122nd AES Convention, Vienna, May 2007. Preprint #6998
Google Scholar
M. Schnell et al., MPEG-4 enhanced low delay AAC – A new standard for high quality communication, in 125th AES Convention, San Francisco, CA, October 2008. Preprint #7503
Google Scholar

MPEG-4 SLS and HD-AAC/SLS Scalable Lossless Audio Coding Standards

R. Geiger, G. Schuller, J. Herre, R. Sperschneider, T. Sporer, Scalable perceptual and lossless audio coding based on MPEG-4 AAC, in 115th AES Convention, New York, NY, October 2003. Preprint #5868
Google Scholar
R. Geiger, R. Yu, J. Herre, S. Rahardja, S.-W. Kim, X. Lin, M. Schmidt, ISO/IEC MPEG-4 high-definition scalable advanced audio coding. J. Audio Eng. Soc. 55(1)/2, 27–43 (2007)
Google Scholar
ISO/IEC 14496-3:2005/Amd.3:2006, Coding of Audio-Visual Objects – Part 3: Audio, Amendment 3: Scalable Lossless Coding (SLS). International Standards Organization, Geneva (2006)
Google Scholar
R. Yu, R. Geiger, S. Rahardja, J. Herre, X. Lin, H. Huang, MPEG-4 scalable to lossless audio coding, in 117th AES Convention, San Francisco, CA, October 2004. Preprint #6183
Google Scholar
R. Yu, S. Rahardja, X. Lin, C.C. Ko, A fine granular scalable to lossless audio coding. IEEE Trans. Audio Speech Lang. Process. 14(4), 1352–1363 (2006)
Article Google Scholar

MPEG-D USAC: Unified Speech and Audio Coding

B. Edler, S. Disch, S. Bayer, G. Guillaume, R. Geiger, A time-warped MDCT approach to speech transform coding, in 126th AES Convention, Munich, May 2009. Preprint #7710
Google Scholar
C.R. Helmrich et al., Efficient transform coding of two-channel audio signals by means of complex-valued stereo prediction, in Proceedings of the IEEE ICASSP’2011, Prague, May 2011, pp. 497–500
Google Scholar
A. Heuerberger, G. Elst, R. Hanke (eds.), MPEG unified speech and audio coding – Bridging the gap, in Microelectronic Systems: Circuits, Systems and Applications (Springer, Berlin, 2011), pp. 343–353
Google Scholar
ISO/IEC 23003—3:2012, MPEG audio technologies, Part 3: Unified Speech and Audio Coding, Geneva, January 2012
Google Scholar
K. Kikuri, N. Naka, MPEG Unified speech and audio coding enabling efficient coding of both speech and music. NTT DOCOMO Tech. J. 13(3), 17–22 (2011)
Google Scholar
M. Neuendorf et al., A novel scheme for low bit rate Unified Speech and Audio Coding – MPEG RM0, in 126th AES Convention, Munich, May 2009. Preprint #7713
Google Scholar
M. Neuendorf et al., Unified speech and audio coding scheme for high quality at low bitrates, in Proceedings of the IEEE ICASSP’2009, Taipei, April 2009, pp. 1–4
Google Scholar
M. Neuendorf et al., The ISO/MPEG Unified Speech and Audio Coding standard – Consistent high quality for all content types and at all bit rates, in 132nd AES Convention, Budapest, April 2012. Preprint #8654. Also published in J. Audio Eng. Soc. 61(12), 956–977 (2013)
Google Scholar
S. Quackenbush, MPEG unified speech and audio coding. IEEE MultiMedia 20(2), 72–78 (2013)
Article Google Scholar

Proprietary Audio Compression Algorithms

M. Bosi, G.A. Davidson, High-quality, low-rate audio transform coding for transmission and multimedia applications, in 93rd AES Convention, San Francisco, CA, December 1992. Preprint# 3365
Google Scholar
G.A. Davidson, L.D. Fielder, M. Antill, Low-complexity transform coder for satellite link applications, in 89th AES Convention, New York, NY, September 1990. Preprint# 2966
Google Scholar
G.A. Davidson, M.A. Isnardi, L.D. Fielder, M.S. Goldman, C.C. Todd, ATSC video and audio coding. Proc. IEEE 94(1), 60–76 (2006)
Article Google Scholar
Digital Audio Compression (AC-3) ATSC Standard, Document A/52/10 of Advanced Television Systems Committee (ATSC), Audio Specialist Group T3/S7, Washington, DC, December 1995
Google Scholar
Digital Audio Compression Standard (AC-3, E-AC-3), Revision B, Document A/52B of Advanced Television Systems Committee (ATSC), Washington DC, December 2012
Google Scholar
L.D. Fielder, G.A. Davidson, AC-2: a family of low complexity transform-based music coders, in Proceedings of the 10th International AES Conference: Images of Audio, London, September 1991, pp. 55–70
Google Scholar
L.D. Fielder, D.P. Robinson, AC-2 and AC-3: the technology and its applications, in 5th Australian Regional Convention, Sydney, April 1995. Preprint #4022
Google Scholar
L.D. Fielder et al., Introduction to Dolby Digital Plus, an enhancement to the Dolby digital coding system, in 117th AES Convention, San Francisco, CA, October 2004. Preprint #6196
Google Scholar
J.D. Johnson, A.J. Ferreira, Sum-difference stereo transform coding, in Proceedings of the IEEE ICASSP’92, vol. II, San Francisco, CA, March 1992, pp. 569–572
Google Scholar
J. Johnson et al., AT&T perceptual audio coder (PAC), in Collected Papers on Digital Audio Bit-Rate Reduction, ed. by N. Gilchrist, C. Grewin (Audio Engineering Society, New York, 1996), pp. 73–81
Google Scholar
D. Sinha, J.D. Johnson, Audio compression at low bit rates using a signal adaptive switched filterbank, in Proceedings of the IEEE ICASSP’96, Atlanta, GA, May 1996, pp. 1053–1056
Google Scholar
K. Tsustsui at al., ATRAC: adaptive transform acoustics coding for MiniDisc, in 93rd AES Convention, San Francisco, CA, October 1992. Preprint #3456
Google Scholar
T. Yoshida, The rewritable MiniDisc system. Proc. IEEE 82(10), 1492–1500 (1994)
Article Google Scholar

Broadcasting/Speech/Data Communication Codecs

3GGP2 C.S0014–C v1.0, Enhanced variable rate codec, speech service Option 3, 68 and 70 for wide-band spread spectrum digital systems (2007)
Google Scholar
M. Bellanger, D. Matera, M. Tanda, A filter bank multicarrier scheme running at symbol rate for future wireless systems, in Proceedings of the IEEE Wireless Telecommunications Symposium (WTS’2015), New York, NY, April 2015, pp. 1–5
Google Scholar
M. Bellanger, D. Matera, M. Tanda, Lapped-OFDM as an alternative to CP-OFDM for 5G asynchronous access and cognitive radio, in Proceedings of the IEEE 81st Vehicular Technology Conference (VTC Spring), Glasgow, May 2015, pp. 1–5
Google Scholar
Digital Radio Mondiale (DRM): System Specification, ETSI ES 201 980 v3.1.1 (2009–08), ETSI Standard, August 2009 (available on web site http://www.drm.org)
W. Hoeg, T. Lauterbach (eds.), Audio services and applications (Chapter 3), in Digital Audio Broadcasting: Principles and Applications of DAB, DAB+ and DMB, 3rd edn. (Wiley, Chichester, 2009), pp. 93–165
Google Scholar
ITU-T Recommendation G.722.1 Annex C, Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Annex C: 14 kHz Mode at 24, 32, and 48 kbit/s, May 2005
Google Scholar
ITU-T SG16 Q9 – Contribution 199: extended high-level description of the Q9 EV-VBR baseline codec (2007)
Google Scholar
L. Laaksonen et al., Super wide-band extension of G.718 and G.729.1 speech codec, in Proceedings of 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, September 2010
Google Scholar
J. Mäkinen et al., AMR-WB+: a new audio coding standard for 3rd generation mobile audio services, in Proceedings of the IEEE ICASSP’2005, vol. II, Philadelphia, PA, March 2005, pp. 1109–1112
Google Scholar
S. Ragot et al., ITU-T G.729.1: an 8–32 kbit/s scalable coder interoperable with G.729 for wideband telephony and voice IP, in Proceedings of the IEEE ICASSP’2007, Honolulu, HI, April 2007, pp. 529–532
Google Scholar
R. Salami et al., Extended AMR-WB for high-quality audio on mobile devices. IEEE Commun. Mag. 44(5), 90–97 (2006)
Article Google Scholar
Sirius Satellite Radio, Available on web site: http://www.siriusradio.com
T. Vaillancourt et al., ITU-T EV-VBR: a robust 8–32kbit/s scalable coder for error prone telecommunication channels, in Proceedings of the 16th European Signal Processing Conference, Lausanne, August 2008
Google Scholar
M. Xie, D. Lindbergh, P. Chu, From ITU-T G.722.1 to ITU-T G.722.1 Annex C: a new low-complexity 14kHz bandwidth audio coding standard, in Proceedings of the IEEE ICASSP’2006, vol. 5, Toulouse, May 2006, pp. 173–176. Also published in J. Multimedia 2(2), 65–76 (2007)
Google Scholar
M. Xie, P. Chu, A. Taleb, M. Briand, ITU-T G.719: a new low-complexity full-band (20kHz) audio coding standard for high quality conversational applications, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’2009), New Paltz, NY, October 2009, pp. 265–268
Google Scholar
XM Satellite Radio, Available on web site: http://www.xmradio.com

Open-Source and royalty-Free Audio/Speech Codecs

OPUS interactive audio/speech codec, 2016. Available on web sites: www.vorbis.com or www.opus-codec.org
The CELT ultra-low delay audio codec, February 2011. Available on web sites: www.vorbis.com or www.celt-codec.org
J.-M. Valin, T.B. Terriberry, G. Maxwell, A full-bandwidth audio codec with low complexity and very low delay, in Proceedings of the 17th European Signal Processing Conference (EUSIPCO’2009), Glasgow, August 2009, pp. 1254–1258
Google Scholar
J.M. Valin, K. Vos, T.B. Terriberry, Definition of the OPUS audio codec, Internet Engineering Task Force (IETF). RFC 6716 Standard Specification, September 2012. Available on web site: www.vorbis.com
J.-M. Valin, T.B. Terriberry, C. Montgomery, G. Maxwell, A high-quality speech and audio codec with less than 10 ms delay. IEEE Trans. Audio Speech Lang. Process. 18(1), 58–67 (2010)
Article Google Scholar
J.-M. Valin, G. Maxwell, T.B. Terriberry, C. Montgomery, K. Vos, High-quality, low-delay music coding in the Opus codec, in 135th AES Convention, New York, NY, October 2013. Preprint #8942
Google Scholar
Vorbis I specification, Xiph.Org Foundation (2015). Available on web site: www.vorbis.com
K. Wright, Notes on Ogg Vorbis and the MDCT, Draft document available on web site: www.free-comp-shop.com/vorbis.html (2003), 7 pp.

Download references

Author information

Authors and Affiliations

Institute of Informatics, Slovak Academy of Sciences, Bratislava, Slovakia
Vladimir Britanak
The University of Texas at Arlington, Arlington, TX, USA
K. R. Rao

Authors

Vladimir Britanak
View author publications
You can also search for this author in PubMed Google Scholar
K. R. Rao
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Britanak, V., Rao, K.R. (2018). Audio Coding Standards, (Proprietary) Audio Compression Algorithms, and Broadcasting/Speech/Data Communication Codecs: Overview of Adopted Filter Banks. In: Cosine-/Sine-Modulated Filter Banks. Springer, Cham. https://doi.org/10.1007/978-3-319-61080-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-61080-1_2
Published: 04 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61078-8
Online ISBN: 978-3-319-61080-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics