Skip to main content

Audio Coding Standards, (Proprietary) Audio Compression Algorithms, and Broadcasting/Speech/Data Communication Codecs: Overview of Adopted Filter Banks

  • Chapter
  • First Online:

Abstract

In general, audio coding or audio compression algorithms are used to obtain compact digital representation of high-quality audio signals for their efficient transmission and storage. The central objective in audio coding is to represent the signal with a minimum number of bits while achieving its transparent reproduction. Besides speech coding schemes based on linear prediction methods especially tailored for efficient speech compression, the developed perceptual transform-based audio coding schemes gained a greater attention, particularly for applications in consumer electronics. Typically, any transform-based audio coding scheme utilizes a near-perfect quadrature mirror filter (QMF) and/or perfect reconstruction cosine-modulated filter bank to obtain a block-wise representation of the audio signal in the frequency domain. Perceptual transform-based audio coding schemes developed up to now are briefly reviewed including the family of ISO/IEC MPEG audio coding standards, proprietary audio compression algorithms, broadcasting/speech/data communication codecs, as well as open-free, patent royalty-free audio/speech codecs. The discussion is concentrated especially on adopted near-perfect QMF and perfect reconstruction cosine-modulated filter banks, processing methods, and specified transform block sizes.

This is a preview of subscription content, log in via an institution.

References

  1. M. Bosi, R.E. Goldberg, Introduction to Digital Audio Coding and Standards, Part II: Audio Coding Standards (Springer Science+Business Media, New York, 2003), pp. 265–430

    Google Scholar 

  2. V.K. Madisetti (ed.), The Digital Signal Processing Handbook: Video, Speech, and Audio Signal Processing and Associated Standards, 2nd edn. (CRC, Boca Raton, FL, 2010)

    Google Scholar 

  3. H.S. Malvar, Extended lapped transforms: properties, applications, and fast algorithms. IEEE Trans. Signal Process. 40(11), 2703–2714 (1992)

    Article  Google Scholar 

  4. H.S. Malvar, Signal Processing with Lapped Transforms (Artech House, Norwood, MA, 1992)

    MATH  Google Scholar 

  5. H. Malvar, A modulated complex lapped transform and its applications to audio processing, in Proceedings of the IEEE ICASSP’99, Phoenix, AR, May 1999, pp. 1421–1424

    Google Scholar 

  6. T. Painter, A. Spanias, Perceptual coding of digital audio. Proc. IEEE 88(4), 451–513 (2000)

    Article  Google Scholar 

  7. J.P. Princen, A.B. Bradley, Analysis/synthesis filter bank design based on time domain aliasing cancellation. IEEE Trans. Acoust. Speech Signal Process. ASSP-34(5), 1153–1161 (1986)

    Article  Google Scholar 

  8. J.P. Princen, A.W. Johnson, A.B. Bradley, Sub-band/transform coding using filter bank designs based on time domain aliasing cancellation, in Proceedings of IEEE ICASSP’87, Dallas, TX, April 1987, pp. 2161–2164

    Google Scholar 

  9. K.R. Rao, J.J. Hwang, MPEG-1 audiovisual coder for digital storage media (Chapter 10), in Techniques and Standards for Image, Video, Audio Coding (Prentice-Hall, Upper Saddle River, NJ, 1996), pp. 242–265

    Google Scholar 

  10. M. Schnell et al., Low delay filter banks for enhanced low delay audio coding, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY, October 2007, pp. 235–238

    Google Scholar 

  11. A. Spanias, T. Painter, V. Atti, Audio coding standards and algorithms (Chapter 10), in Audio Signal Processing and Coding (Wiley-Interscience, Hoboken, NJ, 2007), pp. 263–342

    Google Scholar 

MPEG-1/2 Audio Coding Standards

  1. Information Technology – Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Mbit/s. Part 3: Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 11172-3 (MPEG-1) (1992)

    Google Scholar 

  2. Information Technology – Generic Coding of Moving Pictures and Associated Audio, Part 3: Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 13818-3 (MPEG-2) (1994)

    Google Scholar 

MPEG–2/4 AAC Audio Coding Standards

  1. M. Bosi et al., ISO/IEC MPEG-2 advanced audio coding, in 101st AES Convention, Los Angeles, CA, November 1996. Preprint #4382. Also published in J. Audio Eng. Soc. 45(10), 789–813 (1997)

    Google Scholar 

  2. Information Technology – Generic Coding of Moving Pictures and Associated Audio Information, Subpart 7: Advanced Audio Coding (AAC), ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 13818-7 (MPEG-2 AAC) (1997)

    Google Scholar 

  3. Information Technology – Coding of Audio-Visual Objects, Part 3: Audio, ISO/IEC JTC1/SC29/WG11 MPEG, International Standard 14496-3 (MPEG-4 Audio) (1999)

    Google Scholar 

MPEG-4 AAC-LD Audio Coding Standard

  1. E. Allamanche, R. Geiger, J. Herre, T. Sporer, MPEG-4 low delay audio coding based on the AAC codec, in 106th AES Convention, Munich, May 1999. Preprint #4929

    Google Scholar 

  2. M. Lutzky, G. Schuller, M. Gayer, U. Krämer, S. Wabnik, A guideline to codec delay, in 116th AES Convention, Berlin, May 2004. Preprint #6062

    Google Scholar 

  3. M. Lutzky, M. Schnell, M. Schmidt, R. Geiger, Structural analysis of low latency audio coding schemes, in 119th AES Convention, New York, NY, October 2005. Preprint #6601

    Google Scholar 

MPEG-4 HE-AAC Audio Coding Standard

  1. A.C. den Brinker et al., An overview of the coding standard MPEG-4 audio Amendments 1 and 2: HE-AAC, SSC and HE-AAC v2. EURASIP J. Audio Speech Music Process. Article ID 468971, 21 (2009)

    Google Scholar 

  2. J. Herre, M. Dietz, MPEG-4 High-Efficiency AAC coding. IEEE Signal Process. Mag. 25(3), 137–142 (2008)

    Article  Google Scholar 

  3. Information Technology – Coding of Audio-Visual Objects – Part 3: Audio, Subpart 4: General Audio Coding (GA)-AAC, TwinVQ, BSAC. ISO/IEC 14496–3:2005(E) (2005)

    Google Scholar 

  4. M. Wolters, K. Kjörling, D. Homm, H. Purnhagen, A closer look into MPEG-4 High Efficiency AAC, in 115th AES Convention, New York, NY, October 2003. Preprint #5871

    Google Scholar 

MPEG-4 AAC-ELD Audio Coding Standard

  1. Information Technology – Coding of Audio-Visual Objects – Part 3: Audio, Amendment 9: Enhanced Low Delay AAC. ISO/IEC 14496–3:2005/FDAM 9:2007(E), N9499, Shenzhen, October 2007

    Google Scholar 

  2. M. Lutzky, M.L. Valero, M. Schnell, J. Hilpert, AAC-ELD v2 – The new state of the art in high quality communication audio coding, in 131st AES Convention, New York, NY, October 2011. Preprint #8516

    Google Scholar 

  3. M. Schnell et al., Enhanced MPEG-4 low delay AAC – Low bitrate high quality communication, in 122nd AES Convention, Vienna, May 2007. Preprint #6998

    Google Scholar 

  4. M. Schnell et al., MPEG-4 enhanced low delay AAC – A new standard for high quality communication, in 125th AES Convention, San Francisco, CA, October 2008. Preprint #7503

    Google Scholar 

MPEG-4 SLS and HD-AAC/SLS Scalable Lossless Audio Coding Standards

  1. R. Geiger, G. Schuller, J. Herre, R. Sperschneider, T. Sporer, Scalable perceptual and lossless audio coding based on MPEG-4 AAC, in 115th AES Convention, New York, NY, October 2003. Preprint #5868

    Google Scholar 

  2. R. Geiger, R. Yu, J. Herre, S. Rahardja, S.-W. Kim, X. Lin, M. Schmidt, ISO/IEC MPEG-4 high-definition scalable advanced audio coding. J. Audio Eng. Soc. 55(1)/2, 27–43 (2007)

    Google Scholar 

  3. ISO/IEC 14496-3:2005/Amd.3:2006, Coding of Audio-Visual Objects – Part 3: Audio, Amendment 3: Scalable Lossless Coding (SLS). International Standards Organization, Geneva (2006)

    Google Scholar 

  4. R. Yu, R. Geiger, S. Rahardja, J. Herre, X. Lin, H. Huang, MPEG-4 scalable to lossless audio coding, in 117th AES Convention, San Francisco, CA, October 2004. Preprint #6183

    Google Scholar 

  5. R. Yu, S. Rahardja, X. Lin, C.C. Ko, A fine granular scalable to lossless audio coding. IEEE Trans. Audio Speech Lang. Process. 14(4), 1352–1363 (2006)

    Article  Google Scholar 

MPEG-D USAC: Unified Speech and Audio Coding

  1. B. Edler, S. Disch, S. Bayer, G. Guillaume, R. Geiger, A time-warped MDCT approach to speech transform coding, in 126th AES Convention, Munich, May 2009. Preprint #7710

    Google Scholar 

  2. C.R. Helmrich et al., Efficient transform coding of two-channel audio signals by means of complex-valued stereo prediction, in Proceedings of the IEEE ICASSP’2011, Prague, May 2011, pp. 497–500

    Google Scholar 

  3. A. Heuerberger, G. Elst, R. Hanke (eds.), MPEG unified speech and audio coding – Bridging the gap, in Microelectronic Systems: Circuits, Systems and Applications (Springer, Berlin, 2011), pp. 343–353

    Google Scholar 

  4. ISO/IEC 23003—3:2012, MPEG audio technologies, Part 3: Unified Speech and Audio Coding, Geneva, January 2012

    Google Scholar 

  5. K. Kikuri, N. Naka, MPEG Unified speech and audio coding enabling efficient coding of both speech and music. NTT DOCOMO Tech. J. 13(3), 17–22 (2011)

    Google Scholar 

  6. M. Neuendorf et al., A novel scheme for low bit rate Unified Speech and Audio Coding – MPEG RM0, in 126th AES Convention, Munich, May 2009. Preprint #7713

    Google Scholar 

  7. M. Neuendorf et al., Unified speech and audio coding scheme for high quality at low bitrates, in Proceedings of the IEEE ICASSP’2009, Taipei, April 2009, pp. 1–4

    Google Scholar 

  8. M. Neuendorf et al., The ISO/MPEG Unified Speech and Audio Coding standard – Consistent high quality for all content types and at all bit rates, in 132nd AES Convention, Budapest, April 2012. Preprint #8654. Also published in J. Audio Eng. Soc. 61(12), 956–977 (2013)

    Google Scholar 

  9. S. Quackenbush, MPEG unified speech and audio coding. IEEE MultiMedia 20(2), 72–78 (2013)

    Article  Google Scholar 

Proprietary Audio Compression Algorithms

  1. M. Bosi, G.A. Davidson, High-quality, low-rate audio transform coding for transmission and multimedia applications, in 93rd AES Convention, San Francisco, CA, December 1992. Preprint# 3365

    Google Scholar 

  2. G.A. Davidson, L.D. Fielder, M. Antill, Low-complexity transform coder for satellite link applications, in 89th AES Convention, New York, NY, September 1990. Preprint# 2966

    Google Scholar 

  3. G.A. Davidson, M.A. Isnardi, L.D. Fielder, M.S. Goldman, C.C. Todd, ATSC video and audio coding. Proc. IEEE 94(1), 60–76 (2006)

    Article  Google Scholar 

  4. Digital Audio Compression (AC-3) ATSC Standard, Document A/52/10 of Advanced Television Systems Committee (ATSC), Audio Specialist Group T3/S7, Washington, DC, December 1995

    Google Scholar 

  5. Digital Audio Compression Standard (AC-3, E-AC-3), Revision B, Document A/52B of Advanced Television Systems Committee (ATSC), Washington DC, December 2012

    Google Scholar 

  6. L.D. Fielder, G.A. Davidson, AC-2: a family of low complexity transform-based music coders, in Proceedings of the 10th International AES Conference: Images of Audio, London, September 1991, pp. 55–70

    Google Scholar 

  7. L.D. Fielder, D.P. Robinson, AC-2 and AC-3: the technology and its applications, in 5th Australian Regional Convention, Sydney, April 1995. Preprint #4022

    Google Scholar 

  8. L.D. Fielder et al., Introduction to Dolby Digital Plus, an enhancement to the Dolby digital coding system, in 117th AES Convention, San Francisco, CA, October 2004. Preprint #6196

    Google Scholar 

  9. J.D. Johnson, A.J. Ferreira, Sum-difference stereo transform coding, in Proceedings of the IEEE ICASSP’92, vol. II, San Francisco, CA, March 1992, pp. 569–572

    Google Scholar 

  10. J. Johnson et al., AT&T perceptual audio coder (PAC), in Collected Papers on Digital Audio Bit-Rate Reduction, ed. by N. Gilchrist, C. Grewin (Audio Engineering Society, New York, 1996), pp. 73–81

    Google Scholar 

  11. D. Sinha, J.D. Johnson, Audio compression at low bit rates using a signal adaptive switched filterbank, in Proceedings of the IEEE ICASSP’96, Atlanta, GA, May 1996, pp. 1053–1056

    Google Scholar 

  12. K. Tsustsui at al., ATRAC: adaptive transform acoustics coding for MiniDisc, in 93rd AES Convention, San Francisco, CA, October 1992. Preprint #3456

    Google Scholar 

  13. T. Yoshida, The rewritable MiniDisc system. Proc. IEEE 82(10), 1492–1500 (1994)

    Article  Google Scholar 

Broadcasting/Speech/Data Communication Codecs

  1. 3GGP2 C.S0014–C v1.0, Enhanced variable rate codec, speech service Option 3, 68 and 70 for wide-band spread spectrum digital systems (2007)

    Google Scholar 

  2. M. Bellanger, D. Matera, M. Tanda, A filter bank multicarrier scheme running at symbol rate for future wireless systems, in Proceedings of the IEEE Wireless Telecommunications Symposium (WTS’2015), New York, NY, April 2015, pp. 1–5

    Google Scholar 

  3. M. Bellanger, D. Matera, M. Tanda, Lapped-OFDM as an alternative to CP-OFDM for 5G asynchronous access and cognitive radio, in Proceedings of the IEEE 81st Vehicular Technology Conference (VTC Spring), Glasgow, May 2015, pp. 1–5

    Google Scholar 

  4. Digital Radio Mondiale (DRM): System Specification, ETSI ES 201 980 v3.1.1 (2009–08), ETSI Standard, August 2009 (available on web site http://www.drm.org)

  5. W. Hoeg, T. Lauterbach (eds.), Audio services and applications (Chapter 3), in Digital Audio Broadcasting: Principles and Applications of DAB, DAB+ and DMB, 3rd edn. (Wiley, Chichester, 2009), pp. 93–165

    Google Scholar 

  6. ITU-T Recommendation G.722.1 Annex C, Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss. Annex C: 14 kHz Mode at 24, 32, and 48 kbit/s, May 2005

    Google Scholar 

  7. ITU-T SG16 Q9 – Contribution 199: extended high-level description of the Q9 EV-VBR baseline codec (2007)

    Google Scholar 

  8. L. Laaksonen et al., Super wide-band extension of G.718 and G.729.1 speech codec, in Proceedings of 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, September 2010

    Google Scholar 

  9. J. Mäkinen et al., AMR-WB+: a new audio coding standard for 3rd generation mobile audio services, in Proceedings of the IEEE ICASSP’2005, vol. II, Philadelphia, PA, March 2005, pp. 1109–1112

    Google Scholar 

  10. S. Ragot et al., ITU-T G.729.1: an 8–32 kbit/s scalable coder interoperable with G.729 for wideband telephony and voice IP, in Proceedings of the IEEE ICASSP’2007, Honolulu, HI, April 2007, pp. 529–532

    Google Scholar 

  11. R. Salami et al., Extended AMR-WB for high-quality audio on mobile devices. IEEE Commun. Mag. 44(5), 90–97 (2006)

    Article  Google Scholar 

  12. Sirius Satellite Radio, Available on web site: http://www.siriusradio.com

  13. T. Vaillancourt et al., ITU-T EV-VBR: a robust 8–32kbit/s scalable coder for error prone telecommunication channels, in Proceedings of the 16th European Signal Processing Conference, Lausanne, August 2008

    Google Scholar 

  14. M. Xie, D. Lindbergh, P. Chu, From ITU-T G.722.1 to ITU-T G.722.1 Annex C: a new low-complexity 14kHz bandwidth audio coding standard, in Proceedings of the IEEE ICASSP’2006, vol. 5, Toulouse, May 2006, pp. 173–176. Also published in J. Multimedia 2(2), 65–76 (2007)

    Google Scholar 

  15. M. Xie, P. Chu, A. Taleb, M. Briand, ITU-T G.719: a new low-complexity full-band (20kHz) audio coding standard for high quality conversational applications, in Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’2009), New Paltz, NY, October 2009, pp. 265–268

    Google Scholar 

  16. XM Satellite Radio, Available on web site: http://www.xmradio.com

Open-Source and royalty-Free Audio/Speech Codecs

  1. OPUS interactive audio/speech codec, 2016. Available on web sites: www.vorbis.com or www.opus-codec.org

  2. The CELT ultra-low delay audio codec, February 2011. Available on web sites: www.vorbis.com or www.celt-codec.org

  3. J.-M. Valin, T.B. Terriberry, G. Maxwell, A full-bandwidth audio codec with low complexity and very low delay, in Proceedings of the 17th European Signal Processing Conference (EUSIPCO’2009), Glasgow, August 2009, pp. 1254–1258

    Google Scholar 

  4. J.M. Valin, K. Vos, T.B. Terriberry, Definition of the OPUS audio codec, Internet Engineering Task Force (IETF). RFC 6716 Standard Specification, September 2012. Available on web site: www.vorbis.com

  5. J.-M. Valin, T.B. Terriberry, C. Montgomery, G. Maxwell, A high-quality speech and audio codec with less than 10 ms delay. IEEE Trans. Audio Speech Lang. Process. 18(1), 58–67 (2010)

    Article  Google Scholar 

  6. J.-M. Valin, G. Maxwell, T.B. Terriberry, C. Montgomery, K. Vos, High-quality, low-delay music coding in the Opus codec, in 135th AES Convention, New York, NY, October 2013. Preprint #8942

    Google Scholar 

  7. Vorbis I specification, Xiph.Org Foundation (2015). Available on web site: www.vorbis.com

  8. K. Wright, Notes on Ogg Vorbis and the MDCT, Draft document available on web site: www.free-comp-shop.com/vorbis.html (2003), 7 pp.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Britanak, V., Rao, K.R. (2018). Audio Coding Standards, (Proprietary) Audio Compression Algorithms, and Broadcasting/Speech/Data Communication Codecs: Overview of Adopted Filter Banks. In: Cosine-/Sine-Modulated Filter Banks. Springer, Cham. https://doi.org/10.1007/978-3-319-61080-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61080-1_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61078-8

  • Online ISBN: 978-3-319-61080-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics