Skip to main content

MPEG Audio Compression Basics

  • Chapter
  • First Online:
The MPEG Representation of Digital Media

Abstract

MPEG Audio was the first international standard for high quality audio coding and it opened the doors to a variety of applications in the world of digital music. In this chapter we review the basic ideas and features behind the general purpose, perceptual audio coders specified in the MPEG-1 and MPEG-2 audio standards which include the MP3 and AAC formats. The widely successful MP3 and AAC coders represent some of the most remarkable achievements of the MPEG committee that highly influenced not only the technology but also largely enabled different ways of digital media consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although this notation may suggest that we are blending the input signals in the time domain, this approach is usually carried out in the frequency domain over a frequency range in which the power of the spectral lines is similar in the two channels. This is because there is not much correlation between stereo channels in the time domain. In general, stereo redundancies can be more easily exploited for systems with high frequency resolution.

References

  1. Schroeder M R, Atal B S, Hall JL (1979), Optimizing Digital Speech Coders by Exploiting Masking Properties of the Human Ear. J Acoust Soc Am, 66:1647–1652

    Article  Google Scholar 

  2. ISO/IEC 11172–3 (1993), Information Technology, Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s, Part 3: Audio

    Google Scholar 

  3. ITU-R BS.1115 (1994), Low Bitrate Audio Coding. ITU, Geneva

    Google Scholar 

  4. ISO/IEC 13818–3 (1994–1997), Information Technology - Generic Coding of Moving Pictures and Associated Audio, Part 3: Audio

    Google Scholar 

  5. Nussbaumer H J (1981), Pseudo-QMF Filter Bank”. IBM Tech Disclosure Bull, 24: 3081–3087

    Google Scholar 

  6. Rothweiler J H (1983), Polyphase Quadrature Filters - A new Subband Coding Technique. International Conference IEEE ASSP, Boston, 1280–1283

    Google Scholar 

  7. Princen J P, Johnson A, Bradley A B (1987), Subband/Transform Coding Using Filter Bank Designs Based on Time Domain Aliasing Cancellation. Proc of the ICASSP, 2161–2164

    Google Scholar 

  8. Soulodre G A, Grusec T, Lavoie M, Thibault L (1998), Subjective Evaluation of State-of-the Art Two-Channel Audio Codecs. J Audio Eng Soc, 46:164–177

    Google Scholar 

  9. ISO/IEC JTC 1/SC 29/WG 11 N7950 (2006), Performance of MPEG Surround Technology

    Google Scholar 

  10. J. D. Johnston and A. J. Ferreira (1992), Sum-Difference Stereo Transform Coding, Proc. ICASSP, pp. 569–571

    Google Scholar 

  11. vd Waal R G and Veldhuis R N J (1991), Subband Coding of Stereophonic Digital Audio Signals, Proc. ICASSP, pp. 3601 – 3604

    Google Scholar 

  12. Davies M (1993), The AC-3 Multichannel Coder, presented at the 95th AES Convention, New York, pre-print 3774

    Google Scholar 

  13. Thiede T, Treurniet W, Bitto R, Schmidmer C, Sporer T, Beerends J, Colomes C, Keyhl M, Stoll G, Brandenburg K, Feiten B (2000), PEAQ-The ITU Standard for Objective Measurement of Perceived Audio Quality. J Audio Eng Soc, 48:3–29

    Google Scholar 

  14. Bosi M, Goldberg R E (2003), Introduction to Digital Audio Coding and Standards. Springer, New York

    Book  Google Scholar 

  15. Malvar H S (1990), Lapped transforms for efficient transform/sub-band coding. IEEE Transactions on Acoustics Speech and Signal Processing, 38:969 – 978

    Article  Google Scholar 

  16. Fielder L, Bosi M, Davidson G, Davis M, Todd C, Vernon S (1996), AC-2 and AC-3: Low-Complexity Transform-Based Audio Coding. Collected Papers on Digital Audio Bit-Rate Reduction, Neil Gilchrist and Christer Grewin (ed), AES 54–72

    Google Scholar 

  17. Edler B (1989), Coding of Audio Signals with Overlapping Transform and Adaptive Window Shape (in German), Frequenz, 43:252–256

    Article  Google Scholar 

  18. Bosi M, Brandenburg K, Quackenbush S, Fielder L, Akagiri K, Fuchs H, Dietz M, Herre J, Davidson G, Oikawa Y (1997), ISO/IEC MPEG-2 Advanced Audio Coding. J Audio Eng Soc, 45:789 – 812

    Google Scholar 

  19. Edler B (1992), Aliasing reduction in sub-bands of cascaded filter banks with decimation. Electronics Letters, 28:1104–1105

    Article  Google Scholar 

  20. E. Zwicker E, Fastl H (1990), Psychoacoustics: Facts and Models. Springer-Verlag, Berlin

    Google Scholar 

  21. Hellman R (1972), Asymmetry of Masking Between Noise and Tone. Percep Psychphys, 11:241–246

    Article  Google Scholar 

  22. Fletcher H (1940), Auditory Patterns. Rev Mod Phys, 12:47–55

    Article  Google Scholar 

  23. EBU (1988), Tech 3253 - Sound Quality Assessment Material (SQAM). Tech Rep, European Broadcasting Union

    Google Scholar 

  24. K. Brandenburg K, Johnston JD (1990), Second Generation Perceptual Audio Coding: The Hybrid Coder. 88th AES Convention, Montreux

    Google Scholar 

  25. Johnston J D (1988), Estimation of Perceptual Entropy Using Noise Masking Criteria. Proc ICASSP, 2524–2527

    Google Scholar 

  26. Blauert J (1983), Spatial Hearing. MIT Press, Cambridge

    Google Scholar 

  27. Dehéry Y F, Stoll G, Kerkhof L vd (1991), MUSICAM Source Coding for Digital Sound. Symp Rec Broadcast Sessions, 612–617

    Google Scholar 

  28. Brandenburg K, Herre J, Johnston J D, Mahieux Y, Schroeder E F (1991), ASPEC-Adaptive Spectral Perceptual Entropy Coding of High Quality Music Signals. 90th AES Convention, 3011

    Google Scholar 

  29. Ryden T, Grewin C and Bergman S (1991), The SR report on the MPEG audio subjective listening tests in Stockholm April/May 1991, ISO/IEC JTC1/SC29/WG 11 MPEG 91/010

    Google Scholar 

  30. ITU-R BS.775-1 (1992–1994), Multichannel Stereophonic Sound System with and without Accompanying Picture

    Google Scholar 

  31. ten Kate W, Boers P, Maekivirta A, Kuusama J, Christensen K E, Soerensen E (1992), Matrixing of Bit-Rate Reduced Signals. Proc ICASSP, 2:205–208

    Google Scholar 

  32. Stoll G (1996), ISO-MPEG-2 Audio: A Generic Standard for the Coding of Two-Channel and Multichannel Sound. Gielchrist, Grewin (ed), Collected Papers on Digital Audio Bit-Rate Reduction, 43–53, AES 1996

    Google Scholar 

  33. ISO/IEC 13818–7 (1997), Information Technology - Generic Coding of Moving Pictures and Associated Audio, Part 7: Advanced Audio Coding

    Google Scholar 

  34. ISO/IEC JTC 1/SC 29/WG 11 N1420 (1996), Overview of the Report on the Formal Subjective Listening Tests of MPEG-2 AAC Multichannel Audio Coding

    Google Scholar 

  35. ISO/IEC 14496–3 (1999–2001), Information Technology – Coding of Audio Visual Objects, Part 3: Audio

    Google Scholar 

  36. Herre J, Johnston J D (1996), Enhancing the Performance of Perceptual Audio Coders by Using Temporal Noise Shaping (TNS). 101st AES Convention, 4384

    Google Scholar 

  37. Sinha D, Johnston J D, S. Dorward S, Quackenbush S R (1998), The Perceptual Audio Coder (PAC). The Digital Signal Processing Handbook, Madisetti, Williams (ed), CRC Press, 42.1-42.18

    Google Scholar 

  38. Herre J, Schulz D (1998), Extending the MPEG-4 AAC Codec by Perceptual Noise Substitution. 112th AES Convention, 4720

    Google Scholar 

  39. Dietz M, Liljeryd L, Kjoerling K, Kunz O (2002), Spectral Band Replication, a novel approach in audio coding. 112th AES Convention, 5553

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marina Bosi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Bosi, M. (2012). MPEG Audio Compression Basics. In: Chiariglione, L. (eds) The MPEG Representation of Digital Media. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6184-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-6184-6_6

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4419-6183-9

  • Online ISBN: 978-1-4419-6184-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics