Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding

  • Petr Motlicek
  • Sriram Ganapathy
  • Hynek Hermansky
  • Harinath Garudadri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4892)

Abstract

This paper proposes an analysis technique for wide-band audio applications based on the predictability of the temporal evolution of Quadrature Mirror Filter (QMF) sub-band signals. The input audio signal is first decomposed into 64 sub-band signals using QMF decomposition. The temporal envelopes in critically sampled QMF sub-bands are approximated using frequency domain linear prediction applied over relatively long time segments (e.g. 1000 ms). Line Spectral Frequency parameters related to autoregressive models are computed and quantized in each frequency sub-band. The sub-band residuals are quantized in the frequency domain using a combination of split Vector Quantization (VQ) (for magnitudes) and uniform scalar quantization (for phases). In the decoder, the sub-band signal is reconstructed using the quantized residual and the corresponding quantized envelope. Finally, application of inverse QMF reconstructs the audio signal. Even with simple quantization techniques and without any sophisticated modules, the proposed audio coder provides encouraging results in objective quality tests. Also, the proposed coder is easily scalable across a wide range of bit-rates.

Index Terms

Audio coding Frequency Domain Linear Prediction (FDLP) Perceptual Evaluation of Audio Quality (PEAQ) 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Motlicek, P., Hermansky, H., Garudadri, H., Srinivasamurthy, N.: Speech Coding Based on Spectral Dynamics. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, Springer, Heidelberg (2006)Google Scholar
  2. 2.
    Herre, J., Johnston, J.H.: Enhancing the performance of perceptual audio coders by using temporal noise shaping (TNS). In: 101st Conv. Aud. Eng. Soc. (1996)Google Scholar
  3. 3.
    Athineos, M., Hermansky, H., Ellis, D.P.W.: LP-TRAP: Linear predictive temporal patterns. In: Proc. of ICSLP, pp. 1154-1157, Jeju, S. Korea (October 2004)Google Scholar
  4. 4.
    Motlicek, P., Ullal, V., Hermansky, H.: Wide-Band Perceptual Audio Coding based on Frequency-domain Linear Prediction. In: Proc. of ICASSP, Honolulu, USA (April 2007)Google Scholar
  5. 5.
    Pan, D.: A Tutorial on MPEG/Audio Compression. IEEE Multimedia Journal , 60–74 (1995)Google Scholar
  6. 6.
    Brandenburg, K.: ISO-MPEG-1 Audio: A Generic Standard for Coding of High-Quality Digital Audio. J. Audio Eng. Soc. 42, 780–792 (1994)Google Scholar
  7. 7.
    Hermansky, H., Fujisaki, H., Sato, Y.: Analysis and Synthesis of Speech based on Spectral Transform Linear Predictive Method. In: Proc. of ICASSP, Boston, USA, vol. 8, pp. 777–780 (April 1983)Google Scholar
  8. 8.
    Thiede, T., Treurniet, W.C., Bitto, R., Schmidmer, C., Sporer, T., Beerends, J.G., Colomes, C., Keyhl, M., Stoll, G., Brandenburg, K., Feiten, B.: PEAQ – The ITU Standard for Objective Measurement of Perceived Audio Quality. J. Audio Eng. Soc. 48, 3–29 (2000)Google Scholar
  9. 9.
    ISO/IEC JTC1/SC29/WG11, Framework for Exploration of Speech and Audio Coding, MPEG, /N9254, July 2007, Lausanne, CH (2007)Google Scholar
  10. 10.
  11. 11.
    3GPP TS 26.401: Enhanced aacPlus general audio codec; General DescriptionGoogle Scholar
  12. 12.
    Brandenburg, K., Kunz, O., Sugiyama, A.: MPEG-4 Natural Audio Coding. Signal Processing: Image Communication 15(4), 423–444 (2000)CrossRefGoogle Scholar
  13. 13.
    Dietz M., Liljeryd L., Kjorling K., Kunz O., Spectral Band Replication, a novel approach in audio coding. In: AES 112th Convention, Munich, DE, May 2002, Preprint 5553.Google Scholar
  14. 14.
    Bosi, M., Brandenburg, K., Quackenbush, S., Fielder, L., Akagiri, K., Fuchs, H., Dietz, M., Herre, J., Davidson, G., Oikawa, Y.: ISO/IEC MPEG-2 Advanced Audio Coding. J. Audio Eng. Soc. 45(10), 789–814 (1994)Google Scholar
  15. 15.
    ISO/IEC, Coding of audio-visual objects Part 3: Audio, AMENDMENT 1: Bandwidth Extension, ISO/IEC Int. Std. 14496-3:2001/Amd.1:2003 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Petr Motlicek
    • 1
  • Sriram Ganapathy
    • 1
    • 2
  • Hynek Hermansky
    • 1
    • 2
  • Harinath Garudadri
    • 3
  1. 1.IDIAP Research InstituteMartignySwitzerland
  2. 2.École Polytechnique Fédérale de Lausanne (EPFL)Switzerland
  3. 3.Qualcomm Inc.San DiegoUSA

Personalised recommendations