Challenges in Speech Coding Research

Gibson, Jerry D.

doi:10.1007/978-1-4939-1456-2_2

Jerry D. Gibson⁴

1965 Accesses
3 Citations

Abstract

Speech and audio coding underlie many of the products and services that we have come to rely on and enjoy today. In this chapter, we discuss speech and audio coding, including a concise background summary, key coding methods, and the latest standards, with an eye toward current limitations and possible future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J.D. Gibson, Speech coding methods, standards, and applications. IEEE Circuits Syst. Magazine 5, 30–49 (2005)
Article Google Scholar
J.D. Gibson, T. Berger, T. Lookabaugh, D. Lindbergh, R.L. Baker, Digital Compression for Multimedia: Principles and Standards (Morgan-Kaufmann, San Francisco, 1998)
Google Scholar
R. Cox, S.F. de Campos Neto, C. Lamblin, M.H. Sherif, ITU-T coders for wideband, superwideband, and fullband speech communication. IEEE Commun. Magazine 47, 106–109 (2009)
Article Google Scholar
ITU-T Recommendation P.862, Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs (2001)
Google Scholar
ITU-T Recommendation P.862.2, Wideband extension to Recommendation P.862 for the assessment of wideband telephone networks and speech codecs (2007)
Google Scholar
ITU-T Recommendation P.863, Perceptual objective listening quality assessment (2011)
Google Scholar
W.-Y. Chan, T.H. Falk, Machine assessment of speech communication quality, in The Mobile Communications Handbook, ed. by J.D. Gibson, 3rd edn. (CRC Press, BocaRaton, FL, 2012). Chapter 30
Google Scholar
Advanced audio distribution profile (A2DP) specification version 1.2, Bluetooth SIG, Audio video WG, http://www.bluetooth.org/. April 2007
H.S. Malvar, Signal Processing with Lapped Transforms (Artech House, Norwood, 1992)
MATH Google Scholar
A.M. Kondoz, Digital Speech: Coding for Low Bit Rate Communication Systems (Wiley, West Sussex, 2004)
Book Google Scholar
J.H. Chen, A. Gersho, Adaptive postfiltering for quality enhancement of coded speech. IEEE Trans. Audio Process. 3, 59–70 (1995)
Article Google Scholar
S. Ragot et~al., ITU-T G.729.1: An 8-32 kbit/s scalable coder interoperable with G.729 for wideband telephony and Voice over IP, in Proceedings of ICASSP, Honolulu, April 2007
Google Scholar
ITU-T Recommendation G.722.1, Coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss (1999)
Google Scholar
ITU-T Recommendation G.722.2, Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB) (2002)
Google Scholar
ITU-T Rec. G.718, Frame error robust narrowband and wideband embedded variable bit-rate coding of speech and audio from 8-32 kbit/s (2008)
Google Scholar
ITU-T Rec. 719, Low-complexity, full-band audio coding for high-quality, conversational applications, June 2008
Google Scholar
S. Karapetkov, G.719: the first ITU-T standard for full-band audio. Polycom white paper, April 2009
Google Scholar
http://www.speex.org/
S.V. Andersen, W.B. Kleijn, R. Hagen, J. Linden, M.N. Murthi, J. Skoglund, iLBC – a linear predictive coder with robustness to packet losses, in Proceedings of the IEEE Speech Coding Workshop, October 2002, pp 23–25
Google Scholar
IETF Opus Interactive Audio Codec, http://opus-codec.org/ (2011)
RFC6716, Definition of the Opus Audio Codec, September 2012
Google Scholar
A. Ramo, Voice quality evaluation of various codecs, in ICASSP 2010, Dallas, 14–19 March 2010
Google Scholar
A. Ramo, H. Toukomaa, Voice quality characterization of the IETF Opus Codec, in Proceedings of Interspeech 2011, Florence (2011)
Google Scholar
A. Ramo, H. Toukomaa, On comparing speech quality of various narrow- and wideband speech codecs, in Proceeding of ISSPA, Sydney (2005)
Google Scholar
M. Bosi, R.E. Goldberg, Introduction to Audio Coding and Standards (Kluwer, Boston, 2003)
Book Google Scholar
T. Painter, A. Spanias, Perceptual coding of digital audio. Proc. IEEE 88, 451–512 (2000)
Article Google Scholar
ITU-T Recommendation G.114, One-Way Transmission Time (2000)
Google Scholar
ITU-T Rec. G.718 Amendment 2: New Annex B on superwideband scalable extension for ITU-T G.718 and corrections to main body fixed-point C-code and description text, March 2010
Google Scholar
M. Neuendorf, P. Gournay, M. Multrus, J. Lecomte, B. Bessette, R. Geiger, S. Bayer, G. Fuchs, J. Hilpert, N. Rettelbach, F. Nagel, J. Robilliard, R. Salami, G. Schuller, R. Lefebvre, B. Grill, A novel scheme for low bitrate unified speech and audio coding-MPEG RM0, in Audio Engineering Society, Convention Paper 7713, May 2009
Google Scholar
Y. Hiwasaki et~al., G.711.1: a wideband extension to ITU-T G.711. EUSIPCO 2008, Lausanne, 25–29 August 2008
Google Scholar
M. Xie, D. Lindbergh, P. Chu, ITU-T G.722.1 Annex C: a new low-complexity 14 kHz audio coding standard, in Proceedings of ICASSP, Toulouse, May 2006
Google Scholar
K. Jarvinen, I. Bouazizi, L. Laaksonen, P. Ojala, A. Ramo, Media coding for the next generation mobile system LTE. Comput. Commun. 33, 1916–1927 (2010)
Article Google Scholar
J. Rodman, The effect of bandwidth on speech intelligibility. Polycom white paper, September 2006
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical & Computer Engineering, University of California, Santa Barbara, CA, 93106-6065, USA
Jerry D. Gibson

Authors

Jerry D. Gibson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerry D. Gibson .

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Santa Clara University, Santa Clara, California, USA
Tokunbo Ogunfunmi
School of EE&C Engineering, The University of Western Australia, Crawley, West Australia, Australia
Roberto Togneri
Qualcomm Inc., Santa Clara, California, USA
Madihally (Sim) Narasimha

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gibson, J.D. (2015). Challenges in Speech Coding Research. In: Ogunfunmi, T., Togneri, R., Narasimha, M. (eds) Speech and Audio Processing for Coding, Enhancement and Recognition. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-1456-2_2

Download citation

DOI: https://doi.org/10.1007/978-1-4939-1456-2_2
Published: 18 September 2014
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-1455-5
Online ISBN: 978-1-4939-1456-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics