Skip to main content
Log in

Auditory-like filterbank: An optimal speech processor for efficient human speech communication

  • Published:
Sadhana Aims and scope Submit manuscript

Abstract

The transmitter and the receiver in a communication system have to be designed optimally with respect to one another to ensure reliable and efficient communication. Following this principle, we derive an optimal filterbank for processing speech signal in the listener’s auditory system (receiver), so that maximum information about the talker’s (transmitter) message can be obtained from the filterbank output, leading to efficient communication between the talker and the listener. We consider speech data of 45 talkers from three different languages for designing optimal filterbanks separately for each of them. We find that the computationally derived optimal filterbanks are similar to the empirically established auditory (cochlear) filterbank in the human ear. We also find that the output of the empirically established auditory filterbank provides more than 90% of the maximum information about the talker’s message provided by the output of the optimal filterbank. Our experimental findings suggest that the auditory filterbank in human ear functions as a near-optimal speech processor for achieving efficient speech communication between humans.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Browman C P, Goldstein L 1989 Articulatory gestures as phonological units, Phonology 6(2): 201–251

    Article  Google Scholar 

  • Browman C P, Goldstein L 1990 Gestural specification using dynamically-defined articulatory structures, J. Phonetics 18: 299–320

    Google Scholar 

  • Chatterjee M, Zwislocki J J 1998 Cochlear mechanisms of frequency and intensity coding. II. Dynamic range and the code for loudness, Hear. Res. 124(1–2): 170–181

    Article  Google Scholar 

  • Cover T M, Thomas J A 1991 Elements of information theory (New York: Wiley Interscience)

    Book  MATH  Google Scholar 

  • Darbellay G A, Vajda I 1999 Estimation of the information by an adaptive partition of the observation space, IEEE Trans. Inform. Theory 45: 1315–1321

    Article  MathSciNet  MATH  Google Scholar 

  • Duda R O, Hart P E 2000 Pattern classification and scene analysis (New York: Wiley-Interscience)

    Google Scholar 

  • Ghosh P K, Goldstein L M, Narayanan S S 2011 Processing speech signal using auditory-like filterbank provides least uncertainty about articulatory gestures, J. Acoust. Soc. Am. 129(6): 4014–4022

    Google Scholar 

  • Goldstein L, Chitoran I, Selkirk E 2007 Syllable structure as coupled oscillator modes: evidence from Georgian vs. Tashlhiyt Berber, Proc. 16th International Congress of Phonetic Sciences, Saarbrucken, Germany, pp. 241–244

  • Johnson K 2003 Acoustic and auditory phonetics (MA, USA: Wiley-Blackwell) 2nd edition

  • Nave C R Place theory. Accessed 13/03/2011. URL http://hyperphysics.phy-astr.gsu.edu/hbase/sound/place.html

  • Pathmanathan J S, Kim D O 2001 A computational model for the AVCN marginal shell with medial olivocochlear feedback: generation of a wide dynamic range, Neurocomputing 38: 807–815

    Article  Google Scholar 

  • Perkell S J, Cohen M, Svirsky M, Matthies M, Garabieta I, Jackson M 1992 Electro-magnetic midsagittal articulometer systems for transducing speech articulatory movements, J. Acoust. Soc. Am. 92: 3078–3096

    Article  Google Scholar 

  • Saltzman E L, Munhall K G 1989 A dynamical approach to gestural patterning in speech production, Ecol. Psychol. 1: 333–382

    Article  Google Scholar 

  • Shannon C E 1948 A mathematical theory of communication, Bell Syst. Tech. J. 27: 379–423

    MathSciNet  MATH  Google Scholar 

  • Smith E C, Lewicki M S 2006 Efficient auditory coding, Nature 439: 978–982

    Article  Google Scholar 

  • Strang G, Nguyen T 1996 Wavelets and filter banks (Wellesley, MA: Wellesley-Cambridge Press)

    Google Scholar 

  • Westbury J R 1994 X-ray microbeam speech production database user’s handbook version 1.0. http://www2.uni-jena.de/~x1siad/uwxrmbdb.html (date last viewed 6/15/2010)

  • Wikibooks. Anatomy and physiology of animals/the senses. Accessed 13/03/2011. URL http://en.wikibooks.org/wiki/Anatomy_and_Physiology_of_Animals/The_Senses

  • Yanagawa M 2006 Articulatory timing in first and second language: a cross-linguistic study. Doctoral dissertation, Yale University

  • Zwicker E, Terhardt E 1980 Analytical expressions for critical-band rate and critical bandwidth as a function of frequency, J. Acoust. Soc. Am. 68: 1523–1525

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to PRASANTA KUMAR GHOSH.

Rights and permissions

Reprints and permissions

About this article

Cite this article

GHOSH, P.K., GOLDSTEIN, L.M. & NARAYANAN, S.S. Auditory-like filterbank: An optimal speech processor for efficient human speech communication. Sadhana 36, 699–712 (2011). https://doi.org/10.1007/s12046-011-0042-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12046-011-0042-4

Keywords

Navigation