Abstract
The transmitter and the receiver in a communication system have to be designed optimally with respect to one another to ensure reliable and efficient communication. Following this principle, we derive an optimal filterbank for processing speech signal in the listener’s auditory system (receiver), so that maximum information about the talker’s (transmitter) message can be obtained from the filterbank output, leading to efficient communication between the talker and the listener. We consider speech data of 45 talkers from three different languages for designing optimal filterbanks separately for each of them. We find that the computationally derived optimal filterbanks are similar to the empirically established auditory (cochlear) filterbank in the human ear. We also find that the output of the empirically established auditory filterbank provides more than 90% of the maximum information about the talker’s message provided by the output of the optimal filterbank. Our experimental findings suggest that the auditory filterbank in human ear functions as a near-optimal speech processor for achieving efficient speech communication between humans.
Similar content being viewed by others
References
Browman C P, Goldstein L 1989 Articulatory gestures as phonological units, Phonology 6(2): 201–251
Browman C P, Goldstein L 1990 Gestural specification using dynamically-defined articulatory structures, J. Phonetics 18: 299–320
Chatterjee M, Zwislocki J J 1998 Cochlear mechanisms of frequency and intensity coding. II. Dynamic range and the code for loudness, Hear. Res. 124(1–2): 170–181
Cover T M, Thomas J A 1991 Elements of information theory (New York: Wiley Interscience)
Darbellay G A, Vajda I 1999 Estimation of the information by an adaptive partition of the observation space, IEEE Trans. Inform. Theory 45: 1315–1321
Duda R O, Hart P E 2000 Pattern classification and scene analysis (New York: Wiley-Interscience)
Ghosh P K, Goldstein L M, Narayanan S S 2011 Processing speech signal using auditory-like filterbank provides least uncertainty about articulatory gestures, J. Acoust. Soc. Am. 129(6): 4014–4022
Goldstein L, Chitoran I, Selkirk E 2007 Syllable structure as coupled oscillator modes: evidence from Georgian vs. Tashlhiyt Berber, Proc. 16th International Congress of Phonetic Sciences, Saarbrucken, Germany, pp. 241–244
Johnson K 2003 Acoustic and auditory phonetics (MA, USA: Wiley-Blackwell) 2nd edition
Nave C R Place theory. Accessed 13/03/2011. URL http://hyperphysics.phy-astr.gsu.edu/hbase/sound/place.html
Pathmanathan J S, Kim D O 2001 A computational model for the AVCN marginal shell with medial olivocochlear feedback: generation of a wide dynamic range, Neurocomputing 38: 807–815
Perkell S J, Cohen M, Svirsky M, Matthies M, Garabieta I, Jackson M 1992 Electro-magnetic midsagittal articulometer systems for transducing speech articulatory movements, J. Acoust. Soc. Am. 92: 3078–3096
Saltzman E L, Munhall K G 1989 A dynamical approach to gestural patterning in speech production, Ecol. Psychol. 1: 333–382
Shannon C E 1948 A mathematical theory of communication, Bell Syst. Tech. J. 27: 379–423
Smith E C, Lewicki M S 2006 Efficient auditory coding, Nature 439: 978–982
Strang G, Nguyen T 1996 Wavelets and filter banks (Wellesley, MA: Wellesley-Cambridge Press)
Westbury J R 1994 X-ray microbeam speech production database user’s handbook version 1.0. http://www2.uni-jena.de/~x1siad/uwxrmbdb.html (date last viewed 6/15/2010)
Wikibooks. Anatomy and physiology of animals/the senses. Accessed 13/03/2011. URL http://en.wikibooks.org/wiki/Anatomy_and_Physiology_of_Animals/The_Senses
Yanagawa M 2006 Articulatory timing in first and second language: a cross-linguistic study. Doctoral dissertation, Yale University
Zwicker E, Terhardt E 1980 Analytical expressions for critical-band rate and critical bandwidth as a function of frequency, J. Acoust. Soc. Am. 68: 1523–1525
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
GHOSH, P.K., GOLDSTEIN, L.M. & NARAYANAN, S.S. Auditory-like filterbank: An optimal speech processor for efficient human speech communication. Sadhana 36, 699–712 (2011). https://doi.org/10.1007/s12046-011-0042-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12046-011-0042-4