Abstract
Time-frequency transforms play an important role in signal processing. Many speech processing algorithms needs to convert the time domain speech signal to a frequency domain. The Fourier transform (FT) and the fast Fourier transform (FFT) have been used for decades, but they are not robust to background noise. As shown in this chapter, FFT generates computation noise and pitch harmonics during its computation. In a different approach, the traveling wave in the cochlea was modeled as a Gammatone function. A bank of the functions has been used as the forward transform to decompose the input signal into different frequency bands, but there is no proven inverse transform for the Gammatone filter bank, and the filter bandwidths are fixed and cannot be adjusted for different kinds of applications. To address the above issues, the author presents a robust, invertible, and auditory-based time-frequency transform named auditory-based transform or auditory transform (AT) in [23, 22]. In this chapter, we provide a detailed introduction of the AT.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allen J.: “Cochlear modeling”. IEEE ASSP Magazine, pp. 3–29, Jan. 1985
Barbour, D. L., Wang, X.: “Contrast tuning in auditory cortex”. Science 299, 1073–1075 (2003)
Bruce, I., Sacs, M., Young, E.: “An auditory-periphery model of the effects of acoustic trauma on auditory nerve responses”. J. Acoust. Soc. Am 113, 369–388 (2003)
Choueiter, G. F., Glass, J. R.: “An implementation of rational wavelets and filter design for phonetic classification”. IEEE Trans. on Audio, Speech, and Language Processing 15, 939–948 (2007)
Daubechies I., Maes S. (1996) “A nonlinear squeezing of the continuous wavelet transform based on auditory nerve models”. In: A. Aldroubi, M. Unser (eds.) Wavelets in Medicine and Biology (CRC Press), pp. 527–546
Davis, S.B., Mermelstein, P.: “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”. IEEE Trans. on Acoustics, speech, and signal processing ASSP-28, 357–366 (1980)
Evans E. F. (1977) “Frequency selectivity at high signal levels of single units in cochlear nerve and cochlear nucleus”. In: E. F. Evans, J. P. Wilson (eds.) Psychophysics and Physiology of Hearing. London, UK: Academic Press, pp. 195–192
Flanagan, J. L.: Speech analysis synthesis and perception. Springer-Verlag, New York (1972)
Fletcher H.: Speech and hearing in communication. Acoustical Society of America, 1995
Furui, S.: “Cepstral analysis techniques for automatic speaker verification”. IEEE Trans. Acoust., Speech, Signal Processing 27, 254–277 (1981)
Gelfand, S. A.: Hearing, an introduction to psychological and physiological acoustics. 3rd edition. Marcel Dekker, New York (1998)
Ghitza, O.: “Auditory models and human performance in tasks related to speech coding and speech recognition”. IEEE Trans. on Speech and Audio Processing 2, 115–132 (1994)
Goldstein, J. L.: “Modeling rapid waveform compression on the basilar membrane as a multiple-bandpass-nonlinear filtering”. Hearing Res. 49, 39–60 (1990)
Hermansky, H., Morgan, N.: “Rasta processing of speech”. IEEE Trans. Speech and Audio Proc. 2, 578–589 (1994)
Hohmann, V.: “Frequency analysis and synthesis using a Gammatone filterbank”. Acta Acoustica United with Acustica 88, 433–442 (2002)
Johannesma, P. I. M.: “The pre-response stimulus ensemble of neurons in the cochlear nucleus”. The proceeding of the symposium on hearing Theory IPO, 58–69 (1972)
Johnson, R. A., Wichern, D. W.: Applied Multivariate Statistical Analysis. 3rd edn. Prentice Hall, New Jersey (1988)
Kates, J. M.: “Accurate tuning curves in cochlea model”. IEEE Trans. on Speech and Audio Processing 1, 453–462 (1993)
Kates, J. M.: “A time-domain digital cochlea model”. IEEE Trans. on Signal Processing 39, 2573–2592 (1991)
Khanna S. M., Leonard D. G. B.: “Basilar membrane tuning in the cat cochlea”. Science 215:305–306, Jan 182
Kiang, N. Y.-S.: Discharge patterns of single fibers in the cat’s auditory nerve. 3rd edn. MIT, MA (1965)
Li Q.: “An auditory-based transform for audio signal processing”. in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (New Paltz, NY), Oct. 2009
Li Q.: “Solution for pervasive speaker recognition”. SBIR Phase I Proposal, Submitted to NSF IT.F4, Li Creative Technologies, Inc., NJ, June (2003)
Li Q., Huang Y.: “An auditory-based feature extraction algorithm for robust speaker identification under mismatched conditions” IEEE Trans. on Audio, Speech and Language Processing, Sept. 2011
Li Q., Huang Y.: “Robust speaker identification using an auditory-based feature”. in ICASSP 2010 (2010)
Li Q., Soong F. K., Olivier S.: “An auditory system-based feature for robust speech recognition”. in Proc. 7th European Conf. on Speech Communication and Technology (Denmark), pp. 619–622, Sept. (2001)
Li Q., Soong F. K., Siohan O.: “A high-performance auditory feature for robust speech recognition”. in Proceedings of 6th Int’l Conf. on Spoken Language Processing (Beijing), pp. III 51–54, Oct. 2000
Li, Q., Zheng, J., Tsai, A., Zhou, Q.: “Robust endpoint detection and energy normalization for real-time speech and speaker recognition”. IEEE Trans. on Speech and Audio Processing 10, 146–157 (2002)
Lin, J., Ki, W.-H., Edwards, T., Shamma, S.: “Analog VLSI implementations of auditory wavelet transforms using switched-capacitor circuits”. IEEE Trans. on Circuits and Systems I: Fundamental Theory and Applications 41, 572–583 (1994)
Liu, W., Andreou, A. G., , M. H. Goldstein J.: “Voiced-speech representation by an analog silicon model of the auditory periphery”. IEEE Trans. on Neural Networks 3, 477–487 (1992)
Lyon, R. F., Mead, C.: “An analog electronic cochlea”. IEEE Trans. on Acoustics, Speech, and Signal processing 36, 1119–1134 (1988)
Max, B., Tam, Y.-C., Li, Q.: “Discriminative auditory features for robust speech recognition”. IEEE Trans. on Speech and Audio Processing 12, 27–36 (2004)
Misiti, M., Misiti, Y., Oppenheim, G., Poggi, J.-M.: Wavelet Toolbox User’s Guide. 3rd edn. MathWorks, MA (2006)
Møller, , , A. R.: “Frequency selectivity of single auditory-nerve fibers in response to broadband noise stimuli”. J. Acoust. Soc. Am. 62, 135–142 (1977)
Moore, B., Peters, R. W., Glasberg, B. R.: “Auditory filter shapes at low center frequencies”. J. Acoust. Soc. Am 88, 132–148 (1990)
Moore, B. C. J., Glasberg, B. R.: “Suggested formula for calculating auditory-filter bandwidth and excitation patterns”. J. Acoust. Soc. Am. 74, 750–753 (1983)
Moore, B. C.: An introduction to the psychology of hearing.. 3rd edn. Academic Press, NY (1997)
Nedzelnitsky, V.: “Sound pressures in the casal turn of the cat cochlea”. J. Acoustics Soc. Am. 68, 1676–1680 (1980)
Patterson, R. D.: “Auditory filter shapes derived with noise stimuli”. J. Acoust. Soc. Am. 59, 640–654 (1976)
Pickles, J. O.: An introduction to the physiology of hearing. 2nd edn. Academic Press, New York (1988)
Rao, R., Bopardikar, A.: Wavelet Transforms. 2nd edn. Adison-Wesley, MA (1998)
Sellami, L., Newcomb, R. W.: “A digital scattering model of the cochlea”. IEEE Trans. on Circuits and Systems I: Fundamental Theory and Applications 44, 174–180 (1997)
Sellick, P. M., Patuzzi, R., Johnstone, B. M.: “Measurement of basilar membrane motion in the guinea pig using the Mossbauer technique”. J. Acoust. Soc. Am. 72, 131–141 (1982)
Shaw E. A. G. The external ear, in Handbook of Sensory Physiology. New York: Springer-Verlag, 1974. W. D. Keidel and W. D. Neff eds
Teich M. C., Heneghan C., Khanna S. M. “Analysis of cellular vibrations in the living cochlea using the continuous wavelet transform and the short-time Fourier transform”. in Time frequency and wavelets in biomedical signal processing, pp. 243–269, 1998. Edited by M. Akay.
Torrence, C., Compo, G. P.: “A practical guide to wavelet analysis”. Bulletin of the American Meteorological Society 79, 61–78 (1998)
Volkmer, M.: “Theoretical analysis of a time-frequency-PCNN auditory cortex model”. Internal J. of Neural Systems 15, 339–347 (2005)
von Békésy, G.: Experiments in hearing. 2nd dn. McGRAW-HILL, New York (1998)
Wang D., Brown G. J. Fundamentals of computational auditory scene analysis in Computational Auditory Scene Analysis Edited by D. Wang and G. J. Brown. NJ: IEEE Press, 2006.
Wang, K., Shamma, S. A.: “Spectral shape analysis in the central auditory system”. IEEE Trans. on Speech and Audio Processing 3, 382–395 (1995)
Weintraub M. A theory and computational model of auditory monaural sound separation. PhD thesis, Standford University, CA, August 1985
Wilson, J. P., Johnstone, J.: “Basilar membrane and middle-ear vibration in guinea pig measured by capacitive probe”. J. Acoust. Soc. Am. 57, 705–723 (1975)
Wilson J. P., Johnstone J. “Capacitive probe measures of basilar membrane vibrations in”. Hearing Theory, 1972
Yost, W.: Fundamentals of Hearing: An Introduction, 3rd Edition. 2nd edn. Academic Press, New York (1994)
Zhou B. “Auditory filter shapes at high frequencies”. J. Acoust. Soc. Am 98:1935–1942
Zilany, M., Bruce, I.: “Modeling auditory-nerve response for high sound pressure levels in the normal and impaired auditory periphery”. J. Acoust. Soc. Am 120, 1447–1466 (2006)
Zweig, G., Lipes, R., Pierce, J. R.: “The cochlear compromise”. J. Acoust. Soc. Am. 59, 975–982 (1976)
Zwicker, E., Terhardt, E.: “Analytical expressions for critical-band rate and critical bandwidth as a function of frequency”. J. Acoust. Soc. Am. 68, 1523–1525 (1980)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Li, Q.(. (2012). Auditory-Based Time Frequency Transform. In: Speaker Authentication. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23731-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-23731-7_7
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23730-0
Online ISBN: 978-3-642-23731-7
eBook Packages: EngineeringEngineering (R0)