Auditory-Based Feature Extraction and Robust Speaker Identification
- 697 Downloads
In the previous chapter, we introduced a robust auditory transform (AT). In this chapter, we present an auditory-based feature extraction algorithm based on the AT and apply it to robust speaker identification. Usually, the performances of acoustic models trained in clean speech drop significantly when tested in noisy speech. The presented features, however, have shown strong robustness in this kind of situation. We present a typical text-independent speaker identification system in the experiment section. Under all three different mismatched testing conditions, with white noise, car noise, or babble noise, the auditory features consistently perform better than the baseline mel frequency cepstral coefficient (FMCC) features. The auditory features are also compared with perceptual linear predictive (PLP) and RASTA-PLP features, The features consistently perform much better than PLP. Under white noise, the FMCC features are much better than RASTA-PLP. Under car and babble noises, the performace are similar.
KeywordsFast Fourier Transform Hair Cell Discrete Cosine Transform Basilar Membrane Speaker Recognition
Unable to display preview. Download preview PDF.
- 8.Li, Q.: “An auditory-based transform for audio signal processing,” in Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (New Paltz, NY), Oct. 2009Google Scholar
- 9.Li, Q.: “Solution for pervasive speaker recognition,” SBIR Phase I Proposal, Submitted to NSF IT.F4, Li Creative Technologies, Inc., NJ, June 2003Google Scholar
- 10.Li, Q., Huang, Y.; “An auditory-based feature extraction algorithm for robust speaker identification under mismatched conditions,” IEEE Trans. on Audio, Speech and Language Processing, Sept. 2011Google Scholar
- 11.Li, Q., Huang, Y.: “Robust speaker identification using an auditory-based feature,” in ICASSP 2010 (2010)Google Scholar
- 12.Li, Q., Soong, F. K., Olivier, S.: “An auditory system-based feature for robust speech recognition,” in Proc. 7th European Conf. on Speech Communication and Technology (Denmark), pp. 619–622, Sept. 2001Google Scholar
- 13.Li, Q., Soong, F. K., Siohan, O.: “A high-performance auditory feature for robust speech recognition,” in Proceedings of 6th Int’l Conf. on Spoken Language Processing (Beijing), pp. III 51–54, Oct. 2000Google Scholar
- 16.Moore, B. C.: An introduction to the psychology of hearing. Academic Press, NY (1997)Google Scholar
- 18.Shao, Y., Wang, D.: “Robust speaker identification using auditory features and computational auditory scene analysis,” in Proceedings of IEEE ICASSP, pp. 1589–1592, 2008Google Scholar