Adaptive bands filter bank optimized by genetic algorithm for robust speech recognition system
- 64 Downloads
- 7 Citations
Abstract
Perceptual auditory filter banks such as Bark-scale filter bank are widely used as front-end processing in speech recognition systems. However, the problem of the design of optimized filter banks that provide higher accuracy in recognition tasks is still open. Owing to spectral analysis in feature extraction, an adaptive bands filter bank (ABFB) is presented. The design adopts flexible bandwidths and center frequencies for the frequency responses of the filters and utilizes genetic algorithm (GA) to optimize the design parameters. The optimization process is realized by combining the front-end filter bank with the back-end recognition network in the performance evaluation loop. The deployment of ABFB together with zero-crossing peak amplitude (ZCPA) feature as a front process for radial basis function (RBF) system shows significant improvement in robustness compared with the Bark-scale filter bank. In ABFB, several sub-bands are still more concentrated toward lower frequency but their exact locations are determined by the performance rather than the perceptual criteria. For the ease of optimization, only symmetrical bands are considered here, which still provide satisfactory results.
Key words
perceptual filter banks bark scale speaker independent speech recognition systems zero-crossing peak amplitude genetic algorithmReferences
- [1]ATAL B S. Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification [J]. Journal of the Acoustical Society of America, 1974, 55(6): 1304–1312.CrossRefGoogle Scholar
- [2]DAVIS S, MERMELSTEIN P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences [J]. IEEE Transaction on Acoustics, Speech and Signal Processing, 1980, 28(4): 357–366.CrossRefGoogle Scholar
- [3]KIM D S, LEE S Y, KIL R M. Auditory processing of speech signal for robust speech recognition in real-world noisy environments [J]. IEEE Transaction on Speech and Audio Processing, 1999, 7(1): 55–69.CrossRefGoogle Scholar
- [4]JUANG B H, RABINER L R. Hidden Markov models for speech recognition [J]. Technometrics, 1991, 33(3): 251–272.MathSciNetMATHCrossRefGoogle Scholar
- [5]BROOMHEAD D S, LOWE D. Multivariable functional interpolation and adaptive networks [J]. Complex Systems, 1988, 2(3): 321–355.MathSciNetMATHGoogle Scholar
- [6]SAYOUD H, OUAMOUR S. Speaker clustering of stereo audio documents based on sequential gathering process [J]. Journal of Information Hiding and Multimedia Signal Processing, 2010, 1(4): 344–360.Google Scholar
- [7]HANDEL S. Listening: An introduction to the perception of auditory events [M]. Massachusetts: MIT Press, 1993: 461–546.Google Scholar
- [8]STROPE B, ALWAN A. A model of dynamic auditory perception and its application to robust word recognition [J]. IEEE Transaction on Speech and Audio Processing, 1997, 5(5): 451–464.CrossRefGoogle Scholar
- [9]HOLMBERG M, GELBART D, HEMMERT W. Automatic speech recognition with an adaptation model motivated by auditory processing [J]. IEEE Transaction on Audio, Speech, Language Processing, 2006, 14(1): 44–49.Google Scholar
- [10]ZHANG Xue-ying, HUANG Li-xia, EVANGELISTA G. Warped filter banks used in noisy speech recognition [C]// Proceedings of Innovative Computing, Information and Control. Kaohsiung: IEEE, 2009: 1385–1388.Google Scholar
- [11]HUANG Li-xia, ZHANG Xue-ying, EVANGELISTA G. Speaker independent recognition on OLLO French corpus by using different features [C]// Proceedings of Pervasive Computing, Signal Processing and Applications. Harbin: IEEE, 2010: 332–335.Google Scholar
- [12]HUANG Hsiang-cheh, PAN Jeng-shyang, LU Zhe-ming, SUN Sheng-he, HANG Hsueh-ming. Vector quantization based on genetic simulated annealing [J]. Signal Processing, 2001, 81(7): 1513–1523.MATHCrossRefGoogle Scholar
- [13]LI Xi, CAO Guang-yi, ZHU Xin-jian, WEI Dong. Identification and analysis based on genetic algorithm for proton exchange membrane fuel cell stack [J]. Journal of Central South University of Technology, 2006, 13(4): 428–431.CrossRefGoogle Scholar
- [14]YU Shou-yi, KUANG Su-qiong. Fuzzy adaptive genetic algorithm based on auto-regulating fuzzy rules [J]. Journal of Central South University of Technology, 2010, 17(1): 123–128.CrossRefGoogle Scholar
- [15]GOSSELIN L, TYE-GINGRAS M, MATHIEU-POTVIN F. Review of utilization of genetic algorithms in heat transfer problems [J]. International Journal of Heat and Mass Transfer, 2009, 52(9/10): 2169–2188.MATHCrossRefGoogle Scholar
- [16]PRAKOTPOL D, SRINOPHAKUN T. GAPinch: genetic algorithm toolbox for water pinch technology [J]. Chemical Engineering and Processing, 2004, 43(2): 203–217.CrossRefGoogle Scholar