Abstract
This paper shows results obtained in the Automatic Speech Recognition (ASR) task for a corpus of digits speech files with a determinate noise level immerse. The experiments realized treated with several speech files that contained Gaussian noise. We used HTK (Hidden Markov Model Toolkit) software of Cambridge University in the experiments. The noise level added to the speech signals was varying from fifteen to forty dB increased by a step of 5 units. We used an adaptive filtering to reduce the level noise (it was based in the Least Measure Square –LMS- algorithm). With LMS we obtained an error rate lower than if it was not present. It was obtained because of we trained with 50% of contaminated and originals signals to the ASR. The results showed in this paper to analyze the ASR performance in a noisy environment and to demonstrate that if we have controlling the noise level and if we know the application where it is going to work, then we can obtain a better response in the ASR tasks. Is very interesting to count with these results because speech signal that we can find in a real experiment (extracted from an environment work, i.e.), could be treated with these technique and decrease the error rate obtained. Finally, we report a recognition rate of 99%, 97.5% 96%, 90.5%, 81% and 78.5% obtained from 15, 20, 25, 30, 35 and 40 noise levels, respectively when the corpus that we mentioned above was employed. Finally, we made experiments with a total of 2600 sentences (between noisy and filtered sentences) of speech signal.
Chapter PDF
Similar content being viewed by others
Keywords
References
Farnetani, E.: Coarticulation and connected speech processes. In: Hardcastle, W., Laver, J. (eds.) Handbook of Phonetic Sciences, pp. 371–404. Blackwell (1997)
Challenges in Adopting Speech Recognition. Communications of the ACM 47(1), 69–75
An ASR Incremental Stochastic Matching Algorithm for Noisy Speech Recognition. IEEE Trans. Speech and Audio Processing 9(8), 866–873
Cole, R.A., Hirschman, L., et al.: Workshop on spoken language understanding. Tech. Rep. CSE 92-014, Oregon Graduate Institute of Science&Technology, P.O.Box 91000, Portland, OR 97291-1000 USA, (September 1992)
Gauvain, J.-L., Lee, C.-H.: Bayesian learning for HMM with GM state observation densities. In: Eurospeech (Eur91), pp. 939–942
Lawrence, R., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)
Jialu, Z.: On the syllable structures of Chinese relating to speech recognition. Institute of Acoustics, Academia Sinica Beijing, China (1999)
Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for GM and HMM. ICS Institute, Berkeley, CA (1998)
Barbara, R.: Gaussian Statistics and Unsupervised Learning. A tutorial for the Course Computational Intelligence Signal Processing and Speech Communication Laboratory (November 15, 2001), www.igi.turgaz.at/lehre/CI
Barbara, R.: Hidden Markov Models. A Tutorial for the Course Computational Laboratory. Signal Processing and Speech Communication Laboratory (November 15, 2001), www.igi.turgaz.at/lehre/CI
Prasad, K.V., Nagarajan T., Murthy H.A.: Continuous Speech Recognition Using Automatically Segmented Data at Syllabic Units. Department of Computer Science and Engineering. Indian Institute of Technology. Madras, Chennai pp. 600–636 (2002)
Paul, M.: Automatic Segmentation of Speech into Syllabic Units. Haskins Laboratories, New Haven, Connecticut 06510 58(4) 880–883 (1975)
Huang, X.D., Lee, K.F.: On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition. IEEE Transactions on Speech and Audio Processing 1(2), 150–157 (1993)
Schwartz, R., Chow, Y., Kubala, F.: Rapid speaker adaption using a probabilistic spectral mapping. In: ICASSP [ICA87], pp. 633–636
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rodríguez, J.L.O., Guerra, S.S., Fernández, L.P.S. (2007). Using Adaptive Filter to Increase Automatic Speech Recognition Rate in a Digit Corpus. In: Rueda, L., Mery, D., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2007. Lecture Notes in Computer Science, vol 4756. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76725-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-76725-1_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76724-4
Online ISBN: 978-3-540-76725-1
eBook Packages: Computer ScienceComputer Science (R0)