Using Adaptive Filter to Increase Automatic Speech Recognition Rate in a Digit Corpus

  • José Luis Oropeza Rodríguez
  • Sergio Suárez Guerra
  • Luis Pastor Sánchez Fernández
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4756)

Abstract

This paper shows results obtained in the Automatic Speech Recognition (ASR) task for a corpus of digits speech files with a determinate noise level immerse. The experiments realized treated with several speech files that contained Gaussian noise. We used HTK (Hidden Markov Model Toolkit) software of Cambridge University in the experiments. The noise level added to the speech signals was varying from fifteen to forty dB increased by a step of 5 units. We used an adaptive filtering to reduce the level noise (it was based in the Least Measure Square –LMS- algorithm). With LMS we obtained an error rate lower than if it was not present. It was obtained because of we trained with 50% of contaminated and originals signals to the ASR. The results showed in this paper to analyze the ASR performance in a noisy environment and to demonstrate that if we have controlling the noise level and if we know the application where it is going to work, then we can obtain a better response in the ASR tasks. Is very interesting to count with these results because speech signal that we can find in a real experiment (extracted from an environment work, i.e.), could be treated with these technique and decrease the error rate obtained. Finally, we report a recognition rate of 99%, 97.5% 96%, 90.5%, 81% and 78.5% obtained from 15, 20, 25, 30, 35 and 40 noise levels, respectively when the corpus that we mentioned above was employed. Finally, we made experiments with a total of 2600 sentences (between noisy and filtered sentences) of speech signal.

Keywords

Automatic Speech Recognition Adaptative Filters Continuous Density Hidden Markov Models Gaussian Mixtures and noisy speech signals 

References

  1. 1.
    Farnetani, E.: Coarticulation and connected speech processes. In: Hardcastle, W., Laver, J. (eds.) Handbook of Phonetic Sciences, pp. 371–404. Blackwell (1997)Google Scholar
  2. 2.
    Challenges in Adopting Speech Recognition. Communications of the ACM 47(1), 69–75Google Scholar
  3. 3.
    An ASR Incremental Stochastic Matching Algorithm for Noisy Speech Recognition. IEEE Trans. Speech and Audio Processing 9(8), 866–873Google Scholar
  4. 4.
    Cole, R.A., Hirschman, L., et al.: Workshop on spoken language understanding. Tech. Rep. CSE 92-014, Oregon Graduate Institute of Science&Technology, P.O.Box 91000, Portland, OR 97291-1000 USA, (September 1992)Google Scholar
  5. 5.
    Gauvain, J.-L., Lee, C.-H.: Bayesian learning for HMM with GM state observation densities. In: Eurospeech (Eur91), pp. 939–942Google Scholar
  6. 6.
    Lawrence, R., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)Google Scholar
  7. 7.
    Jialu, Z.: On the syllable structures of Chinese relating to speech recognition. Institute of Acoustics, Academia Sinica Beijing, China (1999)Google Scholar
  8. 8.
    Bilmes, J.A.: A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for GM and HMM. ICS Institute, Berkeley, CA (1998)Google Scholar
  9. 9.
    Barbara, R.: Gaussian Statistics and Unsupervised Learning. A tutorial for the Course Computational Intelligence Signal Processing and Speech Communication Laboratory (November 15, 2001), www.igi.turgaz.at/lehre/CI
  10. 10.
    Barbara, R.: Hidden Markov Models. A Tutorial for the Course Computational Laboratory. Signal Processing and Speech Communication Laboratory (November 15, 2001), www.igi.turgaz.at/lehre/CI
  11. 11.
    Prasad, K.V., Nagarajan T., Murthy H.A.: Continuous Speech Recognition Using Automatically Segmented Data at Syllabic Units. Department of Computer Science and Engineering. Indian Institute of Technology. Madras, Chennai pp. 600–636 (2002)Google Scholar
  12. 12.
    Paul, M.: Automatic Segmentation of Speech into Syllabic Units. Haskins Laboratories, New Haven, Connecticut 06510 58(4) 880–883 (1975)Google Scholar
  13. 13.
    Huang, X.D., Lee, K.F.: On speaker-independent, speaker-dependent, and speaker-adaptive speech recognition. IEEE Transactions on Speech and Audio Processing 1(2), 150–157 (1993)CrossRefGoogle Scholar
  14. 14.
    Schwartz, R., Chow, Y., Kubala, F.: Rapid speaker adaption using a probabilistic spectral mapping. In: ICASSP [ICA87], pp. 633–636Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • José Luis Oropeza Rodríguez
    • 1
  • Sergio Suárez Guerra
    • 1
  • Luis Pastor Sánchez Fernández
    • 1
  1. 1.Center for Computing Research, National Polytechnic Institute, Juan de Dios Batiz esq Miguel Othon de Mendizabal s/n, P.O. 07038Mexico

Personalised recommendations