Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 199))

Abstract

The main objective of this paper is to design a noise-resilient and speaker independent speech recognition system for isolated word recognition. Mel-frequency Cepstral Coefficients (MFCCs) has been used for feature extraction. Noise robust performance of MFCC under mismatched training and testing conditions is enhanced by the application of wavelet based denoising algorithm and also to make MFCCs as robust to variation in vocal track length (VTL) an invariant-integration method is applied. The resultant features are called as enhanced MFCC Invariant-Integration Features (EMFCCIIFs). To accomplish the objective of this paper, classifier called feature-finding neural network (FFNN) is used for the recognition of isolated words. Results are compared with the results obtained by the traditional MFCC features. Through experiments it is observed that under mismatched conditions, the EMFCCIIFs features remains high recognition rate under low Signal-to-noise ratios (SNRs) and their performance are more effective under high SNRs too.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Juang, B.H., Rabiner, L.R.: Automatic Speech Recognition—A Brief History of the Technology, 2nd edn. Elsevier Encyclopedia of Language and Linguistics (2005)

    Google Scholar 

  2. Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech, Audio Processing 2(4), 578–589 (1994)

    Article  Google Scholar 

  3. Acero, A., Stern, R.M.: Environmental robustness in automatic speech recognition. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1990), vol. 2, pp. 849–852 (1990)

    Google Scholar 

  4. Aldibbiat, N.M.: Optical wireless communication systems employing Dual Header Pulse Interval Modulation (DH-PIM). Sheffield Hallam University (2001)

    Google Scholar 

  5. Acero, A.: Acoustical and Environmental Robustness in Automatic Speech Recognition. Ph.D Thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania (1990)

    Google Scholar 

  6. Muller, F., Mertins, A.: Contextual invariant-integration features for improved speaker-independent speech recognition. Speech Communication 53(6), 830–841 (2011)

    Article  Google Scholar 

  7. Li, Q.: Solution for pervasive speaker recognition. SBIR Phase I Proposal, Submitted to NSF IT.F4, Li Creative Technologies, Inc., NJ (2003)

    Google Scholar 

  8. Zhang, X., Meng, W.: The Research of Noise-Robust Speech Recognition Based on Frequency Warping Wavelet. In: Grimm, M., Kroschel, K. (eds.) Source: Robust Speech Recognition and Understanding, p. 460. I-Tech, Vienna (2007) ISBN 987-3-90213-08-0

    Google Scholar 

  9. Burkhardt, H., Muller, X.: On invariant sets of a certain class of fast translation invariant transforms. IEEE Transactions on Acoustic, Speech, and Signal Processing 28(5), 517–523 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  10. Muller, F., Belilovsky, E., Mertins, A.: Generalized cyclic transformations in speaker independent speech recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, Merano, Italy, pp. 211–215 (2009)

    Google Scholar 

  11. Muller, F., Mertins, A.: Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton (2009)

    Google Scholar 

  12. Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech and Audio Processing 26, 357–366 (1980)

    Article  Google Scholar 

  13. Muda, L., Begam, M., Elamvazuthi, I.: Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. Journal of Computing 2(3), 138–143 (2010)

    Google Scholar 

  14. Zhang, J., Li, G.-L., Zheng, Y.-Z., Liu, X.-Y.: A Novel Noise-robust Speech Recognition System Based on Adaptively Enhanced Bark Wavelet MFCC. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009), pp. 443–447 (2009)

    Google Scholar 

  15. Mammone, R.J., Zhang, X., Ramachandran, R.P.: Robust speaker recognition: A feature-based approach. IEEE Signal Processing Magazine 13, 58–70 (1996)

    Article  Google Scholar 

  16. Gramss, T., Strube, H.W.: Recognition of isolated words based on psychoacoustics and neurobiology. Speech Commun. 9, 35–40 (1990)

    Article  Google Scholar 

  17. Gramss, T.: Word recognition with the Feature Finding Neural Network (FFNN). In: IEEE-SP Workshop Neural Networks for Signal Processing, Princeton, New Jersey (1991)

    Google Scholar 

  18. Cooke, M., Lee, T.-W.: Speech separation challenge, http://www.interspeech2006.org

  19. Kohonen, T.: Self-Organization and Associative Memory, 2nd edn. Springer Series in Information Sciences, vol. 8, ch. 5, 7 (1988)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. D. Umarani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Umarani, S.D., Wahidabanu, R.S.D., Raviram, P. (2013). Isolated Word Recognition Using Enhanced MFCC and IIFs. In: Satapathy, S., Udgata, S., Biswal, B. (eds) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA). Advances in Intelligent Systems and Computing, vol 199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35314-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35314-7_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35313-0

  • Online ISBN: 978-3-642-35314-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics