Isolated Word Recognition Using Enhanced MFCC and IIFs

Umarani, S. D.; Wahidabanu, R. S. D.; Raviram, P.

doi:10.1007/978-3-642-35314-7_32

S. D. Umarani⁴,
R. S. D. Wahidabanu⁴ &
P. Raviram⁵

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 199))

2364 Accesses
1 Citations

Abstract

The main objective of this paper is to design a noise-resilient and speaker independent speech recognition system for isolated word recognition. Mel-frequency Cepstral Coefficients (MFCCs) has been used for feature extraction. Noise robust performance of MFCC under mismatched training and testing conditions is enhanced by the application of wavelet based denoising algorithm and also to make MFCCs as robust to variation in vocal track length (VTL) an invariant-integration method is applied. The resultant features are called as enhanced MFCC Invariant-Integration Features (EMFCCIIFs). To accomplish the objective of this paper, classifier called feature-finding neural network (FFNN) is used for the recognition of isolated words. Results are compared with the results obtained by the traditional MFCC features. Through experiments it is observed that under mismatched conditions, the EMFCCIIFs features remains high recognition rate under low Signal-to-noise ratios (SNRs) and their performance are more effective under high SNRs too.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Juang, B.H., Rabiner, L.R.: Automatic Speech Recognition—A Brief History of the Technology, 2nd edn. Elsevier Encyclopedia of Language and Linguistics (2005)
Google Scholar
Hermansky, H., Morgan, N.: RASTA processing of speech. IEEE Trans. Speech, Audio Processing 2(4), 578–589 (1994)
Article Google Scholar
Acero, A., Stern, R.M.: Environmental robustness in automatic speech recognition. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1990), vol. 2, pp. 849–852 (1990)
Google Scholar
Aldibbiat, N.M.: Optical wireless communication systems employing Dual Header Pulse Interval Modulation (DH-PIM). Sheffield Hallam University (2001)
Google Scholar
Acero, A.: Acoustical and Environmental Robustness in Automatic Speech Recognition. Ph.D Thesis, Carnegie Mellon University, Pittsburgh, Pennsylvania (1990)
Google Scholar
Muller, F., Mertins, A.: Contextual invariant-integration features for improved speaker-independent speech recognition. Speech Communication 53(6), 830–841 (2011)
Article Google Scholar
Li, Q.: Solution for pervasive speaker recognition. SBIR Phase I Proposal, Submitted to NSF IT.F4, Li Creative Technologies, Inc., NJ (2003)
Google Scholar
Zhang, X., Meng, W.: The Research of Noise-Robust Speech Recognition Based on Frequency Warping Wavelet. In: Grimm, M., Kroschel, K. (eds.) Source: Robust Speech Recognition and Understanding, p. 460. I-Tech, Vienna (2007) ISBN 987-3-90213-08-0
Google Scholar
Burkhardt, H., Muller, X.: On invariant sets of a certain class of fast translation invariant transforms. IEEE Transactions on Acoustic, Speech, and Signal Processing 28(5), 517–523 (1980)
Article MATH MathSciNet Google Scholar
Muller, F., Belilovsky, E., Mertins, A.: Generalized cyclic transformations in speaker independent speech recognition. In: IEEE Automatic Speech Recognition and Understanding Workshop, Merano, Italy, pp. 211–215 (2009)
Google Scholar
Muller, F., Mertins, A.: Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton (2009)
Google Scholar
Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech and Audio Processing 26, 357–366 (1980)
Article Google Scholar
Muda, L., Begam, M., Elamvazuthi, I.: Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) techniques. Journal of Computing 2(3), 138–143 (2010)
Google Scholar
Zhang, J., Li, G.-L., Zheng, Y.-Z., Liu, X.-Y.: A Novel Noise-robust Speech Recognition System Based on Adaptively Enhanced Bark Wavelet MFCC. In: Sixth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2009), pp. 443–447 (2009)
Google Scholar
Mammone, R.J., Zhang, X., Ramachandran, R.P.: Robust speaker recognition: A feature-based approach. IEEE Signal Processing Magazine 13, 58–70 (1996)
Article Google Scholar
Gramss, T., Strube, H.W.: Recognition of isolated words based on psychoacoustics and neurobiology. Speech Commun. 9, 35–40 (1990)
Article Google Scholar
Gramss, T.: Word recognition with the Feature Finding Neural Network (FFNN). In: IEEE-SP Workshop Neural Networks for Signal Processing, Princeton, New Jersey (1991)
Google Scholar
Cooke, M., Lee, T.-W.: Speech separation challenge, http://www.interspeech2006.org
Kohonen, T.: Self-Organization and Associative Memory, 2nd edn. Springer Series in Information Sciences, vol. 8, ch. 5, 7 (1988)
Google Scholar

Download references

Author information

Authors and Affiliations

Government College of Engineering, Salem, 636011, India
S. D. Umarani & R. S. D. Wahidabanu
Department of CSE, Mahendra Engineering College, Tiruchengode, 637503, India
P. Raviram

Authors

S. D. Umarani
View author publications
You can also search for this author in PubMed Google Scholar
R. S. D. Wahidabanu
View author publications
You can also search for this author in PubMed Google Scholar
P. Raviram
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. D. Umarani .

Editor information

Editors and Affiliations

Dept of Computer Science Engineering, Anil Neerukonda Institute of Technology and Sciences, Vishakapatnam, India
Suresh Chandra Satapathy
AI Lab, University of Hyderabad, Hyderabad, India
Siba K. Udgata
Bhubaneswar Engineering College, Bhubaneswar, India
Bhabendra Narayan Biswal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Umarani, S.D., Wahidabanu, R.S.D., Raviram, P. (2013). Isolated Word Recognition Using Enhanced MFCC and IIFs. In: Satapathy, S., Udgata, S., Biswal, B. (eds) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA). Advances in Intelligent Systems and Computing, vol 199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35314-7_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-35314-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35313-0
Online ISBN: 978-3-642-35314-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics