A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Bharali, Sruti Sruba; Kalita, Sanjib Kr.

doi:10.1007/s10772-015-9311-7

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Published: 19 October 2015

Volume 18, pages 673–684, (2015)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Sruti Sruba Bharali¹ &
Sanjib Kr. Kalita¹

398 Accesses
16 Citations
Explore all metrics

Abstract

This paper describes the work done in implementation of speaker independent, isolated word recognizer for Assamese language. Linear predictive coding (LPC) analysis, LPC cepstral coefficients (LPCEPSTRA), linear mel-filter bank channel outputs and mel frequency cepstral coefficients (MFCC) are used to get the acoustical features. The hidden Markov model toolkit (HTK) using the Hidden Markov Model (HMM) has been used to build the different recognition models. The speech recognition model is trained for 10 Assamese words representing the digits from 0 (shounya) to 9 (no) in the Assamese language using fifteen speakers. Different models were created for each word which varied on the number of input feature values and the number of hidden states. The system obtained a maximum accuracy of 80 % for 39 MFCC features and a 7 state HMM model with 5 hidden states for a system with clean data and a maximum accuracy of 95 % for 26 LPCESPTRA features and a 7 state HMM model with 5 hidden states for a system with noisy data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech recognition: a survey

Article 10 November 2020

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

Article Open access 03 January 2024

References

Abushariah, M. A., Ainon, R. N., Zainuddin, R., Elshafei, M., & Khalifa, O. O. (2010). Natural speaker-independent Arabic speech recognition system based on Hidden Markov Models using Sphinx tools. In Computer and Communication Engineering (ICCCE) (pp. 1–6), 2010 International Conference on, IEEE.
Abushariah, M. A. A. M., Ainon, R. N., Zainuddin, R., Elshafei, M., & Khalifa, O. O. (2012). Arabic speaker-independent continuous automatic speech recognition based on a phonetically rich and balanced speech corpus. International Arab Journal of Information Technology, 9(1), 84–93.
Google Scholar
Al-Qatab, B. A., & Ainon, R. N. (2010). Arabic speech recognition using hidden Markov model toolkit (HTK). In Information Technology (ITSim) (Vol. 2, pp. 557–562), 2010 International Symposium in, IEEE.
Bhaskar, P. V., & Rao, S. R. M. (2014). Telugu Speech Recognition System development using MFCC based Hidden Markov Model technique with Sphinx-4.
Bhattacharjee, U. (2013). A comparative study of LPCC and MFCC features for the recognition of assamese phonemes. In International Journal of Engineering Research and Technology (Vol. 2, No. 1 (January-2013)). ESRSA Publications.
Bourlard, H., & Morgan, N. (1998). Hybrid HMM/ANN systems for speech recognition: Overview and new research directions. In Adaptive processing of sequences and data structures (pp. 389–417). Berlin: Springer.
Dua, M., Aggarwal, R. K., Kadyan, V., & Dua, S. (2012). Punjabi automatic speech recognition using HTK. IJCSI International Journal of Computer Science Issues, 9(4), 1694-0814.
Google Scholar
Eslam Mansour Mohammed, E. M. M., Mohammed Sharaf Sayed, M. S. S., Abdallaa Mohammed Moselhy, A. M. M., & Abdelaziz Alsayed Abdelnaiem, A. A. A. (2013). LPC and MFCC performance evaluation with artificial neural network for spoken language identification. International Journal of Signal Processing, Image Processing and Pattern Recognition, 6(3), 55–66.
Google Scholar
Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., & Woodland, P. (1997). The HTK book (Vol. 2). Cambridge: Entropic Cambridge Research Laboratory.
Google Scholar
Hassan, F., Kotwal, M. R. A., Khan, M. S. A., & Huda, M. N. (2012). Gender independent Bangla automatic speech recognition. In Informatics, Electronics & Vision (ICIEV) (pp. 144–148), 2012 International Conference on, IEEE.
Krishna, K. M., Lakshmi M. V., & Laksmi, S. S. (2014). Feature extraction and dimensionality reduction using IPS for isolated tamil words speech recognizer. International Journal of Advanced Research in Computer and Communication Engineering, 3(3).
Kumar, K., & Aggarwal, R. K. (2011). Hindi speech recognition system using HTK. International Journal of Computing and Business Research, 2(2), 2229–6166.
Google Scholar
Kumar, K., Aggarwal, R. K., & Jain, A. (2012). A Hindi speech recognition system for connected words using HTK. International Journal of Computational, Systems Engineering, 1(1), 25–32.
Article Google Scholar
Mankala, S. R., Bojja, S. R., Ramaiah, V. S., & Rao, R. R. (2014). Automatic speech processing using HTK for Telugu language. International Journal of Advances in Engineering & Technology, 6(6), 2572–2578.
Google Scholar
Mehta, L. R., Mahajan, S. P., & Dabhade, A. S. (2013). Comparative study of MFCC and LPC for Marathi isolated word recognition system. International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, 2(6), 2133–2139.
Google Scholar
Mohamed, A., & Nair, K. N. (2012). HMM/ANN hybrid model for continuous Malayalam speech recognition. Procedia Engineering, 30, 616–622.
Article Google Scholar
Moreau, N. (2002). HTK v. 3.1 Basic Tutorial. TechnischeUniversität Berlin.
Pruthi, T., Saksena, S., & Das, P. K. (2000). Swaranjali: Isolated word recognition for Hindi language using VQ and HMM. In International Conference on Multimedia Processing and Systems (ICMPS), IIT Madras.
Rabiner, L. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Article Google Scholar
Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition (Vol. 14). Englewood Cliffs: PTR Prentice Hall.
Google Scholar
Shinde, M. B., & Gandhe, D. S. (2013). Speech processing for isolated Marathi word recognition using MFCC and DTW features. International Journal of Innovations in Engineering and Technology, 3(1).
Sigappi, A. N., & Palanivel, S. (2012). Spoken word recognition strategy for Tamil language. International Journal of Computer Science Issue, 9(1), 1694-0814.
Google Scholar

Download references

Author information

Authors and Affiliations

Gauhati University, Guwahati, Assam, India
Sruti Sruba Bharali & Sanjib Kr. Kalita

Authors

Sruti Sruba Bharali
View author publications
You can also search for this author in PubMed Google Scholar
Sanjib Kr. Kalita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sruti Sruba Bharali.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bharali, S.S., Kalita, S.K. A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language. Int J Speech Technol 18, 673–684 (2015). https://doi.org/10.1007/s10772-015-9311-7

Download citation

Received: 06 December 2014
Accepted: 13 October 2015
Published: 19 October 2015
Issue Date: December 2015
DOI: https://doi.org/10.1007/s10772-015-9311-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A comparative study of different features for isolated spoken word recognition using HMM with reference to Assamese language

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation