Abstract
Language Identification is the task of identifying a language from a given spoken utterance. Main task of a language identifier is to design an efficient algorithm which helps a machine to identify correctly a particular language from a given audio sample. We have proposed here a hybrid approach for identifying a language which is a combination of Vector Quantization (VQ) and Gaussian Mixture Models (GMM). A brief review of work carried out in the area of Speaker Identification using VQ-GMM hybrid approach is discussed here. We have carried out experiments for identifying four Indian Languages—Assamese, Bengali, Hindi and Indian English. The experiments were carried out on our own recorded standard language database collected from 50 speakers. Speech features were extracted using MFCCs. Results show that after applying hybrid approach, accuracy is best with highest mixture order and with the increase in mixture order, accuracy increases uniformly for all four languages. It is also concluded here that hybrid approach gives better results when compared with the baseline GMM system.
Similar content being viewed by others
References
Ahmad, A. M., & Yee, L. M. (2008). Vector quantization decision function for Gaussian Mixture Model based speaker identification. In ISPACS ‘2008 proceedings (pp. 1–4). Bangkok: ISPACS.
Allili, M. S. (2010). A short tutorial on Gaussian mixture models. Université du Québec en Outaouais.
Linde, Y., Buzo, A., & Gray, R.M. (1980). An algorithm for vector quantizer design. IEEE Transactions on Communications, 28, 84–94.
Lotia, P., & Khan, M. R. (2011a). Multistage VQ based GMM for text independent speaker identification system. International Journal of Soft Computing and Engineering, 1(2), 21–26.
Lotia, P., & Khan, M. R. (2011b). Code separated text independent speaker identification system using GMM. International Journal of Multimedia and Ubiquitous Engineering, 6(3), 61–72.
Minh, N. D. (2012). DSP Mini-Project: an automatic speaker recognition system. http://www.ifp.illinois.edu/~minhdo/teaching/speaker_recognition/speaker_recognition.html.
Moon, T. K. (1996). In the expectation-maximization algorithm. IEEE Signal Processing Magazine, 47–60
Pelecanos, J., Myers, S., Sridharan, S., & Chandram, V. (2000). Vector quantization based Gaussian modelling for speaker verification. In ICPR’2000 proceedings (pp. 294–297). Barcelona: ICPR.
Qasem, M. (2012). Vector quantization. http://www.mqasem.net/vectorquantization/vq.html.
Rabiner, L. R., & Juang, B. H. (1993). Fundamentals of speech recognition. Signal processing series (pp. 122–132). New York: Prentice Hall.
Roy, P., & Das, P. K. (2011a). Language identification of Indian languages based on Gaussian mixture models. International Journal of Wisdom Based Computing, 1(3), 54–59.
Roy, P., & Das, P. K. (2011b). Automatic language identification of three Indian languages using vector quantization. In ICCEE’ 2011 proceedings, Singapore (pp. 293–297).
Yee, L. M., Ahmad, A. M., & Yee, C. S. (2008). Towards making better hybrid pattern classification design for speaker identification. In ICED’2008 proceedings (pp. 1–6). Penang: ICED.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Roy, P., Das, P.K. A hybrid VQ-GMM approach for identifying Indian languages. Int J Speech Technol 16, 33–39 (2013). https://doi.org/10.1007/s10772-012-9152-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-012-9152-6