MFCC-GMM based accent recognition system for Telugu speech signals

Mannepalli, Kasiprasad; Sastry, Panyam Narahari; Suman, Maloji

doi:10.1007/s10772-015-9328-y

MFCC-GMM based accent recognition system for Telugu speech signals

Published: 30 November 2015

Volume 19, pages 87–93, (2016)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Kasiprasad Mannepalli¹,
Panyam Narahari Sastry² &
Maloji Suman¹

1217 Accesses
32 Citations
Explore all metrics

Abstract

Speech processing is very important research area where speaker recognition, speech synthesis, speech codec, speech noise reduction are some of the research areas. Many of the languages have different speaking styles called accents or dialects. Identification of the accent before the speech recognition can improve performance of the speech recognition systems. If the number of accents is more in a language, the accent recognition becomes crucial. Telugu is an Indian language which is widely spoken in Southern part of India. Telugu language has different accents. The main accents are coastal Andhra, Telangana, and Rayalaseema. In this present work the samples of speeches are collected from the native speakers of different accents of Telugu language for both training and testing. In this work, Mel frequency cepstral coefficients (MFCC) features are extracted for each speech of both training and test samples. In the next step Gaussian mixture model (GMM) is used for classification of the speech based on accent. The overall efficiency of the proposed system to recognize the speaker, about the region he belongs, based on accent is 91 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Aggarwal, R. K., & Dave, M. (2011). Using Gaussian mixtures for Hindi speech recognition system. International Journal of Signal Processing, Image Processing and Pattern Recognition, 4(4), 157–170.
Google Scholar
Beek, B., Neuberg, E., Hodge, D. (1977) An assessment of the technology of automatic speech recognition for military applications. IEEE transactions on acoustics speech and signal processing, ASSP-25 (pp 310–322).
Biadsy, F. (2011), Automatic dialect and accent recognition and its application to speech recognition, A Ph.D. Thesis, Columbia University. http://www.cs.columbia.edu/speech/ThesisFiles/fadi_biadsy.pdf.
Bricker, P. D., et al. (1971). Statistical techniques for talker identification. Bell System Technical Journal, 50, 1427–1454.
Article Google Scholar
Eriksson, T., Kim, S., Kang, H.-G., & Lee, C. (2005). An information-theoretic perspective on feature selection in speaker recognition. IEEE Signal Processing Letters, 12(7), 500–503.
Article Google Scholar
Ferrer, L., Bratt, H., Richey, C., Franco, H., Abrash, V., Precoda, K. (2014) Lexical stress classification for language learning using spectral and segmental features. ICASSP-14 (pp. 7754–7758).
Kumpf, K., & King, R. W. (1996) Automatic accent classification of foreign accented Australian english speech, ICSLP-96 (Vol. 3, pp. 1740–1743). doi: 10.1109/ICSLP.1996.607964.
Kun, L. I., & Jia, L. I. U. (2010). English sentence accent detection based on auditory features. Beijing: Tsinghua Tongfang Knowledge Network Technology Co., Ltd.
Google Scholar
Kumar, G. S., Prasad Raju, K. A., Satheesh, P., & Mohan Rao, (2010). Speaker recognition using GMM. International Journal of Engineering Science and Technology, 2(6), 2428–2436.
Google Scholar
Li, K. P., & Hughes, G. W. (1974). Talker differences as they appear in correlation matrices of continuous speech spectra. The Journal of the Acoustical Society of America., 55, 833–837.
Article Google Scholar
Li, Q., & Huang, Y. (2011). An auditory-based feature extraction algorithm for robust speaker identification under mismatched conditions. IEEE Transactions on Audio, Speech and Language Processing, 19(6), 1791–1801.
Article Google Scholar
Liu, M., Xu, B., Hunng, T., Deng, Y., & Li, C. (2000) Mandarin accent adaptation based on context independent/Context-dependent pronunciation modeling. In Proceedings of the acoustics, speech, and signal processing, ICASSP ‘00 (pp: II1025–II1028). Washington, DC: IEEE Computer Society.
Luoh, L., Su, Y.-Z., & Hsu, C.-F. (2010) Speech signal processing based emotion recognition. International Conference on System Science and Engineering, IEEE Conference (pp. 487–490).
Mandal, S. K. D., Gupta, B., & Datta, A. K. (2007). Word boundary detection based on supra segmental features: A case study on Bangla speech. International Journal of Speech Technology, 9(1–2), 17–28.
Article Google Scholar
Ma, Zichen, & Fokoué, Ernest. (2014). A comparison of classifiers in performing speaker accent recognition using MFCCs. Open Journal of Statistics, 4, 258–266.
Article Google Scholar
Malhotra, Kamini, & Khosla, Anu. (2013). Impact of regional Indian accents on spoken Hindi, Asian spoken language research and evaluation (O-COCOSDA/CASLRE). International Conference, 01(2013), 1–4. doi:10.1109/ICSDA.2013.6709876.
Google Scholar
Mannepalli, K., Sastry, P. N., Rajesh, V. (2014) Modellling and analysis of accent based recognition and speaker identification system, ARPN Journal of Engineering and Applied Sciences, 9(12), ISSN: 1819-6608.
Meena, K., Subramanian, U., & Muthusamy, G. (2013). Gender classification in speech recognition using fuzzy logic and neural network. The International Arab Journal of Information Technology, 10(5), 477–485.
Google Scholar
Mermelstein, P., & Davis, S. (1980). Comparison of parametric representation for mono syllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustic Speech and Signal Processing, 28(4), 357–366.
Article Google Scholar
Nidhyananthan, S. S., & Kumari, R. S. S. (2013). Language and text-independent speaker identification system using GMM. WSEAS Transactions on Signal Processing, 9(4), 185–194.
Google Scholar
Nelwamondo, F. V., & Marwala, T. (2006), Faults detection using gaussian mixture models, mel-frequency cepstral coefficients and kurtosis. IEEE International Conference on Systems, Man, and Cybernetics October 8–11, Taipei. 1-4244-0100-3/06: pp. 290–295 (Print).
Rao, K. S., & Koolagudi, S. G. (2011) Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Systems, Cybernetics and Informatics, 9(4). ISSN: 1690-4524.
Singh, N., Khan, R. A., & Shree, R. (2012). MFCC and prosodic feature extraction techniques: A comparative study (0975– 8887). International Journal of Computer Applications, 54(1), 9–13.
Article Google Scholar
Yan, Q., & Vaseghi, S. (2002) A comparative study of UK and US english accents in recognition and synthesis. IEEE international conference on acoustics, speech, and signal processing (ICASSP, 2002) (pp. 413–416). doi: 10.1109/ICASSP.2002.5745496.
YunXue, Z., Long, Z., ShiJie, Z., Wei, Z. (2015) Chinese accent detection research based on features structured. International Journal of Hybrid Information Technology, 8(5), 303–316. http://dx.doi.org/10.14257/ijhit.2015.8.5.33.

Download references

Author information

Authors and Affiliations

K L University, KLEF, Guntur (Dist), Vijayawada, Andhra Pradesh, India
Kasiprasad Mannepalli & Maloji Suman
CBIT, Hyderabad, Telangana, India
Panyam Narahari Sastry

Authors

Kasiprasad Mannepalli
View author publications
You can also search for this author in PubMed Google Scholar
Panyam Narahari Sastry
View author publications
You can also search for this author in PubMed Google Scholar
Maloji Suman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kasiprasad Mannepalli.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mannepalli, K., Sastry, P.N. & Suman, M. MFCC-GMM based accent recognition system for Telugu speech signals. Int J Speech Technol 19, 87–93 (2016). https://doi.org/10.1007/s10772-015-9328-y

Download citation

Received: 15 September 2015
Accepted: 23 November 2015
Published: 30 November 2015
Issue Date: March 2016
DOI: https://doi.org/10.1007/s10772-015-9328-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MFCC-GMM based accent recognition system for Telugu speech signals

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

A Deep Learning Framework for Audio Deepfake Detection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MFCC-GMM based accent recognition system for Telugu speech signals

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

A Deep Learning Framework for Audio Deepfake Detection

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation