An Automated Classification System Based on Regional Accent

Guntur, Radha Krishna; Ramakrishnan, Krishnan; Vinay Kumar, Mittal

doi:10.1007/s00034-021-01948-7

An Automated Classification System Based on Regional Accent

Published: 12 January 2022

Volume 41, pages 3487–3507, (2022)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Radha Krishna Guntur ORCID: orcid.org/0000-0003-1153-2240¹,
Krishnan Ramakrishnan² &
Mittal Vinay Kumar³

366 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Identification of the native language from speech segment of a second language utterance, that is manifested as a distinct pattern of articulatory or prosodic behavior, is a challenging task. A method of classification of speakers, based on the regional English accent, is proposed in this paper. A database of English speech, spoken by the native speakers of three closely related Dravidian languages, was collected from a non-overlapping set of speakers, along with the native language speech data. Native speech samples from speakers of the regional languages of India, namely Kannada, Tamil, and Telugu are used for the training set. The testing set contains utterances of non-native English speakers of compatriots of the above three groups. Automatic identification of native language is proposed by using the spectral features of the non-native speech, that are classified using the classifiers such as Gaussian Mixture Models (GMM), GMM-Universal Background Model (GMM-UBM), and i-vector. Identification accuracy of \(87.9\%\) was obtained using the GMM classifier, which was increased to \(90.9\%\) by using the GMM-UBM method. But the i-vector-based approach gave a better accuracy of \(93.9\%\), along with EER of \(6.1\%\). The results obtained are encouraging, especially viewing the current state-of-the-art accuracies around \(85\%\). It is observed that the identification rate of nativity, while speaking English, is relatively higher at \(95.2\%\) for the speakers of Kannada language, as compared to that for the speakers of Tamil or Telugu as their native language.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

Automatic speech recognition: a survey

Article 10 November 2020

Speech Emotion Recognition: A Comprehensive Survey

Article 08 March 2023

Data Availability

The data that support the findings of this study are utilized strictly for research purpose, and can be made available on reasonable request, for academic use and/or research purposes.

References

A. Abad, E. Ribeiro, F. kepler, R. Astudillo, I. Trancosol, Exploiting phone log-likelihood ratio features for the detection of the native language of non-native English speakers. In: INTERSPEECH 2413–2417, 2016
F. Adeeba, S. Hussain, Native language identification in very short utterances using bidirectional long short-term memory network. IEEE Access 7, 17098–17110 (2019)
Article Google Scholar
O.C. Ali, M. Hariharan, S. Yaacob, L.S. Chee, Classification of Speech dysfluencies with MFCC and LPCC. Expert Syst. Appl. 39(2), 2157–2165 (2012)
Article Google Scholar
M.H. Bahari, R. Saeidi, H. Van Hamme, D. Van Leeuwen, Accent recognition using i-vector, Gaussian mean supervector and gaussian posterior probability supervector for spontaneous telephone speech. In: ICASSP, 7344–7248, 2013
H. Behravan, V. Hautamauki, S.M. Siniscalchi, T. Kinnunen, C.H. Lee, Introducing attribute features to foreign accent recognition. In: Acoustics, Speech and Signal Processing ICASSP, 5332–5336, 2014
N.F. Chen, Characterizing phonetic transformations and acoustic differences across English Dialects. In: IEEE Transactions on Audio, Speech and Signal Processing, 1–15, 2014
T. Chen, C. Huang, E. Chang, J. Wang, Automatic accent identification using Gaussian mixture models. In: Workshop on Automatic Speech Recognition and Understanding, 343–346, 2001
J. Cheng, N. Bojja, X. Chen, Automatic accent quantification of Indian speakers of English. In: Interspeech, 2574–2578 2013
H. Clahsen, C. Felser, How native-like is non-native language processing. Trends Cogn. Sci. 10, 564–570 (2006)
Article Google Scholar
M. Cooke, M.L. Garcia Lecumberry, J. Barker, The foreign language cocktail party problem: the energetic and information masking problem in non-native speech perception. J. Acoust. Soc. Am. 123, 414–427 (2008)
Article Google Scholar
A. Das, G. Zhao, J. Levis, E. Chukharev-Hudilainen, R. Gutierrez-Osuna, Understanding the effect of voice quality and accent on talker similarity. Proc. Interspeech 2020, 1763–1767 (2020)
Google Scholar
S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)
Article Google Scholar
A. De Marco, S.J. Cox, Native accent classification via I-vectors and speaker compensation fusion. In: Interspeech, 1472–1476, 2013
N. Dehak, P.J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification. IEEE Trans. ASLP 19(4), 788–798 (2011)
Google Scholar
H. Ding, B. Lin, L. Wang, H. Wang, R. Fang, A comparison of English rhythm produced by native American speakers and mandarin ESL primary school learners. In: INTERSPEECH, 4481–4485, 2020
D. Garcia-Romero, C.Y. Espy-Wilson, Analysis of i-vector length normalization in Speaker Recognition systems. In: INTERSPEECH, 249–252, 2011
C. Graham, B. Post, Second Language acquisition of Intonation: peak alignment in American English. J. Phon. 66, 1–14 (2018)
Article Google Scholar
C.S. Greenberg, D. Bansé, G.R. Doddington, D. Garcia-Romero, J.J. Godfrey, D. Bansé, T. Kinnunen, A.F. Martin, M. Przybocki, D.A. Reynolds, The NIST 2014 speaker recognition i-vector machine learning challenge. In: Odyssey: The Speaker and Language Recognition Workshop, 224–230, 2014
Hynek Hermansky, Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)
Article Google Scholar
D. Iskra, B. Grosskopf, K. Marasek, H. Heuvel, F. Diehl, A. Kiessling, SPEECON - Speech Databases for consumer devices: database specification and validation. In: Proceedings LREC 2002. Third International Conference on Language Resources and Evaluation, 329–333, 2002
Y. Jiao, M. Tu, V. Berisha, J. Liss, Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features. Interspeech 2016, 2388–2392 (2016)
Google Scholar
A. Kanagasundaram, R. Vogt, D. Dean, S. Sridharan, M. Mason, i-vector based recognition on short utterances. In: INTERSPEECH, 2341–2344, 2010
T. Kinnunen, H. Li, An overview of text-independent speaker recognition: from features to super vectors. Speech Commun. 52, 12–40 (2010)
Article Google Scholar
G.R. Krishna, R. Krishnan, Native language identification based on English accent. In: Intl. Conference on Natural Language Processing (ICON), 69–74, 2014
G.R. Krishna, R. Krishnan, V.K. Mittal, An automated system for regional nativity identification of Indian speakers from english speech. In: 16th IEEE India Council Intl. Conf. INDICON 2019, 2019
G.R. Krishna, R. Krishnan, V.K. Mittal, Non-native accent partitioning for speakers of Indian regional languages. In: 16th Intl. Conf. on Natural Language Processing (ICON). LTRC, IIIT Hyderabad, 2019
H. Li, K.A. Lee, B. Ma, Spoken language recognition, from fundamentals to practice. Proc. IEEE 101(5), 1136–1159 (2013)
Article Google Scholar
S. Malmasi, M. Dras, Native language identification with classifier stacking and ensembles. Comput. Linguist. 44(3), 403–446 (2018)
Article Google Scholar
I. Markov, automatic native language identification. Ph. D. thesis, Instituto Politecnico Nacional, 2018
V.K. Mittal, Analysis of nonverbal speech sounds, Ph.D. Thesis, International Institute of Information Technology Hyderabad, 2014
V.K. Mittal, B. Yegnanarayana, Analysis of production characteristics of laughter. Comput. Speech Lang. 30(1), 99–115 (2015)
Article Google Scholar
V.K. Mittal, B. Yegnanrayana, Effect of glottal dynamics in the production of shouted speech. J. Acoust. Soc. Am. (JASA) 133(5), 3050–3061 (2013)
Article Google Scholar
V.K. Mittal, B. Yegnanarayana, P. Bhaskararao, Study of the effects of vocal tract constriction on glottal vibration. J. Acoust. Soc. Am. (JASA) 136(4), 1932–1941 (2014)
Article Google Scholar
S. Nisioi, Feature analysis for native language identification. In: 16th Intl. conference on Intelligent Text Processing and Computational Linguistics, 1–15, 2015
Y. Quian, K. Evanini, X. Wang , D.S. Oeft, R.A. Pugh, P.L. Lange, H.R. Molloy, F.K. Soong, Improving Sub-phone modeling for better native language identification with non-native English speech. Interspeech, 2586–2590, 2017
N.V. Remnev, Native language identification for Russian using errors types. Comput. Linguist. Intellect. Technol. 1123–1133 (2020)
A. Reynolds, T.F. Quatieri, R.B. Dunn, Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10, 19–41 (2000)
Article Google Scholar
S.N. Saha, S.K. Das Mandal, study of acoustic correlates of English speakers compared to native (L1) English speakers, Listener accent and perceived accent, and comprehension. In: INTERSPEECH, 815–819, 2015
X. Shi, F. Yu, Y. Lu, Y. Liang, Q. Feng, D. Wang, Y. Qian, L. Xie, The accented english speech recognition challenge 2020: open datasets, tracks, baselines, results and methods. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6918–6922. IEEE (2021)
E. Shriberg, L. Ferrer, S. Kajarekar, N. Scheffer, A. Stolcke, M. Akbacak, Detecting non-native speech using speaker recognition approaches. In: Proceedings IEEE Odyssey-08 Speaker and Language Recognition Workshop, (Stellenbosch, South Africa, 2008)
H. Sirsa, M.A. Redford, The effects of native language on Indian English sounds and timing patterns. J. Phon. 41, 393–406 (2013)
Article Google Scholar
I.H. Tsen, O. Verscheure, D.S. Turaga, V. Upendra, Quantization for adapted GMM-based speaker verification. In: ICASSP, 653–657, 2006
F. William, A. Sangwan, J.H.L. Hansen, Using human perception for automatic accent assessment. In: Interspeech, 2509–2512, 2011
C.R. Wiltshire, Uniformity and Variability in the Indian English Accent (Cambridge University Press, Cambridge, 2020)
Book Google Scholar
M. Zampieri, A.M. Ciobanu, L.P. Dinu, Native language identification on text and speech. In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, ACL, 398–404, 2017

Download references

Author information

Authors and Affiliations

Department of ECE, VNRVJIET, Hyderabad, India
Radha Krishna Guntur
Centre for CYS, Amritha University, Coimbatore, India
Krishnan Ramakrishnan
GITAM School of Technology, Hyderabad, India
Mittal Vinay Kumar

Authors

Radha Krishna Guntur
View author publications
You can also search for this author in PubMed Google Scholar
Krishnan Ramakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Mittal Vinay Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guntur, R.K., Ramakrishnan, K. & Vinay Kumar, M. An Automated Classification System Based on Regional Accent. Circuits Syst Signal Process 41, 3487–3507 (2022). https://doi.org/10.1007/s00034-021-01948-7

Download citation

Received: 19 February 2020
Revised: 16 December 2021
Accepted: 17 December 2021
Published: 12 January 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s00034-021-01948-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Automated Classification System Based on Regional Accent

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Automatic speech recognition: a survey

Speech Emotion Recognition: A Comprehensive Survey

Data Availability

References

Author information

Authors and Affiliations

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An Automated Classification System Based on Regional Accent

Abstract

Access this article

Similar content being viewed by others

A comprehensive survey on automatic speech recognition using neural networks

Automatic speech recognition: a survey

Speech Emotion Recognition: A Comprehensive Survey

Data Availability

References

Author information

Authors and Affiliations

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation