Skip to main content
Log in

An Automated Classification System Based on Regional Accent

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Identification of the native language from speech segment of a second language utterance, that is manifested as a distinct pattern of articulatory or prosodic behavior, is a challenging task. A method of classification of speakers, based on the regional English accent, is proposed in this paper. A database of English speech, spoken by the native speakers of three closely related Dravidian languages, was collected from a non-overlapping set of speakers, along with the native language speech data. Native speech samples from speakers of the regional languages of India, namely Kannada, Tamil, and Telugu are used for the training set. The testing set contains utterances of non-native English speakers of compatriots of the above three groups. Automatic identification of native language is proposed by using the spectral features of the non-native speech, that are classified using the classifiers such as Gaussian Mixture Models (GMM), GMM-Universal Background Model (GMM-UBM), and i-vector. Identification accuracy of \(87.9\%\) was obtained using the GMM classifier, which was increased to \(90.9\%\) by using the GMM-UBM method. But the i-vector-based approach gave a better accuracy of \(93.9\%\), along with EER of \(6.1\%\). The results obtained are encouraging, especially viewing the current state-of-the-art accuracies around \(85\%\). It is observed that the identification rate of nativity, while speaking English, is relatively higher at \(95.2\%\) for the speakers of Kannada language, as compared to that for the speakers of Tamil or Telugu as their native language.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

The data that support the findings of this study are utilized strictly for research purpose, and can be made available on reasonable request, for academic use and/or research purposes.

References

  1. A. Abad, E. Ribeiro, F. kepler, R. Astudillo, I. Trancosol, Exploiting phone log-likelihood ratio features for the detection of the native language of non-native English speakers. In: INTERSPEECH 2413–2417, 2016

  2. F. Adeeba, S. Hussain, Native language identification in very short utterances using bidirectional long short-term memory network. IEEE Access 7, 17098–17110 (2019)

    Article  Google Scholar 

  3. O.C. Ali, M. Hariharan, S. Yaacob, L.S. Chee, Classification of Speech dysfluencies with MFCC and LPCC. Expert Syst. Appl. 39(2), 2157–2165 (2012)

    Article  Google Scholar 

  4. M.H. Bahari, R. Saeidi, H. Van Hamme, D. Van Leeuwen, Accent recognition using i-vector, Gaussian mean supervector and gaussian posterior probability supervector for spontaneous telephone speech. In: ICASSP, 7344–7248, 2013

  5. H. Behravan, V. Hautamauki, S.M. Siniscalchi, T. Kinnunen, C.H. Lee, Introducing attribute features to foreign accent recognition. In: Acoustics, Speech and Signal Processing ICASSP, 5332–5336, 2014

  6. N.F. Chen, Characterizing phonetic transformations and acoustic differences across English Dialects. In: IEEE Transactions on Audio, Speech and Signal Processing, 1–15, 2014

  7. T. Chen, C. Huang, E. Chang, J. Wang, Automatic accent identification using Gaussian mixture models. In: Workshop on Automatic Speech Recognition and Understanding, 343–346, 2001

  8. J. Cheng, N. Bojja, X. Chen, Automatic accent quantification of Indian speakers of English. In: Interspeech, 2574–2578 2013

  9. H. Clahsen, C. Felser, How native-like is non-native language processing. Trends Cogn. Sci. 10, 564–570 (2006)

    Article  Google Scholar 

  10. M. Cooke, M.L. Garcia Lecumberry, J. Barker, The foreign language cocktail party problem: the energetic and information masking problem in non-native speech perception. J. Acoust. Soc. Am. 123, 414–427 (2008)

    Article  Google Scholar 

  11. A. Das, G. Zhao, J. Levis, E. Chukharev-Hudilainen, R. Gutierrez-Osuna, Understanding the effect of voice quality and accent on talker similarity. Proc. Interspeech 2020, 1763–1767 (2020)

    Google Scholar 

  12. S. Davis, P. Mermelstein, Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)

    Article  Google Scholar 

  13. A. De Marco, S.J. Cox, Native accent classification via I-vectors and speaker compensation fusion. In: Interspeech, 1472–1476, 2013

  14. N. Dehak, P.J. Kenny, R. Dehak, P. Dumouchel, P. Ouellet, Front-end factor analysis for speaker verification. IEEE Trans. ASLP 19(4), 788–798 (2011)

    Google Scholar 

  15. H. Ding, B. Lin, L. Wang, H. Wang, R. Fang, A comparison of English rhythm produced by native American speakers and mandarin ESL primary school learners. In: INTERSPEECH, 4481–4485, 2020

  16. D. Garcia-Romero, C.Y. Espy-Wilson, Analysis of i-vector length normalization in Speaker Recognition systems. In: INTERSPEECH, 249–252, 2011

  17. C. Graham, B. Post, Second Language acquisition of Intonation: peak alignment in American English. J. Phon. 66, 1–14 (2018)

    Article  Google Scholar 

  18. C.S. Greenberg, D. Bansé, G.R. Doddington, D. Garcia-Romero, J.J. Godfrey, D. Bansé, T. Kinnunen, A.F. Martin, M. Przybocki, D.A. Reynolds, The NIST 2014 speaker recognition i-vector machine learning challenge. In: Odyssey: The Speaker and Language Recognition Workshop, 224–230, 2014

  19. Hynek Hermansky, Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)

    Article  Google Scholar 

  20. D. Iskra, B. Grosskopf, K. Marasek, H. Heuvel, F. Diehl, A. Kiessling, SPEECON - Speech Databases for consumer devices: database specification and validation. In: Proceedings LREC 2002. Third International Conference on Language Resources and Evaluation, 329–333, 2002

  21. Y. Jiao, M. Tu, V. Berisha, J. Liss, Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features. Interspeech 2016, 2388–2392 (2016)

    Google Scholar 

  22. A. Kanagasundaram, R. Vogt, D. Dean, S. Sridharan, M. Mason, i-vector based recognition on short utterances. In: INTERSPEECH, 2341–2344, 2010

  23. T. Kinnunen, H. Li, An overview of text-independent speaker recognition: from features to super vectors. Speech Commun. 52, 12–40 (2010)

    Article  Google Scholar 

  24. G.R. Krishna, R. Krishnan, Native language identification based on English accent. In: Intl. Conference on Natural Language Processing (ICON), 69–74, 2014

  25. G.R. Krishna, R. Krishnan, V.K. Mittal, An automated system for regional nativity identification of Indian speakers from english speech. In: 16th IEEE India Council Intl. Conf. INDICON 2019, 2019

  26. G.R. Krishna, R. Krishnan, V.K. Mittal, Non-native accent partitioning for speakers of Indian regional languages. In: 16th Intl. Conf. on Natural Language Processing (ICON). LTRC, IIIT Hyderabad, 2019

  27. H. Li, K.A. Lee, B. Ma, Spoken language recognition, from fundamentals to practice. Proc. IEEE 101(5), 1136–1159 (2013)

    Article  Google Scholar 

  28. S. Malmasi, M. Dras, Native language identification with classifier stacking and ensembles. Comput. Linguist. 44(3), 403–446 (2018)

    Article  Google Scholar 

  29. I. Markov, automatic native language identification. Ph. D. thesis, Instituto Politecnico Nacional, 2018

  30. V.K. Mittal, Analysis of nonverbal speech sounds, Ph.D. Thesis, International Institute of Information Technology Hyderabad, 2014

  31. V.K. Mittal, B. Yegnanarayana, Analysis of production characteristics of laughter. Comput. Speech Lang. 30(1), 99–115 (2015)

    Article  Google Scholar 

  32. V.K. Mittal, B. Yegnanrayana, Effect of glottal dynamics in the production of shouted speech. J. Acoust. Soc. Am. (JASA) 133(5), 3050–3061 (2013)

    Article  Google Scholar 

  33. V.K. Mittal, B. Yegnanarayana, P. Bhaskararao, Study of the effects of vocal tract constriction on glottal vibration. J. Acoust. Soc. Am. (JASA) 136(4), 1932–1941 (2014)

    Article  Google Scholar 

  34. S. Nisioi, Feature analysis for native language identification. In: 16th Intl. conference on Intelligent Text Processing and Computational Linguistics, 1–15, 2015

  35. Y. Quian, K. Evanini, X. Wang , D.S. Oeft, R.A. Pugh, P.L. Lange, H.R. Molloy, F.K. Soong, Improving Sub-phone modeling for better native language identification with non-native English speech. Interspeech, 2586–2590, 2017

  36. N.V. Remnev, Native language identification for Russian using errors types. Comput. Linguist. Intellect. Technol. 1123–1133 (2020)

  37. A. Reynolds, T.F. Quatieri, R.B. Dunn, Speaker verification using adapted gaussian mixture models. Digit. Signal Process. 10, 19–41 (2000)

    Article  Google Scholar 

  38. S.N. Saha, S.K. Das Mandal, study of acoustic correlates of English speakers compared to native (L1) English speakers, Listener accent and perceived accent, and comprehension. In: INTERSPEECH, 815–819, 2015

  39. X. Shi, F. Yu, Y. Lu, Y. Liang, Q. Feng, D. Wang, Y. Qian, L. Xie, The accented english speech recognition challenge 2020: open datasets, tracks, baselines, results and methods. In: ICASSP 2021–2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6918–6922. IEEE (2021)

  40. E. Shriberg, L. Ferrer, S. Kajarekar, N. Scheffer, A. Stolcke, M. Akbacak, Detecting non-native speech using speaker recognition approaches. In: Proceedings IEEE Odyssey-08 Speaker and Language Recognition Workshop, (Stellenbosch, South Africa, 2008)

  41. H. Sirsa, M.A. Redford, The effects of native language on Indian English sounds and timing patterns. J. Phon. 41, 393–406 (2013)

    Article  Google Scholar 

  42. I.H. Tsen, O. Verscheure, D.S. Turaga, V. Upendra, Quantization for adapted GMM-based speaker verification. In: ICASSP, 653–657, 2006

  43. F. William, A. Sangwan, J.H.L. Hansen, Using human perception for automatic accent assessment. In: Interspeech, 2509–2512, 2011

  44. C.R. Wiltshire, Uniformity and Variability in the Indian English Accent (Cambridge University Press, Cambridge, 2020)

    Book  Google Scholar 

  45. M. Zampieri, A.M. Ciobanu, L.P. Dinu, Native language identification on text and speech. In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, ACL, 398–404, 2017

Download references

Author information

Authors and Affiliations

Authors

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guntur, R.K., Ramakrishnan, K. & Vinay Kumar, M. An Automated Classification System Based on Regional Accent. Circuits Syst Signal Process 41, 3487–3507 (2022). https://doi.org/10.1007/s00034-021-01948-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01948-7

Keywords

Navigation