Abstract
Accent identification and classification pose a major challenge for speech recognition systems, as various pronunciations of the same words by speakers of different races are recognized differently by speech recognition systems. Similarly, in most cases, it is difficult for native speakers of the same dialect to understand each other perfectly, especially, if one or more of the speakers has a thick accent. This paper therefore investigates the most accent sensitive words of the three major Nigerian indigenous languages and in addition uses machine learning (ML) to solve the problem of accent classification (AC) of the three languages. A speech-based algorithm was designed and implemented with Python. Speech data were acquired from 300 speakers and mel-frequency cepstral coefficient (MFCC) was employed to extract distinct features which are used to distinguish speakers of the three native languages. The acquired speech data were used to train a combination of a one-dimensional convolutional neural network (1D CNN) and a long short-term memory (LSTM) network model (1D CNN LSTM). Experimental results show a classification accuracy of 94.9%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Juan SFS (2015) Exploiting resources from closely-related languages for automatic speech recognition in low-resource languages from Malaysia. Ph.D. Dissertation, Université De Grenoble, pp 1–146
Matthew B, Hoy MB (2018) Alexa, Siri, Cortana, and more: an introduction to voice assistants. Med Ref Serv Q 37(1):81–88. https://doi.org/10.1080/02763869.2018.1404391
Lulu L, Elnagar A (2018) Automatic arabic dialect classification using deep learning models. Procedia Comput Sci 142:262–269. https://doi.org/10.1016/j.procs.2018.10.489
Sarma M, Sarma KK (2016) Dialect identification from Assamese speech using prosodic features and a neuro fuzzy classifier. In: 3rd international conference on signal processing and integrated networks (SPIN). Noida, pp 127–132. https://doi.org/10.1109/spin.2016.7566675
Nguyen P, Tran D, Huang X, Sharma D (2010) Australian accent-based speaker classification. In: IEEE third international conference on knowledge discovery and data mining, pp 416–419. https://doi.org/10.1109/wkdd.2010.80
Malhotra K, Khosla A (2008) Automatic identification of gender and accent in spoken Hindi utterances with regional Indian accents. In: 2008 IEEE spoken language technology workshop. Goa, India. https://doi.org/10.1109/slt.2008.4777902
Mannepalli K, Sastry PN, Suman M (2016) MFCC-GMM based accent recognition system for Telugu speech signals. Int J Speech Technol 19(1):87–93. https://doi.org/10.1007/s10772-015-9328-y
Rabiee A, Setayeshi S (2010) Persian accents identification using an adaptive neural network. In: 2nd international workshop on education technology and computer science. Wuhan, China, pp 7–10. https://doi.org/10.1109/etcs.2010.273
Rao SK, Koolagudi SG (2011) Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Systemics, Cybern Informatics 9(4):24–33
Behravan H, Hautama¨ki V, Kinnunen T (2015) Factors affecting i-vector based foreign accent recognition: a case study in spoken Finnish. Speech Commun 66:118–129
Ma ZC, Fokoué E (2014) A comparison of classifiers in performing speaker accent recognition using MFCCs. Open J Stat 4:258–266. https://doi.org/10.4236/ojs.2014.44025
Biadsy F, Hirschberg J, Habash N (2009) Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the EACL workshop on computational approaches to semitic languages. Athens, Greece, pp 53–61
Salau AO, Jain S (2019) Feature extraction: a survey of the types, techniques and applications. In: 5th IEEE international conference on signal processing and communication (ICSC-2019). Noida, India
Salau AO, Oluwafemi I, Faleye KF, Jain S (2019) Audio compression using a modified discrete cosine transform with temporal auditory masking. In: 5th IEEE international conference on signal processing and communication (ICSC-2019). Noida, India
Mukherjee R (2012) Speaker recognition using shifted MFCC. Graduate Theses and Dissertations, pp 1–56
Piat M, Fohr D, Illina I (2008) Foreign accent identification based on prosodic parameters. In: Interspeech. Brisbane, Australia, pp 759–762
Jiao Y, Tu M, Berisha V, Liss J (2016) Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features. In: Interspeech, pp 2388–2392. https://doi.org/10.21437/interspeech.2016-1148
Amuda SAY, Boril H, Sangwan A, Hansen JHL, Ibiyemi TS (2014) Engineering analysis and recognition of Nigerian English: an insight into low resource languages. Trans Mach Learning Artif Intell 2(4):115–126. https://doi.org/10.14738/tmlai.24.334
Chittaragi NB, Prakash A, Koolagudi SG (2018) Dialect identification using spectral and prosodic features on single and ensemble classifiers. Arabian J Sci Eng 43(8):4289–4302. https://doi.org/10.1007/s13369-017-2941-0
Faria A (2006) Accent classification for speech recognition. In: Renals S, Bengio S (eds) Machine learning for multimodal interaction, MLMI (2006). Lecture notes in computer science, 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_25
Hammami N, Bedda M, Farah N (2012) Spoken Arabic digits recognition using MFCC based on GMM. In: IEEE conference on sustainable utilization and development in engineering and technology. Kuala Lumpur, Malaysia, pp 160–163. https://doi.org/10.1109/student.2012.6408392
Soorajkumar R, Girish GN, Ramteke PB, Joshi SS, Koolagudi SG (2017) Text-independent automatic accent identification system for Kannada Language. In: Satapathy S, Bhateja V, Joshi A (eds) Proceedings of the international conference on data engineering and communication technology. Advances in intelligent systems and computing. Springer, Singapore, p 469
Ullah S, Karray F (2007) Speaker accent classification system using fuzzy canonical correlation-based gaussian classifier. In: IEEE international conference on signal processing and communications. Dubai, pp 792–795. https://doi.org/10.1109/icspc.2007.4728438
Yusnita MA, Paulraj MP, Yaacob S, Yusuf R, Shahriman AB (2013) Analysis of accent-sensitive words in multi-resolution mel-frequency cepstral coefficients for classification of accents in Malaysian English. Int J Automot Mech Eng (IJAME) 7:1053–1073. https://doi.org/10.15282/ijame.7.2012.21.0086
Zheng Y, Sproat R, Gu L, Shafran I, Zhou H, Su Y, Jurafsky D, Starr R, Yoon S (2017) Accent detection and speech recognition for Shanghai-Accented Mandarin. In: Interspeech, pp 217–220
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Salau, A.O., Olowoyo, T.D., Akinola, S.O. (2020). Accent Classification of the Three Major Nigerian Indigenous Languages Using 1D CNN LSTM Network Model. In: Jain, S., Sood, M., Paul, S. (eds) Advances in Computational Intelligence Techniques. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-2620-6_1
Download citation
DOI: https://doi.org/10.1007/978-981-15-2620-6_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2619-0
Online ISBN: 978-981-15-2620-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)