Accent Classification of the Three Major Nigerian Indigenous Languages Using 1D CNN LSTM Network Model

Salau, Ayodeji Olalekan; Olowoyo, Tilewa David; Akinola, Solomon Oluwole

doi:10.1007/978-981-15-2620-6_1

Ayodeji Olalekan Salau ORCID: orcid.org/0000-0002-6264-9783⁷,
Tilewa David Olowoyo⁷ &
Solomon Oluwole Akinola⁷

Part of the book series: Algorithms for Intelligent Systems ((AIS))

622 Accesses
12 Citations

Abstract

Accent identification and classification pose a major challenge for speech recognition systems, as various pronunciations of the same words by speakers of different races are recognized differently by speech recognition systems. Similarly, in most cases, it is difficult for native speakers of the same dialect to understand each other perfectly, especially, if one or more of the speakers has a thick accent. This paper therefore investigates the most accent sensitive words of the three major Nigerian indigenous languages and in addition uses machine learning (ML) to solve the problem of accent classification (AC) of the three languages. A speech-based algorithm was designed and implemented with Python. Speech data were acquired from 300 speakers and mel-frequency cepstral coefficient (MFCC) was employed to extract distinct features which are used to distinguish speakers of the three native languages. The acquired speech data were used to train a combination of a one-dimensional convolutional neural network (1D CNN) and a long short-term memory (LSTM) network model (1D CNN LSTM). Experimental results show a classification accuracy of 94.9%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Juan SFS (2015) Exploiting resources from closely-related languages for automatic speech recognition in low-resource languages from Malaysia. Ph.D. Dissertation, Université De Grenoble, pp 1–146
Google Scholar
Matthew B, Hoy MB (2018) Alexa, Siri, Cortana, and more: an introduction to voice assistants. Med Ref Serv Q 37(1):81–88. https://doi.org/10.1080/02763869.2018.1404391
Article Google Scholar
Lulu L, Elnagar A (2018) Automatic arabic dialect classification using deep learning models. Procedia Comput Sci 142:262–269. https://doi.org/10.1016/j.procs.2018.10.489
Article Google Scholar
Sarma M, Sarma KK (2016) Dialect identification from Assamese speech using prosodic features and a neuro fuzzy classifier. In: 3rd international conference on signal processing and integrated networks (SPIN). Noida, pp 127–132. https://doi.org/10.1109/spin.2016.7566675
Nguyen P, Tran D, Huang X, Sharma D (2010) Australian accent-based speaker classification. In: IEEE third international conference on knowledge discovery and data mining, pp 416–419. https://doi.org/10.1109/wkdd.2010.80
Malhotra K, Khosla A (2008) Automatic identification of gender and accent in spoken Hindi utterances with regional Indian accents. In: 2008 IEEE spoken language technology workshop. Goa, India. https://doi.org/10.1109/slt.2008.4777902
Mannepalli K, Sastry PN, Suman M (2016) MFCC-GMM based accent recognition system for Telugu speech signals. Int J Speech Technol 19(1):87–93. https://doi.org/10.1007/s10772-015-9328-y
Article Google Scholar
Rabiee A, Setayeshi S (2010) Persian accents identification using an adaptive neural network. In: 2nd international workshop on education technology and computer science. Wuhan, China, pp 7–10. https://doi.org/10.1109/etcs.2010.273
Rao SK, Koolagudi SG (2011) Identification of Hindi dialects and emotions using spectral and prosodic features of speech. Systemics, Cybern Informatics 9(4):24–33
Google Scholar
Behravan H, Hautama¨ki V, Kinnunen T (2015) Factors affecting i-vector based foreign accent recognition: a case study in spoken Finnish. Speech Commun 66:118–129
Article Google Scholar
Ma ZC, Fokoué E (2014) A comparison of classifiers in performing speaker accent recognition using MFCCs. Open J Stat 4:258–266. https://doi.org/10.4236/ojs.2014.44025
Article Google Scholar
Biadsy F, Hirschberg J, Habash N (2009) Spoken Arabic dialect identification using phonotactic modeling. In: Proceedings of the EACL workshop on computational approaches to semitic languages. Athens, Greece, pp 53–61
Google Scholar
Salau AO, Jain S (2019) Feature extraction: a survey of the types, techniques and applications. In: 5th IEEE international conference on signal processing and communication (ICSC-2019). Noida, India
Google Scholar
Salau AO, Oluwafemi I, Faleye KF, Jain S (2019) Audio compression using a modified discrete cosine transform with temporal auditory masking. In: 5th IEEE international conference on signal processing and communication (ICSC-2019). Noida, India
Google Scholar
Mukherjee R (2012) Speaker recognition using shifted MFCC. Graduate Theses and Dissertations, pp 1–56
Google Scholar
Piat M, Fohr D, Illina I (2008) Foreign accent identification based on prosodic parameters. In: Interspeech. Brisbane, Australia, pp 759–762
Google Scholar
Jiao Y, Tu M, Berisha V, Liss J (2016) Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features. In: Interspeech, pp 2388–2392. https://doi.org/10.21437/interspeech.2016-1148
Amuda SAY, Boril H, Sangwan A, Hansen JHL, Ibiyemi TS (2014) Engineering analysis and recognition of Nigerian English: an insight into low resource languages. Trans Mach Learning Artif Intell 2(4):115–126. https://doi.org/10.14738/tmlai.24.334
Article Google Scholar
Chittaragi NB, Prakash A, Koolagudi SG (2018) Dialect identification using spectral and prosodic features on single and ensemble classifiers. Arabian J Sci Eng 43(8):4289–4302. https://doi.org/10.1007/s13369-017-2941-0
Article Google Scholar
Faria A (2006) Accent classification for speech recognition. In: Renals S, Bengio S (eds) Machine learning for multimodal interaction, MLMI (2006). Lecture notes in computer science, 3869. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11677482_25
Chapter Google Scholar
Hammami N, Bedda M, Farah N (2012) Spoken Arabic digits recognition using MFCC based on GMM. In: IEEE conference on sustainable utilization and development in engineering and technology. Kuala Lumpur, Malaysia, pp 160–163. https://doi.org/10.1109/student.2012.6408392
Soorajkumar R, Girish GN, Ramteke PB, Joshi SS, Koolagudi SG (2017) Text-independent automatic accent identification system for Kannada Language. In: Satapathy S, Bhateja V, Joshi A (eds) Proceedings of the international conference on data engineering and communication technology. Advances in intelligent systems and computing. Springer, Singapore, p 469
Google Scholar
Ullah S, Karray F (2007) Speaker accent classification system using fuzzy canonical correlation-based gaussian classifier. In: IEEE international conference on signal processing and communications. Dubai, pp 792–795. https://doi.org/10.1109/icspc.2007.4728438
Yusnita MA, Paulraj MP, Yaacob S, Yusuf R, Shahriman AB (2013) Analysis of accent-sensitive words in multi-resolution mel-frequency cepstral coefficients for classification of accents in Malaysian English. Int J Automot Mech Eng (IJAME) 7:1053–1073. https://doi.org/10.15282/ijame.7.2012.21.0086
Article Google Scholar
Zheng Y, Sproat R, Gu L, Shafran I, Zhou H, Su Y, Jurafsky D, Starr R, Yoon S (2017) Accent detection and speech recognition for Shanghai-Accented Mandarin. In: Interspeech, pp 217–220
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical/Electronics and Computer Engineering, Afe Babalola University, Ado-Ekiti, Nigeria
Ayodeji Olalekan Salau, Tilewa David Olowoyo & Solomon Oluwole Akinola

Authors

Ayodeji Olalekan Salau
View author publications
You can also search for this author in PubMed Google Scholar
Tilewa David Olowoyo
View author publications
You can also search for this author in PubMed Google Scholar
Solomon Oluwole Akinola
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, Jaypee University of Information Technology, Solan, Himachal Pradesh, India
Shruti Jain
National Institute of Technical Teachers Training and Research, Chandigarh, India
Meenakshi Sood
Department of Biomedical Engineering, North-Eastern Hill University, Shillong, Meghalaya, India
Sudip Paul

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Salau, A.O., Olowoyo, T.D., Akinola, S.O. (2020). Accent Classification of the Three Major Nigerian Indigenous Languages Using 1D CNN LSTM Network Model. In: Jain, S., Sood, M., Paul, S. (eds) Advances in Computational Intelligence Techniques. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-2620-6_1

Download citation

DOI: https://doi.org/10.1007/978-981-15-2620-6_1
Published: 21 February 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-2619-0
Online ISBN: 978-981-15-2620-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics