Amazigh speech recognition based on the Kaldi ASR toolkit

Barkani, Fatima; Hamidi, Mohamed; Laaidi, Naouar; Zealouk, Ouissam; Satori, Hassan; Satori, Khalid

doi:10.1007/s41870-023-01354-z

Amazigh speech recognition based on the Kaldi ASR toolkit

Original Research
Published: 22 June 2023

Volume 15, pages 3533–3540, (2023)
Cite this article

International Journal of Information Technology Aims and scope Submit manuscript

Fatima Barkani¹,
Mohamed Hamidi ORCID: orcid.org/0000-0003-2487-5517²,
Naouar Laaidi¹,
Ouissam Zealouk¹,
Hassan Satori¹ &
…
Khalid Satori¹

109 Accesses
2 Citations
Explore all metrics

Abstract

In this work, we offer a new approach to integrating the Amazigh language, which is a less-resourced language, into an isolated speech recognition system by exploiting the Kaldi open-source platform. Our designed system is able to recognize the ten first Amazigh digits and ten daily must-used Amazigh isolated words, which present typical syllabic structure and which are considered a good representative sample of the Amazigh language. The designed speech system was implemented using Hidden Markov Models (HMMs) with different number of Gaussian distributions. In addition, we evaluated our created system performance by varying the feature extraction methods in order to determine the optimal method for maximum performance. The best-obtained result is 93.96% was obtained with Mel Frequency Cepstral Coefficients (MFCCs) technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward an automatic speech recognition system for amazigh-tarifit language

Article 29 April 2019

Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit

Article 23 October 2020

Amazigh Speech Recognition System Based on CMUSphinx

Data availability

The dataset generated during the current study is not publicly available because it is laboratory-specific data.

References

Hamidi M, Zealouk O, Satori H, Laaidi N, Salek A (2022) COVID-19 assessment using HMM cough recognition system. Int J Inf Technol 15:193–201
Google Scholar
Senapati A, Nag A, Mondal A, Maji S (2021) A novel framework for COVID-19 case prediction through piecewise regression in India. Int J Inf Technol 13(1):41–48
Google Scholar
Hasan I, Dhawan P, Rizvi SAM, Dhir S (2022) Data analytics and knowledge management approach for COVID-19 prediction and control. Int J Inf Technol. https://doi.org/10.1007/s41870-020-00552-3
Article Google Scholar
Alafif T, Etaiwi A, Hawsawi Y, Alrefaei A, Albassam A, Althobaiti H (2022) DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia. Int J Inf Technol 14(6):2825–2838
Google Scholar
Barkani F, Satori H, Hamidi M (2020) Cough detection system based on ASR-HMM. In: 2020 Fourth International Conference on Intelligent Computing in Data Sciences (ICDS), IEEE, pp 1–7
Young S (2009) The HTK book version 3.4. 1. http://htk.eng.cam.ac.uk
Ordowski M, Deshmukh N, Ganapathiraju A, Hamaker J, Picone J (1999) A public domain speech-to-text system. In: Sixth European Conference on Speech Communication and Technology
Këpuska V, Bohouta G (2017) Comparing speech recognition systems (Microsoft API, Google API and CMU Sphinx). Int J Eng Res Appl 7(03):20–24
Google Scholar
Ravishankar MK (1996) Efficient Algorithms for Speech Recognition. Carnegie-Mellon University, Pittsburgh PA, Department of Computer Science
Zealouk O, Satori H, Laaidi N, Hamidi M, Satori K (2020) Noise effect on Amazigh digits in speech recognition system. Int J Speech Technol 23(4):885–892
Article Google Scholar
Povey D, Ghoshal A, Bouliannex G, Burget L, Glembek O, Goel N, Silovsky J (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (No. CONF), IEEE Signal Processing Society
Yadava GT, Jayanna HS (2017) Development and comparison of ASR models using Kaldi for noisy and enhanced kannada speech data. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, pp 1832–1838
Medennikov I, Prudnikov A (2016) Advances in STC Russian spontaneous speech recognition system. International conference on speech and computer. Springer, Cham, pp 116–123
Chapter Google Scholar
Zealouk O, Satori H, Hamidi M, Satori K (2019) Speech recognition for Moroccan dialects: feature extraction and classification methods. J Adv Res Dyn Control Syst 11(2):1401–1408
Google Scholar
Ezzine A, Satori H, Hamidi M, Satori K (2020) Moroccan dialect speech recognition system based on CMU SphinxTools. In: 2020 International Conference on Intelligent Systems and Computer Vision (ISCV), IEEE, pp 1–5
Ameen ZJM, Kadhim AA (2023) Machine learning for Arabic phonemes recognition using electrolarynx speech. Int J Electr Comput Eng 13(1):400
Google Scholar
Saady YE, Rachidi A, Yassa M, Mammass D (2011) Amhcd: a database for amazigh handwritten character recognition research. Int J Comput Appl 27(4):44–48
Google Scholar
Satori H, Elhaoussi F (2014) Investigation Amazigh speech recognition using CMU tools. Int J Speech Technol 17(3):235–243
Article Google Scholar
Hamidi M, Satori H, Zealouk O, Satori K, Laaidi N (2018) Interactive voice response server voice network administration using hidden Markov model speech recognition system. In: 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), IEEE, pp 16–21
Hamidi M, Satori H, Zealouk O, Satori K (2020) Amazigh digits through interactive speech recognition system in noisy environment. Int J Speech Technol 23(1):101–109
Article Google Scholar
Barkani F, Satori H, Hamidi M, Zealouk O, Laaidi N (2020) Amazigh speech recognition embedded system. In: 2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), IEEE, pp 1–5
Satori H, Zealouk O, Satori K, Elhaoussi F (2017) Voice comparison between smokers and non-smokers using HMM speech recognition system. Int J Speech Technol 20(4):771–777
Article Google Scholar
Hamidi M, Satori H, Zealouk O, Satori K (2019) Speech coding effect on Amazigh alphabet speech recognition performance. J Adv Res Dyn Control Syst 11(2):1392–1400
Google Scholar
Hamidi M, Satori H, Zealouk O, Satori K (2020) Interactive voice application-based Amazigh speech recognition. Embedded systems and artificial intelligence. Springer, Singapore, pp 271–279
Chapter Google Scholar
Telmem M, Ghanou Y (2018) Estimation of the optimal HMM parameters for Amazigh speech recognition system using CMU-Sphinx. Proced Comput Sci 127:92–101
Article Google Scholar
Telmem M, Ghanou Y (2021) The convolutional neural networks for Amazigh speech recognition system. Telecommun Comput Electron Control 19(2):515–522
Google Scholar
Telmem M, Ghanou Y (2018) Amazigh speech recognition system based on CMUSphinx. Innovations in smart cities and applications: proceedings of the 2nd Mediterranean symposium on smart city applications, 2nd edn. Springer International Publishing, Cham, pp 397–410
Chapter Google Scholar
Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37(6):1554–1563
Article MathSciNet MATH Google Scholar
Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16
Article Google Scholar
Fine S, Singer Y, Tishby N (1998) The hierarchical hidden Markov model: analysis and applications. Mach Learn 32(1):41–62
Article MATH Google Scholar
Beal M, Ghahramani Z, Rasmussen C (2001) The infinite hidden Markov model. Adv Neural Inf Process Syst 14:577–584
Google Scholar
Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366
Article Google Scholar
Athiramenon G, Anjusha VK (2017) Analysis of feature extraction methods for speech recognition. Int J Innov Sci Eng Technol 4(4)
Hermansky H (1990) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87(4):1738–1752
Article Google Scholar
Meftah A, Alotaibi YA, Selouani SA (2016) A comparative study of different speech features for Arabic phonemes classification. In: 2016 European Modelling Symposium (EMS), IEEE pp 47–52
Ouisaadane A, Safi S (2021) A comparative study for Arabic speech recognition system in noisy environments. Int J Speech Technol 24(3):761–770
Article Google Scholar
Al Amin MA, Islam MT, Kibria S, Rahman MS (2019) Continuous bengali speech recognition based on deep neural network. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), IEEE, pp 1–6
Kumar PP, Jayanna HS (2019) Performance analysis of hybrid automatic continuous speech recognition framework for Kannada dialect. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IEEE, pp 1–6
Saurav JR, Amin S, Kibria S, Rahman MS (2018) Bangla speech recognition for voice search. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), IEEE pp 1–4
Radeck-Arneth S, Milde B, Lange A, Gouvêa E, Radomski S, Mühlhäuser M, Biemann C (2015) Open source German distant speech recognition: corpus and acoustic model. International conference on text, speech, and dialogue. Springer, Cham, pp 480–488
Chapter Google Scholar

Download references

Funding

The authors declare they have no financial interests.

Author information

Authors and Affiliations

Laboratory of Computer Science, Signals, Automation and Cognitivism, Faculty of Sciences Dhar Mahraz, University Sidi Mohamed Ben Abdellah, Fez, Morocco
Fatima Barkani, Naouar Laaidi, Ouissam Zealouk, Hassan Satori & Khalid Satori
Team of modeling and scientific computing, Pluridisciplinary Faculty of Nador, Mohammed First University, Oujda, Morocco
Mohamed Hamidi

Authors

Fatima Barkani
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Hamidi
View author publications
You can also search for this author in PubMed Google Scholar
Naouar Laaidi
View author publications
You can also search for this author in PubMed Google Scholar
Ouissam Zealouk
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Satori
View author publications
You can also search for this author in PubMed Google Scholar
Khalid Satori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Hamidi.

Ethics declarations

Conflict of interest

All authors declare that there is no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Barkani, F., Hamidi, M., Laaidi, N. et al. Amazigh speech recognition based on the Kaldi ASR toolkit. Int. j. inf. tecnol. 15, 3533–3540 (2023). https://doi.org/10.1007/s41870-023-01354-z

Download citation

Received: 11 January 2023
Accepted: 11 June 2023
Published: 22 June 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s41870-023-01354-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Amazigh speech recognition based on the Kaldi ASR toolkit

Abstract

Access this article

Similar content being viewed by others

Toward an automatic speech recognition system for amazigh-tarifit language

Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit

Amazigh Speech Recognition System Based on CMUSphinx

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Amazigh speech recognition based on the Kaldi ASR toolkit

Abstract

Access this article

Similar content being viewed by others

Toward an automatic speech recognition system for amazigh-tarifit language

Optimal parameters selected for automatic recognition of spoken Amazigh digits and letters using Hidden Markov Model Toolkit

Amazigh Speech Recognition System Based on CMUSphinx

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation