Skip to main content
Log in

Amazigh speech recognition based on the Kaldi ASR toolkit

  • Original Research
  • Published:
International Journal of Information Technology Aims and scope Submit manuscript

Abstract

In this work, we offer a new approach to integrating the Amazigh language, which is a less-resourced language, into an isolated speech recognition system by exploiting the Kaldi open-source platform. Our designed system is able to recognize the ten first Amazigh digits and ten daily must-used Amazigh isolated words, which present typical syllabic structure and which are considered a good representative sample of the Amazigh language. The designed speech system was implemented using Hidden Markov Models (HMMs) with different number of Gaussian distributions. In addition, we evaluated our created system performance by varying the feature extraction methods in order to determine the optimal method for maximum performance. The best-obtained result is 93.96% was obtained with Mel Frequency Cepstral Coefficients (MFCCs) technique.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The dataset generated during the current study is not publicly available because it is laboratory-specific data.

References

  1. Hamidi M, Zealouk O, Satori H, Laaidi N, Salek A (2022) COVID-19 assessment using HMM cough recognition system. Int J Inf Technol 15:193–201

    Google Scholar 

  2. Senapati A, Nag A, Mondal A, Maji S (2021) A novel framework for COVID-19 case prediction through piecewise regression in India. Int J Inf Technol 13(1):41–48

    Google Scholar 

  3. Hasan I, Dhawan P, Rizvi SAM, Dhir S (2022) Data analytics and knowledge management approach for COVID-19 prediction and control. Int J Inf Technol. https://doi.org/10.1007/s41870-020-00552-3

    Article  Google Scholar 

  4. Alafif T, Etaiwi A, Hawsawi Y, Alrefaei A, Albassam A, Althobaiti H (2022) DISCOVID: discovering patterns of COVID-19 infection from recovered patients: a case study in Saudi Arabia. Int J Inf Technol 14(6):2825–2838

    Google Scholar 

  5. Barkani F, Satori H, Hamidi M (2020) Cough detection system based on ASR-HMM. In: 2020 Fourth International Conference on Intelligent Computing in Data Sciences (ICDS), IEEE, pp 1–7

  6. Young S (2009) The HTK book version 3.4. 1. http://htk.eng.cam.ac.uk

  7. Ordowski M, Deshmukh N, Ganapathiraju A, Hamaker J, Picone J (1999) A public domain speech-to-text system. In: Sixth European Conference on Speech Communication and Technology

  8. Këpuska V, Bohouta G (2017) Comparing speech recognition systems (Microsoft API, Google API and CMU Sphinx). Int J Eng Res Appl 7(03):20–24

    Google Scholar 

  9. Ravishankar MK (1996) Efficient Algorithms for Speech Recognition. Carnegie-Mellon University, Pittsburgh PA, Department of Computer Science

  10. Zealouk O, Satori H, Laaidi N, Hamidi M, Satori K (2020) Noise effect on Amazigh digits in speech recognition system. Int J Speech Technol 23(4):885–892

    Article  Google Scholar 

  11. Povey D, Ghoshal A, Bouliannex G, Burget L, Glembek O, Goel N, Silovsky J (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (No. CONF), IEEE Signal Processing Society

  12. Yadava GT, Jayanna HS (2017) Development and comparison of ASR models using Kaldi for noisy and enhanced kannada speech data. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), IEEE, pp 1832–1838

  13. Medennikov I, Prudnikov A (2016) Advances in STC Russian spontaneous speech recognition system. International conference on speech and computer. Springer, Cham, pp 116–123

    Chapter  Google Scholar 

  14. Zealouk O, Satori H, Hamidi M, Satori K (2019) Speech recognition for Moroccan dialects: feature extraction and classification methods. J Adv Res Dyn Control Syst 11(2):1401–1408

    Google Scholar 

  15. Ezzine A, Satori H, Hamidi M, Satori K (2020) Moroccan dialect speech recognition system based on CMU SphinxTools. In: 2020 International Conference on Intelligent Systems and Computer Vision (ISCV), IEEE, pp 1–5

  16. Ameen ZJM, Kadhim AA (2023) Machine learning for Arabic phonemes recognition using electrolarynx speech. Int J Electr Comput Eng 13(1):400

    Google Scholar 

  17. Saady YE, Rachidi A, Yassa M, Mammass D (2011) Amhcd: a database for amazigh handwritten character recognition research. Int J Comput Appl 27(4):44–48

    Google Scholar 

  18. Satori H, Elhaoussi F (2014) Investigation Amazigh speech recognition using CMU tools. Int J Speech Technol 17(3):235–243

    Article  Google Scholar 

  19. Hamidi M, Satori H, Zealouk O, Satori K, Laaidi N (2018) Interactive voice response server voice network administration using hidden Markov model speech recognition system. In: 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), IEEE, pp 16–21

  20. Hamidi M, Satori H, Zealouk O, Satori K (2020) Amazigh digits through interactive speech recognition system in noisy environment. Int J Speech Technol 23(1):101–109

    Article  Google Scholar 

  21. Barkani F, Satori H, Hamidi M, Zealouk O, Laaidi N (2020) Amazigh speech recognition embedded system. In: 2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET), IEEE, pp 1–5

  22. Satori H, Zealouk O, Satori K, Elhaoussi F (2017) Voice comparison between smokers and non-smokers using HMM speech recognition system. Int J Speech Technol 20(4):771–777

    Article  Google Scholar 

  23. Hamidi M, Satori H, Zealouk O, Satori K (2019) Speech coding effect on Amazigh alphabet speech recognition performance. J Adv Res Dyn Control Syst 11(2):1392–1400

    Google Scholar 

  24. Hamidi M, Satori H, Zealouk O, Satori K (2020) Interactive voice application-based Amazigh speech recognition. Embedded systems and artificial intelligence. Springer, Singapore, pp 271–279

    Chapter  Google Scholar 

  25. Telmem M, Ghanou Y (2018) Estimation of the optimal HMM parameters for Amazigh speech recognition system using CMU-Sphinx. Proced Comput Sci 127:92–101

    Article  Google Scholar 

  26. Telmem M, Ghanou Y (2021) The convolutional neural networks for Amazigh speech recognition system. Telecommun Comput Electron Control 19(2):515–522

    Google Scholar 

  27. Telmem M, Ghanou Y (2018) Amazigh speech recognition system based on CMUSphinx. Innovations in smart cities and applications: proceedings of the 2nd Mediterranean symposium on smart city applications, 2nd edn. Springer International Publishing, Cham, pp 397–410

    Chapter  Google Scholar 

  28. Baum LE, Petrie T (1966) Statistical inference for probabilistic functions of finite state Markov chains. Ann Math Stat 37(6):1554–1563

    Article  MathSciNet  MATH  Google Scholar 

  29. Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE ASSP Mag 3(1):4–16

    Article  Google Scholar 

  30. Fine S, Singer Y, Tishby N (1998) The hierarchical hidden Markov model: analysis and applications. Mach Learn 32(1):41–62

    Article  MATH  Google Scholar 

  31. Beal M, Ghahramani Z, Rasmussen C (2001) The infinite hidden Markov model. Adv Neural Inf Process Syst 14:577–584

    Google Scholar 

  32. Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366

    Article  Google Scholar 

  33. Athiramenon G, Anjusha VK (2017) Analysis of feature extraction methods for speech recognition. Int J Innov Sci Eng Technol 4(4)

  34. Hermansky H (1990) Perceptual linear predictive (PLP) analysis of speech. J Acoust Soc Am 87(4):1738–1752

    Article  Google Scholar 

  35. Meftah A, Alotaibi YA, Selouani SA (2016) A comparative study of different speech features for Arabic phonemes classification. In: 2016 European Modelling Symposium (EMS), IEEE pp 47–52

  36. Ouisaadane A, Safi S (2021) A comparative study for Arabic speech recognition system in noisy environments. Int J Speech Technol 24(3):761–770

    Article  Google Scholar 

  37. Al Amin MA, Islam MT, Kibria S, Rahman MS (2019) Continuous bengali speech recognition based on deep neural network. In: 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), IEEE, pp 1–6

  38. Kumar PP, Jayanna HS (2019) Performance analysis of hybrid automatic continuous speech recognition framework for Kannada dialect. In: 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), IEEE, pp 1–6

  39. Saurav JR, Amin S, Kibria S, Rahman MS (2018) Bangla speech recognition for voice search. In: 2018 International Conference on Bangla Speech and Language Processing (ICBSLP), IEEE pp 1–4

  40. Radeck-Arneth S, Milde B, Lange A, Gouvêa E, Radomski S, Mühlhäuser M, Biemann C (2015) Open source German distant speech recognition: corpus and acoustic model. International conference on text, speech, and dialogue. Springer, Cham, pp 480–488

    Chapter  Google Scholar 

Download references

Funding

The authors declare they have no financial interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohamed Hamidi.

Ethics declarations

Conflict of interest

All authors declare that there is no conflict of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Barkani, F., Hamidi, M., Laaidi, N. et al. Amazigh speech recognition based on the Kaldi ASR toolkit. Int. j. inf. tecnol. 15, 3533–3540 (2023). https://doi.org/10.1007/s41870-023-01354-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41870-023-01354-z

Keywords

Navigation