Towards a Speech Recognizer for Multiple Languages Using Arabic Acoustic Model Application to Amazigh Language

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 782)

Abstract

The construction of acoustic models of a language, used in automatic speech recognition (ASR) systems, is a developed technology achievable without great difficulty when a large amount of speech and written corpus is available. However, these technological resources are not available in a large part of languages called “Less Resourced Languages”. An alternative solution is to take advantage of the phonetic structures shared between the different languages to build an acoustic model for the target language.

In this paper, we will return to an experiment in this direction. Indeed, we used an acoustic model of the Arabic language to create one for the Amazigh language. The originality of our work comes from the will to address this language which has become an official language in Morocco, and which has not enough resources for the automatic speech recognition. In addition, both languages share several phonemes and certain characteristics. The realized system has reached a recognition rate of about 73% by word. The potential and the effectiveness of the proposed approach is demonstrated by experiments and comparison with other approaches.

Keywords

Automatic speech recognition Acoustic model Arabic Amazigh CMU sphinx 

References

  1. 1.
    Lê, V.B.: Reconnaissance automatique de la parole pour des langues peu dotées. thèse de doctorat, Joseph Fourier - Grenoble1 (2006)Google Scholar
  2. 2.
    Rabiner, L.-R., Schafer, R.-W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)Google Scholar
  3. 3.
    Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Prentice Hall Inc, Englewood Cliffs (2008). Chapter 9 to end of Sect. 9.3Google Scholar
  4. 4.
    Boite, R., Bourlard, H., Dutoit, T., Hancq, J., Leich, H.: Traitement de la parole. Presses Polytechniques et Universitaires Romandes, Collection Electricité, Lausanne, Switzerland (2000)Google Scholar
  5. 5.
    Schultz, T., Waibel, A.: Language independent and language adaptive acoustic modeling for speech recognition. Speech Commun. 35, 31–51 (2001)CrossRefMATHGoogle Scholar
  6. 6.
    Lin, H., Deng, L., Droppo, J., Yu, D., Acero, A.: Learning methods in multilingual speech recognition. In: Proceedings of the NIPS, Vancouver, BC, Canada (2008)Google Scholar
  7. 7.
    Byrne, W., et al.: Towards language independent acoustic modeling. In: Proceedings of the ICASSP (2000)Google Scholar
  8. 8.
    Van Doremalen, J., Cucchiarini, C., Strik, H.: Optimizing automatic speech recognition for low-proficient non native speakers. EURASIP J. Audio Speech Music Process. 2010, 1–13 (2010)CrossRefGoogle Scholar
  9. 9.
    Heigold, G., Vanhoucke, V., Senior, A.W., Nguyen, P., Ranzato, M., Devin, M., Dean, J.: Multilingual acoustic models using distributed deep neural networks. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. 8619–8623 (2013)Google Scholar
  10. 10.
    Garcia, E., Mengusoglu, E., Janke, E.: Multilingual acoustic models for speech recognition in low-resource devices. In: Proceedings of the ICASSP (2007)Google Scholar
  11. 11.
    De Wachter, M., Demuynck, K., van Compernolle, D., Wambaq, P.: Data driven example based continuous speech recognition. In: Proceedings of Eurospeech, Geneva, Switzerland, pp. 1133–1136 (2003)Google Scholar
  12. 12.
    Schultz, T., Kirchhoff, K. (eds.): Multilingual Speech Processing. Academic Press, Amsterdam (2006)Google Scholar
  13. 13.
    International Phonetic Association: Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet, pp. 1–204 (1999)Google Scholar
  14. 14.
  15. 15.
    Schultz, T.: GlobalPhone: A multilingual speech and text database developed at karlsruhe university. In: ICSLP 2002, Denver, CO, USA, Septembre 2002Google Scholar
  16. 16.
  17. 17.
    Ali Sadiqui, Nouredine Chenfour, Réalisation d’un système de reconnaissance automatique de la parole arabe basée sur CMU Sphinx, article publié sur «Annals. Computer Science Series» Tome 8, Avril 2010Google Scholar
  18. 18.
    Arabic Speech Corpus. http://en.arabicspeechcorpus.com/
  19. 19.
    Greenberg J.: The Languages of Africa. The Hague (1966)Google Scholar
  20. 20.
    Ouakrim, O.: Fonética y fonología del Bereber, Survey at the University of Autònoma de Barcelona (1995)Google Scholar
  21. 21.
    El Barkani, B.: Le choix de la graphie tifinaghe pour enseigner, apprendre l’amazighe au Maroc: conditions, reprrésentation et pratiques. Linguistique. Université Jean Monnet -Saint-Etienne. Français (2010)Google Scholar
  22. 22.
    The Royal Institute of Amazigh Culture. http://www.ircam.ma
  23. 23.
    Leggetter, C.-J., Woodland, P.-C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9(2), 171–185 (1995)CrossRefGoogle Scholar
  24. 24.
    Gauvain, J.-L., Lee, C.-H.: Maximum a posteriori estimation for multi-variate gaussian mixture observations of markov chains. IEEE Trans. Speech Audio Process. 2(2), 291–298 (1994)CrossRefGoogle Scholar
  25. 25.
    Wang, Z., Schultz, T.: Non-native spontaneous speech recognition through polyphone decision tress specialization. In: Eurospeech 2003, pp. 1449–1452, Geneva, Switzerland, September 2003Google Scholar
  26. 26.
    Open source speech recognition toolkit CMU sphinx. http://cmusphinx.sourceforge.net/wiki/tutorialadapt

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.OFPPT, ISTA MeknèsMeknèsMorocco
  2. 2.Faculté des Sciences Dhar El MehrazFèsMorocco

Personalised recommendations