International Journal of Speech Technology

, Volume 21, Issue 4, pp 1071–1090 | Cite as

Arabic discourse analysis based on acoustic, prosodic and phonetic modeling: elocution evaluation, speech classification and pathological speech correction

  • Mohsen MaraouiEmail author
  • Naim Terbeh
  • Mounir Zrigui


This work describes a complete study addressing the pathological speech processing. It focuses principally on the speech correction and the assistance to learners of Arabic vocabulary. For this purpose, we follow five main phases. The first one is dedicated to evaluate the produced speech by assigning a pronunciation level for each speaker according to their forced alignment score. The second step consists in classifying the Arabic produced speech into healthy or pathological based on two different models: a prosodic modeling based on elocution speed and a phonetic modeling based on comparing between a referenced Probabilistic-Phonetic Model and a speaker model. Third, we localize for each speech sequence classified as pathological the problematic phonemes that degrade pronunciation. We differentiate also two factors which can falsify produced acoustic signals: degraded speech can be generated from pathological problems, or it can be produced by non arabophone pronouncers. Hence, we focus on forced alignment scores. Fourth, we develop a new algorithm to correct pathological pronunciation. We opt of two different solutions: lexical and phonetic. The last task is the conception of an application assisting learners of Arabic vocabulary to improve their pronunciation. The achieved results are encouraging. Moreover, the evaluation and classification of produced acoustic signals are satisfactory, learners of Arabic vocabulary have presented good amelioration using the developed application. A lot of applications that design systems of voice signal processing and platforms of e-learning can enjoy from our proposition.


Healthy speech Pathological speech Probabilistic-phonetic model Forced alignment score Elocution speed Pronunciation evaluation Speech correction Voice pathologies Non-native speakers Arabic vocabulary learning 



The editors and reviewers within the International Journal of Speech Technology are acknowledged about their critics, remarks and comments to ameliorate the quality of this paper. My supervisors Mr. Mounir Zrigui and Mr. Mohsen Maraoui are also thanked for their valuable supports to achieve this work.


  1. Ajibola, A. S., Rashid, N. K. B. A. M., Sediono, W., & Hashim, N. N. W. N. (2016). A novel approach to stuttered speech correction, Jurnal Ilmu Komputer dan Informasi, 9(2), 80–87.CrossRefGoogle Scholar
  2. Alghamdi, M., Almuhtasib, H., & Elshafei, M. (2004). Arabic phonological rules. King Saud University Journal: Computer Sciences and Information, 16, 85–115.Google Scholar
  3. Aljawarneh, S. (2011). A web engineering security methodology for e-learning systems. Network Security, 2011(3), 12–15.CrossRefGoogle Scholar
  4. Aljlayl, M., & Frieder, O. (2002). On Arabic search: Improving the retrieval effectiveness via a light stemming approach. In Proceedings of the eleventh international conference on Information and knowledge management (pp. 340–347). ACM.Google Scholar
  5. American Speech Language Hearing Association. (2014).
  6. Ayadi, R., Maraoui, M., & Zrigui, M. (2016). A survey of arabic text representation and classification methods. Research in Computing Science, 117, 51–62Google Scholar
  7. Bassil, Y., & Alwani, M. (2012). Post-editing error correction algorithm for speech recognition using bing spelling suggestion. International Journal of advanced Computer Science and Applications, 3, 2.CrossRefGoogle Scholar
  8. Belgacem, M. (2011). Reconnaissance automatique de la parole et ALAO: Vers un système d’apprentissage de l’arabe oral, PhD thesis. Stendhal University, Grenoble.Google Scholar
  9. Biadsy, F., Hirschberg, J., & Habash, N. (2009). Spoken Arabic dialect identification using phonotactic modeling. In Proceedings of the eacl 2009 workshop on computational approaches to semitic languages (pp. 53–61). Association for Computational Linguistics.Google Scholar
  10. Blanc-Brude, T. (2004). Intégration de commandes vocales dans un environnement d’apprentissage par l’action: enjeux ergonomiques, Doctoral dissertation, Grenoble 1.Google Scholar
  11. Boite, R., Bourlard, H., Dutoit, T., Hancq, J., & Leich, H. (2000). Traitement de la parole. Lausanne: Presses Polytechniques et Universitaires Romandes, Collection Electricité.Google Scholar
  12. Bréhilin, L., & Gascuel, O. (2000). Modèles de Markov caches et apprentissage de séquences.Google Scholar
  13. Calliope, E. P. (1989). La parole et son traitement automatique. Paris: Masson.Google Scholar
  14. Elshafei, M., Almuhtasib, H., & Alghamdi, M. (2002). Techniques for high quality text-to-speech. Information Science, 140, 255–267CrossRefGoogle Scholar
  15. Elshafei, M., Al-Muhtaseb, H., & Alghamdi, M. (2006). Statistical methods for automatic diacritization of Arabic text. In The Saudi 18th National Computer Conference. Riyadh (Vol. 18, pp. 301–306).Google Scholar
  16. Haffar, N., Maraoui, M., & Aljawarneh, S. (2016). Use of indexed Arabic text in e-learning system. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–7). IEEE.Google Scholar
  17. Hawashin, B., Mansour, A., Aljawarneh, S., Fahmy, A. A., Al Raddady, F., Shrivastava, A., Rajawat, A. S., Malika, C. R., Mishra, S., Yadav, R. N. (2013). An efficient feature selection method for arabic text classification. International Journal of Computers and Applications, 83, 17.Google Scholar
  18. Huang, X., Acero, A., & Hon, H. W. (2001). Spoken language processing—a guide to theory, algorithm, and system development. Upper Saddle River: Prentice Hall.Google Scholar
  19. Kaki, S., Sumita, E., & Iida, H. (1998). A method for correcting errors in speech recognition using the statistical features of character co-occurrence, In COLING-ACL, Montreal, Quebec, Canada.Google Scholar
  20. Lin, J., Xie, Y., & Zhang, J. (2016). Automatic pronunciation evaluation of non-native mandarin tone by using multi-level confidence measures. Proc. INTERSPEECH 2016, (pp. 2666–2670).Google Scholar
  21. Majidnezhad, V., & Kheidorov, I. (2012). A HMM-based method for vocal fold pathology diagnosis. IJCSI International Journal of Computer Science Issues, 9(6), 135.Google Scholar
  22. Majidnezhad, V., & Kheidorov, I. (2013). An ANN-based method for detecting vocal fold pathology. International Journal of Computer Applications, 62, 7.CrossRefGoogle Scholar
  23. Maraoui, M., Zrigui, M., & Antoniadis, G. (2012). Use of NLP tools in CALL system for Arabic. Journal of Computer Processing of Languages, 24(02), 153–165.Google Scholar
  24. Meddeb, O., Maraoui, M., & Aljawarneh, S. (2016). Hybrid modeling of an offline arabic handwriting recognition system AHRS. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.Google Scholar
  25. Merhbene, L., Zouaghi, A., & Zrigui, M. (2013). An experimental study for some supervised lexical disambiguation methods of arabic language. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.Google Scholar
  26. Paquet, P. (1997). L’utilisation des réseaux de neurones artificiels en finance. Document de recherche 1997-1.Google Scholar
  27. Patane, G., & Russo, M. (2001). The enhanced LBG algorithm. Neural Networks, 14(9), 1219–1237CrossRefGoogle Scholar
  28. Rouhe, A., Karhila, R., Smit, P., & Kurimo, M. (2017). Reading validation for pronunciation evaluation in the Digitala project. Proc. Interspeech 2017, 2050–2051.Google Scholar
  29. Terbeh, N., Labidi, M., & Zrigui, M. (2013). Automatic speech correction: A step to speech recognition for people with disabilities. In Information and Communication Technology and Accessibility (ICTA), 2013 Fourth International Conference on (pp. 1–6). IEEE.Google Scholar
  30. Terbeh, N., Maraoui, M., & Zrigui, M. (2015). Probabilistic approach for detection of vocal pathologies in the arabic speech. In International Conference on Intelligent Text Processing and Computational Linguistics (pp. 606–616). Springer, ChamGoogle Scholar
  31. Terbeh, N., Trigui, A., Maraoui, M., & Zrigui, M. (2016). Arabic speech analysis to identify factors posing pronunciation disorders and to assist learners with vocal disabilities. In Engineering & MIS (ICEMIS), International Conference on (pp. 1–8). IEEE.Google Scholar
  32. Terbeh, N., & Zrigui, M. (2014). Vers la correction automatique de la Parole Arabe, Morocco: Citala 2014.Google Scholar
  33. Terbeh, N., & Zrigui, M. (2016). A novel approach to identify factor posing pronunciation disorders. In International Conference on Computational Collective Intelligence (pp. 153–162). Springer, Cham.Google Scholar
  34. Terbeh, N., & Zrigui, M. (2017a). A robust algorithm for pathological-speech correction. PACLING, 2017, 341–351.Google Scholar
  35. Terbeh, N., & Zrigui, M. (2017b). Identification of pronunciation defects in spoken Arabic language. PACLING, 2017, 355–365.Google Scholar
  36. Vu, H. H., Villaneau, J., Saïd, F., & Marteau, P. F. (2015). Mesurer la similarité entre phrases grâce à Wikipédia en utilisant une indexation aléatoire, 22nd Traitement Automatique des Langues Naturelles, CaenGoogle Scholar
  37. Wali, W., Gargouri, B., & Ben Hamadou, A. (2017). Enhancing the sentence similarity measure by semantic and syntactico-semantic knowledge. Vietnam Journal of Computer Science, 4(1), 51–60.CrossRefGoogle Scholar
  38. Yarra, C., Deshmukh, O. D., & Ghosh, P. K. (2017). Automatic detection of syllable stress using sonority based prominence features for pronunciation evaluation. In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on (pp. 5845–5849). IEEE.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.LaTICE LaboratoryMonastirTunisia
  2. 2.Algebra, Number Theory and Nonlinear Analysis LaboratoryMonastirTunisia

Personalised recommendations