Advertisement

Towards a High-Quality Lemma-Based Text to Speech System for the Arabic Language

  • Oumaima Zine
  • Abdelouafi Meziane
  • Mohamed Boudchiche
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 782)

Abstract

Recent numbers put the Arabic language at around 250 million native speakers, making it the fifth spoken language regarding the number of speakers. Therefore, it has gained the interest of researchers in speech technologies in particular speech recognition and speech synthesis. Indeed, many researchers are still investigating in Arabic Text To Speech to deliver an intelligible and close to natural Text To Speech systems. Nevertheless, the most of the available free and semi-free Arabic Text To Speech systems are still away from the natural sounding as human voice does, and the generation of smooth voice is still involved. The primary intention of this work is to increase the quality of the produced speech resulting from the sub-segment based approach proposed in our previous work. To this end, a lemma-based approach for concatenative TTS synthesis is adopted and presented in this paper. In this context, a study of Arabic lemmas frequency was conducted to identify the highly frequent lemmas that often occur in written and spoken Classical and Modern Standard Arabic (MSA). This study reports an analysis of roughly 65 million words fully vocalized obtained from Tashkila corpus, Nemlar, and Al Jazeera. These latter cover modern and classical Arabic languages. As a result, an Arabic lemmatized frequency list was generated. The top 1,000 frequent lemmas were found to provide approximately 79% coverage of the Arabic words. Thus, the former were used as the basic acoustic units of our Text to Speech System. Finally, we demonstrate that this approach affords an improvement in the intelligibility and naturalness of a Text To Speech system with an overall rate 4.5 out of 5.

Keywords

Text to speech Arabic language Lemma frequency Speech corpus Speech synthesis Unit selection Concatenative synthesis Sub-segments 

Notes

Acknowledgment

The authors gratefully acknowledge and thank Masmoo3 Team for providing us with the Arabic audio files used to build our speech corpus.

References

  1. 1.
    Chabchoub, A., Alahmadi, S., Cherif, A., Barkouti, W.: Di-Diphone Arabic speech synthesis concatenation. Int. J. Comput. Technol. 3, 218–222 (2012)Google Scholar
  2. 2.
    Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51, 1039–1064 (2009).  https://doi.org/10.1016/j.specom.2009.04.004 CrossRefGoogle Scholar
  3. 3.
    Zine, O., Meziane, M.: Novel approach for quality enhancement of Arabic Text To Speech synthesis. In: Presented at 3rd International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2017 (2017).  https://doi.org/10.1109/ATSIP.2017.8075550
  4. 4.
    Bozkurt, B., Öztürk, Ö., Dutoit, T.: Text design for TTS speech corpus building using a modified greedy selection. In: INTERSPEECH (2003)Google Scholar
  5. 5.
    Khan, R.A., Chitode, J.S.: Concatenative speech synthesis: a review. Int. J. of Comput. Appl. 136(3), 1–6 (2016).  https://doi.org/10.5120/ijca2016907992 Google Scholar
  6. 6.
    Hande, S.S.: A review of concatenative text to speech synthesis. Int. J. Latest Technol. Eng. Manag. Appl. Sci. IJLTEMAS 3(9), 12–15 (2014)Google Scholar
  7. 7.
    Hamacher, V., Chalupper, J., Eggers, J., Fischer, E., Kornagel, U., Puder, H., Rass, U.: Signal processing in high-end hearing aids: state of the art, challenges, and future trends. EURASIP J. Appl. Sig. Process. 2005, 2915–2929 (2005)MATHGoogle Scholar
  8. 8.
    Gonzalvo, X., Tazari, S., Chan, C., Becker, M., Gutkin, A., Silen, H.: Recent advances in Google real-time HMM-driven unit selection synthesizer. Presented at the September 8 (2016)Google Scholar
  9. 9.
    Abdelmalek, R., Mnasri, Z.: High quality Arabic text-to-speech synthesis using unit selection. In: 2016 13th International Multi-Conference on Systems, Signals and Devices (SSD), pp. 1–5. IEEE (2016)Google Scholar
  10. 10.
    Rashad, M.Z., El-Bakry, H.M., Isma’il, I.R.: Diphone speech synthesis system for Arabic using MARY TTS. Int. J. Comput. Sci. Inf. Technol. 2, 18–26 (2010).  https://doi.org/10.5121/ijcsit.2010.2402 Google Scholar
  11. 11.
    Alsharif, B., Tahboub, R., Arafeh, L.: Arabic text to speech synthesis using quran-based natural language processing module. J. Theor. Appl. Inf. Technol. 83, 148 (2016)Google Scholar
  12. 12.
    Husni-Al-Muhtaseb, M.E., Al-Ghamdi, M.: Techniques for high quality arabic speech synthesis. Computer Science and Engineering, King Fahd University of Petroleum and Minerals (2003)Google Scholar
  13. 13.
    Campbell, N.: Conversational speech synthesis and the need for some laughter. IEEE Trans. Audio Speech Lang. Process. 14, 1171–1178 (2006).  https://doi.org/10.1109/TASL.2006.876131 CrossRefGoogle Scholar
  14. 14.
    Dutoit, T., Pagel, V., Pierret, N., Bataille, F., van der Vrecken, O.: The MBROLA project: towards a set of high quality speech synthesizers free of use for non commercial purposes. In: Proceedings of the Fourth International Conference on Spoken Language, ICSLP 1996, vol. 3, pp. 1393–1396 (1996)Google Scholar
  15. 15.
  16. 16.
    Karabetsos, S., Tsiakoulis, P., Chalamandaris, A., Raptis, S.: Embedded unit selection text-to-speech synthesis for mobile devices. IEEE Trans. Consum. Electron. 55, 613–621 (2009)CrossRefGoogle Scholar
  17. 17.
    Buckwalter, T., Parkinson, D.: A Frequency Dictionary of Arabic: Core Vocabulary for Learners. Routledge, London (2014)Google Scholar
  18. 18.
    Zaghouani, W., Bouamor, H., Hawwari, A., Diab, M., Obeid, O., Ghoneim, M., Alqahtani, S., Oflazer, K.: Guidelines and framework for a large scale Arabic diacritized corpus. In: The Tenth International Conference on Language Resources and Evaluation (LREC 2016), pp. 3637–3643 (2016)Google Scholar
  19. 19.
    Aljazeera Network, Aljazeera Learning Arabic Service 2016. http://learning.aljazeera.net/arabic. Accessed 10 Aug 2017
  20. 20.
    Belinkov, Y., Magidow, A., Romanov, M., Shmidman, A., Koppel, M.: Shamela: a large-scale historical arabic corpus. arXiv Preprint arXiv:161208989 (2016)
  21. 21.
    Yaseen, B.: Language technology for Arabic. NEMLAR, Center for Sprog-teknologi, Univ. of Copenhagen, Copenhagen (2005)Google Scholar
  22. 22.
    Zeroual, I., Lakhouaja, A.: A new Quranic Corpus rich in morphosyntactical information. Int. J. Speech Technol. 19, 339–346 (2016).  https://doi.org/10.1007/s10772-016-9335-7 CrossRefGoogle Scholar
  23. 23.
    Boudchiche, M., Mazroui, A., Ould Abdallahi Ould Bebah, M., Lakhouaja, A., Boudlal, A.: AlKhalil Morpho Sys 2: a robust Arabic morpho-syntactic analyzer. J. King Saud Univ. Comput. Inf. Sci. 29(2), 141–146 (2017).  https://doi.org/10.1016/j.jksuci.2016.05.002 Google Scholar
  24. 24.
    Boudchiche, M., Mazroui, A.: Approche hybride pour le développement d’un lemmatiseur pour la langue arabe. Presented at the 13th African Conference on Research in Computer Science and Applied Mathematics, Hammamet, Tunisia (2016)Google Scholar
  25. 25.
    Masmoo3 - Arabic Audio Books. http://www.masmoo3.com/. Accessed 10 Aug 2017
  26. 26.
    Boersma, P., Weenink, D.: Praat: doing phonetics by computer [Computer program]. Version 6.0.29. http://www.praat.org/. Accessed 24 May 2017

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Oumaima Zine
    • 1
  • Abdelouafi Meziane
    • 1
  • Mohamed Boudchiche
    • 1
  1. 1.Department of Mathematics and Computer ScienceMohammed First UniversityOujdaMorocco

Personalised recommendations