Skip to main content

Arabic Text-to-Speech Service with Syrian Dialect

  • Conference paper
  • First Online:
Intelligent Decision Technologies (KESIDT 2023)

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 352))

Included in the following conference series:

  • 159 Accesses

Abstract

This research aims to develop an Arabic text-to-speech (TTS) service with Syrian dialect, which is a variety of Arabic spoken in Syria and some neighboring countries, with easy access to it for people with disabilities or difficulty reading Arabic, such as people with visual impairments or learning disabilities. To achieve this goal, we employ two state-of-the-art Machine Learning (ML) approaches: Tactron2 and Transformers, which have achieved impressive results in various natural language processing tasks, including TTS. We compared the two approaches and evaluated the resulting TTS service using subjective measures. Our results show that both approaches can produce high-quality speech in the Syrian dialect, but transformers have the advantage of being more efficient and more flexible in handling different languages and accents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Darwish, K., et al.: A panoramic survey of natural language processing in the Arab world. Commun. ACM 64(4), 72–81 (2021)

    Article  Google Scholar 

  2. Wang, Y., et al.: Tacotron: towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 (2017)

  3. Amazon. Build a unique Brand Voice with Amazon Polly (2021). https://aws.amazon.com/blogs/machine-learning/build-a-unique-brand-voice-with-amazon-polly. Accessed 23 Sept 2021

  4. Google. Custom Voice (Beta) Overview (2021). https://cloud.google.com/text-to-speech/custom-voice/docs. Accessed 23 Sept 2021

  5. Griffin, D., Lim, J.: Signal estimation from modified short-time fourier transform. IEEE Trans. Acoust. Speech Signal Process. 32(2), 236–243 (1984)

    Article  Google Scholar 

  6. van den Oord, A., et al.: Wavenet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)

  7. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  8. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  9. Li, N., Liu, S., Liu, Y., Zhao, S., Liu, M.: Neural speech synthesis with transformer network. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 6706–6713 (2019)

    Google Scholar 

  10. Zerrouki, T., Abu Shquier, M.M., Balla, A., Bousbia, N., Sakraoui, I., Boudardara, F.: Adapting espeak to Arabic language: converting Arabic text to speech language using espeak. Int. J. Reason.-Based Intell. Syst. 11(1), 76–89 (2019)

    Google Scholar 

  11. Zine, O., Meziane, A.: Novel approach for quality enhancement of Arabic text to speech synthesis. In: International Conference on Advanced Technologies for Signal and Image Processing (ATSIP) 2017, pp. 1–6 (2017)

    Google Scholar 

  12. Zine, O., Meziane, A., Boudchiche, M.: Towards a high-quality lemma-based text to speech system for the Arabic language. In: Lachkar, A., Bouzoubaa, K., Mazroui, A., Hamdani, A., Lekhouaja, A. (eds.) ICALP 2017. CCIS, vol. 782, pp. 53–66. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73500-9_4

    Chapter  Google Scholar 

  13. Abdelali, A., Attia, M., Samih, Y., Darwish, K., Mubarak, H.: Diacritization of maghrebi Arabic sub-dialects, arXiv preprint arXiv:1810.06619 (2018)

  14. Zine, O., Meziane, A., et al.: Text-to-speech technology for Arabic language learners. In: 2018 IEEE 5th International Congress on Information Science and Technology (CiSt), pp. 432–436 (2018)

    Google Scholar 

  15. Fahmy, F.K., Khalil, M.I., Abbas, H.M.: A transfer learning end-to-end Arabic text-to-speech (TTS) deep architecture. In: IAPR Workshop on Artificial Neural Networks in Pattern Recognition, pp. 266–277 (2020)

    Google Scholar 

  16. Shen, J., et al.: Natural TTS synthesis by conditioning wavenet on MEL spectrogram predictions. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4779–4783 (2018)

    Google Scholar 

  17. Karita, S., et al.: A comparative study on transformer vs RNN in speech applications. In: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 449–456 (2019)

    Google Scholar 

  18. Ren, Y., et al.: Fastspeech: fast, robust and controllable text to speech. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Mohammad .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Saleh, H., Mohammad, A., Jafar, K., Solieman, M., Ahmad, B., Hasan, S. (2023). Arabic Text-to-Speech Service with Syrian Dialect. In: Czarnowski, I., Howlett, R., Jain, L.C. (eds) Intelligent Decision Technologies. KESIDT 2023. Smart Innovation, Systems and Technologies, vol 352. Springer, Singapore. https://doi.org/10.1007/978-981-99-2969-6_10

Download citation

Publish with us

Policies and ethics