Skip to main content

ComputerAutomatic Robust Rule-Based Phonetization of Standard Arabic

  • Conference paper
  • First Online:
Text, Speech, and Dialogue (TSD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9302))

Included in the following conference series:

Abstract

Phonetization is the process of encoding language sounds using phonetic symbols. It is used in many natural language processing tasks such as speech processing, speech synthesis, and computer-aided pronunciation assessment. A common phonetization approach is the use of letter-to-sound rules developed by linguists for the transcription form orthography to sound. In this paper, we address the problem of rule-based phonetization of standard Arabic. The paper contributions can be summarized as follows: 1) Discussing the transcription rules of standard Arabic which were used in literature on the phonemic and phonetic levels. 2) Important improvements of these rules were suggested and the resulting rules set was tested on large datasets. 3) We present a reliable automatic phonetic transcription of standard Arabic on five levels: phoneme, allophone, syllable, word, and sentence. An encoding which covers all sounds of standard Arabic is proposed and several pronunciation dictionaries were automatically generated. These dictionaries were manually verified yielding an accuracy of 100% with standard Arabic texts that do not contain dates, numbers, acronyms, abbreviations, and special symbols. They are available for research purposes along with the software package which performs the automatic transcription.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali, M., Elshafei, M., Al-Ghamdi, M., Al-Muhtaseb, H.: Arabic phonetic dictionaries for speech recognition. Journal of Information Technology Research (JITR) 2(4), 67–80 (2009)

    Article  Google Scholar 

  2. El-Imam, Y.A.: Phonetization of arabic: rules and algorithms. Computer Speech & Language 18(4), 339–373 (2004)

    Article  Google Scholar 

  3. Hadjar, K., Ingold, R.: Arabic newspaper page segmentation. In: null, p. 895. IEEE (2003)

    Google Scholar 

  4. Masmoudi, A., Ellouze Khmekhem, M., Estève, Y., Hadrich Belguith, L., Habash, N.: A corpus and a phonetic dictionary for tunisian arabic speech recognition. In: Language Resources and Evaluation Conference, Iceland (2014)

    Google Scholar 

  5. Harrat, S., Meftouh, K., Abbas, M., Smaïli, K.: Grapheme to phoneme conversion-an arabic dialect case. In: Spoken Language Technologies for Under-resourced Languages (2014)

    Google Scholar 

  6. Al-ghamdi, M., El Hadj, Y.M., Alkanhal, M.: A manual system to segment and transcribe arabic speech. In: IEEE International Conference on Signal Processing and Communications, ICSPC 2007, pp. 233–236. IEEE (2007)

    Google Scholar 

  7. Al-Ghamdi, M.M., Al-Muhtasib, H., Elshafei, M.: Phonetic rules in arabic script. Journal of King Saud University-Computer and Information Sciences 16, 85–115 (2004)

    Article  Google Scholar 

  8. Abu Salim, I.: The syllabic structure in arabic language. In: Magazine of the Jordan Academy of Arabic, vol. 33. Association for Computational Linguistics (1987)

    Google Scholar 

  9. Zeki, M., Khalifa, O.O., Naji, A.: Development of an arabic text-to-speech system. In: 2010 International Conference on Computer and Communication Engineering (ICCCE), pp. 1–5. IEEE (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fadi Sindran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sindran, F., Mualla, F., Bobzin, K., Nöth, E. (2015). ComputerAutomatic Robust Rule-Based Phonetization of Standard Arabic. In: Král, P., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2015. Lecture Notes in Computer Science(), vol 9302. Springer, Cham. https://doi.org/10.1007/978-3-319-24033-6_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24033-6_50

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24032-9

  • Online ISBN: 978-3-319-24033-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics