Skip to main content

Collecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose

  • Conference paper
  • First Online:
Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction (MA3HMI 2014)

Abstract

Building Automatic Speech Recognition (ASR) systems for spoken languages usually suffer from the problem of limited available transcriptions. Automatic Speech Recognition (ASR) systems require large speech corpora that contain speech and their corresponding transcriptions for training acoustic models. In this paper, we target the Egyptian dialectal Arabic. As other spoken languages, it is mainly used for spoken rather than writing purposes. Transcriptions are usually collected manually by experts. However, this proved to be a time-consuming and expensive process. In this paper, we introduce Games With a Purpose as a cheap and fast approach to gather transcriptions for Egyptian dialectal Arabic. Furthermore, Arabic orthographic transcriptions lack diacritizations, which leads to ambiguity. On the other hand, transcriptions written in Arabic Chat Alphabet are widely used, and include the pronunciation effects given by diacritics. In this work, we present the game

(pronouced as makhamekho) that aims at collecting transcriptions in Arabic orthography, as well as in Arabic Chat Alphabet. It also gathers mappings of words from Arabic orthography to Arabic Chat Alphabet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.mturk.com.

References

  1. Suendermann, D., Liscombe, J., Pieraccini, R.: How to drink from a fire hose: one person can annoscribe 693 thousand utterances in one month (2010)

    Google Scholar 

  2. Novotney, S., Callison-Burch, C.: Cheap, automatic speech recognition with non-expert transcription, fast and good enough (2009)

    Google Scholar 

  3. Delendik, Y.: What is Automatic Speech Recognition? June 2009

    Google Scholar 

  4. Furui, S.: Automatic speech recognition and its application to information extraction (2001)

    Google Scholar 

  5. Verguria, D., Kirchhoff, K.: Automatic diacritization of arabic for acoustic modeling in speech recognition (2014)

    Google Scholar 

  6. Macmillan, P.: Sacred Language, Ordinary People: Dilemmas of Culture and Politics in Egypt. Palgrave Macmillan, New York (2003)

    Google Scholar 

  7. von Ahn, L., Dabbish, L.: Designing games with a purpose (2008). https://www.cs.cmu.edu/~biglou/GWAP_CACM.pdf

  8. Parent, G., Eskenazi, M.: Toward better crowdsourced transcription: transcription of a year of the let’s go bus information system data (2010)

    Google Scholar 

  9. Akasaka, R.: Foreign accented speech transcription and accent recognition using a game-based approach (2009)

    Google Scholar 

  10. Marge, M.R., Satanjeev, B., Rudnicky, A.I.: Using the amazon mechanical turk to transcribe and annotate meeting speech for extractive summarization (2010)

    Google Scholar 

  11. Marge, M.R., Satanjeev, B., Rudnicky, A.I.: Using the amazon mechanical turk for transcription of spoken language (2010)

    Google Scholar 

  12. Evanini, K., Higgins, D., Zechner, K.: Using amazon mechanical turk for transcription of non-native speech (2010)

    Google Scholar 

  13. Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast - but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of EMNLP, vol. 1, pp. 254–263 (2008)

    Google Scholar 

  14. Denkowski, M., Al-Haj, H., Lavie, A.: Turker-assisted paraphrasing for English-Arabic machine translation. In: Proceedings of NAACL-HLT, pp. 66–70 (2010)

    Google Scholar 

  15. Ambati, V., Vogel, S.: Can crowds build parallel corpora for machine translation systems? In: Proceedings of NAACL-HLT, pp. 62–65 (2010)

    Google Scholar 

  16. Elmahdy, M., Gruhn, R., Abdennadher, S., Minker, W.: Rapid phonetic transcription using everyday life natural Chat Alphabet orthography for dialectal Arabic speech recognition. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4936, 4939, 22–27 May 2011

    Google Scholar 

  17. McGonigal, J.: Reality is Broken: Why Games Make Us Better and How they Can Change the World. Penguin Press, New York (2011)

    Google Scholar 

  18. Wieser, C., et al.: ARTigo: Building an artwork search engine with games and higher-order latent semantic analysis. In: First AAAI Conference on Human Computation and Crowdsourcing (2013)

    Google Scholar 

  19. Law, L.M.: TagATune: A game for music and sound annotation. In: ISMIR, vol. 3 (2007)

    Google Scholar 

  20. Parent, G., Eskenazi, M.: Speaking to the crowd: looking at past achievements in using crowdsourcing for speech and predicting future challenges. In: INTERSPEECH (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Injy Hamed .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

El-Sakhawy, D., Abdennadher, S., Hamed, I. (2015). Collecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose. In: Böck, R., Bonin, F., Campbell, N., Poppe, R. (eds) Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. MA3HMI 2014. Lecture Notes in Computer Science(), vol 8757. Springer, Cham. https://doi.org/10.1007/978-3-319-15557-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15557-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15556-2

  • Online ISBN: 978-3-319-15557-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics