Speaker-Dependent Bottleneck Features for Egyptian Arabic Speech Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9811)

Abstract

In this paper, several ways to improve a speech recognition system for the Egyptian dialect of Arabic language are presented. The research is based on the CALLHOME Egyptian Arabic corpus. We demonstrate the contribution of speaker-dependent bottleneck features trained on other languages and verify the possibility of application of a small Modern Standard Arabic (MSA) corpus to derive phonetic transcriptions. The systems obtained demonstrate good results as compared to those published before.

Keywords

Arabic language Keyword search Low resources 

Notes

Acknowledgements

This research was partially financially supported by the Government of the Russian Federation, Grant 074-U01.

References

  1. 1.
    Kirchhoff, K., Bilmes, J., Das, S., Duta, N., Egan, M., Ji, G., He, F., Henderson, J., Liu, D., Noamany, M., Schone, P., Schwartz, R., Vergyri, D.: Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop. In: Proceeding of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. 344–347 (2003)Google Scholar
  2. 2.
    Habash, N., Eskander, R., Hawwari, A.: A morphological analyzer for Egyptian Arabic. In: NAACL-HLT 2012 Workshop on Computational Morphology and Phonology (SIGMOR-PHON 2012), pp. 1–9 (2012)Google Scholar
  3. 3.
    Elmahdy, M., Hasegawa-Johnson, M., Mustafawi, E., Duwairi, R., Minker, W.: Challenges and techniques for dialectal Arabic speech recognition and machine translation. In: Proceeding of Qatar Foundation Annual Research Forum, vol. 2011, CSO5, Doha, (2011)Google Scholar
  4. 4.
    Ali, A., Mubarak, H., Vogel, S.: Advances in dialectal Arabic speech recognition: a study using twitter to improve Egyptian ASR. In: Proceeding of International Workshop on Spoken Language Translation (IWSLT 2014), pp. 156–162 (2014)Google Scholar
  5. 5.
    Amr El-Desoky Mousa, Hong-Kwang Jeff Kuo, Mangu, L., Soltau, H.: Morpheme-based feature-rich language models using Deep Neural Networks for LVCSR of Egyptian Arabic. In: Proceeding of 2013 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), pp. 8435–8439 (2013)Google Scholar
  6. 6.
    Ali, A., Zhang, Y., Cardinal, P., Dahak, N., Vogel, S., Glass, J.: A complete KALDI recipe for building Arabic speech recognition systems. In: Proceeding of IEEE SLT, pp. 525–529 (2015)Google Scholar
  7. 7.
    Samuel, T., Saon, G., Kuo, Hong-Kwang J., Mangu, L.: The IBM BOLT speech transcription system. In: Proceeding of Sixteenth Annual Conference of the International Speech Communication Association, pp. 3150–3153 (2015)Google Scholar
  8. 8.
    Trmal, J., Chen, G., Povey, D., Khudanpur, S.: A keyword search system using open source software. In: Proceeding of Spoken Language Technology (SLT) Workshop, IEEE (2014)Google Scholar
  9. 9.
    Povey, D., Ghoshal, A., et al.: The Kaldi speech recognition toolkit. In: Proceeding of the IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (ASRU), IEEE Signal Processing Society (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.ITMO UniversitySaint-PetersburgRussia
  2. 2.Speech Technology Center LtdSaint-PetersburgRussia

Personalised recommendations