Phoneme-Based Recognizer to Assist Reading the Holy Quran
This paper presents a new phase of our ongoing efforts for building a high performance speaker independent recognizer for Quran recitation. An in-house developed and annotated sound database of about eight hours is used for this purpose. Since this sound database is segmented and annotated on both allophone and phoneme levels, we are developing two separate baseline recognizers for respectively allophones and phonemes. We employed the same approach for developing both phoneme and allophone recognizers to be able to make some kind of comparison between them. The Cambridge HTK tools are used for the development of these recognizers. We present in this paper the development of the phoneme-based recognizer to measure its appropriateness for the sake of our ultimate goal of building a high performance speaker independent recognizer to assist reading and memorizing the Holy Quran; the details of the allophonic recognizer is being published separately. Each Quarnic phoneme is modeled by an acoustic Hidden Markov Model (HMM) with 3-emitting states. A continues probability distribution using 16 Gaussian mixture distributions is used for each emitting state. Results give 92% of average recognition rate, which is very promising, compared to 88% for the allophonic recognizer.
KeywordsAutomatic Speech Recognition Hidden Markov Models Speech Corpus Sound Corpus Phonemes Allophones Phontetic Transcription Speech Segmentation Speech Annotation Holy Quran Recitation Quran Learning Quran Sound Pronunciations Tajweed Rules
Unable to display preview. Download preview PDF.
- 1.Rabiner, L.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2) (1989)Google Scholar
- 2.Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall (1993)Google Scholar
- 3.Jelinek, F.: Statistical Methods for Speech Recognition. MIT Press, Cambridge (1998)Google Scholar
- 4.Huang, X., Acero, A., Hon, H.: Spoken Language Processing. Prentice Hall (2001)Google Scholar
- 5.Jurafsky, D., Martin, J.H.: Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition, 2nd edn. Prentice Hall (2008)Google Scholar
- 6.Young, S.: The HTK Hidden Markov model toolkit: design and philosophy (Tech. Rep. CUED/FINFENG/TR152). Cambridge University Engineering Dept, UK (1994)Google Scholar
- 7.Young, S., et al.: HTK Book (V.3.4). Cambridge University Engineering Dept, UK (2009)Google Scholar
- 9.Elhadj, Y.O.M., Alsughayeir, I.A., Alghamdi, M., Alkanhal, M., Ohali, Y.M., Alansari, A.M.: Computerized teaching of the Holy Quran. Final Technical Report, King Abdulaziz City for Sciences and Technology (KACST), Riyadh, KSA (2012) (in Arabic)Google Scholar
- 10.Elhadj, Y.O.M., AlGhamdi, M., AlKanhal, M., Alansari, A.M.: Sound Corpus of a part of the noble Quran. In: Proc. of the International Conference on the Glorious Quran and Contemporary Technologies, King Fahd Complex for the Printing of the Holy Quran, Almadinah, Saudi Arabia, October 13-15 (2009) (in Arabic)Google Scholar
- 11.Elhadj, Y.O.M.: Preparation of speech database with perfect reading of the last part of the Holly Quran. In: Proc. of the 3rd IEEE International Conference on Arabic Language Processing (CITAL 2009), Rabat, Morocco, May 4-5, pp. 5–8 (2009) (in Arabic)Google Scholar
- 12.AlGhamdi, M., Elhadj, Y.O.M., AlKanhal, M.: A manual system to segment and transcribe Arabic Speech. In: Proceedings of IEEE ICSPC 2007, Dubai, UAE, pp. 233–236 (2007) ISBN 1-4244-1236-6Google Scholar
- 13.Alghamdi, M.: KACST Arabic Phonetics Database. In: The Fifteenth International Congress of Phonetics Science, Barcelona, pp. 3109–3112 (2003)Google Scholar