Mono-font Cursive Arabic Text Recognition Using Speech Recognition System
This paper presents a system to recognise cursive Arabic typewritten text. The system is built using the Hidden Markov Model Toolkit (HTK) which is a portable toolkit for speech recognition system. The proposed system decomposes the page into its text lines and then extracts a set of simple statistical features from small overlapped windows running through each text line. The feature vector sequence is injected to the global model for training and recognition purposes. A data corpus which includes Arabic text of more than 100 A4–size sheets typewritten in Tahoma font is used to assess the performance of the proposed system.
- 2.Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)Google Scholar
- 5.Pechwitz, M., Maergner, V.: Hmm based approach for handwritten arabic word recognition using ifn/enit database. In: The 7th International Conference on Document Analysis and Pattern Recognition, pp. 890–894 (2003)Google Scholar
- 6.Sari, T., Souici, L., Sellami, M.: Offline handwritten arabic character segmentation algorithm: Acsa. In: The 8th International Workshop on Frontiers in Handwriting Recognition, pp. 452–457 (2002)Google Scholar
- 8.Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Dept., Cambridge (2001)Google Scholar
- 10.Khorsheed, M.S.: A lexicon based system with multiple hmms to recognise typewritten and handwritten Arabic words. In: The 17th National Computer Conference, Madinah, Saudi Arabia, April 5-8, pp. 613–621 (2004)Google Scholar