Abstract
This paper presents a system to recognise cursive Arabic typewritten text. The system is built using the Hidden Markov Model Toolkit (HTK) which is a portable toolkit for speech recognition system. The proposed system decomposes the page into its text lines and then extracts a set of simple statistical features from small overlapped windows running through each text line. The feature vector sequence is injected to the global model for training and recognition purposes. A data corpus which includes Arabic text of more than 100 A4–size sheets typewritten in Tahoma font is used to assess the performance of the proposed system.
Chapter PDF
Similar content being viewed by others
References
Khorsheed, M.S.: Off-line Arabic character recognition – A review. Pattern Analysis and Applications 5(1), 31–45 (2002)
Rabiner, L., Juang, B.: Fundamentals of Speech Recognition. Prentice Hall, Englewood Cliffs (1993)
Kim, H., Kim, K., Kim, S., Lee, J.: On-line recognition of handwritten chinese characters based on hmms. Pattern Recognition 30(9), 1489–1500 (1997)
Kim, W., Park, R.: Off-line recognition of handwritten korean and alphanumeric characters using hmms. Pattern Recognition 29(5), 845–858 (1996)
Pechwitz, M., Maergner, V.: Hmm based approach for handwritten arabic word recognition using ifn/enit database. In: The 7th International Conference on Document Analysis and Pattern Recognition, pp. 890–894 (2003)
Sari, T., Souici, L., Sellami, M.: Offline handwritten arabic character segmentation algorithm: Acsa. In: The 8th International Workshop on Frontiers in Handwriting Recognition, pp. 452–457 (2002)
Bazzi, I., Schwartz, R., Makhoul, J.: An omnifont open-vocabulary OCR system for English and Arabic. IEEE Trans. on Pattern Analysis and Machine Intelligence 21(6), 495–504 (1999)
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University Engineering Dept., Cambridge (2001)
Khorsheed, M.S.: Recognising handwritten Arabic manuscripts using a single hidden markov model. Pattern Recognition Letters 24(14), 2235–2242 (2003)
Khorsheed, M.S.: A lexicon based system with multiple hmms to recognise typewritten and handwritten Arabic words. In: The 17th National Computer Conference, Madinah, Saudi Arabia, April 5-8, pp. 613–621 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khorsheed, M.S. (2006). Mono-font Cursive Arabic Text Recognition Using Speech Recognition System. In: Yeung, DY., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2006. Lecture Notes in Computer Science, vol 4109. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11815921_83
Download citation
DOI: https://doi.org/10.1007/11815921_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37236-3
Online ISBN: 978-3-540-37241-7
eBook Packages: Computer ScienceComputer Science (R0)