Abstract
In the context of Arabic optical characters recognition, Arabic poses more challenges because of its cursive nature. We purpose a system for recognizing a document containing Arabic text, using a pipeline of three neural networks. The first network model predicts the font size of an Arabic word, then the word is normalized to an 18pt font size that will be used to train the next two models. The second model is used to segment a word into characters. The problem of words segmentation in the Arabic language, as in many similar cursive languages, presents a challenge to the OCR systems. This paper presents a multichannel neural network to solve the offline segmentation of machine-printed Arabic documents. The segmented characters are then fed as an input to a convolutional neural network for Arabic characters recognition. The font size prediction model produced a test accuracy of 99.1%. The accuracy of the segmentation model using one font is 98.9%, while four-font model showed 95.5% accuracy. The whole pipeline showed an accuracy of 94.38% on Arabic Transparent font of size 18pt from APTI data set.
Similar content being viewed by others
Notes
Keras was used for experiments https://github.com/fchollet/keras/.
Data used from https://sites.google.com/site/motazsite/.
References
Alginahi YM (2013) A survey on Arabic character segmentation. Int J Doc Anal Recognit 16(2):105–126
AlKhateeb JH, Khelifi F, Jiang J, Ipson SS (2009) A new approach for off-line handwritten Arabic word recognition using knn classifier. In: 2009 IEEE international conference on signal and image processing applications (ICSIPA), pp 191–194
Al-Badr B, Haralick RM (1995) Segmentation-free word recognition with application to Arabic. In: Proceedings of the third international conference on document analysis and recognition, vol 1, pp 355–359
Khorsheed MS, Clocksin WF (2000) Multi-font Arabic word recognition using spectral features. In: Proceedings of 15th international conference on pattern recognition, vol 4, pp 543–546
Jelodar MS, Fadaeieslam MJ, Mozayani N, Fazeli M (2005) A Persian OCR system using morphological operators. In: World Academy of Science, Engineering and Technology, pp 137–140
Märgner V (1992) Sarat-a system for the recognition of Arabic printed text. In: Proceedings of 11th IAPR international conference on pattern recognition 1992, vol II. Conference B: pattern recognition methodology and systems, pp 561–564
Azmi R, Kabir E (2001) A new segmentation technique for omnifont farsi text. Pattern Recognit Lett 22(2):97–104
Khoury I, Giménez A, Juan A, Andrés-Ferrer J (2015) Window repositioning for printed Arabic recognition. Pattern Recognit Lett 51:86–93
Ahmad I, Mahmoud SA, Fink GA (2016) Open-vocabulary recognition of machine-printed Arabic text using hidden markov models. Pattern Recognit 51:97–111
Amin A (1998) Off-line Arabic character recognition: the state of the art. Pattern Recognit 31(5):517–530
Zheng L, Hassin AH, Tang X (2004) A new algorithm for machine printed Arabic character segmentation. Pattern Recognit Lett 25(15):1723–1729
Bushofa BMF, Spann M (1997) Segmentation and recognition of printed Arabic characters using structural classification. Image Vis Comput 15(3):167–179
Nawaz SN, Sarfraz M, Zidouri A, Al-Khatib WG (2003) An approach to offline Arabic character recognition using neural networks. In: Proceedings of the 2003 10th IEEE international conference on electronics, circuits and systems, 2003 (ICECS 2003), vol 3, pp 1328–1331
Gouda AM, Rashwan MA (2004) Segmentation of connected Arabic characters using hidden markov models. In: 2004 IEEE international conference on computational intelligence for measurement systems and applications (2004 CIMSA), pp 115–119
Touj S, Amara NB, Amiri H (2007) Two approaches for Arabic script recognition-based segmentation using the Hough transform. In: International conference on document analysis and recognition (ICDAR-2007), pp 654–658
Cireşan DC, Meier U, Masci J, Gambardella LM, Schmidhuber J (2011) High-performance neural networks for visual object classification. Preprint arXiv:1102.0183
Lee C-Y, Xie S, Gallagher P, Zhang Z, Tu Z (2014) Deeply-supervised nets. Preprint arXiv:1409.5185
Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 689–696
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Srivastava N (2013) Improving neural networks with dropout. Ph.D. thesis, University of Toronto
Cattoni R, Coianiz T, Messelodi S, Maria Modena C (1998) Geometric layout analysis techniques for document image understanding: a review. ITC-irst Technical Report 9703(09)
Simard PY, Steinkraus D, Platt JC (2003) Best practices for convolutional neural networks applied to visual document analysis. In: International conference on document analysis and recognition (ICDAR-2003), pp 958–963
Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: International conference on artificial intelligence and statistics, pp 315–323
Zidouri A (2010) On multiple typeface Arabic script recognition. Res J Appl Sci Eng Technol 2(5):428–435
Broumandnia A, Shanbehzadeh J, Nourani M (2007) Segmentation of printed farsi/Arabic words. In: IEEE/ACS international conference on computer systems and applications 2007 (AICCSA’07), pp 761–766
Bushofa BMF, Spann M (1997) Segmentation of Arabic characters using their contour information. In: 13th International conference on digital signal processing proceedings, 1997 (DSP 97), vol 2, pp 683–686
Alaa H, Ramzi H (2001) A neuro-heuristic approach for segmenting handwritten Arabic text. In: ACS/IEEE international conference on computer systems and applications 2001, pp 110–113
Slimane F, Kanoun S, El Abed H, Alimi AM, Ingold R, Hennebert J (2011) ICDAR2011-Arabic recognition competition: multi-font multi-size digitally represented text. In: International conference on document analysis and recognition (ICDAR-2011), pp 1449–1453
Slimane F, Kanoun S, El Abed H, Alimi AM, Ingold R, Hennebert J (2013) ICDAR2013 competition on multi-font and multi-size digitally represented Arabic text. In: International conference on document analysis and recognition (ICDAR-2013), pp 1433–1437
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Radwan, M.A., Khalil, M.I. & Abbas, H.M. Neural Networks Pipeline for Offline Machine Printed Arabic OCR. Neural Process Lett 48, 769–787 (2018). https://doi.org/10.1007/s11063-017-9727-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-017-9727-y