Off-Line Arabic Character Recognition – A Review
- First Online:
- 576 Downloads
Off-line recognition requires transferring the text under consideration into an image file. This represents the only available solution to bring the printed materials to the electronic media. However, the transferring process causes the system to lose the temporal information of that text. Other complexities that an off-line recognition system has to deal with are the lower resolution of the document and the poor binarisation, which can contribute to readability when essential features of the characters are deleted or obscured. Recognising Arabic script presents two additional challenges: orthography is cursive and letter shape is context sensitive. Certain character combinations form new ligature shapes, which are often font-dependent. Some ligatures involve vertical stacking of characters. Since not all letters connect, word boundary location becomes an interesting problem, as spacing may separate not only words, but also certain characters within a word. Various techniques have been implemented to achieve high recognition rates. These techniques have tackled different aspects of the recognition system. This review is organised into five major sections, covering a general overview, Arabic writing characteristics, Arabic text recognition system, Arabic OCR software and conclusions.
Key wordsArabic OCR Feature extraction Fourier Transform Hidden Markov Models Horizontal projection Hough Transform Neural Networks Off-line recognition Preprocessing segmentation Vertical projection
Unable to display preview. Download preview PDF.