Abstract
Optical character recognition of cursive scripts present a number of challenging problems in both segmentation and recognition processes and this attracts many researches in the field of machine learning. This paper presents a novel approach based on a combination of MLP and SVM to design a trainable OCR for Persian/Arabic cursive documents. The implementation results on a comprehensive database show a high degree of accuracy which meets the requirements of commercial use.
Chapter PDF
Similar content being viewed by others
Keywords
- Support Vector Machine
- Natural Language Processing
- Fuzzy Inference System
- Support Vector Machine Classifier
- Multi Layer Perceptrons
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Amin, A.: Off line Arabic character recognition - a survey. In: Proceedings of the International Conference on Document Analysis and Recognition, vol. 2, pp. 596–599 (1997)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Al-Badr, B., Haralick, R.M.: Segmentation-free word recognition with application to Arabic. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 355–359. IEEE Comput. Soc. Press, Los Alamitos (1995)
Bazzi, I., Schwartz, R., Makhoul, J.: An omnifont open-vocabulary OCR system for English and Arabic. IEEE Transactions on Pattern Analysis & Machine Intelligence 21(6), 495–504 (1999)
Hassin, A.H., Xiang-Long, T., Jia-Feng, L., Wei, Z.: Printed Arabic character recognition using HMM. Journal of Computer Science & Technology 19(4), 538–543 (2004)
Cheung, A., Bennamoun, M., Bergmann, N.W.: An Arabic optical character recognition system using recognition-based segmentation. Pattern Recognition 34(2), 215–233 (2001)
Weissman, H., Schenkel, M., Guyon, I., Nohl, C., Henderson, D.: Recognition-based segmentation of on-line run-on handprinted words: input vs. output segmentation. Pattern Recognition 27(3), 405–420 (1994)
Azmi, R., Kabir, E.: A new segmentation technique for ominifont farsi text. Pattern Recognition Letters 22(2), 97–104 (2001)
Haykin, S.: Adaptive Filter Theory, 3rd edn. Prentice-Hall, Upper Saddle River (1996)
Kavianifar, M., Amin, A.: Preprocessing and structural feature extraction for a multi-fonts Arabic/Persian OCR. In: Conference on Document Analysis and Recognition, pp. 213–216. IEEE Computer Soc., Los Alamitos (1999)
Kurdy, B.M., AlSabbagh, M.M.: Omnifont Arabic optical character recognition system. In: Proceedings of Int. Conf. on Information and Communication Technologies: From Theory to Applications, pp. 469–470. IEEE, Piscataway (2004)
Pirsiavash, H., Razzazi, F.: Design and Implementation of a Hierarchical Classifier for Isolated Handwritten Persian/Arabic Characters. In: IJCI Proceedings of International Conference on Signal Processing, Turkey, vol. 1(2) (September 2003)
Hu, M.K.: Visual Pattern Recognition by Moment Invariants. IEEE Transactions on Information Theory IT-8, 179–187 (1962)
Gonzalez, R.C., Wintz, P.: Digital Image Processing, 2nd edn., pp. 392–423. Addison Wesly Publishing Company, London (1987)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pirsiavash, H., Mehran, R., Razzazi, F. (2005). A Robust Free Size OCR for Omni-Font Persian/Arabic Printed Document Using Combined MLP/SVM. In: Sanfeliu, A., Cortés, M.L. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2005. Lecture Notes in Computer Science, vol 3773. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11578079_63
Download citation
DOI: https://doi.org/10.1007/11578079_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29850-2
Online ISBN: 978-3-540-32242-9
eBook Packages: Computer ScienceComputer Science (R0)