Abstract
Libraries contain huge amounts of arabic printed historical documents which cannot be available on-line because they do not have a searchable index. The word spotting idea has previously been suggested as a solution to create indexes for such a collecton of documents by matching word images. In this paper we present a word spotting method for arabic printed historical document. We start with word segmentation using run length smoothing algorithm. The description of the features selected to represent the words images is given afterwards. Elastic Dynamic Time Warping is used for matching the features of the two words. This method was tested on the arabic historical printed document database of Moroccan National Library.
Chapter PDF
References
Adamek, T., O’Connor, N.E., Smeaton, A.F.: Word matching using single closed contours for indexing handwritten historical documents. IJDAR 9, 153–165 (2007)
Borgefors, G.: Hierarchical Chamfer Matching: Aparametric Edge Matching Algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 10(6), 849–865 (1988)
Doermann, D.: The Indexing and Retrieval of Document Images: A Survey. Computer Vision and Image Understanding (CVIU) 70(3), 287–298 (1998)
Gatos, B., Pratikakis, I.: Segmentation-free word spotting in historical printed documents. In: 10th International Conference on Document Analysis and Recognition (2009)
Manmatha, R., Han, C., Riseman, E.M.: Wordspotting: A New Approach to Indexing Handwriting. In: Conference on Computer Vision and Pattern Recognition (CVPR), p. 631 (1996a)
Manmatha, R., Han, C., Riseman, E.M., Croft, W.B.: Indexing handwriting using word matching. In: 1st ACM Internationall Conference on Digital Libraries (1996b)
Moghaddam, R., Rivest-Hénault, D., Cheriet, M.: Restoration and Segmentation of Highly Degraded Characters using a Shape Independent Level Set Approach and Multi-level Classifiers. In: Proc. ICDAR 2009, Barcelona, Spain, pp. 828–832 (2009)
Rath, T.M., Manmatha, R.: Features for word spotting in historical manuscripts. In: Seventh International Conference on Document Analysis and Recognition (ICDAR), p. 218 (2003)
Rath, T.M., Manmatha, R.: Word spotting for historical documents. IJDAR 9, 139–152 (2007)
Rothfeder, J.L., Feng, S., Rath, T.M.: Using corner features correspondences to rank word images by similarity. In: Conference on Computer Vision and Pattern Recognition, pp. 30–35, USA (2003)
Rothfeder, J.L., Feng, S., Rath, T.M.: Using Corner Feature Correspondences to Rankword Images by Similarity. In: Proc. DIAR 2003, Madison, WI (June 2003)
Saabni, R., El-Sana, J.: Keyword searching for Arabic handwritten documents. In: Proc. 11th Int. Conf. on Frontiers in Handwriting Recognition (ICFHR), pp. 271–277 (2008)
Srihari, S., Srinivasan, H., Huang, C., Shetty, S.: Spotting Words in Latin, Devanagari and Arabic scripts. Vivek: Indian Journal of Artificial Intelligence 16(3), 2–9 (2003)
You, J., Pissaloux, E., Zhu, W., Cohen, H.: Efficient Image Matching: A hierarchical Chamfer Matching Scheme via Distributed System. Real-Time Imaging 1(4), 245–259 (1995)
Kolcz, A., Alspector, J., Augusteijn, M., Carlson, R., Popescu, G.V.: A line-oriented approach to word spotting in handwritten documents. Pattern Analysis & Applications, pp. 153–168 (2000)
Wong, K.Y., Casey, R.G., Wahi, F.M.: Document analysis system. IBM Journal of Research Development 26, 647–656 (1982)
Wagner, R.A., Fischer, M.J.: The string-to-string correction Problem. Journal of ACM 21, 168–173 (1974)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zirari, F., Ennaji, A., Mammass, D., Nicolas, S. (2014). Spot Words in Printed Historical Arabic Documents. In: Elmoataz, A., Lezoray, O., Nouboud, F., Mammass, D. (eds) Image and Signal Processing. ICISP 2014. Lecture Notes in Computer Science, vol 8509. Springer, Cham. https://doi.org/10.1007/978-3-319-07998-1_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-07998-1_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07997-4
Online ISBN: 978-3-319-07998-1
eBook Packages: Computer ScienceComputer Science (R0)