Information Retrieval System for Handwritten Documents
The design and performance of a content-based information retrieval system for handwritten documents is described. System indexing and retrieval is based on writer characteristics, textual content as well as document meta data such as writer profile. Documents are indexed using global image features, e.g., stroke width, slant, word gaps, as well local features that describe shapes of characters and words. Image indexing is done automatically using page analysis, page segmentation, line separation, word segmentation and recognition of characters and words. Several types of queries are permitted: (i) entire document image; (ii) a region of interest (ROI) of a document; (iii) a word image; and (iv) textual. Retrieval is based on a probabilistic model of information retrieval. The system has been implemented using Microsoft Visual C++ and a relational database system. This paper reports on the performance of the system for retrieving documents based on same and different content.
Unable to display preview. Download preview PDF.
- 1.Osborn, A.S.: Questioned Documents. Nellon Hall Pub. (1929)Google Scholar
- 2.Robertson, E.W.: Fundamentals of Document Examination, Burnham Inc Pub. (1991)Google Scholar
- 3.Bradford, R.R., Bradford, R.B.: Introduction to Handwriting Examination and Identification, Burnham Inc Pub. (1992)Google Scholar
- 4.Hilton, O.: Scientific examination of questioned documents. CRC Press Inc., Boca Raton (1993)Google Scholar
- 6.Franke, K., Schomaker, L., Vuurpijl, L., Giesler, S.: FISH-new: A common ground for computer-based forensic writer identification. In: Proceedings of the Third European Academy of Forensic Science Triennial Meeting, Istanbul, Turkey, p. 84 (2003)Google Scholar
- 7.Srihari, S.N., Cha, S.-H., Arora, H., Lee, S.: Individuality of Handwriting. Journal of Forensic Sciences 44(4), 856–872 (2002)Google Scholar
- 8.Srihari, S.N., Zhang, B., Tomai, C., Lee, S.-J., Shi, Z., Shin, Y.C.: A system for hand-writing matching and recognition. In: Proceedings of the Symposium on Document Image Understanding Technology (SDIUT 2003), Greenbelt, MD (2003)Google Scholar
- 9.Zhang, B., Srihari, S.N.: Binary vector dissimilarity measures for handwriting. In: Kanungo, T., Smith, E.H.B., Hu, J., Kantor, P.B. (eds.) Document Recognition and Retrieval X, vol. 5010, pp. 28–38. SPIE, Bellingham (2003)Google Scholar
- 10.Sparck Jones, K.: A Probabilistic Model of Information Retrieval: Development and Status, Technical Report, Computer Laboratory, University of Cambridge, UK (1998)Google Scholar