Information Retrieval System for Handwritten Documents

  • Sargur Srihari
  • Anantharaman Ganesh
  • Catalin Tomai
  • Yong-Chul Shin
  • Chen Huang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3163)

Abstract

The design and performance of a content-based information retrieval system for handwritten documents is described. System indexing and retrieval is based on writer characteristics, textual content as well as document meta data such as writer profile. Documents are indexed using global image features, e.g., stroke width, slant, word gaps, as well local features that describe shapes of characters and words. Image indexing is done automatically using page analysis, page segmentation, line separation, word segmentation and recognition of characters and words. Several types of queries are permitted: (i) entire document image; (ii) a region of interest (ROI) of a document; (iii) a word image; and (iv) textual. Retrieval is based on a probabilistic model of information retrieval. The system has been implemented using Microsoft Visual C++ and a relational database system. This paper reports on the performance of the system for retrieving documents based on same and different content.

Keywords

Digital Library Document Image Information Retrieval System Document Retrieval Word Image 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Osborn, A.S.: Questioned Documents. Nellon Hall Pub. (1929)Google Scholar
  2. 2.
    Robertson, E.W.: Fundamentals of Document Examination, Burnham Inc Pub. (1991)Google Scholar
  3. 3.
    Bradford, R.R., Bradford, R.B.: Introduction to Handwriting Examination and Identification, Burnham Inc Pub. (1992)Google Scholar
  4. 4.
    Hilton, O.: Scientific examination of questioned documents. CRC Press Inc., Boca Raton (1993)Google Scholar
  5. 5.
    Huber, R.A., Headrick, A.M.: Handwriting Identification: Facts and Fundamentals. CRC Press, Boca Raton (1999)CrossRefGoogle Scholar
  6. 6.
    Franke, K., Schomaker, L., Vuurpijl, L., Giesler, S.: FISH-new: A common ground for computer-based forensic writer identification. In: Proceedings of the Third European Academy of Forensic Science Triennial Meeting, Istanbul, Turkey, p. 84 (2003)Google Scholar
  7. 7.
    Srihari, S.N., Cha, S.-H., Arora, H., Lee, S.: Individuality of Handwriting. Journal of Forensic Sciences 44(4), 856–872 (2002)Google Scholar
  8. 8.
    Srihari, S.N., Zhang, B., Tomai, C., Lee, S.-J., Shi, Z., Shin, Y.C.: A system for hand-writing matching and recognition. In: Proceedings of the Symposium on Document Image Understanding Technology (SDIUT 2003), Greenbelt, MD (2003)Google Scholar
  9. 9.
    Zhang, B., Srihari, S.N.: Binary vector dissimilarity measures for handwriting. In: Kanungo, T., Smith, E.H.B., Hu, J., Kantor, P.B. (eds.) Document Recognition and Retrieval X, vol. 5010, pp. 28–38. SPIE, Bellingham (2003)Google Scholar
  10. 10.
    Sparck Jones, K.: A Probabilistic Model of Information Retrieval: Development and Status, Technical Report, Computer Laboratory, University of Cambridge, UK (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Sargur Srihari
    • 1
  • Anantharaman Ganesh
    • 1
  • Catalin Tomai
    • 1
  • Yong-Chul Shin
    • 1
  • Chen Huang
    • 1
  1. 1.Center of Excellence for Document Analysis and Recognition (CEDAR)University at Buffalo, State University of New YorkBuffaloUSA

Personalised recommendations