Advertisement

Data extraction from form images

  • F. Cesarini
  • M. Gori
  • S. Marinai
  • G. Soda
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 978)

Abstract

In this paper, we describe a system capable of extracting textual information from images of structured documents. In particular the model and the algorithms we described are used to process forms in which the information fields can not be located only by their position on the page, but can also be identified after locating the corresponding instruction fields. The proposed model is based on attributed relational graphs and performs form registration and location of information fields using algorithms based on the hypothesize-and-verify paradigm. The location of instruction fields is carried out in an holistic way, by using connectionist models.

Keywords

Attributed Relational Graphs Document Registration Form Processing Layout Description 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    M. Bianchini, P. Frasconi and M. Gori. Learning in Multilayered Networks Used as Autoassociators. IEEE Transaction on Neural Networks 1995, Vol. 6, No. 2 pp. 512–515.Google Scholar
  2. 2.
    F. Cesarini, M. Gori, S. Marinai, G. Soda. A Hybrid System for Locating Low Level Graphic Items. To appear in Proceedings of the First IAPR Workshop on Graphic Recognition, Pen State University, 1995.Google Scholar
  3. 3.
    F. Cesarini, M. Gori, S. Marinai, G. Soda. A System for Data Extraction from Forms of Known Class. To appear in Proceedings of the 3th International Conference on Document Analysis and Recognition, Montreal 1995.Google Scholar
  4. 4.
    D. S. Doermann, A. Rosenfeld The Processing of Form Documents. Proceedings of International Conference on Document Analysis and Recognition, 1993, pp. 497–501.Google Scholar
  5. 5.
    M.A. Eshera and K.S. Fu. An Image Understanding System using Attributed Symbolic Representation and Inexact Graph-matching. IEEE Transaction on PAMI 1986, Vol. 8, No. 5 pp. 604–617.Google Scholar
  6. 6.
    M. D. Garris et als. NIST Form-based Handprint Recognition System. NISTIR 5469. U.S. Department of Commerce. Technology Administration. National Institute of Standards and Technology. July 1994.Google Scholar
  7. 7.
    W.E.L. Grimson. Object Recognition by Computer, the Role of Geometric Constraints. Cambridge. MIT Press, 1990.Google Scholar
  8. 8.
    S.W. Lam, S.N. Srihari. Multi-domain Document Layout Understanding. Proceedings of International Conference on Document Analysis and Recognition, 1991, pp. 112–120.Google Scholar
  9. 9.
    S.W. Lam. An Adaptive Approach to Document Classification and Understanding. Proceedings of the IAPR Workshop on Document Analysis Systems Kaiserslautern, Germany, October 1994.Google Scholar
  10. 10.
    Y.Y. Tang, C.De Yan, C.Y. Suen. Document Processing for Automatic Knowledge Acquisition. IEEE Transaction on Knowledge and Data Engineering 1994, Vol. 6, No. 1 pp. 3–20.Google Scholar
  11. 11.
    C.D. Yan, Y.Y. Tang, C.Y. Suen. Form Understanding System Based on Form Description Language. Proceedings of International Conference on Document Analysis and Recognition, 1991, pp. 283–293.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1995

Authors and Affiliations

  • F. Cesarini
    • 1
  • M. Gori
    • 1
  • S. Marinai
    • 1
  • G. Soda
    • 1
  1. 1.Dipartimento di Sistemi e InformaticaUniversità di FirenzeFirenzeItalia

Personalised recommendations