A Statistical Method for an Automatic Detection of Form Types

  • Saddok Kebairi
  • Bruno Taconet
  • Abderrazak Zahour
  • Said Ramdane
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1655)

Abstract

In this paper, we present a method to classify forms by a statistical approach; the physical structure may vary from one writer to another. An automatic form segmentation is performed to extract the physical structure which is described by the main rectangular block set. During the form learning phase, a block matching is made inside each class; the number of occurrences of each block is counted, and statistical block attributes are computed. During the phase of identification, we solve the block instability by introducing a block penalty coefficient, which modifies the classical expression of Mahalanobis distance. A block penalty coefficient depends on the block occurrence probability. Experimental results, using the different form types, are given.

Keywords

Physical Structure Mahalanobis Distance Automatic Detection Rectangular Block Text Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Reference

  1. 1.
    D. Doermann, A. Rosenfeld, E, Rivlin: The Function of documents, Proc. of ICDAR'97, Ulm, Germany (1997) 1077–1081.Google Scholar
  2. 2.
    J. Mao, M. Abayan, K. Mohiuddin: A Model-Based Form Processing Sub-System, Proc. of ICPR'96, Vienna, Austria (1996) 691–695.Google Scholar
  3. 3.
    L. Y. Tseng, R. C. Chen: The Recognition of Form Documents Based on Three Types of Line Segments, Proc. of ICDAR‘97, Ulm, Germany (1997) 71–75.Google Scholar
  4. 4.
    Y. Ishitani, " Model Matching Based on Association Graph for Form Image Understanding, Proc. of ICDAR'95, Montreal, Canada (1995) 287–292.Google Scholar
  5. 5.
    C. D. Yan, Y. Y Tang, C. Y. suen: Form Understanding System Based on Form Description Language. Proc. of ICDAR'91,Saint Malo, France (1991) 283–293Google Scholar
  6. 6.
    J. Yuan, Y. Y. Tang, C. Y. Suen: Four Directional Adjacency Graphs (FDAG) and Their Application in Locating Field in Forms. Proc of ICDAR'95, Montreal, Canada (1995) 752–755Google Scholar
  7. 7.
    F. Cesarini, M. Gori, S. Marinai, G. Soda: A System for Data Extraction from Forms of Known Class. Proc. of ICDAR'95, Montreal, Canada (1995) 1136–1140 96 Saddok Kebairi et al.Google Scholar
  8. 8.
    U. Bohnacker, J. Schacht, T. Yücel.: Matching form lines Based on a Heuristic Search ", Proc. of ICDAR‘97, Ulm, Germany, (1997) 86–90.Google Scholar
  9. 9.
    F. Dubiel, A. Dengel.: FormClass-A System For OCR Free identification Of Forms. DAS'96, USA (1996) 189–208Google Scholar
  10. 10.
    P. Héroux, S. Diana, A. Ribert, E. Trupin: Etude de Méthodes de Classification pour l'Identification Automatique de Classes de Formulaires. Proc. of CIFED'98, Quebec, Canada (1998) 463–472Google Scholar
  11. 11.
    S. Kebairi, B. Taconet, A. Zahour, P. Mercy: Détection Automatique du Type de Formulaire Parmi un Ensemble Appris et Extraction des Données Utiles. CIFED'98, Quebec, Canada (1998) 255–264Google Scholar
  12. 12.
    S. Kebairi, B. Taconet: A System of Automatic Reading of Forms: Int. Conf. of Pattern Recognition and Information Analysis, PRIP'97, Minsk Belarus, (1997) 264–270.Google Scholar
  13. 13.
    L. Boukined, B. Taconet, A. Zahour: Recherche de la Structure Physique d'un Document Imprimé par Rectangulation., Proc. RFIA 91, France (1991) 1027–1031Google Scholar
  14. 14.
    S. Kebairi, A. Zahour, B. Taconet, L. Boukined: Segmentation of Composite Documents Into Homogenous Blocks. Proc. IGS'98, Genova Italy (1997) 111–112Google Scholar
  15. 15.
    J.F. Allen: Maintaing Knowledge About Temporel Intervals. Communication of the ACM, 26 (11), (1983) 832–843MATHCrossRefGoogle Scholar
  16. 16.
    H. Walischewski: Automatic Knowledge Acquisition for Spatial Document Interpretation.Proc. of ICDAR'97, Ulm, Germany (1997) 243–247Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Saddok Kebairi
    • 1
  • Bruno Taconet
    • 1
  • Abderrazak Zahour
    • 1
  • Said Ramdane
    • 1
  1. 1.Laboratoire d’Informatique du HavreUniversité du HavreLe HavreFrance

Personalised recommendations