Distance Based Strategy for Supervised Document Image Classification

  • Fabien Carmagnac
  • Pierre Héroux
  • Éric Trupin
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3138)


This paper deals with supervised document image classification. An original distance based strategy allows automatic feature selection. The computation of a distance between an image to be classified and a class representative (point of view) allows to estimate a membership function for all classes. The choice of the best point of view performs the feature selection. This idea is used by an algorithm which iteratively filters the list of candidate classes. The training phase is performed by computing the distances between every class. Each iteration of the classification algorithm computes the distance d between the image to be classified and the chosen representative. The classes whose distance with this point of view differs from d are deleted in the list of candidate classes. This strategy is implemented as a module of A2IA FieldReader to identify the class of the processed document. Experimental results are presented and compared with results given by a knn classifier.


Membership Function Feature Selection Feature Space Document Image Good Point 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Héroux, P., Diana, S., Ribert, A., Trupin, E.: Classification methods study for automatic form class identification. In: 14th IAPR International Conference on Pattern Recognition ICPR 1998, Brisbane, Australie. International Association on Pattern Recognition, pp. 926–928 (1998)Google Scholar
  2. 2.
    Clavier, E.: Stratégies de tri: un système de tri des formulaires. PhD thesis, Universit é de Caen (2000)Google Scholar
  3. 3.
    Ribert, A.: Structuration évolutive de données: Application à la construction de classifieurs distribués. PhD thesis, Université de Rouen (1998)Google Scholar
  4. 4.
    Vannoorenberghe, P., Denoeux, T.: Handling uncertain labels in multiclass problems using belief decision trees. In: IPMU 2002 (2002)Google Scholar
  5. 5.
    Unser, M., Aldroubi, A., Gerfen, C.R.: A multiresolution image registration procedure using spline pyramids. In: Wavelet Applications in Signal and Image Processing, vol. 2034, pp. 160–170. SPIE, San Jose (1993)Google Scholar
  6. 6.
    Azokly, A.S.: Approche uniforme pour la reconnaissance de la structure physique des documents composites baése sur l’analyse des espaces blancs. PhD thesis, Université de Fribourg (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Fabien Carmagnac
    • 1
    • 2
  • Pierre Héroux
    • 2
  • Éric Trupin
    • 2
  1. 1.A2IA SAParis cedexFrance
  2. 2.Laboratoire PSI, CNRS FRE 2645Université de RouenMont-Saint-Aignan cedexFrance

Personalised recommendations