Logical structure analysis by typographic characteristics extraction

  • Duffy Laurence
  • Lebourgeois Frank
  • Emptoz Hubert
Poster Session D: Biomedical Applications, Detection, Control & Surveillance, Inspection, Optical Character Recognition
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1311)

Abstract

We propose to study the logical structure of a document through its typographic characteristics. So we present a new method of regrouping letters and words in homogeneous font families, which doesn't necessitate to explicitly recognise the font. This analysis will allow us to extract a part of the logical information which is carried by words typographic features.

Keywords

Error Case Logical Structure Document Image Pattern Redundancy Character Word 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Bibliography

  1. [CASEY82]
    R.G. CASEY, G. NAGY Recursive segmentation and classification of composite character. 6th ICPR, Intenational Conference on Pattern Recognition, Paris, France, 1982, vol.2, p.1023–1025Google Scholar
  2. [FISCHER95]
    FISCHER S., AMIN A. and DRIVAS D. Segmentation of the Yellow Page. Third ICDAR, International Conference on Document Analysis and Recognition, Montréal, Canada, 1995, p. 605–609Google Scholar
  3. [LE96]
    LE D.X., THOMA G.R. et WECHSLER. Automated Borders Detection and Adaptative Segmentation for binary Document Images. 13th ICPR, Intenational Conference on Pattern Recognition, Vienne, Austria, 1996, p.737–741Google Scholar
  4. [LEBOU92]
    LEBOURGEOIS F., BUBLINSKI Z. et EMPTOZ H. A fast and efficient method for extracting text paragraphs and graphics from unconstrained documents. 11th ICPR, Intenational Conference on Pattern Recognition, La Hague, Hollande, 1992, p.272–276Google Scholar
  5. [NIYOGI95]
    NIYOGI D. and SRIHARI S.N. Knowledge-Based Derivation of Document Logical Structure. Third ICDAR, International Conference on Document Analysis and Recognition, Montréal, Canada, 1995, p. 472–475Google Scholar
  6. [SATOH95]
    SATOH S., TAKASU A. and KATSURA E. An Automated Generation of Electronic Library based on Document Image Understanding. Third ICDAR, International Conference on Document Analysis and Recognition, Montréal, Canada, 1995, p. 163–166Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Duffy Laurence
    • 1
  • Lebourgeois Frank
    • 1
  • Emptoz Hubert
    • 1
  1. 1.Laboratoire de Reconnaissances de Formes et VisionVilleurbanne Cedex

Personalised recommendations