Multi-oriented English Text Line Identification

  • U. Pal
  • S. Sinha
  • B. B. Chaudhuri
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2749)

Abstract

There are many artistic documents where text lines of a single page may have different inclinations (orientations). To enhance the ability of document analysis system, we have to extract text line in multiple orientations. In this paper, we propose a robust technique to detect English text lines of arbitrary orientation in a single document page. We propose here a bottom-up approach where the connected components are at first labelled. They are then clustered into word groups. Text lines of arbitrary orientation are identified from the estimation of these word groups. From an experiment of 3700 text lines, we obtained an accuracy of 98.3% by the proposed method.

Keywords

Core Area Reference Line Candidate Region Document Image Text Line 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. [1]
    L. A. Fletcher and R. Kasturi, “A robust algorithm for text string separation from mined text/graphics images”, IEEE PAMI vol.10, pp.910–918, 1988.Google Scholar
  2. [2]
    H. Goto and H. Aso, “Extracting curved lines using local linearity of the text line”, International journal of Document Analysis and Recognition, vol.2 pp. 111–118, 1999.CrossRefGoogle Scholar
  3. [3]
    G. Nagy. S. seth and M. Viswanathan, “A prototype document image analysis syatem for technical journals.”,Computer, vol.25, pp. 10–22, 1992.CrossRefGoogle Scholar
  4. [4]
    F. Hones and J. Litcher, “Layout extraction of mixed mode documents”, Machine vision and applications, vol.7, pp.237–246, 1994.CrossRefGoogle Scholar
  5. [5]
    L. O’Gorman, “The document spectrum for page layout analysis”, IEEE PAMI., vol. 15, pp. 1162–1173, 1993.Google Scholar
  6. [6]
    U. Pal, M. Mitra and B. B. Chaudhuri, “Multi-Skew Detection of Indian Script documents”, In Proc. 6th ICDAR pp. 292–296, 20001.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • U. Pal
    • 1
  • S. Sinha
    • 1
  • B. B. Chaudhuri
    • 1
  1. 1.Computer Vision and Pattern Recognition UnitIndian Statistical InstituteKolkataIndia

Personalised recommendations