Simple Layout Segmentation of Gray-Scale Document Images

  • A. Suvichakorn
  • S. Watcharabusaracum
  • W. Sinthupinyo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2423)

Abstract

A simple yet effective layout segmentation of document images is proposed in this paper. First, n x n blocks are roughly labeled as background, line, text, images, graphics or mixed class. For blocks in mixed class, they are split into 4 sub-blocks and the process repeats until no mixed class is found. By exploiting Savitzky-Golay derivative filter in the classification, the computation of features is kept to the minimum. Next, the boundaries of each object are refined. The experimental results yields a satisfactory results as a pre-process prior to OCR.

References

  1. 1.
    A. Savitzky and M.J.E. Golay, Smoothing and di.erentiation of data by simplified least squares procedure. Analytical Chemistry 36 (1964) 1627–1639CrossRefGoogle Scholar
  2. 2.
    I. Keslassy, M. Kalman, D. Wang, and B. Girod, Classification of Compound Images Based on Transform Coeficient Likelihood. Proc. ICIP 2001 (2001)Google Scholar
  3. 3.
    In-Kwon Kim, Dong-Wook Jung and Rae-Hong Park Document image binarization based on topographic analysis using a water flow model. Pattern Recognition, 35(1) (2002) 265–277MATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • A. Suvichakorn
    • 1
  • S. Watcharabusaracum
    • 1
  • W. Sinthupinyo
    • 1
  1. 1.National Electronics and Computer Technology CenterKlong LuangThailand

Personalised recommendations