Advertisement

Document Layout and Reading Sequence Analysis by Extended Split Detection Method

  • Noboru Nakajima
  • Keiji Yamada
  • Jun Tsukumo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1655)

Abstract

This paper describes an Extended Split Detection Method that can hierarchically segment a machine-printed page image with a complex layout into smaller layout elements. The method performs piecewise-linear segmentation using many kinds of separator elements such as field separators, lines, edges of figures, and edges of white background areas. Furthermore, this method represents an analyzed layout of a hierarchical structure in a tree data structure, in which all nodes are traversed according to the simple rules for generating the reading sequence. We demonstrated that the new method increases the correct character line segmentation rate by 15.5%, to 95.5%, and we achieved a correct reading sequence generation of 88.1%.

Reference

  1. 1.
    Y. Y. Tang, S. W. Lee, and C. Y. Suen," Automatic Document Processing: a Survey", Pattern Recognition, Vol. 29, No. 12, pp. 1931–1952, 1996.CrossRefGoogle Scholar
  2. 2.
    A. K. Jain and Y. Zhong," Page Segmentation Using Texture Analysis", Pattern Recognition, Vol. 29, No. 5, pp. 743–770, 1997.CrossRefGoogle Scholar
  3. 3.
    M. Okamoto and M. Takahashi," A Hybrid Page Segmentation Method", Proc. ICDAR, pp. 743–748, 1993.Google Scholar
  4. 4.
    Y. Tsuji," Document Image Analysis for Generating Syntactic Structure Description", Proc. ICPR, pp. 744–747, 1988.Google Scholar
  5. 5.
    A. K. Jain and Bin Yu, "Page Segmentation Using Document Model", Proc. ICDAR, pp. 34–38. 1997.Google Scholar
  6. 6.
    K. Etemad, D. Doeman, and R. Challappa," Multi-scale Segmentation of Unstructured Document Pages Using Soft Decision Integration", Pattern Recognition, Vol. 30, No. 9, pp. 1505–1519, 1997.CrossRefGoogle Scholar
  7. 7.
    Y. Ishitani, Document Layout Analysis Based on Emergent Computation, Proc. ICDAR, pp. 45–50, 1997.Google Scholar
  8. 8.
    K. Kise, O. Yanagida, and S. Takamatsu," Page Segmentation Based on Thinning of Background", Proc. ICPR, pp. 788–792, 1996.Google Scholar
  9. 9.
    J. Liu, Y. Y. Tang, Q. He, and C. Y. Suen," Adaptive document segmentation and geometric relation labeling: algorithm and experimental results", Proc. ICPR, pp. 63–767, 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Noboru Nakajima
    • 1
  • Keiji Yamada
    • 1
  • Jun Tsukumo
    • 1
  1. 1.C&C media Research LaboratoriesNEC.Miyamae-ku, KawasakiJapan

Personalised recommendations