Advertisement

Performance Comparison of Six Algorithms for Page Segmentation

  • Faisal Shafait
  • Daniel Keysers
  • Thomas M. Breuel
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3872)

Abstract

This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based. The evaluation is performed using a subset of the UW-III collection commonly used for evaluation, with a separate training set for parameter optimization. We compare the results using both default parameters and optimized parameters. In the course of the evaluation, the strengths and weaknesses of each algorithm are analyzed, and it is shown that no single algorithm outperforms all other algorithms. However, we observe that the three best-performing algorithms are those based on constrained text-line finding, Docstrum, and the Voronoi-diagram.

Keywords

Segmentation Algorithm Document Image Optical Character Recognition Text Region Text Block 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Cattoni, R., Coianiz, T., Messelodi, S., Modena, C.M.: Geometric layout analysis techniques for document image understanding: a review. Technical report, IRST, Trento, Italy (1998)Google Scholar
  2. 2.
    Mao, S., Rosenfeld, A., Kanungo, T.: Document structure analysis algorithms: a literature survey. In: Proc. SPIE Electronic Imaging, vol. 5010, pp. 197–207 (2003)Google Scholar
  3. 3.
    Yanikoglu, B.A., Vincent, L.: Ground-truthing and benchmarking document page segmentation. In: Proc. ICDAR, Montreal, Canada, pp. 601–604 (1995)Google Scholar
  4. 4.
    Liang, J., Phillips, I.T., Haralick, R.M.: Performance evaluation of document structure extraction algorithms. CVIU 84, 144–159 (2001)zbMATHGoogle Scholar
  5. 5.
    Kanai, J., Nartker, T.A., Rice, S.V., Nagy, G.: Performance metrics for document understanding systems. In: Proc. ICDAR, Tsukuba, Japan, pp. 424–427 (1993)Google Scholar
  6. 6.
    Das, A.K., Saha, S.K., Chanda, B.: An empirical measure of the performance of a document image segmentation algorithm. IJDAR 4, 183–190 (2002)CrossRefGoogle Scholar
  7. 7.
    Mao, S., Kanungo, T.: Empirical performance evaluation methodology and its application to page segmentation algorithms. IEEE TPAMI 23, 242–256 (2001)Google Scholar
  8. 8.
    Antonacopoulos, A., Gatos, B., Bridson, D.: ICDAR 2005 page segmentation competition. In: Proc. ICDAR, Seoul, Korea, pp. 75–80 (2005)Google Scholar
  9. 9.
    Antonacopoulos, A., Gatos, B., Karatzas, D.: ICDAR 2003 page segmentation competition. In: Proc. ICDAR, Edinburgh, UK, pp. 688–692 (2003)Google Scholar
  10. 10.
    Nagy, G., Seth, S., Viswanathan, M.: A prototype document image analysis system for technical journals. Computer 7, 10–22 (1992)CrossRefGoogle Scholar
  11. 11.
    Gorman, L.O.: The document spectrum for page layout analysis. IEEE TPAMI 15, 1162–1173 (1993)Google Scholar
  12. 12.
    Kise, K., Sato, A., Iwata, M.: Segmentation of page images using the area Voronoi diagram. CVIU 70, 370–382 (1998)Google Scholar
  13. 13.
    Wong, K.Y., Casey, R.G., Wahl, F.M.: Document analysis system. IBM Journal of Research and Development 26, 647–656 (1982)CrossRefGoogle Scholar
  14. 14.
    Baird, H.S.: Background structure in document images. In: Document Image Analysis, pp. 17–34. World Scientific, Singapore (1994)CrossRefGoogle Scholar
  15. 15.
    Breuel, T.M.: Two geometric algorithms for layout analysis. In: Document Analysis Systems, Princeton, NJ (2002)Google Scholar
  16. 16.
    Breuel, T.M.: Robust least square baseline finding using a branch and bound algorithm. In: Doc. Recognition & Retrieval, SPIE, San Jose, CA, pp. 20–27 (2002)Google Scholar
  17. 17.
    Guyon, I., Haralick, R.M., Hull, J.J., Phillips, I.T.: Data sets for OCR and document image understanding research. In: Handbook of character recognition and document image analysis, pp. 779–799. World Scientific, Singapore (1997)Google Scholar
  18. 18.
    Mao, S., Kanungo, T.: Software architecture of PSET: a page segmentation evaluation toolkit. IJDAR 4, 205–217 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Faisal Shafait
    • 1
  • Daniel Keysers
    • 1
  • Thomas M. Breuel
    • 1
  1. 1.Image Understanding and Pattern Recognition (IUPR) research groupGerman Research Center for Artificial Intelligence (DFKI), and Technical University of KaiserslauternKaiserslauternGermany

Personalised recommendations