Skip to main content
Log in

A statistical approach to the generation of a database for evaluating OCR software

  • Original Research Paper
  • Published:
International Journal on Document Analysis and Recognition Aims and scope Submit manuscript

Abstract.

In this paper we consider a statistical approach to augment a limited database of groundtruth documents for use in evaluation of optical character recognition software. A modified moving-blocks bootstrap procedure is used to construct surrogate documents for this purpose which prove to serve effectively and, in some regards, indistinguishably from groundtruth. The proposed method is validated through a rigorous statistical procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received: March 30, 2000 / Revised: September 14, 2001

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brundick, F., Brodeen, A. & Taylor, M. A statistical approach to the generation of a database for evaluating OCR software. IJDAR 4, 170–176 (2002). https://doi.org/10.1007/s100320200067

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s100320200067

Navigation