Abstract.
In this paper, we describe a spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well.
Similar content being viewed by others
Author information
Authors and Affiliations
Additional information
Received August 16, 2000 / Revised October 6, 2000
Rights and permissions
About this article
Cite this article
Taghva, K., Stofsky, E. OCRSpell: an interactive spelling correction system for OCR errors in text. IJDAR 3, 125–137 (2001). https://doi.org/10.1007/PL00013558
Issue Date:
DOI: https://doi.org/10.1007/PL00013558