OCRSpell: an interactive spelling correction system for OCR errors in text

  • Kazem Taghva
  • Eric Stofsky
Original papers

Abstract.

In this paper, we describe a spelling correction system designed specifically for OCR-generated text that selects candidate words through the use of information gathered from multiple knowledge sources. This system for text correction is based on static and dynamic device mappings, approximate string matching, and n-gram analysis. Our statistically based, Bayesian system incorporates a learning feature that collects confusion information at the collection and document levels. An evaluation of the new system is presented as well.

Key words: OCR-Spell checkers – Information retrieval – Error correction – Scanning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Kazem Taghva
    • 1
  • Eric Stofsky
    • 1
  1. 1.Information Science Research Institute, University of Nevada, Las Vegas, Las Vegas, NV 89154-4021, USA; e-mail: taghva@isri.unlv.edu US

Personalised recommendations