Skip to main content

Table 1. Number of index terms (in thousands) and the proportional increase in comparison to the baseline.

From: Assessing the Impact of OCR Errors in Information Retrieval

SettingBaseline1%5%10%25%50%
ALL-NS273355 (30%)523 (91%)659 (141%)937 (243%)1,243 (355%)
ALL_ST203253 (24%)352 (73%)434 (113%)605 (197%)801 (293%)
JD_NS273342 (25%)473 (73%)574 (110%)770 (182%)983 (260%)
JD_ST203245 (20%)324 (59%)386 ( 90%)514 (153%)660 (224%)