Skip to main content
Log in

Automatic page analysis for the creation of a digital library from newspaper archives

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract.

Digital preservation of newspaper archives aims both at the salvation of endangered material (paper) and at the creation of digital library services that will allow full utilization of the archives by all interested parties. In this paper, we address a series of issues pertaining to the retro-conversion of newspapers, i.e., the conversion of newspaper pages into digital resources. An integrated approach is presented that provides solutions to problems related to newspaper page image enhancement, segmentation of pages into various items (titles, text, images etc), article identification and reconstruction, and, finally, recognition of the textual components. Emphasis is placed on the most difficult intermediate stages of page segmentation and article identification and reconstruction. Detailed experimental results, obtained from a large testbed of old newspaper issues, are presented which clearly demonstrate the applicability of our methodology to the successful retro-conversion of newspaper material.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received: 21 December 1998 / Revised: 25 May 1999

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gatos, B., Mantzaris, S., Perantonis, S. et al. Automatic page analysis for the creation of a digital library from newspaper archives. Int J Digit Libr 3, 77–84 (2000). https://doi.org/10.1007/PL00021477

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/PL00021477

Navigation