Automatically Geotagging Articles in the Welsh Newspapers Online Collection

Conference paper

DOI: 10.1007/978-3-319-25032-8_28

Cite this paper as:
Sapstead S., Daniel I., Clare A. (2015) Automatically Geotagging Articles in the Welsh Newspapers Online Collection. In: Bramer M., Petridis M. (eds) Research and Development in Intelligent Systems XXXII. Springer, Cham

Abstract

The National Library of Wales’ Welsh Newspapers Online collection comprises over 16 million articles from historic newspapers. It is stored in NLW’s institutional repository, and is a rich source of historic text. The text of the articles has been extracted from the digitised images using OCR. This project investigates methods of determining which articles can be automatically located to places within Wales. We use machine learning, text mining and the OpenStreetMap data as a gazetteer.

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Computer ScienceAberystwyth UniversityAberystwythWales, UK
  2. 2.National Library of WalesAberystwythWales, UK