A semantic architecture for preserving and interpreting the information contained in Irish historical vital records


Irish Record Linkage 1864–1913 is a multi-disciplinary project that started in 2014 aiming to create a platform for analyzing events captured in historical birth, marriage, and death records by applying semantic technologies for annotating, storing, and inferring information from the data contained in those records. This enables researchers to, among other things, investigate to what extent maternal and infant mortality rates were underreported. We report on the semantic architecture, provide motivation for the adoption of RDF and Linked Data principles, and elaborate on the ontology construction process that was influenced by both the requirements of the digital archivists and historians. Concerns of digital archivists include the preservation of the archival record and following best practices in preservation, cataloguing, and data protection. The historians in this project wish to discover certain patterns in those vital records. An important aspect of the semantic architecture is the clear separation of concerns that reflects those distinct requirements—the transcription and archival authenticity of the register pages and the interpretation of the transcribed data—that led to the creation of two distinct ontologies and knowledge bases. The advantage of this clear separation is the transcription of register pages resulted in a reusable data set fit for other research purposes. These transcriptions were enriched with metadata according to best practices in archiving for ingestion in suitable long-term digital preservation platforms.

    The terms and conditions of our data sharing agreement do not permit us to make public any data that would identify any individual [7]. One can access the historic records of the GRO at its dedicated research room in Dublin, but it is restricted per diem and there is an associated charge.

    A MySQL database (https://www.mysql.com/).

    With phpMyAdmin (https://www.phpmyadmin.net/).

    Available via http://purl.org/net/irish-record-linkage/records.

    Friend-of-a-Friend: http://xmlns.com/foaf/spec/.

    Int. List of Causes of Death, Rev.1 (1900), http://www.wolfbane.com/icd/icd1h.html.

    Int. List of Causes of Death, Rev.2 (1909). http://www.wolfbane.com/icd/icd2h.html.

    Department of Commerce and Labor, Bureau of Census. International Classification of Causes of Sickness and Death. Washington Government of Printing Office (1910)

    We used the classification systems that existed in the studied historical period rather than applying today’s most current classification systems, because classification systems reflect a different understanding of disease than those in the 19th century. Diseases may be classified by etiology (cause), pathogenesis (mechanism by which the disease is caused), or by symptom(s). Nosology is a branch of medicine deals with classification of disease. The historical evolution of classification systems, such as ICD or ICSD, is closely related to historical and intellectual conditions of the area. The Early disease classification used by physicians was largely based philosophically on humoral theories of disease, with occasional suggestions that malign outside influences might cause illness or death. The first version of ICD included the principle of classifying diseases by etiology. In later years, the focus first shifted to symptoms and then to mechanism of diseases. For example, in the historical records, we observed “Teething” as cause of death. International List of Causes of Death, Revision 1 provides a classification category for this such as “82 Teething” for infants. The latest version of same classification (ICD10 or ICD11) does not have such a category as a disease or cause of death. A second reason for adopting historical classification systems is the number of categories that have expanded dramatically to reflect the new insights for understanding cause, mechanism, and symptoms of diseases as medical knowledge advanced. The first version of International List of Causes of Death, Revision 1 (1900) had 191 items, whereas current one has more than 14,400 different codes. Mapping the historical disease classification to current ones would require the examination of historical definitions of each category and map each of them to current possible understanding of diseases. In such a mapping, a historian can explore how medical knowledge and social conditions effects the formation of nosologies, but it would not have served our purpose of classifying historical cause of diseases in 19th century.

This publication has emanated from research conducted within the Irish Record Linkage, 1864–1913 project supported by the RPG2013-3 Irish Research Council Interdisciplinary Research Project Grant. The Digital Repository of Ireland (formerly NAVR) gratefully acknowledges funding from the Irish HEA PRTLI programme. Christophe Debruyne is currently supported by the Science Foundation Ireland (Grant 13/RC/2106) as part of the ADAPT Centre for Digital Content Technology Platform Research at Trinity College Dublin.

  • Historical vital records
  • Cultural heritage
  • Linked data
  • Ontology engineering
  • Preservation