Keywords

1 Introduction

The creation of accessible, coherent and well integrated datasets has been demonstrated to be an important catalyst in enabling researchers to produce innovative and groundbreaking research [19]. In the Humanities, even before consideration is given to the interpretation of sources, their accessibility and complex provenances often present researchers with considerable logistical, organisational, and accessibility challenges [22]. In research pertaining to the Holocaust and its historical legacy these challenges are particularly acute. For numerous reasons, including the intentional destruction of evidence [26] and the widespread dislocation of people and administrative bodies following the Second World War, Holocaust-related material and archival sources are highly fragmented and dispersed. In practice, this means that researchers seeking to access important Holocaust sources must in many cases navigate a complex trans-national patchwork of archives with different mandates, cataloguing practices, and systems of arrangement.

Overcoming barriers to effective trans-national Holocaust research is one of the principal goals of the European Holocaust Research Infrastructure (EHRI)Footnote 1, an EU-funded research project, now in its third 4-year phase and soon to transition into a permanent organisation as a European Research Infrastructure Consortium (ERIC). For over a decade, EHRI has built tools to help researchers understand and navigate the complex landscape of Holocaust research [25], cataloguing sources across many hundreds of institutions and working with numerous archives, large and small, to integrate and contextualise their collection descriptions. A major part of these efforts is the EHRI Portal [5]Footnote 2, an online database of Holocaust-related archival sources, which enables the integration and interlinking of archival descriptions and their associated metadata from around the world.

The development of the EHRI Portal, its technologies and APIs, along with various initiatives aimed at increasing the interconnectedness of its metadata have been described elsewhere [3]. In this paper we focus on our efforts to expose the rich metadata contained in the EHRI Portal, derived from institutions around the world as well as EHRI’s own archival specialists, in a manner compatible with the Semantic Web and capable of better integrating with the emerging network of Linked Open Data (LOD) sources.Footnote 3 Semantic Web technologies offer a unique means by which entities can be identified unambiguously, linked across databases, and where new data can be automatically inferred [4], capabilities which have been demonstrated to effectively support Digital Humanities activities [27]. The Knowledge Graph (KG) of Holocaust-related descriptions presented below, based on the EHRI Portal data, serves as a first step to increasing the visibility of this kind of material and facilitate other LOD publishers to link to EHRI’s entities.

Producing and publishing LOD is a challenge common to many GLAM institutions [1, 10], where datasets of research interest are frequently siloed in legacy databases and intermingled with more closely-held administrative data, not amenable to being made public. As described in [5], the EHRI Portal, while developed under an “open-first" approach, also includes many affordances for restricting the visibility and accessibility of material that is private to individual users, concealed from view for copyright reasons, or otherwise sensitive. We believe that the approaches described in this paper therefore have wide applicability to other practitioners who have an interest in expanding the openness of their data, particularly archival institutions. In addition, many archival institutions present a technological deficit making it very hard for them to adapt to new technologies and migrate old data [32]. This KG, therefore, could serve as an example for Holocaust-related institutions that wish to experiment with Semantic Web technologies, and their possibilities, without being required to make more costly and disruptive technical investments. In the future, if more institutions decide to expose their data as LOD, connections could be made both to and from this KG, allowing it to act as an authority hub for Holocaust-related material and facilitating connections between different holding institutions (see Sect. 5.3).

The rest of this paper is structured as follows: Sect. 2 describes related work; in Sect. 3 we outline EHRI’s data and services and how the transformation was carried out; Sect. 4 introduces the KG and its main characteristics; in Sect. 5 we enumerate the challenges that arise from this work and how we intend to solve them in future. Finally, in Sect. 6 we draw the conclusions obtained from this work.

2 Related Work

Many works have addressed the modelling of historical data as KGs. One widely-cited example is Europeana [21], which offers metadata about different types of cultural heritage material. The level of detail offered by Europeana could, however, be considered insufficient for many researchers [30] and it does not seek to contextualise subject-specific material as EHRI does. With regard to the Second World War as a whole, in [6] the authors investigated a linking algorithm to enrich WWII collections with events information modelled as LOD. Similarly, WarSampo [23] offers a Finnish KG for WWII integrating many different data sourcesFootnote 4 and offering them through a single web interface.Footnote 5 This KG models different perspectives such as events, persons, army units, places, etc. To the best of our knowledge, however, no KG has sought to model the archival landscape of Holocaust-related sources.

Even though no KG has yet taken a holistic view of Holocaust-related archival material, a number of relevant initiatives have appeared in recent years focused on a particular region or country.Footnote 6 Others, with a more trans-national perspective that address similar topics (e.g., Jewish material) do inevitably overlap with EHRI’s scope, such as the Yerusha platform which offers a centralised access for Jewish archival heritage.Footnote 7 To date, however, there is a dearth of linkages between these platforms, complicating both users’ access to the information in navigating many overlapping sources, and the task of the holding institutions in keeping their metadata up-to-date in multiple places. This plethora of siloed alternatives gives traction to an alternative semantic landscape where data could be more interoperable and authority hubs (today’s aggregators) could act as linking facilitators (see Sect. 5.3.)

In Cultural Heritage a number of conceptual models, vocabularies and ontologies (some of them related to a conceptual model) have emerged aiming to cover different aspects of the field, e.g., CIDOC-CRM [12], PROV-O [24], FRBRFootnote 8, NIE-INEFootnote 9, ROARFootnote 10 or ARKIVO [29], among others. As relates specifically to archives, a number of attempts have been made to address the mapping from the Encoded Archival Description (EAD) XML schema to these aforementioned ontologies. For example, converting from EAD to CIDOC-CRM has been addressed, among others, by [7, 15, 35, 36] with different levels of EAD semantic coverage. CIDOC-CRM, however, was originally intended for interoperability of museum objects, with some links to archives or libraries, which limits the establishment of metadata equivalents. Moreover, due to these differences in scope, domain experts will always be more comfortable with a domain-specific model capable of integrating with a broader scope and which, for archives, effectively unifies the widely-adopted International Council of Archives (ICA) standards [18]. More recently, a transformation tool from EAD to Records in Contexts Ontology (RiC-O) has been released [14], using XSLT stylesheets as the base for the mapping. As explained later, EHRI expands ICA standards to fit some specific needs making us opt for a domain specific conversion which can be later shared as an EAD to RiC-O mapping for the whole community based on the shared commonalities.

Inside the EHRI project there have been a number of existing cases where semantic and/or RDF technologies were employed, in addition to those mentioned below relating to EHRI’s data model. As we have written about previously [5], EHRI uses a graph database (Neo4j) as its underlying data store, and while it functions as a “property graph" rather than a native triplestore, it has some common characteristics. We have on two occasions experimented with automatic mapping from the internal Neo4j schema to a LOD format, one using an interface to the SAIL (Storage and Inference Layer) APIFootnote 11, and the other using the NeoSemantics (n10s) Neo4j plugin.Footnote 12 While both approaches showed promise in some respects, we did not put them into production due to either compatibility issues stemming from tightly-coupled dependencies, or limitations in query performance and scalability resulting from the on-the-fly translation approach.

A more recent undertaking aimed to enrich data already in the portal relating to controlled vocabularies for camps and ghettos, linking them with Wikidata and georeferencing them against GeoNames [2]. Although the goal of this work was not to fully convert EHRI Portal data to RDF it established some of the foundations that we build on here. Inside the wider EHRI consortium we also want to highlight the Holocaust Victims Names databaseFootnote 13 hosted by the Fondazione Centro di Documentazione Ebraica Contemporanea (CDEC) [8] for which they developed a Shoah ontologyFootnote 14 reusing and extending existing ontologies like FOAFFootnote 15 and BIOFootnote 16 (extended in bio-extFootnote 17) and using it to model the information about these victims. This example motivated us to offer EHRI Portal data as LOD so initiatives such as this from partner institutions could be linked and jointly queryable by users.

3 EHRI’s Data and Transformation

3.1 EHRI’s Data Model

Data in the EHRI Portal is based around three main entities: countries; archival institutions; and archival descriptions. Countries constitute an entry point and provide information on the situation of Holocaust research in a relevant country. Collection-holding institutions (CHIs)—typically archives or bodies with similar mandates—are grouped within their host country and include relevant contact details along with additional context and information pertaining to their holdings-as described in the International Standard for Describing Institutions with Archival Holdings (ISDIAH).Footnote 18 Archival descriptions are contained within their holding institution and store the information aligned with the General International Standard Archival Description (ISAD(G)).Footnote 19 One notable characteristic of archival descriptions is that they can be nested to arbitrary depth to form a hierarchy, modelling the physical arrangement of the described materials (e.g., fonds, series, subseries, item, etc.)

In addition to these three main entities, the EHRI Portal also employs entities for enriching and indexing archival metadata. Authority sets are collections of people, families, or corporate bodies—as defined in the International Standard Archival Authority Record for Corporate Bodies, Persons and Families (ISAAR (CPF))Footnote 20—whilst a set of controlled vocabulariesFootnote 21 hold content-specific terms defined by the project for, at present, subject headings and historical places. These authoritative entities are linked from the access points and creators sections of the archival description, serving as a connecting point between collections and facilitating thematic search.

Finally, this structure is augmented by annotations and links, both modelled as first-class entities that can connect and add additional information to those discussed above. In the current EHRI Portal, vocabularies, annotations, and links are the only parts of the data model derived from and partially aligned with RDF, namely the Simple Knowledge Organisation System (SKOS) [28] in the case of vocabularies, and the Web Annotation Data Model [33] framework for annotations and links. EHRI’s use of the relevant standards for linking and indexing metadata records is discussed further in [3].

3.2 Ontology Alignment

As noted above, EHRI’s data is primarily aligned with the conceptual standards from the International Council on Archives (ICA). As a result, import and export of metadata pertaining to archival descriptions from the EHRI Portal was designed around EAD [31], the most well established format derived from ISAD(G).Footnote 22 However, while EAD is widely adopted in the archival field, it inherits the limitations that non-semantic XML technologies present, as discussed in [17], along with other issues stemming from its flexibility as an encoding medium [34].

Seeking to address said limitations, the ICA has been working on a new conceptual model of the archival domain, using a graph as data model. Dubbed Records in Contexts-Conceptual Model (RiC-CM) [20], it is currently on its second draft version, v0.2, released in 2021, and offers a companion ontology for modelling the data in RDF, called RiC-O.Footnote 23 As this specification is intended to supersede EAD in the future, we have used it as our base ontology for the transformation of EHRI’s data into semantic form.

Using RiC-O 0.2 as a foundation has distinct benefits. It allows us to implement a version of EHRI’s data using Records in Contexts (RiC) on top of the existing implementation, letting us test the new data model before the stable version is released. It presents a future common alignment point for other institutions that are currently using ISAD(G) (and/or ISAAR) for data publication and will likewise, in future, seek to make a similar transition, potentially building on EHRI’s mapping rules for their own use cases. And it constitutes a zero-cost demonstration for EHRI partner institutions of how RiC works and its potential benefits, without them having to make a substantial investment themselves in mapping or adapting their in-house data sources.

Since not all of our required semantics are covered by the current RiC-O draft, however, it has been necessary for us to extend the ontology in some respects. Following best practice in ontology modelling we have tried to reuse other ontologies or vocabularies as much as possible, using schema.orgFootnote 24 to complete some fields missing from RIC-O. Schema.org offers a set of classes dedicated to archives since its version 3.5. These classes and their fields complement and align very well to those in RiC-O. For those fields still missing, but necessary from our data perspective, we have included them as properties of a future EHRI ontology (e.g., https://lod.ehri-project-test.eu/ontology#).

3.3 Data Transformation

Construction of the KG consists of two main processes: harvesting and transformation. For the harvesting process we have made use of the existing (JSON-based) EHRI API endpoints as a more open and reproducible alternative to requiring privileged access to the internal database. Specifically, we have used the REST-style EHRI Search APIFootnote 25 for harvesting information about countries, archival institutions and archival descriptions, and the GraphQL API [9]Footnote 26 for extracting additional metadata such as links between entities. For controlled vocabularies we use the existing RDF-format dataFootnote 27 but incorporate additional harvested links in the process of building the complete KG.

To process the harvested data we make use of the ShExML language [16] and engine,Footnote 28 executing mapping rules for each entity in succession, following the paginated structure of responses from the EHRI APIs. This permits resumption of the transformation if required and was further deemed necessary given the amount of data present in the EHRI Portal, exceeding 400,000 archival descriptions.Footnote 29 The execution of these mapping rules produce several Turtle files that are then merged together, using the RDF compositional property, along with the pre-existing SKOS-format vocabularies. All the materials and resources used for the harvesting and transformation process are open source and can be consulted on Github.Footnote 30

4 Dataset

4.1 Approximate Size and Characterisation

The KG consists of 6,571,095 triples that in Turtle format comprise 767MB of dataFootnote 31. We have published this KG using Apache Jena Fuseki as the triple storeFootnote 32 and the LodView viewerFootnote 33 in order to allow exploration of the data.Footnote 34 The KG also provides a SPARQL endpoint for more complex queries.Footnote 35

As mentioned above, we have used RiC-O as the primary modelling ontology, with some additional fields aligned to schema.org. In these cases, we have double-classed the instances that combine predicates from both specifications, allowing for better discoverability and data completeness. These double typed classes are country ( and ) and archival institution ( and ).Footnote 36 In the future EHRI ontology this will be made explicit with a dedicated class that inherits from both super classes. At the same time, and following the same principle, we have added the three possible name predicates, i.e., , and , allowing for a more standardised access from existing agents.

Inverse relations are always provided where possible as the RiC-O specification suggests, letting users navigate the graph in bidirectional fashion and making the graph more predictable. Examples of this are and or and .

In order to better interconnect with existing or future KGs and to allow users explore beyond just our dataset we have provided the following links. For countries we have connected each country to its DBpedia instance (e.g., ). In the case of archival institutions we have linked them to the main institution webpage which could, potentially, provide additional information in semantic format. In addition, for controlled vocabularies concerning camps and ghettos (that were already in RDF), many entities provide a link to Wikidata [13] (using ) pointing to the equivalent entity [11]. A class diagram can be consulted in Fig. 1.

Fig. 1.
figure 1

Class diagram representing the data model followed in the conversion to RDF using RiC-O and schema.org as the base ontologies.

4.2 Post-transformation Enrichment

In addition to the triples and links generated from the batch process, there are other kinds of links that can be included per case, and that are out of the scope of the batch transformation due to potentially requiring manual verification and update. For now, we perform two post-tranformation enrichments: language links with their counterparts in DBpedia; and links of EHRI authorities (persons and corporate bodies) to their counterparts in the CDEC dataset.

In the case of DBpedia, languages are easily linked based on the label similarity against instances. For this purpose a federated SPARQL query is run on the resulting KGFootnote 37 and the results are supervised by content experts. For CDEC person database links we run another federated query that, similar to that used with DBPedia, establishes the links between EHRI and CDEC authority files.Footnote 38 These triples are verified by CDEC staff and then are retained for future use, such that only previously unseen relations are required to be validated. Both generated links datasets are uploaded to the main triple store and added to the KG. These post-transformation enrichments allow executing SPARQL Federated queries over multiple KGs letting users answer more complex questions like the example given in Listing 1. More examples can be found in the EHRI KG landing page.Footnote 39

figure r

5 Challenges and Future Work

5.1 Mapping Copies and Originals

Even though the majority of the data in the EHRI Portal is mapped using the techniques described in this paper there are still some aspects where the available ontologies do not provide us with satisfactory solutions. In other cases, solutions will require further consensus from the community.

One significant challenge pertaining to Holocaust-related material is the amount of copying of material that has been carried out by different archives around the world, who have proceeded to describe the same underlying material using their own specific in-house style. From the very beginning, the EHRI Portal has had, as one of its main goals, the recontextualisation of Holocaust sources. In the project’s second phase a system was introduced allowing descriptions of copied material to link to the holder of the original sources and/or those sources directly [3]. Users can now have a clearer view, where these connections are made, of the different versions of original archival material that is available to them in various holding institutions.Footnote 40

The EHRI Portal supports four types of links depending on the specificity of the available information: 1) copy archival unit to original archival unit (the archival unit was copied from this specific original archival unit); 2) copy archival institution to original archival institution (the institution holds copies from another institution without specifying which); 3) copy archival unit to original archival institution (the archival unit was copied from the mentioned archival institution, without knowing from which exact collection it was copied); and 4) copy archival institution to original archival unit (the archival institution holds copies of this original archival unit without knowing which copied archival unit holds the copies.) All links can be interpreted bidirectionally, for example, X archival unit was copied from Y original archival unit or Y original archival unit was copied into X archival unit.

Looking into the current RiC-O draft, the properties , , and seem to cover the same semantics explained above. However, if we look at the domain and range of these properties we see that they are bound to meaning that the relation can only be established between two entities of this type or its descendants. Ultimately, this translates to being able to map only one out of the four supported link types in the EHRI Portal. A potential future solution will be to introduce these custom properties used in the EHRI Portal as properties of the planned EHRI ontology.

In addition, the RiC-CM puts the emphasis on the distinction between a Record Resource and an Instantiation, the latter being the representation of the record in a digital or physical form. In this sense, we can see copies as different instantiations of the same record where, for example, the original may be a deed and the copy a microfilm, but in essence both refer to the same original material. Looking at the data already mapped, however, this presents an issue, as archival units (Record Resources in RiC-CM) are assumed to be held by only one institution in the EHRI Portal, with identifiers derived from this hierarchy. In order to maintain this information, therefore, we are compelled to continue creating only one instantiation per Record Resource and make the links between them.

One alternative would be to use the property to indicate that in fact the resource is the same. Unfortunately, this creates some additional verbosity in our mapped data, hindering the clarity of the graph and potentially affecting how users navigate it. While it does not constrain the use of the ontology for our mapped data, it is true that clarifying the semantics for these cases when using RiC-O will benefit data producers and consumers as exposed with this case. Thus, we will follow the development of RiC-O closely to adapt our conversion process if this point becomes clearer in future revisions.

5.2 Incremental Updates

As mentioned in Sect. 3.3, we opted for a batch approach for the conversion of data from the EHRI APIs to the KG. This means that at some point data could be added, updated or deleted in the EHRI Portal making parts of the KG obsolete or incomplete. In order to cope with this issue many strategies could be taken. One possible approach would be to execute the batch process as a nightly task and exchange the old KG for the newly generated one. However, given the size of the dataset this process would be time consuming and fairly inefficient. We have therefore designed a workflow that, while based on the batch approach, incorporates only updates that took place since the previous harvesting operation without impacting the overall performance. Our envisioned solution is to process change events from the EHRI Portal as a stream that are incorporated into an append-only historic log where all changes since the creation of the KG can be tracked. This would facilitate not only processing the changes as they arrive, but also reconstructing update events in cases where it is needed to replicate them for further migrations or installations, or recover from down time. From each of these events it is possible to download the new contents from the harvesting source (Search API or GraphQL API) and, depending on the type of event (creation, deletion, update), run the necessary and/or SPARQL queries against the SPARQL endpoint. This workflow can be seen in Fig. 2. We will undertake the implementation of the proposed data update architecture as future work in order to more efficiently keep the KG up-to-date in a timely manner.

Fig. 2.
figure 2

Architecture of the envisioned stream-based incremental update system to align the KB with the EHRI Portal’s live dataset where the EHRI portal will emit events in the form of Server-Sent Events (SSE). A SSE handler will process and send them to a topic in a distributed event streaming platform (e.g., Apache Kafka). Finally, an event processor will be subscribed to the event streaming platform and will create (and post to the SPARQL endpoint of the triple store) the corresponding SPARQL and/or queries based on the events contents.

5.3 EHRI KG as an Authority Hub

The EHRI Portal acts as an aggregator for information about Holocaust documentation, allowing users to seamlessly access metadata about collections, and institutions to contextualise their own records in a larger, trans-national landscape. This enhanced contextualisation happens in the EHRI Portal, where researchers can benefit from it, but the metadata providers themselves are not able to easily reflect it in their own data. In this sense, this centralised approach presents challenges when it comes to improving agents’ access, and the reusability of data contributed by the institutions themselves.

Federated approaches, however, pose other challenges, such as how to actively manage the links between different nodes, how to manage widely-used persistent and unique identifiers, and how to foster node discoverability. These challenges, however, can be mitigated by an aggregator which can promote visibility, link an institution’s data to other data in the network, and is able to manage coherent and consistent identifiers across the network. We should, therefore, take those advantages and put them to work in a federated manner. In this regard, we see this KG as a first step in establishing an authority hub (as opposed to a data aggregator) where the source of truth is the institutions’ own data. This allows for a more lightweight KG where only general metadata information about collections is served, with the rest being available on-demand (via Semantic Web technologies) based on users’ requirements. The authority hub, then, would have the responsibility to maintain the links among different providers’ items, allowing institutions to search across the network throughout the hub or even re-utilising the data for their own systems.

Moreover, as more institutions start to work following these principles, fewer data integration procedures will be required, reducing the social, technical, and institutional challenges of keeping aggregated metadata sufficiently up-to-date.

5.4 Engaging User Communities

Institutions dealing with Holocaust-related material present a varying degree of technical capacity, ranging from those that already offer data as LOD, like CDEC, to others where data is either not sharable or is available only as PDF files or some other form similarly unsuitable for machine processing. In addition, technical investments typically come with a high cost for these institutions, both financially and in terms of staff training. Advancements in this area, therefore, will not be made lightly, and we envisage the KG presented here as a way to showcase the benefits that Semantic Web technologies can deliver to Holocaust-related institutions without requiring them to make financial commitments.

We anticipate that users of the KG will be as varied as users of the EHRI Portal itself, which currently counts around 35,000 monthly user sessions, growing at over 20% per year for the past four years. Among these we find researchers and educators who are engaged with digital methods and for whom this KG could give answers to new research questions. We also find archivists and other knowledge professionals who see in the EHRI portal an opportunity to increase their collections visibility, discoverability and outreach. For the latter, the greater interconnectedness in archival descriptions explored in this paper would allow them to better contextualise their work with that of other institutions handling similar material, thereby increasing the aforementioned visibility and discoverability of sources. This should encourage Semantic Web technologies to be a stronger consideration in driving technical choices within these organisations.

It is also worth noting that while this KG is focused on Holocaust-related material (the scope of the EHRI project), the approach taken here is subject-matter agnostic and therefore just as applicable to the wider archival field, and indeed any users of ICA conceptual standards such as ISAD(G). If we can encourage more institutions within the EHRI consortium and the wider archival space to publish their data as LOD, connected to this KG, we will be able to offer more information (e.g., via SPARQL Federated queries) in the EHRI Portal, improving its completeness and usefulness to users of all stripes. During the last months this work has been presented within the consortiumFootnote 41 where feedback has been positive, particularly in regard to EHRI seeking an enlarged role as an authority hub and strengthening connections with platforms like Wikidata and other KG projects in overlapping domains.

As a result of these specific circumstances, the EHRI KG is currently available for public use in a testing capacity in order to gain feedback on the data representation and experience in running the services. This is made explicit by the use of a placeholder domain name for URIs containing the “-test” suffix, which will be replaced by the permanent “ehri-project.eu” domain in use elsewhere for EHRI’s production services. When this migration takes place, web redirections will be put in place from the test to production domains in order to ensure that early adopters can straightforwardly migrate to the production platform.

6 Conclusions

Given that the RiC conceptual model has not yet reached its first non-draft release, the work described here is also evolving. We have described in Sect. 3 the general shape of EHRI’s data and how we have approached schema alignment, and where it has been necessary to extend or work around limitations with the ontology. Likewise we have described how the transformation is implemented, using EHRI’s existing APIs and the ShExML mapping language. The resulting dataset, described in Sect. 4, is further enriched with connections to more general KBs, such as DBpedia, or others within the same domain, such as CDEC’s person database. In Sect. 5 we described a number of planned advancements to the EHRI KB, including the incorporation of more information about the provenance of Holocaust sources.

The vision described above in Sect. 5.3-of a distributed LOD environment where each custodian of Holocaust-related material can publish its own metadata, integrating with a common set of vocabularies and authorities that are curated by domain-specific entities like EHRI, or more general ones like Wikidata-is appealing for many reasons. Researchers can benefit enormously from efforts to bring coherence and a deeper level of contextualisation to domains like Holocaust research which are, as discussed in the introduction to this paper, fraught with historical and organisational complexity. Centralised approaches to data integration, whilst necessary with today’s level of LOD adoption in the archival domain, are complex to administer and invariably compromised in how up-to-date and comprehensive they can manage to be.

By expanding EHRI’s LOD capabilities, building on efforts by the creators of RiC and other such systems, we can hope to foster a greater degree of knowledge interoperability in the domain of Holocaust research. If more data providers can justify the necessary technical investments to eventually publish their own linked datasets, perhaps using the techniques described here as a blueprint with which to do so, this will correspondingly benefit EHRI’s goals in contextualising Holocaust sources and bringing greater clarity to the domain.

Supplemental Material Availability: The presented Knowledge Graph and the accompanying documentation are available for consultation on: https://lod.ehri-project-test.eu/. The source code for the conversion can be openly consulted on https://github.com/herminiogg/EHRI2LOD and a persistent version of the code used for this paper can be found on https://doi.org/10.5281/zenodo.8185859.