1 The LUCID Endpoint

The LUCID endpointFootnote 1 provides the necessary technology stack to manage and publish Linked Data, as well as to consume Linked Data from other LUCID endpoints.

This includes authentication and authorization mechanisms to guarantee that data consumers access only the data that the endpoint owner explicitly allows. To ensure consumer authentication, OAuth2 [6] is used. Access control rules can be defined on a named graph level.

Once a local graph was modified, the endpoint will notify its subscribers by sending the latest change sets for inclusionFootnote 2. An example of this approach is depicted in Fig. 1. By sending only the modifications instead of the complete new dataset, subscribers can easily recognize the changed triples without the need for calculation of expensive data diffs itself. Subscriber endpoints can then apply those modifications to their local dataset clones automatically. In order to describe these dataset changes, a vocabulary and exchange format is needed, which we will explain in the next section.

Fig. 1.
figure 1

The LUCID endpoint data management and publication process consists of the following steps: (1) modification of a dataset over SPARQL or a GUI (2) update of the dataset as well as revisioning (3) publication of the updates to all subscribers (4) application of the changes to the cloned datasets on subscriber site (5) user notification

2 The Eccenca Revision Vocabulary

In order to both keep track of the modifications on the local quad store and notify subscribers of it about those modifications, we developed the eccenca Revision VocabularyFootnote 3. This vocabulary is modelled using OWL (OWL 2 DL profile) and extends as well as reuses several concepts of the PROV-O ontology [7].

Unlike other approaches, such as [1], which try to describe changes on higher semantic levels, our approach is based on triple (or rather quad) changes, where each revision or modification event (called commit) contains a diff representing the changed (either inserted and/or deleted) quads. This simple model enables applications to rebuild and revert each commit as well as to merge diverted evolution branches as explained in [3].

Our data modelling approach is build on top of the one proposed in [5], but instead of holding separate revision histories for each revisioned named graph, our approach keeps a unified revision history on any number of named graphs. This enables applications to track revisions across different graphs or for the whole quad store.

Figure 2 illustrates the main parts of the vocabulary: The Commit class defines an instantaneous event containing a set of graph revisions. This class contains also the meta data associated to this event such as author, date and commit message. Revisions (modelled as the Revision class) refer each to a specific named graph which was changed. Changes in an RDF store are defined either as triple insertions (deltaInsertion) or deletions (deltaDeletions) inline with the approach in [2].

Further work to support branching, commit signing and blank nodes is in progress.

Fig. 2.
figure 2

LUCID revision vocabulary & example commit instance

3 Demonstration Use-Case: Master Data Management

Our setup for the demonstration of the LUCID endpoint deploys a very basic but pressing use case in business to business communication: master data management. Enterprise master data is the single source of basic business information used across all enterprise systems, applications and processes for an entire enterprise. This includes resources such as persons, company sites and subsidiaries as well as contact details.

Our proposed demo consists of the following parts:

  • Publishing of master data datasets with a browser based user interface: A LUCID endpoint provides a dataset for each account. The account owner is free to upload any data to this dataset. All resources from the dataset namespace are available as Linked Data and enabled for publish/subscribe as well as OAuth (in case the dataset is non-public). In addition to the generic access via SPARQL, the user can utilize a master data management application. This single page JavaScript application allows for creation of master data resources such as company subsidiaries and contact details. The RDF data model for these resources is based on the master data model from Odette International, a collaboration platform for the automotive supply chain.

  • Versioning of the dataset changes on the SPARQL endpoint backend: All changes to the user dataset are logged as part of the internal LUCID endpoint triple store. The changed triples are calculated directly by the SPARQL query processor and added to the versioning store.

  • Subscription to datasets of another LUCID endpoint by employing the dataset URL: All resources which are Linked Data accessible, are enabled for publish/subscribe activities as well. The user interface is able to manage subscriptions to other endpoints as well as to provide a preview for the incoming data.

  • A publish/subscribe mechanism which uses commit push notifications based on the eccenca revision vocabulary described in Sect. 2: For each resource, a change log dataset is available, which provides the last Commit information. In addition to that, notifications with these Commit information as payload are pushed to all subscribers in case of a change. The subscribing endpoint adds the incoming data to its dataset clone as well as hold the change log in order to provide versioning information to the user.

Fig. 3.
figure 3

Screenshots of the master data management user interface: (left) Any user can subscribe to changes of other datasets by employing the subscription URL provided by the publisher. After committing the subscription process, the current version of the data model is fetched with an HTTP Linked Data request. (right) The publisher of a dataset is able to create its company master data which includes sites, contacts and other structures by using the master data manager. The master data manager is a browser-based user interface to an OAuth2 [6] enabled SPARQL endpoint.

Figure 3 depicts two screenshots of the master data management user interface which lies on top of the versioning and OAuth2 enabled SPARQL endpointFootnote 4.