Handling Inconsistencies Due to Class Disjointness in SPARQL Updates
 5 Citations
 2k Downloads
Abstract
The problem of updating ontologies has received increased attention in recent years. In the approaches proposed so far, either the update language is restricted to sets of ground atoms or, where the full SPARQL update language is allowed, the TBox language is restricted so that no inconsistencies can arise. In this paper we discuss directions to overcome these limitations. Starting from a DLLite fragment covering RDFS and concept disjointness axioms, we define three semantics for SPARQL instancelevel (ABox) update: under cautious semantics, inconsistencies are resolved by rejecting updates potentially introducing conflicts; under brave semantics, instead, conflicts are overridden in favor of new information where possible; finally, the fainthearted semantics is a compromise between the former two approaches, designed to accommodate as much of the new information as possible, as long as consistency with the prior knowledge is not violated. We show how these semantics can be implemented in SPARQL via rewritings of polynomial size and draw first conclusions from their practical evaluation.
Keywords
Variable Binding Triple Pattern Triple Store Class Disjointness Disjointness Condition1 Introduction
RDF has become one of the most important data formats for interoperability, knowledge representation and querying. SPARQL, the W3C standardized language for managing RDF data [11], has grown to offer great power and flexibility of querying, including support for efficient reasoning, rooted in more than a decade of intensive research in description logics. With respect to updates however, SPARQL is currently far less mature. In particular, the interplay between updates and reasoning remains completely open.
In [1], we discussed semantics of SPARQL updates for RDFS ontologies, for the cases in which the knowledge base ABox is fully materialized or to the contrary, is reduced to its minimal core that cannot be derived using TBox axioms. The present paper continues this study of SPARQL updates focusing on the role of inconsistency in supporting SPARQL ABox updates over materialized stores. As a minimalistic ontology language allowing for inconsistencies, we consider RDFS\(_{\lnot }\), an extension of RDFS [12] with class disjointness axioms of the form {P disjointWith Q} from OWL [16].
Consider an ABox with data on student tutors that happen to attend each other’s classes: \(\mathcal {A}_1 = \{\) :jim :attendsClassOf :ann. :ann :attendsClassOf :jim \(\}\). Here, u would create two assertions :jim :studentOf :ann and :ann :studentOf :jim. Due to the range and domain constraints in \(\mathcal {T}\), these assertions result in clashes both for Jim and for Ann. Note that all inconsistencies are in the new data, and thus we say that u is intrinsically inconsistent for the particular ABox \(\mathcal {A}_1\). We discuss how such updates can be fixed using SPARQL rewritings.
Now, let \(\mathcal {A}_2\) be the ABox \(\{\) :jim :attendsClassOf :ann. :jim a :Professor \(\}\). It is clear that after the update u, the ABox will become inconsistent with respect to \(\mathcal {T}\) due to the property assertion :jim :studentOf :ann, implying that Jim is both a professor and a student which contradicts the disjointness axiom. In contrast to the previous case, the clash here is between the prior knowledge and the new data. Based on [1] we propose three update semantics for this case, and provide efficient SPARQL rewriting algorithms for implementing them in the RDFS\(_{\lnot }\) setting.
The topic of knowledge base updates is extremely broad. Our aim in this paper is to adapt the basic belief revision operators for efficient implementation of ABox updates expressed in SPARQL 1.1, in the presence of RDFS\(_{\lnot }\) TBox axioms. In contrast to our setting, most of existing works on knowledge base evolution consider updates based on sets of ground facts to be inserted or deleted. Restricting negation to class disjointness allowed us to keep the presentation clear. It is not difficult to lift our rewritings to theories with role disjointness, functionality and inequality (owl:differentFrom). We discuss related work in more detail in Sect. 6.
In the remainder of the paper, after some short preliminaries (Sect. 2) we discuss checking for intrinsic inconsistencies in Sect. 3. Then in Sect. 4 we present three semantics for dealing with general inconsistencies in the context of materialized triple stores. Sect. 5 describes our practical evaluation of the semantics. Finally, Sect. 6 puts our work in the context of existing research and provides concluding remarks.
2 Preliminaries
DLLite \(_{\textsc {rdfs}_{\lnot }}\) assertions vs. RDF(S), where A, \(A'\) denote concept (or, class) names, P, \(P'\) denote role (or, property) names, \(\varGamma \) is the set of IRI constants (excl. the OWL/RDF(S) vocabulary) and \(x,y \in \varGamma \). For RDF(S), we use abbreviations (rsc, \(\textsf {sp}\), \(\textsf {dom}\), \(\textsf {rng}\), \(\textsf {a}\)) as introduced in [17].
Definition 1
( RDFS \(_{\lnot }\) ABox, TBox, Triple Store). We call a set \(\mathcal {T}\) of inclusion assertions of the forms 1–5 in Table 1 an (RDFS\(_{\lnot }\)) TBox, a set \(\mathcal {A}\) of assertions of the forms 6–7 in Table 1 an (RDF) ABox, and the union \(G = \mathcal {T}\cup \mathcal {A}\) an (RDFS\(_{\lnot }\)) triple store.
Definition 2

each atomic concept A to a subset \(A^{\mathcal {I}}\) of \(\varDelta ^{\mathcal {I}}\),

each negation of atomic concept to \((\lnot A^\mathcal {I}) = \varDelta ^\mathcal {I} \setminus A^\mathcal {I}\),

each atomic role P to a binary relation \(P^{\mathcal {I}}\) over \(\varDelta ^{\mathcal {I}}\), and

each element of \(\varGamma \) to an element of \(\varDelta ^{\mathcal {I}}\).
For expressions \(\exists P\) and \(\exists P^\), the interpretation function is defined as \((\exists P)^{\mathcal {I}} = \{x \in \varDelta ^{\mathcal {I}}\mid \exists y. (x,y) \in P^{\mathcal {I}}\}\) and \((\exists P^)^{\mathcal {I}} = \{y \in \varDelta ^{\mathcal {I}}\mid \exists x. (x,y) \in P^{\mathcal {I}}\}\), resp. An interpretation \(\mathcal {I}\) satisfies an inclusion assertion \(E_1\sqsubseteq E_2\) (of one of the forms 1–5 in Table 1), if \(E_1^{\mathcal {I}}\subseteq E_2^{\mathcal {I}}\). Analogously, \(\mathcal {I}\) satisfies ABox assertions of the form A(x), if \(x^{\mathcal {I}}\in A^{\mathcal {I}}\), and of the form P(x, y), if \((x^{\mathcal {I}},y^{\mathcal {I}}) \in P^{\mathcal {I}}\). An interpretation \(\mathcal {I}\) is called a model of a triple store G (resp., a TBox \(\mathcal {T}\), an ABox \(\mathcal {A}\)), denoted \(\mathcal {I}\models G\) (resp., \(\mathcal {I}\models \mathcal {T}\), \(\mathcal {I}\models \mathcal {A}\)), if \(\mathcal {I}\) satisfies all assertions in G (resp., \(\mathcal {T}\), \(\mathcal {A}\)). Finally, G is called consistent, if it does not entail both C(x) and \(\lnot C(x)\) for any concept C and constant \(x\in \varGamma \), where entailment is defined as usual.
Definition 3
(BGP, CQ, UCQ, Query Answer). A conjunctive query (CQ) q, or basic graph pattern (BGP), is a set of atoms of the form 6–7 from Table 1, where now \(x,y\in \varGamma \cup \mathcal {V}\), \(\mathcal {V}\) a countably infinite set of variables (written as ’?’prefixed alphanumeric strings). A union of conjunctive queries (UCQ) Q, or union pattern, is a set of CQs. We denote with \(\mathcal {V}(q)\) (or \(\mathcal {V}(Q)\)) the set of variables from \(\mathcal {V}\) occurring in q (resp., Q). An answer (under RDFS\(_{\lnot }\) Entailment) to a CQ q over a triple store G is a substitution \(\theta \) of the variables in \(\mathcal {V}(q)\) with constants in \(\varGamma \) such that every model of G satisfies all facts in \(q\theta \). We denote the set of all such answers with \( ans _{\textsf {\tiny rdfs}}(q,G)\) (or simply \( ans (q,G)\)). The set of answers to a UCQ Q is \(\bigcup _{q\in Q} ans (q,G)\).
Query answering in the presence of ontologies is done either by rulebased prematerialization of the ABox or by query rewriting. In the RDFS\(_{\lnot }\) case, materialization in polynomial time is feasible. Let \( mat (G)\) be the triple store obtained from exhaustive application of the inference rules in Fig. 1 on a consistent triple store G. We also define a special notation \({{\mathrm{chase}}}(q, \mathcal {T})\) to denote the “materialization” (also known as chase) of an ABox resp. a BGP q w.r.t. the TBox \(\mathcal {T}\). We call all triples occurring in \({{\mathrm{chase}}}(q, \mathcal {T})\) but not in q the effects of q w.r.t. \(\mathcal {T}\).
We now adapt the semantics for SPARQL update operations from [1].
Definition 4
We call a triple store G (resp. the ABox of G) materialized if the equality \(G \setminus \mathcal {T}= mat (G) \setminus \mathcal {T}\) holds. In this paper, we will always consider G to be materialized and focus on “materialization preserving” semantics for SPARQL update operations, which we dubbed Sem \(^{ mat }_{2}\) in [1] and which preserves a materialized triple store. We recall the intuition behind Sem \(^{ mat }_{2}\), given an update \(u=(P_d, P_i, P_w)\): (i) delete the instantiations of \(P_d\) along with all their causes; (ii) insert the instantiations of \(P_i\) plus all their effects.
The notion of “causes” is made precise as follows. Given an ABox assertion A, \(A^\mathrm {caus}= \{B \mid A \in {{\mathrm{chase}}}(\{B\}, \mathcal {T})\}\). In the definition of \(A^\mathrm {caus}\), if A is a class membership \(\texttt {(x \textsf {a} C)}\) where \(x\in \varGamma \cup \mathcal {V}\), then B is one of \(\texttt {(x \textsf {a} C')}\), \(\texttt {(x \textsf {P} ?Y)}\), \(\texttt {(?Y \textsf {P} x)}\) for some fresh variable ?Y, class \(\texttt {C'}\) and role \(\textsf {P}\). If A is a role participation assertion \(\texttt {(x \textsf {R} z)}\), B is of the form \(\texttt {(x \textsf {P} z)}\), for some role \(\textsf {P}\). For a SPARQL triple (possibly with variables) C we use \(C^\mathrm {caus}\) to denote a BGP computed in the same way as for the ABox assertion A above.
Definition 5
We refer to [1] for further details, but stress that as such, Sem \(^{ mat }_{2}\) is not able to detect or deal with inconsistencies arising from extending G with instantiations of \(P_i\). In what follows, we will discuss how this can be remedied.
Remark 1
Note that although the DELETE clause \(P_d\) is syntactically a BGP, its semantics is different. Namely, triples occurring in \(P_d\) are mutually independent (cf. Definition 4), so that for every \(\theta \in ans (P_w,G)\), each atom in \(P_d\theta \cap G\) is deleted from G no matter which other atoms of \(P_d\theta \) occur in G. Therefore, \(P_d^\mathrm {caus}\) is computed atomwise, unlike CQ rewriting [4]. Note that \(A^\mathrm {caus} = O(\mathcal {T})\) where \(\mathcal {T}\) denotes the vocabulary size of \(\mathcal {T}\): in each RDFS\(_{\lnot }\) derivation, a class membership assertion can occur at most once for each class in \(\mathcal {T}\), and a role membership assertion can occur at most twice for every role in \(\mathcal {T}\). Thus, \(P_d^\mathrm {caus} \le 2P_d\cdot \mathcal {T}\) and \(P_i^\mathrm {eff} \le P_i\cdot \mathcal {T}\), so both can be computed in polytime. This underpins the polynomial complexity of our rewritings.
3 Checking Consistency of a SPARQL Update
In the literature on the evolution of DLLite knowledge bases [5, 7], updates represented by pairs of ABoxes \(\mathcal {A}_d, \mathcal {A}_i\) have been studied. However, whereas such update might be viewed to fit straightforwardly to the corresponding \(\mathcal {A}_d, \mathcal {A}_i\) in Definition 4, it is typically assumed that \(\mathcal {A}_i\) is consistent with the TBox, and thus one only needs to consider how to deal with inconsistencies between the update and the old state of the knowledge base. However, this a priori assumption may be insufficient for SPARQL updates, where concrete values for inserted triples are obtained from variable bindings in the WHERE clause, and depending on the bindings, the update can be either consistent or not. This is demonstrated by the update u from Sect. 1 which, when applied to the ABox \(\mathcal {A}_1\), results in an inconsistent set \(\mathcal {A}_i\) of insertions. We call this intrinsic inconsistency of an update relative to a triple store \(G = \mathcal {T}\cup \mathcal {A}\).
Definition 6
Let G be a triple store. The update u is said to be intrinsically consistent w.r.t. G if the set of new assertions \(\mathcal {A}_i\) from Definition 4 generated by applying u to G, taken in isolation from the ABox of G, does not contradict the TBox of G. Otherwise, the update is said to be intrinsically inconsistent w.r.t. G.
Intrinsic inconsistency of the update differs crucially from the inconsistency w.r.t. the old state of the knowledge base, illustrated by the ABox \(\mathcal {A}_2\) from Sect. 1. This latter case can be addressed by adopting an update policy that prefers newer assertions in case of conflicts, as studied in the context of DLLite KB evolutions [5], which we will discuss in Sect. 4 below. Intrinsic inconsistencies however are harder to deal with, since there is no cue which assertion should be discarded in order to avoid the inconsistency. Our proposal here is thus to discard all mutually inconsistent pairs of insertions.
Remark 2
Our rewriting algorithms rely on producing fresh copies of the WHERE clause. Assume \(\theta \), \(\theta _1\), \(\theta _2\), ... to be substitutions replacing each variable in a given formula with a distinct fresh one. For a substitution \(\sigma \), we also define \(\theta [\sigma ]\) resp. \(\theta _i[\sigma ]\) to be an extension of \(\sigma \), renaming each variable at positions not affected by \(\sigma \) with a distinct fresh one. For instance, let F be a triple (?Z :studentOf ?Y). Now, \(F\theta \) makes a variable disjoint copy of F: \(?Z_1\) :studentOf \(?Y_1\) for fresh \(?Z_1, ?Y_1\). \(F[?Z \mapsto ?X]\) is just a substitution of ?Z by ?X in F. Finally, \(F\theta [?Z \mapsto ?X]\) results in ?X :studentOf \(?Y_2\) for fresh \(?Y_2\). We assume that all occurrences of \(F\theta [\sigma ]\) stand for syntactically the same query, but that \(F\theta [\sigma _1]\) and \(F\theta [\sigma _2]\), for distinct \(\sigma _1\) and \(\sigma _2\), can only have variables in \(range(\sigma _1) \cap range(\sigma _2)\) in common. That is, the choice of fresh variables is defined by the parameterizing substitution \(\sigma \). \(\blacksquare \)
Using this notation, the possibility of unifying two variables ?X and ?Y in \(P_w\) on a given triple store can be tested with the query \(\{P_w\theta _1[?X \mapsto ?Z]\} \{ P_w\theta _2[?Y \mapsto ?Z]\}\) where \(\theta _1\) and \(\theta _2\) are variable renamings as in Remark 2 and ?Z is a fresh variable.
Example 1
(In this and subsequent examples we omit the trivial \(\,\mathsf {FILTER}( False )\) union branch used in rewritings to initialize variables with disjunctive conditions, such as W in Algorithm 1) \(\blacksquare \)
Suppose that an insert is not intrinsically consistent for a given triple store. One solution would be to discard it completely, should the above ASK query return \( True \). Another option which we consider here is to only discard those variable bindings from the WHERE clause, which make the INSERT clause \(P_i\) inconsistent. This is the task of the safe rewriting \({{\mathrm{safe}}}(\cdot )\) in Algorithm 2, removing all variable bindings that participate in a clash between different triples of \(P_i\). Let \(P_w\) be a WHERE clause, in which the variables ?X and ?Y should not be unified to avoid clashes. With \(\theta _1\), \(\theta _2\) being “fresh” variable renamings as in Remark 2, Algorithm 2 uses the union of \(P_w\theta _1[?X\mapsto ?Y]\) and \(P_w\theta _2[?Y\mapsto ?X]\) to eliminate unsafe bindings that send ?X and ?Y to the same value.
Example 2
Note that the safe rewriting can make the update void. For instance, \({{\mathrm{safe}}}(u)\) has no effect on the ABox \(\mathcal {A}_1\) from Sect. 1, since there is no cue, which of :jim :attendsClassOf :ann, :ann :attendsClassOf :jim needs to be dismissed to avoid the clash. However, if we extend this ABox with assertions both satisfying the WHERE clause of u and not causing undesirable variable unifications, \({{\mathrm{safe}}}(u)\) would make insertions based on such bindings. For instance, adding the fact :bob :attendsClassOf :alice to \(\mathcal {A}_1\) would assert :bob :studentOf :alice as a result of \({{\mathrm{safe}}}(u)\). \(\blacksquare \)
We conclude this section by formalizing the intuition of update safety. For a triple store G and an update \(u = (P_d, P_i, P_w)\), let \(\llbracket P_w \rrbracket ^u_G\) denote the set of variable bindings computed by the query “\(\,\mathsf {SELECT}?X_1, \ldots , ?X_k \,\mathsf {WHERE}\ P_w\)” over G, where \(?X_1, \ldots , ?X_k\) are the variables occurring in \(P_i\) or in \(P_d\).
Theorem 1
 (1)
\(q_u(G) = True \ iff \; \exists \mu , \mu ' \in \llbracket P_w \rrbracket ^u_G s.t.\ \mu (P_i) \wedge \mu '(P_i) \wedge \mathcal {T}\models \bot \);
 (2)
\( \llbracket P_w \rrbracket ^u_G \setminus \llbracket P'_w \rrbracket ^u_G = \{ \mu \in \llbracket P_w \rrbracket ^u_G \mid \exists \mu ' \in \llbracket P_w \rrbracket ^u_G \text { s.t. } \mu (P_i) \wedge \mu '(P_i) \wedge \mathcal {T}\models \bot \}. \)
4 Materialization Preserving Update Semantics
In this section we discuss resolution of inconsistencies between triples already in the triple store and newly inserted triples. Our baseline requirement for each update semantics is formulated as the following property.
Definition 7
(Consistencypreserving). Let G be a triple store and \(u(P_d, P_i, P_w)\) an update. A materialization preserving update semantics \( Sem \) is called consistency preserving in RDFS\(_{\lnot }\) if the evaluation of update u, i.e., \(G^{ Sem }_{u(P_d,P_i,P_w)}\), results in a consistent triple store.
Our consistency preserving semantics are respectively called brave, cautious and fainthearted. The brave semantics always gives priority to newly inserted triples by discarding all preexisting information that contradicts the update. The cautious semantics is exactly the opposite, discarding inserts that are inconsistent with facts already present in the triple store; i.e., the cautious semantics never deletes facts unless explicitly required by the DELETE clause of the SPARQL update. Finally, the fainthearted semantics executes the update partially, only performing insertions for those variable bindings which do not contradict existing knowledge (again, taking into account deletions).
All semantics rely upon incremental update semantics Sem \(^{ mat }_{2}\), introduced in Sect. 2, which we aim to extend to take into account class disjointness. Note that for the present section we assume updates to be intrinsically consistent, which can be checked or enforced beforehand in a preprocessing step by the safe rewriting discussed in Sect. 3. In this section, we lift our definition of update operation to include also updates \((P_d, P_i, P_w)\) with \(P_w\) produced by the safe rewriting Algorithm 2 from some update satisfying Definition 4. What remains to be defined is the handling of clashes between newly inserted triples and triples already present in the triple store.

brave semantics Sem \(^{ mat }_{\textit{brave}}\): (i) delete all instantiations of \(P_d\) and their causes, plus all the nondeleted triples in G clashing with instantiations of triples in \(P_i\) to be inserted, again also including the causes of these triples; (ii) insert the instantiations of \(P_i\) plus all their effects.

cautious semantics Sem \(^{ mat }_{\textit{caut}}\): (i) delete all instantiations of \(P_d\) and their causes; (ii) insert all instantiations of \(P_i\) plus all their effects, unless they clash with some nondeleted triples in G: in this latter case, do not perform the update.

fainthearted semantics Sem \(^{ mat }_{\textit{faint}}\): (i) delete all instantiations of \(P_d\) and their causes; (ii) insert those instantiations of \(P_i\) (plus all their effects) which do not clash with nondeleted triples in G.
Remark 3
Note that Sem \(^{ mat }_{2}\) is not able to cope with so called “dangling” effects – that is, triples inserted at some point for the sake of materialization, whose causes have been subsequently deleted. As pointed out in [1], one way to deal with this issue is to combine Sem \(^{ mat }_{2}\) with marking of explicitly inserted triples. This approach was implemented as a semantics Sem \(^{ mat }_{1b}\) in [1], splitting the ABox \(\mathcal {A}\) into the explicit part \(\mathcal {A}_{ex}\) and the implicit part \(\mathcal {A}_{im} = \mathcal {A}\setminus \mathcal {A}_{ex}\). \(\mathcal {A}_{ex}\) can be maintained, e.g., in a separate RDF graph using a straightforward update rewriting. Now, deleting \(P_d\) would not only retract \(P_d^\mathrm {caus}\) from \(\mathcal {A}\), but also the triples in \({{\mathrm{chase}}}(P_d^\mathrm {caus},\mathcal {T}) \setminus {{\mathrm{chase}}}(\mathcal {A}_{ex} \setminus P_d^\mathrm {caus},\mathcal {T})\). That is, the effects of \(P_d^\mathrm {caus}\) are removed unless they can be derived from facts remaining in \(\mathcal {A}\) after enforcing the deletion \(P_d\). Such an aggressive removal of dangling triples can lead to counterintuitive behavior (cf. Example 9 in [1]), and requires maintaining the explicit ABox \(\mathcal {A}_{ex}\), which is why we opted to preserve dangling effects in our rewritings.
4.1 Brave Semantics
The rewriting in Algorithm 3 implements the brave update semantics Sem \(^{ mat }_{\textit{brave}}\); it can be viewed as combining the idea of FastEvol[5] with Sem \(^{ mat }_{2}\) to handle inconsistencies by giving priority to triples that ought to be inserted, and deleting all those triples from the store that clash with the new ones.
Example 3
The DELETE clause removes potential clashes for the inserted triples. Note that also property assertions implying clashes need to be deleted, which introduces fresh variables ?X1 and ?Y1. These variables have to be bound in the WHERE clause, and therefore \(P_d^\textit{fvars}\) adds two optional clauses to the WHERE clause, which is a computationally reasonable implementation of the concept \(P^\textit{fvars}\) from Definition 5. \(\blacksquare \)
The DELETE clause \(P'_d\) of the rewritten update is initialized in Algorithm 3 with the set \(P_d\) of triples from the input update. Rewriting ensures that also all “causes” of deleted facts are removed from the store, since otherwise the materialization will reinsert deleted triples. To this end, line 1 of Algorithm 3 adds to \(P'_d\) all facts from which \(P_d\) can be derived. Then, for each triple implied by \(P_i\) (that is, for each triple in \(P_i^\mathrm {eff}\)) the algorithm computes the patterns of clashing triples and adds them to the DELETE clause \(P'_d\), along with their causes. Note that it suffices to only consider disjointness assertions that are syntactically contained in \(\mathcal {T}\) (and not those implied by \(\mathcal {T}\)), since we assume that the store G is materialized. Finally, the WHERE clause of the rewritten update is extended to satisfy the syntactic restriction that all variables in \(P'_d\) must be bound: bindings of “fresh” variables introduced to \(P'_d\) due to the domain or range constraints in \(\mathcal {T}\) are provided by the part \(P_d^\textit{fvars}\), cf. Definition 5 and Example 3. The rewritten update is evaluated over the triple store, computing its new materialized and consistent state.
In the RDFS\(_{\lnot }\) ontology language and under the restriction that only ABox updates are allowed, the brave semantics is a belief revision operator [10, 20], performing a minimal change of the RDF graph (which due to materialization can be seen both as a deductive closure of the formula representing the ABox as well as the minimal model of this formula). There is a unique way of resolving inconsistencies since the only deduction rule with more than one ABox assertion in the premise, is the clash due to class disjointness (Fig. 1): assuming intrinsic consistency, the choice of which class membership assertion to remove in order to avoid clash is univocal (new knowledge is always preferred).
Theorem 2
Algorithm 3, given a SPARQL update u and a consistent materialized triple store \(G = \mathcal {T}\cup \mathcal {A}\), computes a new consistent and materialized state w.r.t. brave semantics. The rewriting in lines 1–6 takes time polynomial in the size of u and \(\mathcal {T}\).
4.2 Cautious Semantics
Cautious semantics thus treats insertions and deletions asymmetrically: the former depend on the latter but not the other way round. The rationale is that deletions never cause inconsistencies and can remove clashes between the old and the new data.
As in the case of brave semantics, cautious semantics is implemented using rewriting, presented in Algorithm 4. First, the algorithm issues an ASK query to check that no clashes will be generated by the INSERT clause, provided that the DELETE part of the update is executed. If no clashes are expected, in which case the ASK query returns \( False \), the brave update from the previous section is applied.
For a safe update \(u = (P_d,P_i,P_w)\), the ASK query is generated as follows. For each triple pattern \(\{?X ~\textsf {a}~ C\}\) among the effects of \(P_i\), at line 3 Algorithm 4 enumerates all concepts \(C'\) that are explicitly mentioned as disjoint with C in \(\mathcal {T}\). As in the case of brave semantics, this syntactic check is sufficient due to the assumption that the update is applied to a materialized store; by the same reason also no property assertions need to be taken into account.
For each concept \(C'\) disjoint with C, we need to check that a triple matching the pattern \(\{?X ~\textsf {a}~ C'\}\) is in the store G and will not be deleted by u. Deletion happens if there is a pattern \(\{?Y~\textsf {a}~C'\} \in P_d^\mathrm {caus}\) such that the variable ?Y can be bound to the same value as ?X in the WHERE clause \(P_w\). Line 6 of Algorithm 4 produces such a check, using a copy of \(P_w\), in which the variable ?Y is replaced by ?X and all other variables are replaced with distinct fresh ones. Since there can be several such triple patterns in \(P_d^\mathrm {caus}\), testing for clash elimination via the DELETE clause requires a disjunctive graph pattern \(\varTheta ^_{C'}\) constructed at line 6 and combined with \(\{?X\, \textsf {a}\, C'\}\) using \(\,\mathsf {MINUS}\,\) at line 7.
Finally, the resulting pattern is appended to the list W of clash checks using \(\,\mathsf {UNION}\,\). As a result, \(\{P_w\}.\{W\}\) queries for triples that are not deleted by u and clash with an instantiation of some class membership assertion \(\{?X ~\textsf {a}~C\} \in P_i^\mathrm {eff}\).
Theorem 3
Algorithm 4, given a SPARQL update u and a consistent materialized triple store \(G = \mathcal {T}\cup \mathcal {A}\), computes a new consistent and materialized state w.r.t. cautious semantics. The rewriting in lines 1–8 takes time polynomial in the size of u and \(\mathcal {T}\).
Example 4
\(\blacksquare \)
4.3 Fainthearted Semantics
Our realization of the idea of accommodating nonclashing inserts is based on decoupling the insert and the delete part of an update: whereas the delete is executed for all variable bindings satisfying the WHERE clause, one dismisses the inserts for variable bindings that yield clashes with the state of the store after the delete. That is, we deviate from the notion of update as an atomic operation in a different way than in the safe rewriting where both deletions and insertions are dismissed for variable bindings leading to clashes. Our motivation for such a design decision is explained next.
Assume that for each variable binding \(\mu \) returned by the WHERE pattern, we want to either insert \({{\mathrm{gr}}}(P_i\mu )\) along with deleting \({{\mathrm{gr}}}(P_d\mu )\), or dismiss \(\mu \) altogether. As an example, consider the update \(u'\) from Example 4 and the ABox \(\{\) :jim :attendsClassOf :ann. :jim a :Professor. :bob :attendsClassOf :jim \(\}\). With the variable binding \(\mu _1 = [?X\mapsto \mathtt {{:}jim}, ?Y \mapsto \mathtt {{:}ann}]\) we insert :jim a :Student knowing that the clashing fact :jim a :Professor will be deleted by the binding \(\mu _2 = [?X\mapsto \mathtt {{:}bob}, ?Y \mapsto \mathtt {{:}jim}]\). However, if the update is atomic, this anticipated deletion will only happen if \({{\mathrm{gr}}}(P_i\mu _2)\) does not introduce clashes. Assume this is the case (i.e. also {:bob a :Professor} is in the ABox): we have to look one more step ahead and check if this triple will be deleted by some variable binding \(\mu _3\), and so on. This behaviour could be realized with SPARQL path expressions, which would however stipulate severe syntactic restrictions on the WHERE clause \(P_w\) of the original update.
As mentioned above, our interpretation of fainthearted semantics assumes independence between the INSERT and DELETE parts of the update. To implement this, we rely on SPARQL’s flexible handling of variable bindings. Namely, we rename the variables in the DELETE clause apart from the rest of the update, and put this renamed apart copy of the WHERE clause in a new UNION branch. The original WHERE clause is then rewritten (using MINUS operator, similarly to the case of cautious semantics) to ensure that insertions are only done for variable bindings where clashes are removed by the DELETE clause with some variable binding. The implementation can be found in Algorithm 5.
Example 5
We conclude with a claim of correctness and polynomial complexity of the rewriting, similar to those made for the brave and cautious semantics.
Theorem 4
Algorithm 5, given a SPARQL update u and a materialized triple store \(G = \mathcal {T}\cup \mathcal {A}\) w.r.t. fainthearted semantics, computes a new consistent and materialized state. The rewriting in lines 1–9 takes time polynomial in the size of u and \(\mathcal {T}\).
5 Experimental Evaluation
For each of the three semantics discussed in the previous section, we provided a preliminary implementation using the Jena API (http://jena.apache.org) and evaluated them against Jena TDB triple store which implements the latest SPARQL 1.1 specification. As before, for computing the initial materialization of a triple store mat(G) we rely onboard, forwardchaining materialization in Jena TDB using the minimal RDFS rules as in Fig. 1.
For our experiments, we used the data generated by the EUGen generator [15] of for the size range of 5 to 50 Universities. We opted for using this generator as it extends the LUBM ontology [9] with chains of subclasses, making the rewritings more challenging. In our case we have used the default of \(i=20\) subclasses for each LUBM concept (e.g., Subj i Students) and made such subclasses pairwise disjoint. Moreover, we have added more disjointness axioms where appropriate, e.g., :AssociateProfessor \(\textsf {dw}\) :FullProfessor. All these TBox axioms are merged with our previous reduced RDFS version of LUBM used in our previous work [1]. To compare the experimental results with the previous work, for our experiments we adapted the seven updates from [1]. Our prototype, as well as files containing the data, ontology, and the updates used for experiments, are made available on a dedicated Web page^{1}.
Evaluation results in seconds for LUBM 50
Update #  1  2  3  4  5  6  7 

Sem \(^{ mat }_{\textit{brave}}\)  12,4  14,8  0,1  22,1  46,0  15,3  13,6 
Sem \(^{ mat }_{\textit{caut}}\)  0,3  0,2  0,2  44,0  0,2  3,9  2,3 
Sem \(^{ mat }_{\textit{faint}}\)  2,2  2,8  0,01  17,4  3,3  16,7  15,3 
Original  0,2  0,2  0,2  10,2  0,2  6,6  5,4 
6 Related Work and Conclusions
In this paper we have taken a step further from our previous work, in combining SPARQL Update and RDFS entailment by adding concept disjoints as a first step towards dealing with inconsistencies in the context of SPARQL Updates. We distinguish the case of intrinsic inconsistency, localized within instantiations of the INSERT clause of a SPARQL update, and the usual case when the new information is inconsistent with the old knowledge. In the former case, our solution was to discard all solutions of the WHERE query that participate in an inconsistency. For the latter case, we discussed several reconciliation strategies, well suited for efficient implementation in SPARQL. Our preliminary implementation shows the feasibility of all proposed approaches on top of an offtheshelf triple store supporting SPARQL and SPARQL update (Apache TDB).
The problem of knowledge based update and belief revision has been extensively studied in the literature, although not in the context of SPARQL updates where facts to be deleted or inserted come from a query. As argued in Sect. 4.1, brave semantics implements the most established approach of adapting the new information fully via a minimal change, which is feasible in the setting of fixed RDFS\(_{\lnot }\) TBoxes. Also semantics deliberating between accepting and discarding change are known (see [10] for a survey). In [18] an approach involving user interaction to decide whether to accept or reject an individual axiom is considered, with some part of the update being computed automatically in order to ensure its consistency. We do not consider interactive procedures here (although they clearly make sense in the case of more complex TBoxes or for TBox updates). Instead, we rely on the resolution strategies which are simple for the user to understand and can be efficiently encoded in SPARQL. In a practical KB editing system, one should probably combine the two approaches, e.g. for resolving the intrinsic inconsistency. Likewise, the approaches [3, 7, 13] consider grounded updates only, whereas our focus is on implementation of updates in SPARQL. The approach in [7] captures RDFS and several additional types of constraints and is close in spirit to our brave semantics.
Intrinsic consistency of an update is a common assumption in knowledge base update (e.g. [5, 6, 7, 14]), which can be easily violated in the case of SPARQL updates. It is worth noting that our resolution strategy for intrinsic inconsistency called safe rewriting can be combined with all three update semantics using just the basic SPARQL operators.
Much interesting work remains to be done in order to optimize rewritten updates. Moreover, we plan to further extend our work towards increasing coverage of more expressive logics and OWL profiles, namely additional axioms from OWL2 RL or OWL 2 QL [16].
Footnotes
Notes
Acknowledgements
This work was supported by the Vienna Science and Technology Fund (WWTF), project ICT12SEE, and EU IP project Optique (Scalable Enduser Access to Big Data), grant agreement n. FP7318338.
References
 1.Ahmeti, A., Calvanese, D., Polleres, A.: Updating RDFS ABoxes and TBoxes in SPARQL. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 441–456. Springer, Heidelberg (2014)Google Scholar
 2.Beckett, D., BernersLee, T., Prud’hommeaux, E., Carothers, G.: RDF 1.1 Turtle  Terse RDF Triple Language. W3C Recommendation, World Wide Web Consortium, February 2014Google Scholar
 3.Benferhat, S., Bouraoui, Z., Papini, O., Würbel, E.: A prioritized assertionalbased revision for DLLite knowledge bases. In: Fermé, E., Leite, J. (eds.) JELIA 2014. LNCS, vol. 8761, pp. 442–456. Springer, Heidelberg (2014)Google Scholar
 4.Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: the DLLite family. J. Autom. Reasoning 39(3), 385–429 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
 5.Calvanese, D., Kharlamov, E., Nutt, W., Zheleznyakov, D.: Evolution of DL \(\) Lite knowledge bases. In: PatelSchneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 112–128. Springer, Heidelberg (2010)CrossRefGoogle Scholar
 6.De Giacomo, G., Lenzerini, M., Poggi, A., Rosati, R.: On instancelevel update and erasure in description logic ontologies. J. Log. Comput. 19(5), 745–770 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
 7.Flouris, G., Konstantinidis, G., Antoniou, G., Christophides, V.: Formal foundations for RDF/S kb evolution. Knowl. Inf. Syst. 35(1), 153–191 (2013)CrossRefGoogle Scholar
 8.Gearon, P., Passant, A., Polleres, A.: SPARQL 1.1 update. W3C Recommendation, World Wide Web Consortium, March 2013Google Scholar
 9.Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Seman. 3(2–3), 158–182 (2005)CrossRefGoogle Scholar
 10.Hansson, S.: A survey of nonprioritized belief revision. Erkenntnis 50(2–3), 413–427 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
 11.Harris, S., Seaborne, A.: SPARQL 1.1 query language. W3C Recommendation, World Wide Web Consortium, March 2013Google Scholar
 12.Hayes, P., PatelSchneider, P.: RDF 1.1 semantics. W3C Recommendation, World Wide Web Consortium, February 2014Google Scholar
 13.Kharlamov, E., Zheleznyakov, D., Calvanese, D.: Capturing modelbased ontology evolution at the instance level: the case of DLLite. J. Comput. Syst. Sci. 79(6), 835–872 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Liu, H., Lutz, C., Milicic, M., Wolter, F.: Updating description logic aboxes. In: Doherty, P., Mylopoulos, J., Welty, C.A. (eds.) KR, pp. 46–56. AAAI Press (2006)Google Scholar
 15.Lutz, C., Seylan, I., Toman, D., Wolter, F.: The combined approach to OBDA: taming role hierarchies using filters. In: Kagal, L., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 314–330. Springer, Heidelberg (2013)CrossRefGoogle Scholar
 16.Motik, B., Grau, B.C., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: Owl 2 web ontology language profiles, 2nd edn. W3C Recommendation, World Wide Web Consortium, December 2012Google Scholar
 17.Muñoz, S., Pérez, J., Gutierrez, C.: Minimal deductive systems for RDF. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519, pp. 53–67. Springer, Heidelberg (2007)CrossRefGoogle Scholar
 18.Nikitina, N., Rudolph, S., Glimm, B.: Interactive ontology revision. Web Seman. Sci. Serv. Agents World Wide Web 12–13, 118–130 (2012). reasoning with context in the Semantic WebCrossRefGoogle Scholar
 19.Polleres, A., Hogan, A., Delbru, R., Umbrich, J.: RDFS and OWL reasoning for linked data. In: Rudolph, S., Gottlob, G., Horrocks, I., van Harmelen, F. (eds.) Reasoning Weg 2013. LNCS, vol. 8067, pp. 91–149. Springer, Heidelberg (2013)Google Scholar
 20.Winslett, M.: Updating Logical Databases. Cambridge University Press, Cambridge (2005)zbMATHGoogle Scholar