# WORL: a nonmonotonic rule language for the semantic web

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s40595-013-0009-y

- Cite this article as:
- Cao, S.T., Nguyen, L.A. & Szałas, A. Vietnam J Comput Sci (2014) 1: 57. doi:10.1007/s40595-013-0009-y

- 2 Citations
- 547 Downloads

## Abstract

We develop a new Web ontology rule language, called WORL, which combines a variant of OWL 2 RL with eDatalog\(^\lnot \). We allow additional features like negation, the minimal number restriction and unary external checkable predicates to occur at the left-hand side of concept inclusion axioms. Some restrictions are adopted to guarantee a translation into eDatalog\(^\lnot \). We also develop the well-founded semantics and the stable model semantics for WORL as well as the standard semantics for stratified WORL (SWORL) via translation into eDatalog\(^\lnot \). Both WORL with respect to the well-founded semantics and SWORL with respect to the standard semantics have PTime data complexity. In contrast to the existing combined formalisms, in WORL and SWORL negation in concept inclusion axioms is interpreted using nonmonotonic semantics.

### Keywords

OWL 2 RLDatalog with negationSemantic WebRule languages## 1 Introduction

In recent years, the Semantic Web area has been rapidly developed and attracted lots of attention. A central idea of the Semantic Web is that ontologies are a proper bridge among users and search engines, ensuring more accurate search results. Therefore, Web Ontology Language (OWL), built on the top of XML and RDF, serves as an important tool for specifying ontologies and reasoning about them. Together with rule languages, it serves as a main knowledge representation formalism for the Semantic Web.

The main semantical and logical foundation of OWL are description logics (DLs). Such logics represent the domain of interest in terms of concepts, individuals, and roles. A concept is interpreted as a set of individuals, while a role is interpreted as a binary relation between individuals. A knowledge base in a DL consists of an RBox of role axioms, a TBox of terminological axioms and an ABox of facts about individuals.

The second version OWL 2 of OWL, recommended by the W3C consortium in 2009, is based on the DL \(\mathcal {SROIQ}\). This logic is highly expressive but has intractable combined complexity (N2ExpTime-complete) and data complexity (NP-hard) for basic reasoning problems. Thus, W3C recommended also profiles OWL 2 EL, OWL 2 QL and OWL 2 RL, which are restricted sublanguages of OWL 2 Full and enjoy PTime data complexity. These profiles are based on the families of description logics \(\mathcal {EL}\) [3, 4], DL-Lite [5] and Description Logic Programs (DLP) [13], respectively. There are also more sophisticated fragments of DLs with PTime data complexity: Horn-\(\mathcal {SHIQ}\) [15], Horn-\(\mathcal {SROIQ}\) [21] and Horn-DL [20].

Rule languages provide very useful knowledge representation formalisms applicable to the Semantic Web. Some fragments of DLs like DLP [13] can be translated into rule languages. But most importantly, rule languages can be combined with DLs to develop more expressive formalisms. An early attempt to achieve such a combination was SWRL [14], a rule language using only concept names, role names and the equality predicate. However, without restrictions its combination with OWL DL is undecidable.

A knowledge base in other combined languages is usually specified as a pair \(\langle \mathcal {O},\mathcal {P}\rangle \), where \(\mathcal {O}\) is an ontology in some DL and \(\mathcal {P}\) is a set of rules, e.g., specified in Datalog or its suitable extension, which can use concept names and role names. Interaction between \(\mathcal {O}\) and \(\mathcal {P}\) is either one-way (\(\mathcal {O}\) affects \(\mathcal {P}\)) or two-way (where \(\mathcal {P}\) may also affect \(\mathcal {O}\)). The approach of defining a knowledge base as a pair \(\langle \mathcal {O},\mathcal {P}\rangle \) is adopted in a considerable number of works, including [8] (on \(\mathcal {AL}\)-log), [17] (on *CARIN*), [19] (on *DL-safe rules*), [24] (on \(\mathcal {DL}\)+log), [16, 18] (on *hybrid MKNF*), [9] (on *hybrid programs*), [23] (on *OntoDLV*), [10] (on *dl-programs*). In these works, if negation is allowed in \(\mathcal {P}\) then \(\mathcal {P}\) and its interaction with \(\mathcal {O}\) are interpreted using some nonmonotonic semantics (e.g., the stable model semantics, the MKNF semantics or the well-founded semantics). However, \(\mathcal {O}\) is always interpreted using the usual (monotonic) semantics.

Negation in \(\mathcal {O}\) is interpreted using a nonmonotonic semantics (the well-founded semantics, the stable model semantics, or the standard semantics for stratified knowledge bases); this differs from all the above-mentioned works [8–10, 16–19, 23, 24].

We combine \(\mathcal {O}\) and \(\mathcal {P}\) into one set (called a layer, which is divided into a TBox consisting of concept inclusion axioms/program clauses and an ABox consisting of facts). This allows for a tighter integration between DLs and rules. It may seem similar to the approach of SWRL, but we also allow ordinary predicates, use a nonmonotonic semantics for negation, and design the language appropriately to get decidability and PTime data complexity (w.r.t. the well-founded semantics, and the standard semantics for stratified knowledge bases).

To reflect modularity of ontologies (e.g., the import feature of ontologies), we define a knowledge base to be a hierarchy of layers (a tree or a rooted directed acyclic graph of layers). Each layer in turn may be stratifiable and divided further into strata. The granulation is not substantial for the well-founded semantics, as the whole knowledge base will be flattened to a set of program clauses and facts.

However, it is substantial for the stable model semantics (see Example 8). Furthermore, when each layer of the considered knowledge base is stratifiable and the standard semantics is used for it, layers not only emphasize modularity but also affect the semantics (flattening the knowledge base may result in an unstratifiable layer).

disallow those features of OWL 2 RL that play the role of constraints (i.e., the ones that are translated to negative clauses of the form \(\varphi \rightarrow \bot \));

allow unary external checkable predicates;

allow additional features like negation and the constructor \(\ge \!n\,R.C\) to occur at the left-hand side of \(\sqsubseteq \) in concept inclusion axioms.

This paper is a revised and extended version of our conference paper [7]. Comparing to [7], in the current paper, we additionally provide the standard model semantics for WORL, a direct method for checking stratifiability of TBoxes, all the proofs and a number of illustrative examples. The three semantics for eDatalog\(^\lnot \) which we consider are now presented in a uniform manner.

The rest of this paper is structured as follows. In Sect. 2 we introduce eDatalog\(^\lnot \), stratified eDatalog\(^\lnot \), and their semantics. In Sect. 3 we present WORL, a translation of WORL into eDatalog\(^\lnot \), and its well-founded semantics and stable model semantics. Section 4 is devoted to SWORL and its standard semantics. Section 5 concludes “this work”. In the Appendix, we present a direct method for checking stratifiability of TBoxes.

## 2 Preliminaries

We denote the set of *concept names* by \(\mathrm {CNames}\), and the set of *role names* by \(\mathrm {RNames}\).

*individual*(i.e.

*object*) and

*literal*[22] (i.e.

*data constant*). We denote the

*individual*type by \( IType \), and the

*literal*type by \( LType \). Thus, a concept name is a unary predicate of type \(P( IType )\), a

*data type*is a unary predicate of type \(P( LType )\), an

*object role name*is a binary predicate of type \(P( IType \times IType )\), and a

*data role name*is a binary predicate of type \(P( IType \times LType )\). For simplicity, we do not provide specific data types like integer, real or string. Apart from concept names and role names, we will also use a set \(\mathrm {OPreds}\) of

*ordinary predicates*(including data types) and a set \(\mathrm {ECPreds}\) of

*external checkable predicates*. We assume that the sets \(\mathrm {CNames}\), \(\mathrm {RNames}\), \(\mathrm {OPreds}\) and \(\mathrm {ECPreds}\) are finite and pairwise disjoint. By a set of

*defined predicates*we mean:

if \(p\) is a \(k\)-ary predicate from \(\mathrm {ECPreds}\) and \(d_1,\ldots ,d_k\) are constants of \( LType \), then the truth value of \(p(d_1,\ldots , d_k)\) is fixed and computable in polynomial time (in the number of bits used for \(d_1,\ldots ,d_k\)).

For example, one may want to use the binary predicates \(>\), \(\ge \), \(<\), \(\le \) on real numbers with the usual semantics.

We assume there is only one equality predicate ‘\(=\)’, which belongs to \(\mathrm {OPreds}\) and has the type \(P( IType \times IType )\). For data constants, we assume the Unique Names Assumption instead.

A *term* is either an individual (of type \( IType \)) or a literal (of type \( LType \)) or a *variable* (of type \( IType \) or \( LType \)). If \(p\) is a predicate of type \(P(T_1 \times \cdots \times T_k)\), and for \(1 \le i \le k\), \(t_i\) is a term of type \(T_i\), then \(p(t_1,\ldots ,t_k)\) is an *atomic formula* (also called an *atom*). An atom is *ground* if it contains no variables.

*interpretation*\(\mathcal {I}= \langle \Delta _o^\mathcal {I}, \Delta _d^\mathcal {I}, \cdot ^\mathcal {I}\rangle \) consists of a non-empty set \(\Delta _o^\mathcal {I}\) called the

*object domain*of \(\mathcal {I}\), a non-empty set \(\Delta _d^\mathcal {I}\) disjoint with \(\Delta _o^\mathcal {I}\) called the

*data domain*of \(\mathcal {I}\), and a function \(\cdot ^\mathcal {I}\) which maps:

every individual \(a\) to an element \(a^\mathcal {I}\in \Delta _o^\mathcal {I}\),

every literal \(d\) to a unique

^{1}element \(d^\mathcal {I}\in \Delta _d^\mathcal {I}\),every concept name \(A\) to a subset \(A^\mathcal {I}\) of \(\Delta _o^\mathcal {I}\),

every data type \( DT \) to a subset \( DT ^\mathcal {I}\) of \(\Delta _d^\mathcal {I}\),

every predicate of type \(P(T_1 \times \cdots \times T_k)\) in \(\mathrm {DPreds}\) different from ‘\(=\)’ to a subset of \(\Delta _1 \times \cdots \times \Delta _k\), where \(\Delta _i = \Delta _o^\mathcal {I}\) if \(T_i = IType \), and \(\Delta _i = \Delta _d^\mathcal {I}\) if \(T_i = LType \),

predicate ‘\(=\)’ to a congruence of \(\mathcal {I}\).

^{2}

*Herbrand interpretation*is a set of ground atoms of predicates from \(\mathrm {DPreds}\). An

*ABox*is a finite Herbrand interpretation.

The *size* of a ground atom is the number of bits used for its representation. The *size* of an ABox is the sum of the sizes of its atoms.

A Herbrand interpretation \(\mathcal {H}\) is *closed w.r.t.*\( EqAxioms \) if for every ground instance \(\varphi _1 \wedge \cdots \wedge \varphi _k \rightarrow \psi \) (with \(k \ge 0\)) of an axiom in \( EqAxioms \) using the individuals and data constants occurring in \(\mathcal {H}\), if \(\{\varphi _1, \ldots , \varphi _k\} \subseteq \mathcal {H}\) then \(\psi \in \mathcal {H}\).

\(\Delta _o^\mathcal {I}\) is the set of all individuals occurring in \(\mathcal {H}\),

\(\Delta _d^\mathcal {I}\) is the set of all data constants occurring in \(\mathcal {H}\),

- for every \(k\)-ary predicate \(p \in \mathrm {DPreds}\),$$\begin{aligned} p^\mathcal {I}= \{\langle t_1,\ldots ,t_k\rangle \mid p(t_1,\ldots ,t_k) \in \mathcal {H}\}. \end{aligned}$$

*traditional interpretation corresponding to*\(\mathcal {H}\).

### 2.1 The rule language eDatalog\(^\lnot \)

In [6], we defined eDatalog as an extension of Datalog with the equality predicate, external checkable predicates, and a relaxed range-restrictedness condition. In this subsection, we define the rule language eDatalog\(^\lnot \) similarly as an extension of Datalog\(^\lnot \), but using the full range-restrictedness condition.

*eDatalog*\(^\lnot \)

*program clause*is a formula of the form

*range-restrictedness condition*).

The atom \(\alpha \) in (1) is called the *head* of the program clause. If \(p\) is the predicate of \(\alpha \) then the clause is called a *program clause defining p*. The formula at the left-hand side of \(\rightarrow \) in (1) is called the *body* of the program clause.

An *eDatalog*\(^\lnot \)*program* is a finite set of eDatalog \(^\lnot \) program clauses. An *eDatalog*\(^\lnot \)*knowledge base* is a pair \(\langle \mathcal {P},\mathcal {A}\rangle \) consisting of an eDatalog\(^\lnot \) program \(\mathcal {P}\) and an ABox \(\mathcal {A}\). A *query* is defined to be a formula that can be the body of an eDatalog \(^\lnot \) program clause.

**Example 1**

### 2.2 Stratified eDatalog\(^\lnot \)

*stratification*of an eDatalog\(^\lnot \) program \(\mathcal {P}\) is a sequence of eDatalog\(^\lnot \) programs \(\mathcal {P}_1, \ldots , \mathcal {P}_n\) such that:

\(\{\mathcal {P}_1,\ldots ,\mathcal {P}_n\}\) is a partition of \(\mathcal {P}\cup EqAxioms \),

- for some mapping \(f : \mathrm {DPreds}\rightarrow \{1,\ldots ,n\}\), every predicate \(p \in \mathrm {DPreds}\) satisfies the following conditions:
the program clauses in \(\mathcal {P}\cup EqAxioms \) defining \(p\) are in \(\mathcal {P}_{f(p)}\),

- if \(\mathcal {P}\cup EqAxioms \) contains a program clause defining \(p\) in the formthen for every \(1 \le i \le h\) and \(1 \le j \le k\,\):$$\begin{aligned}&(\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \lnot \psi _1 \wedge \cdots \wedge \lnot \psi _k \wedge \xi _1 \wedge \cdots \wedge \xi _l\\&\quad \wedge \ \lnot \zeta _1 \wedge \cdots \wedge \lnot \zeta _m) \rightarrow \alpha \end{aligned}$$
if \(p'_i\) is the predicate of \(\varphi _i\) then \(f(p'_i) \le f(p)\),

if \(p''_j\) is the predicate of \(\psi _j\) then \(f(p''_j) < f(p)\).

*stratum*of the stratification, and \(f\) is called the

*stratification mapping*. Let us emphasize that \(f(\mathrm `=' ) \le f(p)\) for all \(p \in \mathrm {DPreds}\).

An eDatalog\(^\lnot \) program \(\mathcal {P}\) is called a *stratified eDatalog*\(^\lnot \)*program* if it has a stratification. It is called a *semipositive eDatalog*\(^\lnot \)*program* if it has a stratification with only one stratum.^{3}

A pair \(\langle \mathcal {P},\mathcal {A}\rangle \) is called a *stratified eDatalog*\(^\lnot \)*knowledge base* if it is an eDatalog\(^\lnot \) knowledge base with \(\mathcal {P}\) being a stratified eDatalog\(^\lnot \) program.

**Example 2**

The program \(\mathcal {P}\) given in Example 1 is a stratified eDatalog\(^\lnot \) program with two strata. Each program clause of \(\mathcal {P}\) forms a stratum.

### 2.3 Semantics of eDatalog\(^\lnot \)

Let \(\langle \mathcal {P},\mathcal {A}\rangle \) be an eDatalog\(^\lnot \) knowledge base. By \(\mathcal {P}^{ gr }_\mathcal {A}\) we denote the set of all ground instances of the program clauses of \(\mathcal {P}\cup EqAxioms \) that use only individuals and data constants occurring in \(\mathcal {P}\) or \(\mathcal {A}\).

**Example 3**

the well-founded model of an eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) to be the well-founded model of the ground Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) [11],

a stable model of an eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) to be a stable model of the ground Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup ~\mathcal {A}\) [12],

the standard model of a stratified eDatalog\(^\lnot \) knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) to be the standard model of the stratified Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) [1].

*answer*to \(\varphi \) w.r.t. \(\langle \mathcal {P},\mathcal {A}\rangle \) and the

*well-founded semantics*if \(\varphi \theta \) holds in the well-founded model of \(\langle \mathcal {P},\mathcal {A}\rangle \).

^{4}Similarly, \(\theta \) is called an

*answer*to \(\varphi \) w.r.t. \(\langle \mathcal {P},\mathcal {A}\rangle \) and the

*stable model semantics*if \(\varphi \theta \) holds in a stable model of \(\langle \mathcal {P},\mathcal {A}\rangle \). If \(\langle \mathcal {P},\mathcal {A}\rangle \) is stratifiable then \(\theta \) is called an

*answer*to \(\varphi \) w.r.t. \(\langle \mathcal {P},\mathcal {A}\rangle \) and the

*standard semantics*if \(\varphi \theta \) holds in the standard model of \(\langle \mathcal {P},\mathcal {A}\rangle \).

As a Datalog\(^\lnot \) program may have zero or more than one stable model, an eDatalog\(^\lnot \) knowledge base may also have zero or more than one stable model. Note that we adopt the answer set programming approach to deal with the case when an eDatalog\(^\lnot \) knowledge base has more than one stable model.

**Proposition 1**

The data complexity of eDatalog\(^\lnot \) with respect to the well-founded semantics is in PTime.

*Proof*

Let \(\langle \mathcal {P},\mathcal {A}\rangle \) be an eDatalog\(^\lnot \) knowledge base. The set \(\mathcal {P}^{ gr }_\mathcal {A}\) can be constructed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). As the truth values of the atoms of external checkable predicates that occur in \(\mathcal {P}^{ gr }_\mathcal {A}\) can be computed in polynomial time, \(\mathcal {P}_\mathcal {A}\) can also be constructed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). It is well known that the well-founded model of the Datalog\(^\lnot \) program \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) can be constructed in polynomial time and has polynomial size in the size of \(\mathcal {P}_\mathcal {A}\cup \mathcal {A}\) (see, e.g., [1]). Thus, the well-founded model of \(\langle \mathcal {P},\mathcal {A}\rangle \) can be constructed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). Consequently, answering queries to \(\langle \mathcal {P},\mathcal {A}\rangle \) w.r.t. the well-founded semantics can be done in polynomial time in the size of \(\mathcal {A}\).\(\square \)

**Lemma 1**

Given an eDatalog\(^\lnot \) knowledge base \( KB = \langle \mathcal {P},\mathcal {A}\rangle \) with \(\mathcal {P}\) being a semipositive eDatalog\(^\lnot \) program, the standard Herbrand model of \( KB \) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\).

*Proof*

**Corollary 1**

Given a stratified eDatalog\(^\lnot \) knowledge base \( KB = \langle \mathcal {P},\mathcal {A}\rangle \), the standard Herbrand model of \( KB \) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\). As a consequence, the data complexity of stratified eDatalog\(^\lnot \) with respect to the standard semantics is in PTime.

## 3 The web ontology rule language WORL

### 3.1 Syntax and notation of WORL

the truth symbol \(\top \) to denote \(\textit{owl:Thing}\) [22],

\(a\) and \(b\) to denote

*individuals*(i.e.*objects*),\(d\) to denote a

*literal*(i.e. a data constant),\(A\) and \(B\) to denote concept names (i.e. \(\textit{Class}\) elements [22]),

\(C\) and \(D\) to denote

*concepts*(i.e. \(\textit{ClassExpression}\) elements [22]),\(lC_\pm \) and \(lC\) to denote concepts like a \(\textit{subClassExpression}\) of [22],

\(rC\) to denote a concept like a \(\textit{superClassExpression}\) of [22],

\(eC\) to denote a concept like an \(\textit{equivClassExpression}\) of [22],

\( DT \) to denote a

*data type*(i.e. a \(\textit{Datatype}\) of [22]),\( DR \) to denote a

*data range*(i.e. a \(\textit{DataRange}\) of [22]),\(p_{uec}\) to denote a unary predicate from \(\mathrm {ECPreds}\),

\(r\) and \(s\) to denote

*object role names*(i.e. \(\textit{ObjectProperty}\) elements [22]),\(R\) and \(S\) to denote

*object roles*(i.e. \(\textit{ObjectPropertyExpr.}\) elements [22]),\(\sigma \) and \(\varrho \) to denote

*data role names*(i.e. \(\textit{DataProperty}\) elements [22]).

*safeness*(

*range-restrictedness*) condition.

Comparing with [6], it can be seen that \(\lnot A\), \(\,\ge \! n\,R.lC_\pm \) and \(\exists \sigma .p_{uec}\) for \(lC_\pm \) are additional features w.r.t. OWL 2 RL.

The class constructor \(\textit{ObjectOneOf}\) [22] can be written as \(\{a_1,\ldots ,a_k\}\) and expressed as \(\{a_1\} \sqcup \cdots \sqcup \{a_k\}\). We will use the following abbreviations: \(\mathsf {Func}\) (Functional), \(\mathsf {InvFunc}\) (InverseFunctional), \(\mathsf {Sym}\) (Symmetric), \(\mathsf {Trans}\) (Transitive), \(\mathsf {Key}\) (HasKey).

*DL TBox axiom*, like a \(\textit{ClassAxiom}\) or a \(\textit{Datatype}\textit{Definition}\) or a \(\textit{HasKey}\) axiom of OWL 2 RL [22], is an expression of one of the following forms, where \(h, k \ge 0\) and \(h+k \ge 1\):

*RBox axiom*, like an \(\textit{ObjectPropertyAxiom}\) or a \(\textit{Data}\textit{PropertyAxiom}\) of OWL 2 RL [22], is an expression of one of the following forms:

An RBox axiom of the form \(\exists R.\top \sqsubseteq rC\) (resp. \(\top \sqsubseteq \forall R.rC\), \(\exists \sigma \sqsubseteq rC\), \(\top \sqsubseteq \forall \sigma . DR \)) stands for an \(\textit{ObjectPropertyDomain}\) (resp. \(\textit{ObjectPropertyRange}\), \(\textit{Data}\textit{PropertyDomain}\), \(\textit{DataPropertyRange}\)) axiom as in [22].

One can classify these latter axioms as DL TBox axioms instead of RBox axioms. Similarly, \(\mathsf {Key}(\ldots )\) axioms can be classified as RBox axioms instead.

A (WORL)

*TBox axiom*is either a DL TBox axiom (as defined by (3)) or an RBox axiom (as defined by (4)) or an eDatalog\(^\lnot \) program clause.A (WORL)

*TBox*is a finite set of TBox axioms.A

*WORL knowledge layer*is a pair \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \) consisting of a TBox \(\mathcal {T}\) and an ABox \(\mathcal {A}\).

*WORL knowledge bases*are defined inductively as follows:

a WORL knowledge layer is a WORL knowledge base,

if \(\mathcal {L}\) is a WORL knowledge layer and \( KB _1, \ldots , KB _k\) are WORL knowledge bases then \( KB = \langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) is a WORL knowledge base.

**Example 4**

### 3.2 Translating WORL into eDatalog\(^\lnot \)

We first define a translation \(\pi \) that translates a TBox axiom to a set of formulas of classical first-order logic. After that we will refine \(\pi \) to get a translation that converts a TBox to an eDatalog\(^\lnot \) program.

For an eDatalog\(^\lnot \) program clause \(\varphi \), let \(\pi (\varphi ) = \{\varphi \}\).

For \(\pi _{(x)}(\varphi )\) in the cases when \(\varphi \) is \(\exists R.C\), \(\exists R.\top \), \(\ge \!n\,R.C\), \(\exists \sigma . DR \) or \(\exists \sigma .p_{uec}\), note that \(\varphi \) occurs in the left-hand side of \(\rightarrow \) and the introduced variables are existentially quantified. Those quantifiers change to universal when taken out of the scope of \(\rightarrow \).

The translation \(\pi \) is very intuitive and we use it also for specifying the meanings of TBox axioms. Given an interpretation \(\mathcal {I}\) and a DL TBox axiom or an RBox axiom \(\varphi \), we define that \(\mathcal {I}\models \varphi \) iff \(\mathcal {I}\models \pi (\varphi )\), where the latter satisfaction relation \(\models \) is defined as usual. We say that \(\mathcal {I}\) is a model of a TBox \(\mathcal {T}\), denoted by \(\mathcal {I}\models \mathcal {T}\), if \(\mathcal {I}\models \varphi \) for all \(\varphi \in \mathcal {T}\).

**Example 5**

**Example 6**

- when \(\pi _{2,l}\) is applicable to a formula \(\psi \) of predicate logic, \(\pi _{2,l}(\psi )\) is a set of conjunctions of atomic formulas such that, for any interpretation \(\mathcal {I}\), \(\mathcal {I}\models \bigvee \pi _{2,l}(\psi )\) iff \(\mathcal {I}\models \psi \); for example,$$\begin{aligned}&\pi _{2,l}(r(x,y) \wedge (A_1(y) \vee A_2(y)))\\&\quad = \{r(x,y) \wedge A_1(y),\ r(x,y) \wedge A_2(y) \}; \end{aligned}$$
when \(\pi _2\) is applicable to a formula \(\psi \) of predicate logic, \(\pi _2(\psi )\) is a set of eDatalog\(^\lnot \) program clauses such that, for any interpretation \(\mathcal {I}\), \(\mathcal {I}\models \bigwedge \pi _2(\psi )\) iff \(\mathcal {I}\models \psi \).

if \(\varphi \) is an eDatalog\(^\lnot \) program clause then \(\pi _3(\varphi ) = \{\varphi \}\),

- if \(\varphi \) is a DL TBox axiom or an RBox axiom \(\varphi \) then$$\begin{aligned} \pi _3(\varphi ) = \bigcup _{\psi \in \pi (\varphi )} \pi _2(\psi ), \end{aligned}$$
if \(\varphi \) is a TBox \(\mathcal {T}\) then \(\pi _3(\mathcal {T}) = \bigcup _{\varphi \in \mathcal {T}} \pi _3(\varphi )\).

**Lemma 2**

For any (WORL) TBox \(\mathcal {T}\), \(\pi _3(\mathcal {T})\) is an eDatalog\(^\lnot \) program equivalent to \(\mathcal {T}\) in the sense that, for any interpretation \(\mathcal {I}\), \(\mathcal {I}\models \pi _3(\mathcal {T})\) iff \(\mathcal {I}\models \mathcal {T}\).

*Proof*

\(\mathcal {I}\models \bigvee \pi _{2,l}(\psi )\) iff \(\mathcal {I}\models \psi \),

\(\mathcal {I}\models \bigwedge \pi _2(\psi )\) iff \(\mathcal {I}\models \psi \).

It remains to show that \(\pi _3(\mathcal {T})\) is an eDatalog\(^\lnot \) program.

if \(C\) is a concept of the \(lC\) family then \(\pi _{(x)}(C)\) is a formula \(\psi \) of the \(l\psi \) family such that translating \(\psi \) to the conjunctive normal form by using the distributive laws of \(\wedge \) and \(\vee \) results in \(\psi _1 \vee \ldots \vee \psi _k\) (where each \(\psi _i\) does not contains \(\vee \)) such that variable \(x\) occurs in each \(\psi _i\),

if \(C\) is a concept of the \(rC\) family then \(\pi _{(x)}(C)\) is a formula of the \(r\psi \) family such that if a variable \(y\) different from \(x\) occurs in the formula then it occurs (among others) in the left-hand side of some \(\rightarrow \) in the formula.

if \(\psi \) is a formula of the \(l\psi \) family then \(\pi _{2,l}(\psi )\) is a set of formulas of the \(l\psi \) family without the connective \(\vee \) and atoms of the form \(r^-(t,t')\),

if \(\varphi \) is a DL TBox axiom or an RBox axiom then \(\pi (\varphi )\) is a set of formulas of the \(r\psi \) family such that every variable occurring in a formula from \(\pi (\varphi )\) occurs (among others) in some positive atom of the formula in the left-hand side of some \(\rightarrow \),

if \(\varphi \) is a DL TBox axiom or an RBox axiom and \(\psi \in \pi (\varphi )\) then \(\pi _2(\psi )\) is a set of eDatalog\(^\lnot \) program clauses.

### 3.3 The well-founded semantics of WORL

*flattened version*of a WORL knowledge base \( KB \) is the WORL knowledge layer denoted by \(\textit{flatten}( KB )\) and defined as follows:

if \( KB \) is a layer then \(\textit{flatten}( KB ) = KB \),

- else if \( KB = \langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \), \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \) and \(\textit{flatten}( KB _i) = \langle \mathcal {T}_i,\mathcal {A}_i\rangle \) for \(1 \le i \le k\), then$$\begin{aligned} \textit{flatten}( KB ) = \langle \mathcal {T}\cup \mathcal {T}_1 \cup \cdots \cup \mathcal {T}_k, \mathcal {A}\cup \mathcal {A}_1 \cup \cdots \cup \mathcal {A}_k\rangle . \end{aligned}$$

*well-founded (Herbrand) model*of \( KB \), denoted by \(\mathsf {WF}_ KB \), is defined to be the well-founded model of the eDatalog\(^\lnot \) knowledge base \( KB ' = \langle \pi _3(\mathcal {T}),\mathcal {A}\rangle \).

An *answer* to a query \(\varphi \) w.r.t. that \( KB \) and the *well-founded semantics* is an answer to \(\varphi \) w.r.t. that \( KB '\) and the well-founded semantics of eDatalog\(^\lnot \).

The *data complexity* of WORL w.r.t. the well-founded semantics is the complexity of the problem of finding all answers to a query \(\varphi \) w.r.t. a WORL knowledge base \( KB \) and the well-founded semantics, measured w.r.t. the sum of the sizes of all ABoxes used in \( KB \) when assuming that \(\mathrm {DPreds}\), \(\varphi \) and all the TBoxes used in \( KB \) are fixed and checking whether a ground atom of an external checkable predicate is true or false can be done in polynomial time.

The following theorem immediately follows from Proposition 1.

**Theorem 1**

The data complexity of WORL with respect to the well-founded semantics is in PTime.

**Example 7**

### 3.4 The stable model semantics of WORL

*answer set*of a WORL knowledge base is defined inductively as follows:

An

*answer set*of a WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) is defined to be the set of all ground atoms of predicates from \(\mathrm {DPreds}\) that hold in a stable model of \(\langle \mathcal {T},\mathcal {A}\rangle \) (Each stable model of \(\langle \mathcal {T},\mathcal {A}\rangle \) gives an answer set).An

*answer set*of a WORL knowledge base \( KB \) of the form \(\langle \mathcal {L},\{ KB _1,\ldots , KB _k\}\rangle \), where \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \), is defined to be an answer set of the WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\cup \mathcal {A}_1 \cup \cdots \cup \mathcal {A}_k\rangle \), where each \(\mathcal {A}_i\) is an answer set of the WORL knowledge base \( KB _i\).

*answer*to \(\varphi \) w.r.t. a WORL knowledge base \(\langle \mathcal {P},\mathcal {A}\rangle \) and the

*stable model semantics*if \(\varphi \theta \) holds in the interpretation that corresponds to an answer set of \(\langle \mathcal {P},\mathcal {A}\rangle \) (Notice that the answer set programming approach is adopted here).

**Example 8**

## 4 Stratified WORL

A TBox \(\mathcal {T}\) is said to be *stratifiable* if \(\pi _3(\mathcal {T})\) is a stratified eDatalog\(^\lnot \) program. In the “Appendix” we present a direct method for checking stratifiability of a TBox without using translation.

A WORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) is called a *SWORL knowledge layer* if \(\mathcal {T}\) is stratifiable. A WORL knowledge base is called a *SWORL knowledge base* if it is either a SWORL knowledge layer or a pair \(\langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) where \(\mathcal {L}\) is a SWORL knowledge layer and each \( KB _i\) is a SWORL knowledge base.

Note that flattening a SWORL knowledge base \(\langle \mathcal {L}, \{ KB _1,\)\( \ldots , KB _k\}\rangle \) may result in a WORL knowledge layer that is not stratifiable.

*standard Herbrand model*of \( KB \), denoted by \(\mathcal {H}_ KB \), is defined as follows:

If \( KB \) is a SWORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) then \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\rangle \).

If \( KB = \langle \mathcal {L}, \{ KB _1, \ldots , KB _k\}\rangle \) and \(\mathcal {L}= \langle \mathcal {T},\mathcal {A}\rangle \) then \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\cup \mathcal {H}_{ KB _1} \cup \cdots \cup \mathcal {H}_{ KB _k}\rangle \).

*standard model*of a SWORL knowledge base \( KB \) is defined to be the traditional interpretation corresponding to \(\mathcal {H}_ KB \) and is denoted by \(\mathcal {M}_ KB \).

*answer*to a query w.r.t. a SWORL knowledge base and the data complexity of SWORL are defined as usual:

Given a SWORL knowledge base \( KB \) and a query \(\varphi \), a (correct)

*answer*to \(\varphi \) w.r.t. \( KB \) and the*standard semantics*is a ground substitution \(\theta \) for all the variables of \(\varphi \) such that \(\mathcal {M}_ KB \models \varphi \theta \), where \(\models \) is the satisfaction relation defined in the usual way.The

*data complexity*of SWORL w.r.t. the standard semantics is the complexity of the problem of finding all answers to a query \(\varphi \) w.r.t. a SWORL knowledge base \( KB \) and the standard semantics, measured w.r.t. the sum of the sizes of all ABoxes used in \( KB \) when assuming that \(\mathrm {DPreds}\), \(\varphi \), the structure of \( KB \) and all the TBoxes used in \( KB \) are fixed and checking whether a ground atom of an external checkable predicate is true or false can be done in polynomial time.

**Theorem 2**

The data complexity of SWORL with respect to the standard semantics is in PTime.

*Proof*

Let \( KB \) be a SWORL knowledge base and \(n\) be the sum of the sizes of all ABoxes used in \( KB \). We prove by induction on the structure of \( KB \) that the standard Herbrand model \(\mathcal {H}_ KB \) of \( KB \) can be computed in polynomial time and has polynomial size in \(n\,\):

If \( KB \) is a SWORL knowledge layer \(\langle \mathcal {T},\mathcal {A}\rangle \) then \(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\rangle \), and by Corollary 1, \(\mathcal {H}_ KB \) can be computed in polynomial time and has polynomial size in \(n\).

If \( KB = \langle \langle \mathcal {T},\mathcal {A}\rangle , \{ KB _1, \ldots , KB _k\}\rangle \) then:

By the inductive assumption, \(\mathcal {H}_{ KB _1}\), ..., \(\mathcal {H}_{ KB _k}\) can be computed in polynomial time and have polynomial size in \(n\).

\(\mathcal {H}_ KB \) is the standard Herbrand model of the stratified eDatalog\(^\lnot \) knowledge base \(\langle \pi _3(\mathcal {T}),\mathcal {A}\cup \mathcal {H}_{ KB _1} \cup \cdots \cup \mathcal {H}_{ KB _k}\rangle \), and by Corollary 1, \(\mathcal {H}_ KB \) can be computed in polynomial time and has polynomial size in the size of \(\mathcal {A}\cup \mathcal {H}_{ KB _1} \cup \cdots \cup \mathcal {H}_{ KB _k}\).

Hence, \(\mathcal {H}_ KB \) can be computed in polynomial time and has polynomial size in \(n\).

The standard semantics of SWORL coincides with the well-founded semantics when restricting to SWORL knowledge bases that are single layers and to queries of the form \((\varphi _1 \wedge \cdots \wedge \varphi _h \wedge \xi _1 \wedge \cdots \wedge \xi _l \wedge \lnot \zeta _1 \wedge \cdots \wedge \lnot \zeta _m)\), where \(\varphi _1\), ..., \(\varphi _h\) are atoms of predicates from \(\mathrm {DPreds}\) and \(\xi _1\), ..., \(\xi _l\), \(\zeta _1\), ..., \(\zeta _m\) are atoms of predicates from \(\mathrm {ECPreds}\).

### 4.1 Example: apartment renting

In this subsection we discuss apartment renting, a common activity that is often tedious and time-consuming. The example is based on the one of [2]. The difference is that we use SWORL instead of defeasible logic.

Carlos is looking for an apartment of at least 45 m\(^2\) with at least two bedrooms. If it is on the third floor or higher, the house must have an elevator. Also, pet animals must be allowed.

Carlos is willing to pay $300 for a centrally located 45 m\(^2\) apartment, and $250 for a similar flat in the suburbs. In addition, he is willing to pay an extra $5 per m\(^2\) for a larger apartment, and $2 per m\(^2\) for a garden.

He is unable to pay more than $400 in total. If given the choice, he would go for the cheapest option. His second priority is the presence of a garden; his lowest priority is additional space.

\(\textit{hasSize}(X,Y)\) : \(Y\) is the size of apartment \(X\),

\(\textit{bedrooms}(X,Y)\) : apartment \(X\) has \(Y\) bedrooms,

\(\textit{hasPrice}(X,Y)\) : \(Y\) is the rent price of apartment \(X\),

\(\textit{floor}(X,Y)\) : apartment \(X\) is on the \(Y^{ th }\) floor,

\(\textit{garden}(X,Y)\) : apartment \(X\) has a garden of size \(Y\),

\(\textit{withLift}(X)\) : there is an elevator in the house of \(X\),

\(\textit{allowsPets}(X)\) : pets are allowed in apartment \(X\),

\(\textit{central}(X)\) : apartment \(X\) is centrally located.

Let \(\mathcal {T}\) = {(5), ..., (17)}. It is a stratifiable TBox. Only (11) is a DL TBox axiom, while the other axioms are eDatalog\(^\lnot \) program clauses. The program clauses (5), (13), (15) and (17) can also be expressed as DL TBox axioms, treating \(\textit{withGarden}\), \(\textit{acceptable}\), \(\textit{excluded}_1\), \(\textit{preferable}_1\), \(\textit{excluded}_2\), \(\textit{preferable}_2\), \(\textit{excluded}_3\) and \(\textit{mayRent}\) as concept names.

Flat | Bedrooms | Size | Central | Floor | Lift | Pets | Garden | Price |
---|---|---|---|---|---|---|---|---|

a1 | 1 | 50 | Yes | 1 | No | Yes | 300 | |

a2 | 2 | 45 | Yes | 0 | No | Yes | 335 | |

a3 | 2 | 65 | No | 2 | No | Yes | 350 | |

a4 | 2 | 55 | No | 1 | Yes | No | 15 | 330 |

a5 | 3 | 55 | Yes | 0 | No | Yes | 15 | 350 |

a6 | 2 | 60 | Yes | 3 | No | No | 370 | |

a7 | 3 | 65 | Yes | 1 | No | Yes | 12 | 375 |

For example, \(\textit{bedrooms}(a1,1)\), \(\textit{hasSize}(a1,50)\), \(\textit{central}\)\((a1)\), \(\textit{floor}(a1,1)\), \(\textit{allowsPets}(a1)\) and \(\textit{hasPrice}(a1,300)\) are the atoms of \(\mathcal {A}\) that involve apartment \(a1\). As ABoxes contain only positive information, only atom \(\textit{withLift}(a4)\) of predicate \(\textit{withLift}\) occurs in \(\mathcal {A}\).

The pair \( KB = \langle \mathcal {T},\mathcal {A}\rangle \) is a SWORL knowledge layer (and a SWORL knowledge base). The standard Herbrand model \(\mathcal {H}_ KB \) contains atoms \(\textit{acceptable}(X)\) only for \(X \in \{a3\), \(a5\), \(a7\}\) and atoms \(\textit{preferable}_1(X)\) only for \(X \in \{a3\), \(a5\}\). Only atom \(\textit{preferable}_2(a5)\) of predicate \(\textit{preferable}_2\) and atom \(\textit{mayRent}(a5)\) of predicate \(\textit{mayRent}\) occur in \(\mathcal {H}_ KB \).

## 5 Conclusions

We have developed the Web ontology rule languages WORL and SWORL together with the well-founded semantics and the stable model semantics for WORL and the standard semantics for SWORL. Both WORL with respect to the well-founded semantics and SWORL with respect to the standard semantics have PTime data complexity.

As WORL can be translated into eDatalog\(^\lnot \) and SWORL can be translated into stratified eDatalog\(^\lnot \), the languages WORL and SWORL are not more expressive than eDatalog\(^\lnot \) and stratified eDatalog\(^\lnot \), respectively. However, WORL and SWORL allow using also syntax of description logic (and hence also OWL). This has the same benefits as in the case OWL 2 RL compared to eDatalog, and is very useful for applications of the Semantic Web. As Web ontology rule languages, WORL and SWORL have the advantage of using efficient computational methods of Datalog\(^\lnot \) (extended for eDatalog\(^\lnot \)).

Using nonmonotonic semantics for negation in concept inclusion axioms is a novelty of our approach. Modularity of SWORL is also worth mentioning.

## Acknowledgments

This work was supported by the Polish National Science Center (NCN) under Grants No. 2011/01/B/ST6/02769 and 2011/01/B/ST6/02759. The first author would also like to thank the Warsaw Center of Mathematics and Computer Science for support.