In this section we introduce our approach for knowledge representation on the web, based on prototypes. First, we provide an informal overview of the approach, illustrating the main concepts. Then we introduce a formal syntax and semantics.
3.1 Informal Presentation
To illustrate the prototype system we use an example about two Early Netherlandish painters, the brothers van Eyck. First, we look at a simple representation of the Arnolfini Portrait in Fig. 2.Footnote 2 This figure contains the prototype of the portrait which is derived from the empty prototype (\(\mathtt {P}_{\emptyset }\), see Sect. 3.2) and has two properties. The first property is dc:creator and has value Jan van Eyck
Footnote 3. The second property describes the format of the artwork. We also display the example using a concrete syntax.
Next we will start making use of the prototype nature of the representation. Starting from the Arnolfini Portrait, we derive the Ghent Alterpiece. This painting was created by the same painter, but also his brother Hubert van Eyck was involved in the creation of the work. Figure 3 illustrates how this inheritance works in practice; we create a prototype for the second work and indicate that its base is the first one (using the big open arrow). Then, we add a property asserting that the other brother is also a creator of the work. The resulting prototype has the properties we defined directly as well as those inherited from its base.
Often, there will be a case where the base prototype has properties which are not correct for the derived prototype. In the example shown in Fig. 4 we added the example:location property to the Arnolfini Portrait with the value National Gallery, London. The Ghent Altarpiece is, however, located in the Saint Bavo Cathedral, Ghent. Hence, we first remove the example:location property from the Arnolfini Portrait before we add the correct location to the second painting. In effect, the resulting prototype inherits the properties of its base, can remove unneeded ones, and add its own properties as needed.
Another way to arrive at the same final state would be to derive from a base without any properties and add all the properties needed. The predefined empty prototype (proto:P_0) has no properties. All other prototypes derive from an already existing prototype; circular derivation is not permitted. Now, we will let the prototype which we are creating derive directly from the empty prototype and add properties. This flattening of inherited properties produces the prototype’s fixpoint. The fixpoint of the prototype created in Fig. 4 can be found in Fig. 5.
In the proposed system we apply the closed world and the unique name assumptions. If the system used the open world assumption and one would ask whether the Arnolfini Portrait is located in Beijing, the system would only be able to answer that it does not know. In a closed world setting, the system will answer that the painting is not in Beijing. This conclusion is not based on the fact that the system sees that the painting is located in England, but because of the fact that there is no indication that it would be in Beijing. Under the non-unique name assumption, the system would not be able to answer how many paintings it knows about. Instead, it would only be able to tell that there are one or more. Without the unique name assumption, the resource names Ghent Altarpiece and Arnolfini Portrait may refer to the same real-world instance.
3.2 Formal Presentation
The goal of this section is to give a formal presentation of the concepts discussed in the previous section. We separate the formal definition into two parts. First, we define the syntax of our prototype language. Then, we present the semantic interpretation and a couple of definitions which we used informally above.
Prototype Syntax. In this section we define the formal syntax of prototype-based knowledge bases. We define a set of syntactic material first, before we define the language.
Definition 1
(Prototype Expressions). Let ID be a set of absolute IRIs according to RFC 3987 [7] without the IRI proto:P_0. The IRI proto:P_0 is the empty prototype and will be denoted as \(\mathtt {P}_{\emptyset }\). We define expressions as follows:
-
Let \(p \in ID\) and \(r_1,\dots , r_m \in ID\) with \(1 \le m\). An expression \((p, \{r_1,\dots , r_m \})\) or \( (p, *)\) is called a simple change expression. p is called the simple change expression ID, or its property. The set \(\{r_1,\dots , r_m\}\) or \(*\) are called the values of the simple change expression.
-
Let \(id \in ID\) and \(base \in ID \cup \mathtt {P}_{\emptyset }\) and add and remove be two sets of simple change expressions (called change expressions) such that each simple change expression ID occurs at most once in each of the add and remove sets and \(*\) does not occur in the add set. An expression (id, (base, add, remove)) is called a prototype expression. id is called the prototype expression ID.
Let \(\mathtt {PROTO}\) be the set of all prototype expressions. The tuple \(PL=(\mathtt {P}_{\emptyset }, ID, \mathtt {PROTO})\) is called the Prototype Language.
Informally, a prototype expression contains the parts of a prototype which we introduced in the previous subsection. It has an id, a base (a reference to the prototype it derives from), and a description of the properties which are added and removed.
As an example, we could write down the example of Fig. 4 using this syntax. The prototype expression of the Arnolfini Portrait would look like this:
.
The prototype for the Altarpiece would be written down as follows:
.
This syntax is trivially transformable into the concrete syntax which we used in Fig. 4b and the other examples in the previous subsection.
Definition 2
( dom ). The domain of a finite subset \(S \subseteq \mathtt {PROTO}\), i.e., dom(S) is the set of the prototype expression IDs of all prototype expressions in S.
Definition 3
(Grounded). Let \(PL=(\mathtt {P}_{\emptyset }, ID, \mathtt {PROTO})\) be the Prototype Language. Let \(S \subseteq \mathtt {PROTO}\) be a finite subset of \(\mathtt {PROTO}\). The set \(\mathcal {G}\) is defined as:
-
1.
\(\mathtt {P}_{\emptyset }\in \mathcal {G}\)
-
2.
If there is a prototype \((id, (base, add,remove)) \in S\) and \(base \in \mathcal {G}\) then \(id \in \mathcal {G}\).
-
3.
\(\mathcal {G}\) is the smallest set satisfying (1) and (2).
S is called grounded iff \(\mathcal {G} = dom(S) \cup \{\mathtt {P}_{\emptyset }\}\). This condition ensures that all prototypes derive (recursively) from \(\mathtt {P}_{\emptyset }\) and hence ensures that no cycles occur.
To illustrate how cycles are avoided by this definition, imagine that \(S=\lbrace (A, (\mathtt {P}_{\emptyset }, \emptyset , \emptyset )), (B, (C, \emptyset , \emptyset )), (C, (B, \emptyset , \emptyset )), \rbrace \). What we see is that there is a cycle between B and C. If we now construct the set \(\mathcal {G}\), we get \(\mathcal {G} = \{\mathtt {P}_{\emptyset }, A\}\) while \(dom(S) \cup \{\mathtt {P}_{\emptyset }\} = \{A,B,C,\mathtt {P}_{\emptyset }\}\), and hence the condition for being grounded is not fulfilled.
Definition 4
(Prototype Knowledge Base). Let \(PL=(\mathtt {P}_{\emptyset }, ID, \mathtt {PROTO})\) be the Prototype Language. Let \(KB \subseteq \mathtt {PROTO}\) be a finite subset of \(\mathtt {PROTO}\). KB is called a Prototype Knowledge Base iff 1) KB is grounded, 2) no two prototype expressions in KB have the same prototype expression ID, and 3) for each prototype expression \((id, (base, add,remove)) \in KB\), each of the values of the simple change expressions in add are also in dom(KB).
Definition 5
( R ). Let KB be a prototype knowledge base and \(id \in ID\). Then, the resolve function R is defined as: \( R(KB, id) = \) the prototype expression in KB which has prototype expression ID equal to id.
Prototype Semantics
Definition 6
(Prototype-Structure). Let SID be a set of identifiers. A tuple \(pv=(p, \{v_1,\dots ,v_n\})\) with \(p, v_i \in SID\) is called a Value-Space for the ID-Space SID. A tuple \(o = (id, \{pv_1, \dots , pv_m\})\) with \(id \in SID\) and Value-Spaces \(pv_i, 1 \le i \le m\) for the ID-Space SID is called a Prototype for the ID-Space SID. A Prototype-Structure \(O=(SID, OB, I)\) for a Prototype Language PL consists of an ID-Space SID, a Prototype-Space OB consisting of all Prototypes for the ID-Space SID and an interpretation function I, which maps IDs from PL to elements of SID.
Definition 7
(Herbrand-Interpretation).
Let \(O=(SID, OB, I_h)\) be a Prototype-Structure for the prototype language \(PL=(\mathtt {P}_{\emptyset }, ID, PROTO)\). \(I_h\) is called a Herbrand-Interpretation if \(I_h\) maps every element of ID to exactly one distinct element of SID.
As per the usual convention used for Herbrand-Interpretations, we assume that ID and SID are identical.
Next, we define the meaning of the constituents of a prototype. We start with the interpretation functions \(I_{s}\) and \(I_{c}\) which give the semantic meaning of the syntax symbols related to change expressions. These functions (and some of the following ones) are parametrized (one might say contextualized) by the knowledge base. This is needed to link the prototypes together.
Definition 8
(
\(I_{s}\) ). Interpretation for the values of a simple change expression Let KB be a prototype knowledge base and v the values of a simple change expression. Then, the interpretation for the values of the simple change expression \(I_s(KB, v)\) is a subset of SID defined as follows:
$$\begin{aligned} SID,&\text {if} v = * \\ \{ I_h(r_1), I_h(r_2), \dots , I_h(r_n) \},&\text { if } v =\{r_1,\dots ,r_n\} \end{aligned}$$
Definition 9
(
\(I_c\)
). Interpretation of a change expression. Let KB be a prototype knowledge base and a function \(ce=\{(p_1,vs_1), (p_2, vs_2), \dots \}\) be a change expression with \(p_1, p_2, \dots \in ID\) and the \(vs_i\) be values of the simple change expressions. Let \(W = ID \setminus \{p_1,p_2, \dots \} \) . Then, the interpretation of the change expression \(I_c(KB, ce)\) is a function defined as follows (We will refer to this interpretation as a change set, note that this set defines a function):
$$\begin{aligned} \{ (I_h(p_1), I_s(KB, vs_1)), (I_h(p_2), I_s(KB, vs_2)), \dots \} \cup \bigcup _{w \in W} \{(I_h(w), \emptyset )\} \end{aligned}$$
Next, we define J which defines what it means for a prototype to have a property.
Definition 10
(
J ). The value for a property of a prototype. Let KB be a prototype knowledge base and \(id, p \in ID\). Let \(R(KB, id) = (id, (b, r, a))\) (the resolve function applied to id). Then the value for the property p of the prototype id, i.e., J(KB, id, p) is:
$$\begin{aligned} I_c(KB, a) (I_h(p)), \text { if } b = \mathtt {P}_{\emptyset }&\\ (J(KB, b, p) \setminus I_c(KB,r) (I_h(p))) \cup I_c(KB, a) (I_h(p)),&\text {otherwise } \end{aligned}$$
Informally, this function maps a prototype and a property to (1) the set of values defined for this property in the base of the prototype (2) minus what is in the remove set (3) plus what is in the add set.
As an example, let us try to find out what the value for the creator of the Ghent Altarpiece described in the example of the previous subsection would evaluate to assuming that these prototypes were part of a Prototype Knowledge Base KB. For brevity we will write example:Ghent_Altarpiece as GA, example:Arnolfini_Portrait as AP, dc:creator as creator, example:Jan_Van_Eyck as JVE, and example:Hubert_Van_Eyck as HVE.
Concretely, we have to evaluate \(J(KB, GA, creator) = (J(KB, AP, creator) \setminus I_c(KB, \emptyset )(creator)) \cup I_c(KB, add)(creator)\) where add is the add change set of the GA prototype expression. First we compute the recursive part, \(J(KB, AP, creator) = I_c(KB, add_{ap}) (creator) = \{ (creator,\{JVE\}), \dots \} (creator) = \{JVE\}\). Where \(add_{ap}\) is the add change set of the AP prototype expression. The second part (what is removed) becomes \(I_c(KB, \emptyset )(creator) = \emptyset \). The final part (what this prototype is adding) becomes \(I_c(KB, add)(creator) = \{ (creator,\{HVE\} ), \dots \} (creator) = \{HVE\}\). Hence, the original expression becomes \((\{JVE\} \setminus \emptyset ) \cup \{HVE\} = \{JVE, HVE\}\) as expected.
Definition 11
(FP). The interpretation of a prototype expression is also called its fixpoint. Let \(pe = (id, (base, add, remove)) \in KB\) be a prototype expression. Then the interpretation of the prototype expression in context of the prototype knowledge base KB is defined as \(FP(KB, pe) = (I_h(id), \{(I_h(p), J(KB, id, p)) | p \in ID, J(KB, id, p)) \ne \emptyset \})\), which is a Prototype.
Definition 12
(
\(I_{KB}\)
: Interpretation of Knowledge Base). Let \(O=(SID, OB, I_h)\) be a Prototype-Structure for the Prototype Language \(PL=(\mathtt {P}_{\emptyset }, ID, PROTO)\) with \(I_h\) being a Herbrand-Interpretation. Let KB be a Prototype-Knowledge Base. An interpretation \(I_{KB}\) for KB is a function that maps elements of KB to elements of OB as follows: \(I_{KB}(KB, pe) = FP(KB, pe)\)
This concludes the definition of the syntactic structures and semantics of prototypes and prototype knowledge bases. For the semantics, we have adopted Herbrand-Interpretations, which are compatible with the way RDF is handled in SPARQL.