Towards Decision Making via Expressive Probabilistic Ontologies
 1k Downloads
Abstract
We propose a framework for automated multiattribute decision making, employing the probabilistic nonmonotonic description logics proposed by Lukasiewicz in 2008. Using this framework, we can model artificial agents in decisionmaking situation, wherein background knowledge, available alternatives and weighted attributes are represented via probabilistic ontologies. It turns out that extending traditional utility theory with such description logics, enables us to model decisionmaking problems where probabilistic ignorance and default reasoning plays an important role. We provide several decision functions using the notions of expected utility and probability intervals, and study their properties.
Keywords
Ontological Investigation Artificial Agent Models Description Logics (DL) Expected Utility Interval Conditional Constraints1 Introduction
Preference representation and its link to decision support systems is an ongoing research problem in artificial intelligence, gaining more attention every day. This interest has led on the one hand to the analysis of decisiontheoretic problems using methods common in A.I. and knowledge representation, and on the other hand to apply methods from classical decision theory to improve decision support systems. In this regard there has been a growing interest over the last decade in the use of logics to model preferences, see [1, 3, 14, 15, 16, 17, 18, 19].
Description Logics (DLs) are a family of knowledge representation languages that are based on (mostly) decidable fragments of first order logic. They were designed as formal languages for knowledge representation becoming one of the major formalisms in this field over the last decade. Alongside this and from a more practical point of view, they formally underpin semantic web OWL Web Ontology Language^{1}, the semantic web key representation and ontology standard (defined by the World Wide Web Consortium).
In this work, we propose a formal framework which is based on expressive probabilistic DLs [13], viz., the nonmonotonic P\(\mathcal {SHOIN}(\mathbf D )\) family of DL languages, designed to model uncertainty and uncertain, nonmonotonic reasoning.
In such languages one can express objective (statistical) uncertainty (terminological knowledge concerning concepts), as well as subjective (epistemic) uncertainty (assertional knowledge concerning individuals). Furthermore, due to their nonmonotonicity, one can represent and reason with default knowledge. Also, their probabilistic component employs imprecise probabilities to model uncertainty, which in turn allows to model probabilistic ignorance with considerable flexibility, in contrast to classical probability theory.
We show that our framework can represent decisiontheoretic problems and solve them using DL inference services, taking advantage of imprecise probabilities and background knowledge (as represented by ontologies) to compute expected utilities in a finegrained manner that goes beyond traditional decision theory. One reason why this is possible within a DLbased decision making framework, is because one can express the various dependency relations between attributes/decision criteria with rich DL concept hierarchies and evaluate thereafter alternatives in terms of their logical implications.
Our framework can be interpreted as modeling the behavior of an agent, or as a model for systems that support group decisions. In this work, we pursue the former interpretation and focus on modeling artificial agents where each attribute/decision criterion has an independent local utility value (weight). We consider available alternatives in the form of DL individuals, and attributes in the form of DL concepts. Finally, we represent the agent’s background knowledge and beliefs via a probabilistic DL knowledge base.
In this work, we present several decision functions in order to model agents with different characteristics. Furthermore, the employed logic’s use of imprecise probabilities to model uncertainty, allows considerable expressive power to model nonstandard decision behaviour that violate the axioms of (classical) expected utility e.g., Ellsberg paradox. Using the framework, we show that it is straightforward to provide decision functions which model ambiguity averse decisionmaking. In so doing, we investigate the various properties of such decision functions as well as their connection to ontological knowledge.
2 Preliminaries
Preferences and Utility. Traditional utility theory [10] models the behavior of rational agents, by quantifying their available choices in terms of their utility, modeling preference (and eventual courses of action) in terms of the induced partial orders and utility maximization.
Let \(A=\{a_1,\dots ,a_n\}\) be a set of alternatives, and a (rational) preference is a complete and transitive binary relation \(\succeq \) on A. Then, for any \(a_i\), \(a_j \in A\) for \(i, j \in \{1, \dots , n\}\), strict preference and indifference is defined as follows: \(a_i \succ a_j\) iff \(a_i \succeq a_j\) and \(a_j \not \succeq a_i\) (Strict preference), \(a_i \sim a_j\) iff \(a_i \succeq a_j\) and \(a_j \succeq a_i\) (Indifference).
It is said that, \(a_i\) is weakly preferred ^{2} (strictly preferred) to \(a_j\) whenever \(a_i \succeq a_j\) (\(a_i \succ a_j\)), \(a_i\) is indifferent to \(a_j\) whenever \(a_i \sim a_j\). Moreover, \(a_i, a_j\) are incomparable iff \(a_i \mid \mid a_j \iff a_i \not \succeq a_j\) and \(a_j \not \succeq a_i\), which implies that \(\succeq \) is a partial ordering.
In order to represent the preference relation numerically one introduces the notion of utility, which is is a function that maps an alternative to a positive real number reflecting its degree of desire. For a decision theoretic framework, two questions are essential; given a (finite) set of alternatives (i) which alternative is the best one(s)? (ii) How does the whole preference relation look like i.e., a complete list of order of alternatives (e.g., \(a_1 \succ a_3 \succeq \dots \)). Throughout the paper, these two main questions will also be of our concern, along with a restriction to single (nonsequential) decisions.
Formally, given a finite set of alternatives \(A=\{a_1,\dots ,a_n\}\), and preference \(\succeq \) on A, \(u: A \rightarrow \mathbb {R}\) is a utility function iff for any \(a_i\), \(a_j \in A\) with \(i,j \in \{1,\dots ,n\}\), \(a_i \succ a_j \iff u(a_i) > u(a_j)\), \(a_i \succeq a_j \iff u(a_i) \ge u(a_j), a_i \sim a_j \iff u(a_i) = u(a_j)\).
For the proof that such a function exists, we refer the reader to the socalled representation theorems in [7].
The basic principle in utility theory is that a rational agent should always try to maximize its utility, or should take the choice with the highest utility. Utility functions modeling behaviours based on more than one attribute (i.e., nary) are called multiattribute utility functions. Let \(X=\{X_1,\dots ,X_n\}\) be a set of attributes where \(n\ge 2\), and \(\varOmega = X_1 \times \dots \times X_n\) be the set of outcomes over which the agent’s preference relation is defined. An alternative/outcome is a tuple \((x_1,\dots , x_n) \in \varOmega \). Let \(\succeq \) be the preference relation defined over X, then u is a multiattribute utility function representing \(\succeq \) iff for all \((x_1, \dots , x_n), (y_1, \dots , y_n) \in \varOmega \), \((x_1, \dots , x_n)\succeq (y_1, \dots , y_n) \iff u(x_1, \dots , x_n) \ge u(y_1, \dots , y_n).\) For an introductory text on multiattribute utility theory, see [10].
Moreover, a utility function u is said to be unique up to affine transformation iff for any real numbers \(m>0\) and c, \(u(x) \ge u(x')\) iff \(m\cdot u(x)+c \ge m\cdot u(x')+c\).
Along the paper, we will use two running examples to point out two important limitations of traditional decision theory that we will overcome with description logics. theoretic: Ellsberg’s Paradox and a a touristic agent example.
 1st
Gamble: If you guess correctly, you get the prize.
 2nd
Gamble: If you guess correctly, or the ball is yellow, you get the prize.
Syntax and semantics of the DL \(\mathcal {SHOIN}(\mathbf D )\). Notice that \(\mathbf D \) refers to concrete domains. The first block introduces individuals. The second block recursively defines concepts (others can be introduced by explicit definition), while the third does it with roles. The fourth formally introduces terminological statements, resp., concept (ISA) and role inclusion statements. Finally, the fifth block introduces assertional facts a.k.a. membership assertions, resp. concept and role membership assertions. A TBox \(\mathcal {T}\) is a set of terminological statements, an ABox \(\mathcal {A}\) is a set of assertions, and a KB is a pair \(T = (\mathcal {T},\mathcal {A})\). Entailment and satisfiability are defined in the usual way. The syntax and semantics of P\(\mathcal {SHOIN}(\mathbf D )\) extend this definition.
Syntax  Semantics w.r.t. classical interpretation \(\mathcal {I}= (\Delta ^{\mathcal {I}}, \cdot ^{\mathcal {I}})\) 

i  \(i^{\mathcal {I}}\in \Delta ^{\mathcal {I}}\) 
A  \(A^{\mathcal {I}} \subseteq \Delta ^{\mathcal {I}}\) 
D  \(D^{\mathcal {I}} \subseteq \mathbf D = \textsc {Num} \cup \textsc {String}\) 
OneOf \((i_1,\dots ,i_n)\)  \((\textit{OneOf}(i_1,\dots ,i_n))^{\mathcal {I}} := \{i_1,\dots ,i_n\}\) 
\(\lnot \phi \)  \((\lnot \phi )^{\mathcal {I}} := \Delta ^{\mathcal {I}}\setminus \phi ^{\mathcal {I}}\) 
\(\exists r .\phi \)  \((\exists r .\phi )^{\mathcal {I}} := \{d \mid \text {exists } e \text { s.t. } (d,e) \in r^{\mathcal {I}} \text { and } e \in \phi ^{\mathcal {I}}\}\) 
\(\exists _{\le k} r\)  \((\exists _{\le k} r)^{\mathcal {I}} := \{d \mid \text {exists at most k es} \text { s.t. } (d,e) \in r^{\mathcal {I}}\}\) 
\(\phi _1 \sqcap \phi _2\)  \((\phi _1 \sqcap \phi _2)^{\mathcal {I}} := \phi ^{\mathcal {I}} \cap \phi '^{\mathcal {I}}\) 
p  \( p^{\mathcal {I}} \subseteq \Delta ^{\mathcal {I}}\times \Delta ^{\mathcal {I}}\) 
\(r^\)  \((r^)^{\mathcal {I}} := \{(d,e) \mid (d,e) \in r^{\mathcal {I}}\}\) 
Tr(r)  \( (\textit{Tr}(r))^{\mathcal {I}} := \text { the transitive closure of } r \text { in } \Delta ^{\mathcal {I}}\times \Delta ^{\mathcal {I}}\) 
\(\phi _1\sqsubseteq \phi _2\)  \(\mathcal {I}\models \phi _1\sqsubseteq \phi _2 \text {iff} \phi _1^{\mathcal {I}} \subseteq \phi _2^{\mathcal {I}}\) 
\(r_1\sqsubseteq r_2\)  \(\mathcal {I}\models r_1\sqsubseteq r_2 \text {iff} r_1^{\mathcal {I}} \subseteq r_2^{\mathcal {I}}\) 
\(\phi (i)\)  \(\mathcal {I}\models \phi (i) \text {iff} i^{\mathcal {I}} \in \phi ^{\mathcal {I}}\) 
r(i, j)  \(\mathcal {I}\models r(i,j) \text {iff} (i,j)^{\mathcal {I}} \in r^{\mathcal {I}}\) 
The P \(\mathcal {SHOIN}\) (D) Probabilistic DL. Lukasiewicz’s probabilistic description logics (DLs), see [13], extend classical DLs with probabilistic, nonmonotonic reasoning. DLs are logics –typically fragments of first order logic– specifically designed to represent and reason on structured knowledge, where domains of interest are represented as composed of objects structured into: (i) concepts, corresponding to classes, denoting sets of objects; (ii) roles, corresponding to (binary) relationships, denoting binary relations on objects. Knowledge is predicated through socalled assertions, i.e., logical axioms, organized into an intensional component (called TBox, for “terminological box”), and an extensional one (called ABox, for “assertional box”), viz. the former consists of a set of universal statements and the latter of a set of atomic facts. A DL knowledge base (KB) is then defined as the combination of a TBox and an ABox.
For simplicity, we restrict the discussion in this paper to the P\(\mathcal {SHOIN}(\mathbf D )\) family of probabilistic logics, which is an extension of the known \(\mathcal {SHOIN}(\mathbf D )\) DL whose syntax and semantics we briefly recall in Table 1. \(\mathcal {SHOIN}(\mathbf D )\) underpins the OWLDL fragment of OWL (in the OWL 1.1 standard).
Example 1
DLs KBs can be used to formally model domain knowledge, and formally reason over it. Consider the hotel domain. Consider now the KB with TBox \(T = \{ \textit{OneStarHotel} \sqsubseteq \textit{Hotel} \sqcap \exists \textit{hasService.ExtendedBreakfast} \}\), which states that every one star hotel is an hotel and there is an extended breakfast service, and ABox \(A = \{ \textit{OneStarHotel}(\textit{tapir})\}\), which says that Tapir is a one star hotel. ^{3} Following \(\mathcal {SHOIN}(\mathbf D )\) semantics, we will conclude that Tapir is a hotel and it has an extended breakfast service \((T, A) \models \textit{Hotel} \sqcap \exists \textit{hasService.ExtendedBreakfast} (\textit{tapir})\). \(\clubsuit \)
Given that the semantics of the P\(\mathcal {SHOIN}(\mathbf D )\) family is very rich, we avoid giving a full description of it (which would go beyond the scope of this paper), and provide, rather a basic overview of their syntax and semantics, and cover its main properties (on which our results rely) in a succinct Appendix. For its full definition and properties, we refer the reader to [13]. A general remark is that the framework that we present here is (w.l.o.g.) independent from a particular choice of PDL, provided they cover numeric domains (more in general, data types).
Syntax. The P\(\mathcal {SHOIN}(\mathbf D )\) family extends the syntax of \(\mathcal {SHOIN}(\mathbf D )\) with the language of conditional constraints defined as follows: \(\mathbf I _P\) is the set of probabilistic individuals o, disjoint from classical individuals \(\mathbf I _C = \mathbf I \backslash \mathbf I _P\), \(\mathcal {C}\) is a finite nonempty set of basic classification concepts or basic cconcepts, which are (not necessarily atomic) concepts in \(\mathcal {SHOIN}(\mathbf D )\) that are free of individuals from \(\mathbf I _P\). Informally, they are the DL concepts relevant for defining probabilistic relationships. In what follows we overload the notation for concepts with that of cconcepts.
In addition to probabilistic individuals, TBoxes and ABoxes can be extended in P\(\mathcal {SHOIN}(\mathbf D )\) to probabilistic TBoxes (PTBoxes P) and ABoxes (PABoxes \(P_o\)), via socalled conditional constraints, expressing (or encoding) uncertain, default knowledge about domains of interest. A PTBox conditional constraint is an expression \((\psi  \phi )[l,u]\), where \(\psi \) and \(\phi \) are cconcepts, and \(l, u \in [1,0]\). Informally, \((\psi  \phi )[l,u]\) encodes that the probability of \(\psi \) given \(\phi \) lies, by default, within [l, u]. A PABox constraint \((\psi \phi )[l,u] \in P_o\) however, relativizes constraint \((\psi \phi )[l,u]\) to the individual o.
A probabilistic KB \(\mathcal {K}:= (T,P, (P_o)_{o \in \mathbf I _P})\) consists of T a classical KB^{4}, P a PTBox (a set of conditional constraints), and a collection of PABoxes, each of which is a (possibly empty) set of relativized conditional constraints for each probabilistic \(o \in \mathbf I _P\).
Semantics. A world I is a finite set of basic cconcepts \(\phi \in \mathcal {C}\) such that \(\{ \phi (i) \mid \phi \in I \} \cup \{ \lnot \phi (i) \mid \phi \in \mathcal {C} \backslash I\} \) is satisfiable, where i is a new individual (intuitively worlds specify an individual unique up to identity), whereas \(\mathcal {I}_\mathcal {C}\) is the set of all worlds relative to \(\mathcal {C}\). \(I \models T\) iff \(T \cup \{\phi (i) \mid \phi \in I\} \cup \{ \lnot \phi (i) \mid \phi \in \mathcal {C}\backslash I \}\) is satisfiable. \(I \models \phi \) iff \(\phi \in I\). \(I \models \lnot \phi \) iff \(I \models \phi \) does not hold. For cconcepts \(\phi \) and \(\psi \), \(I \models \psi \sqcap \phi \) iff \(I \models \psi \) and \(I \models \phi \). Note that above notion of satisfiability based on worlds is compatible with the satisfiability of classical knowledge bases, that is, there is a classical interpretation \(\mathcal {I} = (\Delta ^{\mathcal {I}} , \cdot ^\mathcal {I})\) that satisfies T iff there is a world \(I \in \mathcal {I}_{\mathcal {C}}\) that satisfies T.^{5}
A probabilistic interpretation Pr is a probability function \(Pr: \mathcal {I}_\mathcal {C} \rightarrow [0, 1]\) with \(\sum _{I \in \mathcal {I}_\mathcal {C}} Pr(I) = 1\). \(Pr \models T\), iff \(I \models T\) for every \(I \in \mathcal {I}_\mathcal {C}\) such that \(Pr(I)>0\). The probability of a cconcept \(\phi \) in Pr is defined as \(Pr(\phi ) = \sum _{I \models \phi } Pr(I)\). For cconcepts \(\phi \) and \(\psi \) with \(Pr(\phi ) > 0\), we write \(Pr(\psi  \phi )\) to abbreviate \(Pr(\psi \sqcap \phi )/ Pr(\phi )\). For a conditional constraint \((\psi  \phi )[l,u]\), \(Pr \models (\psi  \phi )[l,u] \) iff \(Pr(\phi )=0\) or \(Pr(\psi \phi ) \in [l,u]\). For a set of conditional constraints \(\mathcal {F}\), \(Pr \models \mathcal {F}\) iff \(Pr \models F\) for all \(F \in \mathcal {F}\). Notice that T has a satisfying classical interpretation \(\mathcal {I} = (\Delta ^\mathcal {I} , \cdot ^ \mathcal {I} )\) iff \(Pr \models T\) ^{6}. We provide further technical details in the Appendix.
Satisfaction and entailment in \(\mathcal {SHOIN}(\mathbf D )\) can be extended to probabilistic interpretations Pr, see the Appendix. More important for our purposes are the defeasible entailment relations induced by P\(\mathcal {SHOIN}(\mathbf D )\), viz., lexicographic entailment \(\mid \mid \!\sim ^{lex}\) and tight lexicographic entailment \(\mid \mid \!\sim ^{lex}_{tight}\). Probabilistic KBs in general and conditional constraints in particular encode as we said probable, default knowledge, and tolerate to some degree inconsistency (w.r.t. classical knowledge). Lexicographic entailment supports such tolerance by intuitively: (i) partitioning P (ii) selecting the lexicographically least set in such partition consistent with T. See the Appendix for the technicalities.
Reasoning Problems. A reasoning problem that will be of our interest is probabilistic membership PCmem (probabilistic concept membership): given a consistent probabilistic KB \(\mathcal {K}\), a probabilistic individual \(o \in \mathbf I _P\), and a cconcept \(\psi \), compute \(l, u \in [0, 1]\) such that \(\mathcal {K}\mid \mid \!\sim ^{lex} _{tight} (\psi \top )[l, u]\) for o.
3 Representing Decision Making Problems
In this section we introduce probabilistic DL decision bases. Regarding notation, we will try to stick to that in [13] as much as possible, to give the reader easy access to the referred paper.
Attributes and Preferences. We define the nonempty set of attributes as a subset of cconcepts derived from basic cconcepts \(\mathcal {C}\). Informally, every world I determines a subset of attributes that is to be satisfied. We will assume that the set of attributes X possibly contains redundancies.
Decision Base. We define a decision base that models an agent in a decision situation; background knowledge of the agent is modelled by a probabilistic knowledge base, the finite set of available alternatives are modelled by a set of individuals, and a weight function that is defined over the set of attributes which will be used to derive the preference relation of the agent.
Definition 1

\(\mathcal {K} = (T, P, (P_o)_{o \in \mathbf I _P })\) is a consistent probabilistic KB encoding background knowledge,

\(\mathcal {A} \subseteq \mathbf I \) is the set of alternatives,

\(\mathcal {U}\) is UBox, that is a finite graph of a bounded realvalued function \(w: X \longrightarrow \mathbb {R^+}\) with \(w(\bot )=0\). \(\dagger \)
Informally, the role of \(\mathcal {K}\) is to provide assertional information about the alternatives at hand, along with the general terminological knowledge information that the agent may require to reason further over alternatives; indeed X is the set of concepts \(\phi \) such that \(\mathcal {K}\) logically entails \(\phi (a)\). Moreover, \(\mathcal {U}\) can be defined to include negative weights as well, (i.e., \(w: X \longrightarrow \mathbb {R}\) instead or \(\mathbb {R^+}\)) to model undesirable outcomes or punishments.^{7} However, for the sake of brevity, we will consider here only positive weights.
Alternatives with Classical Knowledge. In this particular setting, we assume we are in possession of certain information about the alternatives, and consider only the certain subsumption relations between concepts. We do this by providing a value function for alternatives, defined over the classical component of the framework (i.e., the classical DL KB T in the decision base).
Definition 2
In this work, for the sake of simplicity, we define U as a summation. Note however that U can be potentially any utility function, such as, e.g., \(U(a) = 2(p_1 w(\phi )\cdot p_2 w(\psi )) + p_3 \exp (w(\gamma )) + c\), were a to satisfy \(\psi , \phi , \gamma \in X\), where \(p_i, c \in \mathbb {R}\), for \(i=1,2,3\). Furthermore, we assume that w is defined without knowing the exact knowledge base and its transitive closure on subsumption, without having complete knowledge about the ontological relations between attributes.
Notice that each alternative corresponds to an outcome. Using U, we define the preference relation \(\succ \) over alternatives \(A = \{a_1,\dots ,a_n\}\): \(a_i \succ a_j \text { iff } U(a_i) > U(a_j)\), for \(i,j \in \{1,\dots ,n\}\); \(\succeq \) and \(\sim \) are defined similarly.
Definition 3
Intuitively, the function U measures the value of an alternative with respect to the concepts (possibly deduced) that it belongs. The following proposition is an immediate result of that.
Proposition 1
Let T be a classical part of the knowledge base of \(\mathcal {D}\) and \(a_1, a_2 \in \mathcal {A}\) be any two alternatives. If for every \( \phi \in X\) with \(T \models \phi (a_1)\), there is a \(\psi \in X\) with \(T \models \psi (a_2) \) such that \(T \models \phi \sqsubseteq \psi \), then \(a_1 \succeq a_2\).
Proof
\(\psi \) be any basic cconcept such that \(T \models \psi (a_2)\) and \((\psi , w(\psi )) \in \mathcal {U}\), then \(U(a_2) \ge w(\psi )\). By assumption, there is a \(\phi \in X\) such that \(T \models \phi \sqsubseteq \psi \) and \(T \models \phi (a)\), hence \(T\models \psi (a)\). It follows that \(U(a_1) \ge w(\psi )\), therefore \(a_1 \succeq a_2\). \(\square \)
Intuitively, ceteris paribus (everything else remains the same) any thing that belongs to a subconcept should be at least as desirable as something that belongs to a superconcept; for instance, a new sport car is at least as desirable as a sport car (since anything that is a new sport car is a sport car i.e., new sport car \(\sqsubseteq \) sport car).^{8} The following results says that two alternatives are of same desirability if they belong to exactly the same concepts.
Corollary 1
Let \(\mathcal {D}\) be decision base with a classical knowledge base T and a set of alternatives \(\mathcal {A}\). Then for any two alternatives \(a, a' \in \mathcal {A}\), \(a \sim a'\) iff \( \{\psi \mid \psi \in X, T \models \psi (a)\} = \{\phi \mid \phi \in X, T \models \phi (a)\}\).
Proof
By applying Proposition 1 in both directions (i.e., \(a \sim a' \implies a \succeq a'\) and \(a' \succeq a\)). \(\square \)
The intuitive explanation for Corollary 1 is that we measure the desirability (and nondesirability) of things, according to what they are, or to which concepts they belong. This brings forward the importance of reasoning, since it might not be obvious at all that two alternatives actually belong to exactly the same concepts w.r.t attributes.
Example 2
Properties of the Utility Function. Since every individual is corresponds to a subset of attributes that it satisfies, in this section we will treat U as if it was formally defined over the set of attributes X rather than that of individuals so that we can discuss some common properties of U following the definitions given in [3].
Proposition 2
Suppose that U is a value function. Then U is (a) normalized, (b) nonnegative, (c) is monotone, (d) concave, (e) subadditive, (f) unique up to positive affine transformation.
Proof
 (a)
This holds when the individual does not satisfy any attributes, whence \(U(\emptyset )=0\).
 (b)
Follows from Proposition 1 and property (a).
 (c)
Follows from Proposition 1.
 (d)
Let \(Y, Z, T \subseteq X\) with \(Z \subseteq Y\). Since the classical part of the logic is monotonic and weights are positive, whenever \(\mathcal {I} \models Y\), \(\mathcal {I} \models Z\), which implies \(U(X \cup Y) U(Y ) \le U(X \cup Z)  U(Z)\).
 (e)
Follows from (d).
 (f)Let \(Y, Z \subseteq X\) with \(U(Y) \ge U(Z)\) and \(M(x) = ax + b\) with \(a>0\) and b,
Decisions with Default Ontological Reasoning. In many situations, preferential statements that are done by human agents are not meant to be strict statements, say, as in the formal sciences, nor do they take full ontological knowledge into account. When someone asserts that she prefers a suite to a standard room (i.e., \(\textit{suite} \succ \textit{standard room}\)), it is often the case that the statement is not meant to hold for every suite e.g., a a burned suite (burned suite \(\not \succ \textit{standard room}\)). We would like to model such preferential statements in our framework, but they potentially violate Proposition 1. Indeed, the decision rule for classical ontologies (Definition 3) cannot deal with such cases. To do this we need to go beyond classical KBs, and consider full P\(\mathcal {SHOIN}(\mathbf D )\) KBs and their reasoning techniques.
Example 3
With classical \(\mathcal {SHOIN}(\mathbf D )\) reasoning, it follows that any trip that has a bad famed five star hotel is a trip that has five star hotel, in symbols \(\exists \textit{hasHotel.BadFamedFive} \textit{StarHotel} \sqsubseteq \exists \textit{hasHotel.FiveStarHotel}\). Note that in the light of this information, it is entailed that trip1 is desirable. However, if the agent also learns (added to its knowledge base) that meridian is a bad famed five star hotel, then trip1 will not be desirable anymore^{9}.
Decisions with Ontological Probabilistic Reasoning. In this section, we will generalize our previously introduced choice functions with probabilities, that will result in different behavioral characteristics in the presence of uncertainty. Those behavioral characteristics can be interpreted as different types of agents (optimistic, pessimistic etc.), or a decision support system that orders alternatives with respect to different criteria (best possible uncertain outcome, worst possible uncertain outcome etc.) and user preferences.
A remark on notation before defining expected utility intervals: we will use the notation \([\textsc {PCmem}(\mathcal {K}, a, \phi )]\) to denote the tight interval [l, u] that is the answer to the query PCmem, with regard to knowledge base \(\mathcal {K}\), individual \(a\in I_\mathbf{P }\) and cconcept \(\phi \). Moreover, \(l=\llcorner \textsc {PCmem}(\mathcal {K}, a, \phi )\lrcorner \) and \(r = \ulcorner \textsc {PCmem}(\mathcal {K}, a, \phi )\urcorner \)
As we have a set of probability functions instead of a single probability function which results in probability intervals, we get an interval of the expected utilities. That is, \(EU(a) = \sum _{\phi \in X}\Pr (\phi ) \cdot w(\phi )\) is the expected utility of an alternative a w.r.t. Pr, and EI is the expected utility interval defined as follows.
Definition 4
Now, using expected utility intervals we will define some decision functions (mainly from the literature of imprecise probabilities) which generalize the notion of choices by maximum expected utility. A decision function \(\delta \) maps nonempty sets of alternatives A to a subset of A where \(a \in \delta (\mathcal {A})\) iff \(a \succeq a'\) for every \(a' \in \mathcal {A}\) ^{10}.
We proceed to define decision functions characterizing different kinds of rational agents. In terms of their use of intervals, they are similar to the \(\varGamma \) maximax, \(\varGamma \) minimax, Interval Dominance and Eadmissibility in the literature of imprecise probabilities [8].
Definition 5
Definition 6
(Cautious Choice). The decision function \(\delta \) is said to be cautious iff \(\delta _{id}(\mathcal {A}):=\{ a \in \mathcal {A} \mid \underline{EI}(a) \ge \overline{EI}(a')\) for all \(a' \in \mathcal {A} \}\). \(\dagger \)
We will denote the preference ordering of cautious choices with \(\succ _{id}\) (id for interval dominance). Interval dominance offers a formalisation for incomparability; that is, if two alternatives a and \(a'\) have neither overlapping expected utility intervals (i.e., \(EI(a) \not = EI(a')\)), nor dominate each other (which means that an agent cannot decide between them), then \(a_1 \mid \mid a_2\). Notice that \(\succeq _{id}\) is a partial weak order whereas \(\succeq _{\underline{opt}}\) and \(\succeq ^{\overline{opt}}\) are totalweak orders.
Example 4
Notice that interval dominance is a very strict restriction that is not very helpful in normative settings. We give a less strict version based on Levi’s notion of Eadmissibility in [8, 12] (E for expected).
Definition 7
(EAdmissible Choice). An alternative \(a \in \mathcal {A}\) is Eadmissible (\(a \in \delta _e(\mathcal {A})\)) iff for every \(\phi \in X\), there is a \(Pr(\phi ) \in [l, u]\) s.t. \(K \mid \mid \!\sim ^{lex}_{tight} a: \phi [l,u]\), and for every \(a' \in A \backslash \{a\}\) and for every \(Pr'(\phi ) \in [l', u']\) s.t. \(K \mid \mid \!\sim ^{lex}_{tight} a': \phi [l',u']\), \(Pr(\phi )>Pr'(\phi )\) holds. We denote the preference relation with \(\succeq _{e}\).\(\dagger \)
Informally, \(\delta _e\) looks for a probability distribution that lets an alternative weakly dominates every other.
Example 5
Consider alternatives \(\mathcal {A} = \{a_1, a_2, a_3\}\) with expected utility intervals on a single attribute, that are [5, 7], [1, 10] and [1, 8]. Assume that there are two distributions Pr and \(Pr'\) such that expected utility of each alternatives w.r.t. Pr is 5, 7, 6, and 6, 7, 8 w.r.t. \(Pr'\). Also assume that there is no \(Pr''\) such that \(EU(a_1) \ge EU(a_2)\) and \(EU(a_1) \ge EU(a_3)\). Then, \(\delta _e(\mathcal {A}) = \{a_2, a_3\}\), that is, \(a_3 \mid \mid _e a_2\) and \(a_2 \succeq _e a_1\) as well as \(a_3 \succeq _e a_1 \).
Proposition 3
The following statements hold: (i) \(\succeq ^{\overline{opt}}\,\subseteq \,\succeq _{e}\) and, on the other hand, (ii) \(\succeq _{id}\,\subseteq \,\succeq ^{\overline{opt}} \cap \succeq _{\underline{opt}}\).
Proof
 (a)
Let \((a,a') \in \succeq ^{\overline{opt}}\). By definition of \(\overline{Opt}(\mathcal {A})\), there is a \(Pr(\phi ) \in [l,u]\), (indeed \(Pr(\phi ) = u\)) such that, on the one hand \(\overline{EI}(\overline{Opt}(\mathcal {A})) = Pr(\phi ) \cdot w(\phi )\), and \(\overline{EI}(\overline{Opt}(\mathcal {A})) \ge \overline{EI}(\overline{Opt}(\mathcal {A}\backslash \overline{Opt}(\mathcal {A})))\) on the other hand. These fact together imply that \((a,a') \in ~\succeq _{e}\).
 (b)
Let \((a,a') \in \succ _i\); then \(\underline{EI}(a) \ge \underline{EI}(a')\), which means (i) \(a = \underline{Opt}(A)\) and (ii) \(a' = \overline{Opt}(A)\), whence \((a,a') \in ~\succeq ^{\overline{opt}} \cap \succeq _{\underline{opt}}\). \(\square \)
Modeling Ambiguity Averse Decisions. As it is commonly motivated by imprecise probability literature, the classical theory of probability is not able make distinctions between different layers of uncertainty. One such common example is that under complete ignorance.
In this section, we will encode the Ellsberg example in our framework and show that it is possible to model ambiguity averse decisions.
One popular interpretation for the behaviour explained in preliminary section is that, human agents tend to prefer more precise outcomes to less precise ones. That is, one feels safer where one has an idea about risk (one is less ignorant about the outcomes). The theory of imprecise probabilities offers a straightforward representation of the problem.
Definition 8
(Ellsberglike Choice). Given alternatives \(a, a' \in \mathcal {A}\), \(a \succ _{ebg} a'\) holds iff \((\underline{EI}(a) + \overline{EI}(a))/2 = (\underline{EI}(a') + \overline{EI}(a'))/2\), and \((\overline{EI}(a)  \underline{EI}(a)) < (\overline{EI}(a')  \underline{EI}(a'))\). We will denote the corresponding decision function as \(\delta _{ebg}\) and call it an Ellsberglike choice. \(\dagger \)
Informally, such a function chooses a tighter interval where means are the same. Th reader is invited to verify that the preference relation Ellsbergdominates denoted \(\succ _{ebg}\), behaves accordingly to the experiment scenario given in Preliminaries (Sect. 2).
Example 6
Note that it is still too strict, which one may not expect to hold often. Below, we will give a more tolerant form of this function.
Definition 9
(Ambiguity Averse Opportunist Choice). Given alternatives \(a, a' \in \mathcal {A}\), \(a \succ _{ag} a'\) iff \(\overline{EI}(a) \le \overline{EI}(a')\), \(\underline{EI}(a) \ge \underline{EI}(a')\) and \(\underline{EI}(a)  \underline{EI}(a') \ge \overline{EI}(a')\overline{EI}(a)\). We call the induced choice function an ambiguity averse choice. \(\dagger \)
Intuitively, it brings an extra condition such that the mean needs to be greater or equal. The following result shows that \(\succ _{ebg}\) is a special case of \(\succ _{ag}\).
Proposition 4
Let \(a, a'\) be two alternatives. Then, \(a \succ _{ebg} a'\) implies \(a \succ _{ag} a'\).
Proof
Assume that (i) \((\underline{EI}(a) + \overline{EI}(a))/2 = (\underline{EI}(a') + \overline{EI}(a'))/2\) and also (ii) \((\overline{EI}(a)  \underline{EI}(a)) < (\overline{EI}(a')  \underline{EI}(a'))\). Then by (i), it follows that \((\underline{EI}(a) + \overline{EI}(a)) = (\underline{EI}(a') + \overline{EI}(a'))\) (iii), that is \((\underline{EI}(a)  \underline{EI}(a')) = (\overline{EI}(a')  \overline{EI}(a))\), hence \((\underline{EI}(a)  \underline{EI}(a')) / (\overline{EI}(a')  \overline{EI}(a)) = 1\). We know that \(\overline{EI}(a) \ge \underline{EI}(a) \) and \(\overline{EI}(a') \ge \underline{EI}(a') \). By (ii), \((\underline{EI}(a)  \underline{EI}(a')) < (\overline{EI}(a')  \overline{EI}(a))\) (iv), and by (iii) and (iv), \((\overline{EI}(a')  \overline{EI}(a)) \ge 0\), hence \(\overline{EI}(a') \ge \overline{EI}(a)\). Similarly for \(\underline{EI}(a) \ge \underline{EI}(a')\) \(\square \)
In a loose sense, one can combine them with the previously mentioned functions (e.g., \(\delta _{e + ag}\)) in order to model more complex behaviours. However, we leave their compositions and compatibilities, along with subtle connections to the probabilistic ontologies to future work.
4 Related Work
Our framework can be seen as a part of the literature on weighted logics for representing preferences [3, 11], with an emphasis on agent modeling. Our notion of UBox to generate utility functions was for instance partially derived from the notion of goal bases (occasionally defined in terms of multisets) as understood in the literature of propositional languages for preferences [11, 19]. There is also a substantial tradition on defeasible reasoning over preferences, see [2, 4, 5, 9], on which we have leveraged.
On the DL side, several weighted DL languages have been proposed, albeit without covering uncertainty over instances [16, 17]. In them, constructs similar to goal bases are used, called “preference sets”, and elements of multiattribute utility theory are partially incorporated into their settings.
Further recent works which can be considered to be loosely related (as sensu stricto non utilitytheoretic) recent approaches include: an application of DLbased ontologies to CPNets, see [15], and a probabilistic logicbased setting [14] based on Markov Logics (precise probabilities) and using Markov networks to model and reason over preferences.
An uncertaintybased approach which attempts to focus on multicriteria decision making (MCDM) problems is [18]; it is mainly based on the application of general fuzzy logic to MCDM problems. Although the terms utility and preference are not explicitly used, it refers to preferences implicitly.
5 Conclusions and Further Work
We have introduced a description logic based framework, to effectively express and solve nonsequential decisionmaking problems with multiple attributes.
As the major part of decision theory literature takes uncertainty into account, we based our approach on Lukasiewicz’ P\(\mathcal {SHOIN}(\mathbf D )\) family of probabilistic description logics ([13]). We have shown that it is straightforward to define decision functions representing ambiguity aversion; a case that violates the axioms of expected utility. In so doing, one can define preference relations and decision functions that we believe model decisions by rational (human) agents much better.
Another major direction is to investigate the value of information (structured knowledge in this context) in different ontological frameworks, viz., to explore in which ways and how much prior knowledge influences decisions about to be taken by agents.
Furthermore, it would be interesting to extend the framework to sequential decisions (e.g., a \(\mathcal {D}_i \rightarrow \mathcal {D}_{i+1}\) sequence of decision bases). This is possible, since the language extensively uses conditional constraints. Once a sequential extension is defined, one can express strategies and gametheoretic issues. Furthermore, it would be interesting to apply the framework or an appropriate modification, to common problems such as fair division, voting, preference aggregation etc.
We are currently working on the implementation of the framework as a Protégé^{11} plugin. The development of our Protégé plugin is motivated by the idea to demonstrate the benefits of our approach to a set of different application scenarios where decision making is involved.
Footnotes
 1.
 2.
It is also called preferenceindifference relation, since it is the union of strict preference and indifference relation.
 3.
By convention, objects are written with lower case.
 4.
Note that T is not used to denote a classical TBox anymore but rather the whole classical knowledge base, TBox and ABox.
 5.
See Proposition 4.8 in [13].
 6.
See Proposition 4.9 in [13].
 7.
Alternatively, \(\mathcal {U}\) can be studied in two partition, that is, the set of pairs with nonnegative (denoted \(\mathcal {U}^+\)) and negative weights (denoted \(\mathcal {U}^\)). In extreme cases, \(\mathcal {U} = \mathcal {U}^+\) when \(\mathcal {U}^= \emptyset \) (similarly for \(\mathcal {U} = \mathcal {U}^+\)).
 8.
Recall that we concern ourselves with desirable attributes, i.e., weights are nonnegative.
 9.
This is done via Lehmann’s lexicographic entailment; in this particular example zpartition is \((P_0, P_1)\) where \(P_0 = \{(\lnot \textit{Desirable}  \exists \textit{hasHotel.FiveStarHotel})[1,1]\)} and \(P_1 = \{(\textit{Desirable}  \exists \textit{hasHotel.FiveStarHotel})[1,1]\}\) that is, \((T, P) \cup \textit{BadFamedFiveStar}\textit{Hotel}(\textit{meridian})\mid \mid \!\sim ^{lex} \lnot \textit{Desirable} (\textit{trip1})\).
 10.
Note that this definition essentially coincides with that choice functions in the imprecise probability literature [8], with the exception that it is allowed to return an empty set.
 11.
References
 1.Bienvenu, M., Lang, J., Wilson, N.: From preference logics to preference languages, and back. In: Proceedings of the International Conference on Principles and Knowledge Representation and Reasoning KR (2010)Google Scholar
 2.Boutilier, C.: Toward a logic for qualitative decision theory. In: Proceedings of the International Conference on Principles and Knowledge Representation and Reasoning KR (1994)Google Scholar
 3.Chevaleyre, Y., Endriss, U., Lang, J.: Expressive power of weighted propositional formulas for cardinal preference modeling. In: Proceedings of the International Conference on Principles of Knowledge Representation and Reasoning, KR (2006)Google Scholar
 4.Delgrande, J.P., Schaub, T.: Expressing preferences in default logic. Artif. Intell. 123(1–2), 41–87 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
 5.Delgrande, J.P., Schaub, T., Tompits, H., Wang, K.: A classification and survey of preference handling approaches in nonmonotonic reasoning. Comput. Intell. 20(2), 308–334 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
 6.Ellsberg, D.: Risk, ambiguity, and the savage axioms. Q. J. Econ. 75, 643–669 (1961)CrossRefGoogle Scholar
 7.Fishburn, P.C.: Utility Theory for Decision Making. Robert E. Krieger Publishing Co., Huntington, New York (1969)Google Scholar
 8.Huntley, N., Hable, R., Troffaes, M.C.M.: Decision making. In: Augustin, T., Coolen, F.P.A., de Cooman, G., Troffaes, M.C.M. (eds.) Introduction to Imprecise Probabilities, pp. 190–206. Wiley, Chichester (2014)Google Scholar
 9.Kaci, S., van der Torre, L.: Reasoning with various kinds of preferences: logic, nonmonotonicity, and algorithms. Ann. OR 163(1), 89–114 (2008)CrossRefzbMATHGoogle Scholar
 10.Keeney, R.L., Raiffa, H.: Decisions with Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York (1976)Google Scholar
 11.Lafage, C., Lang, J.: Logical representation of preferences for group decision making. In: Proceedings of the International Conference on Principles and Knowledge Representation and Reasoning KR, San Francisco (2000)Google Scholar
 12.Levi, I.: The Enterprise of Knowledge. MIT Press, Cambridge, MA (1980)Google Scholar
 13.Lukasiewicz, T.: Expressive probabilistic description logics. Artif. Intell. 172(6–7), 852–883 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Lukasiewicz, T., Martinez, M.V., Simari, G.I.: Probabilistic preference logic networks. In: Proceedings of the European Conference on Artificial Intelligence ECAI (2014)Google Scholar
 15.Di Noia, T., Lukasiewicz, T.: Combining CPnets with the power of ontologies. In: AAAI (LateBreaking Developments) (2013)Google Scholar
 16.Ragone, A., Di Noia, T., Donini, F.M., Di Sciascio, E., Wellman, M.P.: Computing utility from weighted description logic preference formulas. In: Baldoni, M., Bentahar, J., van Riemsdijk, M.B., Lloyd, J. (eds.) DALT 2009. LNCS, vol. 5948, pp. 158–173. Springer, Heidelberg (2010) CrossRefGoogle Scholar
 17.Ragone, A., Di Noia, T., Donini, F.M., Di Sciascio, E., Wellman, M.P.: Weighted description logics preference formulas for multiattribute negotiation. In: Godo, L., Pugliese, A. (eds.) SUM 2009. LNCS, vol. 5785, pp. 193–205. Springer, Heidelberg (2009) CrossRefGoogle Scholar
 18.Straccia, U.: Multi criteria decision making in fuzzy description logics: a first step. In: Velásquez, J.D., Ríos, S.A., Howlett, R.J., Jain, L.C. (eds.) KES 2009, Part I. LNCS, vol. 5711, pp. 78–86. Springer, Heidelberg (2009) CrossRefGoogle Scholar
 19.Uckelman, J., Chevaleyre, Y., Endriss, U., Lang, J.: Representing utility functions via weighted goals. Math. Log. Q. 55(4), 341–361 (2009)MathSciNetCrossRefGoogle Scholar