3.1 Introduction

The theory of imprecise probability is a generalization of classical ‘precise’ probability theory that allows modeling imprecision and indecision. Why is such a theory necessary? Because in many practical applications a lack of information—e.g., about model parameters—and paucity of data—especially if we also consider conditional models—make it impossible to create a reliable model.

For example, consider a Bayesian context where a so-called prior probability distribution must be chosen as part of the modeling effort. The lack of information may make it difficult to determine the type of the prior distribution, let alone its parameters. Then, even if we assume some prior has been chosen—e.g., a normal one—in a somewhat arbitrary way, a paucity of data will make the parameters of the posterior—updated—distribution depend to a large degree on the prior’s somewhat arbitrary parameters. The consequence is that conclusions drawn from the posterior are unreliable and decisions based on it somewhat arbitrary.

The theory of imprecise probability provides us with a set of tools for dealing with the problem described above. For the example above, instead of choosing a single prior distribution, a whole set of priors is used, one that is large enough to sufficiently reduce or even eliminate the arbitrariness of this modeling step. The consequence is that conclusions drawn from an imprecise probabilistic model are more reliable by being less committal—more vague, if you wish; some would say ‘more honest’—and that decisions based on it allow for indecision.

In this chapter, we will go over the basic concepts of the theory of imprecise probability theory. Therefore, we will consider ‘small’ problems, with finite possibility spaces. However, the theory can be applied to infinite—countable and uncountable—possibility spaces as well. Also, only the basics of more advanced topics such as conditioning will be touched upon. But, and this is the chapter’s goal, after having understood the material we do treat, the imprecise probability literature should have become substantially more accessible. Good extensive general treatments are available [2, 16, 20] and the proceedings of the ISIPTA conferences provide an extensive selection of papers developing imprecise probability theory or applying it [1, 3, 4, 6,7,8,9,10,11,12].

Concretely, we start with a discussion of the fundamental concepts in Sect. 3.2. This is done in terms of the more basic notion of sets of acceptable gambles. Probabilities only appear thereafter, in Sect. 3.3, together with the related notion of prevision (expectation). The connection with sets of probabilities is made next, in Sect. 3.4. Then we touch upon conditioning, in Sect. 3.5, and before closing add some remarks about continuous possibility spaces, in Sect. 3.6. Throughout we will spend ample time on a running example to illustrate the theory that is introduced.

3.2 Fundamental Concepts

In this section, we introduce the fundamental concepts of the theory of imprecise probability [18] [20, §3.7]. First, in Sect. 3.2.1, we get started with some basic concepts. Then, in Sect. 3.2.2, we list and discuss the coherence criteria on which the whole theory is built.

3.2.1 Basic Concepts

Consider an agent reasoning about an experiment with an uncertain outcome. This experiment is modeled using a possibility space—a set—\(\mathcal {X}\) of outcomes \(x\). Now consider the linear space \(\mathcal {L}= \mathcal {X}\rightarrow \mathbb {R}\) of real-valued functions over the outcomes. We view these functions as gambles because they give a value, seen as a payoff, for each outcome and because the outcome is uncertain and therefore the payoff is as well. A special class of gambles are the outcome indicators \(1_{x}\) or subset indicators \(1_{B}\), which take the value one on that outcome or subset and zero elsewhere.

The agent can then express her uncertainty by specifying a set of gambles, called an assessment \(\mathcal {A}_{\text {}}\), that she considers acceptable. Starting from such an assessment, she can reason about other gambles and decide whether she should also accept them or not. If she were to do this for all gambles, then the natural extension \(\mathcal {E}_{\text {}}\) of her assessment would be the set of all acceptable gambles. To reason in a principled way, she needs some guiding criteria; these are the next section’s topic.

Let us now introduce our running example:

Wiske and Yoko Tsuno want to bet on Belgium vs. Japan

Given a sports match between Belgium and Japan, there is uncertainty about which country’s team will win. So we consider the possibility space \(\left\{ {\textsc {be}},{\textsc {jp}}\right\} \). There are to agents—gamblers—: Wiske and Yoko Tsuno, two comic book heroines. Each has an assessment consisting of a single gamble that they find acceptable:

  • Wiske accepts losing 5 coins if Japan wins for the opportunity to win 1 coin if Belgium wins; so \(\mathcal {A}_{\text {W}}=\left\{ 1_{{\textsc {be}}}-5\cdot 1_{{\textsc {jp}}}\right\} \).

  • Yoko Tsuno accepts losing 4 coins if belgium wins for the opportunity to win 1 coin if Japan wins; so \(\mathcal {A}_{\text {Y}}=\left\{ -4\cdot 1_{{\textsc {be}}}+1_{{\textsc {jp}}}\right\} \).

The heroines are also discussing joining forces and forming a betting pool. The pools they consider are

  • ‘Simple’, formed by combining their assessments; so

    $$\begin{aligned} \mathcal {A}_{\text {SP}}=\left\{ 1_{{\textsc {be}}}-5\cdot 1_{{\textsc {jp}}},-4\cdot 1_{{\textsc {be}}}+1_{{\textsc {jp}}}\right\} . \end{aligned}$$
  • ‘Empty’ in case of disagreement, without any acceptable gambles; so \(\mathcal {A}_{\text {EP}}=\emptyset \).

3.2.2 Coherence

In the theory of imprecise probabilities, the classical rationality criteria used for reasoning about assessments are called coherence criteria. These are typically formulated as four rules that should apply to any gambles \(f\) and \(g\). (There are different variants in the literature, but the differences are not relevant in this introductory text.) We divide the criteria into two classes.

 

Constructive:

State how to generate acceptable gambles from the assessment:

figure a
Background:

State which gambles are always or never acceptable:

figure b

 

These criteria are quite broadly seen as reasonable, under the assumption that the payoffs are ‘not too large’.

The last criterion, ‘Avoiding sure loss’, puts a constraint on what is considered coherent; if it is violated, we say that an assessment incurs sure loss. The first three rules can be used to create an explicit expression for the natural extension:

$$\begin{aligned} \mathcal {E}_{\text {}}= \left\{ {\textstyle \sum _{f\in \mathcal {K}}\lambda _f\cdot f}:{\mathcal {K}\Subset \mathcal {A}_{\text {}}\cup \left\{ {f\in \mathcal {L}}:{f\ge 0} \right\} \text { and } (\forall f\in \mathcal {K}:\lambda _f\ge 0)} \right\} , \end{aligned}$$

where \(\Subset \) denotes the finite subset relation. Then \(\mathcal {E}_{\text {}}\) is the smallest convex cone of gambles encompassing the assessment \(\mathcal {A}_{\text {}}\) and the nonnegative gambles—including the zero gamble.

Let us apply the natural extension to our running example:

The natural extensions of Wiske, Yoko Tsuno, and the betting pools

For our finite possibility space,

$$\begin{aligned} \left\{ {f\in \mathcal {L}}:{f\ge 0} \right\} = \left\{ {\textstyle \sum _{x\in \mathcal {X}}\mu _x\cdot 1_{x}}:{(\forall x\in \mathcal {X}:\mu _x\ge 0)} \right\} \end{aligned}$$

So, with \(\lambda _{\text {A}},\mu _x\ge 0\) for all outcomes \(x\) and agent identifiers \(\text {A}\), we get the following expressions that characterize the natural extensions:

$$\begin{aligned} \text {Wiske} \quad&\begin{aligned}&\lambda _{\text {W}}\cdot (1_{{\textsc {be}}}-5\cdot 1_{{\textsc {jp}}})+\mu _{{\textsc {be}}}\cdot 1_{{\textsc {be}}}+\mu _{{\textsc {jp}}}\cdot 1_{{\textsc {jp}}}\\&\qquad \qquad \qquad \qquad \qquad = (\lambda _{\text {W}}+\mu _{{\textsc {be}}})\cdot 1_{{\textsc {be}}}+(-5\cdot \lambda _{\text {W}}+\mu _{{\textsc {jp}}})\cdot 1_{{\textsc {jp}}}, \end{aligned}\\ \text {Yoko Tsuno} \quad&\begin{aligned}&\lambda _{\text {Y}}\cdot (-4\cdot 1_{{\textsc {be}}}+1_{{\textsc {jp}}})+\mu _{{\textsc {be}}}\cdot 1_{{\textsc {be}}}+\mu _{{\textsc {jp}}}\cdot 1_{{\textsc {jp}}}\\&\qquad \qquad \qquad \qquad \qquad = (-4\cdot \lambda _{\text {Y}}+\mu _{{\textsc {be}}})\cdot 1_{{\textsc {be}}}+(\lambda _{\text {Y}}+\mu _{{\textsc {jp}}})\cdot 1_{{\textsc {jp}}}, \end{aligned}\\ \text {Simple pool} \quad&(\lambda _{\text {W}}-4\cdot \lambda _{\text {Y}}+\mu _{{\textsc {be}}})\cdot 1_{{\textsc {be}}}+(-5\cdot \lambda _{\text {W}}+\lambda _{\text {Y}}+\mu _{{\textsc {jp}}})\cdot 1_{{\textsc {jp}}},\\ \text {Empty pool} \quad&\mu _{{\textsc {be}}}\cdot 1_{{\textsc {be}}}+\mu _{{\textsc {jp}}}\cdot 1_{{\textsc {jp}}}. \end{aligned}$$

To check whether the natural extension incurs sure loss, we must check whether the coefficients of \(1_{{\textsc {be}}}\) and \(1_{{\textsc {jp}}}\) can become negative at the same time. Only the simple pool incurs sure loss; e.g., fill in \(\lambda _{\text {W}}=\lambda _{\text {Y}}=1\) and \(\mu _{{\textsc {be}}}=\mu _{{\textsc {jp}}}=0\) to convince yourself. (Convince yourself as well that the others avoid sure loss indeed.)

3.3 Previsions and Probabilities

In this section, we move from modeling uncertainty using sets of acceptable gambles to the more familiar language of expectation—or, synonymously, prevision—and probability [17]. We first transition from acceptable gambles to previsions in Sect. 3.3.1 [18, §1.6.3] [17, §2.2] and in a second step, in Sect. 3.3.2, give the connection to probabilities [20, §2.6]. Next, in Sect. 3.3.3, we consider assessments in terms of previsions and what the other fundamental concepts of Sect. 3.2 then look like [17, §2.2.1, §2.2.4] [20, §2.4–5, §3.1]. Finally, in Sect. 3.3.4, we consider the important special case of assessments in terms of previsions defined on a linear space of gambles [17, §2.2.1] [20, §2.3.2–6].

3.3.1 Previsions as Prices for Gambles

Before we start: ‘prevision’ is in much of the imprecise probability literature used as a synonym for ‘expectation’; we here follow that tradition.

Now, how do we get an agent’s previsions for a gamble—equivalently: expectation of a random variable—given that we know the agent’s assessment as a set of acceptable gambles \(\mathcal {A}_{\text {}}\)? We first define a price to be a constant gamble and identify this constant gamble with its constant payoff value. Then we define the agent’s previsions as specific types of acceptable prices:

  • The lower prevision \(\underline{P}(f)\) is the supremum acceptable buying price of \(f\):

    $$\begin{aligned} \underline{P}(f)=\sup \left\{ {\nu \in \mathbb {R}}:{f-\nu \in \mathcal {E}_{\text {}}} \right\} . \end{aligned}$$
  • The upper prevision \(\overline{P}(f)\) is the infimum acceptable selling price of \(f\):

    $$\begin{aligned} \overline{P}(f)=\inf \left\{ {\kappa \in \mathbb {R}}:{\kappa -f\in \mathcal {E}_{\text {}}} \right\} . \end{aligned}$$

If \(\mathcal {E}_{\text {}}\) is coherent, then \(\underline{P}\) and \(\overline{P}\) are also called coherent. There is a conjugacy relation between coherent lower and upper previsions: \(\overline{P}(f)=-\underline{P}(-f)\). It allows us to work in terms of either type of prevision; we will mainly use the lower one.

In case \(\underline{P}(f)=\overline{P}(f)\), then \(P(f)=\underline{P}(f)\) is the called the (precise) prevision of the gamble \(f\).

3.3.2 Probabilities as Previsions of Indicator Gambles

Now that we have definitions for lower and upper previsions, we can derive probabilities from those. For classical probability, we have that the probability of an event—a subset \(B\) of the possibility space \(\mathcal {X}\)—is the prevision of the indicator for that event. For lower and upper previsions, we get:

  • The lower probability: \(\underline{P}(B)=\underline{P}(1_{B})\).

  • The upper probability: \(\overline{P}(B)=\overline{P}(1_{B})\).

Notice that we reuse the same symbol for the prevision and probability functions, as is common in the literature. As long as the nature of the argument—gamble or event—is clear, this does not cause ambiguity. If \(\underline{P}\) and \(\overline{P}\) are coherent as previsions, then so are they as probabilities. Also the conjugacy relationship can be translated to coherent lower and upper probabilities; let \(B^{\text {c}}=\mathcal {X}\setminus B\), then

$$\begin{aligned} \overline{P}(B) = \overline{P}(1_{B}) = \overline{P}(1-1_{B^{\text {c}}}) = -\underline{P}(-1+1_{B^{\text {c}}}) = 1-\underline{P}(1_{B^{\text {c}}}) = 1-\underline{P}(B^{\text {c}}). \end{aligned}$$

In case \(\underline{P}(B)=\overline{P}(B)\), then \(P(B)=\underline{P}(B)\) is called the (precise) probability of \(B\).

To make the definitions for lower and upper previsions and probabilities concrete, let us apply them to our running example:

Lower and upper probabilities for all events and agents

We work out the calculation of Wiske’s lower probability that Belgium will win.

$$\begin{aligned} \underline{P}_{\text {W}}({\textsc {be}}) =&\underline{P}_{\text {W}}(1_{{\textsc {be}}}) \qquad (\text {def. lower probability})\\&= \sup \left\{ {\nu \in \mathbb {R}}:{1_{{\textsc {be}}}-\nu \in \mathcal {E}_{\text {W}}} \right\} \qquad (\text {def. lower prevision})\\&= \begin{aligned}&\sup \left\{ {\nu \in \mathbb {R}}:{ \begin{bmatrix}1-\nu \\ 0-\nu \end{bmatrix} = \begin{bmatrix} \lambda _{\text {W}}+\mu _{{\textsc {be}}}\\ -5\cdot \lambda _{\text {W}}+\mu _{{\textsc {jp}}} \end{bmatrix}, \lambda _{\text {W}}\ge 0,\mu _{{\textsc {be}}}\ge 0,\mu _{{\textsc {jp}}}\ge 0} \right\} \\&\qquad \qquad \qquad \qquad \qquad \qquad (\text {write out natural extension~} \mathcal {E}_{\text {W}} \text { of } \mathcal {A}_{\text {W}} \text {)} \end{aligned}\\&= \begin{aligned}&\sup \left\{ {5\cdot \lambda _{\text {W}}-\mu _{{\textsc {jp}}}}:{ 1-5\cdot \lambda _{\text {W}}+\mu _{{\textsc {jp}}} = \lambda _{\text {W}}+\mu _{{\textsc {be}}},\lambda _{\text {W}}\ge 0,\mu _{{\textsc {be}}}\ge 0,\mu _{{\textsc {jp}}}\ge 0 } \right\} \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad (\text {eliminate }\nu \text {)} \end{aligned}\\&= \begin{aligned}&\sup \left\{ {5\cdot \lambda _{\text {W}}-\mu _{{\textsc {jp}}}}:{ \lambda _{\text {W}}=\tfrac{1}{6}(1+\mu _{{\textsc {jp}}}-\mu _{{\textsc {be}}}), \lambda _{\text {W}}\ge 0,\mu _{{\textsc {be}}}\ge 0,\mu _{{\textsc {jp}}}\ge 0} \right\} \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \text { (solve constraint for } \lambda _{\text {W}}\text {)} \end{aligned}\\&= \sup \left\{ { \tfrac{5}{6}-\tfrac{1}{6}\mu _{{\textsc {jp}}}-\tfrac{5}{6}\mu _{{\textsc {be}}} }:{1+\mu _{{\textsc {jp}}}\ge \mu _{{\textsc {be}}},\mu _{{\textsc {be}}}\ge 0,\mu _{{\textsc {jp}}}\ge 0} \right\} \qquad (\text {eliminate }\lambda _{\text {W}}\text {)}\\&= \frac{5}{6} \qquad (\text {feasible solution }\mu _{{\textsc {be}}}=0,\mu _{{\textsc {jp}}}=0 \text { maximizes expression)} \end{aligned}$$

Do the calculations also for the other agents and Japan. Then apply conjugacy to find the following table of lower and upper probabilities:

figure c

While in the above example the calculation of the lower prevision can be done by hand, in general it realistically requires a linear program solver.

3.3.3 Assessments of Lower Previsions

Up until now, we assumed a set of acceptable gambles \(\mathcal {A}_{\text {}}\)—an agent’s assessment—to be given. But often the agent will directly specify lower and upper probabilities or previsions, e.g., as bounds on precise probabilities and previsions. However, the coherence criteria and expression for the natural extension are based on having a set of acceptable gambles. In this section we will provide expressions based on an assessment specified as lower prevision values for gambles in a given set \(\mathcal {K}\).

The approach is to derive an assessment as a set of acceptable gambles \(\mathcal {A}_{\text {}}\) from these lower prevision. Irrespective of what its natural extension \(\mathcal {E}_{\text {}}\) actually looks like, it follows from the definition of the lower prevision as a supremum acceptable buying price that

$$\begin{aligned} 0&\le \sup \left\{ {\nu - \underline{P}(f)}:{\nu \in \mathbb {R}\wedge f-\nu \in \mathcal {E}_{\text {}}\supseteq \mathcal {A}_{\text {}}} \right\} \\&= \sup \left\{ {\kappa \in \mathbb {R}}:{f-(\kappa +\underline{P}(f))\in \mathcal {E}_{\text {}}} \right\} = \sup \left\{ {\kappa \in \mathbb {R}}:{(f-\underline{P}(f))-\kappa \in \mathcal {E}_{\text {}}} \right\} . \end{aligned}$$

This implies that \(f-\underline{P}(f)+\varepsilon \in \mathcal {E}_{\text {}}\) for any \(\varepsilon >0\), because of coherence. We cannot take \(\varepsilon =0\), because the corresponding so-called marginal gamble \(f-\underline{P}(f)\) is not included in \(\mathcal {E}_{\text {}}\) in general, as the supremum value \(\kappa =0\) is not necessarily attained inside the set. We therefore take \(\mathcal {A}_{\text {}}=\bigcup _{f\in \mathcal {K}}\left\{ {f-\underline{P}(f)+\varepsilon }:{\varepsilon >0} \right\} \).

We can then apply the theory described above to this assessment \(\mathcal {A}_{\text {}}\). This leads to the following nontrivial results for a lower prevision \(\underline{P}\) defined on a set of gambles \(\mathcal {K}\):

  • It avoids sure loss if and only if for all \(n\ge 0\) and \(f_k\in \mathcal {K}\) it holds that

    $$ \sup _{x\in \mathcal {X}}\sum _{k=1}^n \left( f_k(x)-\underline{P}(f_k)\right) \ge 0. $$
  • It is coherent if and only if for all \(n,m\ge 0\) and \(f_k\in \mathcal {K}\) it holds that

    $$\begin{aligned} \sup _{x\in \mathcal {X}}\left( \sum _{k=1}^n \left( f_k(x)-\underline{P}(f_k)\right) -m\cdot (f_0-\underline{P}(f_0))\right) \ge 0. \end{aligned}$$
  • Its natural extension to any gamble \(f\) in \(\mathcal {L}\) is

    $$ \underline{E}(f) = \sup \left\{ { \inf _{x\in \mathcal {X}}\left\{ f(x)-\sum _{k=1}^n\lambda _k\cdot \bigl (f_k(x)-\underline{P}(f_k)\bigr ) \right\} }:{ n\ge 0, f_k\in \mathcal {K}, \lambda _k\ge 0 } \right\} . $$

3.3.4 Working on Linear Spaces of Gambles

The coherence criterion for lower previsions on an arbitrary space \(\mathcal {K}\) of gambles we gave in the preceding section is quite involved. However, in case \(\mathcal {K}\) is a linear space of gambles, this criterion becomes considerably simpler. Namely, a lower prevision \(\underline{P}\) must then satisfy the following criteria for all gambles \(f\) and \(g\) in \(\mathcal {K}\) and \(\lambda >0\):

figure d

Expressed for upper previsions \(\overline{P}\), these coherence criteria are very similar:

figure e

From the coherence criteria, many useful properties can be derived for a coherent lower prevision \(\underline{P}\) and its conjugate upper prevision \(\overline{P}\). We provide a number of key ones, which hold for all gambles \(f\) and \(g\) in \(\mathcal {K}\) and \(\mu \in \mathbb {R}\); \(\underline{\overline{P}}\) denotes either \(\underline{P}\) or \(\overline{P}\):

figure f

3.4 Sets of Probabilities

In Sect. 3.2 we modeled uncertainty using a set of acceptable gambles. In Sect. 3.3 we showed how this can also be done in terms of lower or upper previsions (or probabilities). In this section, we add a third representation, one using credal sets—sets of precise previsions [17, §2.2.2], [18, §1.6.2]. In Sect. 3.4.1 we show how to derive the credal set corresponding to a given lower prevision. In Sect. 3.4.2 we go the other direction and show how to go from a credal set to lower prevision values [20, §3.3].

3.4.1 From Lower Previsions to Credal Sets

A credal set is a subset of the set of all precise previsions \(\mathcal {P}\). (For possibility spaces \(B\) different from \(\mathcal {X}\), we write \(\mathcal {P}_B\).) This set is convex, meaning that any convex mixture of precise previsions is again a precise prevision. Because of this, a gamble’s prevision is a linear function over this space. A lower—and upper—prevision can be seen as providing a bound on the value of the precise prevision for that gamble and thereby represent a linear constraint on the precise previsions. So the credal set \(\mathcal {M}\) corresponding to a lower prevision \(\underline{P}\) defined on a set of gambles \(\mathcal {K}\) is the subset of \(\mathcal {P}\) satisfying this constraint for all gambles in \(\mathcal {K}\):

$$\begin{aligned} \mathcal {M}= \bigcap _{f\in \mathcal {K}}\left\{ {P\in \mathcal {P}}:{P(f)\ge \underline{P}(f)} \right\} . \end{aligned}$$

Being defined as such an intersection, such credal sets are closed and convex.

The rationality criteria for a lower prevision \(\underline{P}\) we encountered before can also be expressed using its corresponding credal set \(\mathcal {M}\):

  • \(\underline{P}\) incurs sure loss if and only if \(\mathcal {M}\) is equal to the empty set.

  • \(\underline{P}\) is coherent if and only if all constraints are ‘tight’, i.e., if there exists a \(P\) in \(\mathcal {M}\) such that \(P(f)=\underline{P}(f)\) for all \(f\) in \(\mathcal {K}\).

Let us make the concept of a credal set concrete using our running example:

Yoko Tsuno’s credal set

For a finite possibility space such as the one of our running example, a precise prevision \(P\) can be defined completely by the corresponding probability mass function \(p\) defined by \(p_x=P(\left\{ x\right\} )\) for \(x\) in \(\mathcal {X}=\left\{ {\textsc {be}},{\textsc {jp}}\right\} \). The set of all precise previsions can therefore be represented by the probability simplex—the set of all probability mass functions—on \(\mathcal {X}\). This set and the example probability mass function \((\frac{1}{2},\frac{1}{2})\) is shown below left. Below right, we illustrate how Yoko Tsuno’s lower prevision \(\underline{P}_{\text {Y}}({\textsc {jp}})=\frac{4}{5}\) generates the credal set \(\mathcal {M}_{\text {Y}}\): The gamble \(1_{{\textsc {jp}}}\) as a linear function over the simplex is shown as an inclined line. This linear relationship between \(p\)—equivalently, the corresponding prevision \(P_p\)—and \(P_p(1_{{\textsc {jp}}})=P_p({\textsc {jp}})\) transforms the bounds \(\frac{4}{5}\le P_p({\textsc {jp}})\le 1\) into \(\mathcal {M}_{\text {Y}}\).

figure g

The set of extreme points \(\mathcal {M}^*_{\text {Y}}\) of \(\mathcal {M}_{\text {Y}}\) as probability mass functions is \(\left\{ (\frac{1}{5},\frac{4}{5}), (0,1)\right\} \).

3.4.2 From Credal Sets to Lower Previsions

Now we assume that the agent’s credal set \(\mathcal {M}\) is given. Most generally this can be any set of precise previsions, i.e., any subset of \(\mathcal {P}\). Often, to ensure equivalence between coherent lower previsions and non-empty credal sets, they are required to be closed and convex. In that case, a credal set is determined completely by its set of extreme points \(\mathcal {M}^*\) in the sense that all other elements are convex mixtures of these.

To determine the lower prevision corresponding to any credal set, we determine its value for each gamble \(f\) of interest using the lower envelope theorem:

$$\begin{aligned} \underline{P}(f) = \min \left\{ {P_p(f)}:{p\in \mathcal {M}} \right\} = \min \left\{ {P_p(f)}:{p\in \mathcal {M}^*} \right\} . \end{aligned}$$

Let us again use the running example to provide a feeling for what this all means:

A credal set for the empty pool facing penalties

Consider the empty pool. Because its assessment is empty, its credal set \(\mathcal {M}_{\text {EP}}\) is the trivial one corresponding to all probability mass functions on \(\mathcal {X}=\left\{ {\textsc {be}},{\textsc {jp}}\right\} \). Now we add an extra element to the possibility space, ‘Penalties’. Below left we show \(\mathcal {M}_{\text {EP}}\) embedded in the corresponding larger probability simplex. Wiske and Yoko Tsuno decide to add the uniform probability mass function to it. Below right, you see the convex hull \(\mathcal {M}_{\text {EUP}}\) of this extra probability mass function and the original credal set.

figure h

If we want to calculate lower and upper prevision values, we can here use the extreme point version of the lower and—similar—upper envelope theorem. For example, for the pool’s upper probability for Penalties:

$$\begin{aligned} \overline{P}_{\text {EP}}({\textsc {p}})=\overline{P}_{\text {EP}}(1_{{\textsc {p}}})=\max \left\{ {p^\top (0,0,1)}:{p\in \left\{ (1,0,0),(0,1,0),(\tfrac{1}{3},\tfrac{1}{3},\tfrac{1}{3})\right\} } \right\} =\frac{1}{3}. \end{aligned}$$

To make it explicit where this maximum is achieved, we above right show the line of probability mass functions \(p\) such that \(P_p(\textsc {p})=\frac{1}{3}\).

3.5 Basics of Conditioning

Conditioning an uncertainty model is the act of restricting attention to a subset \(B\) of the possibility space. It is often used to update an uncertainty model after having observed the event \(B\) [20, §6.1].

In the theory of imprecise probability, conditioning is a specific case of natural extension [17, §2.3.3], [20, §6.4.1]. In terms of acceptable gambles, conditioning on \(B\) corresponds to restricting the space of gambles to those that are zero outside \(B\) [18, §1.3.3]. For lower previsions, this translates to the following conditioning rule for all gambles \(f\) in \(\mathcal {L}\):

$$ \underline{E}{(f}\,\vert \,{B)} = {\left\{ \begin{array}{ll} \inf _{x\in B}f(x) &{} \text {if } \underline{P}(B)=0,\\ \max \left\{ {\mu \in \mathbb {R}}:{\underline{P}({1_{B}(f-\mu )=0)}} \right\} &{} \text {if } \underline{P}(B)>0. \end{array}\right. } $$

Conditioning a credal set \(\mathcal {M}\) corresponds to taking the credal set \(\mathcal {M}\vert B\) formed by conditioning each of the precise previsions in \(\mathcal {M}\):

$$ \mathcal {M}\vert B = {\left\{ \begin{array}{ll} \mathcal {P}_B &{} \text {if }\exists P\in \mathcal {M}: P(B)=0,\\ \left\{ {P{(\cdot }\,\vert \,{B)}}:{P\in \mathcal {M}} \right\} &{} \text {if }\forall P\in \mathcal {M}: P(B)>0. \end{array}\right. } $$

These rules based on natural extension give vacuous conditionals whenever the lower probability of the conditioning event is zero. Regular extension is a less imprecise updating rule [17, §2.3.4], [18, §1.6.6], [20, App. J]: In credal set terms, it removes those precise previsions \(P\) such that \(P(B)=0\) from \(\mathcal {M}\).

Let us apply the conditioning rules discussed here to our running example:

Conditioning the empty-uniform pool’s credal set

We condition the empty-uniform pool’s credal set on \(\left\{ {\textsc {jp}},\textsc {p}\right\} \), i.e., Belgium not winning in regular time. Further down on the left, we show what happens if we apply natural extension: the conditional model is vacuous because \(P_{(1,0,0)}(\left\{ {\textsc {jp}},\textsc {p}\right\} )=P_{(1,0,0)}(1_{\left\{ {\textsc {jp}},\textsc {p}\right\} })=(1,0,0)^\top (0,1,1)=0\). Further down on the right, we apply natural extension and therefore remove \(P_{(1,0,0)}\) from \(\mathcal {M}_{\text {EUP}}\); this results in a non-vacuous conditional credal set.

figure i

3.6 Remarks About Infinite Possibility Spaces

The theory we presented is also applicable to denumerable and continuous possibility spaces with some technical amendments to the coherence criteria and by considering only bounded gambles. However, the running example was based on finite possibility spaces, so no feeling was created for applications with infinite possibility spaces. Therefore we here give some remarks about imprecise probabilistic uncertainty models on continuous possibility spaces:

  • They are mostly defined using credal sets whose extreme points are parametric distributions where the parameters vary in a set. A prime example are the imprecise Dirichlet model [21] and its generalizations [19].

  • They are also commonly defined using probability mass assignments to subsets of the possibility space. This is in some way a reduction to the finite case. Examples are belief functions [13, §5.2.1.1], some P-boxes [14, §4.6.4], and NPI models [5, §7.6].

  • Furthermore, models which bound some specific description of a precise prevision, such as cumulative distribution functions and probability density functions, are also popular in some domains. The extreme points of their credal set are, however, not known. General P-boxes [15] and lower and upper density functions [20, §4.6.3] are examples of this class.

  • Calculating lower and upper previsions—i.e., performing natural extension—can easily become difficult optimization problems, so this should be a key consideration when choosing a specific type of model.

3.7 Conclusion

This introduction to the theory of imprecise probability has prepared you for accessing the broader literature on this topic and its applications. For those that wish to apply imprecise probabilistic techniques, this text only provides the first step: You should dive into the literature and contact experts to obtain the necessary knowledge and feedback. The references of this chapter and their authors or editors provide a starting point for that.