In this section, we formally define the decision criteria which are studied in the following sections. As game theory can be seen as interactive decision theory, the relevant decision criteria are introduced in a decision-theoretic setting. In doing so, however, we pay attention to use formal notation that is consistent with the multigame setting developed in the next sections.
The main alternatives investigated in this paper are four: maximization of expected utility, maximin expected utility, and two forms of regret minimization. In the following analysis, the term rationality is uniquely understood and used in its ecological acceptation, in the sense of being advantageous and beneficial for the individual to survive and thrive.
By and large, decision criteria are functions associating a decision problem with an action choice. A decision problem can be formalized as a tuple (S, A, Z, c) where S is a set of possible states of the world, A is the agent’s set of actions,Footnote 3Z is a set of outcomes, and c is the outcome function
$$\begin{aligned} c:S\times A\rightarrow Z. \end{aligned}$$
On the top of the objective structure (S, A, Z, c) of the decision problem, there are two crucial subjective components: the agent’s subjective utility u and the agent’s subjective belief B. The subjective utility is a real-valued function associating each possible outcome with a numerical subjective utility
$$\begin{aligned} u:Z\rightarrow \mathbb {R}. \end{aligned}$$
The subjective belief of the agent is encoded by a set B, whose elements can be objects of different nature, depending on the belief representation in the model. For instance, in a qualitative approach, the belief set B may be just a subset of possible states of the world \(B\subseteq S\) which the agent considers possible or most plausible. In a more quantitative approach, the belief set may be a set of probability measures \(B\subseteq \Delta (S)\), where \(\Delta (S)\) denotes the set of all probability measures p over the set S.
All decision criteria considered here can be thus thought of as functions associating a subjective utility and a subjective belief with an action choice:
$$\begin{aligned} \text {Decision Criterion: Utilities}\times \text {Beliefs}\rightarrow \text {Actions}. \end{aligned}$$
In order to output an action, the decision criterion must be able to resolve the kind of uncertainty encoded by the belief set B. For instance, if the belief set B is not a singleton set containing a unique probability measure, but rather a set of states \(B\subseteq S\), actions cannot be associated with expected utilities, and the maximization of expected utility is hence unfit to solve the choice problem. In those cases, the agent has to resort to other criteria, such as maximin or regret minimization.
In the following, we assume that B is a nonempty set of probability measures. The advantage of this choice is that it allows to encompass the main approaches to the formal representation of beliefs mentioned above. The models where the belief of the agent is given in terms of a set of states \(B=S'\subseteq S\) can be alternatively expressed by the set of probability measures \(B=\Delta (S')\). The models assuming Bayesian agents with probabilistic beliefs are the special cases where the set \(B\subseteq \Delta (S)\) is a singleton.
Expected utility maximization
Of all decision criteria advanced in the literature, the maximization of a subjective expected utility has gained most favor for defining rational choice. Given a decision problem where the belief set B is represented by a single probability measure \(p\in \Delta (S)\), the subjective expected utility of each action \(a\in A\) is
$$\begin{aligned} E_{p}[u|a]=\sum _{s\in S}u(c(s,a))\cdot p(s). \end{aligned}$$
According to expected utility maximization, the rational choice to make is to pick an action \(a^{*}\) such that
$$\begin{aligned} a^{*}\in \text {argmax}_{a\in A}E_{p}[u|a]. \end{aligned}$$
Whenever the uncertainty of the agent is not measurable, i.e., whenever the set B is not representable by a single probability measure, each action a is thus associated with an expected utility \(E_{q}[u|a]\) for each probability measure \(q\in B\). Under unmeasurable uncertainty, also known as ambiguity, the maximization of subjective expected utility is thus unable to assign each option a unique expected-utility value and to prescribe a course of action accordingly. When faced with unmeasurable uncertainty, the agent necessitates more general decision criteria to evaluate her options.
Maximin expected utility
The most famous criterion for decision making under ambiguity is probably the maximinimization of subjective expected utility, or just maximin expected utility. This criterion dictates to rank the actions according to the minimal expected utility and then choose among the top-ranked:
$$\begin{aligned} a^{*}\in \text {argmax}_{a\in A}\min _{p\in B}E_{p}[u|a]. \end{aligned}$$
Note that maximin expected utility is adequate for choice under both unmeasurable and measurable uncertainty. When the uncertainty is measurable, that is when B is reduced to a singleton, maximin expected utility reduces to the maximization of expected utility.
Linear regret minimization
Another important criterion for decision under both measurable and unmeasurable uncertainty is regret minimization, which comes in at least two different forms. The more common version of regret minimization is the one that we call linear, and has recent axiomatizations in Hayashi (2008) and Stoye (2011).
To formalize linear regret minimization, let us first define the linear regret of action a given probability measure p as the following quantity:
$$\begin{aligned} r_{L}(a,p)\,:=\,E_{p}\left[ \max _{a'\in A}u(c(s,a'))-u(c(s,a))\right] \,=\,\sum _{s\in S}p(s)\cdot \left( \max _{a'\in A}u(c(s,a'))-u(c(s,a))\right) . \end{aligned}$$
Linear regret minimization prescribes to pick an action \(a^{*}\) such that
$$\begin{aligned} a^{*}\in \text {argmin}_{a\in A}\max _{p\in B}\,r_{L}(a,p). \end{aligned}$$
Linear regret minimization also prescribes the same action choice as expected utility maximization in case of measurable uncertainty (see Appendix A).
Nonlinear regret minimization
An equally reasonable version of regret minimization is the one we call nonlinear regret minimization. Let us define the nonlinear regret of action a given probability measure p as the quantity
$$\begin{aligned} r_{N}(a,p)\,:=\,\max _{a'\in A}E_{p}[u|a']-E_{p}[u|a]\,=\,\max _{a'\in A}\sum _{s\in S}p(s)\cdot u(c(s,a'))-\sum _{s\in S}p(s)\cdot u(c(s,a)). \end{aligned}$$
Nonlinear regret minimization accordingly prescribes to opt for an action \(a^{*}\) such that
$$\begin{aligned} a^{*}\in \text {argmin}_{a\in A}\max _{p\in B}\,r_{N}(a,p). \end{aligned}$$
Similarly to linear regret minimization, nonlinear regret minimization too agrees with expected utility maximization on the action that shall be chosen in case of measurable uncertainty (see Appendix A).
Example 1
To better appreciate the behavioral difference between the two ways of understanding and minimizing regret, consider the following example. A bag contains ten marbles, which are either blue or red. Seven marbles are red, one is blue, and nothing else is known about the remaining two (the corresponding belief set can be simply expressed as \(\{0.1,0.2,0.3\}\), where it suffices to state the probability of a blue marble only, since drawing a red marble is the complementary event). A marble will be drawn from the bag and an agent has the chance to bet on its color according to the payoffs given in Fig. 1b: a winning bet on blue yields 3, a winning bet on red yields 1, and losing bets cost nothing.
A linear regret minimizer will look at Fig. 1a and pick the action corresponding to the line whose highest point within the dotted interval is lower. A linear regret minimizer is hence indifferent between R and B, as the two lines reach equally high maxima within the dotted interval (where the maximum of \(r_{L}(R)\) is reached when the probability of a blue marble is 0.3, and the maximum of \(r_{L}(B)\) when that probability is 0.1). On the contrary, a nonlinear regret minimizer will prefer to bet on red: as depicted in Fig. 1c, the maximum reached by \(r_{N}(R)\) within the dotted interval is lower than the maximum of \(r_{N}(B)\).
It is worth stressing that in order to resolve the same uncertainty about the bag of marbles through expected utility maximization, a single probability distribution is needed. Given the knowledge on the composition of the bag (7 red marbles, 1 blue marble, 2 either red or blue marbles), it is reasonable for an expected utility maximizer to discard all probability distributions that are out of the interval delimited by the two vertical dotted lines and opt for a distribution that assigns probability of a blue marble between 0.1 and 0.3. In what follows, we assume that the expected utility maximizer reaches a probabilistic belief by averaging the points of the belief set, e.g., \((0.1+0.2+0.3)/3=0.2\) in the case of the bag of marbles. The expected utility maximizer may thus be viewed as resolving unmeasurable uncertainty by means of the principle of insufficient reason, or principle of indifference, which prescribes to assign equal probabilistic weight to all possible alternatives. In the following, by expected utility maximizer we hence mean a decision maker employing the principle of insufficient reason first and the maximization of expected utility next.
All candidate criteria for rational choice that we have seen agree in all those cases where the belief of the agent is represented by a unique probability measure. The disagreement about the course of action to be taken arises when the agent is unmeasurably uncertain. Maximin expected utility and both versions of regret minimization may all prescribe different behaviors, whereas the maximization of expected utility is of no help unless the uncertainty is reduced to a probabilistic belief.