1 Introduction

We examine infinite horizon decision problems in which the decision maker’s payoff is a bounded function of the infinite sequence of the actions chosen. Within our framework, probability measures are so-called charges: they are finitely additive, albeit not necessarily countably additive. We assume that the decision maker uses behavioral strategies. A behavioral strategy assigns a charge on the available actions depending on the actions chosen in the past. In order to assign an expected payoff to each behavioral strategy, it is necessary to define the charge induced by the behavioral strategy on the set of infinite sequences of actions.

Following established literature, we explore four distinct algebras on the set of infinite sequences of actions, and correspondingly, we define the charge induced by a behavioral strategy on each of these algebras.

Yet, because the payoff function is only assumed to be bounded, it may not be measurable with respect to these algebras. Consequently, to calculate expected payoffs through integrals, we need to extend the charge induced by a behavioral strategy from these algebras to encompass the entire power set. However, the extension of these charges is generally not unique, and thus the expected payoff under a behavioral strategy might be ambiguous.

This naturally gives rise to the question: what conditions ensure that the expected payoff is unambiguous? We address this question by finding conditions on the payoff function that guarantee an unambiguous expected payoff regardless of which behavioral strategy the decision maker adopts. Most of these conditions have a topological nature. We illustrate the results by several examples.

Related literature. Charges were advocated by de Finetti (1975), Savage (1972), and Dubins and Savage (2014). They facilitate constructions such as a uniform probability distribution over the natural numbers (cf. Schirokauer and Kadane (2007)), and avoid the problem of measure (cf. Aliprantis and Border (2005)). For a summary of the history of charges, we refer to Bingham (2010).

In decision theory, charges have been used in various models, notably in de Finetti (1975) and Savage (1972), and they are also regularly used to model beliefs (e.g. Gilboa and Marinacci (2016)). Models building on these ideas can be found in Al-Najjar (2009) and Pomatto et al. (2014). Sudderth (2016) writes about finitely additive dynamic programming, where the payoff is some type of aggregation of daily payoffs. Charges also gained recognition in game theoretic models, such as in Maitra and Sudderth (1993), Marinacci (1997), Maitra and Sudderth (1998), Harris et al. (2005), Capraro and Scarsini (2013), Al-Najjar et al. (2014), Flesch et al. (2017), Milchtaich (2020) and Cerreia-Vioglio et al. (2022).

The same type of decision problems as ours are studied in Dubins and Savage (2014), Dubins (1974), Purves and Sudderth (1976) and Flesch et al. (2019), and the algebras we consider on the infinite sequences of actions already appear in these papers. We will mention the specific connections throughout the paper.

Structure of the paper. In Sect. 2, we discuss some preliminaries on charges. In Sect. 3, we introduce the model and the main question. In Sect. 4, we discuss induced charges by behavioral strategies and the corresponding expected payoffs. In Sect. 5, as a preparation, we define a few classes of payoff functions. In Sect. 6, we present our main results: conditions for an unambiguously defined expected payoff. In Sect. 7, we illustrate our results with examples. In Sect. 8, we provide some concluding remarks. For convenience, an overview of the most important notation can be found before the Appendices.

The remaining sections contain technical issues and the proofs: Sect. 9 contains further properties on the algebras and expected payoffs, Sect. 10 contains most of the proofs, and Sect. 11 contains an additional example.

2 Preliminaries on charges

In this section we provide a brief summary on charges. For further reading, we refer to Rao and Rao (1983) and Dunford and Schwartz (1964).

Let X be a nonempty set. A collection \(\mathscr {P}\) of subsets of X is called an algebra if it has the following properties: (1) \(X \in \mathscr {P}\), (2) if \(E, F \in \mathscr {P}\), then \(E \cup F \in \mathscr {P}\), (3) if \(E\in \mathscr {P}\), then \(X{\setminus } E \in \mathscr {P}\). It follows that an algebra is closed under taking finite unions. An algebra is called a sigma-algebra, if it is even closed under taking countable unions.

Let \(\mathscr {P}\) be an algebra on X. A finitely additive probability measure, also called a charge, on \((X, \mathscr {P})\) is a function \(\mu :\mathscr {P}\rightarrow [0,1]\) such that \(\mu (X) = 1\) and for all disjoint sets \(E, F \in \mathscr {P}\) it holds that \(\mu (E \cup F) = \mu (E) + \mu (F)\). We denote the set of all charges on \((X, \mathscr {P})\) by \({\mathcal {C}}(X,\mathscr {P})\). For \(x \in X\), we denote the Dirac charge on x by \(\delta _x\), i.e., for every set \(B \in \mathscr {P}\), we have \(\delta _x(B) =1\) if \(x \in B\) and \(\delta _x(B)=0\) if \(x \notin B\).

The following statement follows from Theorem 2 in Loś and Marczewski (1949) together with the Lemma of Zorn, and is also shown in Theorem C.3 in Flesch et al. (2017). If \(\mathscr {P}\) is an algebra, and \(\mu\) is a charge on \(\mathscr {P}\), then \(\mu\) can be extended to a charge on \(2^X\). That is, there exists a charge \(\nu\) on \((X, 2^X)\) such that \(\nu (E)=\mu (E)\) for all \(E \in \mathscr {P}\). The extension \(\nu\) is generally not unique.

Let \(\mathscr {P}\) be an algebra on X. A function \(s :X \rightarrow \mathbb {R}\) is called a \(\mathscr {P}\)-measurable simple–function if there are \(c_1, \ldots , c_m \in \mathbb {R}\) and a partition \(\{B_1, \ldots , B_m\}\) of X with \(B_1, \ldots , B_m \in \mathscr {P}\) such that \(s = \sum _{i=1}^m c_i \mathbb {I}_{B_i}\), where \(\mathbb {I}_{B_i}\) is the characteristic function of the set \(B_i\). Let \(\mu\) be a charge on \((X, \mathscr {P})\). The integral of s with respect to the charge \(\mu\) is defined by \(\int _{x\in X} s(x)\,\mu (dx) = \sum _{i=1}^m c_i \cdot \mu (B_i)\).

Let \(\mu\) be a charge on \((X, 2^X)\). For every bounded function \(f :X \rightarrow \mathbb {R}\) and every \(\varepsilon >0\), there existsFootnote 1 a (\(2^X\)-measurable) simple–function s such that \(s \le f \le s + \varepsilon\). Let \(f :X \rightarrow \mathbb {R}\) be a bounded function. The integral \(\int _{x\in X} f(x)\,\mu (dx)\) is defined as the supremum of all real numbers \(\int _{x\in X} s(x)\,\mu (dx)\), where s is a simple–function with \(s \le f\). Since f is bounded, the integral is finite. The integral is linear over the set of bounded real-valued functions. We remark that the integral \(\int _{x\in X} f(x)\,\mu (dx)\) is equal to the infimum of all real numbers \(\int _{x\in X} s(x)\,\mu (dx)\), where s is a simple–function and \(s \ge f\).

When X is countably infinite,Footnote 2 we say that a charge \(\mu \in {\mathcal {C}}(X, 2^X)\) is diffuse (also called purely finitely additive) if \(\mu ( \{ x \} ) = 0\) for every \(x \in X\). Assuming the Axiom of Choice, diffuse charges exist. Note that diffuse charges are not countably additive.

3 The model and the main question

In the entire paper, we assume the Axiom of Choice.

The decision problem. Let A be an action set, having at least two elements. Let H denote the set of finite sequences in A, including the empty sequence ø. Elements of A are called actions, elements of H are called histories and elements of \(A^\mathbb {N}\) are called plays. Let \(u:A^\mathbb {N}\rightarrow \mathbb {R}\) be a bounded function, called the payoff function.

Consider the following decision problem. At each period \(t=1,2,\ldots\), the decision maker choosesFootnote 3 an action \(a_t\) from A, knowing his previous choices \((a_1,\ldots ,a_{t-1})\in H\). This induces a play \(\vec a=(a_1,a_2,\ldots )\). The payoff of the decision maker is \(u(\vec a)\).

Behavioral strategy. A behavioral strategy is a function \(b:H\rightarrow {\mathcal {C}}(A,2^A)\). The interpretation is that if history h arises during the decision problem, then the strategy b recommends the decision maker to choose an action according to the charge b(h).

Main question. To explain our main question informally: suppose that an algebra \(\mathscr {P}\) is given on the set \(A^\mathbb {N}\) of plays, and suppose also that for each behavioral strategy b the induced charge \(\mathbb {P}^\mathscr {P}_b\) on \(\mathscr {P}\) is known. We would like to investigate which payoff functions have an unambiguous expected payoff for every \(\mathbb {P}^\mathscr {P}_b\).

More precisely, following the literature, we will consider specific algebras \(\mathscr {P}\) on the set \(A^\mathbb {N}\) of plays. Given any such algebra \(\mathscr {P}\), we will formally define a charge \(\mathbb {P}_b^\mathscr {P}\) on \((A^\mathbb {N},\mathscr {P})\) for each behavioral strategy b (cf. Section 4). Intuitively, \(\mathbb {P}_b^\mathscr {P}\) is the charge that the behavioral strategy b induces on the algebra \(\mathscr {P}\); that is, for each set \(Q\in \mathscr {P}\) of plays, \(\mathbb {P}_b^\mathscr {P}(Q)\) is the probability under b that the realized play belongs to Q.

Since the payoff function u may not be measurable with regard to \(\mathscr {P}\), but u is bounded and therefore always measurable with regard to the power set of \(A^\mathbb {N}\), we extend the induced charges \(\mathbb {P}_b^\mathscr {P}\) to the power set of \(A^\mathbb {N}\). More precisely, we denote by \([\mathbb {P}_b^\mathscr {P}]\) the set of charges on the power set of \(A^\mathbb {N}\) that extend \(\mathbb {P}_b^\mathscr {P}\) from the algebra \(\mathscr {P}\). For each charge \(B\in \hspace{0.1cm} [\mathbb {P}_b^\mathscr {P}]\), we obtain an expected payoff

$$\begin{aligned} u (B)\; =\; \int _{\vec a \in A^\mathbb {N}} u(\vec a)\; B (d \vec a). \end{aligned}$$
(1)

Hence, the set of possible expected payoffs for the behavioral strategy b, with respect to \(\mathscr {P}\), is

$$\begin{aligned}{}[u^\mathscr {P}(b)]\;=\;\{u(B):\;B\in \hspace{0.1cm} [\mathbb {P}_b^\mathscr {P}]\}. \end{aligned}$$
(2)

We say that a behavioral strategy b induces an unambiguous expected payoff with respect to \(\mathscr {P}\), if the set \([u^\mathscr {P}(b)]\) is a singleton.

Our main question is to identify conditions on the payoff function u under which all behavioral strategies induce an unambiguous expected payoff, i.e., \([u^\mathscr {P}(b)]\) is a singleton for each behavioral strategy b. This question depends heavily on the chosen algebra \(\mathscr {P}\) and the way how the induced charges \(\mathbb {P}_b^\mathscr {P}\) for the behavioral strategies are specified.

4 Algebras on the set of plays, induced charges on the algebras, and expected payoff

In this section, we define four different algebras, and for each algebra and each behavioral strategy, we define the charge induced by this behavioral strategy on the algebra. By extending these induced charges to the entire power set of plays, we define expected payoffs under behavioral strategies.

The topology on the set of plays.Footnote 4 We endow the action set A with the discrete topology and the set \(A^\mathbb {N}\) of plays with the induced product topology, denoted by \({\mathcal {T}}\). The elements of \({\mathcal {T}}\) are called the open sets, and their complements are called the closed subsets of \(A^\mathbb {N}\). Thus, a subset of \(A^\mathbb {N}\) is open exactly when it is the union of cylinder sets, where the cylinder set corresponding to a history h is the set of plays that has h as its initial segment (see p.16 for a formal definition). A subset of \(A^\mathbb {N}\) that is both open and closed is called clopen.

The topological space \((A^\mathbb {N}, {\mathcal {T}})\) is completely metrizable, for instance by the metric \(d:A^\mathbb {N}\times A^\mathbb {N}\rightarrow \mathbb {R}\) defined as: if \(\vec a=\vec a'\) then \(d(\vec a,\vec a')=0\), and otherwise \(d(\vec a,\vec a')=2^{-k(\vec a,\vec a')}\) where \(k(\vec a,\vec a') \in \mathbb {N}\) is the first period at which \(\vec a\) and \(\vec a'\) differ. Thus, a sequence of plays \((\vec a_n)_{n \in \mathbb {N}}\) converges to a play \(\vec a\) if for every \(k \in \mathbb {N}\) there exists \(N_k \in \mathbb {N}\) such that for every \(n \ge N_k\) the first k coordinates of \(\vec a_n\) coincide with those of \(\vec a\). The Borel algebra \(\mathcal {R}\) on \(A^\mathbb {N}\) is the smallest algebra of subsets of \(A^\mathbb {N}\) that contains all open sets.

The charge induced by a behavioral strategy on the Borel algebra. For each behavioral strategy b, we define the induced charge \(\mathbb {P}_b\) on the Borel algebra \(\mathcal {R}\). Our definition is in accordance with the literature, in particular with Dubins and Savage (2014) and Dubins (1974). For the formal definition, see Theorem 4.1 below.

We first need a bit of terminology and notation. Consider a decision problem G, a period \(k \in \mathbb {N}\) and a history \(h \in A^k\). We define the subproblem that starts at history h. This subproblem is played as follows: At periods \(n\ge k+1\), the decision maker chooses an action \(a_n \in A\), which induces a play \((a_{k+1},a_{k+2},\ldots )\) and a corresponding payoff \(u|_h(a_{k+1},a_{k+2},\ldots )=u(h,a_{k+1},a_{k+2},\ldots )\). The subproblem that starts at h is denoted by \(G|_h\).Footnote 5 A behavioral strategy b in decision problem G induces a behavioral strategy \(b|_h\) in the subproblem \(G|_h\) as follows: \(b|_h(a_{k+1},a_{k+2},\ldots , a_{k'})=b(h,a_{k+1},a_{k+2},\ldots , a_{k'})\), for every \(k'\ge k\). The strategy \(b|_h\) is called the continuation strategy of b at history h.Footnote 6

A specification is a mapping \(\psi\) that to each behavioral strategy b assigns a charge \(\psi (b)\) on the Borel algebra \(\mathcal {R}\) of \(A^\mathbb {N}\). As is shownFootnote 7 in Theorem 2.8.1 of Dubins and Savage (2014) and Theorem 2 in Dubins (1974), there is a unique specification \(\psi\) that satisfies two natural conditions.

Theorem 4.1

(Dubins and Savage (2014), and Dubins (1974)) There is a unique specification \(\psi\) with the following two conditions:

  1. 1.

    Consistency for clopen (closed and open) sets: for every behavioral strategy b, for every history \(h\in A^{k-1}\) at any period k and for every clopen set \(Q\in {\mathcal {T}}\) it holds that

    $$\begin{aligned} \psi (b|_h)(Q|_h) \;=\; \int _{a \in A} \ \psi (b|_{ha})(Q|_{ha}) \ b(h) (da) , \end{aligned}$$
    (3)

    where \(Q|_h=\{(a_k,a_{k+1},\ldots ):(h,a_k,a_{k+1},\ldots )\in Q\}\) is the continuation of Q after h, and similarly \(Q|_{ha}\) is the continuation of Q after the history ha.

  2. 2.

    Regularity for open sets: for every behavioral strategy b and for every open set \(O\in {\mathcal {T}}\)

    $$\begin{aligned} \psi (b)(O)\;=\;\sup \,\{\psi (b)(Q):\text {clopen}\;Q\in {\mathcal {T}}\text { and }\;Q\subseteq O\}. \end{aligned}$$
    (4)

Remark. Condition 1 (consistency) is first proposed by Dubins and Savage (2014).Footnote 8 The intuition behind this condition is the following. Consider a behavioral strategy b, a history h and a clopen set \(Q\in {\mathcal {T}}\). The specification \(\psi\) assigns to the continuation strategy \(b|_h\) a charge \(\psi (b|_h)\), and for each action \(a\in A\) it also assigns to the continuation strategy \(b|_{ha}\) a charge \(\psi (b|_{ha})\).Footnote 9 The left-hand-side of (3) considers the subgame at history h, and the right-hand-side of (3) considers the subgame at each history ha. Thus, Condition 1 requires the following consistency property between the charges \(\psi (b|_h)\) and \(\psi (b|_{ha})\), where \(a\in A\): the probability of \(Q|_h\) under the charge \(\psi (b|_h)\) should be equal to the expectation of the probability of \(Q|_{ha}\) at the next period under \(\psi (b|_{ha})\), where a is the action chosen after history h.

Condition 2 (regularity) is proposed by Dubins (1974). It requires inner-regularity for the probabilities of open sets: the probability of each open set is equal to the supremum of the probabilities of the contained clopen sets. \(\Diamond\)

Now, for each behavioral strategy b, the induced charge is \(\mathbb {P}_b=\psi (b)\) where \(\psi\) is the unique specification in Theorem 4.1. By \([u(b)]\) we denote the set of possible expected payoffs, which is calculated according to (2) (with \(\mathscr {P}\) being the Borel algebra \(\mathcal {R}\)).Footnote 10

Smaller algebras. As mentioned earlier, from a conceptual point of view the literature has also considered three alternative algebras, instead of the Borel algebra \(\mathcal {R}\). All three algebras are included in \(\mathcal {R}\).

  1. I

    The finite horizon algebra: This algebra, denoted by \(\mathcal {R}^{I}\), consists of all subsets Q of \(A^\mathbb {N}\) such that we already know at some period whether or not the induced play belongs to Q. Formally, \(Q\subseteq A^\mathbb {N}\) belongs to \(\mathcal {R}^{I}\) exactly when Q satisfies the following property: there is a period n such that for all plays \(\vec a, \vec a'\) that coincide up to period n, either both \(\vec a,\vec a'\) belong to Q or both \(\vec a,\vec a'\) belong to \(A^\mathbb {N}{\setminus } Q\). This is a small but very natural algebra. This algebra is examined in Flesch et al. (2019), but Dubins and Savage (2014) in Sect. 2.6 also refer to events that only depend on finitely many coordinates.

  2. II

    The clopen algebra: This algebra, denoted by \(\mathcal {R}^{II}\), consists of all clopen subsets of \(A^\mathbb {N}\). This algebra is examined in detail in Dubins and Savage (2014).

  3. III

    The clopen+singleton algebra: This algebra, denoted by \(\mathcal {R}^{III}\), is the smallest algebra that contains all clopen subsets of \(A^\mathbb {N}\) plus the singleton sets \(\{\vec a\}\), for all \(\vec a \in A^\mathbb {N}\). This algebra is examined in detail in Flesch et al. (2019).

We obviously have

$$\begin{aligned} \mathcal {R}^{I}\subseteq \mathcal {R}^{II}\subseteq \mathcal {R}^{III}\subseteq \mathcal {R}. \end{aligned}$$
(5)

For each \(i=I,II,III\) and for each behavioral strategy b, we denote by \(\mathbb {P}_b^i\) the restriction of the charge \(\mathbb {P}_b\) from the Borel algebra \(\mathcal {R}\) to the algebra \(\mathcal {R}^i\). Due to the previous observation, we have \([\mathbb {P}^I_b]\;\supseteq \; [\mathbb {P}^{II}_b]\;\supseteq \; [\mathbb {P}^{III}_b]\;\supseteq \; [\mathbb {P}_b]\).

With regard to \(\mathcal {R}^i\), we can define the set of possible expected payoffs for each behavioral strategy b according to (2), and we denote this set by \([u^i(b)]\). We then have \([u^I(b)]\;\supseteq \; [u^{II}(b)]\;\supseteq \; [u^{III}(b)]\;\supseteq \; [u(b)]\). We say that a behavioral strategy b induces an unambiguous expected payoff with respect to \(\mathcal {R}^i\) if \([u^i(b)]\) is a singleton.

Details and further discussions on these three smaller algebras are deferred to Sect. 9. In particular, Proposition  9.2 shows that dealing with the algebras \(\mathcal {R}^I\), \(\mathcal {R}^{II}\), \(\mathcal {R}^{III}\) and \(\mathcal {R}\) are all essentially different; that is, our main question of unambigous expected payoffs for all behavioral strategies is different in all these algebras.

5 Classes of payoff functions

In order to be able to present our main results (cf. Sect. 6), we define several classes of payoff functions. We discuss the relation between these classes in the end of this section.

Uniformly approachable payoff functions. Let \(\mathscr {P}\) be an algebra on \(A^\mathbb {N}\). A payoff function \(u: A^\mathbb {N}\rightarrow \mathbb {R}\) is uniformly \(\mathscr {P}\)-approachable, if for every \(\varepsilon >0\) there exists a \(\mathscr {P}\)-measurable simple–function \(u':A^\mathbb {N}\rightarrow \mathbb {R}\) such that \(|u(\vec a)-u'(\vec a)|\le \varepsilon\) for every \(\vec a\in A^\mathbb {N}\).

Semicontinuous, continuous and uniformly continuous payoff functions. A payoff function \(u: A^\mathbb {N}\rightarrow \mathbb {R}\) is called upper semicontinuous if for every \(r \in \mathbb {R}\) the set \(u^{-1}([r,\infty ))\) is closed.Footnote 11 Similarly, u is called lower semicontinuous if for every \(r \in \mathbb {R}\) the set \(u^{-1}((-\infty ,r])\) is closed.Footnote 12 A function is continuous if and only if it is both upper and lower semicontinuous. The payoff function u is called uniformly continuous if for every \(\varepsilon >0\) there exists \(\delta >0\) such that for all plays \(\vec a,\vec a'\in A^\mathbb {N}\) with \(d(\vec a,\vec a')< \delta\) we have \(|u(\vec a)-u(\vec a')|<\varepsilon\).

Tame payoff functions. The oscillation of u at the play \(\vec a\) is defined as

$$\begin{aligned} o_u(\vec a) = \lim _{\varepsilon \downarrow 0} \sup _{\vec a',\vec a'' \in {\mathcal {N}}_\varepsilon (\vec a)} |u(\vec a')-u(\vec a'')|, \end{aligned}$$

where \({\mathcal {N}}_\varepsilon (\vec a)=\{\vec a' \in A^\mathbb {N}: d(\vec a, \vec a')< \varepsilon \}\) is the \(\varepsilon\)-neighbourhood of \(\vec a\). We call the payoff function u weakly tame if for every \(r > 0\) the set \(\{\vec a \in A^\mathbb {N}: o_u(\vec a) \ge r\}\) is finite.

We say that the payoff function \(u: A^\mathbb {N}\rightarrow \mathbb {R}\) has a limit at a play \(\vec a\in A^\mathbb {N}\) if there is \(\ell \in \mathbb {R}\) with the following property: \(u(\vec a_n)\) converges to \(\ell\) for each sequence \(\vec a_n\) in \(A^\mathbb {N}\) such that (1) \(\vec a_n\) converges to \(\vec a\) as \(n\rightarrow \infty\), and (2) \(\vec a_n\ne \vec a\) for all \(n\in \mathbb {N}\). If u has a limit at \(\vec a\), we denote it by \(L_u(\vec a)\).

For the payoff function u, let \(D_u\) denote the set of plays at which u is not continuous. We say that a discontinuity \(\vec a \in D_u\) is removable if u has a limit at \(\vec a\). We call the payoff function u strongly tame if the following two conditions hold: u is weakly tame, and each discontinuity in \(D_u\) is removable.Footnote 13 For an illustration of these concepts, we refer to Example 11.1 in Sect. 11.

Relation between classes of payoff functions. If u is continuous, then by definition, u is both upper and lower semicontinuous. If u is uniformly continuous, then u is also continuous, and if A is finite, then by compactness of \(A^\mathbb {N}\) the converse also holds.

Note that u is continuous at \(\vec a\) if and only if u has a limit at \(\vec a\) and \(L_u(\vec a)=u(\vec a)\). Also, u is continuous at \(\vec a\) if and only if \(o_u(\vec a)=0\). A continuous function is clearly strongly tame.

6 Main results

In this section we present our main results. The proofs can be found in Sect. 10.

The first theorem identifies a condition that guarantees that the expected payoffs are unambiguous under behavioral strategies with respect to any of the algebras.

Theorem 6.1

Let \(\mathscr {P}\) be any of the algebras \(\mathcal {R}^{I},\mathcal {R}^{II},\mathcal {R}^{III},\mathcal {R}\). Suppose that the payoff function u is uniformly \(\mathscr {P}\)-approachable. Then, with respect to \(\mathscr {P}\), each behavioral strategy b induces an unambiguous expected payoff.

We remark that Marinacci (1997) and Harris et al. (2005) also consider uniformly approachable functions, but in a very different setting. They consider one-shot multi-player simultaneous-move games in which the players’ strategies are charges. In their setting each strategy profile induces an unambigious expected payoff if the payoff function is uniformly approachable with respect to a specific algebra on the set of action profiles. Our Theorem 6.1 is set in infinite horizon decision problems, and hence it does not follow from their results.

The reverse implication of Theorem 6.1 also holds for the algebras \(\mathcal {R}^I\) and \(\mathcal {R}^{II}\), as is shown respectively in Theorems  6.2 and 6.3 below. However, Example  11.2 will demonstrate that the reverse implication is no longer valid for the algebras \(\mathcal {R}^{III}\) and \(\mathcal {R}\).

The next theorems provide, for each algebra separately, connections between unambiguous expected payoffs under behavioral strategies and properties of the payoff function.

For the finite horizon algebra \(R^{I}\), uniform continuity plays the crucial role.

Theorem 6.2

(for the finite horizon algebra \(R^{I}\)) The following are equivalent:

  1. 1.

    With respect to \(\mathcal {R}^I\), each behavioral strategy b induces an unambiguous expected payoff.

  2. 2.

    The payoff function u is uniformly continuous.

  3. 3.

    The payoff function u is uniformly \(\mathcal {R}^{I}\)–approachable.

The following theorem, which followsFootnote 14 from Theorem 2.8.5 in Dubins and Savage (2014), gives sufficient and necessary conditions for unambiguous expected payoffs for the clopen algebra \(\mathcal {R}^{II}\). One of these conditions is continuity of the payoff function.

Theorem 6.3

(for the clopen algebra \(R^{II}\)) The following are equivalent:

  1. 1.

    With respect to \(\mathcal {R}^{II}\), each behavioral strategy b induces an unambiguous expected payoff.

  2. 2.

    The payoff function u is continuous.

  3. 3.

    The payoff function u is uniformly \(\mathcal {R}^{II}\)–approachable.

For the algebras \(\mathcal {R}^{III}\) and \(\mathcal {R}\) we obtain sufficient conditions for unambiguous expected payoffs.

Theorem 6.4

(for the clopen+singleton algebra \(\mathcal {R}^{III}\)) Suppose that the payoff function u is strongly tame. Then u is uniformly \(\mathcal {R}^{III}\)–approachable, and hence with respect to \(\mathcal {R}^{III}\) each behavioral strategy b induces an unambiguous expected payoff.

Regarding a converse of Theorem 6.4, we present in Example 11.2 a decision problem in which with respect to \(\mathcal {R}^{III}\) each behavioral strategy induces an unambiguous expected payoff, yet the payoff function is not uniformly \(\mathcal {R}^{III}\)-approachable, and hence not strongly tame either.

Theorem 6.5

(for the Borel algebra \(\mathcal {R}\)) The payoff function u is uniformly \(\mathcal {R}\)–approachable if any of the following conditions hold:

  1. 1.

    the function u is weakly tame,

  2. 2.

    the function u is upper semicontinuous,

  3. 3.

    the function u is lower semicontinuous.

Hence, under any of these conditions, with respect to \(\mathcal {R}\) each behavioral strategy b induces an unambiguous expected payoff.

7 Examples

In this section we discuss a number of illustrative and thought-provoking examples, with a focus on intuition and main ideas. In each example, the payoff function has an easy structure, and we look at the possible expected payoffs for focal behavioral strategies that highlight specific features of the payoff function at hand.

Example 7.1

(Play the largest number). Consider the decision problem with action space \(A=\mathbb {N}\) and the following payoff function: for a play \(\vec a=(a_1,a_2,\ldots )\in A^\mathbb {N}\), the payoff is \(u(\vec a)=\tfrac{n}{n+1}\) where \(n=a_1\). So, the payoff is determined after the first period. Intuitively, the decision maker would like to choose an action as large as possible at period 1.

The interesting question is what the expected payoff is under the following strategy b: at period 1 the decision maker uses a diffuse charge to choose an action (for the definition of a diffuse charge, cf. Section 2). The strategy b has an unambiguous expected payoff with respect to all four algebras \(\mathcal {R}^I\), \(\mathcal {R}^{II}\), \(\mathcal {R}^{III}\), \(\mathcal {R}\). Indeed, this follows by Theorem  6.2 as the payoff function u is uniformly continuous.

In fact, the behavioral strategy b induces the expected payoff of 1, and therefore it is optimal,Footnote 15 with respect to these algebras. To see this, note that, for each \(n\in \mathbb {N}\), the set \(\cup _{k=n}^\infty Q_k\) has probability 1, where \(Q_k\) is the set of plays that start with action k (cf. Lemma 10.2). This implies that, for each \(n\in \mathbb {N}\), the expected payoff is at least \(\frac{n}{n+1}\), which proves that the expected payoff under the strategy b is 1. \(\triangleleft\)

Example 7.2

(Repeat n times). Consider the decision problem with action space \(A=\mathbb {N}\) and the following payoff function: for a play \(\vec a=(a_1,a_2,\ldots )\in A^\mathbb {N}\), if \(a_1=\ldots =a_{n+1}=n\) where \(n\in \mathbb {N}\) then the payoff is \(u(\vec a)=\tfrac{n}{n+1}\) and otherwise \(u(\vec a)=0\). If action n is played at period 1, then the payoff is determined in period \(n+1\). Consequently, the payoff is determined in finite but unbounded time. Intuitively, the decision maker would like to choose an action n, as large as possible, at period 1 and repeat it at periods \(2,\ldots ,n+1\).

The behavioral strategy b we consider is similar to the one in Example 7.1. In order to choose a large action at period 1, the decision maker chooses the first action according to a diffuse charge. If action n is chosen at period 1, then he places probability 1 on action n at periods \(2,\ldots ,n+1\).

The strategy b has an unambiguous expected payoff with respect to the algebras \(\mathcal {R}^{II}\), \(\mathcal {R}^{III}\) and \(\mathcal {R}\), but b has an ambiguous expected payoff with respect to the algebra \(\mathcal {R}^{I}\). Indeed, this follows by Theorems 6.2 and 6.3 as the payoff function u is continuous but not uniformly continuous.Footnote 16

In fact, with respect to the algebras \(\mathcal {R}^{II}\), \(\mathcal {R}^{III}\) and \(\mathcal {R}\), the strategy b has an expected payoff of 1, and therefore it is optimal.Footnote 17 Indeed, for each \(k\in \mathbb {N}\), let \(Q_k\) be the set of plays that start with \({(k,\ldots ,k)}\) till period \(k+1\). Then, for each \(n\in \mathbb {N}\), the set \(\cup _{k=n}^\infty Q_k\) is clopen, and it follows from Condition 1 of Theorem 4.1 that \(\mathbb {P}_b(\cup _{k=n}^\infty Q_k) =1\) for every \(n\in \mathbb {N}\). Hence, \(\mathbb {P}^{II}_b(\cup _{k=n}^\infty Q_k) =1\) for every \(n\in \mathbb {N}\). This implies that, for each \(n\in \mathbb {N}\), the expected payoff is at least \(\frac{n}{n+1}\), and therefore, the expected payoff under the strategy b is 1. \(\triangleleft\)

The next two examples are stopping problems. In these decision problems the decision maker has two actions, one of which could be interpreted as “continue” and the other as “stop”. The payoff is determined by the first period when the decision maker plays the latter action.

Example 7.3

(Continue indefinitely). Consider the decision problem with action space \(A=\{c,s\}\) and the following payoff function: for the play \(\vec c=(c, c, \ldots )\) let \(u(\vec c)=1\), and for all \(\vec a\in A^\mathbb {N}\setminus \{\vec c\}\) let \(u(\vec a)=0\).

Consider the strategy b that chooses action c with probability 1 at every period. The strategy b has an unambiguous expected payoff with respect to the algebras \(\mathcal {R}^{III}\) and \(\mathcal {R}\), but b has an ambiguous expected payoff with respect to the algebras \(\mathcal {R}^I\) and \(\mathcal {R}^{II}\). Indeed, this follows by Theorems 6.3 and 6.4 as the payoff function u is strongly tame but not continuous.Footnote 18 In fact, b has an expected payoff of 1, and is therefore optimal, with respect to the algebras \(\mathcal {R}^{III}\) and \(\mathcal {R}\).Footnote 19\(\triangleleft\)

Example 7.4

(Repeat indefinitely). Consider the decision problem with action space \(A=\mathbb {N}\) and the following payoff function: for any \(n \in \mathbb {N}\), the play \((n, n, \ldots )\) has a payoff of \(u(n, n, \ldots )=\tfrac{n}{n+1}\), and otherwise the payoff is 0. Thus, the decision maker would like to choose an action, as large as possible, in period 1 and then to repeat it in each following period.

The interesting question is what the expected payoff is under the following strategy b: at period 1 the decision maker uses a diffuse charge to choose an action, and at every other period he places probability 1 on the action chosen in period 1.

The strategy b has an unambiguous expected payoff of 1 with respect to the Borel algebra \(\mathcal {R}\), which follows by Theorem  6.5 as the payoff function u is upper semicontinuous.

On the other hand, b has an ambiguous expected payoff with respect to the algebras \(\mathcal {R}^I\), \(\mathcal {R}^{II}\) and \(\mathcal {R}^{III}\). Indeed, u is not continuous, so this claim follows for \(\mathcal {R}^I\), \(\mathcal {R}^{II}\) by Theorem  6.3.Footnote 20 The proof for \(\mathcal {R}^{III}\) follows from Part 3 of Proposition 9.2. \(\triangleleft\)

The algebra \(\mathcal {R}\) is the largest algebra that we consider. As one might suspect, even using the algebra \(\mathcal {R}\), not all behavioral strategies have an unambiguous expected payoff. The following example is very similar to the example presented in Sections 4 and 5 in Purves and Sudderth (1976).

Example 7.5

(Do not play a infinitely many times). Consider a decision problem with action space \(A=\{a, a'\}\). Let Q be the set of plays \(\vec {a}\) such that action a appears only finitely many times in \(\vec {a}\). Note that Q is in the Borel sigma-algebra of \(A^\mathbb {N}\), but not in the algebra \(\mathcal {R}\). Consider the following payoff function: \(u(\vec {a})=1\) for every play \(\vec {a}\in Q\), and \(u(\vec {a})=0\) otherwise.

Consider the behavioral strategy b which chooses the action uniformly random in every period: \(b(h)(a)=\tfrac{1}{2}\) and \(b(h)(a')=\tfrac{1}{2}\) for every history \(h \in H\). We briefly argue that b does not induce an unambiguous expected payoff with respect to \(\mathcal {R}\), that is, \([u(b)]\) is not a singleton.

Note that the only open set that is contained in Q is the empty set, and the only open set that contains Q is the entire set \(A^\mathbb {N}\). That is, inner approximations of Q by open sets give probability 0, and outer approximations of Q by open sets give probability 1. Similarly, inner approximations of \(A^\mathbb {N}\setminus Q\) by open sets give probability 0, and outer approximations of \(A^\mathbb {N}\setminus Q\) by open sets give probability 1. As one can show (by induction on the complexity of the sets in \(\mathcal {R}\)), this implies that inner approximations of Q by sets in \(\mathcal {R}\) all give probability 0, and the same for \(A^\mathbb {N}\setminus Q\). By Theorem C.3 in Flesch et al. (2017), there exists an extension \(B\in [\mathbb {P}_b]\) such that \(B(Q)=0\) and also there exists an extension \(B'\in [\mathbb {P}_b]\) such that \(B'(Q)=1\). This completes the proof. \(\triangleleft\)

8 Concluding remarks

We examined infinite horizon decision problems with arbitrary bounded payoff functions, in which the decision maker uses finitely additive behavioral strategies. Because we only assume that the payoff function is bounded, a behavioral strategy does not necessarily induce an unambiguous expected payoff. To address this ambiguity, we derived conditions on the payoff function that guarantee an unambiguous expected payoff regardless of the behavioral strategy adopted. Our approach involved a systematic exploration of various alternatives proposed in the literature on how to define the finitely additive probability measures on the set of infinite plays induced by behavioral strategies.

In the subsequent part of this section, we discuss extensions of our results.

Multiple players playing sequentially. If there are multiple players, each having her own payoff function, and these players play sequentially (that is, there are no simultaneous moves), then our analytic framework remains applicable. Indeed, a strategy profile, which consists of one strategy for each player, defines a charge on the set of actions at each history, just like a strategy in the case of a single decision maker. Consequently, each strategy profile induces an unambiguous expected payoff to some player exactly under the same conditions on her payoff function as in the case of a single decision maker.

Multiple players playing simultaneously. When considering multiple players playing simultaneously, with finite action sets, our analytic framework remains applicable. The reason is that, in this case, under each strategy profile, there is a unique probability measure on the set of action profiles at each history. This situation contrasts starkly with instances when the action sets are infinite, because then a strategy profile can induce a set of possible charges on the set of action profiles. We refer to Flesch et al. (2017) for more details.

Extending the charge \(\mathbb {P}_b\), for a behavioral strategy b, from the Borel algebra \(\mathcal {R}\) to a larger algebra. Given a specific behavioral strategy b, Purves and Sudderth (1976) extended the charge \(\mathbb {P}_b\) from the algebra \(\mathcal {R}\) to an algebra \({\mathcal {A}}(b)\). The algebra \({\mathcal {A}}(b)\) depends on the behavioral strategy b, but always includes the Borel sigma-algebra on \(A^{\mathbb {N}}\). Even by considering these larger algebras, ambiguous expected payoffs still play a role though: Purves and Sudderth present an example with a specific payoff function and a behavioral strategy b under which the expected payoff is ambiguous.

9 Glossary

Let \(\mathbb {N}=\{1,2,\ldots \}\). We provide a table below containing the most frequently used symbols and notations with reference to the Section where they are defined (in the table, \(i\in \{I,II,III\}\)).

Notation

Meaning

Section

A

action set

3

H

set of histories

3

\(A^\mathbb {N}\)

set of plays

3

u

Payoff function

3

b

Behavioral strategy

3

d

Metric on \(A^\mathbb {N}\)

4

\(\mathcal {R}\)

Borel algebra on \(A^\mathbb {N}\)

4

\(\mathcal {R}^{I}\)

The finite horizon algebra on \(A^\mathbb {N}\)

4

\(\mathcal {R}^{II}\)

The clopen (closed and open) algebra on \(A^\mathbb {N}\)

4

\(\mathcal {R}^{III}\)

The clopen+singleton algebra on \(A^\mathbb {N}\)

4

\(\mathbb {P}_b\) and \(\mathbb {P}_b^i\)

Charge induced on \(\mathcal {R}\) resp. \(\mathcal {R}^{i}\) by the behavioral strategy b

4

[u(b)] and \([u^i(b)]\)

Set of possible expected payoffs when \(\mathbb {P}_b\) resp. \(\mathbb {P}_b^i\)

 
 

is extended from \(\mathcal {R}\) resp. \(\mathcal {R}^i\) to the power set of \(A^\mathbb {N}\)

4

\(o_u\)

the oscillation of u

5

h(t) and \(\vec a(t)\)

Action in history h and respectively in play \(\vec a\) at period t

10

\(\preceq\) and \(\prec\)

Notations for one sequence extending another

10

[h] and [Q]

Set of plays extending the history h or a set Q of histories

10