Abstract
In this paper, a new concept of equilibrium in dynamic games with incomplete or distorted information is introduced. In the games considered, players have incomplete information about crucial aspects of the game and formulate beliefs about the probabilities of various future scenarios. The concept of belief distorted Nash equilibrium combines optimization based on given beliefs and selfverification of those beliefs. Existence and equivalence theorems are proven, and this concept is compared to existing ones. Theoretical results are illustrated using several examples: extracting a common renewable resource, a large minority game, and a repeated Prisoner’s Dilemma.
1 Introduction
Nash equilibrium is the only concept of a solution which can be sustained in a game where rational players, besides knowing their own strategy set and payoff as a function of their own strategy, have complete information about the game they are playing. A player has complete information when they know the following: They are participating in a game (i.e., interacting with other players, conscious decision makers with their own goals, not random “nature”), the number of those players, their strategy sets and payoff functions, the influence of the choices of the others on payoffs, and that the other players are rational. In dynamic games, it is assumed that players can either directly observe the choices of the other players, or their influence on the state variable.
Actually, in the majority of reallife decisionmaking problems of a gametheoretic nature, these assumptions are not fulfilled. Usually, the other players’ payoff functions and sets of strategies are not known exactly.
This fact results in a need to introduce various concepts of equilibria with imperfect, incomplete, or distorted information. This branch of game theory is developing rapidly, with numerous concepts based on various assumptions on what kind of imperfection is allowed: Bayesian equilibria, introduced by Harsanyi [1], \(\varDelta \)rationalizability by Battigalli and Siniscalchi [2], conjectural equilibria by Battigalli and Guaitoli [3], cursed equilibrium considered by Eyster and Rabin [4], selfconfirming equilibria by Fudenberg and Levine [5], and studied, among others, by Azrieli [6], conjectural categorical equilibria introduced by Azrieli [7], stereotypical beliefs by Cartwright and Wooders [8], subjective equilibria by Kalai and Lehrer [9, 10], rationalizable conjectural equilibria by Rubinstein and Wolinsky [11], correlated equilibria by Aumann [12, 13] (to some extent) and belief distorted Nash equilibria for setvalued beliefs introduced by WiszniewskaMatyszkiel ([14, 15], with prerequisites in [16]). Most of them assume that players are rational. A detailed review of these concepts can be found in WiszniewskaMatyszkiel [14].
Only two of the aforementioned concepts can deal with information which is not only incomplete, but can be seriously distorted: the subjective equilibria of Kalai and Lehrer [9, 10] and belief distorted Nash equilibria (BDNE) for setvalued beliefs of WiszniewskaMatyszkiel [14, 15]. Only BDNE is applicable in dynamic games, including those with an infinite time horizon, in which information is gradually disclosed during play.
In the approaches presented in a previous, theoretical paper of the author on this subject [14], and the paper applying these concepts to environmental economics [15], beliefs take the form of a multivalued correspondence. Such a form of beliefs suggests a way of defining the “anticipated” future payoff of a player as the payoff which can be obtained given the worst realization under this belief, assuming that the player will choose optimally in the future. We refer to this approach as the “infapproach” (due to the alternative used in assessing the future payoff), while the approach used in this paper is referred to as “expapproach.” Let us emphasize that the infapproach, with its pessimistic attitude to the future, is not very realistic.
To address this issue in this paper, beliefs are assumed to take the form of probability distributions over the set of possible future trajectories of states and statistics. If players are able to estimate the probability distribution of these parameters in the future, it is inherent that they take into account the expected future payoff, and the verification of beliefs can be assessed quantitatively using this probability distribution.
We introduce the concepts of prebelief distorted Nash equilibrium (preBDNE), \(\varepsilon \)belief distorted Nash equilibrium (\(\varepsilon \)BDNE), belief distorted Nash equilibrium (BDNE), and various concepts of the selfverification of beliefs of the form considered. Existence and equivalence theorems are proven, and the notion of BDNE is compared to the notions of Nash equilibrium, subjective equilibrium and BDNE for setvalued beliefs. These concepts are illustrated by several examples: extracting a common renewable resource, a large minority game, and a repeated Prisoner’s Dilemma.
It is worth emphasizing that the concept of BDNE, both for setvalued and probabilistic beliefs, is not a concept of bounded rationality. In our approach, we assume that players are rational, although they may have false information about the game they are playing. This false information is such that it cannot be falsified during subsequent play.
The paper is composed as follows. The problem is defined in Sect. 2, where the formal definition in Sect. 2.2 is preceded by a brief introduction in Sect. 2.1. The concepts of preBDNE, notions of the selfverification of beliefs, and finally, \(\varepsilon \)BDNE and BDNE are defined in Sect. 3, where theoretical results on existence and equivalence are also stated. In Sect. 4, some examples are studied. Appendix A is devoted to large games and B contains a very general existence result, using a higher level of mathematical abstraction than the main part of the paper.
2 Formulation of the Problem
This section defines a dynamic game, as well as a derived game with distorted information and probabilistic beliefs. The dynamic game is identical to that considered in WiszniewskaMatyszkiel [14] based on the infapproach, while the game with distorted information substantially differs, particularly in the structure of beliefs and expected payoffs.
2.1 Brief Introduction of the Problem and Concepts
Before giving a detailed introduction of the problem, we briefly describe it, without full mathematical precision.
We consider a discrete time dynamic game with the set of players \(\mathbb {I}\), where the payoff of player i under strategy profile S can be written as \(\varPi _{i}(S):=\sum _{t=t_0}^T\frac{P_i(S_i(t),u^S(t),X^S(t))}{(1+r_i)^{tt_0}}+\frac{G_i(X^S(T+1))}{(1+r_i)^{T+1t_0}},\) (or only the first part in the case of an infinite time horizon), where \(u^S\) denotes a statistic describing the players’ behavior under S (e.g., the aggregate of all the strategies used by the players), observable ex post, while \(X^S\) denotes the trajectory of the state variable resulting from choosing S, which is defined by \(X^S(t+1)=\phi (X^S(t),u^S(t))\) with \(X^S(0)=\bar{x}.\) All past and current states are observable.
At a Nash equilibrium, the basic concept of a solution to a noncooperative game, each player maximizes their payoff given the strategies of the remaining players.
We assume that players do not have complete information about the game they are playing. Therefore, at each stage of the game, they formulate beliefs about the future path of \(X^S\) and \(u^S\). The beliefs formulated at time t, \(B_i(t,a,H^S)\), are based on the current decision a and the already observed part of the history \(H^S=(X^S,u^S)\), denoted \(H^S_t\). They take the form of a probability distribution on the set of future paths of \((X^S,u^S)\).
These beliefs define the expected payoff of player i, \(\varPi _i^e(t,H^S_t,S(t))\) as the sum of the actual current payoff at time t, \(P_i(S(t),u^S(t),X^S(t))\), and the expected (with respect to beliefs) value of the future optimal payoff along \(H^S\).
A preliminary concept of a prebelief distorted Nash equilibrium (preBDNE) says that a profile S is a preBDNE iff at each stage each player maximizes \(\varPi _i^e(t,H^S_t,S(t))\). A belief distorted Nash equilibrium (BDNE) is a preBDNE S for which \((X^S,u^S)\) is the most likely path (with maximum likelihood normalized to 1), while an \(\varepsilon \)belief distorted Nash equilibrium (\(\varepsilon \)BDNE) is an \(\varepsilon \)BDNE S for which \((X^S,u^S)\) has likelihood at least \(1\varepsilon \).
2.2 Formal Introduction
A game with distorted information is a tuple of the following objects:
\(((\mathbb {I},\mathfrak {I},\lambda ), \mathbb {T}, \mathbb {X}, \{D_{i}\}_{i\in \mathbb {I}}, U, \phi , \{P_{i}\}_{i\in \mathbb {I}}, \{G_{i}\}_{i\in \mathbb {I}},\{B_{i}\}_{i\in \mathbb {I}}, \{r_{i}\}_{i\in \mathbb {I}}, L)\), i.e., the space of players, set of time points, set of states, sets of the players’ possible decisions, statistic, the system’s reaction function, current payoffs, terminal payoffs, beliefs, discount rates, and likelihood, respectively, briefly described in Sect. 2.1, and in detail in the sequel.
The set of players is denoted by \(\mathbb {I}\). In order that the definitions encompass both games with finitely many players and large games, we introduce a structure on \(\mathbb {I}\) consisting of a \(\sigma \)field \(\mathfrak {I}\) of its subsets and a measure \(\lambda \) on it (in standard games with finitely many players, \(\mathfrak {I}\) is the whole power set, while \(\lambda \equiv 1\)). For readers who are not familiar with games involving a measure space of players, there is a short introduction in Appendix A.
The game is dynamic, played over a discrete set of times \(\mathbb {T}=\{t_{0},t_{0}+1,\ldots ,T\}\) or \(\mathbb {T}=\{t_{0},t_{0}+1,\ldots \}\). We also introduce the symbol \(\overline{\mathbb {T}}\) denoting \(\{t_{0},t_{0}+1,\ldots ,T+1\}\) for finite T and equal to \(\mathbb {T}\) in the opposite case.
At each moment, player i chooses a decision from their decision set \(D_{i}\). We also denote the common superset of these sets as \(\mathbb {D}\)—the set of the (combined) decisions of the players with chosen \(\sigma \)field of its subsets denoted by \(\mathcal {D}\).
We call any measurable function \(\delta :\mathbb {I\rightarrow D}\) with \(\delta (i)\in D_i\), a static profile. The set of all static profiles is denoted by \(\Sigma ^{static }\). We assume that it is nonempty.
The next important object is a finite, mdimensional, statistic of the whole profile, which influences players’ payoffs. Such statistics might be, e.g., aggregate extraction in models of the exploitation of renewable resources, or prices in models of markets. Such a definition does not reduce generality, since in games with finitely many players this statistic may be the whole profile. Formally, a statistic of a static profile is a function \(U:\Sigma ^{static }\mathop {\rightarrow }\limits ^{onto }\mathbb {U}\subset \mathbb {R}^{m}\) defined by
for measurable functions \(g_{k}:\mathbb {I}\times \mathbb {D} \rightarrow \mathbb {R}\). The resultant set \(\mathbb {U}\) is called the set of profile statistics.
If \(\varDelta :\mathbb {T}\rightarrow \Sigma ^{static } \) represents the choices resulting from static profiles at various moments, then we denote by \(u^{\varDelta }\) the function \(u^{\varDelta }:\mathbb {T}\rightarrow \mathbb {U}\) such that \(u^{\varDelta }(t)=U(\varDelta (t))\).
The game is played in an environment (or system) with set of states \(\mathbb {X}\).
Given a function \(u^{\varDelta }:\mathbb {T}\rightarrow \mathbb {U}\), the state variable evolves according to the equation
where \(\phi : \mathbb X \times \mathbb U \rightarrow \mathbb X\) is called the reaction function of the system.
At each moment, player i obtains current payoff, \(P_{i}:D_i\times \mathbb {U}\times \mathbb {X} \rightarrow \mathbb {R\cup \{\infty \}}\). In the case of a finite time horizon, player i also obtains a terminal payoff (at the end of the game) defined by the function \(G_{i}:\mathbb {X}\rightarrow \mathbb {R\cup \{\infty \}}\).
Players sequentially observe the history of the game: At time t, they know the states X(s) for \(s\le t\) and the statistics u(s) for the chosen static profiles at moments \(s<t\). In order to simplify the notation, we introduce the set of histories of the game \(\mathbb {H}:=\mathbb {X}^{Tt_{0}+2}\mathbb {\times }\mathbb {U}^{Tt_{0}+1}\) and for such a history \(H\in \mathbb {H}\), we denote the history observed at time t by \(H_{t}\).
Given the history observed at time t, \(H_{t}\), players formulate their suppositions about future values of u and X, depending on their decision a made at time t. This is formalized as a function describing the beliefs of player i, \(B_{i}:\mathbb {T}\times D_i\times \mathbb {H}\rightarrow \mathcal {M}_{1}\left( \mathbb {H}\right) \), where \(\mathcal {M}_{1}\left( \mathbb {H}\right) \) denotes the set of all probability measures on \(\mathbb {H}\). We assume that beliefs \(B_{i}(t,a,H)\) only depend on H through \(H_{t}\), and that for every \(H^{\prime } \) in the support of \(B_{i}(t,a,H)\), we have \(H^{\prime }_{t}=H_{t}\).
Players have compound strategies dependent on time and the history of the game observed at this time. The strategy of player i is a function \(S_{i}:\mathbb {T}\times \mathbb {H} \rightarrow D_i\) such that \(S_i(t,H)\) only depends on H through \(H_t\). Combining the players’ strategies, we obtain the function \(S:\mathbb {I}\times \mathbb {T}\times \mathbb {H}\rightarrow \mathbb {D}\).
A profile (of strategies) is a combination of strategies such that for each t and H, the function \(S_{\bullet }(t,H)\) is a static profile. The set of all profiles is denoted by \(\Sigma \). Since the choice of a profile S determines the history of the game, we denote this history, consisting of trajectory \(X^S\) and statistic \(u^S\) (defined in Eqs. (1) and (2), respectively), by \(H^{S}\).
To simplify the notation, we consider the openloop form of profile S, \(S^{OL}:\mathbb {T}\rightarrow \Sigma ^{static } \), defined by
If the players choose a profile S, then the discounted payoff of player i, \(\varPi _{i}:\Sigma \rightarrow \overline{\mathbb {R}}\), depends only on the openloop form of the profile and is equal to
where \(r_{i}>0\) is the discount rate of player i. For infinite T, we set \(G_i\equiv 0\). We assume that the \(\varPi _{i}(S)\) are well defined.
This ends the definition of the dynamic game.
However, the players do not know the profile. Therefore, in their calculations, they can only use the expected payoff functions, \(\varPi _{i}^{e}:\mathbb {T}\times \mathbb {H} \times {\varSigma }^{{{static }}}\rightarrow \overline{\mathbb {R}}\), corresponding to their beliefs.
where \(V_{i}:\overline{\mathbb {T}}\times \left( \mathcal {M}_{1}(\mathbb {H})\right) )\rightarrow \overline{\mathbb {R}}\), (the function defining the expected future payoff) represents the discounted value of player i’s expected future payoff assuming that he acts optimally in the future under beliefs, i.e.,
where the function \(v_{i}:\overline{\mathbb {T}} \times \mathbb {H} \rightarrow \overline{\mathbb {R}}\) is the present value of the future payoff of player i under the assumption that they act optimally in the future, given u and X:
Note that this definition of expected payoff mimics, to some extent, the Bellman equation for calculating the best responses of players’ to the strategies of the others, used to derive Nash equilibria.
3 Nash Equilibria and Belief Distorted Nash Equilibria
One of the basic concepts in game theory, Nash equilibrium, assumes that every player (almost every in the case of games with a continuum of players) chooses a strategy which maximizes their payoff given the strategies of the remaining players.
Notational convention For any profile S and strategy d (both static and dynamic) of player i, \(S^{i,d} \text { represents the modification of the profile } S \text {where the strategy of player} i \text { is replaced by }d.\)
Definition 3.1
A profile S is a Nash equilibrium iff for a.e. \(i\in \mathbb {I}\) and for every strategy \(d \in D_i\), \(\varPi _{i}(S)\ge \varPi _{i}(S^{i,d})\).
The abbreviation “a.e.” (almost every) in games with finitely many players reduces to “every.”
3.1 Toward Belief Distorted Nash Equilibria: PreBelief Distorted Nash Equilibria and their Properties
The assumption that a player knows the strategies of the remaining players, or at least the statistic for these strategies which influences their payoff, is not usually fulfilled in reallife situations. Moreover, the details of the other players’ payoff functions or available strategy sets are sometimes not known precisely. Therefore, given their beliefs, players maximize their expected payoffs.
Definition 3.2
A profile S is a prebelief distorted Nash equilibrium (preBDNE for short) for belief B iff for a.e. \(i\in \mathbb {I}\), for every decision a of player i and every \(t\in \mathbb {T}\), we have \(\varPi _{i}^{e}(t,H^S,S^OL (t))\ge \varPi _{i}^{e}(t,H^S, (S^OL (t))^{i,a})\).
In other words, a profile S is a preBDNE iff almost every player maximizes their expected payoff given the current values of \(X^S\) and \(u^S\) and beliefs about their future values.
Remark 3.1
In oneshot games (i.e., for \(T=t_{0}\) and \(G\equiv 0\)), a profile is a preBDNE iff it is a Nash equilibrium.
\(\square \)
Next, we state an existence result for games with a continuum of players.
Theorem 3.1
Let \((\mathbb {I},\mathfrak {I},\lambda )\) be an atomless measure space and let \(D_i\subseteq \mathbb {R}^{n}\), together with the \(\sigma \)field of Borel subsets. Assume that for every t, x, H and for almost every i, the following continuity assumptions hold: The functions \(P_{i}(a,u,x)\) and \(V_{i}(t,B_{i}(t,a,H))\) are upper semicontinuous in (a, u) jointly, while for every a, they are continuous in u and for all k, the functions \(g_{k}(i,a)\) are continuous in a for \(a\in D_{i}\).
Assume also that the sets \(D_{i}\) are compact and the following measurability assumptions hold: The graph of \(D_{\bullet }\) is measurable, and for every t, x, u, k, and H, the \(P_{i}(a,u,x)\), \(r_{i}\), \(V_{i}(t,B_{i}(t,a,H))\), and \(g_{k}(i,a)\) are measurable in (i, a). Moreover, assume that for each k, \(g_k\) is integrably bounded, i.e., there exists an integrable function \(\Gamma :\mathbb {I\rightarrow R}\) such that for every \(a\in D_{i}\), \(\left g_{k}(i,a)\right \le \Gamma (i)\).
Under these assumptions, there exists a preBDNE for B.
Theorem 3.1 states that under some measurability, compactness, and continuity assumptions, there exists a preBDNE. This is a conclusion from a more general existence result — Theorem B.2, proved by a general Nash equilibrium result from WiszniewskaMatyszkiel [17], using the concept of analyticity of sets. Since it requires introducing specific terminology, for the sake of coherence and also for readers who are less interested in nonstandard mathematics, Theorem B.2 is stated and proven in Appendix B. The proof of Theorem 3.1 is also given in Appendix B, after the formulation and proof of Theorem B.2.
Now we turn to show some properties of preBDNE for a special kind of belief.
Definition 3.3
A belief \(B_{i}\) has perfect foresight for a profile S, iff for all t, \(B_{i}(t,S_{i}^{OL}(t),H^{S})\) is concentrated at \({\{H^{S}\}}\).
For perfect foresight, we state the equivalence between Nash equilibria and preBDNE for a continuum of players.
Theorem 3.2
Let \((\mathbb {I},\mathfrak {I},\lambda )\) be an atomless measure space. Assume that for all i, x, and u, the \(P_i(\bullet ,u,x)\) are upper semicontinuous, \(D_i\) are compact, \(\sup _{S\in \Sigma }\varPi _i(S)<+\infty \) and for every \(S\in \Sigma \), \(\max _{d:\mathbb {T} \rightarrow D_i}\varPi _i(S^{i,d})\) is attained.

(a)
Any Nash equilibrium profile \(\bar{S}\) is a preBDNE for any belief corresponding to perfect foresight at \(\bar{S}\) and all profiles \(\bar{S}^{i,d}\).

(b)
If a profile \(\bar{S}\) is a preBDNE for a belief B with perfect foresight at both \(\bar{S}\) and \(\bar{S}^{i,d}\) for a.e. player i and any of their strategies d, then it is a Nash equilibrium.
Proof
In all the subsequent reasonings, we consider player i, who is not a member of the set of players of measure 0 for whom the condition of payoff maximization (actual for Nash equilibrium, expected for BDNE) does not hold.
In the case of a continuum of players, the statistics for the profiles, and consequently the trajectories corresponding to them, are identical for \(\bar{S}\) and all \(\bar{S}^{i,d}\). We denote this statistic by u and the corresponding trajectory by X.
To continue, we need the next lemma stating that along the path of perfect foresight, the equation for the expected payoff of player i becomes the Bellman equation for optimizing their actual payoff, while \(V_{i}\) is the value function. \(\square \)
Lemma 3.1
Let i be a player maximizing their payoff at a Nash equilibrium \(\bar{S}\), while \(\tilde{V}_i\) is the value function for this maximization. If B has perfect foresight for both profile \(\bar{S}\) and profiles \(\bar{S}^{i,d}\) for any \(d\in D_i\), then for all t, the values of \(V_i\) and \(\tilde{V_i}\) coincide and \(\bar{S}_i^{OL}(t)\in Argmax _{a\in D_{i}}\varPi _{i}^{e}(t,H^{\bar{S}},(\bar{S}^{OL}(t))^{i,a})\).
Proof
(of Lemma 3.1) Note that, given the profile of strategies of the remaining players coincides with \(\bar{S}\), the value function for the decision problem of player i can be written as \(\widetilde{V}_{i}:\mathbb {T}\rightarrow \overline{\mathbb {R}}\), unlike in standard dynamic optimization problems, since, because of the negligibility of every single player, the trajectory X is fixed.
(recall that for infinite T, we take \(G\equiv 0\)).
Since the payoff is well defined and the maximum is attained at \(\bar{S}_i\), the value function fulfills the Bellman equation
and
Using Eq. (8) to substitute an expression for \(\widetilde{V}_{i}(t+1)\) on the r.h.s. of the Bellman equation, Eq. (9), we obtain
Note that the last supremum is equal not only to \(\widetilde{V}_{i}(t+1)\), but also to \(v_{i}(t+1,(X,u))\). Since the belief assigns probability one to the history (X, u) for all profiles \(\bar{S}^{i,d}\), this supremum is also equal to \(V_{i}(t+1,B_{i}(t,a,H^{\bar{S}^{i,d}}))\). Therefore,
We only have to show that \(\bar{S}_{i}(t)\in Argmax _{a\in D_{i}}\varPi _{i}^{e}(t,H^{\bar{S}},\left( \bar{S}^{OL}(t)\right) ^{i,a}).\)
From the definition of \(\varPi _i^e\), Eq. (5), this set is equal to
Hence Eqs. (8) and (10) are satisfied, which ends the Proof of Lemma 3.1. \(\square \)
Statement (a) from Theorem 3.2 is an immediate conclusion from Lemma 3.1.
Statement (b): Let \(\bar{S}\) be a preBDNE for B which has perfect foresight at \(\bar{S}\) and all \(\bar{S}^{i,d}\). We consider \(\widetilde{V}_{i}\) as in Lemma 3.1, Eq. (8). From the definition of preBDNE and perfect foresight, \(\bar{S}_{i}(t)\in Argmax _{a\in D_{i}}\varPi _i^{e}(t,H^{\bar{S}},\left( \bar{S}^{OL}(t))^{i,a}\right) \), which is equal to \(Argmax _{a\in D_{i}}\) \(\left[ P_{i}(a,u(t),X(t))+\frac{1}{1+r_{i}}\max _{d:\mathbb {T}\rightarrow D_{i}}\left( \sum _{s=t+1}^{T}\frac{P_{i}(d(s),u(s),X(s))}{\left( 1+r_{i}\right) ^ {s(t+1)}}\right. \right. \left. \left. +\frac{G_{i}\left( X(T+1)\right) }{\left( 1+r_{i}\right) ^{Tt)}}\right) \right] \) . From Eq. (8), this set is equal to \(Argmax _{a\in D_{i}}\) \(\left[ P_{i}(a,u(t),X(t))\right. \left. +\left( \frac{1}{1+r_{i}}\right) \cdot \widetilde{V}_{i}(t+1)\right] \), the set in Expression (10).
At this stage, we need to show the sufficiency of the Bellman equation with the appropriate terminal condition. For the finite time horizon case, Eq. (9) and Expression (10) (with d instead of \(\bar{S}_i\)), together with \(\widetilde{V}_{i}(T+1)=G_{i}(X(T+1))\), are sufficient conditions for the function \(\widetilde{V}_{i}\) and strategy d to be the value function and optimal strategy, respectively.
In the infinite horizon case, the standard form of the terminal condition does not work in the case of unbounded payoffs, so we use a weaker version from WiszniewskaMatyszkiel [18], Theorem 1. The required conditions for our problem are

(i)
\(\mathrm {limsup}_{t\rightarrow \infty }\widetilde{V}_{i}(t)\cdot \left( \frac{1}{1+r_{i}}\right) ^{tt_{0}}\le 0\) and

(ii)
if \(\mathrm {limsup}_{t\rightarrow \infty }\widetilde{V}_{i}(t)\cdot \left( \frac{1}{1+r_{i}}\right) ^{tt_{0}}< 0\), then for every \(d:\mathbb {T}\rightarrow D_i\), \(\varPi _i(\bar{S}^{i,d})=\infty \).
Condition (i) holds from the assumption that the \(\varPi _i\) are bounded from above, while (ii) holds, since the existence of a \(t_k\rightarrow \infty \) such that \(\lim _{k\rightarrow \infty }\widetilde{V}_{i}(t_k)\cdot \left( \frac{1}{1+r_{i}}\right) ^{t_kt_{0}}<0\) when at least one of \(\varPi _i(S^{i,d})\) is greater than \(\infty \) contradicts the convergence of the series in the definition of \(\varPi _i\), see Eq. (4).
Since \(\widetilde{V}_{i}\) fulfills (9), the set \(Argmax _{a\in D_{i}}\) \(\left[ P_{i}(a,u(t),X(t))+\left( \frac{1}{1+r_{i}}\right) \cdot \widetilde{V}_{i}(t+1) \right] \) is the set of optimal actions of player i at time t, given u and X. Since we have this property for a.e. i, \(\bar{S}\) is a Nash equilibrium.
\(\square \)
The next equivalence result holds for repeated games.
Theorem 3.3
Consider a repeated game where players’ belief functions are independent of their strategies, such that for every player i, \(\sup _{d,u} P_i(d,u,\bar{x})<+\infty \).

(a)
If \((\mathbb {I},\mathfrak {I},\lambda )\) is an atomless measure space, then a profile S is a preBDNE for B, iff it is a Nash equilibrium, iff it is a sequence of Nash equilibria in static onestage games.

(b)
Any profile S where the strategies of a.e. player are independent of the observed history is a preBDNE for B, iff it is a Nash equilibrium, iff it is a sequence of Nash equilibria in static onestage games.
Proof
In repeated games, the only variable influencing future payoffs (via the dependence of the strategies of the remaining players on the history) is the statistic of the profile.

(a)
In games with an atomless space of players, the decision of a single player does not influence the statistic. Therefore, the optimization problem faced by player i can be decomposed into the optimization of \(P_{i}(a,u(t),\bar{x})\) at each separate moment (the discounted payoffs obtained in the future are finite, since the current payoffs are bounded).

(b)
If the strategies of the remaining players do not depend on the history of the game, then the current decision of a player does not influence their future payoffs, actual or expected. Therefore, the optimization problem faced by player i can be decomposed into the optimization of \(P_{i}(t,a,u(t),\bar{x})\) at each separate moment (again, the discounted payoffs obtained in the future are finite, since the current payoffs are bounded).
\(\square \)
3.2 Toward Belief Distorted Nash Equilibrium: SelfVerification
In this subsection, we concentrate on the problem of the consistency of a game’s history with players’ beliefs.
In dynamic games with many stages, especially games with an infinite time horizon, we cannot check whether beliefs are consistent with reality by assuming that the game is repeated many times.
Using the infapproach, where beliefs are given by the sets of histories regarded as possible, the method of verification is obvious. If a history regarded as impossible happens, it means that beliefs have been falsified. Otherwise, there is no need to update beliefs and we can regard them as being consistent with reality. Without any ranking of trajectories regarded as being possible, this is the only reasonable method of verification. In the case of probabilistic beliefs, the method of verification is not so obvious. It could be adapted from the infapproach, where the support of a distribution is treated as the set of possible histories, but this leads to a loss of the information introduced by the probability distribution. Therefore, we introduce a function measuring the consistency of beliefs with a game’s history.
First, given a probability distribution \(\beta \) on \(\mathbb {H}\), we introduce a function, called the likelihood function, that measures to what extent the histories corroborate \(\beta \). It assigns to each probability distribution on \(\mathbb {H}\) a function on the set of infinite histories corresponding to the belief.
Definition 3.4
A function \(L:\mathcal {M}_{1}(\mathbb {H})\rightarrow [0,1]^{\mathbb {H}}\) is called a likelihood function iff

(a)
If H is an atom of \(\beta \), then \(L(\beta )(\bar{H}):=\frac{\beta (\bar{H})}{\max _{H \in \mathbb {H}}\beta (H)}\).

(b)
If \(\beta \) is a continuous probability distribution with density \(\mu \),
then \(L(\beta )(\bar{H}) :=\frac{\mu (\bar{H})}{\max _{H \in \mathbb {H}}\mu (H)}\) if the maximum is attained.

(c)
Otherwise, the function L satisfies

(i)
if \(\beta (\{H_{1}\})>\beta (\{H_{2}\})\), then \(L\left( \beta \right) (H_{1}) > L\left( \beta \right) (H_{2})\);

(ii)
for each \( \beta \), there exists H with \(L(\beta )(H)=1\) (the “most likely history” is always of likelihood 1).

(i)
This definition gives a unique function in the case of discrete distributions. In the case of continuous distributions, we can take any density function, which leads to certain nonuniqueness. In the case of mixed distributions, we can choose any function satisfying (a)–(c), since the relation between atoms and the atomless part is not predefined.
From this moment on, we fix a likelihood function L, which is used in further definitions.
The first thing that we consider is verification of beliefs.
Given a likelihood function, we can define a measure of the consistency of beliefs along a profile \(\bar{S}\) as the minimum likelihood of \(H^{\bar{S}}\), taken over time, according to that belief. However, we have to solve a technical problem resulting from the notational convention of using elements from the set \(\mathbb {H}\) to denote both the observed history \(H_t\) and predictions of the future for all t. In fact, given the beliefs at time t, we only want to measure the likelihood of the predictions: X(s) and u(s) for \(s>t\). The observed history, \(H_t\), i.e., X(s) for \(s\le t\) and u(s) for \(s<t\), does not cause any problem, since with probability one, \(H_t=H^{\bar{S}}_t\). So, only u(t) may cause problems. Note that u(t) has no effect on \(B_i(t,\bullet ,\bullet )\). Hence, if we replace it by something else, we do not change any of the previously defined concepts. Therefore, to define the method of verifying beliefs, we slightly modify this irrelevant part of the history:
Definition 3.5
A function \(l^{\bar{S}}:\mathbb {I}\times \mathcal {M}_{1}(\mathbb {H})\rightarrow \mathbb {R}_{+}\) is called a measure of the ex post consistency of beliefs \(\{B_{i}\}_{i\in \mathbb {I}}\) with reality for profile \(\bar{S}\) iff \(l_{i}^{\bar{S}}(B_{i}):=\inf _{t\in \mathbb {T}}L(\bar{B}^t_{i}(t,\bar{S}_{i}^{OL}(t),H^{\bar{S}}))(H^{\bar{S}})\).
Given \(\varepsilon \ge 0\), we can define the following properties of \(\varepsilon \)selfverification of beliefs.
Definition 3.6

(a)
A collection of beliefs \(B=\{B_{i}\}_{i\in \mathbb {I}}\) is perfectly \(\varepsilon \) selfverifying iff for every preBDNE \(\bar{S}\) for B, then for a.e. \(i\in \mathbb {I}\), we have \(l_{i}^{\bar{S}}(B_{i})\ge 1\varepsilon \).

(b)
A collection of beliefs \(B=\{B_{i}\}_{i\in \mathbb {I}}\) is potentially \(\varepsilon \) selfverifying iff there exists a preBDNE \(\bar{S}\) for B such that for a.e. \(i\in \mathbb {I}\), we have \(l_{i}^{\bar{S}}(B_{i})\ge 1\varepsilon \).
In order to interpret these concepts, let us assume that players, who respond best to their beliefs, have an incentive to change their beliefs only if the measure of their ex post consistency is less than \(1\varepsilon \). In this case, perfect \(\varepsilon \)selfverification of beliefs means that players never have any incentive to change their beliefs, while potential \(\varepsilon \)selfverification of beliefs means that there is a possibility that they will have no incentive to change their beliefs.
3.3 Belief Distorted Nash Equilibrium
Definition 3.7
A profile S is an \(\varepsilon \) belief distorted Nash equilibrium for a collection of beliefs \(B=\{B_{i}\}_{i\in \mathbb {I}}\) (\(\varepsilon \)BDNE for short) iff it is a preBDNE for B and \(l^S(B)(H^S)\ge 1 \varepsilon \).
A 0BDNE is called a BDNE.
If we assume that players feel an incentive to change their beliefs only if the measure of the beliefs’ ex post consistency is less than \(1\varepsilon \), then at an \(\varepsilon \)BDNE, beliefs are never changed.
Proposition 3.1
Theorems 3.2 and 3.3 and Remark 3.1 still hold when preBDNE for specific beliefs is replaced by BDNE for those beliefs. \(\square \)
This means that we have equivalence between BDNE and Nash equilibria for those classes of games for which equivalence results hold for preBDNE and Nash equilibria: under assumptions of boundedness, when beliefs are independent of a player’s own decision, in games with a continuum of players, oneshot games and repeated games.
3.4 Comparison of BDNE and \(\varepsilon \)BDNE to Similar Concepts
We can compare the notions of BDNE and \(\varepsilon \)BDNE introduced in this paper with Nash equilibria, subjective equilibria, as well as BDNE for setvalued beliefs.
First, we compare our concept to Nash equilibria. From Proposition 3.1, Nash equilibria and BDNE coincide, for example, in games with a continuum of players or repeated games with bounded payoffs and when beliefs are independent of a player’s decisions. However, in general, the concept of BDNE is neither equivalent nor an extension to the concept of Nash equilibrium. In the examples considered in Sect. 4, we compare Nash equilibria to BDNE for specific models.
Next, we compare BDNE to the most related concepts of equilibrium in games with incomplete information. When distorted information is considered, as mentioned before, only two of the concepts of equilibrium without complete information can deal with it.
The subjective equilibria of Kalai and Lehrer [9, 10] are used in the environment of repeated games or games that can be repeated. Decisions are taken at each moment without foreseeing the future, and beliefs—a stochastic environmental response function—are based on history and the decision applied at the present stage of the game. It is assumed that the current decision does not influence future play. Hence, players just optimize given their beliefs at each stage separately. The condition applied in subjective equilibrium theory is that beliefs are not contradicted by observations, i.e., that the frequencies of various results correspond to the assumed probability distributions.
When we compare the concept of BDNE for probabilistic beliefs to subjective equilibria, there is an apparent similarity: Beliefs are probabilistic, players optimize the expected value of their payoff given those beliefs, and the condition that beliefs are not contradicted by observations is added. However, subjective equilibria are adapted to repeated games, and their extension to multistage games is not obvious. Moreover, using the subjective equilibrium approach, a player’s beliefs, based on the history observed, describe the probability distribution of reactions to the decision of a player by the unknown system (which plays the role of a statistic in our formulation) at this stage only. Given such beliefs, players optimize their expected payoffs. No equilibrium condition is added, only the condition that the frequencies of various reactions correspond to the assumed belief.
Belief distorted Nash equilibria (BDNE) for setvalued beliefs (the infapproach), introduced by the author in [14], as our current concept of BDNE, apply to multistage games. At each stage, players choose decisions maximizing, given their belief correspondences, their guaranteed payoffs (for the realization regarded as being the worst possible) from that moment on. In order for such a profile of decisions to be a preBDNE, we add the condition that the value of the statistic of the profile which influences players’ payoffs and the behavior of the state variable is foreseen correctly at each stage, as considered in this paper. Under the assumption that beliefs have perfect foresight, in games with a continuum of players, this notion coincides with the concept of Nash equilibrium (a result analogous to Theorem 3.2 of this paper). Finally, a profile is a BDNE iff it is a preBDNE and the actual trajectory of the game is in the belief correspondence. In this paper, beliefs are modelled using a set of probability measures instead of a multivalued correspondence, and the optimal expected payoff replaces the optimal guaranteed payoff. A likelihood function is introduced to verify the consistency of beliefs.
It is worth emphasizing that, if we compare setvalued beliefs and probabilistic beliefs with a uniform distribution on the same set, then the concepts of selfverification in both approaches are exactly the same. However, the BDNE are different, since using the infapproach, the guaranteed future payoff is considered instead of the expected payoff, which leads to more risk averse behavior.
4 Examples
As the first example of a game with distorted information, we consider a model of a renewable resource which is the common property of all its users. This model, in a slightly different formulation, was first defined in WiszniewskaMatyszkiel [19] and afterward examined in WiszniewskaMatyszkiel [14] as an example showing some interesting properties of belief distorted Nash equilibria under the infapproach. Here, we use it to illustrate the expapproach.
4.1 A Common Ecosystem
Let us consider two versions of a game of exploiting a common ecosystem: either with n players (\(\{1,\dots ,n\}\) with the normalized counting measure) or using the unit interval [0, 1] with the Lebesgue measure to describe the set of players. The statistic is the aggregate of the profile, i.e., \(g(i,a)=a\). The reaction function \(\phi (x,u)=x(1\max (0,u\zeta ))\), where \(\zeta >0\) is the regeneration rate, and the initial state is \(\bar{x}>0\). The sets of available strategies are given by \(D_{i}=[0,( 1+\zeta )]\). The current payoff functions are \(P_{i}(a,u,x)=\ln (ax)\), where \(\ln 0\) is understood as \(\infty \). The discount rate for all players is \(r>0\). The time horizon is \(+\infty \). In this example, the socalled tragedy of the commons is present in a very drastic form—in the continuum of players case, the players deplete the resource in a finite time at every Nash equilibrium.
The fundamental results from WiszniewskaMatyszkiel [19] regard the Nash equilibria of this game. We need them as the starting point for analysis, since we want to compare preBDNE and \(\varepsilon \)BDNE to Nash equilibria. Rewritten to fit the formulation of this paper, the results regarding Nash equilibria are as follows.
Proposition 4.1
Let \(\mathbb {I}=[0,1]\). No dynamic profile such that any set of players of positive measure get finite payoffs is an equilibrium, and every dynamic profile yielding depletion of the system at any finite time (i.e., \(\exists \bar{t} \text { s.t. } X(\bar{t})=0\)) is a Nash equilibrium. At every Nash equilibrium, for every player, the payoff is \(\infty \).
Proposition 4.2
Let \(\mathbb {I}=\{1,\dots ,n\}\). \(\bar{S}\equiv \max \left( \frac{nr(1+r)}{1+nr},\zeta \right) \) is a Nash equilibrium, and at every Nash equilibrium, the payoffs are finite.
The Proof of Proposition 4.2 uses a standard technique for solving the Bellman equation, while the proof of Proposition 4.1 applies a decomposition method from WiszniewskaMatyszkiel [20].
By Theorem 3.2 and Proposition 3.1, in the case of a continuum of players, any Nash equilibrium is a BDNE for perfect foresight.
We are interested in preBDNE that are not Nash equilibria. One interesting problem is to find a belief for which the resource is not depleted at any preBDNE in the continuum of players case. Moreover, we want to design a belief such that it is enough to “teach” a relatively small set of players, while the others still hold their original beliefs. The belief we are going to consider is of the form—“it is me who can save the system: if I restrict my exploitation to some level, then with probability one the system will not be destroyed within a finite time, while if I exceed this limit, the system will be destroyed in a finite time with positive probability.” Formally, we state the following proposition for a general class of such beliefs.
Proposition 4.3
Let \(\mathbb {I}=[0,1]\). Consider any belief correspondence such that for every \(i\in \mathbb {J}\subset \mathbb {I}\), where \(\mathbb {J}\) is of positive measure, \(t\in \mathbb {T}\), \(H\in \mathbb {H}\), there exist \(\varepsilon _{1},\varepsilon _{2},\varepsilon _{3}>0\) and constants \(\left( 1+\zeta \right)>\varepsilon _{1}>\delta (i,t,H)>0\) such that \(B_{i}(t,a,H)\) assigns a positive measure to the set of histories (X, u) such that for every \(s>t\), \(X(s)=0\) if \(a> \left( 1+\zeta \right) \delta (i,t,H)\), while if \(a\le (1+\zeta )\delta (i,t,H)\), then for every \(s>t\), we have \(X(t+1)\ge \varepsilon _{2}\cdot e^{\varepsilon _{3}t}\) with probability 1. For every profile S which is a preBDNE for this belief, for a.e. \(i\in \mathbb {J}\), we have \(S_{i}^{OL}(t)\le (1+\zeta )\delta (i,t,H)\), and \(X(t)>0\) for every t.
Proof
Obviously, for every player \(i\in \mathbb {J}\), the decision at time t maximizing \({\varPi }_{i}^{e}\) for any strategy profile of the remaining players is not greater than \( (1+\zeta )\delta (i,t,H)\)—the maximal level of extraction such that \(V_{i}(t+1,B_{i}(t,a,H^{S}))\ne \infty \), since \(\varPi _{i}^{e}(t,H^{S},S^{OL}(t))\ge \sum _{s=t}^{T}\ln \left( \left( (1+\zeta )\delta (i,t,H)\right) \cdot \varepsilon _{2}\cdot e^{\varepsilon _{3}t}\right) \cdot (1+r)^{t}\ge \) \(\ge \sum _{s=t}^{T}\varepsilon _{3}\cdot t\cdot \ln \left( \left( (1+\zeta )\varepsilon _{1}\right) \cdot \varepsilon _{2}\right) \cdot (1+r)^{t}\) \(> \infty \).
We have \(X(t_{0})>0.\) If \(\nu :=\int _{\mathbb {J}}\delta (i,t,H)d\lambda (i)\), then \(X(t+1)\ge X(t)\cdot \left( 1\left( \left( (1+\zeta )\nu \right) \zeta \right) \right) =X(t)\cdot \nu >0\), so \(X(t)>0\) implies \(X(t+1)>0\). \(\square \)
This result has an obvious interpretation: Ecological education can make people sacrifice their current utility in order to protect the system even if they, in fact, constitute a continuum. It is sufficient that they believe their decisions really have an influence on the system. The opposite situation is also possible: If people believe that they individually have no influence on the system, then they behave like a continuum. Depletion of the resource, which is impossible at a Nash equilibrium from Proposition 4.2, may happen at a preBDNE, which we prove as the next result.
Proposition 4.4
Let \(\mathbb {I}=\{1,\dots ,n\}\). Consider a belief correspondence such that there exists t such that for every i and H, \(B_{i}(t,a,H)\) assigns a positive probability to the set of (X, u) for which for some \(s>t\), \(X(s)=0\). Then any dynamic profile, including profiles S such that for some \(\bar{t}\), \(X^{S}(\bar{t})=0\), is a preBDNE.
Proof
For every i, t, and a, \(V_{i}(t+1,B_{i}(t,a,H))=\infty \). Therefore, each choice of the players is in the set of best responses to such a belief. \(\square \)
Now let us consider the problem of \(\varepsilon \)selfverification of such beliefs and check whether preBDNE are \(\varepsilon \)BDNE.
Proposition 4.5

(a)
Let \(\mathbb {J}\) be a set of players of positive measure. Assume that the beliefs of the remaining players, \(\backslash \mathbb {J}\), are independent of their own decisions and they assign probability 0 to the set of histories for which \(X(t)=0\) for some t. There exists a belief that is perfectly \(\varepsilon \)selfverifying for some \(\varepsilon <1\) such that for each player from \(\mathbb {J}\), the assumptions of Proposition 4.3 are fulfilled.

(b)
A profile \(\bar{S}\) for which players from \(\mathbb {J}\) choose \((1+\zeta )\delta (i,t,H)\), while the remaining players choose \((1+\zeta )\) is an \(\varepsilon \)BDNE for these beliefs.

(c)
Consider a belief correspondence such that there exists t such that for a.e. i and every H, \(B_{i}(t,a,H)\) assigns a positive probability to the set of histories \(H^{\prime }\) which are admissible (i.e., there exists a profile S such that \(H^{\prime }=H^{S}\)) and, given this, there exists a time moment \(s_t>t\) for which \(X(s_t)=0\). Any such belief correspondence is potentially \(\varepsilon \)selfverifying for some \(\varepsilon <1\).

(d)
Every profile \(\bar{S}\) resulting in depletion of the resource in a finite time is an \(\varepsilon \)BDNE for some beliefs defined in c).

(e)
Points (a)–(d) hold for \(\varepsilon =0\).
Proof
(a) and (b) We construct such a belief. For the players from \(\backslash \mathbb {J}\), this belief does not depend on a and is concentrated on the set \(\left\{ (X,u):\forall t\ X(t)\ne 0\right\} \). We specify this belief after some calculations.
Let \(\nu :=\lambda (\mathbb {J})\). Consider a strategy profile \(\bar{S}\) such that the players \(i\in \mathbb {J}\) choose \(\bar{S}_{i}(t,H)=\alpha \) for some \( \alpha \in [\zeta , 1+ \zeta ] \), while for \(i\notin \mathbb {J}\), \(\bar{S}_{i}(t,H)=(1+\zeta )\). Then the statistic for this profile at time t is equal to \(u(t)= (1+\zeta )(1\nu )+\nu \alpha \), while the trajectory corresponding to it fulfills \(X(t+1)=X(t)(1\max (0,u(t)\zeta )=X(t)\cdot \nu \cdot \left( (1+\zeta )\alpha \right) \).
We consider a belief \(B_{i}\) such that for \(s>t\):

(i)
every history in its support fulfills \(u(s)=(1+\zeta )(1\nu )+\nu \alpha \),

(ii)
\(X(s+1)\ge X(s)\cdot \nu \cdot \left( (1+\zeta )\alpha \right) \) for all \(i\notin \mathbb {J}\) whatever a is and for \(i\in \mathbb {J}\) only for \(a\le \alpha \),

(iii)
for \(i\in \mathbb {J}\) and \(a>\alpha \), \(B_{i}(t,a,H)\) assigns a positive probability to the set of histories with \(X(s)=0\) for some \(s>t\),

(iv)
\(L(B_{i}(t,a,H))\ge 1\varepsilon \) on the set of histories fulfilling (i)–(iii).
For such a belief, the decision maximizing \(\varPi _{i}^{e}\) for every player i from \(\mathbb {J}\) is \(\alpha \), while for the players from \(\backslash \mathbb {J}\) the optimal choice is \((1+\zeta )\). Therefore, all the preBDNE for this belief fulfill the above assumptions, which implies perfect \(\varepsilon \)selfverification.
(c) and (d) Since from Proposition 4.4, every profile is a preBDNE for such a belief correspondence, \(\bar{S}\) is a preBDNE. Hence, if \(L(B_{i}(t,a,H))\ge 1\varepsilon \) on a set of admissible trajectories such that for some \(s>t\), \(X(s)=0\), and this set contains \(\bar{S}\), we have potential \(\varepsilon \)selfverification.
(e) First, rewrite the proof of (a) with the additional assumption in the definition of the beliefs that \(H^{\bar{S}}\) for \(\bar{S}\) from (b) is of maximal likelihood for beliefs along \(H^{\bar{S}}\). Analogously, do the same for \(\bar{S}\) from (d) while rewriting (c). \(\square \)
4.2 The El Farol Bar Problem with a Continuum of Players or a Public Good with Congestion
Here we present an extension of the model presented by Brian Arthur [21] as the El Farol bar problem to a large game. There are players who choose at each time whether to stay at home, represented by 0, or to go to the bar, represented by 1. If the bar is overcrowded, then it is better to stay at home. The less it is crowded, the better it is to go.
Consider the space of players represented by the unit interval with the Lebesgue measure. The game is repeated. Hence, the state variable is trivial and is omitted in the notation. The statistic of a static profile is \(U(\delta ):=\int _{\mathbb {I}}\delta (i)d\lambda (i)\).
In our model, the effects of congestion are reflected by the current payoff function, \(P_{i}(d,u):=d\cdot \left( \frac{1}{2}u\right) \).
First, we state the equivalence between Nash equilibria, preBDNE, BDNE, and subjective equilibria for this model.
Proposition 4.6

(a)
If the \(B_{i}\) are independent of a, then the set of preBDNE coincides with the set of Nash equilibria, which is equal to the set of profiles such that for every t, \(u(t)=\frac{1}{2}\).

(b)
The union of the sets of BDNE over beliefs that are independent of a player’s own actions coincides with the set of Nash equilibria and pure strategy subjective equilibria. For every profile in this set, for every t, \(u(t)=\frac{1}{2}\).
Proof

(a)
The former equivalence is implied by Theorem 3.2 or 3.3. The latter one is trivial.

(b)
We know that the set of preBDNE for beliefs independent of a player’s own choice coincides with the set of Nash equilibria. So, the set of BDNE is a subset of the set of Nash equilibria. What remains to be proved is the fact that each Nash equilibrium is a BDNE for some beliefs from this class.
To prove this, let us take a profile S whose statistic, u, is equal to \(\frac{1}{2}\) for all t and beliefs B having perfect foresight for S (so, they are concentrated on this u) and all \(S^{i,d}\). This profile is a preBDNE and BDNE for B.
Since every Nash equilibrium is a subjective equilibrium, the only fact that remains to be proved is that at every subjective equilibrium, \(u(t)=\frac{1}{2}\). The environmental response function assigns a probability distribution describing a player’s beliefs about u(t). All the players who believe that \(P[u(t)>\frac{1}{2}]\) is greater than \(P[u(t)<\frac{1}{2}]\) choose 0, while those who believe the opposite choose 1, the remaining players may choose either of the two strategies. If the number of players choosing 0 is greater than the number of those choosing 1 with positive probability, then the event \(u(t)<\frac{1}{2}\) happens more frequently than the event \(u(t)>\frac{1}{2}\), which contradicts the beliefs of the players who choose 0. \(\square \)
Next, let us state some selfverification results.
Proposition 4.7
Consider a belief independent of the players’ own decisions.

(a)
Assume \(B=\{B_{i}\}_{i\in \mathbb {I}}\) is such that for every profile S which is a preBDNE for B, for a.e. i, every \(a\in \{0,1\}\), and every time t, \(L\left( B_{i}(t,a,H^{S})\right) (\frac{1}{2},\frac{1}{2},\ldots )\ge 1\varepsilon \) for some \(\varepsilon <1\). Then B is perfectly \(\varepsilon \)selfverifying and S is an \(\varepsilon \)BDNE.

(b)
Assume \(B=\{B_{i}\}_{i\in \mathbb {I}}\) is such that for every profile S which is a preBDNE for B, there exists \(\bar{t}\) such that for a.e. i, every \(a\in \{0,1\}\), \(B_{i}(\bar{t},a,H^{S})(\{u:\exists t>\bar{t}, u(t)\ne \frac{1}{2}\})=1\) with \(L\left( B_{i}(t,a,H^{S})\right) \) equal to 0 outside this set. For every profile \(\bar{S}\) which is a preBDNE for this profile, for a.e. i, we do not have potential \(\varepsilon \)selfverification for any \(\varepsilon <1\).
\(\square \)
Proposition 4.7 states that every preBDNE for beliefs assigning a sufficiently large likelihood to \(u\equiv \frac{1}{2}\), is an \(\varepsilon \)BDNE and such beliefs are perfectly \(\varepsilon \)selfverifying, while beliefs which for every preBDNE have \(u(t)\ne \frac{1}{2}\) for some t with probability one, are not even potentially \(\varepsilon \)selfverifying.
4.3 Repeated Prisoner’s Dilemma
Although the concepts of BDNE are better adapted to games with many players, we present a simple example of a twoplayer game—the Prisoner’s Dilemma—repeated infinitely many times.
There are two players who have two available strategies at each stage: cooperate, coded as 1, and defect, coded as 0. The decisions are made simultaneously. Therefore, a player does not know the decision of their opponent. We assume that the statistic is the whole profile. If both players cooperate, then they get a payoff of C. If they both defect, they get a payoff of N. If only one of the players cooperates, then the cooperator gets a payoff of A, while the defector gets R. These payoffs are ranked as follows \(A<N<C<R\).
Using the notation of this paper, the payoff function can be written as
Obviously, the strictly dominant pair of defecting strategies (0, 0) is the only Nash equilibrium in the onestage game, while a sequence of such decisions also constitutes a Nash equilibrium in the repeated game, as well as a BDNE, which we can easily prove by considering beliefs that are independent of a player’s current decision. We check whether a pair of cooperative strategies can also constitute a BDNE. In order to do this, let us consider any beliefs \(\bar{B}\) of the form “if I defect now, then the other player will always defect, while if I cooperate now, then the other player will always cooperate” with \(B_i(t,1,H)\) assigning maximal probability to the history \(H'\) with \(H'_t=H_t\) and \(H(s)=[1,1]\) for \(s \ge t\).
The pair of grim trigger \(GT \) strategies “cooperate until the first defection of the other player, then defect” constitutes a Nash equilibrium if the discount rates are small. This does not hold for a pair of “cooperate” \(CE \) strategies.
Proposition 4.8
If both \(r_{i}\) are small enough, then:

(a)
The pair \((GT ,GT )\) is a BDNE for \(\bar{B}\) and a Nash equilibrium; moreover, the interval of \(r_{i}\) for which \(GT \) is a BDNE is larger than the interval for which it is a Nash equilibrium;

(b)
All profiles of the same openloop form as \((GT ,GT )\), including the pairs \((CE ,CE )\) and \((GT ,CE )\), are also BDNE for \(\bar{B}\), while profiles of any other openloop form are not preBDNE for \(\bar{B}\);

(c)
There exists beliefs \(\bar{B}\) fulfilling (a) which are perfectly selfverifying.
Proof
We consider \(\bar{B}\) such that for every \(H \in \mathbb {H}_{\infty }\), \(\bar{B}_{i}(t,0,H)\) is concentrated on \(\{H' \in \mathbb {H}_{\infty }: \forall s>t \ (H(s))_{i}=0\}\) and \(\bar{B}_{i}(t,1,H)\) is concentrated on \(\{H' \in \mathbb {H}_{\infty }: \forall s>t \ (H(s))_{i}=1\}\) with \(B_i(t,1,H)( \{ H': H'_t=H_t, \ \forall s \ge t \ H'(s)=(1,1) \} )\) being maximal (over the set of all histories \( H'\) with \(H'_t=H_t\)).
(a) We start by proving that the profile \((GT ,GT )\) is a Nash equilibrium. To do this, consider a player’s best response to \(GT \) from moment t onwards. Assume that at time t this player chooses to defect. Then their maximal payoff for such a profile from time t on is \(R+\sum _{s=t+1}^{\infty } \frac{N}{(1+r_{i})^{(st)}}\), while by playing \(GT \), their payoff is \(C+\sum _{s=t+1}^{\infty } \frac{C}{(1+r_{i})^{(st)}}\). The condition for \(GT \) to be optimal is \((RC)\cdot r_{i}<CN\), which holds for small \(r_{i}\).
Next, we prove that \(GT \) is also a BDNE for \(\bar{B}\), i.e., that it is a preBDNE and that the actual history is of likelihood 1 at every moment t. Consider moment t and history H.
We have \(V_{i}(t,B_{i}(t,0,H))=\sum _{s=t}^{\infty } \frac{N}{(1+r_{i})^{(st)}}=\frac{(1+r_{i})\cdot N}{r_{i}}\) and \(V_{i}(t,B_{i}(t,1,H))=\sum _{s=t}^{\infty } \frac{R}{(1+r_{i})^{(st)}}=\frac{(1+r_{i})\cdot R}{r_{i}}\).
Therefore, for player i, without loss of generality player 1, \(\varPi _{1}^{e}(t,H,(0,1))=R+\frac{N}{r_{1}}\), while \(\varPi _{1}^{e}(t,H,(1,1))=C+\frac{R}{r_{1}}\).
Hence, cooperation is better than defection when \((RC)\cdot r_{1}<RN\). For these values of \(r_{1}\), \(GT \) is a preBDNE for \(\bar{B}\) and no profile with a different openloop form can be a preBDNE for \(\bar{B}\). Since the statistic for \((GT ,GT )\) is equal to (1, 1) at every moment, the likelihood of the resulting history is equal to one at every moment t. Therefore, the measure of the consistency of beliefs is one and the profile \((GT ,GT )\) is a BDNE.
(b) From (a) and the fact that both strategies \(GT \) and \(CE \) behave in the same way if the other player does not defect, which leads to the same openloop form as \((GT ,GT )\).
(c) The perfect selfverification of \(\bar{B}\) is a consequence of this and the fact that at every preBDNE for \(\bar{B}\), the history is \(H^{(GT ,GT )}\equiv (1,1)\), which is of maximal probability and therefore, of likelihood 1 at every moment t. \(\square \)
Since this game is repeated, we can compare the concept of BDNE with subjective equilibria. At a subjective equilibrium, players maximize their expected payoff at each stage given their beliefs about current decision of the opponent. Since defection dominates cooperation, players should defect at each stage, regardless of their beliefs. It should also be noted that under the concept of subjective equilibrium, punishment is impossible.
5 Conclusions
This paper introduces a new notion of equilibrium—Belief Distorted Nash Equilibrium (BDNE) for probabilistic beliefs. The notion of BDNE is especially applicable in dynamic games and repeated games. Existence and equivalence theorems are proved and concepts of selfverification are introduced. These theoretical results are illustrated by examples: extraction of a common renewable resource, a large minority game, and a repeated Prisoner’s Dilemma. The selfverification of various beliefs is analyzed for these examples. In the case of the model of extracting a common resource, the results suggest that appropriate ecological education is of great importance, since, in some cases, it can be the only way to guarantee sustainability. This paper shows that we have to be conscious of the existence of beliefs, which, although often inconsistent with reality, can be regarded as rational if they have the property of selfverification. If we replace the word “beliefs” by “academic models of dynamic decisionmaking problems of a gametheoretic nature, used by their participants,” then our results indicate the danger that models inconsistent with reality may be regarded as scientifically valid, since they have the property of selfverification: They suggest behavior which results in the confirmation of the theories assumed.
References
Harsanyi, J.C.: Games with incomplete information played by Bayesian players, Part I. Manag. Sci. 14, 159–182 (1967)
Battigalli, P., Siniscalchi, M.: Rationalization and incomplete information. Adv. Theor. Econ. 3(1), 1534–5963 (2003)
Battigalli, P., Guaitoli, D.: Conjectural equilibria and rationalizability in a Game with incomplete information. In: Battigali, P., Montesano, A., Panunzi, F. (eds.) Decisions, Games and Markets, pp. 97–124. Kluwer, Dordrecht (1997)
Eyster, E., Rabin, M.: Cursed equilibrium. Econometrica 73, 1623–1672 (2005)
Fudenberg, D., Levine, D.K.: Selfconfirming equilibrium. Econometrica 61, 523–545 (1993)
Azrieli, Y.: On pure conjectural equilibrium with nonmanipulable information. Int. J. Game Theory 38, 209–219 (2009)
Azrieli, Y.: Categorizing others in large games. Games Econ. Behav. 67, 351–362 (2009)
Cartwright, E., Wooders, M.: Correlated equilibrium, conformity and stereotyping in social groups. The Becker Friedman Institute for Research in Economics, Working paper 2012014 (2012)
Kalai, E., Lehrer, E.: Subjective equilibrium in repeated games. Econometrica 61, 1231–1240 (1993)
Kalai, E., Lehrer, E.: Subjective games and equilibria. Games Econ. Behav. 8, 123–163 (1995)
Rubinstein, A., Wolinsky, A.: Rationalizable conjectural equilibrium: between Nash and rationalizability. Games Econ. Behav. 6, 299–311 (1994)
Aumann, R.J.: Subjectivity and correlation in randomized strategies. J. Math. Econ. 1, 67–96 (1974)
Aumann, R.J.: Correlated equilibrium as an expression of bounded rationality. Econometrica 55, 1–19 (1987)
WiszniewskaMatyszkiel, A.: Belief distorted Nash equilibria—introduction of a new kind of equilibrium in dynamic games with distorted information. Annals Oper. Res. (2016). doi:10.1007/s1047901519207, 147–177
WiszniewskaMatyszkiel, A.: When beliefs about future create future—exploitation of a common ecosystem from a new perspective. Strategic Behav. Environ. 4, 237–261 (2014)
WiszniewskaMatyszkiel, A.: Stock market as a dynamic game with continuum of players. Control Cybern. 37, 617–647 (2008)
WiszniewskaMatyszkiel, A.: Existence of pure equilibria in games with nonatomic space of players. Topol. Methods Nonlinear Anal. 16, 339–349 (2000)
WiszniewskaMatyszkiel, A.: On the terminal condition for the Bellman equation for dynamic optimization with an infinite horizon. Appl. Math. Lett. 24, 943–949 (2011)
WiszniewskaMatyszkiel, A.: A dynamic game with continuum of players and its counterpart with finitely many players. In: Nowak, A.S., Szajowski, K. (eds.) Annals of the International Society of Dynamic Games, vol. 7, pp. 455–469. Birkhäuser, BostonBaselBerlin (2005)
WiszniewskaMatyszkiel, A.: Static and dynamic equilibria in games with continuum of players. Positivity 6, 433–453 (2002)
Brian Arthur, W.: Inductive reasoning and bounded rationality. Am. Econ. Rev. 84, 406–411 (1994)
Aumann, R.J.: Markets with a continuum of traders. Econometrica 32, 39–50 (1964)
Vind, K.: Edgeworthallocations is an exchange economy with many traders. Int. Econ. Rev. 5, 165–177 (1964)
Schmeidler, D.: Equilibrium points of nonatomic games. J. Stat. Phys. 17, 295–300 (1973)
MasColell, A.: On the theorem of Schmeidler. J. Math. Econ. 13, 201–206 (1984)
Balder, E.: A unifying approach to existence of Nash equilibria. Int. J. Game Theory 24, 79–94 (1995)
Wieczorek, A.: Large games with only small players and finite strategy sets. Appl. Math. 31, 79–96 (2004)
Carmona, G., Podczeck, K.: On the existence of purestrategy equilibria in large games. J. Econ. Theory 144, 1300–1319 (2009)
Balbus, Ł., Dziewulski, P., Reffett, K., Woźny, Ł.: Differential information in large games with strategic complementarities. Econ. Theory 59, 201–243 (2015)
Lasry, J.M., Lions, P.L.: Mean field games. Jpn. J. Math. 2, 229–260 (2007)
WiszniewskaMatyszkiel, A.: Open and closed loop Nash equilibria in games with a continuum of players. JOTA 160, 280–301 (2014)
WiszniewskaMatyszkiel, A.: Common resources, optimality and taxes in dynamic games with increasing number of players. J. Math. Anal. Appl. 337, 840–841 (2008)
Acknowledgements
The project was financed by funds from the National Science Centre granted by decision number DEC2013/11/B/HS4/00857.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
1.1 Games with a Measure Space of Players
Games with a measure space of players are usually perceived as a synonym of games with infinitely many players also called large games. In order to make it possible to evaluate the influence of an infinite set of players on aggregate variables, a measure is introduced on the \(\sigma \)field of subsets of the set of players. However, the notion games with a measure space of players also encompasses games with finitely many players, since the counting measure on a power set may be considered.
Large games were introduced in order to illustrate situations where the number of agents is large enough to make a single agent insignificant—negligible. However, the impact of a set of players of positive measure is not negligible. This happens in many reallife situations: in competitive markets, on the stock exchange, or when we consider the emission of greenhouse gases or the similar global effects of exploitation of the common global ecosystem.
Although it is possible to construct models with countably many players that illustrate this phenomenon, they are generally difficult to cope with. Therefore, the simplest examples of large games are socalled games with continuum of players, where players constitute an atomless measure space, usually the unit interval with the Lebesgue measure. If, additionally, we consider at least one atomic player, then we call such a game a mixed large game.
The first attempts to use models with a continuum of players are contained in Aumann [22] and Vind [23]. The following are theoretical works on large games: Schmeidler [24], MasColell [25], Balder [26], Wieczorek [27], WiszniewskaMatyszkiel [17], Carmona and Podczeck [28], and Balbus et al. [29]. The general theory of dynamic games with a continuum of players is still being developed by, e.g., WiszniewskaMatyszkiel [20] for games with a common global state variable, Lasry and Lions [30] for stochastic mean field games where each player is associated with a private state variable and WiszniewskaMatyszkiel [31] for games with both common global and private state variables.
Introducing a continuum of players rather than a finite number, however large, can essentially change the properties of equilibria and the way in which they are derived, even if the measure of the space of players is preserved in order to make the results comparable. Such an aggregatepreserving modification to games does not reflect a situation in which new decision makers enter a game, but a situation in which the same “mass” of individuals participating in the game is decomposed into smaller units—decision makers. For example, in models of global ecological problems, the same “mankind” can be decomposed into a set of players as countries (n players), individuals or firms (a continuum of players). In spite of the differences between the methods used and some qualitative differences, some limit properties can be proven. Such comparisons were made by the author in [19, 32].
Appendix B
1.1 General Existence Result and Proof of Theorem 3.1
Here, we prove and generalize Theorem 3.1, using a theorem on the existence of a Nash equilibrium from WiszniewskaMatyszkiel [17].
Before formulating this theorem, we have to define some notions used in it.
Definition B.1
The symbol \(diag \mathbb {X}\) denotes the diagonal in \(\mathbb {X}^{2}\): \(diag \mathbb {X}=\left\{ (x,x)\ \ \right. \left. x\in \mathbb {X}\right\} \).
If \((X,\mathcal {X},\lambda )\) is a measure space, then \(\overline{\mathcal {X}}\) denotes the completion of \(\mathcal {X}\) with respect to \(\lambda \).
A family of sets \(\mathcal {X}\) is called compact iff for every finite sequence of sets \(\left\{ X_{n}\in \mathcal {X}\right\} _{n\in \mathbb {N}}\), the intersection \({\bigcap }_{n\in \mathbb {N}}X_{n}\) is nonempty.
If \((\mathbb {X},\mathcal {X})\) is a measurable space, then a subset of \(\mathbb {X}\) is called \(\mathcal {X}\) analytic iff it can be obtained as a projection of a measurable subset of \(\mathbb {X}\times [0,1]\) (with the \(\sigma \)field of Borel subsets considered on [0, 1]). The family of all \(\mathcal {X}\)analytic sets are denoted by \(\mathcal {A}(\mathcal {X})\).
If \((\mathbb {X},\mathcal {X})\) is a measurable space, then a function \(f:\mathbb {X} \rightarrow \overline{\mathbb {R}}\) is called \(\mathcal {X}\) analytically measurable iff the inverse images of the Borel subsets of \(\overline{\mathbb {R}}\) are \(\mathcal {X}\)analytic.
We consider a game with an atomless space of players \((\mathbb {I},\mathfrak {I},\lambda )\), the space of strategies \((\mathbb {D},\mathcal {D})\), sets of strategies \(D_{i}\), with statistics defined as \(U(\delta )=\int _{\mathbb {I}}g_{k}(i,\delta _{i})d\lambda (i)\) and payoff functions \(\varPi _{i}(\delta )=P_{i}(\delta _{i},U(\delta ))\). The following assumptions are made:

A1.
The space of strategies \(\mathbb {D}\) is such that the diagonal \(diag \mathbb {D}\) is \(\mathcal {D}\otimes \mathcal {D}\)measurable and \(\mathbb {D}\) is a measurable image of a measurable space \((\mathbb {Z},\mathcal {Z})\) which is an analytic subspace of a measurable space \((\mathbb {Q},\mathcal {Q})\) such that the \(\sigma \)field \(\mathcal {Q}\) is contained in \(\mathcal {A}(\mathcal {V})\), where \(\mathcal {V}\) generated by a compact countable family of sets.

A1’.
The space of strategies \(\mathbb {D}\) is such that \(diag \mathbb {D}\) is \(\mathcal {D}\otimes \mathcal {D}\)measurable and \(\mathbb {D}\) is a measurable image of a measurable space \((\mathbb {Z},\mathcal {Z})\) which an analytic subspace of a separable compact topological space \(\mathbb {Q}\) (with the \(\sigma \)field of Borel subsets \(\mathcal {B}(\mathbb {Q})\)).

A2.
For almost every i, the set \(D_{i}\) is nonempty and compact.

A3.
The function \(P_{i}\) is upper semicontinuous for almost every i.

A4.
The graph of D is \(\overline{\mathfrak {I}}\otimes \mathcal {D}\)analytic.

A5.
The function \(P_{i}(a,\bullet )\) is continuous for almost every i and every \(a\in D_{i}\).

A6.
For every u, the function \(P_{\bullet }(\bullet ,u)_{Gr D}\) is \(\overline{\mathfrak {I}}\otimes \mathcal {D}\)analytically measurable.

A7.
The functions \(g_{k}\) are measurable, integrably bounded, such that the \(g_{k}(i,\bullet )\) are continuous on \(D_{i}\) for almost every i.
Theorem B.1
If assumptions A1 and A2–A7 are fulfilled and the space \(\left( \mathbb {I},\mathfrak {I},\lambda \right) \) is complete, or if A1’ and A2–A7 are fulfilled, then there exists a Nash equilibrium.
We want to apply Theorem B.1 to stage games with distorted information \(\mathfrak {G}_{t,H}\), i.e., games with the set of players \(\mathbb {I}\), the sets of their strategies \(D_{i}\) and the payoff functions \(\varPi _{i}^{e}(t,H,\delta )\). Some of the conditions have to be rewritten.
A3\(_{t,H}\). The functions \(P_{i}(a,u,x)\) and \(V_{i}(t+1,B_{i}(t,a,H))\) are upper semicontinuous in (a, u) for almost every i (where \(x=X(t)\) for \(H=(X,u)\)).
A5\(_{t,H}\). The functions \(P_{i}(a,u,x)\) and \(V_{i}(t+1,B_{i}(t,a,H))\) are continuous in u for almost every i and every \(a\in D_{i}\).
A6\(_{t,H}\). For every u, the functions \((i,a)\mapsto P_{i}(a,u,x)\) and \( (i,a)\mapsto V_{i}(t+1,B_{i}(t,a,H))\) are \(\overline{\mathfrak {I}}\otimes \mathcal {D}\)analytically measurable, while r is \(\overline{\mathfrak {I}}\)analytically measurable.
Theorem B.2
Let \((\mathbb {I},\mathfrak {I},\lambda )\) be an atomless measure space.

(a)
If \((\mathbb {I},\mathfrak {I},\lambda )\) is complete and A1, A2, A4, A7, and for all (t, H), A3\(_{(t,H)}\), A5\(_{(t,H) }\) and A6\(_{(t,H)}\) are fulfilled, then there exists a preBDNE for B.

(b)
If A1’, A2, A4, A7, and for all (t, H), A3\(_{(t,H)}\), A5\(_{(t,H) }\) and A6\(_{(t,H)}\) are fulfilled, then there exists a preBDNE for B.
Proof
Since a preBDNE is a sequence of profiles of decisions constituting Nash equilibria in the games \(\mathfrak {G}_{t,H^{S}}\), we consider a specific time moment t and the actual history of the game \(H=H^{S}\).
We show that the expected payoff function, \(P_{i}(a,u,x)+\frac{1}{1+r_{i}} V_{i}(t+1,B_{i}(t,a,H))\), together with \(D_{i}\), fulfills the assumptions of Theorem B.1.
Assumptions A3 and A5 are trivial. Only the measurability assumption A6 is not immediate. Since the composition \(f\circ g\) of an analytically measurable function f with a measurable function g is analytically measurable and the analytical measurability of functions into \(\overline{\mathbb {R}}\) is preserved by addition, multiplication and division (if well defined), the function \((i,a)\mapsto P_{i}(a,u,x)+\frac{1}{1+r_{i}}V_{i}(t+1,B_{i}(t,a,H))\) is \(\overline{\mathfrak {I}}\otimes \mathcal {D}\)analytically measurable.
Therefore, all the assumptions of Theorem B.1 hold for \(\mathfrak {G}_{t,H^{S}}\), which implies the existence of a Nash equilibrium in \(\mathfrak {G}_{t,H^{S}}\). Since we have this condition for all t and H, the resulting sequence of equilibria constitutes a preBDNE in our game with distorted information. \(\square \)
Let us note that the assumptions of Theorem B.2 are in quite a complicated form. Simplifying them to the original functions and correspondences is not always possible—e.g., proving the continuity of \(v_{i}\) in the second argument can be done under the assumption of the compactness of the set of dynamic strategies available to player for a given history. However, this cannot be assumed in the case of an infinite time horizon.
Theorem B.2 immediately implies Theorem 3.1.
Proof
(of Theorem 3.1) Obviously, a measurable set is analytic, while the measurability of a function implies its analytic measurability. The measurable space \(\mathbb {R}^n\) with Borel subsets fulfills both conditions A1 and A1’. Therefore, from Theorem B.2, there exists a preBDNE. \(\square \)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
WiszniewskaMatyszkiel, A. Redefinition of Belief Distorted Nash Equilibria for the Environment of Dynamic Games with Probabilistic Beliefs. J Optim Theory Appl 172, 984–1007 (2017). https://doi.org/10.1007/s1095701610347
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s1095701610347
Keywords
 Distorted information
 Dynamic games
 Nash equilibrium
 Belief distorted Nash equilibrium (BDNE )
 Selfverification of beliefs
Mathematics Subject Classification
 49N30
 91B02
 91B52
 91A10
 91B76
 91A50
 91B06
 91A13