Abstract
Doxastic characterizations of the set of Nash equilibrium outcomes and of the set of backwardinduction outcomes are provided for general perfectinformation games (where there may be multiple backwardinduction solutions). We use models that are behavioral, rather than strategybased, where a state only specifies the actual play of the game and not the hypothetical choices of the players at nodes that are not reached by the actual play. The analysis is completely free of counterfactuals and no belief revision theory is required, since only the beliefs at reached histories are specified.
This is a preview of subscription content, access via your institution.
Notes
See, for example, Aumann (1995), Balkenborg and Winter (1997), BenPorath (1997), Bonanno (2013), Clausing (2003), Halpern (2001), Perea (2012, 2014), Quesada (2003), Samet (1996, 2013), Stalnaker (1998). Surveys of the literature on the epistemic foundations of backward induction are provided in Brandenburger (2007), Perea (2007a) and (Perea 2012, p. 463).
If one identifies histories with nodes in the tree, then \(h\prec h^{\prime }\) means that node h is a predecessor of node \(h^{\prime }\).
Behavioral models were first introduced in Samet (1996).
For simplicity, the characterization is provided for games where no player moves more than once along any play, but we explain how to extend the result to general games.The words ‘outcome’, ‘play’ and ‘terminal history’ will be used interchangeably.
This characterization is not restricted to games where no player moves more than once along any play.
As is customary, we take \(\omega {\mathcal {B}}(\omega ^{\prime })\) and \((\omega ,\omega ^{\prime })\in {\mathcal {B}}\) as interchangeable.
For more details see Battigalli and Bonanno (1999).
Thus it would be more precise to write \({\mathcal {B}}_{\iota (h)}\) instead of \({\mathcal {B}}_{h}\), but we have chosen the lighter notation since there is no ambiguity, because at every decision history there is a unique player who is active there.
For a critical analysis of the use of counterfactuals in dynamic games see Bonanno (2015).
This issue is further discussed in Sect. 7.3.
The root of the tree corresponds to the null history \(\emptyset \), Player 2’s decision node corresponds to history \(a_{1}\), Player 3’s decision node to history \(a_{1}a_{2}\) and Player 1’s last decision node to history \(a_{1}a_{2}a_{3}\).
In other words, for any two states \(\omega \) and \(\omega ^{\prime }\) that are enclosed in a rounded rectangle, \(\{(\omega ,\omega ),(\omega ,\omega ^{\prime }),(\omega ^{\prime },\omega ),(\omega ^{\prime },\omega ^{\prime })\}\subseteq {\mathcal {B}}\) (that is, the relation is total on the set of states contained in the rectangle) and if there is an arrow from a state \( \omega \) to a rounded rectangle then, for every \(\omega ^{\prime }\) in the rectangle, \((\omega ,\omega ^{\prime })\in {\mathcal {B}}\).
Thus \({\mathcal {B}}_{\emptyset }(\omega )=\{\alpha ,\beta ,\gamma \}\) for every \(\omega \in \varOmega =\{\alpha ,\beta ,\gamma ,\delta ,\epsilon \}\), \({\mathcal {B}}_{a_{1}}(\omega )=\{\beta ,\gamma \}\) for every \(\omega \in \{\beta ,\gamma ,\delta ,\epsilon \}\), \({\mathcal {B}}_{a_{1}a_{2}}(\omega )=\{\gamma , \delta \}\) for every \(\omega \in \{\gamma , \delta , \epsilon \}\) and \({\mathcal {B}}_{a_{1}a_{2}a_{3}}(\omega )=\{\delta ,\epsilon \}\) for every \(\omega \in \{\delta ,\epsilon \}.\)
Note that rationality in the traditional sense of expected utility maximization implies rationality in our sense; thus anything that is implied by our weak notion will also be implied by the stronger notion of expected utility maximization. On the other hand, our notion has the advantage that it does not rely on the assumption of von Neumann–Morgenstern preferences: the utility functions can be just ordinal utility functions.
In fact, \({\mathbf {R}}_{\emptyset } = {\mathbf {R}}_{a_1}=\{\alpha ,\beta \}\).
In this model \({\mathbf {T}}_{\emptyset } =\{\alpha ,\beta ,\gamma \}, {\mathbf {T}}_{a_1} =\{\alpha ,\beta \}, {\mathbf {T}}_{a_2} =\{\gamma ,\delta \}, {\mathbf {R}}_{\emptyset } = \varOmega , {\mathbf {R}}_{a_1} = \{\beta \}\) and \({\mathbf {R}}_{a_2} = \{\delta \}\).
If Player 2’s strategy selects choice \(b_1\) at decision history \(a_1\), then Player 1’s best reply is to play \(a_2\) rather than \(a_1\).
Because, for every \(\omega \in \varOmega \), \(\alpha ,\beta \in {\mathcal {B}}_{\emptyset }(\omega ), \ a_1\prec \zeta (\alpha ), \ a_1\prec \zeta (\beta )\) and \(\zeta (\alpha )=a_1b_2\ne \zeta (\beta )=a_1b_1\).
The socalled agent form of a game is obtained by treating a player at different decision histories as different players with the same payoff function. Thus the agent form of a game satisfies the noconsecutivemoves condition (but the latter is a weaker condition). Several papers in the literature on the epistemic foundations of backward induction in perfectinformation games restrict attention to games in agent form (see, for example, Balkenborg and Winter 1997; Stalnaker 1998).
At the end of this section we discuss how this restriction can be relaxed.
In this model, \({\mathbf {T}}=\{\alpha ,\beta \},{\mathbf {R}}=\{\alpha \}\) and \({\mathbf {C}}=\varOmega \), so that \({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}=\{\alpha \}\).
\(\beta \in {\mathcal {B}}_{\emptyset }(\alpha ), a_1\prec \zeta (\beta )=a_1d_2, \gamma \in {\mathcal {B}}_{a_1}(\beta ), a_1a_2\prec \zeta (\gamma )=a_1a_2d_3, \delta \in {\mathcal {B}}_{a_1a_2}(\gamma )\) and \(a_1a_2a_3\prec \zeta (\delta )=a_1a_2a_3d_4\). Another sequence that leads from \(\alpha \) to \(a_1a_2a_3\) is \(\langle \alpha ,\gamma ,\gamma ,\delta \rangle \).
Proof. Let \(h=a_1 \ldots a_m\) (\(m\ge 1\)). By Point 1 of Definition 2, \({\mathcal {B}}_{\emptyset }(\omega ) \ne \varnothing \) (since \(\emptyset \) is a prefix of every history, in particular of history \(\zeta (\omega )\)). Hence, since \(a_1\in A(\emptyset )\), by Point 4 of Definition 2 there exists an \(\omega _1\in {\mathcal {B}}_{\emptyset }(\omega )\) such that \(a_1\preceq \zeta (\omega _1)\). Thus, since \(a_2\in A(a_1)\), by Point 4 of Definition 2, there exists an \(\omega _2\in {\mathcal {B}}_{a_1}(\omega _1)\) such that \(a_1a_2\preceq \zeta (\omega _2)\), etc.
See, for example, Aumann (1995), Balkenborg and Winter (1997), BenPorath (1997), Bonanno (2013), Clausing (2003), Halpern (2001), Perea (2012, 2014), Quesada (2003), Samet (1996, 2013), Stalnaker (1998). A perfectinformation game has no relevant ties if, \(\forall i\in N\), \(\forall h\in D_{i}\), \(\forall a,a^{\prime }\in A(h)\) with \(a\ne a^{\prime } \), \(\forall z,z^{\prime }\in Z\), if ha is a prefix of z and \(ha^{\prime } \) is a prefix of \(z^{\prime }\) then \(u_{i}(z)\ne u_{i}(z^{\prime })\). All games in generic position satisfy this condition.
Proof. It is clear that \(({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}})\cap {\mathbf {I}}_{TRC} \subseteq ({\mathbf {T}}_{\emptyset }\cap {\mathbf {R}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset })\cap {\mathbf {I}}_{TRC}\) since \({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}\subseteq {\mathbf {T}}_{\emptyset } \cap {\mathbf {R}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset }\). To prove the converse, let \(\omega \in ({\mathbf {T}}_{\emptyset }\cap {\mathbf {R}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset })\cap {\mathbf {I}}_{TRC}\) and let \(h=a_1 \ldots a_m\) (\(m\ge 1\)) be a decision history such that \(h\prec \zeta (\omega )\); we need to show that \(\omega \in {\mathbf {T}}_{h}\cap {\mathbf {R}}_{h}\cap {\mathbf {C}}_{h}\). Since \(\omega \in {\mathbf {I}}_{TRC}\), it will be sufficient to show that h is reachable from \(\omega \) via the constant sequence \(\langle \omega _0,\omega _1,\ldots ,\omega _m\rangle \) with \(\omega _i=\omega \) for every \(i=0,1,\ldots ,m\). Point (1) of Definition 9 is trivially true and point (2) follows from the hypothesis that \(h\prec \zeta (\omega )\) and the fact that, for every \(i=1,\ldots ,m1\), \(a_1 \ldots a_i\prec h\). As for Point (3), we have, first of all, that \(\omega _1=\omega \in {\mathcal {B}}_{\emptyset }(\omega _{\emptyset }=\omega )\) because \(\omega \in {\mathbf {T}}_{\emptyset }\). Thus \(a_1\) is reachable from \(\omega \) through the sequence \(\langle \omega ,\omega \rangle \) and hence, since \(\omega \in {\mathbf {I}}_{TRC}\), \(\omega \in {\mathbf {T}}_{a_1}\), that is, \(\omega \in {\mathcal {B}}_{a_1}(\omega )\). It follows that \(a_1a_2\) is reachable from \(\omega \) through the sequence \(\langle \omega ,\omega ,\omega \rangle \) and hence, since \(\omega \in {\mathbf {I}}_{TRC}\), \(\omega \in {\mathbf {T}}_{a_1a_2}\), that is, \(\omega \in {\mathcal {B}}_{a_1a_2}(\omega )\), and so forth.
The proof is by induction. At a “last” decision node (that is, a decision node followed only by terminal nodes) there is a unique rational choice, since there are no ties. Hence at an immediately preceding node the active player who believes that after each of her choices the corresponding player will play rationally, cannot have uncertainty about the subsequent choices of those future players; hence, since there are no relevant ties, also this player has a unique rational choice. One then extends this argument backwards in the tree by induction.
The event \({\mathbf {I}}_{TR}\) is defined as in Definition 10 but without reference to the events \({\mathbf {C}}_h\): \(\omega \in {\mathbf {I}}_{{\mathbf {T}}{\mathbf {R}}}\) if and only if, for every decision history \(h=a_1 \ldots a_m\) (\(m\ge 1)\), and for every sequence \(\langle \omega _0,\omega _1,\ldots ,\omega _m\rangle \) leading from \(\omega \) to h, \(\omega _m\in {\mathbf {T}}_{h}\cap {\mathbf {R}}_{h}\).
The interpretation of the event \({\mathbf {I}}_{TRC}\) given below in terms of “forward belief in rationality” is conceptually similar to the notion of “forward belief in material rationality” given in Perea (2007a, Definition 2.7). However, the latter definition is obtained in a class of models where the space of uncertainty is the set of the opponents’ strategies, rather than the set of terminal histories (furthermore, Perea uses the “typespace” approach rather than the statespace approach followed in this paper). The difference between the two classes of models is discussed in Sect. 7.2.
That is, at state \(\alpha \) and history \(\emptyset \), player \(\iota (\emptyset )=1\) believes that at history \(a_1\) player \(\iota (a_1)=2\) will act rationally.
In this model, \({\mathbf {R}}_{\emptyset }=\{\beta ,\epsilon ,\eta \}, {\mathbf {R}}_{a_1}=\{\epsilon \}, {\mathbf {R}}_{a_2}=\{\beta ,\delta \}, {\mathbf {R}}_{a_1b_1}=\{\alpha ,\eta \}, {\mathbf {R}}=\{\beta ,\epsilon \}, {\mathbf {T}}_{\emptyset }=\{\alpha ,\beta ,\delta ,\epsilon \}, {\mathbf {T}}_{a_1}=\{\epsilon ,\eta \}, {\mathbf {T}}_{a_2}=\{\beta ,\gamma ,\delta \}, {\mathbf {T}}_{a_1b_1}=\{\alpha ,\eta \}, {\mathbf {T}}=\{\beta ,\delta ,\epsilon \}, {\mathbf {C}}_{\emptyset }=\varOmega , {\mathbf {C}}_{a_1}=\{\alpha ,\epsilon ,\eta \}, {\mathbf {C}}_{a_2}=\{\beta ,\gamma ,\delta \}, {\mathbf {C}}_{a_1b_1}=\{\alpha ,\eta \}, {\mathbf {C}}=\varOmega , [a_1]=\{\alpha ,\epsilon ,\eta \}, [a_2]=\{\beta ,\gamma ,\delta \}\) and \([b_1]=\{\alpha ,\eta \}\).
Note that, if state \(\omega \) and decision history h are such that h is not reached at \(\omega \) (that is, \(h\not \prec \zeta (\omega )\)), then, by Definition 2, \({\mathcal {B}}_h(\omega )=\varnothing \) and therefore \({\mathcal {B}}_h(\omega )\subseteq E\), for every event E, that is, \(\omega \in {\mathbb {B}}_hE\). For example, in the model of Fig. 7, \({\mathbb {B}}_{a_1}(\{\alpha ,\beta \})=\{\beta ,\gamma ,\delta \}\), since, for every \(\omega \in \{\beta ,\gamma ,\delta \}\), \({\mathcal {B}}_{a_1}(\omega )=\varnothing \).
It should also be noted that, for perfectinformation games with no relevant ties, Battigalli et al. (2013) shows that in every type structure there is a unique play consistent with common strong belief of material rationality and that play is a Nash equilibrium play.
Stalnaker (Stalnaker 1968) postulates a “selection function” \(f:\varOmega \times 2^{\varOmega }\rightarrow \varOmega \) that associates with every state \(\omega \) and event E a unique state \(f(\omega ,E)\in E\), while Lewis (1973) postulates a selection function \(F:\varOmega \times 2^{\varOmega }\rightarrow 2^{\varOmega }\) that associates with every state \(\omega \) and event E a set of states \(F(\omega ,E)\subseteq E\). Stalnaker declares the proposition ‘if E then G’ true at \(\omega \) if and only if \(f(\omega ,E)\in G\), while Lewis requires that \(F(\omega ,E)\subseteq G\).
For an extensive discussion of this issue see Bonanno (2015).
As the author notes, Theorem 2 does not provide a full characterization of Nash equilibrium outcomes as there are Nash equilibria that are inconsistent with extensiveform rationality. However, if only normalform rationality is assumed, that is, if one assumes that a player optimizes only with respect to her initial beliefs (and not necessarily at every node), then the conditions of Theorem 2 provide a full characterization of Nash equilibrium outcomes.
Furthermore, Ben Porath uses the “type space” approach where a state is identified with an ntuple of types, one for each player (n is the number of players); the type of a player specifies his strategy as well as a belief function that assigns, for every node in the tree, a probabilistic belief over the set of profiles of types of the other players. Each player is assumed to know his own type; in particular, each player knows his own strategy.
Epistemic characterizations of Nash equilibrium in strategicform games have not relied on the condition of common belief of rationality. For example, in their seminal paper Aumann and Brandenburger (1995) showed that, in games with more than two players, if there exists a common prior then mutual belief in rationality and payoffs as well as common belief in each player’s conjecture about the opponents’ strategies imply Nash equilibrium. However, Polak (1999) later showed that in completeinformation games, Aumann and Brandenburger’s conditions actually do imply common belief in rationality. More recently, Barelli (2009) generalized Aumann and Brandenburger’s result by substituting the common prior assumption with the weaker property of actionconsistency, and common belief in conjectures with a weaker condition stating that conjectures are constant in the support of the actionconsistent distribution. Thus, he provided sufficient epistemic conditions for Nash equilibrium without requiring common belief in rationality. Later, Bach and Tsakas (2014) obtained a further generalization by introducing even weaker epistemic conditions for Nash equilibrium than those in Barelli (2009): their characterization of Nash equilibrium is based on introducing pairwise epistemic conditions imposed only on some pairs of players (contrary to the characterizations in Aumann and Brandenburger (1995) and Barelli (2009), which correspond to pairwise epistemic conditions imposed on all pairs of players). Not only do these conditions not imply common belief in rationality but they do not even imply mutual belief in rationality.
Defined as follows: players are rational and always believe in their opponents’ present and future rationality and believe that every opponent always believes in his opponents’ present and future rationality and that every opponent always believes that every other player always believes in his opponents’ present and future rationality, and so on.
In the first round the algorithm eliminates, at every information set of player i, strategies of player i himself that are strictly dominated at present and future information sets, as well as strategies of players other than i that are strictly dominated at present and future information sets. In every further round k those strategies are eliminated that are strictly dominated at a present or future information set \(I_{i}\) of player i, given the opponents’ strategies that have survived up to round k at that information set \(I_{i}\). The strategies that eventually survive the elimination process constitute the output of the backward dominance procedure.
Perea (2014) also suggests that, in general extensiveform games, the two notions of common belief in future rationality and sequential equilibrium reflect the difference between BIS and SPE.
Aumann’s claim is that common knowledge of substantive rationality implies the backward induction solution in perfectinformation games without relevant ties, while Stalnaker maintains that it does not. Roughly speaking, a player is substantively rational if, for every history h of hers, if the play of the game were to reach h, then she would be rational at h.
In the model of Fig. 9 we have that \({\mathbf {R}}_{\emptyset }=\{\delta ,\epsilon ,\eta ,\theta ,\lambda \}, {\mathbf {R}}_{a_1}=\{\delta \}, {\mathbf {R}}_{a_2} =\{\alpha ,\beta \}, {\mathbf {R}}_{a_1b_1} =\{\eta ,\theta \}, {\mathbf {R}}_{a_1b_1c_1} =\{\epsilon \}, {\mathbf {R}}_{a_1b_1c_2} =\{\eta \}, {\mathbf {R}}_{a_2d_2} =\{\alpha \}\) so that \({\mathbf {R}}=\{\delta \}\). Furthermore, \({\mathbf {T}}=\{\gamma ,\delta \}\) and \({\mathbf {C}}=\{\delta ,\epsilon ,\eta ,\theta ,\lambda \}\). Hence \({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}=\{\delta \}\).
For instance, in the example of Fig. 9, one can complete the abovementioned partial strategy profile by adding \(\sigma (a_2d_2)=g_2\) and \(\sigma (a_1b_1c_2)=f_1\) (even though \(g_2\) and \(f_1\) are “irrational” choices).
Proof. Let \(h\in D\) and \(z=f_{\sigma }(h)\). Let \(a=\sigma (h)\) be the action prescribed by \(\sigma \) at h. Then \(f_{\sigma }(h)=f_{\sigma }(ha)\). By (4), \(f_{\sigma }(ha)\in {\mathcal {B}}_{h}(z)\) and thus \(z\in {\mathcal {B}}_{h}(z)\), that is, \(z\in {\mathbf {T}}_h\).
Proof. Let \(h\in D\) and \(z\in Z\) be such that \(h\prec z\). Fix an arbitrary \(a\in A(h)\) and arbitrary \(z^{\prime },z^{\prime \prime }\in {\mathcal {B}}_h(z)\) be such that \(ha\preceq z^{\prime }\) and \(ha\preceq z^{\prime \prime }\). Then, by (4), \(z^{\prime }=z^{\prime \prime }=f_{\sigma }(ha)\); hence \(z\in {\mathbf {C}}_h\).
References
Artemov S (2010) Robust knowledge and rationality. Technical report, CUNY
Aumann R (1995) Backward induction and common knowledge of rationality. Games Econ Behav 8:6–19
Aumann R (1996) Reply to binmore. Games Econ Behav 17:138–146
Aumann R (1998) On the centipede game. Games Econ Behav 23:97–105
Aumann R, Brandenburger A (1995) Epistemic conditions for Nash equilibrium. Econometrica 63:1161–1180
Bach C, Tsakas E (2014) Pairwise epistemic conditions for Nash equilibrium. Games Econ Behav 85:48–59
Bach CW, Heilmann C (2011) Agent connectedness and backward induction. Int Game Theory Rev 13:195–208
Balkenborg D, Winter E (1997) A necessary and sufficient epistemic condition for playing backward induction. J Math Econ 27:325–345
Baltag A, Smets S, Zvesper J (2009) Keep hoping for rationality: a solution to the backward induction paradox. Synthese 169:301–333
Barelli P (2009) Consistency of beliefs and epistemic conditions for Nash and correlated equilibria. Games Econ Behav 67:363–375
Battigalli P, Bonanno G (1999) Recent results on belief, knowledge and the epistemic foundations of game theory. Res Econ 53:149–225
Battigalli P, DiTillio A, Samet D (2013) Strategies and interactive beliefs in dynamic games. In: Acemoglu D, Arellano M, Dekel E (eds) Advances in Economics and Econometrics. Theory and Applications: Tenth World Congress, Volume 1. Cambridge University Press, Cambridge, pp 391–422
BenPorath E (1997) Nash equilibrium and backwards induction in perfect information games. Rev Econ Stud 64:23–46
Binmore K (1996) A note on backward induction. Games Econ Behav 17:135–137
Binmore K (1997) Rationality and backward induction. J Econ Methodol 4:23
Bonanno G (2013) A dynamic epistemic characterization of backward induction without counterfactuals. Games Econ Behav 78:31–43
Bonanno G (2015) Reasoning about strategies and rational play in dynamic games. In: van Benthem J, Ghosh S, Verbrugge R (eds), Models of strategic reasoning, Springer, New York, pp 34–62
Brandenburger A (2007) The power of paradox: some recent developments in interactive epistemology. Int J Game Theory 35:465–492
Clausing T (2003) Doxastic conditions for backward induction. Theor Decis 54:315–336
Clausing T (2004) Belief revision in games of perfect information. Econ Philos 20:89–115
Feinberg Y (2005) Subjective reasoning—dynamic games. Games Econ Behav 52:54–93
Gilboa I (1999) Can free choice be known? In: Bicchieri C, Jeffrey R, Skyrms B (eds) The logic of strategy. Oxford University Press, Oxford, pp 163–174
Ginet C (1962) Can the will be caused? Philos Rev 71:49–55
Goldman A (1970) A theory of human action. Princeton University Press, Princeton
Halpern J (2001) Substantive rationality and backward induction. Games Econ Behav 37:425–435
Kaminski MM (2009) Backward induction and subgame perfection the justification of a “folk algorithm”. Technical report, University of California, Irvine
Ledwig M (2005) The no probabilities for actsprinciple. Synthese 144:171–180
Levi I (1986) Hard choices. Cambridge University Press, Cambridge
Levi I (1997) The covenant of reason: rationality and the commitments of thought. Cambridge University Press, Cambridge
Lewis D (1973) Counterfactuals. Harvard University Press, Harvard
Osborne M, Rubinstein A (1994) A course in game theory. MIT Press, Cambridge
Penta A (2009) Robust dynamic mechanism design. Technical report, University of Wisconsin, Madison
Perea A (2007) Epistemic foundations for backward induction: an overview. In: van Benthem J, Gabbay D, Löwe B (eds), Interactive logic, Proceedings of the 7th Augustus de Morgan Workshop, vol 1 of Texts in logic and games, Amsterdam University Press, Amsterdam, pp 159–193
Perea A (2007) A oneperson doxastic characterization of Nash strategies. Synthese 158:251–271
Perea A (2012) Epistemic game theory: reasoning and choice. Cambridge University Press, Cambridge
Perea A (2014) Belief in the opponents’ future rationality. Games Econ Behav 83:231–254
Polak B (1999) Epistemic conditions for Nash equilibrium, and common knowledge of rationality. Econometrica 67:673–676
Quesada A (2003) From common knowledge of rationality to backward induction. Int Game Theory Rev 5:127–137
Samet D (1996) Hypothetical knowledge and games with perfect information. Games Econ Behav 17:230–251
Samet D (2013) Common belief of rationality in games of perfect information. Games Econ Behav 79:192–200
Shackle GLS (1958) Time in economics. North Holland Publishing Company, Amsterdam
Spohn W (1977) Where Luce and Krantz do really generalize Savage’s decision model. Erkenntnis 11:113–134
Spohn W (1999) Strategic rationality, volume 24 of Forschungsberichte der DFGForschergruppe Logik in der Philosophie. Konstanz University
Stalnaker R (1968) A theory of conditionals. In: Rescher N (ed) Studies in logical theory. Blackwell, Barkeley, pp 98–112
Stalnaker R (1996) Knowledge, belief and counterfactual reasoning in games. Econ Philos 12:133–163
Stalnaker R (1998) Belief revision in games: forward and backward induction. Math Soc Sci 36:31–56
Acknowledgements
I am grateful to two anonymous referees for helpful and constructive comments.
Author information
Authors and Affiliations
Corresponding author
A Proofs
A Proofs
Before proving Proposition 1 we introduce some notation and a definition.
Let G be a perfectinformation game and \(\sigma \) a purestrategy profile of G. Let \(f_{\sigma } : H \rightarrow Z\) (recall that H is the set of histories and Z is the set of terminal histories) be defined as follows: if \(z\in Z\) then \(f_{\sigma }(z)=z\) and if \(h\in D\) (recall that D is the set of decision histories) then \(f_{\sigma }(h)\) is the terminal history reached from h by following the choices prescribed by \(\sigma \).
Definition 11
Let G be a perfectinformation game and \(\sigma \) a purestrategy profile of G. The model of G generated by \(\sigma \) is the following model:

\(\varOmega = Z\).

\(\zeta : Z \rightarrow Z\) is the identity function: \(\zeta (z) = z, \forall z\in Z\).

For every \(h\in D\), \({\mathcal {B}}_h \subseteq Z\times Z\) is defined as follows: \({\mathcal {B}}_h (z)\ne \varnothing \) if and only if \(h\prec z\) and \(z^{\prime }\in {\mathcal {B}}_h (z)\) if and only if \(z^{\prime }=f_{\sigma }(ha)\) for some \(a\in A(h)\) (recall that A(h) is the set of actions available at h). That is, the active player at decision history h believes that if she takes action a then the outcome will be the terminal history reached from ha by \(\sigma \).
Figure 8 shows an extensive form with perfect information and the model generated by the strategy profile \(\sigma = (a_1,b_1,c_1,d_1)\) (\(\sigma \) is highlighted by double edges).
Remark 7
Let G be a perfectinformation game and \({\mathcal {M}}\) the model generated by a purestrategy profile \(\sigma \) of G. Then the nouncertainty condition (Definition 7) is satisfied at every state, that is, \({\mathbf {C}} = Z\). Furthermore, if \(z^{*}\) is the play generated by \(\sigma \) (that is, \(z^{*} = f_{\sigma }(\emptyset )\)), then \(z^{*}\in {\mathcal {B}}_h (z^{*})\) for all \(h\in D\) such that \(h\prec z^{*}\); that is, \(z^{*}\in {\mathbf {T}}\).
Proof of Proposition 1
(A) Fix a perfectinformation game G (not necessarily one that satisfies the noconsecutivemoves condition) and let \(\sigma \) be a purestrategy Nash equilibrium of G. If h is a decision history, to simplify the notation we shall write \(\sigma (h)\) instead of \(\sigma _{\iota (h)}(h)\) to denote the choice selected by \(\sigma \) at h. Consider the model generated by \(\sigma \) (Definition 11). Let \(z^{*}\) be the play generated by \(\sigma \), that is, \(z^{*} = f_{\sigma }(\emptyset )\). By Remark 7, \(z^{*}\in {\mathbf {C}}\cap {\mathbf {T}}\). Thus it only remains to show that \(z^{*}\in {\mathbf {R}}\), that is, that \(z^{*}\in {\mathbf {R}}_h\), for all \(h\in D\) such that \(h\prec z^{*}\). Fix an arbitrary \(h\in D\) such that \(h\prec z^{*}\) and let a be the action at h such that \(ha\preceq z^{*}\), that is, \(\sigma (h)=a\); then \(f_{\sigma }(ha)=f_{\sigma }(\emptyset )= z^{*}\). Suppose that \(z^{*}\notin {\mathbf {R}}_h\). Then there is an action \(b\in A(h){\setminus } \{a\}\) that guarantees a higher utility to player \(\iota (h)\), that is, if \(z^{\prime }\in {\mathcal {B}}_h (z^{*})\) is such that \(hb\preceq z^{\prime }\), then \(u_{\iota (h)}(z^{\prime })>u_{\iota (h)}(z^{*})\). By Definition 11, \(z^{\prime }=f_{\sigma }(hb)\) and thus \(u_{\iota (h)}(f_{\sigma }(hb))>u_{\iota (h)}(f_{\sigma }(ha))\) so that by unilaterally changing his strategy at h from a to b (and leaving the rest of his strategy unchanged), player \(\iota (h)\) can increase his payoff, contradicting the assumption that \(\sigma \) is a Nash equilibrium.
(B) Let G be a perfectinformation game that satisfies the noconsecutivemoves condition (Definition 8) and consider a model of it where there is a state \(\alpha \) such that \(\alpha \in {\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}\). We need to construct a purestrategy Nash equilibrium \(\sigma \) of G such that \(f_{\sigma }(\emptyset )=\zeta (\alpha )\).
Step 1. For every \(h\in D\) such that \(h\prec \zeta (\alpha )\), let \(\sigma (h)=a\) where \(a\in A(h)\) is the action at h such that \(ha\preceq \zeta (\alpha )\).
Step 2. Fix an arbitrary \(h\in D\) such that \(h\prec \zeta (\alpha )\) and an arbitrary \(b\in A(h)\) such that \(b\ne \sigma (h)\) (\(\sigma (h)\) was defined in Step 1). Since \(\alpha \in {\mathbf {C}}\), for every \(\omega ,\omega ^{\prime }\in {\mathcal {B}}_h(\alpha )\) such that \(hb\preceq \zeta (\omega )\) and \(hb\preceq \zeta (\omega ^{\prime })\), \(\zeta (\omega )=\zeta (\omega ^{\prime })\). Select an arbitrary \(\omega \in {\mathcal {B}}_h(\alpha )\) such that \(hb\preceq \zeta (\omega )\) and define, for every \(h^{\prime }\in D\) such that \(hb\preceq h^{\prime }\prec \zeta (\omega )\), \(\sigma (h^{\prime })=c\) where \(c\in A(h^{\prime })\) is the action at \(h^{\prime }\) such that \(h^{\prime } c\preceq \zeta (\omega )\).
So far we have defined the choices prescribed by \(\sigma \) along the play to \(\zeta (\alpha )\) and for paths at onestep deviations from this play. This is illustrated in Fig. 9, where \({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}=\{\delta \}\).^{Footnote 47} Focusing on state \(\delta \), the above two steps yield the following partial strategy profile (which is highlighted by double edges). By Step 1, \(\sigma (\emptyset )=a_1, \sigma (a_1)=b_2\) and, by Step 2, \(\sigma (a_2)=d_1, \sigma (a_1b_1)=c_1, \sigma (a_1b_1c_1)=e_1\), while \(\sigma (a_2d_2)\) and \(\sigma (a_1b_1c_2)\) are left undefined by Steps 1 and 2.
Step 3. Complete \(\sigma \) in an arbitrary way.^{Footnote 48}
Because of Step 1, \(\zeta (\alpha )=f_{\sigma }(h)\), for every \(h\preceq \zeta (\alpha )\). We want to show that \(\sigma \) is a Nash equilibrium. Suppose not. Then there is a decision history h with \(h\prec \zeta (\alpha )\) such that, by changing her choice at h from \(\sigma (h)\) to a different choice, player \(\iota (h)\) can increase her payoff (recall the assumption that the game satisfies the noconsecutivemoves assumption and thus there are no successors of h that belong to player \(\iota (h)\)). Let \(\sigma (h)=a\) (thus \(ha\preceq \zeta (\alpha )\)) and let b be the choice at h that yields a higher payoff to player \(\iota (h)\); that is,
Let \(\omega \in {\mathcal {B}}_h(\alpha )\) be such that \(hb\preceq \zeta (\omega )\) (such an \(\omega \) exists by Point 4 of Definition 2). Since \(\alpha \in {\mathbf {C}}\), for every \(\omega ^{\prime }\in {\mathcal {B}}_h(\alpha )\) such that \(hb\preceq \zeta (\omega ^{\prime })\), \(\zeta (\omega )=\zeta (\omega ^{\prime })\). By Step 2 above,
It follows from (3) that, at state \(\alpha \) and history h, player \(\iota (h)\) believes that if she plays b her payoff will be \(u_{\iota (h)}(f_{\sigma }(hb))\). Since \(\alpha \in {\mathbf {T}}\), \(\alpha \in {\mathcal {B}}_h(\alpha )\), and since \(\alpha \in {\mathbf {C}}\), for every \(\omega ^{\prime }\in {\mathcal {B}}_h(\alpha )\) such that \(ha\preceq \zeta (\omega ^{\prime })\), \(\zeta (\omega ^{\prime })=\zeta (\alpha )\). Thus, at state \(\alpha \) and history h, player \(\iota (h)\) believes that if she plays a her payoff will be \(u_{\iota (h)}(\zeta (\alpha ))\). It follows from this and (2) that at \(\alpha \) and h player \(\iota (h)\) believes that action b is better than action a, which implies that \(\alpha \notin {\mathbf {R}}_h\), contradicting the assumption that \(\alpha \in {\mathbf {R}}\subseteq {\mathbf {R}}_h\). \(\square \)
Before proving Proposition 2 we need to define the length of a game.
Definition 12
The length of a history h, denoted by L(h), is defined recursively as follows: \(L(\emptyset )=0\) and, for every \(a\in A(h)\), \(L(ha)=L(h)+1\); thus the length of history h is the number of actions in h. The length of a game, denoted by \(\ell \), is the length of a longest history in the game: \(\ell = \mathop {\max }\limits _{h \in H} \{ L(h)\}\).
Proof of Proposition 2
(A) Fix a perfectinformation game G and let the purestrategy profile \(\sigma \) be a backwardinduction solution of G (that is, a possible output of the backwardinduction algorithm). Consider the model generated by \(\sigma \) (Definition 11); then—by construction—for every terminal history z and every decision history h such that \(h\prec z\),
It follows from this that^{Footnote 49}
and^{Footnote 50}
Let \(z^{*}\) be the play generated by \(\sigma \), that is, \(z^{*} = f_{\sigma }(\emptyset )\). Since every backwardinduction solution is a Nash equilibrium, it follows from Part A of Proposition 1 that \(z^{*}\in {\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}\). Let \(\ell \) be the length of the game. If \(\ell = 1\) there is nothing further to prove. Assume, therefore, that \(\ell \ge 2\). We need to show that \(z^{*}\in {\mathbf {I}}_{TRC}\), that is, that, for every decision history \(h=a_1 \ldots a_m\) (\(m \ge 1\)) and for every sequence \(\langle z_0,z_1,\ldots ,z_m\rangle \) that leads from \(z^{*}\) to h (see Definition 9), \(z_m\in {\mathbf {T}}_h\cap {\mathbf {R}}_h\cap {\mathbf {C}}_h\); however, by (6), we only need to show that \(z_m\in {\mathbf {T}}_h\cap {\mathbf {R}}_h\). Let \(h=a_1 \ldots a_m\) (\(m\ge 1\)) be a decision history and let \(\langle z_0,z_1,\ldots ,z_m\rangle \) be a sequence that leads from \(z^{*}\) to h (such a sequence exists: see Remark 3). By Definition 9, \(z_m\in {\mathcal {B}}_{a_1 \ldots a_{m 1}}(z_{m1})\), so that, by (4), \(z_m=f_{\sigma }(a_1 \ldots a_{m 1}b)\) for some \(b\in A(a_1 \ldots a_{m 1})\); hence \(a_1 \ldots a_{m 1}b\preceq z_m\). Again by Definition 9, \(a_1 \ldots a_{m 1}a_m=h\preceq z_m\) and thus \(b=a_m\) so that
Hence, by (5), \(z_m\in {\mathbf {T}}_h\).
Let \(a\in A(h)\) be the action taken at h at state \(z_m\) (that is, \(ha\preceq z_m\)). It follows from (7) that \(a=\sigma (h)\). Furthermore, by (4), for every \(z^{\prime }\in {\mathcal {B}}_h(z_m)\), \(z^{\prime }=f_{\sigma }(ha^{\prime })\) for some \(a^{\prime }\in A(h)\). Hence at state \(z_m\) and history h player \(\iota (h)\) believes that after any choice \(a^{\prime }\) at h the outcome will the one generated by \(\sigma \) starting from \(ha^{\prime }\) (that is, the backwardinduction outcome induced by \(\sigma \) in the subtree that starts at history \(ha^{\prime }\)). Furthermore, by (7), the action that she takes at h is \(\sigma (h)\), the backwardinduction choice prescribed by \(\sigma \). Hence player \(\iota (h)\) is rational at h and \(z_m\), that is, \(z_m\in {\mathbf {R}}_h\).
(B) Let G be a perfectinformation game. Consider a model of G and a state \(\alpha \) in that model such that \(\alpha \in ({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}})\cap {\mathbf {I}}_{TRC}\). We want to show that \(\zeta (\alpha )\) is a backwardinduction outcome. Let \(\ell \) be the length of the game. If \(\ell = 1\) then every successor of \(\emptyset \) (the root of the tree) is a terminal history. Hence, since \(\alpha \in {\mathbf {R}}\subseteq {\mathbf {R}}_{\emptyset }\), the action chosen at \(\emptyset \) at state \(\alpha \) maximizes player \(\iota (\emptyset )\)’s payoff and thus is a backwardinduction choice. Assume, therefore, that \(\ell \ge 2\).
Step 1. First we show that, at every decision history of length \(\ell  1\) that is reachable from \(\alpha \), the action chosen there is a backwardinduction choice. Fix an arbitrary decision history \(h=a_1 \ldots a_{\ell 1}\) of length \(\ell 1\) (thus every successor of h is a terminal history) and let \(\langle \omega _0,\omega _1, \ldots ,\omega _{\ell 1}\rangle \) be a sequence in \(\varOmega \) that leads from \(\alpha \) to h (such a sequence exists: see Remark 3), that is, (1) \(\omega _0=\alpha \), (2) for every \(i=1,\ldots ,\ell 1\), \(a_1 \ldots a_i \prec \zeta (\omega _i)\), (3) \(\omega _1\in {\mathcal {B}}_{\emptyset }(\alpha )\) and, for every \(i=2,\ldots ,\ell 1\), \(\omega _i\in {\mathcal {B}}_{a_1 \ldots a_{i1}}(\omega _{i1})\). Since, by hypothesis, \(\alpha \in {\mathbf {I}}_{TRC}\), \(\omega _{\ell 1}\in {\mathbf {R}}_{h}\) and thus if b is the action taken at history h at state \(\omega _{\ell 1}\) (that is, \(\zeta (\omega _{\ell 1})=hb\)), then b maximizes the payoff of player \(\iota (h)\), that is, b is a backwardinduction choice at h.
Step 2. Next we show that, at every decision history of length \(\ell  2\), the active player believes that, for every \(a\in A(h)\), if ha is a decision history then the action chosen at ha is a backwardinduction action. Fix an arbitrary decision history \(h=a_1 \ldots a_{\ell 2}\) of length \(\ell 2\) and let \(\langle \omega _0,\omega _1, \ldots ,\omega _{\ell 2}\rangle \) be a sequence in \(\varOmega \) that leads from \(\alpha \) to h (see Remark 3). Let \(a\in A(h)\) be such that ha is a decision history and let \(\omega \in {\mathcal {B}}_h(\omega _{\ell 2})\) be such that \(ha\preceq \zeta (\omega )\) (such an \(\omega \) exists by Point 4 of Definition 2). Then the sequence \(\langle \omega _0,\omega _1, \ldots ,\omega _{\ell 2},\omega \rangle \) reaches ha from \(\alpha \) and thus, by Step 1, the action chosen by the active player at ha is a backwardinduction action (that is, if \(\zeta (\omega )=hab\), with \(b\in A(ha)\), then b is a backwardinduction choice at ha). Furthermore, since \(\alpha \in {\mathbf {I}}_{TRC}\), \(\omega _{\ell 2}\in {\mathbf {C}}_{h}\) and thus, for every other \(\omega ^{\prime }\in {\mathcal {B}}_h(\omega _{\ell 2})\) such that \(ha\preceq \zeta (\omega ^{\prime })\), \(\zeta (\omega ^{\prime })=\zeta (\omega )\) and thus, at h and \(\omega _{\ell 2}\), player \(\iota (h)\) believes that if she takes action a at h then the ensuing outcome is backwardinduction outcome \(\zeta (\omega )\). From \(\alpha \in {\mathbf {I}}_{TRC}\) it also follows that \(\omega _{\ell 2}\in {\mathbf {R}}_{h}\) and thus the action chosen by player \(\iota (h)\) at h at state \(\omega _{\ell 2}\) is optimal given her beliefs that after every choice a at h the outcome following ha is a backwardinduction outcome. Finally, from \(\alpha \in {\mathbf {I}}_{TRC}\) it follows that \(\omega _{\ell 2}\in {\mathbf {T}}_{h}\) so that \(\omega _{\ell 2}\in {\mathcal {B}}_h(\omega _{\ell 2})\) and thus player \(\iota (h)\) has correct beliefs at h and at state \(\omega _{\ell 2}\) about the outcome following the action actually taken at h and at \(\omega _{\ell 2}\) (that is, if \(\hat{a}\) is such that \(h\hat{a}\preceq \zeta (\omega _{\ell 2})\) then player \(\iota (h)\) believes that if she takes action \(\hat{a}\) then the outcome will be the backwardinduction outcome \(\zeta (\omega _{\ell 2})\)). Thus \(\zeta (\omega _{\ell 2})\) is a backwardinduction outcome in the subtree that starts at history h.
Step 3. Iterate the argument of Step 2 backwards to conclude that if \(a\in A(\emptyset )\) is decision history of length 1 that is reachable from \(\alpha \) via a sequence of the form \(\langle \alpha ,\beta \rangle \), then \(\zeta (\beta )\) is a backwardinduction outcome in the subtree that starts at history a.
Step 4. Use the fact that \(\alpha \in {\mathbf {T}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset }\) to conclude that at state \(\alpha \) and history \(\emptyset \) player \(\iota (\emptyset )\) has correct and certain beliefs about the outcomes following decision histories in \(A(\emptyset )\) and thus, using the fact that \(\alpha \in {\mathbf {R}}_{\emptyset }\) and the conclusion of Step 3, deduce that the action taken at state \(\alpha \) by \(\iota (\emptyset )\) is a backward induction action, so that \(\zeta (\alpha )\) is a backwardinduction outcome. \(\square \)
Rights and permissions
About this article
Cite this article
Bonanno, G. Behavior and deliberation in perfectinformation games: Nash equilibrium and backward induction. Int J Game Theory 47, 1001–1032 (2018). https://doi.org/10.1007/s0018201705955
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s0018201705955
Keywords
 Perfectinformation game
 Behavioral model
 Nash equilibrium outcome
 Backwardinduction outcome