Behavior and deliberation in perfect-information games: Nash equilibrium and backward induction

Bonanno, Giacomo

doi:10.1007/s00182-017-0595-5

Behavior and deliberation in perfect-information games: Nash equilibrium and backward induction

Original Paper
Published: 30 September 2017

Volume 47, pages 1001–1032, (2018)
Cite this article

International Journal of Game Theory Aims and scope Submit manuscript

Giacomo Bonanno ORCID: orcid.org/0000-0003-0935-3497¹

423 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Doxastic characterizations of the set of Nash equilibrium outcomes and of the set of backward-induction outcomes are provided for general perfect-information games (where there may be multiple backward-induction solutions). We use models that are behavioral, rather than strategy-based, where a state only specifies the actual play of the game and not the hypothetical choices of the players at nodes that are not reached by the actual play. The analysis is completely free of counterfactuals and no belief revision theory is required, since only the beliefs at reached histories are specified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testimonial justification under epistemic conflict of interest

Article Open access 20 April 2024

Real Fakes: The Epistemology of Online Misinformation

Article Open access 31 August 2022

The fundamental reason for reasons fundamentalism

Article 11 October 2020

Notes

See, for example, Aumann (1995), Balkenborg and Winter (1997), Ben-Porath (1997), Bonanno (2013), Clausing (2003), Halpern (2001), Perea (2012, 2014), Quesada (2003), Samet (1996, 2013), Stalnaker (1998). Surveys of the literature on the epistemic foundations of backward induction are provided in Brandenburger (2007), Perea (2007a) and (Perea 2012, p. 463).
If one identifies histories with nodes in the tree, then $h\prec h^{\prime }$ means that node h is a predecessor of node $h^{\prime }$.
Behavioral models were first introduced in Samet (1996).
See also Gilboa (1999), Ginet (1962), Goldman (1970), Ledwig (2005), Spohn (1977), Spohn (1999).
See, for example, Aumann (1995), Aumann (1998), Battigalli et al. (2013), Samet (1996).
For simplicity, the characterization is provided for games where no player moves more than once along any play, but we explain how to extend the result to general games.The words ‘outcome’, ‘play’ and ‘terminal history’ will be used interchangeably.
This characterization is not restricted to games where no player moves more than once along any play.
As is customary, we take $\omega {\mathcal {B}}(\omega ^{\prime })$ and $(\omega ,\omega ^{\prime })\in {\mathcal {B}}$ as interchangeable.
For more details see Battigalli and Bonanno (1999).
Thus it would be more precise to write ${\mathcal {B}}_{\iota (h)}$ instead of ${\mathcal {B}}_{h}$, but we have chosen the lighter notation since there is no ambiguity, because at every decision history there is a unique player who is active there.
For a critical analysis of the use of counterfactuals in dynamic games see Bonanno (2015).
This issue is further discussed in Sect. 7.3.
The root of the tree corresponds to the null history $\emptyset $, Player 2’s decision node corresponds to history $a_{1}$, Player 3’s decision node to history $a_{1}a_{2}$ and Player 1’s last decision node to history $a_{1}a_{2}a_{3}$.
In other words, for any two states $\omega $ and $\omega ^{\prime }$ that are enclosed in a rounded rectangle, $\{(\omega ,\omega ),(\omega ,\omega ^{\prime }),(\omega ^{\prime },\omega ),(\omega ^{\prime },\omega ^{\prime })\}\subseteq {\mathcal {B}}$ (that is, the relation is total on the set of states contained in the rectangle) and if there is an arrow from a state $ \omega $ to a rounded rectangle then, for every $\omega ^{\prime }$ in the rectangle, $(\omega ,\omega ^{\prime })\in {\mathcal {B}}$.
Thus ${\mathcal {B}}_{\emptyset }(\omega )=\{\alpha ,\beta ,\gamma \}$ for every $\omega \in \varOmega =\{\alpha ,\beta ,\gamma ,\delta ,\epsilon \}$, ${\mathcal {B}}_{a_{1}}(\omega )=\{\beta ,\gamma \}$ for every $\omega \in \{\beta ,\gamma ,\delta ,\epsilon \}$, ${\mathcal {B}}_{a_{1}a_{2}}(\omega )=\{\gamma , \delta \}$ for every $\omega \in \{\gamma , \delta , \epsilon \}$ and ${\mathcal {B}}_{a_{1}a_{2}a_{3}}(\omega )=\{\delta ,\epsilon \}$ for every $\omega \in \{\delta ,\epsilon \}.$
Note that rationality in the traditional sense of expected utility maximization implies rationality in our sense; thus anything that is implied by our weak notion will also be implied by the stronger notion of expected utility maximization. On the other hand, our notion has the advantage that it does not rely on the assumption of von Neumann–Morgenstern preferences: the utility functions can be just ordinal utility functions.
In fact, ${\mathbf {R}}_{\emptyset } = {\mathbf {R}}_{a_1}=\{\alpha ,\beta \}$.
In this model ${\mathbf {T}}_{\emptyset } =\{\alpha ,\beta ,\gamma \}, {\mathbf {T}}_{a_1} =\{\alpha ,\beta \}, {\mathbf {T}}_{a_2} =\{\gamma ,\delta \}, {\mathbf {R}}_{\emptyset } = \varOmega , {\mathbf {R}}_{a_1} = \{\beta \}$ and ${\mathbf {R}}_{a_2} = \{\delta \}$.
If Player 2’s strategy selects choice $b_1$ at decision history $a_1$, then Player 1’s best reply is to play $a_2$ rather than $a_1$.
Because, for every $\omega \in \varOmega $, $\alpha ,\beta \in {\mathcal {B}}_{\emptyset }(\omega ), \ a_1\prec \zeta (\alpha ), \ a_1\prec \zeta (\beta )$ and $\zeta (\alpha )=a_1b_2\ne \zeta (\beta )=a_1b_1$.
The so-called agent form of a game is obtained by treating a player at different decision histories as different players with the same payoff function. Thus the agent form of a game satisfies the no-consecutive-moves condition (but the latter is a weaker condition). Several papers in the literature on the epistemic foundations of backward induction in perfect-information games restrict attention to games in agent form (see, for example, Balkenborg and Winter 1997; Stalnaker 1998).
At the end of this section we discuss how this restriction can be relaxed.
In this model, ${\mathbf {T}}=\{\alpha ,\beta \},{\mathbf {R}}=\{\alpha \}$ and ${\mathbf {C}}=\varOmega $, so that ${\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}=\{\alpha \}$.
$\beta \in {\mathcal {B}}_{\emptyset }(\alpha ), a_1\prec \zeta (\beta )=a_1d_2, \gamma \in {\mathcal {B}}_{a_1}(\beta ), a_1a_2\prec \zeta (\gamma )=a_1a_2d_3, \delta \in {\mathcal {B}}_{a_1a_2}(\gamma )$ and $a_1a_2a_3\prec \zeta (\delta )=a_1a_2a_3d_4$. Another sequence that leads from $\alpha $ to $a_1a_2a_3$ is $\langle \alpha ,\gamma ,\gamma ,\delta \rangle $.
Proof. Let $h=a_1 \ldots a_m$ ($m\ge 1$). By Point 1 of Definition 2, ${\mathcal {B}}_{\emptyset }(\omega ) \ne \varnothing $ (since $\emptyset $ is a prefix of every history, in particular of history $\zeta (\omega )$). Hence, since $a_1\in A(\emptyset )$, by Point 4 of Definition 2 there exists an $\omega _1\in {\mathcal {B}}_{\emptyset }(\omega )$ such that $a_1\preceq \zeta (\omega _1)$. Thus, since $a_2\in A(a_1)$, by Point 4 of Definition 2, there exists an $\omega _2\in {\mathcal {B}}_{a_1}(\omega _1)$ such that $a_1a_2\preceq \zeta (\omega _2)$, etc.
See, for example, Aumann (1995), Balkenborg and Winter (1997), Ben-Porath (1997), Bonanno (2013), Clausing (2003), Halpern (2001), Perea (2012, 2014), Quesada (2003), Samet (1996, 2013), Stalnaker (1998). A perfect-information game has no relevant ties if, $\forall i\in N$, $\forall h\in D_{i}$, $\forall a,a^{\prime }\in A(h)$ with $a\ne a^{\prime } $, $\forall z,z^{\prime }\in Z$, if ha is a prefix of z and $ha^{\prime } $ is a prefix of $z^{\prime }$ then $u_{i}(z)\ne u_{i}(z^{\prime })$. All games in generic position satisfy this condition.
Proof. It is clear that $({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}})\cap {\mathbf {I}}_{TRC} \subseteq ({\mathbf {T}}_{\emptyset }\cap {\mathbf {R}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset })\cap {\mathbf {I}}_{TRC}$ since ${\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}\subseteq {\mathbf {T}}_{\emptyset } \cap {\mathbf {R}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset }$. To prove the converse, let $\omega \in ({\mathbf {T}}_{\emptyset }\cap {\mathbf {R}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset })\cap {\mathbf {I}}_{TRC}$ and let $h=a_1 \ldots a_m$ ($m\ge 1$) be a decision history such that $h\prec \zeta (\omega )$; we need to show that $\omega \in {\mathbf {T}}_{h}\cap {\mathbf {R}}_{h}\cap {\mathbf {C}}_{h}$. Since $\omega \in {\mathbf {I}}_{TRC}$, it will be sufficient to show that h is reachable from $\omega $ via the constant sequence $\langle \omega _0,\omega _1,\ldots ,\omega _m\rangle $ with $\omega _i=\omega $ for every $i=0,1,\ldots ,m$. Point (1) of Definition 9 is trivially true and point (2) follows from the hypothesis that $h\prec \zeta (\omega )$ and the fact that, for every $i=1,\ldots ,m-1$, $a_1 \ldots a_i\prec h$. As for Point (3), we have, first of all, that $\omega _1=\omega \in {\mathcal {B}}_{\emptyset }(\omega _{\emptyset }=\omega )$ because $\omega \in {\mathbf {T}}_{\emptyset }$. Thus $a_1$ is reachable from $\omega $ through the sequence $\langle \omega ,\omega \rangle $ and hence, since $\omega \in {\mathbf {I}}_{TRC}$, $\omega \in {\mathbf {T}}_{a_1}$, that is, $\omega \in {\mathcal {B}}_{a_1}(\omega )$. It follows that $a_1a_2$ is reachable from $\omega $ through the sequence $\langle \omega ,\omega ,\omega \rangle $ and hence, since $\omega \in {\mathbf {I}}_{TRC}$, $\omega \in {\mathbf {T}}_{a_1a_2}$, that is, $\omega \in {\mathcal {B}}_{a_1a_2}(\omega )$, and so forth.
The proof is by induction. At a “last” decision node (that is, a decision node followed only by terminal nodes) there is a unique rational choice, since there are no ties. Hence at an immediately preceding node the active player who believes that after each of her choices the corresponding player will play rationally, cannot have uncertainty about the subsequent choices of those future players; hence, since there are no relevant ties, also this player has a unique rational choice. One then extends this argument backwards in the tree by induction.
The event ${\mathbf {I}}_{TR}$ is defined as in Definition 10 but without reference to the events ${\mathbf {C}}_h$: $\omega \in {\mathbf {I}}_{{\mathbf {T}}{\mathbf {R}}}$ if and only if, for every decision history $h=a_1 \ldots a_m$ ($m\ge 1)$, and for every sequence $\langle \omega _0,\omega _1,\ldots ,\omega _m\rangle $ leading from $\omega $ to h, $\omega _m\in {\mathbf {T}}_{h}\cap {\mathbf {R}}_{h}$.
The interpretation of the event ${\mathbf {I}}_{TRC}$ given below in terms of “forward belief in rationality” is conceptually similar to the notion of “forward belief in material rationality” given in Perea (2007a, Definition 2.7). However, the latter definition is obtained in a class of models where the space of uncertainty is the set of the opponents’ strategies, rather than the set of terminal histories (furthermore, Perea uses the “type-space” approach rather than the state-space approach followed in this paper). The difference between the two classes of models is discussed in Sect. 7.2.
That is, at state $\alpha $ and history $\emptyset $, player $\iota (\emptyset )=1$ believes that at history $a_1$ player $\iota (a_1)=2$ will act rationally.
In this model, ${\mathbf {R}}_{\emptyset }=\{\beta ,\epsilon ,\eta \}, {\mathbf {R}}_{a_1}=\{\epsilon \}, {\mathbf {R}}_{a_2}=\{\beta ,\delta \}, {\mathbf {R}}_{a_1b_1}=\{\alpha ,\eta \}, {\mathbf {R}}=\{\beta ,\epsilon \}, {\mathbf {T}}_{\emptyset }=\{\alpha ,\beta ,\delta ,\epsilon \}, {\mathbf {T}}_{a_1}=\{\epsilon ,\eta \}, {\mathbf {T}}_{a_2}=\{\beta ,\gamma ,\delta \}, {\mathbf {T}}_{a_1b_1}=\{\alpha ,\eta \}, {\mathbf {T}}=\{\beta ,\delta ,\epsilon \}, {\mathbf {C}}_{\emptyset }=\varOmega , {\mathbf {C}}_{a_1}=\{\alpha ,\epsilon ,\eta \}, {\mathbf {C}}_{a_2}=\{\beta ,\gamma ,\delta \}, {\mathbf {C}}_{a_1b_1}=\{\alpha ,\eta \}, {\mathbf {C}}=\varOmega , [a_1]=\{\alpha ,\epsilon ,\eta \}, [a_2]=\{\beta ,\gamma ,\delta \}$ and $[b_1]=\{\alpha ,\eta \}$.
Note that, if state $\omega $ and decision history h are such that h is not reached at $\omega $ (that is, $h\not \prec \zeta (\omega )$), then, by Definition 2, ${\mathcal {B}}_h(\omega )=\varnothing $ and therefore ${\mathcal {B}}_h(\omega )\subseteq E$, for every event E, that is, $\omega \in {\mathbb {B}}_hE$. For example, in the model of Fig. 7, ${\mathbb {B}}_{a_1}(\{\alpha ,\beta \})=\{\beta ,\gamma ,\delta \}$, since, for every $\omega \in \{\beta ,\gamma ,\delta \}$, ${\mathcal {B}}_{a_1}(\omega )=\varnothing $.
It should also be noted that, for perfect-information games with no relevant ties, Battigalli et al. (2013) shows that in every type structure there is a unique play consistent with common strong belief of material rationality and that play is a Nash equilibrium play.
Stalnaker (Stalnaker 1968) postulates a “selection function” $f:\varOmega \times 2^{\varOmega }\rightarrow \varOmega $ that associates with every state $\omega $ and event E a unique state $f(\omega ,E)\in E$, while Lewis (1973) postulates a selection function $F:\varOmega \times 2^{\varOmega }\rightarrow 2^{\varOmega }$ that associates with every state $\omega $ and event E a set of states $F(\omega ,E)\subseteq E$. Stalnaker declares the proposition ‘if E then G’ true at $\omega $ if and only if $f(\omega ,E)\in G$, while Lewis requires that $F(\omega ,E)\subseteq G$.
For an extensive discussion of this issue see Bonanno (2015).
As the author notes, Theorem 2 does not provide a full characterization of Nash equilibrium outcomes as there are Nash equilibria that are inconsistent with extensive-form rationality. However, if only normal-form rationality is assumed, that is, if one assumes that a player optimizes only with respect to her initial beliefs (and not necessarily at every node), then the conditions of Theorem 2 provide a full characterization of Nash equilibrium outcomes.
Furthermore, Ben Porath uses the “type space” approach where a state is identified with an n-tuple of types, one for each player (n is the number of players); the type of a player specifies his strategy as well as a belief function that assigns, for every node in the tree, a probabilistic belief over the set of profiles of types of the other players. Each player is assumed to know his own type; in particular, each player knows his own strategy.
Epistemic characterizations of Nash equilibrium in strategic-form games have not relied on the condition of common belief of rationality. For example, in their seminal paper Aumann and Brandenburger (1995) showed that, in games with more than two players, if there exists a common prior then mutual belief in rationality and payoffs as well as common belief in each player’s conjecture about the opponents’ strategies imply Nash equilibrium. However, Polak (1999) later showed that in complete-information games, Aumann and Brandenburger’s conditions actually do imply common belief in rationality. More recently, Barelli (2009) generalized Aumann and Brandenburger’s result by substituting the common prior assumption with the weaker property of action-consistency, and common belief in conjectures with a weaker condition stating that conjectures are constant in the support of the action-consistent distribution. Thus, he provided sufficient epistemic conditions for Nash equilibrium without requiring common belief in rationality. Later, Bach and Tsakas (2014) obtained a further generalization by introducing even weaker epistemic conditions for Nash equilibrium than those in Barelli (2009): their characterization of Nash equilibrium is based on introducing pairwise epistemic conditions imposed only on some pairs of players (contrary to the characterizations in Aumann and Brandenburger (1995) and Barelli (2009), which correspond to pairwise epistemic conditions imposed on all pairs of players). Not only do these conditions not imply common belief in rationality but they do not even imply mutual belief in rationality.
See Balkenborg and Winter (1997), Baltag et al. (2009), Clausing (2003, 2004), Feinberg (2005), Perea (2014), Stalnaker (1998).
The conditions for backward induction provided in Bonanno (2013) are conceptually the same as those expressed by the event $({\mathbf {T}}\cap {\mathbf {R}})\cap {\mathbf {I}}_{TR}$ (see Remark 5).
Defined as follows: players are rational and always believe in their opponents’ present and future rationality and believe that every opponent always believes in his opponents’ present and future rationality and that every opponent always believes that every other player always believes in his opponents’ present and future rationality, and so on.
In the first round the algorithm eliminates, at every information set of player i, strategies of player i himself that are strictly dominated at present and future information sets, as well as strategies of players other than i that are strictly dominated at present and future information sets. In every further round k those strategies are eliminated that are strictly dominated at a present or future information set $I_{i}$ of player i, given the opponents’ strategies that have survived up to round k at that information set $I_{i}$. The strategies that eventually survive the elimination process constitute the output of the backward dominance procedure.
Perea (2014) also suggests that, in general extensive-form games, the two notions of common belief in future rationality and sequential equilibrium reflect the difference between BIS and SPE.
Aumann’s claim is that common knowledge of substantive rationality implies the backward induction solution in perfect-information games without relevant ties, while Stalnaker maintains that it does not. Roughly speaking, a player is substantively rational if, for every history h of hers, if the play of the game were to reach h, then she would be rational at h.
It is worth noting that, as pointed out by Samet (2013, p. 194), while (Aumann 1995) states the weaker claim that common knowledge of substantive rationality implies the backward-induction play, he actually proves that it implies the backward-induction strategies.
In the model of Fig. 9 we have that ${\mathbf {R}}_{\emptyset }=\{\delta ,\epsilon ,\eta ,\theta ,\lambda \}, {\mathbf {R}}_{a_1}=\{\delta \}, {\mathbf {R}}_{a_2} =\{\alpha ,\beta \}, {\mathbf {R}}_{a_1b_1} =\{\eta ,\theta \}, {\mathbf {R}}_{a_1b_1c_1} =\{\epsilon \}, {\mathbf {R}}_{a_1b_1c_2} =\{\eta \}, {\mathbf {R}}_{a_2d_2} =\{\alpha \}$ so that ${\mathbf {R}}=\{\delta \}$. Furthermore, ${\mathbf {T}}=\{\gamma ,\delta \}$ and ${\mathbf {C}}=\{\delta ,\epsilon ,\eta ,\theta ,\lambda \}$. Hence ${\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}=\{\delta \}$.
For instance, in the example of Fig. 9, one can complete the above-mentioned partial strategy profile by adding $\sigma (a_2d_2)=g_2$ and $\sigma (a_1b_1c_2)=f_1$ (even though $g_2$ and $f_1$ are “irrational” choices).
Proof. Let $h\in D$ and $z=f_{\sigma }(h)$. Let $a=\sigma (h)$ be the action prescribed by $\sigma $ at h. Then $f_{\sigma }(h)=f_{\sigma }(ha)$. By (4), $f_{\sigma }(ha)\in {\mathcal {B}}_{h}(z)$ and thus $z\in {\mathcal {B}}_{h}(z)$, that is, $z\in {\mathbf {T}}_h$.
Proof. Let $h\in D$ and $z\in Z$ be such that $h\prec z$. Fix an arbitrary $a\in A(h)$ and arbitrary $z^{\prime },z^{\prime \prime }\in {\mathcal {B}}_h(z)$ be such that $ha\preceq z^{\prime }$ and $ha\preceq z^{\prime \prime }$. Then, by (4), $z^{\prime }=z^{\prime \prime }=f_{\sigma }(ha)$; hence $z\in {\mathbf {C}}_h$.

References

Artemov S (2010) Robust knowledge and rationality. Technical report, CUNY
Aumann R (1995) Backward induction and common knowledge of rationality. Games Econ Behav 8:6–19
Article Google Scholar
Aumann R (1996) Reply to binmore. Games Econ Behav 17:138–146
Article Google Scholar
Aumann R (1998) On the centipede game. Games Econ Behav 23:97–105
Article Google Scholar
Aumann R, Brandenburger A (1995) Epistemic conditions for Nash equilibrium. Econometrica 63:1161–1180
Article Google Scholar
Bach C, Tsakas E (2014) Pairwise epistemic conditions for Nash equilibrium. Games Econ Behav 85:48–59
Article Google Scholar
Bach CW, Heilmann C (2011) Agent connectedness and backward induction. Int Game Theory Rev 13:195–208
Article Google Scholar
Balkenborg D, Winter E (1997) A necessary and sufficient epistemic condition for playing backward induction. J Math Econ 27:325–345
Article Google Scholar
Baltag A, Smets S, Zvesper J (2009) Keep hoping for rationality: a solution to the backward induction paradox. Synthese 169:301–333
Article Google Scholar
Barelli P (2009) Consistency of beliefs and epistemic conditions for Nash and correlated equilibria. Games Econ Behav 67:363–375
Article Google Scholar
Battigalli P, Bonanno G (1999) Recent results on belief, knowledge and the epistemic foundations of game theory. Res Econ 53:149–225
Article Google Scholar
Battigalli P, Di-Tillio A, Samet D (2013) Strategies and interactive beliefs in dynamic games. In: Acemoglu D, Arellano M, Dekel E (eds) Advances in Economics and Econometrics. Theory and Applications: Tenth World Congress, Volume 1. Cambridge University Press, Cambridge, pp 391–422
Google Scholar
Ben-Porath E (1997) Nash equilibrium and backwards induction in perfect information games. Rev Econ Stud 64:23–46
Article Google Scholar
Binmore K (1996) A note on backward induction. Games Econ Behav 17:135–137
Article Google Scholar
Binmore K (1997) Rationality and backward induction. J Econ Methodol 4:23
Article Google Scholar
Bonanno G (2013) A dynamic epistemic characterization of backward induction without counterfactuals. Games Econ Behav 78:31–43
Article Google Scholar
Bonanno G (2015) Reasoning about strategies and rational play in dynamic games. In: van Benthem J, Ghosh S, Verbrugge R (eds), Models of strategic reasoning, Springer, New York, pp 34–62
Google Scholar
Brandenburger A (2007) The power of paradox: some recent developments in interactive epistemology. Int J Game Theory 35:465–492
Article Google Scholar
Clausing T (2003) Doxastic conditions for backward induction. Theor Decis 54:315–336
Article Google Scholar
Clausing T (2004) Belief revision in games of perfect information. Econ Philos 20:89–115
Article Google Scholar
Feinberg Y (2005) Subjective reasoning—dynamic games. Games Econ Behav 52:54–93
Article Google Scholar
Gilboa I (1999) Can free choice be known? In: Bicchieri C, Jeffrey R, Skyrms B (eds) The logic of strategy. Oxford University Press, Oxford, pp 163–174
Google Scholar
Ginet C (1962) Can the will be caused? Philos Rev 71:49–55
Article Google Scholar
Goldman A (1970) A theory of human action. Princeton University Press, Princeton
Google Scholar
Halpern J (2001) Substantive rationality and backward induction. Games Econ Behav 37:425–435
Article Google Scholar
Kaminski MM (2009) Backward induction and subgame perfection the justification of a “folk algorithm”. Technical report, University of California, Irvine
Ledwig M (2005) The no probabilities for acts-principle. Synthese 144:171–180
Article Google Scholar
Levi I (1986) Hard choices. Cambridge University Press, Cambridge
Book Google Scholar
Levi I (1997) The covenant of reason: rationality and the commitments of thought. Cambridge University Press, Cambridge
Book Google Scholar
Lewis D (1973) Counterfactuals. Harvard University Press, Harvard
Google Scholar
Osborne M, Rubinstein A (1994) A course in game theory. MIT Press, Cambridge
Google Scholar
Penta A (2009) Robust dynamic mechanism design. Technical report, University of Wisconsin, Madison
Perea A (2007) Epistemic foundations for backward induction: an overview. In: van Benthem J, Gabbay D, Löwe B (eds), Interactive logic, Proceedings of the 7th Augustus de Morgan Workshop, vol 1 of Texts in logic and games, Amsterdam University Press, Amsterdam, pp 159–193
Perea A (2007) A one-person doxastic characterization of Nash strategies. Synthese 158:251–271
Article Google Scholar
Perea A (2012) Epistemic game theory: reasoning and choice. Cambridge University Press, Cambridge
Book Google Scholar
Perea A (2014) Belief in the opponents’ future rationality. Games Econ Behav 83:231–254
Article Google Scholar
Polak B (1999) Epistemic conditions for Nash equilibrium, and common knowledge of rationality. Econometrica 67:673–676
Article Google Scholar
Quesada A (2003) From common knowledge of rationality to backward induction. Int Game Theory Rev 5:127–137
Article Google Scholar
Samet D (1996) Hypothetical knowledge and games with perfect information. Games Econ Behav 17:230–251
Article Google Scholar
Samet D (2013) Common belief of rationality in games of perfect information. Games Econ Behav 79:192–200
Article Google Scholar
Shackle GLS (1958) Time in economics. North Holland Publishing Company, Amsterdam
Google Scholar
Spohn W (1977) Where Luce and Krantz do really generalize Savage’s decision model. Erkenntnis 11:113–134
Article Google Scholar
Spohn W (1999) Strategic rationality, volume 24 of Forschungsberichte der DFG-Forschergruppe Logik in der Philosophie. Konstanz University
Stalnaker R (1968) A theory of conditionals. In: Rescher N (ed) Studies in logical theory. Blackwell, Barkeley, pp 98–112
Google Scholar
Stalnaker R (1996) Knowledge, belief and counterfactual reasoning in games. Econ Philos 12:133–163
Article Google Scholar
Stalnaker R (1998) Belief revision in games: forward and backward induction. Math Soc Sci 36:31–56
Article Google Scholar

Download references

Acknowledgements

I am grateful to two anonymous referees for helpful and constructive comments.

Author information

Authors and Affiliations

Department of Economics, University of California, Davis, CA, USA
Giacomo Bonanno

Authors

Giacomo Bonanno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Giacomo Bonanno.

A Proofs

Before proving Proposition 1 we introduce some notation and a definition.

Let G be a perfect-information game and $\sigma $ a pure-strategy profile of G. Let $f_{\sigma } : H \rightarrow Z$ (recall that H is the set of histories and Z is the set of terminal histories) be defined as follows: if $z\in Z$ then $f_{\sigma }(z)=z$ and if $h\in D$ (recall that D is the set of decision histories) then $f_{\sigma }(h)$ is the terminal history reached from h by following the choices prescribed by $\sigma $.

Definition 11

Let G be a perfect-information game and $\sigma $ a pure-strategy profile of G. The model of G generated by $\sigma $ is the following model:

$\varOmega = Z$.
$\zeta : Z \rightarrow Z$ is the identity function: $\zeta (z) = z, \forall z\in Z$.
For every $h\in D$, ${\mathcal {B}}_h \subseteq Z\times Z$ is defined as follows: ${\mathcal {B}}_h (z)\ne \varnothing $ if and only if $h\prec z$ and $z^{\prime }\in {\mathcal {B}}_h (z)$ if and only if $z^{\prime }=f_{\sigma }(ha)$ for some $a\in A(h)$ (recall that A(h) is the set of actions available at h). That is, the active player at decision history h believes that if she takes action a then the outcome will be the terminal history reached from ha by $\sigma $.

Figure 8 shows an extensive form with perfect information and the model generated by the strategy profile $\sigma = (a_1,b_1,c_1,d_1)$ ($\sigma $ is highlighted by double edges).

Remark 7

Let G be a perfect-information game and ${\mathcal {M}}$ the model generated by a pure-strategy profile $\sigma $ of G. Then the no-uncertainty condition (Definition 7) is satisfied at every state, that is, ${\mathbf {C}} = Z$. Furthermore, if $z^{*}$ is the play generated by $\sigma $ (that is, $z^{*} = f_{\sigma }(\emptyset )$), then $z^{*}\in {\mathcal {B}}_h (z^{*})$ for all $h\in D$ such that $h\prec z^{*}$; that is, $z^{*}\in {\mathbf {T}}$.

Proof of Proposition 1

(A) Fix a perfect-information game G (not necessarily one that satisfies the no-consecutive-moves condition) and let $\sigma $ be a pure-strategy Nash equilibrium of G. If h is a decision history, to simplify the notation we shall write $\sigma (h)$ instead of $\sigma _{\iota (h)}(h)$ to denote the choice selected by $\sigma $ at h. Consider the model generated by $\sigma $ (Definition 11). Let $z^{*}$ be the play generated by $\sigma $, that is, $z^{*} = f_{\sigma }(\emptyset )$. By Remark 7, $z^{*}\in {\mathbf {C}}\cap {\mathbf {T}}$. Thus it only remains to show that $z^{*}\in {\mathbf {R}}$, that is, that $z^{*}\in {\mathbf {R}}_h$, for all $h\in D$ such that $h\prec z^{*}$. Fix an arbitrary $h\in D$ such that $h\prec z^{*}$ and let a be the action at h such that $ha\preceq z^{*}$, that is, $\sigma (h)=a$; then $f_{\sigma }(ha)=f_{\sigma }(\emptyset )= z^{*}$. Suppose that $z^{*}\notin {\mathbf {R}}_h$. Then there is an action $b\in A(h){\setminus } \{a\}$ that guarantees a higher utility to player $\iota (h)$, that is, if $z^{\prime }\in {\mathcal {B}}_h (z^{*})$ is such that $hb\preceq z^{\prime }$, then $u_{\iota (h)}(z^{\prime })>u_{\iota (h)}(z^{*})$. By Definition 11, $z^{\prime }=f_{\sigma }(hb)$ and thus $u_{\iota (h)}(f_{\sigma }(hb))>u_{\iota (h)}(f_{\sigma }(ha))$ so that by unilaterally changing his strategy at h from a to b (and leaving the rest of his strategy unchanged), player $\iota (h)$ can increase his payoff, contradicting the assumption that $\sigma $ is a Nash equilibrium.

(B) Let G be a perfect-information game that satisfies the no-consecutive-moves condition (Definition 8) and consider a model of it where there is a state $\alpha $ such that $\alpha \in {\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}$. We need to construct a pure-strategy Nash equilibrium $\sigma $ of G such that $f_{\sigma }(\emptyset )=\zeta (\alpha )$.

Step 1. For every $h\in D$ such that $h\prec \zeta (\alpha )$, let $\sigma (h)=a$ where $a\in A(h)$ is the action at h such that $ha\preceq \zeta (\alpha )$.

Step 2. Fix an arbitrary $h\in D$ such that $h\prec \zeta (\alpha )$ and an arbitrary $b\in A(h)$ such that $b\ne \sigma (h)$ ($\sigma (h)$ was defined in Step 1). Since $\alpha \in {\mathbf {C}}$, for every $\omega ,\omega ^{\prime }\in {\mathcal {B}}_h(\alpha )$ such that $hb\preceq \zeta (\omega )$ and $hb\preceq \zeta (\omega ^{\prime })$, $\zeta (\omega )=\zeta (\omega ^{\prime })$. Select an arbitrary $\omega \in {\mathcal {B}}_h(\alpha )$ such that $hb\preceq \zeta (\omega )$ and define, for every $h^{\prime }\in D$ such that $hb\preceq h^{\prime }\prec \zeta (\omega )$, $\sigma (h^{\prime })=c$ where $c\in A(h^{\prime })$ is the action at $h^{\prime }$ such that $h^{\prime } c\preceq \zeta (\omega )$.

So far we have defined the choices prescribed by $\sigma $ along the play to $\zeta (\alpha )$ and for paths at one-step deviations from this play. This is illustrated in Fig. 9, where ${\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}=\{\delta \}$.^{Footnote 47} Focusing on state $\delta $, the above two steps yield the following partial strategy profile (which is highlighted by double edges). By Step 1, $\sigma (\emptyset )=a_1, \sigma (a_1)=b_2$ and, by Step 2, $\sigma (a_2)=d_1, \sigma (a_1b_1)=c_1, \sigma (a_1b_1c_1)=e_1$, while $\sigma (a_2d_2)$ and $\sigma (a_1b_1c_2)$ are left undefined by Steps 1 and 2.

Step 3. Complete $\sigma $ in an arbitrary way.^{Footnote 48}

Because of Step 1, $\zeta (\alpha )=f_{\sigma }(h)$, for every $h\preceq \zeta (\alpha )$. We want to show that $\sigma $ is a Nash equilibrium. Suppose not. Then there is a decision history h with $h\prec \zeta (\alpha )$ such that, by changing her choice at h from $\sigma (h)$ to a different choice, player $\iota (h)$ can increase her payoff (recall the assumption that the game satisfies the no-consecutive-moves assumption and thus there are no successors of h that belong to player $\iota (h)$). Let $\sigma (h)=a$ (thus $ha\preceq \zeta (\alpha )$) and let b be the choice at h that yields a higher payoff to player $\iota (h)$; that is,

$$\begin{aligned} u_{\iota (h)}(f_{\sigma }(hb))>u_{\iota (h)}(\zeta (\alpha )). \end{aligned}$$

(2)

Let $\omega \in {\mathcal {B}}_h(\alpha )$ be such that $hb\preceq \zeta (\omega )$ (such an $\omega $ exists by Point 4 of Definition 2). Since $\alpha \in {\mathbf {C}}$, for every $\omega ^{\prime }\in {\mathcal {B}}_h(\alpha )$ such that $hb\preceq \zeta (\omega ^{\prime })$, $\zeta (\omega )=\zeta (\omega ^{\prime })$. By Step 2 above,

$$\begin{aligned} \zeta (\omega )=f_{\sigma }(hb). \end{aligned}$$

(3)

It follows from (3) that, at state $\alpha $ and history h, player $\iota (h)$ believes that if she plays b her payoff will be $u_{\iota (h)}(f_{\sigma }(hb))$. Since $\alpha \in {\mathbf {T}}$, $\alpha \in {\mathcal {B}}_h(\alpha )$, and since $\alpha \in {\mathbf {C}}$, for every $\omega ^{\prime }\in {\mathcal {B}}_h(\alpha )$ such that $ha\preceq \zeta (\omega ^{\prime })$, $\zeta (\omega ^{\prime })=\zeta (\alpha )$. Thus, at state $\alpha $ and history h, player $\iota (h)$ believes that if she plays a her payoff will be $u_{\iota (h)}(\zeta (\alpha ))$. It follows from this and (2) that at $\alpha $ and h player $\iota (h)$ believes that action b is better than action a, which implies that $\alpha \notin {\mathbf {R}}_h$, contradicting the assumption that $\alpha \in {\mathbf {R}}\subseteq {\mathbf {R}}_h$. $\square $

Before proving Proposition 2 we need to define the length of a game.

Definition 12

The length of a history h, denoted by L(h), is defined recursively as follows: $L(\emptyset )=0$ and, for every $a\in A(h)$, $L(ha)=L(h)+1$; thus the length of history h is the number of actions in h. The length of a game, denoted by $\ell $, is the length of a longest history in the game: $\ell = \mathop {\max }\limits _{h \in H} \{ L(h)\}$.

Proof of Proposition 2

(A) Fix a perfect-information game G and let the pure-strategy profile $\sigma $ be a backward-induction solution of G (that is, a possible output of the backward-induction algorithm). Consider the model generated by $\sigma $ (Definition 11); then—by construction—for every terminal history z and every decision history h such that $h\prec z$,

$$\begin{aligned} \forall z^{\prime }\in Z,\ z^{\prime }\in {\mathcal {B}}_{h}(z) \text { if and only if }\ z^{\prime }=f_{\sigma }(ha) \ \text {for some } \ a\in A(h). \end{aligned}$$

(4)

It follows from this that^{Footnote 49}

$$\begin{aligned} \forall h\in D,\ \text { if } z=f_{\sigma }(h)\ \text { then } z\in {\mathbf {T}}_h \end{aligned}$$

(5)

and^{Footnote 50}

$$\begin{aligned} \forall z\in Z \quad \text { and }\quad \forall h\in D \quad \text {such that }\quad h\prec z, \ z\in {\mathbf {C}}_h. \end{aligned}$$

(6)

Let $z^{*}$ be the play generated by $\sigma $, that is, $z^{*} = f_{\sigma }(\emptyset )$. Since every backward-induction solution is a Nash equilibrium, it follows from Part A of Proposition 1 that $z^{*}\in {\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}}$. Let $\ell $ be the length of the game. If $\ell = 1$ there is nothing further to prove. Assume, therefore, that $\ell \ge 2$. We need to show that $z^{*}\in {\mathbf {I}}_{TRC}$, that is, that, for every decision history $h=a_1 \ldots a_m$ ($m \ge 1$) and for every sequence $\langle z_0,z_1,\ldots ,z_m\rangle $ that leads from $z^{*}$ to h (see Definition 9), $z_m\in {\mathbf {T}}_h\cap {\mathbf {R}}_h\cap {\mathbf {C}}_h$; however, by (6), we only need to show that $z_m\in {\mathbf {T}}_h\cap {\mathbf {R}}_h$. Let $h=a_1 \ldots a_m$ ($m\ge 1$) be a decision history and let $\langle z_0,z_1,\ldots ,z_m\rangle $ be a sequence that leads from $z^{*}$ to h (such a sequence exists: see Remark 3). By Definition 9, $z_m\in {\mathcal {B}}_{a_1 \ldots a_{m -1}}(z_{m-1})$, so that, by (4), $z_m=f_{\sigma }(a_1 \ldots a_{m -1}b)$ for some $b\in A(a_1 \ldots a_{m -1})$; hence $a_1 \ldots a_{m -1}b\preceq z_m$. Again by Definition 9, $a_1 \ldots a_{m -1}a_m=h\preceq z_m$ and thus $b=a_m$ so that

$$\begin{aligned} z_m=f_{\sigma }(h). \end{aligned}$$

(7)

Hence, by (5), $z_m\in {\mathbf {T}}_h$.

Let $a\in A(h)$ be the action taken at h at state $z_m$ (that is, $ha\preceq z_m$). It follows from (7) that $a=\sigma (h)$. Furthermore, by (4), for every $z^{\prime }\in {\mathcal {B}}_h(z_m)$, $z^{\prime }=f_{\sigma }(ha^{\prime })$ for some $a^{\prime }\in A(h)$. Hence at state $z_m$ and history h player $\iota (h)$ believes that after any choice $a^{\prime }$ at h the outcome will the one generated by $\sigma $ starting from $ha^{\prime }$ (that is, the backward-induction outcome induced by $\sigma $ in the subtree that starts at history $ha^{\prime }$). Furthermore, by (7), the action that she takes at h is $\sigma (h)$, the backward-induction choice prescribed by $\sigma $. Hence player $\iota (h)$ is rational at h and $z_m$, that is, $z_m\in {\mathbf {R}}_h$.

(B) Let G be a perfect-information game. Consider a model of G and a state $\alpha $ in that model such that $\alpha \in ({\mathbf {T}}\cap {\mathbf {R}}\cap {\mathbf {C}})\cap {\mathbf {I}}_{TRC}$. We want to show that $\zeta (\alpha )$ is a backward-induction outcome. Let $\ell $ be the length of the game. If $\ell = 1$ then every successor of $\emptyset $ (the root of the tree) is a terminal history. Hence, since $\alpha \in {\mathbf {R}}\subseteq {\mathbf {R}}_{\emptyset }$, the action chosen at $\emptyset $ at state $\alpha $ maximizes player $\iota (\emptyset )$’s payoff and thus is a backward-induction choice. Assume, therefore, that $\ell \ge 2$.

Step 1. First we show that, at every decision history of length $\ell - 1$ that is reachable from $\alpha $, the action chosen there is a backward-induction choice. Fix an arbitrary decision history $h=a_1 \ldots a_{\ell -1}$ of length $\ell -1$ (thus every successor of h is a terminal history) and let $\langle \omega _0,\omega _1, \ldots ,\omega _{\ell -1}\rangle $ be a sequence in $\varOmega $ that leads from $\alpha $ to h (such a sequence exists: see Remark 3), that is, (1) $\omega _0=\alpha $, (2) for every $i=1,\ldots ,\ell -1$, $a_1 \ldots a_i \prec \zeta (\omega _i)$, (3) $\omega _1\in {\mathcal {B}}_{\emptyset }(\alpha )$ and, for every $i=2,\ldots ,\ell -1$, $\omega _i\in {\mathcal {B}}_{a_1 \ldots a_{i-1}}(\omega _{i-1})$. Since, by hypothesis, $\alpha \in {\mathbf {I}}_{TRC}$, $\omega _{\ell -1}\in {\mathbf {R}}_{h}$ and thus if b is the action taken at history h at state $\omega _{\ell -1}$ (that is, $\zeta (\omega _{\ell -1})=hb$), then b maximizes the payoff of player $\iota (h)$, that is, b is a backward-induction choice at h.

Step 2. Next we show that, at every decision history of length $\ell - 2$, the active player believes that, for every $a\in A(h)$, if ha is a decision history then the action chosen at ha is a backward-induction action. Fix an arbitrary decision history $h=a_1 \ldots a_{\ell -2}$ of length $\ell -2$ and let $\langle \omega _0,\omega _1, \ldots ,\omega _{\ell -2}\rangle $ be a sequence in $\varOmega $ that leads from $\alpha $ to h (see Remark 3). Let $a\in A(h)$ be such that ha is a decision history and let $\omega \in {\mathcal {B}}_h(\omega _{\ell -2})$ be such that $ha\preceq \zeta (\omega )$ (such an $\omega $ exists by Point 4 of Definition 2). Then the sequence $\langle \omega _0,\omega _1, \ldots ,\omega _{\ell -2},\omega \rangle $ reaches ha from $\alpha $ and thus, by Step 1, the action chosen by the active player at ha is a backward-induction action (that is, if $\zeta (\omega )=hab$, with $b\in A(ha)$, then b is a backward-induction choice at ha). Furthermore, since $\alpha \in {\mathbf {I}}_{TRC}$, $\omega _{\ell -2}\in {\mathbf {C}}_{h}$ and thus, for every other $\omega ^{\prime }\in {\mathcal {B}}_h(\omega _{\ell -2})$ such that $ha\preceq \zeta (\omega ^{\prime })$, $\zeta (\omega ^{\prime })=\zeta (\omega )$ and thus, at h and $\omega _{\ell -2}$, player $\iota (h)$ believes that if she takes action a at h then the ensuing outcome is backward-induction outcome $\zeta (\omega )$. From $\alpha \in {\mathbf {I}}_{TRC}$ it also follows that $\omega _{\ell -2}\in {\mathbf {R}}_{h}$ and thus the action chosen by player $\iota (h)$ at h at state $\omega _{\ell -2}$ is optimal given her beliefs that after every choice a at h the outcome following ha is a backward-induction outcome. Finally, from $\alpha \in {\mathbf {I}}_{TRC}$ it follows that $\omega _{\ell -2}\in {\mathbf {T}}_{h}$ so that $\omega _{\ell -2}\in {\mathcal {B}}_h(\omega _{\ell -2})$ and thus player $\iota (h)$ has correct beliefs at h and at state $\omega _{\ell -2}$ about the outcome following the action actually taken at h and at $\omega _{\ell -2}$ (that is, if $\hat{a}$ is such that $h\hat{a}\preceq \zeta (\omega _{\ell -2})$ then player $\iota (h)$ believes that if she takes action $\hat{a}$ then the outcome will be the backward-induction outcome $\zeta (\omega _{\ell -2})$). Thus $\zeta (\omega _{\ell -2})$ is a backward-induction outcome in the subtree that starts at history h.

Step 3. Iterate the argument of Step 2 backwards to conclude that if $a\in A(\emptyset )$ is decision history of length 1 that is reachable from $\alpha $ via a sequence of the form $\langle \alpha ,\beta \rangle $, then $\zeta (\beta )$ is a backward-induction outcome in the subtree that starts at history a.

Step 4. Use the fact that $\alpha \in {\mathbf {T}}_{\emptyset }\cap {\mathbf {C}}_{\emptyset }$ to conclude that at state $\alpha $ and history $\emptyset $ player $\iota (\emptyset )$ has correct and certain beliefs about the outcomes following decision histories in $A(\emptyset )$ and thus, using the fact that $\alpha \in {\mathbf {R}}_{\emptyset }$ and the conclusion of Step 3, deduce that the action taken at state $\alpha $ by $\iota (\emptyset )$ is a backward induction action, so that $\zeta (\alpha )$ is a backward-induction outcome. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonanno, G. Behavior and deliberation in perfect-information games: Nash equilibrium and backward induction. Int J Game Theory 47, 1001–1032 (2018). https://doi.org/10.1007/s00182-017-0595-5

Download citation

Accepted: 24 September 2017
Published: 30 September 2017
Issue Date: September 2018
DOI: https://doi.org/10.1007/s00182-017-0595-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Behavior and deliberation in perfect-information games: Nash equilibrium and backward induction

Abstract

Access this article

Similar content being viewed by others

Testimonial justification under epistemic conflict of interest

Real Fakes: The Epistemology of Online Misinformation

The fundamental reason for reasons fundamentalism

Notes

References

Acknowledgements