Skip to main content
Log in

AGM-consistency and perfect Bayesian equilibrium. Part II: from PBE to sequential equilibrium

  • Original Paper
  • Published:
International Journal of Game Theory Aims and scope Submit manuscript

Abstract

In (Bonanno, Int J Game Theory 42:567–592, 2013) a general notion of perfect Bayesian equilibrium (PBE) for extensive-form games was introduced and shown to be intermediate between subgame-perfect equilibrium and sequential equilibrium. Besides sequential rationality, the ingredients of the proposed notion are (1) the existence of a plausibility order on the set of histories that rationalizes the given assessment and (2) the notion of Bayesian consistency relative to the plausibility order. We show that a cardinal property of the plausibility order and a strengthening of the notion of Bayesian consistency provide necessary and sufficient conditions for a PBE to be a sequential equilibrium.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. As shown in Bonanno (2011), these notions can be derived from the primitive concept of a player’s epistemic state, which encodes the player’s initial beliefs and her disposition to revise those beliefs upon receiving (possibly unexpected) information. The existence of a plausibility order that rationalizes the epistemic state of each player guarantees that the belief revision policy of each player satisfies the so-called AGM axioms for rational belief revision, which were introduced in Alchourrón et al. (1985).

  2. As in Bonanno (2013) we use the notation \(h\precsim h^{\prime }\) rather than the, perhaps more natural, notation \(h\succsim h^{\prime }\), for two reasons: (1) it is the standard notation in the extensive literature that deals with AGM belief revision (for a recent survey of this literature see the special issue of the Journal of Philosophical Logic, Vol. 40 (2), April 2011) and (2) when representing the order \(\precsim \) numerically it is convenient to assign lower values to more plausible histories. An alternative reading of \(h\precsim h^{\prime }\) is “history h (weakly) precedes \(h^{\prime }\) in terms of plausibility” .

  3. A behavior strategy profile is a list of probability distributions, one for every information set, over the actions available at that information set. A system of beliefs is a collection of probability distributions, one for every information set, over the histories in that information set.

  4. The precise definition is as follows. Let Z denote the set of terminal histories and, for every player i, let \(U_{i}:Z\rightarrow \mathbb {R} \) be player i’s von Neumann-Morgenstern utility function. Given a decision history h, let Z(h) be the set of terminal histories that have h as a prefix. Let \(\mathbb {P}_{h,\sigma }\) be the probability distribution over Z(h) induced by the strategy profile \(\sigma \), starting from history h (that is, if z is a terminal history and \(z=ha_{1}\ldots a_{m}\) then \(\mathbb {P }_{h,\sigma }(z)=\mathop {\prod }\nolimits _{j=1}^{m}\sigma (a_{j})\)). Let I be an information set of player i and let \(u_{i}(I|\sigma ,\mu )=\mathop {\sum }\nolimits _{h\in I}\mu (h)\mathop {\sum }\nolimits _{z\in Z(h)}\mathbb {P}_{h,\sigma }(z)U_{i}(z)\) be player i’s expected utility at I if \(\sigma \) is played, given her beliefs at I (as specified by \(\mu \)). We say that player i’s strategy \(\sigma _{i}\) is sequentially rational at I if \(u_{i}(I|(\sigma _{i},\sigma _{-i}),\mu )\ge u_{i}(I|(\tau _{i},\sigma _{-i}),\mu )\) for every strategy \(\tau _{i}\) of player i (where \(\sigma _{-i}\) denotes the strategy profile of the players other than i). An assessment \((\sigma ,\mu )\) is sequentially rational if, for every player i and for every information set I of player i, \(\sigma _{i}\) is sequentially rational at I.  Note that there are two definitions of sequential rationality: the weakly local one—which is the one adopted here—according to which at an information set a player can contemplate changing her choice not only there but possibly also at subsequent information sets of hers, and a strictly local one, according to which at an information set a player contemplates changing her choice only there. If the definition of perfect Bayesian equilibrium (Definition 5 below) is modified by using the strictly local definition of sequential rationality, then an extra condition needs to be added, namely the “pre-consistency” condition identified in Hendon et al. (1996) and Perea (2002) as being necessary and sufficient for the equivalence of the two notions. For simplicity we have chosen the weakly local definition.

  5. By PL1 of Definition 1, \(b\precsim be\) and, by P1 of Definition 2, it is not the case that \(b\sim be\) because e is not assigned positive probability by \(\sigma \). Thus \(b\prec be\).

  6. Note that if \(h,h^{\prime }\in E\) and \(h^{\prime }=ha_{1}\ldots a_{m}\), then \(\sigma (a_{j})>0\), for all \(j=1,...,m\). In fact, since \(h^{\prime }\sim h\), every action \(a_{j}\) is plausibility preserving and therefore, by Property P1 of Definition 2, \(\sigma (a_{j})>0\).

  7. I am grateful to an anonymous reviewer for raising this question.

  8. By B2 of Definition 3, \(\nu _{E}(f)=\nu _{E}(\emptyset )\times \sigma (f)=\nu _{E}(\emptyset )\times 1=\nu _{E}(\emptyset )\) and, by B1, the support of \(\nu _{E}\) is \(E\cap D_{\mu }^{+}=\{\emptyset ,f\}\).

  9. Nothing of susbstance would change if one considered dates 0, 1, 2 and 3 (note that the length of the game, that is, the length of its maximal histories, is 3): in the expressions below, instead multiplying by \(\frac{1 }{3}\) one would be multiplying by \(\frac{1}{4}\); furthermore, one would have that \(P(\mathbf {h}=fA~|~\mathbf {t}=3)=P(\mathbf {h}=fA~|~\mathbf {t}=2)=1\) and \(P(\mathbf {h}=\emptyset ~|~\mathbf {t}=3)=P(\mathbf {h}=f~|~\mathbf {t}=3)=0.\)

  10. That is, \(\hat{\nu }_{F}(b)=\frac{1}{9}\alpha \), \(\hat{\nu }_{F}(c)=\frac{2}{9} \alpha \), \(\hat{\nu }_{F}(d)=\frac{1}{4}(1-\alpha )\), \(\hat{\nu }_{F}(e)=\frac{ 1}{12}(1-\alpha )\) and \(\hat{\nu }_{F}(er)=\frac{1}{15}(1-\alpha )\).

  11. The probability of reaching history h is the sum of the probabilities that the play of the game is currently at history h or at a history that has h as a proper prefix and is thus interpreted as the probability that the play of the game is currently or was earlier at h.

  12. As noted above, the reason why we take the support of \(\nu _{E}\) to be \( E\cap D_{\mu }^{+}\), rather than E, is that terminal histories, as well as decision histories h with \(\mu (h)=0\), are irrelevant for the notion of Bayesian consistency and \(\nu _{E}\) so defined is a much simpler object (compare, for instance, the simpler function \(\nu (h)\) with the more extensive function P(ht) in the above example).

    In general, \(\nu _{E}(h)\overset{def}{=}P(\mathbf {h}=h~\wedge ~\mathbf {t} =t(h))=P(\mathbf {h}=h~|~\mathbf {t}=t(h))\times P(~\mathbf {t}=t(h))\) where \(P( \mathbf {h}=h~|~\mathbf {t}=t(h))\) is given as follows. First of all, if \( h\notin E\), then \(P(\mathbf {h}=h~|~\mathbf {t}=t(h))=0.\) Secondly, if \(h\in E\) and \(ha\in E\) then \(P(\mathbf {h}=ha~|~\mathbf {t}=t(h)+1)=P(\mathbf {h}=h~|~ \mathbf {t}=t(h))\times \sigma (a).\) Thirdly, for every terminal history \( z\in E,\) if \(t>t(z)\) then \(P(\mathbf {h}=z~|~\mathbf {t}=t)=P(\mathbf {h}=z~|~ \mathbf {t}=t(z))\). Thus we only need to specify \(P(\mathbf {h}=h~|~\mathbf {t} =t(h))\) for “minimal histories” in E, that is, for histories that do not have proper prefixes in E (like histories bcd and e in the equivalence class F in the above example based on Fig. 2). Let \(\{I_{j}\}_{j=1,\ldots ,m}\) be the collection of different information sets to which the minimal histories in E belong (in the equivalence class F in the above example, \(m=2\) and the two different information sets are \(I_{1}=\{a,b,c\}\) and \(I_{2}=\{d,e\}\)) and let \(\{\alpha _{j}\}_{j=1,\ldots ,m}\) be an arbitrary collection of real numbers strictly between 0 and 1, whose sum is equal to 1; call \(\alpha _{j}\) the weight associated with information set \(I_{j}\). Then, if \(h\in E\cap I_{j}\) is a minimal history, \(P(\mathbf {h}=h~|~\mathbf {t}=t(h))=\alpha _{j}\times \mu (h),\) where \(\alpha _{j}\) is the weight associated with information set \( I_{j}.\)

    Assuming that any two dates are equally likely is natural if one takes the point of view of an external observer, who does not know how many moves have been made so far: for such an observer the principle of insufficient reason requires assigning equal probability to each date.

  13. The example of Fig. 1 shows that PBE is a strict refinement of subgame-perfect equilibrium. It is shown in Bonanno (2013) that, in turn, sequential equilibrium is a strict refinement of PBE.

  14. For example, Kohlberg and Reny (1997) adopt this interpretation. For a subjective interpretation of perfect Bayesian equilibrium see Bonanno (2011).

  15. That is, for every \(h\in D\backslash \{\emptyset \}\), \(\mu ^{m}(h)=\frac{ \mathop {\prod }\nolimits _{a\in A_{h}}\sigma ^{m}(a)}{\mathop {\sum }\nolimits _{h^{\prime }\in I(h)}\mathop {\prod }\nolimits _{a\in A_{h^{\prime }}}\sigma ^{m}(a)}\), where \(A_{h}\) is the set of actions that occur in history h. Since \(\sigma ^{m}\) is completely mixed, \(\sigma ^{m}(a)>0\) for every \(a\in A\) and thus \(\mu ^{m}(h)>0\) for all \(h\in D\backslash \{\emptyset \}.\)

  16. Because, by P2 of Definition 2, any such plausibility order \( \precsim \) would have to satisfy \(a\prec b\) and \(be\prec ae\), so that any integer-valued representation F of it would be such that \(F(b)-F(a)>0\) and \(F(be)-F(ae)<0\).

  17. As can be seen by taking \(\nu \) to be the uniform distribution over the set \( D=\left\{ \emptyset ,a,b,ae,be\right\} \) (UB1 is clearly satisfied and UB2 is also satisfied, since \(\frac{\nu (a)}{\nu (b)}=\frac{\frac{1}{5}}{ \frac{1}{5}}=\frac{\nu (ae)}{\nu (be)}\)).

  18. Because, by P2 Definition 2, any such plausibility order \( \precsim \) would have to satisfy \(a\sim b\) and \(ae\sim be\), so that - letting E be the equivalence class \(\{a,b,ad,bd\}\) and F the equivalence class \(\{ae,be,aeg,beg\}\) (thus \(E\cap D_{\mu }^{+}=\{a,b\}\) and \(F\cap D_{\mu }^{+}=\{ae,be\}\)) - if \(\nu \) is any common prior then \(\nu _{E}(a)= \frac{\nu (a)}{\nu (a)+\nu (b)}\), \(\nu _{E}(b)=\frac{\nu (b)}{\nu (a)+\nu (b) }\). By B3 of Definition 3, \(\mu (a)=\frac{\nu _{E}(a)}{\nu _{E}(a)+\nu _{E}(b)}\) and \(\mu (b)=\frac{\nu _{E}(b)}{\nu _{E}(a)+\nu _{E}(b) }\). Thus \(\frac{\nu (a)}{\nu (b)}=\frac{\nu _{E}(a)}{\nu _{E}(b)}=\frac{\mu (a)}{\mu (b)}=3\); similarly, \(\frac{\nu (ae)}{\nu (be)}=\frac{\nu _{F}(ae)}{ \nu _{F}(be)}=\frac{\mu (ae)}{\mu (be)}=\frac{1}{3}\), yielding a violation of UB2 of Definition 9.

  19. Proof. Let \(\precsim \) be a choice measurable plausibility order that rationalizes \((\sigma ,\mu )\) and let F be a cardinal representation of it. Since \(\mu (b)>0\) and \(\mu (c)>0\), by P2 of Definition 2, \( b\sim c\) and thus \(F(b)=F(c)\). By choice measurability, \( F(b)-F(c)=F(bB)-F(cB)\) and thus \(F(bB)=F(cB)\), so that \(bB\sim cB\). Since \( \sigma (f)>0\), by P1 of Definition 2, \(cB\sim cBf\) and therefore, by transitivity of \(\precsim \), \(bB\sim cBf\). Hence if \(\mu (bB)>0 \) then, by P2 of Definition 2, \(bB\in Min_{\precsim }\{bB,cBf,d\}\) (for any \(S\subseteq H\), \(Min_{\precsim }S\) is defined as \( \left\{ h\in S:h\precsim h^{\prime },\forall h^{\prime }\in S\right\} \)) and thus \(cBf\in Min_{\precsim }\{bB,cBf,d\}\) so that, by P2 of Definition 2, \(\mu (cBf)>0\). The proof that if \(\mu (cBf)>0\) then \(\mu (bB)>0\) is analogous.

  20. Proof. Suppose that \(\mu (b)>0\), \(\mu (c)>0\) (so that \(b\sim c\)) and \(\mu (bB)>0.\) Let \(\nu \) be a full-support common prior that satisfies the properties of Definition 9. Then, by UB2, \(\frac{\nu (c)}{\nu (b)}=\frac{\nu (cB)}{\nu (bB)}\) and, by UB1, since \( \sigma (f)=1\), \(\nu (cBf)=\nu (cB)\times \sigma (f)=\nu (cB)\). Let E be the equivalence class that contains b. Then \(E\cap D_{\mu }^{+}=\{b,c\}\). Since \(\nu _{E}(\cdot )=\nu (\cdot ~|~E\cap D_{\mu }^{+})\), by B3 of Defnition 3, \(\mu (b)=\frac{\nu (b)}{\nu (b)+\nu (c)}\) and \(\mu (c)=\frac{\nu (c)}{\nu (b)+\nu (c)}\), so that \(\frac{\mu (c)}{\mu (b)}=\frac{ \nu (c)}{\nu (b)}.\) Let G be the equivalence class that contains bB. Then, since—by hypothesis - \(\mu (bB)>0\), it follows from (7) that either \(G\cap D_{\mu }^{+}=\{bB,cBf\}\) or \(G\cap D_{\mu }^{+}=\{bB,cBf,d\}\). Since \(\nu _{G}(\cdot )=\nu (\cdot ~|~G\cap D_{\mu }^{+})\), by B3 of Defnition 3, in the former case \(\mu (bB)=\frac{\nu (bB)}{\nu (bB)+\nu (cBf)}\) and \(\mu (cBf)=\frac{\nu (cBf)}{\nu (bB)+\nu (cBf)}\) and in the latter case \(\mu (bB)=\frac{\nu (bB)}{\nu (bB)+\nu (cBf)+\nu (d)}\) and \( \mu (cBf)=\frac{\nu (cBf)}{\nu (bB)+\nu (cBf)+\nu (d)}\); thus in both cases \( \frac{\mu (cBf)}{\mu (bB)}=\frac{\nu (cBf)}{\nu (bB)}.\) Hence, since \(\nu (cBf)=\nu (cB)\), \(\frac{\mu (cBf)}{\mu (bB)}=\frac{\nu (cB)}{\nu (bB)}\) and, therefore, since—as shown above—\(\frac{\nu (cB)}{\nu (bB)}=\frac{\nu (c) }{\nu (b)}\) and \(\frac{\nu (c)}{\nu (b)}=\frac{\mu (c)}{\mu (b)}\), we have that \(\frac{\mu (cBf)}{\mu (bB)}=\frac{\mu (c)}{\mu (b)}\).

  21. It follows from Proposition 11 and the fact that \((\sigma ,\mu )\) is sequentially rational and rationalized by the following choice-measurable plausibility order: \(\left( \begin{array}{ll} \precsim ~: &{} F:\\ \emptyset ,a &{} 0 \\ b,c,bT,cT &{} 1 \\ d,bB,cB,cBf,dL,bBL,cBfL &{} 2 \\ bBR,cBe,cBfR,dR &{} 3 \end{array} \right) \) and is uniformly Bayesian relative to it: letting \(E_{1},E_{2}\) and \(E_{3}\) be the top three equivalence classes, the following probability density functions satify the properties of Definition 3: \(\nu _{E_{1}}(\emptyset )=1\), \(\nu _{E_{2}}(b)=\frac{7}{10}\), \(\nu _{E_{2}}(c)= \frac{3}{10}\) and \(\nu _{E_{3}}(d)=\frac{8}{21}\), \(\nu _{E_{3}}(bB)=\frac{7}{ 21}\), \(\nu _{E_{3}}(cB)=\nu _{E_{3}}(cBf)=\frac{3}{21}\); then the following is a full-support uniform common prior: \(\nu (\emptyset )=\frac{9}{40},\nu (b)=\nu (bB)=\frac{7}{40}\), \(\nu (c)=\nu (cB)=\nu (cBf)=\frac{3}{40}\), \(\nu (d)=\frac{8}{40}\) .

  22. Both \((\sigma ,\tilde{\mu })\) and \((\sigma ,\hat{\mu })\) are sequentially rational and are rationalized by the choice measurable plausibility order given in Footnote 21; \((\sigma ,\hat{\mu })\) is Bayesian relative to that plausibility order but cannot be uniformly Bayesian relative to any rationalizing order, because it fails to satisfy (8).

  23. By “Bayesian updating as long as possible” we mean the following: (1) when information causes no surprises, because the play of the game is consistent with the most plausible play(s) (that is, when information sets are reached that have positive prior probability), then beliefs should be updated using Bayes’ rule and (2) when information is surprising (that is, when an information set is reached that had zero prior probability) then new beliefs can be formed in an arbitrary way, but from then on Bayes’ rule should be used to update those new beliefs, whenever further information is received that is consistent with those beliefs.

  24. Blume and Zame (1994) provides an indirect proof of the fact that consistent assessments are determined by finitely many algebraic equations and inequalities.

  25. At the 13th SAET conference in July 2013 Streufert presented a characterization of KW-consistent assessments in terms of additive plausibility (under the different name of “additive absurdity” ) and a condition that he called “pseudo-Bayesianism” which is essentially a reformulation of one of the conditions given in (Streufert 2007, Theorem 2.1, p. 11). He also showed the two conditions to be independent of each other.

  26. Battigalli (1996) shows that in games with observable deviators weak independence suffices for KW-consistency.

  27. Kreps and Wilson themselves (Kreps and Wilson 1982, p. 876) express dissatisfaction with their definition of sequential equilibrium: “We shall proceed here to develop the properties of sequential equilibrium as defined above; however, we do so with some doubts of our own concerning what ‘ought’ to be the definition of a consistent assessment that, with sequential rationality, will give the ‘proper’ definition of a sequential equilibrium.” In a similar vein, Osborne and Rubinstein (Osborne and Rubinstein 1994, p. 225) write “we do not find the consistency requirement to be natural, since it is stated in terms of limits; it appears to be a rather opaque technical assumption” . In these quotations “consistency” corresponds to what we called “KW-consistency”.

  28. In the sense that if \(h,h^{\prime }\in E\cap D_{\mu }^{+}\) and \(h^{\prime }=ha_{1}\ldots a_{m}\) then by Step 1 (and B2 of Definition 3) \( f_{E_{i}}(h^{\prime })=f_{E_{i}}(h)\times \sigma (a_{1})\times ~\cdots ~\times ~\sigma (a_{m})\) and by Step 2 - if applicable - for every \(j=1,\ldots ,m\), \( f_{E_{i}}(ha_{1}\ldots a_{j})=f_{E_{i}}(h)\times \sigma (a_{1})\times ~\cdots ~\times ~\sigma (a_{j})\).

  29. Recal that \(\emptyset \) denotes the null history, that is, the root of the tree. If \(\hat{F}\) is an integer-valued representation of \(\precsim \) that satisfies property CM, then F defined by \(F(h)=\hat{F}(h)-\hat{F} (\emptyset )\) is also an integer-valued representation of \(\precsim \) that satisfies property CM; clearly, \(F(\emptyset )=0\).

  30. If \(h=\emptyset \) then \(h^{\prime }=h\) and there is nothing to prove because \(A_{h}=A^{0}\cap A_{h}=\varnothing \).

  31. The proof in Bonanno (2013) makes use of Lemma A.1 in Kreps and Wilson (1982), whose proof Streufert (2012) finds faulty and repairs.

References

  • Alchourrón C, Gärdenfors P, Makinson D (1985) On the logic of theory change: partial meet contraction and revision functions. J Symb Logic 50:510–530

    Article  Google Scholar 

  • Aumann R, Hart S, Perry M (1997) The absent-minded driver. Games Econ Behav 20:102–116

    Article  Google Scholar 

  • Battigalli P (1996) Strategic independence and perfect Bayesian equilibria. J Econ Theory 70:201–234

    Article  Google Scholar 

  • Blume L, Zame WR (1994) The algebraic geometry of perfect and sequential equilibrium. Econometrica 62:783–794

    Article  Google Scholar 

  • Bonanno G (2011) AGM belief revision in dynamic games. In: Apt KR (ed) Proceedings of the 13th conference on theoretical aspects of rationality and knowledge (TARK XIII). ACM, New York, pp 37–45

    Chapter  Google Scholar 

  • Bonanno G (2013) AGM-consistency and perfect Bayesian equilibrium. Part I: definition and properties. Int J Game Theory 42:567–592

    Article  Google Scholar 

  • Grove A, Halpern J (1997) On the expected value of games with absentmindedness. Games Econ Behav 20:51–65

    Article  Google Scholar 

  • Hendon E, Jacobsen J, Sloth B (1996) The one-shot-deviation principle for sequential rationality. Games Econ Behav 12:274–282

    Article  Google Scholar 

  • Kohlberg E, Reny P (1997) Independence on relative probability spaces and consistent assessments in game trees. J Econ Theory 75:280–313

    Article  Google Scholar 

  • Kreps D, Wilson R (1982) Sequential equilibrium. Econometrica 50:863–894

    Article  Google Scholar 

  • Osborne M, Rubinstein A (1994) A course in game theory. MIT Press, Cambridge

    Google Scholar 

  • Perea A, Jansen M, Peters H (1997) Characterization of consistent assessments in extensive-form games. Games Econ Behav 21:238–252

    Article  Google Scholar 

  • Perea A (2001) Rationality in extensive form games. Kluwer Academic Publishers, Norwell

    Book  Google Scholar 

  • Perea A (2002) A note on the one-deviation property in extensive form games. Games Econ Behav 40:322–338

    Article  Google Scholar 

  • Piccione M, Rubinstein A (1997) On the interpretation of decision problems with imperfect recall. Games Econ Behav 20:3–24

    Article  Google Scholar 

  • Selten R (1975) Re-examination of the perfectness concept for equilibrium points in extensive games. Int J Game Theory 4:25–55

    Article  Google Scholar 

  • Streufert P (2007) Characterizing consistency with monomials, working paper. University of Western Ontario

  • Streufert P (2012) Additive plausibility characterizes the supports of consistent assessments. Research Report # 2012-3, University of Western Ontario

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giacomo Bonanno.

Additional information

I am grateful to an Associate Editor and two anonymous reviewers for helpful comments and suggestions.

Appendix: Proofs

Appendix: Proofs

Proof of Lemma 8

We shall construct a full-support common prior that satisfies the properties of Lemma 8. Let \(\mathcal {E}\) be the finite collection of equivalence classes of \(\precsim \) and let \((E_{1},\ldots ,E_{m})\) be the ordering of \(\mathcal {E}\) according to decreasing plausibility, that is, \(\forall h,h^{\prime }\in H\), \(\forall i,j\in \{1,\ldots ,m\}\), if \(h\in E_{i}\) and \(h^{\prime }\in E_{j}\) then \(h\prec h^{\prime }\) if and only if \(i<j\). Let \(\mathcal {E}^{+}=\left\{ E\in \mathcal {E}:E\cap D_{\mu }^{+}\ne \varnothing \right\} \) and let \(\mathcal {N }=\left\{ \nu _{E}\right\} _{E\in \mathcal {E}^{+}}\) be an arbitrary collection of probability density functions that satisfy the properties of Definition 3. Fix an equivalence class \(E_{i}\) and define the function \(f_{E_{i}}:H\rightarrow [0,1]\) recursively as follows.

Step 0. For every \(h\notin E_{i}\) set \(f_{E_{i}}(h)=0\).

Step 1. For every \(h\in E_{i}\cap D_{\mu }^{+}\) set \( f_{E_{i}}(h)=\nu _{E_{i}}(h),\) where \(\nu _{E_{i}}(\cdot )\) is the relevant element of \(\mathcal {N}\). Note that, by Property B2 of Definition 3, if \(h,ha\in E_{i}\cap D_{\mu }^{+}\) then \( f_{E_{i}}(ha)=f_{E_{i}}(h)\times \sigma (a)\).

Step 2. Let \(h,ha\in E_{i}\); then (a) if \(h\notin D_{\mu }^{+}\) and \(ha\in D_{\mu }^{+}\), set \(f_{E_{i}}(h)=\frac{f_{E_{i}}(ha) }{\sigma (a)}\) (note that, by P1 of Definition 2, \(h\sim ha\) implies \(\sigma (a)>0\)) and (b) if \(h\in D_{\mu }^{+}\) and \(ha\notin D_{\mu }^{+}\), set \(f_{E_{i}}(ha)=f_{E_{i}}(h)\times \sigma (a)\). Note that, because of Property B2 of Definition 3, the values assigned under Step 2 cannot be inconsistent with the values assigned under Step 1.Footnote 28

Step 3. After completing Steps 1 and 2, the only histories \(h\in E_{i}\) for which \(f_{E_{i}}(h)\) has not been defined yet are those that satisfy the following properties: (1) there is no prefix \( h^{\prime }\) of h such that \(h^{\prime }\in E_{i}\cap D_{\mu }^{+}\) and (2) there is no \(h^{\prime }\in E_{i}\cap D_{\mu }^{+}\) such that h is a prefix of \(h^{\prime }\). Let \(\hat{E}_{i}\subseteq E_{i}\) be the set of such histories (it could be that \(\hat{E}_{i}=\varnothing \)). A maximal path in \( \hat{E}_{i}\) is a sequence \(\left\langle h,ha_{1},ha_{1}a_{2},\ldots ,ha_{1}a_{2}\ldots a_{p}\right\rangle \) in \(\hat{E}_{i}\) such that \(ha_{1}a_{2}\ldots a_{p}\) is a terminal history and there is no \( h^{\prime }\in \hat{E}_{i}\) which is a proper prefix of h. Fix an arbitrary maximal path \(\left\langle h,ha_{1},\ldots ,ha_{1}a_{2}\ldots a_{p}\right\rangle \) in \(\hat{E}_{i}\) and define \( f_{E_{i}}(h)=1\) and, for every \(i=0,\ldots ,p-1\), \(f_{E_{i}}(ha_{1}\ldots a_{i+1})=\) \(f_{E_{i}}(ha_{1}\ldots a_{i})\times \sigma (a_{i+1})\) (defining \(ha_{0}\) to be h; note that, by P1 of Definition 2 \(\sigma (a_{i})>0\) for all \(i=1,\ldots ,p\)).

By construction, the function \(f_{E_{i}}\) satisfies the following property:

$$\begin{aligned} \begin{array}{c} \forall h\in E_{i}, \forall a\in A(h)\text {, if }\sigma (a)>0\text { then (}ha\in E_{i}\text { and)} \\ f_{E_{i}}(ha)=f_{E_{i}}(h)\times \sigma (a)\text { and thus } f_{E_{i}}(ha)\le f_{E_{i}}(h). \end{array} \end{aligned}$$
(9)

Note that Steps 1–3 always assign positive values in (0, 1]; thus if \(h\in E_{i}\) then \(f_{E_{i}}(h)\in (0,1]\).

Next we show that there exist weights \(\lambda _{1},\ldots ,\lambda _{m}\in (0,1) \) such that \(\lambda _{1}+\cdots +\lambda _{m}<1\) and, \(\forall i,j\in \{1,\ldots ,m\}\), if \(h\in E_{i}\), \(h^{\prime }\in E_{j}\) and \(i<j\) then \( \lambda _{j}\times f_{E_{j}}(h^{\prime })\le \lambda _{i}\times f_{E_{i}}(h) \). Fix an arbitrary \(\lambda _{1}\in (0,1)\) and let \( a=\min _{h\in E_{1}}\left\{ \lambda _{1}\times f_{E_{1}}(h)\right\} \) (clearly, \(a\in \) (0, 1)). Let \(b=\max _{h\in E_{2}}\left\{ f_{E_{2}}(h)\right\} \) and choose a \(\lambda _{2}\in (0,1)\) such that (1) \( \lambda _{1}+\lambda _{2}<1\) and (2) \(\lambda _{2}\times b\le a\). Then, for every \(h\in E_{1}\) and \(h^{\prime }\in E_{2}\), \(\lambda _{2}\times f_{E_{2}}(h^{\prime })\le \lambda _{1}\times f_{E_{1}}(h)\). Repeat this procedure for choosing a weight \(\lambda _{i}\) for every \(i\in \{3,\ldots ,m\}.\) Define \(\alpha _{i}=\frac{\lambda _{i}}{\lambda _{1}+\cdots +\lambda _{m}}\) and \( \bar{\nu }:H\rightarrow [0,1]\) by \(\bar{\nu }(h):\sum _{i=1}^{m}\left[ \alpha _{i}\times f_{E_{i}}(h)\right] \). We want to show that, \(\forall h\in H\), \( \forall a\in A(h)\)

$$\begin{aligned} \begin{array}{c} \text {if }ha\in D\text { then (}A\text {) }\bar{\nu }(ha)\le \bar{\nu }(h)\text { and } \\ \text {(}B\text {) if }\sigma (a)>0\text { then }\bar{\nu }(ha)=\bar{\nu } (h)\times \sigma (a). \end{array} \end{aligned}$$
(10)

Fix an arbitrary \(ha\in D\). Suppose first that \(h\prec ha\). Then, by Property P1 of Definition 2, \(\sigma (a)=0,\) so that (B) of (10) is trivially satisfied. Let \(E_{i}\) be the equivalence class to which h belongs and \(E_{j}\) the equivalence class to which ha belongs, so that \(i<j\). Then \(\bar{\nu }(h)=\alpha _{i}\times f_{E_{i}}(h)\) and \(\bar{\nu }(ha)=\alpha _{j}\times f_{E_{j}}(ha)\) and thus, since \(\lambda _{j}\times f_{E_{j}}(ha)\le \lambda _{i}\times f_{E_{i}}(h)\) , dividing both sides by \((\lambda _{1}+\cdots +\lambda _{m})\) we get that \(\bar{ \nu }(ha)\le \bar{\nu }(h)\). Suppose now that \(h\sim ha\) (so that, by Property P1 of Definition 2, \(\sigma (a)>0\)). Let \(E_{i}\) be the equivalence class to which both h and ha belong; then, \(\bar{\nu } (h)=\alpha _{i}\times f_{E_{i}}(h)\) and \(\bar{v}(ha)=\alpha _{i}\times f_{E_{i}}(ha)\) and thus (10) follows from (9).

Finally, define \(\nu :D\rightarrow (0,1]\) by \(\nu (h)=\frac{\bar{\nu }(h)}{ \sum _{h^{\prime }\in D}\bar{\nu }(h^{\prime })}\). Then it follows from (10) that if \(ha\in D\) then (A) \(\nu (ha)\le \nu (h)\) and (B) if \( \sigma (a)>0\) then \(\nu (ha)=\nu (h)\times \sigma (a)\). It only remains to prove that \(\nu \) is a common prior of \(\mathcal {N}=\left\{ \nu _{E}\right\} _{E\in \mathcal {E}^{+}}\), that is, that, \(\forall i\in \{1,\ldots ,m\}\) and \( \forall h\in E_{i}\cap D_{\mu }^{+}\), \(v(h~|~E_{i}\cap D_{\mu }^{+})\overset{ def}{=}\frac{\nu (h)}{\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}~\nu (h^{\prime })}=\nu _{E_{i}}(h)\), where \(\nu _{E_{i}}(\cdot )\) is the relevant element of \(\mathcal {N}\). Now, \(\frac{\nu (h)}{\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}~\nu (h^{\prime })}=\frac{\nu (h)~{\large \times } ~\sum _{h^{\prime \prime }\in D}~\bar{\nu }(h^{\prime \prime })}{ \sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}~\nu (h^{\prime })~{\large \times }~\sum _{h^{\prime \prime }\in D}~\bar{\nu }(h^{\prime \prime })}=\frac{ \bar{\nu }(h)}{\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}\left[ \nu (h^{\prime })~{\large \times }~\sum _{h^{\prime \prime }\in D}~\bar{\nu } (h^{\prime \prime })\right] }=\frac{\bar{\nu }(h)}{\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}~\bar{\nu }(h^{\prime })}=\frac{\alpha _{i}~\times ~f_{E_{i}}(h)}{\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}~\left[ \alpha _{i}~\times ~f_{E_{i}}(h^{\prime })\right] }=\frac{\alpha _{i}~\times ~\nu _{E_{i}}(h)}{\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}~\left[ \alpha _{i}~\times ~\nu _{E_{i}}(h^{\prime })\right] }=\frac{\alpha _{i}~\times ~\nu _{E_{i}}(h)}{\alpha _{i}~\times ~\sum _{h^{\prime }\in E_{i}\cap D_{\mu }^{+}}\nu _{E_{i}}(h^{\prime })}=\nu _{E_{i}(h)}\) since, by construction, for all \(h\in E_{i}\cap D_{\mu }^{+}\), \(f_{E_{i}}(h)=\nu _{E_{i}}(h)\) and \( Supp(\nu _{E_{i}})=E_{i}\cap D_{\mu }^{+}\). \(\square \)

In order to prove Proposition 11 we will exploit the characterization of sequential equilibrium given in Perea et al. (1997). First some notation and terminology. Let A be the set of actions (as in Bonanno (2013), we assume that no action is available at more than one information set, that is, if \(h^{\prime }\notin I(h)\) then \(A(h^{\prime })\ne A(h)\)). If h is a history, we denote by \(A_{h}\) the set of actions that occur in history h (thus while h is a sequence of actions, \(A_{h}\) is the set of actions in that sequence; note that, for every history h, \(A_{h}\ne \varnothing \) if and only if \(h\ne \emptyset \)). Given an assessment \( (\sigma ,\mu )\) we denote by \(A^{0}=\{a\in A:\sigma (a)=0\}\) the set of actions that are assigned zero probability by the strategy profile \(\sigma \) . Recall that \(D_{\mu }^{+}\) denotes the set of decision histories to which \( \mu \) assigns positive probability (\(D_{\mu }^{+}=\left\{ h\in D:\mu (h)>0\right\} \)) and that \(h^{\prime }\in I(h)\) mans that h and \(h^{\prime }\) belong to the same information set. A pseudo behavior strategy profile (PBSP) is a generalization of the notion of behavior strategy profile that allows the sum of the “probabilities” over the actions at an information set to be larger than 1, that is, a PBSP is a function \(\bar{\sigma }:A\rightarrow [0,1].\) A PBSP \(\bar{\sigma }\) is a completely mixed extension of a behavior strategy profile \(\sigma \) if, \(\forall a\in A\), (1) \(\bar{ \sigma }(a)>0\) and (2) if \(\sigma (a)>0\) then \(\bar{\sigma }(a)=\sigma (a)\). Given a PBSP \(\bar{\sigma }\), for every history h let \(\mathbb {P}_{\bar{ \sigma }}(h)=\left\{ \begin{array}{ll} 1 &{} \quad \text {if }h=\emptyset \\ \bar{\sigma }(a_{1})\times \cdots \times ~\bar{\sigma }(a_{m}) &{} \quad \text {if } h=a_{1}\ldots a_{m} \end{array} \right. .\) The following proposition is proved in Perea et al. (1997) (see also (Perea 2001, p. 74 and Streufert 2007).

Proposition 13

(Perea et al. 1997, Theorem 3.1, p. 241) Fix an extensive game and let \((\sigma ,\mu )\) be an assessment. Then (A) and (B) below are equivalent:

(A)

  1. (A.1)

    There exists a function \(\varepsilon :A^{0}\rightarrow (0,1)\) such that, \(\forall h,h^{\prime }\in D\) with \(h^{\prime }\in I(h)\),

    (A.1a) If \(h,h^{\prime }\in D_{\mu }^{+}\) then \(\prod \nolimits _{a\in A^{0}\cap A_{h}}~\varepsilon (a)=\prod \nolimits _{a\in A^{0}\cap A_{h^{\prime }}}~\varepsilon (a)\), and

    (A.1b) if \(h\in D_{\mu }^{+}\) and \(h^{\prime }\notin D_{\mu }^{+}\) then \(\prod \nolimits _{a\in A^{0}\cap A_{h}}~\varepsilon (a)>\prod \nolimits _{a\in A^{0}\cap A_{h^{\prime }}}~\varepsilon (a)\), and

  2. (A.2)

    there is a PBSP \(\bar{\sigma }\) which is a completely mixed extension of \(\sigma \) and is such that,\(\forall h,h^{\prime }\in D_{\mu }^{+}\) with \(h^{\prime }\in I(h)\), \(\frac{\mathbb {P}_{\bar{\sigma }}(h)}{\mathbb {P}_{\bar{\sigma } }(h^{\prime })}=\frac{\mu (h)}{\mu (h^{\prime })}\).

  1. (B)

    \((\sigma ,\mu )\) is KW-consistent.

Proof of Proposition 11

\((I)\Rightarrow (II)\). Let \( (\sigma ,\mu )\) be a perfect Bayesian equilibrium which is rationalized by a choice measurable plausibility order \(\precsim \) and is uniformly Bayesian relative to it. We need to show that \((\sigma ,\mu )\) is a sequential equilibrium. Since \((\sigma ,\mu )\) is a perfect Bayesian equilibrium, it is sequentially rational and thus we only need to show that \((\sigma ,\mu )\) is KW-consistent. We shall use choice measurability (Definition 7) to obtain the function \(\varepsilon \) of Proposition 13 (a similar argument can be found in Streufert (2012)) and the full-support common prior \( \nu \) of Definition 9 to obtain the PBSP \(\bar{\sigma }\) .

By hypothesis \(\precsim \) is choice measurable. Fix a cardinal integer-valued representation F of \(\precsim \) and normalize it so that \( F(\emptyset )=0\).Footnote 29 For every action \(a\in A\), define \(\varepsilon (a)=e^{\left[ F(h)-F(ha)\right] }\) for some h such that \(a\in A(h)\). By definition of choice measurability, if \(h^{\prime }\in I(h)\) then \(F(h)-F(ha)=F(h^{\prime })-F(h^{\prime }a)\) and thus the function \(\varepsilon (a)\) is well defined. Furthermore, if \(a\in A^{0}\) (that is, \( \sigma (a)=0\)) it follows from P1 of Definition 2 that \(h\prec ha\) and thus \(F(h)-F(ha)<0\) so that \(0<\varepsilon (a)<1\), while if \(a\notin A^{0}\) (that is, \(\sigma (a)>0\)) then, by P1 of Definition 2, \( h\sim ha\) and thus \(F(h)-F(ha)=0\) so that \(\varepsilon (a)=1\); hence,

$$\begin{aligned} \text {if }A^{0}\cap A_{h}\ne \varnothing ,\text { then }\prod \nolimits _{a\in A_{h}}~\varepsilon (a)=\prod \nolimits _{a\in A^{0}\cap A_{h}}~\varepsilon (a). \end{aligned}$$
(11)

We want to show that the restriction of \(\varepsilon (\cdot )\) to \(A^{0}\) satisfies (A.1) of Proposition 13. Fix an arbitrary history \( h=a_{1}a_{2}\ldots a_{m}\). Since \(\left[ F(\emptyset )-F(a_{1})\right] +\left[ F(a_{1})-F(a_{1}a_{2})\right] +\cdots +\left[ F(a_{1}\ldots a_{m-1})-F(h)\right] =-F(h)\) (recall that \(F(\emptyset )=0\)), it follows that,

$$\begin{aligned} \forall h\in H\backslash \{\emptyset \}, \prod \nolimits _{a\in A_{h}}~\varepsilon (a)=e^{-F(h)}. \end{aligned}$$
(12)

Fix arbitrary \(h,h^{\prime }\in D\backslash \{\emptyset \}\) with \(h^{\prime }\in I(h)\) and \(h\in D_{\mu }^{+}\) (that is, \(\mu (h)>0\)).Footnote 30 Suppose first that \(h^{\prime }\in D_{\mu }^{+}\). Then, by P2 of Definition 2, \(h\sim h^{\prime }\) and thus \(F(h)=F(h^{\prime })\), so that, by (12), \(\prod \nolimits _{a \in A_{h}}~\varepsilon (a)=\prod \nolimits _{a\in A_{h^{\prime }}}~\varepsilon (a).\) Thus, by (11), (A.1a) of Proposition 13 is satisfied. Suppose now that \(h^{\prime }\notin D_{\mu }^{+}\). Then, by P2 of Definition 2, \(h\prec h^{\prime }\) and thus \(F(h)<F(h^{\prime })\) so that \(e^{-F(h^{\prime })}<e^{-F(h)}\) and thus, by (12), \( \prod \nolimits _{a\in A_{h}}~\varepsilon (a)>\prod \nolimits _{a\in A_{h^{\prime }}}~\varepsilon (a)\). Thus, by (11), (A.1b) of Proposition 13 is also satisfied.

Next we prove (A.2). Denote by \(\bar{A}\) the set of “non-terminal actions” , that is, \(\bar{A}=\{a\in A:ha\in D\) for some h with \(a\in A(h)\}\). Let \(\nu \) be a uniform full-support common prior (Definition 9). Define \(\bar{\sigma }:\bar{A}\rightarrow (0,1]\) as follows: \(\bar{\sigma }(a)=\frac{\nu (ha)}{\nu (h)}\) for some h such that \(a\in A(h)\) and \(ha\in D\). By Property UB2 of Definition 9, if \(h^{\prime }\in I(h)\) then \(\frac{\nu (h^{\prime }a)}{\nu (h^{\prime })}=\frac{\nu (ha)}{\nu (h)}\) and thus \(\bar{\sigma }\) is well defined; furthermore, since \(\nu (h)>0\) for all \(h\in D\), \(\bar{\sigma } (a)>0. \) By (A) of Property UB1 of Definition 9, \(\bar{\sigma }(a)\le 1.\) Finally, by (B) of Property UB1 of Definition 9, if \(\sigma (a)>0\) then \(\bar{\sigma }(a)=\sigma (a)\). Thus \(\bar{\sigma }\) is a PBSP which is a completely mixed extension of \(\sigma \). We need to show that \(\forall h,h^{\prime }\in D_{\mu }^{+}\) with \(h^{\prime }\in I(h)\), \(\frac{\mathbb {P}_{\bar{\sigma }}(h)}{\mathbb {P}_{\bar{\sigma }}(h^{\prime }) }=\frac{\mu (h)}{\mu (h^{\prime })}\). If \(h=\emptyset \) it is trivially true because \(h^{\prime }=h\) and \(\mathbb {P}_{\bar{\sigma }}(\emptyset )=\mu (\emptyset )=1\). Fix arbitrary \(h,h^{\prime }\in D_{\mu }^{+}\backslash \{\emptyset \}\) with \(h^{\prime }\in I(h).\) Let \(h=a_{1}a_{2}\ldots a_{p}\) (\( p\ge 1\)) and \(h^{\prime }=b_{1}b_{2}\ldots b_{r}\) (\(r\ge 1\)). By definition of \(\bar{\sigma }\), \(\mathbb {P}_{\bar{\sigma }}(h)=\bar{\sigma }(a_{1})\times \bar{ \sigma }(a_{2})\times \cdots \times ~\bar{\sigma }(a_{p})=\frac{\nu (a_{1})}{\nu (\emptyset )}\times \frac{\nu (a_{1}a_{2})}{\nu (a_{1})}\times \cdots \times \frac{\nu (h)}{\nu (a_{1}a_{2}\ldots a_{p-1})}=\frac{\nu (h)}{\nu (\emptyset )}\) . Similarly, \(\mathbb {P}_{\bar{\sigma }}(h^{\prime })=\frac{\nu (h^{\prime }) }{\nu (\emptyset )}\). Thus \(\frac{\mathbb {P}_{\bar{\sigma }}(h)}{\mathbb {P}_{ \bar{\sigma }}(h^{\prime })}=\frac{\nu (h)}{\nu (h^{\prime })}\). Dividing numerator and denominator of the right-hand-side by \(\mathop {\sum }\nolimits _{h^{\prime \prime }\in E_{i}\cap D_{\mu }^{+}}\nu (h^{\prime \prime })\) and using the fact that (since \(\nu \) is a common prior) \(\frac{\nu (h)}{ \mathop {\sum }\nolimits _{h^{\prime \prime }\in E_{i}\cap D_{\mu }^{+}}\nu (h^{\prime \prime })}=\nu _{E_{i}}(h)\) and \(\frac{\nu (h^{\prime })}{ \mathop {\sum }\nolimits _{h^{\prime \prime }\in E_{i}\cap D_{\mu }^{+}}\nu (h^{\prime \prime })}=\nu _{E_{i}}(h^{\prime })\), where \(E_{i}\) is the equivalence class to which both h and \(h^{\prime }\) belong, we get that \(\frac{\mathbb { P}_{\bar{\sigma }}(h)}{\mathbb {P}_{\bar{\sigma }}(h^{\prime })}=\frac{\nu _{E_{i}}(h)}{\nu _{E_{i}}(h^{\prime })}\); now, dividing numerator and denominator of the right-hand-side by \(\mathop {\sum }\nolimits _{h^{\prime \prime }\in I(h)}\nu _{E_{i}}(h^{\prime \prime })\) and using the fact that, by B3 of Definition 3, \(\frac{\nu _{E_{i}}(h)}{\mathop {\sum }\nolimits _{h^{\prime \prime }\in I(h)}\nu _{E_{i}}(h^{\prime \prime })}=\mu (h)\) and \(\frac{\nu _{E_{i}}(h^{\prime })}{\mathop {\sum }\nolimits _{h^{\prime \prime }\in I(h)}\nu _{E_{i}}(h^{\prime \prime })}=\mu (h^{\prime })\), we obtain \(\frac{\mathbb {P} _{\bar{\sigma }}(h)}{\mathbb {P}_{\bar{\sigma }}(h^{\prime })}=\frac{\mu (h)}{ \mu (h^{\prime })}\), so that (A.2) of Proposition 13 also holds. Hence, by Proposition 13, \((\sigma ,\mu )\) is KW-consistent.

\((II)\Rightarrow (I)\). Let \((\sigma ,\mu )\) be a sequential equilibrium. That \((\sigma ,\mu )\) is rationalized by a choice measurable plausibility order \(\precsim \) and is Bayesian relative to it was proved in Bonanno (2013).Footnote 31 Thus we only need to show that it is uniformly Bayesian. By Proposition 13, there exists a completely mixed PBSP \(\bar{\sigma }\) that extends \(\sigma \) and is such that,

$$\begin{aligned} \forall h,h^{\prime }\in D_{\mu }^{+}\quad \text { with }h^{\prime }\in I(h),~~ \frac{\mathbb {P}_{\bar{\sigma }}(h)}{\mathbb {P}_{\bar{\sigma }}(h^{\prime })}= \frac{\mu (h)}{\mu (h^{\prime })}. \end{aligned}$$
(13)

Define \(\bar{\nu }:D\rightarrow (0,1]\) recursively as follows: \(\nu (\emptyset )=1\) and, if \(a\in A(h)\) and \(ha\in D\), \(\bar{\nu }(ha)=\bar{\nu } (h)\times \bar{\sigma }(a)\). Since, \(\forall a\in A\), \(\bar{\sigma }(a)\in (0,1]\) and \(\bar{\sigma }(a)=\sigma (a)\) whenever \(\sigma (a)>0\), it follows that

$$\begin{aligned} \begin{array}{c} \text {if }a\in A(h)\text { and }ha\in D\text {, then (}A\text {) }\bar{\nu } (ha)\le \bar{\nu }(h)\text { and} \\ \text {(}B\text {) if }\sigma (a)>0\text { then }\bar{\nu }(ha)=\bar{\nu } (h)\times \sigma (a). \end{array}\nonumber \\ \end{aligned}$$
(14)

Define the probability density function \(\nu :D\rightarrow (0,1]\) by \(\nu (h)=\frac{\bar{\nu }(h)}{\mathop {\sum }\nolimits _{h^{\prime }\in D}\bar{\nu }(h^{\prime })}.\) Then, by (14), \(\nu \) satisfies Property UB1 of Definition 9. Furthermore, if \(a\in A(h)\), h and \(h^{\prime }\) belong to the same information set and \(ha,h^{\prime }a\in D\), then \(\frac{\bar{\nu }(ha)}{\bar{\nu }(h^{\prime }a)}=\frac{\bar{\nu } (h)\times \bar{\sigma }(a)}{\bar{\nu }(h^{\prime })\times \bar{\sigma }(a)}= \frac{\bar{\nu }(h)}{\bar{\nu }(h^{\prime })}\) and thus, dividing numerator and denominator by \(\mathop {\sum }\nolimits _{h^{\prime \prime }\in D}\bar{\nu } (h^{\prime \prime })\), we get that \(\nu \) satisfies Property UB2 of Definition 9.

Furthermore, as shown above,

$$\begin{aligned} \forall h\in D\backslash \{\emptyset \},~\mathbb {P}_{\bar{\sigma }}(h)\overset{def}{=}\mathop {\prod }\limits _{a\in A_{h}}\bar{\sigma }(a)=\frac{\bar{\nu }(h)}{\bar{ \nu }(\emptyset )}=\bar{\nu }(h)\text { (since }\nu (\emptyset )=1\text {),} \end{aligned}$$
(15)

it follows from (13) and (15) that

$$\begin{aligned} \forall h,h^{\prime }\in D_{\mu }^{+}\text { with }h^{\prime }\in I(h),~~ \frac{{\bar{\nu }}(h)}{{\bar{\nu }}(h^{\prime })}=\frac{\mu (h)}{ \mu (h^{\prime })}\quad \text { and thus }\frac{{\nu }(h)}{{\nu } (h^{\prime })}=\frac{\mu (h)}{\mu (h^{\prime })}. \end{aligned}$$
(16)

Fix an arbitrary equivalence class E of the plausibility order that rationalizes \((\sigma ,\mu )\) such that \(E\cap D_{\mu }^{+}\ne \varnothing \) and define \(\nu _{E}:H\rightarrow [0,1]\) as follows:

$$\begin{aligned} \nu _{E}(h)=\left\{ \begin{array}{ll} \frac{\nu (h)}{\sum _{h^{\prime }\in E\cap D_{\mu }^{+}}\nu (h^{\prime })} &{} \quad \hbox {if}\,h\in E\cap D_{\mu }^{+}\\ 0 &{} \quad \hbox {if}\,h\notin E\cap D_{\mu }^{+}. \end{array} \right. \end{aligned}$$
(17)

By construction \(\nu _{E}\) satisfies Property B1 of Definition 3 and, by UB1 of Definition 9 (proved above), \(\nu _{E}\) satisfies also B2 of Definition 3 (recall that if \(h,h^{\prime }\in E\) with \(h^{\prime }=ha_{1}\ldots a_{m}\) then, by P1 of Definition 2, \(\sigma (a_{i})>0\) for all \(i=1,\ldots ,m\)). It only remains to prove that Property B3 of Definition 3 is satisfied, namely that if \(h\in E\cap D_{\mu }^{+}\) then, for every \( h^{\prime }\in I(h)\), \(\mu (h^{\prime })=\frac{\nu _{E}(h^{\prime })}{ \sum _{h^{\prime \prime }\in I(h)}\nu _{E}(h^{\prime \prime })}\). Number the elements of \(E\cap D_{\mu }^{+}\) from 1 to m in such a way that \(h_{1}=h\) and the first p elements belong to \(I(h_{1})\) and the remaining elements (if any) do not belong to \(I(h_{1})\), that is, \(E\cap D_{\mu }^{+}=\left\{ h_{1},\ldots ,h_{p},h_{p+1},\ldots ,h_{m}\right\} \) with \(h_{1}=h\), \(I(h_{1})\cap E\cap D_{\mu }^{+}=\{h_{1},\ldots ,h_{p}\}\) and, for \(i>p,\) \(h_{i}\notin I(h_{1}) \). We shall prove that

$$\begin{aligned} \frac{\nu _{E}(h_{1})}{\sum _{h^{\prime \prime }\in I(h_{1})}\nu _{E}(h^{\prime \prime })}=\mu (h_{1}). \end{aligned}$$
(18)

The proof for \(1<j\le p\) is similar. By (16), for every \(j=1,\ldots ,m\), \(\frac{{\nu }(h_{j})}{{\nu }(h_{1})}= \frac{\mu (h_{j})}{\mu (h_{1})}\). Thus

$$\begin{aligned} \frac{\sum _{j=1}^{p}{\nu }(h_{j})}{{\nu }(h_{1})}=\frac{ \sum _{j=1}^{p}{\mu }(h_{j})}{{\mu }(h_{1})}. \end{aligned}$$
(19)

By definition of \(\mu \), \(\sum _{j=1}^{p}{\mu } (h_{j})=1\) (since, for any \(h^{\prime }\in I(h_{1})\) that does not belong to \(E\cap D_{\mu }^{+}\), \(\mu (h^{\prime })=0\): recall that, by Property P2 of Definition 2, if \(h^{\prime }\in I(h_{1})\) is such that \(\mu (h^{\prime })>0\) then \(h^{\prime }\sim h\), that is, \(h^{\prime }\in E\)). Hence \(\frac{{\nu }(h_{1})}{\sum _{j=1}^{p}{\nu }(h_{j})}=\mu (h_{1}).\) By (17), dividing numerator and denominator of left-hand-side by \(\sum _{i=1}^{m}{\nu }(h_{i})\) we obtain

$$\begin{aligned} \frac{{\nu }_{E}(h_{1})}{\sum _{j=1}^{p}{\nu }_{E}(h_{j})}=\mu (h_{1}) \end{aligned}$$
(20)

Since, by (17), for any \(h^{\prime }\in I(h_{1})\) that does not belong to \(E\cap D_{\mu }^{+}\), \(\nu _{E}(h^{\prime })=0\), \(\sum _{h^{\prime \prime }\in I(h_{1})}\nu _{E}(h^{\prime \prime })=\sum _{j=1}^{p}{\nu }_{E}(h_{j})\). Thus (20) yields the desired (18). Since, by construction, \(\nu \) is a full-support common prior to the collection of probability density functions \(\nu _{E}\) given in (17), which have been shown to satisfy the properties of Definition 3, the proof that \((\sigma ,\mu )\) is uniformly Bayesian is complete. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bonanno, G. AGM-consistency and perfect Bayesian equilibrium. Part II: from PBE to sequential equilibrium. Int J Game Theory 45, 1071–1094 (2016). https://doi.org/10.1007/s00182-015-0506-6

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00182-015-0506-6

Keywords

Navigation