Common belief in future and restricted past rationality

We introduce the idea that a player believes at every stage of a dynamic game that his opponents will choose rationally in the future and have chosen rationally in a restricted way in the past. This is summarized by the concept of common belief in future and restricted past rationality, which is defined epistemically. Moreover, it is shown that every properly rationalizable strategy of the normal form of a dynamic game can be chosen in the dynamic game under common belief in future and restricted past rationality. We also present an algorithm that uses strict dominance, and show that it selects exactly those strategies that can be chosen under common belief in future and restricted past rationality.


Introduction
Epistemic game theory deals with the reasoning processes of an individual about his opponents before he makes a decision. This requires a belief about the choices of his opponents, but also a belief about the opponents' beliefs about their opponents' choices, and so on.
Such reasoning processes have been studied thoroughly in the framework of static games, in various forms of the concept of common belief in rationality. However, the extension of these concepts to the framework of dynamic games is not entirely trivial. One possible way to extend the idea of common belief in rationality would require that the players believe their opponents make only rational choices, in particular that past choices have been rational. However, in many cases this is not possible, since there may be stages in the game where players have to conclude that an opponent has chosen irrationally in the past.
To solve this problem some alternative concepts have been proposed. Battigalli and Siniscalchi (2002) propose the concept of common strong belief in rationality, in which players, whenever possible, must believe that their opponents are implementing rational strategies. Perea (2014) proposed the concept of common belief in future rationality, in which at each decision point a player must believe that all players are rational in the present and in the future, but allows players to believe that irrational choices have been made in the past. This concept is similar to sequential rationalizability, proposed by Dekel et al. (1999;, and Asheim and Perea (2005). Reny (1992Reny ( , 1993) studies the idea of common belief in past and future rationality at all information sets, coming to the conclusion that in most games, it is not possible to reason under such concept.
However, taking as a starting point the concept of common belief in future rationality in which we allow players to believe that past choices were irrational, we consider a concept in which a restricted notion of belief in past rationality is assumed. The example presented in Fig. 1 will be used to illustrate the concepts that are being discussed, while it also serves as one of the motivations for developing a new rationality concept.
The key idea in the new concept we propose is that a player does not only believe that his opponents choose rationally in the future, but also that the decisions made in the past were rational among a restricted set of choices. In Fig. 1 we can see that at ∅ the optimal choice for player 1 is c. However, if the game were to reach h 1 , player 2 must believe that a suboptimal choice was made at ∅ . Under the concept of common belief in future rationality player 2 can assume either a or b was chosen at ∅ , as there is no restriction on the beliefs about choices made in the past. We propose that player 2 should reason about the choice made at ∅ by considering only those choices that reach h 1 and from those find which are optimal: in this case, we can see that a is the best choice for player 1 from those that reach h 1 assuming he would choose f afterwards. Hence, under the new concept, player 2 must believe at h 1 that player 1 chose a in the past.
The concept proposed here, which we call "common belief in future and restricted past rationality" is a refinement of common belief in future rationality. The difference is that common belief in future rationality does not reason about the choices made in the past, while the addition of "restricted past rationality" makes players consider the subset of past choices that reach an information set and find the optimal choice in this subset. In other terms, the important factor in what is proposed is that a player, at an information set h, may still believe that the opponents chose irrationally in the past, but if that is the case, he must believe that the opponents chose the "least irrational" strategies among the strategies that reach h. A key feature of our new concept is that belief in the opponents restricted past rationality is always possible, whereas belief in the opponents' (unrestricted) past rationality is only useful when studying strategies that fall on the path of those that are rational, while disregarding the rest of the game.
Since we are ranking the possible mistakes, we can tell there should be a connection between the concepts of proper rationalizability, proposed by Schuhmacher (1999) and Asheim (2001), for the normal form of a dynamic game and common belief in future and restricted past rationality for a dynamic game. In Theorem 1 it is shown that properly rationalizable strategies in the normal form can rationally be chosen under common belief in future and restricted past rationality. And since we know that there are properly rationalizable strategies for every finite normal form game, then we have that there are strategies that can rationally be chosen under common belief in future and restricted past rationality for every finite dynamic game. This shows that common belief in future and restricted past rationality has plausible restrictions which, when studying the normal form, allows players to make reasonable decisions in the dynamic game. In addition we propose an algorithm for this concept, and we show that it delivers exactly the strategies that can rationally be chosen under common belief in future and restricted past rationality.
Note that the example presented in Fig. 1 has unobserved past choices. This is intentional, because we can easily show that for games with observable past choices in which we possibly allow for simultaneous moves, common belief in future rationality and common belief in future and restricted past rationality are equivalent. To see this, note that under observable choices, every information set for every player is a singleton which implies that each player knows exactly what the previous choices were. Since this happens, players do not have to reason about what possible choices were made earlier, since such choices are already given and known to everyone, reducing the reasoning only to future choices. This also shows that the algorithm presented here and the backward dominance procedure proposed in Perea (2014) coincide when the games have observable past choices, as the description of the second set Ŝ k −i (h) in Step k of Algorithm 1 reduces to a description that is equivalent to the sets Γ k (h) in the inductive step of the backward dominance procedure.
For one-shot games, common belief in future rationality and common belief in future and restricted past rationality both coincide with common belief in rationality, and hence are equivalent. Since it is well known that common belief in rationality is weaker than proper rationalizability, there are one-shot games in which a strategy can rationally be chosen under common belief in future and restricted past rationality, but that same strategy is not properly rationalizable in the normal form. Hence, the converse of Theorem 1 does not hold. Such a game is presented in Fig. 2, which does not have generic payoffs. In this game, strategies b and d can rationally be chosen under common belief in future and restricted past rationality, but b and d are not properly rationalizable. We conjecture that for generic payoffs, proper rationalizability is equivalent to common belief in future and restricted past rationality.
It was shown by Asheim (2001) that every choice that has positive probability in some proper equilibrium is optimal for some properly rationalizable type. Therefore proper rationalizability can be seen as the non-equilibrium analogue to proper equilibria. van Damme (1984) proves that for every dynamic game, the proper equilibria of its normal form induce quasi-perfect equilibria of the dynamic game. In this way he shows that it is possible to reason about a dynamic game in terms of the normal form and obtain equilibria of the dynamic game by looking at the normal form only. This is precisely one of the driving ideas behind the present paper, in which the concept of proper rationalizability, which is less restrictive than proper equilibrium, is linked to a concept for dynamic games that, in contrast to common belief in future rationality, takes into account a restricted version of rationality in the past. Also in contrast to strong belief in rationality, it makes players reason about the optimality of choices at every information set, even if an information set can only be reached by past choices that are suboptimal. Another motivation for proposing a new reasoning concept is that common belief in future rationality only reasons about what can happen from a certain point in time onwards, without caring about how the game got to this instance. On the other hand, strong belief in rationality may impose no restrictions at information sets that can only be reached by irrational past choices. Our concept, in contrast, may impose restrictions even in such situations.
Indeed, if the game reaches a certain information h, the player that has to choose knows the game could reach such information set only if his opponents have previously made choices that can reach h. Therefore, according to our concept, this player believes his opponents chose the most plausible among the choices that actually reach h. That is, he concentrates on those opponents' strategies that reach h, that are optimal at all future information sets, and that are optimal at all past information sets among strategies that reach h.
The structure of the paper is as follows. In Sect. 2 we discuss a few examples that highlight some properties of our new concept. In Sect. 3 we introduce dynamic games. In Sect. 4 we present the concept of proper rationalizability for the normal form of a dynamic game. In Sect. 5 we introduce the notion of common belief in future and restricted past rationality for a dynamic game. In Sect. 6 both of these rationalizability concepts are connected, by showing that the strategies that are proper rationalizable can also be chosen under common belief in future and restricted past rationality. In Sect. 7 we describe an algorithm and show that it yields precisely those strategies that can be chosen under common belief in future and restricted past rationality. Section 8 has some concluding remarks and Sect. 9 contains all the proofs of this paper.

Examples
We present some examples that show some properties of the concept of common belief in future and restricted past rationality that differentiate it from previously known concepts for dynamic games.
The example in Fig. 1 has shown that common belief in future and restricted past rationality can be more restrictive than common belief in future rationality in terms of strategies. The example presented in Fig. 3 shows that it can also be more restrictive in terms of outcomes.
Under common belief in future rationality, player 2 can rationally choose d or e, and hence player 1 can rationally choose a or c. Therefore, common belief in future rationality allows for the outcomes (a, d), (a, e) and c. However, since a is better than b for player 1, common belief in future and restricted past rationality requires player 2 to believe at h 1 that player 1 chose a. Therefore, player 2 must choose d and player 1 must choose a. Hence, common belief in future and restricted past rationality only allows for the outcome (a, d).
Moreover, it is not forward induction that is being used in the concept presented here. To see this, consider the game in Fig. 4, which is the Battle of the sexes with an "Outside" option.
Under forward induction, the only possible outcome is player 1 choosing a and player 2 choosing d, since player 2, on noticing the game has reached h 1 reasons that player 1 must have chosen a since that is the only way player 1 can get more than by choosing c. Therefore, player 2 must choose d.
Under our concept however, the only choice that is eliminated is b for player 1. That is because b is strictly dominated by c at ∅ . However, player 2 can still believe at h 1 that player 1 chose b. Indeed, if player 1 believes that player 2 Battle of the sexes with an "Outside" option chooses e, strategy b is better than a, and hence player 2 may believe at h 1 that player 1 chose b. Consequently, player 2 may choose e under our concept, and player 1 may choose c. However, c is not the forward induction strategy for player 1. Now, for the example shown in Fig. 5 it can be seen why in the definition for the concept presented here it is required for players to reason about every past information set and not just a few of the previous information sets. Note that a is better than b for player 1 at ∅ , and that f is better than g for player 1 at h 2 . Therefore, under our concept, player 2 must believe at h 4 that the second node has been reached, so player 2 must choose . To reach this conclusion, it is important that player 2 reasons about both ∅ and h 2 . Reasoning only about ∅ , or reasoning only about h 2 and h 3 , would not be sufficient to draw this conclusion.

Dynamic games
In this section we define the dynamic games we consider, and some general notions that will be used throughout the paper. In what follows we assume the players have perfect recall.
Definition 1 (Dynamic game) A dynamic game G is a tuple where • I is the finite set of players; • C i is the finite set of choices for each player i ∈ I; • X is the set of non-terminal histories, which are sequences of profiles of choices x = (x 1 , … , x k ) , with x m = (c i ) i∈Î ∈ × i∈Î C i for some non-empty Î ⊆ I , and for all < k , (x 1 , … , x ) is also a history. As Î may contain more than one player, simultaneous moves are allowed; • Z is the set of terminal histories of the game. In this case, if z = (x 1 , … , x k ) ∈ Z , then for every < k , (x 1 , … , x ) ∈ X; • H i is a finite collection of information sets for player i. The information sets h ∈ H i are non-empty sets of non-terminal histories. If h contains more than one history, then player i does not know with certainty which history was realized to arrive at h. The collections of information sets for each player are not necessarily disjoint since we allow for simultaneous moves, so the same information set might belong to two or more players at the same time. The collection of all information sets for all players in the game is denoted by H; • C i (h) ⊆ C i is the finite set of choices available for player i at the information set h ∈ H i . We say c ∈ C i (h) if there is a history x ∈ X and x m = (c j ) j∈Î such that x ∈ h , i ∈Î , c i = c and (x, x m ) = x � ∈ X ∪ Z ; and • u i ∶ Z → ℝ is player i's utility function.
As an example, for the game described in Fig. 1 we have a dynamic game in its extensive form. This two-player game has the sets of histories We define a partial order on the information sets of a game. An information set h ′ immediately follows h, or h immediately precedes h ′ , if there exist a non-empty During the game, each player makes one or more choices, sometimes depending on his previous choices or on the choices of other players. However, if a player's choice prevents himself from making some other choices, there is no reason to make a plan that includes both the former choice and any of the latter ones. Therefore, we restrict ourselves to studying those plans that only prescribe choices at information sets that are reachable under the earlier choices: a "plan of action", as described in Rubinstein (1991). These plans we will call strategies. We also identify those strategies that can potentially reach an information set.
Looking at the game shown in Fig. 1 the sets of strategies for each player are S 1 = {(a, f ), (a, g), b, c} , and S 2 = {d, e} . In classical game theory, other sequences such as (b, f) would also qualify as strategies, however, player 1 prevents himself from choosing f by choosing b at an earlier information set, rendering the choice f unnecessary. Let The set of strategies for player i is denoted by S i . The set of strategy combinations for the opponents of i is denoted by S −i = × j≠i S j . A strategy combination for all players is given by The set of strategies for player i that lead to h is denoted by S i (h) . In Example 1, The set of strategy combinations for the opponents of i that lead to h is denoted by S −i (h) . The set of information sets for player i that strategy s i leads to is denoted by H i (s i ).
Finally we identify those strategy combinations that reach a particular information set. Let (s i , s −i ) ∈ S i × S −i be a strategy combination for all players. We define H(s i , s −i ) as the class of information sets h such that s i ∈ S i (h) and s −i ∈ S −i (h) . H(s i , s −i ) are the information sets that can be reached with the strategy combination (s i , s −i ).

Proper rationalizability
To connect the rationalizability concepts in dynamic games with related rationalizability concepts in normal form games, we also need to connect a dynamic game with a related game in its normal form.
Definition 2 (Normal form of a dynamic game) Let G be a dynamic game. The normal form of G is the game G � = (I, (S i ) i∈I , (v i ) i∈I ) in which all players i choose simultaneously a strategy s i ∈ S i , and each player i receives the utility We define a structure called an epistemic model with types, which serves as a compact way to encode belief hierarchies, so we can derive the various levels of belief for each type in the epistemic model. Then we define strategy-type combinations, which are the objects on which beliefs are constructed, and lexicographic beliefs.

Definition 3 (Epistemic model for a normal form game) An epistemic model
consists of a finite set of types T i for each player i, and for each type , which is the set of strategytype combinations of i's opponents.
To derive a lexicographic belief hierarchy for every type, consider a type t i and its . For the first order of the lexicographic belief hierarchy of t i , we have that player i deems the strategies in the support of b 1 i (t i ) infinitely more likely than the strategies that are in the support of b 2 i (t i ) but not in the support of b 1 i (t i ) ; and deems the strategies in the support of b 2 i (t i ) infinitely more likely than the strategies that are in the ; and so on. On the second order of the lexicographic belief hierarchy of t i , we have that player i deems the lexicographic beliefs of each type that appears in b 1 i (t i ) infinitely more likely than the lexicographic beliefs of each type that appears in b 2 i (t i ) but didn't appear in b 1 i (t i ) ; and deems the lexicographic beliefs of each type that appears in b 2 i (t i ) but didn't appear in b 1 i (t i ) infinitely more likely than the lexicographic beliefs of each type that appears in b 3 i (t i ) but didn't appear in a previous level; and so on. Continuing this way it is possible to obtain the full lexicographic belief hierarchy.
We say type t j is deemed possible by type t i for the lexicographic belief If positive probability is assigned to a strategy-type combination in level , earlier than another strategy-type combination in a level k, with < k , we say that the first combination is deemed infinitely more likely than the second one.

Definition 4 (Strategy-type combinations deemed infinitely more likely) Let
) be a lexicographic belief for type t i for player i. We say t i deems a strategy-type combination (s We focus on a particular type of lexicographic beliefs, which are such that for every type combination for i's opponents that is deemed possible in the belief, every strategy combination for i's opponents must receive positive probability at some level k.

Definition 5 (Cautious lexicographic belief) Consider an epistemic model
In order to compare strategies for a player we define the expected utility for a given lexicographic belief. Note that it is defined by levels, and the comparison is made at the first level in which two strategies disagree in their expected utility.
Given a type t i for player i and a lexicographic belief b we define the expected utility of choosing strategy s i at level k as Given a lexicographic belief b i (t i ) for type t i , a strategy s i is optimal for t i if there is no other s � i ∈ S i such that t i prefers s ′ i to s i . Now we define the notion of rationalizability that will be used for normal form games: respect of preferences, due to Asheim (2001), which in turn defines the concept of proper rationalizability.

Definition 6 (Respect of preferences) Consider an epistemic model
) be a lexicographic belief for type t i for player i. We say t i respects j's preferences if for every type t j of player j deemed possible by t i , and strategies s j , s � j ∈ S j such that t j prefers s j to s ′ j , t i deems at least one strategy-type We say t i respects the opponents' preferences if t i respects j's preferences for all j ∈ I⧵{i}.

Definition 7 (k-fold and common full belief in caution)
1. Type t i expresses 1-fold full belief in caution if t i only deems possible opponents' types that are cautious.
2. For every k > 1 , type t i expresses k-fold full belief in caution if t i only deems possible opponents' types that express (k − 1)-fold full belief in caution. 3. Type t i expresses common full belief in caution if t i expresses k-fold full belief in caution for all k ∈ ℕ.
In a similar way we can define k-fold and common full belief in respect of preferences. Now we can define proper rationalizability, which was introduced by Schuhmacher (1999). However, in this section we use the characterization of this concept given by Asheim (2001) which uses lexicographic beliefs.
Definition 8 (Proper rationalizability) Type t i is properly rationalizable if t i is cautious, respects the opponents' preferences and expresses common full belief in caution and common full belief in respect of preferences.
A strategy s i for player i is properly rationalizable if there exists an epistemic model M = (T i , b i ) i∈I and some type t i ∈ T i such that t i is properly rationalizable, and strategy s i is optimal for type t i .
For the game in Fig. 1, consider the epistemic model given in Table 1. The first level of belief b 1 (t 1 ) is the Dirac measure that assigns probability 1 to the strategy-type pair (d, t 2 ) , and the second level is the Dirac measure that assigns probability 1 to (e, t 2 ) . Analogously the belief b 2 (t 2 ) is also shorthand for a collection of Dirac measures. We shall check that each type is properly rationalizable.
Type t 1 only deems possible type t 2 , and the strategy-type combinations (d, t 2 ) and (e, t 2 ) appear at some level of b 1 (t 1 ) , so t 1 is cautious. Similarly t 2 only deems possible type t 1 , and the strategy-type combinations ((a, f ), t 1 ) , ((a, g), t 1 ) , (b, t 1 ) and (c, t 1 ) appear at some level of b 2 (t 2 ) , so t 2 is cautious.
Type t 1 believes player 2 is of type t 2 , which believes at the first level of b 2 (t 2 ) that player 1 will choose c, and at the second level that player 1 will choose (a, f), in which case the order of preference for player 2 is d, then e, so t 1 respects the opponent's preferences.
Type t 2 believes player 1 is of type t 1 , which believes at the first level of b 1 (t 1 ) that player 2 will choose d, in which case the order of preference for player 1 is c, then (a, f), followed by b and finally (a, g), so t 2 respects the opponent's preferences.
Since all the types in the epistemic model are cautious and respect the opponent's preferences, all the types are properly rationalizable. For player 1, c is a strategy that is optimal for t 1 , and for player 2, d is a strategy that is optimal for t 2 . Therefore c and d are properly rationalizable.

Common belief in future and restricted past rationality
Now we turn to dynamic games, and we will define the concept of common belief in future and restricted past rationality. In Sect. 6 we will connect the concept to proper rationalizability of the normal form.
We first define an epistemic model for a dynamic game, which is rather similar to the definition for normal form games, except the beliefs depend on the information set.
Definition 9 (Epistemic model for a dynamic game) An epistemic model M = (T i , i ) i∈I for a dynamic game G consists of a finite set of types T i for each player i, and for each type t i ∈T i and each information set Given a type t i , an information set h for player i, and a conditional belief Now we define the key conditions that will be used: belief in future rationality as has been defined in Perea (2014), Bayesian updating, and a new notion that we propose, which requires players to think about the past rationality of the opponents, insofar as it concerns the strategies that reach the information set at which the player is. We define all three notions separately, then we define common belief in future rationality and common belief in restricted past rationality in an iterative way, to combine them into one concept that refines common belief in future rationality. We should point out that in Definitions 10 and 13 we allow for information sets that weakly follow or precede another. That is because it is possible for the same information set to belong to two or more players with our definition of a dynamic game, as we allow for simultaneous moves.
Definition 10 (Belief in the opponents' future rationality) We say that a type t i believes in j's future rationality if at every Type t i believes in the opponents' future rationality if t i believes in j's future rationality for all players j ∈ I⧵{i}.
Definition 11 (k-fold and common belief in future rationality) 1. Type t i expresses 1-fold belief in future rationality if t i believes in the opponents' future rationality. 2. For every k > 1 , type t i expresses k-fold belief in future rationality if at every information set h ∈ H i , t i only assigns positive probability to opponents' types that express (k − 1)-fold belief in future rationality. 3. Type t i expresses common belief in future rationality if t i expresses k-fold belief in future rationality for every k ∈ ℕ.
Definition 13 (Belief in the opponents' restricted past rationality) We say that a type . Type t i believes in the opponents' restricted past rationality if t i believes in j's restricted past rationality for all players j ∈ I⧵{i}.
The previous definition establishes that type t i must reason at h about those strategies of his opponents that can be chosen at a previous information set h ′ , but only if those strategies can reach the information set h too. That is, i considers at h only those strategies at h ′ that give the highest utility to the opponent at h ′ from those strategies that actually reach h.
We can define k-fold and common belief in restricted past rationality, and k-fold and common belief in Bayesian updating in an analogous way to the definition of k-fold and common belief in future rationality.
A strategy s i for player i can rationally be chosen under common belief in future and restricted past rationality and common belief in Bayesian updating if there exists an epistemic model M = (T i , i ) i∈I and some type t i ∈T i such that t i expresses common belief in future and restricted past rationality and common belief in Bayesian updating, and strategy s i is optimal for type t i at every information set h ∈ H i (s i ).
Returning to the example shown in Fig. 1, consider the epistemic model given in Table 2, for which we check that every type expresses common belief in future and restricted past rationality and satisfies Bayesian updating.
At ∅ ∈ H 1 , t 1 believes that player 2 chooses d and is of type t 2 . Type t 2 believes at h 1 , which weakly follows ∅ , that player 1 chooses (a, f), so the optimal strategy in S 2 (h 1 ) = {d, e} for player 2 is d. Therefore t 1 believes in the opponent's future rationality at ∅ . Since there are no information sets for player 2 that weakly precede ∅ , t 1 believes in the opponent's restricted past rationality at ∅.
At h 2 ∈ H 1 there are no information sets for player 2 that weakly follow h 2 , so t 1 believes in the opponent's future rationality at h 2 . Now, type t 1 believes at h 2 that player 2 chooses d and is of type t 2 ; in fact S 2 (h 1 ) ∩ S 2 (h 2 ) = {d} . Therefore t 1 believes in the opponent's restricted past rationality at h 2 . Moreover, t 1 satisfies Bayesian updating if the game moves from ∅ to h 2 .
At h 1 ∈ H 2 , t 2 believes that player 1 chooses (a, f) and is of type t 1 . Type t 1 believes at h 2 , which weakly follows h 1 , that player 2 chooses d at h 1 , so the optimal strategy in S 1 (h 2 ) for player 1 is (a, f). Therefore t 2 believes in the opponent's future rationality. Type t 1 believes at ∅ , which weakly precedes h 1 , that player 2 chooses d at h 1 , so the optimal strategy in S 1 (∅) ∩ S 1 (h 1 ) = {(a, f ), (a, g), b} for player 1 is (a, f). Therefore t 2 believes in the opponent's restricted past rationality. Finally it can easily be seen that t 2 satisfies Bayesian updating, as h 1 is player 2's only information set. We can see that among all strategies in S 1 (∅) , (a, f) is not optimal for t 1 at ∅ , as c gives a higher utility.
Since all the types in the epistemic model believe in the opponent's future and restricted past rationality and satisfy Bayesian updating, then all the types express common belief in future and restricted past rationality and common belief in Bayesian updating. For player 1, c is optimal for type t 1 at information set ∅ , and for player 2, d is optimal for type t 2 at information set h 1 . Therefore c and d can rationally be chosen under common belief in future and restricted past rationality and common belief in Bayesian updating.

Connection with proper rationalizability
In this section we prove one of our main theorems, which states that proper rationalizability of a strategy in the normal form implies optimality of the same strategy under common belief in future and restricted past rationality with Bayesian updating in the dynamic game.
In order to do so, we break down the proof into four smaller parts. We start by showing that optimality of a strategy for a cautious type in the normal form of the game implies optimality of the same strategy for the induced type in the dynamic game. Then we go on to show that respect of the opponent's preferences in the normal form implies belief in the opponent's future and restricted past rationality and Bayesian updating in the dynamic game. As a consequence, proper rationalizability in the normal form implies common belief in future and restricted past rationality and common belief in Bayesian updating in the dynamic game. This finally implies that every strategy which is properly rationalizable in the normal form can rationally be chosen under common belief in future and restricted past rationality with Bayesian updating in the dynamic game.
Theorem 1 Consider a dynamic game G. If a strategy s i is properly rationalizable in the normal form of G, then s i can rationally be chosen under common belief in future and restricted past rationality and common belief in Bayesian updating in the dynamic game G.
This result has a connection with van Damme (1984), who showed that every proper equilibrium in the normal form of a game implies a quasi-perfect equilibrium in the dynamic game, which in turn implies a sequential equilibrium in the dynamic game. The non-equilibrium analogue for proper equilibria is proper rationalizability. Moreover, every sequential equilibrium is a subgame perfect equilibrium, which, as shown by Perea and Predtetchinski (2019), is the equilibrium counterpart of common belief in future rationality in the case of two-player games. In this way, our theorem may be viewed as a non-equilibrium analogue to van Damme's result.
As a first step to establishing Theorem 1, we define a way to transform an epistemic model of the normal form into an epistemic model for the dynamic game.
Let M = (T i , b i ) i∈I be an epistemic model for the normal form of the game where every type t i ∈ T i is cautious for all i ∈ I . We define the induced epistemic model for the dynamic game M = (T i , i ) i∈I in the following way: for each player i take the bijective mapping f i ∶ T i →T i , effectively a renaming of the types, and let the conditional belief of type f i (t i ) at the information set h ∈ H i be defined as where k is the smallest number for which b k that is, we take the first level k of the lexicographic belief for t i in which there is at least one strategy combination for i's opponents that reaches h, and normalize the probabilities accordingly. By doing this, the conditional beliefs of the types are such that the types for the dynamic game satisfy Bayesian updating. Although some information that could be useful for tie-breaking is lost when constructing the conditional beliefs for the dynamic game, such information is not required for our model. To illustrate how to transform cautious lexicographic beliefs into conditional beliefs, we consider the game from Fig. 1. Suppose the epistemic model for its normal form is the one in Table 3, then the epistemic model induced for the dynamic game is the one in Table 4. Now that we have a way to relate epistemic models of the normal form with those of the dynamic game, we will see how the rationalizability concepts relate to each other. First we show that optimality of a strategy for a cautious type in the normal form of the game implies optimality of the same strategy for the induced type in the dynamic game. This is presented in the following lemma.

Lemma 1 Let M be an epistemic model of the normal form in which all types are cautious, h ∈ H i , h ′ an information set that weakly follows or weakly precedes h, and t i a type for player i in M. If
The optimality implication described above will be very useful to show the relations between the rationalizability concepts that we are studying. The next step is to Table 3 An epistemic model for the normal form g), t 1 )) Table 4 The epistemic model of the dynamic game induced by Table 3 T show that respect of preferences in the normal form of the game implies belief in future and restricted past rationality.

Lemma 2 If t i respects player j's preferences, then f i (t i ) believes in j 's future and restricted past rationality and f i (t i ) satisfies Bayesian updating.
And also, the notion of proper rationalizability implies common belief in future and restricted past rationality.

Lemma 3 If t i is properly rationalizable, then f i (t i ) expresses common belief in future and restricted past rationality and common belief in Bayesian updating.
Since we know that for every normal form game there exists at least one properly rationalizable type for every player (cf. Asheim 2001; Perea 2012), then Lemma 3 implies the following result.

Corollary 1 For every dynamic game G there exists for every player i an epistemic model M and a type t i in it that expresses common belief in future and restricted past rationality and common belief in Bayesian updating.
Once we have all of these results, Lemmas 1 and 3 imply Theorem 1. Therefore, if we transform a dynamic game into its normal form and proceed to find an epistemic model in which the types express proper rationalizability, we can find an induced epistemic model for the dynamic game in which the types express common belief in future and restricted past rationality and common belief in Bayesian updating. Moreover, from Theorem 1 we have that the strategies that can be chosen under proper rationalizability can also be chosen under common belief in future and restricted past rationality and common belief in Bayesian updating.
We can check that the epistemic model in Table 2 is induced by the epistemic model in Table 1 via the transformation described before, and we have seen that all types in Table 1 are properly rationalizable. Since strategy c is optimal for type t 1 and strategy d is optimal for type t 2 , both strategies can rationally be chosen under common belief in future and restricted past rationality and common belief in Bayesian updating according to Theorem 1.
As we can see, at information sets ∅ and h 2 , type t 1 of player 1 believes type t 2 of player 2 will be and has been rational. However, if the game reaches information set h 1 , then this means that player 1 was not rational before. Nevertheless, player 2 believes that if h 1 was reached, then player 1 is choosing optimally among strategies that lead to h 1 . Therefore, type t 2 believes that player 1 will choose (a, f). Hence, player 2 can only rationally choose d under common belief in future and restricted past rationality and common belief in Bayesian updating.
Under common strong belief in rationality, if player 2 sees that h 1 has been reached, then, if possible, he must believe that player 1 made a choice that is rational at ∅ . But choosing c at ∅ gives the highest utility for player 1, so it is not possible for player 2 to believe that player 1 made a rational choice under common strong belief in rationality. Therefore, player 2 can believe player 1 chose any strategy that leads to h 1 , so both d and e can rationally be chosen at h 1 under common strong belief in rationality.
Under common belief in future rationality, if player 2 sees that h 1 was reached, then he may believe that player 1 chose irrationally at ∅ , but he must believe that from now on, player 1 will choose rationally. Therefore, player 2 can believe player 1 chose a or b at ∅ , so both d and e can rationally be chosen under common belief in future rationality.

Algorithm
In this section, whenever we say common belief in future and restricted past rationality, we actually mean common belief in future and restricted past rationality and common belief in Bayesian updating. Hence, we always assume common belief in Bayesian updating.
In order to find the strategies that can rationally be chosen under common belief in future and restricted past rationality, we propose an algorithm based on the backward dominance procedure in Perea (2014). Then we show that the strategies that survive the algorithm are exactly those strategies that can be chosen under common belief in future and restricted past rationality.
As can be seen from the proof in Sect. 9, the algorithm also characterizes those strategies that can be chosen under common belief in future and restricted past rationality without requiring (common belief in) Bayesian updating. Hence, for the strategies that can rationally be chosen it is not relevant whether we require Bayesian updating or not.

Definition 15 (Strict dominance by a randomization) Let h ∈ H i be an information set for player
Algorithm 1 Set S 0 i (h) = S i (h) and Ŝ 0 −i (h) = S −i (h) for all i ∈ I and all h ∈ H i . For every k ≥ 1 we have: Step k: For every player i and every information set h ∈ H i , we define The algorithm ends after K steps if S K+1 for every i ∈ I and every h ∈ H i . Now we have the following result showing that the algorithm identifies the strategies that can be chosen under k-fold belief in future and restricted past rationality, and those that can be chosen under common belief in future and restricted past rationality.

Theorem 2 For every k ≥ 1 the strategies that can rationally be chosen by a type that expresses up to k-fold belief in future and restricted past rationality and up to k-fold belief in Bayesian updating are exactly the strategies s i such that
The strategies that can rationally be chosen by a type that expresses common belief in future and restricted past rationality and common belief in Bayesian updating are exactly the strategies that survive the full algorithm, that is, the strategies s i such that s i ∈ S k i (h) for all k ≥ 1 and all h ∈ H i (s i ).
To illustrate the algorithm, we use the game from Fig. 1. We have that H 1 = {∅, h 2 } and H 2 = {h 1 } and the initial sets of strategies: After the first step is applied, we obtain the following reduced decision problems: Observe that at ∅ , b is strictly dominated by (a, f ) ∈ S 0 1 (h 1 ) ∩ S 0 1 (∅) . We also have that at h 2 , (a, g) is strictly dominated by (a, f ) ∈ S 0 1 (h 2 ) . Therefore the only strategy that remains in Ŝ 1 −2 (h 1 ) is (a, f). At the second iteration of the algorithm we obtain: by a randomization on S j (h � ) for every h � ∈ H j (s j ) weakly following h, and s j is not strictly dominated onŜ k−1 −j (h �� ) by a randomization on S j (h) ∩ S j (h �� ) for every h �� ∈ H j (s j ) weakly preceding h}.
We see that at h 1 , e is strictly dominated on Ŝ 1 −2 (h 1 ) by d, so the only strategy in Ŝ 2 −1 (∅) and S 2 2 (h 1 ) is d. Since all the sets are singletons, the algorithm stops. Therefore the surviving strategies are c for player 1 and d for player 2, which are exactly the strategies that we found in Sect. 5 as those that can be chosen under common belief in future and restricted past rationality.

Concluding remarks
A new reasoning concept for dynamic games was introduced, which not only assumes rationality of the opponents in the future, but also assumes players reason about what happened in the past in the following way: if the game reaches an information set, players should consider only those strategies that actually reach that information set and believe that the opponent has chosen rationally in the past among that restricted set of strategies. In this way, players are reasoning at every information set about the past, but only a restricted part of it. We have also presented the fact that common belief in future and restricted past rationality can be obtained from using proper rationalizability in the normal form of the dynamic game, connecting these two concepts. Additionally, it was possible to define a procedure that starts from the decision problems in the dynamic game, and using strict dominance, selects the strategies that can be chosen under common belief in future and restricted past rationality.
An interesting continuation could involve the study of the robustness of the concept presented here to inessential transformations of the dynamic game as defined in Thompson (1952) and Kohlberg and Mertens (1986). For a quick glance, we can see that in Example 1 it is possible to transform the game by first allowing player 1 to choose between "c" and "not c". If player 1 chooses not c then he has the chance to choose between a and b. It is even possible to switch around the order in which decisions are taken after not c as in Fig. 6, and in spite of that, we obtain the same prediction to the game under common belief in future and restricted past rationality, whereas concepts such as common belief in future rationality would fail to stay indifferent under these transformations.
We can see that in the original game player 1 can choose c, whereas player 2 can choose both d and e under common belief in future rationality. However, in the modified game player 1 can choose c, while player 2 can only choose d under common belief in future rationality, since player 2 must believe at h 1 that player 1 will rationally choose a and f in the future. Under common belief in future and restricted past rationality, in the modified game we also get that player 1 must choose c and player 2 must choose d. It would require some further analysis but from this example we can see common belief in future and restricted past rationality appears to be more robust to inessential transformations than common belief in future rationality. Some more future research could include the application of this concept to other classes of games such as infinite games, repeated games and stochastic games, as well as finding an algorithm in each case that finds the choices that can be made under common belief in future and restricted past rationality.
Another problem that could be investigated in future work is whether we can find an equilibrium analogue to common belief in future and restricted past rationality, and how it would relate to existing equilibrium concepts for dynamic games. Such a search for an equilibrium analogue could be based on Perea and Predtetchinski (2019) who have shown that for two-player stochastic dynamic games with perfect information, subgame perfect equilibrium is equivalent to common belief in future rationality with a correct beliefs assumption. Since players have perfect information, the addition of restricted past rationality does not affect the result, so a natural extension would be to study the case of dynamic games with imperfect information. Chen and Micali (2013) and Perea (2017) have proven that for finite dynamic games, the outcomes obtained under common strong belief in rationality are also reachable under common belief in future rationality, proving that common strong belief in rationality is a more restrictive concept in terms of outcomes. It would be interesting to study the relation in terms of outcomes of the concept of common strong belief in rationality and common belief in future and restricted past rationality.
To show that ŝ i ∈ S i (h � ) we distinguish two cases: whether h ′ weakly precedes h or h ′ weakly follows h.
If h ′ weakly precedes h, Then at every h �� ∈ H(s � i , s −i ) weakly following h and weakly followed by h ′ we have by definition ŝ i (h �� ) = s � i (h �� ) , and at every h �� ∈ H(s � i , s −i ) such that h follows h ′′ we know that ŝ i (h �� ) = s i (h �� ) . But by perfect recall of player i, there exists a unique choice c * i (h �� ) at the information set h ′′ such that h can be reached. Since both s i , s � i ∈ S i (h) , both strategies must choose c * i (h �� ) . Therefore such that h weakly follows h ′′ . Since we have seen that ŝ i (h �� ) = s � i (h �� ) for all h �� ∈ H(s � i , s −i ) weakly following h and weakly preceding h ′ , the strategy combination (ŝ i , s −i ) reaches h ′ , and ŝ i ∈ S i (h � ).
By the two results above, we have that ) be the cautious lexicographic belief for type t i . Let k be the smallest number such that for all < k . Moreover where (1a) and (1b) have been used in the third equality, and the inequality is obtained using ( * ) and the fact that b k Hence we have the result we wanted to prove. ◻ Proof (Lemma 2) First we prove that respect of preferences implies belief in future rationality.
Let h ∈ H i . Suppose f i (t i ) does not believe at h in player j's future rationality. Then for some s j ∈ S j (h � ) that is a suboptimal strategy for f j (t j ) at some h ′ that weakly follows h.
By Lemma 1 there exists ŝ j ∈ S j (h) ∩ S j (h � ) such that t j prefers ŝ j to s j . By the hypothesis, t i respects j's preferences, so it must deem (ŝ j , t j ) infinitely more likely than (s j , t j ) . Hence, there is some k such that b k i (t i )(ŝ j , t j ) > 0 and b m i (t i )(s j , t j ) = 0 for all m ≤ k . Since ŝ j ∈ S j (h) , this implies that by construction of the conditional belief at h. But this is a contradiction. Therefore, f i (t i ) believes at h in player j's future rationality for all h ∈ H i . Now we prove with a similar argument that respect of preferences implies belief in restricted past rationality.
Let h ∈ H i . Suppose f i (t i ) does not believe at h in player j's restricted past rationality. Then for some s j ∈ S j (h) ∩ S j (h �� ) that is a suboptimal strategy for f j (t j ) among strategies in S j (h) ∩ S j (h �� ) at h ′′ that weakly precedes h. By Lemma 1 there exists ŝ j ∈ S j (h) ∩ S j (h �� ) such that t j prefers ŝ j to s j . By the hypothesis, t i respects j's preferences, so it must deem (ŝ j , t j ) infinitely more likely than (s j , t j ) . Since ŝ j ∈ S j (h) , then by construction of the conditional belief at h by an analogous argument as above, which is a contradiction. Therefore f i (t i ) believes at h in player j's restricted past rationality. Finally, by construction, f i (t i ) satisfies Bayesian updating. ◻ We define the set T * (t i ) as the set of types in t i 's belief hierarchy in the normal form, that is, T * (t i ) is the smallest set with the property that t i ∈ T * (t i ) , and for every t j ∈ T * (t i ) , if t j deems possible t k , then t k ∈ T * (t i ).
Similarly we define T * (t i ) as the set of types in t i 's belief hierarchy in the dynamic form. More precisely, T * (t i ) is the smallest set such that t i ∈T * (t i ) and for every t j ∈T * (t i ) , if j (t j , h)(s k ,t k ) > 0 for some h ∈ H j , then t k ∈T * (t i ).
Proof (Lemma 3) Let t i ∈ T i and construct the set T * (t i ) . Since t i is properly rationalizable, every type in T * (t i ) is cautious and respects the opponents' preferences.
By construction, every type in T * (t i ) induces a type in T * (f i (t i )) . It then follows, by Lemma 2, that all types in T * (f i (t i )) believe in the opponents' future and restricted past rationality and believe the opponents satisfy Bayesian updating.
Then by definition, since all of the types in T * (f i (t i )) only refer to types in T * (f i (t i )) , all express common belief in future and restricted past rationality and common belief in Bayesian updating.
Hence, in particular, f i (t i ) expresses common belief in future and restricted past rationality and common belief in Bayesian updating. ◻ Proof (Theorem 1) Since s i is properly rationalizable, there is a type t i that is properly rationalizable such that s i is optimal for t i . By Lemma 3, common belief in future and restricted past rationality and common belief in Bayesian updating. Now we show that s i is also optimal for type f i (t i ) at every information set h ∈ H i (s i ).
Suppose that s i is suboptimal for f i (t i ) at information set h. By Lemma 1, choosing h � = h , there is a strategy ŝ i ∈ S i (h) such that t i prefers ŝ i to s i . Then s i is not an optimal strategy for t i , which is a contradiction. ◻

Proofs for Section 6
Before we prove Theorem 2 we require some auxiliary results, and the construction of an epistemic model according to the algorithm, which will have the desired properties.
We state the following result, first proved in Pearce (1984) for games with two players. A general proof can be found in Perea (2012).

Theorem 3 (Pearce's lemma) Consider a reduced decision problem
be the set of opponents' strategy combinations (s j ) j≠i ∈ S −i (h) such that there is some type t i expressing up to k-fold belief in future and restricted past rationality that at h assigns positive probability to (s j ) j≠i .

Lemma 4 For every player i ∈ I , every information set h ∈ H i and every
Proof We prove this statement by induction on k.
Let k = 1 . Consider a player i ∈ I , an information set h ∈ H i and let s −i ∈ B 1 −i (h) . Then there is a type t i expressing up to 1-fold belief in future and restricted past rationality such that t i assigns positive probability to s −i at h. Now consider an opponent j ≠ i . Since t i believes in j's future and restricted past rationality, then for every h � ∈ H j (s j ) weakly following h we can find a conditional belief j (t j , h � ) for which s j is optimal among strategies in S j (h � ) , and for every h �� ∈ H j (s j ) weakly preceding h we can find a conditional belief j (t j , h �� ) for which s j is optimal among strategies in S j (h) ∩ S j (h �� ).
Then by Pearce's lemma, for every h � ∈ H j (s j ) weakly following h, s j is not strictly dominated on Ŝ 0 −j (h � ) by a randomization on S j (h � ) and for every h �� ∈ H j (s j ) weakly preceding h, s j is not strictly dominated on Ŝ 0 −j (h �� ) by a randomization on , and this is true for all players i ∈ I and every information set h ∈ H i . Now we proceed with the induction step. Fix k ≥ 2 and assume that for every player i ∈ I and every information set For the inequality, we have been using the assumption that i (h � )(S −i (h �� )) > 0 . However, this would mean that u i (s �� i , i (h � )) > u i (s i , i (h � )) . As s �� i ∈ S i (h � ) ∩ S i (h) , this contradicts our assumption that s i is optimal for i (h � ) among strategies in S i (h � ) ∩ S i (h). Hence, s i must be optimal for i (h �� ) among strategies in S i (h �� ) ∩ S i (h) . This completes the proof. ◻ Then, for every player j ≠ i , we have for every information set h �� ∈ H j (s j ) weakly following h ′ that s j is not strictly dominated on Ŝ k−1 −j (h �� ) by a randomization on S j (h �� ) , and for every information set h ��� ∈ H j (s j ) weakly preceding h ′ that s j is not strictly dominated on Ŝ k−1 −j (h ��� ) by a randomization on S j (h � ) ∩ S j (h ��� ).
Take an information set h �� ∈ H j (s j ) that weakly follows h. Then h ′′ weakly follows h ′ , and we know from above that s j is not strictly dominated on Ŝ k−1 −j (h �� ) by a randomization on S j (h �� ).
Now take an information set h ��� ∈ H j (s j ) that weakly precedes h. Then either h ′′′ weakly precedes h ′ , or h ′′′ weakly follows h ′ .
If h ′′′ weakly precedes h ′ , then we know from above that s j is not strictly domi- On the other hand, if h ′′′ weakly follows h ′ , then we know from above that s j is not strictly dominated on Ŝ k−1 −j (h ��� ) by a randomization on S j (h ��� ) . Hence, in particular, s j is not strictly dominated on Ŝ k−1 −j (h ��� ) by a randomization on S j (h) ∩ S j (h ���

Construction of the epistemic model
We start with the construction of beliefs for the model. For i ∈ I take an information set h ∈ H and let D k otherwise. (4) Here, the first and the last equality follow from the fact that t j = t s j ,h �� j for every j ≠ i , the second equality from (3) applied to h ′′ , the third equality from (4), the fourth equality from (3) otherwise.