1 Introduction

This paper provides a full characterization of the limit set, as the time horizon increases, of the set of pure strategy subgame perfect Nash equilibrium payoff vectors of any finitely repeated game. The obtained characterization is in terms of appropriate notions of feasible and individually rational payoff vectors of the stage-game. These notions are based on Smith’s (1995) notion of Nash decomposition and appropriately generalize the classic notion of feasible payoff vectors as well as the notion of effective minimax payoff defined by Wen (1994). The main theorem nests earlier results of Benoit and Krishna (1985), Smith (1995), and Demeze-Jouatsa and Wilson (2019).

Whether non-Nash outcomes of the stage-game can be sustained via subgame perfect Nash equilibria of the finitely repeated game depends on whether players can be incentivized to abandon their short term interests and to follow some collusive paths that have greater long-run average payoffs. There are two extreme cases. On the one hand, in any finite repetition of a stage-game that has a unique Nash equilibrium payoff vector such as the prisoners’ dilemma, only the stage-game Nash equilibrium payoff vector is sustainable by subgame perfect Nash equilibria of finite repetitions of that stage-game. On the other hand, for stage-games in which all players receive different Nash equilibrium payoffs such as the battle of sexes, the limit perfect folk theorem holds: any feasible and individually rational payoff vector of the stage-game is achievable as the limit payoff vector of a sequence of subgame perfect Nash equilibria of the finitely repeated game as the time horizon goes to infinity.

Benoit and Krishna (1985) established that for the limit perfect folk theorem to hold, it is sufficient that the dimension of the set of feasible payoff vectors of the stage-game equals the number of players and that each player receives distinct payoffs at Nash equilibria of the stage-game.Footnote 1 Smith (1995) provided a weaker, necessary and sufficient condition for the limit perfect folk theorem to hold. Smith (1995) showed that it is necessary and sufficient that the Nash decomposition of the stage-game is complete.Footnote 2 The distinct Nash payoffs condition and the full dimensionality of the set of feasible payoff vectors as in Benoit and Krishna (1985) or the complete Nash decomposition of Smith (1995) allow us to construct credible punishment schemes and to (recursively) leverage the behavior of any player near the end of the finitely repeated game. These are essential to generate a limit perfect folk theorem. In the case that the stage-game admits a unique Nash equilibrium payoff vector, Benoit and Krishna (1985) demonstrated that the set of subgame perfect Nash equilibrium payoff vectors of the finitely repeated game is reduced to the unique stage-game Nash equilibrium payoff vector.

A part of the puzzle remains unresolved. Namely, for stage-games that do not admit a complete Nash decomposition, what is the exact range of payoff vectors that are achievable as the limit payoff vector of a sequence of subgame perfect Nash equilibria of finite repetitions of that stage-game?Footnote 3

If the stage-game has an incomplete Nash decomposition, then the set of players naturally breaks up into two blocks where the first block contains all the players whose behavior can recursively be leveraged near the end of the finitely repeated game (see Footnote 2 for details). In contrast, it is not possible to control short run incentives of players of the second block. Therefore, each player of the second block has to play a stage-game pure best response at any profile that occurs on a pure strategy subgame perfect Nash equilibrium play path. Stage-game action profiles eligible for pure strategy subgame perfect Nash equilibrium play paths of the finitely repeated game are therefore exactly the stage-game pure Nash equilibria of what one could call the effective one shot game, the game obtained from the initial stage-game by setting the utility function of each player of the first block equal to a constant.

This restriction of the set of eligible actions for pure strategy subgame perfect Nash equilibrium play paths has two main implications. Firstly, for a feasible payoff vector to be approachable via pure strategy subgame perfect Nash equilibria of the finitely repeated game, it has to be in the convex hull of the set of payoffs to profiles of actions that are Nash equilibria of the effective one shot game. I introduce the concept of enforceable payoff vector. I call a payoff vector enforceable if it belongs to the convex hull of the set of payoff vectors to profile of actions that are Nash equilibria of the effective one shot game. Secondly, as subgame perfect Nash equilibria are protected against unilateral deviations even off-equilibrium paths, any player of the second block has to be at her best response at any action profile occurring on a credible punishment path. Therefore, only pure Nash equilibria of the effective one shot game are eligible for credible punishment paths in any finite repetition of the original stage-game. Consequently, a player of the first block can guarantee herself a payoff that is strictly greater than her effective minimax payoff. I call this new reservation payoff the enforceable minimax payoff.

The main finding of this paper says that, as the time horizon increases, the set of payoff vectors to pure strategy subgame perfect Nash equilibria of the finitely repeated game converges to the set of enforceable payoff vectors that dominate the enforceable minimax payoff vector.

The paper proceeds as follows. In Sect. 2 I introduce the model and the definitions. Section 3 states the main finding of the paper and sketches the proof, and Sect. 4 concludes the paper. Proofs are provided in the Appendix.

2 The model

2.1 The stage-game

Let \(G=(N,A=\times _{i \in N}A_{i},u=(u_{i})_{i\in N})\) be a stage-game where the set of players \(N=\{1,\ldots ,n\}\) is finite and where for all player \(i \in N\) the set \(A_{i}\) of actions of player i is compact. Given a player \(i\in N\) and an action profile \(a=(a_{1},\ldots ,a_{n})\in A\), let \(u_{i}(a)\) denote the stage-game utility of player i given the action profile a. Given an action profile \(a\in A\), \(i\in N\) a player, and \(a_{i}^{\prime }\in A_{i}\) an action of player i, let \((a_{i}^{\prime },a_{-i})\) denote the action profile in which all players except player i choose the same action as in a, while player i chooses \(a_{i}^{\prime }\). A stage-game pure best response of player i to the action profile a is an action \(b_{i}(a)\in A_{i}\) that maximizes the stage-game payoff of player i given that the choice of other players is given by \(a_{-i}\). An action profile \(a\in A\) is a pure Nash equilibrium of the stage-game G (denoted by \(a\in {\text {Nash}}(G)\)) if \(u_{i}(a_{i}^{\prime },a_{-i})\le u_{i}(a)\) for all player \(i\in N\) and all action \(a_{i}^{\prime }\in A_{i}\).

Each stage-game considered in this paper is compact in the sens that each \(A_i\) is compact, and u is continuous. A stage-game could for instance be finite, the mixed extension of another finite stage-game, or a game with a continum of actions for some players.

Let \(\gamma \) be a real number that is strictly greater than any payoff a player might receive in the stage-game G.Footnote 4 A player is said to have distinct pure Nash payoffs in the stage-game if there exist two pure Nash equilibria of the stage-game in which this player receives different payoffs. Let \(\tau (G)=(N,A,(u'_i)_{i \in N}) \) be the normal form game where the utility function of player i is defined by

$$\begin{aligned} u_{i}^{\prime }=\left\{ \begin{array}{ll} \gamma &{} \text {if }i\text { has distinct Nash payoffs in }G\\ u_{i} &{} \text {otherwise}. \end{array} \right. \end{aligned}$$

Let \(G^{0}:=G\) and \(G^{l+1}:=\tau (G^{l})\) for all \(l\ge 0\). For all \(l\ge 0,\) let \(N_{l}\) be the set of players with a utility function that is constant to \(\gamma \) in the game \(G^{l}\). As N is finite, there is an \(h\in [0,+\infty )\) such that \(N_{l+1}=N_{l}\) for all \(l \ge h\). Let \({\widetilde{A}}={\text {Nash}}(G^{h})\) be the set of pure Nash equilibria of the game \(G^h\). We call \({\widetilde{A}}\) the enforceable action set. The set of enforceable payoff vectors of the game G is defined as the convex hull \(Conv[u({\widetilde{A}})]\) of the set \(u( {\widetilde{A}})=\{u(a) \mid a \in {\widetilde{A}}\}\). The sequence \(0\varsubsetneq N_1 \varsubsetneq \cdots \varsubsetneq N_h\) is the Nash decomposition of the game G, and the Nash decomposition is complete if \(N_h=N\).Footnote 5

Let \(\sim \) be the equivalence relation defined on the set of players as follows: player i is equivalent to j (denoted by \(i\sim j\)) if there exist \(\alpha _{ij}>0\) and \(\beta _{ij} \in {\mathbb {R}}\) such that for all \(a\in {\widetilde{A}}\), we have \(u_{i}(a)=\alpha _{ij}\cdot u_{j}(a)+\beta _{ij}\). For all \(i\in N\), let \({\mathcal {J}}(i)\) be the equivalence class of player i and let

$$\begin{aligned} {\widetilde{\mu }}_{i}=\min _{a\in {\widetilde{A}}}\max _{j\in {\mathcal {J}}(i)}\max _{a_{j}^{\prime }\in A_{j}}\left[ \alpha _{ij}\cdot u_{j}(a_{j}^{\prime },a_{-j})+\beta _{ij}\right] \text { and } {\widetilde{\mu }}=({\widetilde{\mu }}_{1},\ldots ,\widetilde{\mu }_{n}). \end{aligned}$$

The payoff \({\widetilde{\mu }}_{i}\) is the enforceable minimax of player i in the stage-game G.Footnote 6

Call a payoff vector e-rational if it dominates the enforceable minimax payoff vector \({\widetilde{\mu }}\). Let \({\widetilde{I}}=\{x=(x_1,\ldots ,x_n) \in {\mathbb {R}} \mid x_i \ge {\widetilde{\mu }}_i \text { for all } i \in N \}\) be the set of e-rational payoff vectors.

The name “enforceable action” comes from Fudenberg et al. (2009) notion of enforceability, which requires an action to be incentive compatible given some set of continuation payoffs. The concepts of enforceable payoff and enforceable minimax are respective generalizations of the classic concepts of feasible payoff and effective minimax, viewed as indicators to derive the perfect folk theorem for finitely repeated games. If each player receives (recursively) distinct payoffs at Nash equilibria of the stage-game, then the behavior of each player can be leveraged if the game is finitely repeated. In that case, the enforceable action set \({\widetilde{A}}\) equals the whole set A of action profiles, the enforceable minimax equals the classic effective minimax, and the set of enforceable payoffs vectors equals the set of feasible payoff vectors. In the other case, the set of enforceable actions \({\widetilde{A}}\) is a proper subset of the whole set of profile of pure actions, the set of enforceable payoff vectors is a proper subset of the classic set of feasible payoff vectors, and the enforceable minimax of a player can be strictly greater than her effective minimax.Footnote 7 Figure 1 uses the example of Footnote 3 to illustrate the differences between our newly introduced concepts and the classic ones. Only payoffs of players 1 and 2 are displayed. In that game, pure action profiles where player 1 chooses OM and player 2 chooses C are not enforceable. The effective minimax payoff vector equals (1, 0, 0, 0) and the enforceable minimax payoff vector equals (1, 1, 0, 0).Footnote 8

Fig. 1
figure 1

Equilibrium payoff vectors of players 1 and 2

2.2 The finitely repeated game

Let G be the stage-game. Given \(T>0\), let G(T) denote the T-fold repeated game obtained by repeating the stage-game T times. A pure strategy of player i in the repeated game G(T) is a contingent plan that provides for each history the action chosen by player i given this history. That is, a strategy is a function \(\sigma _{i}:\bigcup \nolimits _{t=1}^{T}A^{t-1}\rightarrow A_{i}\) where \(A^{0}\) contains only the empty history.Footnote 9 The strategy profile \(\sigma =(\sigma _{1},\ldots ,\sigma _{n})\) of G(T) generates a play path \(\pi (\sigma )=[\pi _1({\sigma }),\ldots ,\pi _T({\sigma })]\in A^{T}\) and player \(i\in N\) receives a sequence \((u_{i}(\pi _{t}(\sigma ))_{1\le t\le T}\) of payoffs. The preferences of player \(i\in N\) among strategy profiles are represented by the average payoff \(u_{i}^{T}(\sigma )=\frac{1}{T} \sum \nolimits _{t=1}^{T}u_{i}[\pi _t(\sigma )]\). A strategy profile \(\sigma =(\sigma _{1},\ldots ,\sigma _{n})\) is a pure strategy Nash equilibrium of G(T) if \(u_{i}^{T}(\sigma _{i}^{\prime },\sigma _{-i})\le u_{i}^{T}(\sigma )\) for all \(i\in N\) and for all pure strategies \(\sigma _{i}^{\prime }\) of player i. A strategy profile \(\sigma =(\sigma _{1},\ldots ,\sigma _{n})\) is a pure strategy subgame perfect Nash equilibrium of G(T) if given any \( t\in \{1,\ldots ,T\}\) and any history \(h^{t}\in A^{t-1}\), the restriction \(\sigma _{\mid h^{t}}\) of \(\sigma \) to the history \(h^{t}\) is a Nash equilibrium of the finitely repeated game \(G(T-t+1)\).

For any \(T>0\), let E(T) be the set of pure strategy subgame perfect Nash equilibrium payoff vectors of G(T). Let E be such that the Hausdorff distance between E(T) and E goes to 0 as T goes to infinity.Footnote 10 The set E is the Hausdorff limit of the set of pure strategy subgame perfect Nash equilibrium payoff vectors of the finitely repeated game. As I show later in the Appendix A, the limit set E exists, is nonempty, convex, and compact.

3 Main result

Theorem 1

Let G be a compact stage-game. As the time horizon increases, the set of pure strategy subgame perfect Nash equilibrium payoff (average payoff) vectors of the finitely repeated game converges (in the Hausdorff sense) to the set of enforceable and e-rational payoff vectors.

A constructive proof of Theorem 1 is provided in the appendix. It uses four main lemmata. Lemma 3 states that as the time horizon increases, the set of pure strategy subgame perfect Nash equilibrium payoffs of the finitely repeated game converges to a well defined set, \({\text {ASPNE}}(G)\), which is the set of payoffs that are approachable via pure strategy subgame perfect Nash equilibria. Lemmata 4 and 5 together say that the limit set of the set of pure strategy subgame perfect Nash equilibrium payoff vectors, which equals the set \({\text {ASPNE}}(G)\), is included in the set of enforceable and e-rational payoff vectors. Lemma 6 states that every enforceable and e-rational payoff vector belongs to the set \({\text {ASPNE}}(G)\). The enforceability and the e-rationality can therefore be observed as necessary and sufficient conditions on feasible payoffs to be approachable via pure strategy subgame perfect Nash equilibria of the finitely repeated game.

Theorem 1 assumes no discounting. This assumption is without loss of generality. One can indeed check that as the discount factor goes to 1, the discounted average converges to the average payoff. Therefore, if the average payoff of a player to a path \(\pi \) is strictly greater than her average payoff to another path \(\pi ^{\prime }\), then the discounted average payoff of that player to the path \(\pi \) is strictly greater than her discounted average to \(\pi ^{\prime }\), given that the discount factor is high enough. One can also make use of the payoff continuation lemma for finitely repeated games and prove a stronger result.Footnote 11 With a fixed discount factor, one can show that the limit set of the set of pure strategy subgame perfect Nash equilibrium payoffs of the discounted finitely repeated game equals the set of enforceable and e-rational payoffs, given that the discount factor exceeds a threshold \({\underline{\delta }}\).

4 Conclusion

This paper analysed the set of pure strategy subgame perfect Nash equilibrium payoff vectors of the finitely repeated games with complete information. The main finding is an effective folk theorem. It is a complete characterization of the limit set, as the time horizon increases, of the set of pure strategy subgame perfect Nash equilibrium payoff vectors of the finitely repeated game. As the time horizon increases, the limiting set always exists, is compact, convex and can be strictly in-between the convex hull of the set of stage-game Nash equilibrium payoff vectors and the classic set of feasible and individually rational payoff vectors. This finding exhibits the exact range of cooperative payoffs that players can achieve via subgame perfect Nash equilibria of the finitely repeated game.

One point of this work is that it provides a full characterization of the optimal punishment payoff of finitely repeated games with complete information and perfect monitoring (Benoit and Krishna 1985, Gossner and Hörner 2010).

The method of this paper applies to the Nash equilibrium case. In this particular case, to leverage the behaviour of a player near the end of the finitely repeated game, it is necessary and sufficient that the latter player either has a pure Nash equilibrium payoff that is strictly greater than her pure minimax payoff, or that there exists a recursive Nash equilibrium in which the latter player receives a payoff that is different from her pure minimax payoff. Pure actions eligible for pure strategy Nash equilibrium play paths of the finitely repeated game are therefore the pure Nash equilibrium profiles of a new stage-game obtained from the original stage-game by setting the utility functions of players whose behaviour can be leverage near the end of the finitely repeated game to a constant. I refer to the convex hull of the set of original payoffs to eligible profiles as the set of Nash-feasible payoffs. As the time horizon increases, the set of pure strategy Nash equilibrium payoff vectors of the finitely repeated game converges to the set of Nash-feasible payoffs vectors that dominate the pure minimax payoff vector. This characterization of the limit set of the set of Nash equilibrium payoff vectors of the finitely repeated game nests early results of Benoit and Krishna (1987) and González-Díaz (2006).

One might wonder if similar method applies in the case that players can employ unobservable mixed strategies (Gossner 1995), in the case that the monitoring technology is imperfect (see Fudenberg et al. 2007 for the infinite horizon case, and with public monitoring), or in the case that equilibrium strategies are protected against renegotiation (Benoit and Krishna 1993).