Markov perfect equilibria in a dynamic decision model with quasi-hyperbolic discounting

We study a discrete-time non-stationary decision model in which the preferences of the decision maker change over time and are described by quasi-hyperbolic discounting. A time-consistent optimal solution in this model corresponds with a Markov perfect equilibrium in a stochastic game with uncountable state space played by countably many short-lived players. We show that Markov perfect equilibria may be constructed using a generalized policy iteration algorithm. This method is in part inspired by the fundamental works of Mertens and Parthasarathy (in: Raghavan, Ferguson, Parthasarathy, Vrieze (eds) Stochastic games and related topics, Kluwer Academic Publishers, Dordrecht, 1991; in: Neyman, Sorin (eds) Stochastic games and applications, Academic Publishers, Dordrecht, 2003) devoted to subgame perfect equilibria in standard n-person discounted stochastic games. If the one-period utilities and transition probabilities are independent of time, we obtain on new existence results on stationary Markov perfect equilibria in the models with unbounded from above utilities.


Introduction
The issue of dynamic inconsistency in sequential decision models with changing preferences in time was pointed out in the seminal paper of Strotz (1956). The time preference studies found that discount rates are much grater in the short run than in the long run. To model this phenomena researchers have adopted discount functions from the class of generalized hyperbolas, see Ainslie (1992) or Harris and Laibson (2001) and references therein. The discrete-time analog of quasi-hyperbolic discounting involves the functions: 1, δβ, δβ 2 , . . . , where β ∈ (0, 1) is a long run discount factor and δ > 0 is a short run discount factor. Such discounting was first used by Phelps and Pollak (1968). Specifically, they noticed that finding a time-consistent solution may be obtained by looking for a Nash equilibrium in certain games played by countably many short-lived players assuming that each player can act only once. Recently, Montiel Olea and Strzalecki (2014) provided an axiomatic characterization of quasi-hyperbolic preferences.
The existence of time-consistent solutions in models with changing tastes is a difficult problem. The akin problem is also met in altruistic economic growth models, known as intergenerational games. Initial positive results expressed in terms of intergenerational games (with a relatively simple utility function for each generation) were given by Bernheim and Ray (1983) and Leininger (1986). Peleg and Yaari (1973), on the other hand, provided some counterexamples for models with slightly less restrictive assumptions. The existence of a stationary Markov perfect equilibrium in stochastic game with finite state and action spaces with quasi-hyperbolic discounting was stated first in Alj and Haurie (1983). Applications of quasi-hyperbolic discounting in ecological problems can be found in Haurie (2005) and Karp (2005) and the references cited therein. For additional examples (especially in macroeconomics) and a detailed discussion on the literature the reader is referred to Balbus et al. (2016) and Jaśkiewicz and Nowak (2018).
In this paper, we study a discrete-time consumption/savings model with changing in time preferences. To construct a time-consistent solution in this model, the decision maker is represented by a sequence of temporal players called "selves". The state space is S = [0, + ∞) and represents the set of stocks of a renewable resource. Current self (self t) decides how to split the available resource stock into consumption and investment for future selves. The outcome of the investment (saving) is determined by some transition probability function. We assume that the stage utility functions and transition probability functions may depend on time and self t measures his satisfaction using so-called short-run and long-run discount factors according to the formula used in quasi-hyperbolic discounting. The total utility of self t depends on his own consumption and consumptions of all future selves. The model is actually a non-cooperative stochastic game with uncountable state space and denumerable set of players. This model was studied successfully by Harris and Laibson (2001) within a stationary framework. They proved the existence of a stationary Markov perfect equilibrium and derived a hyperbolic Euler relation that is useful in characterizing equilibria. The transition probability function in their model is non-atomic and has a special form with additive noise. More general non-atomic transition functions were examined in Balbus et al. (2015a), where the state space was assumed to be a compact interval. Related results were obtained in Balbus and Nowak (2008), Balbus et al. (2014) and Balbus et al. (2015c), but with very specific transition functions being a convex combination of probability measures on the state space. The proof of an existence of a stationary Markov perfect equilibrium in Harris and Laibson (2001) and Balbus et al. (2015a) was based on a fixed point argument in a space of functions with locally bounded variation. The existence of an equilibrium in the class of Lipschitz continuous functions, on the other hand, was shown by Balbus et al. (2015c), who also discussed computational issues.
The existence of subgame perfect equilibria in a large class of standard discounted n-person stochastic games proved by Mertens and Parthasarathy (2003). Their proof is based on some set-valued recursive equations whose solutions are the sets of Nash equilibrium vector payoffs in one-shot games involving the stage payoffs and the transition probability. Mertens and Parthasarathy (2003) also proposed an algorithm extending the well-known "value iteration" in discounted dynamic programming (see, e.g., Blackwell 1965). A simplified version of the algorithm was given in Mertens and Parthasarathy (1991). Certain modification of their method was applied by Balbus and Woźny (2016) to characterize the sets of Markov perfect equilibria in supermodular stochastic games and the consumption/savings problem with quasi-hyperbolic discounting and with very specific transition functions. The existence of an equilibrium is proved in two steps. First, they show that a system of recursive equations in the value function space has a solution. Second, relatively strong concavity assumptions imposed on the transition operator allow to obtain a Lipschitz continuous equilibrium. Finally, it is worth mentioning that a large class of dynamic games for which the Markov perfect equilibria exist and have a natural interpretation was indicated by Maskin and Tirole (2001).
This paper proposes a new method for proving the existence of Markov perfect equilibria in dynamic consumption/savings model with quasi-hyperbolic discounting. Our approach is also inspired by the method of Mertens and Parthasarathy (2003), but it more resembles "policy iteration" algorithm in dynamic programming. Our assumptions on the primitive data are much weaker than in the previous works of Harris and Laibson (2001) and Balbus et al. (2015a). The state space and the utility functions may be unbounded. Our first model concerns the transition probability function that is weakly continuous in investments and has no atoms in the positive part of the state space (0, ∞). The weighted-norm approach, proposed by Wessels (1977) and applied to this model, allows us for consideration of a large class of unbounded (e.g., power) utility functions. The second model, studied in this paper, deals with a special additive type of the transition function that may involve positive atoms. It is worthy to point out that a Markov perfect equilibrium is obtained in this paper from the non-empty intersection of some nested family of compact subsets of the space of strategy profiles of all selves. Therefore, fixed point theorems are not applied in our approach. However, the equilibrium is non-stationary, even if the period utility functions and transition probabilities are independent of time.
This paper is organized as follows. In Sect. 2, we describe a general decision model and basic definitions. Section 3 contains our main results on Markov perfect equilibria in a model with non-atomic transitions, whereas the transitions involving atoms are studied in Sect. 4. Additional comments on our results and related literature are given in Sect. 5.

The model
Let R be the set of all real numbers and N be the set of all positive integers. Define S := [0, +∞), S + := (0, +∞) and, for each s ∈ S, A(s) := [0, s]. The set S is referred to as the state space. It represents the set of "levels" for the renewable resource and A(s) is the set of available actions (possible consumption levels) in state s ∈ S. In a dynamic choice model with quasi-hyperbolic preferences and the state space S we envision an individual decision maker as a sequence of autonomous temporal selves. These selves are indexed by period numbers t ∈ T := N in which they make their choices. More precisely, for a given state s t ∈ S at the beginning of t-th period, self t chooses a consumption level a t ∈ A(s t ) and the remaining part y t := s t − a t is invested for future selves. Self t's satisfaction is measured by a period utility function u τ : S → S for all τ ≥ t.
Let q t be a transition probability from S to S. Then, the state s t+1 is generated by q t (·|y t ) depending on the investment y t ∈ A(s t ).
Let Φ be the set of all Borel measurable functions φ : S → S such that φ(s) ∈ A(s) for each s ∈ S. A Markov strategy for self t is a function c t ∈ Φ. We put i t (s) = s − c t (s), s ∈ S. This is an investment strategy (or saving) of self t for following selves. For any sequence (c n ) ∈ Φ ∞ := Φ × Φ × · · · of strategies of all selves and any t ∈ T, we define c t := (c t , c t+1 , . . .). For any state s t and any c t ∈ Φ ∞ , the transition probabilities q τ (·|i τ (s)) induced by q τ with τ ≥ t generate due to the Ionescu-Tulcea theorem (see Proposition V.1.1 in Neveu (1965)) a unique probability measure P c t s t on S ∞ endowed with the product σalgebra. Let E c t s t denote the expectation operator corresponding to the measure P c t s t . Assume that for each τ ∈ T, the function u τ : S → S is continuous. Note that it is non-negative.
The expected utility of self t is where β ∈ (0, 1) is a long-run discount factor and δ ∈ (0, 1] is a short-run discount factor. The idea of using utility functions of the form (1) goes back to Phelps and Pollak (1968). A detailed discussion with some applications of time preferences represented by utility functions of the type considered here can be found in Harris and Laibson (2001) and Montiel Olea and Strzalecki (2014). Clearly, the expression defined in (1) is non-negative. In the sequel, we give conditions under which it is finite. For any c n = (c n , c n+1 , . . .) ∈ Φ ∞ , n ≥ 2 and s n ∈ S, let Then we have Definition 2 A Stationary Markov Perfect Equilibrium (SMPE) is an MPEĉ = (ĉ t ) t∈T ∈ Φ ∞ such thatĉ t = c 0 for some c 0 ∈ Φ and for all t ∈ T.
In a stationary MPE every self uses the same consumption strategy. One can think that every self is a short-lived player in a non-cooperative game and acts only once. The payoff function of self t ∈ T is given by (1). Then an MPEĉ = (ĉ t ) t∈T ∈ Φ ∞ is a Nash equilibrium in this game.
Let c = (c t ) t∈T ∈ Φ ∞ and u c t+n (s t+n ) := u t+n (c t+n (s t+n )), n ∈ T, s t+n ∈ S.

For any Borel measurable function
Consider the composition Q c t+1 · · · Q c t+nũ c t+1+n (s t+1 ). In particular, note that and By the Ionescu-Tulcea theorem, it follows that Clearly, (7) is well-defined, non-negative, but it can be infinite. Below we make assumptions (W1) There exists a continuous increasing function w : S → [1, ∞) such that 0 ≤ u t (s) ≤ w(s) for all s ∈ S and t ∈ T. (W2) There exists a constant α > 0 such that αβ < 1 and S w(s)q t (ds|y) ≤ αw(y) for all y ∈ S, t ∈ T.
for all s t ∈ S and c t ∈ Φ ∞ . Assumptions of the above type were used to study discounted Markov decision processes with unbounded stage utility function, see Wessels (1977), Hernández-Lerma and Lasserre (1996) or Jaśkiewicz and Nowak (2011).

Markov perfect equilibria in models with non-atomic transitions
Let Pr(S) be the set of all probability measures on the state space S. We recall that a sequence (μ n ) n∈N of probability measures on S converges weakly to some μ 0 ∈ Pr(S) (μ n ⇒ μ 0 in short) if, for any bounded continuous function f : S → R, it holds that We now formulate our main assumptions in this section: (U) For each t ∈ T, the function u t : S → S is increasing, strictly concave and continuous at s = 0. (Q) The transition probability q t is weakly continuous on S, that is, for each y 0 ∈ S and y m → y 0 , we have q t (·|y m ) ⇒ q t (·|y 0 ) as m → ∞. Moreover, for each y ∈ S + , the probability measure q t (·|y) is non-atomic and q t (·|0) has no atoms in S + .
We now define some special classes of strategies of the players. By F, we denote the set of all continuous from the left mappings c : S → S such that the function i(s) := s − c(s) is non-decreasing and 0 ≤ c(s) ≤ s for all s ∈ S. Note that i is lower semicontinuous. Thus, c ∈ F is upper semicontinuous.
We can now state our first two main results.

Theorem 2 Assume that (W1)-(W3), (U) and (Q) hold and the model is stationary, i.e., q t = q and u t = u are independent of t ∈ T. Then there exists an S M P Eĉ
In the proof of Theorem 1 we define by backward induction a nested family of subsets of F ∞ that has a non-empty intersection. A Markov perfect equilibrium is a sequence that belongs to this intersection. This method resembles a "set-valued dynamic programming" approach and is based on two important factors: the continuity of expected utilities with respect to some natural topology on the space F ∞ and the continuity of the best response self t's mapping defined on the space of sequences of future selves strategies. Observe that the assertion of Theorem 2 is stronger compared with Theorem 1, but the assumptions are also stronger (stationarity of the model). To obtain an SM P E in Theorem 2 we shall need to apply a fixed point theorem, which was not needed in the proof of Theorem 1.
Let X be the vector space of all continuous from the left functions f : S → R such that f (0) = 0. It is also assumed that each f ∈ X has bounded variation on every interval [0, m], m ∈ N. We assume that X is endowed with the topology of weak convergence. Recall that a sequence ( f n ) n∈N converges weakly to f ∈ X if and only if f n (s) → f (s) as n → ∞ at any continuity point s ∈ S of f. Here we point out that s = 0 is considered as a continuity Note that every c ∈ F is upper semicontinuous and continuous from the left. The function s → s − c(s) belonging to the space I is non-decreasing and lower semicontinuous. Observe that I ⊂ X. Moreover, s = 0 is the continuity point of every function in F or I.

Lemma 1 The sets I and F are convex and sequentially compact in X.
Proof It is obvious that I is convex. For any f ∈ I and m ∈ N, we define the function f m as follows: f m (s) = f (s) for all s ∈ [0, m] and f m (s) = m for all s > m. Then f m can be viewed as a continuous from the left "distribution function" of some non-negative countably additive measure ρ m such that ρ m (S) = m. Consider an arbitrary sequence ( f n ) n∈N of functions in I. We now apply the standard "diagonal method". By Helly's theorem (see Billingsley 1968), there exists a subsequence (n 1 (k)) of (n) such that ( f 1 n 1 (k) ) k∈N converges weakly (as k → ∞) to some f 1 o ∈ I. Next, there exists a subsequence (n 2 (k)) of (n 1 (k)) such that ( f 2 n 2 (k) ) k∈N converges weakly to some f 2 Proceeding along this way, we infer that for any r ≥ 2, there exists a subsequence (n r (k)) of (n r −1 (k)) such that ( f r n r (k) ) k∈N converges weakly to some f r Thus, I is sequentially compact. Since F is obtained from I by a simple continuous transformation, the result also holds for F. By Lemma 1 and Tychonoff's theorem, we obtain the following auxiliary result.

Corollary 1
The space F ∞ endowed with the product topology is sequentially compact.
Lemma 3 Let the assumptions of Lemma 2 be satisfied. Assume that g : S → R is a Borel measurable function and S d is a denumerable subset of S + such that for any s ∈ S\S d and This fact and Lemma 2 imply that Hence (11) follows.
Let S c denote the set of continuity points of c ∈ X. If c ∈ F, then 0 ∈ S c and S\S c is a denumerable set. The proof of the following result is the same as that of Lemma 3.5 in Balbus et al. (2015a).
Observe that assumption (U) implies that every function u t is continuous on the space S. Now from Lemmas 3 and 5 with g m = g = J t+1 (c t+1 ) (note that λ = αβ 1−αβ in Lemma 2, compare with (8)), we conclude the following auxiliary result.

Lemma 6 If (W1)-(W3), (U) and (Q) hold, then for any c t+1
The set B R t (c t+1 )(s) can be regarded as the set of all best responses of self t ∈ T in state s, given that the following selves are going to use c t+1 ∈ F ∞ . Under our assumptions, this set is non-empty and compact by Lemma 6. For any s ∈ S and t ∈ T let us define A simple adaptation of the arguments given in the proof of Theorem 6.3 in Topkis (1978) gives the following result.

Lemma 7 Let (W1)-(W3), (U) and (Q) be satisfied. Then
s ∈ S and i t (c t+1 ) is non-decreasing and continuous from the left.
Proof For parts (a)-(c) consult Lemmas 3.2 and 3.3 in Balbus et al. (2015a). If φ ∈ F, then the function i given by i(s) = s −φ(s) is non-decreasing and continuous from the left. If s ∈ S φ , then we have φ(s) = br t (c t+1 )(s) by (c). Assume that s o ∈ S\S φ . Since i is continuous from the left, non-decreasing and the correspondence s → B R t (c t+1 )(s) is strongly ascending, (17) is continuous.
Note that x t (σ ) is the best response of self t to σ = c t+1 = (c t+1 , c t+2 , . . .) assumed to be chosen by following selves, x t−1 (σ ) is the best response of self t − 1 to the sequence (br t (c t+1 ), c t+1 ) and so on.
Proof of Theorem 1 First note that for any t ∈ T, we have G t+1 ⊂ G t . Since all best response mappings are continuous (Lemma 8) and the space F ∞ is compact, every set G t is non-empty and compact. Therefore, the set G := ∩ t∈T G t = ∅ and G is compact. Choose anyĉ = (ĉ 1 ,ĉ 2 , . . .) ∈ G. Thenĉ ∈ G t for every t ∈ T. This implies immediately that, for every t ∈ T, we haveĉ t = br t (ĉ t+1 ). Hence,ĉ is an M P E.
Proof of Theorem 2 In the stationary case, we can restrict attention to constant sequences (c, c, . . .) ∈ F ∞ . Every such a sequence can be identified with c ∈ F. Function (3) can be regarded as a function P(a, c)(s), where a ∈ A(s), c ∈ F. The best response mapping br (c) := br t (c t+1 ) (where t is arbitrary and c t+1 = (c, c, . . .) can be identified with c ∈ F) is by Lemma 8 continuous on the convex compact set F in the space X (Lemma 1). From the Schauder-Tychonoff fixed point theorem (see Aliprantis and Border 2006), it follows that there exists c * ∈ F such that c * = br (c * ). Clearly, the constant sequence (c * , c * , . . .) is an SM P E.
A natural example of transition probability that satisfies assumption (Q) is induced by the following equation where y t = s t −a t is the investment in state s t , (ξ t ) t∈T is a sequence of i.i.d. random "shocks" having a probability distribution π. The functionf is continuous and for any Borel set D in S and investment y ∈ S q t (D|y) where f 1 , f 2 : S → S are continuous, increasing and such that f 1 (y) < f 2 (y) for each y ∈ S + and f 1 (0) = f 2 (0) = 0. For instance, let f 1 (y) = y and f 2 (y) = y + √ y. In addition, π is a non-atomic probability measure on [0, 1]. For any y > 0, q(·|y) is a non-atomic measure such that q([ f 1 (y), f 2 (y)]|y) = 1. In particular, if π has the uniform distribution on [0, 1], then q(·|y) has the uniform distribution on [ f 1 (y), f 2 (y)]. Further assume that the utility function is independent of t, e.g., u(s) = s σ with σ ∈ (0, 1). Then, setting w(s) = (s + r ) σ , r ≥ 1, it follows that the assumptions (W1) and (U) are satisfied. Moreover, (W3) and (Q) also hold. Note that q({0}|0) = 1. Now we prove (W2). Assuming that each ξ t has the uniform distribution on [0, 1], we obtain by Jensen's inequality that The function r → η(r ) is decreasing and lim r →∞ η(r ) = 1. Hence, for any β ∈ (0, 1) we may choose sufficiently larger ≥ 1 such that βη σ (r ) < 1. This shows that (W2) holds with α := η σ (r ). The functions u and f 1 and f 2 may depend on t.

Example 3 The model with multiplicative shocks
where f is as in Example 2 and the probability measure π is non-atomic with support included in [0, + ∞).
Other functions w for which the transition probabilities of the type discussed above satisfy conditions (W1)-(W3) and (Q), (U) can be obtained by an adaptation of examples from Section 4 in Jaśkiewicz and Nowak (2011).

Markov perfect equilibria in models with transitions having atoms
The model considered in the previous section does not include deterministic transitions. In the deterministic case the basic continuity lemmas are doubtful in the class of discontinuous strategies F ∞ . A discussion of this issue can be found in Balbus et al. (2015a). In this section, we study some models involving atoms in the set S + , but the transition probability has an additive form.
With any c ∈ Φ we associate i ∈ Φ given by i(s) := s − c(s) and next define F L := {c ∈ Φ : c and i are non-decreasing}.
It is easy to see that F L consists of Lipschitz functions with constant one. We now assume that q = q t for each t ∈ T and the following conditions hold.

It is known (see Topkis 1998) that (A4) holds if and only if for any non-decreasing function
Some comments on the above assumptions are included in Remarks 1 and 2 at the end of this section.
We now state our main results in this section. Let C(S) be the space of all real-valued continuous functions on S. We assume that C(S) is endowed with the well-known topology of uniform convergence on compact sets. Recall that a sequence ( f n ) n∈N of functions in C(S) converges to some f ∈ C(S) if and only if for any compact interval K ⊂ S we have that lim n→∞ sup s∈K | f n (s) − f (s)| = 0.

Lemma 9 The set F L is convex and sequentially compact in C(S).
Proof Clearly, the set F L is convex. In order to show the sequential compactness we proceed similarly as in Lemma 1. However, in this proof, if f ∈ F L and m ∈ N, then f m is the restriction of f to the interval [0, m]. Let ( f n ) n∈N be any sequence of functions in F L and let us apply the standard "diagonal method". By the Arzelà-Ascoli theorem (see Billingsley 1968), there exists a subsequence (n 1 (k)) of (n) such that ( f 1 n 1 (k) ) k∈N converges uniformly (as k → ∞) to some f 1 o ∈ F L on [0, 1]. Next, we chose a subsequence (n 2 (k)) of (n 1 (k)) such that ( f 2 n 2 (k) ) k∈N converges uniformly to some f By Lemma 9 and Tychonoff's theorem, we conclude the following fact.

Corollary 2
The space F ∞ L endowed with the product topology is sequentially compact.
From Lemma 10, we conclude the following result. Since the assertion follows from (26) and the dominated convergence theorem.

Lemma 12 Assume that (A1)-(A4) and (U) hold. Then for any c t+1 in F
Proof First, we show that s → J N t+1 (c t+1 )(s) is non-decreasing. Clearly, s → u t+1+k (c t+1+k (s)) is non-decreasing by (U). Moreover, Hence, by (20), (A2) and the fact that s → s − c t+1 (s) is non-decreasing, it follows that s → Q c t+kũ c t+1+k (s) is non-decreasing either. Continuing the procedure, we finally claim that The next result states that the best response set B R t (c t+s )(s) of self t ∈ T is a singleton.

Lemma 13 Let (A1)-(A4) and (U) be satisfied. Assume that c t+1 in F
Proof Let y := s − a and observe first that the function y → S J t+1 (c t+1 )(s )q(ds |y) is concave. Indeed, by (A1) we obtain Due to Lemma 12 and (20) we know that for every Thus, the above inequalities, assumption (A2) and (27) lead to the conclusion. Assume that at least one inequality (28) is strict. Then we can conclude that y → P t (s − y, c t+1 )(s) and a → P t (a, c t+1 )(s) are strictly concave on A(s) for every s ∈ S + . Therefore, the sets B R t (c t+1 )(s) and B R t (c t+1 )(s) are singletons for every s ∈ S. Moreover, we have From the strict concavity of u t and Lemma 3.2 in Balbus et al. (2015a), it follows that the function s → i t (c t+1 )(s) is non-decreasing. From the strict concavity of y → S J t+1 (c t+1 )(s )q(ds |y) and Lemma 3.2 in Balbus et al. (2015a), we conclude that the function s → br t (c t+1 )(s) is also non-decreasing. We obviously know that the functions i t (c t+1 ) and br t (c t+1 ) satisfy 0 ≤ i t (c t+1 )(s) ≤ s and 0 ≤ br t (c t+1 )(s) ≤ s, s ∈ S. Thus, br t (c t+1 ) ∈ F L . If for every i = 1, . . . , l, we have equality in (28), then Lemma 14 Assume that (A1)-(A3) and (U) hold. Then, the mapping br t : F ∞ L → F L is continuous.
Proof Suppose that c t+1,m → c t+1 as m → ∞. Put ψ m := br t (c t+1,m ). Let ψ be any accumulation point of the sequence (ψ m ) in F L being a compact space. We have to show that ψ = br t (c t+1 ). However, this fact easily follows from the dominated convergence theorem, Lemmas 10, 11 and 13.
Similarly as in (18), we define the setŝ Proof of Theorem 3 We note thatĜ t+1 ⊂Ĝ t . Since all best response mappings are continuous by Lemma 14 and the space F ∞ L is compact, every setĜ t is non-empty and compact. Therefore, the setĜ := ∩ t∈TĜt is non-empty and compact. Choose anyĉ = (ĉ 1 ,ĉ 2 , . . .) ∈ G. Thenĉ ∈Ĝ t for every t ∈ T. This implies thatĉ t = br t (ĉ t+1 ) for every t ∈ T. Hence,ĉ is an M P E.
Proof of Theorem 4 In the stationary case, we again restrict attention to constant sequences (c, c, . . .) ∈ F ∞ L . Clearly, such a sequence can be identified with c ∈ F. Function (3) can be regarded as a function P(a, c)(s), where a ∈ A(s), c ∈ F L . By Lemma 14 the best response mapping br (c) := br t (c t+1 ) is continuous on the convex compact set F L in the space C(S) (Lemma 9). From the Schauder-Tychonoff fixed point theorem (see Aliprantis and Border 2006), it follows that there exists c * ∈ F L such that c * = br (c * ). Hence, the constant sequence (c * , c * , . . .) is an SM P E.
Remark 1 In assumptions (A1)-(A4) the functions h i and the measures ν i , (i = 0, . . . , l) may depend on t ∈ T. Then, Lemmas 10-14 and Theorems 3, 4 remain valid. However, for the sake of clarity of notation we skip this dependence.

Remark 2
The special transition structure in this section and condition (U) imply that both functions y → P t (s − y, c t+1 )(s) and a → P t (a, c t+1 )(s) are strictly concave on A(s) for every s ∈ S + and c t+1 ∈ F ∞ L . This plays a crucial role in getting the best reply br t (c t+1 ) ∈ F L . If the transition probability has some atoms in S + and does not have the special additive form, then there is a problem with the continuity of a → P t (a, c t+1 )(s) for some c t+1 ∈ F ∞ and the best response br t (c t+1 ) to c t+1 may not exist.

Comments
In this section we give some remarks on the relation of our results with the literature.
Remark 3 The decision model studied in this paper can be viewed as a game between generations. Self t can be considered as generation t having the total utility depending on all its descendants. Further comments on this issue can be found in Balbus et al. (2015a). In the intergenerational game setting, it is desirable to assume that the period utility functions u t and transition functions q t depend on generation t ∈ T. This natural requirement motivates us to consider the non-stationary models. Theorems 1 and 3 are new results in the area of decision processes with quasi-hyperbolic discounting (or intergenerational games). It is interesting to note that their proofs are not based on any fixed point argument. The idea is to consider the intersection G of the sets G t and is partly inspired by the fundamental work of Mertens and Parthasarathy (2003) on subgame perfect equilibria in n-person discounted stochastic games with simultaneous moves of the players. Balbus and Woźny (2016) also deal with a similar stationary model but with a compact state space and a relatively narrow class of transition functions.
Remark 4 Theorems 2 and 4 establish the existence of SM P E in stationary models and have some predecessors in the literature. A version of Theorem 2 with compact state space follows from Theorem 3.1 in Balbus et al. (2015a). In this case the function u is bounded. Here, on the other hand, we deal with unbounded state space and unbounded period utility functions. Such a setting covers more applications in economics, e.g., models with power or logarithmic utility functions. A related result to Theorem 1 is stated in Harris and Laibson (2001). Nonetheless, some of their assumptions are much stronger. They assume that S = [0, +∞) but the transition probability is induced by a difference equation with additive non-atomic noise with values in [y 1 , y 2 ] ⊂ S + . The period utility function u may be unbounded from below. However, it has a bounded relative risk-aversion coefficient. The class of strategies F (applied in Sect. 2) was used by Bernheim and Ray (1983) to study equilibria in altruistic growth models and by Majumdar and Sundaram (1991) to study Nash equilibria in symmetric stochastic games of resource extraction. Our paper owes much to their contributions.

Remark 5
The additivity assumptions similar to (A1)-(A3) were used in the study of models of intergenerational stochastic games or decision processes with quasi-hyperbolic discounting. In Balbus and Nowak (2008) a class of stochastic games between generations is considered where each generation consists of finitely many players. The proof of the main results is based on different ideas than in this work. Jaśkiewicz and Nowak (2014) also study models with additive non-atomic transitions but with risk-sensitive preferences. The most relevant work on models satisfying assumptions of type (A1)-(A3) is the paper of Balbus et al. (2014) where the transition probability depends on some unknown parameter chosen by a malevolent nature. Therefore, the notion of a robust Markov perfect equilibrium is introduced. Balbus et al. (2014) assume that the state space is a compact interval and the utility function u is bounded from above. The additivity condition on the transition probability function is also made in Balbus et al. (2015c), but with an additional restrictive assumption that ν 0 is the Dirac measure concentrated at zero. Thus, s = 0 is an absorbing state. Theorem 4 is not a corollary to their results. It should be noted, however, that they study an n-dimensional state space model.

Remark 6
The existence of an MPE in models with quasi-hyperbolic discounting and deterministic transitions is open and seems to be difficult. Examples 5.1 and 5.2 in Balbus et al. (2015a) suggest that Lemma 6 may fail to hold in the class F ∞ of strategies. On the other hand, working with the class F ∞ L as a strategy set leads to very restrictive assumptions.
Remark 7 Our proofs in both models (in atomic and non-atomic transitions) heavily rely on the assumption that the set S is one-dimensional. For example, the definition of br t (c t+1 ) given in (17) is based on the fact that B R t (c t+1 )(s) ⊂ R. Furthermore, the monotonicity of best reply functions with respect to state variable is very difficult to obtain in the model with multidimensional state space. We conjecture that a solution to dynamic decision models with quasi-hyperbolic preferences and more than one resource requires some new ideas and methods.
For any functions f 1 , f 2 ∈ I and m ∈ N, we can define ρ m ( f 1 , f 2 ) := d m ( p( f m 1 ), p( f m 2 )). Note that ρ m is a semimetric on the space I. Now we can define the metric ρ on I as Let f n converge weakly to f in I as n → ∞. This is equivalent to saying that ρ( f n , f ) → 0 as n → ∞.
Letf 1 ,f 2 ∈ F. Define f i (s) = s −f i (s), i = 1, 2, s ∈ S. Then f i ∈ I and, we can define the metric on F asρ(f 1 ,f 2 ) := ρ( f 1 , f 2 ). Clearly, convergence with respect to the metricρ on F is equivalent to the convergence in the weak topology on F.