A class of linear quadratic dynamic optimization problems with state dependent constraints

In this paper, we analyse a wide class of discrete time one-dimensional dynamic optimization problems—with strictly concave current payoffs and linear state dependent constraints on the control parameter as well as non-negativity constraint on the state variable and control. This model suits well economic problems like extraction of a renewable resource (e.g. a fishery or forest harvesting). The class of sub-problems considered encompasses a linear quadratic optimal control problem as well as models with maximal carrying capacity of the environment (saturation). This problem is also interesting from theoretical point of view—although it seems simple in its linear quadratic form, calculation of the optimal control is nontrivial because of constraints and the solutions has a complicated form. We consider both the infinite time horizon problem and its finite horizon truncations.


Introduction
As a motivating example, let us consider the simplest possible discrete time linearquadratic model of extraction of a single renewable resource, e.g. a fishery, by a monopolist facing a linear demand and having a strictly convex quadratic cost and focus on the infinite horizon. To make such a model realistic, constraints have to appear: an obvious constraint that it is impossible to extract more than the current biomass or number of fish, or, equivalently, that the number of fish is always nonnegative. This model is very simple and inherent to the problem, and the problem of renewable resources is crucial for contemporary economics, which makes it is difficult to imagine, that, to the best of our knowledge, the problem has not been solved by now besides a simpler sub-problem in which the interest rate is by assumption identical to the rate of growth of the resource, examined in Singh and Wiszniewska-Matyszkiel (2018) and even that simple problem has quite surprising properties, although with level of compexity uncomparable to the results of this paper. This motivating example was the starting point to the analysis of this paper. The class of optimization problems examined in this paper is the largest class that suits the above description with the assumption that the decision maker is more impatient than in the model of Singh and Wiszniewska-Matyszkiel (2018), which is equivalent to higher interest rate, and this class of problems has been further generalized to encompass one more thing important in renewable resource extraction problems-presence of a saturation point of the resource.
Linear quadratic dynamic optimization problems seem to be the most extensively examined class of dynamic optimization problems and their solutions are regarded as a standard textbook material [although continuous time models are more common, there are also extensive studies of the discrete time case in e.g. Anderson and Moore (2007)]. However, this is mainly true for the unconstrained linear quadratic regulator problem in its simplest form.
The most general form of the discrete time infinite horizon linear quadratic regulator problem with discounting (for consistency with our paper, we write it as a maximization problem) is to

S(t) T RS(t) + X (t) T Q X(t) + X (t) T W S(t)],
for the state trajectory X defined by X (t + 1) = AX(t) + B S(t) with X (0) = x 0 , if we consider the open loop information structure (the control S is dependent of time only and the initial condition x 0 is known) or

S(X (t)) T RS(X (t)) + X (t) T Q X(t) + X (t) T W S(X (t))],
for the state trajectory X defined by X (t +1) = AX(t)+ B S(X (t)) with X (0) = x 0 , if we consider the feedback information structure (the control S is dependent on current state only).
In some papers, especially with finite time horizon, feedback controls are functions of both current state and time.
The square matrices R and Q are assumed to be nonnegative definite, usually positive definite, while the discount factor β ∈ (0, 1], usually β = 1. The most common simplest form of this problem is without the mixed term, i.e. the matrix W = 0. When the infinite horizon linear regulator problem with constraints for states or controls is considered, the theoretical papers in this stream analyse the simplest form of it: undiscounted, with R and Q positive definite and W = 0 with fixed constraints of the form H x ≤ x and Gs ≤ g, without any state dependent constraints on controls [e.g., Zhao and Lin (2008); Bemporad et al. (2002); Scokaert and Rawlings (1998); Thomas (1975); Chmielewski and Manousiouthakis (1996); Grieder et al. (2004); Sznaier and Damborg (1987); Scokaert and Rawlings (1998)]. To the best of our knowledge, no theoretical analysis of linear quadratic optimization problems with nontrivial discounting, non-zero mixed term and state dependent constraints on controls has been published.
In current papers related to automatic control in linear quadratic regulator problems with constraints, Model Predictive Control (MPC) method is frequently used in which at each stage a fixed finite horizon T (moving horizon or receeding horizon) dynamic optimization problem is solved, and it is usually assumed that after T , the control is equal to the optimal control for the unconstrained problem [e.g., Zhao and Lin (2008); Bemporad et al. (2002); Scokaert and Rawlings (1998); Thomas (1975); Chmielewski and Manousiouthakis (1996); Grieder et al. (2004)]. The MPC method in its simplest and computationally fastest form results in an open loop approximation of the open loop optimal control.
Another form of obtaining an optimal control in the open loop form is by a discrete time equivalent of the infinite horizon Pontryagin Maximum Principle [extensively studied in Blot and Hayek (2014); Aseev et al. (2016) or Brunovský and Holecyová (2018)].
On the contrary, the approach based on solving the Bellman equation, first stated by Bellman (1957) (often called Hamilton-Jacobi-Bellman equation as its counterpart in the continuous time), for problems with discounting introduced by Blackwell (1965), returns an optimal control in the feedback form (called sometimes also closed loop or closed loop no memory). This is the approach which we are going to use in this paper.
There are at least two reasons why feedback solutions are desired. Firstly, solutions in the class of feedback controls, although equivalent in simple dynamic optimization problems to solutions in the class of open loop controls, are usually not equivalent if we consider extension of a dynamic optimization problem to a dynamic game-a set of coupled dynamic optimization problems. In dynamic games, the feedback results are more plausible, because of the property of being strongly subgame perfect (see e.g., Başar et al. 2018). Another disadvantage of open loop solutions, especially important in the infinite time horizon case, is the fact that even a small error in estimation of the initial state may result in a substantial change of the optimal control. This may even happen in much simpler finite horizon problems.
A technique based on receding horizon is used also for feedback controls. The receding horizon approach in such papers means either that, as in our paper, a sequence of finite horizon truncation of the initial problem is considered or the unconstrained solution is used as the continuation. However, usually such papers concern convergence results for looking for a control stabilizing the system at 0 in feedback form [e.g. Keerthi and Gilbert (1988) for a nonlinear problem, Sznaier and Damborg (1987) ;Scokaert and Rawlings (1998) and Bemporad et al. (2002) for linear quadratic regulator without the mixed term]. Zhang et al. (2009) develop an algorithm for finding an approximate feedback optimal control for a class of discrete-time constrained systems with nonlinear dynamics in which constraints were related to saturation.
Classes of dynamic optimization problems closer to the problem considered in our paper are examined in eg., Le Van and Morhaim (2002), in which instead of stabilizing the system at 0, optimization of a feedback law on consumption over time is considered in economy with a productive asset.
Constraints, including state constraints and state dependent constraints on controls are inherent in a vast majority of economic problems: there may be some borrowing constraints, so the current spendings cannot exceed accumulated wealth by more than some fixed amount; nonnegativity of consumption or production is often implicitly assumed; in a renewable or non-renewable natural resource extraction problem, the decision maker cannot extract more than currently available biomass in their area.
Because of high level of complexity caused by introducing constraints active at the optimum, problems in which there are constraints but the solutions are always interior are very popular in applications. One class of such problems are resource extraction problems with logarithmic payoffs [mainly so called Fish Wars e.g., Levhari and Mirman (1980);Fischer and Mirman (1992); Wiszniewska-Matyszkiel (2005); Mazalov and Rettieva (2010); Breton and Keoula (2014); Doyen et al. (2018)]. Another class are problems with quadratic payoffs in which constraints are deliberately large and therefore, not active at the optimum (more often in continuous time models e.g., Ehtamo and Hämäläinen (1993)). The fact that the inherent constraint about nonnegativity of controls like production, advertising effort or extraction has been noticed in some papers on sticky prices and the fact that the nonnegativity constraint is active at the optimum or equilibrium led to a non-quadratic value function (Fershtman and Kamien 1987;Wiszniewska-Matyszkiel et al. 2015), while in the majority of the other papers on the subject, only the steady state solutions are considered, which are interior and, therefore, the solutions are equal to the solutions of the unconstrained problem (at least whenever the initial state equals the steady state). Similarly, a problem of international environmental agreements concerning pollution of Rubio and Ulph (2007) with constraints results with a non-quadratic value function.
Usually, in papers on problems with constraints in which boundary solutions are also possible, authors often concentrate on the interior solutions (e.g., Santos (1991) considers a discrete time Ramsey-type model of capital accumulation and he proves that if the objective function is strictly concave and twice continuously differentiable, then any interior optimal path is continuously differentiable with respect to the initial state). To some extent, this is also the case in our problem. Although there is no solution which is interior for all nonzero initial conditions, there are solutions which are interior for some open set of initial conditions. Nevertheless, such solutions are something which we are the least interested in, since they are not realistic-in the interpretation as a fishery extraction problem, they correspond to a set of initial conditions which describe abundance of fish, whose biomass cannot be decreased even by the greedy myopic fishing.
It is worth mentioning that linear quadratic problems with state dependent constraints has been theoretically considered in more compound environment of linear quadratic dynamic games: theory of such games has been studied in Zaccour (2015, 2017), where the linear quadratic dynamic games with linear constraints in discrete time with finite time horizon is considered, both in the open loop (Reddy and Zaccour 2015) and feedback form of strategies (Reddy and Zaccour 2017).
Although generally, continuous time models are better examined, discrete time models are more inherent for many problems related to ecological issues like water consumption (see e.g., Yakowitz 1982 or Krawczyk andTidball 2006) with natural seasonality, exploitation of fisheries or some forests with natural subsequent harvesting seasons (like the interpretation of this model and the model of Singh and Wiszniewska-Matyszkiel (2018) and papers on Fish Wars mentioned above) or pollution problems with international environmental agreements with variable participation (e.g., Rubio and Ulph 2007;Breton et al. 2010) or river pollution (e.g. Krawczyk and Zaccour 1999).
In this paper, we solve analytically a class of one-dimensional discrete time linear quadratic dynamic optimization problems with state dependent constraints on control and a nonnegativity state constraint, using a sufficient condition based on the Bellman or Hamilton-Jacobi-Bellman equation with appropriate terminal condition (Stokey et al. 1989or Wiszniewska-Matyszkiel 2011. Our problem does not belong to the class of constrained linear quadratic dynamic optimization problems solved in theoretical papers on constrained linear quadratic dynamic optimization mentioned above.
Our class of problems has an obvious application in resource extraction problems: either a single dynamic optimization mentioned in the motivating example in the case of lack of externalities and single well defined owner of the resource, but it also can be extended to Fish Wars type of problems-dynamic games of common resource extraction.
Because of that application, we extend the results for the basic constrained linear quadratic model to a problem in which the carrying capacity of the environment, i.e. a saturation point of the state variable, is introduced.
The theoretical part is a brick in the theory of linear quadratic dynamic optimization problems with constraints, especially state-dependent constraints on controls. Although it is simple and all its constituents can be easily interpreted by a nonmathematician-this problem is extremely complicated to solve. Moreover, this model constitutes an example in which a methodological approach often used in applications papers on optimal control or dynamic games does not work and some often used simplifications, if numerical methods are used to verify the Bellman equation, may lead to substantial errors.
The model is related to a linear quadratic multistage game that has been considered in Wiszniewska-Matyszkiel (2018, 2019) in which it also has properties contrary to expectations of readers familiar with linear quadratic dynamic games.
The paper is organized as follows. The problem in its simplest linear quadratic form is formulated in Sect. 2. Finite horizon truncations of the problem are solved in Sect. 3 with the limiting behaviour described in Sect. 3.1, while the basic infinite horizon problem is solved in Sect. 4. Section 5 is devoted to modification of the dynamics taking into account carrying capacity of the environment. Section 6 is devoted to a potential trap related to using the undetermined coefficients method to solve the infinite horizon problem. Technical lemmata are in the "Appendix".

Formulation of the basic problem
We consider a discrete time linear quadratic dynamic optimization problem with linear constraints and its slight modification taking into account a saturation point of the state variable. Because it is important for applications, we want to restrict our attention to non-negative values of the state variable and the control parameters, which are both in R + . Without loss of generality, the time set will be either N or t = {0, . . . , N } (this case is called a truncation of the problem with horizon N ).
We consider the strictly concave quadratic current payoff function P : for A > 0 and B > 0. Nevertheless, the optimal control which we are going to define in the sequel, remains unchanged for any strictly concave quadratic payoff function which attains the maximum at a positive s, since for such optimization problems, the payoff has the form P(s) + c for a constant c, so the optimal control does not change.
There are linear state dependent constraints on available control parameters i.e., s ∈ [0, (1 + ξ)x] for some ξ > 0, whose interpretation becomes obvious after introducing the dynamics. This results in the set of admissible state and control parameter pairs We consider the optimization problem in the class of feedback controls: in the infinite horizon case, they are functions S : R + → R + , with the argument representing the state variable x, while in the case with the finite horizon N , they are functions The set of admissible feedback controls is denoted by S in the infinite horizon case, while S N in the case with horizon N .
The behaviour equation of the dynamical system, given a control S ∈ S is with X (0) = x 0 ≥ 0 as the initial state of the system. For S ∈ S N , analogously. Initially, the function φ : where ξ > 0 denotes the rate of regeneration. However, in the sequel, we shall also consider its modification on some subset of arguments. Note that for this dynamics, given nonnegative initial state, due to the state dependent constraint on control, the state remains always nonnegative, so we do not have to impose additional state constraints for positive t.

Remark 1
The set of admissible trajectories as well as the set of admissible controls remain unchanged if, instead of or beside the state dependent constraint on available control parameters, we consider the state space R, the state transition equation φ : R × R + → R, nonnegative initial states only and we impose the nonnegativity state constraint X (t) ≥ 0 for all t.
The optimal control remains unchanged for such an equivalent statement of the problem.
The payoff for the dynamic optimization problem in the infinite horizon J : with a discount factor β = 1 1+ξ − for some 0 < < 1 1+ξ . Whenever we want to emphasize the dependence on the initial condition, we write J (x, S) for arbitrary x ≥ 0.
In the finite horizon truncations with the horizon N , the payoff is J : R + × {0, . . . , N } × S → R (the second argument corresponds to time moment from which we start calculating the payoff) for X being the solution of (2) (rewritten in the form suitable for S ∈ S N ) with the initial condition X (t 0 ) = x 0 . Again, if we want to emphasize dependence on the initial condition, we write J N (x, t, S) for arbitrary x ≥ 0 and t ∈ {0, . . . , N + 1}.
Example 1 An example of application Consider a problem of extraction of a renewable resource e.g., a fishery with quadratic cost c 1 s + c 2 s 2 , with the controller being a monopolist selling the catch at a market with linear inverse demand, i.e., the price per unit if the firm catches/sells s is A 1 − A 2 s. The state variable denotes the biomass of fish calculated at the beginning of the year (just after spawning; the year is calculated from the beginning of the spawning time) while the control parameter (the catch during the whole year besides a closed season around the spawning time). The nonnegativity constraint becomes obvious both for the state variable (the biomass of fish) and the control parameter (the catch). The state dependent constraint on control translates as "the firm cannot fish out more than available-all the fish including current year's offspring". In presence of a closed season, using discrete time becomes well justified.
This example is a modification of the dynamic game problem with at least two players studied in Wiszniewska-Matyszkiel (2018, 2019) for specific value of B and for = 0 and for the related social optimum problem.

The value function, optimal control and the sufficient condition for the infinite horizon
Theorem 1 A sufficient condition for V : R + → R to be the value function and S : R + → R + to be an optimal control of the dynamic optimization problem given by Eqs.
(1)-(4) consists of the Bellman Equation (6) and terminal condition given by Eq. (7), together with the fact that S(x) maximizes the right hand side of the Bellman equation given in Eq. (8).
with the terminal condition for every admissible trajectory X , lim while the optimal control S is any profile which fulfils Under (6) and (7), V is the value function of the dynamic optimization problem considered, the maximal sum of payoffs is equal to V (x 0 ), and the optimal profile of strategies can be found by inclusion (8) for every x (for simpler reference, from now on, we call it the Bellman inclusion). Generally, (7) is not a necessary condition [see e.g. an example from Wiszniewska-Matyszkiel (2011)] and the Bellman equation (6) This version of sufficient condition is often related to Stokey et al. (1989), Theorem 4.3. However, the problem considered by Stokey, Lucas and Prescott has a slightly different formulation. Theorem 1 is also consequence of a weaker sufficient condition-the main result of Wiszniewska-Matyszkiel (2011).
In our initial optimization problem, the Bellman equation (6) becomes while the Bellman inclusion (8) becomes

The value function, optimal control and the necessary and sufficient condition for the finite horizon N
The value function V N : X × {0, . . . , N + 1} → R obviously depends also directly on time, it is defined by while the optimal control S N : The sufficient condition given in Theorem 1 changes slightly: the Bellman equation (6) and inclusion (8) change to while the terminal condition changes to Remark 2 Equations (11)-(13), are not only sufficient but also necessary for V N to be the value function and S N to be the optimal control in the finite truncations of the initial problem.

The finite horizon solutions versus the infinite horizon solution
Although in finite horizon problems, the dynamic programming method based on the Bellman equation determines the optimal solution explicitly by backwards induction, in the infinite horizon case with a terminal condition at infinity, it cannot be done.
So, we are going to solve it in a different way-as a limit of finite horizon solutions. However, it has to be emphasized that the convergence of the finite horizon feedback optimal controls for the truncated problems does not imply that the limit is the optimal control for the infinite horizon problem, and the sufficient condition [a simple one from Stokey et al. (1989) or more general from Wiszniewska-Matyszkiel (2011)] has to be checked for the limit [see counterexamples: Ex. 7 and 8 from Wiszniewska-Matyszkiel and Singh (2018)]. Moreover, generally, even the value functions of the truncated problems may converge to a limit which is not the infinite horizon value function [Ex. 5 from Wiszniewska-Matyszkiel and Singh (2018)]. So, we use the limits of the finite horizon value functions and optimal controls only as candidates, for which we check the sufficient condition. We also show that the frequently used undetermined coefficient/Ansatz method is not of much use in the case of our class of problems.
Our method is based on finding a candidate for the value function and optimal control as limits of values, at the initial time moment, of value functions and optimal controls, correspondingly, of finite horizon truncations of the problem and verification whether the limits are indeed the value function and optimal control, using sufficient condition Eqs. (6)-(8).

Solution of the dynamic optimization problem for finite horizon truncations of the initial problem
Before we solve the more interesting infinite horizon problem, we are first going to solve its finite horizon truncations with the horizon N . We can prove the following form of the value function and optimal control.
Theorem 2 The following function V N : R + × {0, . . . , N + 1} → R + is the value function while the following function S N : R + × {0, . . . , N } → R + is the optimal control for the truncation of the initial problem with time horizon N .
The number i corresponds to time to resource exhaustion for x in the interval (x i−1 ,x i ): forx i−1 < x <x i , the resource will be depleted in i stages. So, V i and S i correspond to time to resource exhaustion i + 1,x i is the highest state such that if x 0 =x i , then the optimal trajectory fulfils X (i) = 0, whileŷ N is the lowest state such that if x 0 =ŷ N , thenŝ is available in each of N + 1 stages. Equivalently,x i is the lowest state at which a control S with S(X (0), 0) = a i X (0) Equivalently a i , b i andx i+1 can be rewritten as We present the results of Theorem 2-the optimal control S N and the value function V N -in Fig. 1  To show how the optimal control and value function at the initial time change as the time horizon increases, we illustrate them for the same values of parameters A = 1000, B = 1, = 0.01 and ξ = 0.02 and for four values of N = 0, 1, 10, 100 in Fig. 2 (the optimal control) and Fig. 3 (the value function). Small diamonds correspond to subsequentx i andŷ i ;ŷ 100 is out of the range. The non-differentiability of S N is well visible, so is differentiability of V N . As we can see, the optimal control is nonincreasing while the value function non-decreasing in N .
To prove Theorem 2, we need the following sequence of Lemmata. The part of proofs which are less interesting while elaborate are moved to the Appendix.

Lemma 1 (a) φ(x, S N (x)) is non-decreasing in x for all N and strictly increasing in
x for N ≥ 1.

Lemma 2 For all x, V N is concave in x and it is strictly concave for x <ŷ N and it is differentiable for all x.
Proof This proof is based on basic properties of strictly concave functions and their derivatives or superdifferentials [an analogue of properties of convex functions and their subdifferentials, see. e.g., Rockafellar (2015)].
Since by Lemma 4 from the Appendix, H i < 0, V i are strictly concave and they are differentiable. Note that U i is constant for every i, so, it is also concave.
Since V i is strictly concave and by Lemma 9(a) from the Appendix, V N is continuous, ∂ V i ∂ x is strictly decreasing and since by Lemma 10 from the Appendix, 0) is concave on the whole domain. (9) is attained.

(b) If for some s ∈ [0, (1 + ξ)x], ∂(P(s)+βV N (φ(x,s(x)),0)) ∂s = 0, then s is the unique optimum of the right hand side of Bellman equation.
Proof (a) Immediately by Lemma 2 and boundedness of P and V N from above.
(b) If a point fulfils the first order condition for optimization of a strictly concave function then it is the unique optimum.

Proof of Theorem 2
We prove it inductively in two ways: by forward induction with respect to the horizon N and within a fixed horizon N , by backward induction corresponding to the dynamic programming techniques, which we rewrite to forward induction with respect to time to resource exhaustion.
For N = 0 it can be easily verified that the value function x ≥ŷ 0 , fulfils the Bellman equation (11) and there is a unique optimal control which fulfils the Bellman inclusion (12). Assume that the value function and the optimal control are given by Eq. (14) and (15) for N and prove it for N + 1. The Bellman equation (11) has the form while the Bellman inclusion -necessary and sufficient condition for a control to be optimal is By the Bellman optimality principle (Bellman 1957), at time t + 1, the solution has to coincide with the optimal solution of the N horizon problem with the state resulting from the first decision. Since the only dependence on time in the functions of the model is by discounting, so, V N +1 (x, 1) = V N (x, 0) and S N +1 (x, 1) = S N (x, 0). By analogous reasoning, we have V N +1 (x, t + 1) = V N (x, t) and S N +1 (x, t + 1) = S N (x, t) for all t ≤ N . Thus, we only have to check the Eq. (20) and (21) for t = 0. So, Eqs. (20) and (21) can be rewritten as We are going to locate the maximum. It depends on the interval to which x belongs.
then for this x, Eq. (22) reduces to Eq. (24). So, what remains to be proven is the fact that S k+1 is really the maximizer of the r.h.s. of Eq. (24) and that this equation is fulfilled. We do it by induction with respect to k. By substitution, we get it for k = 0 if we use auxiliary V −1 ≡ 0. Now we assume that it is fulfilled for k and prove it for k + 1.
The first order condition for s to be optimal is By solving this equation for s, we get the optimal S k+1 with the constants a k+1 = β H k (1+ξ) β H k −B and b k+1 = βG k −A β H k −B . Substituting the value from Eq. (25) into Eq. (24), we obtain V k+1 (x) = K k+1 + G k+1 x + H k+1 2 x 2 , with the recurrence equation for the constants as in Eq. (17). So, what remains to be proven are two cases: x ∈ [x N +1 ,ŷ N +1 ) and x ≥ŷ N +1 . In the latter case, obviously, φ(x,ŝ) ≥ŷ N , the Bellman equation (22) , s)) and it is fulfilled with s =ŝ.
In the former case we have two sub-cases: , s)), then the reasoning is the same as for  N (φ(x, s)) and it is fulfilled with s =ŝ.

Limit properties of the finite horizon truncations of the problem
In this subsection, we examine the limit properties of the finite horizon truncations of the problem. Let us introducex The first interesting property are the limits of V N and S N , which for x <x is attained in finitely many steps.

Proof Immediate
Another interesting issue is the limit of V N (x N , 0) and S N (x N , 0) for a sequence x N x. To calculate it, first, we need to check convergence of the sequences H N ,

(b) The limit of a i is given by
Proof (a) Consider the recurrence relation for H i given by (17). By calculating the fixed point, we obtain two values: 0 and B(1−β(1+ξ) 2 ) β . By Lemma 4, we know that H i is increasing and bounded from above by 0. So the limit exists and it is non-positive. Consider the following cases. Proposition 3 Consider F i := G i H i . (a) The limit of F i is given by lim i→∞ F i = − A Bξ . (b) The limit of G i is given by

(c) The limit of b i is given by
Proof (a) We calculate the fixed point of F i which is −ŝ ξ . By Lemma (5) from the Appendix, F i is decreasing. Let us consider any sequence given by Eq. (33) without predetermined initial condition and denote it by { f i }. If It is immediate by Proposition 3(a) and 2(a). (c) Immediate by (b) and Eq. (19).
Proof Immediate by the definition ofŷ i ,x i given in Eq. (16) and limits of a i and b i .

Proposition 5 Consider a sequence x N
x.
Proof We know the limits of the sequences H i , G i , a i and b i . To prove the result, we need to prove the convergence and to find the limit of K i . By Lemma 9(b), V i is continuous atŷ i , so K i =k − H i 2 (ŷ i ) 2 − G iŷi . By taking the limit, we obtain that lim i→∞ K i =k − lim i→∞ Since lim N →∞ x N =x, there exists an increasing sequence of integers n N such that for N large enoughx n N ≤ x N <x. By monotonicity of the functions V i , N (x, 0). Taking the limit ends the proof for V . The proof for S is analogous. Similarly , with opposite side monotonicity of derivative resulting from concavity of V N .

The infinite time horizon
In this subsection we return to the infinite horizon problem.
Note that the optimal controls of the truncated problems converge. In such a case, in many papers, the limit is automatically treated as the optimal control for the infinite horizon problem. As we shall prove, the limit in our case is, indeed, the optimal control. To make sure that the limit of finite horizon solutions is really the optimal control, we check the sufficient condition given by Theorem 1.

Theorem 3 The value function is,
fork,x defined by Eq. (26),x 0 = 0 andx N defined by Eq. (18) and V N is given by Eq. (14) while the optimal control is where S N is given by Eq. (15).
We present results of Theorem 3 for values of parameters A = 1000, B = 1, = 0.01 and ξ = 0.02 in Fig. 4 (approximation reduced to 1000 intervals belowx). Small diamonds correspond to subsequentx i andx. The consecutivex i correspond to consecutive number of time moments to depletion of the state variable belowx, while overx the resource is not depleted.
Proof By Propositions 4 and 1, Lemma 9 and properties of concave functions [an analogue of properties of convex functions in e.g., Rockafellar (2015)],V is continuous, differentiable and concave for all x. So, the r.h.s. of the Bellman equation (9) has a unique solution, either given by the zero derivative point or equal to s = (1 + ξ)x. We have checked where the zero derivative point is when x ∈ [x N ,x N +1 ), while proving Theorem 2. If x ≥x, then the optimal control is attained atŝ. So,V fulfils the Bellman equation (9), whileS fulfils the Bellman inclusion (10) withV .
The terminal condition (7) is obviously fulfilled, sinceV is bounded. So,V is the value function whileS is the optimal control.
So, the optimal control of the infinite horizon problem coincides with the optimal control at the initial time of its truncation with finite horizon N and longer for x ∈ R + \ (x N ,x) , while the value function with the value function at the initial time of its truncation with finite horizon N and longer for x ∈ [0,x N ].

The modified problem-introduction of the carrying capacity in ecological applications
Since, as we state in the introduction, we are interested in applications to exploitation of renewable resources, the linear dynamics is reasonable in real life applications only for low states and a saturation point should exist. Therefore, we consider a modification of the problem introduced in Sect. 2-we modify the dynamics of the resource by considering Eq. (2) with φ : Ω → R + being any function such that Theorem 4 Theorems 2 and 3 remain unchanged if we consider the modified dynamics.
Proof Immediately by repeating the procedures obtained to prove Theorems 2 and 3 with the new φ: sinceŝ remains the global unconstrained maximum of the current payoff, its availability does not change and the set [x, +∞) is invariant for the new dynamics given s =ŝ.
The saturation point may be introduced in many different ways. The simplest one is as follows.
Example 2 Consider Example 1 with φ(x, s) = min{(1 + ξ)x − s, Mx} for a constant M ≥ 1. Then we obtain the fishery model with the saturation point Mx. Theorems 2 and 3 hold for this model.
Recall thatx is a state which is so large that the greedy myopic fishing does not reduce the population. Obviously, in current days real life situations, such a large biomass of fish is unrealistic. So, if modification of the state equation over such unrealistic level in a general way described by (29) does not influence the optimal behaviour, there is no need to work on more realistic approximation of the behaviour around the saturation point.

An important methodological issue
This subsection is devoted to a potential trap resulting from using the standard undetermined coefficient/Ansatz method for solving the infinite horizon problem, especially if the solution is not analytic, but only numerical.
Theorem 3 states that the actual value function is piecewise quadratic with infinitely many pieces while the optimal control piecewise linear with infinitely many pieces. This fact implies that obtaining such solutions by the Ansatz method is at least difficult, if not impossible. We show that it may be also misleading.
If we try to solve the infinite horizon problem by the undetermined coefficient method, we can find a function V False for which the Bellman equation (9) is fulfilled besides a small interval [0, x min ( )] with lim →0 + x min ( ) → 0, and a function S False for which the Bellman inclusion (10) with V False is fulfilled everywhere (for details see Proposition 6).
So, for arbitrary small η > 0, there exists an > 0 such that in the problem with this , the sufficient condition is fulfilled besides an interval of length less than η.
The consequence of this error on such a small interval is the fact that the value function and the optimal control are incorrectly calculated on the whole interval (0,x).

Proposition 6 Consider the function
Bξ ((1+ξ) −1) and x min =x (1 + ξ) and the function  The functions V False and S False has been derived in quite a common way the undetermined coefficient method is used in the case some constraints appear, which we reflect in the proof.
Assume that the fact whether the Bellman equation is fulfilled is checked only numerically, and that is small enough, then the length of the interval on which the Bellman equation does not hold may be below the level of accuracy of the floatingpoint arithmetic used. The remaining two parts of the sufficient condition do hold. So, in a class of problems similar to ours (obviously, 0 is well represented in each arithmetic, but a slight modification of the model may result in the critical point being nonzero), the sufficient condition may be treated as numerically proven.
We return to the reason of this trap after the proof. Before the proof, we compare graphically the actual optimal control and value function for the infinite horizon (multicolour thin transparent line) to S False and V False (red thick line) in Fig. 5 and Fig. 6, respectively. The graphs are drawn for the values A = 1000, B = 1, = 0.01 and ξ = 0.02. Small diamonds correspond to subsequent Note that the point x min ( ), corresponding to the interval [0, x min ( )] at which the Bellman equation is not fulfilled is so small that in the graph it is almost indistinguishable from 0 (the first diamond on the red thick line corresponds to x min ( )). Nevertheless, there is a substantial difference between the actual optimal control and value function for the infinite horizon and S False and V False , correspondingly, and there is a perceivable difference on the whole interval [0,x).
Next, we proceed to the proof of Proposition 6. Generally, it is enough to substitute the functions S False and V False to Eqs. (6)-(8) and meticulously check they hold as stated in Proposition 6: Eq. (6) besides [0, x min ( )), the remaining ones everywhere.
Conversely, we derive those function to illustrate an analogue of the process of looking for the optimal control by the undetermined coefficient methods for a problem with constraints. This proof follows similar lines as the proof of Theorem 2 from Singh  (9) is fulfilled for x ≥ x min ( ) only. The terminal condition (7) cannot be neglected in this proof-since, e.g. the dynamic optimization problem from Singh and Wiszniewska-Matyszkiel (2018) has two solutions and the "most obvious" quadratic solution is not the value function, which is another potential trap that can appear while solving a linear quadratic problem with constraints by the undetermined coefficient method.

Proof of Proposition 6
By using the Ansatz method, let us assume that the value function is of quadratic form:V (x) = k + gx + hx 2 2 . We look for a solution of the Bellman equation (9) in this class of functions.
Afterwards, we find s maximizing the right hand side of the Bellman equation (9) over the set of available decisions.
We check the first order condition for the internal s from the Bellman inclusion (10) and get the value of s as follows Next, we substitute the value of s from Eq. (30) to the Bellman equation (9), which allows us to calculate the constants for which this equation is fulfilled, assuming that the supremum is attained at the zero-derivative point. In this way, we obtain three sets of values of unknowns as follows.
Nevertheless, since h ≤ 0 for all such sets of constants, s defined by Eq. (30), if s ∈ [0, (1 + ξ)x], is the global maximizer and it is unique. We consider the following cases.
Case 1. The values of unknowns k, g and h in V are as in (31a), which yieldsV 1 (x) := The candidate for the social optimum in this case is equal toS 1 ( otherwise it exceeds (1 + ξ)x. So, we changeS 1 to the exceeded limit (1 + ξ)x on [0, x min ( )] and obtainS corr 1 (x). Since the maximized function is strictly concave,S corr 1 (x) defines the unique maximizer for case 1.
However, the functionV 1 (x) does not fulfil the terminal condition (7), since lim t→+∞V1 (X 0 (t))β t = −∞ for X 0 being the trajectory corresponding to the profile S ≡ 0. So we have to continue looking for solutions.
Here, we want to note that in the normal procedure of looking for a value function assuming that we found a solution of the Bellman equation, the necessary terminal condition has to be checked. A part of it has the form "limsup t→+∞V 1 (X (t))β t < 0 implies that J (x 0 , t 0 , S) = −∞ for every S for which X is the corresponding trajectory (Theorem 4(b) of Wiszniewska-Matyszkiel and Singh (2018))". In our case it is not fulfilled.
Case 2. The values of unknowns k, g and h are as in (31b), which yields: Hence, the terminal condition (7) is obviously fulfilled, sinceV 2 is constant. The Bellman equation (9) has the formk = sup s∈ [0,(1+ξ)x] P(s) + βk. Therefore, the candidate for solution of the Bellman inclusion is independent of x and equal toŝ.
Since for x <x,ŝ > (1 + ξ)x, so, in this case the Bellman equation is not fulfilled on a large interval [0,x).
Case 3. Consider a combination of case 1 and case 2.
Let us try the continuous, piecewice defined function with two piecesV 1 andV 2 -it is unique and it equals V False .
First, note that V False is not only continuous, but also differentiable, sinceV 1 (x) = V 2 (x), and concave.
The terminal condition given by Eq. (7) is obvious, since V False is bounded. The corresponding zero-derivative point is We correctS 3 on the interval [0, x min ( )] by the exceeded limit (1 + ξ)x and we obtain S False as the unique solution of the Bellman inclusion with V False . So, what remains to be proven is checking that the Bellman equation is really fulfilled on [x min ( ), +∞) and Bellman inclusion on the whole R + by the piecewise defined functions.
We denote the set of s for which φ(x, s) ≤x by S I , while the set of the remaining s by S II . S I is always nonempty.
If for some x, S II is empty, which may hold only for x ≤x, then for this x, the Bellman equation (9) reduces toV 1 (x) = sup s∈ [0,(1+ξ)x] P(s)+βV 1 (φ(x, s)), and the supremum in this case is attained atS corr 1 (x), so the Bellman inclusion (10) is fulfilled, so is the Bellman equation for [x min ( ), +∞) (which we have already solved during calculation of coefficients in case 1).
So, let us consider the case when both S I and S II are nonempty. This situation can be decomposed into two cases.
Note that sup s P(s) + βV 1 (φ(x, s)) is attained atS corr 1 , which is at least ξ x, so it obviously belongs to S I . Since the right hand side of the Bellman equation is strictly concave,S corr 1 defines the unique maximizer. The remaining calculations in this case reduce to case 1.
(II) If x >x, then the Bellman equation (9) can be rewritten as V 2 (x) = max{sup s∈S I P(s) + βV 1 (φ(x, s)), sup s∈S II P(s) + βV 2 (φ(x, s))}. First, let us consider optimization over S II . As we have checked in Case 2, P(s)+βV 2 (φ(x, s)) attains maximum atŝ, which is within the constraints, since x >x. Since the right hand side of the Bellman equation is strictly concave,ŝ is its unique maximizer. The remaining calculations in this case reduce to case 2.
In the above proof we have copied, up to maximal applicable in this case level, the typical way of solving the optimal control problem by the Ansatz method in problems with constraints.
We can add that in such proofs sometimes (e.g. Fershtman andKamien 1987 andWiszniewska-Matyszkiel et al. 2015) at the last stage of the proof, the value function on the set where the constraint is exceeded by the zero-derivative solution, is replaced by some obvious value resulting from the guess implied by the fact that the control is made equal to the constraint. In Fershtman and Kamien (1987) and Wiszniewska-Matyszkiel et al. (2015) it was related to nonnegativity constraint, so an obvious guess was assuming waiting for the state variable to increase to level at which the zeroderivative point is nonnegative (which resulted in a non-quadratic value function, in those papers the procedure ended successfully). In our case, such an obvious guess related to the control value (1 + ξ)x at [0, x min ( )) is correction of V False on this set to P ((1 + ξ)x), because nothing remains to the next stage. Since it is less than V False , the Bellman equation as well as the Bellman inclusion are still fulfilled on [x min ( ), ∞).
An obvious question is why an error in the Bellman equation on a small set only resulted in a substantial error in the value function and optimal control on a large set [0,x). And, consequently, what kind of errors should be avoided.
What is specific in our case, is the fact that the error in the Bellman equation appears in arbitrarily small neighbourhood of a stable steady state (if we consider the dynamics corresponding to the optimal control) and the resulting error in the value function is propagated on the whole basin of attraction of this steady state and influences the optimal control on it.
Thus, the Bellman equation around a stable steady state has to be checked especially carefully.

Conclusions and further research
In this paper, we have analysed a wide class of discrete time discounted dynamic optimization problems-with strictly concave quadratic current payoffs and linear state dependent constraints on the control parameter as well as non-negativity constraint on the one-dimensional state variable and control. This model suits well economic problems like extraction of renewable resources (e.g. a fishery harvesting). The class of sub-problems considered encompasses a linear quadratic optimal control problem as well as models with saturation level of the state variable.
The optimal control we have obtained is piecewice linear, with even infinitely many pieces in the infinite time horizon case, with consecutive pieces corresponding to consecutive number of time moments to depletion of the state variable, only at the last interval the state variable is not depleted.
From theoretical point of view, we derive a solution for a class of linear quadratic dynamic optimization problems, both the infinite time horizon problem and its finite horizon truncations, with applications in economy and environmental economics and modifications of this problem with nonlinear dynamics reflecting that there exists a saturation point of the state variable. The proof, besides inductive methods and calculus was nontrivially based on properties on concave functions and their derivatives. The standard undetermined coefficient method turned out to be ineffective for this class of problems.
This paper has four obvious extensions. The first one is related to examining other relations between the discount rate and the rate of growth of the state variable, i.e. the case when is negative, related to high rate of growth of the resource compared to the interest rate in the economic interpretation. Currently, we can only state that the solution is not, as often suggested in elementary solutions, "wait until the last stage before the resource grows to the abundance level, to have exactly this value of the state variable from the next stage on".
The second one is extension of the problem to more dimensional (in the economic fishery interpretation, this may be interpreted as introducing more than one species of interacting fish or nonuniform spatial allocation). This may result in essential difficulties related to the technique of the proof: proving that all the finite and infinite horizon value functions are concave and strictly concave below a certain level (related to abundance of fish) is related to complicated, highly nonlinear formulae on the coefficients.
The third one is returning to examining this problem as an infinite horizon dynamic game, currently studied only with vestigial results, with complete results either in the case of continuum of players in Singh and Wiszniewska-Matyszkiel (2018) or two stage truncations for = 0 in Singh and Wiszniewska-Matyszkiel (2019).
(c) For i = 0, we prove (34) by just substituting and simplifying. Next, we consider i > 0 and in the part of formula (19) , we substitute all the four constants by their values from the recursive formulae (17). We do it only once (we do not substitute for G i and H i which appear after substitution for G i+1 and H i+1 ). We simplify the formula to get We rewrite Eq. (19) to get G i−1 =x i (H i − H i−1 ) + G i and we substitute it into (35). After simplification, we get Eq. (34).

Lemma 7 (a) For all i,x i fulfilsx
or, equivalently,x Proof (a) It can be easily verified thatx 1 =x 0 +b 1 (1+ξ)−a 1 To prove for another i, we rewrite the right hand side of Eq. (36) and we get (1+ξ) . This is the right hand side of Eq. (34), so it equalŝ x i+1 . (b) It is immediate thatx 0 <ŷ 0 . Next, we assume thatx i <ŷ i .