On the solution stability of parabolic optimal control problems

The paper investigates stability properties of solutions of optimal control problems constrained by semilinear parabolic partial differential equations. Hölder or Lipschitz dependence of the optimal solution on perturbations are obtained for problems in which the equation and the objective functional are affine with respect to the control. The perturbations may appear in both the equation and in the objective functional and may nonlinearly depend on the state and control variables. The main results are based on an extension of recently introduced assumptions on the joint growth of the first and second variation of the objective functional. The stability of the optimal solution is obtained as a consequence of a more general result obtained in the paper–the metric subregularity of the mapping associated with the system of first order necessary optimality conditions. This property also enables error estimates for approximation methods. A Lipschitz estimate for the dependence of the optimal control on the Tikhonov regularization parameter is obtained as a by-product.


Introduction
Let Ω ⊂ R n , 1 ≤ n ≤ 3, be a bounded domain with Lipschitz boundary ∂Ω. For a finite T > 0, denote by Q := Ω × (0, T ) the space-time cylinder and by Σ := ∂Ω × (0, T ) its lateral boundary. In the present paper, we investigate the following optimal control problem: (P) min u∈U J(u) := Q L(x, t, y(x, t), u(x, t)) dx dt , (1.1) subject to ∂y ∂t + Ay + f (·, y) = u in Q, y = 0 on Σ, y(·, 0) = y 0 on Ω. (1.2) Denote by y u the unique solution to the semilinear parabolic equation (1.2) that corresponds to control u ∈ L r (Q), where r is a fixed number satisfying the inequality r > 1 + n 2 . For functions u a , u b ∈ L ∞ (Q) such that u a < u b a.e in Q, the set of feasible controls is given by (1. 3) The objective integrand in (1.1) is defined as L(x, t, y, u) := L 0 (x, t, y) + (my + g)u, (1.4) where m is a number, g is a function in L ∞ (Q) and L 0 satisfies appropriate smoothness condition (see Assumption 2 in Subsection 1.1). The goal of the present paper is to obtain stability results for the optimal solution of problem (1.1)-(1.3). The meaning of "stability" we focus on, is as follows. Given a reference optimal controlū and the corresponding solution yū, the goal is to estimate the distance (call it ∆) from the optimal solutions (u, y u ) of a disturbed version of problem (1.1)-(1.3) to the pair (ū, yū), in terms of the size of the perturbations (call it δ). The perturbations may enter either in the objective integrand or in the state equation, and the meaning of "distance" and "size" in the previous sentence will be clarified in the sequel in terms of appropriate norms. If an estimation ∆ ≤ const.δ θ holds with θ ∈ (0, 1), we talk about Hölder stability, while in the case θ = 1 we have Lipschitz stability.
A powerful technique for establishing stability properties of the solutions of optimization problems is based on regularity properties of the system of first order necessary optimality conditions (see e.g. [18]). In the case of problem (1.1)-(1.3), these are represented by a differential variational inequality (see e.g. [16,24]), consisting of two parabolic equations (the primal equation (1.1) and the corresponding adjoint equation) and one variational inequality representing the condition for minimization of the Hamiltonian associated with the problem. The Lipschitz or Hölder stability of the solution of problem (1.1)-(1.3) is then a consequence of the property of metric subregularity (see [15,18]) of the mapping defining this differential variational inequality. An advantage of this approach is that it unifies in a compact way the study of stability of optimal solutions under a variety of perturbations (linear or nonlinear). Therefore, the main result in the present paper focuses on conditions for metric subregularity of the mapping associated with the first order optimality conditions for problem (1.1)-(1.3). These conditions are related to appropriate second order sufficient optimality conditions, which are revisited and extended in the paper. Several results for stability of the solutions are obtained as a consequence.
The commonly used second order sufficient optimality conditions for ODE or PDE optimal control problems involve a coercivity condition, requiring strong positive definiteness of the objective functional as a function of the control in a Hilbert space. We stress that problem (1.1)-(1.3) is affine with respect to the control variable and such a coercivity condition is not fulfilled. The theory of sufficient optimality theory and the regularity theory for affine optimal control of ODE systems have been developed in the past decade, see [23] and the bibliography therein. Sufficient conditions for weak or strong local optimality for optimal control problems with constraints given by elliptic or parabolic equations are developed in [2,3,4,8,10,12,17]. A detailed discussion thereof is provided in Section 2.1. In contrast with the elliptic setting, there are only a few stability results for semilinear parabolic optimal control problems. Progress in this regard for a tracking type objective functional was made for instance in [9,10] where stability with respect to perturbations in the objective functional was studied, and in [11], where stability with respect to perturbations in the initial data was investigated. We mention that for a linear state equation and a tracking type objective functional, Lipschitz estimates were obtained in [29] under an additional assumption on the structure of the optimal control. More comprehensive discussion about the sufficiency theory and stability can be found in Section 2.
The main novelty in the present paper is the study of the subregularity property of the optimality mapping associated with problem (1.1)- (1.3). In contrast with the case of coercive problems, our assumptions in the affine case jointly involve the first and the second order variations of the objective functional with respect to the control. These assumptions are weaker than the ones in the existing literature in the context of sufficient optimality conditions, however, they are strong enough to imply metric subregularity of the optimality mapping. The subregularity result is used to obtain new Hölder-and Lipschitz estimates for the solution of the considered optimal control problem. An error estimate for the Tikhonov regularization is obtained as a consequence.
The obtained subregularity result provides a base for convergence and error analysis for discretization methods applied to problem (1.1)-(1.3). The point is, that numerical solutions of the discretized versions of the problem typically satisfy approximately first order optimality conditions for the discretized problem and after appropriate embedding in the continuous setting (1.1)-(1.3), satisfy the optimality conditions for the latter problem with a residual depending on the approximation and the discretization error. Then the subregularity property of the optimality mapping associated with (1.1)-(1.3) provides an error estimate. Notice that the (Lipschitz) stability of the solution alone is not enough for such a conclusion, and this is an important motivation for studying subregularity of the optimality mapping rather than only stability of the solutions. However, we do not go into this subject, postponing it to a later paper based on the present one.
The paper is organized as follows. The analysis of the optimal control problem (1.1)-(1.3) begins in Section 2. We recall the state of the art regarding second order sufficient conditions for weak and strong (local) optimality, as well as known sufficient conditions for stability of optimal controls and states under perturbations. In Section 3 we formulate and discuss the assumptions on which our further analysis on sufficiency and stability is based. The strong subregularity of the optimality mapping is proved in Section 4. In Section 5, we obtain stability results for the optimal control problem under non-linear perturbations, postponing some technicalities to Assumption A. Finally, we support the theoretical results with some examples.

Preliminaries
We begin with some basic notations and definitions. Given a non-empty, bounded and Lebesgue measurable set X ⊂ R n , we denote by L p (X), 1 ≤ p ≤ ∞, the Banach spaces of all measurable functions f : X → R for which the usual norm f L p (X) is finite. For a bounded Lipschitz domain X ⊂ R n (that is, a set with Lipschitz boundary), the Sobolev space H 1 0 (X) consists of functions that vanish on the boundary (in the trace sense) and that have weak first order derivatives in L 2 (X). The space H 1 0 (X) is equipped with its usual norm denoted by · H 1 0 (X) . By H −1 (X) we denote the topological dual of H 1 0 (X), equipped with the standard norm · H −1 (X) . Given a real Banach space Z, the space L p (0, T ; Z) consist of all strongly measurable functions y : [0, T ] → Z that satisfy or, for p = ∞, The Hilbert space W (0, T ) consists of all of functions in L 2 (0, T ; H 1 0 (Ω)) that have a distributional derivative in L 2 (0, T ; H −1 (Ω)), i.e.
where a i,j ∈ L ∞ (Ω) satisfy the uniform ellipticity condition The matrix with components a i,j is denoted by A.
The functions f, L 0 : Q × R −→ R of the variables (x, t, y), and the "initial" function y 0 have the following properties.
Assumption 2. For every y ∈ R, the functions f (·, ·, y) ∈ L r (Q), L 0 (·, ·, y) ∈ L 1 (Q), and y 0 ∈ L ∞ (Ω). For a.e. (x, t) ∈ Q the first and the second derivatives of f and L 0 with respect to y exist and are locally bounded and locally Lipschitz continuous, uniformly with respect to (x, t) ∈ Q. Moreover, ∂f ∂y (x, t, y) ≥ 0 for a.e. (x, t) ∈ Q and for all y ∈ R. Remark 1. The last condition in Assumption 2 can be relaxed in the following way: ∂f ∂y (x, t, y) ≥ C f a.a. (x, t) ∈ Q and ∀y ∈ R, (1.6) see [3,8]. However, this leads to complications in the proofs.
1. For each u ∈ L 2 (Q) the linear parabolic equation (1.7) has a unique weak solution y u ∈ W (0, T ). Moreover, there exists a constantĈ > 0 independent of u and α such that 2. If, additionally, u ∈ L r (Q) (we remind (1.5)) then the weak solution y u of (1.7) belongs to W (0, T )∩C(Q). Moreover, there exists a constant C r > 0 independent of u and α such that Besides the independence of the constantsĈ, and C r on α all claims of the theorem are well known, see [28,Theorem 3.13,Theorem 5.5]. A proof of a similar independence statement can be found in [2] for a linear elliptic PDE of non-monotone type.
Proof. For convenience of the reader, we prove that the estimates are independent of α. This is done along the lines of the proof of [2, Lemma 2.2]. By y 0,u we denote a solution of (1.7) for α ≡ 0. It is well known that in this case there exist constants C r ,Ĉ > 0 such that To apply this, we decompose u in positive and negative parts, u = u + − u − , u + , u − ≥ 0. By the weak maximum principle [14,Theorem 11.9], it follows that y α,u + , y α,u − ≥ 0. Again by the weak maximum principle, the equation ∂ ∂t (y α,u + − y 0,u + ) + A(y α,u + − y 0,u + ) + α(y α,u + − y 0,u + ) = −αy 0,u + implies 0 ≤ y α,u + ≤ y 0,u + , thus y α,u + C(Q) ≤ y 0,u + C(Q) . By the same reasoning, it follows that 0 ≤ y α,u − ≤ y 0,u − and y α,u − C(Q) ≤ y 0,u − C(Q) . Hence, The estimate for L 2 (0, T, H 1 0 (Ω)) can be obtained by similar arguments as in [2]. The next lemma is motivated by an analogous result for linear elliptic equations [2, Lemma 2.3], although, according to the nature of the parabolic setting, the interval of feasible numbers s, is smaller. Lemma 2. Let u ∈ L r (Q) and 0 ≤ α ∈ L ∞ (Q). Let y u be the unique solution of (1.7) and let p u be a solution of the problem − ∂p ∂t + A * p + αp = u in Q, p = 0 on Σ, p(·, T ) = 0 on Ω. (1.10) Then, for any s n ∈ [1, n+2 n ) there exists a constant C s ′ n > 0 independent of u and α such that max{ y u L sn (Q) , p u L sn (Q) } ≤ C s ′ n u L 1 (Q) . (1.11) Here s ′ n denotes the Hölder conjugate of s n . Proof. First we observe that by Theorem 1, y u ∈ C(Q) ∩ W (0, T ) and as a consequence, |y u | sn−1 sign(y u ) ∈ L s ′ n (Q). Moreover, s n < n+2 n implies that s ′ n > 1 + n 2 . By change of variables, see for instance [28,Lemma 3.17], a solution of equation (1.10) transforms into a solutions of (1.7). Thus according to Theorem 1, the solution q of − ∂q ∂t + A * q + αq = |y u | sn−1 sign(y u ) in Q, q = 0 on Σ, q(·, T ) = 0 on Ω. belongs to W (0, T ) ∩ C(Q) and satisfies n is independent of a and v. Using these facts we derive the equalities y u sn L sn (Q) = Q |y u | sn dx = − ∂q ∂t + A * q + αq, y u = ∂y u ∂t + Ay u + αy u , q This proves (1.11) for y u . To obtain (1.11) for p u , one tests (1.10) with a weak solution of ∂y ∂t + Ay + αy = |q u | sn−1 sign(q u ) in Q, y = 0 on Σ, y(·, 0) = 0 on Ω, and argues in an analogous way.
Below we remind several results for the semilinear equation (1.2), which will be used further. A proof of the next theorem can be found in [5,Theorem 2.1] Theorem 3. For any u ∈ L 2 (Q) the semilinear parabolic initial-boundary value problem (1.2) has a unique weak solution y u ∈ W (0, T ). If u ∈ L r (Q) (see (1.5)) then y u ∈ W (0, T ) ∩ L ∞ (Q). If additionally y 0 ∈ C(Ω), then y u ∈ C(Q). Moreover, there exists a constant D r > 0, independent of u, f, y 0 such that Finally, if u k ⇀ u weakly in L r (Q), then The differentiability of the control-to-state operator under the assumptions 1 and 2 is well known, see among others [8,Theorem 2.4].

Remark 2.
By the boundedness of U in L ∞ (Q) and by Theorem 3, there exists a constant M U > 0 such that (1.16)

Estimates associated with differentiability
We employ results of the last subsection to derive estimates for the state equation (1.2) and its linearisation (1.14). These estimates constitute a key ingredient to derive stability results in the later sections. The next lemma extends [2, Lemma 2.7] from elliptic equations to parabolic ones.
Lemma 5. The following statements are fulfilled.
(i) There exists a positive constant M 2 such that for every u,ū ∈ U and v ∈ L r (Q) Then there exists ε > 0 such that for every u,ū ∈ U with y u − yū C(Q) < ε the following inequalities are satisfied (1. 19) The proof, that is a consequence of Lemma 28, is given in Appendix A.

The control problem
The optimal control problem (1.1)-(1.3) is well posed under assumptions 1 and 2. Using the direct method of calculus of variations one can easily prove that there exists at least one global minimizer, see [28,Theorem 5.7].
On the other hand, the semilinear state equation makes the optimal control problem nonconvex, therefore we allow global minimizers as well as local ones. In the literature, weak and strong local minimizers are considered.
We say thatū ∈ U a strong local minimum of (P) if there exists ε > 0 such that We say thatū ∈ U is a strict (weak or strong) local minimum if the above inequalities are strict for u =ū.
Relations between these types of optimality are obtained in [3, Lemma 2.8].
As a consequence of Theorem 4 and the chain rule, we obtain the differentiability of the objective functional with respect to the control.

Sufficient conditions for optimality and stability
In this subsection we discuss the state of the art in the theory of sufficient second order optimality conditions in PDE optimal control, as well as related stability results for the optimal solution. For this purpose, we recall the definitions of several cones that are useful in the study of sufficient conditions. Given a triplet (ȳ,p,ū) satisfying the optimality system in Theorem 7, and abbreviating ∂H ∂u (x, t) := ∂H ∂u (x, t,ȳ,p,ū), we have from This motivates to consider the following set Sufficient second order conditions for (local) optimality based on (2.9) are given in [8,3,10]. Following the usual approach in mathematical programming, one can define the critical cone atū as follows: Obviously, this cone is trivial if ∂H ∂u (x, t) = 0 for a.e. (x, t) (which implies bang-bang structure ofū) thus no additional information can be gained based on Cū. To address this issue, it was proposed in [19,21] to consider larger cones on which second order conditions can be posed. Namely, for τ > 0 one defines The cones D τ u , E τ u and G τ u were introduced in [4, 10] as extensions of the usual critical cone. It was proven in [4,9,10] that the condition: is sufficient for weak (in the case G = D τ u ) or strong (in the case G = E τ u ) local optimality in the elliptic and parabolic setting. Most recently, the cone C τ u was defined in [3] and also used in [6]. It was proved in [3], that (2.14) with C = C τ u is sufficient for strong local optimality. Under (2.14) it is possible to obtain some stability results. In [9] and [10] the authors obtain Lipschitz stability in the (L 2 − L ∞ )-sense for the states 1 , under perturbations appearing in a tracking type objective functional and under the assumption that the perturbations are Lipschitz. Further they obtain Hölder stability for the states under a Tikhonov type perturbation. Hölder stability under (2.14) with exponent 1/2 was proved in [11] with respect to perturbations in the initial condition.
To improve the stability results an additional assumption is needed. This role is usually played by the structural assumption on the adjoint state or generally on the derivative of the Hamiltonian with respect to the control. In the case of an elliptic state equation, [25] uses the structural assumption In the parabolic case this assumption (with Ω replaced with Q) is used in [11]. We recall that the assumption (2.15) implies thatū is of bang-bang type. Further, (2.15) implies the existence of a constantκ > 0 such that the following growth property holds: For a proof see [1], [22] or [26]. If the control constraints satisfy u a < u b almost everywhere on Q, both conditions, (2.15) and (2.16) are equivalent, see [17,Proposition 6.4]. In [25], using (2.15) and (2.14) with G = D τ u , the authors proof L 1 -Lipschitz stability of the controls for an elliptic semilinear optimal control problem under perturbations appearing simultaneously in the objective functional and the state equation. Assuming (2.15), (2.14) may also be weakened to the case of negative curvature, In [12], [13] it was proved that (2.15) together with (2.17) implies, for the semililnear elliptic case, weak local optimality in L 1 (Ω). Lipschitz stability results were also obtained in [17] in the elliptic case. Finally, for a semilinear parabolic equation with perturbed initial data, [11,Theorem 4.6] obtains, under (2.14) and (2.15), L 2 − L 2 and L 1 − L 2 -Hölder stability (see Footnote 1), with exponent 2 3 , for the optimal states and controls respectively. Additionally, Lipschitz dependence is obtained on perturbations in L ∞ (Q).

A unified sufficiency condition
In this section, we introduce an assumption that unifies the first and second order conditions presented in the previous section.
Assumption 3. For a number k ∈ {0, 1, 2}, at least one of the following conditions is fulfilled: (A k ): There exist constants α k , γ k > 0 such that In the context of optimal control of PDE's the assumptions (A 0 ) and (B 0 ) were first introduced in [17] and for k = 1, 2 in [2]. Assumption 3(B 0 ) originates from optimal control theory of ODE's where it was first introduced in [23] to deal with nonlinear affine optimal control problems. The cases k = 1, 2 are extensions, adapted to the nature of the PDE setting, while the case k = 0 can be hard to verify if a structural assumption like (2.15) is not imposed. The assumptions corresponding to k = 1, 2 are applicable for the case of optimal controls that need not be bang-bang, especially the case k = 2 seems natural for obtaining state stability. Assumption (A k ) implies strong (local) optimality, while Assumption (B k ) leads to weak (local) optimality. As seen below, in some cases the two assumptions are equivalent.
For an optimal control problem subject to an semilinear elliptic equation the claim of the next proposition with k = 0 was proven in [2, Proposition 5.2].
The proof is given in Appendix A.
Remark 3. We compare the items in Assumption 3 to the ones using (2.15) and (2.17) or (2.14). 2. Assumption 3(A 1 ) is implied by the structural assumption (2.15) together with (2.14). This is clear by (2.16) and by using v and w as defined in Lemma 13 and arguing as in Corollary 14, both presented below in this section.
3. Assumption 3(A 2 ) is implied by (2.14) together with the first order necessary condition.

Sufficiency for optimality of the unified condition
In this subsection we show that assumptions 3(A k ) and (B k ) are sufficient either for strict weak or strict strong local optimality, correspondingly.
Theorem 9. The following holds.
Letū ∈ U satisfy the optimality conditions (2.6)-(2.8) and Assumption 3(A k ) with some k ∈ {0, 1, 2}. Then, there exist ε k , κ k > 0 such that: Before presenting a proof of Theorem 9, we establish some technical results. The following lemma was proved for various types of objective functionals, see e.g. [10,Lemma 6], [9,Lemma 3.11]. Nevertheless, our objective functional is more general, therefore we present in Appendix A an adapted proof.

For every
For the assumptions with k ∈ {0, 1}, we need the subsequent corollary, which is also given in Appendix A.
The same assertions hold for m = 0 if one requires u −ū L 1 (Q) to be small instead of y u −ȳ C(Q) .
The next lemma clams that Assumption 3 implies a growth similar to (3.2) of the first derivative of the objective functional in a neighborhood ofū.
Lemma 12. The following claims are fulfilled.
1. Let m = 0 andū satisfy assumption (A k ), for some k ∈ {0, 1, 2}. Then, there existᾱ k ,γ k > 0 such that Proof. Since J is of class C 2 we can use the mean value theorem to infer the existence of a function θ : and under (A k ) in Assumption 3, we infer the existence of positive constants γ k and α k such that for all u ∈ U with y u −ȳ C(Q) < α k . Using Lemma 10, we obtain that holds. Using Corollary 11 and the estimate , proves the case for (3.6).
Finally, we conclude this subsection with the proof of Theorem 9.
Proof of Theorem 9. Using the Taylor expansion and the optimality condition J ′ (ū)(u −ū) ≥ 0 we have We continue this inequality, using that by Assumption 3 there exist α k > 0 and γ k > 0 such that (3.2) holds: depending on the chosen assumption (A k ) or (B k ). Now, either by Lemma 10 or Corollary 11 (depending on the assumption) there exist ε > 0 andγ k < γ k such that for every u ∈ U with y u −ȳ C(Q) < ε. We may chooseᾱ k > 0 andγ k > 0 according to Lemma 12 and depending on the chosen assumption therein. Inserting this estimate in the above expression and applying (1.18) gives To complete the proof of the second claim of the theorem we use that to apply Lemma 10 or Corollary 11 depending on k ∈ {0, 1, 2}.

Some equivalence results for the assumptions on cones
In this subsection we show that some of the items in Assumption 3 can be formulated equivalently on the cones D τ u or C τ u respectively. This applies to (B k ) or to (A k ) depending on whether the objective functional explicitly depends on the control or not. We need the next lemma, the proof of which uses a result from [7].
Proof. We defineũ,û ∈ U bỹ On the other hand, by (1.18), zū ,u−ū L ∞ (Q) < ε implies y u − yū L ∞ (Q) < 2ε. If m, g = 0, we can argue as in [7] using u −ū ∈ G τ u and the definition of w, to estimate Thus by Theorem 1 and (1.16) r . For zū ,v , we estimate with C := 2(C 0 + 1) In the second case the estimate holds trivially. Now we continue with the equivalence properties.
Corollary 14. For k ∈ {0, 2}, Assumption 3(B k ) is equivalent to the following condition (B k ): there exist constants α k , γ k , τ > 0 such that for all u ∈ U for which (u −ū) ∈ D τ u and u −ū L 1 (Q) < α k . Proof. Let k ∈ {0, 2}. If (B k ) holds then (B k ) is obviously also fulfilled. Now let (B k ) hold. The numbersα k andγ k > 0 will be chosen later so that assumption (B k ) will hold with these numbers. For now we only require that 0 <α k < α k . Choose an arbitrary u ∈ U with u −ū L 1 (Q) <α k . We only need to prove (3.1) in the case u −ū / ∈ D τ u . Take v and w as defined in Lemma 13. Clearly by definition v ∈ D τ u . As a direct consequence of (2.3)-(2.4) and Assumption 1 and 2 there exists a constant C 0 > 0 such that (3.10) We estimate Sinceα k < α k and v ∈ D τ u we may apply (3.8) with v instead of u −ū. Using also (3.11), we estimate In the last inequality we use that by choosingα k > 0 sufficiently small we may ensure that This is implied by the inequalities zū ,w L ∞ , zū ,v L ∞ (Q) ≤ C rα 1 r k resulting from Lemma 13. Further, we find where we used that u −ū L 1 (Q) < 2M U for all u ∈ U and For k = 2: This proves that (3.1) is satisfied with an appropriate numberγ k .
If the control does not appear explicitly in the objective functional, we obtain a stronger result.
Corollary 15. Let m, g = 0. Then Assumption 3(A 2 ) is equivalent to the following condition (Ā 2 ): there exist constants α 2 , γ 2 , τ > 0 such that for all u ∈ U for which (u −ū) ∈ C τ u and y u −ȳ L ∞ (Q) < α 2 . Proof. It is obvious that (A 2 ) implies (Ā 2 ). For the reverse, if u −ū ∈ C τ u the estimate holds trivially. We need to consider the cases u −ū / ∈ G τ u and u −ū / ∈ D τ u with u −ū ∈ G τ u . For the first, we argue as follows. Since For the second case u −ū ∈ G τ u and u −ū / ∈ D τ u , letα > 0 be smaller than α 2 , so that (3.12) and the prerequisite of Lemma 13 is satisfied. We define w, v as in Lemma 13. By the choice of α 2 , Lemma 13 gives the existence of a constant C > 0 such that zū ,u−ū L ∞ < α 2 implies

Now we can proceed by the same arguments as in Corollary 14
Finally, we use the estimate

Strong metric Hölder subregularity and auxiliary results
We study the strong metric Hölder subregularity property (SMHSr) of the optimality map. This is an extension of the strong metric subregularity property (see, [18,Section 3I] or [15,Section 4]) dealing with Lipschitz stability of set-valued mappings. The SMHSr property is especially relevant to the parabolic setting where Lipschitz stability may fail.

The optimality mapping
We begin by defining some operators used to represent the optimality map in a more convenient way. This is done analogously to [17, Section 2.1]. Given the initial data y 0 in (1.2), we define the set D(L) := y ∈ W (0, T ) ∩ L ∞ (Q) d dt + A y ∈ L r (Q), y(·, 0) = y 0 .
To shorten notation, we define L : D(L) → L r (Q) by L := d dt + A. Additionally, we define the operator With the operators L and L * , we recast the semilinear state equation (1.2) and the linear adjoint equation (2.7) in a short way: Ly = u − f (·, y) L * p = L y (·, y u , u) − pf y (·, y u ) = ∂H ∂y (·, y u , p, u).
The normal cone to the set U at u ∈ L 1 (Q) is defined in the usual way: The first order necessary optimality condition for problem (1.1)-(1.3) in Theorem 7 can be recast as With the abbreviation ψ := (y, p, u), the system (4.1) can be rewritten as the inclusion 0 ∈ Φ(ψ). Our goal is to study the stability of system (4.1), or equivalently, the stability of the solutions of the inclusion 0 ∈ Φ(ψ) under perturbations. For elements ξ, η ∈ L r (Ω) and ρ ∈ L ∞ (Ω) we consider the perturbed system which is equivalent to the inclusion ζ := (ξ, η, ρ) ∈ Φ(ψ).

Strong metric Hölder subregularity: main result
This subsection contains one of the main results in this paper: estimates of the difference between the solutions of the perturbed system (4.4) and a reference solution of the unperturbed one, (4.1), by the size of the perturbations. This will be done using the notion of strong metric Hölder subregularity introduced in the next paragraphs. Given a metric space (X , d X ), we denote by B X (c, α) the closed ball of center c ∈ X and radius α > 0. The spaces Y and Z, introduced in (4.2), are endowed with the metrics where ψ i = (y i , p i , u i ) and ζ i = (ξ i , η i , ρ i ), i ∈ {1, 2}. From now on, we denoteψ := (yū, pū,ū) to simplify notation.
In the next assumption we introduce a restriction on the set of admissible perturbations, call it Γ, which is valid for the remaining part of this section.  For any u ∈ U and ζ ∈ Γ we denote by (y ζ u , p ζ u , u) a solution of the first two equations in (4.4). Using (1.12) in Theorem 3 we obtain the existence of a constant K y such that y ζ u L ∞ (Q) ≤ K y ∀u ∈ U ∀ζ ∈ Γ. (4.7) Then for every u ∈ U, every admissible disturbance ζ, and the corresponding solution y of the first equation in (4.4) it holds that (y ζ u (x, t), u(x, t)) ∈ R := [−K y , K y ] × [u a , u b ].

Remark 4.
We apply the local properties in Assumption 2 to the interval [−K y , K y ], and denote further bȳ C a constant that majorates the bounds and the Lipschitz constants of f and L 0 and their first and second derivatives with respect to y ∈ [−K y , K y ].
By increasing the constant K y , if necessary, we may also estimate the adjoint state: This follows from Theorem 1 with α = − ∂f ∂y (x, t, y ζ u ) and with ∂L ∂y (x, t, y ζ u , u) at the place of u. We need some technical lemmas before stating our main result.
Lemma 18. Let u ∈ U be given and v, η ∈ L r (Q), ξ ∈ L ∞ (Q). Consider solutions y u , y ξ u , p u and p η u of the equations There exists constants β i > 0, i ∈ {1, 2}, independent of ζ ∈ Γ, such that the following inequalities hold whereĈ is the constant given in (1.8) and s ∈ [1, n+2 n ). Proof. Subtracting the state equations in (4.8) and using the mean value theorem we obtain Then, (1.8) implies (4.10). To prove (4.11) we subtract the equations (4.9) satisfied by z ξ u,v and z u,v to obtain d dt Now, using (1.8), the mean value theorem, and (4.6) we obtain . The proof for estimate (4.12) follows by the same argumentation but using (1.11). We denote by β 1 > the maximum of the constants appearing in the estimate above and its analog for (4.12). Finally, we subtract the adjoint states and employ the mean value theorem to find The claim follows using (1.8), (1.16), and (4.7) to estimate 2]. Let u ∈ U and let y u , p u be the corresponding state and adjoint state. Further, let y ζ u and p ζ u be solutions to the perturbed state and adjoint equation in (4.4) for the control u. There exist constants C,C > 0, independent of ζ ∈ Γ, such that for v ∈ U, the following estimates hold.
2. For a general m ∈ R: Proof. We consider the first case, m = 0. We begin with integrating by parts For the first term we use the Hölder inequality, the mean value theorem, (1.11), (1.16), and (4.10) to estimate . Here we used that by Theorem 1 and Lemma 1.11 it holds and noticing that 2−s 2s ′ + s 2 = 1 − 2−s 2s . The second term is estimated by using (1.16), Hölder's inequality, and (4.11): where d 1 := ψ MU L s ′ (Q) and d 2 := β 1 max{d 1 , |Q| 1 r C pe }C . For last term we estimate We prove the second case (4.16). By applying (1.9) and arguing as in the proof of (4.10) and (4.13) but for r, we infer the existence of a constant, again denoted byC > 0, such that: The main result in the paper follows.
Let y u and p u denote the solutions to the unperturbed problem with respect to u, i.e. u = Ly u + f (·, ·, y u ) and 0 = L * p u − ∂H ∂y (·, y u , p u , u).
By Lemma 18, there existsĈ, β 2 > 0 independent of ψ and ζ such that By the definition of the normal cone, ρ ∈ ∂H ∂u (·, ·, y ζ u , p ζ u ) + N U (u) is equivalent to We conclude for w =ū, By Lemma 19, we have an estimate on the third term. Since u −ū L 1 (Q) <α 0 , we estimate by Lemma 12 and Lemma 19 and consequently for an adapted constant, denoted in the same way To estimate the states, we use the estimate for the controls. We notice that (2 − s)/(2s ′ ) + s/2 = 1 + (s − 2)(2s) and obtain Thus, for a constant again denoted byC and with (1 + s−2 2s ) 2s s+2 = 3s−2 2+s , Next, we realize that by Lemma 18 and (4.2) Using pū − p u L 2 (Q) ≤Ĉ yū − y u L 2 (Q) and (4.13), the same estimate holds for the adjoint state subsequently we define κ := max{C,Ĉ}. Finally, we consider the case m = 0. Using estimate 4.16 in (4.25) and arguing from that as for the case m = 0, we infer the existence of a constantC > 0 such that This implies under (4.26) the estimate for the states and adjoint-states .
To determine θ and θ 0 we notice that the functions s → s−2 2s and s → 3s−2 2+s are monotone. Inserting the value for n+2 2 for each case n ∈ {1, 2, 3} completes the proof. To obtain results under Assumption 3 for k ∈ {1, 2}, we need additional restrictions. We either don't allow perturbations ρ (appearing in the inclusion in (4.4)) or they need to satisfy ρ ∈ D(L * ).
Proof. We first notice that if the perturbation ρ satisfies (4.27), it holds Under Assumption (A 1 ), we can proceed as in the proof of Theorem 20 using Lemma 12 and (4.15) in Lemma 19, to infer the existence of constants α, κ 1 > 0 such that and by standard estimates the existence of a constantĈ > 0 and using (1.18) for all u ∈ U with y u −ȳ L ∞ (Q) < α or u −ū L 1 (Q) < α depending on the assumption. From here on, one can proceed as in the proof of Theorem 20 and define the final constant κ > 0 and the exponent θ 0 accordingly. Finally, by similar reasoning, under Assumption (A 2 ) with Lemma 12 and Lemma 19, one obtains the existence of a constant κ > 0 such that for all u ∈ U with y u −ȳ L ∞ (Q) < α or u −ū L 1 (Q) < α. Again, proceeding as in Theorem 20 and increasing the constant κ if needed, proves the claim.
Remark 5. Theorems 20 and 21 concern perturbations which are functions of x and t only. On the other hand, [15,Theorem ] suggests that SMHSr implies a similar stability property under classes of perturbations that depend (in a non-linear way) on the state and control. This fact will be used and demonstrated in the next section.

Stability of the optimal solution
In this section we obtain stability results for the optimal solution under non-linear perturbations in the objective functional. Namely, we consider a disturbed problem subject to dy dt + Ay + f (x, t, y) = u + ξ in Q, y = 0 on Σ, y(·, 0) = y 0 in Ω, (5.2) where ζ := (ξ, η) is a perturbation. The corresponding solution will be denoted by y ζ u . In contrast with the previous section, the perturbation η may be state and control dependent. For this reason, here we change the notation of the set of admissible perturbations toΓ. However, Assumption 4 will still be valid for the setΓ. We also use the notations C pe , K y and R with the same meaning as in Subsection 4.2.
In addition to Assumption 4 we require the following that holds through the reminder of the section.
Assumption 5. The perturbation η ∈ L 1 (Q × R) for every (ξ, η) ∈Γ. For a.e. (x, t) ∈ Q the function η(x, t, ·, ·) is of class C 2 and is convex with respect to the last argument, u. Moreover, the functions ∂η ∂y and ∂ 2 η ∂y 2 are bounded on Q × R, and the second one is continuous in (y, u) ∈ R, uniformly with respect to (t, x) ∈ Q.
Due to the linearity of (5.2) and the convexity of the objective functional (5.1) with respect to u, the proof of the next theorem is standard.
In the next two theorems, we consider sequences of problems {(P ζ k )} with ζ k ∈Γ. The proofs repeat the arguments in [2,Theorem 4.2,Theorem 4.3].
Theorem 23. Let a sequence {ζ k ∈Γ} k converge to zero in L 2 (Q) × L 2 (Q × R) and let u k be a local solution of problem (P ζ k ), k = 1, 2, . . .. Then any controlū that is a weak* limit in L ∞ (Q) of this sequence is a week local minimizer in problem (P), and for the corresponding solutions it holds that y u k →ȳ in L 2 (0, T ; H 1 0 (Ω))∩L ∞ (Q). Theorem 24. Let {ζ k } k be as in Theorem 23. Letū be a strict strong local minimizer of (P). Then there exists a sequence of strong local minimizers {u k } of problems (P ζ k ) such that u k * ⇀ū in L ∞ (Q) and y u k converges strongly in L 2 (0, T ; H 1 0 (Ω)) ∩ L ∞ (Q).
The next theorem is central in this section.
Theorem 25. Let assumption 3(A 0 ) be fulfilled for the reference weakly optimal controlū in problem (P) and the correspondingȳ andp. Then there exist positive numbers α and C for which the following is fulfilled. For every perturbation ζ ∈Γ and for every weak local solution u ζ of problem (P ζ ) with u ζ −ū L 1 (Q) ≤ α, the following estimates hold: .
Here θ 0 and θ are defined as in Theorem 20.
Since it is assumed that u ζ −ū L 1 (Q) ≤ α we may apply Theorem 20 (here we choose the same α as in this theorem) to prove the inequalities in the theorem.
Theorem 26. Let m = 0 and Assumption 3(A 1 ) be fulfilled for the reference strongly optimal controlū in problem (P). Then there exist positive numbers α and C for which the following is fulfilled. For every perturbation ζ ∈Γ and for every local solution u ζ of problem (P ζ ) with y u ζ −ȳ L ∞ (Q) ≤ α, the following estimates hold.
and all together where θ 0 is defined in Theorem 20.
Theorem 27. Let m = 0 and let Assumption 3(A 2 ) be fulfilled for the reference strongly optimal controlū in problem (P). Then there exist positive numbers α and C for which the following is fulfilled. For every perturbation ζ ∈Γ and for every local solution u ζ of problem (P ζ ) with y u ζ −ū L ∞ (Q) ≤ α, the following estimates hold: Remark 6. The constraint that u ζ needs to be close to the reference solutionū in the theorems above is not a big restriction. This is clear, since Assumption 3 implies thatū satisfies (3.2). Hence,ū is a strict strong local minimizer of (P) and, consequently, Theorem 24 ensures the existence of a family {u ζ k }, ζ k ∈Γ, of strong local minimizers of problems (P ζ ) satisfying the conditions of Theorem 20 or 21.
Example 1 (Tikhonov regularization). We consider the optimal control problem subject to (1.2) and (1.3). As before,ū denotes a strict strong solution of problem (P)≡ (P 0 ). We assume that u satisfies Assumption 3(A 0 ). From Theorem 24 we know that for every sequence λ k > 0 converging to zero there exists a sequence of strong local minimizer {u λ k } ∞ k=1 such that u k →ū in L 1 (Q) for k → ∞, thus for a sufficiently large k 0 we have that for all k > k 0 where θ is defined in Theorem 20.

Examples
Here we present two examples that show particular applications in which different assumptions are involved.
Example 2 (Negative curvature). We begin with an optimal control problem, that has negative curvature. The parabolic equation has the form dy dt + Ay + exp(y) = u in Q, y = 0 on Σ, y(·, 0) = y 0 on Ω. (6.1) Let 0 ≤ g ∈ L 2 (Q) be a function satisfying the structural assumption (2.15). We consider the optimal control problem min u∈U J(u) := Q (y u + gu) dx dt subject to (6.1) and with control constraints By the weak maximum principle y ua − y u ≤ 0 for all u ∈ U andū := u a constitutes an optimal solution. Further, by the weak maximum principle, the adjoint-statep and the linearized states zū ,u−ū for all u ∈ U, are non-negative. Moreover, we have for all u ∈ U. Since g satisfies the structural assumption, there exists a constant C > 0 such that On the other hand, integrating by parts we obtain If for u ∈ U with u −ū L 1 (Q) or y u −ȳ L ∞ (Q) sufficiently small such that we can absorb the term J ′′ (ū)(u −ū) 2 by estimating where the last inequality is a consequence of the boundedness of U ⊂ L ∞ (Q) that implies the existence of a constant K > 0 such that zū ,u−ū L 1 (Q) ≥ K zū ,u−ū 2 L 2 (Q) for all u ∈ U. Altogether, we find Thus, Assumption 3(A 1 ) is fulfilled and we can apply Theorem 21 to obtain a stability result.
Example 3 (State stability). We consider a tracking type objective functional where the control does not appear explicitly and for which we will verify (A 2 ). As perturbations we consider functions ζ = (ξ, η, ρ) ∈ D(L * ) × L r (Q) × L r (Q) × D(L * ). Denote by y d the solution of this equation with u = u a and consider the problem subject to the same constraints as inn Example 2. For a local minimizerū of the unperturbed problem (ζ = 0), it holds where p solves − dp dt + A * p + exp(ȳ)p =ȳ − y d in Q, p = 0 on Σ, p(·, T ) = 0 on Ω.
If the optimal state tracks y d such that ȳ − y d L r (Q) ≤ 1 2Cr exp(ȳ) L ∞ (Q) we find that (A 2 ) holds. From Theorem 26 we obtain the existence of a constant κ > 0 such that for every perturbation ζ ∈Γ and for every local solution u ζ of problem (P) with y u ζ −ū L ∞ (Q) ≤ α.
Using (1.9) in Theorem 1 we obtain that Then, byα k := α r k C r r (2MU ) r−1 , we obtain that (A k ) implies (B k ) with γ k =γ k . To prove the converse implication, (B k )⇒(A k ), we assume that (B k ) holds, but (A k ) fails. Then for every integer l ≥ 1 there exists an element u l ∈ U such that J ′ (ū)(u l −ū) + J ′′ (ū)(u l −ū) 2 < 1 l u l −ū 2−k L 1 (Q) zū ,u l −ū k L 2 (Q) and y u l −ȳ C(Q) < 1 l . (A.6) Since {u l } ∞ l=1 ⊂ U is bounded in L ∞ (Q), we can extract a subsequence, denoted in the same way, such that u l * ⇀ u in L ∞ (Q). On one side, (A.6) implies that y u l →ȳ in L ∞ (Q). On the other side, u l * ⇀ u in L ∞ (Q) implies weak convergence in L r (Q). From (1.13), the convergence y u l → y u in L ∞ (Q) follows. Then, y u =ȳ and, consequently, u =ū holds. But Assumption(B 0 ) implies thatū is bang-bang, and hence the weak convergence u l * ⇀ū in L ∞ (Q) yields the strong convergence u l →ū in L 1 (Q); see [17,Proposition 4.1 and Lemma 4.2]. Then, for k = 0, (A.6) contradicts (B 0 ). The same argument holds for (B 1 ) and (B 2 ) under the additional condition thatū is bang-bang and noticing that zū ,u l −ū C(Q) ≤ 3/2 y u l −ȳ C(Q) by Lemma 5.
A proof of the following Lemma can be found in [2,Lemma 3.5] or [8,Lemma 3.5].