Maximum Principle for Stochastic Control of SDEs with Measurable Drifts

In this paper, we consider stochastic optimal control of systems driven by stochastic differential equations with irregular drift coefficient. We establish a necessary and sufficient stochastic maximum principle. To achieve this, we first derive an explicit representation of the first variation process (in the Sobolev sense) of the controlled diffusion. Since the drift coefficient is not smooth, the representation is given in terms of the local time of the state process. Then we construct a sequence of optimal control problems with smooth coefficients by an approximation argument. Finally, we use Ekeland’s variational principle to obtain an approximating adjoint process from which we derive the maximum principle by passing to the limit. The work is notably motivated by the optimal consumption problem of investors paying wealth tax.


Introduction
Let T ∈ (0, ∞) be a given deterministic time horizon and d ∈ N, let Ω := C([0, T ], R d ) be the canonical space of continuous paths.We denote by B the canonical process and by P the Wiener measure.Equip Ω with (F t ) t∈[0,T ] , the P-completion of the canonical filtration of B. Given a d-dimensional vector σ and a function b : [0, T ] × R × R m → R, we consider a controlled diffusion of the form dX α (t) = b(t, X α (t), α(t)) dt + σ dB(t), t ∈ [0, T ], X α (0) = x 0 (1) and the control problem Hereby, the performance functional J is given by J(α) := E T 0 f (s, X α (s), α(s)) ds + g(X α (T )) , where, f and g may be seen as profit and bequest functions, respectively.The set A is the set of admissible controls and is defined as the set of progressively measurable processes α valued in a closed convex set A ⊆ R m such that (1) admits a unique strong solution.The goal of the present article is to derive the maximum principle for the above control problem when the drift b is merely measurable is the state variable x.
The stochastic maximum principle is arguably one of the most prominent ways to tackle stochastic control problems as (2) by fully probabilistic methods.It is the direct generalization to the stochastic framework of the maximum principle of Pontryagin [28] in deterministic control.It gives a necessary condition of optimality in the form of a two-point boundary value problem and a maximum condition on the Hamiltonian.More precisely let the Hamiltonian H be defined as H(t, x, y, a) := f (t, x, a) + b(t, x, a)y and assume just for a moment the functions b, f and g to be continuously differentiable.Then, if α ∈ A is an optimal control, then according to the stochastic maximum principle, it holds H(t, X α(t), Y (t), α(t)) ≥ H(t, X α(t), Y (t), a) P ⊗ dt-a.s. for every a ∈ A where (Y, Z) are adapted processes solving the so-called adjoint equation dY (t) = −∂ x f (t, X α(t), α(t)) − ∂ x b(t, X α(t), α(t))Y (t) dt + Z(t) dB(t), Y (T ) = ∂ x g(X α(T )).
Under additional convexity conditions, this necessary condition is sufficient.The interest of the maximum principle is that it reduces the solvability of the control problem (2) to that of a (scalar) variational problem, and therefore allows to derive (sometimes explicit) characterizations of optimal controls.We refer for instance to [5; 30] for proofs and historical remarks.The maximum principle has far-reaching consequences and is widely used in the stochastic control and stochastic differential game literature [6; 7; 27; 16; 13].Its use also fueled by recent progress on the theory of forward backward SDEs.We refer the reader for instance to, [8; 17; 19; 31; 18] and the references therein.
The maximum principle roughly presented above naturally requires differentiability of the coefficients of the control problem, which precludes the applicability of this method to control problems with non-smooth coefficients.The effort to extend the stochastic maximum principle to problems with non-smooth coefficients started with the work of Merzedi [24] who derived a necessary condition of optimality for a problem with a Lipschitz continuous drift, but not necessarily differentiable everywhere in the state and the control variable.His result was further extended, notably to degenerate diffusion cases and singular control problems in [3; 2; 1].See also [29] for the infinity horizon case.
The present work considers the case where b is Borel measurable in x and bounded, and we will derive both necessary and the sufficient conditions of optimality.At this point, an immediate natural question is: What form should the adjoint equation take in this case?The starting point of our argument is the following simple observation: When b is differentiable, the adjoint equation is explicitly solvable, with the solution given by where the process (3) Φ α(t, s) = e s t ∂xb(u,X α(u), α(u)) du 0 ≤ t ≤ s ≤ T is the first variation process (in the Sobolev sense) of the dynamical system X α,x solving (1) with initial condition X α,x 0 = x.This suggests the form of the adjoint process when b is not differentiable, since it is well-known that despite the roughness of the drift b, the dynamical system X α,x is still differentiable (at least in the Sobolev sense), due to Brownian regularization [25] and therefore admits a flow.The crux of our argument will be to make use of this Sobolev differential stochastic flow to define the adjoint process (rather than the adjoint equation) in the non-smooth case to prove necessary and sufficient conditions of optimality.
Throughout this work the functions f and g are assumed to be continuously differentiable with bounded first derivatives.In particular, we will assume The main results of this work are the following necessary and sufficient conditions in the Pontryagin stochastic maximum principle.
where b 1 is a bounded, Borel measurable function and b 2 is bounded measurable, and continuously differentiable in its second and third variables with bounded derivatives.Let α ∈ A be an optimal control and let X α be the associated optimal trajectory.Then the flow Φ α of X α is well-defined and it holds where Y α is the adjoint process given by Theorem 1.2.Let the conditions of Theorem 1.1 be satisfied, further assume that g and (x, a) → H(t, x, y, a) are concave.Let α ∈ A satisfy with Y given by (5).Then, α is an optimal control.
We will elaborate on the conditions imposed in the above theorems in section 3.1.Let us at this point remark that these results correspond exactly to the classical version of the stochastic maximum principle when b is smooth.The only difference here being the fact that the process Φ α seems abstract, as it is obtained from an existence result (of the flow).As noted by [4], it turns out that when the drift is not smooth, the flow Φ α still admits an explicit representation much similar to (3).This representation will be extended to the present controlled case (see Theorem A.1) and will be used in the proof of the maximum principle.
The remainder of the article is dedicated to the proofs of Theorem 1.1 and 1.2.The necessary condition is proved in the next section and the sufficient condition is proved in section 3. The paper ends with an appendix on explicit representations of the flow of SDEs with measurable and random drifts.

The necessary condition for optimality
The goal of this section is to prove Theorem 1.1.Let us first precise the definition of the set of admissible controls.Let A ⊆ R m be a closed convex subset of R m .The set of admissible controls is defined as: 1) has a unique strong solution and for some M > 0. The difficulty in the existence and uniqueness of ( 1) is the fact that the drift b is both nonsmooth and depends on the random term α.Such equations were treated in [22].In fact, consider the set A ′ defined as: The set of progressively measurable processes α : [0, T ] × Ω → A which are Malliavin differentiable (with Malliavin derivative D s α(t)), with and such that there are constants C, η > 0 (possibly depending on α) such that It follows from [22,Theorem 1.2] that if the drift satisfies the conditions of Theorem 1.1, then the SDE (1) is uniquely solvable for every α ∈ A ′ .Since we do not make use of Malliavin differentiability in the present article we restrict ourselves to the set of admissible controls A. For later reference, note that for every α ∈ A it holds In the rest of the article, we let b n be a sequence of functions defined by b smooth functions with compact support and converging a.e. to b 1 .Since b 1 is bounded, the sequence b 1,n can also be taken bounded.We denote by X α n the solution of the SDE (1) with drift b replaced by b n .This process is clearly well-defined since b n is a Lipschitz continuous function.Similarly, we denote respectively by J n and V n the performance and the value function of the problem when the drift b is replaced by b n .That is, we put Furthermore, we denote by δ the distance The general idea of the proof will be to start by showing that an optimal control for the problem (2) is also optimal for an appropriate perturbation of the approximating problem with value V n (x 0 ).This is due to the celebrated variational principle of Ekeland.This maximum principle for control problems with smooth drifts will involve the state process X αn n and its flow Φ αn n .The last and most demanding step is to pass to the limit and show some form of "stability" of the maximum principle.We first address this limit step by a few intermediary technical lemmas that will be brought together to prove Theorem 1.2 at the end of this section.
Lemma 2.1.We have the following bounds: (ii) Given k ∈ N, for every sequence (α n ) in A converging to some α ∈ A it holds that Proof.Adding and subtracting the same term and then using the fundamental theorem of calculus, we arrive at where Λ n (λ, t) is the process given by Λ n (λ, t) := λX α1 n (t) + (1 − λ)X α2 (t).Therefore, we obtain that X α1 n − X α2 admits the representation Hence, taking expectation on both sides above and then using twice Cauchy-Schwarz inequality, we have that By Lipschitz continuity of b 2 , the last term on the right hand side is estimated as

Moreover, denoting
the second integral on the right hand side of ( 9) can be further estimated as follows: for some constant C > 0 and the probability measure Q is the measure with density Note that we used Cauchy-Schwarz inequality and then the fact that b is bounded to get E[( dQ dP ) −1 ] ≤ C. By Girsanov's theorem, under the measure Q, the process (X α2 (t) − x 0 )σ ⊤ /|σ| 2 is a Brownian motion.Thus, it follows that and using the density of Brownian motion, we have for every p ≥ 1 By Fubini's theorem, this shows that Let us now turn our attention to the first term in (9).Since Λ(λ, t) takes the form we use Jensen inequality, Girsanov's theorem as above and Lipschitz continuity of b 2 to get dλ, ( 13) T 0 b λ,α2 (s)dB(s) dP, and where we used the fact that b λ,α2 is bounded.Since the sequence Therefore, putting together ( 9), ( 10), ( 12) and ( 14) concludes the proof.
Since b k is Lipschitz continuous the convergence (ii) follows by classical arguments, the proof is omitted.
Lemma 2.2.Let α ∈ A and let α n be a sequence of admissible controls such that δ(α n , α) → 0.Then, it holds In particular, the function Proof.(i) The continuity of J k easily follows by Lipschitz continuity of f and g.In fact, we have where the convergence follows by dominated convergence and Lemma 2.1.
(ii) is also a direct consequence of Lemma 2.1 since Fubini's theorem and Lipschitz continuity of f and g used as in part (i) above imply where the second inequality follows from Lemma 2.1.
The next lemma pertains to the stability of the adjoint process with respect to the drift and the control process.This result is based on similar stability properties for stochastic flows.Given x ∈ R and the solution X α,x of the SDE (1) with initial condition X α,x t = x, the first variation process of X α,x is the derivative Φ α (t, s) of the function x → X α,x (s).Existence and properties of this Sobolev differentiable flow have been extensively studied by Kunita [15] for equations with sufficiently smooth coefficients.In particular, when the drift b is Lipschitz and continuously differentiable, the function Φ α (t, s) exists and, for almost every ω, is the (classical) derivative of x → X α,x (s).The case of measurable (deterministic) drifts is studied by Mohammed et.al. [25] and extended to measurable and random drifts in [22].These works show that, when b is measurable, then X α,• (s) ∈ L 2 (Ω, W 1,p (U )) for every s ∈ [t, T ] and p > 1, where W 1,p (U ) is the usual Sobolev space and U an open and bounded subset of R. That is, Φ α (t, s) exists and is the weak derivative of X α,• .
The proof of the stability result will make use of an explicit representation of the process Φ α with respect to the time-space local time.Recall that for a ∈ R and X = {X(t), t ≥ 0} a continuous semimartingale, the local time L X (t, a) of X at a is defined by the Tanaka-Meyer formula as where sgn(x) = −1 (−∞,0] (x) + 1 (0,+∞) (x).The local time-space integral plays a crucial role in the representations of the Sobolev derivative of the flows of the solution to the SDE (1).It is defined for functions in the space (H x , • x ) defined (see e.g.[9]) as the space of Borel measurable functions f : [0, T ] × R → R with the norm Since b 1 is bounded, we obviously have b 1 ∈ H x for every x.Moreover, it follows from [11] (see also [4]) that for every continuous semimartingale X the local time-space integral of f ∈ H x with respect to L X (t, z) is well defined and satisfies for every continuous function (in space) f ∈ H x admitting a continuous derivative ∂ x f (s, •), see [11,Lemma 2.3].This representation allows to derive the following: Proof.First observe that for every n ∈ N, it follows by Cauchy-Schwarz inequality that where Q is the probability measure given as in (11) with α 2 therein replaced by α.Hence, since (X α,x −x 0 )σ ⊤ /|σ| 2 is a Brownian motion under Q, it follows by (15) that for some constant C > 0 which does not depend on n, where this latter inequality follows by Lemma A.3.Since b 1 is bounded and b 1,n converges to b 1 pointwise, it follows by [11 x (du, dz) as n goes to infinity.Thus, it follows by dominated convergence that (du,dz) < C.
We are now ready to prove stability of the follow and of the adjoint processes.
Lemma 2.4.Let α ∈ A and α n be a sequence of admissible controls such that δ(α n , α) → 0.Then, the processes X αn n and X α admit Sobolev differentiable flows denoted Φ αn n and Φ α , respectively and for every 0 ≤ t ≤ s ≤ T it holds where Y α is the adjoint process defined as and Y αn n is defined similarly, with (X α , α, Φ α ) replaced by (X αn n , α n , Φ αn n ).Proof.The existence of the process Φ αn n is standard, it follows for instance by [14].The existence of the flow Φ α follows by [22,Theorem 1.3].We start by proving the first convergence claim.As explained above, these processes admit explicit representations in terms of the space-time local time process.It fact, it follows from Theorem A.1 that Φ α admits the representation (du,dz) e s t ∂xb2(u,X α,x (u),α(u))du and Φ αn n admits the same representation with (b 1 , X α,x , α) replaced by (b 1,n , X αn,x , α n ).Using these explicit representations and Hölder inequality we have Splitting up the terms in power 4, then applying Hölder and Young's inequality we continue the estimations as (du,dz) (du,dz) It follows from Lemma 2.3 that I 1 and I 5,n are bounded.Since ∂ x b 2 is bounded, it follows that I 2,n and I 4,n are also bounded with bounds independent on n.Let us now show that I 3,n and I 6,n converge to zero.We show only the convergence of I 6,n since that of I 3,n will follow (at least for a subsequence) from Lemma 2.1 and dominated convergence since ∂ x b 2 is continuous and bounded.
To that end, further define the processes A αn n and A α by (du,dz) .
In order to show that A αn n converges to A α in L 2 , we will show that A αn n converges weakly to A α in L 2 and that We first prove the weak convergence.Since the set spans a dense subspace in L 2 (Ω), in order to show weak convergence, it is enough to show that Denote by Xαn,x n and Xα,x the processes given by d X αn,x Observe that these processes are well-defined, since we have X α,x (t, ω) = X α,x (t, ω + ϕ) and X αn,x n (t, ω) = X αn,x n (t, ω + ϕ).Using the Cameron-Martin-Girsanov theorem as in the proof of Lemma 2.3, we have we obtain Therefore, another application of Hölder's inequality yields the estimate + E e 2 t s R b1(u,z)L σ B x σ (du,dz) Using Lemma A.2, it follows that J 2,n converge to zero, and by dominated convergence J 5,n also convergences to zero.Thanks to Lemma A.3 and boundedness of b 1,n (respectively b 1 ), the term J 3,n (respectively J 4,n ) is bounded.The bound of J 1,n follows by the uniform boundedness of u n .
It remains to show that Using Girsanov transform as in the proof of Lemma 2.3, we have and Therefore using once more |e x − e y | ≤ |x − y||e x + e y | and Cauchy-Schwarz inequality ≤ E e 4 t s R b1,n(u,z)L σ B x σ (du,dz) Now, introducing the random variables and we continue the above estimations as By Lemma A.2, F 1,n converges to zero in L 2 (Ω).Using similar arguments as in [4,Lemma A.3], one can show that V n converges to zero in L 2 (Ω) by the boundedness of u n and the definition of the distance δ.Observe however that in this case, u n depends on α n and not on α as in [4,Lemma A.3]. Nevertheless using the fact that b 1,n , b 1 and b 2 are bounded and Lipschitz in the second variable, one can show by dominated convergence theorem and similar reasoning as in (10) that the overall term converges to zero.It is also worth mentioning that the other terms are uniformly bounded by application of either Girsanov theorem and/or Lemma A.3 to the uniformly bounded senquences (u n ) n≥1 , (b 1,n ) n≥1 and the bounded functions u, b 1 .
Let us now turn our attention to the proof of (ii).Compute the difference Y αn n (t) − Y α (t), add and subtract the terms Φ α (t, T )∂ x g(X αn n (T )) and ) du and then apply Hölder's inequality to obtain for some constant C T depending only on T .Since the process Φ α is square integrable, (see [22,Theorem 1.3]) it follows by boundedness and continuity of ∂ x g, ∂ x f as well as Lemma 2.1 that the first and third terms converge to zero as n goes to infinity.Moreover, by boundedness of ∂ x f and ∂ x g and the L 2 convergence of Φ αn n (t, u) to Φ α (t, u) given in part (i), we conclude that the second and last terms in (23) converge to zero, which shows (ii).
Proof.(of Theorem 1.1) Let α be an optimal control and n ≥ 1 fixed.Observe that by the linear growth assumption on f, g the function J n is bounded from above.By Lemma 2.2 the function J n is also continuous on (A, δ) and there exists ε n such that That is, J n (α) ≤ inf α∈A J n (α) + 2ε n .Thus, by Ekeland's variational principle, see e.g.[12], there is a control αn ∈ A such that δ(α, αn ) ≤ (2ε n ) 1/2 and In other words, putting J ε n (α) := J n (α) + (2ε n ) 1/2 δ(α n , α), the control process αn is optimal for the problem with cost function J ε n .Now, let β ∈ A be an arbitrary control and ε > 0 a fixed constant.By convexity of A, it follows that αn + εη ∈ A, with η := β − αn .Thus, since b n is sufficiently smooth, it is standard that the functional J n is Gâteau differentiable (see [5,Lemma 4.8]) and its Gâteau derivative in the direction η is given by where V n is the stochastic process solving the linear equation for a constant C M > 0 depending on the constant M (introduced in the definition of A).Therefore, J ε n is also Gâteau differentiable and since αn is optimal for J ε n , we have for some constant M > 0.
We will now take the limit on both sides above as n goes to infinity.It follows by Lemma 2.1 and Lemma 2.4 respectively that X αn n (t) → X α(t) and Y αn n (t) → Y α(t) P-a.s. for every t ∈ [0, T ].Since αn → α, we therefore conclude that This shows (4), which concludes the proof.

The sufficient condition for optimality
Let us now turn to the proof of the sufficient condition of optimality.Since we will need to preserve the concavity of H assumed in Theorem 1.2 after approximation, we specifically assume that the function b n is defined by standard mollification.Therefore, H n (t, x, y, a) := f (t, x, a) + b n (t, x, a)y is a mollification of H and thus remains concave.
Proof.(of Theorem 1.2) Let α ∈ A satisfy (6) and α ′ an arbitrary element of A. We would like to show that J(α) ≥ J(α ′ ).Let n ∈ N be arbitrarily chosen.By definition, we have where we used the definition of H n and the fact that g is concave.Since Y α n satisfies it follows by martingale representation and Itô's formula that there is a square integrable progressive process (Y α n , Z α n ) such that Y α n satisfies the (linear) equation Recall that since b n is smooth, so is H n .Therefore, by Itô's formula once again we have Since the stochastic integral above is a local martingale, a standard localization argument allows to take expectation on both sides to get that where the latter inequality follows by concavity of H n .
Coming back to the expression of interest J(α) − J(α ′ ), we have Since b 1,n does not depend on α, we have ).Therefore, taking the limit as n goes to infinity, it follows by Lemmas 2.1, 2.2 and 2.4 that it holds Since α satisfies (6), we therefore conclude that J(α) ≥ J(α ′ ).
3.1.Concluding remarks.Let us conclude the paper by briefly discussing our assumptions.The condition b = b 1 + b 2 seems essential to derive existence and uniqueness results of the controlled system.For instance, the crucial bound (14) derived in [4; 21] is unknown when b 1 depends on α.This condition is also vital in obtaining the explicit representation of the Sobolev derivative of the flows of the solution to the SDE in terms of its local time.This representation cannot be expected in multidimensions due to the non commutativity of matrices and the local time.Therefore, much stronger (regularity) conditions are needed to derive the maximum principle in this case (see for example [1; 2; 3]).Note in addition that the boundedness assumption on b is made mostly to simplify the presentation.The results should also hold with b of linear growth in the spacial variable, albeit with more involved computations and with T small enough, since the flow in this case is expected to exist in small time.
Given the drift b, some known conditions on the control α that guaranty existence and uniqueness of the strong solution to the SDE (1) satisfied by the controlled process are given by ( 7) and (8).These conditions involve the Malliavin derivative of α.Let us remark that the Malliavin differentiability of the control is not an uncommon assumption.This condition appears implicitly in the works [20; 23; 26] on the stochastic maximum principle where the coefficients are required to be at least two times differentiable with bounded derivatives.
We have by using Cameron-Martin theorem, the fact that |e x − e y | ≤ |x − y||e x + e y |, Hölder inequality and boundedness of b (du,dz) ] is bounded.The second term on the right side of the above converges to zero since one can show as in Lemma 2.1 that Xn,α,x (s) converges strongly to X α,x (s) in L 2 and b ′ 2 is bounded and continuous.We now show that the second term converges to zero.We will show weak convergence and convergence in mean square.Using the Cameron-Martin-Girsanov theorem as above, for every ϕ Therefore, using the inequality |e x − e y | ≤ |x − y||e x + e y | and Hölder's inequality we have Lemma A.2, shows that J 2,n converges to zero, and convergence to zero of J 5,n follows by dominated convergence.Thanks to Lemma A.3 and boundedness of b 1,n and b 1 , respectively, the term J 3,n (respectively J 4,n ) is bounded.The bound of J 1,n follows by the uniform boundedness of u n .Each term above converges to zero.We give the detail only for the first term.The treatment of the two oder terms is analogous.Given p > 1, using the density of the Brownian motion, we have as in the proof of Lemma 2.1 (see ( 12)) where C is an increasing function and L B x (ds, dy) denotes integration with respect to the local time of the Brownian motion B x in both time and space.In addition, if b n is an approximating sequence of b such that the b n are uniformy bounded by b ∞ then the above bound still hold true with the bound independent of n.

E exp λ t 0 ∂
n (s, y) − b 1 (s, y) p e − y 2 4s dy.Since b 1,n converges to b 1 , it follows from the dominated convergence theorem that each term in the above inequality converge to zero.The following Lemma corresponds to [4, Lemma A.2] and it gives the exponential bound of the local time-space integral of a bounded function Lemma A.3.Let b : [0, T ] × R → R be a bounded and measurable function.Then for t ∈ [0, T ], λ ∈ R and compact subset K ⊂ R, we have sup x∈K x b(s, B x )ds = sup x∈K E exp λ t 0 R b(s, y)L B x (ds, dy) < C( b ∞ ), The inequality following since αn ∈ A, and where H n is the Hamiltonian of the problem with drift b n given by H n (t, x, y, a) := f (t, x, a) + b n (t, x, a)y