Impulse control of conditional McKean-Vlasov jump diffusions

This paper establishes a verification theorem for impulse control problems involving conditional McKean-Vlasov jump diffusions. We obtain a Markovian system by combining the state equation of the problem with the stochastic Fokker-Planck equation for the conditional probability law of the state. We derive sufficient variational inequalities for a function to be the value function of the impulse control problem, and for an impulse control to be the optimal control. We illustrate our results by applying them to the study of an optimal stream of dividends under transaction costs. We obtain the solution explicitly by finding a function and an associated impulse control which satisfy the verification theorem.

where N(dt, dζ) is a Poisson random measure and ν(dζ) the Lévy measure of N, and a random variable Z ∈ L 2 (P ) that is independent of F. We denote by L 2 (P ) the set of all the d−dimensional F -measurable random variables X such that E[X 2 ] < ∞, where E denotes the expectation with respect to P .We consider the state process X(t) ∈ R d given as the solution of the following conditional McKean-Vlasov jump equation α(s, X(s), µ s )dt + β(s, X(s), µ s )dB(s) where we denote by µ t = L(X(t)|F (1) t ) the conditional probability distribution of X(t) given the filtration F (1) t generated by the first component B 1 (u); u ≤ t of the Brownian motion up to time t.Loosely speaking, the equation above models a McKean-Vlasov dynamics which is subject to what is called a "common noise" coming from the Brownian motion B 1 (t), which is observed and is influencing the dynamics of the system.So defined, µ t is a Borel probability measure on R d for all t ∈ [0, T ], ω ∈ Ω.In particular, µ t ∈ M 0 , with M 0 the set of deterministic Radon measures i.e.Borel measures finite on compact sets, outer regular on all Borel sets and inner regular on all open sets.Notice that all Borel probability measures on R d are Radon measures.From now on we will indicate with M the set of random measures λ(dx, ω) which are Radon measures with respect to x for each ω.We refer to [9] for more information.We suppose that α(t, x, µ) are bounded processes and F-predictable for all x, µ, ζ and they are also continuous with respect to t and x for all µ, ζ.We can easily see that, under hypothesis of Lipschitz continuity and at most linear growth, there exists a unique solution for (1.1) for all t in [0, T ].
The purpose of this paper is to study impulse control problems for conditional McKean-Vlasov jump diffusions.In particular, we will define a performance criterion and then attempt to find a policy that maximizes performance within the admissible impulse strategies.Using a verification theorem approach, we establish a general form of quasi-variational inequalities and identify the sufficient conditions that lead to an optimal function.See precise formulation below.Standard impulse control problems can be solved by using the Dynkin formula.We refer to e.g.Bensoussan & Lions [4] in the continuous case and to Øksendal and Sulem [12] in the setting of jump diffusions.Impulse control problems naturally arise in many concrete applications, in particular when an operator, because of the intervention costs, decides to control the system by intervening only at a discrete set of times with a chosen intervention size: a sequence of stopping times (τ 1 , τ 2 , . . ., τ k , . ..) is chosen to intervene and exercise the control.At each time τ k of the player's k th intervention, the player chooses an intervention of size ζ k .The impulse control consists of the sequence {(τ k , ζ k )} k≥1 .Impulse control has sparked great interest in the financial field and beyond.See, for example, [10] for portfolio theory applications, [2] for energy markets, and [6] for insurance.All of these works are based on quasi-variational inequalities and employ a verification approach.Despite its adaptability to more realistic financial models, few papers have studied the case of mean field problems with impulse control.We refer to [3] for a discussion of a more special type of impulse, where the only type of impulse is to add something to the system.This is a mean field game (MFG) where the mean-field (only the empirical mean) appears as an approximation of the many-player game.They use the smooth fit principle (as used in the present work) to solve a specific MFG explicitly.We refer also to [7] for a MFG impulse control approach.Specifically, a problem of optimal harvesting in natural resource management is addressed.A maximum principle for regime switching control problem for mean-field jump diffusions is studied by [11] but in that paper the problem considered is not really an impulse control problem because the intervention times are fixed in advance.In our setting, we will not consider a MFG setup, as in the above mentioned works, we will only consider a decision maker who chooses the control to optimise a certain reward.Moreover, the mean-field appears as a conditional probability distribution and to overcome the lack of the Markov property, we introduce the equation of the measure which is of stochastic Fokker-Planck type.In [8], the authors can handle a non-Markovian dynamics.However, the impulse control is given in a particular compact form and only a given number of impulses are allowed.They use a Snell envelope approach and related reflected backward stochastic differential equations.In the next section, we introduce some notations and present some preliminary results.As part of Section 3, we state the optimal control problem and prove the verification theorem.In Section 4, we apply the previous results to solve an explicit problem of optimal dividend streams under transaction costs.

Preliminaries
The process X(t) given by (1.1) is not in itself Markovian, so to be able to use the Dynkin formula, we extend the system to the process Y defined by for some arbitrary starting time s ≥ 0, with state dynamics given by X(t), conditional law of the state given by µ t and with X(0) = Z, µ 0 = L(X(0)).This system is Markovian, in virtue of the following Fokker-Planck equation for the conditional law µ t , proved in [1].
Theorem 2.1 (Conditional stochastic Fokker-Planck equation) Let X(t) be as in (1.1) and let µ t = µ t (dx, ω) be the regular conditional distribution of X(t) given F (1) t .Then µ t satisfies the following SPIDE (in the sense of distributions): where A * 0 is the integro-differential operator D n,j [(ββ (T ) ) n,j µ] and A * 1 is the differential operator where β (T ) denotes the transposed of the d × m -matrix β = β j,k 1≤j≤d,1≤k≤m and γ (ℓ) is column number ℓ of the matrix γ.
For notational simplicity, we use D j , D n,j to denote ∂ ∂x j and ∂ 2 ∂xn∂x j in the sense of distributions.We have also used the following notation, taken from [1].For fixed t, µ, ζ and ℓ = 1, 2, ...k, we write for simplicity γ (ℓ) = γ (ℓ) (t, x, µ, ζ) for column number ℓ of the d × k-matrix γ.Then ν ℓ represents the Lévy measure of N ℓ for all ℓ.Note that for given µ ∈ M the map is a bounded linear map on C 0 (R d ), which is defined to be the uniform closure of the space where µ (γ (ℓ) ) , g denotes the action of the measure µ (γ (ℓ) ) on g.We call µ (γ (ℓ) ) the γ (ℓ) -shift of µ.Note that µ (γ (ℓ) ) is positive and absolutely continuous with respect to µ.

A General Formulation and a Verification Theorem
As noted above, in virtue of the Fokker-Planck equation (2.1) we can extend the system (1.1) into a Markovian system by defining the following [0, ∞)×L 2 (P )×M -valued process Y (t) := (s + t, X(t), µ t ) as follows: where X(t) and µ t satisfy the equations (1.1) and (2.1), respectively.Moreover, we have used the shorthand notation The process Y (t) starts at y = (s, Z, µ).We shall denote by µ the initial probability distribution L(X(0)) or the generic value of the conditional law µ t := L(X(t)|F (1) t ), when there is no ambiguity.Similarly, we use the following notation:

and
• X to denote a generic value of the random variable X(t, •) ∈ L 2 (P ).
• When the meaning is clear from the context we use x in both situations.
The concept of impulse control is simple and intuitive: at any time the agent can make an intervention ζ into the system.Due to the cost of each intervention the agent can intervene only at discrete times τ 1 , τ 2 , . ... The impulse problem is to find out at what times it is optimal to intervene and what is the corresponding optimal intervention sizes.We now proceed to formulate precisely our impulse control problem for conditional McKean-Vlasov jump diffusions.
Suppose that at any time t and any state y = (s, X, µ) we are free to intervene and give the state X an impulse ζ ∈ Z ⊂ R d , where Z is a given set (the set of admissible impulse values).Suppose the result of giving the state X the impulse ζ is that the state jumps immediately from X to Γ(X, ζ), where Γ(X, ζ) : L 2 (P ) × Z → L 2 (P ) is a given function.In many applications, the process shifts as a result of a simple translation, i.e.Γ(y, ζ) = y + ζ.Simultaneously, the conditional law jumps from µ t = L(X|F An impulse control for this system is a double (possibly finite) sequence stopping times (the intervention times) and ζ 1 , ζ 2 , . . .are the corresponding impulses at these times.Mathematically, we assume that τ j is a stopping time with respect to a suitable filtration {F t } t≥0 , with τ j+1 ≥ τ j and ζ j is F τ j -measurable for all j.We let V denote the set of all impulse controls.
where we have used the notation v) stemming from the jump of the random measure N (t, •) Note that we distinguish between the (possible) jump of X (v) (τ j ) stemming from the random measure N , denoted by ∆ N X (v) (τ j ) and the jump caused by the intervention v, given by ∆ Consider a fixed open set (called the solvency region) S ⊂ [0, ∞) × R d × M. It represents the set in which the game takes place since it will end once the controlled process leaves S. In portfolio optimization problems, for instance, the game ends in case of bankruptcy, which may be modelled by choosing S to be the set of states where the capital is above a certain threshold.Define Suppose we are given a continuous profit function f : S → R and a continuous bequest function g : S → R.Moreover, suppose the profit/utility of making an intervention with impulse ζ ∈ Z when the state is y is K(y, ζ), where K : S × Z → R is a given continuous function.
We assume we are given a set V of admissible impulse controls which is included in the set of v = (τ 1 , τ 2 , . . .; ζ 1 , ζ 2 , . ..) such that a unique solution Y (v) of (3.3)-(3.5)exist, for all v ∈ V, and the following additional properties hold, assuring that the performance functional below is well-defined: where E y denotes expectation given that Y (0) = y.
We now define the performance criterion, which consists of three parts: a continuous time running profit in [0, τ S ], a terminal bequest value if the game ends, and a discrete-time intervention profit, namely We consider the following impulse control problem: The function Φ(y) is called the value function and v * is called an optimal control.
The following concept is crucial for the solution of this problem.
Definition 3.3 Let H be the space of all measurable functions h : S → R. The intervention operator M : H → H is defined by ) where µ Γ(X,ζ) is given by (3.2).
Let C (1,2,2) (S) denote the family of functions ϕ(s, x, µ) : S → R which are continuously differentiable w.r.t.s and twice continuously Fréchet differentiable w.r.t.x ∈ R d and µ ∈ M. We let ∇ µ ϕ ∈ L(M, R) (the set of bounded linear functionals on M) denote the Fréchet derivative (gradient) of ϕ with respect to µ ∈ M. Similarly, D 2 µ ϕ denotes the double derivative of ϕ with respect to µ and it belongs to L(M × M, R) (see Appendix for further details).The infinitesimal generator G of the Markov jump diffusion process Y (t) is defined on functions ϕ ∈ C (1,2,2) (S) by where, as before, A * 0 is the integro-differential operator We can now state a verification theorem for conditional McKean-Vlasov impulse control problems, providing sufficient conditions that a given function is the value function and a given impulse control is optimal.The verification theorem links the impulse control problem to a suitable system of quasi-variational inequalities.Since the process Y (t) is Markovian, we can, with appropriate modifications, use the approach in Chapter 9 in [12].
For simplicity of notation we will in the following write Assume (iv) ∂D is a Lipschitz surface.
(b) Suppose in addition that  y, ζ) and the intervention cost K.Therefore, MΦ(y) represents the optimal new value if the agent decides to make an intervention at y.Note that by (ii) Φ ≥ MΦ on S, so it is not always optimal to intervene.At the time τj , the operator should intervene with impulse ζj when the controlled process leaves the continuation region, that is when Proof.
Then by (x) we get equality in (3.8) and by our choice of ζ j = ζj we have equality in (3.9).Hence φ(y) = J (v) (y), which combined with (a) completes the proof.

Example: Optimal stream of dividends under transaction costs
In this Section, we solve explicitly an optimal stream of dividends under transaction costs.To this end, for v = (τ 1 , τ 2 , . . . t ) by where α 0 , σ 1 = 0, σ 2 = 0, λ ≥ 0, and c > 0 are constants with −1 ≤ γ 0 (z) a.s.ν.Here X(t) represents the amount available at time t of a cash flow.We assume that it satisfies the McKean-Vlasov equation in (4.1).Note that at any time τ i , i = 0, 1, 2, . . ., the system jumps from X(v) (τ − i ) to where the quantity c + λζ i represents the transaction cost with a fixed part c and a proportional part λζ i , while ζ i is the amount we decide to take out at time τ i .At the same time µ τ − i jumps to Problem 4.1 We want to find Φ and v * ∈ V such that where is the expected discounted total dividend up to time τ S , where is the time of bankruptcy.
To put this problem into the context above, we define Comparing with our Theorem, we see that in this case we have d = 1, m = 2, k = 1 and where we have put q(x) = x so that µ t , q = E X(t) | F (1) t .Therefore the operator G takes the form where The adjoints of the last two operators are and In this case the intervention operator gets the form Note that the condition on ζ is due to the fact that the impulse must be positive and x − c − (1 + λ)ζ must belong to S. We distinguish between two cases: 1. α 0 > ρ.In this case, suppose we wait until some time t 1 and then take out Noting that E y |X(t)] = x exp(α 0 t) for t < t 1 , we see that the corresponding performance is → ∞ as t 1 → ∞.
Substituting this into the expression for G 0 ψ we get, with u = µ, q , By condition (x) we are required to have G 0 ψ(u) = 0 for all u ∈ (0, x), and this equation has the general solution where γ 1 > 1, γ 2 < 0, and C 1 , C 2 are constants.Since we expect φ to be bounded near 0, we guess that C 2 = 0. We guess that it is optimal to wait till u = µ t , q = E y [X(t)|F t ] reaches or exceeds a value u = ū > c and then take out as much as possible, i.e., reduce E y [X(t)|F We therefore propose that ψ(u) has the form Continuity and differentiability of ψ(u) at u = ū give the equations Combining these we get With these values of ū and C 1 , we have to verify that ψ satisfies all the requirements of Theorem 3.4.We check some of them: (ii) φ ≥ Mφ on S.
In our case we have Γ(s, ) and hence we get we conclude that k(u) > 0 for 0 < u < ū.

Appendix: Double Fréchet derivatives
In this section we recall some basic facts we are using about the Fréchet derivatives of a function f : V → W , where V, W are given Banach spaces.• Suppose f : M → R is given by f (µ) = µ, q 2 , where q(x) = x.

( 1 )
t ] to 0. Taking the transaction costs into account this means that we should take outζ(u) = u − c 1 + λ for u ≥ ū.

Definition 5 . 1 Definition 5 . 2
We say that f has a Fréchet derivative ∇ x f = Df (x) at x ∈ V if there exists a bounded linear map A : V → W such thatlim h→0 ||f (x + h) − f (x) − A(h)|| W ||h|| V = 0.Then we call A the Fréchet derivative of f at x and we put Df (x) = A.Note that Df (x) ∈ L(V, W ) (the space of bounded linear functions from V to W ), for each x.We say that f has a double Fréchet derivative D 2 f (x) at x if there exists a bounded bilinear map A(h, k) :V × V → W such that lim k→0 ||Df (x + k)(h) − Df (x)(h) − A(h, k)|| W ||h|| V = 0.Example 5.3