Sequential Systems of Reflected Backward Stochastic Differential Equations with Application to Impulse Control

We consider a system of finite horizon, sequentially interconnected, obliquely reflected backward stochastic differential equations (RBSDEs) with stochastic Lipschitz coefficients. We show existence of solutions to our system of RBSDEs by applying a Picard iteration approach. Uniqueness then follows by relating the limit to an auxiliary impulse control problem. Moreover, we show that the solution to our system of RBSDEs is connected to weak solutions of a stochastic differential game where one player implements an impulse control while the opponent plays a continuous control that enters the drift term. As all our arguments are probabilistic and hence hold in a non-markovian framework, we are able to consider the setting where the underlying uncertainty in the game stems from an impulsively and continuously controlled path-dependent stochastic differential equation driven by Brownian motion.


Introduction
Backward stochastic differential equations (BSDEs) has been a topic of rapid development during the last decades. Non-linear BSDEs were independently introduced in [24] and [7] and has since found numerous applications.
El Karoui et. al. introduced the notion of reflected backward stochastic differential equations (RBSDEs) and demonstrated a link between RBSDEs and optimal stopping in [12]. This was later exploited in a series of articles [15,18,20] as a means of finding solutions to optimal switching problems. In [18] existence and uniqueness of solutions to an interconnected systems of reflected BSDEs were shown. Furthermore, it was shown that the solutions are related to optimal switching problems under Knightian uncertainty. Important contributions from the perspective of the present work are also This work was supported by the Swedish Energy Agency through Grant Number 42982-1. [1,4] that consider BSDEs where the Lipschitz coefficient on the z-variable of the driver is a stochastic process and the more recent work presented in [10] where a RBSDE with stochastic Lipschitz coefficient is solved.
Although the literature on discretely indexed systems of RBSDEs related to switching problems has grown considerably in the last decade, there is this far no work that deals with systems of RBSDEs related to general impulse control problems. In the present work we aim to add to the literature on RBSDEs by considering a sequentially arranged system of RBSDEs, namely (the notation will be explained later) As opposed to the setting in previous works our family of RBSDEs is continuously parameterized (the parameter v is an impulse control). Moreover, to make our results more applicable, we allow the driver f v to satisfy a Lipschitz condition on the zvariable that is formulated in terms of a stochastic process. We rely on a Picard iteration approach and the main obstacle we face is showing continuity of the map ) t that appears in the barrier to (1.1). In particular, we cannot use a "no free loop" property (see e.g. Step 5 in the proof of Theorem 3.2 in [18]). Instead we rely on a uniform convergence argument that requires some intricate analysis.
A strong motivation for the introduction of non-linear BSDEs was their close connection to various types of stochastic control problems (see e.g. [13,25,30]) and stochastic differential games [16,17,19]. Analogously, our main motivation for studying (1.1) is its relation to stochastic differential games (SDGs) of impulse versus continuous control.
In impulse control the control-law takes the form u = (τ 1 , . . . , τ N ; β 1 , . . . , β N ), where τ 1 ≤ τ 2 ≤ · · · ≤ τ N is a sequence of times when the operator intervenes on the system and β j is the impulse that the operator affects the system with at time τ j . We restrict our attention to the case of a Brownian filtration F := {F t } t≥0 and assume that the τ j are F-stopping times and that β j is F τ j -measurable and take values in a compact subset U of R d .
We extend the results in [22] to the two-player, zero-sum game setting by considering a weak formulation of the problem of maximizing the reward functional over impulse controls, u, when simultaneously a minimization is performed over continuous controls α := (α s ) 0≤s≤T , taking values in a compact subset A of R d . Here, [u] j := (τ 1 , . . . , τ N ∧ j ; β 1 , . . . , β N ∧ j ) and X u,α solves the impulsively controlled path-dependent SDE X u,α t = x 0 + t 0 a(s, (X u,α r ) r ≤s , α s )ds + t 0 σ (s, (X u,α r ) r ≤s )dW s , ∀t ∈ [0, τ 1 ) a(s, (X u,α r ) r ≤s , α s )ds + t τ j σ (s, (X u,α r ) r ≤s )dW s , ∀t ∈ [τ j , τ j+1 ) (1.4) for j = 1, . . . , N , with τ N +1 := ∞. By considering systems of reflected BSDEs with drivers that satisfy a stochastic Lipschitz condition we are able to relax the common assumption that |σ −1 (t, x)a(t, x, α)| is bounded and instead assume a linear growth, i.e. that |σ −1 (t, x)a(t, x, α)| ≤ k L (1 + sup s∈[0,t] |x s |), for some constant k L > 0. The remainder of the article is organised as follows. In the next section we set the notation and specify what we mean by a solution to (1.1). Moreover, in this section we formulate a stability and a moment estimate for solutions to RBSDEs with a stochastic Lipschitz coefficient. The proofs of these estimates are postponed to Appendix A. In Section 3 we turn to sequential systems of RBSDEs and show that (1.1) admits a unique solution. Then, in Section 4, we show that the above formulated game has a saddle-point and give a representation of the corresponding optimal controls by relating solutions to (1.1) to weak formulations of the SDG at hand. This is followed by some concluding remarks in Section 5.

Preliminaries
We let ( , F, F, P) be a complete filtered probability space, where F := (F t ) 0≤t≤T is the augmented natural filtration of a d-dimensional Brownian motion W and F := F T , where T ∈ (0, ∞) is the horizon. Throughout, we will use the following notation: • We let E denote expectation with respect to P and for any other probability measure Q on ( , F), we denote by E Q expectation with respect to Q. • P F is the σ -algebra of F-progressively measurable subsets of [0, T ] × . • For p ≥ 1, we let S p be the set of all R-valued, P F -measurable, continuous processes (Z t : t ∈ [0, T ]) such that Z S p := E sup t∈[0,T ] |Z t | p 1/ p < ∞. • For p ≥ 1, we let S p l be the set of all R-valued, P F -measurable, càglàd processes (Z t : t ∈ [0, T ]) such that Z S p < ∞.
• We let H p denote the set of all R d -valued P F -measurable processes (Z t : t ∈ [0, T ]) such that Z H p := E T 0 |Z t | 2 dt p/2 1/ p < ∞.
• For any probability measure Q, we let S p Q and H p Q be defined as S p and H p , respectively, with the exception that the norm is defined with expectation taken with respect to Q, i.e. Z S p Q := E Q sup t∈[0,T ] |Z t | p and Z H p • We let T be the set of all F-stopping times and for each η ∈ T we let T η be the corresponding subsets of stopping times τ such that τ ≥ η, P-a.s. • For each τ ∈ T, we let I(τ ) be the set of all F τ -measurable random variables taking values in U , so that I(τ ) is the set of all admissible interventions at time τ . • We let U be the set of all u = (τ 1 , . . . , τ N ; β 1 , . . . , β N ), where (τ j ) ∞ j=1 is a nondecreasing sequence of F-stopping times taking values in [0, T ], β j ∈ I(τ j ) and N is an F T -measurable, integer valued random variable such that {N ≥ j} on {τ j < T }. Throughout, we also set τ 0 := 0.
• We let U f denote the subset of u ∈ U for which N is P-a.s. finite (i.e. U f := {u ∈ U : P {ω ∈ : N > k, ∀k > 0} = 0}) and for all k ≥ 0 we let • We let D f be the subset of D with all finite sequences and for k ≥ 0 we let D k be the subset of sequences with precisely k interventions, i.e. sequences of the type n is possibly infinite, denote a generic element of D.
• We let * denote stochastic integration and set (X * W ) t,s = s t X r dW r .
• We let E denote the Doléans-Dade exponential and use the notation • For any P F -measurable process ζ such that E[E(ζ * W ) T ] = 1, we define Q ζ to be the probability measure equivalent to P, such that dQ ζ = E(ζ * W ) T dP. • For any non-negative, P F -measurable càdlàg process L we let P L denote the set of all probability measures Q on ( , F) such that dQ = E(ζ * W ) T dP, for some P F -measurable process ζ , with |ζ t | ≤ L t for all t ∈ [0, T ] (outside of a P-null set).
In addition, we will throughout assume that, unless otherwise specified, all inequalities hold in the P-a.s. sense. Furthermore, we define the following set: is jointly continuous.

Definition 2.2
We refer to a family of processes ((X v t ) 0≤t≤T : One of the main objectives of the present work is to show that (1.1) admits a unique solution. We, therefore, need to define what we mean by a solution to (1.1).

RBSDEs with Stochastic Lipschitz Coefficient
Our approach will rely heavily on the available theory of reflected backward SDEs. In particular, we have the following result (a proof of which can be found in Appendix A):

Proposition 2.4 Assume that
(i) There is a P-a.s. non-negative, P F -measurable, continuous process
(iv) The barrier S is real-valued, P F -measurable and continuous with S + ∈ S p for each p ≥ 1 and S T ≤ ξ , P-a.s.
In addition, Y can be interpreted as the Snell envelope in the following way In particular, with D t := inf{r ≥ t : Y r = S r } ∧ T we have the representation and K D t − K t = 0, P-a.s.

Sequential Systems of Reflected BSDEs in Finite Horizon
In this section we move on to the sequential system of reflected BSDEs in (1.1). To be able to use our results for impulse control we must allow the stochastic Lipschitz coefficient in (1.1) to depend on the control parameter v ∈ U f , rendering us a family of stochastic Lipschitz coefficients (

Assumptions
We introduce the following sets of probability measures on ( , F).
We also use the shorthands P 0 := P ∅ 0 and K 0 := K ∅ 0 . To streamline presentation we will formulate our assumptions on the coefficients in terms of the existence of a family of bounding processes:

Definition 3.2 We say that a family of processes
is a bounding family if for each k ≥ 0 and p, κ ≥ 1, there is a C > 0 and a p ≥ 1 such that for all v ∈ U f and v, v ∈ D κ and some q > 1, we have: Moreover, for all ζ ∈ K v and Q ∈ P v we have ess sup for all Q ∈ P v t .
Before moving on to show existence and uniqueness of solutions to (1.1) under Assumption 3.3 we give the following auxiliary result: where W ζ is a Q ζ -Brownian motion, are both P F -measurable, càdlàg processes and s ≥ x} ∧ T and note that for θ > 1, we have by right-continuity that However, for each > 0, there is a u ∈ U f τ x and a ζ , In particular, it follows that Since > 0 was arbitrary we find that P sup For any θ > 1 and arbitrary p ≥ 1 the coefficient pθq(r ) (r ) 2 −r 2 can be made arbitrarily small by choosing r > 1 sufficiently small and since there is a C ≥ 0 such that In particular, using integration by parts, we find that showing that R v ∈ S p with norm uniformly bounded in v. The result forR v follows similarly.

Lemma 3.5
For each p ≥ 1 there is a r > 1 such thatR v defined as Since > 0 was arbitrary and the first term is bounded by 2 pθ−1 sup u∈U f R u S pθ the result follows by repeating the last steps in the proof of Lemma 3.4.
Throughout this section, we assume that r > 1 is small enough that R v S 3 , R v S 3 and R v S 3 are bounded uniformly in v ∈ U f and let r be such that 1 r + 1 r = 1.

An Approximating Sequence
In this section we outline a Picard type approximation scheme, that will ultimately lead us to the conclusion that (1.1) has a solution under Assumption 3.3. We note that admits a unique solution by Proposition 2.4 (note that (3.10) can be seen as a reflected BSDE with barrier S ≡ −∞). We, thus, consider the following sequence of families of reflected BSDEs for k ≥ 1. Hypothesis We will make use of the following induction hypothesis: is jointly continuous (outside of a P-null set) and, moreover, Recall Definition 2.3 specifying what we mean by a solution to (2.3). We extend this definition to incorporate solutions to (3.10) and (3.11) as well. Through this definition the first condition in Hypothesis RBSDE.l implies that sup u∈U f Y v,k < ∞ and also dictates the regularity of the map . The second statement, on the other hand, is a stronger version of the consistency property for families of processes introduced in Definition 2.2. To simplify presentation we will refer to the second property as strong consistency. For κ ≥ 1 and v, v ∈ D κ we have by Proposition 2.4, Assumption 3.3 and Definition 3.2.(iv) that for each p ≥ 1, there is a p ≥ 1 such that Theorem 72 in Chapter IV of [28]) guarantees the existence of a family of processes solves the corresponding BSDE (3.10) and that the map v → To establish the strong consistency in Hypothesis RBSDE.0-(ii), we need to show that for any v ∈ U f , the pair (Ŷ v ,Ẑ v ) ∈ S 2 × H 2 solves the BSDE corresponding to the control v. We let (v j ) j≥0 be an approximating sequence in U f (i.e. v j ∈ U f and v j → v, P-a.s. as j → ∞) taking values in a countable subset of D f . By continuity we have for each η ∈ T and strong consistency follows.
We now turn to the reflected BSDEs (3.11). To obtain estimates for the triple (Y v,k , Z v,k , K v,k ) we rely on Proposition 2.4 to reduce the system of reflected BSDEs to a single non-reflected BSDE with jumps. We, thus, introduce the following BSDE: whenever a unique solution exists and let U v,u ≡ −∞, otherwise.
Proof Existence of a unique solution to (3.12) follows from repeated use of Proposition 2.4 since the intervention costs belong to L p ( , F, P) for all p ≥ 1. Moreover, a similar argument gives that H q and the assertion follows.
In addition, we introduce the following notation: and let Q v,u := Q ζ v,u , the probability measure, equivalent to P, under which W v,u := W − · 0 ζ v,u s ds is a Brownian motion.
Before we move on to show that Hypothesis RBSDE.l holds for all l ≥ 0 we give three helpful lemmas. Lemma 3.10 Assume that Hypothesis RBSDE.l holds for some l ≥ 0, then for each

cṽ(s, b)}} ∧ T and have by Proposition 2.4 and consistency that
where β * 1 can be chosen to be F τ * 1 -measurable by continuity of the map b → Yṽ ) and the measurable selection theorem (see e.g. Chapter 7 in [3] or [11]). Now, we can continue and inductively define τ * . . , l, and take β * j to be the corresponding F τ * j -measurable maximizer. By induction we get that where . . , l} : τ * j < T } and using the convention that τ * 0 = 0 and τ * N * +1 = T . In particular, (3.14) implies by comparison and positivity of the intervention cost that Uṽ ,∅ where the last inequality follows by Assumption 3.
We can thus apply Hölder's inequality to find that by Lemma 3.4 and sinceK v, p t ∈ S 1 for all p ≥ 1.

Lemma 3.11
Assume that Hypothesis RBSDE.l holds for some l ≥ 0, then for each p ≥ 1 there is a C > 0, that does not depend on l, such that Proof This is immediate from Proposition 2.4 and Lemma 3.10.

Lemma 3.12
Assume that Hypothesis RBSDE.l holds for some l ≥ 0, then for each where e t := e t 0 γ s ds with is a Brownian motion under the measure Q ζ ∈ P v t given by dQ ζ = E(ζ * W ) T dP.
For u ∈ U l+1 t , by again appealing to Proposition 3.8, we have that δV u ∈ H 2 Q ζ and since e −k f T ≤ e t ≤ e k f T , taking conditional expectation in (3.18) gives where we have used the Lipschitz condition on f to arrive at the last inequality. In particular, since N ≤ l + 1, P-a.s., this gives that Applying the standard change of measure approach it follows that Changing back to P-expectation we get by the Girsanov theorem that Hence, repeated application of Hölder's inequality gives, with δY t : . By Lemma 3.10 we find that Whenever the statement in Hypothesis RBSDE.l holds for some l ≥ 0, then Proposition 2.4 guarantees the existence of a unique triple By (2.5) of Proposition 2.4, Assumption 3.3 and Lemma 3.12 it, thus, follows by repeating the argument in the proof of Proposition 3.6 that for each p ≥ 1, there is a p ≥ 1 such that For v , v ∈ D κ we let p = (m +1)κ +1 and Kolmogorov's continuity theorem implies the existence of a family of processes ) continuous (and that v → T s Z v r dW r is continuous in v uniformly in s) and uniformly continuous, respectively, and moreover Furthermore, for any v ∈ U f and any approximating sequence (v j ) j≥0 taking values in a countable dense subset of D f with v j → v, P-a.s., and v j ∈ U f , we have by repeating the argument in the proof of Proposition 3.6 that for all η ∈ T. Finally, by Helly's convergence theorem (see e.g. [23], p. 370) we have ,l s | which tends to zero, P-a.s., as j → ∞ we get that and we conclude that Hypothesis RBSDE.l + 1 holds as well. The statement of the proposition now follows by an induction argument.

Convergence of the Scheme
We now show that there exists a limit family of triples : v ∈ U f ) that solves the sequential system of reflected BSDEs (1.1). This result relies heavily upon the following two lemmas and their corollaries.
Proof For the bound on U v,u * we note that by (3.15) we have where we have used Jensen's inequality and (3.3) to reach the last inequality. The first bound then follows by Jensen's inequality sinceR v t R v t ≥ 1, P-a.s.
We apply Ito's formula to (U v,u * ) 2 and get that where the last term appears after applying the relation On the other hand, applying the usual manipulations to (3.12) we get Rearranging terms now gives us (with e t,· = e v,u * t,· ) Put together this gives Raising both sides to p/2 and taking the conditional expectation we find that by choosing κ > 0 sufficiently large. Under P this rewrites as The desired result now follows by setting p ← pr 2 in (3.19) and using Jensen's inequality while noting thatR v t ≥ 1, P-a.s.

Corollary 3.15
For Proof This follows immediately by making suitable manipulations, i.e. setting v ← v • (t, β), in the proof of Lemma 3.14.

Lemma 3.16
For v ∈ U f and k ≥ 0, assume that u * ∈ U k t is such that U v,u * t = ess sup u∈U k t U v,u t . Then, for each p ≥ 1, there is a C > 0 that does not depend on v or k, such that Proof Since the intervention costs are bounded from below by δ > 0, we have from which the first inequality follows. Now, from (3.20) we have where we have used (3.21) to get the last inequality. Now the result is immediate from the last part in the proof of Lemma 3.14.

Corollary 3.17
For Proof This follows by repeating the argument in the proof of Lemma 3.16 after making the swap v ← v • (t, β).
We are now ready to tackle the convergence of the sequence Y v,k , this is done in the following proposition Proposition 3.18 There exists a limit family (Ȳ v : v ∈ U f ) such that for all v ∈ U f (outside of a P-null set) and each p ≥ 2, we have ) k≥0 is non-decreasing and P-a.s. bounded by Lemma 3.10. Thus it converges pointwisely, P-a.s., and i) follows.
We now turn our focus to the second claim and note that for each t ∈ [0, T ], continuity and measurable selection implies that there is a β ∈ I(t) such that To simplify notation we setṽ ← v • (t, β) and have by Corollary 3.17 that if For any k with 0 ≤ k ≤ k, the truncation [u * ] k belongs to U k t . We, thus, have Moreover, since the intervention costs are positive, we have that Setting e t,s := e s t γ s with Taking the conditional expectation, using that Vṽ ,u * , Vṽ ,[u * ] k ∈ H 2 Q ζ by Proposition 3.8 and noting that the right-hand side is non-zero only when N * > k gives By Hölder's inequality we find that where we have used (3.1) and Corollary 3.15 to arrive at the last inequality. Since is continuous and the right hand side of the above equation is a càdlàg process this extends to all t ∈ [0, T ] (outside of a P-null set) and we can take the S 1 -norm followed by Hölder's inequality to get that where C > 0 is independent of k, k . The last inequality holds since there is a C > 0 such that for all v ∈ U f . Finally, taking the limit as k → ∞, (i) and Fatou's lemma gives that a solution to (1.1).
In particular, (Z v,k ) k≥0 is a Cauchy sequence in the Hilbert space H 2 and we conclude that there is aZ L v s ≥ l} ∧ T and by Lemma 3.11 we have thatK v ∈ S 2 . Since L v is continuous, and thus has P-a.s. bounded trajectories, we find that

Application to SDGs Involving Impulse Control
We now apply the above results to find solutions to stochastic differential games of impulse versus continuous control under weak formulation. In particular, we are interested in finding a saddle point for the game, i.e. a pair (u * , α * ) ∈ U f × A such that for all (u, α) ∈ U f × A. Since we consider a weak formulation, each pair (u, α) ∈ U f × A gives rise to a probability measure Q and a corresponding Brownian motion W Q such that (1.3) and (1.4) admits a solution on ( , F, F, Q, W Q ) and J (u, α) is to be interpreted as the corresponding cost functional under the expectation induced by Q. Throughout, we assume the following forms on the drift and volatility terms in the forward SDE (1.3) and (1.4), t, x, α) and where a is of at most linear growth in the data x and σ is uniformly bounded. The drift is split into two terms a 1 : and X u is the unique solution to the impulsively controlled forward SDE Our approach to solving the above optimization problem is to define a measure Q u,α under which W u,α t = W t − t 0ȃ (s, (X u r ) r ≤s , α * s )ds is a Brownian motion, where α * is a measurable selection of a minimizer in (4.2). In particular, we note that for any (u, α) ∈ U f × A, the 6-tuple ( , F, F, Q u,α , X u , W u,α ) is a weak solution to (1.3) and (1.4) with impulse control u and continuous control α.
Before we move on to show optimality of the above scheme, we give assumptions on a, σ and and φ, ψ and under which the sequential system (1.1) with driver given by (4.2) attains a unique solution. any t, t ≥ 0, b, b ∈ U , ξ, ξ ∈ R d , x, x ∈ D and α ∈ A and for some ρ ≥ 0 we have:  (iii) The running reward φ :

Assumption 4.1 For
Moreover, we have the growth condition for some C g > 0 and all ξ ∈ R d , and there is a nondecreasing function C L : R + → R + such that for each K > 0, and locally Lipschitz in ξ and Hölder continuous in t, i.e. there is a nondecreasing function C L : R + → R + such that for each K > 0, whenever |ξ | ∨ |ξ | ≤ K for some C H , ς > 0.
Under these assumptions, we note that f u as defined in (4.2) is stochastic Lipschitz with Lipschitz coefficient where C > 0 is chosen to eliminate jumps. Moreover, for some k L > 0, we have

Some Preliminary Estimates
We now present some preliminary estimates of moments and stability of solutions to (4.3) and (4.4). Towards the end of the section we will prove that any necessary changes of measure are feasible.
where C = C( p) and ess sup where C = C(ρ, p).
Proof See Proposition 5.4 in [26]. In particular, both moment estimates follows by noting that For any κ ≥ 1 and all v : Proof See the proof of Lemma 5.5 in [26].
A fundamental assumption in Sect. 3 is the existence of a q > 1 and a C > 0 such that sup ζ ∈K 0 E Q |E(ζ * W Q ) T | q ≤ C for all Q ∈ P 0 . In the following two lemmas we show that since L v ≤ k L (1 + sup s∈[0,·] |X v s |), this statement is true. Lemma 4. 4 For v ∈ U f and ς > 1, let ϒ v,ς be the set of all P F -measurable processes ζ with |ζ t | ≤ L v t for all t ∈ [0, T ] (outside of a P-null set) such that E E(r ζ * W ) T = 1 for all r ∈ [1, ς]. Then, there is a q > 1 such that where h : R + → R + is bounded on compacts and we conclude that the left hand side is finite for q > 1 sufficiently small.

Lemma 4.5 Let ζ be a P F -measurable process with trajectories in
Proof We will reach the result by adapting the proof of Lemma 7 in [14] to solutions of impulsively controlled path-dependent SDEs (see also Lemma 0 of Sect. 5 in [2]).
Here, the second property follows from Proposition 4.2. Moreover, let where X u,ζ solves (1.3) and (1.4) with drift a =ã + σ ζ . We first restrict our attention to the situation when v ∈ U k for some k ≥ 0, and note that (by arguing as in the proof of Lemma 4.3) we have where v,ζ t Combining these gives that Moreover, by property (a) above and right-continuity we have that Since > 0 was arbitrary, this proves the assertion whenever v ∈ U k for some finite k ≥ 0. To get the result for arbitrary v := (τ 1 , . . . , τ N ; β 1 , . . . , β N ) ∈ U f , we define the sets for all k ≥ 0 and let Now, for any v ∈ U f we have by definition that P({ω : N > k, ∀k ≥ 0}) = 0 and so we can by again appealing to Lemma 4.4 find a k ≥ 0 such that and the assertion follows as > 0 was arbitrary.

Corollary 4.6
There is a q > 1 such that Proof Lemma 4.5 shows that for each ς > 1 the ϒ v,ς in Lemma 4.4 is in fact all P Fmeasurable processes ζ with |ζ t | ≤ k L (1 + sup s∈[0,t] |X v s |) for all t ∈ [0, T ] (outside of a P-null set).
The above corollary gives the following:  (4.9) where C = C( p) and ess sup where C = C(ρ, p). Moreover, there is a q > 1 and a C > 0 such that for each v ∈ U f and all Proof Existence of a weak solution to (1.3) and (1.4) follows by taking ζ t = a(t, (X u s ) s≤t , α t ) and using Lemma 4.5 to conclude that under Q u,α , the process W u,α := W − · 0ȃ (t, (X s ) s≤t , α t )dt is a Brownian motion. The moment estimates (4.9) and (4.10) now follow by repeating the steps in the proof of Proposition 5.4 in [26] and the last assertion follows by repeating the steps in Lemmas 4.4 and 4.5 while referring to the bound (4.9) rather than (4.6).

Lemma 4.3 and Assumption 4.1 implies the existence of a family
satisfying the conditions in Definition 3.2 and Assumption 3.3 and it follows by Proposition 3.19 and Theorem 3.20 that there is a unique family (Y v,m,n , Z v,m,n , K v,m,n ) that solves the sequential system of reflected BSDEs in (4.12).
We have, Taking the conditional expectation under the measure Q n,n where W t − where n := n ∧ n . As H * ,m,n and H * ,m,n have the same z-coefficient and thus also the same stochastic Lipschitz coefficient we find that Q n,n ∈ P v•u * . In particular, we have The result now follows by Proposition 4.7.
Now, as clearly sup b∈U | n (·, For each m ≥ 0 we note that (Y v,m,n ) n≥0 is a non-increasing sequence of continuous processes that is P-a.s. bounded and we have that Y v,m,n converges pointwisely to a progressively measurable process Y v,m := lim n→∞ Y v,m,n . Furthermore, by Lemma 4.9 we find that Y v,m is continuous and thus belongs to S 2 .

A Solution to the Stochastic Differential Game
We are now ready to solve the SDG by relating optimal controls to solutions of the sequential system of reflected BSDEs (4.11). However, before proceeding we need to narrow down the set of admissible impulse controls that we search over in order to guarantee that (4.13) admits a unique solution.
Definition 4. 13 We let U m be the set subset of U f with all u = (τ 1 , . . . , τ N ; β 1 , . . . , β N ) such that N has moments of all orders, i.e. for each k ≥ 0 we have E (N ) k < ∞. Lemma 4.14 For each u ∈ U f , there is aû ∈ U m such that whenever a solution exists in S 2 l × H 2 . We define the set of sensible impulse controls, U s , as the subset of u ∈ U f such that for each t ∈ [0, T ] and α ∈ A, where Then for each u ∈ U f we obtain a u ∈ U s by removing all future interventions whenever (4.18) does not hold. Moreover, u dominates u in the sense that J (u , α) > J (u, α) for all α ∈ A. Now, whenever u ∈ U s there is a (B u,α , E u,α ) ∈ S 2 l × H 2 that solves (4.17). We will build on the argument in Lemma 3.14 to show that U s ⊂ U m . We thus assume that u ∈ U s . Rearranging the terms in (4.17) gives where we know that all terms on the right hand side, except for the last (martingale) term, have moments of all orders. By Ito's formula we have for κ > 0 (where we have used that (τ j , X . Using (4.19), the growth conditions on φ and ψ and the fact that u ∈ U s together with (4.8) gives that Raising both sides to p ≥ 1 followed by taking the expectation and applying the Burkholder-Davis-Gundy inequality gives The following theorem shows that we can extract the optimal pair (u * , α * ) from the family of maps (α v (t, ω, z) : v ∈ U f ) and the solution to (4.11). (4.11). Then the and N * = sup{ j : τ * j < T }, with τ * 0 := 0 and with τ * N +1 := ∞ is optimal in the sense of (4.1) and Y ∅ 0 = J (u * , α * ). Proof For u ∈ U m we let (U u , V u ) ∈ S 2 l × H 2 be the unique solution to U u t = ψ(X u T ) +  Then V u H p < ∞ for all p ≥ 1 implying that V u ∈ H 2 Q for all Q ∈ P 0 and, since U u 0 is F 0 -measurable and F 0 is trivial, we have where now Q u,α is the measure, equivalent to P, under which W u,α := W − · 0ȃ (s, (X v r ) r ≤s , α s )ds is a martingale. Moreover, a straightforward extension of Lemma 3.16 to impulse controls with an unbounded number of interventions shows that u * ∈ U m and we conclude by Theorem 3.20 (see Remark 4.12) that Combined with Lemma 4.14, the above gives To show that α * is an optimal response it is enough to show that α * is a minimizer of α → J (u * , α). However, for any (u, α) ∈ U m × A, we have and we conclude that (u * , α * ) is a saddle-point for the game.

Conclusions
The present work considers a sequentially arranged system of reflected BSDEs parameterized by impulse controls. Using only probabilistic arguments, we show existence and uniqueness of solutions under a stochastic Lipschitz condition on the driver. Moreover, we relate the solution to our system of BSDEs to a weakly formulated stochastic differential game where one player implements an impulse control while the adversary player plays a continuous control that does not enter the diffusion coefficient. Since all arguments are probabilistic, we do not need to rely on any Markov property of solutions to the underlying controlled SDE, enabling us to handle path-dependence in the drift and diffusion coefficient as well as in the impulse-to-jump map, , of the impulse controller. A practical application of the result lies within robust control where the impulse controller is ambiguous about the model for the drift term in the controlled SDE. A cautions operator could incorporate this ambiguity into the model by considering the worst case, thus rendering a zero-sum stochastic differential game where the adversary (nature) controls the drift term. In this regard, a number of questions are left open. An issue of clear interest is when the adversary player is allowed to control the diffusion coefficient as well. In the robust control framework this setting corresponds to ambiguity about both the drift and the diffusion coefficient. A recent development in this direction is [27] where path-dependent PDEs [8,9] and second order BSDEs [29] (abbreviated 2BSDEs) are used to represent solutions to path-dependent zero-sum stochastic differential games of continuous versus continuous control in a weak formulation. A natural next step would thus be to develop the present work to incorporate 2BSDEs.
Our game is one of impulse versus continuous control. Alternatively, one could consider the case where both players play impulse controls. This problem was approached in the Markovian framework in [6] but the extension to the path-dependent setting remains an open problem.
Finally, we leave the important issue of tractable numerical approximation algorithms as a topic of future research.
Funding Open access funding provided by Linnaeus University.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. By the Girsanov theorem, it follows that under the probability measure Q ζ defined as dQ ζ := E(ζ * W ) T dP the process W