Nonzero-sum stochastic differential games between an impulse controller and a stopper

We study a two-player nonzero-sum stochastic differential game where one player controls the state variable via additive impulses while the other player can stop the game at any time. The main goal of this work is characterize Nash equilibria through a verification theorem, which identifies a new system of quasi-variational inequalities whose solution gives equilibrium payoffs with the correspondent strategies. Moreover, we apply the verification theorem to a game with a one-dimensional state variable, evolving as a scaled Brownian motion, and with linear payoff and costs for both players. Two types of Nash equilibrium are fully characterized, i.e. semi-explicit expressions for the equilibrium strategies and associated payoffs are provided. Both equilibria are of threshold type: in one equilibrium players' intervention are not simultaneous, while in the other one the first player induces her competitor to stop the game. Finally, we provide some numerical results describing the qualitative properties of both types of equilibrium.


Introduction
Controller-stopper games are two-player stochastic dynamic games whose payoffs depend on the evolution over time of some state variable, one player can control its dynamics while the other player can stop the game. The study of these games started with Maitra and Sudderth's work [23] on a zero-sum discrete time setting. Later on, many authors investigated such games in continuous time, especially in the zero-sum case, while very little has been done in the nonzero-sum. Indeed, apart from Karatzas and Sudderth [17], all the other articles focus on the zero-sum case and in all of them the controller uses regular controls, i.e. absolutely continuous for the Lebesgue measure. Here we mention Karatzas and Sudderth [18], who derived the explicit solution for a game with a one-dimensional diffusion with absorption at the endpoints of a bounded interval as a state process; Karatzas and Zamfirescu [20,21] developed a martingale approach to a general class of controller-stopper games, while Bayraktar and Huang [5] showed that the value functions of such games is the unique viscosity solution to an appropriate Hamilton-Jacobi-Bellman equation. Moreover, Hernandez et al. [16] have analysed the case when the controller plays singular controls and derived a set of variational inequalities characterizing the games value functions. On the whole, this class of games is motivated by a variety of applications in finance, insurance and economics. In view of this we quote Bayraktar et al. [6] on convex risk measures, Nutz and Zhang [24] on sub-hedging of American options under volatility uncertainty, Bayraktar and Young [7] on minimization of lifetime ruin probability and Karatzas and Wang [19] on pricing and hedging of American contingent claims among others. solution will give equilibrium payoffs. In this setting, the relevant QVI's system is as follows where M and H are suitable intervention operators defined in Section 2.1. One of the main contributions of this paper consists in the Verification Theorem 2.1 establishing that if two functions V 1 and V 2 are regular enough and they are solution to the system above, then they coincide with some equilibrium payoff functions of the game and a characterization of the related equilibrium strategies is possible.
Furthermore, building on the verification theorem, we present an example of solvable impulse controller and stopper game. More in detail, we consider a game with a one-dimensional state variable X, modelled as a real-valued (scaled) Brownian motion. Both players have linear running payoffs. When P1 intervenes, he faces a penalty while P2 faces a gain, both characterized by a fixed and a variable part, proportional to the size of the impulse. Moreover, when P2 stops the game, he may suffer a loss proportional to the state variable, while P1 might gain something proportional to X as well. Hence, the players objective functions are: with suitable coefficients satisfying some assumptions that will be specified later. Some preliminary heuristics on the QVIs above leads us to consider two pairs of candidates for the functions V i . Then, a careful application of the verification theorem shows that such candidates actually coincide with some equilibrium payoff functions. In particular, we are able to identify two kinds of Nash equilibria, both of threshold type, that can be shortly described as follows: (i) in the first type of equilibrium, P1 intervenes when the state X is smaller than some threshold x 1 and moves the process to some endogenously determined target x * 1 , while P2 terminates the game when the state X is bigger than somex 2 ; in this kind of equilibrium the optimal target of P1, x * 1 , is strictly smaller thanx 2 , so the two players intervene separately.
(ii) In the second type, P1 intervenes when the state X is smaller than some (possibly different) thresholdx 1 and move the state variable to the intervention region of P2, who is then forced by P1 to end the game. In this case, players' interventions are simultaneous.
We provide quasi-explicit expressions for the value functions and for the thresholdsx i , x * 1 for both equilibria. Finally, we perform some numerical experiments providing several cases when one of the two equilibria emerges. The question if there are cases when the two types of equilibria can coexist is still open.
The paper is organised as follows. Section 2 gives the general formulation of impulse controller and stopper game, in particular the notion of admissible strategies, and more importantly we state and prove a verification theorem giving sufficient condition in terms of the system of QVIs for a given couple of payoffs to be a Nash equilibrium. In Section 3 we consider the one-dimensional example with linear payoffs and provide quasi-explicitly characterizations for the two types of Nash equilibria sketched above. Finally, some numerical experiments illustrate the qualitative behaviour of such equilibria.

Description of the game
In this section we have gathered all main theoretical results on a general class of nonzero-sum impulse controller and stopper games. We start with a detailed description of the game, together with all technical assumptions and the definition of admissible strategies.
Let (Ω, F, P) be a probability space equipped with a complete and right-continuous filtration F = (F t ) t≥0 . On this space we consider the uncontrolled state variable X ≡ X x defined as solution of the following time-homogeneous SDE: where (W t ) t≥0 is an F-Brownian motion and the coefficients b : R d → R d and σ : R d → R d×k are assumed to be globally Lipschitz continuous, i.e. there exists a constant C > 0 such that for all so that existence of a unique strong solution is granted and X is well-defined. We consider two players, that we call P1 and P2. Equation (2.1) describes the evolution of the state process in case of no intervention from both players. During the game, P1 can affect X's dynamics applying some impulse δ ∈ Z in an additive fashion, moving the state variable from X τ − to its new value X τ = X τ − + δ, where τ denotes the intervention time. The controlled state variable will be denoted by X x,u : On the other hand, P2 can stop the game by choosing any stopping time η with values in [0, ∞]. We, now, give a proper definition of such strategies.
Remark 2.1 We observe that simultaneous interventions are possible in this game. This is in contrast with games where both players intervenes with impulses, where simultaneous interventions are usually not allowed since they would be very difficult to handle with from a modelling perspective (cf. [1]). On the other hand here, due to the different nature of the strategies for the two players, one can safely allow for simultaneous actions. This has an interesting consequence on our analysis, as we will see in the linear game of the next section that at least two types of Nash equilibria are possible and in one of them P1 induces P2 to stop instantaneously.
The players want to maximize their respective objectives, featuring each of them three discounted terms: a running payoff, P1's intervention cost/gain and a terminal payoff. The players' discount factors can be different of each other. More precisely, for each i = 1, 2, r i > 0 denotes the discount rate of player i, f, g : R d → R are their running payoffs, h, k : R d → R their terminal payoffs and φ, ψ : R d × Z → R are the intervention cost and gain, respectively. Throughout the whole paper, we will work under the assumption that all these functions are continuous. Hence, we can define the payoffs as follows.
Definition 2.2 Let x ∈ R d , let (u, η) be a pair of strategies. Provided that the right-hand sides exist and are finite we set: where the subscript in the expectation denotes the conditioning with respect to the starting point.
In order for J 1 and J 2 to be well defined, we now introduce the set of admissible strategies.

Definition 2.
3 Let x ∈ R d be some initial state and let (u, η) be some strategy profile. We say that the pair (u, η) is x-admissible if: (i) the following random variables are all in L 1 (Ω): (ii) for each p ∈ N, the random variable X x,u ∞ = sup t≥0 |X x,u t | is in L p (Ω). We denote by A x the set of all x-admissible pairs. Remark 2.2 Notice that, as it is formulated above, admissibility is a joint condition on the strategies of both players. Under condition (ii) above and if all functions f , g, h, k, φ and ψ have at most polynomial growth in their respective variables, the set of all jointly admissible strategies can be expressed as Pi's set of (individually) admissible strategies for i = 1, 2, and is defined as follows: A 1 x is the set of all P1's strategies u = (τ n , δ n ) n≥0 such that n≥0 |δ n | ∈ L p (Ω) for all p ≥ 1, while A 2 x is the set of all [0, ∞]-values stopping times. Indeed, for P1's strategies for instance, using classical a-priori L p -estimates of the (uncontrolled) state variable, there exists a constant c > 0 such that Moreover, similar estimates can be performed for the other expectations in Definition 2.3(i).
We conclude this section with the classical definition of Nash equilibrium and the corresponding equilibrium payoffs. Definition 2.4 (Nash Equilibrium) Given x ∈ R d , we say that (u, η) ∈ A x is a Nash equilibrium if Finally, the equilibrium payoffs of any Nash equilibrium (u * , η * ) ∈ A x are defined as

The system of quasi-variational inequalities
Now, we are going to introduce the differential problem that will be satisfied by the equilibrium payoff functions of our game. Let V 1 , V 2 : R d → R be two measurable functions such that for some measurable function δ : R d → Z. Moreover we define the following two intervention operators: 3) for each x ∈ R d . The expressions in (2.2), (2.3) and (2.4) have the following natural interpretation: (2.2) let x be the current state of the process, if P1 intervenes immediately with impulse δ(x), P1's payoff after intervention changes to V 1 (x + δ(x)) − φ(x, δ(x)), given by the payoff in the new state minus the intervention cost. Therefore, δ(x) in (2.2) is the optimal impulse that P1 would apply in case of intervention.
Moreover, for any V ∈ C 2 (R d ) we can consider the infinitesimal generator of the uncontrolled state variable X: where b, σ are as in (2.1), σ t denotes the transposed of σ, ∇V and D 2 V are the gradient and the Hessian matrix of V , respectively. We are interested in the following quasi-variational inequalities (QVI's, for short) for V 1 , V 2 : Each part of the QVI's system above can be interpreted in the following way: (2.5) it means that is not always optimal for P1 to intervene and it is a standard condition in impulse control theory [10,8]; (2.6) if the current state is x and P2 chooses to stop the game, i.e. η = 0, he gains k(x) and since this is a suboptimal strategy, we have V 2 (x) ≥ k(x) for all x ∈ R d ; (2.7) by definition of Nash equilibrium we expect that P2 does not lose anything when P1 intervenes as in [1], otherwise P2 would like to deviate, by contradicting the notion of equilibrium; (2.9) before P2 stops the game, P1 plays as in a classic impulse control problem (e.g. [10]); (2.10) similarly as above, when P1 does not intervene P2 solves his own optimal stopping problem (e.g. [11]).
After all this preparation, we are ready to move to our main result, which is a verification theorem linking Nash equilibria and solutions to the QVI system above.

The verification theorem
In this subsection, we state and proof our main verification theorem. This result will be key in order to compute Nash equilibria in specific examples.
Theorem 2.1 Let V 1 , V 2 : R d → R be two given functions. Assume that (2.2) holds and set with MV 1 as in (2.3). Moreover, assume that: • V 1 and V 2 are solutions of the system of QVIs; , for i = j, and both functions have at most polynomial growth; • ∂C i is a Lipschitz surface * , and V i 's second order derivatives are locally bounded near ∂C i for i = 1, 2.
Finally, let x ∈ R d and assume that (u * , η * ) ∈ A x , where u * = (τ n , δ n ) n≥1 is given by Remark 2.3 First, we stress that, unlike usual impulse control problems, the candidates V 1 , V 2 are not required to be twice differentiable everywhere, but only in {V 2 > k} and {MV 1 − V 1 < 0} respectively. Moreover, we observe that for the equilibrium strategies in the theorem above the right-continuity of (X x;u t ) t≥0 implies the following: for every strategies u and η such that both (u * , η), (u, η * ) belong to A x , for every s ∈ [0, η) and every τ k < ∞.
The proof is performed in three steps.
Step 1 : We show that V 1 (x) ≥ J 1 (x; u, η * ). Let u be a strategy such that (u, η * ) ∈ A x . Thanks to the regularity assumptions and by approximation arguments of Theorem 2.1 in [25] (for more details see the proof of Theorem 3.3 in [1]), we can assume without loss of generality that For each r > 0 and n ∈ N, we set is an open ball with radius r and centre in x. As usual, we adopt the convention inf ∅ = +∞. Applying Itô's formula to e −r1s V 1 (X s ) between time zero and τ r,n and taking conditional expectations on both sides give for all s ∈ [0, η * ). Moreover, using (2.5) we also have: Therefore, it is locally the graph of a Lipschitz function.
Observe that by admissibility we have for some constants C > 0 and p ∈ N. Thus, we can use dominated convergence theorem and pass to the limit, first as r → ∞ and then for n → ∞. Finally, because of (2.8), we obtain Step 2 : We show that Thanks to regularity assumptions and by the same approximation argument as before, we can assume again without loss of generality that Arguing exactly as in Step 1 we obtain for all s ∈ [0, η). Moreover, due to (2.7) and (2.13) we obtain Then, and as before we can use dominated convergence theorem and pass to the limit so that using (2.8) we obtain Step 3 : . We argue as in Step 1, with equalities instead of inequalities by the property of u * . Similarly for P2 with V 2 (x) = J 2 (x; u * , η * ).

An impulse controller-stopper game with linear payoffs
In the next sections 3.1-3.4 we provide an application of the Verification Theorem 2.1 to an impulse game with a one-dimensional state variable evolving essentially as a Brownian motion, which can be shifted by P1's impulses and stopped by P2, and where both players want to maximise linear payoffs. We find two types of Nash equilibria for this game, depending on whether P1 finds it convenient or not to force P2 to stop the game. For both types, we provide quasi-explicit expressions for the equilibrium payoff functions and related strategies. Our findings will be illustrated by some numerical examples.

Setting
We are in a more specific setting than before. This time, the state variable is one-dimensional, while the players have the following linear payoffs for x ∈ R: with s, c, λ, a, q, d, γ, b positive constants fulfilling Hence, given an initial state x and an impulse strategy u = (τ n , δ n ) n≥1 , we define the controlled process X x;u t as where W is a standard one dimensional Brownian motion and σ > 0 is a fixed parameter. Moreover, we assume that the two players have the same discount factor r 1 = r 2 = r such that The players' payoff functions are given by Therefore in this game P1 can shift the state variable X by intervening with impulses in order to keep it high enough, while paying some costs at each intervention time, until the end of the game, which is decided by P2. In addition to that P2, who want to keep X low, might gain something each time P1 intervenes. At the end of the game, P1 (resp. P2) receives (resp. looses) some amount proportional to X. Hence, depending on whether her terminal payoff is high enough, P1 might want to end the game soon, by forcing P2 to do that.
Our goal is to find some Nash equilibrium by solving the QVI problem in (2.5)-(2.10). More specifically, a heuristic analysis of the QVI system will help us finding a couple of quasi-explicit candidates W 1 , W 2 for the equilibrium payoff functions of the game V 1 , V 2 . We recall the optimal impulse size and the intervention operators in this setting together with the infinitesimal generator of the uncontrolled state variable Before giving the QVI system in this case, let us introduce the continuation regions for both players so that the respective intervention regions are given by C c i for i = 1, 2. Now, the QVIs system becomes A first look at the system suggests the following representation for W 1 and W 2 : where ϕ 1 and ϕ 2 are solution to the ODEs Hence, for each x ∈ R, we have: where C 11 , C 12 , C 21 , C 22 are real parameters and θ = 2r/σ 2 .

An equilibrium with no simultaneous interventions
In this subsection we push our heuristics further by focusing on a first type of Nash equilibrium, where simultaneous interventions are not allowed. We mean by that we are looking for an equilibrium of threshold type, where P1 intervenes each time X falls below a certain level, sayx 1 , in which case P1 applies an impulse moving the state variable towards an optimal level x * 1 belonging to the continuation region of both players. On the other hand, P2 wait until X is too high for him, i.e. until X crosses some upper level, sayx 2 , at which point P2 decides to stop the game. The heuristics will lead us to propose candidates for the equilibrium payoffs and related strategies, which will be then checked to be the correct ones subject to some additional conditions. Such additional conditions will be checked in some numerical examples.
Heuristics. Loosely speaking, since P1 is happy when X is high while P2 prefers it to be low, we make the following ansatz about the continuation regions: Hence, we can rewrite (3.3)-(3.4) as Let us find more explicit expressions for the operators MW 1 and HW 2 . In this example, it is natural to restrict the analysis to δ ≥ 0 since P1 prefers high values of X x,u . Hence, whenever he intervenes he will always move the process X to the right, so that Here we focus on the case where the maximum point belongs to (x 1 ,x 2 ), in other words P1 does not force P2 to stop. In particular, we have W 1 (x * 1 ) = ϕ(x * 1 ) and Therefore, we obtain The parameters appearing in the expressions for W 1 and W 2 must be chosen so as to satisfy the regularity assumptions in the verification theorem, i.e.
We can summarize the description of our candidates for equilibrium payoffs in the following Ansatz 3.1 Let W 1 and W 2 be as in (3.7)-(3.8) where the parameters involved satisfy the order conditionx 9) and the following equations Reparameterization. We are going to conveniently reparameterize the equations above in order to reduce their complexity. Using the expressions in (3.6) we can rewrite (3.10) as follows So, subtracting (3.11b) to (3.11a) we obtain Then, adding (3.11c) to (3.11g) we find Hence, by substitution, we are reduced to solving the following sub-system Now, the change of variable z = e θ(x * 1 −x1) turns equation (3.12a) into the following which has a unique solutionz > 1. Indeed, let F (z) := ln z −2( z−1 z+1 )− crθ 1−λr and observe that it satisfies F (z) > 0 for all z > 1. Moreover z = e θ(x * 1 −x1) > 1 due to the order condition (3.9), F (1) < 0 and lim z→+∞ F (z) = +∞. Therefore, there is only one valuez such that F (z) = 0, which can be easily computed numerically. Now, in order to solve (3.12b) and (3.12c) we perform a second change of variable, w = e θ(x2−x1) , leading to the following equations Notice that (3.14a) is linear inx 2 , hence it can be easily solved in terms ofz and w, to get Regarding (3.14b), it can be rewritten as The equation for w above is a quartic equation for which explicit formulae for its roots are available. However, since they are quite cumbersome and not easy to use, we are going to solve it numerically, leaving the analysis for later. Once the two new parametersz andw are found, by solving numerically the respective equations above, the thresholdsx 1 ,x 2 and the optimal level for P1, x * 1 , can be deduced automatically. It remains to check under which additional conditions such thresholds correspond to a Nash equilibrium of our original linear game. This will be done in the next paragraph.
Characterization of the equilibrium and verification. The next proposition summarizes our findings and establish the link between the solutionsz andw to the equations above with the Nash equilibrium of threshold type we are looking for, provided some additional inequalities are fulfilled.
Then a Nash equilibrium for the game in Section 3 exists and it is given by the pair (u * , η * ), where u * = (τ n , δ n ) n≥1 is defined by where the thresholdsx 1 , x * 1 andx 2 satisfy Moreover, the functions W 1 , W 2 in Ansatz 3.1 coincide with the equilibrium payoff functions V 1 , V 2 , i.e.
Proof. The proof consists in checking all the conditions needed to apply the Verification Theorem (2.1). First, notice that by construction the functions W 1 and W 2 satisfy all required regularity properties, i.e. W 1 and W 2 have polynomial growth, W 1 ∈ C 2 ((−∞,x 2 )/{x 1 }) ∩ C 1 ((−∞,x 2 )) ∩ C(R) and Next, we show that for all x ∈ {MW 1 − W 1 = 0} = (−∞,x 1 ], we have W 2 (x) = HW 2 (x). Indeed, by definition of HW 2 we have: We have to prove that . First, notice that W 2 (x) = −bx and AW 2 (x) = 0. Hence, we are reduced to checking the inequality Since by assumption 1 − br > 0, the function x → q − (1 − br)x is decreasing, so we just need to check whether the inequality holds inx 2 , i.e. (1 − br)x 2 − q ≥ 0 which is satisfied by (3.17).
To conclude our verification that the candidate equilibrium payoffs satisfy the QVI system, we are left with checking that −bx − W 2 (x) = 0 implies W 1 (x) = ax, and that, on the other side, −bx − W 2 (x) < 0 implies Now, the first implication holds by definition, while the second one boils down to proving For x ∈ (x 1 ,x 2 ) we have MW 1 (x) − W 1 (x) < 0 and, as before, as ϕ 1 is solution to the ODE (3.5). For x ∈ (−∞,x 1 ] we know that MW 1 (x) − W 1 (x) = 0 and therefore we have to check that To do that, recall first that . Notice that, since by assumption 1 − λr > 0, the function As a result, we only need to prove that the desired inequality holds for x =x 1 , i.e. −rϕ 1 (x 1 ) +x 1 − s ≤ 0, which is verified since Aϕ 1 (x 1 ) − rϕ 1 (x 1 ) +x 1 − s = 0 and Aϕ 1 ( To finish the proof, we check that equilibrium strategies are x-admissible for every x ∈ R. By construction, the controlled process never exits from (x 1 ,x 2 ) ∪ {x}, so that sup t≥0 |X t | ∈ L p (Ω) holds. It is easy to check that all the other conditions are satisfied provided we show the following: (3.20) To start, let us assume that the initial state x is x * 1 . The idea is to write τ k as a sum of independent and identically distributed copies of some exit time (as in the proof of Proposition 4.7 in [1]). Denote by µ the exit time of the process x * 1 + σW from (x 1 ,x 2 ) where W is a one-dimensional Brownian motion. Then each time τ k can be decomposed as τ k = k l≥1 ζ l where ζ l are i.i.d. random variables with the same law as µ. We can now show (3.20).
and, by the Fubini-Tonelli theorem and the independence of (ζ l ) l≥1 , we get which is a convergent geometric series, since µ ≥ 0. Then, for any x ∈ (x 1 ,x 2 ) same arguments hold whereas, when x ∈ [x 2 , +∞), P2 stops as soon as the game starts and, as a consequence, P1 cannot apply any impulse, hence, the condition is satisfied. Finally, if x ∈ (−∞, since sup t≥0 |X t | ∈ L p (Ω).

An equilibrium where the controller activates the stopper
We turn now to another kind of Nash equilibrium, where P1 behaves similarly as in the previous type with the main difference that this time when the state variable X falls below a given threshold, he will intervene and send X directly to the stopping region of P2, hence forcing him to stop the game instantaneously. In particular, this would be an equilibrium in which the two players act at the same time. The approach we are going to use to characterize such an equilibrium follows the same steps as in the previous subsection.
Heuristics. We start with some heuristics leading us to formulate a conjecture on the equations the thresholds characterizing this equilibrium should reasonably satisfy. Arguing as before, we expect the candidates for equilibrium payoffs to be of the following type (3.3)-(3.4) as Now, according to the type of equilibrium we want to identify, we investigate the case in which the maximum point of the function y → W 1 (y) − λy belongs to [x 2 , ∞), meaning that when P1 intervenes he is applying an optimal impulse moving the state variable to the stopping region of her competitor. Thus in this case we have MW 1 (x) = sup y≥x2 (ay − λy).
Therefore, we have the following scenarios: Clearly, the only interesting case is a < λ, so that x * 1 =x 2 . As a consequence, this type of equilibrium will be characterized only by two thresholds. Similarly as in the previous subsection, we are going to characterize the parameters (C 11 , C 12 , C 21 , C 22 ) and the thresholds (x 1 ,x 2 ) by exploiting the smooth pasting conditions coming from the regularity assumptions postulated in Theorem 2.1. By doing so, we obtain together with the order conditionx 1 <x 2 .
Reparameterization. We first rewrite (3.23) as C 11 e θx2 + C 12 e −θx2 +x 2 − s r = ax 2 (3.24c) Then, dividing (3.24a) by θ and adding it to (3.24d), we obtain a linear equation in C 11 that can be solved giving and consequently A similar manipulation of equations (3.24b) and (3.24e) yields which, noting thatx 1 =x 2 − ln w θ and applying the change of variable w = e θ(x2−x1) , can be rewritten as This is a linear equation inx 2 , yieldinḡ Proceeding analogously with (3.24f), we obtain the following alternative expression forx 2 Then, by equating (3.29) to (3.30), we obtain an equation in w: which has at least a solution, say w > 1, due to lim w→+∞ G(w) = +∞ and lim w→1 G(w) = −∞. The first limit follows from the highest order term, w 2 ln w, being multiplied by 1−λr 1−ar > 0 (cf. (3.2)). On the other hand, the second limit follows from (3.1): Characterization of the equilibrium and verification. The next proposition summarizes our characterization of this Nash equilibrium in terms of only one parameter, w, provided some further conditions, that will be checked numerically in the next subsection.
Proposition 3.2 Assume that there exists w solution to (3.31) such that Then a Nash equilibrium for the game in Section 3 exists and it is given by the strategies (u * , η * ), with u * = (τ n , δ n ) n≥1 defined by and where the thresholds satisfȳ Moreover, the functions W 1 , W 2 in Ansatz 3.1 coincide with the equilibrium payoff functions V 1 , V 2 , i.e.
Proof. We proceed as for the previous equilibrium, by checking all the conditions necessary to apply the verification theorem. First of all, the functions W 1 , W 2 satisfy by construction all required regularity properties, i.e.
and both have at most polynomial growth. Next, Lemmas A.3 and A. 4 give By definition of HW 2 we have: Now, in order to prove that we consider two separate cases as for the previous equilibrium. First, for x ∈ (x 1 ,x 2 ), we have −bx − W 2 (x) < 0 and since ϕ 2 is solution to the ODE (3.5), so the maximum between the two terms is zero. Second, we Since AW 2 (x) = 0, we are reduced to verify the inequality Given that x → q − (1 − br)x is decreasing due to 1 − br > 0, it suffices to show the inequality above at the pointx 2 , i.e. (1 − br)x 2 − q ≥ 0, which is implied by (3.33).
To complete the verification that W 1 , W 2 are solutions to the QVI system, we show that in −bx − W 2 (x) = 0 implies W 1 (x) = ax and that −bx − W 2 (x) < 0 yields max{AW 1 (x) − rW 1 (x) + x − s, MW 1 (x) − W 1 (x)} = 0. The first implication holds by definition. For the second one, we have to prove For x ∈ (x 1 ,x 2 ) we have MW 1 (x) − W 1 (x) < 0 and as before as ϕ 1 is solution to the ODE (3.5). For any x ∈ (−∞,x 1 ] we know that MW 1 (x) − W 1 (x) = 0, hence, we have to check that To do so, we notice that, since by assumption 1−λr > 0, the function x → (1−λr)x+cr−s−(a−λ)rx 2 is increasing in x. Therefore, we only need to prove that the desired inequality for x =x 1 , i.e.
(1 − ar) which is given by Lemma A.3. Finally, the optimal strategies are x-admissible for every x ∈ R. Indeed, by construction, the controlled process never exits from (x 1 ,x 2 ) ∪ {x}, so that sup t≥0 |X t | ∈ L p (Ω) for all p ≥ 1. It is easy to check that all the other conditions are satisfied as in the first type of equilibrium.

Numerical experiments
In this section we will give some numerical illustrations of the equilibrium payoff functions and a selection of comparative statics regarding the two types of Nash equilibria identified in the previous subsections. † It is useful to remember that in order for the solutions to the QVI system to be Nash equilibria of one of the two types, they have to satisfy either (3.17)-(3.18) or (3.32)-(3.33). Before we start, let us recall the meaning of the parameters involved: • s and q might be interpreted as exogenous costs and gains respectively. Note that P1's running payoff f (x) = x − s, hence, in order to make profit P1 needs x to be greater than s, which can fairly be considered as P1's expense, an analogous reasoning applies for P2, but in the opposite direction since g(x) = q − x; • a and b can be considered as terminal payoff sensitivity to the underlying process, X t , as we have h(x) = ax and k(x) = −bx respectively; • at each intervention time P1 faces a fixed cost, c, while P2 receives a fixed gain, d; • moreover, λ is P1's proportional cost parameter, while γ is P2's proportional gain parameter; • finally, r is the discount rate, the same for both players, and σ is the volatility of the state variable.
Equilibrium with no simultaneous interventions. In order to fulfill (3.17)-(3.18), we can observe that both inequalities are satisfied for high enough values ofw. It is possible to show via graphical analysis thatw, solution to (3.16), is decreasing in a, b, s and increasing in c, d, q, λ and γ. Therefore, we have chosen small values of a, b and s to obtain the first equilibrium, Scenario A, whereas for Scenario B we have looked for higher values and increased q and d in order to find an equilibrium.    Figure 1(ii) we can see how a reduction in the volatility seems to shrink the continuation region, hence, the players become more cautious, reducing their intervention regions when there is more uncertainty. Another interesting fact to note is how the relative distance between x 1 andx 2 becomes smaller. This can be due to the increase in P2's terminal payoff sensitivity, b, and the increase in P1's exogenous cost, s. In one direction, P2 is losing more money when she decides to terminate the game, therefore she will not stop when the state process value is too high, hence she reduces her thresholdx 2 . In the other, since P1 is facing higher exogenous costs, she pushes the target, x * 1 , as far as she can, making sure the state process is not going too low, rising the barrierx 1 . Figures 1(iii)-1(iv)-1(v)-1(vi) represent some comparative statics of the thresholdsx 1 , x * 1 andx 2 for Scenario B. Similar graphs hold for Scenario A as well, therefore they are omitted. First, in Figure  1(iii) we can observe how an increase in P1's fixed cost expands the gap betweenx 1 and x * 1 . The more P1 has to pay at any intervention time, the less often she will intervene, lowering the threshold,x 1 , and increasing the target, x * 1 . This allows P2, who does not like high values of x, to slightly lower her threshold,x 2 , so as to pay less when she will stop the game. In Figure 1(iv) the behavior with respect to the proportional cost is quite different. P1 will reduce the interventions for higher λ, with the distance betweenx 1 and x * 1 left nearly unchanged, while P2 keeps the barrier at a constant level x 2 . In particular, P1 tends to never intervening when the proportional cost reaches its maximum, set by the condition 1 − λr > 0. This behavior shows how P1 is quite indifferent to changes in the proportional cost when this is not too big while she is really sensitive once it gets high. Finally, in Figures 1(v)-1(vi) we can see that, when P2's gains more each time P1 intervenes increases, P2 is happy playing for longer, heightening the thresholdx 2 , since she is receiving more money.
Equilibrium where P1 induces P2 to stop. To satisfy (3.32)-(3.33), we want w to be neither too high nor too low, in particular, high λ should help in (3.32) as high w in (3.33). As before, via graphical analysis it is possible to show that w, solution to (3.31), is decreasing in a, b, s and increasing in c, d, q, λ and γ. Therefore, the first instance of Nash equilibrium, Scenario B, has been selected to have high λ and w, choosing high values of c, d, q and γ and low values of b and s, whereas for Scenario A we have looked for lower values of λ and adapted the others.   As before, Figure 2(i)-2(ii) represent the equilibrium payoff functions in the selected examples. First, we can observe that the continuation region in Scenario A is shifted to the right with respect to the one in Scenario B and we can observe that its width has not changed much from one case to the other. Furthermore, we can notice that Scenario B is more profitable for P2 and less profitable for P1. These two facts might be explained by the following changes from Scenario B to Scenario A: P1's exogenous cost, s, increases, so P1 cannot tolerate low levels of x, increasing her thresholdx 1 . Moreover, although P2's gains, q, d and γ, decrease we do not see her threshold scale down as it would be expected as the game is now less profitable. This is probably due to b's reduction, which leads P2 to stop for higher values ofx 2 since she is going to lose less when she decides to stop. Now, let us spend some words on the comparative statics in Figures 2(iii)-2(iv)-2(v)-2(vi). When P1's costs, c and λ, increases, Figure 2(iii)-2(iv), P1 would intervene for lower values of x and the distancex 2 −x 1 will increase, even thoughx 2 gets lower as well. This can be explained as follows, with the costs increasing, P1 is less willing to intervene, reducingx 1 , even though this shift allows P2 to lower her threshold,x 2 , since she likes low values of x. When the fixed gain, d, rises, Figure 2(v), P2 can afford the game to run for longer, increasingx 2 , as she will gain more when P1 will make her stop. Moreover, this makes P1 heightenx 1 in order to limit the proportional costs increment. Lastly, we have a similar behavior to the one described above for the proportional gain, γ, Figure 2(vi). The main difference is the speed with which the distance between the thresholds increases, higher for proportional gain increments. This happens because, in case of proportional gain increments, P2 is more incentivized to pushx 2 far away since the bigger the impulse the more the revenue, whereas an analogous behavior in case of fixed gain increments would lead to a loss in the terminal payoff outrunning the additional profit due to the fact that the gain, d, does not depend on the intensity of the impulse P1 is playing while the losses are increasing, since they depend on P2's threshold, −bx 2 .
Comparison between the two equilibria We conclude with a short discussion on the reasons why P1 would play aggressively, forcing P2 to stop. To do so we compare first the two scenarios A and B in both equilibria. So, going from Type I to Type II we see a reduction in the proportional gain, γ, an increase in P1 terminal payoff sensitivity, b, and a reduction in P2's exogenous gain, q, making P2 lower her threshold,x 2 , to reduce the losses at the end of the game. Then, P1's exogenous cost, s, increases making P1 rise both the threshold and the target,x 1 and x * 1 respectively. Furthermore, P1 terminal payoff sensitivity, a, increases and, intuitively incentivize P1 to let P2 end the game sooner so to receive the terminal payoff. More specifically, sincew is decreasing in a, its increase makes lnw = θ(x 2 −x 1 ) decrease, hence, since the distance between the two thresholds is now smaller, P1's target, x * 1 , is closer to P2's barrier up to the point they coincide, x * 1 ≡x 2 . Regarding Scenario B, again from Type I to Type II, we observe increments in the terminal payoff sensitivity of the two players, a and b, in particular P1's sensitivity rises much more than in the first scenario, hence, P1 is more incentivized to let P2 end the game. Another important change regards the proportional cost, λ, which is very high in case P1 induces P2 to stop. As we have seen before in the comparative statics in Figure 1(iv), P1 intervenes less and less when the proportional cost becomes higher and higher, so it is more convenient to intervene only once, inducing P2 to stop.
We finally observe that while we have managed to find numerical values for which only one of the two types of Nash equilibria emerges at a time, the problem of whether the two equilibria can coexist remains open.
As a consequence, Γ has a unique global maximum in x * 1 , so that in (x * 1 , ∞) This implies the first part of our statement, i.e. δ(x) = (x * 1 − x)1 (−∞,x * 1 ] (x). Now, to show (A.1), notice first that where we set in (x * 1 , ∞) Now, we are going to prove that ζ > 0. By C 0 -pasting inx 1 , which is strictly positive since Γ is increasing in (x 1 , x * 1 ]. Hence, ζ is strictly positive and we have Proof. First, we recall that where ϕ 2 (x) = C 21 e θx + C 22 e −θx + q − x r .
We want to prove that ϕ 2 (x) > −bx in (x 1 ,x 2 ) and HW 2 (x) > −bx in (−∞,x 1 ]. For the first inequality we are interested in the conditions such that, for all x ∈ (x 1 ,x 2 ), we have