1 Introduction

In recent years, studying so-called price impact games (also referred to as market impact games) in the context of optimal portfolio liquidation problems has gained a lot of attraction in the financial mathematics literature. They investigate the strategic interaction of financial agents, who simultaneously trade in the same risky asset in order to cost-efficiently liquidate their position while affecting the asset’s execution price through jointly generated price impact. That is, influencing the price in an adverse manner when they execute their buy or sell orders. These price impact games provide a tractable way to formalize the competition between agents for a risky asset’s liquidity. Among the first game-theoretic approaches carried out to investigate possible phenomena in a competitive equilibrium where agents seek to liquidate their positions in the same risky asset are, e.g., Brunnermeier and Pedersen [6], Attari et al. [4], Carlin et al. [9], Schöneborn [32],Schöneborn and Schied [33],Carmona and Yang [10], and Schied and Zhang [29].

Our goal in this paper is to extend these works by formulating and studying the competition between two strategic agents for liquidity when both agents are trading simultaneously in an illiquid risky asset affected by price impact, because each agent seeks to track her own exogenously given stochastic target strategy like, e.g., a frictionless delta hedge to dynamically hedge the fluctuations of their random endowments. Single-agent optimal tracking problems in the presence of price impact have first been considered by Rogers and Singh [28], Naujokat and Westray [26], Horst and Naujokat [22], and Cartea and Jaimungal [11]. To the best of our knowledge, the present manuscript is the first to study a dynamic tracking problem in a competitive two-player price impact game setting. Specifically, we extend the single-player cost optimal benchmark portfolio tracking problem studied in Bank et al. [5] in the presence of temporary and permanent price impact as proposed by Almgren and Chriss [2] to a two-player stochastic differential game. Both strategic agents are fully aware of the opponent’s individual tracking objectives and they compete for available liquidity as the jointly caused price impact on the execution price directly feeds into their trading performances. We also allow for individual stochastic terminal state constraints on each agent’s final portfolio position. Our aim is to shed light on the strategic interplay between the agents and to make transparent how each agent takes into account the other agent’s trading targets in an optimal cost minimizing manner by solving for a Nash equilibrium in this two-player price impact game.

The paper most closely related to ours is Schied and Zhang [29]. Therein, the authors determine a unique open-loop Nash equilibrium within the class of deterministic strategies of agents aiming to liquidate a given asset position by maximizing a mean-variance criterion in an Almgren and Chriss [2] framework. Their study is an extension of the corresponding deterministic differential game solved in Carlin et al. [9] of liquidating risk-neutral agents who maximize expected revenues. Other extensions of the latter game include, e.g., Schöneborn and Schied [33], Carmona and Yang [10], Moallemi et al. [25], Chu et al. [14]. In contrast to these papers, which focus on optimal portfolio liquidation only, we additionally allow the agents to track their own general predictable target strategies as in the single-player case investigated in [5]. Moreover, facing the same time horizon, the players’ terminal portfolio positions are also restricted to some exogenously predetermined stochastic levels which reveal gradually over time. As a consequence, both agents will choose their dynamic trading strategies from a suitable set of adapted stochastic processes rather than opting for static strategies from a set of deterministic functions as in the papers cited above (except for the numerical study in [10]).

Other recent work on both finite-player as well as infinite-player mean field price impact games with Almgren-Chriss type price impact include, e.g., Cardaliaguet and Lehalle [8], Huang et al. [23], Casgrain and Jaimungal [12, 13], Fu et al. [21], Fu and Horst [19], Evangelista and Thamsten [18], and Drapeau et al. [15], where finitely and infinitely many agents pursue optimal liquidation of their initial positions and interact through common aggregated permanent and temporary price impact. Price impact games of liquidating agents in a market model with transient price impact are analyzed, e.g., in Luo and Schied [24], Schied and Zhang  [30], Schied et al. [31], Strehle [34]; and very recently in Fu et al. [20] and Neuman and Voß [27]. However, these works are all portfolio liquidation games where the agents steer their initial portfolio positions towards zero (with strict liquidation constraints enforced in [18,19,20,21]). In particular, the agents neither track any individual stochastic running trading targets nor do they aim for reaching an individual random terminal target position. In contrast, as mentioned above, our present study formulates and solves a two-player price impact portfolio tracking game with random terminal state constraints between two heterogeneous agents who have their own individual trading targets.

Our main result is an explicit description of a unique open-loop Nash equilibrium within the class of progressively measurable strategies to our two-player stochastic differential game, where both agents track their own target strategies as in [5] and interact through temporary and permanent price impact as in [29] and [9]. Mathematically, we solve a linear quadratic stochastic differential game with random terminal state constraints. Inspired by the analysis in [5], we follow a probabilistic and convex-analytic approach in the style of Pontryagin’s stochastic maximum principle. This also allows us to consider general predictable strategies as the agents’ tracking targets and not necessarily Markovian or continuous diffusion-type processes. We prove uniqueness of the Nash equilibrium and derive its characterization, which takes the form of a four-dimensional coupled system of linear forward-backward stochastic differential equations (FBSDEs). Due to the stochastic terminal state constraints the FBSDE system has singular terminal conditions. As a consequence, explicitly computing a solution to the constrained stochastic differential game is a nontrivial task. The manuscript shows how this can be achieved. Solving the singular FBSDE system provides us with the agents’ optimal trading strategies in equilibrium in closed-form and unveils a rich phenomenology for their optimal behaviour.

In fact, it turns out that in equilibrium, similar to the single-player solution presented in [5], both agents anticipate their individual running target portfolio by gradually trading in the direction of a weighted average of expected future target positions of the target strategy. However, being aware of the competitor’s tracking goals, each agent also assesses a weighted average of the expected future positions of the opponent’s target strategy and chooses to trade accordingly. Interestingly, it arises that the agents’ trading directions with respect to the adversary’s target strategy are not invariant but depend on the relation between temporary and permanent price impact. Conceptually, our explicit results extend the analysis carried out by Schöneborn and Schied [33]. Therein, the authors identify two distinct types of illiquid markets: A plastic market where the price impact is predominantly permanent, and an elastic market where the major part of incurred price impact is temporary. Their model predicts that a competitor who is conscious of the other agent’s liquidation intention engages in predatory trading in a plastic market (in the sense that the competitor partly trades in the same direction as her opponent), while she tends to cooperate and provides liquidity in an elastic market (in the sense that she trades in the opposite direction of her opponent’s trading); cf. also the detailed discussion in Schöneborn and Schied [33]. Our closed-form Nash equilibrium solution of our more general price impact portfolio tracking game corroborates this. The novelty of our contribution comes from the fact that both predation by simultaneously trading in the same direction as the opponent, as well as cooperation by trading in the opposite direction can occur in a coexisting manner; depending on whether the market is plastic or elastic. As a consequence, different behavioral paradigms can emerge as optimal in our Nash equilibrium; see the illustrations in Sect. 4.

The remainder of the paper is organized as follows. In Sect. 2 we introduce our two-player stochastic differential price impact game by extending the framework of Carlin et al. [9] and Schied and Zhang [29] to a stochastic tracking problem of general predictable target strategies and random terminal state constraints. Our main result, an explicit description of a unique open-loop Nash equilibrium of the game is presented in Sect. 3. Section 4 contains some illustrations and discusses the qualitative behaviour of the two players’ optimal strategies in equilibrium.

Notation: Throughout this manuscript we use superscripts for enumerating purposes as, e.g., in \(X^1\), \(X^2\), \(\alpha ^1\), \(\alpha ^2\), or other quantities like \(\xi ^1\), \(\xi ^2\) etc., to mark all objects which are associated with player 1 and player 2, respectively; or, to itemize objects as \(w^1\), \(w^2\), \(w^3\) etc. In particular, \(X^2\), \(\alpha ^2\), \(\xi ^2\) is not to be confused with quadratic powers, which will be explicitly denoted with brackets like \((\alpha )^2\), or, if necessary, as \((\alpha ^2)^2\).

2 Problem formulation

Let \(T >0\) denote a finite deterministic time horizon and fix a filtered probability space \((\Omega ,{\mathscr {F}},({\mathscr {F}}_t)_{0 \le t \le T},{\mathbb {P}})\) satisfying the usual conditions of right continuity and completeness. We consider two agents (preferred pronouns she/her/hers and he/him/his, respectively) who are trading in a financial market consisting of one risky asset, e.g., a stock. The number of shares agent 1 and agent 2 are holding at time \(t \in [0,T]\) are defined, respectively, as

$$\begin{aligned} X^1_t \triangleq x^1 + \int _0^t \alpha ^1_s ds \qquad \text {and} \qquad X^2_t \triangleq x^2 + \int _0^t \alpha ^2_s ds \end{aligned}$$
(1)

with initial positions \(x^1, x^2 \in {\mathbb {R}}\). The real-valued stochastic processes \((\alpha ^1_t)_{0 \le t \le T}\) and \((\alpha ^2_t)_{0 \le t \le T}\) represent the turnover rate at which each agent trades in the risky asset and belong to the general class of stochastic processes

$$\begin{aligned} {\mathscr {A}}\triangleq \left\{ \alpha : \alpha \text { progressively measurable s.t. } {\mathbb {E}} \left[ \int _0^T (\alpha _t)^2 dt \right] < \infty \right\} . \end{aligned}$$
(2)

We adopt the framework from Carlin et al. [9] and Schied and Zhang [29] and suppose that the agents’ trading incurs linear temporary and permanent price impact à la Almgren and Chriss [2] in the sense that trades in the risky asset are executed at prices

$$\begin{aligned} S_ t \triangleq P_t + \lambda (\alpha ^1_t + \alpha ^2_t)+ \gamma ((X^1_t - x^1) + (X^2_t - x^2)) \quad (0 \le t \le T) \end{aligned}$$
(3)

with some unaffected price process \(P_\cdot = P_0 + \sqrt{\sigma } W_\cdot \) following a Brownian motion \((W_t)_{0 \le t \le T}\) with respect to the underlying filtration with variance \(\sigma > 0\). The trading of both agents in the risky asset consumes available liquidity and instantaneously affects the execution price in (3) in an adverse manner through temporary price impact \(\lambda > 0\). In addition, the agents’ total accumulated trading activity also leaves a trace in the execution price which is captured by the permanent price impact parameter \(\gamma >0\).

Similar to the single-agent setup in Bank et al. [5] we assume that agent 1 and agent 2 are trading in this illiquid risky asset because each agent seeks to track their own exogenously given target strategy \((\xi _t^1)_{0 \le t \le T}\) and \((\xi _t^2)_{0 \le t \le T}\), respectively. Both processes \(\xi ^1\) and \(\xi ^2\) are supposed to be real-valued predictable processes in \(L^2({\mathbb {P}}\otimes dt)\) and can be thought of, for instance, as hedging strategies adopted from a frictionless market. Moreover, the agents are also required to reach a predetermined terminal portfolio target position \(\Xi ^1_T\) and \(\Xi ^2_T\) in \(L^2({\mathbb {P}},{\mathscr {F}}_T)\) at time T. Mathematically, we can formalize their objectives as follows: For a given strategy \((\alpha ^2_t)_{0 \le t \le T}\) of her competitor agent 2, agent 1 aims to choose her trading rate \((\alpha ^1_t)_{0 \le t \le T}\) in order to minimize the cost functional

$$\begin{aligned} \begin{aligned} J^1(\alpha ^1;\alpha ^2)&\triangleq \; {\mathbb {E}}\left[ \frac{1}{2} \sigma \int _0^T (X^1_t - \xi ^1_t)^2 dt \right. \\&\quad \left. + \frac{1}{2} \lambda \int _0^T \alpha ^1_t \left( \alpha ^1_t + \alpha ^2_t \right) dt + \frac{1}{2} \gamma \int _0^T \alpha ^1_t \left( X^2_t - x^2 \right) dt \right] , \end{aligned} \end{aligned}$$
(4)

whereas agent 2 wishes to minimize

$$\begin{aligned} \begin{aligned} J^2(\alpha ^2;\alpha ^1)&\triangleq \; {\mathbb {E}}\left[ \frac{1}{2} \sigma \int _0^T (X^2_t - \xi ^2_t)^2 dt \right. \\&\quad \left. + \frac{1}{2} \lambda \int _0^T \alpha ^2_t \left( \alpha ^1_t + \alpha ^2_t \right) dt + \frac{1}{2} \gamma \int _0^T \alpha ^2_t \left( X^1_t - x^1 \right) dt \right] \end{aligned} \end{aligned}$$
(5)

via his trading rate \((\alpha ^2_t)_{0 \le t \le T}\) in response to a given strategy \((\alpha ^1_t)_{0 \le t \le T}\) of his opponent agent 1. As in the single-agent problem in Bank et al. [5], the first term in (4) and (5) reflects the agents’ running after their individual target strategies \(\xi ^1\) and \(\xi ^2\), respectively, through minimizing the corresponding square deviation from their respective portfolio positions \(X^1\) and \(X^2\). The common weight parameter \(\sigma \) measures price fluctuations of the underlying unaffected price process. The second and third terms in (4) and (5) take into account the additional incurred linear quadratic illiquidity costs which are induced by temporary and permanent price impact while both agents are trading in the risky asset as stipulated in (3) (see also Carlin et al. [9] and Schied and Zhang [29]). Note, however, that due to each agent’s individual terminal state constraint \(X^i_T = \Xi ^i_T\) \({\mathbb {P}}\)-a.s. (for \(i=1,2\)) only the competitor’s accrued permanent price impact feeds into their respective cost functional. Indeed, integration by parts yields that the i-th agent’s permanent impact from their own trading always creates the same costs \(\gamma (X^i_T - x^i)^2=\gamma (\Xi ^i_T - x^i)^2\) independent of their chosen trading rate and therefore can be neglected in their own objective functional. We obtain following individual optimal stochastic control problems for agent 1 and agent 2, namely,

$$\begin{aligned} J^1(\alpha ^1;\alpha ^2) \rightarrow \min _{\alpha ^1 \in {\mathscr {A}}^1} \end{aligned}$$
(6)

for any fixed strategy \(\alpha ^2 \in {\mathscr {A}}^2\), and

$$\begin{aligned} J^2(\alpha ^2;\alpha ^1) \rightarrow \min _{\alpha ^2 \in {\mathscr {A}}^2}, \end{aligned}$$
(7)

for any fixed strategy \(\alpha ^1 \in {\mathscr {A}}^1\), where \({\mathscr {A}}^{i}\), \(i=1,2\), is the set of admissible constrained policies defined as

$$\begin{aligned} {\mathscr {A}}^{i} \triangleq \left\{ \alpha ^i : \alpha ^i \in {\mathscr {A}}\text { satisfying } X^i_T = x^i + \int _0^T \alpha ^i_t dt = \Xi ^i_T \; {\mathbb {P}}\text {-a.s.} \right\} . \end{aligned}$$
(8)

Similar to Bank et al. [5] we further assume that the target positions \(\Xi ^1_T, \Xi ^2_T \in L^2({\mathbb {P}},{\mathscr {F}}_T)\) satisfy

$$\begin{aligned} {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^+ \rangle _s \right]< \infty \quad \text {and} \quad {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^- \rangle _s \right] < \infty , \end{aligned}$$
(9)

where \(M^+_t \triangleq {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \vert {\mathscr {F}}_t]\) and \(M^-_t \triangleq {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \vert {\mathscr {F}}_t]\) for \(0 \le t \le T\).

Remark 2.1

  1. 1.

    As in Carlin et al. [9] and Schied and Zhang [29] the agents’ individual optimization problems in (6) and (7) are intertwined through common aggregated temporary and permanent price impact affecting their performance functionals \(J^1\) and \(J^2\) in (4) and (5) (in contrast to, e.g, Huang et al. [23], Casgrain and Jaimungal  [12, 13] or Ekren and Nadtochiy [17] where agents only interact through permanent or temporary price impact, respectively). One can think of both players as strategic agents who compete for liquidity while concurrently trading in a single illiquid risky asset to meet their tracking objectives for the purpose of, e.g., hedging fluctuations of random endowments. Note that both agents are fully aware of the opponent’s trading targets \(\xi ^i\) and \(\Xi _T^i\) (\(i=1,2\)), as well as the jointly caused price impact on the execution prices in (3). That is, our game is one of complete information as in the related studies in Brunnermeier and Pedersen [6], Carlin et al. [9], Schöneborn and Schied [33], Carmona and Yang [10], and Schied and Zhang [29].

  2. 2.

    For further motivation for the tracking cost functionals in (4) and (5) we refer to the single-player optimization problems studied, e.g., in Rogers and Singh [28], Naujokat and Westray [26], Horst and Naujokat [22], Almgren and Li [3], Bank et al. [5], and Cai et al. [7]. Observe that the square tracking error also incorporates a risk aversion on each player’s inventory. In this regard, both agents are homogeneous in their inventory risk.

  3. 3.

    Note that the coefficients \(\sigma , \lambda , \gamma > 0\) in the cost functionals in (4) and (5) are constants. This is an important assumption for obtaining a closed-from solution for the stochastic differential game, which is our primary focus of interest. In fact, the only sources of randomness in the game are the target strategies \((\xi ^1_t)_{0 \le t \le T}\), \((\xi ^2_t)_{0 \le t \le T}\) and the random terminal conditions \(\Xi ^1_T\), \(\Xi ^2_T\), which will force the agents’ optimal policies to be random processes as well.

  4. 4.

    Analog to the study in Bank et al. [5] the assumption in (9) will ensure that \({\mathscr {A}}^i \ne \varnothing \) for \(i=1,2\) (cf. also the Proof of Theorem 3.5 in Sect. 3 below). In fact, for given random variables \(\Xi ^i_T \in L^2({\mathbb {P}},{\mathscr {F}}_T)\) only known at time T the terminal state constraint \(X^i_T = \Xi ^i_T\) \({\mathbb {P}}\)-a.s. (\(i=1,2\)) is quite demanding. Thus, loosely speaking, the condition in (9) requires that the speed at which information on the random ultimate target positions \(\Xi ^1_T\), \(\Xi ^2_T\) is revealed as \(t \uparrow T\) is sufficiently fast.

Our goal is to compute a Nash equilibrium in which both agents solve their minimization problems in (6) and (7) simultaneously, given the strategy of their competitor, in the following sense:

Definition 2.2

A pair of admissible strategies \( ({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) is called an open-loop Nash equilibrium if for all admissible strategies \(\alpha ^1 \in {\mathscr {A}}^1\) and \(\alpha ^2 \in {\mathscr {A}}^2\) it holds that

$$\begin{aligned} J^1 ({\hat{\alpha }}^1;{\hat{\alpha }}^2) \le J^1 (\alpha ^1;{\hat{\alpha }}^2) \quad \text {and} \quad J^2 ({\hat{\alpha }}^2;{\hat{\alpha }}^1) \le J^2 (\alpha ^2;{\hat{\alpha }}^1). \end{aligned}$$

In other words, in a Nash equilibrium neither player has an incentive to deviate from the chosen strategy.

Remark 2.3

In the special case of optimally liquidating the agents’ initial risky asset holdings \(x^1, x^2 \in {\mathbb {R}}\) without tracking exogenously given target strategies, i.e., \(\xi ^1 \equiv \xi ^2 \equiv 0\), and with non-random terminal target positions \(\Xi ^1_T = \Xi ^2_T = 0\) \({\mathbb {P}}\)-almost surely, the above formulated two-player (deterministic) differential game is solved in Carlin et al. [9] setting \(\sigma = 0\) in the performance functionals in (4) and (5); and in Schied and Zhang [29] allowing for \(\sigma > 0\) instead. In both studies, the authors obtain a unique open-loop Nash equilibrium in the sense of Definition 2.2 in closed form within the class of deterministic strategies.

3 Main result

Our main result is an explicit description of a unique open-loop Nash equilibrium in the sense of Definition 2.2 of the two-player stochastic differential game formulated in Sect. 2. Inspired by Bank et al. [5] we will use tools from convex analysis and simple calculus of variations arguments to derive the equilibrium strategies.

First, a strict convexity property of each players’ objective in (4) and (5) is established in the following

Lemma 3.1

For every \(\alpha ^2 \in {\mathscr {A}}^2\) fixed, the functional \(\alpha ^1 \mapsto J^1(\alpha ^1;\alpha ^2)\) in (4) is strictly convex in \(\alpha ^1 \in {\mathscr {A}}^1\). Similarly, for every \(\alpha ^1 \in {\mathscr {A}}^1\) fixed, the functional \(\alpha ^2 \mapsto J^2(\alpha ^2;\alpha ^1)\) in (5) is strictly convex in \(\alpha ^2 \in {\mathscr {A}}^2\).

Proof

We only show strict convexity of the first agent’s objective in (4). The reasoning for the second agent’s objective in (5) follows analogously. To this end, let \(\alpha ^2 \in {\mathscr {A}}^2\) be fixed. Consider \(\alpha ^1,{\tilde{\alpha }}^1 \in {\mathscr {A}}^1\) such that \(\alpha ^1 \ne {\tilde{\alpha }}^1\) \(d{\mathbb {P}}\otimes dt\text {-a.e. on } \Omega \times [0,T]\) and denote by \(X^1, {\tilde{X}}^1\) the corresponding share holdings. For every \(\varepsilon \in (0,1)\) it holds that \(\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1 \in {\mathscr {A}}^1\) with share holdings \(X^{\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1} = \varepsilon X^{1} + (1-\varepsilon ) {\tilde{X}}^1\). We have to show that

$$\begin{aligned} \varepsilon J^1(\alpha ^1;\alpha ^2) + (1-\varepsilon ) J^1({\tilde{\alpha }}^1;\alpha ^2) - J^1(\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1; \alpha ^2) > 0. \end{aligned}$$

In fact, a straightforward computation reveals that

$$\begin{aligned} \begin{aligned}&\varepsilon J^1(\alpha ^1;\alpha ^2) + (1-\varepsilon ) J^1({\tilde{\alpha }}^1;\alpha ^2) - J^1(\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1; \alpha ^2) \\&\quad = \frac{1}{2} \varepsilon (1-\varepsilon ) {\mathbb {E}} \left[ \int _0^T \left( \sigma (X^1_t- {\tilde{X}}^1_t)^2+\lambda (\alpha ^1_t - {\tilde{\alpha }}^1_t)^2 \right) dt \right] >0 \end{aligned} \end{aligned}$$

because \(\alpha ^1 \ne {\tilde{\alpha }}^1\) \(d{\mathbb {P}}\otimes ds\text {-a.e. on } \Omega \times [0,T]\). \(\square \)

As an important consequence we obtain

Lemma 3.2

There exists at most one Nash equilibrium in the sense of Definition 2.2.

Proof

We adapt the argument from Schied and Zhang [29,  Lemma 4.1] (see also Schied et al. [31,  Proposition 4.8]) to our stochastic differential game and prove the claim by contradiction. Specifically, assume that there exist two distinct Nash equilibria \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) and \(({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)\) in \({\mathscr {A}}^1 \times {\mathscr {A}}^2\), i.e.,

$$\begin{aligned} \begin{aligned} J^1 ({\hat{\alpha }}^1;{\hat{\alpha }}^2)&\le J^1 (\alpha ^1;{\hat{\alpha }}^2) \quad \text {and} \quad J^2 ({\hat{\alpha }}^2;{\hat{\alpha }}^1) \le J^2 (\alpha ^2;{\hat{\alpha }}^1), \\ J^1 ({\tilde{\alpha }}^1;{\tilde{\alpha }}^2)&\le J^1 (\alpha ^1;{\tilde{\alpha }}^2) \quad \text {and} \quad J^2 ({\tilde{\alpha }}^2;{\tilde{\alpha }}^1) \le J^2 (\alpha ^2;{\tilde{\alpha }}^1), \end{aligned} \end{aligned}$$
(10)

for all admissible strategies \(\alpha ^1 \in {\mathscr {A}}^1\) and \(\alpha ^2 \in {\mathscr {A}}^2\). Then we can define for all \(\varepsilon \in [0,1]\) the function

$$\begin{aligned} \begin{aligned} f(\varepsilon )&\triangleq \, J^1(\varepsilon {\tilde{\alpha }}^1 + (1-\varepsilon ) {\hat{\alpha }}^1;{\hat{\alpha }}^2) + J^2(\varepsilon {\tilde{\alpha }}^2 + (1-\varepsilon ) {\hat{\alpha }}^2;{\hat{\alpha }}^1) \\&\quad \, + J^1((1-\varepsilon ) {\tilde{\alpha }}^1 + \varepsilon {\hat{\alpha }}^1 ;{\tilde{\alpha }}^2) + J^2((1-\varepsilon ) {\tilde{\alpha }}^2 + \varepsilon {\hat{\alpha }}^2 ;{\tilde{\alpha }}^1) . \end{aligned} \end{aligned}$$
(11)

Note that due to Lemma 3.1 and the assumption that the two Nash equilibria \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) and \(({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)\) are distinct, the function \(f(\varepsilon )\) is strictly convex in \(\varepsilon \) on [0, 1]. Moreover, in light of (10) it has a unique minimum in \(\varepsilon = 0\). It follows that

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} \frac{f(\varepsilon )-f(0)}{\varepsilon } = \frac{d}{d\varepsilon } f(\varepsilon ) \Big \vert _{\varepsilon = 0+} \ge 0. \end{aligned}$$
(12)

Next, denoting the corresponding share holdings of \({\hat{\alpha }}^1\) and \({\tilde{\alpha }}^1\) with \({\hat{X}}^1\) and \({\tilde{X}}^1\), respectively, and noting that \(X^{\varepsilon {\tilde{\alpha }}^1 + (1-\varepsilon ) {\hat{\alpha }}^1} = \varepsilon {\tilde{X}}^1 + (1-\varepsilon ) {\hat{X}}^1\), we can compute

$$\begin{aligned} \begin{aligned}&\frac{d}{d\varepsilon } J^1(\varepsilon {\tilde{\alpha }}^1 + (1-\varepsilon ) {\hat{\alpha }}^1;{\hat{\alpha }}^2) \Big \vert _{\varepsilon = 0+} \\&\quad = {\mathbb {E}} \left[ \sigma \! \int _0^T ({\hat{X}}^1_t - \xi ^1_t) ({\tilde{X}}^1_t - {\hat{X}}^1_t) dt + \int _0^T ({\tilde{\alpha }}^1_t-{\hat{\alpha }}^1_t) \left( \frac{1}{2} \lambda (2{\hat{\alpha }}^1_t + {\hat{\alpha }}^2_t) + \frac{1}{2} \gamma ({\hat{X}}^2_t - x^2) \right) dt \right] , \end{aligned} \end{aligned}$$

as well as the derivatives of the remaining three terms in (11) in a very similar manner in order to ultimately obtain

$$\begin{aligned} \begin{aligned}&\frac{d}{d\varepsilon } f(\varepsilon ) \Big \vert _{\varepsilon = 0+} \\&\quad = - \sigma {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{X}}^1_t - {\hat{X}}^1_t)^2 + ({\tilde{X}}^2_t - {\hat{X}}^2_t)^2 \right) dt \right] \\&\qquad + \frac{1}{2} \gamma {\mathbb {E}}\left[ \int _0^T ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) ({\hat{X}}^2_t - {\tilde{X}}^2_t) dt \right] + \frac{1}{2} \gamma {\mathbb {E}}\left[ \int _0^T ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) ({\hat{X}}^1_t - {\tilde{X}}^1_t) dt \right] \\&\qquad - \lambda {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) + ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) \right) ^2 dt \right] , \end{aligned} \end{aligned}$$

where \({\hat{X}}^2\) and \({\tilde{X}}^2\) denote the share holdings of \({\hat{\alpha }}^2\) and \({\tilde{\alpha }}^2\), respectively. Observing that integration by parts yields

$$\begin{aligned} \int _0^T ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) ({\hat{X}}^2_t - {\tilde{X}}^2_t) dt = - \int _0^T ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) ({\hat{X}}^1_t - {\tilde{X}}^1_t) dt \end{aligned}$$

because \({\tilde{X}}^i_0 = {\hat{X}}^i_0 = x^i\) and \({\hat{X}}^i_T = {\tilde{X}}^i_T = \Xi ^i_T\) for both \(i \in \{1,2\}\), we obtain

$$\begin{aligned} \begin{aligned} \frac{d}{d\varepsilon } f(\varepsilon ) \Big \vert _{\varepsilon = 0+}&= \, - \sigma {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{X}}^1_t - {\hat{X}}^1_t)^2 + ({\tilde{X}}^2_t - {\hat{X}}^2_t)^2 \right) dt \right] \\&\quad \, - \lambda {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) + ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) \right) ^2 dt \right] \end{aligned} \end{aligned}$$

which is strictly negative because the two Nash equilibria \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) and \(({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)\) are distinct. But this contradicts (12). \(\square \)

Next, for any arbitrary but fixed controls \({\tilde{\alpha }}^2 \in {\mathscr {A}}^2\) and \({\tilde{\alpha }}^1 \in {\mathscr {A}}^1\), we can introduce the Gâteaux derivatives of the mappings \(\alpha ^1 \mapsto J^1(\alpha ^1;{\tilde{\alpha }}^2)\) at \(\alpha ^1 \in {\mathscr {A}}^1\) and \(\alpha ^2 \mapsto J^2(\alpha ^2;{\tilde{\alpha }}^1)\) at \(\alpha ^2 \in {\mathscr {A}}^2\), respectively, in any directions \(\beta ^1, \beta ^2 \in {\mathscr {A}}^0 \triangleq \{ \beta : \beta \in {\mathscr {A}}\text { satisfying } \int _0^T \beta _t dt = 0 \; {\mathbb {P}}\text {-a.s.}\}\), namely,

$$\begin{aligned} \langle \nabla J^1(\alpha ^1; {\tilde{\alpha }}^2), \beta ^1 \rangle&\triangleq \; \lim _{\varepsilon \rightarrow 0} \frac{J^1(\alpha ^1 +\varepsilon \beta ^1;{\tilde{\alpha }}^2)-J^1(\alpha ^1,{\tilde{\alpha }}^2)}{\varepsilon },\\ \langle \nabla J^2(\alpha ^2;{\tilde{\alpha }}^1), \beta ^2 \rangle&\triangleq \; \lim _{\varepsilon \rightarrow 0} \frac{J^2(\alpha ^2 +\varepsilon \beta ^2; {\tilde{\alpha }}^1 )-J^2(\alpha ^2;{\tilde{\alpha }}^1)}{\varepsilon }. \end{aligned}$$

They allow for following explicit expressions presented in

Lemma 3.3

Let \({\tilde{\alpha }}^2 \in {\mathscr {A}}^2\) be fixed with corresponding share holdings \({\tilde{X}}^2\). Then for all \(\alpha ^1 \in {\mathscr {A}}^1\) we have

$$\begin{aligned}&\langle \nabla J^1(\alpha ^1; {\tilde{\alpha }}^2), \beta ^1 \rangle \nonumber \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^1_s \left( \lambda \alpha ^1_s + \frac{\lambda }{2} {\tilde{\alpha }}^2_s + \frac{\gamma }{2} ({\tilde{X}}^2_s - x^2) + \int _s^T (X^1_t - \xi ^1_t) \sigma dt \right) ds \right] \end{aligned}$$
(13)

for any \(\beta ^1 \in {\mathscr {A}}^0\). Similarly, let \({\tilde{\alpha }}^1 \in {\mathscr {A}}^1\) be fixed with corresponding share holdings \({\tilde{X}}^1\). Then for all \(\alpha ^2 \in {\mathscr {A}}^2\) we have

$$\begin{aligned}&\langle \nabla J^2(\alpha ^2; {\tilde{\alpha }}^1), \beta ^2 \rangle \nonumber \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^2_s \left( \lambda \alpha ^2_s + \frac{\lambda }{2} {\tilde{\alpha }}^1_s + \frac{\gamma }{2} ({\tilde{X}}^1_s - x^1) + \int _s^T (X^2_t - \xi ^2_t) \sigma dt \right) ds \right] \end{aligned}$$
(14)

for any \(\beta ^2 \in {\mathscr {A}}^0\).

Proof

We only compute the Gâteaux derivative in (13). The same computations apply for (14). Fix \({\tilde{\alpha }}^2 \in {\mathscr {A}}^2\) with share holdings \({\tilde{X}}^2\) and let \(\alpha ^1 \in {\mathscr {A}}^1, \beta ^1 \in {\mathscr {A}}^0\) as well as \(\varepsilon > 0\). Note that \(\alpha ^1 +\varepsilon \beta ^1 \in {\mathscr {A}}^1\) with share holdings \(X^{\alpha ^1 +\varepsilon \beta ^1} = X^1 + \varepsilon \int _0^\cdot \beta ^1_s ds\). Moreover, since

$$\begin{aligned}&J^1(\alpha ^1 +\varepsilon \beta ^1; {\tilde{\alpha }}^2)-J^1(\alpha ^1;{\tilde{\alpha }}^2) \\&\quad = \, \varepsilon {\mathbb {E}}\left[ \int _0^T \left( \frac{\lambda }{2} \beta ^1_t (2 \alpha ^1_t + {\tilde{\alpha }}^2_t) + \left( \int _0^t \beta ^1_s ds \right) (X^1_t - \xi ^1_t) \sigma +\frac{\gamma }{2} \beta ^1_t ({\tilde{X}}^2_t -x^2) \right) dt \right] \\&\qquad + \frac{1}{2} \varepsilon ^2 {\mathbb {E}}\left[ \int _0^T \left( \lambda (\beta ^1_t)^2 + \left( \int _0^t \beta ^1_sds \right) ^2 \sigma \right) dt \right] , \end{aligned}$$

we obtain the desired result in (13) after applying Fubini’s theorem. \(\square \)

Having at hand the explicit expressions in (13) and (14) we can now derive a sufficient and necessary first order condition for the Nash equilibrium in terms of a system of coupled forward-backward stochastic differential equations (FBSDE).

Lemma 3.4

A pair of controls \(({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) is a Nash equilibrium in the sense of Definition 2.2 if and only if \(({\hat{X}}^1, {\hat{X}}^2, {\hat{\alpha }}^1,{\hat{\alpha }}^2)\) solve following coupled forward backward SDE system

$$\begin{aligned} \left\{ \begin{aligned} dX^1_t =&\; \alpha ^1_t dt, \qquad X^1_0 = x^1, \\ dX^2_t =&\; \alpha ^2_t dt, \qquad X^2_0 = x^2, \\ d\alpha ^1_t =&\; \frac{\sigma }{\lambda } (X^1_t - \xi ^1_t) dt - \frac{\gamma }{2\lambda } \alpha ^2_t dt - \frac{1}{2} d\alpha ^2_t + dM^1_t, \qquad X^1_T = \Xi ^1_T,\\ d\alpha ^2_t =&\; \frac{\sigma }{\lambda } (X^2_t - \xi ^2_t) dt - \frac{\gamma }{2\lambda } \alpha ^1_t dt - \frac{1}{2} d\alpha ^1_t + dM^2_t, \qquad X^2_T = \Xi ^2_T, \end{aligned} \right. \end{aligned}$$
(15)

for two suitable square integrable martingales \((M^1_t)_{0 \le t < T}\) and \((M^2_t)_{0 \le t < T}\).

Proof

Sufficiency: Assume first that \(({\hat{X}}^1, {\hat{X}}^2,{\hat{\alpha }}^1,{\hat{\alpha }}^2, M^1, M^2)\) with \(({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) solves the FBSDE system in (15). We have to show that \({\hat{\alpha }}^1\) minimizes \(\alpha ^1 \mapsto J^1(\alpha ^1;{\hat{\alpha }}^2)\) over \({\mathscr {A}}^1\), and, vice versa, that \({\hat{\alpha }}^2\) minimizes \(\alpha ^2 \mapsto J^2(\alpha ^2;{\hat{\alpha }}^1)\) over \({\mathscr {A}}^2\). Since we are minimizing strictly convex functionals due to Lemma 3.1, a sufficient condition for the optimality of \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\), respectively, is given by

$$\begin{aligned} \langle \nabla J^1({\hat{\alpha }}^1; {\hat{\alpha }}^2), \beta ^1 \rangle = 0 \text { for all } \beta ^1 \in {\mathscr {A}}^0 \end{aligned}$$
(16)

and

$$\begin{aligned} \langle \nabla J^2({\hat{\alpha }}^2; {\hat{\alpha }}^1), \beta ^2 \rangle = 0 \text { for all } \beta ^2 \in {\mathscr {A}}^0; \end{aligned}$$
(17)

cf., e.g., Ekeland and Témam [16]. We start with the proof of (16). By assumption we have the representation

$$\begin{aligned} {\hat{\alpha }}^1_t&= \; {\hat{\alpha }}^1_0 + \frac{\sigma }{\lambda } \int _0^t ({\hat{X}}^1_s - \xi ^1_s) ds- \frac{\gamma }{2\lambda } \int _0^t {\hat{\alpha }}^2_s ds \\&\quad -\frac{1}{2} ({\hat{\alpha }}^2_t - {\hat{\alpha }}^2_0) + M^1_t-M^1_0 \quad d{\mathbb {P}}\otimes dt \text {-a.e. on } \Omega \times [0,T) \end{aligned}$$

for some square integrable martingale \((M^1_t)_{0 \le t < T}\). Moreover, since \({\hat{\alpha }}^1, {\hat{\alpha }}^2, \xi ^1 \in L^2({\mathbb {P}}\otimes dt)\) it follows that \({\mathbb {E}}[\int _0^T (M_s^1)^2 ds] < \infty \). Next, introducing the square integrable martingale

$$\begin{aligned} N_s \triangleq {\mathbb {E}}\left[ \int _0^T ({\hat{X}}^1_t - \xi ^1_t) \sigma dt \, \bigg \vert \, {\mathscr {F}}_s \right] \quad (0 \le s \le T) \end{aligned}$$

and plugging the above representation of \({\hat{\alpha }}^1\) in the Gâteaux derivative in (13) we obtain

$$\begin{aligned}&\langle \nabla _1 J^1({\hat{\alpha }}^1; {\hat{\alpha }}^2), \beta ^1 \rangle \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^1_s \left( \lambda {\hat{\alpha }}^1_s + \frac{\lambda }{2} {\hat{\alpha }}^2_s + \frac{\gamma }{2} ({\hat{X}}^2_s - x^2) + \int _s^T ({\hat{X}}^1_t - \xi ^1_t) \sigma dt \right) ds \right] \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^1_s \left( \lambda {\hat{\alpha }}^1_0 + \frac{\lambda }{2} {\hat{\alpha }}^2_0 + N_T + \lambda M^1_s - \lambda M^1_0 \right) ds \right] \\&\quad = {\mathbb {E}}\left[ \left( \lambda {\hat{\alpha }}^1_0 + \frac{\lambda }{2} {\hat{\alpha }}^2_0 + N_T - \lambda M^1_0 \right) \int _0^T \beta ^1_s ds \right] + \lambda {\mathbb {E}}\left[ \int _0^T \beta ^1_s M^1_s ds \right] \\&\quad = 0 \text { for all } \beta ^1 \in {\mathscr {A}}^0, \end{aligned}$$

where we used the result from Bank et al. [5,  Lemma 5.3] in the last line. Hence, as desired, we obtain that the first order optimality condition in (16) is satisfied by \({\hat{\alpha }}^1 \in {\mathscr {A}}^1\). In fact, the same computations apply to show that also \({\hat{\alpha }}^2 \in {\mathscr {A}}^2\) is satisfying the first order optimality condition in (17). Therefore, we can conclude that \(({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) is a Nash equilibrium in the sense of Definition 2.2.

Necessity: Finally, as shown in the Proof of Theorem 3.5 below (which does not use the necessity assertion of the present lemma) the pair of controls \(({\hat{\alpha }}^1, {\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) presented in (21) below satisfies the coupled forward backward SDE system in (15). Therefore, by uniqueness of the Nash equilibrium via Lemma 3.2 the assertion is indeed also necessary. \(\square \)

We are now ready to state our main result. To do so, it is convenient to introduce following nonnegative constants

$$\begin{aligned} \delta ^+ \triangleq \frac{\gamma ^2}{4} + 6 \lambda \sigma , \qquad \delta ^- \triangleq \frac{\gamma ^2}{4} + 2 \lambda \sigma , \end{aligned}$$
(18)

the nonnegative functions

$$\begin{aligned} \begin{aligned} c^+_t&\triangleq \; \frac{1}{3} \sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + \frac{1}{6} \gamma , \\ c^-_t&\triangleq \; \sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda ) -\frac{1}{2}\gamma \end{aligned} \qquad (0 \le t \le T) \end{aligned}$$
(19)

such that \(\lim _{t \uparrow T} c_t^{\pm } = +\infty \), as well as the weight functions

$$\begin{aligned} \begin{aligned} w^1_t&\triangleq \frac{\sqrt{\delta ^+} \, e^{\frac{\gamma }{6\lambda }(T-t)}}{3 (c^+_t + c^-_t) \sinh (\sqrt{\delta ^+}(T-t) /(3\lambda ))}, \\ w^2_t&\triangleq \; \frac{\sqrt{\delta ^-} \, e^{-\frac{\gamma }{2\lambda }(T-t)}}{(c^+_t + c^-_t) \sinh (\sqrt{\delta ^-}(T-t)/\lambda )}, \\ w^3_t&\triangleq \; \frac{c^+_t}{c^+_t + c^-_t} - w^1_t, \qquad w^4_t \triangleq \frac{c^-_t}{c^+_t + c^-_t} - w^2_t, \qquad w^5_t \triangleq \frac{c^+_t - c^-_t}{c^+_t + c^-_t} \end{aligned} \end{aligned}$$
(20)

for all \(t \in [0,T]\). An explicit description of the unique Nash equilibrium is provided in the following

Theorem 3.5

There exists a unique open-loop Nash equilibrium \(({\hat{\alpha }}^1, {\hat{\alpha }}^2)\) in \({\mathscr {A}}^1 \times {\mathscr {A}}^2\) in the sense of Definition 2.2. The corresponding equilibrium share holdings \({\hat{X}}^1_\cdot = x^1 + \int _0^\cdot {\hat{\alpha }}^1_tdt\) of agent 1 and \({\hat{X}}^2_\cdot = x^2 + \int _0^\cdot {\hat{\alpha }}^2_tdt\) of agent 2 satisfy the random linear coupled ODE

$$\begin{aligned} \begin{aligned} {\hat{X}}^1_0 =&\; x^1,&\quad d{\hat{X}}^1_t =&\; \frac{c^+_t+c^-_t}{2\lambda } \left( {\hat{\xi }}^1_t - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t \right) dt, \\ {\hat{X}}^2_0 =&\; x^2,&\quad d{\hat{X}}^2_t =&\; \frac{c^+_t+c^-_t}{2\lambda } \left( {\hat{\xi }}^2_t - w^5_t {\hat{X}}^1_t - {\hat{X}}^2_t \right) dt \end{aligned} \quad (0 \le t < T), \end{aligned}$$
(21)

where, for \(0 \le t \le T\), we let

$$\begin{aligned} \begin{aligned} {\hat{\xi }}^{1}_t&\triangleq \; w^1_t \cdot {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \,\vert \, {\mathscr {F}}_t] + w^2_t \cdot {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \,\vert \, {\mathscr {F}}_t] \\&\quad + w^3_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) \cdot K^1(t,u) \,du \,\Big \vert \,{\mathscr {F}}_t \right] \\&\quad + w^4_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^1_u - \xi ^2_u) \cdot K^2(t,u) \, du \, \Big \vert \, {\mathscr {F}}_t \right] \end{aligned} \end{aligned}$$
(22)

and

$$\begin{aligned} \begin{aligned} {\hat{\xi }}^2_t&\triangleq \; w^1_t \cdot {\mathbb {E}}[\Xi ^2_T + \Xi ^1_T \,\vert \, {\mathscr {F}}_t] + w^2_t \cdot {\mathbb {E}}[\Xi ^2_T - \Xi ^1_T \,\vert \, {\mathscr {F}}_t] \\&\quad + w^3_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^2_u + \xi ^1_u) \cdot K^1(t,u) \,du \,\Big \vert \,{\mathscr {F}}_t \right] \\&\quad + w^4_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^2_u - \xi ^1_u) \cdot K^2(t,u) \, du \, \Big \vert \, {\mathscr {F}}_t \right] \end{aligned} \end{aligned}$$
(23)

with nonnegative kernels

$$\begin{aligned} \begin{aligned} K^1(t,u)&\triangleq \; \frac{w^1_t}{w^3_t} \frac{2\sigma e^{-\frac{\gamma }{6\lambda }(T-u)} \sinh (\sqrt{\delta ^+}(T-u)/(3\lambda ))}{\sqrt{\delta ^+}}, \\ K^2(t,u)&\triangleq \; \frac{w^2_t}{w^4_t} \frac{2\sigma e^{\frac{\gamma }{2\lambda }(T-u)} \sinh (\sqrt{\delta ^-}(T-u)/\lambda )}{\sqrt{\delta ^-}} \end{aligned} \;\; (0 \le t \le u < T) \end{aligned}$$
(24)

which, for each \(t \in [0,T)\), integrate to one over [tT]. The solution \(({\hat{X}}^1, {\hat{X}}^2)\) of (21) satisfies the terminal state constraints in the sense that

$$\begin{aligned} \lim _{t \uparrow T} {\hat{X}}^1_t = \Xi ^1_T \quad \text {and} \quad \lim _{t \uparrow T} {\hat{X}}^2_t = \Xi ^2_T \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$
(25)

The Proof of Theorem 3.5 consists of a verification that the pair \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) with dynamics in (21) is admissible (i.e., belongs to \({\mathscr {A}}^1\times {\mathscr {A}}^2\)) and satisfies the FBSDE system in (15). An explanation on how the Nash equilibrium \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) can be constructed is provided in the appendix.

Proof of Theorem 3.5

In view of Lemma 3.4 we merely have to show that \(({\hat{X}}^1, {\hat{X}}^2, {\hat{\alpha }}^1, {\hat{\alpha }}^2)\) with dynamics described in Theorem 3.5, Eq. (21), is a solution of the FBSDE system in (15) with some suitable square integrable martingales \((M^1_t)_{0 \le t < T}\) and \((M^2_t)_{0 \le t < T}\). Uniqueness of the Nash equilibrium then follows together with Lemma 3.2.

Step 1: We start with computing the dynamics of the controls \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\) in (21) and verify that they satisfy the dynamics of the FBSDE system in (15). To this end, it is convenient to rewrite \(w^1, w^2\) in (20), as well as \({\hat{\xi }}^1\) in (22) and \({\hat{\xi }}^2\) in (23) by introducing

$$\begin{aligned} {\tilde{w}}_t^1&\triangleq \; (c^+_t + c^-_t) w^1_t, \quad {\tilde{w}}_t^2 \triangleq (c^+_t + c^-_t) w^2_t \quad (0 \le t < T) \end{aligned}$$
(26)

and

$$\begin{aligned} {\tilde{\xi }}_t^1 \triangleq&\; (c^+_t + c^-_t) {\hat{\xi }}^1_t, \quad {\tilde{\xi }}_t^2 \triangleq (c^+_t + c^-_t) {\hat{\xi }}^2_t \quad (0 \le t < T). \end{aligned}$$
(27)

Moreover, setting

$$\begin{aligned} \begin{aligned} Y_t^+ \triangleq&\; \int _0^t (\xi ^1_s + \xi ^2_s) \frac{2\sigma }{\sqrt{\delta ^+}} e^{-\frac{\gamma }{6\lambda }(T-s)} \sinh (\sqrt{\delta ^+}(T-s)/(3\lambda )) ds, \\ M_t^+ \triangleq&\; {\mathbb {E}}\left[ \Xi ^1_T + \Xi ^2_T + Y_T^+ \,\vert \, {\mathscr {F}}_t\right] \end{aligned} \end{aligned}$$
(28)

and

$$\begin{aligned} \begin{aligned} Y_t^- \triangleq&\; \int _0^t (\xi ^1_s - \xi ^2_s) \frac{2\sigma }{\sqrt{\delta ^-}} e^{\frac{\gamma }{2\lambda }(T-s)} \sinh (\sqrt{\delta ^-}(T-s)/\lambda ) ds, \\ M_t^- \triangleq&\; {\mathbb {E}}\left[ \Xi ^1_T - \Xi ^2_T + Y_T^- \,\vert \, {\mathscr {F}}_t\right] \end{aligned} \end{aligned}$$
(29)

for all \(0 \le t \le T\), we obtain the representations

$$\begin{aligned} \begin{aligned} {\tilde{\xi }}^1_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) + {\tilde{w}}^2_t ( M^-_t - Y^-_t ), \\ {\tilde{\xi }}^2_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) - {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \end{aligned} \qquad (0 \le t < T). \end{aligned}$$
(30)

In particular,

$$\begin{aligned} {\tilde{\xi }}^1_t + {\tilde{\xi }}^2_t = 2 {\tilde{w}}^1_t ( M^+_t - Y^+_t ), \quad {\tilde{\xi }}^1_t - {\tilde{\xi }}^2_t = 2 {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \end{aligned}$$
(31)

on [0, T). Note that \(\Xi ^1_T, \Xi ^2_T, Y_T^+, Y_T^- \in L^2({\mathbb {P}})\) implies that \((M_t^+)_{0 \le t \le T}\) and \((M_t^-)_{0 \le t \le T}\) are square integrable martingales. Also, observe that the processes \(Y^+, M^+, Y^-, M^- \in L^2({\mathbb {P}}\otimes dt)\). We can now rewrite (21) as

$$\begin{aligned} \begin{aligned} {\hat{\alpha }}^1_t =&\; \frac{1}{2\lambda } ( {\tilde{\xi }}^1_t - c^+_t {\hat{X}}^2_t + c^-_t {\hat{X}}^2_t - c^+_t {\hat{X}}^1_t - c^-_t {\hat{X}}^1_t), \\ {\hat{\alpha }}^2_t =&\; \frac{1}{2\lambda } ( {\tilde{\xi }}^2_t - c^+_t {\hat{X}}^1_t + c^-_t {\hat{X}}^1_t - c^+_t {\hat{X}}^2_t - c^-_t {\hat{X}}^2_t) \end{aligned} \qquad (0 \le t < T). \end{aligned}$$
(32)

Next, for \({\tilde{w}}^1\), \({\tilde{w}}^2\) in (26) one can easily check that

$$\begin{aligned} ({\tilde{w}}^1_t)' = {\tilde{w}}^1_t \left( \frac{1}{\lambda } c^+_t - \frac{\gamma }{3\lambda } \right) , \quad ({\tilde{w}}^2_t)' = {\tilde{w}}^2_t \left( \frac{1}{\lambda } c^-_t + \frac{\gamma }{\lambda } \right) \quad (0 \le t < T). \end{aligned}$$
(33)

Hence, by applying integration by parts in (30) we obtain the dynamics

$$\begin{aligned} \begin{aligned} d{\tilde{\xi }}^1_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) \left( \frac{1}{\lambda } c^+_t - \frac{\gamma }{3\lambda } \right) dt -\frac{2}{3} \sigma (\xi ^1_t + \xi ^2_t) dt \\&+ {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \left( \frac{1}{\lambda } c^-_t + \frac{\gamma }{\lambda } \right) dt - 2\sigma (\xi ^1_t - \xi ^2_t) dt \\&+ {\tilde{w}}^1_t dM^+_t + {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T) \end{aligned} \end{aligned}$$
(34)

and

$$\begin{aligned} \begin{aligned} d{\tilde{\xi }}^2_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) \left( \frac{1}{\lambda } c^+_t - \frac{\gamma }{3\lambda } \right) dt -\frac{2}{3} \sigma (\xi ^1_t + \xi ^2_t) dt \\&- {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \left( \frac{1}{\lambda } c^-_t + \frac{\gamma }{\lambda } \right) dt - 2 \sigma (\xi ^1_t - \xi ^2_t) dt \\&+ {\tilde{w}}^1_t dM^+_t - {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T). \end{aligned} \end{aligned}$$
(35)

Now, having at hand (34) and (35), as well as the fact that the functions \(c^+, c^-\) in (19) satisfy the ordinary Riccati differential equations

$$\begin{aligned} (c^+_t)' = \frac{(c^+_t)^2}{\lambda } - \frac{\gamma }{3\lambda } c^+_t - \frac{2}{3} \sigma , \quad (c^-_t)' = \frac{(c^-_t)^2}{\lambda } + \frac{\gamma }{\lambda } c^-_t - 2\sigma \quad (0 \le t < T), \end{aligned}$$
(36)

an elementary but tedious computation reveals that the dynamics of \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\) in (32) on [0, T) are given by

$$\begin{aligned} \begin{aligned} d{\hat{\alpha }}^1_t =&\; {\hat{X}}^1_t \left( \frac{4\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t - \frac{\gamma }{2\lambda ^2} c^-_t \right) dt - \frac{4\sigma }{3\lambda } \xi ^1_t dt + \frac{\gamma }{6\lambda ^2} {\tilde{\xi }}^1_t dt \\&+ {\hat{X}}^2_t \left( -\frac{2\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t + \frac{\gamma }{2\lambda ^2} c^-_t \right) dt + \frac{2\sigma }{3\lambda } \xi ^2_t dt - \frac{\gamma }{3\lambda ^2} {\tilde{\xi }}^2_t dt \\&+ \frac{{\tilde{w}}^1_t}{2\lambda } dM^+_t + \frac{{\tilde{w}}^2_t}{2\lambda } dM^-_t \end{aligned} \end{aligned}$$
(37)

and, similarly, by

$$\begin{aligned} \begin{aligned} d{\hat{\alpha }}^2_t =&\; {\hat{X}}^2_t \left( \frac{4\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t - \frac{\gamma }{2\lambda ^2} c^-_t \right) dt - \frac{4\sigma }{3\lambda } \xi ^2_t dt + \frac{\gamma }{6\lambda ^2} {\tilde{\xi }}^2_t dt \\&+ {\hat{X}}^1_t \left( -\frac{2\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t + \frac{\gamma }{2\lambda ^2} c^-_t \right) dt + \frac{2\sigma }{3\lambda } \xi ^1_t dt - \frac{\gamma }{3\lambda ^2} {\tilde{\xi }}^1_t dt \\&+ \frac{{\tilde{w}}^1_t}{2\lambda } dM^+_t - \frac{{\tilde{w}}^2_t}{2\lambda } dM^-_t, \end{aligned} \end{aligned}$$
(38)

where we also employed the identities in (31). As a consequence, using the representations in (32) we obtain

$$\begin{aligned}&d{\hat{\alpha }}^1_t + \frac{1}{2} d{\hat{\alpha }}^2_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^1_t - \xi ^1_t) dt - \frac{\gamma }{4\lambda ^2} ({\tilde{\xi }}^2_t - c^+_t {\hat{X}}^1_t + c^-_t {\hat{X}}^1_t - c^+_t {\hat{X}}^2_t - c^-_t {\hat{X}}^2_t ) dt \\&\qquad + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t + \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^1_t - \xi ^1_t) dt - \frac{\gamma }{2\lambda } {\hat{\alpha }}^2_t dt + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t + \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T) \end{aligned}$$

and

$$\begin{aligned}&d{\hat{\alpha }}^2_t + \frac{1}{2} d{\hat{\alpha }}^1_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^2_t - \xi ^2_t) dt - \frac{\gamma }{4\lambda ^2} ({\tilde{\xi }}^1_t - c^+_t {\hat{X}}^2_t + c^-_t {\hat{X}}^2_t - c^+_t {\hat{X}}^1_t - c^-_t {\hat{X}}^1_t ) dt \\&\qquad + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t - \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^2_t - \xi ^2_t) dt - \frac{\gamma }{2\lambda } {\hat{\alpha }}^1_t dt + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t - \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T). \end{aligned}$$

In other words, the pair \(({\hat{\alpha }}^1, {\hat{\alpha }}^2)\) described in (21) satisfies the dynamics of the FBSDE system in (15), where \(\int _0^\cdot {\tilde{w}}_t^{1} dM_t^{+}\), \(\int _0^\cdot {\tilde{w}}_t^{2} dM_t^{-}\) are square integrable martingales on [0, T) providing the ingredients for \(M^1\) and \(M^2\).

Step 2: Next, we have to check the terminal conditions of the FBSDE system in (15), that is, \(\lim _{t \uparrow T} {\hat{X}}^1_t = \Xi ^1_T\) and \(\lim _{t \uparrow T} {\hat{X}}^2_t = \Xi ^2_T\) \({\mathbb {P}}\)-a.s. holds true for the pair of solutions \(({\hat{X}}^1, {\hat{X}}^2)\) of the coupled ODE in (21). We adapt the argumentation from Bank et al. [5] which employs a simple comparison principle for ordinary differential equations to our current setting. Specifically, note that it suffices to show that

$$\begin{aligned} \lim _{t \uparrow T} ({\hat{X}}^1_t + {\hat{X}}^2_t) =&\; \Xi ^1_T + \Xi ^2_T\quad {\mathbb {P}}\text {-a.s. and} \end{aligned}$$
(39)
$$\begin{aligned} \lim _{t \uparrow T} ({\hat{X}}^1_t - {\hat{X}}^2_t) =&\; \Xi ^1_T - \Xi ^2_T \quad {\mathbb {P}}\text {-a.s.}, \end{aligned}$$
(40)

where, using the dynamics in (21) and the definition of \(w^5\) in (20), the processes \({\hat{X}}^1 + {\hat{X}}^2\) and \({\hat{X}}^1 - {\hat{X}}^2\) satisfy, respectively, the ODE

$$\begin{aligned} \begin{aligned} d({\hat{X}}^1_t + {\hat{X}}^2_t) =&\; \frac{c^+_t + c^-_t}{2\lambda } \left( {\hat{\xi }}^1_t + {\hat{\xi }}^2_t - w^5_t {\hat{X}}^1_t - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t - {\hat{X}}^2_t \right) dt \\ =&\; \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - ({\hat{X}}^1_t + {\hat{X}}^2_t) \right) dt \quad (0 \le t <T) \end{aligned} \end{aligned}$$
(41)

and

$$\begin{aligned} \begin{aligned} d({\hat{X}}^1_t - {\hat{X}}^2_t) =&\; \frac{c^+_t + c^-_t}{2\lambda } \left( {\hat{\xi }}^1_t - {\hat{\xi }}^2_t + w^5_t {\hat{X}}^1_t - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t + {\hat{X}}^2_t \right) dt \\ =&\; \frac{c^-_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} - ({\hat{X}}^1_t - {\hat{X}}^2_t) \right) dt \quad (0 \le t <T). \end{aligned} \end{aligned}$$
(42)

Note that \(w^5_t \in (-1,1)\) for all \(t \in [0,T]\) by virtue of Lemma 3.7 1.). First, analogously to (30) let us rewrite \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) in (22) and (23) as

$$\begin{aligned} \begin{aligned} {\hat{\xi }}^1_t =&\; w^1_t ( M^+_t - Y^+_t ) + w^2_t ( M^-_t - Y^-_t ), \\ {\hat{\xi }}^2_t =&\; w^1_t ( M^+_t - Y^+_t ) - w^2_t ( M^-_t - Y^-_t ) \end{aligned} \qquad (0 \le t \le T) \end{aligned}$$
(43)

with \(Y^+, M^+,Y^-,M^-\) as defined in (28) and (29). Hence, we can consider a càdlàg version of the processes \(({\hat{\xi }}^1_t)_{0 \le t \le T}\) and \(({\hat{\xi }}^2_t)_{0 \le t \le T}\) and obtain, together with Lemma 3.7, 2.), the \({\mathbb {P}}\)-a.s. limits

$$\begin{aligned} \begin{aligned} \lim _{t \uparrow T} {\hat{\xi }}^1_t =&\; \frac{1}{2} {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] + \frac{1}{2} {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] = \Xi ^1_T \quad \text {and} \\ \lim _{t \uparrow T} {\hat{\xi }}^2_t =&\; \frac{1}{2} {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] - \frac{1}{2} {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] = \Xi ^2_T \end{aligned} \end{aligned}$$

due to \({\mathscr {F}}_{T-}\)-measurability of \(\Xi ^1_T\) and \(\Xi ^2_T\) by virtue of our assumption in (9). In particular, since \(\lim _{t\uparrow T} w^5_t = 0\) because of Lemma 3.7, 2.), it also holds that

$$\begin{aligned} \lim _{t \uparrow T} \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} = \Xi ^1_T + \Xi ^2_T \quad \text {and} \quad \lim _{t \uparrow T} \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} = \Xi ^1_T - \Xi ^2_T \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$
(44)

Let us now start with proving the limit in (39). As a consequence of (44), for every \(\varepsilon > 0\) there exists a (random) time \(\tau _\varepsilon \in [0,T)\) such that \({\mathbb {P}}\)-a.s.

$$\begin{aligned} \Xi ^1_T + \Xi ^2_T - \varepsilon \le \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} \le \Xi ^1_T + \Xi ^2_T + \varepsilon \quad \text {for all } t \in [\tau _\varepsilon ,T). \end{aligned}$$
(45)

Next, define \(Y^{+,\varepsilon }_t \triangleq \Xi ^1_T + \Xi ^2_T + \varepsilon - ({\hat{X}}^1_t + {\hat{X}}^2_t)\) for all \(t \in [0,T)\) so that

$$\begin{aligned} Y^{+,\varepsilon }_t \ge \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - ({\hat{X}}^1_t + {\hat{X}}^2_t) \quad \text {for all } t \in [\tau _\varepsilon ,T). \end{aligned}$$
(46)

Together with the dynamics of \({\hat{X}}^1+{\hat{X}}^2\) in (41) this yields

$$\begin{aligned} \begin{aligned} d Y^{+,\varepsilon }_t =&\; -d({\hat{X}}^1_t + {\hat{X}}^2_t) = - \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - ({\hat{X}}^1_t + {\hat{X}}^2_t) \right) dt \\ \ge&\; -\frac{c_t^+}{\lambda } Y^{+,\varepsilon }_t dt \quad \text {on } [\tau _\varepsilon ,T). \end{aligned} \end{aligned}$$
(47)

Moreover, since for all \(\omega \in \Omega \) the linear ODE on \([\tau _\varepsilon (\omega ),T)\) given by

$$\begin{aligned} Z^{+,\varepsilon }_{\tau _\varepsilon (\omega )} = Y^{+,\varepsilon }_{\tau _\varepsilon (\omega )}(\omega ), \quad dZ^{+,\varepsilon }_t = -\frac{c_t^+}{\lambda } Z^{+,\varepsilon }_t dt \end{aligned}$$

admits the solution

$$\begin{aligned} Z^{+,\varepsilon }_t =&\; Y^{+,\varepsilon }_{\tau _\varepsilon (\omega )}(\omega ) \cdot e^{-\int _{\tau _\varepsilon }^{t} \frac{c^+_s}{\lambda } ds} \\ =&\; Y^{+,\varepsilon }_{\tau _\varepsilon }(\omega ) \cdot e^{-\frac{\gamma }{6\lambda } (t-\tau _\varepsilon )} \cdot \frac{\sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))}{\sinh (\sqrt{\delta ^+}(T-\tau _{\varepsilon })/(3\lambda ))} \quad (\tau _\varepsilon \le t < T) \end{aligned}$$

with \(\lim _{t \uparrow T} Z^{+,\varepsilon }_t = 0\), the comparison principle for ODEs in (47) implies that \(Y^{+,\varepsilon }_t \ge Z^{+,\varepsilon }_t\) for all \(t \in [\tau _\varepsilon , T)\) and thus

$$\begin{aligned} \liminf _{t \uparrow T} Y^{+,\varepsilon }_t \ge \lim _{t \uparrow T} Z^{+,\varepsilon }_t = 0 \quad {\mathbb {P}}\text {-a.s.}, \end{aligned}$$

or, equivalently,

$$\begin{aligned} \limsup _{t \uparrow T} ({\hat{X}}^1_t + {\hat{X}}^2_t) \le \Xi ^1_T + \Xi ^2_T + \varepsilon \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$
(48)

Next, in a similar way, set \({\tilde{Y}}^{+,\varepsilon }_t \triangleq \Xi ^1_T + \Xi ^2_T - \varepsilon - ({\hat{X}}^1_t + {\hat{X}}^2_t)\) for all \(t \in [0,T)\) and observe as above from (45) that \({\mathbb {P}}\)-a.s. on \([\tau _\varepsilon , T)\) it holds that \(d{\tilde{Y}}^{+,\varepsilon }_t \le -\frac{c_t^+}{\lambda } {\tilde{Y}}^{+,\varepsilon }_t dt\) and hence

$$\begin{aligned} \limsup _{t \uparrow T} {\tilde{Y}}^{+,\varepsilon }_t \le \lim _{t \uparrow T} Z^{+,\varepsilon }_t \le 0 \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

by the comparison principle. That is,

$$\begin{aligned} \liminf _{t \uparrow T} ({\hat{X}}^1_t + {\hat{X}}^2_t) \ge \Xi ^1_T + \Xi ^2_T - \varepsilon \quad {\mathbb {P}}\text {-a.s.}, \end{aligned}$$

which, together with (48) yields the limit in (39).

In fact, it can now be argued along the same lines as above that also the limit in (40) holds true. Indeed, simply note that (44) implies similar to (45) that \({\mathbb {P}}\)-a.s. for every \(\varepsilon > 0\) there exists a (random) time \(\tau '_\varepsilon \in [0,T)\) such that

$$\begin{aligned} \Xi ^1_T - \Xi ^2_T - \varepsilon \le \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} \le \Xi ^1_T - \Xi ^2_T + \varepsilon \quad \text {for all } t \in [\tau '_\varepsilon ,T). \end{aligned}$$

Then, introduce the processes \(Y^{-,\varepsilon }_t \triangleq \Xi ^1_T - \Xi ^2_T + \varepsilon - ({\hat{X}}^1_t - {\hat{X}}^2_t)\) and \({\tilde{Y}}^{-,\varepsilon }_t \triangleq \Xi ^1_T - \Xi ^2_T - \varepsilon - ({\hat{X}}^1_t - {\hat{X}}^2_t)\) for all \(t \in [0,T)\). By using the dynamics of \({\hat{X}}^1 - {\hat{X}}^2\) in (42) we can once more apply the comparison principle on the interval \([\tau '_\varepsilon ,T)\) for the ODEs of \(Y^{-,\varepsilon }\) and \({\tilde{Y}}^{-,\varepsilon }\) together with the linear ODE

$$\begin{aligned} Z^{-,\varepsilon }_{\tau _\varepsilon } = z \in {\mathbb {R}}, \quad dZ^{-,\varepsilon }_t = -\frac{c_t^-}{\lambda } Z^{-,\varepsilon }_t dt, \end{aligned}$$

which admits the solution

$$\begin{aligned} Z^{-,\varepsilon }_t = z e^{-\int _{\tau _\varepsilon '}^{t} \frac{c^-_s}{\lambda } ds} = z^- e^{\frac{\gamma }{2\lambda } (t-\tau _\varepsilon )} \frac{\sinh (\sqrt{\delta ^-}(T-t)/\lambda )}{\sinh (\sqrt{\delta ^-}(T-\tau _{\varepsilon }')/\lambda )} \quad (\tau '_\varepsilon \le t < T) \end{aligned}$$

such that \(\lim _{t \uparrow T} Z^{-,\varepsilon }_t = 0\) to finally conclude that

$$\begin{aligned} \Xi ^1_T - \Xi ^2_T - \varepsilon \le \liminf _{t \uparrow T} ({\hat{X}}^1_t - {\hat{X}}^2_t) \le \limsup _{t \uparrow T} ({\hat{X}}^1_t - {\hat{X}}^2_t) \le \Xi ^1_T - \Xi ^2_T + \varepsilon \end{aligned}$$

as desired.

Step 3: It is left to argue that the controls \({\hat{\alpha }}^1, {\hat{\alpha }}^2\) described in (21) belong to the set \({\mathscr {A}}\) in (2), i.e., \({\hat{\alpha }}^1, {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)\). To achieve this we will follow a similar strategy as in Bank et al. [5]. For simplicity, we will assume without loss of generality that \(x^1=x^2=0\). Because of the coupling of \({\hat{\alpha }}^1, {\hat{\alpha }}^2\) in (21) it is more convenient to prove that \({\hat{\alpha }}^+ \triangleq {\hat{\alpha }}^1 + {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)\) and \({\hat{\alpha }}^- \triangleq {\hat{\alpha }}^1 - {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)\), where we set \({\hat{X}}^+_\cdot \triangleq \int _0^\cdot {\hat{\alpha }}^+_s ds\) and \({\hat{X}}^-_\cdot \triangleq \int _0^\cdot {\hat{\alpha }}^-_s ds\). Recall from (41) and (42) above that we then have

$$\begin{aligned} {\hat{\alpha }}^+_t = \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - {\hat{X}}^+_t\right) , \quad {\hat{\alpha }}^-_t = \frac{c^-_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} - {\hat{X}}^-_t \right) \end{aligned}$$
(49)

on [0, T), where

$$\begin{aligned} {\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^1_t (M^+_t - Y^+_t), \quad {\hat{\xi }}^1_t - {\hat{\xi }}^2_t = 2 w^2_t (M^-_t - Y^-_t) \quad (0 \le t \le T) \end{aligned}$$
(50)

because of (43) (recall that \(M^+,Y^+\) are given in (28) and \(M^-,Y^-\) are given in (29)).

We start with showing that \({\hat{\alpha }}^+ \in L^2({\mathbb {P}}\otimes dt)\). For this purpose, observe that it suffices to examine the following two cases \(\xi ^1\equiv \xi ^2\equiv 0\) and \(\Xi ^1_T=\Xi ^2_T=0\) separately. Indeed, let us denote \({\hat{\alpha }}^{+,\xi ^1,\xi ^2,\Xi ^1,\Xi ^2} \triangleq {\hat{\alpha }}^{+}\) to emphasize also the dependence on \(\xi ^1,\xi ^2,\Xi ^1,\Xi ^2\). Then, due to the linear dependence of \({\hat{\alpha }}^+\) in (49) on \(\xi ^1,\xi ^2,\Xi ^1,\Xi ^2\), it holds that

$$\begin{aligned} {\hat{\alpha }}^{+,\xi ^1,\xi ^2,\Xi ^1,\Xi ^2} = {\hat{\alpha }}^{+,0,0,\Xi ^1,\Xi ^2} + {\hat{\alpha }}^{+,\xi ^1,\xi ^2,0,0}. \end{aligned}$$
(51)

Hence, it suffices to show that \({\hat{\alpha }}^{+,0,0,\Xi ^1,\Xi ^2} \in L^2({\mathbb {P}}\otimes dt)\) and \({\hat{\alpha }}^{+,\xi ^1,\xi ^2,0,0} \in L^2({\mathbb {P}}\otimes dt)\).

Case 1.1: \(\xi ^1\equiv \xi ^2\equiv 0\):

From (50) it follows that \({\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^1_t M^+_t\). Moreover, the explicit solutions in (66) and (67) yield

$$\begin{aligned} \begin{aligned} {\hat{X}}^+_t =&\; e^{-\int _0^t \frac{c^+_u}{\lambda } du} \int _0^t \frac{c_s^+ + c^-_s}{\lambda } w^1_s M^+_s e^{\int _0^s \frac{c^+_u}{\lambda } du} ds \\ =&\; e^{\frac{\gamma }{6\lambda } (T-t)} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \\&\int _0^t M^+_s \frac{\sqrt{\delta ^+}}{3\lambda \sinh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} ds \qquad (0 \le t < T). \end{aligned} \end{aligned}$$
(52)

Introducing the deterministic and differentiable function \(f^+_s \triangleq 1/\sinh (\sqrt{\delta ^+}(T-s)/(3\lambda ))\) on [0, T) allows to rewrite the integral in (52) by applying integration by parts as

$$\begin{aligned}&\int _0^t M^+_s \frac{\sqrt{\delta ^+}}{3\lambda \sinh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} ds = \int _0^t {\tilde{M}}^+_s df^+_s \nonumber \\&\quad = {\tilde{M}}^+_t f^+_t - {\tilde{M}}^+_0 f^+_0 - \int _0^t f_s^+ d{\tilde{M}}^+_s \qquad (0 \le t < T), \end{aligned}$$
(53)

where \({\tilde{M}}^+_t \triangleq M_t^+/\cosh (\sqrt{\delta ^+}(T-t)/(3\lambda ))\) for all \(t \in [0,T)\). Moreover, we have that

$$\begin{aligned} \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} = \frac{\sqrt{\delta ^+} e^{\frac{\gamma }{6\lambda }(T-t)}}{3 c^+_t \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))} M^+_t \quad (0 \le t \le T). \end{aligned}$$
(54)

Now, plugging back (54) and (52) together with (53) into \({\hat{\alpha }}^+\) in (49) yields, after some elementary computations,

$$\begin{aligned} \begin{aligned} {\hat{\alpha }}^+_t =&\; -\frac{\gamma }{6\lambda } e^{\frac{\gamma }{6\lambda }(T-t)} {\tilde{M}}^+_t + \frac{c^+_t}{\lambda } e^{\frac{\gamma }{6\lambda } (T-t)} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) {\tilde{M}}^+_0 f^+_0 \\&\; + \frac{c^+_t}{\lambda } e^{\frac{\gamma }{6\lambda } (T-t)} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \int _0^t f_s^+ d{\tilde{M}}^+_s \quad (0 \le t < T). \end{aligned} \end{aligned}$$
(55)

In fact, since \(c^+_t \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))\) is bounded on [0, T] (recall from (19) that \(c^+_t = \frac{1}{3} \sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + \frac{1}{6}\gamma \)) and \({\tilde{M}}^+ \in L^2({\mathbb {P}}\otimes dt)\) (recall that \(M^+\) in (28) belongs to \(L^2({\mathbb {P}}\otimes dt)\)) the first two terms in (55) are in \(L^2({\mathbb {P}}\otimes dt)\). For the stochastic integral, we obtain

$$\begin{aligned} \int _0^t f_s^+ d{\tilde{M}}^+_s =&\; \int _0^t \frac{\sqrt{\delta ^+} M_s^+}{3\lambda \cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} ds \\&\; + \int _0^t \frac{{\tilde{f}}_s^+}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))} dM^+_s, \end{aligned}$$

where the first integral on the right is again an element of \(L^2({\mathbb {P}}\otimes dt)\). The second integral satisfies

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ \int _0^T \left( \int _0^t \frac{{\tilde{f}}_s^+}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))} dM^+_s \right) ^2 dt \right] \\&\quad = {\mathbb {E}}\left[ \int _0^T \int _0^t \left( \frac{{\tilde{f}}_s^+}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))} \right) ^2 d\langle M^+ \rangle _s dt \right] \\&\quad = {\mathbb {E}}\left[ \int _0^T (T-s) \frac{({\tilde{f}}_s^+)^2}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} d\langle M^+ \rangle _s \right] \\&\quad \le \frac{9\lambda ^2}{\delta ^+} {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^+ \rangle _s \right] < \infty \end{aligned} \end{aligned}$$
(56)

by our assumption in (9), where we also used Fubini’s theorem twice and the fact that \(\sinh (\tau ) \ge \tau \) and \(\cosh (\tau ) \ge 1\) for all \(\tau \ge 0\). That is, we obtain that \({\hat{\alpha }}^+ \in L^2({\mathbb {P}}\otimes dt)\) in this case.

Case 1.2: \(\Xi ^1_T=\Xi ^2_T=0\):

In this case, we obtain from the expressions in (22) and (23) that

$$\begin{aligned} {\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^3_t {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) K^1(t,u) du \, \Big \vert \, {\mathscr {F}}_t \right] \quad (0 \le t \le T) \end{aligned}$$

and thus, using again the explicit representation for \({\hat{X}}^+={\hat{X}}^1 + {\hat{X}}^2\) from (66) and (67), \({\hat{\alpha }}^+\) in (49) becomes

$$\begin{aligned} {\hat{\alpha }}^+_t&= \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - {\hat{X}}^+_t\right) \nonumber \\&= \frac{2c^+_tw^3_t}{\lambda (1+w^5_t)} {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) K^1(t,u) du \, \bigg \vert \, {\mathscr {F}}_t \right] \nonumber \\&\quad - \frac{c^+_t}{\lambda } e^{-\int _0^t \frac{c^+_u}{\lambda } du} \nonumber \\&\qquad \int _0^t \frac{(c_s^+ + c^-_s)w^3_s}{\lambda } e^{\int _0^s \frac{c^+_u}{\lambda } du} {\mathbb {E}}\left[ \int _s^T (\xi ^1_u + \xi ^2_u) K^1(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] ds. \end{aligned}$$
(57)

In fact, it holds that all the ratios in (57) involving \(c^+\), \(c^-\) are bounded on [0, T]. Moreover, by Lemma 3.8 we have

$$\begin{aligned} {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) K^1(t,u) du \, \bigg \vert \, {\mathscr {F}}_t \right] \in L^2({\mathbb {P}}\otimes dt), \end{aligned}$$

as well as

$$\begin{aligned}&{\mathbb {E}}\left[ \int _0^T \left( \int _0^t {\mathbb {E}}\left[ \int _s^T (\xi ^1_u + \xi ^2_u) K^1(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] ds \right) ^2 dt \right] \\&\quad \le \frac{T^2}{2} {\mathbb {E}}\left[ \int _0^T \left( {\mathbb {E}}\left[ \int _s^T (\xi ^1_u + \xi ^2_u) K^1(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] \right) ^2 ds \right] < \infty \end{aligned}$$

by using Jensen’s inequality. As a consequence, we can also conclude in this case that \({\hat{\alpha }}^+\) belongs to \(L^2({\mathbb {P}}\otimes dt)\).

Let us now argue that also \({\hat{\alpha }}^-\) in (49) belongs to \(L^2({\mathbb {P}}\otimes dt)\). The argumentation is very similar to the one presented above so that we only sketch the main steps. Again, it is enough to investigate the following two cases \(\xi ^1\equiv \xi ^2\equiv 0\) and \(\Xi ^1_T=\Xi ^2_T=0\) separately because \({\hat{\alpha }}^-\) in (49) can similarly be decomposed as \({\hat{\alpha }}^+\) in (51).

Case 2.1: \(\xi ^1\equiv \xi ^2\equiv 0\):

Similar to (52) above, using \({\hat{\xi }}^1_t - {\hat{\xi }}^2_t = 2 w^2_t M^-_t\) from (50) we obtain via (66) and (67) the representation

$$\begin{aligned} \begin{aligned} {\hat{X}}^-_t =&\; e^{-\int _0^t \frac{c^-_u}{\lambda } du} \int _0^t \frac{c_s^+ + c^-_s}{\lambda } w^2_s M^-_s e^{\int _0^s \frac{c^-_u}{\lambda } du} ds \\ =&\; e^{-\frac{\gamma }{2\lambda } (T-t)} \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \\&\int _0^t M^-_s \frac{\sqrt{\delta ^-}}{\lambda \sinh (\sqrt{\delta ^-}(T-s)/\lambda )^2} ds \qquad (0 \le t < T). \end{aligned} \end{aligned}$$
(58)

Setting \(f^-_s \triangleq 1/\sinh (\sqrt{\delta ^-}(T-s)/\lambda )\) on [0, T) we can rewrite the integral in (58) as

$$\begin{aligned} \int _0^t {\tilde{M}}^-_s df^-_s = {\tilde{M}}^-_t f^-_t - {\tilde{M}}^-_0 f^-_0 - \int _0^t f_s^- d{\tilde{M}}^-_s \qquad (0 \le t < T) \end{aligned}$$
(59)

with \({\tilde{M}}^-_t \triangleq M_t^-/\cosh (\sqrt{\delta ^-}(T-t)/\lambda )\) for all \(t \in [0,T)\). In addition,

$$\begin{aligned} \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} = \frac{\sqrt{\delta ^-} e^{-\frac{\gamma }{2\lambda }(T-t)}}{c^-_t \sinh (\sqrt{\delta ^-}(T-t)/\lambda )} M^-_t \quad (0 \le t \le T). \end{aligned}$$
(60)

Inserting (60) and (58) together with (59) into \({\hat{\alpha }}^-\) in (49) then yields

$$\begin{aligned} \begin{aligned} {\hat{\alpha }}^-_t =&\; \frac{\gamma }{2\lambda } e^{-\frac{\gamma }{2\lambda }(T-t)} {\tilde{M}}^-_t + \frac{c^-_t}{\lambda } e^{-\frac{\gamma }{2\lambda } (T-t)} \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) {\tilde{M}}^-_0 f^-_0 \\&\; + \frac{c^-_t}{\lambda } e^{-\frac{\gamma }{2\lambda } (T-t)} \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \int _0^t f_s^- d{\tilde{M}}^-_s \quad (0 \le t < T), \end{aligned} \end{aligned}$$
(61)

where

$$\begin{aligned} \int _0^t f_s^- d{\tilde{M}}^-_s =&\; \int _0^t \frac{\sqrt{\delta ^-} M_s^-}{\lambda \cosh (\sqrt{\delta ^-}(T-s)/\lambda )^2} ds \nonumber \\&\; + \int _0^t \frac{{\tilde{f}}_s^-}{\cosh (\sqrt{\delta ^-}(T-s)/\lambda )} dM^-_s. \end{aligned}$$
(62)

Observe as in (55) above that \(c^-_t \sinh (\sqrt{\delta ^-}(T-t)/\lambda )\) is bounded on [0, T] (recall from (19) that \(c^-_t = \sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda )-\frac{1}{2}\gamma \)) and that \({\tilde{M}}^- \in L^2({\mathbb {P}}\otimes dt)\). Therefore, we only need to justify that the stochastic integral in (62) belongs to \(L^2({\mathbb {P}}\otimes dt)\). Indeed, by the same computations as in (56), we obtain via our assumption in (9) that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ \int _0^T \left( \int _0^t \frac{{\tilde{f}}_s^-}{\cosh (\sqrt{\delta ^-}(T-s)/\lambda )} dM^-_s \right) ^2 dt \right] \\&\quad \le \frac{\lambda ^2}{\delta ^-} {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^- \rangle _s \right] < \infty . \end{aligned} \end{aligned}$$
(63)

Hence, we can conclude that \({\hat{\alpha }}^- \in L^2({\mathbb {P}}\otimes dt)\) in this case.

Case 2.2: \(\Xi ^1_T=\Xi ^2_T=0\):

Here, similar to (57) above, (22) and (23) imply that

$$\begin{aligned} {\hat{\xi }}^1_t - {\hat{\xi }}^2_t = 2 w^4_t {\mathbb {E}}\left[ \int _t^T (\xi ^1_u - \xi ^2_u) K^2(t,u) du \, \Big \vert \, {\mathscr {F}}_t \right] \quad (0 \le t \le T) \end{aligned}$$

and hence, together with \({\hat{X}}^-={\hat{X}}^1 - {\hat{X}}^2\) from (66) and (67), \({\hat{\alpha }}^-\) in (49) can be written as

$$\begin{aligned} {\hat{\alpha }}^-_t&= \frac{c^-_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} - {\hat{X}}^-_t\right) \nonumber \\&= \frac{2c^-_tw^4_t}{\lambda (1-w^5_t)} {\mathbb {E}}\left[ \int _t^T (\xi ^1_u - \xi ^2_u) K^2(t,u) du \, \bigg \vert \, {\mathscr {F}}_t \right] \nonumber \\&\quad - \frac{c^-_t}{\lambda } e^{-\int _0^t \frac{c^-_u}{\lambda } du} \nonumber \\&\qquad \int _0^t \frac{(c_s^+ + c^-_s)w^4_s}{\lambda } e^{\int _0^s \frac{c^-_u}{\lambda } du} {\mathbb {E}}\left[ \int _s^T (\xi ^1_u - \xi ^2_u) K^2(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] ds. \end{aligned}$$
(64)

As in (57), all the ratios in (64) involving the functions \(c^+\), \(c^-\) are bounded on [0, T], and we can conclude along the same lines as in case 1.2 by virtue of Lemma 3.8 that \({\hat{\alpha }}^- \in L^2({\mathbb {P}}\otimes dt)\) in this case as well.

Step 4: Finally, we have to argue that the functions \(K^1(t,u)\) and \(K^2(t,u)\) defined in (24) are nonnegative kernels which integrate to one over [tT) as functions in \(u \in [t,T)\). To this end, observe that \(c^+_t > 0\) and \(c^-_t > 0\) for all \(t \in [0,T]\), which implies that \(w^1_\cdot , w^2_\cdot > 0\) on [0, T). Moreover, a direct computation yields that for all \(t \in [0,T)\) we have

$$\begin{aligned} \begin{aligned} 0<&\; \int _t^T \frac{2\sigma }{\sqrt{\delta ^+}} e^{-\frac{\gamma }{6\lambda }(T-u)} \sinh (\sqrt{\delta ^+}(T-u)/(3\lambda )) du = \frac{w^3_t}{w^1_t}, \\ 0 <&\; \int _t^T \frac{2\sigma }{\sqrt{\delta ^-}} e^{\frac{\gamma }{2\lambda }(T-u)} \sinh (\sqrt{\delta ^-}(T-u)/\lambda ) du = \frac{w^4_t}{w^2_t}. \end{aligned} \end{aligned}$$
(65)

Thus, we also obtain that \(w^3_\cdot , w^4_\cdot > 0\) on [0, T). But this implies for the functions defined in (24) that \(K^1(t,u) > 0\) and \(K^2(t,u) > 0\) for all \(0 \le t \le u < T\), as well as that \(\int _t^T K^1(t,u) du = \int _t^T K^2(t,u) du = 1\) for all \(t \in [0,T)\). \(\square \)

The equilibrium share holdings prescribed by the linear coupled ODE in (21) can also be computed explicitly.

Corollary 3.6

The solution \(({\hat{X}}^1, {\hat{X}}^2)\) to the linear ODE in (21) is given by

$$\begin{aligned} {\hat{X}}^{1}_t =&\; \frac{1}{2} (x^1 + x^2) e^{-\int _0^t \frac{c^+_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^1_s+{\hat{\xi }}^2_s) e^{-\int _s^t \frac{c^+_u}{\lambda } du} ds \nonumber \\&\; + \frac{1}{2} (x^1 - x^2) e^{-\int _0^t \frac{c^-_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^1_s-{\hat{\xi }}^2_s) e^{-\int _s^t \frac{c^-_u}{\lambda } du} ds \end{aligned}$$
(66)

and, similarly, by

$$\begin{aligned} {\hat{X}}^{2}_t= & {} \frac{1}{2} (x^2 + x^1) e^{-\int _0^t \frac{c^+_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^2_s+{\hat{\xi }}^1_s) e^{-\int _s^t \frac{c^+_u}{\lambda } du} ds \nonumber \\&+ \frac{1}{2} (x^2 - x^1) e^{-\int _0^t \frac{c^-_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^2_s-{\hat{\xi }}^1_s) e^{-\int _s^t \frac{c^-_u}{\lambda } du} ds \end{aligned}$$
(67)

for all \(t \in [0,T]\).

Proof

Recall that from the dynamics of \({\hat{X}}^1\) and \({\hat{X}}^2\) in (21) we obtain that the processes \({\hat{X}}^1 + {\hat{X}}^2\) and \({\hat{X}}^1 - {\hat{X}}^2\) satisfy, respectively, the linear ODEs in (41) and (42) with initial values \(x^1+x^2\) and \(x^1-x^2\). Applying the variation of constants formula then yields

$$\begin{aligned} {\hat{X}}^1_t \pm {\hat{X}}^2_t = (x^1 \pm x^2) e^{-\int _0^t \frac{c^{\pm }_s}{\lambda } ds} + \int _0^t \frac{c^+_s+c^-_s}{2\lambda } ({\hat{\xi }}^1_s \pm {\hat{\xi }}^2_s) e^{-\int _s^t \frac{c_u^{\pm }}{\lambda } du} ds \end{aligned}$$

and hence the assertion in (66) and (67) via the obvious relation

$$\begin{aligned} {\hat{X}}^{1,2}_t = \frac{1}{2} ({\hat{X}}^1_t + {\hat{X}}^2_t) \pm \frac{1}{2} ({\hat{X}}^1_t - {\hat{X}}^2_t). \end{aligned}$$

\(\square \)

Lastly, following simple properties of the weight functions introduced in (20) will help enlightening the structure of the Nash equilibrium presented in Theorem 3.5.

Lemma 3.7

The weight functions \(w^1, w^2, w^3, w^4,w ^5\) defined in (20) satisfy

  1. 1.

    \(w_\cdot ^5 \in (-1,1)\), \(w_{\cdot }^{1,2,3,4} > 0\) on [0, T) and \(w^1_\cdot + w^2_\cdot + w^3_\cdot + w^4_\cdot =1\) on [0, T],

  2. 2.

    \(\lim _{t \uparrow T} w_t^{1,2} = 1/2\) and \(\lim _{t \uparrow T} w_t^{3,4,5} = 0\).

Proof

1. First, recall from the Proof of Theorem 3.5, Step 4, above that \(w^1_\cdot , w^2_\cdot ,w^3_\cdot ,w^4_\cdot > 0\) on [0, T). Moreover, from the definition in (20) we immediately obtain that \(w^1_t + w^2_t + w^3_t + w^4_t =1\) for all \(t \in [0,T]\). Together with the fact that \(c^+_\cdot > 0\) and \(c^-_\cdot > 0\) on [0, T], we also observe that \(w^5_t \in (-1,1)\) for all \(t \in [0,T]\).

2. Concerning the limiting behaviour of the weight functions, it suffices to note that

$$\begin{aligned} \lim _{t \uparrow T} \frac{\sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))}{\sinh (\sqrt{\delta ^-}(T-t)/\lambda )} = \frac{\sqrt{\delta ^+}}{3\sqrt{\delta ^-}}. \end{aligned}$$

Then, rewriting \(w^1\), \(w^2\) in (20) by plugging in \(c^+\), \(c^-\) from (19) to obtain the representations

$$\begin{aligned} w^1_t = \frac{\sqrt{\delta ^+} e^{\frac{\gamma }{6\lambda }(T-t)}}{d^1_t}, \qquad w^2_t = \frac{3 \sqrt{\delta ^-} e^{-\frac{\gamma }{2\lambda }(T-t)}}{d^2_t} \end{aligned}$$

with

$$\begin{aligned} d^1_t \triangleq&\; \sqrt{\delta ^+}\cosh (\sqrt{\delta ^+}(T-t)/(3\lambda ))-\gamma \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \\&+ \sqrt{\delta ^-} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \coth (\sqrt{\delta ^-}(T-t)/\lambda ), \\ d^2_t \triangleq&\; 3 \sqrt{\delta ^-} \cosh (\sqrt{\delta ^-}(T-t)/\lambda ) -\gamma \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \\&+ \sqrt{\delta ^+}\sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) \end{aligned}$$

yields

$$\begin{aligned} \lim _{t\uparrow T} w^1_t = \frac{\sqrt{\delta ^+}}{\sqrt{\delta ^+} + \sqrt{\delta ^+}} = \frac{1}{2}, \qquad \lim _{t\uparrow T} w^2_t = \frac{\sqrt{\delta ^-}}{\sqrt{\delta ^-} + \sqrt{\delta ^-}} = \frac{1}{2}. \end{aligned}$$

Similarly, with

$$\begin{aligned} \frac{c_t^+}{c^+_t+c^-_t} =&\; \frac{2\sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + \gamma }{2\sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + 6\sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda )-2\gamma } \\ \frac{c_t^-}{c^+_t+c^-_t} =&\; \frac{6 \sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda ) - 3 \gamma }{2\sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + 6\sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda )-2\gamma } \end{aligned}$$

we also have

$$\begin{aligned} \lim _{t\uparrow T} \frac{c_t^+}{c^+_t+c^-_t} = \frac{\sqrt{\delta ^+}}{\sqrt{\delta ^+} + \sqrt{\delta ^+}} = \frac{1}{2}, \qquad \lim _{t\uparrow T} \frac{c_t^-}{c^+_t+c^-_t} = \frac{\sqrt{\delta ^-}}{\sqrt{\delta ^-} + \sqrt{\delta ^-}} = \frac{1}{2} \end{aligned}$$

and hence

$$\begin{aligned} \lim _{t\uparrow T} w^3_t = \lim _{t\uparrow T} w^4_t = \lim _{t\uparrow T} w^5_t = 0 \end{aligned}$$

as desired. \(\square \)

The final lemma provides estimates with respect to the \(L^2({\mathbb {P}}\otimes dt)\)-norm which are used in the Proof of Theorem 3.5 above.

Lemma 3.8

Let \((\zeta _t)_{0 \le t \le T} \in L^2({\mathbb {P}}\otimes dt)\) be progressively measurable. Moreover, let \(K^1(t,u)\), \(K^2(t,u)\), \(0 \le t \le u < T\), denote the kernels from Theorem 3.5.

  1. (a)

    For \(\zeta ^{K^1}_t \triangleq {\mathbb {E}}[ \int _t^T \zeta _u K^1(t,u) du \vert {\mathscr {F}}_t]\), \(0 \le t < T\), it holds that

    $$\begin{aligned} \Vert \zeta ^{K^1} \Vert _{L^2({\mathbb {P}}\otimes dt)} \le c \Vert \zeta \Vert _{L^2({\mathbb {P}}\otimes dt)} \end{aligned}$$

    for some constant \(c>0\).

  2. (b)

    For \(\zeta ^{K^2}_t \triangleq {\mathbb {E}}[ \int _t^T \zeta _u K^2(t,u) du \vert {\mathscr {F}}_t]\), \(0 \le t < T\), it holds that

    $$\begin{aligned} \Vert \zeta ^{K^2} \Vert _{L^2({\mathbb {P}}\otimes dt)} \le c \Vert \zeta \Vert _{L^2({\mathbb {P}}\otimes dt)} \end{aligned}$$

    for some constant \(c>0\).

Proof

Both upper bounds can be verified in a similar fashion as in the proof of Lemma 5.5 in Bank et al. [5]. We will thus omit it here. \(\square \)

Remark 3.9

Following up on Remark 2.3, setting \(\xi ^1 \equiv \xi ^2 \equiv 0\) and \(\Xi ^1_T = \Xi ^2_T = 0\) \({\mathbb {P}}\)-almost surely, our Theorem 3.5 together with Corollary 3.6 retrieves the two-player results from Carlin et al. [9,  Result 1] for the case \(\sigma = 0\) and from Schied and Zhang [29,  Corollary 2.6] for the case \(\sigma > 0\). Note that this configuration yields \({\hat{\xi }}^1 \equiv {\hat{\xi }}^2 \equiv 0\) in (22) and (23), which in turn implies that the Nash equilibrium trading rates in (21) and the corresponding share holdings in (66) and (67) are deterministic.

We end this section by briefly discussing qualitatively the Nash equilibrium obtained in Theorem 3.5. Very similar to the single-player solution in [5] it turns out that the trading rates \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\) in (21) prescribe, respectively, to gradually trade in the direction of an optimal signal process \({\hat{\xi }}^1_t\) and \({\hat{\xi }}^2_t\) (rather than toward the actual target position \(\xi ^1_t\), \(\xi ^2_t\)), which is further adjusted by a fraction \(w^5_t \in (-1,1)\) of the opponent’s respective current portfolio position \({\hat{X}}^2_t\) and \({\hat{X}}^1_t\). The optimal signal processes \({\hat{\xi }}^1\) in (22) and \({\hat{\xi }}^2\) in (23) are convex combinations of weighted averages of expected future target positions of the processes \(\xi ^1\), \(\xi ^2\) and the expected terminal positions \(\Xi ^1_T\), \(\Xi ^2_T\), where the weights \(w^1_t, w^2_t, w^3_t, w^4_t\) systematically shift toward the desired individual terminal state as \(t \uparrow T\) (Lemma 3.7 implies that \(\lim _{t \uparrow T} {\hat{\xi }}^i_t = \Xi ^i_T\) \({\mathbb {P}}\)-a.s. for both players \(i=1,2\)). The increasing urgency rate \((c^+_t+c^-_t)/(2\lambda ) \uparrow \infty \) for \(t \uparrow T\), together with \(\lim _{t\uparrow T} w^5_t = 0\), then forces both strategies in (21) to end up in the predetermined terminal portfolio position at maturity T (see also the Proof of Theorem 3.5 above). Interestingly, we note that the first agent’s optimal signal process \({\hat{\xi }}^1\) not only seeks to anticipate the future evolution of her own target strategy \(\xi ^1\) but, conscious of her competitor’s trading goals, does so also for the opponent’s target strategy \(\xi ^2\). In other words, besides following her own objectives, she also takes into account the other agent’s known trading intentions. Moreover, the weights \(w^3_t\) and \(w^4_t\) dictate the actual trading direction with respect to the other agent’s tracking target. Indeed, observe that if \(w^3_t\) predominates \(w^4_t\) in (22), the first player’s optimal signal \({\hat{\xi }}^1\) directs to also trade in parallel in the same direction as the second player, that is, in the direction of the expected future average positions of \(\xi ^2\). In contrast, if \(w^4_t\) outweighs \(w^3_t\), then the optimal signal imposes to trade in the opposite direction of the second player’s target strategy, i.e., toward the expected weighted averages of \(-\xi ^2\). The former case can be viewed as a predatory trading action of the first agent against the second agent, whereas the latter case can be regarded as a cooperative behaviour. The same applies for the second player in (23) due to symmetry. In our illustrations in Sect. 4 below it becomes apparent that both these cases depend on the relationship between the permanent and temporary price impact parameters \(\gamma \) and \(\lambda \). Loosely speaking, in a plastic market where \(\gamma \gg \lambda \), the weight \(w^3\) tends to be larger than \(w^4\), and in an elastic market with \(\lambda \gg \gamma \) we have that \(w^4\) tends to be larger than \(w^3\) (see also the graphical illustration of the weight functions in Fig. 1 below). In this regard, depending on the illiquidity parameters the optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) account for different types of regimes. It turns out that this leads to qualitative different behavioral patterns in the Nash equilibrium where both predation and cooperation between the agents can occur, even in a coexisting manner.

4 Illustrations

In this section we present some case studies to illustrate the qualitative behaviour of the two-player Nash equilibrium presented in Theorem 3.5.

4.1 Optimal liquidation revisited

We start with revisiting the differential game of optimal portfolio liquidation studied in Schied and Zhang [29]. Specifically, the first agent seeks to liquidate her initial portfolio position of \(x^1=1\) shares in the risky asset by time \(T=2\) and hence requires her terminal position to satisfy \(\Xi ^1_T=0\) \({\mathbb {P}}\)-a.s. at final time. Vigilant about her share holdings and in line with her selling intention she also wants her inventory to be close to 0 throughout by tracking \(\xi ^1 \equiv 0\) on [0, T]. The second agent, on the contrary, does not pursue any predetermined buying or selling objectives but solely chooses to trade in the risky asset because he knows about the intentions of the first liquidating agent. That is, possessing no shares at time 0 (\(x^2=0\)) he gives himself the constraints \(\xi ^2_t = \Xi ^2_T = 0\) \({\mathbb {P}}\)-a.s. for all \(t \in [0,T]\). In this case, following Theorem 3.5, we have \({\hat{\xi }}^1 \equiv {\hat{\xi }}^2 \equiv 0\) \({\mathbb {P}}\)-a.s. on [0, T] in (22) and (23), and the deterministic equilibrium trading rates of both players in (21) reduce to

$$\begin{aligned} {\hat{\alpha }}^1_t = \frac{c^+_t+c^-_t}{2\lambda } \left( - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t \right) \quad \text {and} \quad {\hat{\alpha }}^2_t = \frac{c^+_t+c^-_t}{2\lambda } \left( - w^5_t {\hat{X}}^1_t - {\hat{X}}^2_t \right) \end{aligned}$$
(68)

on [0, T); cf. also the result in [29,  Corollary 2.6] with a slightly different representation. We observe in (68) that the first agent’s portfolio position \({\hat{X}}^1_t\) is not gradually reverting towards 0 but takes the effect of the second agent’s actions into account via the correction term \(-w^5_t {\hat{X}}^2_t\). Similarly, concerning the second agent, it is optimal for him to systematically trade in the direction of the liquidating agent’s current portfolio position \({\hat{X}}^1_t\) weighted with \(w^5_t \in (-1,1)\).

Fig. 1
figure 1

Exemplary illustration of the weight functions \(w^1\), \(w^2\), \(w^3\), \(w^4\), \(w^5\) on [0, T] defined in (20). The parameters are \(T=5\), \(\sigma = 1\), \(\lambda = 1\), as well as \(\gamma = 4\) (left panel), \(\gamma = 0.2\) (right panel)

Fig. 2
figure 2

The two-player Nash equilibrium strategies \({\hat{X}}^1\) for the liquidating agent 1 (green) and \({\hat{X}}^2\) for agent 2 (orange) on [0, T], together with the corresponding processes \(-w^5 {\hat{X}}^i\) \((i=1,2)\) from the trading rates in (68) (same-color dashed lines). The optimal single-player liquidation strategy from (69) is depicted in black. The parameters are \(T=2\), \(\sigma = 1\), \(\lambda = 1\), as well as \(\gamma = 4\) (left panel), \(\gamma = 0.2\) (right panel)

As shown in Fig. 2, this yields to predation on the first agent in a plastic market where, e.g., \(\gamma = 4 > 1 = \lambda \). Indeed, during the first half of the trading period he short-sells the risky asset in parallel to the selling of the first agent and then steadily unwinds his accrued short position by buying back shares to become “hands-clean” by final time T. In contrast, in an elastic market with, e.g., \(\gamma = 0.2 < 1 = \lambda \), the Nash equilibrium strategy dictates the second agent to cooperate with the seller and to moderately buy almost up to one-tenth of the shares by time T/2 agent 1 is concurrently selling before starting liquidating his portfolio to finish up with zero inventory at T. Note that the weight function \(w^5_\cdot \) in (68) flips sign depending on the market’s illiquidity regime (see also Fig. 1). As a consequence, compared to the single-player optimal liquidation strategy \({\hat{X}}_t = 1 + \int _0^t {\hat{\alpha }}_s ds\), \(t \in [0,T]\), which satisfies

$$\begin{aligned} {\hat{\alpha }}_t = -\sqrt{\frac{\sigma }{\lambda }} \coth \left( \sqrt{\frac{\sigma }{\lambda }} (T-t) \right) {\hat{X}}_t \qquad (0 \le t < T) \end{aligned}$$
(69)

(cf., e.g., Almgren [1]), and does not depend on \(\gamma \), we observe in Fig. 2 that, due to the presence of the second agent’s trading activity which directly feeds into the first agent’s turnover rate \({\hat{\alpha }}^1\) via \(-w^5 {\hat{X}}^2\) in (68), her optimal portfolio liquidation strategy becomes more prudent in a plastic market and slightly more aggressive in an elastic market environment. To sum up, in equilibrium, depending on the illiquid market type, either predation or cooperation between both agents occurs; see also the discussion in [29,  Sect. 3].

4.2 Piecewise constant inventory targets

The next two case studies are again simple deterministic examples but this time with nonzero optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\).

Fig. 3
figure 3

The two-player Nash equilibrium strategies \({\hat{X}}^1\) for Player 1 (green) and \({\hat{X}}^2\) for Player 2 (orange), together with the processes \({\hat{\xi }}^i-w^5 {\hat{X}}^j\) \((i\ne j \in \{1,2\})\) from the optimal trading rates in (21) (same-color dashed lines). The first agent’s buying program \(\xi ^1 = 1_{[0,5)}+2 \cdot 1_{[5,10]}\) is plotted in grey. For comparison, the corresponding single-player optimal tracking strategy with associated optimal signal process from [5] is depicted in black (solid and dashed). The parameters are \(T=10\), \(\sigma = 1\), \(\lambda = 1\), as well as \(\gamma = 4\) (left panel), \(\gamma = 0.2\) (right panel)

In the first example, as in the optimal liquidation problem above, we suppose that agent 2 only trades in the risky asset because of his awareness of the trading activity of the first agent. That is, with \(x^2 = 0\) initial shares his inventory targets are \( \xi ^2_t = \Xi ^2_T = 0\) \({\mathbb {P}}\)-a.s. for all \(t \in [0,T]\). Concerning the first agent, starting with no inventory \(x^1=0\) she wants to follow a stock-buying schedule over a time period of \(T=10\) that prescribes to hold one share until time T/2 and then to double and hold her position up to time T. Her inventory target is thus \(\xi ^1_t = 1 \cdot 1_{\{0 \le t < 5\}}+2 \cdot 1_{\{5 \le t \le 10\}}\) on [0, T] with terminal constraint \(\Xi ^1_T = 2\). Note that in this game setup the optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) of both agents in (22) and (23) in equilibrium are nonzero. In particular, similar to the single-player case in [5] they are anticipating and smoothing out the jump in \(\xi ^1\) at time T/2 via the averaging through the kernels \(K^1\) and \(K^2\). The associated Nash-equilibrium trading strategies \({\hat{X}}^1\) and \({\hat{X}}^2\) from Theorem 3.5 are presented in Fig. 3. As expected from the liquidation problem above, if the market is plastic \((\gamma > \lambda )\) the second agent heavily preys on the first agent by trading halfway of the trading period in the same direction and buying shares. Accordingly, in comparison to the first agent’s single-player optimal tracking strategy from [5] (which does not dependent on \(\gamma \)) her running after the buying-schedule \(\xi ^1\) gets affected due to the presence of the preying second agent and falls behind the single-player solution in the second half of the trading period (also recall the adjustment \({\hat{\xi }}^1 -w^5{\hat{X}}^2\) of the first agent’s optimal signal process in her trading rate in (21)). However, if the market is elastic \((\lambda > \gamma )\) the second agent’s optimal behaviour in equilibrium changes. Interestingly, we observe that his strategy turns out to be a succession of round-trips during which he either provides liquidity to his opponent by short-selling the risky asset like, e.g., during the first quarter of the trading period, or engages in predatory trading by concurrently building up some inventory in parallel to his adversary’s buying efforts as it is the case during the second quarter of the trading period. Thus, compared to the first agent’s single-player optimal strategy, she suitably buys slightly faster and slower in the two-player setup. Overall, it turns out that predation and cooperation coexist in equilibrium in this case.

Fig. 4
figure 4

The two-player Nash equilibrium strategies \({\hat{X}}^1\) for Player 1 (green) and \({\hat{X}}^2\) for Player 2 (orange), together with the processes \({\hat{\xi }}^i-w^5 {\hat{X}}^j\) \((i\ne j \in \{1,2\})\) from the optimal trading rates in (21) (same-color dashed lines). Both agent’s inventory targets \(\xi ^1 \equiv 1\) and \(\xi ^2 \equiv 0.1\) are plotted in grey. For comparison, the corresponding single-player optimal tracking strategies with associated optimal signal processes from [5] are depicted in black (solid and dashed). The parameters are \(T=10\), \(\sigma = 1\), \(\lambda = 1\), as well as \(\gamma = 4\) (left panel), \(\gamma = 0.2\) (right panel)

As a second example, let us examine the situation where both agents with zero initial inventory \(x^1=x^2=0\) seek to gradually build up and hold a positive fraction of the risky asset over some time period [0, T] with \(T=10\). Concretely, assume that \(\xi ^1 \equiv \Xi ^1_T = 1\) and \(\xi ^2 \equiv \Xi ^2_T = 0.1\), i.e., agent 1 wants her inventory to be close to 1 and ten times larger than the desired inventory level of agent 2 all through the trading period [0, T]. The associated Nash equilibrium strategies \({\hat{X}}^1\) and \({\hat{X}}^2\) from Theorem 3.5 are presented in Fig. 4. Again, as expected from the analysis above, in a plastic market it is optimal for agent 2 to excessively prey on the first agent who aims for a much larger asset position by buying up to three times more shares than his actual target inventory predetermines. In response, the acquisition of the first agent is slowed down compared to her single-player optimal strategy from [5]. By contrast, in an elastic market environment it turns out to be optimal for the second agent to initially ignore her own tracking target and to trade away from her desired inventory level in order to provide liquidity to the higher-volume seeking first agent by short-selling some shares. Also note how in this case the second agent’s single-player optimal tracking strategy from [5] strongly differs from her optimal behaviour in the two-player Nash equilibrium at the beginning of the trading period.

4.3 Running after the delta

In the final two examples we want to investigate a situation where the target strategies \(\xi ^1\) and \(\xi ^2\) are adapted stochastic processes. Specifically, let us suppose that the first agent wants to hedge an at-the-money call option with maturity T on the underlying unaffected price process \(P=P_0+\sqrt{\sigma } W\) in (3) by tracking the corresponding frictionless (Bachelier-)delta-hedging strategy

$$\begin{aligned} \xi ^1_t \triangleq \Phi \left( \frac{P_t - P_0}{\sqrt{\sigma (T-t)}} \right) \quad (0 \le t \le T). \end{aligned}$$
(70)

Here, \(\Phi \) denotes the cumulative distribution function of the standard normal distribution. We further suppose that her initial position in the risky asset coincides with the frictionless delta \(x^1=\xi ^1_0 = 1/2\) and that \(\Xi ^1_T = 0\) \({\mathbb {P}}\)-a.s., i.e., she wants to systematically unwind her hedging portfolio when approaching maturity T.

Lemma 4.1

The process \((\xi ^1_t)_{0 \le t \le T}\) in (70) is a martingale on [0, T].

Proof

Obviously, \((\xi ^1_t)_{0 \le t \le T}\) is adapted, bounded and hence integrable. Moreover, using the property that for any \(a,b \in {\mathbb {R}}\) a standard normal distributed random variable Z satisfies \({\mathbb {E}}[\Phi (a Z + b)] = \Phi (b/\sqrt{1+a^2})\) we obtain

$$\begin{aligned} {\mathbb {E}}\left[ \Phi \left( \frac{P_t - P_0}{\sqrt{\sigma (T-t)}} \right) \bigg \vert \, {\mathscr {F}}_s \right] = {\mathbb {E}}\left[ \Phi \left( \frac{\sqrt{\sigma (t-s)} Z + P_s - P_0}{\sqrt{\sigma (T-t)}} \right) \right] = \Phi \left( \frac{P_s - P_0}{\sqrt{\sigma (T-s)}} \right) \end{aligned}$$

as desired. \(\square \)

Firstly, we assume that the second agent does not pursue any specific predetermined trading objectives, that is, \(x^2 = \xi ^2 = \Xi ^2_T = 0\) \({\mathbb {P}}\)-a.s. Since \(\xi ^1\) in (70) is a martingale on [0, T] the optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) in (22) and (23) simplify to

$$\begin{aligned} {\hat{\xi }}^1_t = (w^3_t + w^4_t) \xi ^1_t \quad \text {and} \quad {\hat{\xi }}^2_t = (w^3_t - w^4_t) \xi ^1_t \qquad (0 \le t \le T), \end{aligned}$$
(71)

using Fubini’s theorem and the fact that for each \(t \in [0,T)\) the kernels \(K^1(t,u)\) and \(K^2(t,u)\) as functions in \(u \in [t,T)\) integrate to one over [tT]. The Nash equilibrium strategies \({\hat{X}}^1\) and \({\hat{X}}^2\) from Theorem 3.5 are plotted in Fig. 5, together with the corresponding realisation of the delta-hedge \(\xi ^1\) in the case where the call option expires in the money.

Fig. 5
figure 5

The two-player Nash equilibrium strategies \({\hat{X}}^1\) for Player 1 (green) and \({\hat{X}}^2\) for Player 2 (orange), together with the processes \({\hat{\xi }}^i-w^5 {\hat{X}}^j\) \((i\ne j \in \{1,2\})\) from the optimal trading rates in (21) (same-color dashed lines). The first agent’s frictionless delta-hedge \(\xi ^1\) is plotted in grey. For comparison, her corresponding single-player optimal hedging strategy with associated optimal signal process from [5] is depicted in black (solid and dashed). The parameters are \(T=5\), \(\sigma = 1\), \(\lambda = 1\), as well as \(\gamma = 4\) (left panel) 8, \(\gamma = 0.2\) (right panel)

Depending on the illiquidity parameters, we observe the same behavioral patterns in equilibrium as in the deterministic cases analyzed above: In a plastic market environment, the second agent engages in predatory trading on the first agent by trading in parallel in the same direction of the delta-hedge. When the market is elastic he turns into a liquidity provider instead and partially takes the opposite side of the hedger’s transactions. Also note that the sign of the second agent’s optimal signal process in (71) is determined by the relation between the weights \(w^3\) and \(w^4\), which is in turn affected by the relation between \(\gamma \) and \(\lambda \) (cf. also Fig. 1).

Fig. 6
figure 6

The two-player Nash equilibrium strategies \({\hat{X}}^1\) for Player 1 (green) and \({\hat{X}}^2\) for Player 2 (orange), together with the processes \({\hat{\xi }}^i-w^5 {\hat{X}}^j\) \((i\ne j \in \{1,2\})\) from the optimal trading rates in (21) (same-color dashed lines). Only the second agent’s frictionless delta-hedge \(\xi ^2=\xi ^1/10\) is plotted in grey (the first agent’s target strategy \(\xi ^1\) is the same as in Fig. 5 and omitted here). For comparison, the corresponding single-player optimal hedging strategies of the two agents together with their associated optimal signal processes from [5] are depicted in black (solid and dashed). The parameters are \(T=5\), \(\sigma = 1\), \(\lambda = 1\), as well as \(\gamma = 4\) (left panel), \(\gamma = 0.2\) (right panel)

Secondly, let us now assume that the second agent also hedges a one-tenth fraction of the same call option, i.e., \(\xi ^2 = \xi ^1/10\) (with initial and final portfolio positions \(x^2=1/20\) and \(\Xi ^2_T =0\) \({\mathbb {P}}\)-a.s.). The resulting Nash equilibrium strategies from Theorem 3.5 are presented in Fig. 6 where we used the same realisation of the delta-hedge as in Fig. 5. In a similar vein as in the deterministic case above, the second agent’s optimal behaviour in the two-player Nash equilibrium changes notably compared to his optimal single-player frictional hedging strategy from [5]; focussing more on preying on the first agent’s larger hedging portfolio in a plastic market, or on providing liquidity to the latter in an elastic market.