A two-player portfolio tracking game

Voß, Moritz

doi:10.1007/s11579-022-00324-6

A two-player portfolio tracking game

Open access
Published: 26 July 2022

Volume 16, pages 779–809, (2022)
Cite this article

Download PDF

You have full access to this open access article

Mathematics and Financial Economics Aims and scope Submit manuscript

A two-player portfolio tracking game

Download PDF

Moritz Voß ORCID: orcid.org/0000-0002-5047-6280¹

2284 Accesses
5 Citations
Explore all metrics

Abstract

We study the competition of two strategic agents for liquidity in the benchmark portfolio tracking setup of Bank et al. (Math Financial Economics 11(2):215–239 2017). Specifically, both agents track their own stochastic running trading targets while interacting through common aggregated temporary and permanent price impact à la Almgren and Chriss (J Risk 3:5–39 2001). The resulting stochastic linear quadratic differential game with terminal state constraints allows for a unique and explicitly available open-loop Nash equilibrium. Our results reveal how the equilibrium strategies of the two players take into account the other agent’s trading targets: either in an exploitative intent or by providing liquidity to the competitor, depending on the relation between temporary and permanent price impact. As a consequence, different behavioral patterns can emerge as optimal in equilibrium. These insights complement and extend existing studies in the literature on predatory trading models examined in the context of optimal portfolio liquidation games.

Multi-dimensional optimal trade execution under stochastic resilience

Article 30 May 2019

Sequential Fair Stackelberg Equilibria of Linear Strategies in Risk-Seeking Insider Trading

Article 08 August 2018

Multi-agent dynamic financial portfolio management: a differential game approach

Article 11 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In recent years, studying so-called price impact games (also referred to as market impact games) in the context of optimal portfolio liquidation problems has gained a lot of attraction in the financial mathematics literature. They investigate the strategic interaction of financial agents, who simultaneously trade in the same risky asset in order to cost-efficiently liquidate their position while affecting the asset’s execution price through jointly generated price impact. That is, influencing the price in an adverse manner when they execute their buy or sell orders. These price impact games provide a tractable way to formalize the competition between agents for a risky asset’s liquidity. Among the first game-theoretic approaches carried out to investigate possible phenomena in a competitive equilibrium where agents seek to liquidate their positions in the same risky asset are, e.g., Brunnermeier and Pedersen [6], Attari et al. [4], Carlin et al. [9], Schöneborn [32],Schöneborn and Schied [33],Carmona and Yang [10], and Schied and Zhang [29].

Our goal in this paper is to extend these works by formulating and studying the competition between two strategic agents for liquidity when both agents are trading simultaneously in an illiquid risky asset affected by price impact, because each agent seeks to track her own exogenously given stochastic target strategy like, e.g., a frictionless delta hedge to dynamically hedge the fluctuations of their random endowments. Single-agent optimal tracking problems in the presence of price impact have first been considered by Rogers and Singh [28], Naujokat and Westray [26], Horst and Naujokat [22], and Cartea and Jaimungal [11]. To the best of our knowledge, the present manuscript is the first to study a dynamic tracking problem in a competitive two-player price impact game setting. Specifically, we extend the single-player cost optimal benchmark portfolio tracking problem studied in Bank et al. [5] in the presence of temporary and permanent price impact as proposed by Almgren and Chriss [2] to a two-player stochastic differential game. Both strategic agents are fully aware of the opponent’s individual tracking objectives and they compete for available liquidity as the jointly caused price impact on the execution price directly feeds into their trading performances. We also allow for individual stochastic terminal state constraints on each agent’s final portfolio position. Our aim is to shed light on the strategic interplay between the agents and to make transparent how each agent takes into account the other agent’s trading targets in an optimal cost minimizing manner by solving for a Nash equilibrium in this two-player price impact game.

The paper most closely related to ours is Schied and Zhang [29]. Therein, the authors determine a unique open-loop Nash equilibrium within the class of deterministic strategies of agents aiming to liquidate a given asset position by maximizing a mean-variance criterion in an Almgren and Chriss [2] framework. Their study is an extension of the corresponding deterministic differential game solved in Carlin et al. [9] of liquidating risk-neutral agents who maximize expected revenues. Other extensions of the latter game include, e.g., Schöneborn and Schied [33], Carmona and Yang [10], Moallemi et al. [25], Chu et al. [14]. In contrast to these papers, which focus on optimal portfolio liquidation only, we additionally allow the agents to track their own general predictable target strategies as in the single-player case investigated in [5]. Moreover, facing the same time horizon, the players’ terminal portfolio positions are also restricted to some exogenously predetermined stochastic levels which reveal gradually over time. As a consequence, both agents will choose their dynamic trading strategies from a suitable set of adapted stochastic processes rather than opting for static strategies from a set of deterministic functions as in the papers cited above (except for the numerical study in [10]).

Other recent work on both finite-player as well as infinite-player mean field price impact games with Almgren-Chriss type price impact include, e.g., Cardaliaguet and Lehalle [8], Huang et al. [23], Casgrain and Jaimungal [12, 13], Fu et al. [21], Fu and Horst [19], Evangelista and Thamsten [18], and Drapeau et al. [15], where finitely and infinitely many agents pursue optimal liquidation of their initial positions and interact through common aggregated permanent and temporary price impact. Price impact games of liquidating agents in a market model with transient price impact are analyzed, e.g., in Luo and Schied [24], Schied and Zhang [30], Schied et al. [31], Strehle [34]; and very recently in Fu et al. [20] and Neuman and Voß [27]. However, these works are all portfolio liquidation games where the agents steer their initial portfolio positions towards zero (with strict liquidation constraints enforced in [18,19,20,21]). In particular, the agents neither track any individual stochastic running trading targets nor do they aim for reaching an individual random terminal target position. In contrast, as mentioned above, our present study formulates and solves a two-player price impact portfolio tracking game with random terminal state constraints between two heterogeneous agents who have their own individual trading targets.

Our main result is an explicit description of a unique open-loop Nash equilibrium within the class of progressively measurable strategies to our two-player stochastic differential game, where both agents track their own target strategies as in [5] and interact through temporary and permanent price impact as in [29] and [9]. Mathematically, we solve a linear quadratic stochastic differential game with random terminal state constraints. Inspired by the analysis in [5], we follow a probabilistic and convex-analytic approach in the style of Pontryagin’s stochastic maximum principle. This also allows us to consider general predictable strategies as the agents’ tracking targets and not necessarily Markovian or continuous diffusion-type processes. We prove uniqueness of the Nash equilibrium and derive its characterization, which takes the form of a four-dimensional coupled system of linear forward-backward stochastic differential equations (FBSDEs). Due to the stochastic terminal state constraints the FBSDE system has singular terminal conditions. As a consequence, explicitly computing a solution to the constrained stochastic differential game is a nontrivial task. The manuscript shows how this can be achieved. Solving the singular FBSDE system provides us with the agents’ optimal trading strategies in equilibrium in closed-form and unveils a rich phenomenology for their optimal behaviour.

In fact, it turns out that in equilibrium, similar to the single-player solution presented in [5], both agents anticipate their individual running target portfolio by gradually trading in the direction of a weighted average of expected future target positions of the target strategy. However, being aware of the competitor’s tracking goals, each agent also assesses a weighted average of the expected future positions of the opponent’s target strategy and chooses to trade accordingly. Interestingly, it arises that the agents’ trading directions with respect to the adversary’s target strategy are not invariant but depend on the relation between temporary and permanent price impact. Conceptually, our explicit results extend the analysis carried out by Schöneborn and Schied [33]. Therein, the authors identify two distinct types of illiquid markets: A plastic market where the price impact is predominantly permanent, and an elastic market where the major part of incurred price impact is temporary. Their model predicts that a competitor who is conscious of the other agent’s liquidation intention engages in predatory trading in a plastic market (in the sense that the competitor partly trades in the same direction as her opponent), while she tends to cooperate and provides liquidity in an elastic market (in the sense that she trades in the opposite direction of her opponent’s trading); cf. also the detailed discussion in Schöneborn and Schied [33]. Our closed-form Nash equilibrium solution of our more general price impact portfolio tracking game corroborates this. The novelty of our contribution comes from the fact that both predation by simultaneously trading in the same direction as the opponent, as well as cooperation by trading in the opposite direction can occur in a coexisting manner; depending on whether the market is plastic or elastic. As a consequence, different behavioral paradigms can emerge as optimal in our Nash equilibrium; see the illustrations in Sect. 4.

The remainder of the paper is organized as follows. In Sect. 2 we introduce our two-player stochastic differential price impact game by extending the framework of Carlin et al. [9] and Schied and Zhang [29] to a stochastic tracking problem of general predictable target strategies and random terminal state constraints. Our main result, an explicit description of a unique open-loop Nash equilibrium of the game is presented in Sect. 3. Section 4 contains some illustrations and discusses the qualitative behaviour of the two players’ optimal strategies in equilibrium.

Notation: Throughout this manuscript we use superscripts for enumerating purposes as, e.g., in $X^1$, $X^2$, $\alpha ^1$, $\alpha ^2$, or other quantities like $\xi ^1$, $\xi ^2$ etc., to mark all objects which are associated with player 1 and player 2, respectively; or, to itemize objects as $w^1$, $w^2$, $w^3$ etc. In particular, $X^2$, $\alpha ^2$, $\xi ^2$ is not to be confused with quadratic powers, which will be explicitly denoted with brackets like $(\alpha )^2$, or, if necessary, as $(\alpha ^2)^2$.

2 Problem formulation

Let $T >0$ denote a finite deterministic time horizon and fix a filtered probability space $(\Omega ,{\mathscr {F}},({\mathscr {F}}_t)_{0 \le t \le T},{\mathbb {P}})$ satisfying the usual conditions of right continuity and completeness. We consider two agents (preferred pronouns she/her/hers and he/him/his, respectively) who are trading in a financial market consisting of one risky asset, e.g., a stock. The number of shares agent 1 and agent 2 are holding at time $t \in [0,T]$ are defined, respectively, as

$$\begin{aligned} X^1_t \triangleq x^1 + \int _0^t \alpha ^1_s ds \qquad \text {and} \qquad X^2_t \triangleq x^2 + \int _0^t \alpha ^2_s ds \end{aligned}$$

(1)

with initial positions $x^1, x^2 \in {\mathbb {R}}$. The real-valued stochastic processes $(\alpha ^1_t)_{0 \le t \le T}$ and $(\alpha ^2_t)_{0 \le t \le T}$ represent the turnover rate at which each agent trades in the risky asset and belong to the general class of stochastic processes

$$\begin{aligned} {\mathscr {A}}\triangleq \left\{ \alpha : \alpha \text { progressively measurable s.t. } {\mathbb {E}} \left[ \int _0^T (\alpha _t)^2 dt \right] < \infty \right\} . \end{aligned}$$

(2)

We adopt the framework from Carlin et al. [9] and Schied and Zhang [29] and suppose that the agents’ trading incurs linear temporary and permanent price impact à la Almgren and Chriss [2] in the sense that trades in the risky asset are executed at prices

$$\begin{aligned} S_ t \triangleq P_t + \lambda (\alpha ^1_t + \alpha ^2_t)+ \gamma ((X^1_t - x^1) + (X^2_t - x^2)) \quad (0 \le t \le T) \end{aligned}$$

(3)

with some unaffected price process $P_\cdot = P_0 + \sqrt{\sigma } W_\cdot $ following a Brownian motion $(W_t)_{0 \le t \le T}$ with respect to the underlying filtration with variance $\sigma > 0$. The trading of both agents in the risky asset consumes available liquidity and instantaneously affects the execution price in (3) in an adverse manner through temporary price impact $\lambda > 0$. In addition, the agents’ total accumulated trading activity also leaves a trace in the execution price which is captured by the permanent price impact parameter $\gamma >0$.

Similar to the single-agent setup in Bank et al. [5] we assume that agent 1 and agent 2 are trading in this illiquid risky asset because each agent seeks to track their own exogenously given target strategy $(\xi _t^1)_{0 \le t \le T}$ and $(\xi _t^2)_{0 \le t \le T}$, respectively. Both processes $\xi ^1$ and $\xi ^2$ are supposed to be real-valued predictable processes in $L^2({\mathbb {P}}\otimes dt)$ and can be thought of, for instance, as hedging strategies adopted from a frictionless market. Moreover, the agents are also required to reach a predetermined terminal portfolio target position $\Xi ^1_T$ and $\Xi ^2_T$ in $L^2({\mathbb {P}},{\mathscr {F}}_T)$ at time T. Mathematically, we can formalize their objectives as follows: For a given strategy $(\alpha ^2_t)_{0 \le t \le T}$ of her competitor agent 2, agent 1 aims to choose her trading rate $(\alpha ^1_t)_{0 \le t \le T}$ in order to minimize the cost functional

$$\begin{aligned} \begin{aligned} J^1(\alpha ^1;\alpha ^2)&\triangleq \; {\mathbb {E}}\left[ \frac{1}{2} \sigma \int _0^T (X^1_t - \xi ^1_t)^2 dt \right. \\&\quad \left. + \frac{1}{2} \lambda \int _0^T \alpha ^1_t \left( \alpha ^1_t + \alpha ^2_t \right) dt + \frac{1}{2} \gamma \int _0^T \alpha ^1_t \left( X^2_t - x^2 \right) dt \right] , \end{aligned} \end{aligned}$$

(4)

whereas agent 2 wishes to minimize

$$\begin{aligned} \begin{aligned} J^2(\alpha ^2;\alpha ^1)&\triangleq \; {\mathbb {E}}\left[ \frac{1}{2} \sigma \int _0^T (X^2_t - \xi ^2_t)^2 dt \right. \\&\quad \left. + \frac{1}{2} \lambda \int _0^T \alpha ^2_t \left( \alpha ^1_t + \alpha ^2_t \right) dt + \frac{1}{2} \gamma \int _0^T \alpha ^2_t \left( X^1_t - x^1 \right) dt \right] \end{aligned} \end{aligned}$$

(5)

via his trading rate $(\alpha ^2_t)_{0 \le t \le T}$ in response to a given strategy $(\alpha ^1_t)_{0 \le t \le T}$ of his opponent agent 1. As in the single-agent problem in Bank et al. [5], the first term in (4) and (5) reflects the agents’ running after their individual target strategies $\xi ^1$ and $\xi ^2$, respectively, through minimizing the corresponding square deviation from their respective portfolio positions $X^1$ and $X^2$. The common weight parameter $\sigma $ measures price fluctuations of the underlying unaffected price process. The second and third terms in (4) and (5) take into account the additional incurred linear quadratic illiquidity costs which are induced by temporary and permanent price impact while both agents are trading in the risky asset as stipulated in (3) (see also Carlin et al. [9] and Schied and Zhang [29]). Note, however, that due to each agent’s individual terminal state constraint $X^i_T = \Xi ^i_T$ ${\mathbb {P}}$-a.s. (for $i=1,2$) only the competitor’s accrued permanent price impact feeds into their respective cost functional. Indeed, integration by parts yields that the i-th agent’s permanent impact from their own trading always creates the same costs $\gamma (X^i_T - x^i)^2=\gamma (\Xi ^i_T - x^i)^2$ independent of their chosen trading rate and therefore can be neglected in their own objective functional. We obtain following individual optimal stochastic control problems for agent 1 and agent 2, namely,

$$\begin{aligned} J^1(\alpha ^1;\alpha ^2) \rightarrow \min _{\alpha ^1 \in {\mathscr {A}}^1} \end{aligned}$$

(6)

for any fixed strategy $\alpha ^2 \in {\mathscr {A}}^2$, and

$$\begin{aligned} J^2(\alpha ^2;\alpha ^1) \rightarrow \min _{\alpha ^2 \in {\mathscr {A}}^2}, \end{aligned}$$

(7)

for any fixed strategy $\alpha ^1 \in {\mathscr {A}}^1$, where ${\mathscr {A}}^{i}$, $i=1,2$, is the set of admissible constrained policies defined as

$$\begin{aligned} {\mathscr {A}}^{i} \triangleq \left\{ \alpha ^i : \alpha ^i \in {\mathscr {A}}\text { satisfying } X^i_T = x^i + \int _0^T \alpha ^i_t dt = \Xi ^i_T \; {\mathbb {P}}\text {-a.s.} \right\} . \end{aligned}$$

(8)

Similar to Bank et al. [5] we further assume that the target positions $\Xi ^1_T, \Xi ^2_T \in L^2({\mathbb {P}},{\mathscr {F}}_T)$ satisfy

$$\begin{aligned} {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^+ \rangle _s \right]< \infty \quad \text {and} \quad {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^- \rangle _s \right] < \infty , \end{aligned}$$

(9)

where $M^+_t \triangleq {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \vert {\mathscr {F}}_t]$ and $M^-_t \triangleq {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \vert {\mathscr {F}}_t]$ for $0 \le t \le T$.

Remark 2.1

1.
As in Carlin et al. [9] and Schied and Zhang [29] the agents’ individual optimization problems in (6) and (7) are intertwined through common aggregated temporary and permanent price impact affecting their performance functionals $J^1$ and $J^2$ in (4) and (5) (in contrast to, e.g, Huang et al. [23], Casgrain and Jaimungal [12, 13] or Ekren and Nadtochiy [17] where agents only interact through permanent or temporary price impact, respectively). One can think of both players as strategic agents who compete for liquidity while concurrently trading in a single illiquid risky asset to meet their tracking objectives for the purpose of, e.g., hedging fluctuations of random endowments. Note that both agents are fully aware of the opponent’s trading targets $\xi ^i$ and $\Xi _T^i$ ($i=1,2$), as well as the jointly caused price impact on the execution prices in (3). That is, our game is one of complete information as in the related studies in Brunnermeier and Pedersen [6], Carlin et al. [9], Schöneborn and Schied [33], Carmona and Yang [10], and Schied and Zhang [29].
2.
For further motivation for the tracking cost functionals in (4) and (5) we refer to the single-player optimization problems studied, e.g., in Rogers and Singh [28], Naujokat and Westray [26], Horst and Naujokat [22], Almgren and Li [3], Bank et al. [5], and Cai et al. [7]. Observe that the square tracking error also incorporates a risk aversion on each player’s inventory. In this regard, both agents are homogeneous in their inventory risk.
3.
Note that the coefficients $\sigma , \lambda , \gamma > 0$ in the cost functionals in (4) and (5) are constants. This is an important assumption for obtaining a closed-from solution for the stochastic differential game, which is our primary focus of interest. In fact, the only sources of randomness in the game are the target strategies $(\xi ^1_t)_{0 \le t \le T}$, $(\xi ^2_t)_{0 \le t \le T}$ and the random terminal conditions $\Xi ^1_T$, $\Xi ^2_T$, which will force the agents’ optimal policies to be random processes as well.
4.
Analog to the study in Bank et al. [5] the assumption in (9) will ensure that ${\mathscr {A}}^i \ne \varnothing $ for $i=1,2$ (cf. also the Proof of Theorem 3.5 in Sect. 3 below). In fact, for given random variables $\Xi ^i_T \in L^2({\mathbb {P}},{\mathscr {F}}_T)$ only known at time T the terminal state constraint $X^i_T = \Xi ^i_T$ ${\mathbb {P}}$-a.s. ($i=1,2$) is quite demanding. Thus, loosely speaking, the condition in (9) requires that the speed at which information on the random ultimate target positions $\Xi ^1_T$, $\Xi ^2_T$ is revealed as $t \uparrow T$ is sufficiently fast.

Our goal is to compute a Nash equilibrium in which both agents solve their minimization problems in (6) and (7) simultaneously, given the strategy of their competitor, in the following sense:

Definition 2.2

A pair of admissible strategies $ ({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2$ is called an open-loop Nash equilibrium if for all admissible strategies $\alpha ^1 \in {\mathscr {A}}^1$ and $\alpha ^2 \in {\mathscr {A}}^2$ it holds that

$$\begin{aligned} J^1 ({\hat{\alpha }}^1;{\hat{\alpha }}^2) \le J^1 (\alpha ^1;{\hat{\alpha }}^2) \quad \text {and} \quad J^2 ({\hat{\alpha }}^2;{\hat{\alpha }}^1) \le J^2 (\alpha ^2;{\hat{\alpha }}^1). \end{aligned}$$

In other words, in a Nash equilibrium neither player has an incentive to deviate from the chosen strategy.

Remark 2.3

In the special case of optimally liquidating the agents’ initial risky asset holdings $x^1, x^2 \in {\mathbb {R}}$ without tracking exogenously given target strategies, i.e., $\xi ^1 \equiv \xi ^2 \equiv 0$, and with non-random terminal target positions $\Xi ^1_T = \Xi ^2_T = 0$ ${\mathbb {P}}$-almost surely, the above formulated two-player (deterministic) differential game is solved in Carlin et al. [9] setting $\sigma = 0$ in the performance functionals in (4) and (5); and in Schied and Zhang [29] allowing for $\sigma > 0$ instead. In both studies, the authors obtain a unique open-loop Nash equilibrium in the sense of Definition 2.2 in closed form within the class of deterministic strategies.

3 Main result

Our main result is an explicit description of a unique open-loop Nash equilibrium in the sense of Definition 2.2 of the two-player stochastic differential game formulated in Sect. 2. Inspired by Bank et al. [5] we will use tools from convex analysis and simple calculus of variations arguments to derive the equilibrium strategies.

First, a strict convexity property of each players’ objective in (4) and (5) is established in the following

Lemma 3.1

For every $\alpha ^2 \in {\mathscr {A}}^2$ fixed, the functional $\alpha ^1 \mapsto J^1(\alpha ^1;\alpha ^2)$ in (4) is strictly convex in $\alpha ^1 \in {\mathscr {A}}^1$. Similarly, for every $\alpha ^1 \in {\mathscr {A}}^1$ fixed, the functional $\alpha ^2 \mapsto J^2(\alpha ^2;\alpha ^1)$ in (5) is strictly convex in $\alpha ^2 \in {\mathscr {A}}^2$.

Proof

We only show strict convexity of the first agent’s objective in (4). The reasoning for the second agent’s objective in (5) follows analogously. To this end, let $\alpha ^2 \in {\mathscr {A}}^2$ be fixed. Consider $\alpha ^1,{\tilde{\alpha }}^1 \in {\mathscr {A}}^1$ such that $\alpha ^1 \ne {\tilde{\alpha }}^1$ $d{\mathbb {P}}\otimes dt\text {-a.e. on } \Omega \times [0,T]$ and denote by $X^1, {\tilde{X}}^1$ the corresponding share holdings. For every $\varepsilon \in (0,1)$ it holds that $\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1 \in {\mathscr {A}}^1$ with share holdings $X^{\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1} = \varepsilon X^{1} + (1-\varepsilon ) {\tilde{X}}^1$. We have to show that

$$\begin{aligned} \varepsilon J^1(\alpha ^1;\alpha ^2) + (1-\varepsilon ) J^1({\tilde{\alpha }}^1;\alpha ^2) - J^1(\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1; \alpha ^2) > 0. \end{aligned}$$

In fact, a straightforward computation reveals that

$$\begin{aligned} \begin{aligned}&\varepsilon J^1(\alpha ^1;\alpha ^2) + (1-\varepsilon ) J^1({\tilde{\alpha }}^1;\alpha ^2) - J^1(\varepsilon \alpha ^1 + (1-\varepsilon ) {\tilde{\alpha }}^1; \alpha ^2) \\&\quad = \frac{1}{2} \varepsilon (1-\varepsilon ) {\mathbb {E}} \left[ \int _0^T \left( \sigma (X^1_t- {\tilde{X}}^1_t)^2+\lambda (\alpha ^1_t - {\tilde{\alpha }}^1_t)^2 \right) dt \right] >0 \end{aligned} \end{aligned}$$

because $\alpha ^1 \ne {\tilde{\alpha }}^1$ $d{\mathbb {P}}\otimes ds\text {-a.e. on } \Omega \times [0,T]$. $\square $

As an important consequence we obtain

Lemma 3.2

There exists at most one Nash equilibrium in the sense of Definition 2.2.

Proof

We adapt the argument from Schied and Zhang [29, Lemma 4.1] (see also Schied et al. [31, Proposition 4.8]) to our stochastic differential game and prove the claim by contradiction. Specifically, assume that there exist two distinct Nash equilibria $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ and $({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)$ in ${\mathscr {A}}^1 \times {\mathscr {A}}^2$, i.e.,

$$\begin{aligned} \begin{aligned} J^1 ({\hat{\alpha }}^1;{\hat{\alpha }}^2)&\le J^1 (\alpha ^1;{\hat{\alpha }}^2) \quad \text {and} \quad J^2 ({\hat{\alpha }}^2;{\hat{\alpha }}^1) \le J^2 (\alpha ^2;{\hat{\alpha }}^1), \\ J^1 ({\tilde{\alpha }}^1;{\tilde{\alpha }}^2)&\le J^1 (\alpha ^1;{\tilde{\alpha }}^2) \quad \text {and} \quad J^2 ({\tilde{\alpha }}^2;{\tilde{\alpha }}^1) \le J^2 (\alpha ^2;{\tilde{\alpha }}^1), \end{aligned} \end{aligned}$$

(10)

for all admissible strategies $\alpha ^1 \in {\mathscr {A}}^1$ and $\alpha ^2 \in {\mathscr {A}}^2$. Then we can define for all $\varepsilon \in [0,1]$ the function

$$\begin{aligned} \begin{aligned} f(\varepsilon )&\triangleq \, J^1(\varepsilon {\tilde{\alpha }}^1 + (1-\varepsilon ) {\hat{\alpha }}^1;{\hat{\alpha }}^2) + J^2(\varepsilon {\tilde{\alpha }}^2 + (1-\varepsilon ) {\hat{\alpha }}^2;{\hat{\alpha }}^1) \\&\quad \, + J^1((1-\varepsilon ) {\tilde{\alpha }}^1 + \varepsilon {\hat{\alpha }}^1 ;{\tilde{\alpha }}^2) + J^2((1-\varepsilon ) {\tilde{\alpha }}^2 + \varepsilon {\hat{\alpha }}^2 ;{\tilde{\alpha }}^1) . \end{aligned} \end{aligned}$$

(11)

Note that due to Lemma 3.1 and the assumption that the two Nash equilibria $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ and $({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)$ are distinct, the function $f(\varepsilon )$ is strictly convex in $\varepsilon $ on [0, 1]. Moreover, in light of (10) it has a unique minimum in $\varepsilon = 0$. It follows that

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} \frac{f(\varepsilon )-f(0)}{\varepsilon } = \frac{d}{d\varepsilon } f(\varepsilon ) \Big \vert _{\varepsilon = 0+} \ge 0. \end{aligned}$$

(12)

Next, denoting the corresponding share holdings of ${\hat{\alpha }}^1$ and ${\tilde{\alpha }}^1$ with ${\hat{X}}^1$ and ${\tilde{X}}^1$, respectively, and noting that $X^{\varepsilon {\tilde{\alpha }}^1 + (1-\varepsilon ) {\hat{\alpha }}^1} = \varepsilon {\tilde{X}}^1 + (1-\varepsilon ) {\hat{X}}^1$, we can compute

$$\begin{aligned} \begin{aligned}&\frac{d}{d\varepsilon } J^1(\varepsilon {\tilde{\alpha }}^1 + (1-\varepsilon ) {\hat{\alpha }}^1;{\hat{\alpha }}^2) \Big \vert _{\varepsilon = 0+} \\&\quad = {\mathbb {E}} \left[ \sigma \! \int _0^T ({\hat{X}}^1_t - \xi ^1_t) ({\tilde{X}}^1_t - {\hat{X}}^1_t) dt + \int _0^T ({\tilde{\alpha }}^1_t-{\hat{\alpha }}^1_t) \left( \frac{1}{2} \lambda (2{\hat{\alpha }}^1_t + {\hat{\alpha }}^2_t) + \frac{1}{2} \gamma ({\hat{X}}^2_t - x^2) \right) dt \right] , \end{aligned} \end{aligned}$$

as well as the derivatives of the remaining three terms in (11) in a very similar manner in order to ultimately obtain

$$\begin{aligned} \begin{aligned}&\frac{d}{d\varepsilon } f(\varepsilon ) \Big \vert _{\varepsilon = 0+} \\&\quad = - \sigma {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{X}}^1_t - {\hat{X}}^1_t)^2 + ({\tilde{X}}^2_t - {\hat{X}}^2_t)^2 \right) dt \right] \\&\qquad + \frac{1}{2} \gamma {\mathbb {E}}\left[ \int _0^T ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) ({\hat{X}}^2_t - {\tilde{X}}^2_t) dt \right] + \frac{1}{2} \gamma {\mathbb {E}}\left[ \int _0^T ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) ({\hat{X}}^1_t - {\tilde{X}}^1_t) dt \right] \\&\qquad - \lambda {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) + ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) \right) ^2 dt \right] , \end{aligned} \end{aligned}$$

where ${\hat{X}}^2$ and ${\tilde{X}}^2$ denote the share holdings of ${\hat{\alpha }}^2$ and ${\tilde{\alpha }}^2$, respectively. Observing that integration by parts yields

$$\begin{aligned} \int _0^T ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) ({\hat{X}}^2_t - {\tilde{X}}^2_t) dt = - \int _0^T ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) ({\hat{X}}^1_t - {\tilde{X}}^1_t) dt \end{aligned}$$

because ${\tilde{X}}^i_0 = {\hat{X}}^i_0 = x^i$ and ${\hat{X}}^i_T = {\tilde{X}}^i_T = \Xi ^i_T$ for both $i \in \{1,2\}$, we obtain

$$\begin{aligned} \begin{aligned} \frac{d}{d\varepsilon } f(\varepsilon ) \Big \vert _{\varepsilon = 0+}&= \, - \sigma {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{X}}^1_t - {\hat{X}}^1_t)^2 + ({\tilde{X}}^2_t - {\hat{X}}^2_t)^2 \right) dt \right] \\&\quad \, - \lambda {\mathbb {E}} \left[ \int _0^T \left( ({\tilde{\alpha }}^1_t - {\hat{\alpha }}^1_t) + ({\tilde{\alpha }}^2_t - {\hat{\alpha }}^2_t) \right) ^2 dt \right] \end{aligned} \end{aligned}$$

which is strictly negative because the two Nash equilibria $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ and $({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)$ are distinct. But this contradicts (12). $\square $

Next, for any arbitrary but fixed controls ${\tilde{\alpha }}^2 \in {\mathscr {A}}^2$ and ${\tilde{\alpha }}^1 \in {\mathscr {A}}^1$, we can introduce the Gâteaux derivatives of the mappings $\alpha ^1 \mapsto J^1(\alpha ^1;{\tilde{\alpha }}^2)$ at $\alpha ^1 \in {\mathscr {A}}^1$ and $\alpha ^2 \mapsto J^2(\alpha ^2;{\tilde{\alpha }}^1)$ at $\alpha ^2 \in {\mathscr {A}}^2$, respectively, in any directions $\beta ^1, \beta ^2 \in {\mathscr {A}}^0 \triangleq \{ \beta : \beta \in {\mathscr {A}}\text { satisfying } \int _0^T \beta _t dt = 0 \; {\mathbb {P}}\text {-a.s.}\}$, namely,

$$\begin{aligned} \langle \nabla J^1(\alpha ^1; {\tilde{\alpha }}^2), \beta ^1 \rangle&\triangleq \; \lim _{\varepsilon \rightarrow 0} \frac{J^1(\alpha ^1 +\varepsilon \beta ^1;{\tilde{\alpha }}^2)-J^1(\alpha ^1,{\tilde{\alpha }}^2)}{\varepsilon },\\ \langle \nabla J^2(\alpha ^2;{\tilde{\alpha }}^1), \beta ^2 \rangle&\triangleq \; \lim _{\varepsilon \rightarrow 0} \frac{J^2(\alpha ^2 +\varepsilon \beta ^2; {\tilde{\alpha }}^1 )-J^2(\alpha ^2;{\tilde{\alpha }}^1)}{\varepsilon }. \end{aligned}$$

They allow for following explicit expressions presented in

Lemma 3.3

Let ${\tilde{\alpha }}^2 \in {\mathscr {A}}^2$ be fixed with corresponding share holdings ${\tilde{X}}^2$. Then for all $\alpha ^1 \in {\mathscr {A}}^1$ we have

$$\begin{aligned}&\langle \nabla J^1(\alpha ^1; {\tilde{\alpha }}^2), \beta ^1 \rangle \nonumber \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^1_s \left( \lambda \alpha ^1_s + \frac{\lambda }{2} {\tilde{\alpha }}^2_s + \frac{\gamma }{2} ({\tilde{X}}^2_s - x^2) + \int _s^T (X^1_t - \xi ^1_t) \sigma dt \right) ds \right] \end{aligned}$$

(13)

for any $\beta ^1 \in {\mathscr {A}}^0$. Similarly, let ${\tilde{\alpha }}^1 \in {\mathscr {A}}^1$ be fixed with corresponding share holdings ${\tilde{X}}^1$. Then for all $\alpha ^2 \in {\mathscr {A}}^2$ we have

$$\begin{aligned}&\langle \nabla J^2(\alpha ^2; {\tilde{\alpha }}^1), \beta ^2 \rangle \nonumber \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^2_s \left( \lambda \alpha ^2_s + \frac{\lambda }{2} {\tilde{\alpha }}^1_s + \frac{\gamma }{2} ({\tilde{X}}^1_s - x^1) + \int _s^T (X^2_t - \xi ^2_t) \sigma dt \right) ds \right] \end{aligned}$$

(14)

for any $\beta ^2 \in {\mathscr {A}}^0$.

Proof

We only compute the Gâteaux derivative in (13). The same computations apply for (14). Fix ${\tilde{\alpha }}^2 \in {\mathscr {A}}^2$ with share holdings ${\tilde{X}}^2$ and let $\alpha ^1 \in {\mathscr {A}}^1, \beta ^1 \in {\mathscr {A}}^0$ as well as $\varepsilon > 0$. Note that $\alpha ^1 +\varepsilon \beta ^1 \in {\mathscr {A}}^1$ with share holdings $X^{\alpha ^1 +\varepsilon \beta ^1} = X^1 + \varepsilon \int _0^\cdot \beta ^1_s ds$. Moreover, since

$$\begin{aligned}&J^1(\alpha ^1 +\varepsilon \beta ^1; {\tilde{\alpha }}^2)-J^1(\alpha ^1;{\tilde{\alpha }}^2) \\&\quad = \, \varepsilon {\mathbb {E}}\left[ \int _0^T \left( \frac{\lambda }{2} \beta ^1_t (2 \alpha ^1_t + {\tilde{\alpha }}^2_t) + \left( \int _0^t \beta ^1_s ds \right) (X^1_t - \xi ^1_t) \sigma +\frac{\gamma }{2} \beta ^1_t ({\tilde{X}}^2_t -x^2) \right) dt \right] \\&\qquad + \frac{1}{2} \varepsilon ^2 {\mathbb {E}}\left[ \int _0^T \left( \lambda (\beta ^1_t)^2 + \left( \int _0^t \beta ^1_sds \right) ^2 \sigma \right) dt \right] , \end{aligned}$$

we obtain the desired result in (13) after applying Fubini’s theorem. $\square $

Having at hand the explicit expressions in (13) and (14) we can now derive a sufficient and necessary first order condition for the Nash equilibrium in terms of a system of coupled forward-backward stochastic differential equations (FBSDE).

Lemma 3.4

A pair of controls $({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2$ is a Nash equilibrium in the sense of Definition 2.2 if and only if $({\hat{X}}^1, {\hat{X}}^2, {\hat{\alpha }}^1,{\hat{\alpha }}^2)$ solve following coupled forward backward SDE system

$$\begin{aligned} \left\{ \begin{aligned} dX^1_t =&\; \alpha ^1_t dt, \qquad X^1_0 = x^1, \\ dX^2_t =&\; \alpha ^2_t dt, \qquad X^2_0 = x^2, \\ d\alpha ^1_t =&\; \frac{\sigma }{\lambda } (X^1_t - \xi ^1_t) dt - \frac{\gamma }{2\lambda } \alpha ^2_t dt - \frac{1}{2} d\alpha ^2_t + dM^1_t, \qquad X^1_T = \Xi ^1_T,\\ d\alpha ^2_t =&\; \frac{\sigma }{\lambda } (X^2_t - \xi ^2_t) dt - \frac{\gamma }{2\lambda } \alpha ^1_t dt - \frac{1}{2} d\alpha ^1_t + dM^2_t, \qquad X^2_T = \Xi ^2_T, \end{aligned} \right. \end{aligned}$$

(15)

for two suitable square integrable martingales $(M^1_t)_{0 \le t < T}$ and $(M^2_t)_{0 \le t < T}$.

Proof

Sufficiency: Assume first that $({\hat{X}}^1, {\hat{X}}^2,{\hat{\alpha }}^1,{\hat{\alpha }}^2, M^1, M^2)$ with $({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2$ solves the FBSDE system in (15). We have to show that ${\hat{\alpha }}^1$ minimizes $\alpha ^1 \mapsto J^1(\alpha ^1;{\hat{\alpha }}^2)$ over ${\mathscr {A}}^1$, and, vice versa, that ${\hat{\alpha }}^2$ minimizes $\alpha ^2 \mapsto J^2(\alpha ^2;{\hat{\alpha }}^1)$ over ${\mathscr {A}}^2$. Since we are minimizing strictly convex functionals due to Lemma 3.1, a sufficient condition for the optimality of ${\hat{\alpha }}^1$ and ${\hat{\alpha }}^2$, respectively, is given by

$$\begin{aligned} \langle \nabla J^1({\hat{\alpha }}^1; {\hat{\alpha }}^2), \beta ^1 \rangle = 0 \text { for all } \beta ^1 \in {\mathscr {A}}^0 \end{aligned}$$

(16)

and

$$\begin{aligned} \langle \nabla J^2({\hat{\alpha }}^2; {\hat{\alpha }}^1), \beta ^2 \rangle = 0 \text { for all } \beta ^2 \in {\mathscr {A}}^0; \end{aligned}$$

(17)

cf., e.g., Ekeland and Témam [16]. We start with the proof of (16). By assumption we have the representation

$$\begin{aligned} {\hat{\alpha }}^1_t&= \; {\hat{\alpha }}^1_0 + \frac{\sigma }{\lambda } \int _0^t ({\hat{X}}^1_s - \xi ^1_s) ds- \frac{\gamma }{2\lambda } \int _0^t {\hat{\alpha }}^2_s ds \\&\quad -\frac{1}{2} ({\hat{\alpha }}^2_t - {\hat{\alpha }}^2_0) + M^1_t-M^1_0 \quad d{\mathbb {P}}\otimes dt \text {-a.e. on } \Omega \times [0,T) \end{aligned}$$

for some square integrable martingale $(M^1_t)_{0 \le t < T}$. Moreover, since ${\hat{\alpha }}^1, {\hat{\alpha }}^2, \xi ^1 \in L^2({\mathbb {P}}\otimes dt)$ it follows that ${\mathbb {E}}[\int _0^T (M_s^1)^2 ds] < \infty $. Next, introducing the square integrable martingale

$$\begin{aligned} N_s \triangleq {\mathbb {E}}\left[ \int _0^T ({\hat{X}}^1_t - \xi ^1_t) \sigma dt \, \bigg \vert \, {\mathscr {F}}_s \right] \quad (0 \le s \le T) \end{aligned}$$

and plugging the above representation of ${\hat{\alpha }}^1$ in the Gâteaux derivative in (13) we obtain

$$\begin{aligned}&\langle \nabla _1 J^1({\hat{\alpha }}^1; {\hat{\alpha }}^2), \beta ^1 \rangle \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^1_s \left( \lambda {\hat{\alpha }}^1_s + \frac{\lambda }{2} {\hat{\alpha }}^2_s + \frac{\gamma }{2} ({\hat{X}}^2_s - x^2) + \int _s^T ({\hat{X}}^1_t - \xi ^1_t) \sigma dt \right) ds \right] \\&\quad = {\mathbb {E}}\left[ \int _0^T \beta ^1_s \left( \lambda {\hat{\alpha }}^1_0 + \frac{\lambda }{2} {\hat{\alpha }}^2_0 + N_T + \lambda M^1_s - \lambda M^1_0 \right) ds \right] \\&\quad = {\mathbb {E}}\left[ \left( \lambda {\hat{\alpha }}^1_0 + \frac{\lambda }{2} {\hat{\alpha }}^2_0 + N_T - \lambda M^1_0 \right) \int _0^T \beta ^1_s ds \right] + \lambda {\mathbb {E}}\left[ \int _0^T \beta ^1_s M^1_s ds \right] \\&\quad = 0 \text { for all } \beta ^1 \in {\mathscr {A}}^0, \end{aligned}$$

where we used the result from Bank et al. [5, Lemma 5.3] in the last line. Hence, as desired, we obtain that the first order optimality condition in (16) is satisfied by ${\hat{\alpha }}^1 \in {\mathscr {A}}^1$. In fact, the same computations apply to show that also ${\hat{\alpha }}^2 \in {\mathscr {A}}^2$ is satisfying the first order optimality condition in (17). Therefore, we can conclude that $({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2$ is a Nash equilibrium in the sense of Definition 2.2.

Necessity: Finally, as shown in the Proof of Theorem 3.5 below (which does not use the necessity assertion of the present lemma) the pair of controls $({\hat{\alpha }}^1, {\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2$ presented in (21) below satisfies the coupled forward backward SDE system in (15). Therefore, by uniqueness of the Nash equilibrium via Lemma 3.2 the assertion is indeed also necessary. $\square $

We are now ready to state our main result. To do so, it is convenient to introduce following nonnegative constants

$$\begin{aligned} \delta ^+ \triangleq \frac{\gamma ^2}{4} + 6 \lambda \sigma , \qquad \delta ^- \triangleq \frac{\gamma ^2}{4} + 2 \lambda \sigma , \end{aligned}$$

(18)

the nonnegative functions

$$\begin{aligned} \begin{aligned} c^+_t&\triangleq \; \frac{1}{3} \sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + \frac{1}{6} \gamma , \\ c^-_t&\triangleq \; \sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda ) -\frac{1}{2}\gamma \end{aligned} \qquad (0 \le t \le T) \end{aligned}$$

(19)

such that $\lim _{t \uparrow T} c_t^{\pm } = +\infty $, as well as the weight functions

$$\begin{aligned} \begin{aligned} w^1_t&\triangleq \frac{\sqrt{\delta ^+} \, e^{\frac{\gamma }{6\lambda }(T-t)}}{3 (c^+_t + c^-_t) \sinh (\sqrt{\delta ^+}(T-t) /(3\lambda ))}, \\ w^2_t&\triangleq \; \frac{\sqrt{\delta ^-} \, e^{-\frac{\gamma }{2\lambda }(T-t)}}{(c^+_t + c^-_t) \sinh (\sqrt{\delta ^-}(T-t)/\lambda )}, \\ w^3_t&\triangleq \; \frac{c^+_t}{c^+_t + c^-_t} - w^1_t, \qquad w^4_t \triangleq \frac{c^-_t}{c^+_t + c^-_t} - w^2_t, \qquad w^5_t \triangleq \frac{c^+_t - c^-_t}{c^+_t + c^-_t} \end{aligned} \end{aligned}$$

(20)

for all $t \in [0,T]$. An explicit description of the unique Nash equilibrium is provided in the following

Theorem 3.5

There exists a unique open-loop Nash equilibrium $({\hat{\alpha }}^1, {\hat{\alpha }}^2)$ in ${\mathscr {A}}^1 \times {\mathscr {A}}^2$ in the sense of Definition 2.2. The corresponding equilibrium share holdings ${\hat{X}}^1_\cdot = x^1 + \int _0^\cdot {\hat{\alpha }}^1_tdt$ of agent 1 and ${\hat{X}}^2_\cdot = x^2 + \int _0^\cdot {\hat{\alpha }}^2_tdt$ of agent 2 satisfy the random linear coupled ODE

$$\begin{aligned} \begin{aligned} {\hat{X}}^1_0 =&\; x^1,&\quad d{\hat{X}}^1_t =&\; \frac{c^+_t+c^-_t}{2\lambda } \left( {\hat{\xi }}^1_t - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t \right) dt, \\ {\hat{X}}^2_0 =&\; x^2,&\quad d{\hat{X}}^2_t =&\; \frac{c^+_t+c^-_t}{2\lambda } \left( {\hat{\xi }}^2_t - w^5_t {\hat{X}}^1_t - {\hat{X}}^2_t \right) dt \end{aligned} \quad (0 \le t < T), \end{aligned}$$

(21)

where, for $0 \le t \le T$, we let

$$\begin{aligned} \begin{aligned} {\hat{\xi }}^{1}_t&\triangleq \; w^1_t \cdot {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \,\vert \, {\mathscr {F}}_t] + w^2_t \cdot {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \,\vert \, {\mathscr {F}}_t] \\&\quad + w^3_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) \cdot K^1(t,u) \,du \,\Big \vert \,{\mathscr {F}}_t \right] \\&\quad + w^4_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^1_u - \xi ^2_u) \cdot K^2(t,u) \, du \, \Big \vert \, {\mathscr {F}}_t \right] \end{aligned} \end{aligned}$$

(22)

and

$$\begin{aligned} \begin{aligned} {\hat{\xi }}^2_t&\triangleq \; w^1_t \cdot {\mathbb {E}}[\Xi ^2_T + \Xi ^1_T \,\vert \, {\mathscr {F}}_t] + w^2_t \cdot {\mathbb {E}}[\Xi ^2_T - \Xi ^1_T \,\vert \, {\mathscr {F}}_t] \\&\quad + w^3_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^2_u + \xi ^1_u) \cdot K^1(t,u) \,du \,\Big \vert \,{\mathscr {F}}_t \right] \\&\quad + w^4_t \cdot {\mathbb {E}}\left[ \int _t^T (\xi ^2_u - \xi ^1_u) \cdot K^2(t,u) \, du \, \Big \vert \, {\mathscr {F}}_t \right] \end{aligned} \end{aligned}$$

(23)

with nonnegative kernels

$$\begin{aligned} \begin{aligned} K^1(t,u)&\triangleq \; \frac{w^1_t}{w^3_t} \frac{2\sigma e^{-\frac{\gamma }{6\lambda }(T-u)} \sinh (\sqrt{\delta ^+}(T-u)/(3\lambda ))}{\sqrt{\delta ^+}}, \\ K^2(t,u)&\triangleq \; \frac{w^2_t}{w^4_t} \frac{2\sigma e^{\frac{\gamma }{2\lambda }(T-u)} \sinh (\sqrt{\delta ^-}(T-u)/\lambda )}{\sqrt{\delta ^-}} \end{aligned} \;\; (0 \le t \le u < T) \end{aligned}$$

(24)

which, for each $t \in [0,T)$, integrate to one over [t, T]. The solution $({\hat{X}}^1, {\hat{X}}^2)$ of (21) satisfies the terminal state constraints in the sense that

$$\begin{aligned} \lim _{t \uparrow T} {\hat{X}}^1_t = \Xi ^1_T \quad \text {and} \quad \lim _{t \uparrow T} {\hat{X}}^2_t = \Xi ^2_T \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

(25)

The Proof of Theorem 3.5 consists of a verification that the pair $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ with dynamics in (21) is admissible (i.e., belongs to ${\mathscr {A}}^1\times {\mathscr {A}}^2$) and satisfies the FBSDE system in (15). An explanation on how the Nash equilibrium $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ can be constructed is provided in the appendix.

Proof of Theorem 3.5

In view of Lemma 3.4 we merely have to show that $({\hat{X}}^1, {\hat{X}}^2, {\hat{\alpha }}^1, {\hat{\alpha }}^2)$ with dynamics described in Theorem 3.5, Eq. (21), is a solution of the FBSDE system in (15) with some suitable square integrable martingales $(M^1_t)_{0 \le t < T}$ and $(M^2_t)_{0 \le t < T}$. Uniqueness of the Nash equilibrium then follows together with Lemma 3.2.

Step 1: We start with computing the dynamics of the controls ${\hat{\alpha }}^1$ and ${\hat{\alpha }}^2$ in (21) and verify that they satisfy the dynamics of the FBSDE system in (15). To this end, it is convenient to rewrite $w^1, w^2$ in (20), as well as ${\hat{\xi }}^1$ in (22) and ${\hat{\xi }}^2$ in (23) by introducing

$$\begin{aligned} {\tilde{w}}_t^1&\triangleq \; (c^+_t + c^-_t) w^1_t, \quad {\tilde{w}}_t^2 \triangleq (c^+_t + c^-_t) w^2_t \quad (0 \le t < T) \end{aligned}$$

(26)

and

$$\begin{aligned} {\tilde{\xi }}_t^1 \triangleq&\; (c^+_t + c^-_t) {\hat{\xi }}^1_t, \quad {\tilde{\xi }}_t^2 \triangleq (c^+_t + c^-_t) {\hat{\xi }}^2_t \quad (0 \le t < T). \end{aligned}$$

(27)

Moreover, setting

$$\begin{aligned} \begin{aligned} Y_t^+ \triangleq&\; \int _0^t (\xi ^1_s + \xi ^2_s) \frac{2\sigma }{\sqrt{\delta ^+}} e^{-\frac{\gamma }{6\lambda }(T-s)} \sinh (\sqrt{\delta ^+}(T-s)/(3\lambda )) ds, \\ M_t^+ \triangleq&\; {\mathbb {E}}\left[ \Xi ^1_T + \Xi ^2_T + Y_T^+ \,\vert \, {\mathscr {F}}_t\right] \end{aligned} \end{aligned}$$

(28)

and

$$\begin{aligned} \begin{aligned} Y_t^- \triangleq&\; \int _0^t (\xi ^1_s - \xi ^2_s) \frac{2\sigma }{\sqrt{\delta ^-}} e^{\frac{\gamma }{2\lambda }(T-s)} \sinh (\sqrt{\delta ^-}(T-s)/\lambda ) ds, \\ M_t^- \triangleq&\; {\mathbb {E}}\left[ \Xi ^1_T - \Xi ^2_T + Y_T^- \,\vert \, {\mathscr {F}}_t\right] \end{aligned} \end{aligned}$$

(29)

for all $0 \le t \le T$, we obtain the representations

$$\begin{aligned} \begin{aligned} {\tilde{\xi }}^1_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) + {\tilde{w}}^2_t ( M^-_t - Y^-_t ), \\ {\tilde{\xi }}^2_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) - {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \end{aligned} \qquad (0 \le t < T). \end{aligned}$$

(30)

In particular,

$$\begin{aligned} {\tilde{\xi }}^1_t + {\tilde{\xi }}^2_t = 2 {\tilde{w}}^1_t ( M^+_t - Y^+_t ), \quad {\tilde{\xi }}^1_t - {\tilde{\xi }}^2_t = 2 {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \end{aligned}$$

(31)

on [0, T). Note that $\Xi ^1_T, \Xi ^2_T, Y_T^+, Y_T^- \in L^2({\mathbb {P}})$ implies that $(M_t^+)_{0 \le t \le T}$ and $(M_t^-)_{0 \le t \le T}$ are square integrable martingales. Also, observe that the processes $Y^+, M^+, Y^-, M^- \in L^2({\mathbb {P}}\otimes dt)$. We can now rewrite (21) as

$$\begin{aligned} \begin{aligned} {\hat{\alpha }}^1_t =&\; \frac{1}{2\lambda } ( {\tilde{\xi }}^1_t - c^+_t {\hat{X}}^2_t + c^-_t {\hat{X}}^2_t - c^+_t {\hat{X}}^1_t - c^-_t {\hat{X}}^1_t), \\ {\hat{\alpha }}^2_t =&\; \frac{1}{2\lambda } ( {\tilde{\xi }}^2_t - c^+_t {\hat{X}}^1_t + c^-_t {\hat{X}}^1_t - c^+_t {\hat{X}}^2_t - c^-_t {\hat{X}}^2_t) \end{aligned} \qquad (0 \le t < T). \end{aligned}$$

(32)

Next, for ${\tilde{w}}^1$, ${\tilde{w}}^2$ in (26) one can easily check that

$$\begin{aligned} ({\tilde{w}}^1_t)' = {\tilde{w}}^1_t \left( \frac{1}{\lambda } c^+_t - \frac{\gamma }{3\lambda } \right) , \quad ({\tilde{w}}^2_t)' = {\tilde{w}}^2_t \left( \frac{1}{\lambda } c^-_t + \frac{\gamma }{\lambda } \right) \quad (0 \le t < T). \end{aligned}$$

(33)

Hence, by applying integration by parts in (30) we obtain the dynamics

$$\begin{aligned} \begin{aligned} d{\tilde{\xi }}^1_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) \left( \frac{1}{\lambda } c^+_t - \frac{\gamma }{3\lambda } \right) dt -\frac{2}{3} \sigma (\xi ^1_t + \xi ^2_t) dt \\&+ {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \left( \frac{1}{\lambda } c^-_t + \frac{\gamma }{\lambda } \right) dt - 2\sigma (\xi ^1_t - \xi ^2_t) dt \\&+ {\tilde{w}}^1_t dM^+_t + {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T) \end{aligned} \end{aligned}$$

(34)

and

$$\begin{aligned} \begin{aligned} d{\tilde{\xi }}^2_t =&\; {\tilde{w}}^1_t ( M^+_t - Y^+_t ) \left( \frac{1}{\lambda } c^+_t - \frac{\gamma }{3\lambda } \right) dt -\frac{2}{3} \sigma (\xi ^1_t + \xi ^2_t) dt \\&- {\tilde{w}}^2_t ( M^-_t - Y^-_t ) \left( \frac{1}{\lambda } c^-_t + \frac{\gamma }{\lambda } \right) dt - 2 \sigma (\xi ^1_t - \xi ^2_t) dt \\&+ {\tilde{w}}^1_t dM^+_t - {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T). \end{aligned} \end{aligned}$$

(35)

Now, having at hand (34) and (35), as well as the fact that the functions $c^+, c^-$ in (19) satisfy the ordinary Riccati differential equations

$$\begin{aligned} (c^+_t)' = \frac{(c^+_t)^2}{\lambda } - \frac{\gamma }{3\lambda } c^+_t - \frac{2}{3} \sigma , \quad (c^-_t)' = \frac{(c^-_t)^2}{\lambda } + \frac{\gamma }{\lambda } c^-_t - 2\sigma \quad (0 \le t < T), \end{aligned}$$

(36)

an elementary but tedious computation reveals that the dynamics of ${\hat{\alpha }}^1$ and ${\hat{\alpha }}^2$ in (32) on [0, T) are given by

$$\begin{aligned} \begin{aligned} d{\hat{\alpha }}^1_t =&\; {\hat{X}}^1_t \left( \frac{4\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t - \frac{\gamma }{2\lambda ^2} c^-_t \right) dt - \frac{4\sigma }{3\lambda } \xi ^1_t dt + \frac{\gamma }{6\lambda ^2} {\tilde{\xi }}^1_t dt \\&+ {\hat{X}}^2_t \left( -\frac{2\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t + \frac{\gamma }{2\lambda ^2} c^-_t \right) dt + \frac{2\sigma }{3\lambda } \xi ^2_t dt - \frac{\gamma }{3\lambda ^2} {\tilde{\xi }}^2_t dt \\&+ \frac{{\tilde{w}}^1_t}{2\lambda } dM^+_t + \frac{{\tilde{w}}^2_t}{2\lambda } dM^-_t \end{aligned} \end{aligned}$$

(37)

and, similarly, by

$$\begin{aligned} \begin{aligned} d{\hat{\alpha }}^2_t =&\; {\hat{X}}^2_t \left( \frac{4\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t - \frac{\gamma }{2\lambda ^2} c^-_t \right) dt - \frac{4\sigma }{3\lambda } \xi ^2_t dt + \frac{\gamma }{6\lambda ^2} {\tilde{\xi }}^2_t dt \\&+ {\hat{X}}^1_t \left( -\frac{2\sigma }{3\lambda } + \frac{\gamma }{6\lambda ^2} c^+_t + \frac{\gamma }{2\lambda ^2} c^-_t \right) dt + \frac{2\sigma }{3\lambda } \xi ^1_t dt - \frac{\gamma }{3\lambda ^2} {\tilde{\xi }}^1_t dt \\&+ \frac{{\tilde{w}}^1_t}{2\lambda } dM^+_t - \frac{{\tilde{w}}^2_t}{2\lambda } dM^-_t, \end{aligned} \end{aligned}$$

(38)

where we also employed the identities in (31). As a consequence, using the representations in (32) we obtain

$$\begin{aligned}&d{\hat{\alpha }}^1_t + \frac{1}{2} d{\hat{\alpha }}^2_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^1_t - \xi ^1_t) dt - \frac{\gamma }{4\lambda ^2} ({\tilde{\xi }}^2_t - c^+_t {\hat{X}}^1_t + c^-_t {\hat{X}}^1_t - c^+_t {\hat{X}}^2_t - c^-_t {\hat{X}}^2_t ) dt \\&\qquad + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t + \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^1_t - \xi ^1_t) dt - \frac{\gamma }{2\lambda } {\hat{\alpha }}^2_t dt + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t + \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T) \end{aligned}$$

and

$$\begin{aligned}&d{\hat{\alpha }}^2_t + \frac{1}{2} d{\hat{\alpha }}^1_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^2_t - \xi ^2_t) dt - \frac{\gamma }{4\lambda ^2} ({\tilde{\xi }}^1_t - c^+_t {\hat{X}}^2_t + c^-_t {\hat{X}}^2_t - c^+_t {\hat{X}}^1_t - c^-_t {\hat{X}}^1_t ) dt \\&\qquad + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t - \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \\&\quad = \frac{\sigma }{\lambda } ({\hat{X}}^2_t - \xi ^2_t) dt - \frac{\gamma }{2\lambda } {\hat{\alpha }}^1_t dt + \frac{3}{4\lambda } {\tilde{w}}^1_t dM^+_t - \frac{1}{4\lambda } {\tilde{w}}^2_t dM^-_t \qquad (0 \le t < T). \end{aligned}$$

In other words, the pair $({\hat{\alpha }}^1, {\hat{\alpha }}^2)$ described in (21) satisfies the dynamics of the FBSDE system in (15), where $\int _0^\cdot {\tilde{w}}_t^{1} dM_t^{+}$, $\int _0^\cdot {\tilde{w}}_t^{2} dM_t^{-}$ are square integrable martingales on [0, T) providing the ingredients for $M^1$ and $M^2$.

Step 2: Next, we have to check the terminal conditions of the FBSDE system in (15), that is, $\lim _{t \uparrow T} {\hat{X}}^1_t = \Xi ^1_T$ and $\lim _{t \uparrow T} {\hat{X}}^2_t = \Xi ^2_T$ ${\mathbb {P}}$-a.s. holds true for the pair of solutions $({\hat{X}}^1, {\hat{X}}^2)$ of the coupled ODE in (21). We adapt the argumentation from Bank et al. [5] which employs a simple comparison principle for ordinary differential equations to our current setting. Specifically, note that it suffices to show that

$$\begin{aligned} \lim _{t \uparrow T} ({\hat{X}}^1_t + {\hat{X}}^2_t) =&\; \Xi ^1_T + \Xi ^2_T\quad {\mathbb {P}}\text {-a.s. and} \end{aligned}$$

(39)

$$\begin{aligned} \lim _{t \uparrow T} ({\hat{X}}^1_t - {\hat{X}}^2_t) =&\; \Xi ^1_T - \Xi ^2_T \quad {\mathbb {P}}\text {-a.s.}, \end{aligned}$$

(40)

where, using the dynamics in (21) and the definition of $w^5$ in (20), the processes ${\hat{X}}^1 + {\hat{X}}^2$ and ${\hat{X}}^1 - {\hat{X}}^2$ satisfy, respectively, the ODE

$$\begin{aligned} \begin{aligned} d({\hat{X}}^1_t + {\hat{X}}^2_t) =&\; \frac{c^+_t + c^-_t}{2\lambda } \left( {\hat{\xi }}^1_t + {\hat{\xi }}^2_t - w^5_t {\hat{X}}^1_t - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t - {\hat{X}}^2_t \right) dt \\ =&\; \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - ({\hat{X}}^1_t + {\hat{X}}^2_t) \right) dt \quad (0 \le t <T) \end{aligned} \end{aligned}$$

(41)

and

$$\begin{aligned} \begin{aligned} d({\hat{X}}^1_t - {\hat{X}}^2_t) =&\; \frac{c^+_t + c^-_t}{2\lambda } \left( {\hat{\xi }}^1_t - {\hat{\xi }}^2_t + w^5_t {\hat{X}}^1_t - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t + {\hat{X}}^2_t \right) dt \\ =&\; \frac{c^-_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} - ({\hat{X}}^1_t - {\hat{X}}^2_t) \right) dt \quad (0 \le t <T). \end{aligned} \end{aligned}$$

(42)

Note that $w^5_t \in (-1,1)$ for all $t \in [0,T]$ by virtue of Lemma 3.7 1.). First, analogously to (30) let us rewrite ${\hat{\xi }}^1$ and ${\hat{\xi }}^2$ in (22) and (23) as

$$\begin{aligned} \begin{aligned} {\hat{\xi }}^1_t =&\; w^1_t ( M^+_t - Y^+_t ) + w^2_t ( M^-_t - Y^-_t ), \\ {\hat{\xi }}^2_t =&\; w^1_t ( M^+_t - Y^+_t ) - w^2_t ( M^-_t - Y^-_t ) \end{aligned} \qquad (0 \le t \le T) \end{aligned}$$

(43)

with $Y^+, M^+,Y^-,M^-$ as defined in (28) and (29). Hence, we can consider a càdlàg version of the processes $({\hat{\xi }}^1_t)_{0 \le t \le T}$ and $({\hat{\xi }}^2_t)_{0 \le t \le T}$ and obtain, together with Lemma 3.7, 2.), the ${\mathbb {P}}$-a.s. limits

$$\begin{aligned} \begin{aligned} \lim _{t \uparrow T} {\hat{\xi }}^1_t =&\; \frac{1}{2} {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] + \frac{1}{2} {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] = \Xi ^1_T \quad \text {and} \\ \lim _{t \uparrow T} {\hat{\xi }}^2_t =&\; \frac{1}{2} {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] - \frac{1}{2} {\mathbb {E}}[\Xi ^1_T - \Xi ^2_T \, \vert \, {\mathscr {F}}_{T-}] = \Xi ^2_T \end{aligned} \end{aligned}$$

due to ${\mathscr {F}}_{T-}$-measurability of $\Xi ^1_T$ and $\Xi ^2_T$ by virtue of our assumption in (9). In particular, since $\lim _{t\uparrow T} w^5_t = 0$ because of Lemma 3.7, 2.), it also holds that

$$\begin{aligned} \lim _{t \uparrow T} \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} = \Xi ^1_T + \Xi ^2_T \quad \text {and} \quad \lim _{t \uparrow T} \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} = \Xi ^1_T - \Xi ^2_T \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

(44)

Let us now start with proving the limit in (39). As a consequence of (44), for every $\varepsilon > 0$ there exists a (random) time $\tau _\varepsilon \in [0,T)$ such that ${\mathbb {P}}$-a.s.

$$\begin{aligned} \Xi ^1_T + \Xi ^2_T - \varepsilon \le \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} \le \Xi ^1_T + \Xi ^2_T + \varepsilon \quad \text {for all } t \in [\tau _\varepsilon ,T). \end{aligned}$$

(45)

Next, define $Y^{+,\varepsilon }_t \triangleq \Xi ^1_T + \Xi ^2_T + \varepsilon - ({\hat{X}}^1_t + {\hat{X}}^2_t)$ for all $t \in [0,T)$ so that

$$\begin{aligned} Y^{+,\varepsilon }_t \ge \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - ({\hat{X}}^1_t + {\hat{X}}^2_t) \quad \text {for all } t \in [\tau _\varepsilon ,T). \end{aligned}$$

(46)

Together with the dynamics of ${\hat{X}}^1+{\hat{X}}^2$ in (41) this yields

$$\begin{aligned} \begin{aligned} d Y^{+,\varepsilon }_t =&\; -d({\hat{X}}^1_t + {\hat{X}}^2_t) = - \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - ({\hat{X}}^1_t + {\hat{X}}^2_t) \right) dt \\ \ge&\; -\frac{c_t^+}{\lambda } Y^{+,\varepsilon }_t dt \quad \text {on } [\tau _\varepsilon ,T). \end{aligned} \end{aligned}$$

(47)

Moreover, since for all $\omega \in \Omega $ the linear ODE on $[\tau _\varepsilon (\omega ),T)$ given by

$$\begin{aligned} Z^{+,\varepsilon }_{\tau _\varepsilon (\omega )} = Y^{+,\varepsilon }_{\tau _\varepsilon (\omega )}(\omega ), \quad dZ^{+,\varepsilon }_t = -\frac{c_t^+}{\lambda } Z^{+,\varepsilon }_t dt \end{aligned}$$

admits the solution

$$\begin{aligned} Z^{+,\varepsilon }_t =&\; Y^{+,\varepsilon }_{\tau _\varepsilon (\omega )}(\omega ) \cdot e^{-\int _{\tau _\varepsilon }^{t} \frac{c^+_s}{\lambda } ds} \\ =&\; Y^{+,\varepsilon }_{\tau _\varepsilon }(\omega ) \cdot e^{-\frac{\gamma }{6\lambda } (t-\tau _\varepsilon )} \cdot \frac{\sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))}{\sinh (\sqrt{\delta ^+}(T-\tau _{\varepsilon })/(3\lambda ))} \quad (\tau _\varepsilon \le t < T) \end{aligned}$$

with $\lim _{t \uparrow T} Z^{+,\varepsilon }_t = 0$, the comparison principle for ODEs in (47) implies that $Y^{+,\varepsilon }_t \ge Z^{+,\varepsilon }_t$ for all $t \in [\tau _\varepsilon , T)$ and thus

$$\begin{aligned} \liminf _{t \uparrow T} Y^{+,\varepsilon }_t \ge \lim _{t \uparrow T} Z^{+,\varepsilon }_t = 0 \quad {\mathbb {P}}\text {-a.s.}, \end{aligned}$$

or, equivalently,

$$\begin{aligned} \limsup _{t \uparrow T} ({\hat{X}}^1_t + {\hat{X}}^2_t) \le \Xi ^1_T + \Xi ^2_T + \varepsilon \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

(48)

Next, in a similar way, set ${\tilde{Y}}^{+,\varepsilon }_t \triangleq \Xi ^1_T + \Xi ^2_T - \varepsilon - ({\hat{X}}^1_t + {\hat{X}}^2_t)$ for all $t \in [0,T)$ and observe as above from (45) that ${\mathbb {P}}$-a.s. on $[\tau _\varepsilon , T)$ it holds that $d{\tilde{Y}}^{+,\varepsilon }_t \le -\frac{c_t^+}{\lambda } {\tilde{Y}}^{+,\varepsilon }_t dt$ and hence

$$\begin{aligned} \limsup _{t \uparrow T} {\tilde{Y}}^{+,\varepsilon }_t \le \lim _{t \uparrow T} Z^{+,\varepsilon }_t \le 0 \quad {\mathbb {P}}\text {-a.s.} \end{aligned}$$

by the comparison principle. That is,

$$\begin{aligned} \liminf _{t \uparrow T} ({\hat{X}}^1_t + {\hat{X}}^2_t) \ge \Xi ^1_T + \Xi ^2_T - \varepsilon \quad {\mathbb {P}}\text {-a.s.}, \end{aligned}$$

which, together with (48) yields the limit in (39).

In fact, it can now be argued along the same lines as above that also the limit in (40) holds true. Indeed, simply note that (44) implies similar to (45) that ${\mathbb {P}}$-a.s. for every $\varepsilon > 0$ there exists a (random) time $\tau '_\varepsilon \in [0,T)$ such that

$$\begin{aligned} \Xi ^1_T - \Xi ^2_T - \varepsilon \le \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} \le \Xi ^1_T - \Xi ^2_T + \varepsilon \quad \text {for all } t \in [\tau '_\varepsilon ,T). \end{aligned}$$

Then, introduce the processes $Y^{-,\varepsilon }_t \triangleq \Xi ^1_T - \Xi ^2_T + \varepsilon - ({\hat{X}}^1_t - {\hat{X}}^2_t)$ and ${\tilde{Y}}^{-,\varepsilon }_t \triangleq \Xi ^1_T - \Xi ^2_T - \varepsilon - ({\hat{X}}^1_t - {\hat{X}}^2_t)$ for all $t \in [0,T)$. By using the dynamics of ${\hat{X}}^1 - {\hat{X}}^2$ in (42) we can once more apply the comparison principle on the interval $[\tau '_\varepsilon ,T)$ for the ODEs of $Y^{-,\varepsilon }$ and ${\tilde{Y}}^{-,\varepsilon }$ together with the linear ODE

$$\begin{aligned} Z^{-,\varepsilon }_{\tau _\varepsilon } = z \in {\mathbb {R}}, \quad dZ^{-,\varepsilon }_t = -\frac{c_t^-}{\lambda } Z^{-,\varepsilon }_t dt, \end{aligned}$$

which admits the solution

$$\begin{aligned} Z^{-,\varepsilon }_t = z e^{-\int _{\tau _\varepsilon '}^{t} \frac{c^-_s}{\lambda } ds} = z^- e^{\frac{\gamma }{2\lambda } (t-\tau _\varepsilon )} \frac{\sinh (\sqrt{\delta ^-}(T-t)/\lambda )}{\sinh (\sqrt{\delta ^-}(T-\tau _{\varepsilon }')/\lambda )} \quad (\tau '_\varepsilon \le t < T) \end{aligned}$$

such that $\lim _{t \uparrow T} Z^{-,\varepsilon }_t = 0$ to finally conclude that

$$\begin{aligned} \Xi ^1_T - \Xi ^2_T - \varepsilon \le \liminf _{t \uparrow T} ({\hat{X}}^1_t - {\hat{X}}^2_t) \le \limsup _{t \uparrow T} ({\hat{X}}^1_t - {\hat{X}}^2_t) \le \Xi ^1_T - \Xi ^2_T + \varepsilon \end{aligned}$$

as desired.

Step 3: It is left to argue that the controls ${\hat{\alpha }}^1, {\hat{\alpha }}^2$ described in (21) belong to the set ${\mathscr {A}}$ in (2), i.e., ${\hat{\alpha }}^1, {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)$. To achieve this we will follow a similar strategy as in Bank et al. [5]. For simplicity, we will assume without loss of generality that $x^1=x^2=0$. Because of the coupling of ${\hat{\alpha }}^1, {\hat{\alpha }}^2$ in (21) it is more convenient to prove that ${\hat{\alpha }}^+ \triangleq {\hat{\alpha }}^1 + {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)$ and ${\hat{\alpha }}^- \triangleq {\hat{\alpha }}^1 - {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)$, where we set ${\hat{X}}^+_\cdot \triangleq \int _0^\cdot {\hat{\alpha }}^+_s ds$ and ${\hat{X}}^-_\cdot \triangleq \int _0^\cdot {\hat{\alpha }}^-_s ds$. Recall from (41) and (42) above that we then have

$$\begin{aligned} {\hat{\alpha }}^+_t = \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - {\hat{X}}^+_t\right) , \quad {\hat{\alpha }}^-_t = \frac{c^-_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} - {\hat{X}}^-_t \right) \end{aligned}$$

(49)

on [0, T), where

$$\begin{aligned} {\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^1_t (M^+_t - Y^+_t), \quad {\hat{\xi }}^1_t - {\hat{\xi }}^2_t = 2 w^2_t (M^-_t - Y^-_t) \quad (0 \le t \le T) \end{aligned}$$

(50)

because of (43) (recall that $M^+,Y^+$ are given in (28) and $M^-,Y^-$ are given in (29)).

We start with showing that ${\hat{\alpha }}^+ \in L^2({\mathbb {P}}\otimes dt)$. For this purpose, observe that it suffices to examine the following two cases $\xi ^1\equiv \xi ^2\equiv 0$ and $\Xi ^1_T=\Xi ^2_T=0$ separately. Indeed, let us denote ${\hat{\alpha }}^{+,\xi ^1,\xi ^2,\Xi ^1,\Xi ^2} \triangleq {\hat{\alpha }}^{+}$ to emphasize also the dependence on $\xi ^1,\xi ^2,\Xi ^1,\Xi ^2$. Then, due to the linear dependence of ${\hat{\alpha }}^+$ in (49) on $\xi ^1,\xi ^2,\Xi ^1,\Xi ^2$, it holds that

$$\begin{aligned} {\hat{\alpha }}^{+,\xi ^1,\xi ^2,\Xi ^1,\Xi ^2} = {\hat{\alpha }}^{+,0,0,\Xi ^1,\Xi ^2} + {\hat{\alpha }}^{+,\xi ^1,\xi ^2,0,0}. \end{aligned}$$

(51)

Hence, it suffices to show that ${\hat{\alpha }}^{+,0,0,\Xi ^1,\Xi ^2} \in L^2({\mathbb {P}}\otimes dt)$ and ${\hat{\alpha }}^{+,\xi ^1,\xi ^2,0,0} \in L^2({\mathbb {P}}\otimes dt)$.

Case 1.1: $\xi ^1\equiv \xi ^2\equiv 0$:

From (50) it follows that ${\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^1_t M^+_t$. Moreover, the explicit solutions in (66) and (67) yield

$$\begin{aligned} \begin{aligned} {\hat{X}}^+_t =&\; e^{-\int _0^t \frac{c^+_u}{\lambda } du} \int _0^t \frac{c_s^+ + c^-_s}{\lambda } w^1_s M^+_s e^{\int _0^s \frac{c^+_u}{\lambda } du} ds \\ =&\; e^{\frac{\gamma }{6\lambda } (T-t)} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \\&\int _0^t M^+_s \frac{\sqrt{\delta ^+}}{3\lambda \sinh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} ds \qquad (0 \le t < T). \end{aligned} \end{aligned}$$

(52)

Introducing the deterministic and differentiable function $f^+_s \triangleq 1/\sinh (\sqrt{\delta ^+}(T-s)/(3\lambda ))$ on [0, T) allows to rewrite the integral in (52) by applying integration by parts as

$$\begin{aligned}&\int _0^t M^+_s \frac{\sqrt{\delta ^+}}{3\lambda \sinh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} ds = \int _0^t {\tilde{M}}^+_s df^+_s \nonumber \\&\quad = {\tilde{M}}^+_t f^+_t - {\tilde{M}}^+_0 f^+_0 - \int _0^t f_s^+ d{\tilde{M}}^+_s \qquad (0 \le t < T), \end{aligned}$$

(53)

where ${\tilde{M}}^+_t \triangleq M_t^+/\cosh (\sqrt{\delta ^+}(T-t)/(3\lambda ))$ for all $t \in [0,T)$. Moreover, we have that

$$\begin{aligned} \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} = \frac{\sqrt{\delta ^+} e^{\frac{\gamma }{6\lambda }(T-t)}}{3 c^+_t \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))} M^+_t \quad (0 \le t \le T). \end{aligned}$$

(54)

Now, plugging back (54) and (52) together with (53) into ${\hat{\alpha }}^+$ in (49) yields, after some elementary computations,

$$\begin{aligned} \begin{aligned} {\hat{\alpha }}^+_t =&\; -\frac{\gamma }{6\lambda } e^{\frac{\gamma }{6\lambda }(T-t)} {\tilde{M}}^+_t + \frac{c^+_t}{\lambda } e^{\frac{\gamma }{6\lambda } (T-t)} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) {\tilde{M}}^+_0 f^+_0 \\&\; + \frac{c^+_t}{\lambda } e^{\frac{\gamma }{6\lambda } (T-t)} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \int _0^t f_s^+ d{\tilde{M}}^+_s \quad (0 \le t < T). \end{aligned} \end{aligned}$$

(55)

In fact, since $c^+_t \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))$ is bounded on [0, T] (recall from (19) that $c^+_t = \frac{1}{3} \sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + \frac{1}{6}\gamma $) and ${\tilde{M}}^+ \in L^2({\mathbb {P}}\otimes dt)$ (recall that $M^+$ in (28) belongs to $L^2({\mathbb {P}}\otimes dt)$) the first two terms in (55) are in $L^2({\mathbb {P}}\otimes dt)$. For the stochastic integral, we obtain

$$\begin{aligned} \int _0^t f_s^+ d{\tilde{M}}^+_s =&\; \int _0^t \frac{\sqrt{\delta ^+} M_s^+}{3\lambda \cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} ds \\&\; + \int _0^t \frac{{\tilde{f}}_s^+}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))} dM^+_s, \end{aligned}$$

where the first integral on the right is again an element of $L^2({\mathbb {P}}\otimes dt)$. The second integral satisfies

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ \int _0^T \left( \int _0^t \frac{{\tilde{f}}_s^+}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))} dM^+_s \right) ^2 dt \right] \\&\quad = {\mathbb {E}}\left[ \int _0^T \int _0^t \left( \frac{{\tilde{f}}_s^+}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))} \right) ^2 d\langle M^+ \rangle _s dt \right] \\&\quad = {\mathbb {E}}\left[ \int _0^T (T-s) \frac{({\tilde{f}}_s^+)^2}{\cosh (\sqrt{\delta ^+}(T-s)/(3\lambda ))^2} d\langle M^+ \rangle _s \right] \\&\quad \le \frac{9\lambda ^2}{\delta ^+} {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^+ \rangle _s \right] < \infty \end{aligned} \end{aligned}$$

(56)

by our assumption in (9), where we also used Fubini’s theorem twice and the fact that $\sinh (\tau ) \ge \tau $ and $\cosh (\tau ) \ge 1$ for all $\tau \ge 0$. That is, we obtain that ${\hat{\alpha }}^+ \in L^2({\mathbb {P}}\otimes dt)$ in this case.

Case 1.2: $\Xi ^1_T=\Xi ^2_T=0$:

In this case, we obtain from the expressions in (22) and (23) that

$$\begin{aligned} {\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^3_t {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) K^1(t,u) du \, \Big \vert \, {\mathscr {F}}_t \right] \quad (0 \le t \le T) \end{aligned}$$

and thus, using again the explicit representation for ${\hat{X}}^+={\hat{X}}^1 + {\hat{X}}^2$ from (66) and (67), ${\hat{\alpha }}^+$ in (49) becomes

$$\begin{aligned} {\hat{\alpha }}^+_t&= \frac{c^+_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t + {\hat{\xi }}^2_t}{1+w^5_t} - {\hat{X}}^+_t\right) \nonumber \\&= \frac{2c^+_tw^3_t}{\lambda (1+w^5_t)} {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) K^1(t,u) du \, \bigg \vert \, {\mathscr {F}}_t \right] \nonumber \\&\quad - \frac{c^+_t}{\lambda } e^{-\int _0^t \frac{c^+_u}{\lambda } du} \nonumber \\&\qquad \int _0^t \frac{(c_s^+ + c^-_s)w^3_s}{\lambda } e^{\int _0^s \frac{c^+_u}{\lambda } du} {\mathbb {E}}\left[ \int _s^T (\xi ^1_u + \xi ^2_u) K^1(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] ds. \end{aligned}$$

(57)

In fact, it holds that all the ratios in (57) involving $c^+$, $c^-$ are bounded on [0, T]. Moreover, by Lemma 3.8 we have

$$\begin{aligned} {\mathbb {E}}\left[ \int _t^T (\xi ^1_u + \xi ^2_u) K^1(t,u) du \, \bigg \vert \, {\mathscr {F}}_t \right] \in L^2({\mathbb {P}}\otimes dt), \end{aligned}$$

as well as

$$\begin{aligned}&{\mathbb {E}}\left[ \int _0^T \left( \int _0^t {\mathbb {E}}\left[ \int _s^T (\xi ^1_u + \xi ^2_u) K^1(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] ds \right) ^2 dt \right] \\&\quad \le \frac{T^2}{2} {\mathbb {E}}\left[ \int _0^T \left( {\mathbb {E}}\left[ \int _s^T (\xi ^1_u + \xi ^2_u) K^1(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] \right) ^2 ds \right] < \infty \end{aligned}$$

by using Jensen’s inequality. As a consequence, we can also conclude in this case that ${\hat{\alpha }}^+$ belongs to $L^2({\mathbb {P}}\otimes dt)$.

Let us now argue that also ${\hat{\alpha }}^-$ in (49) belongs to $L^2({\mathbb {P}}\otimes dt)$. The argumentation is very similar to the one presented above so that we only sketch the main steps. Again, it is enough to investigate the following two cases $\xi ^1\equiv \xi ^2\equiv 0$ and $\Xi ^1_T=\Xi ^2_T=0$ separately because ${\hat{\alpha }}^-$ in (49) can similarly be decomposed as ${\hat{\alpha }}^+$ in (51).

Case 2.1: $\xi ^1\equiv \xi ^2\equiv 0$:

Similar to (52) above, using ${\hat{\xi }}^1_t - {\hat{\xi }}^2_t = 2 w^2_t M^-_t$ from (50) we obtain via (66) and (67) the representation

$$\begin{aligned} \begin{aligned} {\hat{X}}^-_t =&\; e^{-\int _0^t \frac{c^-_u}{\lambda } du} \int _0^t \frac{c_s^+ + c^-_s}{\lambda } w^2_s M^-_s e^{\int _0^s \frac{c^-_u}{\lambda } du} ds \\ =&\; e^{-\frac{\gamma }{2\lambda } (T-t)} \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \\&\int _0^t M^-_s \frac{\sqrt{\delta ^-}}{\lambda \sinh (\sqrt{\delta ^-}(T-s)/\lambda )^2} ds \qquad (0 \le t < T). \end{aligned} \end{aligned}$$

(58)

Setting $f^-_s \triangleq 1/\sinh (\sqrt{\delta ^-}(T-s)/\lambda )$ on [0, T) we can rewrite the integral in (58) as

$$\begin{aligned} \int _0^t {\tilde{M}}^-_s df^-_s = {\tilde{M}}^-_t f^-_t - {\tilde{M}}^-_0 f^-_0 - \int _0^t f_s^- d{\tilde{M}}^-_s \qquad (0 \le t < T) \end{aligned}$$

(59)

with ${\tilde{M}}^-_t \triangleq M_t^-/\cosh (\sqrt{\delta ^-}(T-t)/\lambda )$ for all $t \in [0,T)$. In addition,

$$\begin{aligned} \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} = \frac{\sqrt{\delta ^-} e^{-\frac{\gamma }{2\lambda }(T-t)}}{c^-_t \sinh (\sqrt{\delta ^-}(T-t)/\lambda )} M^-_t \quad (0 \le t \le T). \end{aligned}$$

(60)

Inserting (60) and (58) together with (59) into ${\hat{\alpha }}^-$ in (49) then yields

$$\begin{aligned} \begin{aligned} {\hat{\alpha }}^-_t =&\; \frac{\gamma }{2\lambda } e^{-\frac{\gamma }{2\lambda }(T-t)} {\tilde{M}}^-_t + \frac{c^-_t}{\lambda } e^{-\frac{\gamma }{2\lambda } (T-t)} \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) {\tilde{M}}^-_0 f^-_0 \\&\; + \frac{c^-_t}{\lambda } e^{-\frac{\gamma }{2\lambda } (T-t)} \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \int _0^t f_s^- d{\tilde{M}}^-_s \quad (0 \le t < T), \end{aligned} \end{aligned}$$

(61)

where

$$\begin{aligned} \int _0^t f_s^- d{\tilde{M}}^-_s =&\; \int _0^t \frac{\sqrt{\delta ^-} M_s^-}{\lambda \cosh (\sqrt{\delta ^-}(T-s)/\lambda )^2} ds \nonumber \\&\; + \int _0^t \frac{{\tilde{f}}_s^-}{\cosh (\sqrt{\delta ^-}(T-s)/\lambda )} dM^-_s. \end{aligned}$$

(62)

Observe as in (55) above that $c^-_t \sinh (\sqrt{\delta ^-}(T-t)/\lambda )$ is bounded on [0, T] (recall from (19) that $c^-_t = \sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda )-\frac{1}{2}\gamma $) and that ${\tilde{M}}^- \in L^2({\mathbb {P}}\otimes dt)$. Therefore, we only need to justify that the stochastic integral in (62) belongs to $L^2({\mathbb {P}}\otimes dt)$. Indeed, by the same computations as in (56), we obtain via our assumption in (9) that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}\left[ \int _0^T \left( \int _0^t \frac{{\tilde{f}}_s^-}{\cosh (\sqrt{\delta ^-}(T-s)/\lambda )} dM^-_s \right) ^2 dt \right] \\&\quad \le \frac{\lambda ^2}{\delta ^-} {\mathbb {E}}\left[ \int _0^T \frac{1}{T-s} d\langle M^- \rangle _s \right] < \infty . \end{aligned} \end{aligned}$$

(63)

Hence, we can conclude that ${\hat{\alpha }}^- \in L^2({\mathbb {P}}\otimes dt)$ in this case.

Case 2.2: $\Xi ^1_T=\Xi ^2_T=0$:

Here, similar to (57) above, (22) and (23) imply that

$$\begin{aligned} {\hat{\xi }}^1_t - {\hat{\xi }}^2_t = 2 w^4_t {\mathbb {E}}\left[ \int _t^T (\xi ^1_u - \xi ^2_u) K^2(t,u) du \, \Big \vert \, {\mathscr {F}}_t \right] \quad (0 \le t \le T) \end{aligned}$$

and hence, together with ${\hat{X}}^-={\hat{X}}^1 - {\hat{X}}^2$ from (66) and (67), ${\hat{\alpha }}^-$ in (49) can be written as

$$\begin{aligned} {\hat{\alpha }}^-_t&= \frac{c^-_t}{\lambda } \left( \frac{{\hat{\xi }}^1_t - {\hat{\xi }}^2_t}{1-w^5_t} - {\hat{X}}^-_t\right) \nonumber \\&= \frac{2c^-_tw^4_t}{\lambda (1-w^5_t)} {\mathbb {E}}\left[ \int _t^T (\xi ^1_u - \xi ^2_u) K^2(t,u) du \, \bigg \vert \, {\mathscr {F}}_t \right] \nonumber \\&\quad - \frac{c^-_t}{\lambda } e^{-\int _0^t \frac{c^-_u}{\lambda } du} \nonumber \\&\qquad \int _0^t \frac{(c_s^+ + c^-_s)w^4_s}{\lambda } e^{\int _0^s \frac{c^-_u}{\lambda } du} {\mathbb {E}}\left[ \int _s^T (\xi ^1_u - \xi ^2_u) K^2(s,u) du \, \bigg \vert \, {\mathscr {F}}_s \right] ds. \end{aligned}$$

(64)

As in (57), all the ratios in (64) involving the functions $c^+$, $c^-$ are bounded on [0, T], and we can conclude along the same lines as in case 1.2 by virtue of Lemma 3.8 that ${\hat{\alpha }}^- \in L^2({\mathbb {P}}\otimes dt)$ in this case as well.

Step 4: Finally, we have to argue that the functions $K^1(t,u)$ and $K^2(t,u)$ defined in (24) are nonnegative kernels which integrate to one over [t, T) as functions in $u \in [t,T)$. To this end, observe that $c^+_t > 0$ and $c^-_t > 0$ for all $t \in [0,T]$, which implies that $w^1_\cdot , w^2_\cdot > 0$ on [0, T). Moreover, a direct computation yields that for all $t \in [0,T)$ we have

$$\begin{aligned} \begin{aligned} 0<&\; \int _t^T \frac{2\sigma }{\sqrt{\delta ^+}} e^{-\frac{\gamma }{6\lambda }(T-u)} \sinh (\sqrt{\delta ^+}(T-u)/(3\lambda )) du = \frac{w^3_t}{w^1_t}, \\ 0 <&\; \int _t^T \frac{2\sigma }{\sqrt{\delta ^-}} e^{\frac{\gamma }{2\lambda }(T-u)} \sinh (\sqrt{\delta ^-}(T-u)/\lambda ) du = \frac{w^4_t}{w^2_t}. \end{aligned} \end{aligned}$$

(65)

Thus, we also obtain that $w^3_\cdot , w^4_\cdot > 0$ on [0, T). But this implies for the functions defined in (24) that $K^1(t,u) > 0$ and $K^2(t,u) > 0$ for all $0 \le t \le u < T$, as well as that $\int _t^T K^1(t,u) du = \int _t^T K^2(t,u) du = 1$ for all $t \in [0,T)$. $\square $

The equilibrium share holdings prescribed by the linear coupled ODE in (21) can also be computed explicitly.

Corollary 3.6

The solution $({\hat{X}}^1, {\hat{X}}^2)$ to the linear ODE in (21) is given by

$$\begin{aligned} {\hat{X}}^{1}_t =&\; \frac{1}{2} (x^1 + x^2) e^{-\int _0^t \frac{c^+_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^1_s+{\hat{\xi }}^2_s) e^{-\int _s^t \frac{c^+_u}{\lambda } du} ds \nonumber \\&\; + \frac{1}{2} (x^1 - x^2) e^{-\int _0^t \frac{c^-_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^1_s-{\hat{\xi }}^2_s) e^{-\int _s^t \frac{c^-_u}{\lambda } du} ds \end{aligned}$$

(66)

and, similarly, by

$$\begin{aligned} {\hat{X}}^{2}_t= & {} \frac{1}{2} (x^2 + x^1) e^{-\int _0^t \frac{c^+_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^2_s+{\hat{\xi }}^1_s) e^{-\int _s^t \frac{c^+_u}{\lambda } du} ds \nonumber \\&+ \frac{1}{2} (x^2 - x^1) e^{-\int _0^t \frac{c^-_s}{\lambda } ds} + \frac{1}{4\lambda } \int _0^t (c^+_s + c^-_s) ({\hat{\xi }}^2_s-{\hat{\xi }}^1_s) e^{-\int _s^t \frac{c^-_u}{\lambda } du} ds \end{aligned}$$

(67)

for all $t \in [0,T]$.

Proof

Recall that from the dynamics of ${\hat{X}}^1$ and ${\hat{X}}^2$ in (21) we obtain that the processes ${\hat{X}}^1 + {\hat{X}}^2$ and ${\hat{X}}^1 - {\hat{X}}^2$ satisfy, respectively, the linear ODEs in (41) and (42) with initial values $x^1+x^2$ and $x^1-x^2$. Applying the variation of constants formula then yields

$$\begin{aligned} {\hat{X}}^1_t \pm {\hat{X}}^2_t = (x^1 \pm x^2) e^{-\int _0^t \frac{c^{\pm }_s}{\lambda } ds} + \int _0^t \frac{c^+_s+c^-_s}{2\lambda } ({\hat{\xi }}^1_s \pm {\hat{\xi }}^2_s) e^{-\int _s^t \frac{c_u^{\pm }}{\lambda } du} ds \end{aligned}$$

and hence the assertion in (66) and (67) via the obvious relation

$$\begin{aligned} {\hat{X}}^{1,2}_t = \frac{1}{2} ({\hat{X}}^1_t + {\hat{X}}^2_t) \pm \frac{1}{2} ({\hat{X}}^1_t - {\hat{X}}^2_t). \end{aligned}$$

$\square $

Lastly, following simple properties of the weight functions introduced in (20) will help enlightening the structure of the Nash equilibrium presented in Theorem 3.5.

Lemma 3.7

The weight functions $w^1, w^2, w^3, w^4,w ^5$ defined in (20) satisfy

1.
$w_\cdot ^5 \in (-1,1)$, $w_{\cdot }^{1,2,3,4} > 0$ on [0, T) and $w^1_\cdot + w^2_\cdot + w^3_\cdot + w^4_\cdot =1$ on [0, T],
2.
$\lim _{t \uparrow T} w_t^{1,2} = 1/2$ and $\lim _{t \uparrow T} w_t^{3,4,5} = 0$.

Proof

1. First, recall from the Proof of Theorem 3.5, Step 4, above that $w^1_\cdot , w^2_\cdot ,w^3_\cdot ,w^4_\cdot > 0$ on [0, T). Moreover, from the definition in (20) we immediately obtain that $w^1_t + w^2_t + w^3_t + w^4_t =1$ for all $t \in [0,T]$. Together with the fact that $c^+_\cdot > 0$ and $c^-_\cdot > 0$ on [0, T], we also observe that $w^5_t \in (-1,1)$ for all $t \in [0,T]$.

2. Concerning the limiting behaviour of the weight functions, it suffices to note that

$$\begin{aligned} \lim _{t \uparrow T} \frac{\sinh (\sqrt{\delta ^+}(T-t)/(3\lambda ))}{\sinh (\sqrt{\delta ^-}(T-t)/\lambda )} = \frac{\sqrt{\delta ^+}}{3\sqrt{\delta ^-}}. \end{aligned}$$

Then, rewriting $w^1$, $w^2$ in (20) by plugging in $c^+$, $c^-$ from (19) to obtain the representations

$$\begin{aligned} w^1_t = \frac{\sqrt{\delta ^+} e^{\frac{\gamma }{6\lambda }(T-t)}}{d^1_t}, \qquad w^2_t = \frac{3 \sqrt{\delta ^-} e^{-\frac{\gamma }{2\lambda }(T-t)}}{d^2_t} \end{aligned}$$

with

$$\begin{aligned} d^1_t \triangleq&\; \sqrt{\delta ^+}\cosh (\sqrt{\delta ^+}(T-t)/(3\lambda ))-\gamma \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \\&+ \sqrt{\delta ^-} \sinh (\sqrt{\delta ^+}(T-t)/(3\lambda )) \coth (\sqrt{\delta ^-}(T-t)/\lambda ), \\ d^2_t \triangleq&\; 3 \sqrt{\delta ^-} \cosh (\sqrt{\delta ^-}(T-t)/\lambda ) -\gamma \sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \\&+ \sqrt{\delta ^+}\sinh (\sqrt{\delta ^-}(T-t)/\lambda ) \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) \end{aligned}$$

yields

$$\begin{aligned} \lim _{t\uparrow T} w^1_t = \frac{\sqrt{\delta ^+}}{\sqrt{\delta ^+} + \sqrt{\delta ^+}} = \frac{1}{2}, \qquad \lim _{t\uparrow T} w^2_t = \frac{\sqrt{\delta ^-}}{\sqrt{\delta ^-} + \sqrt{\delta ^-}} = \frac{1}{2}. \end{aligned}$$

Similarly, with

$$\begin{aligned} \frac{c_t^+}{c^+_t+c^-_t} =&\; \frac{2\sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + \gamma }{2\sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + 6\sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda )-2\gamma } \\ \frac{c_t^-}{c^+_t+c^-_t} =&\; \frac{6 \sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda ) - 3 \gamma }{2\sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(T-t)/(3\lambda )) + 6\sqrt{\delta ^-} \coth (\sqrt{\delta ^-}(T-t)/\lambda )-2\gamma } \end{aligned}$$

we also have

$$\begin{aligned} \lim _{t\uparrow T} \frac{c_t^+}{c^+_t+c^-_t} = \frac{\sqrt{\delta ^+}}{\sqrt{\delta ^+} + \sqrt{\delta ^+}} = \frac{1}{2}, \qquad \lim _{t\uparrow T} \frac{c_t^-}{c^+_t+c^-_t} = \frac{\sqrt{\delta ^-}}{\sqrt{\delta ^-} + \sqrt{\delta ^-}} = \frac{1}{2} \end{aligned}$$

and hence

$$\begin{aligned} \lim _{t\uparrow T} w^3_t = \lim _{t\uparrow T} w^4_t = \lim _{t\uparrow T} w^5_t = 0 \end{aligned}$$

as desired. $\square $

The final lemma provides estimates with respect to the $L^2({\mathbb {P}}\otimes dt)$-norm which are used in the Proof of Theorem 3.5 above.

Lemma 3.8

Let $(\zeta _t)_{0 \le t \le T} \in L^2({\mathbb {P}}\otimes dt)$ be progressively measurable. Moreover, let $K^1(t,u)$, $K^2(t,u)$, $0 \le t \le u < T$, denote the kernels from Theorem 3.5.

(a)
For $\zeta ^{K^1}_t \triangleq {\mathbb {E}}[ \int _t^T \zeta _u K^1(t,u) du \vert {\mathscr {F}}_t]$, $0 \le t < T$, it holds that
$$\begin{aligned} \Vert \zeta ^{K^1} \Vert _{L^2({\mathbb {P}}\otimes dt)} \le c \Vert \zeta \Vert _{L^2({\mathbb {P}}\otimes dt)} \end{aligned}$$
for some constant $c>0$.
(b)
For $\zeta ^{K^2}_t \triangleq {\mathbb {E}}[ \int _t^T \zeta _u K^2(t,u) du \vert {\mathscr {F}}_t]$, $0 \le t < T$, it holds that
$$\begin{aligned} \Vert \zeta ^{K^2} \Vert _{L^2({\mathbb {P}}\otimes dt)} \le c \Vert \zeta \Vert _{L^2({\mathbb {P}}\otimes dt)} \end{aligned}$$
for some constant $c>0$.

Proof

Both upper bounds can be verified in a similar fashion as in the proof of Lemma 5.5 in Bank et al. [5]. We will thus omit it here. $\square $

Remark 3.9

Following up on Remark 2.3, setting $\xi ^1 \equiv \xi ^2 \equiv 0$ and $\Xi ^1_T = \Xi ^2_T = 0$ ${\mathbb {P}}$-almost surely, our Theorem 3.5 together with Corollary 3.6 retrieves the two-player results from Carlin et al. [9, Result 1] for the case $\sigma = 0$ and from Schied and Zhang [29, Corollary 2.6] for the case $\sigma > 0$. Note that this configuration yields ${\hat{\xi }}^1 \equiv {\hat{\xi }}^2 \equiv 0$ in (22) and (23), which in turn implies that the Nash equilibrium trading rates in (21) and the corresponding share holdings in (66) and (67) are deterministic.

We end this section by briefly discussing qualitatively the Nash equilibrium obtained in Theorem 3.5. Very similar to the single-player solution in [5] it turns out that the trading rates ${\hat{\alpha }}^1$ and ${\hat{\alpha }}^2$ in (21) prescribe, respectively, to gradually trade in the direction of an optimal signal process ${\hat{\xi }}^1_t$ and ${\hat{\xi }}^2_t$ (rather than toward the actual target position $\xi ^1_t$, $\xi ^2_t$), which is further adjusted by a fraction $w^5_t \in (-1,1)$ of the opponent’s respective current portfolio position ${\hat{X}}^2_t$ and ${\hat{X}}^1_t$. The optimal signal processes ${\hat{\xi }}^1$ in (22) and ${\hat{\xi }}^2$ in (23) are convex combinations of weighted averages of expected future target positions of the processes $\xi ^1$, $\xi ^2$ and the expected terminal positions $\Xi ^1_T$, $\Xi ^2_T$, where the weights $w^1_t, w^2_t, w^3_t, w^4_t$ systematically shift toward the desired individual terminal state as $t \uparrow T$ (Lemma 3.7 implies that $\lim _{t \uparrow T} {\hat{\xi }}^i_t = \Xi ^i_T$ ${\mathbb {P}}$-a.s. for both players $i=1,2$). The increasing urgency rate $(c^+_t+c^-_t)/(2\lambda ) \uparrow \infty $ for $t \uparrow T$, together with $\lim _{t\uparrow T} w^5_t = 0$, then forces both strategies in (21) to end up in the predetermined terminal portfolio position at maturity T (see also the Proof of Theorem 3.5 above). Interestingly, we note that the first agent’s optimal signal process ${\hat{\xi }}^1$ not only seeks to anticipate the future evolution of her own target strategy $\xi ^1$ but, conscious of her competitor’s trading goals, does so also for the opponent’s target strategy $\xi ^2$. In other words, besides following her own objectives, she also takes into account the other agent’s known trading intentions. Moreover, the weights $w^3_t$ and $w^4_t$ dictate the actual trading direction with respect to the other agent’s tracking target. Indeed, observe that if $w^3_t$ predominates $w^4_t$ in (22), the first player’s optimal signal ${\hat{\xi }}^1$ directs to also trade in parallel in the same direction as the second player, that is, in the direction of the expected future average positions of $\xi ^2$. In contrast, if $w^4_t$ outweighs $w^3_t$, then the optimal signal imposes to trade in the opposite direction of the second player’s target strategy, i.e., toward the expected weighted averages of $-\xi ^2$. The former case can be viewed as a predatory trading action of the first agent against the second agent, whereas the latter case can be regarded as a cooperative behaviour. The same applies for the second player in (23) due to symmetry. In our illustrations in Sect. 4 below it becomes apparent that both these cases depend on the relationship between the permanent and temporary price impact parameters $\gamma $ and $\lambda $. Loosely speaking, in a plastic market where $\gamma \gg \lambda $, the weight $w^3$ tends to be larger than $w^4$, and in an elastic market with $\lambda \gg \gamma $ we have that $w^4$ tends to be larger than $w^3$ (see also the graphical illustration of the weight functions in Fig. 1 below). In this regard, depending on the illiquidity parameters the optimal signal processes ${\hat{\xi }}^1$ and ${\hat{\xi }}^2$ account for different types of regimes. It turns out that this leads to qualitative different behavioral patterns in the Nash equilibrium where both predation and cooperation between the agents can occur, even in a coexisting manner.

4 Illustrations

In this section we present some case studies to illustrate the qualitative behaviour of the two-player Nash equilibrium presented in Theorem 3.5.

4.1 Optimal liquidation revisited

We start with revisiting the differential game of optimal portfolio liquidation studied in Schied and Zhang [29]. Specifically, the first agent seeks to liquidate her initial portfolio position of $x^1=1$ shares in the risky asset by time $T=2$ and hence requires her terminal position to satisfy $\Xi ^1_T=0$ ${\mathbb {P}}$-a.s. at final time. Vigilant about her share holdings and in line with her selling intention she also wants her inventory to be close to 0 throughout by tracking $\xi ^1 \equiv 0$ on [0, T]. The second agent, on the contrary, does not pursue any predetermined buying or selling objectives but solely chooses to trade in the risky asset because he knows about the intentions of the first liquidating agent. That is, possessing no shares at time 0 ($x^2=0$) he gives himself the constraints $\xi ^2_t = \Xi ^2_T = 0$ ${\mathbb {P}}$-a.s. for all $t \in [0,T]$. In this case, following Theorem 3.5, we have ${\hat{\xi }}^1 \equiv {\hat{\xi }}^2 \equiv 0$ ${\mathbb {P}}$-a.s. on [0, T] in (22) and (23), and the deterministic equilibrium trading rates of both players in (21) reduce to

$$\begin{aligned} {\hat{\alpha }}^1_t = \frac{c^+_t+c^-_t}{2\lambda } \left( - w^5_t {\hat{X}}^2_t - {\hat{X}}^1_t \right) \quad \text {and} \quad {\hat{\alpha }}^2_t = \frac{c^+_t+c^-_t}{2\lambda } \left( - w^5_t {\hat{X}}^1_t - {\hat{X}}^2_t \right) \end{aligned}$$

(68)

on [0, T); cf. also the result in [29, Corollary 2.6] with a slightly different representation. We observe in (68) that the first agent’s portfolio position ${\hat{X}}^1_t$ is not gradually reverting towards 0 but takes the effect of the second agent’s actions into account via the correction term $-w^5_t {\hat{X}}^2_t$. Similarly, concerning the second agent, it is optimal for him to systematically trade in the direction of the liquidating agent’s current portfolio position ${\hat{X}}^1_t$ weighted with $w^5_t \in (-1,1)$.

As shown in Fig. 2, this yields to predation on the first agent in a plastic market where, e.g., $\gamma = 4 > 1 = \lambda $. Indeed, during the first half of the trading period he short-sells the risky asset in parallel to the selling of the first agent and then steadily unwinds his accrued short position by buying back shares to become “hands-clean” by final time T. In contrast, in an elastic market with, e.g., $\gamma = 0.2 < 1 = \lambda $, the Nash equilibrium strategy dictates the second agent to cooperate with the seller and to moderately buy almost up to one-tenth of the shares by time T/2 agent 1 is concurrently selling before starting liquidating his portfolio to finish up with zero inventory at T. Note that the weight function $w^5_\cdot $ in (68) flips sign depending on the market’s illiquidity regime (see also Fig. 1). As a consequence, compared to the single-player optimal liquidation strategy ${\hat{X}}_t = 1 + \int _0^t {\hat{\alpha }}_s ds$, $t \in [0,T]$, which satisfies

$$\begin{aligned} {\hat{\alpha }}_t = -\sqrt{\frac{\sigma }{\lambda }} \coth \left( \sqrt{\frac{\sigma }{\lambda }} (T-t) \right) {\hat{X}}_t \qquad (0 \le t < T) \end{aligned}$$

(69)

(cf., e.g., Almgren [1]), and does not depend on $\gamma $, we observe in Fig. 2 that, due to the presence of the second agent’s trading activity which directly feeds into the first agent’s turnover rate ${\hat{\alpha }}^1$ via $-w^5 {\hat{X}}^2$ in (68), her optimal portfolio liquidation strategy becomes more prudent in a plastic market and slightly more aggressive in an elastic market environment. To sum up, in equilibrium, depending on the illiquid market type, either predation or cooperation between both agents occurs; see also the discussion in [29, Sect. 3].

4.2 Piecewise constant inventory targets

The next two case studies are again simple deterministic examples but this time with nonzero optimal signal processes ${\hat{\xi }}^1$ and ${\hat{\xi }}^2$.

In the first example, as in the optimal liquidation problem above, we suppose that agent 2 only trades in the risky asset because of his awareness of the trading activity of the first agent. That is, with $x^2 = 0$ initial shares his inventory targets are $ \xi ^2_t = \Xi ^2_T = 0$ ${\mathbb {P}}$-a.s. for all $t \in [0,T]$. Concerning the first agent, starting with no inventory $x^1=0$ she wants to follow a stock-buying schedule over a time period of $T=10$ that prescribes to hold one share until time T/2 and then to double and hold her position up to time T. Her inventory target is thus $\xi ^1_t = 1 \cdot 1_{\{0 \le t < 5\}}+2 \cdot 1_{\{5 \le t \le 10\}}$ on [0, T] with terminal constraint $\Xi ^1_T = 2$. Note that in this game setup the optimal signal processes ${\hat{\xi }}^1$ and ${\hat{\xi }}^2$ of both agents in (22) and (23) in equilibrium are nonzero. In particular, similar to the single-player case in [5] they are anticipating and smoothing out the jump in $\xi ^1$ at time T/2 via the averaging through the kernels $K^1$ and $K^2$. The associated Nash-equilibrium trading strategies ${\hat{X}}^1$ and ${\hat{X}}^2$ from Theorem 3.5 are presented in Fig. 3. As expected from the liquidation problem above, if the market is plastic $(\gamma > \lambda )$ the second agent heavily preys on the first agent by trading halfway of the trading period in the same direction and buying shares. Accordingly, in comparison to the first agent’s single-player optimal tracking strategy from [5] (which does not dependent on $\gamma $) her running after the buying-schedule $\xi ^1$ gets affected due to the presence of the preying second agent and falls behind the single-player solution in the second half of the trading period (also recall the adjustment ${\hat{\xi }}^1 -w^5{\hat{X}}^2$ of the first agent’s optimal signal process in her trading rate in (21)). However, if the market is elastic $(\lambda > \gamma )$ the second agent’s optimal behaviour in equilibrium changes. Interestingly, we observe that his strategy turns out to be a succession of round-trips during which he either provides liquidity to his opponent by short-selling the risky asset like, e.g., during the first quarter of the trading period, or engages in predatory trading by concurrently building up some inventory in parallel to his adversary’s buying efforts as it is the case during the second quarter of the trading period. Thus, compared to the first agent’s single-player optimal strategy, she suitably buys slightly faster and slower in the two-player setup. Overall, it turns out that predation and cooperation coexist in equilibrium in this case.

As a second example, let us examine the situation where both agents with zero initial inventory $x^1=x^2=0$ seek to gradually build up and hold a positive fraction of the risky asset over some time period [0, T] with $T=10$. Concretely, assume that $\xi ^1 \equiv \Xi ^1_T = 1$ and $\xi ^2 \equiv \Xi ^2_T = 0.1$, i.e., agent 1 wants her inventory to be close to 1 and ten times larger than the desired inventory level of agent 2 all through the trading period [0, T]. The associated Nash equilibrium strategies ${\hat{X}}^1$ and ${\hat{X}}^2$ from Theorem 3.5 are presented in Fig. 4. Again, as expected from the analysis above, in a plastic market it is optimal for agent 2 to excessively prey on the first agent who aims for a much larger asset position by buying up to three times more shares than his actual target inventory predetermines. In response, the acquisition of the first agent is slowed down compared to her single-player optimal strategy from [5]. By contrast, in an elastic market environment it turns out to be optimal for the second agent to initially ignore her own tracking target and to trade away from her desired inventory level in order to provide liquidity to the higher-volume seeking first agent by short-selling some shares. Also note how in this case the second agent’s single-player optimal tracking strategy from [5] strongly differs from her optimal behaviour in the two-player Nash equilibrium at the beginning of the trading period.

4.3 Running after the delta

In the final two examples we want to investigate a situation where the target strategies $\xi ^1$ and $\xi ^2$ are adapted stochastic processes. Specifically, let us suppose that the first agent wants to hedge an at-the-money call option with maturity T on the underlying unaffected price process $P=P_0+\sqrt{\sigma } W$ in (3) by tracking the corresponding frictionless (Bachelier-)delta-hedging strategy

$$\begin{aligned} \xi ^1_t \triangleq \Phi \left( \frac{P_t - P_0}{\sqrt{\sigma (T-t)}} \right) \quad (0 \le t \le T). \end{aligned}$$

(70)

Here, $\Phi $ denotes the cumulative distribution function of the standard normal distribution. We further suppose that her initial position in the risky asset coincides with the frictionless delta $x^1=\xi ^1_0 = 1/2$ and that $\Xi ^1_T = 0$ ${\mathbb {P}}$-a.s., i.e., she wants to systematically unwind her hedging portfolio when approaching maturity T.

Lemma 4.1

The process $(\xi ^1_t)_{0 \le t \le T}$ in (70) is a martingale on [0, T].

Proof

Obviously, $(\xi ^1_t)_{0 \le t \le T}$ is adapted, bounded and hence integrable. Moreover, using the property that for any $a,b \in {\mathbb {R}}$ a standard normal distributed random variable Z satisfies ${\mathbb {E}}[\Phi (a Z + b)] = \Phi (b/\sqrt{1+a^2})$ we obtain

$$\begin{aligned} {\mathbb {E}}\left[ \Phi \left( \frac{P_t - P_0}{\sqrt{\sigma (T-t)}} \right) \bigg \vert \, {\mathscr {F}}_s \right] = {\mathbb {E}}\left[ \Phi \left( \frac{\sqrt{\sigma (t-s)} Z + P_s - P_0}{\sqrt{\sigma (T-t)}} \right) \right] = \Phi \left( \frac{P_s - P_0}{\sqrt{\sigma (T-s)}} \right) \end{aligned}$$

as desired. $\square $

Firstly, we assume that the second agent does not pursue any specific predetermined trading objectives, that is, $x^2 = \xi ^2 = \Xi ^2_T = 0$ ${\mathbb {P}}$-a.s. Since $\xi ^1$ in (70) is a martingale on [0, T] the optimal signal processes ${\hat{\xi }}^1$ and ${\hat{\xi }}^2$ in (22) and (23) simplify to

$$\begin{aligned} {\hat{\xi }}^1_t = (w^3_t + w^4_t) \xi ^1_t \quad \text {and} \quad {\hat{\xi }}^2_t = (w^3_t - w^4_t) \xi ^1_t \qquad (0 \le t \le T), \end{aligned}$$

(71)

using Fubini’s theorem and the fact that for each $t \in [0,T)$ the kernels $K^1(t,u)$ and $K^2(t,u)$ as functions in $u \in [t,T)$ integrate to one over [t, T]. The Nash equilibrium strategies ${\hat{X}}^1$ and ${\hat{X}}^2$ from Theorem 3.5 are plotted in Fig. 5, together with the corresponding realisation of the delta-hedge $\xi ^1$ in the case where the call option expires in the money.

Depending on the illiquidity parameters, we observe the same behavioral patterns in equilibrium as in the deterministic cases analyzed above: In a plastic market environment, the second agent engages in predatory trading on the first agent by trading in parallel in the same direction of the delta-hedge. When the market is elastic he turns into a liquidity provider instead and partially takes the opposite side of the hedger’s transactions. Also note that the sign of the second agent’s optimal signal process in (71) is determined by the relation between the weights $w^3$ and $w^4$, which is in turn affected by the relation between $\gamma $ and $\lambda $ (cf. also Fig. 1).

Secondly, let us now assume that the second agent also hedges a one-tenth fraction of the same call option, i.e., $\xi ^2 = \xi ^1/10$ (with initial and final portfolio positions $x^2=1/20$ and $\Xi ^2_T =0$ ${\mathbb {P}}$-a.s.). The resulting Nash equilibrium strategies from Theorem 3.5 are presented in Fig. 6 where we used the same realisation of the delta-hedge as in Fig. 5. In a similar vein as in the deterministic case above, the second agent’s optimal behaviour in the two-player Nash equilibrium changes notably compared to his optimal single-player frictional hedging strategy from [5]; focussing more on preying on the first agent’s larger hedging portfolio in a plastic market, or on providing liquidity to the latter in an elastic market.

Data Availibility

No data was used in this article.

References

Almgren, R.: Optimal trading with stochastic liquidity and volatility. SIAM J. Financial Math. 3(1), 163–181 (2012). https://doi.org/10.1137/090763470
Article MathSciNet MATH Google Scholar
Almgren, R., Chriss, N.: Optimal execution of portfolio transactions. J. Risk 3, 5–39 (2001)
Article Google Scholar
Almgren, R., Li, T.M.: Option hedging with smooth market impact. Market Microstructure Liquidity 02(01), 1650002 (2016). https://doi.org/10.1142/S2382626616500027
Article Google Scholar
Attari, M., Mello, A.S., Ruckes, M.E.: Arbitraging arbitrageurs. J. Finance 60(5), 2471–2511 (2005)
Article Google Scholar
Bank, P., Soner, H.M., Voß, M.: Hedging with temporary price impact. Math. Financial Economics 11(2), 215–239 (2017). https://doi.org/10.1007/s11579-016-0178-4
Article MathSciNet MATH Google Scholar
Brunnermeier, M.K., Pedersen, L.H.: Predatory Trading. J. Finance 60(4), 1825–1863 (2005). https://doi.org/10.1111/j.1540-6261.2005.00781.x
Article Google Scholar
Cai, J., Rosenbaum, M., Tankov, P.: Asymptotic lower bounds for optimal tracking: A linear programming approach. Ann. Appl. Probab. 27(4), 2455–2514 (2017). https://doi.org/10.1214/16-AAP1264
Article MathSciNet MATH Google Scholar
Cardaliaguet, P., Lehalle, C.-A.: Mean field game of controls and an application to trade crowding. Math. Financial Economics 12(3), 335–363 (2018). https://doi.org/10.1007/s11579-017-0206-z
Article MathSciNet MATH Google Scholar
Carlin, B.I., Lobo, M.S., Viswanathan, S.: Episodic liquidity crises: Cooperative and predatory trading. J. Finance 62(5), 2235–2274 (2007). https://doi.org/10.1111/j.1540-6261.2007.01274.x
Article Google Scholar
Carmona, R., Yang, J.: Predatory Trading: a Game on Volatility and Liquidity, (2008). https://carmona.princeton.edu/download/fe/PredatoryTradingGameQF.pdf
Cartea, Á., Jaimungal, S.: A closed-form execution strategy to target volume weighted average price. SIAM J. Financial Math. 7(1), 760–785 (2016). https://doi.org/10.1137/16M1058406
Article MathSciNet MATH Google Scholar
Casgrain, P., Jaimungal, S.: Mean Field Games with Partial Information for Algorithmic Trading, (2018). arXiv:1803.04094
Casgrain, P., Jaimungal, S.: Mean-field games with differing beliefs for algorithmic trading. Math. Finance 30(3), 995–1034 (2020). https://doi.org/10.1111/mafi.12237
Article MathSciNet MATH Google Scholar
Chu, C.S., Lehnert, A., Passmore, W.: Strategic trading in multiple assets and the effects on market volatility. Int. J. Central Banking 5(4), 143–172 (2009)
Google Scholar
Drapeau, S., Luo, P., Schied, A., Xiong, D.: An FBSDE approach to market impact games with stochastic parameters. Probability, Uncertainty Quantitative Risk 6(3), 237–260 (2021)
Article MathSciNet Google Scholar
Ekeland, I., Témam, R.: Convex Analysis and Variational Problems. Soc. Ind. Appl. Math. (1999). https://doi.org/10.1137/1.9781611971088
Article MATH Google Scholar
Ekren, I., Nadtochiy, S.: Utility-based pricing and hedging of contingent claims in Almgren-Chriss model with temporary price impact. Math. Finance 32(1), 172–225 (2022). https://doi.org/10.1111/mafi.12330
Article MathSciNet Google Scholar
Evangelista, D., Thamsten, Y.: On finite population games of optimal trading, (2020). arXiv:2004.00790
Fu, G., Horst, U.: Mean-field leader-follower games with terminal state constraint. SIAM J. Control Optim. 58(4), 2078–2113 (2020). https://doi.org/10.1137/19M1241878
Article MathSciNet MATH Google Scholar
Fu, G., Horst, U., Xia, X.: Portfolio liquidation games with self-exciting order flow, (2020). arXiv:2011.05589
Fu, G., Graewe, P., Horst, U., Popier, A.: A mean field game of optimal portfolio liquidation. Math. Oper. Res. 46(4), 1250–1281 (2021). https://doi.org/10.1287/moor.2020.1094
Article MathSciNet MATH Google Scholar
Horst, U., Naujokat, F.: When to cross the spread? Trading in Two-Sided Limit Order Books. SIAM J. Financial Math. 5(1), 278–315 (2014). https://doi.org/10.1137/110849341
Article MathSciNet MATH Google Scholar
Huang, X., Jaimungal, S., Nourian, M.: Mean-field game strategies for optimal execution. Appl. Math. Finance 26(2), 153–185 (2019). https://doi.org/10.1080/1350486X.2019.1603183
Article MathSciNet MATH Google Scholar
Luo, X., Schied, A.: Nash equilibrium for risk-averse investors in a market impact game with transient price impact. Market Microstructure Liquidity 05(01n04), 2050001 (2019). https://doi.org/10.1142/S238262662050001X
Article Google Scholar
Moallemi, C.C., Park, B., Van Roy, B.: Strategic execution in the presence of an uninformed arbitrageur. J. Financial Markets 15(4), 361–391 (2012)
Article Google Scholar
Naujokat, F., Westray, N.: Curve following in illiquid markets. Math. Financial Economics 4(4), 299–335 (2011). https://doi.org/10.1007/s11579-011-0042-5
Article MathSciNet MATH Google Scholar
Neuman, E., Voß, M.: Trading with the Crowd, (2021). arXiv:2106.09267
Rogers, L.C.G., Singh, S.: The cost of illiquidity and its effects on hedging. Math. Finance 20(4), 597–615 (2010). https://doi.org/10.1111/j.1467-9965.2010.00413.x
Article MathSciNet MATH Google Scholar
Schied, A., Zhang, T.: A state-constrained differential game arising in optimal portfolio liquidation. Math. Finance 27(3), 779–802 (2017). https://doi.org/10.1111/mafi.12108
Article MathSciNet MATH Google Scholar
Schied, A., Zhang, T.: A market impact game under transient price impact. Math. Oper. Res. 44(1), 102–121 (2019). https://doi.org/10.1287/moor.2017.0916
Article MathSciNet MATH Google Scholar
Schied, A., Strehle, E., Zhang, T.: High-frequency limit of Nash equilibria in a market impact game with transient price impact. SIAM J. Financial Math. 8(1), 589–634 (2017). https://doi.org/10.1137/16M107030X
Article MathSciNet MATH Google Scholar
Schöneborn, T.: Trade execution in illiquid markets: Optimal stochastic control and multi-agent equilibria. PhD thesis, Technische Universität Berlin, (2008)
Schöneborn, T., Schied, A.: Liquidation in the Face of Adversity: Stealth vs. Sunshine Trading, (2009). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1007014
Strehle, E.: Optimal execution in a multiplayer model of transient price impact. Market Microstructure Liquidity 3(4), 1850007 (2017). https://doi.org/10.1142/S2382626618500077
Article Google Scholar

Download references

Acknowledgements

I am grateful to Jean-Pierre Fouque for encouraging and illuminating discussions. The paper has also profoundly benefited from the valuable comments and suggestions of the anonymous referee and the Editor-in-Chief Ulrich Horst.

Author information

Authors and Affiliations

Department of Mathematics, University of California Los Angeles, Los Angeles, CA, 90095, USA
Moritz Voß

Authors

Moritz Voß
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Moritz Voß.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Since the Proof of Theorem 3.5 is a verification of a proposed Nash equilibrium, we briefly explain for the reader’s convenience how the candidate Nash equilibrium strategies $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ provided in (21) can be constructed. Suppose we replace the constrained optimization problems in (6) and (7) by their unconstrained versions

$$\begin{aligned} J^{1,n}(\alpha ^1;\alpha ^2) \triangleq&\, J^1(\alpha ^1;\alpha ^2) + \frac{n}{2} \, {\mathbb {E}}[(X^1_T - \Xi ^1_T)^2] \rightarrow \min _{\alpha ^1 \in {\mathscr {A}}}, \end{aligned}$$

(72)

$$\begin{aligned} J^{2,n}(\alpha ^2;\alpha ^1) \triangleq&\, J^2(\alpha ^2;\alpha ^1) + \frac{n}{2} \, {\mathbb {E}}[(X^2_T - \Xi ^2_T)^2] \rightarrow \min _{\alpha ^2 \in {\mathscr {A}}} \end{aligned}$$

(73)

with some penalty parameter $n \in {\mathbb {N}}$. Then, along the same lines of Lemmas 3.1, 3.2, 3.3 and 3.4 above, solving (72) and (73) simultaneously results into solving following coupled FBSDE system

$$\begin{aligned} \left\{ \begin{aligned} dX^1_t =&\; \alpha ^1_t dt, \qquad X^1_0 = x^1, \\ dX^2_t =&\; \alpha ^2_t dt, \qquad X^2_0 = x^2, \\ d\alpha ^1_t =&\, \frac{\sigma }{\lambda } (X^1_t - \xi ^1_t) dt - \frac{\gamma }{2\lambda } \alpha ^2_t dt - \frac{1}{2} d\alpha ^2_t + d{\tilde{M}}^1_t, \\ \alpha ^1_T =&\, -\frac{n}{\lambda } (X^1_T - \Xi ^1_T) - \frac{1}{2} \alpha ^2_T -\frac{\gamma }{2\lambda } (X^2_T - x^2), \\ d\alpha ^2_t =&\; \frac{\sigma }{\lambda } (X^2_t - \xi ^2_t) dt - \frac{\gamma }{2\lambda } \alpha ^1_t dt - \frac{1}{2} d\alpha ^1_t + d{\tilde{M}}^2_t, \\ \alpha ^2_T =&\, -\frac{n}{\lambda } (X^2_T - \Xi ^2_T) - \frac{1}{2} \alpha ^1_T -\frac{\gamma }{2\lambda } (X^1_T - x^1) \end{aligned} \right. \end{aligned}$$

(74)

for two suitable square integrable martingales $({\tilde{M}}^1_t)_{0 \le t \le T}$ and $({\tilde{M}}^2_t)_{0 \le t \le T}$. The system in (74) can be decoupled by adding and subtracting both forward and backward equations to obtain the two autonomous systems

$$\begin{aligned} \left\{ \begin{aligned} d(X^1_t+X^2_t) =&\; (\alpha ^1_t + \alpha ^2_t) dt, \qquad X^1_0 + X^2_0 = x^1 + x^2, \\ d(\alpha ^1_t + \alpha ^2_t) =&\, \frac{2\sigma }{3\lambda } \left( (X^1_t+X^2_t) - (\xi ^1_t+\xi ^2_t) \right) dt - \frac{\gamma }{3\lambda } (\alpha ^1_t+\alpha ^2_t) dt \!+\! \frac{2}{3} d({\tilde{M}}^1_t \!\!+\!\! {\tilde{M}}^2_t), \\ \alpha ^1_T + \alpha ^2_T =&\, -\frac{2n}{3\lambda } \left( (X^1_T + X^2_T) - (\Xi ^1_T+\Xi ^2_T) \right) -\frac{\gamma }{3\lambda } \left( (X^1_T + X^2_T) - (x^1 + x^2) \right) , \end{aligned} \right. \end{aligned}$$

(75)

and

$$\begin{aligned} \left\{ \begin{aligned} d(X^1_t-X^2_t) =&\; (\alpha ^1_t - \alpha ^2_t) dt, \qquad X^1_0 - X^2_0 = x^1 - x^2, \\ d(\alpha ^1_t - \alpha ^2_t) \!=\!&\, \frac{2\sigma }{\lambda } \left( (X^1_t-X^2_t) - (\xi ^1_t-\xi ^2_t) \right) dt \!+\! \frac{\gamma }{\lambda } (\alpha ^1_t-\alpha ^2_t) dt \!+\! 2 d({\tilde{M}}^1_t - {\tilde{M}}^2_t), \\ \alpha ^1_T - \alpha ^2_T =&\, -\frac{2n}{\lambda } \left( (X^1_T - X^2_T) - (\Xi ^1_T+\Xi ^2_T) \right) + \frac{\gamma }{\lambda } \left( (X^1_T - X^2_T) - (x^1 - x^2) \right) . \end{aligned} \right. \end{aligned}$$

(76)

The decoupled FBSDEs in (75) and (76) are linear. To solve them, we make a linear ansatz of the following form

$$\begin{aligned} \lambda (\alpha ^1_t + \alpha ^2_t) = b^{+,n}_t - c^{+,n}_t (X^1_t+X^2_t),\quad \lambda (\alpha ^1_t - \alpha ^2_t) = b^{-,n}_t - c^{-,n}_t (X^1_t-X^2_t). \end{aligned}$$

(77)

Plugging this ansatz in (75) and (76), respectively, and comparing coefficients yields two deterministic Riccati equations for $c^{+,n}$ and $c^{-,n}$ given by

$$\begin{aligned} \begin{aligned} (c^{+,n}_t)' =&\, \frac{(c^{+,n}_t)^2}{\lambda } - \frac{\gamma }{3\lambda } c^{+,n}_t - \frac{2}{3} \sigma , \quad c^{+,n}_T = \frac{1}{3} (2 n + \gamma ), \\ (c^{-,n}_t)' =&\, \frac{(c^{-,n}_t)^2}{\lambda } + \frac{\gamma }{\lambda } c^{-,n}_t - 2\sigma , \quad c^{-,n}_T = (2 n - \gamma ); \end{aligned} \end{aligned}$$

(78)

as well as two linear BSDEs for $b^{+,n}$ and $b^{-,n}$ given by

$$\begin{aligned} \begin{aligned} db^{+,n}_t =&\, \left( \left( \frac{c^{+,n}_t}{\lambda } - \frac{\gamma }{3\lambda } \right) b^{+,n}_t - \frac{2\sigma }{3} (\xi ^1_t + \xi ^2_t) \right) dt - \frac{2\lambda }{3} d({\tilde{M}}^1_t + {\tilde{M}}^2_t), \\ b^{+,n}_T =&\, \frac{2n}{3} (\Xi ^1_T + \Xi ^2_T) + \frac{\gamma }{3} (x^1+x^2), \\ db^{-,n}_t =&\, \left( \left( \frac{c^{-,n}_t}{\lambda } + \frac{\gamma }{\lambda } \right) b^{-,n}_t - 2\sigma (\xi ^1_t - \xi ^2_t) \right) dt - 2\lambda d({\tilde{M}}^1_t - {\tilde{M}}^2_t), \\ b^{-,n}_T =&\, 2n (\Xi ^1_T - \Xi ^2_T) - \gamma (x^1-x^2). \end{aligned} \end{aligned}$$

(79)

The ODEs in (78) can be solved in closed form with solutions

$$\begin{aligned} c^{+,n}_t = \frac{1}{6} \gamma + \frac{1}{3} \sqrt{\delta ^+} \frac{e^{\frac{2\sqrt{\delta ^+}}{3\lambda } (T-t)} \kappa ^+_n - 1}{e^{\frac{2\sqrt{\delta ^+}}{3\lambda } (T-t)} \kappa ^+_n + 1}, \quad c^{-,n}_t = -\frac{\gamma }{2} + \sqrt{\delta ^-} \frac{e^{\frac{2\sqrt{\delta ^-}}{\lambda } (T-t)} \kappa ^-_n - 1}{e^{\frac{2\sqrt{\delta ^-}}{\lambda } (T-t)} \kappa ^-_n + 1}, \end{aligned}$$

(80)

where $\kappa ^+_n \triangleq \frac{2\sqrt{\delta ^+} + \gamma + 4n}{2\sqrt{\delta ^+}-\gamma -4n}$ and $\kappa ^-_n \triangleq \frac{2\sqrt{\delta ^-} - \gamma + 4n}{2\sqrt{\delta ^-}+\gamma -4n}$ (with $\delta ^+, \delta ^-$ introduced in (18)). Also the linear BSDEs in (79) have explicit solutions given by

$$\begin{aligned} \begin{aligned} b^{+,n}_t&= \; {\mathbb {E}}\left[ \left( \frac{2n}{3} (\Xi ^1_T + \Xi ^2_T) + \frac{\gamma }{3} (x^1+x^2) \right) e^{-\int _t^T \big ( \frac{c^{+,n}_s}{\lambda } - \frac{\gamma }{3\lambda } \big ) ds} \right. \\&\quad + \left. \int _t^T \frac{2\sigma }{3} (\xi ^1_s + \xi ^2_s) e^{-\int _t^s \big ( \frac{c^{+,n}_u}{\lambda } - \frac{\gamma }{3\lambda } \big ) du} \, ds \; \bigg \vert \; {\mathscr {F}}_t \right] , \\ b^{-,n}_t =&\; {\mathbb {E}}\left[ \left( 2n (\Xi ^1_T - \Xi ^2_T) - \gamma (x^1-x^2) \right) e^{-\int _t^T \big ( \frac{c^{-,n}_s}{\lambda } + \frac{\gamma }{\lambda } \big ) ds} \right. \\&\quad + \left. \int _t^T 2\sigma (\xi ^1_s - \xi ^2_s) e^{-\int _t^s \big ( \frac{c^{-,n}_u}{\lambda } + \frac{\gamma }{\lambda } \big ) du} \, ds \; \bigg \vert \; {\mathscr {F}}_t \right] . \end{aligned} \end{aligned}$$

(81)

Putting everything together with the ansatz in (77), we obtain (for every $n \in {\mathbb {N}}$) a pair $(\alpha ^1, \alpha ^2)$ of candidate solutions which simultaneously solve (72) and (73), namely

$$\begin{aligned} \begin{aligned} \alpha ^1_t =&\, \frac{1}{2\lambda } \left( \lambda (\alpha ^1_t + \alpha ^2_t) + \lambda (\alpha ^1_t - \alpha ^2_t) \right) \\ =&\, \frac{c^{+,n}_t+c^{-,n}_t}{2\lambda } \left( \frac{b^{+,n}_t + b^{-,n}_t}{c^{+,n}_t+c^{-,n}_t} - \frac{c^{+,n}_t-c^{-,n}_t}{c^{+,n}_t+c^{-,n}_t} X^{2}_t - X^{1}_t \right) , \\ \alpha ^{2}_t =&\, \frac{1}{2\lambda } \left( \lambda (\alpha ^1_t + \alpha ^2_t) - \lambda (\alpha ^1_t - \alpha ^2_t) \right) \\ =&\, \frac{c^{+,n}_t+c^{-,n}_t}{2\lambda } \left( \frac{b^{+,n}_t - b^{-,n}_t}{c^{+,n}_t+c^{-,n}_t} - \frac{c^{+,n}_t-c^{-,n}_t}{c^{+,n}_t+c^{-,n}_t} X^1_t - X^2_t \right) . \end{aligned} \end{aligned}$$

(82)

Since all terms in (82) can be explicitly computed, one can identify the limit in (82) as the penalty parameter n in (72) and (73) goes to infinity. This yields $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ in (21), a candidate for the Nash equilibrium strategies for the original constraint stochastic differential game from Sect. 2. It is then only left to show that $({\hat{\alpha }}^1,{\hat{\alpha }}^2)$ is indeed the unique Nash equilibrium and belongs to ${\mathscr {A}}^1 \times {\mathscr {A}}^2$. This verification is carried out in the Proof of Theorem 3.5 in Sect. 3 above.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Voß, M. A two-player portfolio tracking game. Math Finan Econ 16, 779–809 (2022). https://doi.org/10.1007/s11579-022-00324-6

Download citation

Received: 20 February 2021
Accepted: 08 July 2022
Published: 26 July 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s11579-022-00324-6

A two-player portfolio tracking game

Abstract

Similar content being viewed by others

Multi-dimensional optimal trade execution under stochastic resilience

Sequential Fair Stackelberg Equilibria of Linear Strategies in Risk-Seeking Insider Trading

Multi-agent dynamic financial portfolio management: a differential game approach

1 Introduction

2 Problem formulation

Remark 2.1

Definition 2.2

Remark 2.3

3 Main result

Lemma 3.1

Proof

Lemma 3.2

Proof

Lemma 3.3

Proof

Lemma 3.4

Proof

Theorem 3.5

Proof of Theorem 3.5

Corollary 3.6

Proof

Lemma 3.7

Proof

Lemma 3.8

Proof

Remark 3.9

4 Illustrations

4.1 Optimal liquidation revisited

4.2 Piecewise constant inventory targets

4.3 Running after the delta

Lemma 4.1

Proof

Data Availibility

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

JEL Classification

Search

Navigation