Abstract
We study the competition of two strategic agents for liquidity in the benchmark portfolio tracking setup of Bank et al. (Math Financial Economics 11(2):215–239 2017). Specifically, both agents track their own stochastic running trading targets while interacting through common aggregated temporary and permanent price impact à la Almgren and Chriss (J Risk 3:5–39 2001). The resulting stochastic linear quadratic differential game with terminal state constraints allows for a unique and explicitly available openloop Nash equilibrium. Our results reveal how the equilibrium strategies of the two players take into account the other agent’s trading targets: either in an exploitative intent or by providing liquidity to the competitor, depending on the relation between temporary and permanent price impact. As a consequence, different behavioral patterns can emerge as optimal in equilibrium. These insights complement and extend existing studies in the literature on predatory trading models examined in the context of optimal portfolio liquidation games.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In recent years, studying socalled price impact games (also referred to as market impact games) in the context of optimal portfolio liquidation problems has gained a lot of attraction in the financial mathematics literature. They investigate the strategic interaction of financial agents, who simultaneously trade in the same risky asset in order to costefficiently liquidate their position while affecting the asset’s execution price through jointly generated price impact. That is, influencing the price in an adverse manner when they execute their buy or sell orders. These price impact games provide a tractable way to formalize the competition between agents for a risky asset’s liquidity. Among the first gametheoretic approaches carried out to investigate possible phenomena in a competitive equilibrium where agents seek to liquidate their positions in the same risky asset are, e.g., Brunnermeier and Pedersen [6], Attari et al. [4], Carlin et al. [9], Schöneborn [32],Schöneborn and Schied [33],Carmona and Yang [10], and Schied and Zhang [29].
Our goal in this paper is to extend these works by formulating and studying the competition between two strategic agents for liquidity when both agents are trading simultaneously in an illiquid risky asset affected by price impact, because each agent seeks to track her own exogenously given stochastic target strategy like, e.g., a frictionless delta hedge to dynamically hedge the fluctuations of their random endowments. Singleagent optimal tracking problems in the presence of price impact have first been considered by Rogers and Singh [28], Naujokat and Westray [26], Horst and Naujokat [22], and Cartea and Jaimungal [11]. To the best of our knowledge, the present manuscript is the first to study a dynamic tracking problem in a competitive twoplayer price impact game setting. Specifically, we extend the singleplayer cost optimal benchmark portfolio tracking problem studied in Bank et al. [5] in the presence of temporary and permanent price impact as proposed by Almgren and Chriss [2] to a twoplayer stochastic differential game. Both strategic agents are fully aware of the opponent’s individual tracking objectives and they compete for available liquidity as the jointly caused price impact on the execution price directly feeds into their trading performances. We also allow for individual stochastic terminal state constraints on each agent’s final portfolio position. Our aim is to shed light on the strategic interplay between the agents and to make transparent how each agent takes into account the other agent’s trading targets in an optimal cost minimizing manner by solving for a Nash equilibrium in this twoplayer price impact game.
The paper most closely related to ours is Schied and Zhang [29]. Therein, the authors determine a unique openloop Nash equilibrium within the class of deterministic strategies of agents aiming to liquidate a given asset position by maximizing a meanvariance criterion in an Almgren and Chriss [2] framework. Their study is an extension of the corresponding deterministic differential game solved in Carlin et al. [9] of liquidating riskneutral agents who maximize expected revenues. Other extensions of the latter game include, e.g., Schöneborn and Schied [33], Carmona and Yang [10], Moallemi et al. [25], Chu et al. [14]. In contrast to these papers, which focus on optimal portfolio liquidation only, we additionally allow the agents to track their own general predictable target strategies as in the singleplayer case investigated in [5]. Moreover, facing the same time horizon, the players’ terminal portfolio positions are also restricted to some exogenously predetermined stochastic levels which reveal gradually over time. As a consequence, both agents will choose their dynamic trading strategies from a suitable set of adapted stochastic processes rather than opting for static strategies from a set of deterministic functions as in the papers cited above (except for the numerical study in [10]).
Other recent work on both finiteplayer as well as infiniteplayer mean field price impact games with AlmgrenChriss type price impact include, e.g., Cardaliaguet and Lehalle [8], Huang et al. [23], Casgrain and Jaimungal [12, 13], Fu et al. [21], Fu and Horst [19], Evangelista and Thamsten [18], and Drapeau et al. [15], where finitely and infinitely many agents pursue optimal liquidation of their initial positions and interact through common aggregated permanent and temporary price impact. Price impact games of liquidating agents in a market model with transient price impact are analyzed, e.g., in Luo and Schied [24], Schied and Zhang [30], Schied et al. [31], Strehle [34]; and very recently in Fu et al. [20] and Neuman and Voß [27]. However, these works are all portfolio liquidation games where the agents steer their initial portfolio positions towards zero (with strict liquidation constraints enforced in [18,19,20,21]). In particular, the agents neither track any individual stochastic running trading targets nor do they aim for reaching an individual random terminal target position. In contrast, as mentioned above, our present study formulates and solves a twoplayer price impact portfolio tracking game with random terminal state constraints between two heterogeneous agents who have their own individual trading targets.
Our main result is an explicit description of a unique openloop Nash equilibrium within the class of progressively measurable strategies to our twoplayer stochastic differential game, where both agents track their own target strategies as in [5] and interact through temporary and permanent price impact as in [29] and [9]. Mathematically, we solve a linear quadratic stochastic differential game with random terminal state constraints. Inspired by the analysis in [5], we follow a probabilistic and convexanalytic approach in the style of Pontryagin’s stochastic maximum principle. This also allows us to consider general predictable strategies as the agents’ tracking targets and not necessarily Markovian or continuous diffusiontype processes. We prove uniqueness of the Nash equilibrium and derive its characterization, which takes the form of a fourdimensional coupled system of linear forwardbackward stochastic differential equations (FBSDEs). Due to the stochastic terminal state constraints the FBSDE system has singular terminal conditions. As a consequence, explicitly computing a solution to the constrained stochastic differential game is a nontrivial task. The manuscript shows how this can be achieved. Solving the singular FBSDE system provides us with the agents’ optimal trading strategies in equilibrium in closedform and unveils a rich phenomenology for their optimal behaviour.
In fact, it turns out that in equilibrium, similar to the singleplayer solution presented in [5], both agents anticipate their individual running target portfolio by gradually trading in the direction of a weighted average of expected future target positions of the target strategy. However, being aware of the competitor’s tracking goals, each agent also assesses a weighted average of the expected future positions of the opponent’s target strategy and chooses to trade accordingly. Interestingly, it arises that the agents’ trading directions with respect to the adversary’s target strategy are not invariant but depend on the relation between temporary and permanent price impact. Conceptually, our explicit results extend the analysis carried out by Schöneborn and Schied [33]. Therein, the authors identify two distinct types of illiquid markets: A plastic market where the price impact is predominantly permanent, and an elastic market where the major part of incurred price impact is temporary. Their model predicts that a competitor who is conscious of the other agent’s liquidation intention engages in predatory trading in a plastic market (in the sense that the competitor partly trades in the same direction as her opponent), while she tends to cooperate and provides liquidity in an elastic market (in the sense that she trades in the opposite direction of her opponent’s trading); cf. also the detailed discussion in Schöneborn and Schied [33]. Our closedform Nash equilibrium solution of our more general price impact portfolio tracking game corroborates this. The novelty of our contribution comes from the fact that both predation by simultaneously trading in the same direction as the opponent, as well as cooperation by trading in the opposite direction can occur in a coexisting manner; depending on whether the market is plastic or elastic. As a consequence, different behavioral paradigms can emerge as optimal in our Nash equilibrium; see the illustrations in Sect. 4.
The remainder of the paper is organized as follows. In Sect. 2 we introduce our twoplayer stochastic differential price impact game by extending the framework of Carlin et al. [9] and Schied and Zhang [29] to a stochastic tracking problem of general predictable target strategies and random terminal state constraints. Our main result, an explicit description of a unique openloop Nash equilibrium of the game is presented in Sect. 3. Section 4 contains some illustrations and discusses the qualitative behaviour of the two players’ optimal strategies in equilibrium.
Notation: Throughout this manuscript we use superscripts for enumerating purposes as, e.g., in \(X^1\), \(X^2\), \(\alpha ^1\), \(\alpha ^2\), or other quantities like \(\xi ^1\), \(\xi ^2\) etc., to mark all objects which are associated with player 1 and player 2, respectively; or, to itemize objects as \(w^1\), \(w^2\), \(w^3\) etc. In particular, \(X^2\), \(\alpha ^2\), \(\xi ^2\) is not to be confused with quadratic powers, which will be explicitly denoted with brackets like \((\alpha )^2\), or, if necessary, as \((\alpha ^2)^2\).
2 Problem formulation
Let \(T >0\) denote a finite deterministic time horizon and fix a filtered probability space \((\Omega ,{\mathscr {F}},({\mathscr {F}}_t)_{0 \le t \le T},{\mathbb {P}})\) satisfying the usual conditions of right continuity and completeness. We consider two agents (preferred pronouns she/her/hers and he/him/his, respectively) who are trading in a financial market consisting of one risky asset, e.g., a stock. The number of shares agent 1 and agent 2 are holding at time \(t \in [0,T]\) are defined, respectively, as
with initial positions \(x^1, x^2 \in {\mathbb {R}}\). The realvalued stochastic processes \((\alpha ^1_t)_{0 \le t \le T}\) and \((\alpha ^2_t)_{0 \le t \le T}\) represent the turnover rate at which each agent trades in the risky asset and belong to the general class of stochastic processes
We adopt the framework from Carlin et al. [9] and Schied and Zhang [29] and suppose that the agents’ trading incurs linear temporary and permanent price impact à la Almgren and Chriss [2] in the sense that trades in the risky asset are executed at prices
with some unaffected price process \(P_\cdot = P_0 + \sqrt{\sigma } W_\cdot \) following a Brownian motion \((W_t)_{0 \le t \le T}\) with respect to the underlying filtration with variance \(\sigma > 0\). The trading of both agents in the risky asset consumes available liquidity and instantaneously affects the execution price in (3) in an adverse manner through temporary price impact \(\lambda > 0\). In addition, the agents’ total accumulated trading activity also leaves a trace in the execution price which is captured by the permanent price impact parameter \(\gamma >0\).
Similar to the singleagent setup in Bank et al. [5] we assume that agent 1 and agent 2 are trading in this illiquid risky asset because each agent seeks to track their own exogenously given target strategy \((\xi _t^1)_{0 \le t \le T}\) and \((\xi _t^2)_{0 \le t \le T}\), respectively. Both processes \(\xi ^1\) and \(\xi ^2\) are supposed to be realvalued predictable processes in \(L^2({\mathbb {P}}\otimes dt)\) and can be thought of, for instance, as hedging strategies adopted from a frictionless market. Moreover, the agents are also required to reach a predetermined terminal portfolio target position \(\Xi ^1_T\) and \(\Xi ^2_T\) in \(L^2({\mathbb {P}},{\mathscr {F}}_T)\) at time T. Mathematically, we can formalize their objectives as follows: For a given strategy \((\alpha ^2_t)_{0 \le t \le T}\) of her competitor agent 2, agent 1 aims to choose her trading rate \((\alpha ^1_t)_{0 \le t \le T}\) in order to minimize the cost functional
whereas agent 2 wishes to minimize
via his trading rate \((\alpha ^2_t)_{0 \le t \le T}\) in response to a given strategy \((\alpha ^1_t)_{0 \le t \le T}\) of his opponent agent 1. As in the singleagent problem in Bank et al. [5], the first term in (4) and (5) reflects the agents’ running after their individual target strategies \(\xi ^1\) and \(\xi ^2\), respectively, through minimizing the corresponding square deviation from their respective portfolio positions \(X^1\) and \(X^2\). The common weight parameter \(\sigma \) measures price fluctuations of the underlying unaffected price process. The second and third terms in (4) and (5) take into account the additional incurred linear quadratic illiquidity costs which are induced by temporary and permanent price impact while both agents are trading in the risky asset as stipulated in (3) (see also Carlin et al. [9] and Schied and Zhang [29]). Note, however, that due to each agent’s individual terminal state constraint \(X^i_T = \Xi ^i_T\) \({\mathbb {P}}\)a.s. (for \(i=1,2\)) only the competitor’s accrued permanent price impact feeds into their respective cost functional. Indeed, integration by parts yields that the ith agent’s permanent impact from their own trading always creates the same costs \(\gamma (X^i_T  x^i)^2=\gamma (\Xi ^i_T  x^i)^2\) independent of their chosen trading rate and therefore can be neglected in their own objective functional. We obtain following individual optimal stochastic control problems for agent 1 and agent 2, namely,
for any fixed strategy \(\alpha ^2 \in {\mathscr {A}}^2\), and
for any fixed strategy \(\alpha ^1 \in {\mathscr {A}}^1\), where \({\mathscr {A}}^{i}\), \(i=1,2\), is the set of admissible constrained policies defined as
Similar to Bank et al. [5] we further assume that the target positions \(\Xi ^1_T, \Xi ^2_T \in L^2({\mathbb {P}},{\mathscr {F}}_T)\) satisfy
where \(M^+_t \triangleq {\mathbb {E}}[\Xi ^1_T + \Xi ^2_T \vert {\mathscr {F}}_t]\) and \(M^_t \triangleq {\mathbb {E}}[\Xi ^1_T  \Xi ^2_T \vert {\mathscr {F}}_t]\) for \(0 \le t \le T\).
Remark 2.1

1.
As in Carlin et al. [9] and Schied and Zhang [29] the agents’ individual optimization problems in (6) and (7) are intertwined through common aggregated temporary and permanent price impact affecting their performance functionals \(J^1\) and \(J^2\) in (4) and (5) (in contrast to, e.g, Huang et al. [23], Casgrain and Jaimungal [12, 13] or Ekren and Nadtochiy [17] where agents only interact through permanent or temporary price impact, respectively). One can think of both players as strategic agents who compete for liquidity while concurrently trading in a single illiquid risky asset to meet their tracking objectives for the purpose of, e.g., hedging fluctuations of random endowments. Note that both agents are fully aware of the opponent’s trading targets \(\xi ^i\) and \(\Xi _T^i\) (\(i=1,2\)), as well as the jointly caused price impact on the execution prices in (3). That is, our game is one of complete information as in the related studies in Brunnermeier and Pedersen [6], Carlin et al. [9], Schöneborn and Schied [33], Carmona and Yang [10], and Schied and Zhang [29].

2.
For further motivation for the tracking cost functionals in (4) and (5) we refer to the singleplayer optimization problems studied, e.g., in Rogers and Singh [28], Naujokat and Westray [26], Horst and Naujokat [22], Almgren and Li [3], Bank et al. [5], and Cai et al. [7]. Observe that the square tracking error also incorporates a risk aversion on each player’s inventory. In this regard, both agents are homogeneous in their inventory risk.

3.
Note that the coefficients \(\sigma , \lambda , \gamma > 0\) in the cost functionals in (4) and (5) are constants. This is an important assumption for obtaining a closedfrom solution for the stochastic differential game, which is our primary focus of interest. In fact, the only sources of randomness in the game are the target strategies \((\xi ^1_t)_{0 \le t \le T}\), \((\xi ^2_t)_{0 \le t \le T}\) and the random terminal conditions \(\Xi ^1_T\), \(\Xi ^2_T\), which will force the agents’ optimal policies to be random processes as well.

4.
Analog to the study in Bank et al. [5] the assumption in (9) will ensure that \({\mathscr {A}}^i \ne \varnothing \) for \(i=1,2\) (cf. also the Proof of Theorem 3.5 in Sect. 3 below). In fact, for given random variables \(\Xi ^i_T \in L^2({\mathbb {P}},{\mathscr {F}}_T)\) only known at time T the terminal state constraint \(X^i_T = \Xi ^i_T\) \({\mathbb {P}}\)a.s. (\(i=1,2\)) is quite demanding. Thus, loosely speaking, the condition in (9) requires that the speed at which information on the random ultimate target positions \(\Xi ^1_T\), \(\Xi ^2_T\) is revealed as \(t \uparrow T\) is sufficiently fast.
Our goal is to compute a Nash equilibrium in which both agents solve their minimization problems in (6) and (7) simultaneously, given the strategy of their competitor, in the following sense:
Definition 2.2
A pair of admissible strategies \( ({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) is called an openloop Nash equilibrium if for all admissible strategies \(\alpha ^1 \in {\mathscr {A}}^1\) and \(\alpha ^2 \in {\mathscr {A}}^2\) it holds that
In other words, in a Nash equilibrium neither player has an incentive to deviate from the chosen strategy.
Remark 2.3
In the special case of optimally liquidating the agents’ initial risky asset holdings \(x^1, x^2 \in {\mathbb {R}}\) without tracking exogenously given target strategies, i.e., \(\xi ^1 \equiv \xi ^2 \equiv 0\), and with nonrandom terminal target positions \(\Xi ^1_T = \Xi ^2_T = 0\) \({\mathbb {P}}\)almost surely, the above formulated twoplayer (deterministic) differential game is solved in Carlin et al. [9] setting \(\sigma = 0\) in the performance functionals in (4) and (5); and in Schied and Zhang [29] allowing for \(\sigma > 0\) instead. In both studies, the authors obtain a unique openloop Nash equilibrium in the sense of Definition 2.2 in closed form within the class of deterministic strategies.
3 Main result
Our main result is an explicit description of a unique openloop Nash equilibrium in the sense of Definition 2.2 of the twoplayer stochastic differential game formulated in Sect. 2. Inspired by Bank et al. [5] we will use tools from convex analysis and simple calculus of variations arguments to derive the equilibrium strategies.
First, a strict convexity property of each players’ objective in (4) and (5) is established in the following
Lemma 3.1
For every \(\alpha ^2 \in {\mathscr {A}}^2\) fixed, the functional \(\alpha ^1 \mapsto J^1(\alpha ^1;\alpha ^2)\) in (4) is strictly convex in \(\alpha ^1 \in {\mathscr {A}}^1\). Similarly, for every \(\alpha ^1 \in {\mathscr {A}}^1\) fixed, the functional \(\alpha ^2 \mapsto J^2(\alpha ^2;\alpha ^1)\) in (5) is strictly convex in \(\alpha ^2 \in {\mathscr {A}}^2\).
Proof
We only show strict convexity of the first agent’s objective in (4). The reasoning for the second agent’s objective in (5) follows analogously. To this end, let \(\alpha ^2 \in {\mathscr {A}}^2\) be fixed. Consider \(\alpha ^1,{\tilde{\alpha }}^1 \in {\mathscr {A}}^1\) such that \(\alpha ^1 \ne {\tilde{\alpha }}^1\) \(d{\mathbb {P}}\otimes dt\text {a.e. on } \Omega \times [0,T]\) and denote by \(X^1, {\tilde{X}}^1\) the corresponding share holdings. For every \(\varepsilon \in (0,1)\) it holds that \(\varepsilon \alpha ^1 + (1\varepsilon ) {\tilde{\alpha }}^1 \in {\mathscr {A}}^1\) with share holdings \(X^{\varepsilon \alpha ^1 + (1\varepsilon ) {\tilde{\alpha }}^1} = \varepsilon X^{1} + (1\varepsilon ) {\tilde{X}}^1\). We have to show that
In fact, a straightforward computation reveals that
because \(\alpha ^1 \ne {\tilde{\alpha }}^1\) \(d{\mathbb {P}}\otimes ds\text {a.e. on } \Omega \times [0,T]\). \(\square \)
As an important consequence we obtain
Lemma 3.2
There exists at most one Nash equilibrium in the sense of Definition 2.2.
Proof
We adapt the argument from Schied and Zhang [29, Lemma 4.1] (see also Schied et al. [31, Proposition 4.8]) to our stochastic differential game and prove the claim by contradiction. Specifically, assume that there exist two distinct Nash equilibria \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) and \(({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)\) in \({\mathscr {A}}^1 \times {\mathscr {A}}^2\), i.e.,
for all admissible strategies \(\alpha ^1 \in {\mathscr {A}}^1\) and \(\alpha ^2 \in {\mathscr {A}}^2\). Then we can define for all \(\varepsilon \in [0,1]\) the function
Note that due to Lemma 3.1 and the assumption that the two Nash equilibria \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) and \(({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)\) are distinct, the function \(f(\varepsilon )\) is strictly convex in \(\varepsilon \) on [0, 1]. Moreover, in light of (10) it has a unique minimum in \(\varepsilon = 0\). It follows that
Next, denoting the corresponding share holdings of \({\hat{\alpha }}^1\) and \({\tilde{\alpha }}^1\) with \({\hat{X}}^1\) and \({\tilde{X}}^1\), respectively, and noting that \(X^{\varepsilon {\tilde{\alpha }}^1 + (1\varepsilon ) {\hat{\alpha }}^1} = \varepsilon {\tilde{X}}^1 + (1\varepsilon ) {\hat{X}}^1\), we can compute
as well as the derivatives of the remaining three terms in (11) in a very similar manner in order to ultimately obtain
where \({\hat{X}}^2\) and \({\tilde{X}}^2\) denote the share holdings of \({\hat{\alpha }}^2\) and \({\tilde{\alpha }}^2\), respectively. Observing that integration by parts yields
because \({\tilde{X}}^i_0 = {\hat{X}}^i_0 = x^i\) and \({\hat{X}}^i_T = {\tilde{X}}^i_T = \Xi ^i_T\) for both \(i \in \{1,2\}\), we obtain
which is strictly negative because the two Nash equilibria \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) and \(({\tilde{\alpha }}^1,{\tilde{\alpha }}^2)\) are distinct. But this contradicts (12). \(\square \)
Next, for any arbitrary but fixed controls \({\tilde{\alpha }}^2 \in {\mathscr {A}}^2\) and \({\tilde{\alpha }}^1 \in {\mathscr {A}}^1\), we can introduce the Gâteaux derivatives of the mappings \(\alpha ^1 \mapsto J^1(\alpha ^1;{\tilde{\alpha }}^2)\) at \(\alpha ^1 \in {\mathscr {A}}^1\) and \(\alpha ^2 \mapsto J^2(\alpha ^2;{\tilde{\alpha }}^1)\) at \(\alpha ^2 \in {\mathscr {A}}^2\), respectively, in any directions \(\beta ^1, \beta ^2 \in {\mathscr {A}}^0 \triangleq \{ \beta : \beta \in {\mathscr {A}}\text { satisfying } \int _0^T \beta _t dt = 0 \; {\mathbb {P}}\text {a.s.}\}\), namely,
They allow for following explicit expressions presented in
Lemma 3.3
Let \({\tilde{\alpha }}^2 \in {\mathscr {A}}^2\) be fixed with corresponding share holdings \({\tilde{X}}^2\). Then for all \(\alpha ^1 \in {\mathscr {A}}^1\) we have
for any \(\beta ^1 \in {\mathscr {A}}^0\). Similarly, let \({\tilde{\alpha }}^1 \in {\mathscr {A}}^1\) be fixed with corresponding share holdings \({\tilde{X}}^1\). Then for all \(\alpha ^2 \in {\mathscr {A}}^2\) we have
for any \(\beta ^2 \in {\mathscr {A}}^0\).
Proof
We only compute the Gâteaux derivative in (13). The same computations apply for (14). Fix \({\tilde{\alpha }}^2 \in {\mathscr {A}}^2\) with share holdings \({\tilde{X}}^2\) and let \(\alpha ^1 \in {\mathscr {A}}^1, \beta ^1 \in {\mathscr {A}}^0\) as well as \(\varepsilon > 0\). Note that \(\alpha ^1 +\varepsilon \beta ^1 \in {\mathscr {A}}^1\) with share holdings \(X^{\alpha ^1 +\varepsilon \beta ^1} = X^1 + \varepsilon \int _0^\cdot \beta ^1_s ds\). Moreover, since
we obtain the desired result in (13) after applying Fubini’s theorem. \(\square \)
Having at hand the explicit expressions in (13) and (14) we can now derive a sufficient and necessary first order condition for the Nash equilibrium in terms of a system of coupled forwardbackward stochastic differential equations (FBSDE).
Lemma 3.4
A pair of controls \(({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) is a Nash equilibrium in the sense of Definition 2.2 if and only if \(({\hat{X}}^1, {\hat{X}}^2, {\hat{\alpha }}^1,{\hat{\alpha }}^2)\) solve following coupled forward backward SDE system
for two suitable square integrable martingales \((M^1_t)_{0 \le t < T}\) and \((M^2_t)_{0 \le t < T}\).
Proof
Sufficiency: Assume first that \(({\hat{X}}^1, {\hat{X}}^2,{\hat{\alpha }}^1,{\hat{\alpha }}^2, M^1, M^2)\) with \(({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) solves the FBSDE system in (15). We have to show that \({\hat{\alpha }}^1\) minimizes \(\alpha ^1 \mapsto J^1(\alpha ^1;{\hat{\alpha }}^2)\) over \({\mathscr {A}}^1\), and, vice versa, that \({\hat{\alpha }}^2\) minimizes \(\alpha ^2 \mapsto J^2(\alpha ^2;{\hat{\alpha }}^1)\) over \({\mathscr {A}}^2\). Since we are minimizing strictly convex functionals due to Lemma 3.1, a sufficient condition for the optimality of \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\), respectively, is given by
and
cf., e.g., Ekeland and Témam [16]. We start with the proof of (16). By assumption we have the representation
for some square integrable martingale \((M^1_t)_{0 \le t < T}\). Moreover, since \({\hat{\alpha }}^1, {\hat{\alpha }}^2, \xi ^1 \in L^2({\mathbb {P}}\otimes dt)\) it follows that \({\mathbb {E}}[\int _0^T (M_s^1)^2 ds] < \infty \). Next, introducing the square integrable martingale
and plugging the above representation of \({\hat{\alpha }}^1\) in the Gâteaux derivative in (13) we obtain
where we used the result from Bank et al. [5, Lemma 5.3] in the last line. Hence, as desired, we obtain that the first order optimality condition in (16) is satisfied by \({\hat{\alpha }}^1 \in {\mathscr {A}}^1\). In fact, the same computations apply to show that also \({\hat{\alpha }}^2 \in {\mathscr {A}}^2\) is satisfying the first order optimality condition in (17). Therefore, we can conclude that \(({\hat{\alpha }}^1,{\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) is a Nash equilibrium in the sense of Definition 2.2.
Necessity: Finally, as shown in the Proof of Theorem 3.5 below (which does not use the necessity assertion of the present lemma) the pair of controls \(({\hat{\alpha }}^1, {\hat{\alpha }}^2) \in {\mathscr {A}}^1 \times {\mathscr {A}}^2\) presented in (21) below satisfies the coupled forward backward SDE system in (15). Therefore, by uniqueness of the Nash equilibrium via Lemma 3.2 the assertion is indeed also necessary. \(\square \)
We are now ready to state our main result. To do so, it is convenient to introduce following nonnegative constants
the nonnegative functions
such that \(\lim _{t \uparrow T} c_t^{\pm } = +\infty \), as well as the weight functions
for all \(t \in [0,T]\). An explicit description of the unique Nash equilibrium is provided in the following
Theorem 3.5
There exists a unique openloop Nash equilibrium \(({\hat{\alpha }}^1, {\hat{\alpha }}^2)\) in \({\mathscr {A}}^1 \times {\mathscr {A}}^2\) in the sense of Definition 2.2. The corresponding equilibrium share holdings \({\hat{X}}^1_\cdot = x^1 + \int _0^\cdot {\hat{\alpha }}^1_tdt\) of agent 1 and \({\hat{X}}^2_\cdot = x^2 + \int _0^\cdot {\hat{\alpha }}^2_tdt\) of agent 2 satisfy the random linear coupled ODE
where, for \(0 \le t \le T\), we let
and
with nonnegative kernels
which, for each \(t \in [0,T)\), integrate to one over [t, T]. The solution \(({\hat{X}}^1, {\hat{X}}^2)\) of (21) satisfies the terminal state constraints in the sense that
The Proof of Theorem 3.5 consists of a verification that the pair \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) with dynamics in (21) is admissible (i.e., belongs to \({\mathscr {A}}^1\times {\mathscr {A}}^2\)) and satisfies the FBSDE system in (15). An explanation on how the Nash equilibrium \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) can be constructed is provided in the appendix.
Proof of Theorem 3.5
In view of Lemma 3.4 we merely have to show that \(({\hat{X}}^1, {\hat{X}}^2, {\hat{\alpha }}^1, {\hat{\alpha }}^2)\) with dynamics described in Theorem 3.5, Eq. (21), is a solution of the FBSDE system in (15) with some suitable square integrable martingales \((M^1_t)_{0 \le t < T}\) and \((M^2_t)_{0 \le t < T}\). Uniqueness of the Nash equilibrium then follows together with Lemma 3.2.
Step 1: We start with computing the dynamics of the controls \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\) in (21) and verify that they satisfy the dynamics of the FBSDE system in (15). To this end, it is convenient to rewrite \(w^1, w^2\) in (20), as well as \({\hat{\xi }}^1\) in (22) and \({\hat{\xi }}^2\) in (23) by introducing
and
Moreover, setting
and
for all \(0 \le t \le T\), we obtain the representations
In particular,
on [0, T). Note that \(\Xi ^1_T, \Xi ^2_T, Y_T^+, Y_T^ \in L^2({\mathbb {P}})\) implies that \((M_t^+)_{0 \le t \le T}\) and \((M_t^)_{0 \le t \le T}\) are square integrable martingales. Also, observe that the processes \(Y^+, M^+, Y^, M^ \in L^2({\mathbb {P}}\otimes dt)\). We can now rewrite (21) as
Next, for \({\tilde{w}}^1\), \({\tilde{w}}^2\) in (26) one can easily check that
Hence, by applying integration by parts in (30) we obtain the dynamics
and
Now, having at hand (34) and (35), as well as the fact that the functions \(c^+, c^\) in (19) satisfy the ordinary Riccati differential equations
an elementary but tedious computation reveals that the dynamics of \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\) in (32) on [0, T) are given by
and, similarly, by
where we also employed the identities in (31). As a consequence, using the representations in (32) we obtain
and
In other words, the pair \(({\hat{\alpha }}^1, {\hat{\alpha }}^2)\) described in (21) satisfies the dynamics of the FBSDE system in (15), where \(\int _0^\cdot {\tilde{w}}_t^{1} dM_t^{+}\), \(\int _0^\cdot {\tilde{w}}_t^{2} dM_t^{}\) are square integrable martingales on [0, T) providing the ingredients for \(M^1\) and \(M^2\).
Step 2: Next, we have to check the terminal conditions of the FBSDE system in (15), that is, \(\lim _{t \uparrow T} {\hat{X}}^1_t = \Xi ^1_T\) and \(\lim _{t \uparrow T} {\hat{X}}^2_t = \Xi ^2_T\) \({\mathbb {P}}\)a.s. holds true for the pair of solutions \(({\hat{X}}^1, {\hat{X}}^2)\) of the coupled ODE in (21). We adapt the argumentation from Bank et al. [5] which employs a simple comparison principle for ordinary differential equations to our current setting. Specifically, note that it suffices to show that
where, using the dynamics in (21) and the definition of \(w^5\) in (20), the processes \({\hat{X}}^1 + {\hat{X}}^2\) and \({\hat{X}}^1  {\hat{X}}^2\) satisfy, respectively, the ODE
and
Note that \(w^5_t \in (1,1)\) for all \(t \in [0,T]\) by virtue of Lemma 3.7 1.). First, analogously to (30) let us rewrite \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) in (22) and (23) as
with \(Y^+, M^+,Y^,M^\) as defined in (28) and (29). Hence, we can consider a càdlàg version of the processes \(({\hat{\xi }}^1_t)_{0 \le t \le T}\) and \(({\hat{\xi }}^2_t)_{0 \le t \le T}\) and obtain, together with Lemma 3.7, 2.), the \({\mathbb {P}}\)a.s. limits
due to \({\mathscr {F}}_{T}\)measurability of \(\Xi ^1_T\) and \(\Xi ^2_T\) by virtue of our assumption in (9). In particular, since \(\lim _{t\uparrow T} w^5_t = 0\) because of Lemma 3.7, 2.), it also holds that
Let us now start with proving the limit in (39). As a consequence of (44), for every \(\varepsilon > 0\) there exists a (random) time \(\tau _\varepsilon \in [0,T)\) such that \({\mathbb {P}}\)a.s.
Next, define \(Y^{+,\varepsilon }_t \triangleq \Xi ^1_T + \Xi ^2_T + \varepsilon  ({\hat{X}}^1_t + {\hat{X}}^2_t)\) for all \(t \in [0,T)\) so that
Together with the dynamics of \({\hat{X}}^1+{\hat{X}}^2\) in (41) this yields
Moreover, since for all \(\omega \in \Omega \) the linear ODE on \([\tau _\varepsilon (\omega ),T)\) given by
admits the solution
with \(\lim _{t \uparrow T} Z^{+,\varepsilon }_t = 0\), the comparison principle for ODEs in (47) implies that \(Y^{+,\varepsilon }_t \ge Z^{+,\varepsilon }_t\) for all \(t \in [\tau _\varepsilon , T)\) and thus
or, equivalently,
Next, in a similar way, set \({\tilde{Y}}^{+,\varepsilon }_t \triangleq \Xi ^1_T + \Xi ^2_T  \varepsilon  ({\hat{X}}^1_t + {\hat{X}}^2_t)\) for all \(t \in [0,T)\) and observe as above from (45) that \({\mathbb {P}}\)a.s. on \([\tau _\varepsilon , T)\) it holds that \(d{\tilde{Y}}^{+,\varepsilon }_t \le \frac{c_t^+}{\lambda } {\tilde{Y}}^{+,\varepsilon }_t dt\) and hence
by the comparison principle. That is,
which, together with (48) yields the limit in (39).
In fact, it can now be argued along the same lines as above that also the limit in (40) holds true. Indeed, simply note that (44) implies similar to (45) that \({\mathbb {P}}\)a.s. for every \(\varepsilon > 0\) there exists a (random) time \(\tau '_\varepsilon \in [0,T)\) such that
Then, introduce the processes \(Y^{,\varepsilon }_t \triangleq \Xi ^1_T  \Xi ^2_T + \varepsilon  ({\hat{X}}^1_t  {\hat{X}}^2_t)\) and \({\tilde{Y}}^{,\varepsilon }_t \triangleq \Xi ^1_T  \Xi ^2_T  \varepsilon  ({\hat{X}}^1_t  {\hat{X}}^2_t)\) for all \(t \in [0,T)\). By using the dynamics of \({\hat{X}}^1  {\hat{X}}^2\) in (42) we can once more apply the comparison principle on the interval \([\tau '_\varepsilon ,T)\) for the ODEs of \(Y^{,\varepsilon }\) and \({\tilde{Y}}^{,\varepsilon }\) together with the linear ODE
which admits the solution
such that \(\lim _{t \uparrow T} Z^{,\varepsilon }_t = 0\) to finally conclude that
as desired.
Step 3: It is left to argue that the controls \({\hat{\alpha }}^1, {\hat{\alpha }}^2\) described in (21) belong to the set \({\mathscr {A}}\) in (2), i.e., \({\hat{\alpha }}^1, {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)\). To achieve this we will follow a similar strategy as in Bank et al. [5]. For simplicity, we will assume without loss of generality that \(x^1=x^2=0\). Because of the coupling of \({\hat{\alpha }}^1, {\hat{\alpha }}^2\) in (21) it is more convenient to prove that \({\hat{\alpha }}^+ \triangleq {\hat{\alpha }}^1 + {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)\) and \({\hat{\alpha }}^ \triangleq {\hat{\alpha }}^1  {\hat{\alpha }}^2 \in L^2({\mathbb {P}}\otimes dt)\), where we set \({\hat{X}}^+_\cdot \triangleq \int _0^\cdot {\hat{\alpha }}^+_s ds\) and \({\hat{X}}^_\cdot \triangleq \int _0^\cdot {\hat{\alpha }}^_s ds\). Recall from (41) and (42) above that we then have
on [0, T), where
because of (43) (recall that \(M^+,Y^+\) are given in (28) and \(M^,Y^\) are given in (29)).
We start with showing that \({\hat{\alpha }}^+ \in L^2({\mathbb {P}}\otimes dt)\). For this purpose, observe that it suffices to examine the following two cases \(\xi ^1\equiv \xi ^2\equiv 0\) and \(\Xi ^1_T=\Xi ^2_T=0\) separately. Indeed, let us denote \({\hat{\alpha }}^{+,\xi ^1,\xi ^2,\Xi ^1,\Xi ^2} \triangleq {\hat{\alpha }}^{+}\) to emphasize also the dependence on \(\xi ^1,\xi ^2,\Xi ^1,\Xi ^2\). Then, due to the linear dependence of \({\hat{\alpha }}^+\) in (49) on \(\xi ^1,\xi ^2,\Xi ^1,\Xi ^2\), it holds that
Hence, it suffices to show that \({\hat{\alpha }}^{+,0,0,\Xi ^1,\Xi ^2} \in L^2({\mathbb {P}}\otimes dt)\) and \({\hat{\alpha }}^{+,\xi ^1,\xi ^2,0,0} \in L^2({\mathbb {P}}\otimes dt)\).
Case 1.1: \(\xi ^1\equiv \xi ^2\equiv 0\):
From (50) it follows that \({\hat{\xi }}^1_t + {\hat{\xi }}^2_t = 2 w^1_t M^+_t\). Moreover, the explicit solutions in (66) and (67) yield
Introducing the deterministic and differentiable function \(f^+_s \triangleq 1/\sinh (\sqrt{\delta ^+}(Ts)/(3\lambda ))\) on [0, T) allows to rewrite the integral in (52) by applying integration by parts as
where \({\tilde{M}}^+_t \triangleq M_t^+/\cosh (\sqrt{\delta ^+}(Tt)/(3\lambda ))\) for all \(t \in [0,T)\). Moreover, we have that
Now, plugging back (54) and (52) together with (53) into \({\hat{\alpha }}^+\) in (49) yields, after some elementary computations,
In fact, since \(c^+_t \sinh (\sqrt{\delta ^+}(Tt)/(3\lambda ))\) is bounded on [0, T] (recall from (19) that \(c^+_t = \frac{1}{3} \sqrt{\delta ^+} \coth (\sqrt{\delta ^+}(Tt)/(3\lambda )) + \frac{1}{6}\gamma \)) and \({\tilde{M}}^+ \in L^2({\mathbb {P}}\otimes dt)\) (recall that \(M^+\) in (28) belongs to \(L^2({\mathbb {P}}\otimes dt)\)) the first two terms in (55) are in \(L^2({\mathbb {P}}\otimes dt)\). For the stochastic integral, we obtain
where the first integral on the right is again an element of \(L^2({\mathbb {P}}\otimes dt)\). The second integral satisfies
by our assumption in (9), where we also used Fubini’s theorem twice and the fact that \(\sinh (\tau ) \ge \tau \) and \(\cosh (\tau ) \ge 1\) for all \(\tau \ge 0\). That is, we obtain that \({\hat{\alpha }}^+ \in L^2({\mathbb {P}}\otimes dt)\) in this case.
Case 1.2: \(\Xi ^1_T=\Xi ^2_T=0\):
In this case, we obtain from the expressions in (22) and (23) that
and thus, using again the explicit representation for \({\hat{X}}^+={\hat{X}}^1 + {\hat{X}}^2\) from (66) and (67), \({\hat{\alpha }}^+\) in (49) becomes
In fact, it holds that all the ratios in (57) involving \(c^+\), \(c^\) are bounded on [0, T]. Moreover, by Lemma 3.8 we have
as well as
by using Jensen’s inequality. As a consequence, we can also conclude in this case that \({\hat{\alpha }}^+\) belongs to \(L^2({\mathbb {P}}\otimes dt)\).
Let us now argue that also \({\hat{\alpha }}^\) in (49) belongs to \(L^2({\mathbb {P}}\otimes dt)\). The argumentation is very similar to the one presented above so that we only sketch the main steps. Again, it is enough to investigate the following two cases \(\xi ^1\equiv \xi ^2\equiv 0\) and \(\Xi ^1_T=\Xi ^2_T=0\) separately because \({\hat{\alpha }}^\) in (49) can similarly be decomposed as \({\hat{\alpha }}^+\) in (51).
Case 2.1: \(\xi ^1\equiv \xi ^2\equiv 0\):
Similar to (52) above, using \({\hat{\xi }}^1_t  {\hat{\xi }}^2_t = 2 w^2_t M^_t\) from (50) we obtain via (66) and (67) the representation
Setting \(f^_s \triangleq 1/\sinh (\sqrt{\delta ^}(Ts)/\lambda )\) on [0, T) we can rewrite the integral in (58) as
with \({\tilde{M}}^_t \triangleq M_t^/\cosh (\sqrt{\delta ^}(Tt)/\lambda )\) for all \(t \in [0,T)\). In addition,
Inserting (60) and (58) together with (59) into \({\hat{\alpha }}^\) in (49) then yields
where
Observe as in (55) above that \(c^_t \sinh (\sqrt{\delta ^}(Tt)/\lambda )\) is bounded on [0, T] (recall from (19) that \(c^_t = \sqrt{\delta ^} \coth (\sqrt{\delta ^}(Tt)/\lambda )\frac{1}{2}\gamma \)) and that \({\tilde{M}}^ \in L^2({\mathbb {P}}\otimes dt)\). Therefore, we only need to justify that the stochastic integral in (62) belongs to \(L^2({\mathbb {P}}\otimes dt)\). Indeed, by the same computations as in (56), we obtain via our assumption in (9) that
Hence, we can conclude that \({\hat{\alpha }}^ \in L^2({\mathbb {P}}\otimes dt)\) in this case.
Case 2.2: \(\Xi ^1_T=\Xi ^2_T=0\):
Here, similar to (57) above, (22) and (23) imply that
and hence, together with \({\hat{X}}^={\hat{X}}^1  {\hat{X}}^2\) from (66) and (67), \({\hat{\alpha }}^\) in (49) can be written as
As in (57), all the ratios in (64) involving the functions \(c^+\), \(c^\) are bounded on [0, T], and we can conclude along the same lines as in case 1.2 by virtue of Lemma 3.8 that \({\hat{\alpha }}^ \in L^2({\mathbb {P}}\otimes dt)\) in this case as well.
Step 4: Finally, we have to argue that the functions \(K^1(t,u)\) and \(K^2(t,u)\) defined in (24) are nonnegative kernels which integrate to one over [t, T) as functions in \(u \in [t,T)\). To this end, observe that \(c^+_t > 0\) and \(c^_t > 0\) for all \(t \in [0,T]\), which implies that \(w^1_\cdot , w^2_\cdot > 0\) on [0, T). Moreover, a direct computation yields that for all \(t \in [0,T)\) we have
Thus, we also obtain that \(w^3_\cdot , w^4_\cdot > 0\) on [0, T). But this implies for the functions defined in (24) that \(K^1(t,u) > 0\) and \(K^2(t,u) > 0\) for all \(0 \le t \le u < T\), as well as that \(\int _t^T K^1(t,u) du = \int _t^T K^2(t,u) du = 1\) for all \(t \in [0,T)\). \(\square \)
The equilibrium share holdings prescribed by the linear coupled ODE in (21) can also be computed explicitly.
Corollary 3.6
The solution \(({\hat{X}}^1, {\hat{X}}^2)\) to the linear ODE in (21) is given by
and, similarly, by
for all \(t \in [0,T]\).
Proof
Recall that from the dynamics of \({\hat{X}}^1\) and \({\hat{X}}^2\) in (21) we obtain that the processes \({\hat{X}}^1 + {\hat{X}}^2\) and \({\hat{X}}^1  {\hat{X}}^2\) satisfy, respectively, the linear ODEs in (41) and (42) with initial values \(x^1+x^2\) and \(x^1x^2\). Applying the variation of constants formula then yields
and hence the assertion in (66) and (67) via the obvious relation
\(\square \)
Lastly, following simple properties of the weight functions introduced in (20) will help enlightening the structure of the Nash equilibrium presented in Theorem 3.5.
Lemma 3.7
The weight functions \(w^1, w^2, w^3, w^4,w ^5\) defined in (20) satisfy

1.
\(w_\cdot ^5 \in (1,1)\), \(w_{\cdot }^{1,2,3,4} > 0\) on [0, T) and \(w^1_\cdot + w^2_\cdot + w^3_\cdot + w^4_\cdot =1\) on [0, T],

2.
\(\lim _{t \uparrow T} w_t^{1,2} = 1/2\) and \(\lim _{t \uparrow T} w_t^{3,4,5} = 0\).
Proof
1. First, recall from the Proof of Theorem 3.5, Step 4, above that \(w^1_\cdot , w^2_\cdot ,w^3_\cdot ,w^4_\cdot > 0\) on [0, T). Moreover, from the definition in (20) we immediately obtain that \(w^1_t + w^2_t + w^3_t + w^4_t =1\) for all \(t \in [0,T]\). Together with the fact that \(c^+_\cdot > 0\) and \(c^_\cdot > 0\) on [0, T], we also observe that \(w^5_t \in (1,1)\) for all \(t \in [0,T]\).
2. Concerning the limiting behaviour of the weight functions, it suffices to note that
Then, rewriting \(w^1\), \(w^2\) in (20) by plugging in \(c^+\), \(c^\) from (19) to obtain the representations
with
yields
Similarly, with
we also have
and hence
as desired. \(\square \)
The final lemma provides estimates with respect to the \(L^2({\mathbb {P}}\otimes dt)\)norm which are used in the Proof of Theorem 3.5 above.
Lemma 3.8
Let \((\zeta _t)_{0 \le t \le T} \in L^2({\mathbb {P}}\otimes dt)\) be progressively measurable. Moreover, let \(K^1(t,u)\), \(K^2(t,u)\), \(0 \le t \le u < T\), denote the kernels from Theorem 3.5.

(a)
For \(\zeta ^{K^1}_t \triangleq {\mathbb {E}}[ \int _t^T \zeta _u K^1(t,u) du \vert {\mathscr {F}}_t]\), \(0 \le t < T\), it holds that
$$\begin{aligned} \Vert \zeta ^{K^1} \Vert _{L^2({\mathbb {P}}\otimes dt)} \le c \Vert \zeta \Vert _{L^2({\mathbb {P}}\otimes dt)} \end{aligned}$$for some constant \(c>0\).

(b)
For \(\zeta ^{K^2}_t \triangleq {\mathbb {E}}[ \int _t^T \zeta _u K^2(t,u) du \vert {\mathscr {F}}_t]\), \(0 \le t < T\), it holds that
$$\begin{aligned} \Vert \zeta ^{K^2} \Vert _{L^2({\mathbb {P}}\otimes dt)} \le c \Vert \zeta \Vert _{L^2({\mathbb {P}}\otimes dt)} \end{aligned}$$for some constant \(c>0\).
Proof
Both upper bounds can be verified in a similar fashion as in the proof of Lemma 5.5 in Bank et al. [5]. We will thus omit it here. \(\square \)
Remark 3.9
Following up on Remark 2.3, setting \(\xi ^1 \equiv \xi ^2 \equiv 0\) and \(\Xi ^1_T = \Xi ^2_T = 0\) \({\mathbb {P}}\)almost surely, our Theorem 3.5 together with Corollary 3.6 retrieves the twoplayer results from Carlin et al. [9, Result 1] for the case \(\sigma = 0\) and from Schied and Zhang [29, Corollary 2.6] for the case \(\sigma > 0\). Note that this configuration yields \({\hat{\xi }}^1 \equiv {\hat{\xi }}^2 \equiv 0\) in (22) and (23), which in turn implies that the Nash equilibrium trading rates in (21) and the corresponding share holdings in (66) and (67) are deterministic.
We end this section by briefly discussing qualitatively the Nash equilibrium obtained in Theorem 3.5. Very similar to the singleplayer solution in [5] it turns out that the trading rates \({\hat{\alpha }}^1\) and \({\hat{\alpha }}^2\) in (21) prescribe, respectively, to gradually trade in the direction of an optimal signal process \({\hat{\xi }}^1_t\) and \({\hat{\xi }}^2_t\) (rather than toward the actual target position \(\xi ^1_t\), \(\xi ^2_t\)), which is further adjusted by a fraction \(w^5_t \in (1,1)\) of the opponent’s respective current portfolio position \({\hat{X}}^2_t\) and \({\hat{X}}^1_t\). The optimal signal processes \({\hat{\xi }}^1\) in (22) and \({\hat{\xi }}^2\) in (23) are convex combinations of weighted averages of expected future target positions of the processes \(\xi ^1\), \(\xi ^2\) and the expected terminal positions \(\Xi ^1_T\), \(\Xi ^2_T\), where the weights \(w^1_t, w^2_t, w^3_t, w^4_t\) systematically shift toward the desired individual terminal state as \(t \uparrow T\) (Lemma 3.7 implies that \(\lim _{t \uparrow T} {\hat{\xi }}^i_t = \Xi ^i_T\) \({\mathbb {P}}\)a.s. for both players \(i=1,2\)). The increasing urgency rate \((c^+_t+c^_t)/(2\lambda ) \uparrow \infty \) for \(t \uparrow T\), together with \(\lim _{t\uparrow T} w^5_t = 0\), then forces both strategies in (21) to end up in the predetermined terminal portfolio position at maturity T (see also the Proof of Theorem 3.5 above). Interestingly, we note that the first agent’s optimal signal process \({\hat{\xi }}^1\) not only seeks to anticipate the future evolution of her own target strategy \(\xi ^1\) but, conscious of her competitor’s trading goals, does so also for the opponent’s target strategy \(\xi ^2\). In other words, besides following her own objectives, she also takes into account the other agent’s known trading intentions. Moreover, the weights \(w^3_t\) and \(w^4_t\) dictate the actual trading direction with respect to the other agent’s tracking target. Indeed, observe that if \(w^3_t\) predominates \(w^4_t\) in (22), the first player’s optimal signal \({\hat{\xi }}^1\) directs to also trade in parallel in the same direction as the second player, that is, in the direction of the expected future average positions of \(\xi ^2\). In contrast, if \(w^4_t\) outweighs \(w^3_t\), then the optimal signal imposes to trade in the opposite direction of the second player’s target strategy, i.e., toward the expected weighted averages of \(\xi ^2\). The former case can be viewed as a predatory trading action of the first agent against the second agent, whereas the latter case can be regarded as a cooperative behaviour. The same applies for the second player in (23) due to symmetry. In our illustrations in Sect. 4 below it becomes apparent that both these cases depend on the relationship between the permanent and temporary price impact parameters \(\gamma \) and \(\lambda \). Loosely speaking, in a plastic market where \(\gamma \gg \lambda \), the weight \(w^3\) tends to be larger than \(w^4\), and in an elastic market with \(\lambda \gg \gamma \) we have that \(w^4\) tends to be larger than \(w^3\) (see also the graphical illustration of the weight functions in Fig. 1 below). In this regard, depending on the illiquidity parameters the optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) account for different types of regimes. It turns out that this leads to qualitative different behavioral patterns in the Nash equilibrium where both predation and cooperation between the agents can occur, even in a coexisting manner.
4 Illustrations
In this section we present some case studies to illustrate the qualitative behaviour of the twoplayer Nash equilibrium presented in Theorem 3.5.
4.1 Optimal liquidation revisited
We start with revisiting the differential game of optimal portfolio liquidation studied in Schied and Zhang [29]. Specifically, the first agent seeks to liquidate her initial portfolio position of \(x^1=1\) shares in the risky asset by time \(T=2\) and hence requires her terminal position to satisfy \(\Xi ^1_T=0\) \({\mathbb {P}}\)a.s. at final time. Vigilant about her share holdings and in line with her selling intention she also wants her inventory to be close to 0 throughout by tracking \(\xi ^1 \equiv 0\) on [0, T]. The second agent, on the contrary, does not pursue any predetermined buying or selling objectives but solely chooses to trade in the risky asset because he knows about the intentions of the first liquidating agent. That is, possessing no shares at time 0 (\(x^2=0\)) he gives himself the constraints \(\xi ^2_t = \Xi ^2_T = 0\) \({\mathbb {P}}\)a.s. for all \(t \in [0,T]\). In this case, following Theorem 3.5, we have \({\hat{\xi }}^1 \equiv {\hat{\xi }}^2 \equiv 0\) \({\mathbb {P}}\)a.s. on [0, T] in (22) and (23), and the deterministic equilibrium trading rates of both players in (21) reduce to
on [0, T); cf. also the result in [29, Corollary 2.6] with a slightly different representation. We observe in (68) that the first agent’s portfolio position \({\hat{X}}^1_t\) is not gradually reverting towards 0 but takes the effect of the second agent’s actions into account via the correction term \(w^5_t {\hat{X}}^2_t\). Similarly, concerning the second agent, it is optimal for him to systematically trade in the direction of the liquidating agent’s current portfolio position \({\hat{X}}^1_t\) weighted with \(w^5_t \in (1,1)\).
As shown in Fig. 2, this yields to predation on the first agent in a plastic market where, e.g., \(\gamma = 4 > 1 = \lambda \). Indeed, during the first half of the trading period he shortsells the risky asset in parallel to the selling of the first agent and then steadily unwinds his accrued short position by buying back shares to become “handsclean” by final time T. In contrast, in an elastic market with, e.g., \(\gamma = 0.2 < 1 = \lambda \), the Nash equilibrium strategy dictates the second agent to cooperate with the seller and to moderately buy almost up to onetenth of the shares by time T/2 agent 1 is concurrently selling before starting liquidating his portfolio to finish up with zero inventory at T. Note that the weight function \(w^5_\cdot \) in (68) flips sign depending on the market’s illiquidity regime (see also Fig. 1). As a consequence, compared to the singleplayer optimal liquidation strategy \({\hat{X}}_t = 1 + \int _0^t {\hat{\alpha }}_s ds\), \(t \in [0,T]\), which satisfies
(cf., e.g., Almgren [1]), and does not depend on \(\gamma \), we observe in Fig. 2 that, due to the presence of the second agent’s trading activity which directly feeds into the first agent’s turnover rate \({\hat{\alpha }}^1\) via \(w^5 {\hat{X}}^2\) in (68), her optimal portfolio liquidation strategy becomes more prudent in a plastic market and slightly more aggressive in an elastic market environment. To sum up, in equilibrium, depending on the illiquid market type, either predation or cooperation between both agents occurs; see also the discussion in [29, Sect. 3].
4.2 Piecewise constant inventory targets
The next two case studies are again simple deterministic examples but this time with nonzero optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\).
In the first example, as in the optimal liquidation problem above, we suppose that agent 2 only trades in the risky asset because of his awareness of the trading activity of the first agent. That is, with \(x^2 = 0\) initial shares his inventory targets are \( \xi ^2_t = \Xi ^2_T = 0\) \({\mathbb {P}}\)a.s. for all \(t \in [0,T]\). Concerning the first agent, starting with no inventory \(x^1=0\) she wants to follow a stockbuying schedule over a time period of \(T=10\) that prescribes to hold one share until time T/2 and then to double and hold her position up to time T. Her inventory target is thus \(\xi ^1_t = 1 \cdot 1_{\{0 \le t < 5\}}+2 \cdot 1_{\{5 \le t \le 10\}}\) on [0, T] with terminal constraint \(\Xi ^1_T = 2\). Note that in this game setup the optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) of both agents in (22) and (23) in equilibrium are nonzero. In particular, similar to the singleplayer case in [5] they are anticipating and smoothing out the jump in \(\xi ^1\) at time T/2 via the averaging through the kernels \(K^1\) and \(K^2\). The associated Nashequilibrium trading strategies \({\hat{X}}^1\) and \({\hat{X}}^2\) from Theorem 3.5 are presented in Fig. 3. As expected from the liquidation problem above, if the market is plastic \((\gamma > \lambda )\) the second agent heavily preys on the first agent by trading halfway of the trading period in the same direction and buying shares. Accordingly, in comparison to the first agent’s singleplayer optimal tracking strategy from [5] (which does not dependent on \(\gamma \)) her running after the buyingschedule \(\xi ^1\) gets affected due to the presence of the preying second agent and falls behind the singleplayer solution in the second half of the trading period (also recall the adjustment \({\hat{\xi }}^1 w^5{\hat{X}}^2\) of the first agent’s optimal signal process in her trading rate in (21)). However, if the market is elastic \((\lambda > \gamma )\) the second agent’s optimal behaviour in equilibrium changes. Interestingly, we observe that his strategy turns out to be a succession of roundtrips during which he either provides liquidity to his opponent by shortselling the risky asset like, e.g., during the first quarter of the trading period, or engages in predatory trading by concurrently building up some inventory in parallel to his adversary’s buying efforts as it is the case during the second quarter of the trading period. Thus, compared to the first agent’s singleplayer optimal strategy, she suitably buys slightly faster and slower in the twoplayer setup. Overall, it turns out that predation and cooperation coexist in equilibrium in this case.
As a second example, let us examine the situation where both agents with zero initial inventory \(x^1=x^2=0\) seek to gradually build up and hold a positive fraction of the risky asset over some time period [0, T] with \(T=10\). Concretely, assume that \(\xi ^1 \equiv \Xi ^1_T = 1\) and \(\xi ^2 \equiv \Xi ^2_T = 0.1\), i.e., agent 1 wants her inventory to be close to 1 and ten times larger than the desired inventory level of agent 2 all through the trading period [0, T]. The associated Nash equilibrium strategies \({\hat{X}}^1\) and \({\hat{X}}^2\) from Theorem 3.5 are presented in Fig. 4. Again, as expected from the analysis above, in a plastic market it is optimal for agent 2 to excessively prey on the first agent who aims for a much larger asset position by buying up to three times more shares than his actual target inventory predetermines. In response, the acquisition of the first agent is slowed down compared to her singleplayer optimal strategy from [5]. By contrast, in an elastic market environment it turns out to be optimal for the second agent to initially ignore her own tracking target and to trade away from her desired inventory level in order to provide liquidity to the highervolume seeking first agent by shortselling some shares. Also note how in this case the second agent’s singleplayer optimal tracking strategy from [5] strongly differs from her optimal behaviour in the twoplayer Nash equilibrium at the beginning of the trading period.
4.3 Running after the delta
In the final two examples we want to investigate a situation where the target strategies \(\xi ^1\) and \(\xi ^2\) are adapted stochastic processes. Specifically, let us suppose that the first agent wants to hedge an atthemoney call option with maturity T on the underlying unaffected price process \(P=P_0+\sqrt{\sigma } W\) in (3) by tracking the corresponding frictionless (Bachelier)deltahedging strategy
Here, \(\Phi \) denotes the cumulative distribution function of the standard normal distribution. We further suppose that her initial position in the risky asset coincides with the frictionless delta \(x^1=\xi ^1_0 = 1/2\) and that \(\Xi ^1_T = 0\) \({\mathbb {P}}\)a.s., i.e., she wants to systematically unwind her hedging portfolio when approaching maturity T.
Lemma 4.1
The process \((\xi ^1_t)_{0 \le t \le T}\) in (70) is a martingale on [0, T].
Proof
Obviously, \((\xi ^1_t)_{0 \le t \le T}\) is adapted, bounded and hence integrable. Moreover, using the property that for any \(a,b \in {\mathbb {R}}\) a standard normal distributed random variable Z satisfies \({\mathbb {E}}[\Phi (a Z + b)] = \Phi (b/\sqrt{1+a^2})\) we obtain
as desired. \(\square \)
Firstly, we assume that the second agent does not pursue any specific predetermined trading objectives, that is, \(x^2 = \xi ^2 = \Xi ^2_T = 0\) \({\mathbb {P}}\)a.s. Since \(\xi ^1\) in (70) is a martingale on [0, T] the optimal signal processes \({\hat{\xi }}^1\) and \({\hat{\xi }}^2\) in (22) and (23) simplify to
using Fubini’s theorem and the fact that for each \(t \in [0,T)\) the kernels \(K^1(t,u)\) and \(K^2(t,u)\) as functions in \(u \in [t,T)\) integrate to one over [t, T]. The Nash equilibrium strategies \({\hat{X}}^1\) and \({\hat{X}}^2\) from Theorem 3.5 are plotted in Fig. 5, together with the corresponding realisation of the deltahedge \(\xi ^1\) in the case where the call option expires in the money.
Depending on the illiquidity parameters, we observe the same behavioral patterns in equilibrium as in the deterministic cases analyzed above: In a plastic market environment, the second agent engages in predatory trading on the first agent by trading in parallel in the same direction of the deltahedge. When the market is elastic he turns into a liquidity provider instead and partially takes the opposite side of the hedger’s transactions. Also note that the sign of the second agent’s optimal signal process in (71) is determined by the relation between the weights \(w^3\) and \(w^4\), which is in turn affected by the relation between \(\gamma \) and \(\lambda \) (cf. also Fig. 1).
Secondly, let us now assume that the second agent also hedges a onetenth fraction of the same call option, i.e., \(\xi ^2 = \xi ^1/10\) (with initial and final portfolio positions \(x^2=1/20\) and \(\Xi ^2_T =0\) \({\mathbb {P}}\)a.s.). The resulting Nash equilibrium strategies from Theorem 3.5 are presented in Fig. 6 where we used the same realisation of the deltahedge as in Fig. 5. In a similar vein as in the deterministic case above, the second agent’s optimal behaviour in the twoplayer Nash equilibrium changes notably compared to his optimal singleplayer frictional hedging strategy from [5]; focussing more on preying on the first agent’s larger hedging portfolio in a plastic market, or on providing liquidity to the latter in an elastic market.
Data Availibility
No data was used in this article.
References
Almgren, R.: Optimal trading with stochastic liquidity and volatility. SIAM J. Financial Math. 3(1), 163–181 (2012). https://doi.org/10.1137/090763470
Almgren, R., Chriss, N.: Optimal execution of portfolio transactions. J. Risk 3, 5–39 (2001)
Almgren, R., Li, T.M.: Option hedging with smooth market impact. Market Microstructure Liquidity 02(01), 1650002 (2016). https://doi.org/10.1142/S2382626616500027
Attari, M., Mello, A.S., Ruckes, M.E.: Arbitraging arbitrageurs. J. Finance 60(5), 2471–2511 (2005)
Bank, P., Soner, H.M., Voß, M.: Hedging with temporary price impact. Math. Financial Economics 11(2), 215–239 (2017). https://doi.org/10.1007/s1157901601784
Brunnermeier, M.K., Pedersen, L.H.: Predatory Trading. J. Finance 60(4), 1825–1863 (2005). https://doi.org/10.1111/j.15406261.2005.00781.x
Cai, J., Rosenbaum, M., Tankov, P.: Asymptotic lower bounds for optimal tracking: A linear programming approach. Ann. Appl. Probab. 27(4), 2455–2514 (2017). https://doi.org/10.1214/16AAP1264
Cardaliaguet, P., Lehalle, C.A.: Mean field game of controls and an application to trade crowding. Math. Financial Economics 12(3), 335–363 (2018). https://doi.org/10.1007/s115790170206z
Carlin, B.I., Lobo, M.S., Viswanathan, S.: Episodic liquidity crises: Cooperative and predatory trading. J. Finance 62(5), 2235–2274 (2007). https://doi.org/10.1111/j.15406261.2007.01274.x
Carmona, R., Yang, J.: Predatory Trading: a Game on Volatility and Liquidity, (2008). https://carmona.princeton.edu/download/fe/PredatoryTradingGameQF.pdf
Cartea, Á., Jaimungal, S.: A closedform execution strategy to target volume weighted average price. SIAM J. Financial Math. 7(1), 760–785 (2016). https://doi.org/10.1137/16M1058406
Casgrain, P., Jaimungal, S.: Mean Field Games with Partial Information for Algorithmic Trading, (2018). arXiv:1803.04094
Casgrain, P., Jaimungal, S.: Meanfield games with differing beliefs for algorithmic trading. Math. Finance 30(3), 995–1034 (2020). https://doi.org/10.1111/mafi.12237
Chu, C.S., Lehnert, A., Passmore, W.: Strategic trading in multiple assets and the effects on market volatility. Int. J. Central Banking 5(4), 143–172 (2009)
Drapeau, S., Luo, P., Schied, A., Xiong, D.: An FBSDE approach to market impact games with stochastic parameters. Probability, Uncertainty Quantitative Risk 6(3), 237–260 (2021)
Ekeland, I., Témam, R.: Convex Analysis and Variational Problems. Soc. Ind. Appl. Math. (1999). https://doi.org/10.1137/1.9781611971088
Ekren, I., Nadtochiy, S.: Utilitybased pricing and hedging of contingent claims in AlmgrenChriss model with temporary price impact. Math. Finance 32(1), 172–225 (2022). https://doi.org/10.1111/mafi.12330
Evangelista, D., Thamsten, Y.: On finite population games of optimal trading, (2020). arXiv:2004.00790
Fu, G., Horst, U.: Meanfield leaderfollower games with terminal state constraint. SIAM J. Control Optim. 58(4), 2078–2113 (2020). https://doi.org/10.1137/19M1241878
Fu, G., Horst, U., Xia, X.: Portfolio liquidation games with selfexciting order flow, (2020). arXiv:2011.05589
Fu, G., Graewe, P., Horst, U., Popier, A.: A mean field game of optimal portfolio liquidation. Math. Oper. Res. 46(4), 1250–1281 (2021). https://doi.org/10.1287/moor.2020.1094
Horst, U., Naujokat, F.: When to cross the spread? Trading in TwoSided Limit Order Books. SIAM J. Financial Math. 5(1), 278–315 (2014). https://doi.org/10.1137/110849341
Huang, X., Jaimungal, S., Nourian, M.: Meanfield game strategies for optimal execution. Appl. Math. Finance 26(2), 153–185 (2019). https://doi.org/10.1080/1350486X.2019.1603183
Luo, X., Schied, A.: Nash equilibrium for riskaverse investors in a market impact game with transient price impact. Market Microstructure Liquidity 05(01n04), 2050001 (2019). https://doi.org/10.1142/S238262662050001X
Moallemi, C.C., Park, B., Van Roy, B.: Strategic execution in the presence of an uninformed arbitrageur. J. Financial Markets 15(4), 361–391 (2012)
Naujokat, F., Westray, N.: Curve following in illiquid markets. Math. Financial Economics 4(4), 299–335 (2011). https://doi.org/10.1007/s1157901100425
Neuman, E., Voß, M.: Trading with the Crowd, (2021). arXiv:2106.09267
Rogers, L.C.G., Singh, S.: The cost of illiquidity and its effects on hedging. Math. Finance 20(4), 597–615 (2010). https://doi.org/10.1111/j.14679965.2010.00413.x
Schied, A., Zhang, T.: A stateconstrained differential game arising in optimal portfolio liquidation. Math. Finance 27(3), 779–802 (2017). https://doi.org/10.1111/mafi.12108
Schied, A., Zhang, T.: A market impact game under transient price impact. Math. Oper. Res. 44(1), 102–121 (2019). https://doi.org/10.1287/moor.2017.0916
Schied, A., Strehle, E., Zhang, T.: Highfrequency limit of Nash equilibria in a market impact game with transient price impact. SIAM J. Financial Math. 8(1), 589–634 (2017). https://doi.org/10.1137/16M107030X
Schöneborn, T.: Trade execution in illiquid markets: Optimal stochastic control and multiagent equilibria. PhD thesis, Technische Universität Berlin, (2008)
Schöneborn, T., Schied, A.: Liquidation in the Face of Adversity: Stealth vs. Sunshine Trading, (2009). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1007014
Strehle, E.: Optimal execution in a multiplayer model of transient price impact. Market Microstructure Liquidity 3(4), 1850007 (2017). https://doi.org/10.1142/S2382626618500077
Acknowledgements
I am grateful to JeanPierre Fouque for encouraging and illuminating discussions. The paper has also profoundly benefited from the valuable comments and suggestions of the anonymous referee and the EditorinChief Ulrich Horst.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Since the Proof of Theorem 3.5 is a verification of a proposed Nash equilibrium, we briefly explain for the reader’s convenience how the candidate Nash equilibrium strategies \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) provided in (21) can be constructed. Suppose we replace the constrained optimization problems in (6) and (7) by their unconstrained versions
with some penalty parameter \(n \in {\mathbb {N}}\). Then, along the same lines of Lemmas 3.1, 3.2, 3.3 and 3.4 above, solving (72) and (73) simultaneously results into solving following coupled FBSDE system
for two suitable square integrable martingales \(({\tilde{M}}^1_t)_{0 \le t \le T}\) and \(({\tilde{M}}^2_t)_{0 \le t \le T}\). The system in (74) can be decoupled by adding and subtracting both forward and backward equations to obtain the two autonomous systems
and
The decoupled FBSDEs in (75) and (76) are linear. To solve them, we make a linear ansatz of the following form
Plugging this ansatz in (75) and (76), respectively, and comparing coefficients yields two deterministic Riccati equations for \(c^{+,n}\) and \(c^{,n}\) given by
as well as two linear BSDEs for \(b^{+,n}\) and \(b^{,n}\) given by
The ODEs in (78) can be solved in closed form with solutions
where \(\kappa ^+_n \triangleq \frac{2\sqrt{\delta ^+} + \gamma + 4n}{2\sqrt{\delta ^+}\gamma 4n}\) and \(\kappa ^_n \triangleq \frac{2\sqrt{\delta ^}  \gamma + 4n}{2\sqrt{\delta ^}+\gamma 4n}\) (with \(\delta ^+, \delta ^\) introduced in (18)). Also the linear BSDEs in (79) have explicit solutions given by
Putting everything together with the ansatz in (77), we obtain (for every \(n \in {\mathbb {N}}\)) a pair \((\alpha ^1, \alpha ^2)\) of candidate solutions which simultaneously solve (72) and (73), namely
Since all terms in (82) can be explicitly computed, one can identify the limit in (82) as the penalty parameter n in (72) and (73) goes to infinity. This yields \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) in (21), a candidate for the Nash equilibrium strategies for the original constraint stochastic differential game from Sect. 2. It is then only left to show that \(({\hat{\alpha }}^1,{\hat{\alpha }}^2)\) is indeed the unique Nash equilibrium and belongs to \({\mathscr {A}}^1 \times {\mathscr {A}}^2\). This verification is carried out in the Proof of Theorem 3.5 in Sect. 3 above.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Voß, M. A twoplayer portfolio tracking game. Math Finan Econ 16, 779–809 (2022). https://doi.org/10.1007/s11579022003246
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11579022003246