Abstract
We present an equilibrium model of dynamic trading, learning, and pricing by strategic investors with trading targets and price impact. Since trading targets are private, investors filter the child order flow dynamically over time to estimate the latent underlying parent trading demand imbalance and to forecast its impact on subsequent pricepressure dynamics. We prove existence of an equilibrium and solve for equilibrium trading strategies and prices as the solution to a system of coupled ODEs. Trading strategies are combinations of trading towards investor targets, liquidity provision for other investors’ demands, and speculation based on learning about latent underlying tradingdemand imbalances.
Introduction
The price formation process in financial markets involves equating supply and demand for securities over time for arriving investors with heterogeneous trading preferences. In present day markets, large investors act on their underlying trading preferences, sometimes called parent demands, by splitting their trading into dynamic sequences of smaller orders, called child orders (see O’Hara [32]), to minimize their price impact. Since the parent demands driving childorder trading are private information, investors use information from arriving child orders to form inferences over time about the dynamically evolving fundamental state of the market. In particular, investors learn about imbalances in the underlying aggregate parent demands and the associated pressure on future marketclearing prices and incorporate this information in their current child orders. Given the widespread prevalence of optimized ordersplitting of parent orders into flows of child orders, dynamic learning about aggregate parent demands is a critical part of market dynamics.^{Footnote 1}
This paper is the first to provide an analytically tractable equilibrium model of dynamic learning, trading, and pricing with parent trading demands. We consider a continuoustime model with highfrequency trading at times \(t\in [0,1]\) over short timehorizons with [0, 1] being a day or an hour. Trading occurs between pricesensitive optimizing traders with two different types of parent trading targets: One group has fixed individual targets, and the other group wants to track a stochastically evolving target over time. Since parent targets are initially not public, information about parent demand imbalances is partially revealed through marketclearing stock prices. Our analysis models the equilibrium dynamic learning process, stock holdings, and stockprice processes.
Our main results are:

We construct and solve two different equilibrium models: A simpler pricefriction equilibrium and a subgame perfect Nash financialmarket equilibrium. In the pricefriction equilibrium, price impact is due to an exogenous trading friction, but in the subgame Nash equilibrium, price impact includes both exogenous frictions and an endogenous price impact due to market clearing with constrained market assetholding capacity. We find that these two equilibria are numerically similar.

Intraday price drifts due to price pressure change over the trading day and are pathdependent. This leads to timevarying incentives for investors to provide liquidity to the child orders of other investors.

A practical application of our model is that we can compute total trading costs for investors given the effects of dynamic learning and optimal trading by other investors. We show these costs are quadratic in the rebalancers’ trading targets.

Trading in our model reflects a combination of liquidity provision and speculation but not predatory trading. We conjecture that the absence of predatory trading is because our model replaces the exogenous priceelastic residual supply used in both Brunnermeier and Pedersen [9] and Carlin, Lobo, and Viswanathan [10] with endogenous demands coming from rational profitmaximizing investors.
Our paper advances several strands of research on market microstructure. First, dynamic learning and trading have been extensively studied in the context of markets with strategic investors with longlived asymmetric information as in Kyle [29]. However, equilibrium trading, learning, and pricing with optimal dynamic ordersplitting by large uninformed investors are less understood. Thus, we model price pressure to equate supply and demand rather than adverse selection. Second, Grossman and Miller [21] model pricing and liquidity provision with impatient traders who submit single orders equal to their parent demands and with symmetric payoff information. In contrast, we model liquidity provision with optimal ordersplitting of parent demands into child order flows. Third, Choi, Larsen, and Seppi [12] construct an equilibrium with optimal dynamic trading and learning in a market with a strategic rebalancer with an endofday trading target and an informed investor who trades on private longlived assetpayoff information. By filtering the order flow over time, the rebalancer learns about the underlying asset payoff, the informed investor learns about the rebalancer’s trading target, and market makers learn about both when setting prices. That earlier paper provides a characterization result for equilibrium and gives numerical examples but does not have an existence proof or analytic solutions. In contrast, our model is solved analytically and gives the equilibrium in closed form. Fourth, Brunnermeier and Pedersen [9] and Carlin, Lobo, and Viswanathan [10] show how dynamic rebalancing by a large investor can lead to predatory trading. However, these papers abstract from the learning problem by assuming the parent trading needs are publicly observable. They also make an ad hoc assumption about the price sensitivity of a residual marketmaker trading demand due to exogenous priceelastic noise traders. In contrast, our model assumes the underlying parent trading demands are private information, which leads to a learning problem. In addition, our prices are rationally set with no ad hoc residual demand. Fifth, a large body of research models optimal ordersplitting strategies for a single strategic investor given an exogenous pricing rule with no learning about latent trading demands of other investors (see, e.g., Almgren and Chriss [3, 4], Almgren [2], and Schied and Schöneborn [34]). In contrast, we solve for optimal trades, learning, and pricing jointly. van Kerval, Kwan, and Westerholm [25] solve for optimal trading strategies for two dynamic rebalancers with learning over time about each other’s latent trading demands. This leads to predictions about the effect of aggregate parent demand on individual investor child orders, which are then verified empirically. However, they assume an ad hoc linear pricing rule, and there are no existence proofs or analytic solutions. In contrast, price pressure in our Nash model is partly endogenously determined in equilibrium, and we solve our model analytically. As in van Kervel, Kwan, and Westerholm [25], trading in our model is a combination of speculation on expected future price changes and tradingdemand accommodation.
The mathematics of our model is tractable because we use a modeling approach from the assetpricing literature for nondividend paying stocks. The simplification involves finding equilibrium price drifts that clear the market without determining the levels of marketclearing prices as discounted future cash flows. Karatzas and Shreve [27, Chap. 4] use this approach in complete market settings, and Cuoco and He [14] consider an extension to incomplete markets. Atmaz and Basak [1] show that nondividend paying stocks are relevant for asset pricing. However, the nondividend paying stock approach is new in the mainstream microstructure literature. Gârleanu and Pedersen [20], Bouchard, Fukasawa, Herdegen, and MuhleKarbe [7], and Noh and Weston [31] use the zerodividend stock approach to model prices given exogenous transaction costs. We extend this approach to include learning and endogenous price impact.
Model
We model equilibrium trading, learning, and pricing in a market with a risky stock and a riskless bank account over a short time horizon [0, 1] (e.g., a trading day). For simplicity, the net supply of both the stock and bank account are set to zero. Since the time horizon is short, the riskfree interest rate on the bank account is set to zero. Stock differs from the bank account in two ways: First, investors have individual parent demands for the stock. Second, stock prices are stochastic over time. Stock valuation can be viewed as the sum of two components: One component is a fundamental valuation of future dividends absent price pressure from trading targets. The other component is incremental price pressure for markets to clear given parent trading demand imbalances. It is the price pressure component that is the focus of our analysis. Our analysis treats these two components as being orthogonal and, for simplicity, normalizes the dividend valuation component to zero. Thus, hereafter, when we refer to the “stock price”, this is shorthand for the “price pressure valuation component of stock prices.” Our prices are random due to random trading demand imbalances. In a more complicated model, a separate fundamental dividend valuation component could be added to our stockprice pressure valuation to get the full stock price.
Two different groups of investors trade in our equilibrium model.

(i)
Pricesensitive rebalancers. Rebalancer \(i\in \{1,...,M\}\) maximizes her expected profit subject to a parent trading target \(\tilde{a}_i\) where \(\tilde{a}_i\) is private information for i. The targets \((\tilde{a}_1,...,\tilde{a}_M)\) are assumed independent and homogeneously distributed \(\tilde{a}_i \sim \mathcal {N}(0,\sigma _{\tilde{a}}^2)\) for all rebalancers \(i\in \{1,...,M\}\) with identical zero means and standard deviations \(\sigma _{\tilde{a}}\). The aggregate target is
$$\begin{aligned} \tilde{a}_\Sigma := \sum _{i=1}^M \tilde{a}_i. \end{aligned}$$(2.1)Rebalancer i’s control is her stock holdings, which are denoted by \((\theta _{i,t})_{t\in [0,1]}\) for \(i\in \{1,...,M\}\). For simplicity, the initial endowed holdings of both the bank account and the stock are normalized to zero for all rebalancers. When \(\tilde{a}_i\) is close to zero \((\tilde{a}_i\approx 0)\), rebalancer i is a “highfrequency" liquidity provider with inventory penalties. Because \(\tilde{a}_i\) is private information for i, other traders k, \(k\ne i\), do not know whether rebalancer i has an active latent trading demand \((\tilde{a}_i>>0)\) or is a liquidity provider \((\tilde{a}_i\approx 0)\).

(ii)
Pricesensitive trackers. Trackers \(j\in \{M+1,...,M+\bar{M}\}\) all track a dynamic target given by a common exogenous Brownian motion process \(w_t\) over time \(t\in [0,1]\)
$$\begin{aligned} w_t := w_0 + w^\circ _t,\quad t\in (0,1], \end{aligned}$$(2.2)where the initial target is \(w_0 \sim {{\mathcal {N}}}(0,\sigma ^2_{w_0})\), and \(w^\circ _t\) is a standard Brownian motion that starts at zero, has a zero drift, and a unit volatility.^{Footnote 2} While trackers observe the same \(w_t\) at time \(t\in [0,1]\), rebalancers do not and instead filter \(w_t\) over time \(t\in [0,1]\). Tracker j’s control is her stock holdings, which are denoted by \((\theta _{j,t})_{t\in [0,1]}\) for \(j\in \{M+1,...,M+\bar{M}\}\). Their initial stock and money market holdings are also normalized to zero. We assume the random variables \((\tilde{a}_1,...,\tilde{a}_M)\), \(w_0\), and \((w^\circ _t)_{t\in [0,1]}\) are all independent.
van Kerval, Kwan, and Westerholm [25] show that interactions between multiple heterogenous investors are an empirically important part of the trading process. Our model with \(M \ge 1\) and \({\bar{M}} \ge 1\) lets us analyze such trading interactions. In the following, index \(k\in \{1,...,M+{\bar{M}}\}\) denotes any generic trader, index \(i\in \{1,...,M\}\) denotes a rebalancer, and index \(j\in \{M+1,...,M+{\bar{M}}\}\) denotes a tracker. This allows us to express the stockmarket clearing condition as
Investor stock demands change over time due to stochastic shocks to the tracker target \(w_t\) and due to randomness in imperfect learning about the rebalancer targets. As a result, the stockprice process that clears the market as in (2.3) changes randomly over time. Thus, stock randomness in our model — given that the fundamental dividend valuation is normalized to zero — comes from learning about traders’ parent targets (which are initially private information of the individual rebalancers and the trackers) and from random changes over time in the trackers’ target \(w_t\).^{Footnote 3}
Investor information is represented as generic filtrations \({\mathcal F}_{i,t}\) and \({\mathcal F}_{j,t}\) for rebalancers and trackers. These filtrations are constructed explicitly in the equilibria considered below. In the pricefriction equilibrium in Sect. 3, the filtrations \(\mathcal {F}_{i,t}\) and \(\mathcal {F}_{j,t}\) are
where \(S_{i,t}\) and \(S_{j,t}\) denote perceived stockprice processes for a rebalancer i and a tracker j. In the Nash equilibrium in Sect. 4, more complicated filtrations are needed to derive traders’ optimal offequilibrium response functions.
Our model is a model of dynamic learning. As we shall see, trackers infer the aggregate target \(\tilde{a}_\Sigma \) in (2.1) from the initial stock price, and so trackers have no need to filter the rebalancers’ individual targets \((\tilde{a}_1, ..., \tilde{a}_M)\). The situation is different for each rebalancer \(i\in \{1,...,M\}\), who only observes her own target \(\tilde{a}_i\) and past and current stock prices. When \(\sigma _{w_0} >0\), these observations are insufficient to infer \(\tilde{a}_\Sigma \) and \(w_t\) separately, so rebalancer i filters based on \(\tilde{a}_i\) and on past and current stockprice observations to learn about the underlying latent parent demands \(\tilde{a}_\Sigma \) and \(w_t\). In contrast, when \(\sigma _{w_0}:=0\), the model only has static learning about \(\tilde{a}_\Sigma \) at time \(t=0\) from the initial stock price. At later times \(t\in (0,1]\), the rebalancers can infer \(w_t\) from their stockprice observations. The static learning model with \(\sigma _{w_0}:=0\) was developed in Choi, Larsen, and Seppi [13].
Individual maximization problems
This section introduces the individual maximization problems. A generic trader k’s optimal stock holdings are determined in terms of a tradeoff between expected terminal wealth \(X_{k,1}\) and a penalty for deviations of their holdings \(\theta _{k,t}\) over time from their parent target \(\tilde{a}_i\) (rebalancers) or Brownian motion \(w_t\) (trackers). An investor’s terminal wealth \(X_{k,1}\) depends on the stock prices \(S_{k,t}\) associated with k’s holdings \(\theta _{k,t}\) over time. An exogenous continuous (deterministic) function \(\kappa :[0,1]\rightarrow [0,\infty ]\) models the severity of the target penalty over time.^{Footnote 4} For example, more severe target penalties later in the day would be associated with a penalty severity function \(\kappa \) that is increasing with time t. The rebalancer and tracker objectives are
where \(\tilde{a}_i\) is the ideal holdings for rebalancer i and \(w_t\) is the ideal holdings for tracker j at time \(t\in [0,1]\). However, stockmarket clearing prevents \(\theta _{i,t}\) and \(\theta _{j,t}\) from being \(\tilde{a}_i\) and \(w_t\). The suprema in (2.5) are taken over progressively measurable holding processes \(\theta _{i,t}\) and \(\theta _{j,t}\) with respect to traders’ filtrations \(\mathcal {F}_{i,t}\) and \(\mathcal {F}_{j,t}\). As we shall see in Sects. 3 and 4 below, our traders optimally use controls given as smooth functions evaluated at a finite set of state processes (i.e., Markov controls). The next section constructs such a set of Markovian state processes. To rule out doubling strategies, we require square integrability
Terminal wealth \(X_{k,1}\) in (2.5) is generated by trader k’s perceived wealth process
which is affected by k’s holdings \(\theta _{k,t}\) both directly and also indirectly via the impact of k’s holdings on an associated perceived stockprice process \(S_{k,t}\). Trader k’s holdings \(\theta _{k,t}\) are price sensitive because marketclearing price pressure affects price drifts and, thus, investor wealth. In (2.7), the zero initial wealth \(X_{k,0}=0\) is because trader k’s initial endowed money market and stock holdings are normalized to zero. Thus, \(\tilde{a}_i\) and \(w_t\) are ideal holding changes relative to investors’ normalized initial zero holdings. Given the objectives in (2.5), trading reflects a combination of motives: Investors seek to have stock holdings close to their own targets \(a_i\) and \(w_t\), but they also seek to increase their expected terminal wealth by trading on price pressure from other investors trading on their targets. Thus, traders demand liquidity (to come close to their targets) and supply liquidity for markets to clear (by being willing to deviate from their targets so that other traders can trade towards their targets, given the appropriate price incentives), and speculate on future predictable price pressure.
Our remaining model construction involves specifying investor stockprice perceptions \(S_{i,t}\) and \(S_{j,t}\) and the associated investor filtrations \({\mathcal F}_{i,t}\) and \({\mathcal F}_{j,t}\). We then state conditions that these perceptions and filtrations must satisfy in equilibrium. Finally, we give theoretical results that ensure equilibria exist.
State processes
The fundamental underlying state of the market in our model depends on the aggregate parent demand imbalances \(\tilde{a}_\Sigma \) and \(\bar{M} w_t\). As already noted, there is a significant informational difference between trackers and rebalancers. Each tracker directly observes \(w_t\) in (2.2) and — as we shall see — can therefore infer the aggregate rebalancer target \(\tilde{a}_\Sigma \) in (2.1) from the initial stock price. In contrast, rebalancers learn about \(w_t\) and \(\tilde{a}_\Sigma \) using dynamic filtering. Thus, the rebalancer filtrations \(\mathcal {F}_{i,t}\), \(i\in \{1,...,M\}\), and tracker filtrations \(\mathcal {F}_{j,t}\), \(j\in \{M+1,...,M+{\bar{M}}\}\), are not nested. Rebalancers know prices and their individual target \(\tilde{a}_i\), whereas trackers know \(\tilde{a}_\Sigma \), \(w_t\), and prices.
Before considering specific stockprice perceptions in Sects. 3 and 4 below, we describe a set of conjectured state processes \((Y_t,\eta _t,q_{i,t},w_{i,t})\) for rebalancer \(i\in \{1,...,M\}\). These processes are all endogenous in the equilibria we construct. However, it is convenient to describe the state processes’ informational properties first, before showing how they arise in equilibrium. The processes \((Y_t,\eta _t)\) are public in that they are adapted to \(\mathcal {F}_{k,t}\) for all traders \(k\in \{1,...,M+\bar{M}\}\). Furthermore, \(\eta _t\) will be adapted to \(\sigma (Y_u)_{u\in [0,t]}\). The state processes \((q_{i,t},w_{i,t})\) are specific to individual rebalancers. They are adapted to i’s filtration \(\mathcal {F}_{i,t}\), but they are not adapted to other traders’ filtrations \(\mathcal {F}_{k,t}\) for \(k\ne i\).
Rebalancers learn by extracting information about aggregate demand imbalances from stock prices. In the equilibria we construct, the information extracted from stock prices over time t is a state process \(Y_t\), which has the form
where \(B:[0,1]\rightarrow \mathbb {R}\) is a smooth deterministic function of time that is endogenously determined in equilibrium. The function B(t) controls how \(\tilde{a}_\Sigma \) and \(w_t\) are mixed in stock prices. The process \(Y_t\) is not directly observable for the rebalancers, but Lemma 3.1 below shows that \(Y_t\) can be inferred from stock prices. Because rebalancer \(i\in \{1,...,M\}\) also knows her own target \(\tilde{a}_i\), by knowing \(Y_t\) over time \(t\in [0,1]\), she equivalently knows
Unlike \(Y_t\) in (2.8), the process \(Y_{i,t}\) is independent of rebalancer i’s private trading target \(\tilde{a}_i\) and satisfies
Rebalancers use knowledge of \(Y_t\) to estimate \(\tilde{a}_\Sigma \) and \(w_t\) from stock prices at time t. For a continuously differentiable function \(B:[0,1]\rightarrow \mathbb {R}\), we define two processes
for each rebalancer \(i \in \{1,...,M\}\) and \( t\in [0,1]\). The expectation \(q_{i,t}\) describes what rebalancer i has learned up through time t about the aggregate target \(\tilde{a}_\Sigma \tilde{a}_i\) of the other rebalancers.^{Footnote 5} In particular, \(q_{i,t}\) is a pathdependent process because it depends on the path of \(Y_{i,s}\) over time \(s\in [0,t]\).
Let the function \(\Sigma (t)\) denote the remaining variance
where the second equality follows from the zeromean assumptions for \((\tilde{a}_1,...,\tilde{a}_M)\) and \(w_0\). Because the targets \((\tilde{a}_1,...,\tilde{a}_M)\) are assumed independent and homogeneously distributed \({{\mathcal {N}}}(0,\sigma ^2_{\tilde{a}})\), the initial variance \(\Sigma (0)=\mathbb {E}[(\tilde{a}_\Sigma \tilde{a}_i q_{i,0})^2]\) is identical across all rebalancers \(i\in \{1,...,M\}\). This property and the formula for \(\Sigma (t)\) in (2.15) below imply that \(\Sigma (t)\) is also identical for all index \(i\in \{1,...,M\}\) for all \(t\in [0,1]\).
Now consider the \(w_{i,t}\) processes. Eq. (2.11) gives the dynamics of \(Y_{i,t}\) as
The following result is a special case of the KalmanBucy result from filtering theory (See Appendix B for details).
Lemma 2.1
(KalmanBucy) For a continuously differentiable function \(B:[0,1]\rightarrow \mathbb {R}\), the process \(w_{i,t}\) is independent of \(\tilde{a}_i\), is a Brownian motion, and satisfies (modulo \(\mathbb {P}\) null sets)
Furthermore, the remaining variance at time t is given by
\(\diamondsuit \)
Because the process \(w_{i,t}\) is independent of \(\tilde{a}_i\), \(w_{i,t}\) is also a Brownian motion with respect to the filtration \(\sigma (\tilde{a}_i,w_{i,u})_{u\in [0,t]}\). Furthermore, Lemma 2.1 shows that \((\tilde{a}_i,w_{i,t})\) are informationally equivalent with \((\tilde{a}_i,Y_{i,t})\) in the sense that (2.14) holds. However, while \(w_{i,t}\) on the left in (2.11) is observable by rebalancers, the individual terms \(w_t\) and \(\tilde{a}_\Sigma \) in \(w_{i,t}\)’s decomposition on the right of (2.11) are not.
The stockmarket clearing condition (2.3) lets us relate prices to the state processes driving investor demands. The sum \(\sum _{i=1}^M q_{i,t}\) is an important term in this relation, so the following decomposition results are useful:
Lemma 2.2
Let \(B:[0,1]\rightarrow \mathbb {R}\) be a continuously differentiable function.

1.
The decomposition
$$\begin{aligned} \sum _{i=1}^Mq_{i,t} = \eta _t + A(t) \tilde{a}_\Sigma ,\quad t\in [0,1], \end{aligned}$$(2.16)holds with the process \(\eta _t\) being adapted to \(\sigma (Y_u)_{u\in [0,t]}\) with \(Y_t\) in (2.8) and
$$\begin{aligned} \begin{aligned} A'(t)&=  \big (B'(t)\big )^2\Sigma (t)\big (A(t) +1\big ),\quad A(0)=\tfrac{(M1)B(0)^2\sigma ^2_{\tilde{a}}}{\sigma ^2_{w_0} +(M1)B(0)^2\sigma ^2_{\tilde{a}}},\\ d\eta _t&=  \big (B'(t)\big )^2\Sigma (t)\eta _tdt MB'(t)\Sigma (t)dY_t,\quad \eta _0 =\tfrac{M(M1)B(0)\sigma ^2_{\tilde{a}}}{\sigma ^2_{w_0} +(M1)B(0)^2\sigma ^2_{\tilde{a}}}Y_0. \end{aligned} \end{aligned}$$(2.17) 
2.
The inverse relation
$$\begin{aligned} q_{i,t}&= \frac{\eta _t}{M}  F_1(t)\left( \tfrac{(M1)B(0)^2 \sigma _a^2}{\sigma _{w_0}^2 + (M1)B(0)^2\sigma _a^2} +F_2(t) \right) \tilde{a}_i \end{aligned}$$(2.18)holds with deterministic functions \(F_1(t)\) and \(F_2(t)\) given by the ODEs
$$\begin{aligned} \begin{aligned} F_1'(t)&=B'(t)^2 \Sigma (t) F_1(t), \quad F_1(0)=1,\\ F_2'(t)&=\tfrac{B'(t)^2 \Sigma (t)}{F_1(t)}, \quad F_2(0)=0. \end{aligned} \end{aligned}$$(2.19)
\(\diamondsuit \)
There are two key points: First, no investor knows \(\sum _{i=1}^Mq_{i,t}\), but it can be decomposed into a public term \(\eta _t\) and a term \(A(t)\tilde{a}_\Sigma \) that trackers know but not the rebalancers. Second, from (2.17), the process \(\eta _t\) depends on the path of \(Y_s\) over time \(s\in [0,t]\). Thus, the state process \(\eta _t\) reflects common path dependence due to \(w_t\). The expression (2.18) shows that the individual rebalancer expectation \(q_{i,t}\) includes a common learning component \(\frac{\eta _t}{M}\) and then the effect of i’s private information \(\tilde{a}_i\). In particular, it follows from (2.19), that \(F_1(t)\) and \(F_2(t)\) are both positive so that, consistent with intuition, the loading on \(\tilde{a}_i\) is negative in (2.18).
Pricefriction equilibrium
Investor perceptions of the impact of their trading on stock prices are a key part of the optimizations in (2.5) and the resulting market equilibrium. We consider two specifications of investor stockprice perceptions. This section presents a simplified model in which perceived price impact is a fully exogenous trading friction. This approach is analogous to the exogenous price impact used in van Kerval, Kwan, and Westerholm [25]. We then solve for the endogenous stockprice process that clears the market (and also satisfies some weak consistency conditions) and the associated optimized investorholding processes. Sect. 4 presents a richer model of price impact in which investor stockprice perceptions are partially endogenized in a subgame perfect Nash financialmarket equilibrium.
Our equilibrium construction is a conjectureandverify analysis. Section 3.1 conjectures functional forms for investor perceptions of stockprice dynamics. Section 3.2 defines equilibrium and then solves for equilibrium priceperception coefficients and the associated price dynamics and holdings that satisfy the definition of equilibrium.
Stockprice perceptions
Recall that price pressure is different from the value of future dividends. It is a valuation adjustment needed to clear the stock market given trading demand imbalances. This allows us to model price pressure as zerodividend asset prices as in, e.g., Karatzas and Shreve [27, Chap. 4].
Rebalancers optimize (2.5) with respect to perceived stockprice processes of the form
where \(f_0,f_1,f_2,f_3:[0,1]\rightarrow \mathbb {R}\) are continuous (deterministic) functions of time \(t\in [0,1]\) and \((\alpha ,\gamma )\) are constants. The “f” superscript indicates that the perceived price \(S^f_{i,t}\) is defined with respect to a particular set of coefficient functions f in (3.1). The stockprice drift in (3.1) is perceived by rebalancer i to be affine in a set of state processes. Consistent with intuition, we will see that in equilibrium the loadings \(f_0(t)\) and \(f_3(t)\) on \(Y_t\) and \(\eta _t\) are negative. In particular, \(Y_t\) with \(B(t) < 0\) measures a mix of aggregate demand from rebalancers and trackers, and \(\eta _t\) reflects public expectations of aggregate private rebalancer expectations about other rebalancers’ parentdemand imbalances, both of which depress price change expectations. The other coefficients describe the perceived impact of rebalancer i on the stockprice drift.
Theorem 3.5 below endogenously determines \((f_0,f_1,f_2,f_3)\) in equilibrium. The exogenous parameters \((\alpha ,\gamma )\) can be found by calibrating model output to empirical data. The term \(\alpha \theta _{i,t}\) allows for ad hoc trading frictions. The pricefriction parameter \(\alpha \) is an exogenous model input. Price taking is a special case with \(\alpha :=0\), whereas the empirically relevant case is \(\alpha <0\) such that buy (sell) orders decrease (increase) the future stockprice drifts.
The innovations in the rebalancers’ perceived stock prices \(dw_{i,t}\) come from new information rebalancer i learns over time about the underlying parentdemand state variable \(Y_t\), which has both a direct effect on the future stockprice drift and an additional indirect effect via its impact on \(\eta _t\) since \(\eta _t\) is adapted to \(\sigma (Y_u)_{u\in [0,t]}\) from Lemma 2.2.
The zerodividend stock valuation approach (see, e.g., Chapter 4 in Karatzas and Shreve, [27]) has several consequences: First, we model perceived and equilibrium stockprice drifts rather than price levels. Second, in (3.1), the stock’s volatility and initial value are not determined in equilibrium but rather are model inputs. For simplicity, we set the volatility to be a constant \(\gamma > 0\) (i.e., positive demand innovations \(dw_{i,t}\) increase prices), and the initial price is set to be \(Y_0\) in (3.1). However, other choices of \(S_0\) would work equally well as long as \(S_0\) satisfies \(\sigma (S_0) = \sigma (Y_0)\).
The next result shows that \(w_{i,t}\) is rebalancer i’s innovations process in the sense that \(w_{i,t}\) is a Brownian motion relative to i’s filtration defined with perceived stock prices \(S^f_{i,t}\) in (3.1) and such that \(S^f_{i,t}\) and \(w_{i,t}\) generate the same information.
Lemma 3.1
Let \(f_0,f_1,f_2,f_3:[0,1]\rightarrow \mathbb {R}\) be continuous functions and let \(B:[0,1]\rightarrow \mathbb {R}\) be a continuously differentiable function. For a rebalancer \(i\in \{1,...,M\}\), let \(\theta _{i,t}\) satisfy (2.6) and be progressively measurable with respect to \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) with \(S^f_{i,t}\) defined in (3.1) and \(Y_t\) defined in (2.8). Then, modulo \(\mathbb {P}\)null sets, we have
\(\diamondsuit \)
Thus, given a path of perceived prices generated by a price process \(S^f_{i,t}\) of the form in (3.1) and her personal target \(\tilde{a}_i\), rebalancer i can infer the path of \(w_{i,t}\). Furthermore, given the path \(w_{i,t}\), rebalancer i can infer \(Y_{i,t}\) using (2.14) and, thus, can infer \(Y_t\) from (2.10). Consequently, rebalancer i can infer \((q_{i,t},\eta _t)\) where we recall from Lemma 2.2 that \(\eta _t\) is adapted to \(\sigma (Y_t)_{t\in [0,1]}\).
Trackers optimize (2.5) with respect to a perceived stockprice process of the form
where \(\bar{f}_3,\bar{f}_4,\bar{f}_5:[0,1]\rightarrow \mathbb {R}\) are continuous (determinstic) functions, and the \(\alpha \) is a constant.^{Footnote 6} Trackers have different information in that they observe \(w_t\) directly and can infer \(\tilde{a}_\Sigma \) from the initial stock price \(Y_0\) using (2.8) and their knowledge of \(w_0\). Therefore, their perceived stock prices differ from those of the rebalancers. Theorem 3.5 below endogenously determines \((\bar{f}_3,\bar{f}_4,\bar{f}_5)\) in equilibrium, and \((\alpha ,\gamma )\) are exogenous model inputs. Again, \(\alpha :=0\) is the special case of pricetaking.
The motivation for these price perceptions for the trackers is as follows. First, the perceptions in (3.3) allow trackers to condition their perceived price drift to take into account price pressure from target imbalances \(\tilde{a}_\Sigma \) and \(w_t\) that depress expected price changes. Since trackers and rebalancers trade differently on their targets, the pricedrift impacts \(\bar{f}_4\) and \(\bar{f}_5\) are in general different. Second, the trackers understand that the state process \(Y_t\) affects the rebalancer demand and, thus, the stockprice drift. However, \(Y_t\) does not need to be included explicitly in the tracker perceived price drift in (3.3) since \(Y_t\) is given by a linear combination of \(\tilde{a}_\Sigma \) and \(w_t\), which are already included in the drift. Third, trackers know that rebalancers’ can infer \(\eta _t\) and that this potentially affects their price perceptions in (3.1), and, thus, is likely to affect their trading, and, thus, is likely to affect pricing. Thus, trackers allow for the pricing effect of \(\eta _t\) in their perceptions in (3.3). Fourth, as already noted, \(\alpha \) allows for possible exogenous trading frictions, if any.
An important difference between rebalancer and tracker perceived prices in (3.1) and (3.3) is that rebalancer price dynamics are based on the informational innovations \(dw_{i,t}\), whereas tracker price dynamics are based on the tracker target changes \(dw_t\). Reconciling the price perceptions of rebalancers and trackers will impose restrictions on equilibrium price perceptions and holdings and will rely on the relation between \(dw_{i,t}\) and \(dw_t\) in (2.11).
Given the price perceptions in (3.1) and (3.3), we solve (2.5) for optimal rebalancer and tracker holdings.
Lemma 3.2
Let \(f_0,f_1,f_2,f_3,\bar{f}_3,\bar{f}_4,\bar{f}_5:[0,1]\rightarrow \mathbb {R}\) and \(\kappa :[0,1]\rightarrow (0,\infty ]\) be continuous functions, let \(B:[0,1]\rightarrow \mathbb {R}\) be continuously differentiable, let \(\alpha \le 0\), and let the perceived stockprice process in the wealth dynamics (2.7) be as in (3.1) and (3.3). Then, for \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{{\bar{f}}}_{j,u})_{u\in [0,t]}\), and, provided the holding processes
satisfy (2.6), the traders’ maximizers for (2.5) are \(\hat{\theta }_{i,t} \) for rebalancer \(i\in \{1,...,M\}\) and \(\hat{\theta }_{j,t}\) for tracker \(j\in \{M+1,...,M+\bar{M}\}\). \(\diamondsuit \)
The proof of Lemma 3.2 shows that pointwise quadratic maximization gives the maximizers for (2.5) for rebalancers and trackers for arbitrary f and \({\bar{f}}\) functions.
Stockprice perceptions play two interconnected roles in our model. First, rebalancers and trackers solve their optimization problems in (2.5) based on their perceptions in (3.1) and (3.3) for how hypothetical holdings \(\theta _{i,t}\) and \(\theta _{j,t}\) affect price dynamics. Second, investor stockprice perceptions affect how they learn from observed prices. In particular, Lemma 3.1 shows that rebalancers use their stockprice perceptions (3.1) to infer the aggregate demand state variable \(Y_t\) based on past and current stock prices. In other words, dynamic learning by rebalancers depends critically on their stockprice perceptions. Similarly, trackers also use their stockprice perception of \(Y_0\) in (3.3) to infer the aggregate parent demand \(\tilde{a}_\Sigma \) from the initial price at time \(t=0\). However, thereafter, there is no additional learning from prices by the trackers at \(t>0\) since they directly observe their target \(w_t\).
Equilibrium
This section defines our first of two equilibrium concepts and then derives priceperception coefficients for the conjectured functional form in Sect. 3.1 that satisfy the equilibrium definition along with the associated equilibrium price dynamics and holdings. The notion of equilibrium in our first construction is relatively simple, being based just on market clearing and consistency of investor price perceptions.
Definition 3.3
Deterministic functions of time \(f_0,f_1,f_2,f_3,\bar{f}_3,\bar{f}_4,\bar{f}_5,B:[0,1]\rightarrow \mathbb {R}\) constitute a pricefriction equilibrium if:

(i)
Maximizers \(\hat{\theta }_{k,t}\) for (2.5) exist for traders \(k \in \{1,...,M+\bar{M}\}\) given the stockprice perceptions (3.1) and (3.3) for filtrations \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{{\bar{f}}}_{j,u})_{u\in [0,t]}\).

(ii)
Inserting trader k’s maximizer \(\hat{\theta }_{k,t}\) into the perceived stockprice processes (3.1) and (3.3) produces identical stockprice processes across all traders \(k\in \{1,...,M+\bar{M}\}\). This common equilibrium stockprice process is denoted by \(\hat{S}_t\).

(iii)
The money and stock markets clear. \(\diamondsuit \)
Definition 3.3 places only minimal restrictions on the perceived stockprice coefficient functions in (3.1) and (3.3): Markets must clear and result in consistent perceived stockprice processes when all investors use their equilibrium strategies. Section 4 below considers a subgame perfect Nash extension of our basic model that imposes more restrictions on allowable offequilibrium stockprice perceptions such as offequilibrium market clearing and various consistency requirements.
Definition 3.3(ii) requires that in equilibrium rebalancers and trackers perceive identical stockprice dynamics when using their equilibrium holdings. However, rebalancers and trackers have different information (i.e., rebalancers form imperfect inferences about \(w_t\) and \(\tilde{a}_\Sigma \), whereas trackers observe \(w_t\) directly and infer \(\tilde{a}_\Sigma \) at time 0). The resolution of this apparent paradox is investors’ different information sets: Trackers and rebalancers all agree on \(d\hat{S}_t\), but they disagree on how to decompose \(d\hat{S}_t\) into drift and volatility components. Because the trackers observe \(w_t\), they can use \(dw_t\) in their decomposition of \(d\hat{S}_t\). However, \(w_t\) is not adapted to the rebalancers’ filtrations and can therefore not be used in their \(d\hat{S}_t\) decompositions. Instead, rebalancers use their innovations processes \(dw_{i,t}\) when decomposing \(d\hat{S}_t\) into drift and volatility. By replacing \(dw_{i,t}\) in \(dS^f_{i,t}\) in (3.1) with the decomposition of \(dw_{i,t}\) in terms of \(dw_t\) from (2.11), we can rewrite \(dS^f_{i,t}\) in (3.1) as
Therefore, to ensure identical equilibrium stockprice perceptions for all rebalancers and trackers, it suffices to match the drift of \(dS^{{\bar{f}}}_{j,t}\) in (3.3) for the equilibrium holdings \(\theta _{j,t} = \hat{\theta }_{j,t}\), \(j\in \{M+1,...,M+\bar{M}\}\), with the drift of \(dS^f_{i,t}\) in (3.5) for the equilibrium holdings \(\theta _{i,t}:= \hat{\theta }_{i,t}\), \(i\in \{1,...,M\}\). This produces the following equilibrium requirement:
for all rebalancers \(i \in \{1,...,M\}\) and all trackers \(j\in \{M+1,...,M+\bar{M}\}\). We note that the righthand side of (3.6) does not depend on the rebalancer index i. Matching up coefficients in front of \((\tilde{a}_i,\tilde{a}_{\Sigma },q_{i,t},\eta _t,w_t)\) in (3.6) using \(\hat{\theta }_{i,t}\) and \(\hat{\theta }_{j,t}\) in (3.4) and \(Y_t\) in (2.8) produces five equations. In addition, inserting \(\hat{\theta }_{i,t}\) and \(\hat{\theta }_{j,t}\) in (3.4) into the marketclearing condition (2.3) and using (2.16) produce three more equations from matching \((\tilde{a}_{\Sigma },\eta _t,w_t)\) coefficients. All in all, we have eight equilibrium restrictions for \((f_0,f_1,f_2,f_3,\bar{f}_3,\bar{f}_4,\bar{f}_5)\) and \(B'\), which give the equilibrium coefficient functions (A.1) in Appendix A and the ODE for B(t) in (3.7) below.
Our equilibrium existence result is based on the following technical lemma. It guarantees the existence of a solution to an autonomous system of coupled ODEs. In particular, given rebalancer stockprice perceptions of the form in (3.1) with an aggregate demand state variable \(Y_t\) process of the form in (2.8) (and the associated \(\eta _t\) process), we must construct a deterministic function B(t) that gives an equilibrium.
Lemma 3.4
Let \(\kappa :[0,1]\rightarrow [0,\infty ]\) be a continuous and integrable function (i.e., \(\int _0^1 \kappa (t)dt <\infty \)). For an initial constant \(B(0) \in \mathbb {R}\), the coupled ODEs
have unique solutions with \(\Sigma (t) \ge 0\), \(\Sigma (t)\) decreasing, \(A(t) \in [1,0]\), A(t) decreasing for \(t\in [0,1]\), and \(B(t),B'(t)<0\) when \(\bar{M}B(0) +1< 0\). \(\diamondsuit \)
The solutions to the ODEs for A(t) and \(\Sigma (t)\) in (3.7) agree with the expressions in (2.15) and (2.17). The exogenous pricefriction coefficient \(\alpha \) does not appear in the ODEs (3.7). It is possible to restate the ODE system (3.7) using a single pathdependent ODE . The special case \(B(0) := \frac{1}{\bar{M}}\) produces a model with no dynamic learning because \(B'(t)=0\) implies \(\Sigma '(t)=0\) and so \(dq_{i,t}=0\).
The following theorem gives the pricefriction equilibrium in terms of the ODEs (3.7). In this theorem, the pricefriction parameter \(\alpha \), volatility \(\gamma \), and initial value \(B(0)\in \mathbb {R}\) are free parameters. The intuition for B(0) being free is discussed after our equilibrium construction in Theorem 3.5.
Theorem 3.5
Let \(\kappa :[0,1]\rightarrow (0,\infty )\) be continuous, let the functions \((B,A,\Sigma )\) be as in Lemma 3.4, and let \(\alpha \le 0\). Then, we have:

(i)
A pricefriction equilibrium exists and is given by the priceperception functions (A.1) in Appendix A.

(ii)
The equilibrium in (i) has holdings \(\hat{\theta }_{i,t}\) for rebalancer i and \(\hat{\theta }_{j,t}\) for tracker j given by
$$\begin{aligned} \begin{aligned} \hat{\theta }_{i,t}&=\tfrac{\gamma B'(t)2 \kappa (t)}{2 \kappa (t)\alpha }\tilde{a}_i \tfrac{\gamma B'(t)}{2 \kappa (t)\alpha }q_{i,t} \\&\qquad +\tfrac{\gamma B'(t)}{(M+\bar{M}) (2 \kappa (t)\alpha )}\eta _t\tfrac{2 \bar{M} \kappa (t)}{(M+\bar{M}) (2 \kappa (t)\alpha )}Y_t,\quad i\in \{1,...,M\}, \\ \hat{\theta }_{j,t}&=\tfrac{\gamma B'(t)}{(M+\bar{M}) (2 \kappa (t)\alpha )}\eta _t +\tfrac{2 M \kappa (t)}{(M+\bar{M}) (2 \kappa (t)\alpha )}w_t \\&\qquad +\tfrac{\gamma (A(t)M+1) B'(t)2 \kappa (t)}{(M+\bar{M}) (2 \kappa (t)\alpha )}\tilde{a}_\Sigma ,\quad j\in \{M+1,...,M+\bar{M}\}. \end{aligned} \end{aligned}$$(3.8) 
(iii)
The equilibrium in (i) has the equilibrium stockprice process \(\hat{S}_t\) given by \(\hat{S}_0 := w_0  B(0)\tilde{a}_\Sigma \) and dynamics with respect to the trackers’ filtrations \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{{\bar{f}}}_{j,u})_{u\in [0,t]}\) given by
$$\begin{aligned} \begin{aligned} d\hat{S}_t&=\Big \{\tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}w_t +\tfrac{\gamma (A(t)M+1) B'(t)2 \kappa (t)}{M+\bar{M}}\tilde{a}_\Sigma \Big \}dt + \gamma dw_t, \end{aligned} \end{aligned}$$(3.9)and dynamics with respect to the rebalancers’ filtrations \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) given by
$$\begin{aligned} \begin{aligned} d\hat{S}_t&=\Big \{\tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}Y_t\gamma B'(t)\big ( \tilde{a}_i + q_{i,t}\big ) \Big \}dt+ \gamma dw_{i,t}. \end{aligned} \end{aligned}$$(3.10)\(\diamondsuit \)
Several observations follow from Theorem 3.5:

1.
Lemma 3.1 ensures that rebalancer i can infer her innovations process \(w_{i,t}\) from perceived prices \(S^f_{i,t}\) and \(\tilde{a}_i\), but rebalancer i cannot infer the trackers’ target \(w_t\) from the equilibrium prices \(\hat{S}_t\) in (3.9). This is because the aggregate target \(\tilde{a}_\Sigma \) also appears in the drift of \(d\hat{S}_t\) and \(\tilde{a}_\Sigma \) is not observed by individual rebalancers.

2.
The equilibrium holdings (3.8) follow from inserting the equilibrium f and \({\bar{f}}\) functions in (A.1) in Appendix A into (3.4). Thus, the holdings in (3.8) are expressed in terms of the investors’ state processes, which, in particular, are adapted to the investors’ filtrations. However, these state processes are not mutually independent and so we give such representations of (3.8) in (A.2) and (A.3) in Appendix A. First, the pricefriction equilibrium rebalancer holdings \(\hat{\theta }_{i,t}\) in (3.8) can be written in terms of the independent variables \((\tilde{a}_i, \tilde{a}_\Sigma \tilde{a}_i, w_0)\) and a residual orthogonal term given as a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. Likewise, the pricefriction equilibrium tracker holdings \(\hat{\theta }_{j,t}\) can be written in terms of the independent variables \((\tilde{a}_\Sigma , w_0)\) and a residual orthogonal term given in terms of a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. Both these residual terms are Gaussian. Section 3.4 illustrates the loading coefficients on these independent state processes.

3.
Because the exogenous pricefriction coefficient \(\alpha \le 0\) does not appear in the ODEs (3.7), \(\alpha \) is irrelevant for the equilibrium stockprice dynamics (3.9). However, \(\alpha \) does affect the equilibrium holdings in (3.8).

4.
The stockprice volatility \(\gamma \) affects the stockprice drift and holdings via its impact on B(t) in (3.7) and, thus, on (A.1).

5.
It can seem paradoxical that trackers and rebalancers all perceive the same equilibrium stockprice process \({\hat{S}}_t\), but they decompose its dynamics \(d{\hat{S}}_t\) into different perceived drifts and martingale terms (i.e., they have different Itô decompositions). The resolution lies in the rebalancers and trackers having different filtrations:^{Footnote 7} The drift and martingale terms in (3.10) are not adapted to \(\mathcal {F}_{j,t}\) and the drift and martingale terms in (3.9) are not adapted to \(\mathcal {F}_{i,t}\). The dynamics (3.9) and (3.10) all produce the same process \(\hat{S}_t\) because the innovations process \(w_{i,t}\) in (2.11) links \(dw_t\) with \(dw_{i,t}\) and the drift term \(B'(t)(\tilde{a}_\Sigma \tilde{a}_iq_{i,t})dt\).

6.
Investors’ offequilibrium perceived stockprice drifts differ linearly from their equilibrium drifts due to the differences \(\theta _{k,t}\hat{\theta }_{k,t}\) between their offequilibrium and equilibrium holdings.^{Footnote 8} Rebalancer i’s perceived stockprice drift in (3.1) can be decomposed for arbitrary holdings \(\theta _{i,t}\) as
$$\begin{aligned} \begin{aligned}&f_0(t)Y_t +f_1(t)\tilde{a}_i +f_2(t)q_{i,t}+f_3(t)\eta _t+ \alpha \theta _{i,t}\\&= \tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}Y_t \gamma B'(t)\big ( \tilde{a}_i + q_{i,t}\big ) +\alpha (\theta _{i,t}  \hat{\theta }_{i,t}), \end{aligned} \end{aligned}$$(3.11)where we have used the formulas for \((f_0,f_1,f_2,f_3)\) in (A.1) in Appendix A. Likewise, for arbitrary holdings \(\theta _{j,t}\), tracker j’s perceived stockprice drift in (3.3) is
$$\begin{aligned} \begin{aligned}&\bar{f}_3(t)\eta _t+\bar{f}_4(t)\tilde{a}_\Sigma +\bar{f}_5(t)w_t+ \alpha \theta _{j,t}\\&=\tfrac{\gamma B'(t)}{M+\bar{M}}\eta _t\tfrac{2 \bar{M} \kappa (t)}{M+\bar{M}}w_t +\tfrac{\gamma (A(t)M+1) B'(t)2 \kappa (t)}{M+\bar{M}}\tilde{a}_\Sigma +\alpha (\theta _{j,t}  \hat{\theta }_{j,t}), \end{aligned} \end{aligned}$$(3.12)where we have used the formulas for \((\bar{f}_3,\bar{f}_4,\bar{f}_5)\) in (A.1) in Appendix A. Continuity between equilibrium and offequilibrium is a reasonable property of investor stockprice perceptions. The representation of the perceived rebalancer drift in (3.11) relative to \(\hat{\theta }_{i,t}\) from (3.8) also explains the presence of the rebalancerspecific terms \((\tilde{a}_i,q_{i,t})\) in the rebalancers’ perceptions in (3.1).

7.
Investors initially use block trades at time 0 to trade to positions \(\theta _{i,0}\) and \(\theta _{j,0}\) from (3.8) that are generically different from their initial normalized holdings of 0. Thereafter, investors trade continuously at times \(t > 0\).

8.
Theorem 3.5 verifies that priceperception coefficients in (3.1) and (3.3) can be constructed such that an equilibrium satisfying Definition 3.3 exists. However, as with many other rational expectation models, we do not have a proof of uniqueness. For example, there may be other public state variables in addition to \(\eta _t\) that could hypothetically be included in the perceived price drifts that might also be associated with other equilibria as defined in Definition 3.3.
The function B(t) from (3.7) is key both in constructing the equilibrium and for interpreting the equilibrium price and holding processes. First, there is the issue that the initial value B(0) is a free input in Theorem 3.5. The intuition is that our model determines equilibrium stockprice drifts but not price levels. As can be seen in Theorem 3.5(iii) , B(0) controls the initial price level in our model. Second, the relation between B(t) and price levels allows us to impose additional structure on B(t). In particular, \(w_t\) and \(\tilde{a}_\Sigma \) represent different types of demand imbalances. Thus, if \(B(t) < 0\), then \(Y_t\) in (2.8) plays the role of an aggregate demand state variable. How the two component quantities \(w_t\) and \(\tilde{a}_\Sigma \) are mixed in the aggregate demand state variable \(Y_t\) is different given the two components’ different informational dynamics (i.e., \(\tilde{a}_\Sigma \) is not time dependent while \(w_t\) changes randomly over time) and given their different impacts on investor demands (i.e., each rebalancer only knows their personal \(\tilde{a}_i\) component of \(\tilde{a}_\Sigma \) where other rebalancers’ targets do not affect investor i’s parent demand whereas \(w_t\) affects both an individual tracker’s parent demand and is also information about other trackers’ parent demands). It seems reasonable that the sign of the impact of \(w_t\) and \(\tilde{a}_\Sigma \) on the price level should be the same, which imposes the additional restriction that \(B(t) < 0\). From Lemma 3.4, a sufficient condition for \(B(t) < 0\) for all \(t \in [0,1]\) is \({{\bar{M}}}B(0) + 1<0\), which implies \(B'(t)<0\).^{Footnote 9}
With the economically reasonable parametric restriction that \(B'(t) < 0\) and given that \(\alpha \le 0\) so that \(\alpha  2 \kappa (t) < 0\), we can sign the impact of various quantities in the model on holdings and prices, which leads to the following comparative statics:

1.
In (3.8), the equilibrium holdings \(\hat{\theta }_{i,t}\) of rebalancers are positively related to their parent targets \(\tilde{a}_i\). This is intuitive because rebalancers want holdings close to \(\tilde{a}_i\). Rebalancer holdings \(\hat{\theta }_{i,t}\) are negatively related to the aggregate demand imbalance state variable \(Y_t\). The fact that \(\theta _{i,t}\) is decreasing in \(Y_t\) is consistent with the theoretical results and empirical evidence in van Kerval, Kwan, and Westerholm [25] that investors buy less when there is a positive parentdemand imbalance for other investors in the market. The same intuition applies to the negative impact of the common component \(\eta _t\) on \(\hat{\theta }_{i,t}\). However, the impact of \(q_{i,t}\) on \(\hat{\theta }_{i,t}\) is positive. The intuition is that when rebalancer i expects the other rebalancers (given i’s ability to filter using her private target information \(\tilde{a}_i\)) to have a net positive parentdemand imbalance \(\mathbb {E}[\tilde{a}_\Sigma  \tilde{a}_i \mathcal {F}_{i,t}]\) from (2.11), she buys at time t to speculate on the resulting anticipated positive drift in future price pressure in (3.10).

2.
In (3.8), the equilibrium holdings \(\hat{\theta }_{j,t}\) of trackers are increasing in \(w_t\) (which reflects both her own parent demand and also information about the parent demands of other trackers). Tracker holdings \(\hat{\theta }_{j,t}\) are also decreasing in \(\eta _t\), which is related to imbalances in rebalancers’ aggregate parent demand expectations. The negative effect of \(\eta _t\) is consistent with the van Kerval, Kwan, and Westerholm [25] liquidityprovision result and empirical evidence. However, the impact of \(\tilde{a}_\Sigma \) is ambiguous in (3.8), and numerical calculations in Sect. 3.4 show that the sign is positive. This is again consistent with speculation on future predicted price pressure due to the tracker’s superior information about aggregated latent parentdemand imbalances.

3.
The equilibrium stockprice drift in (3.9) is decreasing in the tracker parent demand \(w_t\). However, the impact of \(\tilde{a}_\Sigma \) in the price drift is again ambiguous, which is related to information about \(\tilde{a}_\Sigma \) being useful in forecasting future price pressure.
Tractability and model structure
This section discusses the key modeling features that make our model tractable. First, we assume all traders seek to maximize their individual objectives in (2.4). Linearquadratic objectives have been used extensively in the literature because of their tractability. Such objectives have been used in, e.g., Sannikov and A. Skrzypacz [33], Gârleanu and Pedersen [20], and Bouchard, Fukasawa, Herdegen, and MuhleKarbe [7]. The linearquadratic objectives (2.5) allow us to solve for the optimal holdings in Lemma 3.2 using quadratic pointwise optimization. In the pricefriction equilibrium, we could equivalently use dynamic programming to produce the same optimal holdings.
Second, our stock does not pay dividends, which means that only the stock drift can be endogenously determined in equilibrium. Models with nondividend paying stocks have been used extensively in the literature. The monograph Karatzas and Shreve [27] gives an overview.^{Footnote 10} In particular, nondividend paying stock models have been used for short horizon models like ours where consumption only takes place at the terminal time.^{Footnote 11} The rebalancers’ dynamic learning produces forwardrunning filtering equations and by considering a nondividend paying stock, we circumvent having additional backwardrunning equations. Equilibrium models with both forward and backwardrunning equations include Kyle [29], Foster and Viswanathan [18, 19], Back, Cao, and Willard [5], and Choi, Larsen, and Seppi [12].
Third, price impact is often modeled as the impact of investor holdings and orders on price levels (e.g., as in Almgren [2]) and as the impact of orders on price changes (e.g., Kyle [29]). However, for the sake of tractability, we follow Cuoco and Cvitanić [15] and model price impact in terms of the impact of investor holdings on the price drift. Price impact matters for the trading decisions of strategic investors because of its effect on future expected price changes (e.g., high holding demand raises prices which lowers expected future price appreciation). Our pricefriction specification simply assumes directly that investor holdings affect expected future price changes. Thus, while our price impact specification is a simplification, it is a reasonable simplification that preserve the essential economics of price impact.
Fourth, instead of exogenous noise traders, we use optimizing trackers with a Brownian motion target \(w_t\). Grossman and Stiglitz [22] and Kyle [29] are standard references with an exogenous Gaussian stock supply. Gaussian noise traders are also used in the predatory trading models in Brunnermeier and Pederson [9] and Carlin, Lobo, and Viswanathan [10]. In our setting, we could eliminate trackers by setting \(\bar{M}:=0\) and replace the stockmarket clearing condition (2.3) by using \(w_t\) to model the exogenous stock supply as in
Including noise traders as in (3.13) in the model would be tractable in the pricefriction equilibrium. However, surprisingly, exogenous noisetraders complicate constructing a Nash equilibrium with dynamic learning, whereas — as we show in Sect. 4 — optimizing trackers and market learning in (2.3) produce a subgame perfect Nash financialmarket equilibrium in closed form. The models in Sannikov and Skrzypacz [33] and Choi, Larsen, and Seppi [13] have optimizing trackers but no dynamic learning.
Numerics
Our pricefriction equilibrium is straightforward to compute numerically. This is because equilibrium stock prices and holdings are available in closed form given the solutions to the associated coupled ODEs in (3.7). We illustrate our models for several different parameterizations. In these parameterizations, there are \(M := 5\) rebalancers and \({\bar{M}} := 10\) trackers. The penalty function is a constant over the trading day and set to \(\kappa (t):=1\). The rebalancer target volatility is normalized to \(\sigma _{\tilde{a}} := 1\) whereas we consider \(\sigma _{w_0} \in \{\frac{1}{10}, 1\}\) to illustrate the impact of dynamic learning. Recall that \(\sigma _{w_0}:=0\) gives the model with only initial learning of \(\tilde{a}_\Sigma \) as developed in Choi, Larsen, and Seppi [13]. To be consistent with our negative B(t) restriction, we consider an initial value \(B(0) :=0.2\). We consider two stockprice volatility parameters \(\gamma \in \{\frac{1}{2},1\}\) and a zero pricefriction parameter \(\alpha :=0\) (i.e., the competitive equilibrium). As noted above, \(\alpha \) does not affect the endogenous pricedrift coefficients, but \(\alpha \) does affect investor holdings.
Equilibrium holdings
First, we consider equilibrium holdings. Fig. 1 shows the coefficient functions for the equilibrium stock holdings \(\hat{\theta }_{k,t}\) in (3.8) for rebalancers and trackers using their orthogonal representations in (A.2) and (A.3) in Appendix A. Alternatively, we could plot coefficient loadings on the state processes \((\tilde{a}_i, q_{i,t}, \eta _t,Y_t)\) and \((\eta _t,w_t,\tilde{a}_\Sigma )\) in (3.8). We prefer to illustrate orthogonal loadings to avoid cancelation effects in the different state processes.
Fig. 1E shows rebalancer i’s loadings over time on her own parent target \(a_i\). As expected, these loadings are positive, but they are less than 1 because trading towards a positive target depresses equilibrium price drifts in order for markets to clear. The rebalancer loading on \(\tilde{a}_i\) is over 0.9, which implies a large initial block trade at time \(t=0\). The negative coefficients on \(\tilde{a}_\Sigma  \tilde{a}_i\) (for rebalancer i) in Fig. 1A and \(\tilde{a}_\Sigma \) (for tracker j) in Fig. 1B are demand accommodation. In particular, rebalancers and trackers reduce their holdings when other rebalancers want to buy. The loadings on \(w_0\) in Fig. 1C and 1D are more subtle. When the initial tracker target \(w_0\) has a high volatility (as in the red and amber trajectories), the tracker holdings load positively on \(w_0\) over time in Fig. 1D, and the negative rebalancer loadings in Fig. 1C indicate demand accommodation by the rebalancers. However, when the initial tracker target has low volatility (as in the green and blue trajectories), the initial positive tracker loadings on \(w_0\) eventually flip signs as do the initial negative rebalancer loadings. At first glance, this is puzzling. The explanation is that, as noted above, the trackers and rebalancers have different stockprice drift perceptions in (3.9) and (3.10) given their different filtrations. In particular, there is dynamic learning over time by the rebalancers based on the information \(Y_t\) inferred from prices, whereas the trackers are fully informed about \(\tilde{a}_\Sigma \) and \(w_t\) (trackers infer \(\tilde{a}_\Sigma \) at time 0). Fig. 3C and 3D below illustrate that the rebalancers’ and trackers’ stockdrift perceptions are quite different in these two low \(\sigma _{w_0}\) parameterizations.
In addition to the effects illustrated in Fig. 1, investor holdings are also affected by the realized path of \(w_t = w_0+w^\circ _t\) over time. This is because of fluctuations in the underlying tracker parent demand and also due to the effect of \(w^\circ _t\) on dynamic learning by the rebalancers. Appendix A shows the exact specification of this term in the tracker holdings (given as a \(dw^\circ _u\) integral of a deterministic function). Given the linearity of investor holdings and since the Brownian motion \(w^\circ _t\) has zero expected increments, this random path effect disappears in ex ante expected investor holdings.
To summarize, Fig. 1 shows there are three main drivers of investor holdings: First, investors’ holdings in most cases are drawn partially towards their own targets \(\tilde{a}_i\) and \(w_t\). Second, investors provide partial accommodation to other investors’ parent demands. Third, dynamic learning and speculation on the price drift affect demand accommodation. Interestingly, there is no evidence in Fig. 1 of predatory trading. Predatory trading differs from demand accommodation in that a predator first trades in the same direction as another investor and then subsequently unwinds her position. In this context, the humpshape of the blue trajectories (for low \(\sigma _{w_0})\) in Fig. 1C and 1D do not indicate predatory trading: Because \(w_0\) is the trackers’ own target, the blue hump in Fig. 1D cannot reflect predatory trading. Furthermore, the blue humpshaped trajectory in Fig. 1C also differs from predatory trading because the tracker and rebalancer loadings have opposite signs as seen in Fig. 1D. This is due to market clearing. For example, when the rebalancers are buying given \(w_0> 0\), the trackers are actually selling. Instead of predatory trading, we shall see below, that the blue trajectories are explained by price perceptions and dynamic learning.
Fig. 2 plots the instantaneous intraday unconditional trading autocorrelations
for the pricefriction equilibrium holding processes for both the rebalancer and tracker in (3.8). These autocorrelations are scaled by the time step \(h>0\) (the unscaled versions converge to zero as \(h\downarrow 0\)).
Thus, consistent with empirical evidence, trading is autocorrelated due to order splitting. Fig. 2 shows that rebalancers’ orders are positively autocorrelated (2A) whereas trackers’ orders exhibit negative autorcorrelation (2B).
Market clearing forces the intraday instantaneous unconditional cross correlation between rebalancers’ and trackers’ holdings to be negatively perfectly correlated
for all \(i\in \{1,...,M\}\) and \(j\in \{M+1,...,M+{\bar{M}}\}\).
Equilibrium prices
Next, we consider the pricefriction equilibrium stockprice dynamics in (3.9) and (3.10). For the trackers, we can rewrite the drift in (3.9) in terms of the independent random variables \((\tilde{a}_\Sigma , w_0)\) and an residual orthogonal term given as a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. For the rebalancers, we can rewrite the perceived drift in (3.10) in terms of the independent random variables \((\tilde{a}_\Sigma \tilde{a}_i, w_0, \tilde{a}_i)\) and an residual orthogonal term given as a stochastic integral with respect to \(w^\circ _t\) of a deterministic function of time. These formulas are given in (A.4) and (A.5) in Appendix A and are illustrated in Fig. 3.
Fig. 3 shows that positive parent demands \(\tilde{a}_i\), \(\tilde{a}_\Sigma  \tilde{a}_i\), and \(\tilde{a}_\Sigma \) all depress perceived stockprice drifts. The same is true for the tracker perceived stockprice drift loading on the initial tracker parent demand \(w_0\). However, the relation between the rebalancer perceived drift and \(w_0\) is more nuanced. When the initial tracker demand volatility \(\sigma _{w_0}\) is high (red and amber lines in Fig. 3C and 3D), then rebalancers perceive that \(w_0\) depresses the price drift. However, when \(\sigma _{w_0}\) is low, then the dynamic learning process — given the inability of rebalancers to observe \(w_0\) directly — causes the rebalancer perceived stockprice drift loading on \(w_0\) to change sign. The blue and green lines in Fig. 3C and 3D illustrate that low values of \(\sigma _{w_0}\) make the trackers use their superior knowledge of \(w_0\) to manipulate stockprice perceptions to create gains from trade that outweigh their penalties. More specifically, the blue and green lines in Fig. 1C and 1D show that rebalancers have large positive stock holdings and trackers have large negative holdings based on a positive realization \(w_0>0\). Such large negative holdings imply that trackers incur large inventory penalties because they deviate from the target trajectory \(w_t= w_0 + w^\circ _t\). Trackers find this behavior optimal because their blue and green lines in Fig. 3D are negative (giving trackers large gains from trade) and rebalancers are willing to hold these large positive stock positions because their blue and green lines in Fig. 3C are positive (giving also rebalancers large gains from trade).
Fig. 4A plots the instantaneous intraday unconditional stockprice autocorrelation, which is again scaled relative to h
for the equilibrium stockprice process \(\hat{S}_t\).
Price pressure from persistent parent demands lead to rising intraday price autocorrelation over the trading day. Fig. 4B plots the time trajectory of the unconditional variance of intraday price drifts over the trading day based on the trackers’ equilibrium perceptions in (3.9). Predictable price drifts are important in actual markets as incentives for intraday liquidity provision by HFT market makers (represented in our model by rebalancers with realizations \(\tilde{a}_i \approx 0\).) We see that pricedrift variability due to price pressure increases over the trading day.
Equilibrium learning
Fig. 5A shows that \(\sigma _{w_0}>0\) controls the starting point \(\Sigma (0)>0\) whereas \(\gamma >0\) controls the speed of learning (i.e., how negative the slope of \(\Sigma (t)\) is). For example, the green and red lines (\(\gamma =1\)) illustrate a slower speed of learning relative to the amber and blue lines (\(\gamma =0.1\)). These effects on \(\Sigma '(t)\) come from Fig. 5B and the formula for \(\Sigma (t)\) in terms \(B'(t)\) in (2.15). The red line in Fig. 5A also shows that the remaining variance \(\Sigma (1)\) at \(t=1\) can be substantial.
Equilibrium welfare
In this section, we study the impact of the exogenous model input \(B(0)\in \mathbb {R}\) on equilibrium welfare. There are many ways to measure social welfare (see, e.g., Vayanos [35, Section 6]). We follow Du and Zhu [17, Eq. 42] and consider maximizing the expected aggregate certainty equivalent for the \(M+\overline{M}\) investors. The certainty equivalent CE\(_k\in \mathbb {R}\) for investor \(k\in \{1,...,M+\overline{M}\}\) is defined by the expressions in (2.5). The aggregate expected welfare is given by
where the expectation in (3.17) is ex ante in the sense that it is taken over the random variables \((\tilde{a}_1,...,\tilde{a}_M)\) and \(w_0\) (Gaussian and independent).
Fig. 6 shows that in the pricefriction equilibrium with \(\alpha :=0\), expected welfare is maximized at \(B(0)=\frac{1}{{\bar{M}}}\). This is not too surprising because \(B(0)=\frac{1}{{\bar{M}}}\) implies full revelation and no dynamic learning takes place for \(t>0\) (see the discussion after Lemma 3.4). Aggregate welfare is decreasing in the initial tracker parent standard deviation \(\sigma _{w_0}\) both because more demand accommodation is required and also because the rebalancer learning problem is more difficult. This effect can be seen by comparing the blue and green (low initial standard deviation) and amber and red (high initial standard deviation) cases in Fig. 6.
Subgame perfect Nash equilibrium
This section builds on the analysis in Section 3 by endogenizing stockprice perceptions and price impact. In particular, we partially endogenize the impact of an investor’s hypothetical offequilibrium holdings on offequilibrium marketclearing stock prices based on her perceptions of how other investors perceive prices and on other investors’ resulting optimal response functions to her offequilibrium holdings. More specifically, a subgame perfect Nash equilibrium involves describing how each trader \(k_0\) (who might be a rebalancer \(i_0\) or a tracker \(j_0\) with their different filtrations) perceives all other traders’ price perceptions.
The major difference between the pricefriction equilibrium in Sect. 3 and our subgame perfect Nash equilibrium lies in the traders’ stockprice perceptions. For a subgame perfect Nash equilibrium, investor stockprice perceptions must be such that:

(i)
Trader \(k_0\)’s own stockprice perceptions must be consistent with marketclearing for any offequilibrium holdings \(\theta _{k_0,t}\) used by \(k_0\), when other traders’ holding responses are optimal given the stockprice dynamics \(k_0\) perceives other traders \(k\ne k_0\) to have. This offequilibrium marketclearing requirement can be found in, e.g., Vayanos [35].

(ii)
Trader \(k_0\)’s equilibrium holdings are found by solving her optimization problem using her own marketclearing stockprice dynamics from (i).

(iii)
All optimizers from (i) must be consistent with traders’ equilibrium holdings in (ii).
Definition 4.3 below makes properties (i)–(iii) operational. We refer to the last property (iii) as a consistency requirement between off and onequilibrium holdings.
Optimal offequilibrium responses
In our subgame perfect Nash model, a generic trader \(k_0\) perceives that other rebalancers and trackers have stockprice perceptions of the form
where \(W_{i,t}\) and \(W_{j,t}\) are Brownian motions and \(Z_t\) is an arbitrary Itô process (i.e., \(Z_t\) is a sum of drift and volatility). The “Z” superscript in (4.1) indicates that the perceived stock prices \(S_{i,t}^Z\) and \(S_{j,t}^Z\) are defined with respect to \(Z_t\). We use the marketclearing condition (2.3) to construct two such Itô processes in (4.5) and (4.8) below. These \(Z_t\) processes differ from \(Y_t\) in (3.1) and (3.3) in that we use \(Z_t\) to capture the effect of arbitrary offequilibrium stock holdings by trader \(k_0\) on marketclearing prices given optimal responses by other investors k, \(k\ne k_0\). We then go on to determine endogenously the deterministic functions \((\mu _1,\mu _2,\mu _3,\bar{\mu }_4, \bar{\mu }_5)\) in equilibrium in Theorem 4.5 below.
Lemma 4.1 gives traders’ optimal response to an arbitrary Itô process \(Z_t\) and is the Nash equilibrium analogue of Lemma 3.2.
Lemma 4.1
(Optimal responses to \(Z_t\)) Let \(\mu _1,\mu _2,\mu _3,\bar{\mu }_4,\bar{\mu }_5:[0,1]\rightarrow \mathbb {R}\) and \(\kappa :[0,1]\rightarrow (0,\infty ]\) be continuous functions, let \(\alpha \le 0\), let \((Z_t)_{t\in [0,1]}\) be an Itô process, and let the perceived stockprice process in the wealth dynamics (2.7) be as in (4.1). Then, \(Z_t\) is adapted to both \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,Y_u,W_{i,u},S^Z_{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:=\sigma (\tilde{a}_\Sigma ,w_u,Y_u,W_{j,u},S^Z_{j,u})_{u\in [0,t]}\) and, provided
satisfy (2.6), the maximizer for (2.5) is \(\theta ^Z_{i,t} \) for rebalancer \(i\in \{1,...,M\}\) and \(\theta ^Z_{j,t}\) for tracker \(j\in \{M+1,...,M+\bar{M}\}\). \(\diamondsuit \)
Similar to Lemma 3.2, Lemma 4.1 is proven using pointwise quadratic maximization. Unlike \(Y_t\) in Lemma 3.2, there is no Markov structure imposed on \(Z_t\) in Lemma 4.1, which makes dynamical programming inapplicable. Therefore, the simplicity of the linearquadratic objectives in (2.5) is crucial for the proof of the optimality of \(\theta ^Z_{i,t}\) and \(\theta ^Z_{j,t}\) in (4.2).
Marketclearing stockprice perceptions
Investor \(k_0\)’s perceptions about other investors’ stockprice perceptions ensure that the stock market clears for any choice of \(k_0\)’s holdings. Thus, when solving for trader \(k_0\)’s individual equilibrium holdings, we require \(k_0\)’s perceived stockprice process (denoted by \(S^\nu _{k_0,t}\) below) clears the stock market for arbitrary hypothetical holdings \(\theta _{k_0,t}\). We assume that a given trader \(k_0\in \{1,...,M+\bar{M}\}\) perceives that other traders \(k\ne k_0\) perceive the stockprice processes in (4.1). Hence, trader \(k_0\) perceives that other traders k, \(k\ne k_0\), optimally hold \(\theta ^Z_{k,t}\) in (4.2) shares of stock. Given this, we then find marketclearing \(Z_{k_0,t}\) processes associated with arbitrary hypothetical holdings \(\theta _{k_0,t}\) for trader \(k_0\).
First, consider a rebalancer \(i_0\in \{1,...,M\}\). We construct a process \(Z_{i_0,t}\) such that the stock market clears in the sense
where \(\theta _{i_0,t}\) denotes an arbitrary stockholdings process for rebalancer \(i_0\) and other investors’ responses \(\theta ^{Z_{i_0}}_{k,t}\) are from (4.2) for \(Z_t := Z_{i_0,t}\). Clearly, any solution \(Z_{i_0,t}\) of (4.3) is specific for rebalancer \(i_0\). To describe one particular solution \(Z_{i_0,t}\), we insert (4.2) into (4.3). This produces an affine equation in \((\theta _{i_0,t},Z_{i_0,t}, \tilde{a}_{i_0}, q_{i_0,t},\eta _t,w_t,\tilde{a}_\Sigma )\). Because rebalancer i cannot observe nor infer \(w_t\) and \(\tilde{a}_\Sigma \) seperately, she has to filter based on observing a linear combination of \(w_t\) and \(\tilde{a}_\Sigma \) given by \(Y_t := w_t B(t)\tilde{a}_\Sigma \) where \(B:[0,1]\rightarrow \mathbb {R}\) is a continuously differentiable function satisfying
where A(t) is as in (2.17). The specific form of (4.4) comes from rewriting (4.3) in terms of \((\theta _{i_0,t},Z_{i_0,t}, \tilde{a}_{i_0}, q_{i_0,t},\eta _t,Y_t)\) rather than \((\theta _{i_0,t},Z_{i_0,t}, \tilde{a}_{i_0}, q_{i_0,t},\eta _t,w_t,\tilde{a}_\Sigma )\). Because A(t) in (2.17) depends on B(t), Eq. (4.4) is a fixed point requirement for B(t). Below, we show that the coupled ODEs in (4.19) characterize (A, B) in (4.4), and we give conditions ensuring that (4.19) has a solution. Given a solution B(t) to (4.4), we use \(Y_t:=w_t  B(t)\tilde{a}_\Sigma \) from (2.8) to express a solution of (4.3) as^{Footnote 12}
The process \(Z_{i_0,t}\) in (4.5) captures the impact of arbitrary holdings \(\theta _{i_0,t}\) by rebalancer \(i_0\) on marketclearing stock prices given \(i_0\)’s perceptions of how other traders \(k\ne i_0\) optimally respond using \(\theta _{k,t}^{Z_{i_0}}\) from (4.2) with \(Z_t := Z_{i_0,t}\).
Next, we describe rebalancer \(i_0\)’s stockprice perceptions for \(i_0\in \{1,...,M\}\). Rebalancer \(i_0\) filters based on her own target \(\tilde{a}_i\) and on observations of past and current perceived marketclearing stock prices \(S^\nu _{i_0,u}\) defined by
where \((\tilde{a}_{i_0},\theta _{i_0,t})\) are known and \((Z_{i_0,t} ,q_{i_0,t},\eta _{t_0})\) are inferred by rebalancer \(i_0\). The “\(\nu \)” superscript in (4.6) indicates that the perceived stock prices are defined with respect to a particular set of deterministic functions \((\nu _0,\nu _1,\nu _2,\nu _3)\), which we endogenously determine in Theorem 4.5 below. More specifically, by observing \(\tilde{a}_{i_0}\) and \((S^\nu _{i_0,u})_{u\in [0,t]}\) defined in (4.6), rebalancer \(i_0\) infers \(Y_t:=w_t  B(t)\tilde{a}_\Sigma \) from (2.8) using the Volterra argument behind Lemma 3.1. To see this, we insert (4.5) into (4.6) to produce rebalancer \(i_0\)’s perceived marketclearing stockprice dynamics
Because the expressions multiplying \((\tilde{a}_{i_0},q_{i_0,t},\eta _t,Y_t,\theta _{i_0,t})\) in (4.7) are continuous (deterministic) functions of time \(t\in [0,1]\), Lemma 3.1 applies and shows that by observing \(\tilde{a}_{i_0}\) and \((S^\nu _{i_0,u})_{u\in [0,t]}\) in (4.7) over time \(t\in [0,1]\), rebalancer \(i_0\) can infer \(w_{i_0,t}\). Subsequently, rebalancer \(i_0\) can use (2.10) and (2.14) to also infer \(Y_t\) over time \(t\in [0,1]\).
Next, consider a tracker \(j_0\in \{M+1,...,M+\bar{M}\}\). For arbitrary offequilibrium holdings \(\theta _{j_0,t}\), the marketclearing solution \(Z_{j_0,t}\) from
is given by
where A(t) is as in (2.17). Once again, \(Z_{j_0,t}\) captures tracker \(j_0\)’s perceptions of the impact of her holdings \(\theta _{j_0,t}\) on marketclearing stock prices given \(j_0\)’s perceptions of other investors’ \(k\ne j_0\) responses \(\theta _{k,t}^{Z_{j_0}}\) to \(\theta _{j_0,t}\).
Tracker \(j_0\)’s perceived marketclearing stockprice process is defined as
where \(\bar{\nu }_3,\bar{\nu }_4,\bar{\nu }_5:[0,1]\rightarrow \mathbb {R}\) are deterministic functions of time (endogenously determined Theorem 4.5 below). Inserting (4.9) into (4.10) gives tracker \(j_0\)’s perceived marketclearing stockprice dynamics
We note that tracker \(j_0\)’s perceived marketclearing stockprice dynamics \(dS^{\bar{\nu }}_{j_0,t}\) in (4.11) are driven by the exogenous Brownian motion \(w_t\) from (2.2) whereas rebalancer \(i_0\)’s stock prices \(dS^\nu _{i_0,t}\) in (4.7) are driven by \(i_0\)’s innovations process \(dw_{i_0,t}\) from (2.11). This is due to the different information sets of rebalancers and trackers.
Unlike the pricefriction equilibrium in Theorem 3.5, we see from (4.7) and (4.11) that, even with no direct price impact in the sense \(\alpha := 0\) in (4.6) and (4.10), the remaining net price impacts \(\frac{2 \nu _0(t) \kappa (t)}{M+\bar{M}1}\) and \(\frac{2 \kappa (t)}{M+\bar{M}1}\) of \(\theta _{i,t}\) and \(\theta _{j,t}\) are nonzero. This is because price pressure in (4.7) and (4.11) clears the stock market for arbitrary holdings \(\theta _{i,t}\) and \(\theta _{j,t}\).
The next result gives the optimal holdings \(\theta ^*_{k,t}\) for all traders \(k_0:=k\in \{1,...,M+\bar{M}\}\) given their perceptions of marketclearing stock prices in (4.7) and (4.11). While both \(\theta ^*_{k,t}\) and the optimal response holdings \(\theta ^{Z}_{k,t}\) in (4.2) maximize (2.5), they differ because they are based on different perceived stockprice processes. On one hand, the optimal responses \(\theta ^{Z}_{k,t}\) in (4.2) are based on the stockprice perceptions in (4.1). On the other hand, the optimizer \(\theta ^*_{k,t}\) is based on the marketclearing stockprice perceptions in (4.7) and (4.11).
Lemma 4.2
Let \(\nu _0,\nu _1\), \(\nu _2,\nu _3,\bar{\nu }_3,\bar{\nu }_4,\bar{\nu }_5:[0,1]\rightarrow \mathbb {R}\) and \(\kappa :[0,1]\rightarrow (0,\infty ]\) be continuous functions with \(\nu _0>0\) and assume \(\alpha \le 0\). Let the perceived marketclearing stockprice processes in the wealth dynamics (2.7) be given by (4.7) and (4.11) with corresponding filtrations \(\mathcal {F}_{i,t}:= \sigma (\tilde{a}_i,S^\nu _{i,u})_{u\in [0,t]}\) and \(\mathcal {F}_{j,t}:= \sigma (w_u,S^{\bar{\nu }}_{j,u})_{u\in [0,t]}\) for \( i\in \{1,...,M\}\) and \(j\in \{M+1,...,M+\bar{M}\}\). Then, provided the holding processes
satisfy (2.6), the traders’ maximizers for (2.5) are \(\theta _{i,t}^*\) for rebalancer \(i\in \{1,...,M\}\) and \(\theta _{j,t}^*\) for tracker \(j\in \{M+1,...,M+\bar{M}\}\). \(\diamondsuit \)
From Lemma 4.2, we note that a generic rebalancer \(i_0\) has filtration \(\sigma (\tilde{a}_{i_0},S^\nu _{i_0,u})_{u\in [0,t]}\) whereas she perceives that other rebalancers \(i\ne i_0\) have filtrations \(\sigma (\tilde{a}_i,Y_u,W_{i,u},S^Z_{i,u})_{u\in [0,t]}\) as in Lemma 4.1. Because these are \(i_0\)’s offequilibrium perceptions, this is allowable as long as they are consistent with i’s equilibrium holdings. We require this consistency in Definition 4.3(iii) below. We also note from Lemma 4.1 that rebalancer i can infer \(Z_{i_0,t}\) in (4.5). In turn, this allows rebalancer i, \(i\ne i_0\), to also know the process
However, knowing (4.13) is insufficient for rebalancer i, \(i\ne i_0\), to infer rebalancer \(i_0\)’s private target \(\tilde{a}_{i_0}\).
Equilibrium
Definition 4.3
Deterministic functions of time \(\mu _1,\mu _2,\mu _3,\bar{\mu }_4,\bar{\mu }_5,\nu _0,\nu _1,\nu _2,\nu _3,\bar{\nu }_3,,\bar{\nu }_4,\bar{\nu }_5:[0,1]\rightarrow \mathbb {R}\) constitute a subgame perfect Nash financialmarket equilibrium if:

(i)
For \(k \in \{1,...,M+\bar{M}\}\), trader k’s maximizer \(\theta ^*_{k,t}\) for (2.5) exists given the marketclearing stockprice perceptions (4.7) and (4.11).

(ii)
For \(k\in \{1,...,M+\bar{M}\}\), inserting trader k’s maximizer \(\theta ^*_{k,t}\) into the perceived marketclearing stockprice processes (4.7) and (4.11) produces identical stockprice processes across all traders. This common equilibrium stockprice process is denoted by \(S^*_t\).

(iii)
Optimizers and equilibrium holdings must be consistent in the sense that trader k’s perceived response to trader \(k_0\)’s maximizer \(\theta ^*_{k_0,t}\) is trader k’s maximizer \(\theta ^*_{k,t}\).

(iv)
The money and stock markets clear. \(\diamondsuit \)
The identical stockprice requirement in Definition 4.3(ii) is similar to the one in Definition 3.3(ii). We see from the rebalancers’ perceptions (4.6) that both the drifts and the martingale terms have i dependence. Similar to (3.5), we replace \(dw_{i,t}\) in \(dS^\nu _{i,t}\) in (4.6) with the decomposition of \(dw_{i,t}\) in terms of \(dw_t\) in (2.11) and rewrite \(dS^\nu _{i,t}\) in (4.6) as
Therefore, to ensure identical equilibrium stockprice perceptions for all traders \(k\in \{1,...,M+\bar{M}\}\), it suffices to match the drift of \(dS^{\bar{\nu }}_{j,t}\) in (4.10) for \(j\in \{M+1,...,M+\bar{M}\}\) with the drift of \(dS^\nu _{i,t}\) in (4.14) for the optimal holdings \(\theta _{i,t}:= \theta ^*_{i,t}\) for \(i\in \{1,...,M\}\). This produces the requirement
for all rebalancers \(i \in \{1,...,M\}\) and all trackers \(j\in \{M+1,...,M+\bar{M}\}\). The righthand side of (4.15) does not depend on the rebalancer index i. In (4.15), the process \(Z_{i,t}^*\) is (4.5) evaluated at \(\theta _{i,t}:= \theta ^*_{i,t}\), and \(Z_{j,t}^*\) is (4.9) evaluated at \(\theta _{j,t}:= \theta ^*_{j,t}\) so that:
for rebalancers \( i\in \{1,...,M\}\) and trackers \(j\in \{M+1,...,M+\bar{M}\}\).
As for the consistency requirement in Definition 4.3(iii), we first fix a rebalancer \(i_0\in \{1,...,M\}\). We require that the response holdings in (4.2) are consistent with \(\theta ^*_{i_0,t}\) in the sense that
for rebalancers \(i\in \{1,...,M\}\setminus \{i_0\}\) and trackers \(j\in \{M+1,...,M+\bar{M}\}\). Second, we fix a tracker \(j_0\in \{M+1,...,M+\bar{M}\}\) and require that the response holdings in (4.2) must be consistent with \(\theta ^*_{j_0,t}\) in the sense that
for rebalancers \(i\in \{1,...,M\}\) and trackers \(j\in \{M+1,...,M+\bar{M}\}\setminus \{j_0\}\).
Similar to the pricefriction equilibrium, our Nash equilibrium existence result is based on a technical lemma, which guarantees the existence of a solution to an autonomous system of coupled ODEs.
Lemma 4.4
Let \(\kappa :[0,1]\rightarrow (0,\infty ]\) be a continuous and integrable function (i.e., \(\int _0^1 \kappa (t)dt <\infty \)), let \(M+\bar{M}>2\), and let \(\alpha \le 0\). For a constant \(B(0) \in \mathbb {R}\), the coupled ODEs
have unique solutions with \(\Sigma (t) \ge 0\), \(\Sigma (t)\) decreasing, \(A(t) \in [1,0]\), and A(t) decreasing for \(t\in [0,1]\). \(\diamondsuit \)
The affine ODE for B(t) in (4.19) is more complicated than the corresponding affine ODE in (3.7) because the Nash equilibrium has the additional fixed point requirement in (4.4) that is absent in the pricefriction equilibrium. However, both ODEs for B(t) are affine. It is possible to restate the ODE system (4.19) using a single pathdependent ODE. The special case \(\alpha :=0\) and \(B(0):=\frac{1}{{\bar{M}}}+\frac{1}{{\bar{M}} (M+{\bar{M}}1)^2}\) produces a Nash model with no dynamic learning because \(B'(t)=0\) implies \(\Sigma '(t)=0\) and so \(d\eta _t=dq_{i,t}=0\). The resulting Subgame perfect Nash equilibrium model only has learning at \(t=0\) and can be seen as a special case of Choi, Larsen, and Seppi [13].
Our main theoretical result gives a Nash equilibrium in terms of the ODEs (4.19). In this theorem, the pricefriction parameter \(\alpha \le 0 \), volatility \(\gamma >0\), and initial value \(B(0)\in \mathbb {R}\) are free parameters.
Theorem 4.5
Let \(\kappa :[0,1]\rightarrow (0,\infty )\) be continuous, let the functions \((B,A,\Sigma )\) be as in Lemma 4.4, let \(M+\bar{M}>2\), and let \(\alpha \le 0\). Then, we have:

(i)
A subgame perfect Nash financialmarket equilibrium exists and is given by the functions in (A.6) in Appendix A.

(ii)
The Nash equilibrium in (i) has holdings given by
$$\begin{aligned} \theta _{i,t}^*&:= \frac{(M+\bar{M}2) \left( 2 \kappa (t)\gamma B'(t)\right) }{\alpha (M+\bar{M})2 (M+\bar{M}1) \kappa (t)}\tilde{a}_i\nonumber \\&+\frac{\gamma (M+\bar{M}2) B'(t)}{\alpha (M+\bar{M})2 (M+\bar{M}1) \kappa (t)}q_{i,t}\nonumber \\&\frac{\begin{array}{l}\Big \{\gamma (M+\bar{M}2)^2 B'(t) (\alpha (M+\bar{M}+1)2 (M+\bar{M}) \kappa (t))\Big \}\nonumber \\ \end{array}}{\begin{array}{l} \Big \{(\alpha (M+\bar{M})2 (M+\bar{M}1) \kappa (t)) \big (\alpha \big ((3 M1) \bar{M}^2+M (3 M2) \bar{M}\nonumber \\ +(M2) M (M+1)+\bar{M}^3\big )2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)\big )\Big \} \end{array} }\eta _t\nonumber \\&+\frac{\begin{array}{l}\Big \{2 \bar{M} (M+\bar{M}2) (M+\bar{M}1) \kappa (t)\Big \}\\ \end{array}}{\begin{array}{l} \Big \{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) \\ 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)\Big \} \end{array} } Y_t, \end{aligned}$$(4.20)$$\begin{aligned} \begin{aligned} \theta _{j,t}^* :&= \tfrac{\gamma (M+\bar{M}2) (M+\bar{M}1) B'(t)}{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\eta _t \\&\qquad \tfrac{2 M (M+\bar{M}2) (M+\bar{M}1) \kappa (t)}{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}w_t \\&\qquad +\tfrac{(M+\bar{M}2) (M+\bar{M}1) \left( \gamma (A(t)+M1) B'(t)+2 \kappa (t)\right) }{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)} \tilde{a}_\Sigma , \end{aligned} \end{aligned}$$for rebalancers \(i\in \{1,...,M\}\) and trackers \( j\in \{M+1,...,M+\bar{M}\}\).

(iii)
The Nash equilibrium in (i) has the stockprice process \(S^*_t\) given by \(S^*_0 := w_0  B(0)\tilde{a}_\Sigma \) and dynamics with respect to the trackers’ filtrations \(\mathcal {F}_{j,t}:=\sigma (w_u,S^{\bar{\nu }}_{j,u})_{u\in [0,t]}\) given by
$$\begin{aligned} \begin{aligned} dS^*_t&=\Big \{\tfrac{\gamma (M+\bar{M}2) B'(t) (\alpha (M+\bar{M}+1)2 (M+\bar{M}) \kappa (t))}{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\eta _t\\&\qquad \tfrac{2 \bar{M} (M+\bar{M}1) \kappa (t) (\alpha (M+\bar{M})2 (M+\bar{M}1) \kappa (t))}{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}w_t \\&\qquad \tfrac{(M+\bar{M}2) (\alpha (M+\bar{M}+1)2 (M+\bar{M}) \kappa (t)) \left( \gamma (A(t)+M1) B'(t)+2 \kappa (t)\right) }{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\tilde{a}_\Sigma \Big \}dt \\&\qquad + \gamma dw_t, \end{aligned} \end{aligned}$$(4.21)and dynamics with respect to the rebalancers’ filtrations \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^\nu _{i,u})_{u\in [0,t]}\) given by
$$\begin{aligned} \begin{aligned} dS^*_t&=\Big \{\tfrac{\gamma (M+\bar{M}2) B'(t) (\alpha (M+\bar{M}+1)2 (M+\bar{M}) \kappa (t))}{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}\eta _t\\&\tfrac{2 \bar{M} (M+\bar{M}1) \kappa (t) (\alpha (M+\bar{M})2 (M+\bar{M}1) \kappa (t))}{\alpha \left( (3 M1) \bar{M}^2+M (3 M2) \bar{M}+(M2) M (M+1)+\bar{M}^3\right) 2 \left( (M+\bar{M}2) (M+\bar{M})^2+\bar{M}\right) \kappa (t)}Y_t \\&\gamma B'(t)(\tilde{a}_i+q_{i,t})\Big \}dt + \gamma dw_{i,t}. \end{aligned} \end{aligned}$$(4.22)\(\diamondsuit \)
The following observations follow from Theorem 4.5:

1.
The logic for the initial value B(0) being a free input parameter is the same as in the pricefriction equilibrium.

2.
The pricefriction parameter \(\alpha \) and stockprice volatility \(\gamma \) affect the stockprice drift and holdings via its impact on B(t) in (4.19). The dependence on \(\alpha \) is different from the pricefriction equilibrium where the corresponding B(t) in (3.7) is independent of \(\alpha \). The reason is that \(\alpha \) affects the perceived optimal responses in (4.2).

3.
Similar to (3.11) and (3.12), for an arbitrary trader \(k_0 \in \{1,...,M+\bar{M}\}\) and her arbitrary holdings \(\theta _{k_0,t}\), the optimal responses in (4.2) can be decomposed as
$$\begin{aligned} \begin{aligned} \theta ^{Z_{k_0}}_{i,t}&= \theta ^*_{i,t} \frac{1}{M+\bar{M}1} (\theta _{k_0,t}\theta ^*_{k_0,t}),\quad i \in \{1,...,M\},\\ \theta ^{Z_{k_0}}_{j,t}&=\theta ^*_{j,t} \frac{1}{M+\bar{M}1} (\theta _{k_0,t}\theta ^*_{k_0,t}),\quad j\in \{M+1,...,M+\bar{M}\}, \end{aligned} \end{aligned}$$(4.23)where the equilibrium holdings \((\theta ^*_{i,t}, \theta ^*_{j,t},\theta ^*_{k_0,t})\) are in (4.20).^{Footnote 13}

4.
The subgame perfect Nash financialmarket equilibrium is attractive because of its reasonable offequilibrium marketclearing perceptions. However, although much of the mathematic structure is similar, the expressions for the equilibrium stock price and holding coefficients are algebraically more complex. Nonetheless, our numerical results in Sect. 3.4 below show that the differences between the pricefriction and the subgame perfect Nash financialmarket equilibria are quantitatively small. This, in turn, suggests that the economic logic from the pricefriction equilibrium carries over to the Nash equilibrium.
Numerics
We have experimented extensively with the subgame perfect Nash model’s numerics, and its numerics are very similar to the numerics of the pricefriction equilibrium in Sect. 3. The numerical similarity of the two equilibria suggests that the intuitions for the signs of the various coefficients in the pricefriction equilibrium carry over to the subgame perfect Nash financialmarket equilibrium. Because the two equilibria produce similar numerics, it appears that the inequilibrium marketclearing requirement (common in both equilibria) has a much larger effect on equilibrium prices relative to the offequilibrium marketclearing requirement (only present in the subgame perfect Nash equilibrium).
Empirical predictions
The primary contribution of our analysis is theoretical. The Kyle model has provided a tractable framework for a large body of theoretical research on price discovery and dynamic order splitting given longlived asymmetric information about stock cash flows. However, no corresponding tractable framework exists for modeling price discovery and dynamic order splitting with private trading targets (e.g., by large index funds). Our model provides such a framework. While our zerodividend modeling approach precludes statements about the impact of order on price levels, our analysis does have empirical implications for intraday price drifts:
First, intraday price predictability is an important empirical driver of highfrequency liquidity provision. Our model’s equilibrium price dynamics in (3.9) and (4.21) suggest that intraday price drifts are path dependent (via the \(\eta _t\) term) and also that learning about parentdemand imbalances early in the trading day is associated with predictable price drifts later in the day.
Second, our analysis provides insights about the determinants of price impact as it relates to imbalancerelated parent trading demands and toxic cumulative order flow. In particular, the holdings \(\theta _{k,t}\) are cumulative trading up through time t, and large parent targets \(\tilde{a}_i\) lead to toxic streams of orders. Our subgame perfect Nash model endogenizes the pricedrift impact of investor holdings (i.e., cumulative trading). The Nash model’s pricefriction coefficient in the rebalancer’s perceived stockprice dynamics (4.7) is given by
where we have inserted \(\nu _0(t)\) from (A.6). An implication of (5.1) is that if, as is widely believed, investor target penalties become stronger as time passes (i.e., if \(\kappa (t)\) increases with time), then our Nash model predicts that the total price impact in (5.1) should increase. On its face, this is contrary to evidence in Barardehi and Bernhardt [6] that price impact declines over the trading day. We conjecture, however, that a richer model can be reconciled with these stylized facts if the number of investors (and, thus, the available inventory bearing capacity to absorb aggregate parent demand imbalances) is also allowed to grow as the market approaches the end of the trading day. Increased investor participation toward the end of the trading day is also empirically common.
Measuring execution costs
As an application, this section gives a measure of a rebalancer’s costs of rebalancing from zero endowed shares at time \(t=0\) to a given target \(\tilde{a}_i\). We present the measure in the pricefriction equilibrium in Sect. 3 (the Nash analogue is logically similar and produces similar numerics). In the pricefriction equilibrium, rebalancer i’s value function is
where \(\hat{\theta }_{i,t}\) denotes rebalancer i’s equilibrium stock holdings in (3.8) and \(\mathcal {F}_{i,t}:=\sigma (\tilde{a}_i,S^f_{i,u})_{u\in [0,t]}\) where the f coefficient functions are as in (A.1) in Appendix A for \(i\in \{1,...,M\}\). We seek a value function \(J= J(\tilde{a}_i,s,q,Y,q_i)\) such that the process
is a martingale with respect to \(\mathcal {F}_{i,t}\). Because rebalancer i’s objective in (2.5) is linearquadratic, the value function J is again linearquadratic in the state processes. Thus, J can be written as
for deterministic functions of time \((J_0, J_\eta , J_Y, J_{q_i},J_{\eta \eta },J_{\eta Y}, J_{YY},J_{q_iq_i},J_{q_i\eta },J_{q_iY})\). These functions are given by a coupled set of ODEs with zero terminal conditions (we omit the ODEs for brevity). In (6.3), the dummy variables \((\eta ,Y,q_i)\) are real numbers and \(s\in [0,1]\).
To quantify the costs associated with rebalancer i’s trading target \(\tilde{a}_i\), the quadratic mapping RC (Rebalancing Costs) defined by
measures the dependence the change in profit (i.e., change in value function) associated with a nonzero target \(\tilde{a}_i\). The rebalancing cost RC in (6.4) for a target \(\tilde{a}_i\) is computed as the difference between the value function evaluated at \(\tilde{a}_i\) and the function evaluated at \(\tilde{a}_i = 0\). Since the value function J is highest at \(\tilde{a}_i = 0\), the measure RC is positive.
Figure 7 plots the rebalancer’s value function J for different target values \(\tilde{a}_i\) for different model parameterizations. When the target \(\tilde{a}_i\) is close to zero, the rebalancers become highfrequency liquidity providers. Their value function is positive due expected profit from liquidity provision and pricepressure speculation. As the target moves away from zero, the rebalancer starts to have larger stockholding penalties that eventually drive the rebalancer’s value function negative. Interestingly, the impact of the stockprice volatility parameter \(\gamma \) on the rebalancer’s value function can be positive or negative. Liquidity providing rebalancers are better off with a small \(\gamma \) whereas rebalancers with large rebalancing targets are better off when \(\gamma \) is large.
Conclusion
This paper presents the first analytically tractable model of dynamic learning about parent tradingdemand imbalances with optimized ordersplitting. In particular, we provide closedform expressions prices and stock holdings in terms of solutions to systems of coupled ODEs in both the pricefriction and Subgame perfect Nash equilibria. Trading in our models reflects a combination of reaching investor’s own trading targets, liquidity provision so that markets can clear, and speculation based on predictions of future price pressure.
There are many interesting directions for future research based on our analysis. First, replacing the zerodividend stock approach with valuation based on a terminal payoff would be a significant technical step. Second, the model could be enriched by allowing for investor heterogeneity in the form of different penalty functions \(\kappa (t)\) and by having multiple tracker targets (which would weaken the trackers’ informational advantage). Third, it would be interesting to investigate if other offequilibrium refinements have larger equilibrium effects. Fourth, incorporating riskaversion into the investors’ objectives would be interesting too. For example, how can Lemma 4.1 be extended if the objectives in (2.5) are changed to exponential utilities?
Notes
Adding a volatility coefficient \(\sigma _w\) in front of \(w^\circ _t\) in (2.2) does not increase model flexibility because — as we shall see — the stock volatility \(\gamma \) is a free model parameter and \(\gamma \) and \(\sigma _w\) would play identical roles. Moreover, our model can be extended to include a drift term \(\mu _w t\) for a constant \(\mu _w\) in (2.2).
Our model features asymmetric information and learning about parent demands. However, because there are no stock dividends, there can be no asymmetric information related to future dividends.
Our analysis can be extended to allow for different penalty functions for the two groups of traders.
The process \(Y_{i,t}\) is also informative about the current value of the trackers’ target \(w_t\). Using (2.9) and (2.11), we have \(\mathbb {E}[w_t \sigma (Y_{i,u})_{u\in [0,t]}] = \mathbb {E}[Y_{i,t} + B(t)(\tilde{a}_\Sigma \tilde{a}_i) \sigma (Y_{i,u})_{u\in [0,t]}] =Y_{i,t} +B(t) q_{i,t}\).
Our model can be extended to allow for a different pricefriction coefficient \(\bar{\alpha }\) with \(\bar{\alpha }\ne \alpha \) for the trackers.
We nickname this our “Rashomon Theorem” after the 1950 movie in which different characters perceive the same event differently given their different perspectives. Rebalancers and trackers both start with private information so their filtrations are not nested. However, in equilibrium, stockprice dynamics depend on \(w_t\) and \(\tilde{a}_\Sigma \). Because the trackers know \(w_0\) at time \(t=0\), they infer \(\tilde{a}_\Sigma \) from \(S_{j,0}=w_0B(0)\tilde{a}_\Sigma \), and so they have no need to filter at later times. On the other hand, rebalancer i only has noisy dynamic predictions \(\mathbb {E}[\tilde{a}_\Sigma \mathcal {F}_{i,t}] = q_{i,t}+\tilde{a}_i\) of the aggregate parent imbalance \(\tilde{a}_\Sigma \) given her inferences based on the individual parent targets \(\tilde{a}_i\) and stockprice observations.
Similar to a money market account, a nondividend paying stock is a financial asset in the sense that holding one stock at time \(t=1\), gives one unit of consumption at \(t=1\). Likewise, being short one stock at \(t=1\), means the trader provides one unit of consumption at \(t=1\). Both the bank account and the nondividend paying stock have exogenous initial prices and volatilities. It is custom for the money market account’s initial price to be one and its volatility to be zero. For the nondividend paying stock, we set the initial price to be \(Y_0\), its volatility to be a positive constant \(\gamma >0\), and determine endogenously the drift.
There are longlived nondividend paying stocks too as; see, for example, Atmaz and Basak [1] write: “For example, Hartzmark and Solomon [24] find that over the longsample of 19272011, the average proportion of nodividend stocks is around 35% and accounts for 21.3% of the aggregate US stock market capitalization. Similarly, by taking into account of rising share repurchase programs since the mid1980ies, Boudoukh et al. [8] report that over the 19842003 period, the average proportion of nodividend stocks is 64% and nopayout stocks, i.e., no dividends or no share repurchases, is 51% with the relative market capitalizations of 16.4% and 14.2%, respectively."
This is similar to Eq. (2.16) in Chen, Choi, Larsen, and Seppi [11].
References
Atmaz, A., Basak, S.: Stock market and nodividend stocks. J Finance 77(1), 545–599 (2022)
Almgren, R.: Optimal execution with nonlinear impact functions and tradingenhanced risk. Appl Math Finance 10, 1–18 (2003)
Almgren, R., Chriss, N.: Value under liquidation. Risk 12, 61–63 (1999)
Almgren, R., Chriss, N.: Optimal execution of portfolio transactions. J Risk 3, 5–39 (2000)
Back, K., Cao, H., Willard, G.: Imperfect competition among informed traders. J Finance 55, 2117–2155 (2000)
Barardehi, Y. H., and Bernhardt, D.: Uncovering the impacts of endogenous liquidity consumption in intraday trading patterns, working paper (2021)
Bouchard, B., Fukasawa, M., Herdegen, M., MuhleKarbe, J.: Equilibrium returns with transaction costs. Finance Stochast 22, 569–601 (2018)
Boudoukh, J., Michaely, R., Richardson, M., Roberts, M.R.: On the importance of measuring payout yield: implications for empirical asset pricing. J Finance 62, 877–915 (2007)
Brunnermeier, M.K., Pedersen, L.H.: Predatory trading. J Finance 60, 1825–1863 (2005)
Carlin, B., Lobo, M., Viswanathan, S.: Episodic liquidity crises: cooperative and predatory trading. J Finance 62, 2235–2274 (2007)
Chen, X., Choi, J.H., Larsen, K., Seppi, D.: Assetpricing puzzles and pricefriction, working paper (2021)
Choi, J.H., Larsen, K., Seppi, D.: Information and trading targets in a dynamic market equilibrium. J Financ Econom 132, 22–49 (2019)
Choi, J.H., Larsen, K., Seppi, D.: Equilibrium effects of intraday ordersplitting benchmarks. Math Financ Econo 15, 315–352 (2021)
Cuoco, D., He, H.: Dynamic equilibrium in infinitedimensional economies with incomplete financial markets, Wharton working paper (1994)
Cuoco, D., Cvitanić, J.: Optimal consumption choices for a large investor. J Econo Dyn Control 22, 401–436 (1998)
Davis, M.H.A.: Linear estimation and stochastic control. Wiley, New Jersey (1977)
Du, S., Zhu, H.: What is the optimal trading frequency in financial markets? Rev Econ Stud 84, 1606–1651 (2017)
Foster, F., Viswanathan, S.: Strategic trading with asymmetrically informed traders and longlived information. J Financ Quantit Anal 29, 499–518 (1994)
Foster, F., Viswanathan, S.: Strategic trading when agents forecast the forecasts of others. J Finance 51, 1437–1478 (1996)
Gârleanu, N., Pedersen, L.H.: Dynamic portfolio choice with frictions. J Econ Theory 165, 487–516 (2016)
Grossman, S.J., Miller, M.: Liquidity and market structure. J Finance 43, 617–633 (1988)
Grossman, S.J., Stiglitz, J.E.: On the impossibility of informationally efficient markets. Am Econ Rev 70, 393–408 (1980)
Hartman, P.: Ordinary differential equations, 2nd Ed., SIAM Classics in Applied Mathematics (2002)
Hartzmark, S.M., Solomon, D.H.: The dividend month premium. J Financ Econ 109, 640–660 (2013)
van Kervel, V., Kwan, A., Westerholm, P.: Order splitting and interacting with a counterparty, working paper (2020)
van Kervel, V., Menkveld, A.: Highfrequency trading around large institutional orders. J Finance 74, 1091–1137 (2019)
Karatzas, I., and Shreve, S.E.: Methods of mathematical finance. Springer, New York (1998)
Korajczyk, R.A., Murphy, D.: Highfrequency market making to large institutional trades. Rev Financ Stud 32, 1034–1067 (2019)
Kyle, A.: Continuous auctions and insider trading. Econometrica 53, 1315–1336 (1985)
Lipster, R.S., Shiryaev, A.N.: Statistics of random processes I. Springer, Berlin (2001)
Noh, E., Weston, K.: Price impact equilibrium with transaction costs and TWAP trading. Math. Financ. Econ. 16, 187–204 (2022)
O’Hara, M.: High frequency market microstructure. J Financ Econ 116, 257–270 (2015)
Sannikov, Y., Skrzypacz, A.: Dynamic trading: Price inertia and front running, working paper, (2016)
Schied, A., Schöneborn, T.: Risk aversion and the dynamics of optimal liquidation strategies in illiquid markets. Finance Stochast 13, 181–204 (2009)
Vayanos, D.: Strategic trading and welfare in a dynamic market. Rev Econ Stud 66, 219–254 (1999)
Funding
The authors have benefited from helpful comments from Dan Bernhardt, John Kuong (Paris Finance discussant), and participants at the SIAM and INFORMS math finance conferences (2021), the Paris Finance meeting (2021), and Carnegie Mellon. Jin Hyuk Choi is supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (Nos. 2020R1C1C1A01014142 and No. 2021R1A4A1032924). Kasper Larsen has been supported by the National Science Foundation under Grant No. DMS 1812679 (2018 – 2022). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
A Formulas
A.1 Priceperception coefficients for the pricefriction equilibrium
A.2 Orthogonal representations for the pricefriction equilbrium
Let the deterministic functions \(F_1(t)\) and \(F_2(t)\) be as in (2.19).
A.2.1 Pricefriction equilibrium holdings
The pricefriction equilibrium holdings \(\hat{\theta }_{i,t}\) in (3.8) for rebalancer \( i\in \{1,...,M\}\) has an orthogonal representation given by
The pricefriction equilibrium holdings \(\hat{\theta }_{j,t}\) in (3.8) for tracker \( j\in \{M+1,...,M+{\bar{M}}\}\) has an orthogonal representation given by
A.2.2 Pricefriction equilibrium stock dynamics
For the trackers, we can rewrite the drift in (3.9) in terms of \((\tilde{a}_\Sigma , w_0)\) and an residual orthogonal term as
For the rebalancers, we can rewrite the drift in (3.10) in terms of \((\tilde{a}_\Sigma \tilde{a}_i, w_0, \tilde{a}_i)\) and an residual orthogonal term as
A.3 Priceperception coefficients for the Nash equilibrium
B Kalman–Bucy filtering
The proof of Lemma 2.1 follows from the wellknown KalmanBucy result in filtering theory and can be found in, e.g., Lipster and Shiryaev [30, Chapter 8]. We note that the solution to the Riccati equation (B.3) below is given by (2.15).
Theorem B.1
(KalmanBucy) Let \(B:[0,1]\rightarrow \mathbb {R}\) be a continuously differentiable function and consider the Gaussian observation process \(Y_{i,t}:= w_t  B(t)(\tilde{a}_\Sigma \tilde{a}_i)\) from (2.9) with dynamics
and corresponding innovations process \(w_{i,t}\) in (2.11). Then, (2.14) holds and the filtering property in (2.11) holds if \(q_{i,t}\) has dynamics given by
and the remaining variance is given by
with initial value
C Remaining proofs
Proof of Lemma 2.2
To see that (2.16) holds, we use the KalmanBucy filter (B.2) to write
Then,
To explicitly solve for \(\sum _{i=1}^M q_{i,t}\), we note
We get the solution \(\sum _{i=1}^M q_{i,t}\) by integrating
Thus, the decomposition (2.16) holds with
For the second part, we write the solution to the OrnsteinUhlenbeck SDE for \(d\eta _t\) in (2.17) as
where the deterministic functions \(F_1(t)\) and \(F_2(t)\) are given by the ODEs in (2.19). Similarly, the the OrnsteinUhlenbeck SDE for \(dq_{i,t}\) in (B.2) has solution
By comparing (C.6) and (C.7), we get (2.18). \(\diamondsuit \)
Proof of Lemma 3.1
The inclusion “\(\supseteq \)” in (3.2) follows from (2.4), (2.10), and (2.14). To see the inclusion “\(\subseteq \)”, we use \(Y_t\) in (2.8), \(\eta _t\) in (C.5), and \(q_{i,t}\) in (B.2) to find deterministic functions \(h_0,h\), and H such that
We define
The inclusion “\(\subseteq \)" in (3.2) will follow from the inclusion
To see (C.10), let \(t_0 \in [0,t]\) be arbitrary and let f(s), \(s\in [0,t]\), solve the following Volterra integral equation of the second kind (such f exists by Lemma 4.3.3 in Davis [16] because \(\gamma \ne 0\)):
This gives us
\(\diamondsuit \)
Proof of Lemma 3.2
Consider a rebalancer \(i\in \{1,...,M\}\). For arbitrary holdings \(\theta _{i,t}\), the expectation in the i’th objective in (2.5) is
The equality in (C.13) follows from the square integrability condition (2.6), which ensures that the stochastic integral \(\int _0^s \theta _{i,t} dw_{i,t}\) is a martingale with zero expectation. We can maximize the integrand in (C.13) pointwise because the secondorder condition \(\alpha <\kappa (t)\) holds. This gives the first formula in (3.4).
The second formula for a tracker j in (3.4) is proved similarly. \(\diamondsuit \)
Proof of Lemma 3.4
The local Lipschitz property of the ODEs (3.7) ensures that there exists a maximal interval of existence \([0, \tau )\) with \(\tau \in (0,\infty ]\) by the PicardLindelöf theorem (see, e.g., Theorem II.1.1 in Hartman [23]). We assume that \(\tau <1\) and construct a contradiction. To this end, we set
First, the Riccati ODE for \(\Sigma (t)\) has the explicit solution in (2.15), which cannot explode as \(t\uparrow \tau \) (even if B(t) should explode as \(t\uparrow \tau \)).
Second, the initial value A(0) in (3.7) ensures \(A(0)\ge 1\) and to see that implies \(A(t)\ge 1\) for all \(t\in [0,\tau )\), we note
which implies
This shows that A(t) cannot explode as \(t\uparrow \tau \) (even if B(t) should explode as \(t\uparrow \tau \)).
Third, we show B(t) is uniformly bounded for \(t\in [0,\tau )\); hence, also B(t) cannot explode as \(t\uparrow \tau \). This then gives the desired contradiction because of Theorem II.3.1 in Hartman [23]. The affine ODE for B(t) in (3.7) has the explicit solution
We can use K in (C.14) to produce the upper bound
In turn, the bound (C.18) and (C.17) imply
for \(t\in [0,\tau )\). Because the upper bound in (C.19) is uniform over \(t\in [0,\tau )\), B(t) cannot explode as \(t\uparrow \tau \). \(\diamondsuit \)
Proof of Theorem 3.5
To see that the holdings in (3.8) satisfy the square integrability condition (2.6), we insert \(B'(t)\) from (3.7) to get
Because \(\kappa :[0,1]\rightarrow (0,\infty )\) is continuous, \(\kappa (t)\) is uniformly bounded. This gives us that \(B'(t)\) in (3.7) is also uniformly bounded. As a consequence, the variances \(\mathbb {V}[q_{i,t}], \mathbb {V}[\eta _t]\), and \(\mathbb {V}[Y_t]\) are also uniformly bounded functions of \(t\in [0,1]\). Therefore, the holding processes in (C.20) satisfy (2.6) if the coefficient functions for \((\tilde{a}_i, q_{i,t},\eta _t,Y_t, w_t,\tilde{a}_\Sigma )\) are square integrable over \(t\in [0,1]\). For example, the coefficient function for \(\tilde{a}_i\) in \(\hat{\theta }_{i,t}\) is bounded because
which is continuous for \(t\in [0,1]\). Similarly, the remaining coefficients functions can be seen to be bounded too. The optimality in Definition 3.3(i) then follows from Lemma 3.2 and the fact that the holdings (3.8) are those in (3.4) with the f functions in (A.1) inserted.
Definition 3.3(ii)+(iii) are ensured by the specific f functions in (A.1).
\(\diamondsuit \)
Proof of Lemma 4.1
Lemma A.1 in Choi, Larsen, and Seppi [13] and the continuity of \(Z_t\)’s paths imply that \(Z_t\) is adapted to both \(\mathcal {F}_{i,t}\) and \(\mathcal {F}_{j,t}\). The rest of this proof is similar to the proof of Lemma 3.2 given above and is therefore omitted. \(\diamondsuit \)
Proof of Lemma 4.2
The rebalancers’ secondorder condition is
whereas the trackers’ secondorder condition is \(\alpha <\kappa (t)\). Inequality (C.22) holds because \(\nu _0(t)\ge 0\) and \(\alpha <\kappa (t)\). The rest of this proof is similar to the proof of Lemma 3.2 given above and is therefore omitted. \(\diamondsuit \)
Proof of Lemma 4.4
The proof only requires minor changes to the proof of Lemma 3.4. As before, we let \([0, \tau )\) be the maximal interval of existence with \(\tau \in (0,\infty ]\) and assume that \(\tau <1\) to construct a contradiction. As in the proof of Lemma 3.4, \(\Sigma (t)=\frac{1}{\frac{1}{\Sigma (0)}+\int _0^t(B'(t))^2dt}\) and \(A(t)\ge 1\). Next, to show B(t) is bounded on \([0,\tau )\), we rewrite the ODE for B(t) in (4.19) as
where the deterministic function c(t) is defined as
Because \(\alpha \le 0\) and \(\kappa (t)>0\), we have \(c(t) >0\). Furthermore, c(t) is bounded because
where the inequality follows from \(2(M+{\bar{M}}) > (M+{\bar{M}} +1)\) and the positivity of \(\kappa (t)\). Because \(A(t) + 1\ge 0\) and \(c(t)>0\) we get the two estimates
where K is as in (C.14). Similar to (C.17), the explicit solution of (C.23) is
Combing this expression for B(t) with the bounds (C.26) produces
\(\diamondsuit \)
Proof of Theorem 4.5
From (C.23) we see that
where \(c_0\) is defined in (C.25). Because \(\kappa (t)\) is continuous on \(t\in [0,1]\), \(\kappa (t)\) is bounded and from (C.19) we know that B(t) is bounded too. Therefore, from (C.29), we see that \(B'(t)\) is also uniformly bounded. Consequently, the variances \(\mathbb {V}[q_{i,t}], \mathbb {V}[\eta _t]\), and \(\mathbb {V}[Y_t]\) are also uniformly bounded functions of \(t\in [0,1]\).
As before, the coefficient functions for \((\tilde{a}_i, q_{i,t},\eta _t,Y_t, w_t,\tilde{a}_\Sigma )\) in (4.20) are all uniformly bounded for \(t\in [0,1]\). Therefore, the squareintegrability condition (2.6) holds.
The requirements in Definition 4.3 follow from the definition of the functions in (A.6). \(\diamondsuit \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chen, X., Choi, J.H., Larsen, K. et al. Learning about latent dynamic trading demand \(^*\). Math Finan Econ (2022). https://doi.org/10.1007/s11579022003175
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11579022003175
Keywords
 Ordersplitting
 Optimal order execution
 Subgame perfect nash equilibrium
 Dynamic learning
 Trading targets
 Speculation
JEL codes
 G11
 G12