1 Introduction

Optimal investment is a tremendously rich source of mathematical challenges in stochastic control theory. The key driver in this problem is the tradeoff between risk and return. Thus, information on the investment opportunities is playing a role which is as important for the mathematical theory as it is in practice where investors go at great lengths to secure even the slightest advantage in knowledge. So it is no wonder that insider information has been widely studied in the literature; see, for instance, [2, 3, 5, 20, 24] where an investor obtains extra information about the stock price evolution at some fixed point in time. By contrast to these studies, the present paper takes a more dynamic view on information gathering and affords the investor the opportunity to continually peek \(\Delta \) units of time into the future. Closest to such an investor in reality may be high-frequency traders (“frontrunners”) that get access to order flow information earlier or are able process it faster than their competition. To the best of our knowledge, this paper is the first continuous-time stochastic control paper with such a feature, apart from the optimal stopping problem of [9].

Of course, perfect knowledge about future stock prices easily lets optimal investment problems degenerate and so it is of great interest to understand how market mechanisms may curb an investor’s ability to take advantage of this extra information. A most satisfactory approach from an economic point of view is the equilibrium approach due to Kyle [22] where the insider knows the terminal stock price right from the start and internalizes the impact of her orders on market prices. Generalizations of this approach are challenging; see [6, 7, 28] and the references therein. For dynamic information advantages in this context, we refer to [11, 12] who consider an insider receiving a dynamic signal on, respectively, the terminal asset price or the traded firm’s default time. These models, however, do not get close to addressing the intrinsically infinite-dimensional information structure of our peek-ahead setting. Fortunately, also the much simpler market impact model of [1] that just imposes quadratic transaction costs for the investor turns out to be sufficient friction to make the optimal investment problem viable. In an insider model where additional information is obtained just once, [4] use such a friction for optimal portfolio liquidation. A combination between Kyle’s equilibrium setting and quadratic price impact costs is solved in [8]. With peek-ahead information as in the present paper, [14] study super–replication, albeit in a discretized version of the Bachelier model.

It is in the continuous-time Bachelier model that the present paper provides its main result, namely the explicit optimal investment strategy for an exponential utility maximizer who knows about future prices \(\Delta \) time units before they materialize in the market, but cannot freely take advantage of her extra knowledge due to quadratic transaction costs. The optimal policy turns out to be a combination of two trading incentives. On the one hand, there is the urge to trade towards the optimal frictionless position given by the well-known Merton ratio. On the other hand, there is the desire to take advantage of the next stock price moves and this contributes to the optimal turnover rate through an explicitly given average of stock prices over the window of length \(\Delta \) on which our investor has extra information.

Due to its peek-ahead feature, our optimal control problem can be viewed as a contribution to pathwise stochastic control. A closely related work is [10] where the authors studied a hidden stochastic volatility model with a controller who has full information on the extra noise. The theory of delayed or partial information also shares the infinite-dimensional pathwise control issues we need to address here; see the recent papers [25, 26] and the references therein. Finally, our control theoretic setting is also related to models discussed in the monograph [15].

Instead of dynamic programming (which would be challenging in this infinite memory setting; cf. [15]), our methodology is based on duality. For the case of exponential utility and quadratic transaction costs, this theory is developed with flexible information flow in great generality in an essentially self-contained appendix. It shows that the primal optimal control is determined by the conditional expectation of the terminal stock price under the dual optimal probability measure. For the Brownian framework that we focus on in the main body of the paper, we derive a particularly convenient representation of the dual target functional which leads to deterministic variational problems. These problems can be solved explicitly, and results from the theory of Gaussian Volterra integral equations [18, 19] allow us both to construct the solution to the dual problem and to compute the primal optimal strategy. These Gaussian Volterra integral equations also occur in [13] albeit in the rather different context of (no) arbitrage criteria in fractionally perturbed financial models.

In Section 2 we specify our model and formulate and interpret our main result. Section 3 contains the proof of the main result and the appendix A presents the duality results necessary for these developments.

2 Problem Formulation and Main Result

We consider an investor who knows about market movements some time before they happen, but cannot arbitrarily exploit them due to market frictions. Specifically, apart from a riskless savings account bearing zero interest (for simplicity), the investor has the opportunity to trade in a risky asset with Bachelier price dynamics

$$\begin{aligned} S_t=s_0+\mu t+\sigma W_t, \quad t \ge 0, \end{aligned}$$

where \(s_0 \in {\mathbb {R}}\) is the initial asset price, \(\mu \in \mathbb {R}\) is the constant drift, \(\sigma >0\) is the constant volatility and W is a one-dimensional Brownian motion supported on a complete probability space \((\Omega , \mathcal {F}, {\mathbb {P}})\). Rather than having access to just the natural augmented filtration \((\mathcal {F}^S_t)_{t \ge 0}\) for making investment decisions, we assume that our investor can peek \(\Delta \in [0,\infty )\) time units into the future, and so her information flow is given by the filtration

$$\begin{aligned} {\mathcal {G}}^{\Delta }_t:={\mathcal {F}}^S_{t+\Delta }, \quad t\ge 0. \end{aligned}$$

Remark 2.1

As suggested by an anonymous referee, one could more generally consider a non-decreasing time shift \(\tau :[0,\infty ) \rightarrow [0,\infty )\) with \(\tau (t) \ge t\) to model time-varying ability to peek ahead. To keep the exposition here as simple as possible, we leave this extension of our model as a topic for future research.

Taking advantage of the inside information is impeded by the investor’s adverse market impact. Following [1], we model this impact in a temporary linear form and, thus, when at time t the investor turns over her position \(\Phi _t\) at the rate \(\phi _t=\dot{\Phi }_t\) the execution price is \(S_t + \frac{\Lambda }{2} \phi _t\) for some constant \(\Lambda >0\). As a result, the profits and losses from trading are given by

$$\begin{aligned} V^{\Phi _0,\phi }_T:=\Phi _0(S_T-S_0)+\int _{0}^T \phi _t(S_T-S_t)dt-\frac{\Lambda }{2} \int _{0}^T \phi ^2_t dt, \end{aligned}$$
(2.1)

where, for convenience, we assume that the investor marks to market her position \(\Phi _T = \Phi _0+\int _0^T \phi _t dt\) in the risky asset that she has acquired by time \(T>0\).

Fixing a time horizon \(T>0\), the natural class of admissible strategies is then

$$\begin{aligned} {\mathcal {A}}^\Delta :=\left\{ \phi =(\phi _t)_{t\in [0,T]}: \ \phi \text { is } \ {\mathcal {G}}^{\Delta }\text {-optional with } \int _{0}^T \phi ^2_t dt<\infty \ \text { a.s.}\right\} . \end{aligned}$$

The investor’s preferences are described by an exponential utility function

$$\begin{aligned} u(x):=-\exp (-\alpha x), \quad x\in {\mathbb {R}}, \end{aligned}$$

with constant absolute risk aversion parameter \(\alpha >0\), and her goal is thus to

$$\begin{aligned} \text {Maximize } {\mathbb {E}}\left[ u(V^{\Phi _0,\phi }_T)\right] = {\mathbb {E}}\left[ -\exp \left( -\alpha V^{\Phi _0,\phi }_T\right) \right] \text { over } {\phi \in {\mathcal {A}}^\Delta }. \end{aligned}$$
(2.2)

The paper’s main result is the following solution to this optimization problem:

Theorem 2.2

In the utility maximization problem (2.2), the investor’s optimal turnover rate \(\hat{\phi }_t\) at time \(t \in [0,T]\) depends on the risk-liquidity ratio

$$\begin{aligned} \rho :=\frac{\alpha \sigma ^2}{\Lambda }, \end{aligned}$$

on the position \(\hat{\Phi }_t=\Phi _0+\int _0^t \hat{\phi }_sds\) acquired so far and the privileged information on the next stock prices \((S_{t+s})_{s \in [0,\Delta ]}\) in the feedback form

$$\begin{aligned} \hat{\phi }_t=&\frac{1}{\Lambda }\left( \bar{S}^\Delta _t-S_t\right) +\frac{\Upsilon ^\Delta (T-t)}{\Delta } \left( \frac{\mu }{\alpha \sigma ^2} -\hat{\Phi }_t\right) , \end{aligned}$$
(2.3)

where \(\bar{S}^\Delta \) is the stock price average given by

$$\begin{aligned} \bar{S}^\Delta _t := \left( 1-\Upsilon ^\Delta (T-t)\right) S_{(t+\Delta ) \wedge T} +\Upsilon ^\Delta (T-t)\frac{1}{\Delta } \int _0^{\Delta } S_{t+s}ds \end{aligned}$$
(2.4)

with \(\Upsilon ^\Delta (\tau )=\Delta \sqrt{\rho }\tanh (\sqrt{\rho }(\tau -\Delta )^{+})/(1+\Delta \sqrt{\rho }\tanh (\sqrt{\rho }(\tau -\Delta )^+))\). The maximal utility this policy generates is

$$\begin{aligned}&\max _{\phi \in {\mathcal {A}}^\Delta } {\mathbb {E}} \left[ -\exp \left( -\alpha \sigma V^{\Phi _0,\phi }_T\right) \right] \nonumber \\&\quad =-\exp \left( \frac{\alpha \Lambda \sqrt{\rho }}{2\coth \left( \sqrt{\rho }T\right) } {\left( \Phi _0-\frac{\mu }{\alpha \sigma ^2}\right) ^2}-\frac{1}{2}\frac{\mu ^2}{\sigma ^2}T\right) \nonumber \\&\qquad \cdot \exp \left( -\frac{1}{2}\int _{0}^T \frac{(s\wedge \Delta )\rho }{1+(s\wedge \Delta ) \sqrt{\rho }\tanh \left( \sqrt{\rho }(T-s)\right) }ds\right) . \end{aligned}$$
(2.5)

Our feedback description (2.3) can be interpreted as follows: First, without privileged information, i.e. for \(\Delta =0\), we have \(\bar{S}^\Delta _t=S_t\) and, therefore, the first term in (2.3) vanishes leaving us with the optimal policy

$$\begin{aligned} \hat{\phi }_t=\sqrt{\rho }\tanh (\sqrt{\rho }(T-t)) \left( \frac{\mu }{\alpha \sigma ^2} -{\hat{\Phi }}_t\right) , \quad t \in [0,T]. \end{aligned}$$

So the uninformed agent will trade towards the optimal position \(\mu /(\alpha \sigma ^2)\) well known from the frictionless Merton problem. Due to the impact costs, she does so with finite urgency \(\sqrt{\rho }\tanh (\sqrt{\rho }(T-t))\). With a long time to go, this urgency is essentially \(\sqrt{\rho }\) and thus dictated by the risk/liquidity ration \(\rho =\alpha \sigma ^2/\Lambda \); as t approaches the time horizon T, the urgency vanishes because, towards the end, position improvements have an ever shorter time to yield risk premia but the investor still has to pay the same impact costs that obtain at the start of trading.Footnote 1

For the informed agent, i.e. for \(\Delta >0\), the desire to be close to the Merton ratio persists, but the urgency reduces to

$$\begin{aligned} \frac{\Upsilon ^\Delta (T-t)}{\Delta }=\frac{\sqrt{\rho } \tanh (\sqrt{\rho }(T-t-\Delta )^{+})}{1+\Delta \sqrt{\rho } \tanh (\sqrt{\rho }(T-t-\Delta )^+)}, \end{aligned}$$

leaving “some air” to take advantage of the knowledge on future price movements. This is done by averaging out in (2.4) the latest relevant stock price available to the investor, \(S_{(t+\Delta )\wedge T},\) with the mean stock price \(\frac{1}{\Delta } \int _0^{\Delta } S_{t+s}ds\) to be realized over the next \(\Delta \) time units in an effort to assess the earnings potential over today’s stock price \(S_t\). Put into relation with the impact costs \(\Lambda \), this yields the second contribution \((\bar{S}^\Delta _t-S_t)/\Lambda \) to the optimal turnover rate. The weight that this assessment of earnings assigns to the average stock prices is given by \(\Upsilon ^\Delta (T-t) \in [0,1]\); it is about \(\Delta \sqrt{\rho }/(1+\Delta \sqrt{\rho })\) when there is still a lot of time to go, but vanishes completely as soon as \(T-t\le \Delta \), i.e. as soon as full knowledge of stock price movements over the relevant time span [0, T] is attained. In this terminal regime also the ambition to be close to the Merton ratio is wiped out and the investor just chases the earning potential \(S_T-S_t\) from the stock, of course still in a tradeoff against the liquidity costs \(\Lambda \); this latter effect is also immediate from separate, pointwise optimization over \(\phi _t\) in the representation (2.1) of profits and losses (which leads to \(\phi _t^*=(S_T-S_t)/\Lambda \), \(t \in [0,T]\), an admissible strategy as soon as \(S_T\) becomes known). Figure 1 illustrates the different components important for the optimal strategy along a typical trajectory of price fluctuations.

Fig. 1
figure 1

The first of these illustrations shows an evolution of the stock price S (blue), the corresponding average \(S^\Delta \) (orange) along with the underlying weight \(\Upsilon ^\Delta \); the second shows the resulting trading rates due to “frontrunning” (grey) and due to tracking the Merton portfolio (black); the third display shows the ensuing stock position \(\Phi \) (red) together with the Merton ratio \(\mu /(\alpha \sigma ^2\)) (light red). Parameters where chosen as \(s_0=0\), \(\mu =.1\), \(\sigma =.3\), \(T=10\), \(\Delta =1\), \(\alpha =.03\), \(\Phi _0=0\), \(\Lambda = .01\)

The monetary value of being able to peek ahead by \(\Delta \) is best described by the certainty equivalent

$$\begin{aligned} c(\Delta )&= -\frac{1}{\alpha } \log \frac{ \max _{\phi \in \mathcal A^{\Delta }} {\mathbb {E}}\left[ -\exp \left( -\alpha V^{\Phi _0,\phi }_T\right) \right] }{ \max _{\phi \in {\mathcal {A}}^{0}} {\mathbb {E}}\left[ -\exp \left( -\alpha V^{\Phi _0,\phi }_T\right) \right] }\nonumber \\&=\frac{1}{2\alpha } \int _{0}^T \frac{(s\wedge \Delta )\rho }{1+(s\wedge \Delta ) \sqrt{\rho }\tanh \left( \sqrt{\rho }(T-s)\right) }ds \end{aligned}$$
(2.6)

determined by comparing the utility attainable for an informed investor (with admissible strategy set \({\mathcal {A}}^\Delta \)) and an uninformed one (who is confined to strategies from the smaller class \({\mathcal {A}}^0\))Footnote 2. Interestingly, the certainty equivalent does not depend on the stock’s risk premium \(\mu \), but is determined by the risk/liquidity ratio \(\rho =\alpha \sigma ^2/\Lambda \), the investor’s time horizon T and the time units \(\Delta \) she can look ahead. Except for a period of length \(\Delta \) and with a lot of time to go, it accrues at about the rate \(\Delta \rho /(2(1+\Delta \sqrt{\rho }))\) which increases with \(\Delta \) to the upper bound \(\sqrt{\rho }/2\), revealing again the curb frictions put on the earning potential of even extreme information advantages.

The proof of Theorem 2.2 is carried out in the next section. It is obtained by solving the dual problem to

$$\begin{aligned}&\qquad \text {Minimze}\nonumber \\&\qquad \qquad {\mathbb {E}}_{{\mathbb {Q}}}\left[ \Phi _0(S_T-S_0) +\frac{1}{2\Lambda }\int _{0}^T\left| {\mathbb {E}}_{{\mathbb {Q}}}\left[ S_T|\mathcal G^{\Delta }_t\right] -S_t\right| ^2 dt\right] +\frac{1}{\alpha } {\mathbb {E}}_{{\mathbb {Q}}}\left[ \log \frac{d{\mathbb {Q}}}{d{\mathbb {P}}}\right] \nonumber \\&\qquad \qquad \hbox {over } {\mathbb {Q}} \sim {\mathbb {P}} \hbox { with finite relative entropy } {\mathbb {E}}_{{\mathbb {Q}}}\left[ \log \frac{d{\mathbb {Q}}}{d\mathbb P}\right] <\infty . \end{aligned}$$
(2.7)

The corresponding duality theory holds true beyond the Brownian framework specified here and is developed in a self-contained manner in the Appendix A as a second key contribution of our paper.

3 Proof of Theorem 2.2

Let us first note that it suffices to treat the case

$$\begin{aligned} S=W\text {, i.e., without loss of generality } s_0=0, \sigma =1, \mu =0. \end{aligned}$$
(3.1)

Indeed, by passing from \(\alpha \), \(\Lambda \), \(\mu \) to, respectively, \(\alpha '=\alpha \sigma \), \(\Lambda '=\Lambda /\sigma \), \(\mu '=\mu /\sigma \), the utility with \(\sigma '=1\) obtained from a given strategy will coincide with the one obtained from this strategy under the original parameters. Moreover, rewriting the expected utility under \({\mathbb {P}}' \sim {\mathbb {P}}\) with density

$$\begin{aligned} \frac{d{\hat{{\mathbb {P}}'}}}{d{\mathbb {P}}}{|{\mathcal {F}}^W_T}:=\exp \left( -\mu ' W_T-\frac{1}{2}\mu '^2 T\right) , \end{aligned}$$

under which \(W'_t=W_t+\mu ' t\), \(t \ge 0\), is a driftless Brownian motion, the expected utilities under \({\mathbb {P}}\) coincide, up to the factor \(\exp \left( \frac{1}{2}\mu '^2 T\right) \), with those under \(\mathbb P'\) if we start with \(\Phi _0'=\Phi _0-{\mu }/({\alpha \sigma ^2})\) rather than \(\Phi _0\) risky assets.

The proof of Theorem 2.2 will be accomplished via the dual problem whose properties are summarized in the following proposition which is an immediate consequence of the general duality results presented in Appendix A.

Proposition 3.1

Denoting by \({\mathcal {Q}}\) the set of all probability measures \({\mathbb {Q}}\sim {\mathbb {P}}\) with finite entropy

$$\begin{aligned} {\mathbb {E}}_{{\mathbb {Q}}}\left[ \log \left( \frac{d{\mathbb {Q}}}{d\mathbb P}\right) \right] <\infty \end{aligned}$$

relative to \({\mathbb {P}}\), we have

$$\begin{aligned}&\max _{\phi \in {\mathcal {A}}} \left\{ -\frac{1}{\alpha }\log \mathbb E\left[ \exp \left( -\alpha V^{\Phi _0,\phi }_T\right) \right] \right\} \nonumber \\&\quad =\min _{{\mathbb {Q}}\in {\mathcal {Q}}}{\mathbb {E}}_{\mathbb Q}\left[ \Phi _0(S_T-S_0)+\frac{1}{\alpha }\log \left( \frac{d\mathbb Q}{d{\mathbb {P}}}\right) +\frac{1}{2\Lambda }\int _{0}^T\left| {\mathbb {E}}_{{\mathbb {Q}}}(S_T|{\mathcal {G}}^{\Delta }_t) -S_t\right| ^2 dt\right] . \end{aligned}$$
(3.2)

Furthermore, the minimizer \(\hat{{\mathbb {Q}}}\) for the dual problem is unique and yields via

$$\begin{aligned} {\hat{\phi }}_t:=\frac{{\mathbb {E}}_{\hat{{\mathbb {Q}}}}\left[ S_T|\mathcal G^{\Delta }_t\right] -S_t}{\Lambda }, \quad t\in [0,T], \end{aligned}$$
(3.3)

the unique optimal portfolio for the primal problem.

Proof

Follows from Proposition A.2 below with the choice \(\mathcal {G}_{t}:=\mathcal {G}_{t}^{\Delta }\) after noting that \(S_{t}/\sqrt{t}\) is standard Gaussian and so

$$\begin{aligned} \sup _{t\in [0,T]}{\mathbb {E}}[\exp (a S_{t}^{2})]\le {\mathbb {E}}[\exp (a S_{T}^{2})]<\infty , \end{aligned}$$

clearly holds for some small enough \(a>0\). \(\square \)

In order to solve the utility maximization problem it therefore suffices to find the minimizer \(\hat{{\mathbb {Q}}}\) of the dual problem and work out the conditional expectation in (3.3). This is the path we will follow for the rest of this section. In a first step we derive a particularly convenient representation for the target functional of our dual problem:

Lemma 3.2

The dual infimum in (3.2) coincides with the one taken over all \({\mathbb {Q}} \in {\mathcal {Q}}\) whose densities take the form

$$\begin{aligned} \frac{d{\mathbb {Q}}}{d{\mathbb {P}}} = \exp \left( -\int _0^T\theta _t dW_t-\frac{1}{2}\int _0^T\theta _t^2dt\right) \end{aligned}$$
(3.4)

for some bounded and adapted \(\theta \) changing values only at finitely many deterministic times. For such \({\mathbb {Q}}\) the induced value (2.7) for the dual problem can be written as

$$\begin{aligned}&{\mathbb {E}}_{\mathbb Q}\left[ \Phi _0(S_T-S_0)+\frac{1}{\alpha }\log \left( \frac{d\mathbb Q}{d{\mathbb {P}}}\right) +\frac{1}{2\Lambda }\int _{0}^T\left| \mathbb E_{{\mathbb {Q}}}\left[ S_T|\mathcal G^{\Delta }_t\right] -S_t\right| ^2 dt\right] \\&\quad =-\Phi _0\int _{0}^T a_t dt+\frac{1}{2\alpha }\int _{0}^T a^2_t dt+\frac{1}{2\Lambda }\int _{0}^T \left( \int _t^T a_udu\right) ^2 dt\\&\qquad +\int _{0}^T {\mathbb {E}}_{{\mathbb {Q}}}\left[ \frac{1}{2\alpha }\int _{s}^{T} l^2_{t,s} dt + \frac{1}{2\Lambda } \int _{s}^{T}\left( \int _t^T l_{u,s}du\right) ^2dt \right. \\&\qquad \left. +\frac{s\wedge \Delta }{2\Lambda }\left( 1-\int _s^Tl_{u,s}du\right) ^2\right] ds \end{aligned}$$

where, for \(t\in [0,T]\), \(a_t\) and \(l_{t,.}\) are determined by the Itô-representations

$$\begin{aligned} \theta _t = a_t + \int _0^t l_{t,s} dW^{{\mathbb {Q}}}_s \end{aligned}$$
(3.5)

with respect to the \({\mathbb {Q}}\)-Brownian motion \(W^{\mathbb Q}_s=W_s+\int _0^s \theta _r dr\), \(s \ge 0\).

Proof

For any \({\mathbb {Q}} \in {\mathcal {Q}}\) the martingale representation property of Brownian motion gives us a predictable \(\theta \) with \({\mathbb {E}}_{{\mathbb {Q}}}[\log (d{\mathbb {Q}}/d{\mathbb {P}})]=\mathbb E_{{\mathbb {Q}}}[\int _0^T\theta ^2_sds]/2<\infty \) such that the density \(d{\mathbb {Q}}/d{\mathbb {P}}\) takes the form (3.4). Using this density to rewrite the dual target functional as an expectation under \({\mathbb {P}}\), we can follow standard density arguments to see that the infimum over \({\mathbb {Q}} \in {\mathcal {Q}}\) can be realized by considering the \({\mathbb {Q}}\) induced via (3.4) by simple \(\theta \) as described in the lemma’s formulation. As a consequence, the Itô representations of \(\theta _t\) in (3.5) can be chosen in such a way that the resulting \((a_t,l_{t,.})\) are also measurable in t: in fact they only change when \(\theta \) changes its value, i.e., at finitely many deterministic times. This joint measurability will allow us below to freely apply Fubini’s theorem.

Let us rewrite the dual target functional in terms of a and l. In terms of \(\theta \) and the \({\mathbb {Q}}\)-Brownian motion \(W^{{\mathbb {Q}}}\), it reads

$$\begin{aligned}&{\mathbb {E}}_{\mathbb Q}\left[ \Phi _0(S_T-S_0)+\frac{1}{\alpha }\log \left( \frac{d\mathbb Q}{d{\mathbb {P}}}\right) +\frac{1}{2\Lambda }\int _{0}^T\left( {\mathbb {E}}_{{\mathbb {Q}}}\left[ S_T|\mathcal G^{\Delta }_t\right] -S_t\right) ^2 dt\right] \nonumber \\&\quad ={\mathbb {E}}_{{\mathbb {Q}}}\Bigg [-\Phi _0\int _{0}^T \theta _t dt+\frac{1}{2\alpha }\int _{0}^T \theta ^2_u du\nonumber \\&\qquad +\frac{1}{2\Lambda }\int _{0}^T \left( W^{{\mathbb {Q}}}_{(t+\Delta )\wedge T}-W^{{\mathbb {Q}}}_t-\mathbb E_{{\mathbb {Q}}}\left[ \int _{t}^T\theta _u du|\mathcal G^{\Delta }_t\right] \right) ^2 dt\Bigg ]. \end{aligned}$$
(3.6)

From Itô’s isometry and Fubini’s theorem we obtain

$$\begin{aligned} {\mathbb {E}}_{{\mathbb {Q}}}\left[ \int _{0}^T \theta ^2_u du\right] =\int _{0}^T a^2_t dt+\int _{0}^T\int _{s}^{T}\mathbb E_{{\mathbb {Q}}}\left[ l^2_{t,s} \right] dt\, ds. \end{aligned}$$
(3.7)

Again by Fubini’s theorem it follows that

$$\begin{aligned} {\mathbb {E}}_{{\mathbb {Q}}}\left[ \int _{t}^T \theta _udu|\mathcal G^{\Delta }_t\right]&= \int _{t}^T a_u du+{\mathbb {E}}_{{\mathbb {Q}}}\left[ \int _{0}^T\int _{t\vee s}^T l_{u,s}du\, dW^{{\mathbb {Q}}}_s|\mathcal G^{\Delta }_t\right] \\&=\int _{t}^T a_u du+\int _{0}^{(t+\Delta )\wedge T} \int _{t\vee s}^T l_{u,s} du\, dW^{{\mathbb {Q}}}_s \end{aligned}$$

for any \(t\in [0,T]\), where the last equality follows from the martingale property of stochastic integrals. Thus, another application of Itô’s isometry yields

$$\begin{aligned}&{\mathbb {E}}_{{\mathbb {Q}}}\left[ \left( W^{{\mathbb {Q}}}_{(t+\Delta )\wedge T}-W^{{\mathbb {Q}}}_t-{\mathbb {E}}_{{\mathbb {Q}}}\left[ \int _{t}^T \theta _u du|{\mathcal {G}}_t^\Delta \right] \right) ^2 \right] \\&\quad =\left( \int _{t}^T a_u du\right) ^2+ {\mathbb {E}}_{\mathbb Q}\left[ \int _{0}^t \left( \int _{t}^T l_{u,s} du\right) ^2ds \right] \\&\qquad +{\mathbb {E}}_{{\mathbb {Q}}}\left[ \int _{t}^{(t+\Delta )\wedge T} \left( 1-\int _{s}^T l_{u,s} du\right) ^2ds \right] . \end{aligned}$$

Plugging this together with (3.7) into (3.6) and using Fubini’s theorem then provides us with the claimed formula for our dual target value:

$$\begin{aligned}&{\mathbb {E}}_{{\mathbb {Q}}}\left[ -\Phi _0(S_T-S_0)+\frac{1}{\alpha }\log \left( \frac{d\mathbb Q}{d{\mathbb {P}}}\right) +\frac{1}{2\Lambda }\int _{0}^T\left( {\mathbb {E}}_{{\mathbb {Q}}}\left[ S_T|\mathcal G^{\Delta }_t\right] -S_t\right) ^2 dt\right] \\&\quad =-\Phi _0\int _{0}^T a_t dt+\frac{1}{2\alpha }\int _{0}^T a^2_t dt+\frac{1}{2\Lambda }\int _{0}^T \left( \int _t^T a_udu\right) ^2 dt\\&\qquad +\int _{0}^T {\mathbb {E}}_{{\mathbb {Q}}}\left[ \frac{1}{2\alpha }\int _{s}^{T} l^2_{t,s} dt+ \frac{1}{2\Lambda } \int _{s}^{T}\left( \int _t^T l_{u,s}du\right) ^2dt\right. \\&\qquad \left. +\frac{s\wedge \Delta }{2\Lambda }\left( 1-\int _s^Tl_{u,s}du\right) ^2\right] ds. \end{aligned}$$

\(\square \)

The crucial point of the above representation is that by taking the minimum separately over a and over \(l_{.,s}\) for each \(s \in [0,T]\) we obtain deterministic variational problems that can be solved explicitly (see the next Lemma 3.3) and this deterministic minimum yields a lower-bound for the dual target value that, using some Gaussian process theory, will ultimately be shown to actually coincide with it (see Lemma 3.4 below).

Lemma 3.3

Recall our notation \(\rho =\alpha \sigma ^2/\Lambda =\alpha /\Lambda \) (because \(\sigma =1\); cf. (3.1)).

  1. (i)

    The minimum of the functional

    $$\begin{aligned} -\Phi _0\int _{0}^T a_t dt+\frac{1}{2\alpha }\int _{0}^T a^2_t dt+\frac{1}{2\Lambda }\int _{0}^T \left( \int _t^T a_udu\right) ^2 dt \end{aligned}$$

    over \(a \in L^2([0,T],dt)\) is attained for \(\hat{a}\Phi _0\) where

    $$\begin{aligned} \hat{a}_t = \frac{\alpha \cosh (\sqrt{\rho }(T-t))}{\cosh (\sqrt{\rho } T)}, \quad t \in [0,T]. \end{aligned}$$
    (3.8)

    The resulting minimum value is \(-\hat{A}_T\Phi _0^2\) where

    $$\begin{aligned} \hat{A}_T=\Lambda \sqrt{\rho }\tanh (\sqrt{\rho } T)/2. \end{aligned}$$
    (3.9)
  2. (ii)

    For any \(s \in [0,T]\), the minimum of the functional

    $$\begin{aligned} \frac{1}{2\alpha }\int _{s}^{T} l^2_{t} dt+ \frac{1}{2\Lambda } \int _{s}^{T}\left( \int _t^T l_{u}du\right) ^2dt +\frac{s\wedge \Delta }{2\Lambda }\left( 1-\int _s^Tl_{u}du\right) ^2 \end{aligned}$$

    over \(l \in L^2([s,T],dt)\) is attained at

    $$\begin{aligned} \hat{l}_{t,s} = \frac{ \rho (s \wedge \Delta ) \cosh (\sqrt{\rho }(T-t))}{\cosh (\sqrt{\rho }(T-s))+ \sqrt{\rho } (s \wedge \Delta ) \sinh (\sqrt{\rho }(T-s))}, \quad t\in [s,T]. \end{aligned}$$
    (3.10)

    The corresponding minimum value is

    $$\begin{aligned} \hat{L}_s=\frac{1}{2\Lambda }\frac{s \wedge \Delta }{1+(s\wedge \Delta )\sqrt{\rho } \tanh (\sqrt{\rho }(T-s))}. \end{aligned}$$
    (3.11)

Proof

We start with (ii). The uniqueness follows from strict convexity of the functional to be minimized over \(l \in L^2([s,T],dt)\). To write this as a standard variational problem, put \(H(u,v):=\frac{1}{2\Lambda }u^2+ \frac{1}{2\alpha } v^2\) for \( u,v\in {\mathbb {R}}\), reparametrize l via \(g(t) = \int _t^T l_u du\), \(t \in [s,T]\), and consider, for any \(s \in [0,T]\) and any \(\Theta \in {\mathbb {R}}\), the problem to minimize \(\int _{s}^T H(g_t,\dot{g}_t) dt\) over \(g \in C^1[s,T]\) subject to the constraints \(g(s)=\Theta \), \(g(T)=0\).

The optimization problem is convex and so it has a unique solution which has to satisfy the Euler–Lagrange equation (for details see Section 1 in [16])

$$\begin{aligned} \frac{d}{dt}\frac{\partial H}{\partial \dot{g}}=\frac{\partial H}{\partial {g}}. \end{aligned}$$

Thus, the optimizer is the unique solution of the linear ODE

$$\begin{aligned} \ddot{g}=\rho g, \ \ g(s)=\Theta , \ \ g(T)=0, \end{aligned}$$

namely

$$\begin{aligned} g^{\Theta ,s}(t):=\frac{\Theta \sinh \left( \sqrt{\rho }(T-t)\right) }{\sinh \left( \sqrt{\rho }(T-s)\right) }, \ \ t\in [s,T]. \end{aligned}$$

Next, observe that for the function \(g(t):=\int _{t}^T l_u du\), \(t\in [s,T]\) we have \(\dot{g}=-l\) where \(\dot{g}\) is the weak derivative of g, and so,

$$\begin{aligned} \frac{1}{2\alpha }\int _{s}^T \psi ^2_t dt+\frac{1}{2\Lambda }\int _{s}^T\left( \int _{t}^{T}\psi _u du\right) ^2dt=\int _{s}^T H(g_t,\dot{g}_t) dt. \end{aligned}$$

Thus, from simple density arguments (needed since g is not necessarily smooth) we obtain that

$$\begin{aligned}&\inf _{l\in L^2 ([s,T],dt)} \left\{ \frac{1}{2\alpha }\int _{s}^{T}l^2_{t} dt+ \frac{1}{2\Lambda } \int _{s}^{T}\left( \int _t^T l_{u}du\right) ^2dt +\frac{s\wedge \Delta }{2\Lambda }\left( 1-\int _s^T l_{u}du\right) ^2\right\} \\&\quad =\inf _{\Theta \in {\mathbb {R}}}\left\{ \int _{s}^T H(g^{\Theta ,s}_t,\dot{g}^{\Theta ,s}_t) dt+\frac{s\wedge \Delta }{2\Lambda }(1-\Theta )^2\right\} \\&\quad =\frac{1}{2\Lambda }\inf _{\Theta \in {\mathbb {R}}} \left\{ \frac{\coth \left( \sqrt{\rho }(T-s)\right) }{\sqrt{\rho }}\Theta ^2+(s\wedge \Delta ) (1-\Theta )^2\right\} \end{aligned}$$

where the last equality follows from simple computations.

Finally, the minimum of the above quadratic pattern (in \(\Theta \)) is attained at

$$\begin{aligned} \Theta ^{*}=-\frac{ (s\wedge \Delta )\sqrt{\rho }}{\coth \left( \sqrt{\rho }(T-s)\right) +(s\wedge \Delta )\sqrt{\rho }}. \end{aligned}$$

This gives (3.10)–(3.11).

The proof of (i) is almost the same as the of (ii), but slightly simpler. Observe that

$$\begin{aligned}&\inf _{a\in L^2([0,T],dt)} \left\{ -\Phi _0\int _{0}^T a_t dt+\frac{1}{2\alpha }\int _{0}^T a^2_t dt+\frac{1}{2\Lambda }\int _{0}^T \left( \int _t^T a_udu\right) ^2 dt\right\} \\&\qquad =\inf _{\Theta \in {\mathbb {R}}}\left\{ -\Phi _0 \Theta +\int _{0}^T H(g^{\Theta ,0}_t,\dot{g}^{\Theta ,0}_t) dt\right\} \\&\qquad =\inf _{\Theta \in {\mathbb {R}}}\left\{ -\Phi _0 \Theta +\frac{\coth \left( \sqrt{\rho }T\right) }{2\sqrt{\rho }\Lambda }\Theta ^2\right\} . \end{aligned}$$

The minimum of the above quadratic pattern (in \(\Theta \)) is attained in

$$\begin{aligned} \tilde{\Theta }^{*}=\Phi _0\sqrt{\rho }\Lambda \tanh \left( \sqrt{\rho }T\right) . \end{aligned}$$

This gives (3.8)–(3.9). \(\square \)

The previous two lemmas suggest a way to construct a candidate for the solution to the dual problem: Find \(\hat{{\mathbb {Q}}} \sim {\mathbb {P}}\) whose density is given by

$$\begin{aligned} \frac{d\hat{{\mathbb {Q}}}}{d{\mathbb {P}}} = \exp \left( -\int _0^T{\hat{\theta }}_t dW_t-\frac{1}{2}\int _0^T{\hat{\theta }}_t^2dt\right) \end{aligned}$$
(3.12)

with

$$\begin{aligned} \hat{\theta }_s =\hat{a}_s\Phi _0+ \int _0^s\hat{l}_{s,r} d\hat{W}^{\hat{{\mathbb {Q}}}}_r, \quad s \in [0,T]. \end{aligned}$$
(3.13)

For the associated Brownian motion \(\hat{W}=W^{\hat{{\mathbb {Q}}}}=W+\int _0^. \hat{\theta }_r dr\) this implies the Volterra-type integral equation

$$\begin{aligned} W_t+\int _0^t \hat{a}_s \Phi _0 ds = \hat{W}_t-\int _0^t \int _0^s\hat{l}_{s,r} d\hat{W}_rds, \quad t \in [0,T]. \end{aligned}$$
(3.14)

Integral equations of this type occur in [18, 19]; see also [13]. By considering \(W+\int _0^. \hat{a}_r \Phi _0 dr\) as a Brownian motion with respect to some probability measure which is equivalent to \({\mathbb {P}}\), we can apply the results from Section 6.4 in [18] (in particular see Theorem 6.3 and its proof there). We obtain that (3.14) has a unique solution given by

$$\begin{aligned} \hat{W}_t&= W_t+\Phi _0\int _0^t \hat{a}_s ds -\int _{0}^t\int _{0}^s {\hat{k}}_{s,r} \left( dW_r+\Phi _0\hat{a}_r dr\right) ds \nonumber \\&=W_t-\int _{0}^t\int _{0}^s {\hat{k}}_{s,r} dW_r ds+\Phi _0\left( \int _{0}^t\hat{a}_s ds-\int _{0}^t\int _{0}^s {\hat{k}}_{s,r}{\hat{a}}_r dr ds\right) \end{aligned}$$
(3.15)

where \(\hat{k}\) is the associated resolvent kernel characterized by the equation

$$\begin{aligned} \hat{k}_{t,s}+\hat{l}_{t,s}=\int _{s}^t \hat{l}_{t,u}\hat{k}_{u,s}du, \quad 0 \le s \le t \le T. \end{aligned}$$
(3.16)

Moreover, \(\hat{W}\) is a Brownian motion with respect to \(\hat{{\mathbb {Q}}}\) which is well defined by (3.12).

As our \(\hat{l}\) is separable multiplicatively, (3.16) can be reduced to a linear ODE from which we compute the explicit solution

$$\begin{aligned} \hat{k}_{t,s}=-\exp \left( \int _s^t \hat{l}_{u,u}du\right) \hat{l}_{t,s}, \quad 0 \le s \le t \le T . \end{aligned}$$
(3.17)

We are now in a position to solve the dual problem:

Lemma 3.4

The dual infimum (3.2) is attained by \(\hat{{\mathbb {Q}}} \sim {\mathbb {P}}\) with density

$$\begin{aligned} \frac{d\hat{{\mathbb {Q}}}}{d{\mathbb {P}}} = \exp \left( -\int _0^T \hat{\theta }_tdW_t-\frac{1}{2}\int _0^T \hat{\theta }_t^2dt\right) \end{aligned}$$

for \(\hat{\theta }\) constructed in (3.13) with \(\hat{W}\) as given by (3.15); this \(\hat{W}\) coincides with the \(\hat{{\mathbb {Q}}}\)-Brownian motion induced by the \(\mathbb P\)-Brownian motion W via Girsanov’s theorem. The value of the dual problem is

$$\begin{aligned} -\frac{\Lambda \Phi ^2_0 \sqrt{\rho }}{2\coth \left( \sqrt{\rho }T\right) } +\int _{0}^T \frac{1}{2\Lambda }\frac{(s\wedge \Delta )}{1+(s\wedge \Delta ) \sqrt{\rho }\tanh \left( \sqrt{\rho }(T-s)\right) }ds. \end{aligned}$$
(3.18)

Proof

The construction of \(\hat{{\mathbb {Q}}}\), \(\hat{W}\) and \(\hat{\theta }\) has already been established by the preceding discussion. It is readily checked that \(\hat{{\mathbb {Q}}}\) has finite entropy relative to \({\mathbb {P}}\) and so \(\hat{{\mathbb {Q}}} \in {\mathcal {Q}}\). Note that \(\hat{W}\) and W generate the same filtration because of (3.14) and (3.15) and so we can follow the same reasoning as in the proof of Lemma 3.2 to obtain its representation for the dual target functional also for \(\hat{\mathbb Q}\). Recalling the minimizing properties of \(\hat{a}\) and \(\hat{l}_{.,s}\), \(s \in [0,T]\), it then follows that \(\hat{\mathbb Q}\) solves the dual problem with value (3.18). \(\square \)

By (3.2) the above value (3.18) of the dual problem already yields the claimed value (2.5) for our primal utility maximization problem. For the completion of the proof of Theorem 2.2 it therefore remains to work out the optimal turnover policy \(\hat{\phi }\). Due to its dual description (3.3), it suffices to compute \({\mathbb {E}}_{\hat{{\mathbb {Q}}}}\left[ S_T|{\mathcal {G}}^{\Delta }_t\right] = {\mathbb {E}}_{\hat{{\mathbb {Q}}}}\left[ W_T|\mathcal F_{t+\Delta }\right] \). Recalling the Volterra-type equation (3.14) and using Fubini’s theorem we can write

$$\begin{aligned} W_T&= \hat{W}_T- \int _0^T \left( \hat{a}_u\Phi _0+ \int _0^u\hat{l}_{u,s} d\hat{W}_s\right) du \\&= \int _0^T \left( 1-\int _s^T \hat{l}_{u,s}du\right) d\hat{W}_s-\int _0^T\hat{a}_u du\Phi _0. \end{aligned}$$

Thus, for any \(t \in [0,T]\), we find

$$\begin{aligned}&{\mathbb {E}}_{\hat{{\mathbb {Q}}}}\left[ S_T|{\mathcal {G}}_t^\Delta \right] \\&\quad = \int _0^{(t+\Delta ) \wedge T} \left( 1-\int _s^T \hat{l}_{u,s}du\right) d\hat{W}_s- \int _0^T\hat{a}_u du\Phi _0 \\&\quad = \int _0^{(t+\Delta ) \wedge T} \left( 1-\int _s^T \hat{l}_{u,s}du\right) \left( dW_s-\int _{0}^s {\hat{k}}_{s,r} d W_r ds\right) \\&\qquad + \Phi _0 \left( \int _0^{{(t+\Delta ) \wedge T}}\left( 1-\int _s^T \hat{l}_{u,s}du\right) \left( \hat{a}_sds-\int _0^s\hat{k}_{s,r} \hat{a}_rdr\,ds\right) -\int _0^T\hat{a}_u du\right) \end{aligned}$$

where in the second step we used (3.15) to get an expression in terms of the original input to our problem W rather than \(\hat{W}\). The structure of this expression suggests to consider for \(X=W\) and \(X=\int _0^. \hat{a}_s ds\) the integral operator

$$\begin{aligned} {\mathcal {I}}_{t}^{T}(X) := \int _0^{(t+\Delta )\wedge T} \left( 1-\int _s^T \hat{l}^T_{u,s}du\right) \left( dX_s-\int _{0}^s {\hat{k}}^T_{s,r} d X_r ds\right) \end{aligned}$$

for continuous paths X. Notice that the dX-integrals can be defined through integration by parts which reveals in particular that \({\mathcal {I}}^T_t(X)\) depends continuously on X; notice also that we used the notation \(\hat{l}^T\) and \(\hat{k}^T\) in lieu of l and \(k\) to emphasize that these kernels depend on the time horizon T. In conjunction with (3.3) and \(S_t=W_t\), this provides us with a (somewhat) explicit ‘open loop’ expression of the optimal turnover policy:

$$\begin{aligned} \hat{\phi }_t = \frac{1}{\Lambda } \left( \mathcal {I}_{t}^{T}\left( W\right) -W_t +\Phi _0\left( \mathcal {I}_{t}^{T}\left( \int _0^.\hat{a}^T_sds\right) -\int _0^T\hat{a}^T_udu\right) \right) , \end{aligned}$$
(3.19)

where, again, \(\hat{a}^T\) is used to recall that \(\hat{a}\) of (3.8) depends on T.

To establish the policy’s more informative feedback description given in Theorem 2.2, we note next that dynamic programming holds for our problem:

Lemma 3.5

The optimal policy \(\hat{\phi }\) of (3.19) can alternatively be described in the form

$$\begin{aligned} \hat{\phi }_t =&\frac{1}{\Lambda } \Bigg (\mathcal {I}_{0}^{T-t}\left( W_{t+.}-W_t\right) -W_t\nonumber \\&+\hat{\Phi }_t\left( \mathcal {I}_{0}^{T-t} \left( \int _0^.\hat{a}^{T-t}_sds\right) -\int _0^{T-t}\hat{a}^{T-t}_udu\right) \Bigg ) \end{aligned}$$
(3.20)

where \(\hat{\Phi }_t=\Phi _0+\int _0^t \hat{\phi }_sds\) for \(t \in [0,T]\).

Proof

The righthand side of (3.19) gives us for each time horizon T a continuous functional \(\Psi ^T:{\mathbb {R}}\times C[0,T]\rightarrow C[0,T]\) such that for any initial stock position \(\Phi _0 \in {\mathbb {R}}\) and any stock price evolution W, \(\Psi ^T(\Phi _0,W|_{[0,T]})\) is the correspondingly optimal strategy \(\hat{\phi }\) for the utility maximization problem.

Assume by contradiction that the statement of our lemma does not hold. Then, by continuity of sample paths of \(\hat{\phi }\), there exists \(t_0\in [0,T]\) such that with positive probability \(\hat{\phi }_{t_0}\) does not coincide with the righthand side of (3.20). Now consider the strategy \(\tilde{\phi }\) that coincides with \(\hat{\phi }\) up to time \(t_0\) when it switches to

$$\begin{aligned} \tilde{\phi }_t:=\Psi ^{T-t_0}\left( {\hat{\Phi }}_{t_0},W_{.+t_0}-W_{t_0}\right) _{t-t_0}, \quad t \in [t_0,T]. \end{aligned}$$

For any strategy \(\phi \), we can write the contribution over the interval \([t_0,T]\) to the resulting terminal wealth as

$$\begin{aligned} V^{\Phi _0,{\phi }}_T - V^{\Phi _0,{\phi }}_{t_0} ={\hat{\Phi }}_{t_0}(S_T-S_{t_0})+\int _{t_0}^T{\Phi }_t(S_T-S_t)dt -\frac{\Lambda }{2}\int _{t_0}^T{\phi }^2_tdt=:V_{[t_0,T]}^{\Phi _{t_0},,\phi }, \end{aligned}$$

where \(\Phi _{t_0}:=\Phi _0+\int _0^{t_0}\phi _t dt\). Of course, \(\tilde{\Phi }_{t_0}:=\Phi _0+\int _0^{t_0}\tilde{\phi }_t dt=\hat{\Phi }_{t_0}\). So, by the Markov property of Brownian motion and choice of \(\tilde{\phi }\) as the unique optimal policy as of time \(t_0\), this allows us to observe that

$$\begin{aligned} \mathbb {E}\left[ -\exp \left( -\alpha V_{[t_0,T]}^{\tilde{\Phi }_{t_0},\tilde{\phi }}\right) |\mathcal G^\Delta _{t_0}\right] \ge \mathbb {E}\left[ -\exp \left( -\alpha V_{[t_0,T]}^{\hat{\Phi }_{t_0}, \hat{\phi }}\right) |\mathcal G^\Delta _{t_0}\right] , \end{aligned}$$

with “>” holding on \(\{\hat{\phi }_{t_0} \not = \tilde{\phi }_{t_0}\}\) (i.e. where (3.20) is violated) because continuity of \(\hat{\phi }\) and \(\tilde{\phi }\) ensures that they will differ on an open interval once they differ at all. Since by assumption this happens with positive probability, it follows for the unconditional expected utility from \(\tilde{\phi }\) that

$$\begin{aligned} \mathbb {E} \left[ -\exp (-\alpha V^{\Phi _0,\tilde{\phi }}_T)\right]&=\mathbb {E} \left[ \exp (-\alpha V_{t_0}^{\Phi _0,\tilde{\phi }}) \mathbb {E}\left[ -\exp \left( -V_{[t_0,T]}^{\tilde{\Phi }_{t_0}, \tilde{\phi }}\right) |\mathcal G^\Delta _{t_0}\right] \right] \\&>\mathbb {E} \left[ \exp (-\alpha V_{t_0}^{\Phi _0,\tilde{\phi }}) \mathbb {E}\left[ -\exp \left( -V_{[t_0,T]}^{\hat{\Phi }_{t_0},{\hat{\phi }}}\right) |\mathcal G^\Delta _{t_0}\right] \right] \\&=\mathbb {E} \left[ -\exp (-\alpha V^{\Phi _0,\hat{\phi }}_T)\right] , \end{aligned}$$

contradicting the optimality of \(\hat{\phi }\). \(\square \)

As a consequence of this dynamic programming result, it suffices to verify our feedback policy description (2.3) for time \(t=0\):

Lemma 3.6

The optimal initial turnover rate is

$$\begin{aligned} \hat{\phi }_0=&\frac{1}{1 +\Delta \sqrt{\rho }\tanh (\sqrt{\rho }(T-\Delta )^{+})}\frac{S_{\Delta \wedge T}}{\Lambda }\\&+\frac{\sqrt{\rho }}{\coth (\sqrt{\rho }(T-\Delta )^{+}) +\Delta \sqrt{\rho }} \int _0^{\Delta \wedge T} \frac{S_{s}}{\Lambda }ds-\frac{S_0}{\Lambda }\nonumber \\&+\frac{\sqrt{\rho }}{\coth (\sqrt{\rho }(T-\Delta )^{+}) +\Delta \sqrt{\rho }} \left( \frac{\mu }{\alpha \sigma ^2}-\Phi _0\right) .\nonumber \end{aligned}$$
(3.21)

Proof

In view of (3.19), we need to compute for \(X=W\) and \(X=\int _0^.\hat{a}_udu\) the operator

$$\begin{aligned} \mathcal {I}_{0}^T\left( X\right)&= \int _0^{\Delta \wedge T} \left( 1-\int _s^T \hat{l}_{t,s}dt\right) \left( dX_s-\int _{0}^s {\hat{k}}_{s,r} d X_r ds\right) \nonumber \\&= \int _0^{\Delta \wedge T} \left( 1-\int _s^T \hat{l}_{t,s}dt\right) dX_s-I \end{aligned}$$
(3.22)

with

$$\begin{aligned} I&:=\int _0^{\Delta \wedge T} \left( 1-\int _s^T\hat{l}_{t,s}dt\right) \int _{0}^s {\hat{k}}_{s,r} d X_r ds\nonumber \\&= \int _0^{\Delta \wedge T} \left( \int _r^{\Delta \wedge T} \hat{k}_{s,r}ds -\int _r^T\int _r^{t\wedge \Delta } \hat{l}_{t,s}\hat{k}_{s,r}ds \ dt \right) dX_r \end{aligned}$$
(3.23)

where the last equality is due to Fubini’s theorem. For \(t \in [r,\Delta \wedge T]\), the kernel identity (3.16) shows that the second ds-integral in (3.23) gives \(\hat{k}_{t,r}+\hat{l}_{t,r}\). For \(t \in (\Delta \wedge T,T]\), we note that \(\Delta <T\) and we let \(n_t\) denote the numerator in the definition of \(\hat{l}_{t,.}\) in (3.10) to write \(\hat{l}_{t,r}=\hat{l}_{\Delta ,r} n_t/n_{\Delta }\). It follows by another use of the kernel identity (3.16) that for such t the second ds-integral above amounts to

$$\begin{aligned} \int _r^{\Delta } \hat{l}_{t,s}\hat{k}_{s,r}ds&= \int _r^{\Delta } \hat{l}_{\Delta ,s}\hat{k}_{s,r}ds \frac{n_t}{n_{\Delta }} = \left( \hat{k}_{\Delta ,r}+\hat{l}_{\Delta ,r}\right) \frac{n_t}{n_{\Delta }}\\&=\hat{l}_{t,r}-\exp \left( \int _r^{\Delta }\hat{l}_{u,u}du\right) \hat{l}_{t,r} \end{aligned}$$

where we used (3.17) in the final step. Plugging all this into (3.23), we see that the contribution to the dt-integral from \([r,\Delta \wedge T]\) is partially cancelled by the first ds-integral there, leaving us with

$$\begin{aligned} I=\int _0^{\Delta \wedge T}\left( -\int _r^T \hat{l}_{t,r}dt+\int _{\Delta \wedge T}^T\exp \left( \int _r^{\Delta }\hat{l}_{u,u}du\right) \hat{l}_{t,r}dt\right) dX_r. \end{aligned}$$

Inserting this into (3.22), we see a cancellation of integrals over \(\hat{l}\) and arrive at

$$\begin{aligned} \mathcal {I}_{0}^T\left( X\right)&= \int _0^{\Delta \wedge T} \left( 1-\int _{\Delta \wedge T}^T\exp \left( \int _s^{\Delta }\hat{l}_{u,u}du\right) \hat{l}_{t,s}dt\right) dX_s\nonumber \\&=\int _0^{\Delta \wedge T} \left( 1-f_T(s)\right) dX_s \end{aligned}$$
(3.24)

where in view of (3.10) we have

$$\begin{aligned} f_{T}(s) :=&\exp \left( \int _s^{\Delta \wedge T} \frac{u\rho }{1+u\sqrt{\rho }\tanh (\sqrt{\rho }(T-u))}du\right) \nonumber \\&\cdot \frac{s\sqrt{\rho }\sinh (\sqrt{\rho }(T-\Delta )^+)}{\cosh (\sqrt{\rho }(\tau -s)) +s\sqrt{\rho }\sinh (\sqrt{\rho }(T-s))}\nonumber \\ =&\frac{s\sqrt{\rho }\sinh (\sqrt{\rho }(T-\Delta )^+)}{\cosh (\sqrt{\rho }(T-\Delta )^{+}) +\Delta \sqrt{\rho }\sinh (\sqrt{\rho }(T-\Delta )^{+})}. \end{aligned}$$
(3.25)

Now we apply (3.24) to \(X=W\) and \(X=\int _.^T\hat{a}_udu\) to rewrite the open loop description (3.19) of \(\hat{\phi }_0\) as

$$\begin{aligned} \hat{\phi }_0 =&\frac{W_{\Delta \wedge T}}{\Lambda } (1-f_T(\Delta \wedge T)) +\int _0^{\Delta \wedge T} \frac{W_{s}}{\Lambda } f'_T(s)ds\\&- \Phi _0\left( \int _{\Delta \wedge T}^T\frac{\hat{a}_u}{\Lambda }du (1-f_T(\Delta \wedge T))+\int _0^{\Delta \wedge T}\int _{s}^T\frac{\hat{a}_u}{\Lambda }du f'_T(s)ds\right) . \end{aligned}$$

We conclude the claimed representation for the optimal policy (3.21) by inserting (3.25) and

$$\begin{aligned} \int _{s}^T\frac{\hat{a}_u}{\Lambda }du = \frac{\sqrt{\rho }\sinh (\sqrt{\rho }(T-s))}{\cosh (\sqrt{\rho }T)}, \quad s \in [0,T], \end{aligned}$$

in the above formula for \(\hat{\phi }_0\).

As a final step, we need to recall the simplifying steps from the beginning of this chapter where we reduced everything to the case \(S=W\) underpinning our calculations so far. Reversing these steps then leads to the formulae given in the present lemma which work for the general case required in our main theorem. \(\square \)