1 Introduction

In today’s financial markets, every tick is archived. In analyzing events in the ancient past (1970s) or less automated markets like credit default swaps or emerging market bonds (roughly pre-2013), the only data that typically is available is the open, high, low, close data. An entire field, chartist analysis, uses these descriptors as the “sufficient statistics” for prediction. This paper defines the probability distribution of B(t|highlowclose) and calculates its expectation. Our formulas allow us to interpolate the price signal as E[B(t|openhighlowclose)] over all time in [0,1] given any data source that only has open, high, low, close data. We think of the open, high, low and close as “statistics” which we will use to estimate the mean and variance of the process.

Classically, most of the financial forecasting based on charts uses at most four pieces of information for each day, the opening price (open), the closing price (close), the maximum price (high) and the minimum price, (low) Morris (2006). We address the issue of how much additional information the high and low carry beyond that of the open and close. We measure the “value” by the reduction of the variance of the Brownian motion given one or both of the high, h, and low, \(\ell \) . The variance of the path of a Brownian motion is \(V(t) =t\) which integrates to \(\int _0^1 V(s) =1/2\). For the Brownian bridge pinned to \(B(t)= c\), the variance is independent of the terminal value, c, and satisfies \(V(t)=t(1-t)\). Integrating variance of the Brownian bridge from zero to one yields an average variance, \(\int _0^1 V(s) =1/6\). Thus knowledge of both the open and the close significantly reduces the variance of the process. Our results allow us to calculate the variance of Brownian motion given the high, low and close, \(V(t|h,\ell ,c)\).

There have been a number of studies that use the open, high, low and close to improve the estimate of the volatility (standard deviation) of the Brownian motion (Garman and Klass 1980; McLeish 2002; Meillijson 2011; Rogers and Satchell 1991). In contrast, we assume that the variance is given and standardized to \(\sigma ^2\). In reality, the volatility of financial time series are unknown, bursty, and temporally non-uniform on many time scales. Given a model of the time dependence of the volatility, one must transform time to an equal volatility time. For this paper (except for Section 10), we neglect this difficult problem and proceed with the studying standardized Brownian motion.

Let B(t) be a Brownian motion on [0,1] and \(B_c(t)\) be the Brownian motion restricted to \(B_c(t=1)=c\). We allow an arbitrary const variance, \(E[B(s)^2]=\sigma ^2\). Our notation tracks the excellent compendium of results by Devroye (2010). Many of the results summarized in Sect. 2 can be found there. We consider the distribution of B(t) conditioned on one or more of the statistics: \(B(t=1)=c\), \(\max _{t \in [0,1]}B(t)=h\) and \(\min _{t \in [0,1]}B(t)=\ell \). We evaluate the conditional density of \(B(t|close, \max )\) and \(B(t|close, \max , \min )\) using Chapman–Kolmogorov type calculations. The conditional densities of \(B(t|\max )\) and \(B(t|\max , \min )\) are found by integrating the earlier densities over c. Our primary goal is to evaluate the conditional mean and conditional variance of B(t) in these cases. For several cases, explicit formulas for the moments are given. The location of the minimum and the location of the maximum are unknown and not used in our analysis. A number of other studies examine the distribution of Brownian motion given its maximum and the location of its maximum (Shepp 1979; Pitman and Yor 1996; Imhof 1992; Riedel 2020). Integrating densities over the location of the maximum is sometimes proposed. Our experience is that this is analytically intractable.

All of our results are for Brownian motion, B(t), on [0,1] with \(Var[B(t)]=\sigma ^2\). We will often use the notation \(\sigma ^2_t \equiv \sigma ^2*t(1-t)\). Section 2 reviews results on the density/distribution of extrema of Brownian motion. Section 3 derives analytic formuli for E[B(t|ch)] and Var[B(t|ch)]. Section 4 derives analytic formuli for \(E[B(t|c,h,\ell )]\) and \(Var[B(t|c,h,\ell )]\). Section 5 reviews our numerical simulations. Section 6 plots E[B(t|c)] and Var[B(t|c)] as well as E[B(t|h)] and Var[B(t|h)]. It then computes Feller’s distribution for the range, \(\Delta =h-\ell \), and compares with our simulation results. Section 7 plots the E[B(t|ch)] and Var[B(t|ch)] for a variety of different values of (ch). Section 8 compares the analytic formuli in Sect. 3 with the simulation results in Sect. 7. Section 9 compares the analytic formuli in Sect. 4 with the simulation results. Section 11 derives the distribution \(p(t|h,\ell )\) by integrating over the closing value in \(p(t,h,\ell ,c)\). Section 12 discusses and summarizes results. Especially important are Table 2 and Fig. 18 as they demonstrate the variance reduction from using the high and low in the estimation of B(t).

2 Distributions of Brownian extrema

The study of Brownian extrema date back to the founders of the field (Lèvy 1948). Our brief discussion follows Devroye (2010) with additional results taken from Shepp (1979); Durrett et al. (1977a, 1977b), Imhof (1985), Borodin and Salminen (2002). See also (Bertoin and Pitman 1994; Karatzas and Shreve 1998; Pitman and Yor 1996; Yor 1997). We denote the Gaussian density by \(\phi _{\sigma ^2}(x) = (2\pi \sigma ^2)^{-.5} \exp (-x^2/2\sigma ^2)\). The density of the high, (maximum of B(t)), h is that of the half normal: \(p(h)=2 \phi _{\sigma ^2}(h) =\sqrt{\frac{2}{\pi \sigma ^2}} \exp (-h^2/2\sigma ^2)\), \(h>0\). The classic result (Williams 1970; Imhof 1984) derived using the reflection principle is

Theorem 2.1

The joint distribution of the close, c, the high, h is

$$\begin{aligned} P(h,c)= P(\max \{B(s),\ s \in [0,1]\}\le h, B(1)=c) = \phi _{\sigma ^2}(c) - \phi _{\sigma ^2}(2h-c) \ . \end{aligned}$$
(2.1)

The marginal density satisfies

$$\begin{aligned} p(h,c) \equiv p(h=\max \{B(s),\ s \in [0,1]\}, B(1)=c) = \frac{2(2h-c)}{\sigma ^2} \phi _{\sigma ^2}(2h-c) \end{aligned}$$
(2.2)

where \( h\ge 0\), \(h \ge c\).

Here P(hc) is a distribution in h and a density in c. The conditional density, p(c|h), is given by

$$\begin{aligned} p(c|h) =p(h,c)/p(h) = \ \frac{(2h-c) \exp (-(2h-c)^2/2\sigma ^2)}{\sigma ^2\exp (-h^2/2\sigma ^2)} \ . \end{aligned}$$
(2.3)

Using (2.3), we find

$$\begin{aligned} E[c|h]= & {} h - \sigma \sqrt{\frac{\pi }{2}} erfc\left( \frac{h}{\sqrt{2\sigma ^2}}\right) \exp \left( \frac{h^2}{2\sigma ^2}\right) \ , \quad \nonumber \\ E[c^2|h]= & {} h^2 +2\sigma ^2-\,4h\sigma \sqrt{\frac{\pi }{2}} erfc\left( \frac{h}{\sqrt{2\sigma ^2}}\right) \exp \left( \frac{h^2}{2\sigma ^2}\right) \ , \end{aligned}$$
(2.4)
$$\begin{aligned} Var[c|h]= & {} h^2 +2\sigma ^2-2h\sigma \sqrt{\frac{\pi }{2}} erfc\left( \frac{h}{\sqrt{2\sigma ^2}}\right) \exp \left( \frac{h^2}{2\sigma ^2}\right) \nonumber \\&- \,\frac{\pi \sigma ^2}{2} \left[ erfc\left( \frac{h}{\sqrt{2\sigma ^2}}\right) \exp \left( \frac{h^2}{2\sigma ^2}\right) \right] ^2 \ . \end{aligned}$$
(2.5)

A result that goes back to Lèvy (1948), Choi and Roh (2013), if not earlier, is

Theorem 2.2

The joint distribution of the close, c, the high, h, and the low, \(\ell \) is

$$\begin{aligned} P(h,\ell ,c)= & {} P(B(1)=c, \ell \le \{B(s),\ s \in [0,1]\}\le h) \nonumber \\= & {} \sum _{k=-\infty }^{\infty } \phi _{\sigma ^2}(c-2k(h-\ell ) ) - \phi _{\sigma ^2}(c-2h- 2k(h-\ell )) \end{aligned}$$
(2.6)
$$\begin{aligned}= & {} \phi _{\sigma ^2}(c) - \sum _{k=0}^{\infty } \left[ \phi _{\sigma ^2}(c-2h- 2k\Delta ) + \phi _{\sigma ^2}(c-2\ell + 2k\Delta ) \right] \nonumber \\&+\, \sum _{k=1}^{\infty }\left[ \phi _{\sigma ^2}(c-2k\Delta )+ \phi _{\sigma ^2}(c+2k\Delta )\right] \end{aligned}$$
(2.7)

where \(\Delta \equiv (h-\ell )\).

The symmetric form, (2.7), not only treats h and \(\ell \) symmetrically, but also shows the series is in an alternating form. Here \(P(h,\ell ,c)\) is a distribution in \(h,\ell \) and a density in c. To calculate the density, we use \(p(h,\ell ,c) = - \partial _{\ell } \partial _h P(B(t=1)=c, \ell \le B(t) \le h)\).

Corollary 2.3

The density, \(p(B(1)=c, \max \{B(s),\ s \in [0,1]\}=h, \min \{B(s)\}=\ell )\), is given by

$$\begin{aligned} p(h,\ell ,c,\sigma ^2)= & {} \frac{4}{\sigma ^2}\sum _{k=-\infty }^{\infty } k^2 a_k(c,\Delta )\phi _{\sigma ^2}(c-2k\Delta ) \nonumber \\&-\, k(k+1)a_k(c-2h,\Delta ) \phi _{\sigma ^2}(c-2h-2k\Delta ) \end{aligned}$$
(2.8)

where \(a_k(c,\Delta )\equiv (c-2k \Delta )^2/\sigma ^2-1\).

A number of estimators of \(\sigma ^2\) given \((h,\ell ,c)\) have been proposed (Garman and Klass 1980; McLeish 2002; Meillijson 2011; Rogers and Satchell 1991). The maximum likelihood estimator, \(\hat{\sigma }^2\equiv argmin_{\sigma 2} p(h,\ell ,c,\sigma ^2) \) was proposed in Ball and Torous (1984), Siegmund (1985).

Corollary 2.4

The density, p(\(\max \{B(s),\ s \in [0,1]\}=h, \min \{B(s),\ s \in [0,1]\}=\ell )\), satisfies

$$\begin{aligned}&p(h,\ell ) = - \int _{\ell }^h \partial _{\ell } \partial _hP(B(t=1)=c, \ell \le B(t) \le h)dc = \end{aligned}$$
(2.9)
$$\begin{aligned}&\frac{-4}{\sigma ^2} \sum _{k=-\infty }^{\infty } k^2 [h_k \phi _{\sigma ^2}(h_k)-\ell _k \phi _{\sigma ^2}(\ell _k)] - k(k+1)[(h_k-2h)\phi _{\sigma ^2}(h_k-2h) \nonumber \\&\qquad - \,(\ell _k-2h)\phi _{\sigma ^2}(\ell _k-2h)] \ \end{aligned}$$
(2.10)

where \(h_{k}\equiv h-2k\Delta \) and \(\ell _{k}\equiv \ell -2k\Delta \).

3 Density given high and close

The classical results in Sect. 2 are for \(t=1\). Our focus for the remainder of the article is on the density and moments for \(t<1\). We derive the density, \(p(B(t)=x |B(1)=c,\ \max \{B(s)\}=h)\) and then compute the first and second moments. We interpret the high and close as “statistics” in the sense of estimation theory.

Theorem 3.1

The distribution, \(F(x,t, h, c) \equiv P(B(t)=x, B(1)=c, B(s) \le h\ \mathrm{for}\ s\ \in \ [0,1])\), satisfies

$$\begin{aligned} F(x,t, h, c)\equiv & {} P(B(t)=x, B(s) \le h\ \mathrm{for}\ s\ \in \ [0,t]) \times \ P(B'(1-t)=c-x, B'(s) \nonumber \\\le & {} h-x\ \mathrm{for}\ s\ \in [0,1-t]) \ \end{aligned}$$
(3.1)

where \(B'\) is a second independent Brownian motion,

$$\begin{aligned} P_{t,x}(h)\equiv & {} P(\max \{B_x(s),\ s \in [0,t]\} \le h) = \phi _{t\sigma ^2}(x) - \phi _{t\sigma ^2}(2h-x) \ , \end{aligned}$$
(3.2)
$$\begin{aligned} P_{1-t,c-x}(h-x)\equiv & {} P(\max \{B_{c-x}(s),\ s \in [t,1]\} \le h-x) \nonumber \\= & {} \phi _{(1-t)\sigma ^2}(c-x) - \phi _{(1-t)\sigma ^2}(2h-c-x) \ . \end{aligned}$$
(3.3)

Similar results to Theorem 3.1 for the case of Brownian meanders and excursions appear in Chung (1976), Durrett et al. (1977b), but we have not found precisely this result. Here F(xthc) is a distribution in h and a density in xc.

Corollary 3.2

The conditional density, \(p(B(t)=x |h,c) \equiv p(B(t)=x |B(1)=c,\ \max \{B(s)\}=h)\), satisfies:

$$\begin{aligned} p(x,t|h,c)\equiv p(B(t)=x|h,c) = p(B(t)=x, B(1)=c,\ \max \{B(s)\}=h) / p(h,c),\nonumber \\ \end{aligned}$$
(3.4)

where the divisor, p(hc), is given by (2.2) and

$$\begin{aligned} p(B(t)=x, B(1)= & {} c,\ \max \{B(s)\}=h) =P_{t,x}(h) q_{1-t,c-x}(h-x)\nonumber \\&+\, q_{t,x}(h) P_{1-t,c-x}(h-x) \ . \end{aligned}$$
(3.5)

Here

$$\begin{aligned} q_{t,x}(h)\equiv & {} p(\max \{B_x(s),\ s \in [0,t]\} = h) = \frac{d P_{t,x}(h)}{dh} \nonumber \\= & {} \frac{2(2h-x)}{t\sigma ^2} \phi _{t\sigma ^2}(2h-x) ,\end{aligned}$$
(3.6)
$$\begin{aligned} q_{1-t,c-x}(h-x)\equiv & {} p(\max \{B_{c-x}(s),\ s \in [t,1]\} = h-x) \nonumber \\= & {} \frac{2(2h-c-x)}{(1-t)\sigma ^2} \phi _{(1-t)\sigma ^2}(2h-c-x) \ . \end{aligned}$$
(3.7)

Equation (3.5) simply states that if the realization goes through the points (tx) and (1, c) and has a high value of h, then either it reaches h in [0, t] or in [t, 1]. Equation (3.5) is the kernel of the Chapman–Kolmogorov representations for this restricted Brownian motion problem (Karush 1961).

Lemma 3.3

F(xthc) is the difference of four Gaussians:

$$\begin{aligned} F(x,t, h, c) = f_1(x,t,c) -f_2(x,t,c,h) - f_3(x,t,c,h) +f_4(x,t,c,h) \ . \end{aligned}$$
(3.8)

The \(f_i\) are of the form:

$$\begin{aligned} f_i(x,t,h,c)= & {} \frac{1}{2\pi \sigma \sigma _t}\exp \left( -\frac{(1-t)(x-a_i)^2 +(x-b_i)^2 t}{2t(1-t)\sigma ^2}\right) \nonumber \\= & {} \phi _{\sigma ^2_t}(x- \mu _i(t,c,h)) )\frac{\exp ^{-g_i(c,h)}}{\sqrt{2\pi }\sigma } \end{aligned}$$
(3.9)

where \(\sigma _t^2 \equiv t(1-t)\sigma ^2\), \(a_1=0\), \(b_1=c\), \(a_2=0\), \(b_2=2h-c\), \(a_3=2h\), \(b_3=c\), \(a_4=2h\), \(b_4=2h-c\). In (3.9), \(\mu _i(t,c,h)\) and \(g_i\) are defined by

$$\begin{aligned} \mu _i(t,c,h) \equiv a_i(1-t) +b_it\ ;\quad g_i \equiv \frac{[a_i^2(1-t) +t b_i^2]-\mu _i^2}{2\sigma _t^2}=\frac{(a_i - b_i)^2}{2\sigma ^2} \ . \end{aligned}$$
(3.10)

Thus, \(\mu _1= ct\), \(g_1=c^2/2\sigma ^2\), \(\mu _2= (2h-c)t\), \(g_2=(2h-c)^2/2\sigma ^2\), \(\mu _3=2h(1-t) +ct\), \(g_3=(2h-c)^2/2\sigma ^2\) and \(\mu _4= 2h-ct\), \(g_4=c^2/2\sigma ^2\).

Note that \(f_i(x=h,t,h,c)= \phi _{t\sigma ^2}(h)\phi _{(1-t)\sigma ^2}(h-c)\). The equality of the four terms at \(x=h\) will allow us to cancel terms when we integrate by parts. We also define \(\psi _i = \exp ^{-g_i(c,h)} / \sqrt{2\pi }\sigma \) so \(\psi _1=\psi _4=\phi _{\sigma ^2}(c)\) and \(\psi _2=\psi _3=\phi _{\sigma ^2}(2h-c)\). To simplify our calculations, observe

$$\begin{aligned} \partial _h f_i = \left[ \frac{(x-\mu _i)}{\sigma _t^2} \partial _h \mu _i -\partial _h g_i\right] f_i \quad \mathrm{and}\ \partial _x f_i = -\frac{(x-\mu _i)}{\sigma _t^2} f_i \ . \end{aligned}$$
(3.11)

Note \(f_1\) is independent of h and therefore may be ignored. Derivatives of F(xthc) with respect to h only enter through h dependencies in \(\mu _i\) and \(g_i\). We now evaluate the moments with respect to x for a given time, t, and fixed (hc).

Theorem 3.4

Consider \(M_m(t, h, c) \equiv \int _{\infty }^h x^m p(x,t, h, c)dx\). The zeroth moment is \(M_0(t,c,h) =p(h,c)\) where p(hc) is given in (2.2).

$$\begin{aligned} M_1\equiv & {} \phi _{\sigma ^2}(c)\left[ 1 +erf\left( \frac{ct-h}{\sqrt{2}\sigma _t}\right) \right] \nonumber \\&+\,\phi _{\sigma ^2}(r)\left[ \frac{2hr}{\sigma ^2}-1 +p_{h,r,t} erf\left( \frac{h-rt}{\sqrt{2}\sigma _t}\right) \right] - 4 r t(1-t) \phi _{\sigma ^2}(r) \phi _{\sigma _t^2}(h-rt)\nonumber \\ \end{aligned}$$
(3.12)

where \(r\equiv 2h-c\) and \(p_{h,r,t}\equiv (1-2t) +\frac{2r(rt-h)}{\sigma ^2}\).

$$\begin{aligned}&M_2=2(2h-ct) \phi _{\sigma ^2}(c)\left[ 1 +erf\left( \frac{ct-h}{\sqrt{2}\sigma _t}\right) \right] \left. - 4 r h t(1-t)\phi _{\sigma _t}(h-rt) \right) + \end{aligned}$$
(3.13)
$$\begin{aligned}&2\phi _{\sigma ^2}(r) \left( \left[ r t(1-t) + q_1(h,c,t) + q_2(h,c,t) erf\left( \frac{h-rt}{\sqrt{2} \sigma _t}\right) \right] \right. \nonumber \\ \end{aligned}$$
(3.14)

where \(q_1(h,c,t)\equiv -(rt^2 +(1-t)(2h-rt) ) +r (h^2+ (h-rt)^2)/\sigma ^2\) and \(q_2(h,c,t)\equiv (2h(1-t)-rt) + 2hr(rt-h)/\sigma ^2\).

Note \(M_1(t=1,c,h) =c p(c,h)\) and \(M_2(t=1,c,h) =c^2 p(c,h)\) as must be.

To compute the moments, we use

$$\begin{aligned}&M_m(h,c)=\int _{\infty }^h x^m \partial _h F(x,t, h, c)dx = \sum _{i=2}^4 s_i \int _{-\infty }^h x^m \left[ -\partial _h \mu _i \partial _x f_i -\partial _h g_i f_i\right] = \nonumber \\\end{aligned}$$
(3.15)
$$\begin{aligned}&\sum _{i=2}^4 s_i \int _{-\infty }^{h}\left[ m x^{m-1} \partial _h \mu _i -\partial _h g_i x^m \right] f_idx = \end{aligned}$$
(3.16)
$$\begin{aligned}&\sum _{i=2}^4 s_i \psi _i\int _{-\infty }^{h-\mu _i} \left[ m(x+\mu _i)^{m-1} \tau _i - \frac{2(2h-c)}{\sigma ^2}(x+\mu _i)^m(1-\delta _{i,4})\right] \phi _{\sigma _t^2}(x)dx.\qquad \end{aligned}$$
(3.17)

Here \(s_i = -1\) for \(i=2,3\) and \(s_i=1\) for \(i=1,4\) and we define \(\tau _i \equiv \partial _h \mu _i\) so that \(\tau _2= 2t\), \(\tau _3= 2(1-t)\), \(\tau _4= 2\). To go from (3.15) to (3.16), we use that the three terms evaluated at \(x=h\) cancel. The remainder of the evaluation of the moments \(M_1(t,h,c)\) and \(M_2(t,h,c)\) is deferred to the Appendix. \(\square \)

Given the moments, \(M_i(t,h,c)\), the conditional mean and conditional second moment are \(E[B(t|h,c)]=M_1(t,h,c)/p(h,c)\) and \(E[B^2(t|h,c)]=M_2(t,h,c)/p(h,c)\). We treat \(E[B(t|h,c)]=M_1(t,h,c)/p(h,c)\) as an estimator of B(t|hc). An alternative estimator is the maximum likelihood estimate given by maximizing (3.5) with respect to x for each time t.

4 Density given high, low and close

We now consider using the open, high, low and close together as statistics to estimate a realization of a Brownian process. After writing down the density conditional on these statistics, we spend the bulk of this section evaluating the moments of the density as summarized by (4.3). We begin by applying Chapman–Kolmogorov equation to \(P(B(t)=x, \ell \le B(s) \le h)\) (Karush 1961):

Theorem 4.1

Let \(G(x,t, h, \ell , c) \equiv P(B(t)=x, B(1)=c, \ell \le B(s) \le h | \mathrm{for}\ s \in [0,1] )\), \(Q(x,t, h,\ell ) \equiv P(B(t)=x, \ell \le B(s) \le h|s\le t)\) and \(Q_R(x,t, h,\ell ,c) = P(B'(1-t)=c-x, \ell -x \le B'(s) \le h-x\ |s\le 1-t)\). Here \(B'\) is a second independent Brownian motion. Then

$$\begin{aligned} P(B(t)= & {} x, \ell \le B(s) \le h|s\le t) \nonumber \\= & {} \sum _{j=-\infty }^{\infty } \bigg [ \phi _{t\sigma ^2}(x-2j(h-\ell )) - \phi _{t\sigma ^2}(x-2h+ 2j(h-\ell )) \bigg ] \ , \end{aligned}$$
(4.1)
$$\begin{aligned} P(B'(1-t)= & {} c-x, l-x \le B'(s) \le h-x) \nonumber \\= & {} \sum _{k=-\infty }^{\infty } \bigg [ \phi _{\hat{t}\sigma ^2}(x-c+2k\Delta ) - \phi _{\hat{t}\sigma ^2}(x-(2h-c) -2k\Delta ) \bigg ] \end{aligned}$$
(4.2)

where \(\Delta \) is the high - low on [0,1], \(\Delta \equiv h-\ell \) and \(\hat{t}\equiv (1-t)\). The probability distribution, \(P(B(t)=x, \ell \le B(s) \le h)\), satisfies

$$\begin{aligned} G(x,t, h, \ell , c) = Q(x,t, h,\ell ) Q_R(x,t, h,\ell ,c) \end{aligned}$$
(4.3)

Proof

We apply (2.6) in the time interval \(s \in [0,t]\) to yield (4.1) and to \(s \in [t,1]\) to yield (4.2). The Markovian property yields (4.3). \(\square \)

Clearly, \(Q_R(x,t, h,\ell , c)= Q(c-x,1-t, h-x,\ell -x)\). Here \(G(x,t,h,\ell ,c)\) is a distribution in \(h,\ell \) and a density in xc. To derive the density of \(p(x;t,h,\ell ,c)\), we need to consider four terms, the probability that both the high and low are to the left of t, the probability that just the low is to the left of t, the probability that just the high is to the right of t and the probability that both the high and the low are to the right of t.

Corollary 4.2

The density, \(p(x;t,h,\ell ,c)\equiv p(B(t)=x |B(1)=c,\ \max \{B(s)\}=h,\ \min \{B(s)\}=\ell )\), satisfies

$$\begin{aligned} p(x;t,h,\ell ,c) = - \partial _{\ell }\partial _h G(x,t, h,\ell , c) = -\partial _{\ell }\partial _h Q(x,t, h,\ell ) Q_R(x,t, h,\ell ,c) \ . \end{aligned}$$
(4.4)

Furthermore,

$$\begin{aligned} p(x;t,h,\ell ,c)= & {} p(t_{\ell }<t, t_{h}<t)+p(t_{\ell }<t, t_{h} \ge t)+ p(t_{\ell } \ge t, t_{h} < t) \nonumber \\&+\, p(t_{\ell }\ge t, t_{h} \ge t) \end{aligned}$$
(4.5)

where \(t_{\ell }\) is the first time that B reaches its minimum and \(t_{h}\) is the first time that B reaches its maximum.

Analogous to (3.5), Eq. (4.4) is the kernel of the Chapman–Kolmogorov representation for this restricted Brownian motion problem. The four terms in (4.5) correspond to applying the product rule of calculus to \(Q(x,t, h,\ell ) Q_R(x,t, h,\ell ,c)\). As in (3.9), the generator, \(G(x,t, h, \ell , c)\), is composed of a sum of Gaussians in x. The remainder of this section and the Appendix are devoted to evaluating the moments of \(p(x;t,h,\ell ,c)\).

Theorem 4.3

The probability distribution \(G(x,t, h, \ell , c)\) has the following representations:

$$\begin{aligned} G(x,t, h,\ell , c)\equiv & {} \sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 \frac{s_{i}}{2\pi \sigma \sigma _t} \exp \left( \frac{-(x-a_{ij})^2}{2t\sigma ^2} -\frac{(x-b_{ik})^2 }{2(1-t)\sigma ^2}\right) \nonumber \\= & {} \sum _{j,k>-\infty }^{\infty }\sum _{i=1}^4 s_{i} f_{ijk}(x) \ . \end{aligned}$$
(4.6)

Here \(s_i = -1\) for \(i=2,3\) and \(s_i=1\) for \(i=1,4\) and \(\sigma _t^2\equiv t(1-t)\sigma ^2\). The parameters are defined as \(a_{1,j}=2j \Delta \), \(b_{1,k}=c-2k\Delta \), \(a_{2,j}=2j \Delta \), \(b_{2,k}=2h-c+ 2k\Delta \),\(a_{3,j}=2h-2j\Delta \), \(b_{3,k}=c-2k\Delta \), \(a_{4,j}=2h-2j\Delta \), \(b_{4,k}=2h-c+2k\Delta \). The shifted Gaussian representation is

$$\begin{aligned} G(x,t, h,\ell , c)= & {} \sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 \frac{s_{i}}{\sqrt{2\pi }\sigma }\phi _{\sigma _t^2}(x- \mu _{ijk}(h,\ell )) e^{-g_{ijk}(h,\ell )} \nonumber \\= & {} \sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 s_{i}\phi _{\sigma _t^2}(x-\mu _{ijk}(h,\ell )) \psi _{ijk}(h,\ell ) \ \end{aligned}$$
(4.7)

where \(\mu _{ijk}(t,c,h) \equiv (a_{ij}(1-t) +b_{ik}t)\), and \(g_{ijk} =\left( [a_{i,j}^2(1-t) +t b_{i,k}^2]-\mu _{ijk}^2\right) \) \(/(2\sigma _t^2)=(a_{i,j} - b_{i,k})^2/2 \sigma ^2 \) and \(\psi _{ijk}(h,\ell ) \equiv e^{-g_{ijk}(h,\ell )} /\sqrt{2\pi }\sigma \).

Let \(v_{j,k} =2(j(1-t) +kt)\), \({\tilde{v}}_{j,k} =2(j(1-t) -kt)\), \(w_{j,k} =2(j+k)\), \({\tilde{w}}_{j,k} =2(j-k)\). Then \(\mu _{1,j,k}= ct+{\tilde{v}}_{j,k}\Delta \), \(g_1= (c-w_{j,k}\Delta )^2/2\sigma ^2\), \(\mu _2= (2h-c)t+v_{j,k}\Delta \), \(g_2=(2h-c-{\tilde{w}}_{j,k}\Delta )^2/2\sigma ^2\), \(\mu _3=2h(1-t) +ct-v_{j,k}\Delta \), \(g_3=(2h-c-{\tilde{w}}_{j,k}\Delta )^2/2\sigma ^2\) and \(\mu _4= 2h-ct-{\tilde{v}}_{j,k}\Delta \), \(g_4=(c-w_{j,k}\Delta )^2/2\sigma ^2\).

As in Sect. 3, we evaluate the moments in x for a given time, t, and fixed \((h,\ell ,c)\).

Lemma 4.4

The moments, \(M_m(t,h,\ell ,c)\),

$$\begin{aligned} M_m(t,h,\ell ,c)\equiv & {} -\int _{\ell }^h x^m \partial _h \partial _{\ell } G(x,t, h,\ell , c)\nonumber \\= & {} \int _{\ell }^h x^m \sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 -s_{i} \partial _h \partial _{\ell } f_{ijk}(x,h,\ell ) \ \ . \end{aligned}$$
(4.8)

The (ijk)th term inside the integral satisfies

$$\begin{aligned}&-\partial _h \partial _{\ell } \phi _{\sigma ^2_t}(x- \mu _{ijk}(t,h,\ell ))e^{-g_{ijk}(h,\ell ,t)} \nonumber \\&\quad = H_{ijk}(x- \mu _{ijk}) \phi _{\sigma _t^2}(x- \mu _{ijk}(t,h,\ell ))e^{-g_{ijk}} \ \ . \end{aligned}$$
(4.9)

Here \(H_{ijk}(z)\) is a quadratic polynomial in z defined as

$$\begin{aligned} H_{ijk}(z)\equiv -\left[ \frac{z \tau _{ijk} }{\sigma _t^2} -\partial _h g_{ijk}\right] \left[ \frac{z \hat{\tau }_{ijk}}{\sigma _t^2} - \partial _{\ell } g_{ijk}\right] + \frac{\tau _{ijk}\hat{\tau }_{ijk}}{\sigma _t^{2}} + \partial _{\ell }\partial _h g_{ijk}\ \end{aligned}$$
(4.10)

where \(\tau _{ijk} \equiv \partial _{h} \mu _{ijk}\) and \(\hat{\tau }_{ijk} \equiv \partial _{\ell } \mu _{ijk}\). Thus \(\tau _{1jk} ={\tilde{v}}_{j,k}=-\hat{\tau }_{1jk}\), \(\tau _{2jk} =2t+v_{j,k}\), \(\hat{\tau }_{2jk} =-v_{j,k}\), \(\tau _{3jk}=2(1-t)-v_{j,k}\), \(\hat{\tau }_{3jk} =v_{j,k}\), \(\tau _{4jk} =2-{\tilde{v}}_{j,k}\) and \(\hat{\tau }_{4jk} ={\tilde{v}}_{j,k}\). Here \(\tau _{.}\) and \(\hat{\tau }_{.}\) have no dependence on h and \(\ell \).

We group the terms in (4.10) by powers of z and define \(A_{ijk} = \tau _{ijk} \hat{\tau }_{ijk}\), \(B_{ijk} = [\tau _{ijk}\partial _{\ell } g_{ijk}+\hat{\tau }_{ijk}\partial _{h} g_{ijk}]\) and \(C_{ijk} =\Gamma _{ijk}+\sigma _t^{-2} \tau _{ijk} \hat{\tau }_{ijk} \) and \(\Gamma _{ijk}\equiv -\partial _h g_{ijk}\partial _{\ell } g_{ijk} + \partial _{\ell }\partial _h g_{ijk}\). Note that \(\Gamma _{4jk} = \Gamma _{1jk} = (2*g_{1jk}-1)w_{jk}^2/\sigma ^2\) and \(\Gamma _{3jk} = \Gamma _{2jk} = (2*g_{2jk}-1){\tilde{w}}_{jk}({\tilde{w}}-2)/\sigma ^2\). Thus \(H_{ijk}(z) = -A_{ijk}z^2/\sigma _t^4 +B_{ijk}z/\sigma _t^2 +C_{ijk}\).

To simplify the moment calculation, we evaluate the derivatives by h and \(\ell \) and recast them as derivatives with respect to x so that we can integrate by parts:

$$\begin{aligned} \partial _h \partial _{\ell } f_{ijk} = \tau _{ijk}\hat{\tau }_{ijk}\partial _x^2 f_{ijk} + B_{ijk}\partial _x f_{ijk} - \Gamma _{ijk} f_{ijk} \ \ . \end{aligned}$$
(4.11)

Note that \(\sum _{i=1}^4 s_i f_{ijk}(x=h)=0\), \(\sum _{i=1}^4 s_i \partial _h f_{ijk}(x=h)=0\) and \(\sum _{i=1}^4 s_i \partial _l f_{ijk}(x=h)=0\). This allows us to integrate by parts and drop terms.

We define the moments, \(M_m\), where the limits of integration, H and L, are to be set to h and \(\ell \) after we differentiate \(\partial _h \partial _{\ell } G\). This is done because integration by parts should not include the dependence on the limits of integrations.

We integrate by parts and find from (4.11):

$$\begin{aligned} M_m(t,h,\ell ,c)= & {} \sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 s_{i} \int _L^H mx^{m-1} \left[ A_{ijk}\partial _{x}f_{ijk} +B_{ijk} f_{ijk} \right] + x^m \Gamma _{ijk} f_{ijk}\nonumber \\&+ \sum _{i,j,k} s_i ({\mathcal {B}}_{ijk}(x=h) -{\mathcal {B}}_{ijk}(x=\ell ) ) \end{aligned}$$
(4.12)

where \({\mathcal {B}}_{ijk}\) is the boundary term. In the Appendix 13.5, we show that the boundary terms sum to zero.

In this section, we will often need the triple sum, \(\sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4\). For notational simplicity, we replace the triple sum with a simple \(\sum _{ijk}\) when appropriate.

Lemma 4.5

The moments have the representation:

$$\begin{aligned} M_m(t, h,\ell ,c)= & {} \sum _{ijk} s_{i} \psi _{ijk}\left[ \frac{-m A_{ijk}}{\sigma ^2_t}G_{m-1,1}(\mu _{ijk}) +m B_{ijk}G_{m-1,0}(\mu _{ijk},\sigma ^2_t)\right. \nonumber \\&\left. +\Gamma _{ijk}G_{m0}(\mu _{ijk},\sigma ^2_t) \right] \ \end{aligned}$$
(4.13)

where the

$$\begin{aligned} G_{mn}(\mu ,h,\ell ,\sigma )\equiv \int _{\ell -\mu }^{h-\mu } (x+\mu )^m x^n \phi _{\sigma ^2}(x) \ . \end{aligned}$$
(4.14)

As in Theorem 4.3, \(\psi _{ijk} \equiv e^{-g_{ijk}(h,\ell ,t)} /\sqrt{2\pi } \sigma \).

Proof

We substitute the definitions in (4.14) into (4.12). \(\square \)

In Appendix 13.2, we evaluate the functions \(G_{mn}()\) in terms of \(\phi _{\sigma ^2_t}(h-\mu _{ijk})\), \(\phi _{\sigma ^2_t}(\ell -\mu _{ijk})\) and the corresponding error functions. Collecting terms from above and using the Appendix 13.2 yields

Theorem 4.6

For \(m \le 2\), Eq. (4.13) becomes

$$\begin{aligned} M_m(t)=\sum _{ijk} s_{i}\psi _{ijk}\left[ a_{ijk}^{(m)} \phi _{\sigma ^2_t}(h-\mu _{ijk}) -{\hat{a}}_{ijk}^{(m)} \phi _{\sigma ^2_t}(\ell -\mu _{ijk}) + e_{ijk}^{(m)}R_{\sigma _t}(h,\ell ,\mu _{ijk})\right] \nonumber \\ \end{aligned}$$
(4.15)

where \(R_{\sigma _t}(h,\ell ,\mu _{ijk})\equiv [ E_{\sigma _t}(h-\mu _{ijk}) - E_{\sigma _t}(\ell -\mu _{ijk})]\) and \(E_{\sigma }(x)\) is the scaled erf function, \(E_{\sigma }(x)\equiv .5* erf(x/\sqrt{2}\sigma )\). For \(m=1\), the coefficients are

$$\begin{aligned} a_{ijk}^{(1)}= {\hat{a}}_{ijk}^{(1)}= A_{ijk}- \Gamma _{ijk}*\sigma ^2_t \ \ \ , \ \ \ e_{ijk}^{(1)}=B_{ijk}+ \Gamma _{ijk}\mu _{ijk} \ . \end{aligned}$$
(4.16)

For \(m=2\), the coefficients are

$$\begin{aligned} a_{ijk}^{(2)}&=2h A_{ijk} - 2B_{ijk}\sigma ^2_t - \Gamma _{ijk}\sigma ^2_t (\mu _{ijk}+h) \ \ , \nonumber \\ {\hat{a}}_{ijk}^{(2)}&=2\ell A_{ijk} -2B_{ijk}\sigma ^2_t - \Gamma _{ijk}\sigma ^2_t(\mu _{ijk}+\ell ) \ \ , \nonumber \\ e_{ijk}^{(2)}&=2B_{ijk}\mu _{ijk}-2A_{ijk} + \Gamma _{ijk}*(\mu _{ijk}^2+\sigma ^2_t) \ \ . \end{aligned}$$
(4.17)

For \(m=0\), \(a_{ijk}^{(2)}=0\), \({\hat{a}}_{ijk}^{(2)}=0\) and \(e_{ijk}^{(2)}=\Gamma _{ijk}\).

Proof

We substitute in the \(G_{mn}\) expressions into (4.13) and collect terms. \(\square \)

Some further simplifications of the coefficients in (4.16)–(4.17) can be found in Appendix 13.4 for \(m\le 2\). When \(m>2\), the terms multiplying \(E_{\sigma }(h-\mu _{ijk})\) and \(E_{\sigma }(\ell -\mu _{ijk})\) are different.

Corollary 4.7

\(M_0(t,h,\ell ,c)= p(h,\ell ,c)\) as given by (2.8).

Proof

See the Appendix 13.6. \(\square \)

We treat \(E[B(t|h,\ell ,c)]=M_1(t,h,\ell ,c)/p(h,c)\) as an estimator of \(B(t|h,\ell ,c)\). To evaluate (4.13) numerically, we need to truncate the expansion in j and k. Luckily, the Feller distribution of Sect. 6.3 shows that very few realizations have small values of \(\Delta \). Thus the double expansion for j and k converges quickly for the vast majority of the Brownian realizations.

A second method to evaluate the probability of (4.3) is to numerically evaluate Q, \(\partial _h Q\), \(\partial _{\ell } Q\) and \(\partial _{\ell }\partial _h Q\) in (4.1) and to numerically evaluate \(Q_R\), \(\partial _h Q_R\), \(\partial _{\ell } Q_R\) and \(\partial _{\ell }\partial _h Q_R\) in (4.2). We then numerically integrate the moments of (4.18).

$$\begin{aligned} prob(x,t,h,\ell ,c)= & {} Q(x,t,h,\ell ) \partial _{\ell }\partial _h Q_R + \partial _h Q(x,t,h,\ell ) \partial _{\ell } Q_R \nonumber \\&+\, \partial _{\ell } Q \partial _h Q_R+ Q \partial _{\ell }\partial _h Q_R(x,t,h,\ell ,c) \end{aligned}$$
(4.18)

times \(x^m\) from \(x=\ell \) to \(x=h\). Each of the terms in the integral involves truncating only in one of j or k. Thus the additional work involved in evaluating Q and \(Q_R\) at many points to evaluate the integral is partially compensated by the single infinite sums as opposed to a doubly infinite sum.

An alternative estimator is the maximum likelihood estimate given by maximizing likelihood of \(p(x,t|h,\ell ,c)\) with respect to x for each time t. Here \(p(x,t|h,\ell ,c) = -\partial _h \partial _{\ell }G(x,t, h, \ell , c) / p(h,\ell ,c)\). Using the series representation yields

$$\begin{aligned} p(x,t|h,\ell ,c) =\frac{\sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 \frac{s_{i}}{\sqrt{2\pi }\sigma } H_{ijk}(x- \mu _{ijk}(h,\ell )) \phi _{\sigma _t^2}(x- \mu _{ijk}(h,\ell )) e^{-g_{ijk}(h,\ell )}}{p(h,\ell ,c)}.\nonumber \\ \end{aligned}$$
(4.19)

In practice, the estimator \(E[B(t|h,\ell ,c)]=M_1(t,h,\ell ,c)/p(h,c)\) is much faster to evaluate than the maximum likelihood estimate.

5 Numerical methods

Simply put, we generate a large number of Brownian paths, bin the paths in \((close,\ max,\ min)\) space and calculate the mean and variance for each time and bin. We order the coordinates of phase space, \((q_1,q_2, q_3)\), so that \(q_1 = B(1)\), \(q_2=\ max_{0 \le t \le 1}B(t)\) and \(q_1=\ min_{0 \le t \le 1}B(t)\). We also consider the case where we replace one or more of these operators with argmax or argmin. The results for the argmax case are found in Riedel (2020).

A very straightforward algorithm is

  1. (1)

    Specify a timestep, dt, a number of bins in each direction nbins, and a number of sample paths, \(N_{samp}\) with typically \(N_{samp} \approx \kappa \ \mathrm{nbins}^3\) where \(\kappa \) denotes the typical number of simulations in a bin. More generally, for any choice of grids for the bins, we want at least \(\kappa \) simulations in each bin where \(\kappa \) is a large number. Generate a large array of scaled Gaussian random variables, size \((N_{samp}, 1/dt)\). Cumsum them to generate an array of Brownian paths. We often use a nonuniform time step where the time step is smaller near \(t=0\) and near \(t=1\).

  2. (2)

    In the first phase space direction, compute bin boundaries so that the number of curves are roughly equal in each bin. For each one dimensional bin, compute bin boundaries in the second coordinate direction so that the number of bins is roughly equal. Finally, for each of the two dimensional bins, compute bins in the third direction.

  3. (3)

    For each of the \(nbin^3\) bins, assign a triple index, \(\left( i,j,k\right) \) bins, compute the mean of the coordinates, \(({\bar{q}}_1,{\bar{q}}_2,{\bar{q}}_3)\), and compute the mean, \(\mu (t;{\bar{q}}_1,{\bar{q}}_2,{\bar{q}}_3)\), and variance, \(V(t;{\bar{q}}_1,{\bar{q}}_2,{\bar{q}}_3)\), of \(\{B(t)\}\) in the bin.

  4. (4)

    Test for convergence of \(\mu (t;{\bar{q}}_1,{\bar{q}}_2,{\bar{q}}_3)\) and \(V(t;{\bar{q}}_1,{\bar{q}}_2,{\bar{q}}_3)\) in \(N_{samp}\), nbins, and dt. This involves interpolation as grids boundaries are random functions of the particular ensemble of paths. Note that the grid boundaries for the first coordinate direction are independent of the second two coordinate directions but that the average value of \(q_1\) will depend on all three indices, (ijk). We find that interpolation from one grid to another grid broadens the width of the peaked functions especially when argmax is one of the given variables.

There is a bias versus variance tradeoff. If the bins are too large, the variation of the mean and variance will be obscured. If the bins are too small, there will be too few curves in each bin and the sample variance will dominate. Each of the close, max and min have a Gaussian or half Gaussian distribution individually so the tails of the distribution will be spread out. The situation is actually somewhat better as the high and low are exponentially distributed given the closing value. Nevertheless, exponential distributions have very few points in the tail of the distribution. Again, a low density of curves will significantly inflate the size of the tail bins and thereby add larger bias to the the computation of the bin variance. Thus convergence of the mean and variance on the outermost bins is tenuous. When we compute population average variance, we are tempted to downweight or even exclude the outer bins. While this is probably a smart thing to do, we report the simple ensemble average instead of a more complex limit reducing the underweighting as the bin size goes to zero.

Assume that we wish to generate bins in the \({\bar{q}}\) direction. We sort the Brownian realization in the \({\bar{q}}_1\) direction. To generate the grid boundaries, we initially tried equi-spaced quantile bins. This results in very large bins in the low density region. These large bins result in bias to our estimates for both the expectation and variance estimates. Let the density of points/curves be \(n({\bar{q}})\). To reduce the the size of the largest bins, we select bin boundaries to keep \(\int _{{\bar{q}}_k}^{{\bar{q}}_{k+1}} n({\bar{q}})^\alpha d{\bar{q}}\) to be approximately equal where \(\{{\bar{q}}_k\}\) are the bin boundaries. We use \(\alpha =.7-.75\) while \(\alpha =1\) corresponds to equal quantiles. We find that first and last bins converge much very slowly in (nSimnbin) space especially using a quantile based gridding. Using equal bins of \(n({\bar{q}})^\alpha \) partially but not completely alleviates this problem.

Wiener’s Fourier representation of Brownian paths on [0,1] is

$$\begin{aligned} B(t) = \xi _0 t + \sum _{n=1}^{\infty } \xi _n \frac{sin(nt)}{\pi n},\ \ {\mathrm{where }}\, \{\xi _k \} {\mathrm{are \ independent \ normal.}} \end{aligned}$$
(5.1)

Given an ensemble of Brownian paths, \(\{B_i(t)\}\), we can create an equivalent ensemble of Brownian paths, \(\{B_i(t,c)\}\), with right endpoint c, using the formula: \(B_i(t,c) \equiv B_i(t) - (B_i(t=1) -c) t\). This allows us to take one set of Brownian paths and use them on a grid of final values. This significantly reduces the number of realizations we need to cover phase space. Thus if the closing value is the first parameter direction that we examine, a 3-dimensional parameterization is reduced to a sequence of two-dimensional parameterizations.

6 Single conditional value

6.1 Brownian bridge

We begin with plots of our simulation for the Brownian bridge case, i.e. Brownian motion constrained to a given closing value. For this simulation, we use 15 million simulations with nsteps=1500. For a given value of \(B(1)=c\), the simulation yields a straight line in time for \(E[B(t)|B(1)=c]\). Figure 1 plots the time dependent variance, \(Var[B(t)|B(1)=c]\) for a variety of c. The closing values are chosen to be the values inbins number, \((0,2, \ldots nbin-3, nbin-1)\) where the third through eigth bin are equi-spaced in bin number. The theoretical value is \(t(1-t)\) and is displayed as the red curve in Fig. 1. All but the first and last curve match the theoretical values. This occurs because the first and last bins cover a very large range of c. We are averaging different values of \(E[B(t)|B(1)= close]\) and the squared bias is miscounted as variance.

Fig. 1
figure 1

Var[B(t|c)] for various final values, c

6.2 Given high

To calculate the probability, p(xth), we integrate \(p(x,t, h)= \int _{-\infty }^h p(x,t, h, c)dc\) using (3.5)–(3.7). The result is

Theorem 6.1

The probability density, \(p(x, t, h) \equiv p(B(t)=x |\max \{B(s)\}=h)\) satisfies

$$\begin{aligned} p(x,t, h)= & {} 2[\phi _{t\sigma ^2}(x) - \phi _{t\sigma ^2}(2h-x)]\phi _{(1-t)\sigma ^2}(h-x)\nonumber \\&+ \frac{2(2h-x)}{t\sigma ^2} \phi _{t\sigma ^2}(2h-x) erf\left( \frac{h-x}{\sqrt{2(1-t)\sigma _t}}\right) . \end{aligned}$$
(6.1)

The theoretical values of E[B(t)|h] and Var[B(t)|h] can be calculated by computing moments with respect to (6.1) and then dividing by \(p(h) = 2 *\phi _{\sigma ^2}(h)\), \(h\ge 0\). Unfortunately, we have not found a tractable analytic form from the integrals and therefore we compute them numerically Fayed and Atiya (2014). Another, very abstract expresion for p(xt|h) can be found in Bertoin et al. (1999).

Figure 2 plots the expectation of B(t) for ten values of the high. Not surprisingly, if the high occurs near \(t=0\), the expectation decreases monotonically beyond the argmax, \(\theta \), and decreases faster for smaller t. Let \(f(t,h)\equiv E[B(t)|\max B=h]\). It appears that f is smooth in t and \(|\frac{\partial f}{\partial t}|\) is decreasing in time. For large values of h, f(th) grows approximately linearly. We see that the zero of \(f(t=1,h)\) occurs somewhere between \(.68<h<.95\). Using (2.4), we see the precise value is .7517915247. Figure 3 plots the variance of a bin as a function of time. Again, the computed variance includes the squared bias from effectively assuming that expectation is constant in each bin. Since f(th) varies from the smallest value of h in the bin to the largest value of h in the bin, we are systematically overestimating the variance. For this particular computation, we define vrAvg to be the time and ensemble average of the variance. Looking at the dependence as a function of nbins, the number of bins, we find \(vrAvg(nbins=80)=0.16033\), \(vrAvg(nbins=160)=0.16021\), \(vrAvg(nbins=320)=0.16018\) and \(vrAvg(nbins=480)=0.16017\). Knowing the value of the high is slightly better at reducing the time averaged variance since \(vrAvg< 1/6\).

Returning to Fig. 3, we see that that \(var(t,h)\equiv Var[B(t)|\max B= h]\) is monotonically increasing for small values of h, up to at least \(h=.67\). For larger values of h, the variance is non-monotone. This non-monotonicity occurs because at large values of h , the maximum of B(t) is likely to be near \(t=1\). In these simulations, we use an ensemble of 36,000,000 realizations computed with 1530 steps and bin the results into 100 bins. For each of the ten values of the high, we display both the simulation curve and the analytic curve from numerically computing the moments of (6.1). The simulated curves have the symbols overstruck on them. The point is the match of simulation with (6.1) is very good.

Fig. 2
figure 2

Expectation of B(t) given \(max\{B(s)\}=h\) for various values of the high, h. If the maximum is small, the expectation decreases for most of its domain. If the maximum is large, the expectation increases

Fig. 3
figure 3

Variance of B(t) given \(max\{B(s)\}=h\) for various values of the high, h

6.3 Feller range

To look at convergence, we examine the distribution of the range as a function of the number of steps in the Brownian motion simulation. The theoretical distribution was calculated by Feller in Feller (1951): The range \(R_t\equiv \max _{0\le s \le t}B(s) -\min _{0\le s \le t} B(s) \) at time t is distributed like \(\sqrt{t}R_1\) and the density of \(R_1\) is the function \(f(x)=8 \sum _{k=1}^{\infty } (-1)^{k+1}k^2 \phi (kx)\) where \(\phi \) denotes the standard normal density and defined on \((0,\infty )\) Feller (1951). As noted by Feller: “In this form it is not even obvious that the function is positive”. We compute Feller’s formula for the density of the range of a Brownian motion. It converges very slowly near zero. To evaluate \(f(x=.005)\), we need between 300 and 400 terms. The formula is useful to compare our Brownian motion computations with the theoretical results. Although Feller’s article is almost seventy years old, we are unaware of any previous numerical study of its convergence or even a computation of it. Figure 4 compares the empirical density with Feller’s result. The blue curve is computed from Feller’s expansion, the black curve is the empirical density from four million realizations with 2000 time steps. The green curve uses only 500 time steps. We see very good agreement. The main difference is that the empirical distribution is shifted slightly to the left. There is less than \(0.1\%\) of the distribution below \(range<0.7\). In Sect. 4, the density given high and low bounds involves an expansion in \(\sum _k \exp (- k^2(h-l)^2)\). This expansion converges very quickly for vast majority of the ensemble of Brownian paths.

Fig. 4
figure 4

Density of the range: \(\max _{0\le s \le 1}B(s) -\min _{0\le s \le 1} B(s)\). Computation of Feller’s formula uses 400 terms. Empirical distribution uses 4,000,000 simulations with 500 and 2000 time steps

We see that the shift of the empirical distribution decreases as the step size decreases. For a step size of .0005, the shift of the center of mass of the distribution is .0066 from the theoretical result. Using a timestmp four times larger doubles the shift.

The distribution of the range is very small for \(range<.5\) and this region is poorly approximated by the Feller expansion. The is the clear opportunity for an asymptotic expansion in the region of small range.

7 Figures conditional on close, high

In this section, we plot the E[B(t|ch)] and Var[B(t|ch)] for a variety of different values of (ch). Specifically, we choose quantiles (.2,.5, .8) of the bin values for the close. For our robustified grid, this corresponds to \(close = -1.011, 0.0152, 1.055\). In each plot, we plot the expectation E[B(t|ch)] for ten values of h. The ten values of h are chosen to be equi-spaced in the bin coordinate from the second bin to the second to the last bin. We then repeat for Var[B(t|ch)]. We conclude with plots for the time average of E[B(t|ch)] and Var[B(t|ch)].

For these plots, we use 1530 time steps on each simulation for a total of 18 million simulations with 100 bins in each parameter direction. The curves overstruck by symbols are the simulation curves. The analytic formula curves have the same color but no symbol.

7.1 Time dependent mean given close, high

Figure 5 shows that the expectation is nearly monotonically decreasing for strongly negative values of the close and near zero values of the high. We say nearly decreasing because we have not examined the behavior near \(time=0\). For large values of the high, the high peaks near the middle of the time interval. Figure 6 shows the expectation is nearly symmetric in time when the close is near zero.

Fig. 5
figure 5

\(E[B(t |close=-1.011, various\ high)]\) where the values of the maximum are given in the legend

Fig. 6
figure 6

\(E[B(t |close=0.0152, various\ high)]\). The smooth curves with no symbol are given by (3.12) while the noisy curves are our simulation

Fig. 7
figure 7

\(E[B(t |close=1.055, various\ high)]\). The values of the high are given in the legend

Figures 5 and 7 display the following reflection symmetry: \(E[B(t |-c, h)] = E[B(1-t |c, h+c)] -c\) where \(c>0\).

7.2 Time dependent variance given close, high

Figures 8, 9 and 10 display Var[B(t|closehigh)] for \(close = -1.011, 0.0152, 1.055\). The smooth curves with no symbol are the analytic results from (3.12) and (3.13). In many cases, the variance is multimodal in time. Figures 8 and 10 display the following reflection symmetry: \(Var[B(t |-c, h)] = Var[B(1-t |c, h+c)]\) where \(c>0\).

Fig. 8
figure 8

\(Var[B(t |close=-1.011, high)]\)

Fig. 9
figure 9

\(Var[B(t |close=0.0152, high)]\)

Fig. 10
figure 10

\(Var[B(t |close=1.055, high)]\)

Fig. 11
figure 11

Comparison of simulation with (3.12) for four values of (ch). \(5\%\) worst MSE Mean:.0000117 at close:0.622, high:0.718    \(2\%\) worst MSE Mean:.0000124 at close:1.67, high:1.739    \(1\%\) worst MSE Mean:.000013 at close:-0.294, high:0.0581    \(0.2\%\) worst MSE Mean:.0000136 at close:0.373, high:0.446

8 Comparison of theory and simulation given close and high

In this section, we plot the simulation and theoretical calculation given by (3.12) and (13.5). for this comparison, we use 30 million realizations each with 1530 steps. The results are then binned in 120 bins in each direction for a total of 1.73 million bins. We compute the MSE for each bin and sort them. Figure 11 displays the fits for the worst .05, .02, .01 and .002 quantiles of the bins. To put the curves to scale, we plot all the curves together. The curves overstruck by symbols are the simulation curves. The analytic formula curves have the same color but no symbol. The differences are due to (a) averaging realizations for different values of (ch); (b) discretization errors from the finite time step of the Brownian motion. Figure 12 compares the simulated variance in four separate bins with the analytic expression in (13.5). Here again, we compute the squared error for each of the one million bins. We then plot the fits for the worst .05, .02, .01 and .002 quantiles of the bins. The worst fits for the variance have different parameters than the parameters for the worst fits to the empirical mean. To put the curves to scale, we plot all the curves together.

Fig. 12
figure 12

Comparison of simulation with (13.5) for four values of (ch).    \(5\%\) worst MSE Var:.0000000363 at close:1.996, high:2.167    \(2\%\) worst MSE Var:.0000000567 at close:1.007, high:1.148    \(1\%\) worst MSE Var:.00000236 at close:3.125, high:4.036    \(0.2\%\) worst MSE Var:.00000277 at close:−0.836, high:1.518

9 Comparison of theory and simulation given close, high and low

In this section, we plot the simulation and theoretical calculation given by (4.13). For this comparison, we use 30 million realizations each with 1530 steps. The results are then binned in 120 bins in each direction, thus a total of 1.73 million bins. We compute the MSE for each bin and sort them. We then display the fits for the worst .05, .02, .01 and .002 of the bins. To put the curves to scale, we plot all the curves together (Fig. 13).

Fig. 13
figure 13

Comparison of simulation with (4.13) for four values of \((c,h,\ell )\).    \(5\%\) worst MSE Mean:.0000114 at close:−1.289, high:0.109, low:−1.502;    \(2\%\) worst MSE Mean:.000014 at close:0.836, high:0.932, low:−0.487;    \(1\%\) worst MSE Mean:.0000165 at close:0.972, high:1.256, low:−1.198;    \(0.2\%\) worst MSE Mean:.0000257 at close:0.242, high:0.875, low:−1.056

Figure 14 compares the simulated variance in four separate bins with the analytic expression in (13.5). Here again, we compute the squared error for each of the 1.73 million bins. We then plot the fits for the worst .05, .02, .01 and .002 quantiles of the bins. The worst fits for the variance have different parameters than the parameters for the worst fits to the empirical mean. To put the curves to scale, we plot all the curves together.

Fig. 14
figure 14

Comparison of simulation with (4.13) for four values of \((c,h,\ell )\). \(5\%\) worst MSE Var:.00000124 at close:−0.038, high:0.756, low:−0.611; \(2\%\) worst MSE Var:.0000021 at close:3.125, high:3.286, low:−0.276; \(1\%\) worst MSE Var:.00000311 at close:−3.125, high:0.307, low:−3.642; \(0.2\%\) worst MSE Var:.00000695 at close:−3.125, high:0.911, low:−3.211

10 Estimation of SP500 prices

We now estimate the SP500 index future, ES, given only prices at the open, high, low and close. Our real applications use the formulas in Sect. 4 to estimate the time evolution of series with only open, high, low and close data. We choose the SP500 because we have the time history and can test the performance of various estimators. For 1309 days between 2005 and 2015 we compute ten second bars between 9:30 am and 16:00 pm for a total of \(2340=6*390\) prices per day. We exclude half days. For each day, we define \(X(t,day)\equiv \log (price)(t,day) - \log (price)(t=0,day)\). Since \(X(t=0,day)=0\), we do not use the first point each day. The volatility varies throughout the day, being larger near the beginning and end of the day.

We estimate the time dependence as

$$\begin{aligned} \hat{\sigma }^2(t) = \frac{1}{N_d}\sum _{day} [X(t,day) - X(t-h,day)]^2 \end{aligned}$$
(10.1)

where \(N_d\) is the number of days in the sum. The time dependent volatility is independent of day. The sum of the volatilities, \(\sum _{j=1}^N \hat{\sigma }^2(t_j) =.007216^2\), corresponding to a daily volatility of 1.6.

We define volatility time \(\tau _i\) by

$$\begin{aligned} \tau _k =\tau (t_k) = \sum _{i=1}^k \hat{\sigma }^2(t_i) / \sum _{j=1}^N \hat{\sigma }^2(t_j). \end{aligned}$$
(10.2)

In volatility time, diffusion rate of \(X(\tau (t))\) is time independent and matches the assumptions of Brownian motion.

We score our various estimates, \({\hat{X}}(t,day)\), with the MSE in volatility time:

$$\begin{aligned} MSE= \frac{1}{N*N_d} \sum _{i,day} \hat{\sigma }^2(t_i) [X(t_i,day) - {\hat{X}}(t_i,day)]^2. \end{aligned}$$
(10.3)

To normalize the MSE, we use the relative mean square error:

$$\begin{aligned} RMSE= \sum _{i,day} \hat{\sigma }^2(t_i) [X(t_i,day) - {\hat{X}}(t_i,day)]^2 / \sum _{i,day} \hat{\sigma }^2(t_i) X(t_i,day)^2 \end{aligned}$$
(10.4)

We also give the mean relative squared error:

$$\begin{aligned} MRSE= \sum _{day} \frac{\sum _i\hat{\sigma }^2(t_i) [X(t_i,day) - {\hat{X}}(t_i,day)]^2}{ \sum _{i} \hat{\sigma }^2(t_i) X(t_i,day)^2}. \end{aligned}$$
(10.5)

We include the estimated variance, \(\hat{\sigma }^2(t_i)\), in the loss measure because it corresponds to a time integral in volatility time.

The estimators from Sect. 4 require \(\sigma ^2\) and this must be estimated. The simplest is the date independent estimate \(\hat{\sigma }^2_{const} = Mean[c(day)^2]\). The Garmen Klass estimate is \(\hat{\sigma }^2_{GK}(h,\ell ,c)= K_1(h-\ell )^2 -K_2[c(h+\ell )-2h\ell ] - K_3 c^2 \) where \(K_1=.511\), \(K_2=.019\) and \(K_3=.383\) (Garman and Klass 1980; Rogers and Satchell 1991). The maximum likelihood estimator, \(\hat{\sigma }^2_{ML}(h,\ell ,c)\), is based on (2.8). The Meillijson estimator, \(\hat{\sigma }^2_{M}(h,\ell ,c)\) is given in Meillijson (2011). We reject the Rogers-Satchell estimator because it gives \(\hat{\sigma }^2_{R}(h=0,\ell ,c)=0\) when the low happens on the close (\(\ell =c\)) and \(h=0\). This case does occur in financial data even though it never occurs in Brownian motion.

Table 1 compares the MSE of the estimates. Our first estimates is the are the Brownian bridge, \({\hat{X}}(t_i,day) = c_{day} t_i\), and the second row is the Brownian volatility bridge, \({\hat{X}}(t_i,day) = c_{day} tau_i\). The remainder of our estimate are \(E[B(t|h,\ell ,c)]\) as given by Theorem 4.6 with various plu in estimates of \(\sigma \). In Table 1, \(E[B(\tau |h,\ell ,c)]\) denotes using volatility time. We see the use of volatility time only slightly improves the fit. For these fits, we estimate the volatility every day separately, but use volatility time calculated for the whole data set. The maximum likelihood estimate, \(\hat{\sigma }^2_{ML}(h,\ell ,c)\), does slightly better than \(\hat{\sigma }^2_{GK}(h,\ell ,c)\) with \(\hat{\sigma }^2_{M}(h,\ell ,c)\) coming in third. Using the information from \((h,\ell ,c)\) improves reduces the error to \(56\%\) of the error of the Brownian bridge.

Table 1 Performance of various estimators on SP500 data

11 Distribution given high and low

We evaluate the distribution \(p(x, t,h,\ell )\) by integrating over the closing value in \(p(x,t,h,\ell ,c)\) using (4.6). As before, the limits of integration, H and L, are to be set to h and \(\ell \) after differentiation.

Theorem 11.1

Let \(G_{HL}(x,t, h, \ell ) \equiv P(B(t)=x, \ell \le B(s) \le h | \mathrm{for}\ s \in [0,1] )\)

$$\begin{aligned} G_{HL}(x,t, h,\ell )\equiv & {} \int _L^H G(x,t, h,\ell , c) dc = Q(x,t, h, \ell ) \sum _{k>-\infty }^{\infty }\left[ R_{1k} (x,t, h, \ell ;H,L)\right. \nonumber \\&\left. -R_{2k}(x,t, h, \ell )\right] \end{aligned}$$
(11.1)

where \(Q(x,t, h, \ell )\) is defined in (4.1) and

$$\begin{aligned} R_{1k}&= \frac{1}{2}\left[ erf\left( \frac{H-x -k \Delta }{\sqrt{2(1-t)}\sigma }\right) - erf\left( \frac{{L-x -k \Delta } }{\sqrt{2(1-t)}\sigma }\right) \right] , \ \\ R_{2k}&=\frac{1}{2} \left[ erf\left( \frac{H+x -2h +k \Delta }{\sqrt{2(1-t)}\sigma }\right) - erf\left( \frac{{L+x -2h +k \Delta } }{\sqrt{2(1-t)}\sigma }\right) \right] \ \ . \end{aligned}$$

The density satisfies \(p(x;t,h,\ell ) = - \lim _{H\rightarrow h, L \rightarrow \ell }\partial _{\ell }\partial _h G(x,t, h,\ell ;H,L)\).

Proof

(11.1) is \(Q(x,t, h, \ell )\int _L^H Q_R(x,t, h,\ell , c) dc\), integrated term by term. We again use the articificial limits of L and H to indicate that the limits should not be differentiated in evaluating the density. \(\square \)

To get the density conditional on the high and low, we divide \(P(x;t,h,\ell )\) by \(p(h,\ell )\) as given by (2.9) The theoretical values of \(E[B(t)|h,\ell ]\) and \(Var[B(t)|h,\ell ]\) can be calculated by computing moments with respect to (6.1). Unfortunately, we have not found a tractable analytic form from the integrals and therefore we compute them numerically. We display the simulation results for \(E[B(t)|h,\ell ]\) for a small value of \(h=.304\), the median value of \(h=.816\) and a large value of \(h=1.572\) (Figs. 15, 16 and 17).

Fig. 15
figure 15

\(E[B(t |high=0.816, various\ low)]\). If the \(h> |\ell |\), the maximum occurs after the minimum. If the \(h< |\ell |\), the minimum occurs first

Fig. 16
figure 16

E[B(t|high =.304 ; various low)] The smaller the low, the lower the curve

Fig. 17
figure 17

\(E[B(t |high=1.572, various\ low)]\)

12 Summary

By calculating \(E[B(t)|\max , \min , \mathrm {close}]\), we are able to interpolate in time any dataset where only the open, high, low and close are given. In practice, we interpolate on the log scale using the logarithms of the open, high, low and close. For most applications, we are interested in relative price chances so the log scale is appropriate. If one is truly interested in the actual price, our formulas need to be modified for log Brownian motion.

Our simulations have calculated the ensemble average of the mean square error in Brownian motion for a variety of different given statistics. The time dependence of the variance is displayed in Fig. 18. In Fig. 18,

$$\begin{aligned} V(t|h,\ell ,c)&\equiv \int Var[B(t|h,\ell ,c)] dp(h,\ell ,c),\nonumber \\ V(t|h,c)&\equiv \int Var[B(t|h,c)] dp(h,c) \ . \end{aligned}$$
(12.1)

We ensemble average the variance expressions over all paths. For a given value of the statistics, h or (hc) or \((h,\ell ,c)\), the results of the previous sections should be used for a more accurate evaluation of the variance. The fifth curve in Fig. 18 is the case when the location of the maximum is specified in addition to (hc). This is borrowed from Riedel (2020). The variance is symmetric in time when final value, c, is specified. If just the high or the high and low are specified, the variance is nonmonotonic in time.

Fig. 18
figure 18

Time dependence of ensemble averaged variance given conditional variables

The time averaged variance in Fig. 18 is presented in Table 2. The values for Table 2 are from the simulation. We plan to compute these ensemble averages using the analytic results in Sects. 3 and 4. Table 2 shows that using the open, high, low and close reduces the variance to just \(14\%\) of the variance using only the initial value or only the final value. This shows that the use of only the open, high, low and close in chartist forecasting Morris (2006) keeps most of the information about the time history of the process. Table 2 answers interesting questions like is it better to know the maximum or the final value of the Brownian motion to predict B(t) in [0, 1]. By a ratio of 0.96 to 1, it is slightly better to know the high than the closing value. Similarly, rows 5–8 of Table 2 show that it is better to know the close and the high than the high or the low or the close and time of the high. The last two rows show that it is better to know the close, high and low than to know the close, high and time of the high. Finally, we see the expected variance when using all of the open, high, low and close is just \(42\%\) of the the variance from using just the open and close.

Table 2 Time averaged variance by givens

Table 2: Expected time average variance reduction. We multiply the variance by 6 in the third column to compare with knowing only the final value, c. Here argMax is the first location of the maximum of B(t). The results that contain argMax are taken from Riedel (2020).

Table 1 shows the performance of our estimator on the log of the SP500 price. For the financial data, we have a MSE reduction of \(54\%\) over the Brownian bridge. This is a significant improvement but it is not as good as the theoretical value of \(42\%\). The reason is clear. For real world data, we need to estimate \(\sigma \) while our theorems and simulations have \(\sigma \) given. Also the volatility time varies from day to day in practice.

Our moment expression for \(M_1(h,\ell ,c)\) and \(M_2(h,\ell ,c)\) in Theorem 4.6 and Sect. 13.4 are two dimensional sums over Gaussians terms. We are unable to collapse the two dimensional sum over j and k to a single infinite sum as was possible in the \(m=0\) case of Corollary 13.6. The double expansion for j and k times \(\Delta =(h-\ell )\) converges quickly for all but the set of Brownian paths where \(\Delta \) is very small. The Feller distribution of Sect. 6.3 shows that the measure of the small \(\Delta \) paths is very small.

13 Appendix: integral evaluations

13.1 Close and high

We now evaluate the integrals \(M_m\) in (3.15)–(3.17). Set \(r\equiv 2h-c\), \(z_2= h-\mu _2= -z_3 = h -(2h-c)t\) and \(z_4=h-\mu _4=ct-h\). Checking the normalization

$$\begin{aligned} M_0=\left[ - \sum _{i=2}^4 s_i \tau _i f_i(x=h)\right] + 2(2h-c)\int _{-\infty }^H (f_2 +f_3) dx = p(h,c) \ . \end{aligned}$$
(13.1)

For \(m=1\), (3.16) reduces to

$$\begin{aligned}&M_1\equiv \sum _{i=2}^4 s_i \psi _i \int _{-\infty }^{h-\mu _i}\left[ \tau _i -\frac{2(2h-c)}{\sigma ^2}(x+\mu _i)(1-\delta _{i,4})\right] \phi _{\sigma _t^2}(x)dx \end{aligned}$$
(13.2)
$$\begin{aligned}&=\sum _{i=2}^4 s_i \psi _i \left[ \frac{\tau _i}{2} -\frac{r \mu _i}{\sigma ^2} (1-\delta _{i,4})\right] \left[ 1 +erf\left( \frac{h-\mu _i}{\sqrt{2}\sigma _t}\right) \right] - 4 r t(1-t) \phi _{\sigma ^2}(r) \phi _{\sigma ^2_t}(h-rt) \nonumber \\ \end{aligned}$$
(13.3)
$$\begin{aligned}&=\phi _{\sigma ^2}(c)\left[ 1 +erf\left( \frac{ct-h}{\sqrt{2}\sigma _t}\right) \right] + \phi _{\sigma ^2}(r)\left[ \frac{2hr}{\sigma ^2} -1 +p_{h,r,t} erf\left( \frac{h-rt}{\sqrt{2}\sigma _t}\right) \right] \nonumber \\&- 4 r t(1-t) \phi _{\sigma ^2}(r) \phi _{\sigma ^2_t}(h-rt) \ . \end{aligned}$$
(13.4)

Here we define \(r= r(h,c)=2h-c\) and use \(\mu _2= rt\), \(g_2=r^2/2\), \(\mu _3=2h-rt\), \(g_3=r^2/2\). \(p_{h,r,t}= (\tau _3 -\tau _2)/2 +\frac{r(\mu _2-\mu _3)}{\sigma ^2} = (1-2t) +\frac{2r(rt-h)}{\sigma ^2}\). For the second moment, (3.17) reduces to

$$\begin{aligned}&M_2\equiv \sum _{i=2}^4 s_i \psi _i \int _{-\infty }^{h-\mu _i}\left[ 2(x+\mu _i) \tau _i -\frac{2(2h-c)}{\sigma ^2}(x+\mu _i)^2(1-\delta _{i,4})\right] \phi _{\sigma _t^2}(x)dx \end{aligned}$$
(13.5)
$$\begin{aligned}&= \sum _{i=2}^4 s_i \tau _i \psi _i \left[ \mu _i\left[ 1 +erf\left( \frac{h-\mu _i}{\sqrt{2}\sigma _t}\right) \right] -2\sigma ^2_t \phi _{\sigma ^2_t}(h-\mu _i)\right] \end{aligned}$$
(13.6)
$$\begin{aligned}&+ \frac{2(2h-c) \phi _{\sigma ^2}(r) }{\sigma ^2} \left[ \sum _{i=2}^3 \frac{(\mu _i^2+\sigma _t^2)}{2} \left[ 1 + erf\left( \frac{h-\mu _i}{\sqrt{2} \sigma _t}\right) \right] - 4h\sigma _t^2 \phi _{\sigma ^2_t}(h-\mu _2) \right] \end{aligned}$$
(13.7)
$$\begin{aligned}&= 2 \mu _4 \phi _{\sigma ^2}(c)\left[ 1 +erf\left( \frac{ct-h}{\sqrt{2}\sigma _t}\right) \right] \nonumber \\&\quad - \phi _{\sigma ^2}(r)\left[ (\tau _2 \mu _2 + \tau _3 \mu _3) + (\tau _2 \mu _2 - \tau _3 \mu _3) erf\left( \frac{h-rt}{\sqrt{2} \sigma _t}\right) \right] \end{aligned}$$
(13.8)
$$\begin{aligned}&+ 2r \phi _{\sigma ^2}(r) \left( \left[ t(1-t) +\frac{\mu _2^2+\mu _3^2}{2\sigma ^2} +\frac{\mu _2^2-\mu _3^2}{2\sigma ^2} erf\left( \frac{z_2}{\sqrt{2} \sigma _t}\right) \right] - 4 h t(1-t) \phi _{\sigma ^2_t}(z_2) \right) \ = \nonumber \\\end{aligned}$$
(13.9)
$$\begin{aligned}&2 (2h-ct)\phi _{\sigma ^2}(c) \left[ 1 +erf\left( \frac{ct-h}{\sqrt{2}\sigma _t}\right) \right] - \phi _{\sigma ^2}(r) \left[ q_3(h,c,t) + q_4(h,c,t) erf\left( \frac{h-rt}{\sqrt{2}\sigma _t}\right) \right] \nonumber \\ \end{aligned}$$
(13.10)
$$\begin{aligned}&+ 2r \phi _{\sigma ^2}(c) \left( \left[ t(1-t) + \frac{q_5(h,c,t)}{\sigma ^2}+\frac{2h(rt-h)}{\sigma ^2} erf\left( \frac{h-rt}{\sqrt{2}\sigma _t}\right) \right] \right. \nonumber \\&\left. \quad - 4 h t(1-t) \phi _{\sigma ^2_t}(h-rt) \right) \end{aligned}$$
(13.11)

where we define \(q_3(h,c,t)\equiv \mu _2 \tau _2 +\mu _3 \tau _3 =2(rt^2 +(1-t)(2h-rt) )\), \(q_4(h,c,t)\equiv \mu _2 \tau _2 - \mu _3 \tau _3 =2(rt-2h(1-t)) \), \(\mu _2^2+\mu _3^2=r^2t^2+ (2h-rt)^2\) and \(\mu _2^2-\mu _3^2= 4h(rt-h)\). Let \(q_5(h,c,t)\equiv (\mu _2^2+\mu _3^2)/2 = 2 h^2 + r^2t^2 - 2hrt = h^2 + (h-rt)^2\).

For (13.8), we use

$$\begin{aligned}&\int _{-\infty }^{H-\mu } (x+\mu ) \phi _{\sigma ^2_t}(x) = \frac{\mu }{2} \left[ erf\left( \frac{H-\mu }{\sqrt{2} \sigma _t}\right) + 1 \right] - \sigma ^2 \phi _{\sigma _t^2}(H-\mu ) \ , \end{aligned}$$
(13.12)
$$\begin{aligned}&\int _{-\infty }^{H-\mu }(x+\mu )^2 \phi _{\sigma ^2_t}(x) = \frac{\mu ^2+\sigma ^2}{2} \left[ erf\left( \frac{H-\mu }{\sqrt{2} \sigma _t}\right) + 1 \right] - \sigma ^2 (H+\mu ) \phi _{\sigma ^2_t}(H-\mu ).\nonumber \\ \end{aligned}$$
(13.13)

These formulas have been verified by numerically integrating the moments of \(\partial _h F(x,t,h,c)\) from \(-\infty \) to h. This completes the proof of Theorem 3.4. \(\square \)

13.2 \(G_{mn}\) evaluation

We evaluate the integrals, \(G_{mn}\), of (4.14). We define the scaled erf function, \(E_{\sigma }(x)\equiv .5* erf(x/\sqrt{2}\sigma )\). For \(m=0\), \(G_{00}(\mu ,h,\ell ) = \left[ E_{\sigma }(h-\mu )- E_{\sigma }(\ell -\mu )\right] \) and \(G_{01}(\mu ,h,\ell ) = \sigma ^2 \left[ \phi _{\sigma ^2}(\ell -\mu )- \phi _{\sigma ^2}(h-\mu )\right] \).

$$\begin{aligned}&G_{10}(\mu ,h,\ell ,\sigma ) = \mu \left[ E_{\sigma }(h-\mu )-E_{\sigma }(\ell -\mu ) \right] +\sigma ^2 \left[ \phi _{\sigma ^2}(\ell -\mu )- \phi _{\sigma ^2}(h-\mu )\right] , \nonumber \\\end{aligned}$$
(13.14)
$$\begin{aligned}&G_{20}(h,\ell ) = (\sigma ^2+\mu ^2) \left[ E_{\sigma }(h-\mu )-E_{\sigma }(\ell -\mu ) \right] + \sigma ^2 \left[ (\ell +\mu )\phi _{\sigma ^2}(\ell -\mu )\right. \nonumber \\&\quad \left. - (h+\mu )\phi _{\sigma ^2}(h-\mu )\right] , \end{aligned}$$
(13.15)
$$\begin{aligned}&G_{11}(\mu ,h,\ell ,\sigma ) = \sigma ^2 \left[ E_{\sigma }(h-\mu )-E_{\sigma }(\ell -\mu ) \right] + \sigma ^2 \left[ \ell \phi _{\sigma ^2}(\ell -\mu )- h \phi _{\sigma ^2}(h-\mu )\right] .\nonumber \\ \end{aligned}$$
(13.16)

13.3 Centering at the lower limit

To simplify the lower limit values at \(\ell \) in (4.15), we need to define the analog of \(\mu _{ijk}\) except that the definitions are centered at the lower limit. Let \(\nu _{1,j,k}= ct+{\tilde{v}}_{j,k}\Delta \), \({\tilde{g}}_1= (c-w_{j,k}\Delta )^2/2\sigma ^2\), \(\nu _2= (2\ell -c)t+v_{j,k}\Delta \), \({\tilde{g}}_2=(2\ell -c-{\tilde{w}}_{j,k}\Delta )^2/2\sigma ^2\), \(\nu _3=2\ell (1-t) +ct-v_{j,k}\Delta \) and \(\nu _4= 2\ell -ct-{\tilde{v}}_{j,k}\Delta \). Of course, \({\tilde{g}}_4={\tilde{g}}_1= g_1\) and \({\tilde{g}}_3={\tilde{g}}_2\). The analog of (4.7), centered at the lower limits, is

$$\begin{aligned} G(x,t, h,\ell , c)=\sum _{j,k>-\infty }^{\infty } \sum _{i=1}^4 \frac{s_{i}}{\sqrt{2\pi }\sigma }\phi _{\sigma _t^2}(x- \nu _{ijk}) e^{-{\tilde{g}}_{ijk}(h,\ell )}, \ \end{aligned}$$
(13.17)

We further define \(\tilde{\tau }_{ijk} \equiv \partial _{h} \nu _{ijk}\) and \(\hat{\tilde{\tau }}_{ijk} \equiv \partial _{\ell } \nu _{ijk}\). Thus \(\tilde{\tau }_{1jk} ={\tilde{v}}_{j,k}=-\hat{\tilde{\tau }}_{1jk}\), \(\tilde{\tau }_{2jk} =v_{j,k}\), \(\hat{\tilde{\tau }}_{2jk} =2t-v_{j,k}\), \(\tilde{\tau }_{3jk}=-v_{j,k}\), \(\hat{\tilde{\tau }}_{3jk} = 2(1-t)+v_{j,k}\), \(\tilde{\tau }_{4jk} =-{\tilde{v}}_{j,k}\) and \(\hat{\tilde{\tau }}_{4jk} =2+{\tilde{v}}_{j,k}\). Finally, we need \(A^{\ell }_{ijk} = \tilde{\tau }_{ijk} \hat{\tilde{\tau }}_{ijk}\), \(B^{\ell }_{ijk} = [\tilde{\tau }_{ijk}\partial _{\ell } {\tilde{g}}_{ijk}+\hat{\tilde{\tau }}_{ijk}\partial _{h} {\tilde{g}}_{ijk}]\) and \(C^{\ell }_{ijk} =\tilde{\Gamma }_{ijk}+\sigma _t^{-2} \tilde{\tau }_{ijk} \hat{\tilde{\tau }}_{ijk} \) and \(\tilde{\Gamma }_{ijk}\equiv -\partial _h {\tilde{g}}_{ijk}\partial _{\ell } {\tilde{g}}_{ijk} + \partial _{\ell }\partial _h {\tilde{g}}_{ijk}\). Note \(\tilde{\Gamma }_{1jk}=\Gamma _{1jk}\) and \(\tilde{\Gamma }_{2jk}= (2*{\tilde{g}}_{2jk} -1) {\tilde{w}}_{jk}({\tilde{w}}_{jk} +2)/\sigma ^2\).

13.4 Further simplification of Theorem 4.6

We now simplify (4.15) by summing the Gaussian terms over i. The terms involving the error function do not simplify much and are left as in Theorem 4.6.

Corollary 13.1

Theorem 4.6 may be re-expressed as

$$\begin{aligned} M_m= \sum _{j,k} U_{jk}^{m,h} f_{1jk}(h-\mu _{1jk}) - U_{jk}^{m,\ell } f_{1jk}(\ell -\mu _{1jk}) + \sum _{ijk} s_{i} e_{ijk}^{(m)} \psi _{ijk} R(h,\ell ,\mu _{ijk})\nonumber \\ \end{aligned}$$
(13.18)

where \(R(h,\ell ,\mu _{ijk}) = E_{\sigma _t}(h-\mu _{1jk})-E_{\sigma _t}(\ell -\mu _{1jk}).\) The coefficients, \(e_{ijk}^{(m)}\) are defined below (4.15). The coefficients satisfy

$$\begin{aligned} U_{jk}^{1h} = \sum _i s_i A_{ijk} + 2(\Gamma _{2jk}-\Gamma _{1jk})\sigma ^2_t ={\bar{A}}_{jk} + 2(\Gamma _{2jk}-\Gamma _{1jk})\sigma ^2_t \end{aligned}$$
(13.19)

where \({\bar{A}}_{jk}=\sum _i s_i A_{ijk} = [32 jk +8(j-k)]\ t(1-t)\). For the lower limit,

$$\begin{aligned} U_{jk}^{1\ell } = \sum _i s_i A^{\ell }_{ijk} + 2(\tilde{\Gamma }_{2jk}-\Gamma _{1jk})\sigma ^2_t ={\bar{A}}^{\ell }_{jk} + 2(\tilde{\Gamma }_{2jk}-\Gamma _{1jk})\sigma ^2_t \end{aligned}$$
(13.20)

where \({\bar{A}}^{\ell }_{jk}=\sum _i s_i A^{\ell }_{ijk} = [32 jk-8(j-k)]\ t(1-t)\). For the second moment,

$$\begin{aligned} U_{jk}^{2h}= & {} 2h\sum _i s_i A_{ijk}- 2 \sigma ^2_t \sum _i s_i B_{ijk} + 4 h\sigma ^2_t(\Gamma _{2jk}-\Gamma _{1jk}) \nonumber \\= & {} 2 h {\bar{A}}_{jk}+ {\bar{B}}_{jk} + 4h \sigma ^2_t(\Gamma _{2jk}-\Gamma _{1jk}), \end{aligned}$$
(13.21)
$$\begin{aligned} U_{jk}^{2\ell }= & {} 2\ell \sum _i s_i A^{\ell }_{ijk}- 2 \sigma ^2\sum _i s_i B_{ijk}^{\ell } + 4 \ell \sigma ^2(\tilde{\Gamma }_{2jk}-\Gamma _{1jk}) \nonumber \\= & {} 2 \ell {\bar{A}}^{\ell }_{jk} +\bar{{\tilde{B}}}_{jk} + 4 \ell \sigma ^2(\Gamma ^{\ell }_{2jk}-\Gamma _{1jk}) \end{aligned}$$
(13.22)

where \({\bar{B}}_{jk}= \sum _i s_i B_{ijk}= \frac{8}{\sigma ^2} [-4jk\Delta +c*j -h(j-k)] \) and \(\bar{{\tilde{B}}}_{jk}= \sum _i s_i B^{\ell }_{ijk}=\frac{8}{\sigma ^2} [4jk\Delta -c*j +\ell (j-k)] \).

Proof

We begin with

$$\begin{aligned}&{\bar{A}}_{jk}\equiv \sum _i s_i A_{ijk} = 2(v_{j,k}^2-{\tilde{v}}_{j,k}^2) + 2{\tilde{v}}_{j,k} +2(2t-1)v_{j,k}\nonumber \\&\quad = [32 jk +8(j-k)]t(1-t) \ \ , \end{aligned}$$
(13.23)
$$\begin{aligned}&{\bar{A}}^{\ell }_{jk} \equiv \sum _i s_i A^{\ell }_{ijk}= 2(v_{j,k}^2-{\tilde{v}}_{j,k}^2) - 2{\tilde{v}}_{j,k} -2(2t-1)v_{j,k}\nonumber \\&\quad =[32 jk -8(j-k)]t(1-t) \ \ . \end{aligned}$$
(13.24)

To sum \(s_i B_{ijk}\), we begin with the pairs \(B_{ijk}\) yields

$$\begin{aligned} B_{1jk}+B_{4jk}&= (\tau _{1jk}+\tau _{4jk})\partial _{\ell } g_{1jk}+(\hat{\tau }_{1jk}+\hat{\tau }_{4jk})\partial _{h} g_{1jk}] = 2 \partial _{\ell } g_1 \ \ , \nonumber \\ B_{2jk}+B_{3jk}&= (\tau _{2jk}+\tau _{3jk})\partial _{\ell } g_{2jk}+(\hat{\tau }_{2jk}+\hat{\tau }_{3jk})\partial _{h} g_{2jk}] = 2 \partial _{\ell } g_2 \ . \end{aligned}$$
(13.25)

Thus

$$\begin{aligned} {\bar{B}}_{jk}&\equiv \sum _i s_i B_{ijk} =2[\partial _{\ell } g_1 -\partial _{\ell } g_2] = \frac{2}{\sigma ^2}\left[ (c-w_{jk} \Delta ) w_{jk} - (2h-c - {\tilde{w}}_{jk} \Delta ){\tilde{w}}_{jk}\right] \end{aligned}$$
(13.26)
$$\begin{aligned}&=\frac{2}{\sigma ^2}\left[ ({\tilde{w}}_{jk}^2-w_{jk}^2)\Delta + c w_{jk} -(2h-c) {\tilde{w}}_{jk}\right] = \frac{-32jk \Delta -8(h-c)j+ 8hk}{\sigma ^2}. \end{aligned}$$
(13.27)

Similarly,

$$\begin{aligned} \bar{{\tilde{B}}}_{jk}\equiv \sum _i s_i B^{\ell }_{ijk}=2\partial _{h} ({\tilde{g}}_1 -{\tilde{g}}_2) = \frac{32jk \Delta +8(\ell -c)j- 8\ell k}{\sigma ^2}. \end{aligned}$$
(13.28)

\(\square \)

It is possible to make small additional simplifications of (13.18), but the resulting moment computations are not much simpler than (13.18). The computation remains a two dimensional infinite sum.

13.5 High, low, close boundary terms

We begin by showing the boundary terms in (4.12) vanish: \({\mathcal {B}}_{ijk}(x=H) = - h^m [(A_{ijk}/\sigma _t^2) \partial _{x}f_{ijk}(h)\) \(+ B_{ijk}f_{ijk}(h)] \). To show these boundary terms vanish, we note \(f_{ijk}(h) = f_{1jk}(h)\) , \(\partial _{x}f_{ijk}(h) = (\mu _{ijk}-h)f_{1jk}(h)/\sigma ^2_t \). Thus \(\partial _{x}f_{4jk}(h) =-\partial _{x}f_{2jk}(h)\). Also note that \(0 =\sum _i \partial _{h}f_{ijk}(x=h) =\sum _i [(\mu _{ijk}-h)/\sigma ^2_t +\partial _h g_{ijk}] \) Since \(\tau _{4jk}\hat{\tau }_{4jk}=\tau _{1jk}\hat{\tau }_{1jk}+2 {\tilde{v}}_{jk}\) and \(\tau _{3jk}\hat{\tau }_{3jk}=\tau _{2jk}\hat{\tau }_{2jk}+2 v_{jk}\), we have

$$\begin{aligned} \sum _{i=1}^4 s_i A_{ijk} \partial _x f_{ijk}(h) = \frac{f_{1jk}(h)}{\sigma ^2_t} \sum _{i=1}^4 s_i A_{ijk} (\mu _{ijk}-h). \end{aligned}$$
(13.29)

Simplifying

$$\begin{aligned} \frac{1}{2}\sum _{i=1}^4 s_i A_{ijk} (\mu _{ijk}-h)&= \left[ {\tilde{v}}_{jk}(\mu _{4jk}-h)-v_{jk}(\mu _{3jk}-h)\right] \nonumber \\&=\,\left[ {\tilde{v}}_{jk}(h-ct-{\tilde{v}}_{j,k}\Delta )-v_{jk}(h+ct-2ht-v_{jk}\Delta )\right] \nonumber \\&=\,\left[ (v_{jk}^2-{\tilde{v}}_{jk}^2)\Delta - ct (v+{\tilde{v}}_{jk}) +h({\tilde{v}}_{jk}-v_{jk}) + 2ht v_{jk}\right] \nonumber \\&=\,\left[ 16jkt(1-t)\Delta - 4cjt(1-t)- 4hkt +2htv_{jk}\right] \nonumber \\&=\,\left[ 16jk\Delta - 4cj +4h(j-k) \right] t(1-t). \end{aligned}$$
(13.30)

This precisely cancels with \({\bar{B}}_{jk}=\sum _i s_i B_{ijk}\) from (13.26). For the lower boundary, we regroup the sum using \(j\rightarrow {\hat{j}}=j+1\), \(k\rightarrow {\hat{k}}=k-1\). This corresponds to centering the generator relative to \(\ell \) instead of h. \(\square \)

13.6 Proof of Corollary 4.7

Proof

For \(M_0\), the coefficients in (4.15) satisfy \(a_{ijk}^{(0)}= {\hat{a}}_{ijk}^{(0)}= 0\) and \(e_{ijk}^{(0)}= \Gamma _{ijk}\). Only the last term in (4.15) is nonzero and the sum reduces to

$$\begin{aligned} M_0= & {} \sum _{ijk}\frac{s_{i}}{\sqrt{2\pi }\sigma } \Gamma _{ijk}e^{-g_{ijk}}[E_{\sigma _t}(h-\mu _{ijk})- E_{\sigma _t}(\ell -\mu _{ijk})] \\= & {} \sum _{jk}\sum _{i=1}^2 \frac{s_{i}}{\sqrt{2\pi }\sigma } \Gamma _{ijk}e^{-g_{ijk}} [E_{\sigma _t}(\ell -\mu _{ijk}+2\Delta )- E_{\sigma _t}(\ell -\mu _{ijk})] \ . \end{aligned}$$

Here \(E_{\sigma _t}\) is the scaled erf function, \(E_{\sigma _t}(x)\equiv .5* erf(x/\sqrt{2}\sigma _t)\). To simplify the first sum, we used \(h-\mu _{4jk}= \mu _{1jk}-h\), \(h-\mu _{3jk}= \mu _{2jk}-h\), \(E_{\sigma _t}(\ell -\mu _{4jk}) = -E_{\sigma _t}(\ell -\mu _{1jk} + 2 \Delta )\), \(E_{\sigma _t}(\ell -\mu _{3jk}) = -E_{\sigma _t}(\ell -\mu _{2jk} + 2 \Delta )\). The \(\Gamma _{ijk}\) satisfy \(\Gamma _{1jk}=(2g_{1jk}-1)w_{jk}^2/\sigma ^2=w_{jk}^2[(c-w_{j,k}\Delta )^2-\sigma ^2]/\sigma ^4\) and \( \Gamma _{2jk} =(2g_{2jk}-1){\tilde{w}}_{jk}({\tilde{w}}_{jk}-2)/\sigma ^2={\tilde{w}}_{jk}{\tilde{w}}_{j,k+1}[(2h-c-{\tilde{w}}_{j,k}\Delta )^2-\sigma ^2]/\sigma ^4\).

To sum these terms, we reparametrize \(k(j,{\hat{k}})\). For \(i=1,4\), we set \(k={\hat{k}}-j\), \({\hat{k}}=k+j\). For \(i=2,3\), we set \(k=j-{\hat{k}}\), \({\hat{k}}=k-j\). With these transformations, \(v_{j,{\hat{k}}}= j-{\hat{k}}t\), \({\tilde{v}}_{j,{\hat{k}}}= j-{\hat{k}}t\), \(w_{j{\hat{k}}}= 2 {\hat{k}}\), \({\tilde{w}}_{j{\hat{k}}}= 2 {\hat{k}}\), \(\mu _{1,j{\hat{k}}} =ct+{\tilde{v}}_{j,{\hat{k}}}\Delta \), \(g_1= (c-w_{j,{\hat{k}}}\Delta )^2/2\sigma ^2\), \(\mu _2= (2h-c)t+v_{j,{\hat{k}}}\Delta \), \(g_2=(2h-c-{\tilde{w}}_{j,{\hat{k}}}\Delta )^2/2\sigma ^2\). Since the \(g_i\) depend only on \({\hat{k}}\) and not j, so do the \(\Gamma _{.j{\hat{k}}}\). The double sum splits into a single sum

$$\begin{aligned} \sum _{i{\hat{k}}}\Gamma _{i{\hat{k}}}e^{-g_{ij{\hat{k}}}}\sum _j [E_{\sigma _t}(\ell -\mu _{ij{\hat{k}}}+2\Delta )- E_{\sigma _t}(\ell -\mu _{ij{\hat{k}}})]=\sum _{{\hat{k}}}\Gamma _{1{\hat{k}}}e^{-g_{1{\hat{k}}}} - \Gamma _{2{\hat{k}}}e^{-g_{2{\hat{k}}}}\nonumber \\ \end{aligned}$$
(13.31)

where we have dropped the j dependence on g and \(\Gamma \). We use that for a given k, the sum of the integrals for \((1,j,{\hat{k}})\) and \((4,j,{\hat{k}})\) cover the region from \(-\infty \) to \(\infty \). This allows us to collapse the sum over \(i \in (1,4),j\). Similarly, the sums over \((2,j,{\hat{k}})\) and \((3,j,{\hat{k}})\) collapse. We recognize the expression in (13.31) to precisely correspond to \(M_0(t,h,\ell ,c)= p(h,\ell ,c)\) as given by (2.8). \(\square \)

We would very much like to have expressions for the first and second moment that reduce the double sum to a single sum. This does not appear possible because the \(a_{ijk}\) and \(a'_{ijk}\) do not vanish.