1 Introduction

The analysis of first-passage problems of diffusion processes and especially of the Brownian motion is a rich field in applied probability with important applications in various areas such as mathematical finance, statistics, physics, engineering or biology. For instance, in mathematical finance, a default event can be modelled by the first time a stochastic process representing firm value crosses a certain possibly time-varying barrier or a barrier option can be exercised if the underlying value process reaches a predefined boundary. In biology, extinction of a population can be described by the event that the number of individuals first passes a threshold value (for more applications in biology, we refer the reader to Ricciardi et al. 1999). Another area of application arises in statistics, in particular in sequential analysis and change-point problems.

Let \((W(t))_{t\ge 0}\) denote a standard Brownian motion, i.e. a Gaussian process with continuous paths, \(W(0)=0\), \({{{\mathbb {E}}}}(W(t))=0\) and \({{{\mathbb {E}}}}(W(s)W(t))=s\wedge t\). Let \(b:[0,\infty )\rightarrow {\mathbb {R}}\) denote a function with \(b(0)\ge 0\), the upper boundary. Define the first-exit time

$$\begin{aligned} \tau =\inf \{t>0\mid W(t)\ge b(t)\}. \end{aligned}$$
(1)

\(\tau \) is a stopping time. Denote the distribution of \(\tau \) by F. Given regularity conditions, F is absolutely continuous with a density f, which is continuous and strictly positive on \((0,\infty )\).

The direct first-passage-time problem identifies F for given b. The inverse problem computes b for given F or f. The problem of finding the boundary function b is of importance in many fields, and the application areas are similar to the ones involving the direct first-passage time problems, e.g. mathematical finance, in particular credit risk modelling or in biology, in neural activity (for more details see Abundo 2015). For both problems an extensive literature exists. Since for only few boundaries the distribution F of \(\tau \) can be computed in closed form, numerical procedures, approximate solutions of corresponding partial differential equations, or Monte Carlo methods are necessary. For details we refer to Durbin (1971), Lerche (1986), Salminen (1988), Novikov et al. (1999) or Pötzelberger and Wang (2001).

Inverse problems are often based on a Volterra Integral equation, the so-called master equation, for which (bF) is a solution. Let \(z\ge b(t)\). Then

$$\begin{aligned} {\bar{\varPhi }}\left( \frac{z}{\sqrt{t}}\right) =\int _0^t {\bar{\varPhi }}\left( \frac{z-b(u)}{\sqrt{t-u}}\right) \, dF(u), \end{aligned}$$
(2)

with \({\bar{\varPhi }}=1-\varPhi \) the survival function of the standard normal distribution. \(z=b(t)\) and the differentiation of (2) in \(z=b(t)\) give

$$\begin{aligned} {\bar{\varPhi }}\left( \frac{b(t)}{\sqrt{t}}\right)= & {} \int _0^t {\bar{\varPhi }}\left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \, dF(u), \end{aligned}$$
(3)
$$\begin{aligned} \phi \left( \frac{b(t)}{\sqrt{t}}\right) \frac{1}{\sqrt{t}}= & {} \int _0^t \phi \left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \frac{1}{\sqrt{t-u}}\, dF(u). \end{aligned}$$
(4)

See Peskir (2002), Zucca and Sacerdote (2009) or Abundo (2015) for thorough discussion of the inverse problem.

If f is given, an approximation of b(t) on \(\{ t_i\mid t_i=h i\}\) is the solution of the system of equations

$$\begin{aligned} {\bar{\varPhi }}\left( \frac{b(t_i)}{\sqrt{t_i}}\right) =\sum _{j=1}^i {\bar{\varPhi }}\left( \frac{b(t_i)-b(t_j)}{\sqrt{t_i-t_j}}\right) \,f(t_j)\, h, \quad (i=1,\ldots , n). \end{aligned}$$
(5)

In this paper we analyze the statistical inverse first-passage-time problem: Given a sample \(\tau _1,\ldots , \tau _n\) of independent first-exit times, we approximate the unknown boundary b(t) by an estimator \({\hat{b}}_n(t)\). We propose the empirical estimator, which is the solution of (2), when F is replaced by the empirical distribution \({\hat{F}}_n\).

This paper is organized as follows. In Sect. 2 we prove that the empirical estimator is strongly consistent with rate \(o((\log n + \eta \log \log n)^{1/2}n^{-1/2})\) for every \(\eta >1/2\), uniformly on \(t\in [0,T]\), for all \(T>0\).

We compare the performance of the empirical estimator to an approximate conditional likelihood method, namely a Bayes estimator. The approximate conditional likelihood is the density of the first-exit time \(\tau \), when the boundary b is approximated by a piecewise linear boundary \(b_m\), i.e. a boundary that is linear on intervals \([t_{i-1}, t_i]\), with \(0=t_0<t_1 <\cdots t_m=T\) a partition of [0, T]. For \(b_m\), the density of \(\tau \), given \(W(t_1), \ldots , W(t_m)\), can be computed in closed form. In Sect. 3 we compute this approximate and conditional density. Section 4 concludes with the results of Monte Carlo experiments for the empirical estimator and a Bayes estimator derived from the approximate likelihood.

2 Empirical estimator

Let \(\tau _1,\ldots ,\tau _n\) be an i.i.d. sample of first-exit times corresponding to the boundary b. Note that \(\tau _i=\infty \) if the Brownian motion (W(t)) never crosses the boundary b. We denote the empirical distribution of \(\tau _i\), \(i\in \{1,\ldots ,n\}\) by \({\hat{F}}_n\). The empirical estimator \({\tilde{b}}_n(t)\) of the boundary b(t) is the solution of

$$\begin{aligned} {\bar{\varPhi }}\left( \frac{{\tilde{b}}_n(t)}{\sqrt{t}}\right) =\int _0^t {\bar{\varPhi }}\left( \frac{{\tilde{b}}_n(t)-{\tilde{b}}_n(u)}{\sqrt{t-u}}\right) \,d{\hat{F}}_n(u). \end{aligned}$$
(6)

The empirical estimator is consistent. Equation (6) has a solution for all t. However, it is convenient to solve a corresponding system of equations at the sample. Let \(\tau _{(1)}\le \tau _{(2)}\le \cdots \le \tau _{(n)}\) denote the order statistics of the sample. Note that for finite order statistics \(\tau _{(i)}\), \(\tau _{(i)}<\tau _{(i+1)}\) a.s.

We define the estimator \({\hat{b}}_n\) at \(\tau _{(i)}\), i.e. \({\hat{b}}_n(\tau _{(1)}),\ldots , {\hat{b}}_n(\tau _{(n)})\), as the solution of

$$\begin{aligned} {\bar{\varPhi }}\left( \frac{{\hat{b}}_n(\tau _{(1)})}{\sqrt{\tau _{(1)}}}\right)= & {} \frac{1}{2n}\\ {\bar{\varPhi }}\left( \frac{{\hat{b}}_n(\tau _{(k)})}{\sqrt{\tau _{(k)}}}\right)= & {} \sum _{i=1}^{k-1} {\bar{\varPhi }}\left( \frac{{\hat{b}}_n(\tau _{(k)}) -{\hat{b}}_n(\tau _{(i)})}{\sqrt{\tau _{(k)} -\tau _{(i)}}}\right) \frac{1}{n}. \end{aligned}$$

In case not all \(\tau _i\)’s are finite, we define \({\hat{b}}_n(\infty )=\infty \). For \(t\not \in \{\tau _1,\ldots ,\tau _n\}\) we interpolate \({\hat{b}}_n(t)\) linearly.

Theorem 1

Let b be continuously differentiable with \(b(0)>0\). The empirical estimator is strongly consistent:

Let for \(\eta >1/2\), \(\epsilon _n= (\log n + \eta \log \log n)^{1/2}n^{-1/2}\). Then for all \(0<T\),

$$\begin{aligned} P\left( \lim _{n\rightarrow \infty }\sup _{ \theta \le T} |{\hat{b}}_n(\theta )-b(\theta )|/\epsilon _n=0\right) =1. \end{aligned}$$
(7)

Proof

The empirical estimator can be considered as a discretization scheme of the master equation with the order statistics \(\tau _{(i)}\) as random knots.

Zucca and Sacerdote (2009) proved the consistency of the Euler scheme for the (deterministic) master equation. We follow their lines, with the necessary modifications indicated. Define for \(T>0\) fixed and knots \(0=t_0<t_1<\cdots <t_n=T\) the solution of the Euler scheme (5) by \(b^*(t_k)\) and the local consistency error by

$$\begin{aligned} \delta (h,t_k)= & {} \int _0^{t_k}{\bar{\varPhi }}\left( \frac{b(t_k)-b(u)}{\sqrt{t_k-u}}\right) dF(u) \\&-\sum _{1\le j<k}{\bar{\varPhi }}\left( \frac{b(t_k)-b(t_j)}{\sqrt{t_k-t_j}}\right) (F(t_j)-F(t_{j-1})), \end{aligned}$$

with \(h=\max _{1\le j <k} (t_j-t_{j-1})\). The proof of Theorem 6.2. in Zucca and Sacerdote (2009) shows that there is a constant \({\tilde{c}}>0\) (depending on b and T only), such that for all i,

$$\begin{aligned} |b(t_i)-b^*(t_i)|\le {\tilde{c}}\delta (h, t_i). \end{aligned}$$

Define for \(\theta >0\),

$$\begin{aligned} Z_j(\theta )={\bar{\varPhi }}\left( \frac{b(\theta )-b(\tau _{(j)})}{\sqrt{\theta -\tau _{(j)}}}\right) I_{[0,\theta ]}(\tau _{(j)}) \end{aligned}$$

and

$$\begin{aligned} \delta _n(\theta )= & {} \int _0^{t_k}{\bar{\varPhi }}\left( \frac{b(\theta )-b(u)}{\sqrt{\theta -u}}\right) dF(u) -\sum _{\tau _{(j)}\le \theta }{\bar{\varPhi }} \left( \frac{b(\theta )-b(\tau _{(j)}}{\sqrt{\theta -\tau _{(j)}}}\right) \frac{1}{n}\\= & {} \sum _{j\le n} ({{{\mathbb {E}}}}(Z_j(\theta ))-Z_j(\theta )). \end{aligned}$$

Since \(|Z_j(\theta )|\le 1\) for all j, Hoeffding’s inequality gives for all \(\epsilon >0\),

$$\begin{aligned} P\left( |\delta _n(\theta )|>\epsilon \right) \le 2e^{-2n\epsilon ^2}. \end{aligned}$$

For \(\epsilon _n= (\log n + \eta \log \log n)^{1/2}n^{-1/2}\) we get for \(\eta '=2\eta -1>0\) and fixed \(\theta \),

$$\begin{aligned} P(|\delta _n(\theta )|>\epsilon _n) \le \frac{1}{n^2\log n^{1+\eta '}}. \end{aligned}$$

Then

$$\begin{aligned} P(\max _{1\le k\le n}|\delta _n(\tau _{(k)})|>\epsilon _n ) \le \frac{1}{n\log n^{1+\eta '}}. \end{aligned}$$
(8)

Since

$$\begin{aligned} \sum _{n=1}^{\infty }\frac{1}{n\log n^{1+\eta '}} < \infty , \end{aligned}$$

the Theorem of Borel–Cantelli implies (7). \(\square \)

Remark

In case of censoring at T, the empirical estimator is still consistent, if a consistent estimator of F(T) is available. If this estimator is even strongly consistent with rate \(\epsilon '_n\), then the empirical estimator is strongly consistent with rate \(\epsilon '_n\vee \epsilon _n\), with \(\epsilon _n\) defined in Theorem 1.

Let us briefly comment on the asymptotic distribution of the residuals \(({\hat{\epsilon }}_n(t))\) defined as

$$\begin{aligned} {\hat{\epsilon }}_n(t)=\sqrt{n}( {\hat{b}}_n(t)-b(t)). \end{aligned}$$
(9)

Denote by \({\hat{U}}_n(t)\) the empirical process

$$\begin{aligned} {\hat{U}}_n(t)=\sqrt{n}({\hat{F}}_n(t)-F(t)). \end{aligned}$$

\(({\hat{U}}_n(t)) \rightarrow (U^F(t))\), with \((U^F(t))\) a Brownian bridge, a Gaussian process with continuous paths, \({{{\mathbb {E}}}}(U(t))=0\) and \({{\mathbb {C}}ov}(U(s),U(t))=F(s \wedge t)-F(s)F(t)\). There are processes \(({\hat{U}}^*_n)\) and a Brownian bridge \((U^{F *})\) with the same distributions as \(({\hat{U}}_n)\) and \((U^F)\), such that with probability 1, \(\Vert {\hat{U}}^*_n-U^{F*}\Vert \rightarrow 0\) (see Shorack and Wellner 2009). To simplify the exposition, and since we are interested in the asymptotic distribution of the residuals only, we may assume that a.s. \(\Vert {\hat{U}}_n-U^{F}\Vert \rightarrow 0\).

We have

$$\begin{aligned} {\bar{\varPhi }}\left( \frac{b(t)+n^{-1/2}{\hat{\epsilon }}_n(t)}{\sqrt{t}}\right)&=\int _0^t {\bar{\varPhi }} \left( \frac{b(t)-b(s)+n^{-1/2}({\hat{\epsilon }}_n(t) -{\hat{\epsilon }}_n(u))}{\sqrt{t-u}}\right) \,dF(u) \\&\quad +\frac{1}{\sqrt{n}} \int _0^t {\bar{\varPhi }} \left( \frac{b(t)-b(s)+n^{-1/2}({\hat{\epsilon }}_n(t) -{\hat{\epsilon }}_n(u))}{\sqrt{t-u}}\right) \,d{\hat{U}}_n(u),\\ {\bar{\varPhi }}\left( \frac{b(t)}{\sqrt{t}}\right) -\phi \left( \frac{b(t)}{\sqrt{t}}\right) \frac{{\hat{\epsilon }}_n(t)}{\sqrt{n}\sqrt{t}}&=\int _0^t {\bar{\varPhi }}\left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \,dF(u) \\&\quad +\frac{1}{\sqrt{n}}\int _0^t {\bar{\varPhi }}\left( \frac{b(t) -b(u)}{\sqrt{t-u}}\right) \,d{\hat{U}}_n(u)\\&\quad -{\frac{1}{\sqrt{n}}}\int _0^t\phi \left( \frac{b(t)-b(u)}{\sqrt{t-u}} \right) \frac{{\hat{\epsilon }}_n(t)-{\hat{\epsilon }}_n(u)}{\sqrt{t-u}}dF(u)+o\left( \frac{1}{\sqrt{n}}\right) . \end{aligned}$$

Therefore

$$\begin{aligned} \phi \left( \frac{b(t)}{\sqrt{t}}\right) \frac{{\hat{\epsilon }}_n(t)}{\sqrt{t}}&= \int _0^t\phi \left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \frac{{\hat{\epsilon }}_n(t) -{\hat{\epsilon }}_n(u)}{\sqrt{t-u}}dF(u) \\&\quad -\int _0^t {\bar{\varPhi }}\left( \frac{b(t)-b(u)}{\sqrt{t-u}} \right) \,d{\hat{U}}_n(u)+o(1). \end{aligned}$$

Assume that for all t, \({\hat{\epsilon }}_n(t)\) would converge to a limit \(\epsilon (t)\). The process \((\epsilon (t))\) would solve

$$\begin{aligned} \phi \left( \frac{b(t)}{\sqrt{t}}\right) \frac{\epsilon (t)}{\sqrt{t}}&= \int _0^t\phi \left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \frac{\epsilon (t)-\epsilon (u)}{\sqrt{t-u}}dF(u)\\&\quad -\int _0^t {\bar{\varPhi }}\left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \,dU^F(u). \end{aligned}$$

With (4), \((\epsilon (t))\) would solve the stochastic linear Abel integral equation

$$\begin{aligned} \int _0^t\phi \left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \frac{\epsilon (u)}{\sqrt{t-u}}dF(u)= -\int _0^t {\bar{\varPhi }}\left( \frac{b(t)-b(u)}{\sqrt{t-u}}\right) \,dU^F(u). \end{aligned}$$
(10)

However, there is no “classical” solution \((\epsilon (t))\) of (10). To see this, let \(b(t)=b\) be constant. Recall that the density f of \(\tau \) is continuous and bounded. Then Eq. (10) is

$$\begin{aligned} \int _0^t\frac{\epsilon (u)f(u)}{\sqrt{t-u}}\, du= -\sqrt{\frac{\pi }{2}}U^F(t). \end{aligned}$$
(11)

Applying the Abel transform, (see Gorenflo and Vessella 1991), we get

$$\begin{aligned} \int _0^t \epsilon (u)f(u)\, du =H_t:= -\frac{1}{\sqrt{2\pi }} \int _0^t \frac{U^F(u)}{\sqrt{t-u}}\, du. \end{aligned}$$
(12)

The process \((H_t)\) is a Gaussian process with \({{{\mathbb {E}}}}(H_t)=0\) and for \(s\le t\),

$$\begin{aligned} K(s,t)={{\mathbb {C}}ov}(H_s,H_t)= 2\int _0^s\left( \sqrt{\frac{s-u}{t-u}}+\sqrt{\frac{t-u}{s-u}}\right) F(u)\,du -g(s)g(t), \end{aligned}$$
(13)

with

$$\begin{aligned} g(s)=\int _0^s \frac{1}{\sqrt{s-u}}F(u)\,du. \end{aligned}$$

Remark

  1. 1.

    If \(\int _0^t \epsilon (u)f(u)\,du\) would be of bounded variation with \(\epsilon (t)f(t)=\frac{d}{dt}H_t\), then

    $$\begin{aligned} {{\mathbb {V}}ar}(\epsilon (t)f(t))= & {} \frac{\partial ^2}{\partial s\,\partial t}K(s,t)\mid _{t=s}\\= & {} \int _0^s \frac{1}{\sqrt{(s-u)(t-u)}}f(v)\, du -g'(s)g'(t)\\= & {} \int _0^t \frac{1}{t-u}f(u)\, du -\left( \int _0^t \frac{1}{\sqrt{t-u}}f(u)\, du\right) ^2 =\infty . \end{aligned}$$
  2. 2.

    Note that Lévy’s theorem on the modulus of continuity implies that the Brownian bridge \(U^F\) has modulus of continuity \(\sqrt{2h\log (1/h)}\), i.e. it holds

    $$\begin{aligned} \lim _{h\rightarrow 0+}\sup _{|s|\le h} \frac{|U^F(t+s)-U^F(t)|}{\sqrt{2h\log (1/h)}}=1, \text { a.s. .} \end{aligned}$$

    It follows that the modulus of continuity of \((H_t)\) is \(ch\sqrt{\log (1/h)}\), with \(c>0\) a constant.

Proposition 1

Let b(t) be constant. The process \(\left( \int _0^t{\hat{\epsilon }}_n(u)\,dF(u)\right) \) of integrated weighted residuals converges in distribution to the centered Gaussian process \((H_t)\) defined by (12).

3 Approximate likelihood

For piecewise linear boundaries the following conditional boundary crossing probability allows the computation of an approximate conditional likelihood function. Let \(b_m\) be continuous and linear on intervals \([t_i, t_{i+1}]\), where \(0=t_0<t_1<\cdots t_m=T<\infty \). Let \(\tau ^m\) denote the corresponding first-exit time and \(W^m=(W(t_i))_{i \le m}\) a discrete Brownian motion and \(w^m=(w_1,\ldots ,w_m)\in {\mathbb {R}}^m\). Wang and Pötzelberger (1997) prove that

$$\begin{aligned} P(\tau ^m>T)= {{\mathbb {E}}}(\nu _m(W(t_1),\ldots ,W(t_m),T)), \end{aligned}$$
(14)

with

$$\begin{aligned} \nu _m(w_1,\ldots ,w_m,T)= & {} \prod _{i=1}^m \left( 1-\exp \left\{ -\frac{2(b_m(t_{i-1})-w_{i-1})(b_m(t_i)-w_{i})}{\varDelta t_i}\right\} \right) \nonumber \\&\times I_{\{w_{i}<b_m(t_i)\}}. \end{aligned}$$
(15)

Let \(f(t\mid b_m)\) denote the density of \(\tau ^m\), the first-exit time for the boundary \(b_m\) and \(f(t\mid b_m, w^m)\) is the conditional density of \(\tau ^m\) given \((W^m=w^m)\). Then \( f(t\mid b_m)={{\mathbb {E}}}(f(t\mid b_m, W^m))\).

Proposition 2

Define for given \(t_d\): \(t_u=t_{d+1}\) and \(\varDelta =t_{u}-t_d\). Let

$$\begin{aligned} \mu _t = \frac{w_d(t_u-t) + w_u (t- t_u)}{\varDelta },\quad \sigma ^2_t= \frac{(t_u-t) (t- t_d)}{\varDelta }. \end{aligned}$$
(16)

1. For \(t_{d}<t<t_{u}\) and \(w_u\ge b(t_u)\),

$$\begin{aligned} f(t\mid b_m,w^m)= & {} \nu _{m-1}(w_1,\ldots ,w_d,t_d) \nonumber \\&\times \, \phi \left( \frac{b_m(t)-\mu _t}{\sigma _t}\right) \frac{\varDelta ^{1/2}(b(t_d)-w_d)}{(t_u-t)^{1/2}(t-t_d)^{3/2}} \end{aligned}$$
(17)

2. For \(t_{d}<t<t_{u}\) and \(w_u< b(t_u)\), define

$$\begin{aligned} \eta _t= & {} \mu _t+\frac{2(b(t_u)-w_u)(t-t_d)}{\varDelta },\\ \theta _t= & {} \mu _t+\frac{2(b(t_u)-w_u)(t-t_d)}{\varDelta } + \frac{2(b(t_d)-w_d)(t_u-t)}{\varDelta }. \end{aligned}$$

Then \(f(t\mid b_m,w^m) =\nu _{m-1}(w_1,\ldots ,w_d,t_d)\, g(t\mid b_m,w^m)\), where

$$\begin{aligned} g(t\mid b_m,w^m)= & {} \phi \left( \frac{b_m(t)-\eta _t}{\sigma _t}\right) \nonumber \\&\times \left( \frac{\varDelta ^{1/2}(b(t_d)-w_d)}{2(t_u-t)^{1/2}(t-t_d)^{3/2}} + \frac{\varDelta ^{1/2}(b(t_u)-w_u)}{2(t_u-t)^{3/2}(t-t_d)^{1/2}} \right) \nonumber \\&+ \exp \left[ \frac{2(b(t_d)-w_d)(b(t_u)-w_u)}{\varDelta }\right] \phi \left( \frac{b_m(t)-\theta _t}{\sigma _t}\right) \nonumber \\&\times \left( \frac{\varDelta ^{1/2}(b(t_d)-w_d)}{2(t_u-t) ^{1/2}(t-t_d)^{3/2}} - \frac{\varDelta ^{1/2}(b(t_u)-w_u)}{2(t_u-t)^{3/2}(t-t_d)^{1/2}} \right) \end{aligned}$$
(18)

Proof

Let \(\tau ^m=t\in [t_d,t_u]\). Conditional on \(W(t_d)=w_d\) and \(W(t_u)=w_u\), \((W(s))_{t_d\le s\le t_u}\) is a Brownian bridge, for which the crossing probabilities are given in closed form. The conditional distribution of W(t) is Gaussian with parameters (16). To compute \(P(\tau ^m>t\mid \tau ^m\in [t_d,t_u])\), condition on \(W(t)=v\) with \(v<b(t)\). There is no crossing in \([t_d,t]\) and a crossing in \([t,t_u]\). Note that in case \(W(t_u)\ge b(t_u)\) the latter conditional probability is 1. Taking expectation w.r.t. W(t) and finally the derivative w.r.t. t gives (17) and (18). \(\square \)

Approximate likelihood inference replaces the exact likelihood function by the approximate one, i.e. the boundary b is approximated by a piecewise linear boundary \(b_m\). Estimates for errors, especially on \(|P(\tau>t)-P(\tau ^m>t)|\) are derived in Pötzelberger and Wang (2001), Borovkov and Novikov (2005), Zucca and Sacerdote (2009) and Pötzelberger (2012), among others.

4 Monte Carlo experiments

Monte Carlo simulation experiments were performed to evaluate the performance of the empirical estimator for finite sample sizes. Since for the Bayes estimator no theoretical result on its properties is available, the Monte Carlo experiments can indicate whether likelihood-based methods have the potential to outperform the empirical estimator. We estimate four boundaries—a constant boundary, a linear increasing, a linear decreasing and a Daniel’s boundary—for which the first-exit time distribution is known in closed form, on [0, T] with \(T=1\). The fifth boundary corresponds to exponentially distributed first-exit times (see Abundo 2015).

Fig. 1
figure 1

Empirical estimator, \(n=10^3\). The true boundaries are shown by the black solid lines. The estimated boundaries are shown by the red solid lines (color figure online)

Table 1 Empirical estimator: MISE for \(K=100\) replications

The results for the empirical estimator are given in Fig. 1. The mean-integrated-squared errors reported in Tables 1 and 2 are an estimate of \(\int _0^T({\hat{b}}_n(t)-b(t))^2\,dt\). For the empirical estimator, we generate \(K=100\) samples of first-exit times of size n. For each sample

$$\begin{aligned} \sum _{i=1}^n ({\hat{b}}_n(\tau _{(i)})-b(\tau _{(i)}))^2 (\tau _{(i)}-\tau _{(i-1)}) \end{aligned}$$
(19)

(with \(\tau _{(0)}=0\)) is computed. The MISE is the mean over these \(K=100\) samples.

Fig. 2
figure 2

Bayes estimator, \(n=10^3\). The true boundaries are shown by the black solid lines. The posterior means are shown by the red dashed lines together with \(95\%\)-credible intervals in gray (color figure online)

Table 2 Bayes—posterior mean: MISE for \(K=100\) replications

We compare the performance of the empirical estimator to an approximate Bayes estimator: Let \(t_i=Ti/m\). The boundary b(t) is approximated by \(b_m(t)\), which is linear on the intervals \([t_{i-1},t_i]\). For parameter \({\mathbf {b}}=(b(0), b(t_1), \ldots , b(t_m))\) and \({\mathbf {w}}=(w_1,\ldots ,w_m)\in {\mathbb {R}}^m\), the conditional approximate density of \(\tau _i\) is given by (17) (if \(w_u\ge b(t_u)\)), (18) (if \(w_u< b(t_u)\)) or (15) (if \(\tau \ge T\)). The product of these conditional approximate likelihoods is denoted by \(L({\mathbf {b}},{\mathbf {w}}^1,\ldots ,{\mathbf {w}}^n, \tau _1,\ldots ,\tau _n)\). The problem can be formulated as a latent space model with a suitably chosen prior for the parameter \({\mathbf {b}}\). The Bayes estimator is the posterior mean of a sample of parameters \({\mathbf {b}}\) generated through a Markov Chain Monte Carlo scheme:

  • Parameter \({\mathbf {b}}\).

  • Data A sample of i.i.d. first hitting times \({\tau }=(\tau _1,\dots , \tau _n)\).

  • Latent state space \({\mathbf {W}}^1,\ldots ,{\mathbf {W}}^n\).

  • Prior We assume that the slopes of \({\mathbf {b}}\) follow a random walk with the double gamma shrinkage prior on the process variances, (see Bitto and Frühwirth-Schnatter 2019)

    $$\begin{aligned} d_{t_i}= & {} \frac{b_m(t_{i+1}) - b_m(t_i)}{t_{i+1} - t_i},\\ d_{t_i}= & {} d_{t_{i-1}}+\epsilon _{t_i}, \\ \epsilon _{t_i}\sim & {} {\mathcal {N}}(0, \theta ^2_j), \\ \theta _j^2,\sim & {} {\mathcal {G}}(0.5, 0.5/\xi ^2) \\ \xi ^2\sim & {} {\mathcal {G}}(a^\xi , a^\xi \kappa ^2). \end{aligned}$$
  • Computational details The estimation is performed in JAGS (see Plummer 2015) and the results shown are based on one chain, burn-in 5000, 10,000 iterations with a thinning of 10 and hyperparameters \(a^{\xi }=0.1\), \(\kappa ^2=1\).

Results are given in Fig. 2 and Table 2. The MISE in Table 2 is defined analogously to the case of the empirical estimator, with (19) replaced by

$$\begin{aligned} \sum _{i=1}^n ({\hat{b}}_n(t_i)-b(t_i))^2 (t_i-t_{i-1}). \end{aligned}$$
(20)

Remark and Conclusion The computation of the empirical estimator is straightforward and in negligible time. A tight upper bound for its asymptotic error is available. The Bayes estimator based on the approximate likelihood could incorporate prior knowledge and has its potential if the class of boundaries can be parametrized by a finite-dimensional parameter. In the nonparametric case, the numerical experiments revealed drawbacks, at least compared to the empirical estimator. The computation was costly, considering time. The study was performed on 10 nodes using a cluster of workstations. Each node on the cluster has 2 six core Intel Xeon X5670 @ 2.93 GHz processor and was used for one boundary with a given n. The execution times are around 7 min and 2 h 20 min for the Bayesian estimator (while they are only around 1 s and 9 s for the empirical estimator) with \(n=10^2\), \(n=10^3\) respectively.

As can be seen in Fig. 2, in all cases considered the Bayes estimator showed a strong positive bias. This bias should be a result of the data-augmentation procedure. Note that the discrete Brownian motion is always below the boundary up to the observed exit-time. Then, conditional on the n discrete Brownian motions, the newly sampled boundary is above all these discrete Brownian motions, which have not crossed the boundary up to t. These findings do not depend on the chosen prior. Alternative priors, such as (discrete) Ornstein–Uhlenbeck processes have been considered with qualitatively the same result.