Parameter inference from hitting times for perturbed Brownian motion

A latent internal process describes the state of some system, e.g. the social tension in a political conflict, the strength of an industrial component or the health status of a person. When this process reaches a predefined threshold, the process terminates and an observable event occurs, e.g. the political conflict finishes, the industrial component breaks down or the person has a heart attack. Imagine an intervention, e.g., a political decision, maintenance of a component or a medical treatment, is initiated to the process before the event occurs. How can we evaluate whether the intervention had an effect? To answer this question we describe the effect of the intervention through parameter changes of the law governing the internal process. Then, the time interval between the start of the process and the final event is divided into two subintervals: the time from the start to the instant of intervention, denoted by $S$, and the time between the intervention and the threshold crossing, denoted by $R$. The first question studied here is: What is the joint distribution of $(S,R)$? The theoretical expression is provided and serves as a basis to answer the main question: Can we estimate the parameters of the model from observations of $S$ and $R$ and compare them statistically? Maximum likelihood estimators are illustrated on simulated data under the assumption that the process before and after the intervention is described by the same type of model, i.e. a Brownian motion, but with different parameters.


Monte Carlo simulation study
In this online material we extend the description of the results of the simulations with further details. In the simulations we are mainly concerned with illustrating the performance of the estimators. It is of interest to evaluate the effect of the variability and correlation of S and R on estimation, to evaluate sample sizes needed for the asymptotic results of tests and confidence intervals to be valid, to investigate the influence of right censored data on estimation, to illustrate different special submodels which simplify estimation, and finally to evaluate how much information is gained on parameters of S by taking into account observations of R.
In the simulations, three scenarios are considered: no information about the parameters is available, i.e. all parameters can vary freely; we assume equal variances σ 2 1 = σ 2 2 = σ 2 ; or we assume σ 2 i = kµ i , as in Section 4.1 of the paper. That is, either φ = (µ 1 , σ 2 1 , µ 2 , σ 2 2 ), φ = (µ 1 , µ 2 , σ 2 ) or φ = (µ 1 , µ 2 , k) are estimated. We assume both the parametric form of the underlying process and the relations between parameters, if any, to be known. It can be discussed if these assumptions are realistic. Equality of diffusion coefficients, or the assumption of variance proportional to the mean, can be checked by likelihood ratio test. Parameter values are chosen such that the mean of T in the case of no intervention is five times its standard deviation. This is obtained by setting B = 10, µ 1 = 1, σ 2 1 = 0.4, yielding E[T ] = 10, and Var[T ] = 4. Then µ 1 and σ 2 1 are varied to investigate different regimes of the model. Then the effect of the intervention is varied through the parameters µ 2 and σ 2 2 . Samples of size n = 100 are simulated, and for each set of parameter values, we repeat simulation of data set and estimation 1000 times, obtaining 1000 statistically independent trials. To illustrate the effect of right censored data, we choose the most general scenario, i.e. parameters vary freely.  Table 1 Averages, empirical and asymptotic SEs and CPs in percentage over 1000 estimates of φ = (µ 1 , σ 2 1 , µ 2 , σ 2 2 ) for n = 100, when µ 1 = 1, σ 2 1 = 0.4, µ 2 = 0.1, and σ 2 2 = 0.026, 0.059, 0.094, or 0.131, yielding an approximate CV(R) = 0.60, 0.65, 0.70 or 0.75, respectively. In all cases, CV(S) = 0.62.
Coverage probabilities (CPs) were calculated, defined as the probability that the CI covers the true value, to evaluate the performance of the CIs. The CP should be close to 1 − α, where α = 0.05 is the significance level, and the CI should be narrow for a reliable estimator.
The computing environment R has been used to carry out both the simulations of (s i , r i ), including right censored data where relevant, and the parameter estimation. A description of the simulation procedure is reported in Appendix B of the paper.
Averages and empirical SEs of the estimates, as well as medians of the asymptotic SEs and the CPs of the CIs are reported in Table 1. All estimators appear unbiased and with acceptable SEs. Not surprisingly, the performance improves when the CV of R decreases. This holds also forμ 1 andσ 2 1 , highlighting the dependence between S and R: a large variability after the intervention deteriorates estimation of parameters governing the process before the intervention. Coverage probabilities of drift parameters are close to the desired 95%, whereas the diffusion parameters σ 2 1 and σ 2 2 need a larger n. In the second case, we let µ 2 vary in the interval [0.1, 10], and fix σ 2 2 = 0.1. Here the response to the intervention either slows down or accelerates the process, depending on whether µ 2 < µ 1 or µ 1 < µ 2 , respectively.
A relevant question is how much, if at all, the estimators of µ 1 and σ 2 1 improve by considering the more complicated likelihood based on eq. (12) compared to the simple likelihood based on eq. (7), where information from R is ignored. All estimators appear unbiased (figures not shown). The estimates of µ 1 and σ 2 1 obtained from observations of (S, R) outperform those obtained only from observations of S, as can be seen comparing both their empirical and asymptotic SEs in Fig. 1. When µ 2 increases, the performance ofμ 1 and σ 2 1 improve and that ofμ 2 andσ 2 2 get worse even if CV of R decrease. Moreover, the difference between the empirical and the asymptotic SEs forμ 2 and σ 2 2 increases with µ 2 , and thus, for large µ 2 , a larger sample size is needed for asymptotics to be valid. Otherwise the empirical and asymptotic SEs are approximately equal, and thus the asymptotic values appear acceptable for inference purposes. In the following we only report the asymptotic values.
Equal variances When σ 2 1 = σ 2 2 = σ 2 , we put σ 2 = 0.1, 0.4, 1 and 2, respectively, with either µ 1 = 1 and µ 2 ∈ [0.1, 10] or µ 2 = 1 and µ 1 ∈ [0.1, 10]. The variability of the estimators for different values of µ 1 and µ 2 is reported in Fig. 2, where the SEs of the estimators are plotted against µ 2 . The estimators appear unbiased (results not shown). All of them improve when σ 2 decreases, since that reduces the variability of both S and R. The performance ofμ i improves while that ofμ j gets worse when µ j increases, for i, j = 1, 2 and i = j. Interestingly, the performance ofσ 2 seems to be constant with respect to µ, unless σ 2 is large.
A likelihood ratio test is performed for testing the hypothesis H 0 : µ 1 = µ 2 at a 5% significance level and the percentage of rejections of H 0 as a function of µ 2 is reported in Fig. 3. If µ 1 = µ 2 , the percentage should be around 5%, while if µ 1 = µ 2 , the percentage represents the power of the test, i.e. the probability of correctly rejecting the null hypothesis, and it should be as  Fig. 4 Asymptotic SEs over 1000 estimates of µ i and k for n = 100 rescaled by µ i as a function of k when the variance is proportional to the mean, σ 2 i = kµ i , i = 1, 2. The parameters are µ 1 = 1 and µ 2 = 2. The results for SE(μ 1 )/µ 1 and SE(μ 2 )/µ 2 are almost indistinguishable. The same results hold for other combinations of (µ 1 , µ 2 ) and are therefore not reported.
high as possible. When µ 1 = µ 2 = 1, this percentage is around 5% for all values of σ 2 , suggesting that n = 100 is sufficient for asymptotics to be valid. Not surprisingly, the power of the test decreases when σ 2 increases, but it is worthwhile noting that it is larger than 50% when |µ 1 − µ 2 | > 0.2 and around 100% if |µ 1 − µ 2 | ≥ 0.4, indicating a satisfactory performance of the test.
Variance proportional to the mean Assume σ 2 i = kµ i , for k > 0. The parameter values are k ∈ [0.1, 10] and µ 1 , µ 2 ∈ {0.1, 1, 2}. The performance of the estimators is reported in Fig. 4, where SE(μ i )/µ i and SE(k) are plotted against k. Also in this case, estimators appear unbiased (results not shown). As expected from the theoretical results in Section 4.1, the performance ofμ 1 andμ 2 appears similar, and it does not depend on µ 2 and µ 1 , respectively. Interestingly, the asymptotic SE ofk depends neither on µ 1 nor on µ 2 , but only on k. This may be due to the fact that neither the CVs of S and R nor their correlation depend on µ 1 and µ 2 , see eqs. (13), (14) and (17).

Right censoring
To illustrate the effect of censoring, we consider the more general scenario, i.e. no assumptions on the parameters, and l = 5, 25, 50% right censored data for different sample sizes, namely n = 100, 200, 500 and 1000. The parameters are µ 1 = 1, σ 2 1 = 0.4, µ 2 = 2 and σ 2 2 = 0.1 chosen as worst case scenario giving the worst performance in terms of SEs ofμ 2 andσ 2 2 for non-censored data, as shown in Fig. 1. Boxplots ofφ for different values of l and n are reported in Fig. 5. The boxes contain the estimates between the 1st and the 3rd quartiles, while the 2nd quartile, i.e. the median of the estimates, is marked with the black horizontal line. The bars show the range of the estimates, except the outliers, defined to be the points outside the bars that are more than 1.5 times the interquartile range from the box. As expected, the performance ofφ gets worse when the percentage of right censored data increases and thus a larger sample size is needed.