Abstract
We study approximations of boundary crossing probabilities for the maximum of moving weighted sums of i.i.d. random variables. We consider a particular case of weights obtained from a trapezoidal weight function which, under certain parameter choices, can also result in an unweighted sum. We demonstrate that the approximations based on classical results of extreme value theory provide some scope for improvement, particularly for a range of values required in practical applications.
Similar content being viewed by others
1 Introduction: statement of the problem
Let \(\varepsilon _1,\varepsilon _2,\ldots \) be a sequence of independent identically distributed random variables with finite mean \(\mu \) and variance \(\sigma ^2\) and some c.d.f. F. Define the moving weighted sum as
where the weight function \(w_{L,Q}(\cdot )\) is defined by
where L and Q are positive integers with \(Q \le L\).
The weight function \(w_{L,Q}(\cdot )\) is depicted in Fig. 1. In the special case \(Q=1\), the weighted moving sum (1) becomes an ordinary moving sum.
The main aim of this paper is to study precision of different approximations of boundary crossing probabilities for the maximum of the moving weighted sum; that is,
where H is a given threshold, M is reasonably large and L, Q are fixed parameters.
This paper is structured as follows. In Sect. 2 we reformulate the problem and provide motivation why a trapezoidal weight function is considered. In Sect. 3, a number of approximations to (3) are introduced based on the classical extreme value theory. Using the classical approximations, which do not perform very well, we also derive another approximation (called ‘combined’) which appears to be more accurate. The performance of these approximations is analyzed by a large simulation study described in Sect. 4.
2 Boundary crossing probabilities: discrete and continuous time
2.1 Reformulation of the problem
For convenience of dealing with the probability (3), we standardise the moving weighted sum \(\mathcal{S}_{n;L,Q}\). Derivation of the following lemma is straightforward.
Lemma 1
The first two moments of \(\mathcal{S}_{n;L,Q}\) are
We now define the standardized random variables (r.v.)
\(n=0,1,\ldots .\) If the r.v. \(\varepsilon _1, \varepsilon _2, \ldots \) are normal then the r.v. \(\zeta _1, \zeta _2, \ldots \) are also normal. Otherwise, using the Central Limit Theorem, we obtain that \( \zeta _{n} \sim N(0,1)\, \) holds asymptotically, as \(L\rightarrow \infty \).
Using the notation \(\zeta _n\), our problem (3) is equivalent to studying approximations for the boundary crossing probability (abbreviated BCP)
where
A number of approaches could be used to approximate (6). We could have ignored the dependence structure of the sequence of moving weighted sums and used either asymptotic normality alone or the limiting extreme value distribution to choose h. Instead, in what follows we study several approximations of (6) which are based on approximating the sequence \(\zeta _n\) by a continuous time random process. Before we proceed, let us consider a special case of \(\varepsilon _j\), which has important practical significance.
2.2 Motivation for the problem
If we let \(\varepsilon _j=\xi _j^2\), where \(\xi _1,\xi _2,\ldots \) are i.i.d random variables with zero mean, variance \(\delta ^2\) and finite fourth moment \(\mu _4=E\xi _i^4\), then \(\mathcal{S}_{n;L,Q}\) can be seen as a moving weighted sum of squares. In this case, the mean \(\mu =E\varepsilon _j=\delta ^2\) and \(\sigma ^2=\mathrm{var }(\varepsilon _j) = \mu _4- \delta ^4\). By approximating (3) we are considering a particularly interesting case linked to the SSA change-point detection algorithm proposed in Moskvina and Zhigljavsky (2003). A good approximation for the BCP for the maximum of the moving weighted sums of squares is needed in the theory of sequential change-point detection because the BCP defines the significance levels for the SSA change-point detection statistic. For an extensive introduction to SSA, see Golyandina et al. (2001) and Golyandina and Zhigljavsky (2013).
2.3 Continuous time approximation
By the definition, the probability \( P_{M, h} (\zeta _n)\) is an \((M+1)\)-dimensional integral which is difficult to compute. We assume that \(L\rightarrow \infty \) and consider a transformation described below in Sect. 3 from the time series \(\zeta _n\), \(n=0,1,\ldots , M\), to a continuous-time process \(\zeta _t, t\in [0,{ T}],\) where \(T=M/\sqrt{LQ}\) for large Q, see (10), and \(T=M/{L}\) in the case of small Q, see beginning of Sect. 3.2. Like the time series \(\zeta _n\), the process \(\zeta _t\) is standardized so that \(E\zeta _t=0\) and \(E\zeta _t^2=1\) for all t. Also, the process \(\zeta _t\) is Gaussian and stationary with some autocorrelation function \(R(s)= E\zeta _0\zeta _{s}\).
By such a transformation, the probability \( P_{M, h} (\zeta _n)\) is approximated by \(P({ T},h, \zeta _t)\), which is the probability of reaching the threshold h by the process \(\zeta _t\) on the interval [0, T]; that is,
For the continuous process \(\zeta _t\), two main useful characteristics are the probability density function of reaching the threshold h for the first time
and the average time \({\varrho }({h,\zeta _t})\) until the process \(\zeta _t\) reaches the threshold h
From the practical point of view, we are interested in finding good approximations of (6) for small and moderate M. But the mathematical theory guarantees accurate approximations just for large M.
To proceed further, we need to discuss results concerning the autocorrelation function of the continuous process \(\zeta _t\). This can be done through computing the correlations between \(\mathcal{S}_{n;L,Q}\) and \(\mathcal{S}_{n+\nu ,L,Q}\) for \(\nu >0\).
2.4 Correlation between \(\mathcal{S}_{n;L,Q}\) and \(\mathcal{S}_{n+1;L,Q}\)
For fixed L and Q, the moving weighted sum \(\mathcal{S}_{n;L,Q}\) is a function of n. The index n can be treated as time and thus the sequence \(\mathcal{S}_{0;L,Q}\), \(\mathcal{S}_{1;L,Q}, \ldots \) defined in (1) can be considered as a time series. In order to derive our approximations, we need explicit expressions for the correlation Corr(\(\mathcal{S}_{n;L,Q},\mathcal{S}_{n+1;L,Q})\). The general case Corr(\(\mathcal{S}_{n;L,Q},\mathcal{S}_{n+\nu ;L,Q})\), \(\nu > 1\) need not be considered for these approximations.
Without loss of generality, we can assume that \(n=0\) and we denote \( \mathcal{S}_{\nu }:=\mathcal{S}_{\nu ;L,Q} \) where \(\nu =0,1\).
Lemma 2
The correlation \(\mathrm{Corr}(\mathcal{S}_0,\mathcal{S}_1)=\mathrm{Corr}(\mathcal{S}_{n;L,Q},\mathcal{S}_{n+1;L,Q})\), where \(\mathcal{S}_{n;L,Q}\) is defined in (1), is
Proof
From the definition (1), the quadratic forms \(\mathcal{S}_{0}\) and \(\mathcal{S}_{1}\) can be represented as
and
Using these representations, we can easily obtain \( E(\mathcal{S}_0\mathcal{S}_1)=E\mathcal{S}_0^2-Q\sigma ^2\, . \) Then by substituting the explicit expressions (4) for \(E\mathcal{S}_0\) and \(\mathrm{var}(\mathcal{S}_0)=E\mathcal{S}_0^2\), we obtain the desired result.\(\square \)
Note that the correlation does not depend on the distribution of errors \(\varepsilon _j\) (unlike the covariance which depends on the mean \(\mu \) and variance \(\sigma ^2\) of \(\varepsilon _j\)). This also can be seen in relation to the fact (see, for example, Priestley 1981) that the spectral density of the moving average process depends only on the weight function, which is \(w_{L,Q}(t)\) in our case.
3 Approximations of the boundary crossing probabilities
In this section we formulate four different approximations for the BCP \(P_{M, h} (\zeta _n)\) defined in (7). These approximations depend on the behaviour of the autocorrelation function \(R(s)= E\zeta _0\zeta _{s}\) at 0 which in its turn depends on parameters Q and L of the weight function in (2). We consider the following two cases: (i) large Q and large L, (ii) small Q and large L.
3.1 Case of large Q and large L
Consider the sequence of random variables \(\zeta _0, \zeta _1,\ldots ,\zeta _{M}\) defined in (5). In view of Lemma 2, the correlation between \(\zeta _n\) and \(\zeta _{n\, +\, 1}\) is
Assume that both L and Q are large. Moreover, assume that L and Q tend to infinity in such a way that the limit \(\lambda =\lim Q/L \) exists and \(0<\lambda \le 1\). Set \(\varDelta ={1}/{\sqrt{LQ}}\) and
Define a piece-wise linear continuous-time process \({\zeta _t^{(L)}}, t \in [0,T],\) as follows
By construction, the process \({\zeta _t^{(L)}}\) is such that \({\zeta _{t_n}^{(L)}}=\zeta _{n} \; \mathrm{for } \; n=0,\ldots ,{M}\). Also we have that \({\zeta _t^{(L)}}\) is a second-order stationary process in the sense that \(E\zeta _t^{(L)},\, \mathrm{var}(\zeta _t^{(L)})\) and the autocorrelation function \(R_\zeta ^{(L)}(t,t+k\varDelta )=\mathrm{Corr}( \zeta _t^{(L)}, \zeta _{t+k\varDelta }^{(L)})\) do not depend on t.
Lemma 3
Let \(\lambda =\lim _{L,Q\rightarrow \infty } Q/L\) and assume that \(0 < \lambda \le 1\). Consider the process \(\zeta _t^{(L)}\) defined in (11). The limiting process \(\zeta _t=\lim _{L,Q\rightarrow \infty } {\zeta _t^{(L)}}\) is stationary Gaussian with some autocorrelation function \(R_\zeta (t,t+s)=R(s)\). Moreover, \(R'(0)=0\) and \(R''(0)=-6/(3\, -\, \lambda )\).
Proof
For the autocorrelation function \(R(\cdot )\) we have \(R'(0)=0\) since
where we used the relations \(\varDelta \, =\, {1}/{\sqrt{LQ}}\), \(R(\varDelta )\, =\, 1-{3}\, /\, (3LQ-Q^2\, +\, 1)\) and \(R(0)\,=\, 1\). We similarly obtain
For a Gaussian stationary process \(\zeta _t\) with \(E\zeta _t=0\) and \(E\zeta _t^2=1\) and autocorrelation function \(R(\cdot )\) such that \(R'(0)=0\) and \(R''(0)<0\) we can use the following two well-known approximations.
Approximation 1
(App 1) From Theorem 8.2.7 in Leadbetter et al. (1983) we have
Expressing u in terms of h, we obtain the Approximation 1
with \( {u=\gamma (h-\gamma )+c}, \) where
Approximation 2
(App 2) From Cramér (1965), we have
where
Expressing v in terms of h, we obtain Approximation 2
with
Note that \({2\log \mu }\, =\, {\gamma ^2\, \, -\,2c}\) and
as \(\gamma \rightarrow \infty \), where \(\gamma \) and c are defined in (13). Therefore, for large T (and, therefore, large \(\gamma \)) we have
Let us construct another approximation by combining the Approximations 1 and 2.
Approximation 3
(Combined) Consider the approximation
where
Formally, \(\lambda =\lim _{L,Q\rightarrow \infty } Q/L=0\) still satisfies Lemma 3 in the sense that \(R'(0)=0\) and \(R''(0)=-2<0\); however, the above approximations are poor when Q is small; this shall be demonstrated in Sect. 4. The case of small Q should be treated differently and is considered in the following subsection.
3.2 Case of small Q and large L
Consider again the sequence of random variables \(\zeta _n\) defined by (5). Unlike in Sect. 3.1, now we look at the asymptotic transformation when \(L\rightarrow \infty \) but Q is fixed. Set \(\varDelta = 1/L\) and \(T= {M} \varDelta \). Define \(t_n\), \(n=0,1,\ldots ,{M},\) as in (10) and consider the piece-wise linear continuous-time process \(\zeta _t^{(L)}\) defined by (11).
Lemma 4
Let Q be fixed. The limiting process \(\zeta _t\) as \(L \rightarrow \infty \) is a Gaussian second-order stationary process with autocorrelation function \(R_\zeta (t,t+s)=R(s)\). Moreover, \(R'(0+)=-\frac{1}{Q} \ne 0\).
Proof
We first note that
Using (9) and the fact that \(\varDelta ={1}/{L}\), we have
\(\square \)
Let us now formulate the tangent approximation suggested in Durbin (1985); it is one of the most known approximations for the density function \(q(t,h,\zeta _t)\) of the first passage time defined in (8). Using this, we can approximate the first passage probability \(P({ T}, h,\zeta _t)\) defined in (7) in the case of a Gaussian process \(\zeta (t)\) on [0, T] with \(E\zeta (t)=0\), some autocorrelation function \(R_{\zeta }(t,s)\) and the possibly non-constant threshold \(h=h(t)\).
The Durbin approximation for \(q(t,h,\zeta _t)\) can be written as
where
In view of (8) the related approximation for the first passage probability \(P({ T}, h,\zeta _t)\) is
In the case when the threshold \(h(t)=h\) is constant, using Lemma 4 we obtain
and therefore we obtain the following approximation.
Approximation 4
(App 4) The Durbin approximation for the BCP (7) is
4 Simulation study
In this section we study quality of approximations for the BCP \(P_{M, h} (\zeta _n)\) defined in (6), where \(\varepsilon _t\) are normal r.v.’s with mean 0 and variance 1. Asymptotically (for large L and M), the approximations we study can also be used for the BCP connected to the weighted sum of squares discussed in Sect. 2.2 and therefore for setting significance levels for the SSA change-point statistic defined in Moskvina and Zhigljavsky (2003).
In Figs. 2, 3, 4, 5, and 6, the ’Sum of normal’ line corresponds to the empirical value of (6) computed from 100,000 simulations with different values of L, Q and M. In simulations leading to Figs. 2, 3, and 4 the value of Q can be considered as large and hence we compare Approximations 1–3. In Fig. 5 we present analysis demonstrating the lack of accuracy of Approximations 1–3 when Q is small. We then analyse the performance of the Durbin approximation in Fig. 6, which is constructed specifically under the assumption that Q is small; in this case we set \(Q=1\). We observe that for large L and Q Approximation 3 is typically superior to the Approximations 1 and 2 for all h (note that Approximations 1 and 3 coincide for large values of h). Listed in Tables 1, 2, 3, and 4 are the approximated threshold values h (for Approximations 1 and 2 only) for a specified true BCP, when this BCP is small enough. In these tables, R.E. denotes the relative error.
As seen in Fig. 2 and Table 1, for the chosen parameters Approximation 2 is generally poor; for small BCP we see particularly high relative errors in Table 1. On the other hand, Approximation 1 performs well for small BCP and, although discrepancies can be seen for small h, we see that Approximation 3 performs quite well across all values of h.
As shown in Fig. 3 and Table 2, Approximation 2, whilst still being considerably worse than Approximations 1 and 3, shows signs of improvement with this choice of L and Q. At the BCP of 0.05, Approximation 1 produces the lowest relative error with the parameter choices considered so far.
As shown in Fig. 4 and Table 3, we see a considerable improvement in Approximation 2 with the increase in M from 1000 to 2000, however Approximation 3 still remains far superior. For this larger M, Approximation 1 shows the smallest relative error at a BCP of 0.05 which is arguably the most important case.
We shall now consider the performance of Approximations 1–3 for small Q. We conclude that all three approximations perform poorly when Q is not large enough (of order L).
As can be seen from Fig. 5 and Table 4, all three approximations are poor for \(Q=5\). Relative errors are high and thus the use of these approximations for the case of small Q and large L cannot be justified.
For checking the quality of the Durbin approximation we used the same settings as for the Approximations 1, 2 and 3. In Fig. 6, we show results for the Durbin approximation for a few particular values of L and Q.
We can conclude that the quality of the Durbin approximation (16) is poor unless the threshold h is very large. This is seen graphically in Fig. 6 as well as numerically in Table 5, where there is a sharp increase in the relative error as the BCP increases. For the BCP of 0.05 the relative error for the Durbin approximation is higher than all relative errors of Approximation 1 considered in this paper.
5 Conclusion
A number of approximations of boundary crossing probabilities for the maximum of moving weighted sums of i.i.d. random variables have been considered. The particular weights are obtained from a trapezoidal weight function that has important links to the SSA change-point detection algorithm described in Moskvina and Zhigljavsky (2003). We have seen that Approximations 1–3 perform rather well for large Q and L, and Approximation 3 consistently outperforming Approximations 1 and 2 across all values of the threshold h. The case of small Q must be considered separately since Approximations 1–3 perform poorly. The Durbin approximation, developed for small Q, is not satisfactory, unless threshold h is very large.
References
Cramér H (1965) A limit theorem for the maximum values of certain stochastic processes. Theory Probab Appl 10(1):126–128
Durbin J (1985) The first-passage density of a continuous Gaussian process to a general boundary. J Appl Probab 22:99–122
Golyandina N, Zhigljavsky A (2013) Singular spectrum analysis for time series. Springer briefs in statistics. Springer, Berlin
Golyandina N, Nekrutkin V, Zhigljavsky AA (2001) Analysis of time series structure: SSA and related techniques, monographs on statistics and applied probability, vol 90. Chapman & Hall, London
Leadbetter MR, Lindgren G, Rootzén H (1983) Extremes and related properties of random sequences and processes, vol 21. Springer, New York
Moskvina V, Zhigljavsky A (2003) An algorithm based on singular spectrum analysis for change-point detection. Commun Stat 32(2):319–352
Priestley MB (1981) Spectral analysis and time series. Academic Press, London
Acknowledgements
The authors are grateful to both referees for their constructive comments.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Noonan, J., Zhigljavsky, A. Approximations of the boundary crossing probabilities for the maximum of moving weighted sums. Stat Papers 59, 1325–1337 (2018). https://doi.org/10.1007/s00362-018-1015-z
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-1015-z