Convergence of U-processes in Hölder spaces with application to robust detection of a changed segment

Račkauskas, Alfredas; Wendler, Martin

doi:10.1007/s00362-020-01161-9

Convergence of U-processes in Hölder spaces with application to robust detection of a changed segment

Regular Article
Open access
Published: 29 January 2020

Volume 61, pages 1409–1435, (2020)
Cite this article

Download PDF

You have full access to this open access article

Statistical Papers Aims and scope Submit manuscript

Convergence of U-processes in Hölder spaces with application to robust detection of a changed segment

Download PDF

Alfredas Račkauskas¹ &
Martin Wendler²

1039 Accesses
6 Citations
Explore all metrics

Abstract

To detect a changed segment (so called epidemic changes) in a time series, variants of the CUSUM statistic are frequently used. However, they are sensitive to outliers in the data and do not perform well for heavy tailed data, especially when short segments get a high weight in the test statistic. We will present a robust test statistic for epidemic changes based on the Wilcoxon statistic. To study their asymptotic behavior, we prove functional limit theorems for U-processes in Hölder spaces. We also study the finite sample behavior via simulations and apply the statistic to a real data example.

Testing mean changes by maximal ratio statistics

Article 17 November 2021

On change-points tests based on two-samples U-Statistics for weakly dependent observations

Article 28 May 2021

Change-Point Detection Under Dependence Based on Two-Sample U-Statistics

1 Introduction

In change point detection, the hypothesis is typically stationarity, but there are different types of alternatives, like the at most one change point or multiple change points. In this article, we are interested in testing stationarity with respect to the so called epidemic change or changed segment alternative: We have a random sample $X_1, X_2, \ldots , X_n$ (with values in a sample space $({\mathbb {S}}, {\mathcal {S}})$ and distributions $P_{X_1}, P_{X_2}, \ldots , P_{X_n}$) and we wish to test the null hypothesis

$$\begin{aligned} H_0: P_{X_1} = P_{X_2}= \cdots = P_{X_n}, \end{aligned}$$

versus the alternative

$$\begin{aligned} H_1: \ \,&\text {there is a segment}\ \ I^*:=\{k^*+1, \ldots , m^*\}\subset I_n:=\{1, 2, \ldots , n\}\ \ \text {such that}\\&P_{X_i}={\left\{ \begin{array}{ll} P \ \ &{}\text {for} \ \ i\in I_n{\setminus } I^*\\ Q \ \ &{}\text {for}\ \ i\in I^*, \end{array}\right. }\ \ \text {and}\ \ P\not =Q. \end{aligned}$$

Under $H_1$ the sample $(X_i, i\in I^*)$ constitutes a changed segment starting at $k^*$ and having the length $\ell ^*=m^*-k^*$ and Q is then the corresponding distribution in the changed segment. This type of alternative is of special relevance in epidemiology and has first been studied by Levin and Kline (1985) in the case of a change in mean. Their test statistic is a generalization of the CUSUM (cumulated sum) statistic. Simultaneously, epidemic-type models were introduced by Commenges et al. (1986) in connection with experimental neurophysiology.

If the changed segment is rather short compared to the sample size, tests that give higher weight to short segments have more power. Asymptotic critical values for such tests have been proved by Siegmund (1988) in the Gaussian case [see also (Siegmund and Venkatraman 1995)]. The logarithmic case was treated in Kabluchko and Wang (2014), and the regular varying case in Mikosch and Račkauskas (2010). Yao (1993) and Hušková (1995) compared tests with different weightings. Račkauskas and Suquet (2004, 2007) have suggested using a compromise weighting, that allows to express the limit distribution of the test statistic as a function of a Brownian motion. However, in order to apply the continuous mapping theorem for this statistic, it is necessary to establish the weak convergence of the partial sum process to a Brownian motion with respect to the Hölder norm.

It is well known that the CUSUM statistic is sensitive to outliers in the data, see e.g. Prášková and Chochola (2014). The problem becomes worse if higher weights are given to shorter segments. A common strategy to obtain a robust change point test is to adapt robust two-sample tests like the Wilcoxon one. This was first used by Darkhovsky (1976) and by Pettitt (1979) in the context of detecting at most one change in a sequence of independent observations. For a comparison of different change point test see Wolfe and Schechtman (1984). The results on the Wilcoxon type change point statistic were generalized to long range dependent time series by Dehling et al. (2013). The Wilcoxon statistic can either be expressed as a rank statistic or as a (two-sample) U-statistic. This motivated Csörgő and Horváth (1989) to study more general U-statistics for change point detection, followed by Ferger (1994) and Gombay (2001). Orasch (2004) and Döring (2010) have studied U-statistics for detecting multiple change-points in a sequence of independent observations. Results for change point tests based on general two-sample U-statistics for short range dependent time series were given by Dehling et al. (2015), for long range dependent time series by Dehling et al. (2017). Betken (2016) has suggested a self-normalized change-point test based on the Wilcoxon statistic. By using self-normalization, it is possible to avoid the estimation of unknown parameters in the limit distribution.

Gombay (1994) has suggested to use a Wilcoxon type test also for the epidemic change problem. The aim of this paper is to generalize these results in three aspects: to study more general U-statistics, to allow the random variable to exhibit some form of short range dependence, and to introduce weightings to the statistic. This way, we obtain a robust test which still has good power for detecting short changed segments. To obtain asymptotic critical values, we will prove a functional central limit theorem for U-processes in Hölder spaces.

The article is organized as follows. Section 2 introduces U-statistics type test statistics to deal with the epidemic change point problem. In Sect. 3 some experimental results are presented and discussed whereas Sect. 4 deals with a concrete data set. Sects. 5 and 6 constitute the theoretical part of the paper where asymptotic results are established under the null hypothesis. Consistency under the alternative of a changed segment is discussed in Sect. 7. Finally in Sect. 8, we present the table with asymptotic critical values for the tests under consideration.

2 Tests for changed segment based on U-statistics

A general approach for constructing procedures to detect a changed segment is to use a measure of heterogeneity $\Delta _n(k,m)$ between two segments

$$\begin{aligned} \{X_i, i\in I(k,m)\}\ \ \text {and}\ \ \{X_i, i\in I^c(k,m)\}, \ \ 0\le k<m\le n, \end{aligned}$$

where $I(k,m)=\{k+1, \ldots , m\}$ and $I^c(k, m)=I_n{\setminus } I(k, m)$. As neither the beginning $k^*$ nor the end $m^*$ of changed segment is known, the statistics

$$\begin{aligned} T_n:= \max _{0\le k<m\le n}\frac{1}{\rho _n(m-k)}\Delta _n(k, m) \end{aligned}$$

may be used to test the presence of a changed segment in the sample $(X_i)$, where $\rho _n(m-k)$ is a factor smoothing over the influence of either too short or too large data windows. In this paper we consider a class of U-statistic type measures of heterogeneity $\Delta _n(k, m)$ defined via a measurable function $h:{\mathbb {S}}\times {\mathbb {S}}\rightarrow {\mathbb {R}}$ by

$$\begin{aligned} \Delta _n(k, m)=\Delta _{h, n}(k, m):=\sum _{i\in I(k, m)}\sum _{j\in I_n{\setminus } I(k, m)}h(X_i, X_j), \end{aligned}$$

and the corresponding test statistics

$$\begin{aligned} T_{n}(\gamma , h)=\max _{0\le k<m\le n}\frac{|\Delta _{h, n}(k, m)|}{\rho _\gamma ((m-k)/n)}, \end{aligned}$$

(1)

where $0\le \gamma <1/2$ and

$$\begin{aligned} \rho _{\gamma }(t)=[t(1-t)]^{\gamma }, \ 0<t<1. \end{aligned}$$

Although other weighting functions are possible our choice is limited by application of a functional central limit theorem in Hölder spaces.

Recall the kernel h is symmetric if $h(x, y)=h(y, x)$ and antisymmetric if $h(x, y)=-h(y, x)$ for all $x, y\in {\mathbb {S}}$. Any non symmetric kernel h can be antisymmetrized by considering

$$\begin{aligned} \widetilde{h}(x, y)=h(x, y)-h(y, x), x, y\in {\mathbb {S}}. \end{aligned}$$

Let’s note that the kernel h is antisymmetric if and only if $E[h(X,Y)]=0$ for any independent random variables with the same distribution such that the expectation exists. The if part follows by Fubini and antisymmetry. To see the only if part, first consider the one point distribution $X=x$ and $Y=x$ almost surely to conclude that $h(x,x)=0$ for all x. Next, consider the two point distribution $P(X=x)=P(X=y)=1/2$ and conclude that $0=E[h(X,Y)]=(h(x,x)+h(y,y)+h(x,y)+h(y,x))/4$ and thus $h(x,y)=-h(y,x)$. So a U-statistic with antisymmetric kernel has expectation 0 if the observations are independent and identically distributed and are good candidates for change point tests. We only consider antisymmetric kernels in this paper.

In the case of a real valued sample, examples of antisymmetric kernels include the CUSUM kernel $h_C(x,y)=x-y$ or the Wilcoxon kernel $h_W(x,y)=\varvec{1}_{\{x<y\}}-\varvec{1}_{\{y<x\}}$. The kernel $h_W$ leads to Wilcoxon type statistics

$$\begin{aligned} T_n(\gamma , h_W):=\max _{0\le k<m\le n}\frac{1}{\rho _\gamma ((m-k)/n)}\Big |\sum _{i\in I(k, m)}\sum _{j\in I_n{\setminus } I(k, m)}\Big [\varvec{1}_{\{X_i<X_j\}}-\varvec{1}_{\{X_j< X_i\}}\Big ]\Big | \end{aligned}$$

whereas with the kernel $h_C$ we get CUSUM type statistics

$$\begin{aligned} n^{-1}T_n(\gamma , h_C)=\max _{0\le k<m\le n}\frac{1}{\rho ((m-k)/n)}\Big |\sum _{i=k+1}^m[X_i-\overline{X}_n]\Big |, \end{aligned}$$

where $\overline{X}_n:=n^{-1}\sum _{i=1}^n X_i$. As more general classes of kernels and corresponding statistics we can consider the CUSUM test of transformed data ($h(x, y):=\psi (x)-\psi (y)$) or a test based on two-sample M-estimators ($h(x,y)=\psi (x-y)$ for some monotone function, see (Dehling et al. 2017).

Based on invariance principles in Hölder spaces discussed in the next section, we derive the limit distribution of test statistics $T_n(\gamma , h)$. Theorems 1 and 2 provide examples of our results. Let $W=(W(t), t\ge 0)$ be a standard Wiener process and $B=(B(t), 0\le t\le 1)$ be a corresponding Brownian bridge. Define for $0\le \gamma <1/2$,

$$\begin{aligned} T_{\gamma }:=\sup _{0\le s<t\le 1}\frac{|B(t)-B(s)|}{\rho _\gamma (t-s)}. \end{aligned}$$

Theorem 1

If $(X_i)_{i\in {\mathbb {N}}}$ are independent and identically distributed random elements and h is an antisymmetric kernel with $E[|h(X_1,X_2)|^{p}]<\infty $ for some $p>2$, then for any $\gamma <(p-2)/2p$, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }P(n^{-3/2}\sigma _{h}^{-1}T_n(\gamma , h)\le x)=P(T_\gamma \le x),\ \ \text {for all}\ \ x\in {\mathbb {R}}, \end{aligned}$$

where the variance parameter $\sigma _h$ is defined by $\sigma _h^2=\mathrm {var}(h_1(X_i))$ and $h_1(x)=E[h(x,X_i)]$.

Note that in practice, the random variables $X_i$ might not have high moments, but if we use a bounded kernel like $h_W$, we know that the condition of the theorem holds for any $p\in (0,\infty )$, so we have the convergence for any $\gamma <1/2$. Also, in practical applications, the variance parameter has to be estimated. This can be done by

$$\begin{aligned} {\hat{\sigma }}^2_{n,h}:=\frac{1}{n}\sum _{i=1}^n {\hat{h}}_1^2(X_i) \end{aligned}$$

(2)

with ${\hat{h}}_1(x)=n^{-1}\sum _{j=1}^nh(x,X_i)$.

For the case of a dependent sample, we consider absolutely regular sequences of random elements (also called $\beta $-mixing). Recall that the coefficients of absolute regularity $(\beta _m)_{m\in {\mathbb {N}}}$ are defined by

$$\begin{aligned} \beta _m=E\sup _{A\in {\mathcal {F}}_{m}^\infty }\left( P(A|{\mathcal {F}}_{-\infty }^{0})-P(A)\right) , \end{aligned}$$

where ${\mathcal {F}}_a^b:=\sigma (X_a,X_{a+1},\ldots ,X_b)$ is the $\sigma $-field generated by $X_a,X_{a+1},\ldots ,X_b$.

Theorem 2

Let $(X_i)_{i\in {\mathbb {N}}}$ be a stationary, absolutely regular sequence and h be an antisymmetric kernel, and assume that the following conditions are satisfied:

(i)
$\sup _{i,j\in {\mathbb {N}}}E|h(X_i,X_j)|^{q}<\infty $ for some $q>2$;
(ii)
$\sum _{k=1} ^{\infty }k\beta ^{1-2/q}_k<\infty $ and $\sum _k k^{r/2-1}\beta ^{1-r/q}_k<\infty $ for some $2<r<q$.

Then for any $0\le \gamma <1/2-1/r$, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }P(n^{-3/2}\sigma _{\infty }^{-1}T_n(\gamma , h)\le x)=P(T_\gamma \le x),\ \ \text {for all}\ \ x\in {\mathbb {R}}, \end{aligned}$$

where the long run variance parameter $\sigma _\infty $ is given by

$$\begin{aligned} \sigma _\infty ^2=\mathrm {var}\big (h_1(X_1)\big )+2\sum _{k=2}^\infty {\text {cov}}\big (h_1(X_1),h_1(X_k)\big ) \end{aligned}$$

For a bounded kernel h the conditions (ii) on decay of the coefficients of absolute regularity reduces to

(ii’)
$\sum _k \max \{k, k^{r/2-1}\}\beta _k<\infty $ for some $r>2$.

Following Vogel and Wendler (2017), $\sigma ^2_\infty $ can be estimated using a kernel variance estimator. For this, define autocovariance estimators ${\hat{\rho }}(k)$ by

$$\begin{aligned} {\hat{\rho }}(k)=\frac{1}{n}\sum _{i=1}^{n-k}{\hat{h}}_1(X_i){\hat{h}}_1(X_{i+k}) \end{aligned}$$

with ${\hat{h}}_1(x)=\sum _{j=1}^nh(x,X_i)$. Then, for some Lipschitz continuous function K with $K(0)=1$ and finite integral, we set

$$\begin{aligned} {\hat{\sigma }}_{\infty }^2={\hat{\sigma }}_h^2+2\sum _{k=1}^{n-1}K(k/b_n){\hat{\rho }}(k), \end{aligned}$$

where $b_n$ is a bandwidth such that $b_n\rightarrow \infty $ and $b_n/\sqrt{n}\rightarrow 0$ as $n\rightarrow \infty $.

With the help of the limit distribution and the variance estimators, we obtain critical values for our test statistic. Simulated quantiles for the limit distribution can be found in Sect. 8.

To discuss the behavior of the test statistics $T_{n}(\gamma , h)$ under the alternative we assume that for each $n\ge 1$ we have two probability measures $P_n$ and $Q_n$ on $({\mathbb {S}}, {\mathcal {S}})$ and a random sample $(X_{ni})_{1\le i\le n}$ such that for $k^*_n, \ell ^*_n\in \{1, \ldots , n\}$,

$$\begin{aligned} P_{X_{ni}}={\left\{ \begin{array}{ll} Q_n, \ \, &{}\text {for}\ \ i\in I^*:=\{k^*_n+1, \ldots , k^*_n+\ell ^*_n\}\\ P_n, \ \, &{}\text {for}\ \ i\in I_n{\setminus } I^*. \end{array}\right. } \end{aligned}$$

Set

$$\begin{aligned} \delta _n=\int _{{\mathbb {S}}}\int _{{\mathbb {S}}} h(x, y)Q_n(dx)P_n(dy),\ \ \nu _n=\int _{{\mathbb {S}}}\int _{{\mathbb {S}}} (h(x, y)-\delta _n)^2Q_n(dx)P_n(dy). \end{aligned}$$

Theorem 3

Let $0\le \gamma <1$. Assume that for all $n\in {\mathbb {N}}$, the random variables $X_{n1},\ldots ,X_{nn}$ are independent and let h be an antisymmetric kernel.

If

$$\begin{aligned} \lim _{n\rightarrow \infty }\sqrt{n}\left| \delta _n\right| \Big [\frac{\ell ^*_n}{n}\Big (1-\frac{\ell ^*_n}{n}\Big )\Big ]^{1-\gamma }=\infty \ \ \text {and}\ \ \sup _n \Big [\frac{\ell ^*}{n}\Big (1-\frac{\ell ^*}{n}\Big )\Big ]^{1-2\gamma }\nu _n<\infty , \end{aligned}$$

(3)

then

$$\begin{aligned} n^{-3/2}T_n(\gamma , h)\xrightarrow [n\rightarrow \infty ]{\mathrm {P}}\infty . \end{aligned}$$

(4)

For dependent random variables, we get a similar theorem:

Theorem 4

Assume that for all $n\in {\mathbb {N}}$, the random variables $X_{n1},\ldots ,X_{nn}$ are absolutely regular with mixing coefficients $(\beta _k)_{k\in {\mathbb {N}}}$ not depending on n, such that $\sum _{k=1}^{\infty }k^{q/2}\beta ^{1/2-1/q}_k<\infty $ for some $q>2$. Let h be an antisymmetric kernel, such that there exist $C_r<\infty $ such that $E[|h(X_{in},X_{jn})|^{q}]\le C_q$ for all $n\in {\mathbb {N}}$, $i,j\le n$. Furthermore, let $0\le \gamma <1$ and assume that

$$\begin{aligned} \lim _{n\rightarrow \infty }\sqrt{n}\left| \delta _n\right| \Big [\frac{\ell ^*_n}{n}\Big (1-\frac{\ell ^*_n}{n}\Big )\Big ]^{1-\gamma }=\infty . \end{aligned}$$

(5)

Then (4) holds.

This implies that a test based on statistic $T_n(\gamma , h)$ is consistent. More on consistency see Sect. 7. The proofs of Theorems 1 and 2 are given in Sect. 6.

3 Simulation results

We compare the CUSUM type and the Wilcoxon type test statistic in a Monte Carlo simulation study. The model is an autoregressive process $(Y_n)_{n\in {\mathbb {N}}}$ of order 1 with $Y_i=aY_{i-1}+\epsilon _i$, where $(\epsilon _i)_{i\in {\mathbb {N}}}$ are either normal distributed, exponential distributed or $t_5$ distributed. We assume that the first L observations are shifted, so that we observe

$$\begin{aligned} X_i:={\left\{ \begin{array}{ll}Y_i/\sqrt{\mathrm {var}(Y_i)}+\delta _n\, &{}\text {for }i=1,\ldots ,L\\ Y_i/\sqrt{\mathrm {var}(Y_i)} \, &{}\text {for }i=L+1,\ldots ,n\end{array}\right. } \end{aligned}$$

Under independence, the distribution of the change-point statistics does not dependent on the beginning of the changed segment, only on the length. In Table 1, we show some simulation results comparing the power for a changed segment in the beginning of the data and in the middle for a dependent sequence (autoregressive parameter $a=0.5$). The rejection frequencies do not differ much, so we restrict further simulations to segments of the form $I^\star =\{1,\ldots ,L\}$.

Table 1 Empirical rejection frequency under alternative for an AR(1)-process of length $N=480$ with AR-parameter 0.5 and $t_5$ distributed-innovations, changed segment from 1 to 160 or from 161 to 320, change height $\delta _n=0.58$, level $\alpha =5\%$

Full size table

In Fig. 1, the results for $n=240$ independent observations ($a=0$) are shown. In this case, we use the known variance of our observations and do not estimate the variance. The relative rejection frequency of 3000 simulation runs under the alternative is plotted against the relative rejection frequency under the hypothesis for theoretical significance levels of 1%, 2.5%, 5% and 10%. As expected, the CUSUM test has a better performance than the Wilcoxon test for normal distributed data. For the exponential and the $t_5$ distribution, the Wilcoxon type test has higher power. For the long changed segment ($L=80$), the weighted tests with $\gamma =0.1$ outperform the tests with $\gamma =0.3$. For the short changed segment ($L=30$), the Wilcoxon type test has more power with weight $\gamma =0.3$. The same holds for the CUSUM type test under normality. For the other two distributions however, the empirical size is also higher for $\gamma =0.3$ so that the size corrected power is not improved.

In Fig. 2, we show the results for $n=480$ dependent observations (AR(1) with $a=0.5$). In this case, we estimated the long run variance with a kernel estimator, using the quartic spectral kernel and the fixed bandwidth $b=4$. Both tests become too liberal now with typical rejection rates of 13% to 15% for a theoretical level of 10%. For the long changed segment ($L=160$) it is better to use the weight $\gamma =0.1$, for the short segment ($L=60$) the weight $\gamma =0.3$. Under normality, the CUSUM type test has a better performance, though the difference in power is not very large. For the other two distributions, the Wilcoxon type test has a better power.

In practice, the strength of dependence is usually not known beforehand, so it would make sense to use a data-adaptive bandwidth for the variance estimation. However, the bias of the variance estimator under the alternative might get worse for data-adaptive bandwidths, and this might lead to a nonmonotonic power of change-point tests, see e.g. Vogelsang (1999) or Shao and Zhang (2010). For this reason, we propose to estimate the variance the following way: Split the data set into five shorter parts of equal length and use a variance estimator with data-adaptive bandwidth separately for each of the parts. Then take the median of the five estimators for standardizing the test statistic. The beginning and the end of the changed segment will only affect at most two of the parts, so we have at least three estimates not affected. In the simulations in Fig. 3, we study again an AR(1)-process and use the standard setting of the R function lvar for the data-adaptive choice of the bandwidths in the five parts. With this method, we do not observe a loss of power compared to the fixed bandwidth. Under the hypothesis, the all tests become strongly oversized. The Wilcoxon type test statistic clearly outperforms the CUSUM type statistic for nonnormal in innovations.

Another problem in many practical applications is the unknown length of the changed segment, so that it is difficult to choose the value $\gamma \in [0,1/2)$ to achieve the optimal power. If there is no a-priori knowledge of the typical length of an epidemic change, it would also be possible to use the maximum of (suitable standardized) test statistics for different values of $\gamma $. Another straightforward application of Theorem 15 leads to the asymptotic distribution of this combined test statistic and critical values could be obtained via simulations, but this goes beyond the scope of this paper.

4 Data example

We investigate the frequency of search for the term ‘Harry Potter’ from January 2004 until February 2019 obtained from Google trends. The time series is plotted in Fig. 4. We apply the CUSUM type and the Wilcoxon type change-point test with weight parameters $\gamma \in \{0,0.1,\ldots ,0.4\}$. The lag one autocovariance is estimated as 0.457, so that we have to allow for dependence in our testing procedure. We estimate the long run variance with a kernel estimator, using the quartic spectral kernel and the fixed bandwidth $b=4$.

The CUSUM type test does not reject the hypothesis of stationarity for a significance level of 5%, regardless of the choice of $\gamma $. In contrast, the Wilcoxon type test detects a changed segment for any $\gamma \in \{0,0.1,\ldots ,0.4\}$, even at a significance level of 1%. The beginning and end of the changed segment are estimated differently for different values of $\gamma $: The unweighted Wilcoxon type test with $\gamma =0$ leads to a segment from January 2008 to June 2016. For $\gamma =0.1,0.2,0.3$, we obtain January 2012 to June 2016 as an estimate. $\gamma =0.4$ leads to an estimated changed segment from January 2012 to May 2016.

By visual inspection of the time series, we come to the conclusion that the estimated changed segment for values $\gamma \ge 0.1$ fits the data better, because this segment coincides with a period with only low frequencies of search. Furthermore, the spikes of this time series can be explained by the release of movies, and the estimated changed segment is between the release of the last harry potter movie in July 2011 and the release of ‘Fantastic Beasts and Where to Find Them’ in November 2016.

5 Double partial sum process

Throughout this section we assume that the sequence $(X_i)$ is stationary and $P_X:=P_{X_i}$ is the distribution of each $X_i$. Consider for a kernel $h:{\mathbb {S}}\times {\mathbb {S}}\rightarrow {\mathbb {R}}$ the double partial sums

$$\begin{aligned} U_{h,0}=U_{h,n}=0, \ \ U_{h,k}=\sum _{i=1}^k\sum _{j=k+1}^n h(X_i, X_j),\ \ 1\le k<n \end{aligned}$$

and the corresponding polygonal line process${\mathcal {U}}_{h,n}=({\mathcal {U}}_{h,n}(t), t\in [0, 1])$ defined by

$$\begin{aligned} {\mathcal {U}}_{h,n}(t):=U_{h,\lfloor nt \rfloor }+(nt-[nt])(U_{h,\lfloor nt \rfloor +1}-U_{h,\lfloor nt \rfloor }),\ \ t\in [0, 1], \end{aligned}$$

(6)

where for a real number $a\ge 0$, $\lfloor a\rfloor :=\max \{k:\, k\in {\mathbb {N}},\,k\le x\}$, ${\mathbb {N}}=\{0,1,\ldots \}$, is a value of the floor function. So ${\mathcal {U}}_{h,n}=({\mathcal {U}}_{h,n}(t), t\in [0, 1])$, is a random polygonal line with vertexes $(U_{h, k}, k/n)$, $k=0, 1, \ldots , n$. As a functional framework for the process ${\mathcal {U}}_{h,n}$ we consider Banach spaces of Hölder functions. Recall the space C[0, 1] of continuous functions on [0, 1] is endowed with the norm

$$\begin{aligned} ||x||=\max _{0\le t\le 1}|x(t)|. \end{aligned}$$

The Hölder space ${\mathcal {H}}_\gamma ^{o}[0, 1], 0\le \gamma <1$, of functions $x\in C[0, 1]$ such that

$$\begin{aligned} \omega _\gamma (x, \delta ):=\sup _{0<|s-t|\le \delta }\frac{|x(t)-x(s)|}{|t-s|^\gamma }\rightarrow 0\ \ \text {as}\ \ \delta \rightarrow 0, \end{aligned}$$

is endowed with the norm

$$\begin{aligned} ||x||_\gamma :=|x(0)|+\omega _\gamma (x, 1). \end{aligned}$$

Both C[0, 1] and ${\mathcal {H}}^o_\gamma [0, 1]$ are separable Banach spaces. The space ${\mathcal {H}}^o_0[0, 1]$ is isomorphic to C[0, 1].

Definition 5

For a kernel h and a number $0\le \gamma <1$ we say that $(X_i)$ satisfies $(h, \gamma )$-FCLT if there is a Gaussian process ${\mathcal {U}}_h=({\mathcal {U}}_h(t), t\in [0, 1])\in {\mathcal {H}}^o_\gamma [0, 1]$, such that

$$\begin{aligned} n^{-3/2}{\mathcal {U}}_{h, n}\xrightarrow [n\rightarrow \infty ]{{\mathcal {D}}}{\mathcal {U}}_h \ \ \text {in the space}\ \ {\mathcal {H}}_\gamma ^o[0, 1]. \end{aligned}$$

In order to make use of results for partial sum processes, we decompose the U-statistics into a linear part and a so-called degenerate part. Hoeffding’s decomposition of the kernel h reads

$$\begin{aligned} h(x, y)=h_1(x) - h_1(y) + g(x, y),\ \ x, y\in {\mathbb {S}}, \end{aligned}$$

where

$$\begin{aligned} h_1(x)=\int _{{\mathbb {S}}} h(x, y)P_X(\mathrm {d}y),\ \ \text {and}\ \ g(x, y)=h(x, y)-h_1(x)+h_1(y), \ \ x, y\in {\mathbb {S}}, \end{aligned}$$

and leads to the splitting

$$\begin{aligned} {\mathcal {U}}_{h, n}(t)=n[W_{h_1,n}(t)-tW_{h_1, n}(1)]+{\mathcal {U}}_{g, n}(t),\ \ t\in [0, 1], \end{aligned}$$

(7)

where

$$\begin{aligned} W_{h_1,n}(t)=\sum _{i=1}^{\lfloor nt \rfloor }h_1(X_i)+ (nt-\lfloor nt \rfloor )h_1(X_{\lfloor nt \rfloor +1}),\ \ t\in [0, 1], \end{aligned}$$

is the polygonal line process defined by partial sums of random variables $(h_1(X_i))$. Decomposition (7) reduces $(h, \gamma )$-FCLT to Hölderian invariance principle for random variables $(h_1(X_i))$ via the following lemma.

Lemma 6

If there exists a constant $C>0$ such that for any integers $0\le k<m\le n$

$$\begin{aligned} E(U_{g,m}-U_{g,k})^2\le C(m-k)(n-(m-k)) \end{aligned}$$

(8)

then

$$\begin{aligned} ||n^{-3/2}{\mathcal {U}}_{g, n}||_{\gamma }=o_P(1) \end{aligned}$$

for any $0\le \gamma <1/2$.

Remark 7

For an antisymmetric kernel h the condition (8) follows from the following one: there exists a constant $C>0$ such that for any $0 \le m_1< n_1 \le m_2 < n_2$,

$$\begin{aligned} E\left( \sum ^{n_1}_{i=m_1+1}\sum ^{n_2}_{j=m_2+1}g(X_i, X_j)\right) ^2\le C(n_1 - m_1)(n_2 - m_2). \end{aligned}$$

(9)

Indeed, by antisymmetry

$$\begin{aligned} U_{g,m}-U_{g,k}=\sum _{i=k+1}^m\sum _{j=m+1}^n h(X_i, X_j)+\sum _{i=k+1}^m\sum _{j=1}^k h(X_i, X_j), \end{aligned}$$

so that (9) yields

$$\begin{aligned}&E(U_{g,m}-U_{g,k})^2\\&\quad \le 2C[(m-k)(n-m)+(m-k)(k-1)]\le 2C(m-k)(n-(m-k)). \end{aligned}$$

Before we proceed with the proof of Lemma 6 we need some preparation. Let $D_j$ be the set of dyadic numbers of level j in [0, 1], that is $D_0 :=\{0,1\}$ and for $j\ge 1$, $D_j:= \bigl \{(2l-1)2^{-j};\;1\le l \le 2^{j-1}\bigr \}$. For $r\in D_j$ set $r^-:=r-2^{-j}$, $r^+:=r+2^{-j}$, $j\ge 0$. For $f:[0,1]\rightarrow {\mathbb {R}}$ and $r\in D_j$ define

$$\begin{aligned} \lambda _r(f):= {\left\{ \begin{array}{ll} f(r^+)+f(r^-)-2f(r) &{} \text {if }j \ge 1,\\ f(r) &{}\text {if }j=0. \end{array}\right. } \end{aligned}$$

The following sequential norm on ${\mathcal {H}}^o_\gamma [0, 1]$ defined by

$$\begin{aligned} 2^{-1}||f||^{\mathrm {seq}}_\gamma :=\sup _{j\ge 0}2^{\gamma j}\max _{r\in D_j}|\lambda _r(f)|, \end{aligned}$$

is equivalent to the norm $||f||_\gamma $, see Ciesielski (1960): there is a positive constant $c_\gamma $ such that

$$\begin{aligned} ||f||^{\mathrm {seq}}_\gamma \le ||f||_\gamma \le c_\gamma ||f||^{\mathrm {seq}}_\gamma ,\ \ \ f\in {\mathcal {H}}^o_\gamma [0, 1]. \end{aligned}$$

(10)

Set ${\mathcal {D}}_j:=\{k2^{-j},$$0\le k\le 2^j\}$. In what follows, we denote by $\log $ the logarithm with basis 2 ($\log 2 = 1$).

Lemma 8

For any $0\le \gamma \le 1$ there is a constant $c_\gamma >0$ such that, if $V_n$ is a polygonal line function with vertexes $(0, 0), (k/n, V_n(k/n)), k=1, \ldots , n$, then

$$\begin{aligned} ||V_{n}||_\gamma \le c_\gamma \max _{0\le j\le \log n}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}\Big |V_{n}(\lfloor nr+n2^{-j} \rfloor /n)-V_{n}(\lfloor nr \rfloor /n)\Big |. \end{aligned}$$

Proof

First we remark that for any $j\ge 1$,

$$\begin{aligned} \max _{r\in D_j}|\lambda _r(V_n)| \le \max _{r\in D_j}|V_n(r^+)-V_n(r)| + \max _{r\in D_j}|V_n(r)-V_n(r^-)|. \end{aligned}$$

As $r^+$ and $r^-$ belong to ${\mathcal {D}}_j$, this gives,

$$\begin{aligned} \sup _{j\ge 1}2^{\gamma j}\max _{r\in D_j}|\lambda _r(V_n)| \le 2\sup _{j\ge 1}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}|V_n(r+2^{-j})-V_n(r)| \end{aligned}$$

and it follows by (10),

$$\begin{aligned} ||V_n||_\gamma \le 2 c_\gamma \sup _{j\ge 0}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}|V_n(r+2^{-j})-V_n(r)|. \end{aligned}$$

If s and $t>s$ belong to the same interval, say, $[(k-1)/n, k/n]$, then, observing that the slope of $V_n$ in this interval is precisely $n[V_n(k/n)-V_n((k-1)/n)]$, we have

$$\begin{aligned} |V_n(t)-V_n(s)|&= n(t-s)|V_n(k/n)-V_n((k-1)/n)|\le n(t-s)\Delta _n, \end{aligned}$$

where $\Delta _n=\max _{1\le k\le n}|V_n(k/n)-V_n((k-1)/n)|$. If $s\in [(k-1)/n, k/n), t\in [k/n, (k+1)/n)$ then

$$\begin{aligned} |V_n(t)-V_n(s)|&\le |V_n(t)-V_n(k/n)|+|V_n(k/n)-V_n(s)| \le n(t-s)\Delta _n. \end{aligned}$$

If $s\in [(k-1)/n, k/n)$, $t\in [(j-1)/n, j/n)$ and $j>k+1$, then

$$\begin{aligned} |V_n(t)-V_n(s)|&\le |V_n(t)-V_n((j-1)/n)|+|V_n(k/n)-V_n((j-1)/n)|\\&\quad +|V_n(k/n)-V_n(s)|\\&\le |V_n(k/n){-}V_n((j-1)/n)|{+}n[(k/n-s)+(t-(j-1)/n)]\Delta _n. \end{aligned}$$

We apply these three configurations to $s=r$ and $t=r+2^{-j}$. If $j\ge \log n$ then only the first two configurations are possible and we deduce

$$\begin{aligned} \max _{j\ge \log n}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}|V_n(r+2^{-j})-V_n(r)| \le \max _{j\ge \log n}2^{\gamma j}n2^{-j}\Delta _n= 2n^\gamma \Delta _n. \end{aligned}$$

If $j<\log n$ then we apply the third configuration to obtain

$$\begin{aligned} \max _{j<\log n}2^{\gamma j}&\max _{r\in {\mathcal {D}}_j}|V_n(r+2^{-j})-V_n(r)| \\&\le \max _{j< \log n}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}|V_n(\lfloor nr+n2^{-j} \rfloor /n) -V_n(\lfloor nr \rfloor /n)|\\&\quad + 2\max _{j<\log n}2^{\gamma j}n2^{-j}\max _{1\le k\le n}|V_n(k/n)-V_n((k-1)/n)|\\&\le \max _{j< \log n}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}|V_n(\lfloor nr+n2^{-j} \rfloor /n)-V_n(\lfloor nr \rfloor /n)| + 2n^\gamma \Delta _n. \end{aligned}$$

To complete the proof just observe that $\lfloor nr+2^{-j} \rfloor =\lfloor nr \rfloor +1$ if $j=\log n$ and so $\Delta _n\le \max _{j\le \log n}2^{\gamma j}\max _{r\in {\mathcal {D}}_j}|V_n(\lfloor nr+n2^{-j} \rfloor /n)-V_n(\lfloor nr \rfloor /n)|$. $\square $

Proof of Lemma 6

By Lemma 8 we have with some constant $C>0$,

$$\begin{aligned} E||{\mathcal {U}}_{g,n}||^2_\gamma \le C\sum _{j=0}^{\log n}2^{2\gamma j}2^j\max _{r\in {\mathcal {D}}_j}E\Big ({\mathcal {U}}_{g,n}(\lfloor nr+n2^{-j} \rfloor /n)- {\mathcal {U}}_{g,n}(\lfloor nr \rfloor /n)\Big )^2. \end{aligned}$$

Condition (8) gives

$$\begin{aligned} E\Big ({\mathcal {U}}_{g, n}(m/n)-{\mathcal {U}}_{g, n}(k/n)\Big )^2\le C(m-k)(n-(m-k)). \end{aligned}$$

This yields, taking into account that $\lfloor nr+n2^{-j} \rfloor -\lfloor nr \rfloor \le n2^{-j}$ for $r\in {\mathcal {D}}_j$,

$$\begin{aligned} E||n^{-3/2}{\mathcal {U}}_{g,n}||^{2}_\gamma \le C_\gamma n^{-3}\sum _{j=1}^{\log n}2^{2\gamma j} 2^j[n2^{-j}(n-n2^{-j})]\le C_\gamma n^{-1+2\gamma }. \end{aligned}$$

This completes the proof due to the restriction $0\le \gamma <1/2$. $\square $

The next lemma gives a general conditions for the tightness of the sequence $(n^{-1/2}W_{h_1, n})$ in Hölder spaces.

Lemma 9

Assume that the sequence $(X_i)_{i\in {\mathbb {N}}}$ is a stationary and for a $q> 2$, there is a constant $c_q>0$ such that for any $0\le k<m\le n$

$$\begin{aligned} E\Big |\sum _{i=k+1}^m h_1(X_i)\Big |^q\le c_q(m-k)^{q/2}. \end{aligned}$$

(11)

Then for any $0\le \gamma <1/2-1/q$ the sequence $(n^{-1/2}W_{h_1,n})$ is tight in the space ${\mathcal {H}}^o_{\gamma }[0, 1]$.

Proof

Fix $\beta >0$ such that $0\le \gamma<\beta <1/2-1/q$. By Arcela–Ascoli the embedding ${\mathcal {H}}^o_\beta [0, 1]\rightarrow {\mathcal {H}}^o_\gamma [0, 1]$ is compact, hence, it is enough to prove

$$\begin{aligned} \lim _{a\rightarrow \infty }\sup _{n\ge 1}P(||n^{-1/2}W_{h_1,n}||_\beta >a)=0. \end{aligned}$$

(12)

By Lemma 8,

$$\begin{aligned} P(||n^{-1/2}W_{h_1,n}||_\beta >a)\le I_n(a), \end{aligned}$$

where

$$\begin{aligned} I_{n}(a){=}P\Big (\max _{0\le j{\le } \log n}2^{\beta j}\max _{r\in {\mathcal {D}}_j}\Big |W_{h_1,n}(\lfloor nr{+}n2^{-j} \rfloor /n){-}W_{h_1,n}(\lfloor nr \rfloor /n)\Big |\ge c_\beta n^{1/2}a\Big ). \end{aligned}$$

with some constant $c_\beta >0$. Since $\lfloor nr+n2^{-j} \rfloor -\lfloor nr \rfloor \le n2^{-j}$ we have by condition (11),

$$\begin{aligned} I_n(a)&\le cn^{-q/2}a^{-q}\sum _{j=1}^{\log n}2^{q\beta j}2^j\max _{r\in D_j}E\Big |W_{h_1, n}(\lfloor nr+n2^{-j} \rfloor /n)-W_{h_1, n}(\lfloor nr \rfloor /n)\Big |^q\\&=cn^{-q/2}a^{-q}\sum _{j=1}^{\log n}2^{q\beta j}2^j\max _{r\in D_j}E\Big |\sum ^{\lfloor nr+n2^{-j} \rfloor }_{i=\lfloor nr \rfloor +1}h_1(X_i)\Big |^q\\&\le cn^{-q/2}a^{-q}\sum _{j=1}^{\log n}2^{q\beta j}2^j (n2^{-j})^{q/2}\\&\le ca^{-q}\sum _{j=1}^{\log n} 2^{-j(q/2-q\beta -1)}, \end{aligned}$$

with some constant $c>0$. Since $q/2-q\beta -1>0$, we obtain $I_n(a)\le ca^{-q}$ and complete the proof of (12) and that of the lemma. $\square $

Summing up we have the following functional limit theorem for the process ${\mathcal {U}}_{h,n}$.

Theorem 10

Assume that the sequence $(X_i)$ is stationary sequence of ${\mathbb {S}}$-valued random elements. Let h be an antisymmetric kernel end $E|h(X_1, X_2)|^p<\infty $ for some $p>2$. If

(i)
there is a constant $C>0$ such that for any $0\le m_1<n_1\le m_2<n_2$ the inequality (9) is satisfied;
(ii)
for some $2<q\le p$ the inequality (11) is satisfied;
(iii)
there is a Gaussian process ${\mathcal {U}}_h$ such that
$$\begin{aligned} n^{-1/2}W_{h_1,n}\xrightarrow [n\rightarrow \infty ]{\mathrm {fdd}}{\mathcal {U}}_h, \end{aligned}$$

then

$$\begin{aligned} n^{-3/2}{\mathcal {U}}_{h,n}\xrightarrow {{\mathcal {D}}}{\mathcal {U}}^o_h\ \ \text {in the space}\ \ {\mathcal {H}}^o_\gamma [0, 1] \end{aligned}$$

for any $0\le \gamma <1/q,$ where ${\mathcal {U}}_h^o=({\mathcal {U}}_h(t)-t{\mathcal {U}}_h(1), t\in [0, 1])$.

5.1 iid sample

In this subsection we establish the $(h, \gamma )-FCLT$ for independent identically distributed sequences $(X_i)_{i\in {\mathbb {N}}}$.

Theorem 11

Assume that $(X_i)$ are independent and identically distributed random elements in ${\mathbb {S}}$ and the measurable function $h:{\mathbb {S}}\times {\mathbb {S}}\rightarrow {\mathbb {R}}$ is antisymmetric. If $E|h(X_1, X_2)|^q<\infty $ for some $q>2$, then $(X_i)$ satisfies $(h, \gamma )-FCLT$ for any $0\le \gamma <1/2-1/q$ with the limit process ${\mathcal {U}}_{h_1}=\sigma _{h} B$, where $B=(B(t), t\in [0, 1])$ is a standard Brownian bridge.

Particularly, if the kernel h is antisymmetric and bounded, then $(X_i)$ satisfies $(h, \gamma )$-FCLT for any $0\le \gamma <1/2$.

Proof

We need to check conditions (i)–(iii) of Theorem 10. Starting with (i) we have

$$\begin{aligned} E\left( \sum ^{n_1}_{i=m_1+1}\sum ^{n_2}_{j=m_2+1}g(X_i, X_j)\right) ^2 =\sum _{i, i'=m_1+1}^{n_1}\sum _{j=m_2+1}^{n_2}Eg(X_i, X_j)g(X_{i'}, X_{j'}) \end{aligned}$$

and observe that $Eg(X_i, X_j)g(X_{i'}, X_{j'})=0$ if either $i\not =i'$ or $j\not =j'$. Indeed, it is enough to observe that $Eg(X_1, x)=0$ for each x:

$$\begin{aligned} Eg(X_1, x)&=E[h(X_1, x)-h_1(X_1)+h_1(x)]\\&=E[-h(x, X_1)-h_1(X_1)+h_1(x)]\\&=Eh_1(X_1)=0. \end{aligned}$$

Now, if $i\not =i'$, $j=j'$ then we have

$$\begin{aligned} Eg(X_i, X_j)g(X_{i'}, X_{j'})=\int _{{\mathbb {S}}}Eg(X_i, x)Eg(X_{i'}, x)P_X(dx)=0. \end{aligned}$$

Hence,

$$\begin{aligned} E\left( \sum ^{n_1}_{i=m_1+1}\sum ^{n_2}_{j=m_2+1}g(X_i, X_j)\right) ^2&= \sum ^{n_1}_{i=m_1+1}\sum ^{n_2}_{j=m_2+1}Eg^2(X_i, X_j)\\&=(n_1-m_1)(n_2-m_2)Eg^2(X_1, X_2)\\&\le 4(n_1-m_1)(n_2-m_2)Eh^2(X_1, X_2). \end{aligned}$$

Condition (ii) is obtained via Rosenthal’s inequality. Since the moment assumption gives $E|h_1(X_1)|^q=E[|E[h(X_1, X_2)|X_2]|^q]\le E|h(X_1, X_2)|^q<\infty $ we have

$$\begin{aligned} E\Big |\sum _{i=k+1}^m h_1(X_i)\Big |^q&\le c_q\Big [\Big (\sum _{i=k+1}^m Eh_1^2(X_i)\Big )^{q/2}+\sum _{i=k+1}^m E|h_1(X_i)|^q\Big ]\\&\le 2c_q(m-k)^{q/2}E|h_1(X_1)|^q. \end{aligned}$$

As the convergence $n^{-1/2}W_{h_1, n}\xrightarrow [n\rightarrow \infty ]{\mathrm {fdd}}\sigma _{h_1}W $ is well known, the proof is completed. $\square $

5.2 Mixing sample

In this subsection we establish the $(h, \gamma )-FCLT$ for $\beta $-mixing sequences $(X_i)_{i\in {\mathbb {N}}}$. For $A\subset {\mathbb {Z}}$ we will denote by $P_A$ the joint distribution of $\{X_i, i\in A\}$. We write $P_X$ for the distribution of $X_i$. We need some auxiliary lemmas:

Lemma 12

Let $i_1<i_2<\cdots <i_k$ be arbitrary integers. Let $f:{\mathbb {S}}^k\rightarrow {\mathbb {R}}$ be a measurable function such that for any j, $1\le j\le k-1$,

$$\begin{aligned} \int _{{\mathbb {S}}^k} |f|^{1+\delta } d\Big [P_{{i_1},\ldots , {i_k}}+P_{{i_1}, \ldots , {i_j}}\otimes P_{{i_{j+1}}, \ldots , {i_k}}\Big ]<M, \end{aligned}$$

for some $\delta >0$. Then

$$\begin{aligned} \Big |\int _{{\mathbb {S}}^k}f \ d\Big (P_{X_{i_1},\ldots , X_{i_k}}-P_{X_{i_1}, \ldots , X_{i_j}}\otimes P_{X_{i_{j+1}}, \ldots , X_{i_k}})\Big |\le 4M^{1/(1+\delta )}\beta ^{\delta /(1+\delta )}_{i_{j+1}-i_j}. \end{aligned}$$

Proof

The proof goes along the lines of the proof of Lemma 1 in Yoshihara (1976).

$\square $

Lemma 13

Assume that for a $\delta >0$ there is a constant M such that

$$\begin{aligned} E|h(X_i, X_j)|^{2(1+\delta )}\le M \end{aligned}$$

for any $1\le i, j\le n$ and

$$\begin{aligned} \sum _{k=0}^\infty k\beta ^{\delta /(1+\delta )}_k<\infty . \end{aligned}$$

Then for any $0\le m_1<n_1\le m_2<n_2$,

$$\begin{aligned} I(m_1, n_1, m_2,n_2):=E\left( \sum _{i=m_1+1}^{n_1}\sum _{j=m_2+1}^{n_2}g(X_i, X_j)\right) ^2\le C(n_1-m_1)(n_2-m_2) \end{aligned}$$

Proof

We have

$$\begin{aligned} I(m_1, n_1, m_2,n_2)=\sum _{i_1, i_2=m_1+1}^{n_1}\sum _{j_1, j_2=m_2+1}^{n_2}J(i_1, i_2, j_1, j_2), \end{aligned}$$

where

$$\begin{aligned} J(i_1, i_2, j_1, j_2)=Eg(X_{i_1}, X_{j_1})g(X_{i_2}, X_{j_2}). \end{aligned}$$

First consider the case where $i_1<i_2$ and $j_1<j_2$. If $j_2-j_1>i_2-i_1$ then by Lemma 12 we have

$$\begin{aligned} \Big |J(i_1, i_2, j_1, j_2)-\int _{{\mathbb {S}}^4}g(x_1, x_2)g(x_3, x_4)dP_{X_{i_1}, X_{i_2}, X_{j_1}}\otimes P_{X_{j_2}}\Big |\le 4M^{1/(1+\delta )}\beta ^{\delta /(1+\delta )}_{j_1-j_2}. \end{aligned}$$

If $i_2-i_1>j_2-j_1$ then

$$\begin{aligned} \Big |J(i_1, i_2, j_1, j_2)-\int _{{\mathbb {S}}^4}g(x_1, x_2)g(x_3, x_4)dP_{X_{i_1}}\otimes P_{X_{i_2}, X_{j_1}, X_{j_2}}\Big |\le 4M^{1/(1+\delta )}\beta ^{\delta /(1+\delta )}_{i_1-i_2}. \end{aligned}$$

Note that for any $y\in {\mathbb {S}}$,

$$\begin{aligned} \int _{{\mathbb {S}}} g(y, x)P_{X_{j_2}}(dx)=\int _{{\mathbb {S}}} [h(y,x)-(h_1(y)-h_1(x))]P_{X_{j_2}}(dx)=0 \end{aligned}$$

and

$$\begin{aligned} \int _{{\mathbb {S}}} g(x, y)P_{X_{i_1}}(dx)=\int _{{\mathbb {S}}} [h(x,y)-(h_1(x)-h_1(y))]P_{X_{i_1}}(dx)=0. \end{aligned}$$

Treating the other cases in the same way, we deduce that for any $m_1<i_1,i_2\le n_2\le m_2<j_1,j_2\le n_2$,

$$\begin{aligned} |J(i_1, i_2, j_1, j_2)|\le 4M^{1/(1+\delta )}\beta ^{\delta /(1+\delta )}_{\min \{|i_2-i_1|,|j_2-j_1|\}}. \end{aligned}$$

This yields

$$\begin{aligned} \left| I(m_1, n_1, m_2,n_2)\right| \le C\sum _{i_1, i_2=m_1+1}^{n_1}\sum _{j_1, j_2=m_2+1}^{n_2}\beta ^{\delta /(1+\delta )}_{\min \{|i_2-i_1|,|j_2-j_1|\}}. \end{aligned}$$

If $k:=\min \{|i_2-i_1|,|j_2-j_1|\}=|i_2-i_1|$, then there are less than $n_1-m_1$ choices for $i_1$, at most 2 choices for $i_2$, as $i_2\in \{i_1-k,i_1+k\}$. Furthermore, there are less than $n_2-m_2$ choices for $j_1$, and, because $|j_2-j_1|\le k$, at most $2k+1$ choices for $j_2$. In the case $k:=\min \{|i_2-i_1|,|j_2-j_1|\}=|j_2-j_1|$, we can use a similar reasoning. In total, there are less than $12(n_1-m_1)(n_2-m_2)k$ ways to chose the indices for given k. We arrive at

$$\begin{aligned} \left| I(m_1, n_1, m_2,n_2)\right| \le C(n_1-m_1)(n_2-m_2)\sum _{k=0}^\infty k\beta ^{\delta /(1+\delta )}_k=C(n_1-m_1)(n_2-m_2) \end{aligned}$$

provided that $\sum _k k\beta ^{\delta /(1+\delta )}_k<\infty $.

$\square $

Lemma 14

Assume that

$$\begin{aligned} \int _{{\mathbb {S}}}\left( \int _{{\mathbb {S}}}h(x, y)P_X(dy)\right) ^{r+\delta }P_X(dx)<\infty \end{aligned}$$

for some $r>2$ and $\delta >0$. If

$$\begin{aligned} \sum _k k^{r/2-1}\beta ^{\delta /(r+\delta )}_k<\infty \end{aligned}$$

then there is a constant $c_{r, \delta }>0$ such that for any $0\le k<m\le n$,

$$\begin{aligned} E\Big |\sum _{i=k+1}^m h_1(X_i)\Big |^r\le c_{r, \delta } (m-k)^{r/2}. \end{aligned}$$

Proof

This lemma is proved in Yokoyama (1980) for real valued strongly mixing random variables. We need to note that if $(X_i)$ is $\beta $-mixing then $(h_1(X_i))$ is $\beta $-mixing as well for any measurable $h_1:{\mathbb {S}}\rightarrow {\mathbb {R}}$. Being such this sequence is also strongly mixing. $\square $

Theorem 15

Assume that $(X_i)$ is a strictly stationary $\beta $-mixing sequence of random elements in ${\mathbb {S}}$ and the measurable function $h:{\mathbb {S}}\times {\mathbb {S}}\rightarrow {\mathbb {R}}$ is antisymmetric. If $E|h(X_1, X_2)|^{q}<\infty $ and

$$\begin{aligned} \sum _k k\beta ^{1-2/q}_k<\infty ,\ \ \sum _k k^{r/2-1}\beta ^{1-r/q}_k<\infty , \end{aligned}$$

(13)

for some $q>2$ and $2<r<q$, then $(X_i)$ satisfies $(h, \gamma )-FCLT$ for any $0\le \gamma <1/2-1/r$ with the limit process ${\mathcal {U}}_h=\sigma _\infty B$, where $B=(B(t), t\in [0, 1])$ is a standard Brownian bridge and

$$\begin{aligned} \sigma ^2_\infty =\mathrm {var}\big (h_1(X_1)\big )+2\sum _{k=2}^\infty {\text {cov}}\big (h_1(X_1),h_1(X_k)\big ). \end{aligned}$$

Particularly, if the kernel h is antisymmetric and bounded then condition (13) becomes $\sum _k k^{r/2-1}\beta _k<\infty $, and in this case $(X_i)$ satisfies $(h, \gamma )$-FCLT for any $0\le \gamma <1/2-1/r$.

Proof

We need to check conditions (i)–(iii) of Theorem 10. First we check (i) using Lemma 13 with $\delta =(q-2)/2$. Condition (ii) follows imediately from Lemma 14. Finally, convergence of finite dimensional distributions can be deduced from invariance principles for $\alpha $-mixing sequences proved by a number of authors (see, e.g., (Herrndorf 1985) and references therein). $\square $

6 Asymptotic distribution under null

In the following, we show how the asymptotic behaviour of the statistic $T_{n}(\gamma , h)$ follows from the functional limit results for U-processes:

Theorem 16

Let $0\le \gamma <1/2$ and let the kernel $h:{\mathbb {S}}\times {\mathbb {S}}\rightarrow {\mathbb {R}}$ be antisymmetric. Assume that $(X_i)$ is a stationary sequence and satisfies $(h, \gamma )$-FCLT with the limit process ${\mathcal {U}}_h$. Then

$$\begin{aligned} n^{-3/2}T_n(\gamma , h)\xrightarrow [n\rightarrow \infty ]{{\mathcal {D}}}T_{\gamma , h}:=\sup _{0\le s<t\le 1} \frac{|{\mathcal {U}}_h(t)-{\mathcal {U}}_h(s)|}{[(t-s)(1-(t-s))]^\gamma }. \end{aligned}$$

Proof

Set for $f\in {\mathcal {H}}^o_\gamma [0, 1]$, and $0\le s<t\le 1$,

$$\begin{aligned} I(f; s, t) := \frac{|f(t) - f(s) - (t - s)(f(1)-f(0))|}{\rho _\gamma (t - s)}. \end{aligned}$$

Consider the functions

$$\begin{aligned} F_n(f):=\max _{0\le k<m\le n}I(f; k/n, m/n),\ \ \text {and}\ \ F(f){=}\sup _{0\le s<t\le 1}I(f; s, t),\ \ f\in {\mathcal {H}}^o_\gamma [0, 1]. \end{aligned}$$

Since ${\mathcal {U}}_h(0)={\mathcal {U}}_h(1)$ we see that $F({\mathcal {U}}_h)=T_\gamma $. We have due to anti-symmetry of h, for any $0\le k<m\le n$,

$$\begin{aligned} {\mathcal {U}}_{h, n}(m/n)-{\mathcal {U}}_{h, n}(k/n)&=\sum _{i=1}^m\sum _{j=m+1}^n h(X_i, X_j)-\sum _{i=1}^k\sum _{j=k+1}^n h(X_i, X_j)\\&=\sum _{i=k+1}^m\sum _{j=m+1}^n h(X_i, X_j){+}\sum _{i=1}^k\Big [\sum _{j=m+1}^n-\sum _{j=k+1}^n\Big ]h(X_i, X_j)\\&= \sum _{i=k+1}^m\sum _{j=m+1}^nh(X_i, X_j)-\sum _{i=1}^k\sum _{j=k+1}^m h(X_i, X_j)\\&=\Delta _{h,n}(k, m). \end{aligned}$$

Hence, $F_n(n^{-3/2}{\mathcal {U}}_{h,n})=n^{-3/2}T_n(\gamma , h)$. We prove next that

$$\begin{aligned} F_n(n^{-3/2}{\mathcal {U}}_{h,n}(\cdot ))=F(n^{-3/2}{\mathcal {U}}_{h,n}(\cdot ))+o_P(1). \end{aligned}$$

(14)

To this aim we apply the following simple lemma (the proof is given in Račkauskas and Suquet (2004), see Lemma 13 therein).

Lemma 17

Let $(\eta _n)_{n\ge 1}$ be a tight sequence of random elements in the separable Banach space ${\mathbb {B}}$ and $g_n$, g be continuous functionals ${\mathbb {B}}\rightarrow {\mathbb {R}}$. Assume that $g_n$ converges pointwise to g on ${\mathbb {B}}$ and that $(g_n)_{n\ge 1}$ is equicontinuous. Then

$$\begin{aligned} g_n(\eta _n) = g(\eta _n) + o_P(1). \end{aligned}$$

We check the continuity of the function F first. We have if $t-s\le 1/2$, $\rho _\gamma (t-s)\ge 2^{-\gamma }(t-s)^\gamma $ and this yields

$$\begin{aligned} I(f; s,t)&\le 2^\gamma \sup _{0\le s<t\le 1}\frac{|f(t)-f(s)-(t-s)(f(1)-f(0))|}{(t-s)^\gamma } \\&\le 2^{1+\gamma }||f||_\gamma . \end{aligned}$$

If $t-s>1/2$ then $1-(t-s)>1-t$ and $1-(t-s)>s$. This yields

$$\begin{aligned} I(f; s,t)&\le 2^\gamma \left\{ \frac{|f(t)-f(1)|}{(1-t)^\gamma }+\frac{|f(0)-f(s)|}{s^\gamma } +\frac{(1-(t-s))|f(1)-f(0)|}{(1-(t-s))^\gamma }\right\} \\&\le 32^\gamma ||f||_\gamma . \end{aligned}$$

Hence, $F(f)\le 6||f||_\gamma $ and this yields the continuity since the inequality $|F(f)-F(g)|\le F(f-g)$ can be easily checked. Similarly we have $|F_n(f)-F_n(g)|\le F_n(f-g)\le 32^\gamma ||f-g||_\gamma $, therefore the sequence $(F_n)$ is equicontinuous on ${\mathcal {H}}^o_\gamma [0, 1]$. To check the point-wise convergence on ${\mathcal {H}}^o_\gamma [0, 1]$ of $F_n$ to F, it is enough to show that for each $f\in {\mathcal {H}}^o_\gamma [0, 1]$ the function $(s, t) \rightarrow I (f; s, t)$ can be extended by continuity to the compact set $T = \{(s, t)\in [0, 1]^2, 0\le s\le t\le 1\}$. As above we get for $t-s<1/2$$I(f; s, t)\le 2^\gamma \omega _\gamma (f; t-s)+ 2^\gamma |f(1)-f(0)|(t-s)^{1-\gamma }$, which allows the continuous extension along the diagonal putting $I(f; s, s):= 0$. If $t-s>1/2$ we get $I(f; s, t)\le 2^\gamma \omega _\gamma (f, 1-(t-s))+2^\gamma |f(1)-f(0)|(1 + t - s)^{1-\gamma }$ which allows the continuous extension at the point (0, 1) putting $I(f; 0, 1):= 0$.

The pointwise convergence of $(F_n)$ being now established, and observing that by the $(\gamma , h)$-FCLT, the sequence $n^{-3/2}U_n$ is tight, Lemma 17 gives (14). Since F is continuous, continuous mapping theorem together with $(h, \gamma )$-FCLT yield

$$\begin{aligned} F(n^{-3/2}U_{h,n}(\cdot ))\xrightarrow [n\rightarrow \infty ]{{\mathcal {D}}}F({\mathcal {U}}_h)= T_{\gamma , h}. \end{aligned}$$

By (14) we get

$$\begin{aligned} n^{-3/2}T_{n}(h, \gamma )=F_n(n^{-3/2}U_{h,n})\xrightarrow [n\rightarrow \infty ]{{\mathcal {D}}}T_{\gamma , h}. \end{aligned}$$

This completes the proof. $\square $

Combination of this general result with Theorems 11 and 15 gives the proofs of Theorems 1 and 2 respectively.

7 Behavior under the alternative

To discuss the behaviour of the test statistics $T_{n}(\gamma , h)$ under the alternative we assume that for each $n\ge 1$ we have two probability measures $P_n$ and $Q_n$ on $({\mathbb {S}}, {\mathcal {S}})$ and a random sample $(X_{ni})_{1\le i\le n}$ such that for $k^*_n, m^*_n\in \{1, \ldots , n\}$,

$$\begin{aligned} P_{X_{ni}}={\left\{ \begin{array}{ll} Q_n, \ \, &{}\text {for}\ \ i\in I^*:=\{k^*_n+1, \ldots , m^*_n\}\\ P_n, \ \, &{}\text {for}\ \ i\in I_n{\setminus } I^*. \end{array}\right. } \end{aligned}$$

We will write $k^\star =k_n^\star $, $m^\star =m_n^\star $ and $\ell ^\star =m^\star -k^\star $ for short. Set

$$\begin{aligned} \delta _n=\delta (P_n, Q_n)=\int _{{\mathbb {S}}}\int _{{\mathbb {S}}} h(x, y)Q_n(dx)P_n(dy). \end{aligned}$$

Note that $\delta _n$ measures in a sense the difference between the probability distributions $P_n$ and $Q_n$. If $P_n=Q_n$, then $\delta _n=0$. If $h(x, y)=h_c(x, y)$ then $\delta _n=\int sP_n(dx)-\int xQ_n(dx)$. If $h=h_W$ then $\delta _n=\int P_n(x)Q_n(dx)-\int Q_n(x)P_n(dx)$. The general consistency result is in the following elementary lemma.

Lemma 18

If

$$\begin{aligned} \frac{1}{\rho _\gamma (\ell ^*/n)}n^{-3/2}\sum _{i\in I^*}\sum _{j\in I_n{\setminus } I^*}\big [h(X_{ni}, X_{nj})-\delta _n\big ]=O_P(1) \end{aligned}$$

(15)

and

$$\begin{aligned} \sqrt{n}\big |\delta _n\big |\Big [\frac{\ell ^*}{n}\Big (1-\frac{\ell ^*}{n}\Big )\Big ]^{1-\gamma }\rightarrow \infty , \end{aligned}$$

(16)

then

$$\begin{aligned} n^{-3/2}T_n(\gamma , h)\xrightarrow [n\rightarrow \infty ]{\mathrm {P}}\infty . \end{aligned}$$

Table 2 Upper quantiles of $T_1$ (upper half) and $T_2$ (lower half)

Full size table

Proof of Theorem 3

Set for $i\in I^*, j\in I_n{\setminus } I^*$,

$$\begin{aligned} Z_{ij}=h(X_{ni}, X_{nj})-\delta _n. \end{aligned}$$

Noting that $EZ_{ij}=0$ and $EZ^2_{ij}= \nu _n$ for any $i\in I^*, j\in I_n{\setminus } I^*$, we obtain

$$\begin{aligned} E\Big (\sum _{i\in I^*}\sum _{j\in I_n{\setminus } I^*}Z_{ij}\Big )^2=\sum _{i,i'\in I^*}\sum _{j,j'\in I_n{\setminus } I^*}E(Z_{ij}Z_{i'j'})\le n\ell ^*(n-\ell ^*)\nu _n. \end{aligned}$$

This yields (15) by (3) and completes the proof. $\square $

Proof of Theorem 4

We will use a Hoeffding decomposition adjusted to the changing distribution. To this aim we define

$$\begin{aligned} h_{1,n}(x)&:=\int _{{\mathbb {S}}}h(x, y)Q_n(dy)-\delta _n,\\ h_{2,n}(y)&:=\int _{{\mathbb {S}}}h(x, y)P_n(dx)-\delta _n,\\ g_n(x, y)&:= h(x,y)-h_{1,n}(x) - h_{2,n}(y) - \delta _n. \end{aligned}$$

Next we show that the following estimates hold with an absolute constant $C>0$:

$$\begin{aligned}&E\left[ \bigg (\sum _{i=k^*+1}^{k^*+\ell ^*}h_{2,n}(X_{i,n})\bigg )^2\right] \le C\ell ^*, \end{aligned}$$

(17)

$$\begin{aligned}&E\left[ \bigg (\sum _{i=1 }^{k^*} h_{1,n}(X_{i,n})+\sum _{i=k^*+\ell ^*+1 }^n h_{1,n}(X_{i,n})\bigg )^2\right] \le C(n-\ell ^*), \end{aligned}$$

(18)

$$\begin{aligned}&E\left[ \bigg (\sum _{i=1 }^{k^*}\sum _{j=k^*+1}^{k^*+\ell ^*}g_n(X_{i,n},X_{j,n})+\sum _{i=k^*+\ell ^*+1 }^n \sum _{j=k^*+1}^{k^*+\ell ^*}g_n(X_{i,n},X_{j,n})\bigg )^2\right] \nonumber \\&\quad \le C\ell ^*(n-\ell ^*). \end{aligned}$$

(19)

These estimates yield

$$\begin{aligned} E\left( \sum _{i\in I^*}\sum _{j\in I_n{\setminus } I^*}[h(X_{ni}, X_{nj})-\delta _n] \right) ^2\le Cn\ell ^*(n-\ell ^*) \end{aligned}$$

with an absolute constant $C>0$ and (15) follows by (5). Hence, it remains to prove (17)–(19).

Conditions (17) and (18) follow from Lemma 14, (19) follows from Lemma 13. $\square $

8 Critical values

In Table 2, we give the upper quantiles of limit distribution of the one-sided and two-sided test statistics, that is

$$\begin{aligned} T_1&:=\sup _{s,t\in [0,1],s<t}\frac{B(t)-B(s)}{(t-s)^{\gamma }(1-(t-s))^\gamma }\\ T_2&:=\sup _{s,t\in [0,1],s<t}\frac{\big |B(t)-B(s)\big |}{(t-s)^{\gamma }(1-(t-s))^\gamma }, \end{aligned}$$

where B is a standard Brownian bridge. The distribution was evaluated on a grid of size 10,000 and we run a Monte-Carlo-simulation with 30,000 runs.

References

Betken A (2016) Testing for change-points in long-range dependent time series by means of a self-normalized Wilcoxon test. J Time Ser Anal 37(6):785–809
Article MathSciNet MATH Google Scholar
Ciesielski Z (1960) On the isomorphisms of the spaces $H_\alpha $ and $m$. Bull Acad Pol Sci Ser Sci Math Phys 8:217–222
MATH Google Scholar
Commenges D, Seal J, Pinatel F (1986) Inference about a change point in experimental neurophysiology. Math Biosci 80:81–108
Article MATH Google Scholar
Csörgő M, Horváth L (1989) Invariance principles for changepoint problems. In: CR Rao, MM Rao (eds) Multivariate statistics and probability, pp 151–168
Darkhovsky BS (1976) A nonparametric method for the a posteriori detection of the “disorder” time of a sequence of independent random variables. Theory Probab Appl 21:178–183
Article MathSciNet Google Scholar
Dehling H, Rooch A, Taqqu MS (2013) Non-parametric change-point tests for long-range dependent data. Scand J Stat 40(1):153–173
Article MathSciNet MATH Google Scholar
Dehling H, Fried R, Garcia I, Wendler M (2015) Change-point detection under dependence based on two-sample U-statistics. In: Dawson D et al (eds) Asymptotic laws and methods in stochastics. Springer, New York, pp 195–220
Chapter Google Scholar
Dehling H, Rooch A, Wendler M (2017) Two-sample U-statistic processes for long-range dependent data. Statistics 51(1):84–104
Article MathSciNet MATH Google Scholar
Döring M (2010) Multiple change-point estimation with U-statistics. J Stat Plann Inference 140(7):2003–2017
Article MathSciNet MATH Google Scholar
Ferger D (1994) On the power of nonparametric changepoint-tests. Metrika 41(1):277–292
Article MathSciNet MATH Google Scholar
Gombay E (1994) Testing for change-points with rank and sign statistics. Stat Probab Lett 20(1):49–55
Article MathSciNet MATH Google Scholar
Gombay E (2001) U-statistics for change under alternatives. J Multivar Anal 78(1):139–158
Article MathSciNet MATH Google Scholar
Hušková M (1995) Estimators for epidemic alternatives. Comment Math Univ Carolin 36(2):279–291
MathSciNet MATH Google Scholar
Herrndorf N (1985) A functional central limit theorem for strongly mixing sequences of random variables. Z Wahrscheinlichkeitstheor verw Geb 69(4):541–550
Article MathSciNet MATH Google Scholar
Kabluchko Z, Wang Y (2014) Limiting distribution for the maximal standardized increment of a random walk. Stoch Process Appl 124:2824–2867
Article MathSciNet MATH Google Scholar
Levin B, Kline J (1985) The cusum test of homogeneity with an application in spontaneous abortion epidemiology. Stat Med 4(4):469–488
Article Google Scholar
Mikosch T, Račkauskas A (2010) The limit distribution of the maximum increment of a random walk with regularly varying jump size distribution. Bernoulli 16(4):1016–1038
Article MathSciNet MATH Google Scholar
Orasch M (2004) Using U-statistcs based processes to detect multiple change-points. Asymptot Methods Stoch Fields Inst Commun 44:315–334
MathSciNet MATH Google Scholar
Pettitt AN (1979) A non-parametric approach to the change-point problem. J R Stat Soc Ser B 28:126–135
MATH Google Scholar
Prášková Z, Chochola O (2014) M-procedures for detection of a change under weak dependence. J Stat Plan Inference 149:60–76
Article MathSciNet MATH Google Scholar
Račkauskas A, Suquet C (2004) Hölder norm test statistics for epidemic change. J Stat Plan Inference 126(2):495–520
Article MATH Google Scholar
Račkauska A, Suquet C (2007) Estimating a changed segment in a sample. Acta Appl Math 97(1–3):189–210
Article MathSciNet MATH Google Scholar
Shao X, Zhang X (2010) Testing for change points in time series. J Am Stat Assoc 105(491):1228–1240
Article MathSciNet MATH Google Scholar
Siegmund D (1988) Approximate tail probabilities for the maxima of some random fields. Ann Probab 16(2):487–501
Article MathSciNet MATH Google Scholar
Siegmund D, Venkatraman ES (1995) Using the generalized likelihood ratio statistic for sequential detection of a change-point. Ann Stat 23:255–271
Article MathSciNet MATH Google Scholar
Vogel D, Wendler M (2017) Studentized U-quantile processes under dependence with applications to change-point analysis. Bernoulli 23(4B):3114–3144
Article MathSciNet MATH Google Scholar
Vogelsang TJ (1999) Sources of nonmonotonic power when testing for a shift in mean of a dynamic time series. J Econom 88(2):283–299
Article MathSciNet MATH Google Scholar
Wolfe DA, Schechtman E (1984) Nonparametric statistical procedures for the changepoint problem. J Stat Plan Inference 9(3):389–396
Article MathSciNet MATH Google Scholar
Yao Q (1993) Tests for change-points with epidemic alternatives. Biometrika 80(1):179–191
Article MathSciNet MATH Google Scholar
Yoshihara KI (1976) Limiting behavior of U-statistics for stationary, absolutely regular processes. Z Wahrscheinlichkeitstheor verw Geb 35(3):237–252
Article MathSciNet MATH Google Scholar
Yokoyama R (1980) Moment bounds for stationary mixing sequences. Z Wahrscheinlichkeitstheorie verw Gebiete 52:45–57
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Open Access funding provided by Projekt DEAL. We thank the editor and two anonymous referees for their careful reading of the article and their thoughtful comments, which have led to an improvement of the article. We also would like to thank Lea Wegner for her help concerning language corrections and additional R programming.

Author information

Authors and Affiliations

Vilnius University, Vilnius, Lithuania
Alfredas Račkauskas
Otto von Guericke Universität Magdeburg, Magdeburg, Germany
Martin Wendler

Authors

Alfredas Račkauskas
View author publications
You can also search for this author in PubMed Google Scholar
Martin Wendler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Wendler.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research is supported by the Research Council of Lithuania, Grant No. S-MIP-17-76.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Račkauskas, A., Wendler, M. Convergence of U-processes in Hölder spaces with application to robust detection of a changed segment. Stat Papers 61, 1409–1435 (2020). https://doi.org/10.1007/s00362-020-01161-9

Download citation

Received: 27 August 2019
Revised: 13 January 2020
Published: 29 January 2020
Issue Date: August 2020
DOI: https://doi.org/10.1007/s00362-020-01161-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Convergence of U-processes in Hölder spaces with application to robust detection of a changed segment

Abstract

Similar content being viewed by others

Testing mean changes by maximal ratio statistics

On change-points tests based on two-samples U-Statistics for weakly dependent observations

Change-Point Detection Under Dependence Based on Two-Sample U-Statistics

1 Introduction

2 Tests for changed segment based on U-statistics

Theorem 1

Theorem 2

Theorem 3

Theorem 4

3 Simulation results

4 Data example

5 Double partial sum process

Definition 5

Lemma 6

Remark 7

Lemma 8

Proof

Proof of Lemma 6

Lemma 9

Proof

Theorem 10

5.1 iid sample

Theorem 11

Proof

5.2 Mixing sample

Lemma 12

Proof

Lemma 13

Proof

Lemma 14

Proof

Theorem 15

Proof

6 Asymptotic distribution under null

Theorem 16

Proof

Lemma 17

7 Behavior under the alternative

Lemma 18

Proof of Theorem 3

Proof of Theorem 4

8 Critical values

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation