A weighted U-statistic based change point test for multivariate time series

Hu, Junwei; Wang, Lihong

doi:10.1007/s00362-022-01341-9

A weighted U-statistic based change point test for multivariate time series

Regular Article
Published: 19 July 2022

Volume 64, pages 753–778, (2023)
Cite this article

Statistical Papers Aims and scope Submit manuscript

345 Accesses
Explore all metrics

Abstract

In this paper we study the change point detection for the mean of multivariate time series. We construct the weighted U-statistic change point tests based on the weight function $1/{\sqrt{t(1-t)}}$ and some suitable kernel functions. We establish the asymptotic distribution of the test statistic under the null hypothesis and the consistency under the alternatives. A bootstrap procedure is applied to approximate the distribution of the test statistic and it is proved that the test statistic based on bootstrap sampling has the same asymptotic distribution as the original statistic. Numerical simulation and real data analysis show the good performance of the weighted change point test especially when the change point location is not in the middle of the observation period.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of methods for time series change point detection

Article 08 September 2016

Evaluating time series forecasting models: an empirical study on performance estimation methods

Article 13 October 2020

Bake off redux: a review and experimental evaluation of recent time series classification algorithms

Article Open access 19 April 2024

References

Akbari S, Reddy MJ (2018) Detecting changes in regional rainfall series in India using binary segmentation-based multiple change-point detection techniques. In: Singh VP, Yadav S, Yadava RN (eds) Climate change impacts. Springer Nature, Singapore, pp 103–116
Chapter Google Scholar
Aue A, Horváth L (2013) Structural breaks in time series. J Time Ser Anal 34(1):1–16
Article MathSciNet MATH Google Scholar
Bardet JM, Dion C (2019) Robust semi-parametric multiple change-points detection. Signal Process 156:145–155
Article Google Scholar
Berkes I, Gombay E, Horváth L (2009) Testing for changes in the covariance structure of linear processes. J Stat Plan Inference 139(6):2044–2063
Article MathSciNet MATH Google Scholar
Betken A (2016) Testing for change-points in long-range dependent time series by means of a self-normalized Wilcoxon test. J Time Ser Anal 37(6):785–809
Article MathSciNet MATH Google Scholar
Billingsley P (1999) Convergence of probability measures, 2nd edn. Wiley, New York
Book MATH Google Scholar
Cao Y, Thompson A, Wang M et al (2019) Sketching for sequential change-point detection. EURASIP J Adv Signal Process 2019:42
Article Google Scholar
Cho H, Fryzlewicz P (2012) Multiscale and multilevel technique for consistent segmentation of nonstationary time series. Stat Sin 22(1):207–229
Article MathSciNet MATH Google Scholar
Cox DR, Hinkley DV (1979) Theoretical statistics. Chapman & Hall, London
Book MATH Google Scholar
Csörgő M, Horváth L (1988) Invariance principles for change point problems. J Multivar Anal 27(1):151–168
Article MATH Google Scholar
Csörgő M, Horváth L (1997) Limit theorems in change-point analysis. Wiley, Chichester
MATH Google Scholar
Dehling H, Fried R, García I et al (2015) Change-point detection under dependence based on two-sample U-statistics. In: Dawson DA, Kulik R, Ould Haye M et al (eds) Asymptotic laws and methods in stochastics: honour of Miklós Csörgő. Springer, New York, pp 195–220
Chapter Google Scholar
Dehling H, Rooch A, Taqqu MS (2011) Nonparametric change-point tests for long-range dependent data. Scand J Stat 40(1):153–173
Article MATH Google Scholar
Dehling H, Vuk K, Wendler M (2021) Change-point detection based on weighted two-sample U-statistics. arXiv:2003.12573
Dehling H, Wendler M (2010) Law of the iterated logarithm for U-statistics of weakly dependent observations. Dependence in probability, analysis and number theory. Kendrick Press, Heber City, pp 177–194
Google Scholar
Dehling H, Wendler M (2010) Central limit theorem and the Bootstrap for U-Statistics of strongly mixing data. J Multivar Anal 101(1):126–137
Article MathSciNet MATH Google Scholar
Franke J, Hefter M, Herzwurm A, et al (2020) Adaptive quantile computation for brownian bridge in change-point analysis. arXiv:2101.00064
Harlé F, Chatelain F, Gouy-Pailler C, et al (2014) Rank-based multiple change-point detection in multivariate time series. 22nd European Signal Processing Conference (EUSIPCO)
Hlávka Z, Hušková M, Meintanis SG (2020) Change-point methods for multivariate time-series: paired vectorial observations. Stat Pap 61:1351–1383
Article MathSciNet MATH Google Scholar
Horváth L, Kokoszka P, Steinebach J (1999) Testing for changes in multivariate dependent observations with an application to temperature changes. J Multivar Anal 68(1):96–119
Article MathSciNet MATH Google Scholar
Inclán C, Tiao GC (1994) Use of cumulative sums of squares for retrospective detection of change of variance. J Am Stat Assoc 89(427):913–923
MathSciNet MATH Google Scholar
Li Q, Wang L (2020) Robust change point detection method via adaptive LAD-LASSO. Stat Pap 61:109–121
Article MathSciNet MATH Google Scholar
Liu B, Zhou C, Zhang X (2019) A tail adaptive approach for change point detection. J Multivar Anal 169:33–48
Article MathSciNet MATH Google Scholar
Liu B, Zhou C, Zhang X et al (2020) A unified data-adaptive framework for high dimensional change point detection. J R Stat Soc Ser B 82(4):933–963
Article MathSciNet MATH Google Scholar
Lung-Yut-Fong A, Lévy-Leduc C, Cappé O (2011) Homogeneity and change-point detection tests for multivariate data using rank statistics. Statistics 123(3):523–531
MATH Google Scholar
Messer M, Albert S, Schneider G (2018) The multiple filter test for change point detection in time series. Metrika 81(6):589–607
Article MathSciNet MATH Google Scholar
Muggeo VMR, Adelfio G (2011) Efficient change point detection for genomic sequences of continuous measurements. Bioinformatics 27(2):161–166
Article Google Scholar
Ngatchou-Wandji J, Elharfaoui E, Harel M (2021) On change-points tests based on two-samples U-Statistics for weakly dependent observations. Stat Pap. https://doi.org/10.1007/s00362-021-01242-3
Article MATH Google Scholar
Pešta M, Wendler M (2020) Nuisance parameters free changepoint detection in non-stationary series. TEST 29:379–408
Article MathSciNet MATH Google Scholar
Pettitt AN (1979) A non-parametric approach to the change-point problem. J R Stat Soc Ser C 28(2):126–135
MathSciNet MATH Google Scholar
Schmitz A (2011) Limit theorems in change-point analysis for dependent data. Doctoral Dissertation. University of Cologne
Sharipov OS, Wendler M (2012) Bootstrap for the sample mean and for U-statistics of mixing and near epoch dependent processes. J Nonparametr Stat 24(2):317–342
Article MathSciNet MATH Google Scholar
Shi X, Gallagher C, Lund R, et al (2021) A comparison of single and multiple changepoint techniques for time series data. arXiv:2101.01960
Shi X, Wu Y (2021) An empirical-characteristic-function-based change-point test for detection of multiple distributional changes. J Stat. Theory Practice 15(2):1–16
Article MathSciNet MATH Google Scholar
Zhang L, Lin J, Karim R (2018) Adaptive kernel density-based anomaly detection for nonlinear systems. Knowl-Based Syst 139:50–63
Article Google Scholar

Download references

Acknowledgements

The authors sincerely wish to thank the two referees and the editors for their queries and many insightful remarks and suggestions which have led to significantly improving the presentation of the results.

Author information

Authors and Affiliations

Department of Mathematics, Nanjing University, Nanjing, 210093, China
Junwei Hu & Lihong Wang

Authors

Junwei Hu
View author publications
You can also search for this author in PubMed Google Scholar
Lihong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lihong Wang.

Ethics declarations

Conflicts of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by National Natural Science Foundation of China (NSFC) under Grant 11671194.

Appendix

The proofs of Theorem 1 and Theorem 2 proceed along similar lines of the proofs of Theorems 3 and 4 in Liu et al. (2019). Hence we shall only briefly indicate the extra steps that are needed for us to achieve our goal.

For each coordinate k, define the Hoeffding’s decomposition of the kernel function h(x, y) as

$$\begin{aligned} h(x, y)=\theta _{k}+h_{1, k}(x)+h_{2, k}(y)+g_{k}(x, y), \end{aligned}$$

where $\theta _{k}=E(h(Y_1, Y_2))$, $h_{1, k}(x)=E(h(x, Y_2))-\theta _{k}$, $h_{2, k}(y)=E(h(Y_1, y))-\theta _{k}$, $g_{k}(x, y)=h(x, y)-h_{1, k}(x)-h_{2, k}(y)-\theta _{k},$ where $Y_1$ and $Y_2$ are independent random variables with the same marginal distributions as $X_{1, k}$ under $H_0$.

Lemma 4.5 in Dehling and Wendler (2010a) proved that, if h(x, y) is a 2-Lipschitz continuous function, then its degenerate part $g_k(x, y)$ is also a 2-Lipschitz continuous function.

The following lemma is shown by Theorem 10.2 of Billingsley (1999) (see also Theorem 7 of Dehling et al. (2015)). Let $\eta _1, \ldots , \eta _n$ be random variables, and define $S_i=\eta _1+\ldots +\eta _i$ $(S_0=0$), and $M_n =\max _{0\le i\le n} |S_i|$.

Lemma 1

Suppose that $\alpha >1$ and there exist nonnegative numbers $a_1, \ldots , a_n$ such that for all positive $\zeta $,

$$\begin{aligned} P\Big (|S_j-S_i|>\zeta \Big ) \le \frac{1}{\zeta ^2} \Big (\sum _{i<l\le j} a_l\Big )^\alpha , \ \ 0\le i\le j\le n. \end{aligned}$$

Then there exists a constant C depending only on $\alpha $ such that

$$\begin{aligned} P\Big (M_n>\zeta \Big ) \le \frac{C}{\zeta ^2} \Big (\sum _{0<l\le n} a_l\Big )^\alpha . \end{aligned}$$

Lemma 2

Suppose that Assumption (S1) holds and $g_k(x, y)$ is a 2-Lipschitz continuous function. Then, as $n\rightarrow \infty $, for each $1\le {k}\le {K}$,

$$\begin{aligned} \sup _{t\in {(0, 1)}}\Big |\frac{1}{\sqrt{t(1-t)}}\frac{1}{n^{3/2}}\sum _{i=1}^{[(n+1)t]}\sum _{j=[(n+1)t]+1}^{n}g_k(\xi _{i,k}, \xi _{j,k})\Big |\rightarrow _P{0}, \end{aligned}$$

where $\rightarrow _P$ denotes convergence in probability.

Proof

Let

$$\begin{aligned} G_{k}(t)=\frac{1}{\sqrt{t(1-t)}}\frac{1}{n^{3/2}}\sum _{i=1}^{[(n+1)t]}\sum _{j=[(n+1)t]+1}^{n}g_k(\xi _{i,k}, \xi _{j,k}),~1\le {k}\le {K}, \\ {\widetilde{G}}_{k}(t)=\frac{1}{n^{3/2}}\sum _{i=1}^{[(n+1)t]}\sum _{j=[(n+1)t]+1}^{n}g_k(\xi _{i,k}, \xi _{j,k}),~1\le {k}\le {K}. \end{aligned}$$

Let $\epsilon =\epsilon _n=O(n^{-\beta })$ with $0<\beta <\frac{M-2}{3M}$ for some large enough constant M. Note that $G_k(t)=0$ for $0<t<1/(n+1)$ and $1-1/(n+1)<t<1$. Therefore, it suffices to show that

$$\begin{aligned} \sup _{t\in {[\epsilon , 1-\epsilon ]}}\big |G_k(t)\big |\rightarrow _P {0}, \end{aligned}$$

(6)

$$\begin{aligned} \sup _{t\in {[1/(n+1), \epsilon )}}\big |G_k(t)\big |\rightarrow _P {0}, \end{aligned}$$

(7)

and

$$\begin{aligned} \sup _{t\in {(1-\epsilon , 1- 1/(n+1)]}}\big |G_k(t)\big |\rightarrow _P {0}. \end{aligned}$$

(8)

Let $t_u=u/(n+1)$, $t_v=v/(n+1)$ for $[(n+1)\epsilon ]\le u<v\le n+1-[(n+1)\epsilon ]$. We have

$$\begin{aligned} \max _{\epsilon \le t_v\le {1-\epsilon }}\frac{1}{t_v(1-t_v)}\le Cn^\beta \end{aligned}$$

(9)

for some positive constant C. On the other hand, the mean value theorem yields

$$\begin{aligned} \Big (\frac{1}{\sqrt{t_v(1-t_v)}}-\frac{1}{\sqrt{t_u(1-t_u)}}\Big )^{2}=\frac{1}{4}(t_v-t_u)^2(\rho (1-\rho ))^{-3}(1-2\rho )^2\le C(v-u)n^{3\beta -1}, \end{aligned}$$

where $\epsilon \le t_u\le \rho \le t_v\le {1-\epsilon }$. Thus, by Chebyshev’s inequality, for any $\zeta >0$,

$$\begin{aligned}&P\Big (\Big |G_{k}(t_v)-G_{k}(t_u)\Big |>\zeta \Big )\le \frac{1}{\zeta ^2}E\Big (G_{k}(t_v)-G_{k}(t_u)\Big )^{2}\\&\ \ =\frac{1}{\zeta ^2}E\Big (\frac{1}{\sqrt{t_v(1-t_v)}}{\widetilde{G}}_{k}(t_v)-\frac{1}{\sqrt{t_u(1-t_u)}}{\widetilde{G}}_{k}(t_u)\Big )^{2}\\&\ \ \le \frac{2}{\zeta ^2}E\Big (\frac{1}{\sqrt{t_v(1-t_v)}}{\widetilde{G}}_{k}(t_v)-\frac{1}{\sqrt{t_v(1-t_v)}}{\widetilde{G}}_{k}(t_u)\Big )^{2}\\&\ \ \ \ +\frac{2}{\zeta ^2}E\Big (\frac{1}{\sqrt{t_v(1-t_v)}}{\widetilde{G}}_{k}(t_u)-\frac{1}{\sqrt{t_u(1-t_u)}}{\widetilde{G}}_{k}(t_u)\Big )^{2}\\&\ \ =\frac{2}{t_v(1-t_v)\zeta ^2}E\Big ({\widetilde{G}}_{k}(t_v)-{\widetilde{G}}_{k}(t_u)\Big )^{2}\\&\ \ \ \ +\Big (\frac{1}{\sqrt{t_v(1-t_v)}}-\frac{1}{\sqrt{t_u(1-t_u)}}\Big )^{2}\frac{2}{\zeta ^2}E\Big ({\widetilde{G}}_{k}(t_u)\Big )^{2}\\&\ \ \le \frac{Cn^{\beta } }{\zeta ^2}E\Big ({\widetilde{G}}_{k}(t_v)-{\widetilde{G}}_{k}(t_u)\Big )^{2} +\frac{C(v-u)n^{3\beta -1}}{\zeta ^2}E\Big ({\widetilde{G}}_{k}(t_u)\Big )^{2}. \end{aligned}$$

It follows from Lemma 2 of Dehling et al. (2015) that there exists a constant C such that

$$\begin{aligned} E\Big ({\widetilde{G}}_{k}(t_v)-{\widetilde{G}}_{k}(t_u)\Big )^{2}\le \frac{C(v-u)}{n^{2}}, \end{aligned}$$

and

$$\begin{aligned} E\Big ({\widetilde{G}}_{k}(t_u)\Big )^{2}\le \frac{Cu}{n^{2}}. \end{aligned}$$

Thus we obtain

$$\begin{aligned} P\Big (\Big |G_{k}(t_v)-G_{k}(t_u)\Big |>\zeta \Big )\le & {} \frac{Cn^{\beta } }{\zeta ^2}\frac{(v-u)}{n^2}+\frac{C(v-u)n^{3\beta -1}}{\zeta ^2}\frac{u}{n^2}\\\le & {} \frac{Cn^{\beta } }{\zeta ^2}\frac{(v-u)}{n^2}+\frac{C(v-u)n^{3\beta }}{\zeta ^2}\frac{1}{n^2}\\\le & {} \frac{C(v-u)n^{3\beta }}{n^{2}\zeta ^2}\\\le & {} \frac{1}{\zeta ^2}\Big (\frac{(Cn^{3\beta })^{M/(M+1)}}{n^{(2M-1)/(M+1)}}(v-u)\Big )^{(M+1)/M}\\= & {} \frac{1}{\zeta ^2}\Big (\sum _{u<l\le v}\frac{(Cn^{3\beta })^{M/(M+1)}}{n^{(2M-1)/(M+1)}}\Big )^{(M+1)/M}. \end{aligned}$$

For every $1\le k \le K$, we define the random variable $\eta _i = G_k(i/(n+1))-G_k((i-1)/(n+1))$ for $i = 1,\ldots , n$. We also define $S_i = \eta _1+\ldots +\eta _i$ with $S_0 = 0$. Then we have $S_i = G_k(i/(n+1))$. Therefore the conditions of Lemma 1 are satisfied, where $\alpha =(M+1)/M>1$ and $a_l=(Cn^{3\beta })^{M/(M+1)}/n^{(2M-1)/(M+1)}$. Hence we have

$$\begin{aligned} \begin{aligned} P\Big (\max _{[(n+1)\epsilon ]\le {v}\le {n+1-[(n+1)\epsilon ]}}\Big |G_{k}(t_v)\Big |>\zeta \Big ) \le \frac{C}{\zeta ^2}\Big (\sum _{0<l\le n}\frac{(Cn^{3\beta })^{M/(M+1)}}{n^{(2M-1)/(M+1)}}\Big )^{(M+1)/M} \le \frac{C}{\zeta ^2}n^{3\beta +\frac{2-M}{M}}. \end{aligned} \end{aligned}$$

This, together with the assumption that $0<\beta <\frac{M-2}{3M}$, yields (6).

Along similar arguments, we obtain

$$\begin{aligned} P\Big (\Big |{{\widetilde{G}}}_{k}(t_v)-{{\widetilde{G}}}_{k}(t_u)\Big |>\zeta \Big )\le & {} \frac{1}{\zeta ^2}E\Big ({{\widetilde{G}}}_{k}(t_v)-{{\widetilde{G}}}_{k}(t_u)\Big )^{2}\\\le & {} \frac{C(v-u)}{n^{2}\zeta ^2}\le \frac{1}{\zeta ^2}\Big (\sum _{u<l\le v}\frac{C^{M/(M+1)}}{n^{(2M-1)/(M+1)}}\Big )^{(M+1)/M}, \end{aligned}$$

and hence, with $\epsilon =O(n^{-\beta })$, we arrive at

$$\begin{aligned} P\Big (\max _{1\le {v}<[(n+1)\epsilon ]}n^{\frac{\beta +1}{2}-\frac{1}{M}}\big |{{\widetilde{G}}}_{k}(t_v)\big |>\zeta \Big )\le & {} \frac{C}{\zeta ^2}\Big (\sum _{0<l\le [(n+1)\epsilon ]}\frac{(Cn^{\beta +1-\frac{2}{M}})^{M/(M+1)}}{n^{(2M-1)/(M+1)}}\Big )^{(M+1)/M}\\\le & {} \frac{Cn^{\beta +1-\frac{2}{M}}}{\zeta ^2}\frac{(n\epsilon )^{(M+1)/M}}{n^{(2M-1)/M}}\le \frac{C}{\zeta ^2}n^{-\beta /M}. \end{aligned}$$

This implies that

$$\begin{aligned} \max _{1\le {v}<[(n+1)\epsilon ]}\big |{{\widetilde{G}}}_{k}(t_v)\big |=o_P\big (n^{-\frac{\beta +1}{2}+\frac{1}{M}}\big ). \end{aligned}$$

On the other hand,

$$\begin{aligned} \sup _{t\in {[1/(n+1), \epsilon )}}\frac{1}{\sqrt{t(1-t)}}=O(n^{1/2}). \end{aligned}$$

Hence we obtain

$$\begin{aligned} \sup _{t\in {[1/(n+1), \epsilon )}}\big |G_k(t)\big |=O(n^{1/2})o_P\big (n^{-\frac{\beta +1}{2}+\frac{1}{M}}\big )=o_P(1), \end{aligned}$$

if $\beta $ is chosen to be $\frac{2}{M}<\beta <\frac{M-2}{3M}$ for some large enough M. This implies (7). (8) can be proved in a similar way. Thus we completes the proof of Lemma 2. $\square $

Proof of Theorem 1

The antisymmetry of h(x, y) implies that $ \theta _{k}+\theta _{k}=E(h(Y_1, Y_2))+E(-h(Y_2, Y_1))=0$, and $ h_{1,k}(x)=E (h(x,Y_2))-\theta _{k}=-E(h(Y_2, x))-\theta _{k}=-E(h(Y_1, x))+\theta _{k}=-h_{2,k}(x).$

Thus the U-statistic $U_{k}(t)$ can be written as

$$\begin{aligned} U_{k}(t)= & {} \frac{1}{\sqrt{t(1-t)}}\frac{1}{n^{3/2}}\sum _{i=1}^{[(n+1)t]}\sum _{j=[(n+1)t]+1}^{n}\Big (\theta _{k}+h_{1,k}(X_{i,k})+h_{2,k}(X_{j,k}) +g_{k}(X_{i,k},X_{j,k})\Big )\\= & {} \frac{1}{\sqrt{t(1-t)}}\Big (\frac{1}{\sqrt{n}}\sum _{i=1}^{[(n+1)t]}h_{1,k}(X_{i,k})-\frac{[(n+1)t]}{n^{3/2}}\sum _{i=1}^{n}h_{1,k}(X_{i,k})\Big )\\&\ \ +\frac{1}{\sqrt{t(1-t)}}\frac{1}{n^{3/2}}\sum _{i=1}^{[(n+1)t]}\sum _{j=[(n+1)t]+1}^{n}g_{k}(X_{i,k},X_{j,k})\\\triangleq & {} V_{k}(t)+G_{k}(t). \end{aligned}$$

Let $V(t)=(V_{1}(t), V_{2}(t),\ldots , V_{K}(t))'$ and $G(t)=(G_{1}(t), G_{2}(t),\ldots , G_{K}(t))'$. Lemma 2 implies that $\sup _{t\in {(0,1)}}G(t)'G(t)=o_{P}(1)$. Hence, to prove Theorem 1, it suffices to show that

$$\begin{aligned} \sup _{t\in {(0,1)}} V(t)'V(t) \rightarrow _{{\mathcal {D}}}\sup _{t\in {(0, 1)}}\frac{1}{t(1-t)}B(t)'B(t). \end{aligned}$$

(10)

Let ${{\widetilde{V}}}_{k}(t)=\frac{1}{\sqrt{n}}\sum _{i=1}^{[(n+1)t]}h_{1,k}(X_{i,k})$ and ${{\widetilde{V}}}(t)=({{\widetilde{V}}}_{1}(t), {{\widetilde{V}}}_{2}(t),\ldots , {{\widetilde{V}}}_{K}(t))'$. From the proof of Theorem 3 of Liu et al. (2019) (see (26) and (27) of the supplementary materials to Liu et al. (2019)), we have the finite dimensional convergence of ${{\widetilde{V}}}(t)_{t\in {[0,1]}}$ and the tightness of ${{\widetilde{V}}}(t)_{t\in {[0,1]}}$. Therefore we obtain ${{\widetilde{V}}}(t)_{t\in {[0,1]}}$ converges weakly to ${W(t)}_{t\in {[0,1]}}$ in space $(D[0,1])^{K}$ that is equipped with the sup-norm, where $W(t)=(W_{1}(t), W_{2}(t),\ldots , W_{K}(t))'$ is a K-dimensional Brownian motion process, and cov$(W_{k_1}(t),W_{k_2}(s))=\min (t, s)\sigma ^2_{ k_{1}, k_{2}}$ for $k_1, k_ 2\in \{1, 2, \ldots , K\}$ and $t, s\in [0, 1]$. It should be mentioned that, although Theorem 3 of Liu et al. (2019) is proved for their special kernel function, carefully examining their proof reveals that the weak convergence result holds for any kernel satisfying the Assumption (S2).

Thus, by continuous mapping theorem in space $(D[0,1])^{K}$, we obtain

$$\begin{aligned} \sup _{t\in {(0, 1)}}\frac{1}{t(1-t)}({{\widetilde{V}}}(t)-t{{\widetilde{V}}}(1))'({{\widetilde{V}}}(t)-t{{\widetilde{V}}}(1))\rightarrow _{{\mathcal {D}}}\sup _{t\in {(0, 1)}}\frac{1}{t(1-t)}B(t)'B(t). \end{aligned}$$

This leads to (10) and hence completes the proof of Theorem 1. $\square $

Proof of Theorem 2

Recall that, for each $1\le {k}\le {K}$,

$$\begin{aligned} U_{k}^{b}(t)=\frac{1}{\sqrt{t(1-t)}}\frac{1}{(LM)^{3/2}}\sum _{i=1}^{[(LM+1)t]}\sum _{j=[(LM+1)t]+1}^{LM}h(X_{i,k}^{b},Y_{j,k}^{b}). \end{aligned}$$

Similarly to the proof of Theorem 1, we decompose the statistic $U_{k}^{b}(t)$ as follows,

$$\begin{aligned} U_{k}^{b}(t)= & {} \frac{1}{\sqrt{t(1-t)}}\Big (\frac{1}{\sqrt{LM}}\sum _{i=1}^{[(LM+1)t]}h_{1,k}(X_{i,k}^{b})-\frac{[(LM+1)t]}{(LM)^{3/2}}\sum _{i=1}^{LM}h_{1,k}(X_{i,k}^{b})\\&+\frac{1}{(LM)^{3/2}}\sum _{i=1}^{[(LM+1)t]}\sum _{j=[(LM+1)t]+1}^{LM}g_{k}(X_{i,k}^{b},X_{j,k}^{b})\Big )\\\triangleq & {} V^b_{k}(t)+G^b_{n,k}(t). \end{aligned}$$

Let ${{\widetilde{V}}}^b_{k}(t)=\frac{1}{\sqrt{LM}}\sum _{i=1}^{[(LM+1)t]}h_{1,k}(X_{i,k}^{b})$ and denote $V^{b}(t)$, ${{\widetilde{V}}}^{b}(t)$ and $G_{n}^{b}(t)$ be the corresponding vector processes. Then it follows from Assumptions (S1) and (S2) and Theorem 2.8 of Sharipov and Wendler (2012) that ${{\widetilde{V}}}^{b}(t)$ converges weakly to ${W(t)}_{t\in {[0,1]}}$ in space $(D[0,1])^{K}$. Again the continuous mapping theorem in space $(D[0,1])^{K}$ yields

$$\begin{aligned} \sup _{t\in {(0, 1)}}\frac{1}{t(1-t)}V^b(t)'V^b(t)\rightarrow ^*_{\mathcal {D}}\sup _{t\in {(0, 1)}}\frac{1}{t(1-t)}B(t)'B(t). \end{aligned}$$

Therefore, to complete the proof of Theorem 2, it suffices to prove $\sup _{t\in {(0, 1)}} G^b_{n}(t)'G^b_{n}(t)\rightarrow 0$ in probability conditionally on $\{X_{i}\}_{i\in {\mathbb {N}}}$. To this end, we shall show that, for each $1\le {k}\le {K}$,

$$\begin{aligned} P^{*}\Big (\sup _{t\in {(0, 1)}}|G_{n,k}^b(t)|\rightarrow {0}\Big )=1 \end{aligned}$$

almost surely, where $P^*$ denotes the probability conditionally on $\{X_{i}\}_{i\in {\mathbb {N}}}$. By Fubini’s Theorem, it is sufficient to prove that

$$\begin{aligned} P\Big (\sup _{t\in {(0, 1)}}|G^b_{n,k}(t)|\rightarrow {0}\Big )=1. \end{aligned}$$

We proceed with two steps. In the first step, we show that

$$\begin{aligned} P\Big (\sup _{t\in [\epsilon , 1-\epsilon ]}|G^b_{n,k}(t)|\rightarrow {0}\Big )=1, \end{aligned}$$

(11)

where $\epsilon $ is defined as in the proof of Lemma 2. In the second step, we prove that

$$\begin{aligned} P\Big ( \sup _{t\in {[1/(n+1), \epsilon )}}\big |G^b_{n,k}(t)|\rightarrow {0}\Big )=1,\ \ \text{ and }\ \ P\Big (\sup _{t\in {(1-\epsilon , 1- 1/(n+1)]}}\big |G^b_{n,k}(t)|\rightarrow {0}\Big )=1.\nonumber \\ \end{aligned}$$

(12)

For the proof of (11), with the method of subsequences, it suffices to show that, as $l\rightarrow \infty $,

$$\begin{aligned} \sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l},k}(t)|\rightarrow {0}, \end{aligned}$$

(13)

and

$$\begin{aligned} \max _{2^{l-1}\le {n}<2^{l}}\big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{n,k}(t)|-\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1},k}(t)|\big |\rightarrow {0} \end{aligned}$$

(14)

almost surely.

By Chebyshev’s inequality, for any $\zeta >0$, we have

$$\begin{aligned} \sum _{l=1}^{\infty }P\Big (\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l},k}(t)|>\zeta \Big )\le \sum _{l=1}^{\infty }\frac{1}{{\zeta }^{2}} E\Big (E^{*}\big (\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l},k}(t)|\big )^{2}\Big ). \end{aligned}$$

It follows from the proofs of Lemmas 3.6 and 3.7 of Dehling and Wendler (2010b) that

$$\begin{aligned} E\Big (E^{*}\Big (\sum _{s=1}^{n-1}\Big (\sum _{i=1}^{s}\sum _{j=s+1}^{n}|g_{k}(X_{i,k}^{b},X_{j,k}^{b})| \Big )^{2}\Big )\Big ) =O(n^{3+\tau -\gamma }) \end{aligned}$$

(15)

with $\tau $ satisfying $0<\tau <\gamma $. Combining this with (9) yields that

$$\begin{aligned}&E\Big (E^{*}\big (\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l},k}(t)|\big )^{2}\Big )\nonumber \\&\ \ =E\Big (E^{*}\Big (\sup _{s\in {[T_{\epsilon },2^{l}+1-T_{\epsilon }]}}\frac{1}{\sqrt{\frac{s}{2^{l}+1}(1-\frac{s}{2^{l}+1})}}\frac{1}{(2^l)^{3/2}} \Big |\sum _{i=1}^{s}\sum _{j=s+1}^{2^l}g_{k}(X_{i,k}^{b},X_{j,k}^{b})\Big | \Big )^{2}\Big )\nonumber \\&\ \ \le \sup _{s\in {[T_{\epsilon },2^{l}+1-T_{\epsilon }]}}\frac{1}{{\frac{s}{2^{l}+1}(1-\frac{s}{2^{l}+1})}} 2^{-3l} E\Big (E^{*}\Big (\sum _{s=1}^{2^l-1}\Big (\sum _{i=1}^{s}\sum _{j=s+1}^{2^l}|g_{k}(X_{i,k}^{b},X_{j,k}^{b})| \Big )^{2}\Big )\Big ) \nonumber \\&\ \ \le C2^{l(\tau -\gamma +\beta )}, \end{aligned}$$

(16)

where $T_{\epsilon }=(2^{l}+1)\epsilon $. Hence we obtain

$$\begin{aligned} \sum _{l=1}^{\infty }P\Big (\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l},k}(t)|>\zeta \Big )\le \frac{C}{{\zeta }^{2}}\sum _{l=1}^{\infty }2^{l(\tau -\gamma +\beta )}. \end{aligned}$$

Let $0<\beta <\gamma -\tau $, then Borel-Cantelli Lemma implies (13).

For the proof of (14), we apply the chaining technique as in the proof of Theorem 2.8 in Sharipov and Wendler (2012),

$$\begin{aligned}&\max _{2^{l-1}\le {n}<2^{l}}\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{n,k}(t)|-\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1},k}(t)|\Big |\\&\ \ \le \sum _{m=1}^{l}\max _{i=1,2,\ldots ,2^{l-m}}\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+i2^{m-1},k}(t)| -\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+(i-1)2^{m-1},k}(t)|\Big |. \end{aligned}$$

Thus we have

$$\begin{aligned}&E\Big (E^{*}\Big (\max _{2^{l-1}\le {n}<2^{l}}\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{n,k}(t)| -\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1},k}(t)|\Big |\Big )^{2}\Big )\\&\ \ \le E\Big (E^{*}\Big (\sum _{m=1}^{l}\max _{i=1,2,\ldots ,2^{l-m}}\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+i2^{m-1},k}(t)| -\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+(i-1)2^{m-1},k}(t)|\Big |\Big )^{2}\Big )\\&\ \ \le {l}\sum _{m=1}^{l}\sum _{i=1}^{2^{l-m}}E\Big (E^{*}\Big ( \Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+i2^{m-1},k}(t)| -\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+(i-1)2^{m-1},k}(t)|\Big |^{2}\Big )\Big )\\&\ \ \le 2l\sum _{m=1}^{l}\sum _{i=1}^{2^{l-m}}\Big ( E\Big (E^{*}\Big (\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+i2^{m-1},k}(t)|\Big |^{2}\Big )\Big ) \\&+E\Big (E^{*}\Big (\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+(i-1)2^{m-1},k}(t)|\Big |^{2}\Big )\Big )\Big ). \end{aligned}$$

Along similar arguments as in (15) and (16), we obtain

$$\begin{aligned}&\sum _{m=1}^{l}\sum _{i=1}^{2^{l-m}}E\Big (E^{*}\Big (\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+i2^{m-1},k}(t)|\Big |^{2}\Big )\Big )\\&\ \ \le C2^{l\beta -3l}\sum _{m=1}^{l}\sum _{i=1}^{2^{l-m}} E\Big (E^{*}\Big (\sum _{s=1}^{2^{l-1}+i2^{m-1}-1}\Big (\sum _{i=1}^{s}\sum _{j=s+1}^{2^l}|g_{k}(X_{i,k}^{b},X_{j,k}^{b})| \Big )^{2}\Big )\Big ) \nonumber \\&\ \ \le C2^{l\beta -3l}\sum _{m=1}^{l} E\Big (E^{*}\Big (\sum _{s=1}^{2^{l}-1}\Big (\sum _{i=1}^{s}\sum _{j=s+1}^{2^l}|g_{k}(X_{i,k}^{b},X_{j,k}^{b})| \Big )^{2}\Big )\Big ) \nonumber \\&\ \ \le C2^{l(\tau -\gamma +\beta )} \end{aligned}$$

and

$$\begin{aligned} \sum _{m=1}^{l}\sum _{i=1}^{2^{l-m}}E\Big (E^{*}\Big (\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1}+(i-1)2^{m-1},k}(t)|\Big |^{2}\Big )\Big ) \le {C2^{l(\tau -\gamma +\beta )}}. \end{aligned}$$

These bounds yield that

$$\begin{aligned} E\Big (E^{*}\Big (\max _{2^{l-1}\le {n}<2^{l}}\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{n,k}(t)| -\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1},k}(t)|\Big |\Big )^{2}\Big )\le C2^{l(\tau -\gamma +\beta )}. \end{aligned}$$

Now Chebyshev’s inequality leads to

$$\begin{aligned} \sum _{l=1}^{\infty }P\Big (\max _{2^{l-1}\le {n}<2^{l}}\Big |\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{n,k}(t)| -\sup _{t\in {[\epsilon , 1-\epsilon ]}}|G^b_{2^{l-1},k}(t)|\Big | >\zeta \Big )\le \frac{C}{{\zeta }^{2}}\sum _{l=1}^{\infty }2^{l(\tau -\gamma +\beta )}<\infty . \end{aligned}$$

This together with Borel-Cantelli Lemma implies (14). Hence (11) follows from (13) and (14). Now we proceed to establish the first result of (12). Again by studying the proofs of Lemmas 3.6 and 3.7 of Dehling and Wendler (2010b), we conclude

$$\begin{aligned} E\Big (E^{*}\Big (\sum _{s=1}^{[(n+1)\epsilon ]}\Big (\sum _{i=1}^{s}\sum _{j=s+1}^{n}|g_{k}(X_{i,k}^{b},X_{j,k}^{b})| \Big )^{2}\Big )\Big ) \le Cn\big ((n+1)\epsilon \big )^{2+\tau -\gamma }=O(n^{3+\tau -\gamma -\beta (2+\tau -\gamma )}). \end{aligned}$$

This in turn gives

$$\begin{aligned}&E\Big (E^{*}\big (\sup _{t\in {[1/(n+1), \epsilon )}}|G^b_{2^{l},k}(t)|\big )^{2}\Big )\nonumber \\&\ \ =E\Big (E^{*}\Big (\sup _{1/(n+1)\le s/(2^{l}+1)<{\epsilon }}\frac{1}{\sqrt{\frac{s}{2^{l}+1}(1-\frac{s}{2^{l}+1})}}\frac{1}{(2^l)^{3/2}} \Big |\sum _{i=1}^{s}\sum _{j=s+1}^{2^l}g_{k}(X_{i,k}^{b},X_{j,k}^{b})\Big | \Big )^{2}\Big )\nonumber \\&\ \ \le \sup _{1/(n+1)\le s/(2^{l}+1)<{\epsilon }}\frac{1}{{\frac{s}{2^{l}+1}(1-\frac{s}{2^{l}+1})}} 2^{-3l} E\Big (E^{*}\Big (\sum _{s=1}^{[(2^l+1)\epsilon ]}\Big (\sum _{i=1}^{s}\sum _{j=s+1}^{2^l}|g_{k}(X_{i,k}^{b},X_{j,k}^{b})| \Big )^{2}\Big )\Big ) \nonumber \\&\ \ \le C2^{l(1+\tau -\gamma -\beta (2+\tau -\gamma ))}, \end{aligned}$$

where $\beta $ is chosen to satisfy $(1+\tau -\gamma )/(2+\tau -\gamma )<\beta <\gamma -\tau $. Consequently, we arrive at

$$\begin{aligned} \sum _{l=1}^{\infty }P\Big (\sup _{t\in {[1/(n+1), \epsilon )}}|G^b_{2^{l},k}(t)|>\zeta \Big )\le \frac{C}{{\zeta }^{2}}\sum _{l=1}^{\infty }2^{l(1+\tau -\gamma -\beta (2+\tau -\gamma ))}<\infty . \end{aligned}$$

The latter combined with Borel-Cantelli Lemma leads to $\sup _{t\in {[1/(n+1), \epsilon )}}|G^b_{2^{l},k}(t)|\rightarrow {0}$ almost surely. In a similar fashion, we obtain $\max _{2^{l-1}\le {n}<2^{l}}\big |\sup _{t\in {[1/(n+1), \epsilon )}}|G^b_{n,k}(t)|-\sup _{t\in {[1/(n+1), \epsilon )}} |G^b_{2^{l-1},k}(t)|\big |\rightarrow {0}$ almost surely. These imply the first result of (12) and then a similar argument yields the second part of (12). In view of (11) and (12), we establish the result of Theorem 2. $\square $

Proof of Theorem 3

It suffices to prove that, for each $1\le {k}\le {K}$,

$$\begin{aligned} \sup _{t\in (0,1)}|U_k(t)|\rightarrow \infty . \end{aligned}$$

Note that

$$\begin{aligned} \sup _{t\in (0,1)}|U_k(t)|\ge \frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}h(X_{i,k},X_{j,k})\Big |. \end{aligned}$$

It is enough to show that

$$\begin{aligned} \frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}h(X_{i,k},X_{j,k})\Big |\rightarrow \infty . \end{aligned}$$

It follows from the Hoeffding’s decomposition that

$$\begin{aligned} h(X_{i,k},X_{j,k})=\theta '_{k}+h_{1,k}(X_{i,k})+h_{2,k}(X_{j,k}) +g_{k}(X_{i,k},X_{j,k}), \ \ 1\le i\le k^*<j\le n, \end{aligned}$$

where $\theta '_{k}=E (h(X_{1,k}, X'_{n,k}))$, $h_{1, k}(x)=E(h(x, X'_{n,k}))-\theta '_{k}$, $h_{2, k}(y)=E(h(X_{1,k}, y))-\theta '_{k}$, $g_{k}(x, y)=h(x, y)-h_{1, k}(x)-h_{2, k}(y)-\theta '_{k},$ where $X'_{n,k}$ is independent of $X_{1, k}$ with the same distribution as $X_{n,k}$. Hence we obtain

$$\begin{aligned}&\frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}h(X_{i,k},X_{j,k})\Big |\\&\ \ =\frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}\Big (\theta '_{k}+h_{1,k}(X_{i,k})+h_{2,k}(X_{j,k}) +g_{k}(X_{i,k},X_{j,k})\Big )\Big |\\&\ \ =\theta '_k\sqrt{nt^*(1-t^*)}+\sqrt{\frac{n-k^*}{nk^*}}\Big |\sum _{i=1}^{k^*}h_{1,k}(X_{i,k})\Big |+\sqrt{\frac{k^*}{n(n-k^*)}}\Big |\sum _{j=k^*+1}^{n}h_{2,k}(X_{j,k})\Big |\\&\ \ \ \ +\frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}g_{k}(X_{i,k},X_{j,k})\Big |\\ \end{aligned}$$

Theorem 1.8 of Dehling and Wendler (2010b) implies that

$$\begin{aligned} \sqrt{\frac{n-k^*}{nk^*}}\Big |\sum _{i=1}^{k^*}h_{1,k}(X_{i,k})\Big |=O_P(1),\ \ \text{ and }\ \ \sqrt{\frac{k^*}{n(n-k^*)}}\Big |\sum _{j=k^*+1}^{n}h_{2,k}(X_{j,k})\Big |=O_P(1). \end{aligned}$$

Since $\theta '_k\sqrt{nt^*(1-t^*)}\rightarrow \infty $, we only need to show

$$\begin{aligned} \frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}g_{k}(X_{i,k},X_{j,k})\Big |=o_P(1). \end{aligned}$$

Let ${{\tilde{h}}}(x, y)=h(x+\mu _k, y+\mu _k+\lambda _k)-\theta _k'$. Then ${{\tilde{h}}}(\xi _{i,k}, \xi _{j,k})=h(X_{i,k}, X_{j,k})-\theta _k'$, and ${{\tilde{g}}}_k(\xi _{i,k}, \xi _{j,k})=g_k(X_{i,k}, X_{j,k})$, where ${{\tilde{g}}}_k(x, y)={{\tilde{h}}}(x, y)-{{\tilde{h}}}_{1,k}(x)-{{\tilde{h}}}_{2,k}(y)$, ${{\tilde{h}}}_{1,k}(x)=E({{\tilde{h}}}(x, \xi _{j,k}))$ and ${{\tilde{h}}}_{2,k}(y)=E({{\tilde{h}}}(\xi _{i,k}, y))$.

Since ${{\tilde{h}}}(x, y)$ is a 2-Lipschitz continuous function, Lemma 4.5 in Dehling and Wendler (2010a) implies that its degenerate part ${{\tilde{g}}}_k(x, y)$ is also a 2-Lipschitz continuous function. This, together with Assumption (S1), verifies that ${{\tilde{g}}}_k(\xi _{i,k}, \xi _{j,k})$ satisfies the conditions of Lemma 2. Then Lemma 2 results in

$$\begin{aligned} \frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}g_{k}(X_{i,k},X_{j,k})\Big |=\frac{1}{\sqrt{nk^*(n-k^*)}}\Big |\sum _{i=1}^{k^*}\sum _{j=k^*+1}^{n}{{\tilde{g}}}_k(\xi _{i,k}, \xi _{j,k})\Big |=o_P(1). \end{aligned}$$

This concludes the proof of Theorem 3. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hu, J., Wang, L. A weighted U-statistic based change point test for multivariate time series. Stat Papers 64, 753–778 (2023). https://doi.org/10.1007/s00362-022-01341-9

Download citation

Received: 11 September 2021
Revised: 14 March 2022
Accepted: 05 July 2022
Published: 19 July 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s00362-022-01341-9

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A weighted U-statistic based change point test for multivariate time series

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

Evaluating time series forecasting models: an empirical study on performance estimation methods

Bake off redux: a review and experimental evaluation of recent time series classification algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendix

Lemma 1

Lemma 2

Proof

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A weighted U-statistic based change point test for multivariate time series

Abstract

Access this article

Similar content being viewed by others

A survey of methods for time series change point detection

Evaluating time series forecasting models: an empirical study on performance estimation methods

Bake off redux: a review and experimental evaluation of recent time series classification algorithms

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of interest

Additional information

Publisher's Note

Appendix

Appendix

Lemma 1

Lemma 2

Proof

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation