Skip to main content
Log in

Change-point methods for multivariate time-series: paired vectorial observations

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

We consider paired and two-sample break-detection procedures for vectorial observations and multivariate time series. The new methods involve L2-type criteria based on empirical characteristic functions and are easy to compute regardless of dimension. We obtain asymptotic results that allow for application of the methods to a wide range of settings involving on-line as well as retrospective circumstances with dependence between the two time series as well as with dependence within each series. In the ensuing Monte Carlo study the new detection methods are implemented by means of resampling procedures which are properly adapted to the type of data at hand, be it independent or paired, autoregressive or GARCH structured, medium or heavy-tailed. The new methods are also applied on a real dataset from the financial sector over a time period which includes the Brexit referendum.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Alippi C, Boracchi G, Carrera D, Roveri M (2015) Change detection in multivariate datastreams: likelihood and detectability loss. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, pp 1368–1374

  • Aminikhanghahi S, Cook DJ (2017) A survey of methods for time series change point detection. Knowl Inf Syst 51(2):339–367

    Google Scholar 

  • Aue A, Hörmann S, Horváth L, Reimherr M (2009) Break detection in the covariance structure of multivariate time series models. Ann Stat 37(6B):4046–4087

    MathSciNet  MATH  Google Scholar 

  • Billingsley P (2013) Convergence of probability measures. Wiley, Hoboken

    MATH  Google Scholar 

  • Chan NH, Yau CY, Zhang RM (2014) Group LASSO for structural break time series. J Am Sta Assoc 109(506):590–599

    MathSciNet  MATH  Google Scholar 

  • Chen F, Meintanis SG, Zhu L (2019) On some characterizations and multidimensional criteria for testing homogeneity, symmetry and independence. J Multivar Anal 173:125–144

    MathSciNet  MATH  Google Scholar 

  • Doukhan P (1994) Mixing: properties and examples, vol 85. Lecture notes in statistics. Springer, New York

    MATH  Google Scholar 

  • Doukhan P, Lang G, Leucht A, Neumann MH (2015) Dependent wild bootstrap for the empirical process. J Time Ser Anal 36(3):290–314

    MathSciNet  MATH  Google Scholar 

  • Edelmann D, Fokianos K, Pitsillou M (2019) An updated literature review of distance correlation and its applications to time series. In Stat Rev 87(2):237–262

    MathSciNet  Google Scholar 

  • Galeano P, Peña D (2007) Covariance changes detection in multivariate time series. J Stat Plan Inference 137(1):194–211

    MathSciNet  MATH  Google Scholar 

  • Groen JJ, Kapetanios G, Price S (2013) Multivariate methods for monitoring structural change. J App Econ 28(2):250–274

    MathSciNet  Google Scholar 

  • Henze N, Klar B, Zhu LX et al (2005) Checking the adequacy of the multivariate semiparametric location shift model. J Multivar Anal 93(2):238–256

    MathSciNet  MATH  Google Scholar 

  • Hlávka Z, Hušková M, Kirch C, Meintanis SG (2012) Monitoring changes in the error distribution of autoregressive models based on Fourier methods. Test 21(4):605–634

    MathSciNet  MATH  Google Scholar 

  • Hlávka Z, Hušková M, Kirch C, Meintanis SG (2017a) Fourier-type tests involving martingale difference processes. Econ Rev 36(4):468–492

    MathSciNet  Google Scholar 

  • Hlávka Z, Hušková M, Meintanis SG (2017b) Change point detection with multivariate observations based on characteristic functions. In: Ferger D, González Manteiga W, Schmidt T, Wang JL (eds) From statistics to mathematical finance. Festschrift in Honour of Winfried Stute. Springer, New York, pp 273–290

    Google Scholar 

  • Holmes M, Kojadinovic I, Quessy JF (2013) Nonparametric tests for change-point detection à la Gombay and Horváth. J Multivar Anal 115:16–32

    MATH  Google Scholar 

  • Horváth L, Rice G (2014) Extensions of some classical methods in change point analysis. Test 23(2):219–255

    MathSciNet  MATH  Google Scholar 

  • Horváth L, Hušková M, Kokoszka P, Steinebach J (2004) Monitoring changes in linear models. J Stat Plan Inference 126(1):225–251

    MathSciNet  MATH  Google Scholar 

  • Hušková M, Kirch C (2012) Bootstrapping sequential change-point tests for linear regression. Metrika 75(5):673–708

    MathSciNet  MATH  Google Scholar 

  • Hušková M, Meintanis SG (2006) Change point analysis based on empirical characteristic functions. Metrika 63(2):145–168

    MathSciNet  MATH  Google Scholar 

  • Hušková M, Meintanis SG (2008) Tests for the multivariate k-sample problem based on the empirical characteristic function. J Nonparam Stat 20(3):263–277

    MathSciNet  MATH  Google Scholar 

  • Ibragimov I, Chasminskij R (1981) Statistical estimation: asymptotic theory. Springer, New York

    Google Scholar 

  • James NA, Matteson DS (2014) ecp: an R package for nonparametric multiple change point analysis of multivariate data. J Stat Softw 62(7)

  • Jiménez-Gamero MD, Alba-Fernández M, Jodrá P, Barranco-Chamorro I (2017) Fast tests for the two-sample problem based on the empirical characteristic function. Math Comput Simul 137:390–410

    MathSciNet  Google Scholar 

  • Kim AY, Marzban C, Percival DB, Stuetzle W (2009) Using labeled data to evaluate change detectors in a multivariate streaming environment. Signal Process 89(12):2529–2536

    MATH  Google Scholar 

  • Kirch C (2008) Bootstrapping sequential change-point tests. Seq Anal 27(3):330–349

    MathSciNet  MATH  Google Scholar 

  • Kuncheva LI (2013) Change detection in streaming multivariate data using likelihood detectors. IEEE Trans Knowl Data Eng 25(5):1175–1180

    Google Scholar 

  • Lahiri SN (1999) Theoretical comparisons of block bootstrap methods. An Stat 27(1):386–404

    MathSciNet  MATH  Google Scholar 

  • Lahiri S (2003) Resampling methods for dependent data. Springer, New York

    MATH  Google Scholar 

  • Lung-Yut-Fong A, Lévy-Leduc C, Cappé O (2015) Homogeneity and change-point detection tests for multivariate data using rank statistics. J Soc Franc Stat 156:133–162

    MathSciNet  MATH  Google Scholar 

  • Lütkepohl H (2005) New introduction to multiple time series analysis. Springer, New York

    MATH  Google Scholar 

  • Ma TF, Yau CY (2016) A pairwise likelihood-based approach for changepoint detection in multivariate time series models. Biometrika 103(2):409–421

    MathSciNet  MATH  Google Scholar 

  • Matteson DS, James NA (2014) A nonparametric approach for multiple change point analysis of multivariate data. J Am Stat Assoc 109(505):334–345

    MathSciNet  MATH  Google Scholar 

  • Merlevède F, Peligrad M (2020) Functional clt for nonstationary strongly mixing processes. Stat Probab Lett 156:108581

    MathSciNet  MATH  Google Scholar 

  • Meynaoui A, Albert M, Laurent-Bonneau B, Marrel A (2019) Adaptive test of independence based on HSIC measures. arXiv:1902.06441

  • Nolan JP (2013) Multivariate elliptically contoured stable distributions: theory and estimation. Comput Stat 28(5):2067–2089

    MathSciNet  MATH  Google Scholar 

  • Politis DN, White H (2004) Automatic block-length selection for the dependent bootstrap. Econ Rev 23(1):53–70

    MathSciNet  MATH  Google Scholar 

  • Preuss P, Puchstein R, Dette H (2015) Detection of multiple structural breaks in multivariate time series. J Am Stat Assoc 110(510):654–668

    MathSciNet  MATH  Google Scholar 

  • Quessy JF, Éthier F (2012) Cramér-von Mises and characteristic function tests for the two and k-sample problems with dependent data. Comput Stat Data Anal 56(6):2097–2111

    MATH  Google Scholar 

  • Quessy JF, Éthier F (2014) Two bootstrap strategies for a k-problem up to location-scale with dependent samples. J Probab Stat 2014:1–12

    MathSciNet  MATH  Google Scholar 

  • Rio E (2017) Asymptotic theory of weakly dependent random processes. Springer, New York

    MATH  Google Scholar 

  • Rizzo ML, Székely GJ et al (2010) Disco analysis: a nonparametric extension of analysis of variance. Ann Appl Stat 4(2):1034–1055

    MathSciNet  MATH  Google Scholar 

  • Shao X (2010) The dependent wild bootstrap. J Am Stat Assoc 105(489):218–235

    MathSciNet  MATH  Google Scholar 

  • Spokoiny V et al (2009) Multiscale local change point detection with applications to value-at-risk. Ann Stat 37(3):1405–1436

    MathSciNet  MATH  Google Scholar 

  • Steland A, Von Sachs R et al (2017) Large-sample approximations for variance-covariance matrices of high-dimensional time series. Bernoulli 23(4A):2299–2329

    MathSciNet  MATH  Google Scholar 

  • Székely GJ, Rizzo ML (2005) Hierarchical clustering via joint between-within distances: extending Ward’s minimum variance method. J Classif 22(2):151–183

    MathSciNet  MATH  Google Scholar 

  • Székely GJ, Rizzo ML (2013) Energy statistics: a class of statistics based on distances. J Stat Plan Inference 143(8):1249–1272

    MathSciNet  MATH  Google Scholar 

  • Székely GJ, Rizzo ML (2016) Energy distance. Wiley Interdiscipl Rev 8(1):27–38

    MathSciNet  Google Scholar 

  • Tenreiro C (2019) On the automatic selection of the tuning parameter appearing in certain families of goodness-of-fit tests. J Stat Comput Simul 89(10):1780–1797

    MathSciNet  MATH  Google Scholar 

  • Yokoyama R (1980) Moment bounds for stationary mixing sequences. Z Wahrscheinlichkeitstheorie Verwandte Gebiete 52(1):45–57

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The research was supported by Czech Science Foundation, Grant GAČR 18-08888S.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zdeněk Hlávka.

Additional information

To the memory of Theophilos Cacoullos (1932–2020).

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Simos G. Meintanis—On sabbatical leave from the University of Athens.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 206 KB)

Appendix

Appendix

This section contains the proofs of the assertions formulated in Sect. 3. We start with a useful auxiliary relationship and auxiliary lemmas. D denotes a generic constant which may vary from line to line.

Direct calculations give for \(t=1,\ldots ,n\)

$$\begin{aligned}&\int _{{\mathbb {R}}^p}|{\widehat{\varphi }}_{x,t}({\varvec{u}})- {\widehat{\varphi }}_{y,t}({\varvec{u}})|^2 W({\varvec{u}}) d {\varvec{u}} \\&\quad = \frac{1}{t^2} \int _{{\mathbb {R}}^p}\Big |\sum _{\tau =1}^t\{\exp (i {\varvec{u}}^\top {\varvec{x}}_{\tau })- \exp (i {\varvec{u}}^\top {\varvec{y}}_{\tau })\}\Big |^2 W({\varvec{u}}) d{\varvec{u}} \\&\quad = \frac{1}{t^2} \int _{{\mathbb {R}}^p}\Big ( Z_T(t/N, {\varvec{u}}) \Big )^2 W({\varvec{u}}) d {\varvec{u}}, \end{aligned}$$

where

$$\begin{aligned} Z_T(s, {\varvec{u}})&= \frac{1}{\sqrt{T}}\sum _{\tau =1}^{\lfloor sT\rfloor } h_\tau ({\varvec{u}}),\quad s \in (0,1],\, {\varvec{u}}\in {\mathbb {R}}^p,\\ h_\tau ({\varvec{u}})&= \cos ({\varvec{u}}^\top \varvec{x}_{\tau }) + \sin ({\varvec{u}}^\top {\varvec{x}}_{\tau }) -\{\cos ({\varvec{u}}^\top {\varvec{y}}_{\tau }) + \sin (\varvec{u}^\top {\varvec{y}}_{\tau })\} \end{aligned}$$

Next lemma summarizes needed assertions on partial sums of \(\alpha \)-mixing random variables.

Lemma 1

Let sequence \(\{({\varvec{x}}_t,{\varvec{y}}_t),\, t=1,\ldots \}\) be 2p-dimensional strictly stationary \(\alpha \)-mixing with coefficients \(\alpha (j)\) and let \(E h_t({\varvec{u}})=0\) then for arbitrary \({\varvec{u}}\in {\mathbb {R}}^p\) and any \(\xi >0\)

$$\begin{aligned} E\Big | \sum _{t=1}^T h_t({\varvec{u}}) \Big |^2\le D T \Big ( E| h_1({\varvec{u}})|^{2+\xi }\Big )^{2/(2+\xi )} \sum _{j=1}^{\infty }(\alpha (j))^{\xi /(2+\xi )} \end{aligned}$$

and for any \(b_1\ge \cdots \ge b_T>0\)

$$\begin{aligned}&E\Big | \max _{1\le j\le T} b_j\sum _{t=1}^j h_t({\varvec{u}}) \Big |^{2}\le \\&D (\log T)^2 T^{(2+\delta )2} \Big ( E| h_1(\varvec{u})|^{2+\xi }\Big )^{2/(2+\xi )} \sum _{j=1}^{\infty }(\alpha (j))^{\xi /(2+\xi )} \end{aligned}$$

The inequalities remain true if \( h_t({\varvec{u}})\)’s is replaced by \( h_t({\varvec{u}}_1)- h_t({\varvec{u}}_2)\). If additionally

$$\begin{aligned} 0<\sigma ^2= \mathrm{{var}}\{ h_1({\varvec{u}})\} + 2\sum _{j=1}^{\infty }\mathrm{{cov}}\{ h_1({\varvec{u}}),\, h_{j+1}({\varvec{u}})\}<\infty \end{aligned}$$
(25)

then for \(T\rightarrow \infty \) the limit distribution of \(\frac{1}{\sqrt{T}} \sum _{t=1}^T h_t({\varvec{u}})\) is asymptotically normal with zero mean and variance \(\sigma ^2\).

(b) Let assumption (A.1) and (25) be satisfied then

$$\begin{aligned} E\Big | \sum _{t=1}^T h_t({\varvec{u}}) \Big |^{2+\delta }\le D T^{2+\delta } \Big ( E| h_1(\varvec{u})|^{2+\xi }\Big )^{2/(2+\delta )} \end{aligned}$$

and

$$\begin{aligned} E\Big |\max _{1\le t\le T} \sum _{\ell =1}^t h_t({\varvec{u}}) \Big |^{2+\delta }\le D T^{(2+\delta )2} \Big ( E| h_1(\varvec{u})|^{2+\xi }\Big )^{2/(2+\delta )} \end{aligned}$$

Proof

It follows from classical results on \(\alpha \)-mixing sequences of random variables, see e.g. Yokoyama (1980) and Doukhan (1994). \(\square \)

The following Lemma is substantial for the proof of Theorem 1.

Lemma 2

Let assumptions (A.1)–(A.2) and (25) be satisfied and let the null hypothesis hold true. Then

(a) for any compact subset F of \({\mathbb {R}}^p\) it holds

$$\begin{aligned} \sup _T E \int _F Z^2_T(s, {\varvec{u}}) W ({\varvec{u}}) d{\varvec{u}} < \infty , \end{aligned}$$

(b) there exist \(a>0\) and \(0<C<\infty \) such that for any \(0\le s\le 1\) and any \({\varvec{u}}_1\) and \({\varvec{u}}_2\) it holds

$$\begin{aligned} \sup _T E \Big | Z_T(s, {\varvec{u}}_1)^2 -Z_T(s, \varvec{u}_2)^2\Big |\le Cx \Vert {\varvec{u}}_1-{\varvec{u}}_2\Vert ^a, \end{aligned}$$

(c) the marginal distributions of \(\{Z_T(s, {\varvec{u}})\}\) converge to the marginal distributions of the Gaussian process \(\{Z(s, {\varvec{u}})\}\) with zero mean and covariance structure \((0\le s_1<s_2\le 1)\)

$$\begin{aligned} { \mathrm{{cov}}\{ Z(s_1, {\varvec{u}}_1), Z(s_2, {\varvec{u}}_2)\}= } s_1 E\Big ( h_\tau ({\varvec{u}}_1), h_\tau ({\varvec{u}}_2) + \sum _{\ell =1}^{ \infty }2 E\big ( h_\tau ({\varvec{u}}_1) h_{\tau +\ell }({\varvec{u}}_2)\Big ). \end{aligned}$$

Proof

It follows the line of Lemma 7.1 in Hlávka et al. (2017), where the basic difference is the present paper works with functionals of partial sums of \(\alpha \)-mixing random vectors while in Hlávka et al. (2017) functionals of martingale differences are considered. In other words, instead of assertions on partial sums of martingale differences analogous results on partial sums of \(\alpha \)-mixing random vectors are applied.

By the assumptions for each \({\varvec{u}}\) the sequence \(\{ h_\tau ({\varvec{u}})\}_{\tau }\) is \(\alpha \)-mixing with the same coefficient as the sequence \(\{({\varvec{x}}_\tau , \varvec{y}_{\tau }\}_{\tau }\). Moreover,

$$\begin{aligned} E Z(s, {\varvec{u}})&=0, \\ E Z^2(s, {\varvec{u}})&=\frac{\lfloor Ts\rfloor }{T} \Big (E h^2_1({\varvec{u}}) + 2 \sum _{\ell =1}^{\infty } E h_1({\varvec{u}}) h_{\ell +1}({\varvec{u}})\Big ) \end{aligned}$$

By the assumptions \(E Z^2(s, {\varvec{u}})\) is uniformly bounded and therefore (a) holds true.

Concerning assertion (b) notice that

$$\begin{aligned} E\big |&Z^2_T(s, {\varvec{u}}_1) -Z^2_T(s, {\varvec{u}}_2)\big | \le D \left\{ E\left| Z_T(s, {\varvec{u}}_1) -Z_T(s, \varvec{u}_2)\right| ^2\right\} ^{1/2} \\&\le D\left[ E \left\{ h_1({\varvec{u}}_1)- h_1(\varvec{u}_2)\right\} ^2\right] ^{1/2} \le D \Vert {\varvec{u}}_1- \varvec{u}_2\Vert ^a \end{aligned}$$

for some \(a>0\). We used the moment inequalities for \(\alpha \)-mixing sequences. Particularly, we used inequalities for \(\alpha \)-mixing.

Concerning (c) it is a consequence of Lemma 1. \(\square \)

Proof

(Theorem 1) From Lemma 7.1 in Hlávka et al. (2017) in addition to Theorem 22 in Ibragimov and Chasminskij (1981, pp. 380, 381) and Lemma 2 we get that for each \(s\in (0,1]\) as \(T\rightarrow \infty \)

$$\begin{aligned} \int _F Z_T^2 (s, {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}} {\mathop {\rightarrow }\limits ^{{\mathcal {L}}}} \int _F V^2 (s, {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}} \end{aligned}$$

for any compact \(F\subset {\mathbb {R}}^p\). Since \(W(\cdot )\) is integrable and \( EZ^2_T(s,{\varvec{u}})\le D\) then

$$\begin{aligned} \int _{{\mathbb {R}}^p} Z_T^2 (s, {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}} {\mathop {\rightarrow }\limits ^{{\mathcal {L}}}} \int _{{\mathbb {R}}^p} V^2 (s, {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}} \end{aligned}$$

which proves (a).

Next we have to study limit behavior of the process

$$\begin{aligned} S_T(s)=\sqrt{\int _{{\mathbb {R}}^p} Z_T^2 (s, {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}}}\quad s\in (0,1] \end{aligned}$$

Convergence finite dimensional distributions follow from Lemma 1 in combination with continuous mapping theorem. To obtain tightness by the Minkowski inequality we have

$$\begin{aligned} |S_T(s_1) - S_T(s_2) |\le D\sqrt{\int |Z_T(s_1,{\varvec{u}})- Z_T(s_2,{\varvec{u}})|^2 W({\varvec{u}}) d {\varvec{u}}} \end{aligned}$$

and further by the Jensen inequality for any \(\varepsilon >0\)

$$\begin{aligned}&P(|S_T(s_1) - S_T(s_1) |\ge \varepsilon )\\&\quad \le D E\Big (\int |Z_T(s_,{\varvec{u}})- Z_T(s_2,{\varvec{u}})|^2 W({\varvec{u}}) d{\varvec{u}}\Big )^{1+\delta /2} \varepsilon ^{-(1+\delta /2)}\\&\quad \le D \int E|Z_T(s_,{\varvec{u}})- Z_T(s_2,\varvec{u})|^{2+\delta } W({\varvec{u}}) d{\varvec{u}} \varepsilon ^{-(1+\delta /2)} \\&\quad \le D|s_1-s_2|^a \end{aligned}$$

Then by Billingsley (2013, Theorem 15.6) we conclude that

$$\begin{aligned} S_T(\cdot ) {\mathop {\rightarrow }\limits ^{D[0,1]}} \int _{{\mathbb {R}}^p} V^2 (\cdot , {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}} \end{aligned}$$

which further implies that

$$\begin{aligned} \max _{s_1<s\le 1} S_T(s)s^{\eta -2}{\mathop {\rightarrow }\limits ^{{\mathcal {L}}}} \max _{s_1<s\le 1} \int _{{\mathbb {R}}^p} V^2 (\cdot , {\varvec{u}}) W({\varvec{u}}) d{\varvec{u}}. \end{aligned}$$

\(\square \)

Proof

(Theorem 2) It follows the line of Theorem 1 and therefore is omitted. \(\square \)

Proof

(Theorem 3) By inequalities in Lemma 2 we notice that for any \(s\in (s_0,1]\)

$$\begin{aligned} \int _{{\mathbb {R}}^p} \Big (Z_T (s, {\varvec{u}})-EZ_T (s, \varvec{u})\Big )^2 W({\varvec{u}}) d{\varvec{u}} =O_P(1) \end{aligned}$$

and by assumptions

$$\begin{aligned} \int _{{\mathbb {R}}^p} \Big (EZ_T (s, {\varvec{u}})\Big )^2 W({\varvec{u}}) d{\varvec{u}} \rightarrow \infty . \end{aligned}$$

The proof can be straightforwardly finished. \(\square \)

Proof

(Theorem 4): It is omitted since it is done in the same way as Theorem 3. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hlávka, Z., Hušková, M. & Meintanis, S.G. Change-point methods for multivariate time-series: paired vectorial observations. Stat Papers 61, 1351–1383 (2020). https://doi.org/10.1007/s00362-020-01175-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-020-01175-3

Keywords

Mathematics Subject Classification

Navigation