1 Introduction

As a common feature of ‘big data’, change point arises in many areas such as signal processing (Basseville [1]), finance (Chen and Gupta [2]), ecology (Hawkins [3]), disease outbreak watch (Sparks et al. [4]), and neuroscience (Ratnam et al. [5]; Lena et al. [6]) and has been much investigated in the last few decades. To detect change point and estimate its location, there has emerged a number of approaches including least squares (LS, Bai [7]), Bayesian method (Fearnhead [8]), maximum likelihood (Zou et al. [9]), and some nonparametric methods (Matteson and James [10]; Haynes et al. [11]). The cumulative sum (CUSUM) method, based on the LS estimation, is a very attractive one for detecting the variance change in a sequence because it avoids some assumptions about the underlying error distribution function and is computed simply (Gombay et al. [12]). For independent sequences, Gombay et al. [12] constructed the CUSUM statistic to detect and estimate the change of variance. Wang and Wang [13] used the CUSUM test to detect the variance change in a linear process with long memory errors. Zhao et al. [14] considered the ratio test for variance change in a linear process. Qin et al. [15] investigated the strong convergence rate of the CUSUM estimator of the variance change in linear processes.

However, most of the references above assume the change point number in a sequence is one, which is a serious restriction when applied to practical problems. For multiple change point detection, Inclán and Tiao [16] employed the cumulative sums of squares to detect the multiple changes of variance in the uncorrelated sequences. Lavielle [17] obtained the convergence rate for multiple change detection for strongly mixing and strongly dependent processes. Li and Zhao [18] gave the convergence rate for multiple change-points estimation of moving-average processes. More recently, Haynes et al. [11] proposed a computationally efficient nonparametric approach for change point detection, and Laurentiu et al. [19] offered the Bayesian loss-based approach to analyze change point problem. But both of them require the information of the underlying error distribution function, which may lead to the complexity of computation.

In this contribution, we consider the following multiple variance change model:

$$ {{Y_{t}} = \mu + {\sigma _{i}} {e_{t}},\quad t_{i-1}^{*} \le t \le t_{i} ^{*}, 1 \le i \le r,} $$
(1)

where r is the known number of change points, μ and \({\sigma _{i}}\) (\(1 \le i \le r\)) are parameters, \(t_{i}^{*} \), \(1 \le i \le r\), \(t_{0}^{*} = 0\), \(t_{r + 1}^{*} = n\) are the true change locations with \(t_{i}^{*} = [ {{\tau _{i}}n} ]\), where \([x]\) denotes the integer part of x, \(\boldsymbol{\tau } = ( {\tau _{1}^{*} ,\tau _{2}^{*} , \ldots ,\tau _{r}^{*}} )\) are the change points, and \({e_{t}}\) is linear processes given as follows:

$$ {{e_{t}} = \sum_{j = 0}^{\infty }{{a_{j}} {\varepsilon _{t - j}}},} $$
(2)

where \({a_{j}}\) is an array of real numbers satisfying \(\sum_{j = 0}^{\infty }{a_{j}^{2}} < \infty \), \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) are stationary random variables.

Under the independent or dependent assumptions of \(\{{{\varepsilon _{m}},m \in Z} \}\), the convergence rates of the single change point estimators have been established for the linear processes (2). We refer to Bai [7] and Qin et al. [15] for independence case, to Li and Zhao [18] for linear negative quadrant dependence, and to Wang and Wang [13] for long range dependence. In this article, we will consider the multiple variance change model, and simultaneously \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) are negatively super-additive dependence (NSD) whose definition is based on the super-additive functions.

Definition 1

(Hu [20])

A function ϕ is called super-additive if

$$ \phi ( {\boldsymbol{x} \vee \boldsymbol{y}} ) + \phi ( {\boldsymbol{x} \wedge \boldsymbol{y}} ) \ge \phi ( \boldsymbol{x} ) + \phi ( \boldsymbol{y} ) $$

for all \(\boldsymbol{x},\boldsymbol{y} \in {R^{n}}\), where “∨” is componentwise maximum and “∧” represents componentwise minimum.

Definition 2

(Hu [20])

A random vector \(( {{X_{1}}, {X_{2}}, \ldots ,{X_{n}}} )\) is said to be NSD if

$$ E\phi (X_{1},X_{2},\ldots ,X_{n} )\leq E \phi \bigl(X_{1} ^{*},X_{2}^{*}, \ldots ,X_{n}^{*} \bigr), $$
(3)

where \(\{ {X_{m}^{*},1 \le m \le n} \}\) are independent random variables that have the same marginal distribution with \(\{ {X_{m},1 \le m \le n} \}\) for each i, and ϕ is a super-additive function such that the expectations in (3) exist.

Definition 3

(Wang et al. [21])

A sequence of random variables \(( {{X_{1}},{X_{2}}, \ldots ,{X_{n}}, \ldots } )\) is called NSD if, for all \(n \ge 1\), \(( {{X_{1}},{X_{2}}, \ldots ,{X_{n}}} )\) is NSD.

NSD has received considerable attention since it includes the well-known negative association (see Christofides and Vaggelatou [22]). Eghbal et al. [23] explored the strong law of large numbers and the rate of convergence for NSD sequences with the existence of high order moments. Shen et al. [24] and Wu et al. [25] got the almost sure and complete convergence, respectively, for NSD random variables. Wang et al. [26] investigated the complete convergence, and Yu et al. [27] established the central limit theorem for weighted sums of NSD random variables. Moreover, NSD samples have been introduced to various models; for example, under NSD errors, Yu et al. [27] considered the M-test problem of regression parameters in a linear model; Wang et al. [28] studied the strong consistency and weak consistency of the LS estimators in an EV regression model, and Yu et al. [29] obtained the convergence rates of the wavelet thresholding estimators in a nonparametric regression model.

The aim of this study is to detect the multiple change points for linear processes under NSD. We propose the CUSUM-type change point estimator in model (1) and establish the weak convergence rate of the estimator with the mean parameter μ estimated by its LS estimator. Moreover, some simulations are implemented by R Software to compare the CUSUM-type estimator with some methods. The result indicates that the CUSUM-type change point estimator is broadly comparable with those obtained by the typical methods.

The remainder of this paper is organized as follows. In Sect. 2, we describe the CUSUM-type multiple change point estimation and give the weak convergence rate of this estimator. Also, we give a multiple variance-change iterative (MVCI) algorithm to evaluate the estimator. In Sect. 3, some simulations are presented to show the performances of the estimator. Finally, the proofs of the main results are given in Sect. 4.

2 Estimation and main results

Let \({\tilde{Y}_{j}} = {Y_{j}} - {\hat{\mu }_{n}}\), where \({\hat{\mu } _{n}} = \frac{1}{n}\sum_{t = 1}^{n} {{Y_{t}}} \) is the LS estimator of the mean μ. Assume that

$$ {A_{n,r} = \bigl\{ { ( {{t_{0}},{t_{1}}, \ldots ,{t_{r + 1}}} ), {t_{0}} = 0 < {t_{1}} < \cdots < {t_{r}} < {t_{r + 1}} = n} \bigr\} } $$

is a set of allowable r-partitions. We further consider the following set of allowable r-partitions:

$$ A_{n,r}^{{\delta _{n}}} = \bigl\{ { ( {{t_{0}},{t_{1}}, \ldots , {t_{r + 1}}} ):{t_{i}} - {t_{i - 1}} \ge n{\delta _{n}}} \bigr\} , $$

where \({\delta _{n}}\) is a non-increasing non-negative sequence satisfying \({\delta _{n}} \to 0\) and \(n{\delta _{n}} \to \infty \).

For each \(t_{i}\), \(1 \le i \le r + 1\), we define

$$ R ( {{t_{i}}} ) = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \Biggl\vert { \frac{1}{{{t_{i}} - {t_{i - 1}}}} \sum_{t = {t_{i - 1}} + 1}^{{t_{i}}} { \tilde{Y}_{t}^{2}} - \frac{1}{ {{t_{i + 1}} - {t_{i}}}}\sum _{t = {t_{i}} + 1}^{{t_{i + 1}}} {\tilde{Y}_{t}^{2}} } \Biggr\vert . $$

Denote \(\hat{\boldsymbol{\tau }}^{{\delta _{n}}} = {{\hat{\boldsymbol{t}} ^{{\delta _{n}}}} / n}\), the CUSUM-type multiple change point estimator is given by

$$ {\hat{\boldsymbol{\tau }}^{{\delta _{n}}} = \mathop{\arg \max } \limits _{\boldsymbol{t} \in A_{n,r}^{{\delta _{n}}}} \frac{1}{n} \sum_{i = 1}^{r + 1} {R ( {{t_{i}}} )}.} $$
(4)

To derive our results, we list several conditions as follows.

  1. (A1)

    \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) are stationary NSD random variables with \(E{\varepsilon _{m}} = 0\) and \(\operatorname{Var} ( {{\varepsilon _{m}}} ) = {\sigma ^{2}} < \infty \).

  2. (A2)

    For all \(l \ge 1\), we have \(\sum_{m:| {l - m} | \ge u} {| {\operatorname{Cov} ( {{\varepsilon _{l}},{\varepsilon _{m}}} )} |} \to 0\), as \(u \to \infty \).

  3. (A3)

    \(E\varepsilon _{m}^{4} < \infty \) holds for all \(m \ge 1\).

  4. (A4)

    \(\sum_{j = 0}^{\infty }{| {{a_{j}}} |} < \infty \).

Remark 1

Conditions (A1) and (A2) are easily satisfied (see Yu et al. [29]). (A3) is often applied to obtain the convergence rate of change point estimator (e.g., Qin et al. [15]; Shi et al. [30]). Condition (A4) is weaker than Bai [11], which requires \(\sum_{j = 0}^{\infty }{j| {{a_{j}}} |} < \infty \). Furthermore, condition (A4) implies that \(\sum_{j = 0}^{\infty }{a_{j}^{2}} < \infty \) and \(\sum_{j = 0}^{\infty }{a_{j}^{4}} < \infty \).

Theorem 1

Assume that conditions (A1)(A4) hold. Then, for all \(1 \le j \le r\), we have

$$ \hat{\tau } _{j}^{{\delta _{n}}} \to \tau _{j}^{*}, \quad \textit{in probability}. $$

When the mean μ is known, Qin et al. [15] established the strong convergence of the CUSUM estimator. It is obvious that Theorem 1 is still true when μ is known, and we will give the following corollary without proof.

Corollary 1

If the mean is known (\(\mu = 0\)), conditions (A1)(A4) hold, then we have the same conclusion of Theorem 1.

Under assumptions (A1)–(A4), we can further establish the convergence rate of the CUSUM-type multiple change point estimator \(\hat{\boldsymbol{\tau }}^{{\delta _{n}}}\).

Theorem 2

Let \(M(n)\) be a natural number sequence with \(M(n)\rightarrow \infty \). Then, under the conditions of Theorem 1, we further have

$$ \hat{\tau } _{j}^{{\delta _{n}}} - \tau _{j}^{*} = o \bigl( {{{M ( n )} / n}} \bigr), \quad \textit{in probability}. $$

To implement the CUSUM-type multiple change-point method, we also give the multiple variance-change iterative (MVCI) algorithm based on Qin et al. [15] and Shi et al. [30] as follows:

Step 1. Choose \(\eta \ge 1\), compute \({\hat{\mu }_{n}}\) and \(\{ {\tilde{Y}_{t}^{2}} \}\).

Step 2. Set \(i = 1\), \(m = 0\), and \(l = [ {n{\delta _{n}}} ]\). Divide the sample into L subintervals \({I_{j}}\) with the equal interval length l.

Step 3. For each subinterval \({I_{j}}\), \(j = 1,2, \ldots ,L\), find \(\hat{t}_{j}^{ ( i )} = \arg \max _{t \in ( {1 + m, m + l} )} R ( {{t_{j}}} )\).

Step 4. Compute the set \(\Delta = \{ {R ( {\hat{t} _{j}^{ ( i )}} )} \}\), and select r change locations which correspond to r maximum values of \(R ( {\hat{t} _{j}^{ ( i )}} )\) in the set Δ.

Step 5. For the selected r change locations \(\hat{t}_{j} ^{ ( i )}\), \(j = 1, \ldots ,r\), find \(\hat{t}_{j}^{ ( {i + 1} )} = [4] \arg \max _{t \in ( {\hat{t}_{j}^{ ( i )} - 2M ( l ), \hat{t}_{j}^{ ( {i + 1} )} + 2M ( l )} )} R ( {{t_{j}}} )\).

Step 6. Set \(l = 4M ( l )\) and \(m = \hat{t}_{j} ^{ ( i )} - 2M ( l )\).

Step 7. If \({ \Vert { \hat{\boldsymbol{t}} ^{ ( {i + 1} )} - \hat{\boldsymbol{t}}^{ ( i )}} \Vert _{ \infty }} < \eta \), then proceed to Step 8, otherwise set \(i=i+1\), go back to Step 3.

Step 8. \({ \hat{\boldsymbol{t}}_{\mathrm{MVCI}}} = \hat{\boldsymbol{t}}^{ ( i )}\) and \({\hat{\tau }_{ \mathrm{MVCI}}} = {{ \hat{\boldsymbol{t}}^{ ( i )}} / n}\).

3 Simulation studies

We present a set of simulation studies to illustrate the availability of the CUSUM-type MVCI algorithm via R packages. Additionally, we implement some available competitors including segment neighborhood (SN), pruned exact linear time (PELT), binary segmentation (BS), and wild binary segmentation (WBS) to compare the performance of the MVCI algorithm.

In model (1), we take \(r = 4\), \(\mu = 0\), \({\sigma _{1}} = 2\), \({\sigma _{2}} = 4\), \({\sigma _{3}} = 8\), \({\sigma _{4}} = 4\), \({\sigma _{5}} = 2\), and suppose the true change locations \(t_{1}^{*} = 100\), \(t_{2}^{*} = 200\), \(t_{3}^{*} = 300\), \(t_{4}^{*} = 400\), we model the NSD sequence \(\{ {{\varepsilon _{m}},m \in \mathbb{Z}} \}\) as a multivariate mixture of normal distribution with joint distribution \(N ( {0,0,1,4; -0.5} )\). The sample size is taken to be \(n = 500\) and the weight functions are satisfied \({a_{j}} = {2^{ - j}}\), \(j \in \mathbb{Z}\). Figure 1 displays the simulated sequence of \({Y_{t}}\), \(1 \le t \le 500\), and the true change locations.

Figure 1
figure 1

The simulated time sequence of \({Y_{t}}\), the red vertical lines are true change locations

To carry out the SN (Auger and Lawrence [31]), PELT (Killick et al. [32]), BS (Killick and Eckley [33]), we use the penalty likelihood method, which can be implemented by changepoint package in (Killick [34]). As to WBS (Killick and Eckley [33]), we utilize package wbsts (Korkas, Karolos, and Piotr [35]) in our model with the threshold \({\lambda _{n}} = C\sqrt{2} {\log ^{{1 / 2}}}n\), where \(C = 1\). We assume the parameter \({\delta _{n}} = {n^{ {{ - 1} / 2}}}\) in the MVCI algorithm. The mean squared error (MSE) of the CUSUM-type variance change point estimator of \(\tau ^{*}\) is defined as \(\mathrm{MSE} = \frac{1}{r}\sum_{i = 1}^{r} {{{ ( {\hat{\tau }_{i} - \tau _{i}^{*}} )} ^{2}}} \), and the performances of the above methods are described in Table 1 (all of the simulations are run for 100 replicates).

Table 1 Comparison of the MVCI algorithm with SN,PELT, BS, and WBS methods

Table 1 presents the average MSEs of the MVCI, SN, PELT, BS, and WBS methods. Generally, the first change point is overestimated and the rest change points are underestimated. When the sample size is large (\(n=500\)), all of the methods can estimate the change points availably, but the MVCI method is superior in terms of the average MSE. This also indicates that the CUSUM-type variance-change method is computationally competitive with some of best change point estimation methods.

4 Proof of the theorems

Throughout the proof, let C be a general positive constant and \({c_{0}},{c_{1}},{c_{2}},{C_{0}},{C_{1}}, \ldots ,{C_{4}}\) be some positive constants. Denote \({x^{+} } = xI ( {x \ge 0} )\) and \({x^{-} } = - xI ( {x < 0} )\). In the following we will state some lemmas which are needed.

Lemma 1

(Hu [20])

An NSD random sequence \(\{ {{X_{m}},m \ge 1} \}\) possesses the following properties.

  1. (P1)

    For any \({x_{1}},{x_{1}}, \ldots ,{x_{n}}\),

    $$ P ( {{X_{1}} \le {x_{1}},{X_{2}} \le {x_{2}}, \ldots ,{X_{n}} \le {x_{n}}} ) \le \prod_{m = 1}^{n} {P ( {{X_{m}} \le {x_{m}}} )} . $$
  2. (P2)

    \(\{ { - {X_{1}}, - {X_{2}}, \ldots , - {X_{n}}} \}\) is also NSD.

  3. (P3)

    Let \({f_{1}},{f_{2}}, \ldots \) be a sequence of non-decreasing Borel functions, then \(\{ {f_{n}}({X_{n}}),n \ge 1\}\) is still an NSD random sequence.

Lemma 2

(Wang et al. [21])

Suppose that \(\{ {{X_{m}},m \ge 1} \}\) is an NSD random sequence with \(E{X_{m}} = 0\) and \(E{| {{X_{m}}} |^{\alpha }} < \infty \) for some \(\alpha \ge 2\), then for all n,

$$ {E \Biggl( \max_{1 \le k \le n} { \Biggl\vert {\sum _{m = 1}^{k} {{X_{m}}} } \Biggr\vert ^{\alpha }} \Biggr) \le C \Biggl\{ {\sum_{m = 1}^{n} {E{{ \vert {{X_{m}}} \vert }^{\alpha }} + {{ \Biggl( { \sum_{m = 1}^{n} {EX_{m}^{2}} } \Biggr)}^{{\alpha / 2}}}} } \Biggr\} .} $$

Lemma 3

Suppose that \(\{ {{X_{m}},m \ge 1} \}\) is an NSD random sequence with conditions (A1)(A2) hold, \(\{ {{a_{m}},1 \le m \le n,n \ge 1} \}\) is a sequence of real numbers satisfying \(\sum_{m = 1}^{\infty }{a_{m}^{2}} < \infty \). Then

$$ {\sigma _{n}^{2} = \operatorname{Var} \Biggl( {\sum _{m = 1}^{ \infty }{{a_{m}} { \varepsilon _{n-m}}} } \Biggr) \le C{\sigma ^{2}}.} $$

Proof

For a pair of NSD random variables X, Y, by property (P1) in Lemma 1, we have

$$ {H ( {x,y} ) = P ( {X \le x,Y \le y} ) - P ( {X \le x} )P ( {Y \le y} ) \le 0.} $$

The covariance of X and Y is verified to be negative by

$$ \operatorname{Cov} ( {X,Y} ) = E ( {XY} ) - E ( X )E ( Y ) = \int { \int {H ( {x,y} )\,dx\,dy} } \le 0. $$
(5)

Then, for \(u \ge 1\),

$$\begin{aligned} &\sum_{l,m = 1, \vert {l - m} \vert \ge u} { \bigl\vert {{a_{l}} {a_{m}} \operatorname{Cov} ( {{X_{l}},{X_{m}}} )} \bigr\vert } \\ &\quad \le \sum _{l = 1} {\sum_{m = l + u} { \bigl(a_{l}^{2} + a _{m}^{2} \bigr)} } \bigl\vert {\operatorname{Cov} ( {{X_{l}},{X_{m}}} )} \bigr\vert \\ &\quad \le \sum_{l = 1} {a_{l}^{2} \sum_{m = l + u} { \bigl\vert {\operatorname{Cov} ( {{X_{l}},{X_{m}}} )} \bigr\vert } } + \sum _{m = u + 1} {a_{m}^{2}\sum _{l = 1}^{m - u} { \bigl\vert {\operatorname{Cov} ( {{X_{l}},{X_{m}}} )} \bigr\vert } } \\ &\quad \le \sum_{l = 1} {a_{l}^{2} \sum_{ \vert {m - l} \vert \ge u} { \bigl\vert {\operatorname{Cov} ( {{X_{l}},{X_{m}}} )} \bigr\vert } } \\ &\quad \le \sup_{l} {\sum_{m = 1, \vert {l - m} \vert \ge u} | {\operatorname{Cov} ( {{X_{l}},{X_{m}}} )} } | \biggl( {\sum_{m = 1} {a_{m}^{2}} } \biggr). \end{aligned}$$

Hence, by condition (A2), for a fixed small \(\varepsilon > 0\), there exists a positive integer \(u = {u_{\varepsilon }}\) such that

$$ \sum_{l,m = 1, \vert {l - m} \vert \ge u} { \bigl\vert {{a_{l}} {a_{m}} \operatorname{Cov} ( {{X_{l}},{X_{m}}} )} \bigr\vert } \le \varepsilon . $$

Set \(K = [{1 / \varepsilon }]\) and \({Y_{m}} = \sum_{l = um + 1}^{u ( {m + 1} )} {a_{l}}{X_{l}}\), \(m = 0,1, \ldots , n \),

$$ {\varUpsilon _{m}} = \Biggl\{ {l:2Km \le l \le 2Km + K, \bigl\vert { \operatorname{Cov}( {Y_{l}},{Y_{l + 1}})} \bigr\vert \le \frac{2}{K}\sum_{l = 2Km}^{2Km + K} { \operatorname{Var}({Y_{m}})} } \Biggr\} . $$

Define \({h_{0}} = 0\), \({h_{m + 1}} = \min \{ {h:h > {h_{m}},h \in {\varUpsilon _{m}}} \}\), and put

$$\begin{aligned}& {Z_{m}} = \sum_{l = {h_{m}} + 1}^{{h_{m + 1}}} {{Y_{l}}} ,\quad m = 0,1, \ldots ,n, \\& {\varLambda _{m}} = \bigl\{ {u({h_{m}} + 1) + 1, \ldots ,u({h_{m + 1}} + 1)} \bigr\} . \end{aligned}$$

Note that

$$ {Z_{m}} = \sum_{l \in {\varLambda _{m}}} {{a_{l}}} {X_{l}},\quad m = 0,1, \ldots ,n. $$

It is easy to see that \(\# {\varLambda _{m}} \le 3Ku\), where # stands for the cardinality of a set. From Lemma 2, it follows that

$$\begin{aligned} \sigma _{n}^{2} =& E\sum _{m = 1} {Z_{m}^{2}} + \sum _{1 \le m < l \le n} { \bigl\vert {\operatorname{Cov} ( {{Z_{m}}, {Z_{l}}} )} \bigr\vert } \\ =& \sum_{m = 1} E { \biggl( {\sum _{l \in {\varLambda _{m}}} {{a_{l}}} {X_{l}}} \biggr)^{2}} + \sum_{1 \le m < l \le n, \vert {m - l} \vert = 1} { \bigl\vert {\operatorname{Cov} ( {{Z_{m}},{Z_{l}}} )} \bigr\vert }\\ &{} + \sum_{1 \le m < l \le n, \vert {m - l} \vert > 1} { \bigl\vert {\operatorname{Cov} ( {{Z_{m}},{Z_{l}}} )} \bigr\vert } \\ \le & \sum_{m = 1} {a_{m}^{2}E{{ \biggl( {\sum_{l \in {\varLambda _{m}}} {{X_{l}}} } \biggr)}^{2}}} + \sum_{m = 1} { \bigl\vert {\operatorname{Cov} ( {{Y_{{h_{m}}}},{Y _{{h_{m + 1}}}}} )} \bigr\vert }\\ &{} + \sum_{1 \le m < l \le n, \vert {m - l} \vert \ge u} { \vert {{a_{m}} {a_{l}}} \vert \bigl\vert { \operatorname{Cov} ( {{Z_{m}},{Z_{l}}} )} \bigr\vert } \\ \le & \sum_{m = 1} {\sum _{l \in {\varLambda _{m}}} {a_{l}^{2}E{{ ( {{X_{l}}} )}^{2}}} } + \frac{1}{K}\sum _{m = 1} {\operatorname{Var} ( {{Y_{{h_{m}}}}} )} + \varepsilon \\ \le & \sum_{m = 1} {\sum _{l \in {\varLambda _{m}}} {a_{l}^{2}E{{ ( {{X_{l}}} )}^{2}}} } + \frac{u}{K}\sum _{m = 1} {\sum_{l \in {\varLambda _{m}}} {a_{l}^{2}E {{ ( {{X_{l}}} )}^{2}}} } + \varepsilon \\ \le & \sum_{m = 1} {E{{ ( {{a_{m}} {X_{m}}} )} ^{2}}} + \frac{{Cu}}{K}\sum _{m = 1} {E{{ ( {{a_{m}} {X_{m}}} )}^{2}}} + \varepsilon = C{\sigma ^{2}}, \end{aligned}$$

which completes the proof of Lemma 3. □

Lemma 4

Suppose that \({e_{t}}\) is linear processes under NSD random sequence with conditions (A1)(A4) hold, let \(\sigma _{\varepsilon }^{2} = E ( {e_{t}^{2}} )\). Then

$$ E{ \Biggl( {\sum_{t = 1}^{k} { \bigl( {e_{t}^{2} - \sigma _{\varepsilon }^{2}} \bigr)} } \Biggr)^{2}} \le Ck. $$

Proof

According to Lemma 3, we obtain

$$ \sigma _{\varepsilon }^{2} = E{ \Biggl( {\sum _{j = 0}^{\infty } {{a_{j}} {\varepsilon _{t - j}}} } \Biggr)^{2}} \le C{\sigma ^{2}} < \infty . $$

Obviously, there exists a positive number \(c_{0}\) such that

$$ e_{t}^{2} - \sigma _{\varepsilon }^{2} = \sum_{j = 0}^{\infty } {a_{j}^{2} \bigl( {\varepsilon _{t - j}^{2} - {c_{0}} \sigma ^{2}} \bigr)} + 2\sum _{0 \le s < l < \infty } {{a_{s}} {a_{l}}} \varepsilon _{t - s}\varepsilon _{t - l}. $$

Hence

$$\begin{aligned} E{ \bigl( {e_{t}^{2} - \sigma _{\varepsilon }^{2}} \bigr)^{2}} =& E { \Biggl[ {\sum _{s = 0}^{\infty }{a_{s}^{2} \bigl( {\varepsilon _{t - s} ^{2} - {c_{0}} \sigma ^{2}} \bigr)} } \Biggr]^{2}}+ 4E{ \biggl( {\sum_{0 \le s < l < \infty } {{a_{s}} {a_{l}}} \varepsilon _{t - s}\varepsilon _{t - l}} \biggr)^{2}}\\ &{}+ 4E \Biggl( { \Biggl[ {\sum _{l = 0}^{\infty }{a_{l}^{2} \bigl( {\varepsilon _{t - l} ^{2} - {c_{0}} \sigma ^{2}} \bigr)} } \Biggr] \biggl[ {\sum _{0 \le s < l < \infty } {{a_{s}} {a_{l}}} \varepsilon _{t - s} \varepsilon _{t - l}} \biggr]} \Biggr) \\ = & E \Biggl[ {\sum_{s = 0}^{\infty }{a_{s}^{4}{{ \bigl( {\varepsilon _{t - s}^{2} - {c_{0}} \sigma ^{2}} \bigr)}^{2}}} } \Biggr] + 2E \biggl[ {\sum_{0 \le s < l < \infty } {a_{l}^{2}a_{s}^{2} \bigl( {\varepsilon _{t - l}^{2} - {c_{0}} \sigma ^{2}} \bigr)} } \biggr] \bigl( {\varepsilon _{t - s}^{2} - {c_{0}}\sigma ^{2}} \bigr) \\ &{}+ 4E \biggl( { \biggl[ {\sum_{0 \le s' < l' < \infty } {{a_{s'}} {a_{l'}}} \varepsilon _{t - s'} \varepsilon _{t - l'}} \biggr] \biggl[ {\sum _{0 \le s' < l' < \infty } {{a_{s'}} {a_{l'}}} \varepsilon _{t - s'}\varepsilon _{t - l'}} \biggr]} \biggr)\\ &{} + 4E \Biggl( { \Biggl[ {\sum_{j = 0}^{\infty }{ \sum_{0 \le s < l < \infty } {a_{j}^{2}{a_{s}} {a_{l}}} \bigl( {\varepsilon _{t - s}^{2} - {c_{0}}\sigma ^{2}} \bigr) \varepsilon _{t - s}\varepsilon _{t - l}} } \Biggr]} \Biggr) \\ \le & E{ \bigl( {\varepsilon _{j}^{2} - {c_{0}} {\sigma ^{2}}} \bigr) ^{2}}\sum _{s = 0}^{\infty }{a_{s}^{4}} + 4\sum_{0 \le s < l < \infty } {a_{s}^{2}a_{l}^{2}} E \bigl( {\varepsilon _{t - s}^{2}\varepsilon _{t - l}^{2}} \bigr) \\ = :& {H_{1}} + {H_{2}}. \end{aligned}$$

By conditions (A3) and (A4), we obtain

$$ {H_{1}} \le C \bigl( {E\varepsilon _{t}^{4} - 2c_{0}{\sigma ^{2}}E \varepsilon _{t}^{2} + c_{0}^{2}{\sigma ^{4}}} \bigr) < + \infty . $$
(6)

Decomposing \(\varepsilon _{t}\) as \(\varepsilon _{t} = \varepsilon _{t}^{+} - \varepsilon _{t}^{-} \), from properties (P2) and (P3) in Lemma 1, one can see that \(\varepsilon _{t}^{+} \), \(\varepsilon _{t}^{-} \), \({ ( {\varepsilon _{t}^{-} } )^{2}}\) and \({ ( {\varepsilon _{t} ^{+} } )^{2}}\) are NSD random sequences. From formula (5), we have \(E ( {XY} ) \leq E ( X )E ( Y )\), then

$$ \begin{aligned}[b] {H_{2}} &= 4\sum _{0 \le s < l < \infty } {a_{s}^{2}a_{l} ^{2}} E \bigl( {{{ \bigl( {\varepsilon _{t - s}^{+} - \varepsilon _{t - s}^{-} } \bigr)}^{2}} {{ \bigl( {\varepsilon _{t - l}^{+} - \varepsilon _{t - l}^{-} } \bigr)}^{2}}} \bigr) \\ &\le 4\sum_{0 \le s < l < \infty } {a_{s}^{2}a_{l}^{2}} E \bigl( { \bigl[ {{{ \bigl( {\varepsilon _{t - s}^{+} } \bigr)}^{2}} + {{ \bigl( {\varepsilon _{t - s}^{-} } \bigr)}^{2}}} \bigr] \cdot \bigl[ {{{ \bigl( {\varepsilon _{t - l}^{+} } \bigr)}^{2}} + {{ \bigl( { \varepsilon _{t - l}^{-} } \bigr)}^{2}}} \bigr]} \bigr) \\ &\le 16\sum_{0 \le s < l < \infty } {a_{s}^{2}a_{l}^{2}} {\sigma ^{4}} < \infty . \end{aligned} $$
(7)

Combining (6) and (7), we get

$$ E{ \bigl( {e_{t}^{2} - \sigma _{\varepsilon }^{2}} \bigr)^{2}} < \infty . $$
(8)

Now we consider the cross term. For any \(t < j\), one can see that

$$\begin{aligned} &E \bigl[ { \bigl( {e_{t}^{2} - \sigma _{\varepsilon }^{2}} \bigr) \bigl( {e_{j}^{2} - \sigma _{\varepsilon }^{2}} \bigr)} \bigr] \\ &\quad = E \Biggl[ {\sum_{s = 0}^{\infty }{a_{s}^{2} \bigl( {\varepsilon _{t - s} ^{2} - \sigma ^{2}} \bigr) + 2\sum_{0 \le s < l < \infty } {{a_{s}} {a_{l}}} \varepsilon _{t - s} \varepsilon _{t - l}} } \Biggr]\\ &\qquad {} \cdot \Biggl[ {\sum _{s' = 0}^{\infty }{a_{s'}^{2} \bigl( {\varepsilon _{j - s'}^{2} - {c_{0}} \sigma ^{2}} \bigr) + 2 \sum _{0 \le s' < l' < \infty } {{a_{s'}} {a_{l'}}} \varepsilon _{t - s'}\varepsilon _{j - l'}} } \Biggr] \\ &\quad = E \Biggl\{ \sum_{s = 0}^{\infty }a_{s}^{2} \bigl( {\varepsilon _{t - s} ^{2} - {c_{0}} \sigma ^{2}} \bigr) \cdot \sum _{\tilde{s} = 0} ^{\infty }a_{s}^{2} \bigl( {\varepsilon _{t - s}^{2} - \sigma ^{2}} \bigr) \\ &\qquad {}+ 2\sum_{s = 0}^{\infty }{a_{s}^{2} \bigl( {\varepsilon _{t - s} ^{2} - {c_{0}} \sigma ^{2}} \bigr)} \sum _{0 \le s' < l' < \infty } {{a_{s'}} {a_{l'}}} \varepsilon _{t - s'}\varepsilon _{j - l'} \Biggr\} \\ &\qquad {}+2E \Biggl\{ {\sum_{s' = 0}^{\infty }{a_{s'}^{2} \bigl( {\varepsilon _{t - s'}^{2} - {c_{0}} \sigma ^{2}} \bigr)} \sum _{0 \le s < l < \infty } {{a_{s}} {a_{l}}} \varepsilon _{t - s} \varepsilon _{t - l}} \Biggr\} \\ &\qquad {}+ 4E \biggl\{ {\sum_{0 \le s < l < \infty } {{a_{s}} {a_{l}}} \varepsilon _{t - s} \varepsilon _{t - l}\sum_{0 \le s' < l' < \infty } {{a_{s}} {a_{l}}} \varepsilon _{j - s'}\varepsilon _{j - l'}} \biggr\} \\ &\quad = \sum_{s = 0}^{\infty }{\sum _{s' = 0}^{\infty }{a _{s}^{2}a_{s'}^{2}} } E \bigl\{ { \bigl( {\varepsilon _{t - s}^{2} - {c_{0}} {\sigma ^{2}}} \bigr) \bigl( {\varepsilon _{t - s'}^{2} - {c_{0}} {\sigma ^{2}}} \bigr)} \bigr\} \\ &\qquad {}+ 2\sum_{s = 0} { \sum_{0 \le s' < l' < \infty } {a_{s}^{2}a_{s'}a_{l'} } } E \bigl\{ { \bigl( {\varepsilon _{t - s}^{2} - {c_{0}} {\sigma ^{2}}} \bigr)\varepsilon _{j - s'}^{2}} \bigr\} \varepsilon _{j - l'} ^{2} \\ &\qquad {}+ 2\sum_{s' = 0}^{\infty }{\sum _{0 \le s < l < \infty } {a_{s'}^{2}{a_{s}}a_{l}} } E \bigl\{ { \bigl( {\varepsilon _{t - s'}^{2} - {c_{0}} {\sigma ^{2}}} \bigr)\varepsilon _{t - s}\varepsilon _{t - l} } \bigr\} \\ &\qquad {}+ 4\sum _{0 \le s < l < \infty } {\sum_{0 \le s' < l' < \infty } {a_{s}{a_{l}}a_{s'}a_{l'} } } E \{ {\varepsilon _{t - s}\varepsilon _{t - l} \varepsilon _{j - s'}\varepsilon _{j - l'}} \}. \end{aligned}$$

Let \(s' = j - t + s\), \(l' = j - t + l\), similar to the proof of inequality (8), we have

$$ \begin{aligned} &E \bigl[ { \bigl( {e_{t}^{2} - \sigma _{\varepsilon }^{2}} \bigr) \bigl( {e_{j}^{2} - \sigma _{\varepsilon }^{2}} \bigr)} \bigr]\\ &\quad = \sum _{s = 0}^{\infty }{a_{s}^{2}a_{j - t + s}^{2}} E{ \bigl( {\varepsilon _{t - s}^{2} - {c_{0}} \sigma ^{2}} \bigr)^{2}} + 4 \sum _{0 \le s < l < \infty } {{a_{s}} {a_{l}}} a_{j - t + s} a_{j - t + l}E{ ( {\varepsilon _{t - s} \varepsilon _{t - l} } )^{2}} < \infty . \end{aligned} $$

Hence

$$\begin{aligned} &\sum_{1 \le t < j \le k} { \bigl\vert {E \bigl[ { \bigl( {e_{t}^{2} - \sigma _{\varepsilon }^{2}} \bigr) \bigl( {e_{j}^{2} - \sigma _{\varepsilon }^{2}} \bigr)} \bigr]} \bigr\vert } \\ &\quad = \sum_{1 \le t < j \le k} { \Biggl\vert {E{{ \bigl( {\varepsilon _{t}^{2} - {c_{0}}\sigma ^{2}} \bigr)}^{2}}\sum_{s = 0}^{\infty } {a_{s}^{2}a_{j - t + s}^{2}} + 4\sigma ^{2}\sum_{0 \le s < l < \infty } {{a_{s}} {a_{l}}} a_{j - t + s}a _{j - t + l}} \Biggr\vert } \\ &\quad \le C\sigma ^{4}\sum _{t = 1}^{k - 1} {\sum_{j = t + 1}^{k} {\sum_{s = 0}^{\infty }{a_{s}^{2}a_{j - t + s}^{2}} } } + 4\sigma ^{4}\sum _{t = 1}^{k - 1} {\sum_{j = t + 1} ^{k} {\sum_{0 \le s < l < \infty } { \vert {{a_{s}} {a_{l}}a_{j - t + s}a_{j - t + l}} \vert } } } \\ &\quad = C \Biggl\{ {\sum_{t = 1}^{k - 1} {\sum _{s = 0}^{ \infty }{a_{s}^{2}} \sum_{j = t + 1}^{k} {a_{j - t + s}^{2}} } + \sum_{t = 1}^{k - 1} {\sum _{0 \le s < l < \infty } { \vert {{a_{s}} {a_{l}}} \vert \sum_{j = t + 1}^{k} { \vert {a_{j - t + s} a_{j - t + l}} \vert } } } } \Biggr\} \\ &\quad \le C \Biggl\{ {k{{ \Biggl( {\sum_{s = 0}^{\infty }{a_{s}^{2}} } \Biggr)}^{2}} + k{{ \Biggl( {\sum_{s = 0}^{\infty }{a_{s} } } \Biggr)}^{2}} \Biggl( {\sum_{u = 0}^{\infty }{a_{u}^{2}} } \Biggr)} \Biggr\} \le Ck. \end{aligned}$$

 □

Lemma 5

Let \({Y_{1}},{Y_{2}}, \ldots ,{Y_{n}}\) be a sample from model (1), assume that \({\tilde{Y}_{t}} = {Y_{t}} - {\hat{\mu } _{n}}\), \({\hat{\mu }_{n}} = \frac{1}{n}\sum_{t = 1}^{n} {{Y _{t}}}\), if assumptions (A1)(A4) hold, then for any \(\varepsilon > 0\),

$$ P \Biggl( \max_{1 \le k \le n} \frac{1}{k} \Biggl\vert { \sum_{t = 1}^{k} { \bigl( { \tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert > \varepsilon \Biggr) \le \frac{C}{{\sqrt{n} }}. $$

Proof

Note that

$$ {\sum_{t = 1}^{k} {\tilde{Y}_{t}^{2}} = \sum_{t = 1} ^{k} {\sigma _{i}^{2} e_{t}^{2}} - 2 ( {{{ \hat{\mu }}_{n}} - \mu } )\sum_{t = 1}^{k} {\sigma _{i}^{2} e_{t}^{2}} + k { ( {{{\hat{\mu }}_{n}} - \mu } )^{2}},} $$

where \({e_{t}} = ({Y_{t}} - \mu )/\sigma _{i}\), then

$$\begin{aligned} &P \Biggl( \max_{1 \le k \le n} \frac{1}{k} \Biggl\vert {\sum_{t = 1}^{k} { \bigl( { \tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert > \varepsilon \Biggr) \\ &\quad \le {P \Biggl( \max_{1 \le k \le n} \frac{\sigma _{i}^{2}}{k} \Biggl\vert {\sum_{t = 1}^{k} { \bigl( {e_{t}^{2} - Ee_{t} ^{2}} \bigr)} } \Biggr\vert > \frac{\varepsilon }{3} \Biggr)} + P \biggl( { \bigl\vert {{{ ( {{{\hat{\mu }}_{n}} - \mu } )}^{2}} - E{{ ( {{{\hat{\mu }}_{n}} - \mu } )}^{2}}} \bigr\vert > \frac{\varepsilon }{3}} \biggr) \\ &\qquad {}+{ P \Biggl( \max_{1 \le k \le n} \frac{\sigma _{i} ^{2}}{k} \Biggl\vert {2 ( {{{\hat{\mu }}_{n}} - \mu } )\sum _{t = 1}^{k} {e_{t}^{2}} - 2E ( {{{\hat{\mu }}_{n}} - \mu } )\sum_{t = 1}^{k} {e_{t}^{2}} } \Biggr\vert > \frac{\varepsilon }{3} \Biggr)} \\ &\quad =: {J_{1}} + {J_{2}} + {J_{3}}. \end{aligned}$$

Applying Lemma 4, it is easy to see that \({J_{1}} \le C/{\sqrt{n} }\). From Markov’s inequality, \({J_{2}}\) is bounded by

$$\begin{aligned} {J_{2}} =& P \Biggl( {{\sigma _{i}^{2}} \Biggl\vert {{{ \Biggl( {\frac{1}{n} \sum _{t= 1}^{n} {e_{t}} } \Biggr)}^{2}} - E{{ \Biggl( {\frac{1}{n} \sum _{t = 1}^{n} {e_{t}} } \Biggr)}^{2}}} \Biggr\vert > \frac{\varepsilon }{3}} \Biggr) \\ \le & \frac{3{\sigma _{i}^{2}}}{\varepsilon }E \Biggl\vert {{{ \Biggl( {\frac{1}{n} \sum_{t = 1}^{n} {e_{t}} } \Biggr)}^{2}} - E{{ \Biggl( {\frac{1}{n}\sum _{t = 1}^{n} {e_{t}} } \Biggr)}^{2}}} \Biggr\vert \\ \le & \frac{6{\sigma _{i}^{2}}}{\varepsilon }E{ \Biggl( {\frac{1}{n} \sum _{t = 1}^{n} {e_{t}} } \Biggr)^{2}} \le \frac{C}{n}. \end{aligned}$$

Now, we will show that \({J_{3}} \le C/{\sqrt{n} }\). By Cauchy–Schwarz’s inequality, it follows

$$ \begin{aligned} &\max_{1 \le k \le n} E \Biggl\vert { ( {{{\hat{\mu }}_{n}} - \mu } )\frac{1}{k}\sum _{t = 1}^{k} {{{\sigma _{i} e _{t}}}} } \Biggr\vert \\ &\quad \le {\sigma _{i}^{2}} \max_{1 \le k \le n} { \Biggl( {E{{ \Biggl( {\frac{1}{n}\sum _{t = 1}^{n} {{e_{t}}} } \Biggr)}^{2}}} \Biggr)^{{1 / 2}}}\max_{1 \le k \le n} { \Biggl( {E{{ \Biggl( {\frac{1}{k}\sum_{t = 1}^{k} {{e_{t}}} } \Biggr)} ^{2}}} \Biggr)^{{1 / 2}}} \le \frac{C}{{\sqrt{n} }}. \end{aligned} $$

Therefore

$$\begin{aligned} &P \Biggl( \max_{1 \le k \le n} \frac{1}{k} \Biggl\vert {2 ( {{{\hat{\mu }}_{n}} - \mu } )\sum _{t = 1}^{k} {{{\sigma _{i}}e_{t}}} } \Biggr\vert > \frac{\varepsilon }{6} \Biggr)\\ &\quad \le P \Biggl( \max _{1 \le k \le n} \Biggl\vert {{{ \Biggl( {\sum _{t = 1}^{k} {{e_{t}}} } \Biggr)}^{2}} + \frac{1}{k} \Biggl( {\sum _{t = 1}^{k} {e_{t}^{2}} } \Biggr)} \Biggr\vert > \frac{\varepsilon }{6 {\sigma _{i}^{2}}} \Biggr) \\ &\quad \le P \Biggl( \max_{1 \le k \le n} {{ \Biggl( {\sum _{t = 1}^{k} {{e_{t}}} } \Biggr)}^{2}} > \frac{\varepsilon }{{12{\sigma _{i}^{2}}}} \Biggr) + P \Biggl( \max _{1 \le k \le n} \frac{1}{k} \Biggl( {\sum _{t = 1}^{k} {e_{t}^{2}} } \Biggr) > \frac{\varepsilon }{{12{\sigma _{i}^{2}}}} \Biggr) \\ &\quad \le \frac{{{c_{1}}}}{{\sqrt{n} }} + \frac{{{c_{2}}}}{n} \le \frac{C}{ {\sqrt{n} }}. \end{aligned}$$

Thus the proof of Lemma 5 is completed. □

Proof of Theorem 1

Let \({\delta _{0}} = { ( {\sigma _{i} ^{2} - \sigma _{i - 1}^{2}} )}\sum_{j = 0}^{\infty } {a_{j}^{2}} \), for \({t_{i}} \le t_{i}^{*} \), we have

$$\begin{aligned} &ER ( {{t_{i}}} ) \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \Biggl\vert {E \Biggl\{ {\frac{1}{{{t_{i}} - {t_{i - 1}}}}\sum_{t = {t_{i - 1}} + 1}^{{t_{i}}} {\tilde{Y} _{t}^{2}} - \frac{1}{{{t_{i + 1}} - {t_{i}}}}\sum _{t = {t_{i}} + 1}^{{t_{i + 1}}} {\tilde{Y}_{t}^{2}} } \Biggr\} } \Biggr\vert \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )} ^{2}}}} \Biggl\vert {E \Biggl\{ {\frac{1}{{{t_{i}} - {t_{i - 1}}}}\sum _{t = {t_{i - 1}} + 1}^{{t_{i}}} {\tilde{Y}_{t}^{2}} - \frac{1}{ {{t_{i + 1}} - {t_{i}}}}\sum_{t = {t_{i}} + 1}^{t_{i}^{*} } {\tilde{Y}_{t}^{2}} - \frac{1}{{{t_{i + 1}} - {t_{i}}}}\sum _{t = t_{i}^{*} + 1}^{{t_{i + 1}}} {\tilde{Y}_{t}^{2}} } \Biggr\} } \Biggr\vert \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )} ^{2}}}} \Biggl\vert {\sigma _{i - 1}^{2} \sum_{j = 0}^{\infty }{a_{j}^{2}} - \frac{{t_{i}^{*} - {t_{i}}}}{{{t_{i + 1}} - {t_{i}}}}\sigma _{i - 1} ^{2}\sum _{j = 0}^{\infty }{a_{j}^{2}} - \frac{{{t_{i + 1}} - t _{i}^{*} }}{{{t_{i + 1}} - {t_{i}}}}\sigma _{i}^{2}\sum _{j = 0} ^{\infty }{a_{j}^{2}} } \Biggr\vert \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )} ^{2}}}}\frac{{ ( {{t_{i + 1}} - t_{i}^{*} } )}}{{{t_{i + 1}} - {t_{i}}}} \vert {{\delta _{0}}} \vert \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - t_{i}^{*} } )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )} ^{2}}}} \vert {{\delta _{0}}} \vert . \end{aligned}$$

Similarly, for \({t_{i}} \ge t_{i}^{*} \),

$$\begin{aligned} &ER ( {{t_{i}}} ) \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \Biggl\vert {E \Biggl\{ {\frac{1}{{{t_{i}} - {t_{i - 1}}}}\sum_{t = {t_{i - 1}} + 1}^{t_{i}^{*} } {\tilde{Y}_{t}^{2}} + \frac{1}{{{t_{i}} - {t_{i - 1}}}}\sum _{t = t_{i}^{*} + 1}^{t_{i}} {\tilde{Y}_{t}^{2}} - \frac{1}{ {{t_{i + 1}} - {t_{i}}}}\sum_{t = t_{i}^{*} + 1}^{{t_{i + 1}}} { \tilde{Y}_{t}^{2}} } \Biggr\} } \Biggr\vert \\ &\quad = \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )} ^{2}}}} \Biggl\vert {\frac{{t_{i}^{*} - t_{i}}}{{{t_{i + 1}} - {t_{i}}}} \sigma _{i - 1}^{2}\sum_{j = 0}^{\infty }{a_{j}^{2}} + \frac{ {{t_{i}} - t_{i}^{*} }}{{{t_{i}} - {t_{i - 1}}}}\sigma _{i}^{2}\sum _{j = 0}^{\infty }{a_{j}^{2}} - \sigma _{i}^{2}\sum_{j = 0}^{\infty }{a_{j}^{2}} } \Biggr\vert \\ &\quad = \frac{{ ( {{t_{i + 1}} - {t_{i}}} ) ( {t_{i}^{*} - {t_{i - 1}}} )}}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )} ^{2}}}} \vert {{\delta _{0}}} \vert . \end{aligned}$$

Note that \(ER ( {{t_{i}}} )\) is increasing for \({t_{i}} \le t_{i}^{*} \) decreasing while \({t_{i}} \ge t_{i}^{*} \), thus the maximum of \(ER ( {{t_{i}}} )\) is

$$ \bigl\vert {ER \bigl( {t_{i}^{*} } \bigr)} \bigr\vert = \frac{{ ( {{t_{i + 1}} - t_{i}^{*} } ) ( {t_{i}^{*} - {t_{i - 1}}} )}}{ {{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \vert {{\delta _{0}}} \vert . $$

By direct calculation, it follows

$$ \bigl\vert {ER \bigl( {t_{i}^{*} } \bigr)} \bigr\vert - \bigl\vert {ER ( {t_{i}} )} \bigr\vert \ge \frac{{ ( {t_{i}^{*} - {t_{i - 1}}} ) \wedge ( {{t_{i + 1}} - t_{i}^{*} } )}}{{{{ ( {{t _{i + 1}} - {t_{i - 1}}} )}^{2}}}} \bigl\vert {{t_{i}} - t_{i}^{*} } \bigr\vert \vert {{\delta _{0}}} \vert \ge \frac{{{C_{0}}}}{n} \bigl\vert {{t_{i}} - t_{i}^{*} } \bigr\vert = {C_{0}} \bigl\vert {{\tau _{i}} - \tau _{i}^{*} } \bigr\vert . $$

In order to prove Theorem 1, it is desired to show that, for any \(\varepsilon > 0\),

$$ P \bigl( {{{ \bigl\Vert {\hat{\boldsymbol{\tau }} - \boldsymbol{\tau } ^{*} } \bigr\Vert }_{\infty }} \ge \varepsilon } \bigr) \to 0. $$

Since \(R ( {{t_{i}}} )=R ( {{t_{i}}} )-ER ( {{t_{i}}} )+ ( ER ( {{t_{i}}} )-R ( {{t_{i}}^{*}} ) )+R ( {{t_{i}}^{*}} )\) and \(| {R ( {t_{i}} ^{*} ) - ER ( {t_{i} } ^{*} )} |\leq [4]\max_{t \in A_{n,r}^{{\delta _{n}}}} | {R ( {{t_{i}}} ) - ER ( {{t_{i}}} )} |\), then

$$\begin{aligned} \bigl\vert {R ( {{t_{i}}} )} \bigr\vert - \bigl\vert {R \bigl( {t_{i} }^{*} \bigr)} \bigr\vert \le & \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} )} \bigr\vert + \bigl\vert {R \bigl( {t_{i}} ^{*} \bigr) - ER \bigl( {t_{i} } ^{*} \bigr)} \bigr\vert + ER ( {{t_{i}}} ) - ER \bigl( {t_{i} }^{*} \bigr) \\ \le & 2\max_{t \in A_{n,r}^{{\delta _{n}}}} \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} )} \bigr\vert + ER ( {{t_{i}}} ) - ER \bigl( {t_{i} }^{*} \bigr). \end{aligned}$$

Define \({\varLambda _{n,r}} = \{ { \boldsymbol{t} \in A_{n,r}^{ {\delta _{n}}},{{ \Vert { \boldsymbol{t} - { \boldsymbol{t}^{*} }} \Vert } _{\infty }} \ge n\varepsilon } \}\), then

$$ \begin{aligned}[b] P \bigl( {{{ \bigl\Vert { \hat{\boldsymbol{\tau }} - \boldsymbol{\tau } ^{*} } \bigr\Vert }_{\infty }} \ge \varepsilon } \bigr) &\le P \Biggl( \max_{ \boldsymbol{t} \in {\varLambda _{n,r}}} \sum _{i = 1}^{r} { \bigl\{ { \bigl\vert {R ( {t_{i}} )} \bigr\vert - \bigl\vert {R \bigl( {t_{i}^{*} } \bigr)} \bigr\vert } \bigr\} } \ge 0 \Biggr) \\ &\le P \Biggl( 2\max_{ \boldsymbol{t} \in {\varLambda _{n,r}}} \sum _{i = 1}^{r} { \bigl\{ { \bigl\vert {R ( {t_{i}} ) - ER ( {t_{i}} )} \bigr\vert } \bigr\} } - \sum_{i = 1}^{r} {{C_{0}} {{ \bigl\Vert { \hat{\boldsymbol{\tau }} - \boldsymbol{\tau } ^{*} } \bigr\Vert } _{\infty }}} \ge 0 \Biggr) \\ &\le P \Bigl( \max_{1 \le i \le r} \bigl\vert {R ( {t_{i}} ) - ER ( {t_{i}} )} \bigr\vert \ge \delta \Bigr), \end{aligned} $$
(9)

where \(\delta = {{{C_{0}}\varepsilon } / 2}\) is an arbitrarily small positive number. According to the definition of \(ER ( {{t_{i}}} )\), one can see that

$$ \begin{aligned}[b] &\max_{1 \le i \le r} \bigl\vert {R ( {t_{i}} ) - ER ( {t_{i}} )} \bigr\vert \\ &\quad = \max_{1 \le {t_{i - 1}} < {t_{i}} < {t_{i + 1}} \le n} \frac{{ ( {{t_{i}} - {t_{i - 1}}} ) ( {{t_{i + 1}} - {t_{i}}} )}}{ {{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \Biggl\vert \Biggl\{ \frac{1}{{{t_{i}} - {t_{i - 1}}}}\sum_{t = {t_{i - 1}} + 1} ^{{t_{i}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} \\ &\qquad {}- \frac{1}{{{t_{i + 1}} - {t_{i}}}}\sum _{t = {t_{i}} + 1}^{ {t_{i + 1}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} \Biggr\} \Biggr\vert \\ &\quad \le \max_{1 \le {t_{i - 1}} < {t_{i}} \le n} \frac{1}{ {{t_{i}} - {t_{i - 1}}}} \Biggl\vert { \sum_{t = {t_{i - 1}} + 1}^{{t_{i}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert \\ &\qquad {}+ \max_{1 \le {t_{i}} < {t_{i + 1}} \le n} \frac{1}{ {{t_{i + 1}} - {t_{i}}}} \Biggl\vert {\sum_{t = {t_{i}} + 1}^{{t_{i + 1}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert . \end{aligned} $$
(10)

From (9) and (10), the proof of Theorem 1 will be completed by showing

$$ P \Biggl( {\max_{1 \le {t_{i - 1}} < {t_{i}} \le n} \frac{1}{ {{t_{i}} - {t_{i - 1}}}} \Biggl\vert { \sum_{t = {t_{i - 1}} + 1}^{{t_{i}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert > \delta } \Biggr) \to 0,\quad n \to \infty , $$
(11)

and

$$ P \Biggl( \max_{1 \le {t_{i}} < {t_{i + 1}} \le n} \frac{1}{ {{t_{i + 1}} - {t_{i}}}} \Biggl\vert { \sum_{t = {t_{i}} + 1}^{{t_{i + 1}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert > \delta \Biggr) \to 0,\quad n \to \infty . $$
(12)

Since Eq. (12) can be proved similarly as (11), we only consider Eq. (11), thus the proof of Theorem 1 is finished by taking \(k = {t_{i}} - {t_{i - 1}}\) in Lemma 5. □

Proof of Theorem 2

Let θ be a constant in the interval \(( {0,1} )\). Denote \(D_{n,r}^{M ( n )} = \{ t \in A_{n,r}^{{\delta _{n}}}, n\theta > {{ \Vert {t - {t^{*} }} \Vert }_{\infty }} > M ( n ) \}\). By Theorem 1, we have

$$\begin{aligned} P \bigl( {{{ \bigl\Vert { \hat{\boldsymbol{\tau }} - \boldsymbol{\tau } ^{*} } \bigr\Vert }_{\infty }} > {{M ( n )} / n}} \bigr) \le & P \bigl( {{{ \bigl\Vert {\hat{\boldsymbol{\tau }} - \boldsymbol{\tau } ^{*} } \bigr\Vert } _{\infty }} \ge \theta } \bigr) + P \bigl( {\theta > {{ \bigl\Vert {\hat{\boldsymbol{\tau }} - \boldsymbol{\tau } ^{*} } \bigr\Vert } _{\infty }} > {{M ( n )} / n}} \bigr) \\ \le & \varepsilon + P \Biggl( \max_{t \in D_{n,r}^{M ( n )}} \sum _{i = 1}^{r} { \bigl\{ { \bigl\vert {R ( {t_{i}} )} \bigr\vert - \bigl\vert {R \bigl( {t_{i} ^{*} } \bigr)} \bigr\vert } \bigr\} } \ge 0 \Biggr). \end{aligned}$$

Without loss of generality, we assume that \({\delta _{0}} < 0\). In view of the fact that \(| x | \ge | y |\) is equivalent to (i) \(x - y \ge 0\) and \(x + y \ge 0\), or (ii) \(x - y \le 0\) and \(x + y \le 0\), then

$$\begin{aligned} &P \Biggl( \max_{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum_{i = 1}^{r} { \bigl\{ { \bigl\vert {R ( {t_{i}} )} \bigr\vert - \bigl\vert {R \bigl( {t_{i}^{*} } \bigr)} \bigr\vert } \bigr\} } \ge 0 \Biggr) \\ &\quad \le P \Biggl( \max_{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum _{i = 1}^{r} { \bigl\{ {R ( {{t_{i}}} ) - R \bigl( {t_{i}^{*} } \bigr)} \bigr\} } \ge 0 \Biggr) + P \Biggl( \max_{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum_{i = 1}^{r} { \bigl\{ {R ( {{t_{i}}} ) + R \bigl( {t_{i}^{*} } \bigr)} \bigr\} } < 0 \Biggr) \\ &\quad \le P \Biggl( \max_{\boldsymbol{t} \in D_{n,r}^{M ( n )},{t_{i}} < t _{i}^{*} } \sum _{i = 1}^{r} { \bigl\{ {R ( {{t_{i}}} ) - R \bigl( {t_{i}^{*} } \bigr)} \bigr\} } \ge 0 \Biggr) + P \Biggl( \max_{\boldsymbol{t} \in D_{n,r}^{M ( n )},{t_{i}} \ge t_{i}^{*} } \sum_{i = 1}^{r} { \bigl\{ {R ( {{t_{i}}} ) - R \bigl( {t_{i}^{*} } \bigr)} \bigr\} } \ge 0 \Biggr) \\ &\qquad {}+ P \Biggl( \max_{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum _{i = 1}^{r} { \bigl\{ {R ( {{t_{i}}} ) + R \bigl( {t_{i}^{*} } \bigr)} \bigr\} } < 0 \Biggr) \\ &\quad =: {T_{1}} + {T_{2}} + {T_{3}}. \end{aligned}$$

For \({t_{i}} < t_{i}^{*} \), we have

$$\begin{aligned} & \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} ) - \bigl( {R \bigl( {t_{i}^{*} } \bigr) - ER \bigl( {t_{i}^{*} } \bigr)} \bigr)} \bigr\vert \\ &\quad \le \frac{{{C_{1}} \vert {{t_{i}} - t_{i}^{*} } \vert }}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \Biggl\vert {\sum _{t = {t_{i - 1}} + 1} ^{{t_{i}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert + \frac{{{C_{2}} \vert {{t_{i}} - t_{i}^{*} } \vert }}{{{{ ( {{t_{i + 1}} - {t_{i - 1}}} )}^{2}}}} \Biggl\vert {\sum_{t = t_{i}^{*} + 1}^{ {t_{i + 1}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert \\ &\qquad {}+\frac{{{C_{3}}}}{{{t_{i + 1}} - {t_{i - 1}}}} \Biggl\vert {\sum _{t = {t_{i - 1}} + 1}^{t_{i}^{*} } { \bigl( {\tilde{Y}_{t} ^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert + \frac{{{C_{4}}}}{{{t_{i + 1}} - {t_{i - 1}}}} \Biggl\vert {\sum_{t = {t_{i - 1}} + 1}^{t_{i}^{*} } { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert . \end{aligned}$$

Since \({\delta _{0}} < 0\), \(E{R} ( {{t_{i}}} ) \ge 0\), then for \(1 \le i \le r\),

$$\begin{aligned} {T_{1}} \le & P \Biggl( \bigcup_{\boldsymbol{t} \in D_{n,r}^{M ( n )},{t_{i}} < t _{i}^{*} } \Biggl\{ {\sum_{i = 1}^{r} { \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} ) - \bigl( {R \bigl( {t_{i} ^{*} } \bigr) - ER \bigl( {t_{i}^{*} } \bigr)} \bigr)} \bigr\vert } \ge \sum_{i = 1}^{r} { \bigl( {ER \bigl( {t_{i}^{*} } \bigr) - ER ( {{t_{i}}} )} \bigr)} } \Biggr\} \Biggr) \\ \le & P \Biggl( \bigcup_{\boldsymbol{t} \in D_{n,r}^{M ( n )},{t_{i}} < t _{i}^{*} } \Biggl\{ \sum _{i = 1}^{r} { \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} ) - \bigl( {R \bigl( {t_{i} ^{*} } \bigr) - ER \bigl( {t_{i}^{*} } \bigr)} \bigr)} \bigr\vert } \\ &{}\ge \sum_{i = 1}^{r} {{{{C_{0}} \bigl( {t_{i}^{*} - t_{i}} \bigr)} / { ( {{t_{i + 1}} - {t_{i - 1}}} )}}} \Biggr\} \Biggr) \\ \le & P \Biggl( \max_{n{\delta _{n}} \le {t_{i}} - {t_{i - 1}} \le n} \frac{1}{ {{t_{i + 1}} - {t_{i - 1}}}} \Biggl\vert {\sum_{t = {t_{i - 1}} + 1}^{ {t_{i}}} { \bigl( { \tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert \ge {C_{1}} \Biggr) \\ &{}+ P \Biggl( \max _{n{\delta _{n}} \le {t_{i}} - {t_{i - 1}} \le n} \frac{1}{ {{t_{i + 1}} - {t_{i - 1}}}} \Biggl\vert {\sum _{t = t_{i}^{*} + 1}^{{t _{i + 1}}} { \bigl( {\tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert \ge {C_{2}} \Biggr) \\ &{}+ P \Biggl( \max_{M ( n ) \le t_{i}^{*} - {t_{i}} \le \theta n} \frac{1}{ {t_{i}^{*} - {t_{i}}}} \Biggl\vert {\sum_{t = {t_{i - 1}} + 1}^{t_{i}^{*} } { \bigl( { \tilde{Y}_{t}^{2} - E\tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert \ge {C_{3}} \Biggr) \\ &{}+ P \Biggl( \max _{M ( n ) \le t_{i}^{*} - {t_{i}} \le \theta n} \frac{1}{ {t_{i}^{*} - {t_{i}}}} \Biggl\vert {\sum _{t = {t_{i - 1}} + 1}^{t_{i}^{*} } { \bigl( {\tilde{Y}_{t}^{2} - E \tilde{Y}_{t}^{2}} \bigr)} } \Biggr\vert \ge {C_{4}} \Biggr) \\ =:& {Q_{1}} + {Q_{2}} + {Q_{3}} + {Q_{4}}. \end{aligned}$$

In the view of \(n{\delta _{n}} \to \infty \) and \(M ( n ) \to \infty \), Lemma 5 yields

$$ {Q_{i}} \to 0,\quad i = 1,2,3,4. $$

Thus \({T_{1}} \to 0\). We can treat \({T_{2}}\) analogously as \({T_{1}}\), hence \({T_{2}} \to 0\).

To complete the proof of Theorem 2, it is sufficient to show \({T_{3}} \to 0\). Since \(R ( {{t_{i}}} ) + R ( {t_{i} ^{*} } ) \le 0\) implies that \(R ( {{t_{i}}} ) - ER ( {{t_{i}}} ) + R ( {t_{i}^{*} } ) - ER ( {t_{i}^{*} } ) \le - ER ( {{t_{i}}} ) - ER ( {t_{i}^{*} } ) \le - ER ( {t_{i}^{*} } )\), we obtain

$$ R ( {{t_{i}}} ) - ER ( {{t_{i}}} ) \le - {{ER \bigl( {t_{i}^{*} } \bigr)} / 2} \quad \text{or} \quad R \bigl( {t_{i}^{*} } \bigr) - ER \bigl( {t_{i}^{*} } \bigr) \le - {{ER \bigl( {t_{i}^{*} } \bigr)} / 2}. $$
(13)

According to \(ER ( {t_{i}^{*} } ) \ge 0\) (\({\delta _{0}} < 0\)), inequality (13) implies that

$$ \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} )} \bigr\vert \ge {{ER \bigl( {t_{i}^{*} } \bigr)} / 2}\quad \text{or} \quad \bigl\vert {R \bigl( {t_{i}^{*} } \bigr) - ER \bigl( {t_{i}^{*} } \bigr)} \bigr\vert \ge {{ER \bigl( {t_{i}^{*} } \bigr)} / 2}. $$

Hence

$$\begin{aligned} {T_{3}} \le & P \Biggl( \bigcup _{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum_{i = 1}^{r} { \bigl\{ { \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t_{i}}} )} \bigr\vert \ge {{ER \bigl( {t_{i}^{*} } \bigr)} / 2}} \bigr\} } \Biggr) \\ &{}+ P \Biggl( \bigcup _{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum_{i = 1}^{r} { \bigl\{ { \bigl\vert {R \bigl( {t_{i}^{*} } \bigr) - ER \bigl( {t_{i}^{*} } \bigr)} \bigr\vert \ge {{ER \bigl( {t_{i} ^{*} } \bigr)} / 2}} \bigr\} } \Biggr) \\ \le & 2P \Biggl( \bigcup_{\boldsymbol{t} \in D_{n,r}^{M ( n )}} \sum _{i = 1}^{r} { \bigl\vert {R ( {{t_{i}}} ) - ER ( {{t _{i}}} )} \bigr\vert \ge {{ER \bigl( {t_{i}^{*} } \bigr)} / 2}} \Biggr) \\ \le & 2rP \Bigl( \max_{1 \le i \le r} \bigl\vert {{R} ( {t_{i}} ) - E{R} ( {t_{i}} )} \bigr\vert \ge {{ER \bigl( {t_{i}^{*} } \bigr)} / 2} \Bigr). \end{aligned}$$

Combining (4), (5), (6), and (7), we get

$$ P \Bigl( \max_{1 \le i \le r} \bigl\vert {R ( {t_{i} } ) - ER ( {t_{i}} )} \bigr\vert \ge {{ER \bigl( {t_{i}^{*} } \bigr)} / 2} \Bigr) \to 0. $$

Thus \({T_{3}} \to 0\). This completes the proof of Theorem 2. □

5 Conclusions

In this study, we consider the multiple variance change model and develop a CUSUM-type methodology for change points estimation. We assume the errors from linear processes under NSD. The weak convergence rate of the change points estimation has been established. Recently, Qin et al. [15] and Shi et al. [30] concentrated on the strong convergence of the CUSUM-type estimator, we believe that the proposed estimation in this paper also has the strong convergent property. Additionally, investigating the change points estimation with the unknown number of the change points is an interesting topic, and this is our next work.