Skip to main content
Log in

A change-point problem in relative error-based regression

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

A nonparametric relative error-based method is proposed to detect and estimate the change point for the multiplicative regression models. The asymptotic distribution of the proposed test statistic for no change-point effect is established. We prove the \(n\)-consistency of the proposed estimator of the change point. Simulation studies demonstrate that change-point detection and estimation with relative errors perform reasonably well in many practical situations. Application is illustrated with a financial dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bai J (1995) Least absolute deviation estimation of a shift. Econom Theory 11:403–436

    Article  Google Scholar 

  • Bai J (1997) Estimating multiple breaks one at a time. Econom Theory 13:315–352

    Article  Google Scholar 

  • Bassevill M, Nikiforov IV (1993) Detection of abrupt changes: theory and applications. Prentice Hall, Englewood Cliffs

    Google Scholar 

  • Chen K, Guo S, Lin Y, Ying Z (2010) Least absolute relative error estimation. J Am Stat Assoc 105:1104–1112

    Article  MathSciNet  Google Scholar 

  • Chen K, Lin Y, Wang Z, Ying Z (2013) Least product relative error estimation. http://arxiv.org/abs/1309.0220

  • Ciuperca G (2014) Estimation in a change-point nonlinear quantile model. http://arxiv.org/abs/1401.4883

  • Csörgő M, Horváth L (1997) Limit theorems in change point analysis. Wiley, New York

    Google Scholar 

  • Dempfle A, Stute W (2002) Nonparametric estimation of a discontinuity in regression. Stat Neerl 56:233–242

    Article  MATH  MathSciNet  Google Scholar 

  • Fokianos K, Gombay E, Hussein A (2014) Retrospective change detection for binary time series models. J Stat Plan Inference 145:102–112

    Article  MATH  MathSciNet  Google Scholar 

  • Gijbels I, Gürler U (2003) Estimation of a change point in a hazard function based on censored data. Lifetime Data Anal 9:395–411

    Article  MATH  MathSciNet  Google Scholar 

  • Gijbels I, Hall P, Kneip A (1999) On the estimation of jump points in smooth curves. Ann Inst Stat Math 51:231–251

    Article  MATH  MathSciNet  Google Scholar 

  • Gurevicha G, Vexler A (2005) Change point problems in the model of logistic regression. J Stat Plan Inference 131:313–331

    Article  Google Scholar 

  • Hall P, Molchanov I (2003) Sequential methods for design-adaptive estimation of discontinuities in regression curves and surfaces. Ann Stat 31:921–941

    Article  MATH  MathSciNet  Google Scholar 

  • Kirch C, Kamgaing JT (2014) Monitoring time series based on estimating functions. Technical report, Fachbereich Mathematik, Technische Universität Kaiserslautern

  • Krishnaiah PR, Miao BQ (1988) Review about estimations of change-point. Handb Stat 7:375–402

    Article  Google Scholar 

  • Koul HL, Qian L (2002) Asymptotics of maximum likelihood estimation in a two-phase linear regression model. J Stat Plan Inference 108:99–119

    Article  MATH  MathSciNet  Google Scholar 

  • Liu Y, Zou C, Zhang R (2008) Empirical likelihood ratio test for a change-point in linear regression model. Commun Stat Theory Methods 37:2551–2563

    Article  MATH  MathSciNet  Google Scholar 

  • Loader CR (1996) Change point estimation using nonparametric regression. Ann Stat 24:1667–1678

    Article  MATH  MathSciNet  Google Scholar 

  • Müller HG (1992) Change-points in nonparametric regression analysis. Ann Stat 20:737–761

    Article  MATH  Google Scholar 

  • Müller HG, Song K (1997) Two-stage change-point estimations in smooth regression models. Stat Probab Lett 34:323–335

    Article  MATH  Google Scholar 

  • Oka T, Qu Z (2011) Estimating structural changes in regression quantiles. J Econom 162:248–267

    Article  MathSciNet  Google Scholar 

  • Park H, Stefanski LA (1998) Relative-error prediction. Stat Probab Lett 40:227–236

    Article  MATH  MathSciNet  Google Scholar 

  • Pons O (2003) Estimation in a Cox regression model with a change-point at an unknown time. Ann Stat 31:442–463

    Article  MATH  MathSciNet  Google Scholar 

  • Quandt RE, Ramsay JB (1978) Estimating mixtures of normal distributions and switching regressions. J Am Stat Assoc 73:730–752

    Article  MATH  Google Scholar 

  • Rao CR, Wu Y, Shi X (2010) An M-estimation-based criterion for simultaneous change point analysis and variable selection in a regression problem. J Stat Theory Pract 4:773–801

    Article  MathSciNet  Google Scholar 

  • Ritov Y (1990) Asympotic efficient estimation of the change point with unknown distributions. Ann Stat 18:1829–1839

    Article  MATH  MathSciNet  Google Scholar 

  • Wang ZF, Wu YH, Zhao LC (2007) Change-point estimation for censored regression model. Sci China Ser A 50:63–72

    MATH  MathSciNet  Google Scholar 

  • Ye J (2007) Price models and the value relevance of accounting information. Technical report, Baruch College-Stan Ross Dept. of Accountancy, City University of New York

  • Zhang Q, Wang Q (2012) Local least absolute relative error estimating approach for partially linear multiplicative model. Stat Sin 23:1091–1116

    Google Scholar 

  • Zhou Z (2013) Heteroscedasticity and autocorrelation robust structural change detection. J Am Stat Assoc 108:726–740

    Article  MATH  Google Scholar 

Download references

Acknowledgments

The authors are grateful to the Editor-in-Chief, Professor A.F. Militino, the Associate Editor, and the anonymous referees for comments and suggestions that lead to substantial improvements in the paper. Wang’s work is supported by the State Key Program of National Natural Science of China (Grant No. 11231010) and the Fundamental Research Funds for the Central Universities (Grant No. WK2040000010). Lin’s work is supported by the Hong Kong Research Grants Council (Grant No. 509413) and a Direct Grant for Research at the Chinese University of Hong Kong (Grant No. 4053136).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuanyuan Lin.

Appendix A: Proofs

Appendix A: Proofs

1.1 A1. Proof of Theorem 1

The following lemmas are needed to prove Theorem 1. We first state Theorem 3 in Chen et al. (2013) as our Lemma 1, which will be used in the proof of Lemma 2.

Lemma 1

Suppose Assumptions (A1)–(A3) hold. Then, under \(H_0\), \(\sqrt{n}(\hat{\beta }_n-\beta _0)\) is asymptotically normal with mean 0 and covariance matrix \(D^{-1}\Sigma D^{-1}\), where \(D=E\{XX^{\top }(\epsilon +\epsilon ^{-1})\}\) and \(\Sigma =E\{XX^{\top }(\epsilon -\epsilon ^{-1})^2\}\).

Lemma 2

Suppose Assumptions (A1)–(A3) hold. Then, under \(H_0\),

$$\begin{aligned} \max _{1 \le k \le n}\sqrt{n}\left\| S_k(\hat{\beta }_n)-\left\{ S_k(\beta _0)-\frac{k}{n}S_n(\beta _0)\right\} \right\| =o_p(1). \end{aligned}$$

Proof of Lemma 2

Denote \(h_i(\beta )=Y_i^{-1}\exp (X_i^{\top }\beta )-Y_i\exp (-X_i^{\top }\beta )\). Then, \(h_i(\beta _0)=\epsilon _i^{-1}-\epsilon _i\). For any \(x\in R^p\), let \(x^{(i)}\), \(i=1, \ldots , p\) represent the \(i\)th component of \(x\). In view of \(S_n(\hat{\beta _n})=0\), by a Taylor expansion, we have

$$\begin{aligned}&\sqrt{n}S_k^{(i)}(\hat{\beta }_n) \\&\quad = \sqrt{n}\left\{ S_k^{(i)}(\hat{\beta }_n)-\frac{k}{n}S_n^{(i)}(\hat{\beta }_n)\right\} \\&\quad = \sqrt{n}\left\{ S_k^{(i)}(\beta _0)-\frac{k}{n}S_n^{(i)}(\beta _0)\right\} +\sqrt{n}\sum _{j=1}^{p}\{\hat{\beta }_n^{(j)}-\beta _0^{(j)}\} \frac{1}{n}\left\{ \sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0) \right. \\&\qquad \left. - \frac{k}{n}\sum _{l=1}^nX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right\} +E_{k,n}^{(i)}, \end{aligned}$$

where \(E_{k,n}^{(i)}\) is the remainder term to be discussed later. By Assumption (A2), \(EX_l^{(i)}X_l^{(j)}h_l(\beta _0)=0\). Since \((X_i, \epsilon _i), i=1,\ldots , n\) are i.i.d, by Hájek–Rényi inequality and Assumption (A3), we have

$$\begin{aligned} \max _{1 \le k \le n}\left| \frac{1}{k}\sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right| =O_p(1). \end{aligned}$$

This gives

$$\begin{aligned} \max _{1 \le k \le \frac{n}{\log n}}\frac{k}{n}\left| \frac{1}{k}\sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right| =o_p(1). \end{aligned}$$
(7)

On the other hand, by Hájek–Rényi inequality and Assumption (A3), we can show

$$\begin{aligned} \max _{\frac{n}{\log n} < k \le n}\frac{k}{n}\left| \frac{1}{k}\sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right| =o_p(1). \end{aligned}$$
(8)

Combining (7) and (8), we have

$$\begin{aligned} \max _{1 \le k \le n}\frac{1}{n}\left| \sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right| =o_p(1), \end{aligned}$$
(9)

which leads to

$$\begin{aligned} \max _{1 \le k \le n}\frac{k}{n}\left| \frac{1}{n}\sum _{l=1}^nX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right| =o_p(1). \end{aligned}$$
(10)

Thus, (9) and (10) together imply

$$\begin{aligned} \max _{1 \le k \le n}\frac{1}{n}\left| \sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0)-\frac{k}{n}\sum _{l=1}^nX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right| =o_p(1). \end{aligned}$$
(11)

It follows from Lemma 1 that \(\sqrt{n}|\hat{\beta }_n^{(j)}-\beta _0^{(j)}|=O_p(1)\). Together with (11), we have shown

$$\begin{aligned}&\max _{1 \le k \le n}\sqrt{n}\left| \sum _{j=1}^{d}(\hat{\beta }_n^{(j)}-\beta _0^{j})\frac{1}{n}\left\{ \sum _{l=1}^kX_l^{(i)}X_l^{(j)}h_l(\beta _0) \right. \right. \\&\qquad \qquad \qquad \left. \left. -\frac{k}{n}\sum _{l=1}^nX_l^{(i)}X_l^{(j)}h_l(\beta _0)\right\} \right| =o_p(1). \end{aligned}$$

Moreover, \(E_{k,n}^{(i)}\) is the higher order term of \((\hat{\beta }_n^{(j)}-\beta _0^{j})\). Under Assumption (A3), it can be shown to be \(o_p(1)\) uniformly in \(1\le k\le n\). Thus, the proof of Lemma 2 is complete.

Proof of Theorem 1

First, by the Functional Central Limit Theorem,

$$\begin{aligned} \left\{ \frac{1}{\sqrt{n}} S_k(\beta _0):1\le k\le n\right\} \rightarrow \{W(t):0\le t\le 1\} \end{aligned}$$

in distribution as \(n\rightarrow \infty \), where \(\{W(t):0\le t\le 1\}\) is a Winner process with covariance matrix \(\Sigma =E\{XX^{\top }(\epsilon ^{-1}-\epsilon )^2\}\). Thus, as \(n\rightarrow \infty \),

$$\begin{aligned} \frac{1}{\sqrt{n}} \Sigma ^{-1/2} \left\{ S_k(\beta _0)-\frac{k}{n}S_n(\beta _0)\right\} \rightarrow B^p(t) \end{aligned}$$

in distribution. Second, it follows from the consistency of \(\hat{\beta }_n\) that \(\hat{\Sigma }_n\) is a consistent estimate of \(\Sigma \). Therefore, by Lemma 2 and Slutsky’s theorem, Theorem 1 is proved.

1.2 A2. Proof of Theorem 2

Since a closed form for the estimator of \(\beta _1\) and \(\beta _2\) for fixed \(k\) is not available, analogous to Bai (1995), we shall obtain the rate of convergence of \(\hat{t}_n\) by evaluating the global behavior of \(G(\beta _1,\beta _2,k)\) over the whole parameter space for \(\beta _1\), \(\beta _2\) and \(k\). We first reparameterize \(G_n(\beta _1,\beta _2,k)\) as

$$\begin{aligned} V_n(\gamma _1,\gamma _2,\nu )=G_n(\beta _{10}+\gamma _1,\beta _{20}+\gamma _2,k(\nu ))-G_n(\beta _{10},\beta _{20},k_0), \end{aligned}$$

where \(\gamma _1 = \beta _1-\beta _{10}\), \(\gamma _2 = \beta _2-\beta _{20}\), \(k(\nu )=[k_0+\nu c_n]\) with \(c_n=O(1/{\Vert \lambda _n\Vert ^2})\) and \(\nu \) is a real scalar, \([x]\) is the integer part of \(x\). In such a way, with \(\nu \) varying over \(R\), \(k(\nu )\) visits all integers between \(1\) and \(n\). We assume \(k(\nu )=1\) if \(k(\nu )\le 1\) and \(k(\nu )=n\) if \(k(\nu )\ge n\). Note that, the minimization problem is not changed in the reparameterization. Denote

$$\begin{aligned} \hat{\nu }_n\equiv \arg \min _{\nu \in R} \min _{\gamma _1\in B_1,\gamma _2\in B_2} V_n(\gamma _1,\gamma _2,\nu ), \end{aligned}$$

where \(B_1=\{\beta -\beta _{10}:\beta \in B\}\) and \(B_2=\{\beta -\beta _{20}:\beta \in B\}\) are the reparameterized spaces. Then, we have \([\hat{\nu }_nc_n]=n(\hat{t}_n-t_0)=\hat{k}_n-k_0\). Furthermore, Theorem 1 is equivalent to \(\hat{\nu }_n=O_p(1)\). In view of \(V_n(0,0,0)=0\), Theorem 2 is a consequence of the following lemma.

Lemma 3

Suppose Assumptions (A1)–(A4) hold. Then, for every \(\epsilon >0\) and \(C>0\), there exists \(\nu _1>0\) such that for large \(n\)

$$\begin{aligned} P\left( \inf _{|\nu |>\nu _1}\inf _{\gamma _1\in B_1,\gamma _2\in B_2} V_n(\gamma _1,\gamma _2,\nu )<C\right) <\epsilon . \end{aligned}$$
(12)

This lemma presents the global behavior of \(V_n(\gamma _1,\gamma _2,\nu )\) or \(G_n(\beta _1,\beta _2,k)\), which tells that \(G_n(\beta _1,\beta _2,k)\) cannot achieve its global minimum when \(k\) is far from \(k_0\). To prove Lemma 3, we consider the case of \(\nu \le 0\) without loss of generality because of symmetry. Then, for \(k(\nu ) \le k_0\),

$$\begin{aligned}&V_n(\gamma _1,\gamma _2,\nu ) \\&\quad = \sum _{i=1}^{k(\nu )}\{\epsilon _i\exp (-X_i^{\top }\gamma _1)+\epsilon _i^{-1}\exp (X_i^{\top }\gamma _1) -(\epsilon _i+\epsilon _i^{-1})\}\\&\qquad + \sum _{i=k(\nu )+1}^{k_0}\{\epsilon _i\exp (-X_i^{\top }\gamma _2-X_i^{\top }\lambda _n)+\epsilon _i^{-1}\exp (X^{\top }\gamma _1+X_i^{\top }\lambda _n) -(\epsilon _i+\epsilon _i^{-1})\}\\&\qquad + \sum _{i=k_0+1}^n\{\epsilon _i\exp (-X_i^{\top }\gamma _2)+\epsilon _i^{-1}\exp (X_i^{\top }\gamma _2) -(\epsilon _i+\epsilon _i^{-1})\}. \end{aligned}$$

Write \(U(l,j,\gamma )=\sum _{i=l}^{j}\{\epsilon _i\exp (-X_i^{\top }\gamma )+\epsilon _i^{-1}\exp (X_i^{\top }\gamma )-(\epsilon _i+\epsilon _i^{-1})\}.\) Then,

$$\begin{aligned} {{V_n(\gamma _1,\gamma _2,\nu )}}=U(1,k(\nu ),\gamma _1)+U(k(\nu )+1,k_0,\gamma _2+\lambda _n)+U(k_0,n,\gamma _2). \end{aligned}$$

In the following discussion, we put \(U_k(\gamma )=U(1,k,\gamma )\). We need the following lemma to prove Lemma 3.

Lemma 4

Under Assumptions (A1)–(A3), we have the following

  1. (i)

    for any \(\delta \in (0,1)\),

    $$\begin{aligned} \sup _{n\delta \le k \le n}\left| \inf _{\gamma }U_k(\gamma )\right| =O_p(1); \end{aligned}$$
  2. (ii)
    $$\begin{aligned} \sup _{1\le k\le n}\left| \inf _{\gamma }U_k(\gamma )\right| =O_p(\log n); \end{aligned}$$
  3. (iii)

    for any \(\delta \in (0,1)\), \(\varepsilon > 0\), \(D>0\) and \(n\) sufficiently large,

    $$\begin{aligned} P\left( \inf _{n\delta \le k \le n} \inf _{\Vert \gamma \Vert \ge n^{-{1}/{2}}{\log n}}U_k(\gamma )<D \log n\right) <\varepsilon ; \end{aligned}$$
  4. (iv)

    let \(h_n\) and \(d_n\) be positive sequences such that \(h_n\) is nondecreasing, \(h_n \rightarrow \infty \), \(d_n \rightarrow \infty \) and \(h_n d_n^2/n \rightarrow h\) for \(0<h<\infty \). Then, for any \(\varepsilon >0\) and \(D>0\), there exists an \(A>0\) such that when \(n\) is sufficiently large

    $$\begin{aligned} P\left( \inf _{Ah_n\le k\le n} \inf _{\Vert \gamma \Vert \ge n^{-1/2}d_n}U_k(\gamma )<D \right) <\varepsilon . \end{aligned}$$

Proof of Lemma 4

First, by a Taylor expansion, we obtain

$$\begin{aligned} U_k(\gamma )&=\gamma ^{\top } \sum _{i=1}^k X_i(-\epsilon _i +\epsilon _i^{-1})\nonumber \\&\quad + \gamma ^{\top } \sum _{i=1}^k X_iX_i^{\top }\{\epsilon _i \exp (-X_i^{\top }\gamma ^{*})+\epsilon _i^{-1}\exp (X_i^{\top }\gamma ^{*})\}\gamma , \end{aligned}$$
(13)

where \(\gamma ^{*}\) is between \(0\) and \(\gamma \).

Step 1. To prove part (i). Write

$$\begin{aligned} \hat{\gamma }_k\equiv \arg \min _{\gamma }\sum _{i=1}^k\{\epsilon _i\exp (-X_i^{\top }\gamma )+\epsilon _i^{-1}\exp (X_i^{\top }\gamma )\}. \end{aligned}$$

Lemma 1 implies that \(\sup _{n\delta \le k \le n} \sqrt{n} \Vert \hat{\gamma _k}\Vert =O_p(1)\). Thus, it suffices to prove that for each \(A>0\),

$$\begin{aligned} \sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le An^{-1/2}}|U_k(\gamma )|=O_p(1). \end{aligned}$$

In view of (13), we have

$$\begin{aligned}&\sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le An^{-1/2}}|U_k(\gamma )|\\&\quad \le \sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le An^{-1/2}}\left| \gamma ^{\top } \sum _{i=1}^kX_i(-\epsilon _i +\epsilon _i^{-1})\right| \\&\qquad + \sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le An^{-1/2}}\left| \gamma ^{\top } \sum _{i=1}^kX_iX_i^{\top }\{\epsilon _i \exp (-X_i^{\top }\gamma ^{*})+\epsilon _i^{-1}\exp (X_i^{\top }\gamma ^{*})\}\gamma \right| \\&\quad \le \sup _{1\le k \le n}\frac{A}{\sqrt{n}}\left\| \sum _{i=1}^kX_i(-\epsilon _i +\epsilon _i^{-1})\right\| +\sup _{1\le k \le n}\frac{A^2}{n} \sum _{i=1}^k\Vert X_i\Vert ^2(\epsilon _i+\epsilon _i^{-1})\exp (\Vert X_i\Vert \eta ) \\&\quad = O_p(1), \end{aligned}$$

for \(\eta >0\) as \(\Vert \gamma ^{*}\Vert \le \Vert \gamma \Vert \le An^{-1/2}\rightarrow 0\). The last equality holds by the Hájek–Rényi inequality, the law of large numbers and Assumptions (A2)–(A3).

Step 2. To prove part (ii). Analogous to the proof of Lemma A.3 in Oka and Qu (2011), we obtain \(\sup _{1\le k \le n}(\log n)^{-1/2}k^{1/2}\Vert \hat{\gamma _k}\Vert =O_p(1)\). Thus, it suffices to show that for each \(B>0\),

$$\begin{aligned} \sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le B\sqrt{\log n}/\sqrt{k}}\left| \frac{1}{\log n} U_k(\gamma )\right| =O_p(1). \end{aligned}$$

In view of (13), we have

$$\begin{aligned}&\sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le B\sqrt{\log n}/\sqrt{k}} \left| \frac{1}{\log n} U_k(\gamma )\right| \\&\quad \le \sup _{1\le k \le n} \sup _{\Vert \gamma \Vert \le B\sqrt{\log n}/\sqrt{k}} \left| \frac{1}{\log n}\gamma ^{\top } \sum _{i=1}^kX_i(-\epsilon _i +\epsilon _i^{-1})\right| \\&\qquad + \sup _{1\le k \le n}\sup _{\Vert \gamma \Vert \le B\sqrt{\log n}/\sqrt{k}} \left| (\log n)^{-1}\gamma ^{\top } \sum _{i=1}^kX_iX_i^{\top }\{\epsilon _i \exp (-X_i^{\top }\gamma ^{*}) \right. \\&\qquad \left. + \epsilon _i^{-1}\exp (X_i^{\top }\gamma ^{*})\}\gamma \right| \\&\quad \le \sup _{1\le k \le n}\frac{B\sqrt{\log n}}{\sqrt{k}}\left\| \sum _{i=1}^kX_i(-\epsilon _i +\epsilon _i^{-1})\right\| \\&\qquad + \sup _{1\le k \le n}B^2k^{-1}\sum _{i=1}^k\Vert X_i\Vert ^2(\epsilon _i+\epsilon _i^{-1})\exp (\Vert X_i\Vert {\eta })\\&\quad = O_p(1), \end{aligned}$$

for \(\eta >0\) as \(\gamma ^*\) is in a compact set, where the last equality holds by the Hájek–Rényi inequality and Assumption (A2)–(A3) after a centralization for the second term.

Step 3. To prove part (iii). Due to the convexity of \(U_k(\gamma )\) in \(\gamma \), it suffices to show

$$\begin{aligned} P\left( \inf _{n\delta \le k \le n} \inf _{\Vert \gamma \Vert =\log n/\sqrt{n}}U_k(\gamma )<D \log n\right) <\varepsilon . \end{aligned}$$

For the second term of (13), we utilize the arithmetic–geometric mean inequality and obtain

$$\begin{aligned} U_k(\gamma )\ge \gamma ^{\top }\sum _{i=1}^k X_i(-\epsilon _i +\epsilon _i^{-1})+2\gamma ^{\top } \sum _{i=1}^kX_iX_i^{\top }\gamma \equiv \widetilde{U}_k(\gamma ). \end{aligned}$$

Hence,

$$\begin{aligned}&\inf _{n\delta \le k \le n} \inf _{\Vert \gamma \Vert = \log n/\sqrt{n} }U_k(\gamma ) \\&\quad \ge \inf _{n\delta \le k \le n} \inf _{\Vert \gamma \Vert = \log n/\sqrt{n} }\widetilde{U}_k(\gamma )\\&\quad \ge \inf _{n\delta \le k \le n} \inf _{\Vert \gamma \Vert = \log n/\sqrt{n} }E\widetilde{U}_k(\gamma ) -\sup _{n\delta \le k \le n} \sup _{\Vert \gamma \Vert = \log n/\sqrt{n} }|\widetilde{U}_k(\gamma )-E\widetilde{U}_k(\gamma )|. \end{aligned}$$

By Assumptions (A1)–(A2), we have for some \(L > 0\),

$$\begin{aligned} \inf _{n\delta \le k\le n}\inf _{\Vert \gamma \Vert = \log n/\sqrt{n} }E\widetilde{U}_k(\gamma )&= \inf _{n\delta \le k\le n}\inf _{\Vert \gamma \Vert =n^{-1/2}\log n}2k\gamma ^{\top }EXX^{\top }\gamma \nonumber \\&\ge 2\delta (\log n)^2L. \end{aligned}$$
(14)

On the other hand, under Assumption (A1)–(A3),

$$\begin{aligned}&\sup _{n\delta \le k \le n} \sup _{\Vert \gamma \Vert = \log n/\sqrt{n} }(\log n)^{-2}|\widetilde{U}_k(\gamma )-E\widetilde{U}_k(\gamma )|\nonumber \\&\quad \le \sup _{n\delta \le k \le n} \sup _{\Vert \gamma \Vert = \log n/\sqrt{n} }(\log n)^{-2}\left| \gamma ^{\top }\sum _{i=1}^kX_i(-\epsilon _i+\epsilon _i^{-1})\right| \nonumber \\&\qquad + \sup _{n\delta \le k \le n} \sup _{\Vert \gamma \Vert = \log n/\sqrt{n} }(\log n)^{-2}\left| 2\gamma ^{\top }\sum _{i=1}^k(X_iX_i^{\top }-EXX^{\top })\gamma \right| \nonumber \\&\quad \le \sup _{n\delta \le k\le n}n^{-1/2}(\log n)^{-1}\left\| \sum _{i=1}^kX_i(-\epsilon _i+\epsilon _i^{-1})\right\| \nonumber \\&\qquad + \sup _{n\delta \le k\le n}2n^{-1}\left\| \sum _{i=1}^k(X_iX_i^{\top }-EXX^{\top })\right\| \nonumber \\&\quad = o_p(1). \end{aligned}$$
(15)

Again the last equality holds due to the Hájek–Rényi inequality. Combining (14) and (15), part (iii) is proved.

Step 4. To prove part (iv). Due to the convexity in \(\gamma \), it suffices to show

$$\begin{aligned} P\left( \inf _{Ah_n\le k\le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}U_k(\gamma )<D\right) <\varepsilon . \end{aligned}$$

It follows from the lines of the proof of part (iii) that

$$\begin{aligned}&\inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}U_k(\gamma ) \nonumber \\&\quad \ge \inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}\widetilde{U}_k(\gamma )\nonumber \\&\quad \ge \inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}E\widetilde{U}_k(\gamma ) -\sup _{Ah_n \le k \le n} \sup _{\Vert \gamma \Vert = n^{-1/2}d_n}|\widetilde{U}_k(\gamma )-E\widetilde{U}_k(\gamma )|. \end{aligned}$$
(16)

By Assumptions (A1)–(A2), for some \(L > 0\),

$$\begin{aligned} \inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}\frac{1}{k}E\widetilde{U}_k(\gamma )&=\inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}2\gamma ^{\top }E\{XX^{\top }\}\gamma \nonumber \\&\ge 2\frac{d_n^2}{n}L, \end{aligned}$$
(17)

which implies the expected value of \(\widetilde{U}_k(\gamma )\) is large. To prove (iv), it suffices to show that the quantity of interest is dominated by its expected value. More precisely, that is,

$$\begin{aligned}&P\left( \sup _{Ah_n \le k \le n} \sup _{\Vert \gamma \Vert =n^{-1/2}d_n}\frac{1}{k}|\widetilde{U}_k(\gamma )-E\widetilde{U}_k(\gamma )|>\frac{d_n^2}{n}L\right) \nonumber \\&\quad \le P\left( \sup _{Ah_n \le k \le n} \sup _{\Vert \gamma \Vert =n^{-1/2}d_n}\frac{1}{k} \left| \gamma ^{\top }\sum _{i=1}^kX_i(-\epsilon _i+\epsilon _i^{-1})\right| >\frac{d_n^2}{2n}L\right) \nonumber \\&\qquad + P\left( \sup _{Ah_n \le k \le n} \sup _{\Vert \gamma \Vert =n^{-1/2}d_n}\frac{1}{k} \left| 2\gamma ^{\top }\sum _{i=1}^k(X_iX_i^{\top }-EXX^{\top })\gamma \right| >\frac{d_n^2}{2n}L\right) \nonumber \\&\quad \le P\left( \sup _{Ah_n \le k \le n} \sup _{\Vert \gamma \Vert =n^{-1/2}d_n}\frac{1}{k} \left\| \sum _{i=1}^kX_i(-\epsilon _i+\epsilon _i^{-1})\right\| >\frac{d_n}{2\sqrt{n}}L\right) \nonumber \\&\qquad + P\left( \sup _{Ah_n \le k \le n} \sup _{\Vert \gamma \Vert =n^{-1/2}d_n}\frac{1}{k} \left\| \sum _{i=1}^k(X_iX_i^{\top }-EXX^{\top })\right\| >\frac{L}{4}\right) \nonumber \\&\quad \le \frac{4n\sigma _1^2}{L^2d_n^2}\left( \frac{1}{Ah_n}+\sum _{i=Ah_n+1}^{n}\frac{1}{k^2}\right) +\frac{16\sigma _2^2}{L^2}\left( \frac{1}{Ah_n}+\sum _{i=Ah_n+1}^{n}\frac{1}{k^2}\right) \nonumber \\&\quad \le \frac{12\sigma _1^2}{L^2A}\left( \frac{h_nd_n^2}{n}\right) ^{-1}+\frac{48\sigma _2^2}{L^2}(Ah_n)^{-1} < \varepsilon , \end{aligned}$$
(18)

where \(\sigma _1=E\Vert X_i(-\epsilon _i+\epsilon _i^{-1})\Vert ^2<\infty \) and \(\sigma _2=E\Vert (X_iX_i^{\top }-EXX^{\top })\Vert ^2<\infty \) under Assumption (A3). The third inequality holds by the Hájek–Rényi inequality and the fourth inequality is based on the fact that \(\sum _{i=Ah_n+1}^{n}{1}/{k^2}\le 2/(Ah_n)\). The last inequality holds by choosing large \(n\), \(A\) and \(h_n d_n^2/n \rightarrow h\) as \(n\rightarrow \infty \).

Hence, (16), (17) and (18) together imply that

$$\begin{aligned}&P\left( \inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}U_k(\gamma )>A\frac{h_nd_n^2}{n}L \right) \\&\quad \ge P\left( \inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}\widetilde{U}_k(\gamma )>A\frac{h_nd_n^2}{n}L \right) \\&\quad \ge P\left( \inf _{Ah_n \le k \le n} \inf _{\Vert \gamma \Vert = n^{-1/2}d_n}\frac{1}{k}\widetilde{U}_k(\gamma )>\frac{d_n^2}{n}L\right) >1-\varepsilon \end{aligned}$$

for sufficiently large \(n\) and \(A\). However, as \(A{h_nd_n^2}L/n \rightarrow AhL\) as \(n\rightarrow \infty \), \(A{h_nd_n^2}L/n\) can be greater than \(D\) by choosing a large \(A\). Hence, the proof of Lemma 4 is complete.

Proof of Lemma 3

We first divide the set \(\{(\gamma _1, \gamma _2, \nu ):\gamma _1\in R^p, \gamma _2\in R^p, \nu \le -\nu _1\}\) into the following three regions:

$$\begin{aligned} B_1&=\left\{ {{(\gamma _1, \gamma _2, \nu )}}:\Vert \gamma _2\Vert \le \frac{1}{2}\Vert \lambda _n\Vert , n\delta \le k(\nu )\le -k\nu _1\right\} \\ B_2&=\left\{ {{(\gamma _1, \gamma _2, \nu )}}:\Vert \gamma _2\Vert \le \frac{1}{2}\Vert \lambda _n\Vert , 0\le k(\nu )\le n\delta \right\} \\ B_3&=\left\{ {{(\gamma _1, \gamma _2, \nu )}}:\Vert \gamma _2\Vert \ge \frac{1}{2}\Vert \lambda _n\Vert , 0\le k(\nu )\le -k\nu _1\right\} . \end{aligned}$$

Thus, it is sufficient to show

$$\begin{aligned} {{P\left( \inf _{(\gamma _1, \gamma _2, \nu ) \in B_i} V_n(\gamma _1, \gamma _2, \nu )<C\right) <\varepsilon , \quad i=1,2,3.}} \end{aligned}$$
(19)

First, with \(B_1\) in (19), the behavior of \(U(1,k(\nu ),\gamma _1)\) and \(U(k_0,n,\gamma _2)\) is governed by Lemma 4(i), as \(\Vert \gamma _2+\lambda _n\Vert \ge \Vert \lambda _n\Vert -\Vert \gamma _2\Vert \ge \Vert \lambda _n\Vert /2. \) By Lemma 4(iv) with \(d_n=\frac{1}{2}\Vert \lambda _n\Vert \), \(h_n=\Vert \lambda _n\Vert ^{-2}\) and \(A=\nu _1\), for any \(D>0\), it can be shown that \(U(k(\nu )+1,k_0,\gamma _2+\lambda _n)\ge D\) with high probability when \(\nu _1\) is large. Thus, \(V_n\ge O_p(1)+D+O_p(1)\) with high probability. Second, with \(B_2\) in (19), it follows from Lemma 4(i)–(iii) that

$$\begin{aligned} V_n \ge O_p(\log n)+D\log n+ O_p(1) \end{aligned}$$

with high probability. In particular, under Assumption (A4), we utilize the fact that

$$\begin{aligned} \Vert \gamma _2+\lambda _n\Vert \ge \Vert \lambda _n\Vert -\Vert \gamma _2\Vert \ge \frac{1}{2}\Vert \lambda _n\Vert \ge n^{-\frac{1}{2}}\log n \end{aligned}$$

when applying Lemma 4(iii) to \(U(k(\nu )+1,k_0, \gamma _2+\lambda _n)\). Lastly, with \(B_3\) in (19), it can be shown similarly that \(V_n\ge O_p(\log n)+ O_p(\log n)+D\log n\) with high probability by applying Lemma 4(ii) to \(U(1,k(\nu ),\gamma _1)\), \(U(k(\nu )+1,k_0,\gamma _2+\lambda _n)\) and Lemma 4(iii) to \(U(k_0,n,\gamma _2)\), respectively. Therefore, Lemma 3 is proved.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Z., Liu, W. & Lin, Y. A change-point problem in relative error-based regression. TEST 24, 835–856 (2015). https://doi.org/10.1007/s11749-015-0438-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-015-0438-2

Keywords

Mathematics Subject Classification

Navigation