Abstract
A nonparametric relative error-based method is proposed to detect and estimate the change point for the multiplicative regression models. The asymptotic distribution of the proposed test statistic for no change-point effect is established. We prove the \(n\)-consistency of the proposed estimator of the change point. Simulation studies demonstrate that change-point detection and estimation with relative errors perform reasonably well in many practical situations. Application is illustrated with a financial dataset.
Similar content being viewed by others
References
Bai J (1995) Least absolute deviation estimation of a shift. Econom Theory 11:403–436
Bai J (1997) Estimating multiple breaks one at a time. Econom Theory 13:315–352
Bassevill M, Nikiforov IV (1993) Detection of abrupt changes: theory and applications. Prentice Hall, Englewood Cliffs
Chen K, Guo S, Lin Y, Ying Z (2010) Least absolute relative error estimation. J Am Stat Assoc 105:1104–1112
Chen K, Lin Y, Wang Z, Ying Z (2013) Least product relative error estimation. http://arxiv.org/abs/1309.0220
Ciuperca G (2014) Estimation in a change-point nonlinear quantile model. http://arxiv.org/abs/1401.4883
Csörgő M, Horváth L (1997) Limit theorems in change point analysis. Wiley, New York
Dempfle A, Stute W (2002) Nonparametric estimation of a discontinuity in regression. Stat Neerl 56:233–242
Fokianos K, Gombay E, Hussein A (2014) Retrospective change detection for binary time series models. J Stat Plan Inference 145:102–112
Gijbels I, Gürler U (2003) Estimation of a change point in a hazard function based on censored data. Lifetime Data Anal 9:395–411
Gijbels I, Hall P, Kneip A (1999) On the estimation of jump points in smooth curves. Ann Inst Stat Math 51:231–251
Gurevicha G, Vexler A (2005) Change point problems in the model of logistic regression. J Stat Plan Inference 131:313–331
Hall P, Molchanov I (2003) Sequential methods for design-adaptive estimation of discontinuities in regression curves and surfaces. Ann Stat 31:921–941
Kirch C, Kamgaing JT (2014) Monitoring time series based on estimating functions. Technical report, Fachbereich Mathematik, Technische Universität Kaiserslautern
Krishnaiah PR, Miao BQ (1988) Review about estimations of change-point. Handb Stat 7:375–402
Koul HL, Qian L (2002) Asymptotics of maximum likelihood estimation in a two-phase linear regression model. J Stat Plan Inference 108:99–119
Liu Y, Zou C, Zhang R (2008) Empirical likelihood ratio test for a change-point in linear regression model. Commun Stat Theory Methods 37:2551–2563
Loader CR (1996) Change point estimation using nonparametric regression. Ann Stat 24:1667–1678
Müller HG (1992) Change-points in nonparametric regression analysis. Ann Stat 20:737–761
Müller HG, Song K (1997) Two-stage change-point estimations in smooth regression models. Stat Probab Lett 34:323–335
Oka T, Qu Z (2011) Estimating structural changes in regression quantiles. J Econom 162:248–267
Park H, Stefanski LA (1998) Relative-error prediction. Stat Probab Lett 40:227–236
Pons O (2003) Estimation in a Cox regression model with a change-point at an unknown time. Ann Stat 31:442–463
Quandt RE, Ramsay JB (1978) Estimating mixtures of normal distributions and switching regressions. J Am Stat Assoc 73:730–752
Rao CR, Wu Y, Shi X (2010) An M-estimation-based criterion for simultaneous change point analysis and variable selection in a regression problem. J Stat Theory Pract 4:773–801
Ritov Y (1990) Asympotic efficient estimation of the change point with unknown distributions. Ann Stat 18:1829–1839
Wang ZF, Wu YH, Zhao LC (2007) Change-point estimation for censored regression model. Sci China Ser A 50:63–72
Ye J (2007) Price models and the value relevance of accounting information. Technical report, Baruch College-Stan Ross Dept. of Accountancy, City University of New York
Zhang Q, Wang Q (2012) Local least absolute relative error estimating approach for partially linear multiplicative model. Stat Sin 23:1091–1116
Zhou Z (2013) Heteroscedasticity and autocorrelation robust structural change detection. J Am Stat Assoc 108:726–740
Acknowledgments
The authors are grateful to the Editor-in-Chief, Professor A.F. Militino, the Associate Editor, and the anonymous referees for comments and suggestions that lead to substantial improvements in the paper. Wang’s work is supported by the State Key Program of National Natural Science of China (Grant No. 11231010) and the Fundamental Research Funds for the Central Universities (Grant No. WK2040000010). Lin’s work is supported by the Hong Kong Research Grants Council (Grant No. 509413) and a Direct Grant for Research at the Chinese University of Hong Kong (Grant No. 4053136).
Author information
Authors and Affiliations
Corresponding author
Appendix A: Proofs
Appendix A: Proofs
1.1 A1. Proof of Theorem 1
The following lemmas are needed to prove Theorem 1. We first state Theorem 3 in Chen et al. (2013) as our Lemma 1, which will be used in the proof of Lemma 2.
Lemma 1
Suppose Assumptions (A1)–(A3) hold. Then, under \(H_0\), \(\sqrt{n}(\hat{\beta }_n-\beta _0)\) is asymptotically normal with mean 0 and covariance matrix \(D^{-1}\Sigma D^{-1}\), where \(D=E\{XX^{\top }(\epsilon +\epsilon ^{-1})\}\) and \(\Sigma =E\{XX^{\top }(\epsilon -\epsilon ^{-1})^2\}\).
Lemma 2
Suppose Assumptions (A1)–(A3) hold. Then, under \(H_0\),
Proof of Lemma 2
Denote \(h_i(\beta )=Y_i^{-1}\exp (X_i^{\top }\beta )-Y_i\exp (-X_i^{\top }\beta )\). Then, \(h_i(\beta _0)=\epsilon _i^{-1}-\epsilon _i\). For any \(x\in R^p\), let \(x^{(i)}\), \(i=1, \ldots , p\) represent the \(i\)th component of \(x\). In view of \(S_n(\hat{\beta _n})=0\), by a Taylor expansion, we have
where \(E_{k,n}^{(i)}\) is the remainder term to be discussed later. By Assumption (A2), \(EX_l^{(i)}X_l^{(j)}h_l(\beta _0)=0\). Since \((X_i, \epsilon _i), i=1,\ldots , n\) are i.i.d, by Hájek–Rényi inequality and Assumption (A3), we have
This gives
On the other hand, by Hájek–Rényi inequality and Assumption (A3), we can show
Combining (7) and (8), we have
which leads to
Thus, (9) and (10) together imply
It follows from Lemma 1 that \(\sqrt{n}|\hat{\beta }_n^{(j)}-\beta _0^{(j)}|=O_p(1)\). Together with (11), we have shown
Moreover, \(E_{k,n}^{(i)}\) is the higher order term of \((\hat{\beta }_n^{(j)}-\beta _0^{j})\). Under Assumption (A3), it can be shown to be \(o_p(1)\) uniformly in \(1\le k\le n\). Thus, the proof of Lemma 2 is complete.
Proof of Theorem 1
First, by the Functional Central Limit Theorem,
in distribution as \(n\rightarrow \infty \), where \(\{W(t):0\le t\le 1\}\) is a Winner process with covariance matrix \(\Sigma =E\{XX^{\top }(\epsilon ^{-1}-\epsilon )^2\}\). Thus, as \(n\rightarrow \infty \),
in distribution. Second, it follows from the consistency of \(\hat{\beta }_n\) that \(\hat{\Sigma }_n\) is a consistent estimate of \(\Sigma \). Therefore, by Lemma 2 and Slutsky’s theorem, Theorem 1 is proved.
1.2 A2. Proof of Theorem 2
Since a closed form for the estimator of \(\beta _1\) and \(\beta _2\) for fixed \(k\) is not available, analogous to Bai (1995), we shall obtain the rate of convergence of \(\hat{t}_n\) by evaluating the global behavior of \(G(\beta _1,\beta _2,k)\) over the whole parameter space for \(\beta _1\), \(\beta _2\) and \(k\). We first reparameterize \(G_n(\beta _1,\beta _2,k)\) as
where \(\gamma _1 = \beta _1-\beta _{10}\), \(\gamma _2 = \beta _2-\beta _{20}\), \(k(\nu )=[k_0+\nu c_n]\) with \(c_n=O(1/{\Vert \lambda _n\Vert ^2})\) and \(\nu \) is a real scalar, \([x]\) is the integer part of \(x\). In such a way, with \(\nu \) varying over \(R\), \(k(\nu )\) visits all integers between \(1\) and \(n\). We assume \(k(\nu )=1\) if \(k(\nu )\le 1\) and \(k(\nu )=n\) if \(k(\nu )\ge n\). Note that, the minimization problem is not changed in the reparameterization. Denote
where \(B_1=\{\beta -\beta _{10}:\beta \in B\}\) and \(B_2=\{\beta -\beta _{20}:\beta \in B\}\) are the reparameterized spaces. Then, we have \([\hat{\nu }_nc_n]=n(\hat{t}_n-t_0)=\hat{k}_n-k_0\). Furthermore, Theorem 1 is equivalent to \(\hat{\nu }_n=O_p(1)\). In view of \(V_n(0,0,0)=0\), Theorem 2 is a consequence of the following lemma.
Lemma 3
Suppose Assumptions (A1)–(A4) hold. Then, for every \(\epsilon >0\) and \(C>0\), there exists \(\nu _1>0\) such that for large \(n\)
This lemma presents the global behavior of \(V_n(\gamma _1,\gamma _2,\nu )\) or \(G_n(\beta _1,\beta _2,k)\), which tells that \(G_n(\beta _1,\beta _2,k)\) cannot achieve its global minimum when \(k\) is far from \(k_0\). To prove Lemma 3, we consider the case of \(\nu \le 0\) without loss of generality because of symmetry. Then, for \(k(\nu ) \le k_0\),
Write \(U(l,j,\gamma )=\sum _{i=l}^{j}\{\epsilon _i\exp (-X_i^{\top }\gamma )+\epsilon _i^{-1}\exp (X_i^{\top }\gamma )-(\epsilon _i+\epsilon _i^{-1})\}.\) Then,
In the following discussion, we put \(U_k(\gamma )=U(1,k,\gamma )\). We need the following lemma to prove Lemma 3.
Lemma 4
Under Assumptions (A1)–(A3), we have the following
-
(i)
for any \(\delta \in (0,1)\),
$$\begin{aligned} \sup _{n\delta \le k \le n}\left| \inf _{\gamma }U_k(\gamma )\right| =O_p(1); \end{aligned}$$ -
(ii)
$$\begin{aligned} \sup _{1\le k\le n}\left| \inf _{\gamma }U_k(\gamma )\right| =O_p(\log n); \end{aligned}$$
-
(iii)
for any \(\delta \in (0,1)\), \(\varepsilon > 0\), \(D>0\) and \(n\) sufficiently large,
$$\begin{aligned} P\left( \inf _{n\delta \le k \le n} \inf _{\Vert \gamma \Vert \ge n^{-{1}/{2}}{\log n}}U_k(\gamma )<D \log n\right) <\varepsilon ; \end{aligned}$$ -
(iv)
let \(h_n\) and \(d_n\) be positive sequences such that \(h_n\) is nondecreasing, \(h_n \rightarrow \infty \), \(d_n \rightarrow \infty \) and \(h_n d_n^2/n \rightarrow h\) for \(0<h<\infty \). Then, for any \(\varepsilon >0\) and \(D>0\), there exists an \(A>0\) such that when \(n\) is sufficiently large
$$\begin{aligned} P\left( \inf _{Ah_n\le k\le n} \inf _{\Vert \gamma \Vert \ge n^{-1/2}d_n}U_k(\gamma )<D \right) <\varepsilon . \end{aligned}$$
Proof of Lemma 4
First, by a Taylor expansion, we obtain
where \(\gamma ^{*}\) is between \(0\) and \(\gamma \).
Step 1. To prove part (i). Write
Lemma 1 implies that \(\sup _{n\delta \le k \le n} \sqrt{n} \Vert \hat{\gamma _k}\Vert =O_p(1)\). Thus, it suffices to prove that for each \(A>0\),
In view of (13), we have
for \(\eta >0\) as \(\Vert \gamma ^{*}\Vert \le \Vert \gamma \Vert \le An^{-1/2}\rightarrow 0\). The last equality holds by the Hájek–Rényi inequality, the law of large numbers and Assumptions (A2)–(A3).
Step 2. To prove part (ii). Analogous to the proof of Lemma A.3 in Oka and Qu (2011), we obtain \(\sup _{1\le k \le n}(\log n)^{-1/2}k^{1/2}\Vert \hat{\gamma _k}\Vert =O_p(1)\). Thus, it suffices to show that for each \(B>0\),
In view of (13), we have
for \(\eta >0\) as \(\gamma ^*\) is in a compact set, where the last equality holds by the Hájek–Rényi inequality and Assumption (A2)–(A3) after a centralization for the second term.
Step 3. To prove part (iii). Due to the convexity of \(U_k(\gamma )\) in \(\gamma \), it suffices to show
For the second term of (13), we utilize the arithmetic–geometric mean inequality and obtain
Hence,
By Assumptions (A1)–(A2), we have for some \(L > 0\),
On the other hand, under Assumption (A1)–(A3),
Again the last equality holds due to the Hájek–Rényi inequality. Combining (14) and (15), part (iii) is proved.
Step 4. To prove part (iv). Due to the convexity in \(\gamma \), it suffices to show
It follows from the lines of the proof of part (iii) that
By Assumptions (A1)–(A2), for some \(L > 0\),
which implies the expected value of \(\widetilde{U}_k(\gamma )\) is large. To prove (iv), it suffices to show that the quantity of interest is dominated by its expected value. More precisely, that is,
where \(\sigma _1=E\Vert X_i(-\epsilon _i+\epsilon _i^{-1})\Vert ^2<\infty \) and \(\sigma _2=E\Vert (X_iX_i^{\top }-EXX^{\top })\Vert ^2<\infty \) under Assumption (A3). The third inequality holds by the Hájek–Rényi inequality and the fourth inequality is based on the fact that \(\sum _{i=Ah_n+1}^{n}{1}/{k^2}\le 2/(Ah_n)\). The last inequality holds by choosing large \(n\), \(A\) and \(h_n d_n^2/n \rightarrow h\) as \(n\rightarrow \infty \).
Hence, (16), (17) and (18) together imply that
for sufficiently large \(n\) and \(A\). However, as \(A{h_nd_n^2}L/n \rightarrow AhL\) as \(n\rightarrow \infty \), \(A{h_nd_n^2}L/n\) can be greater than \(D\) by choosing a large \(A\). Hence, the proof of Lemma 4 is complete.
Proof of Lemma 3
We first divide the set \(\{(\gamma _1, \gamma _2, \nu ):\gamma _1\in R^p, \gamma _2\in R^p, \nu \le -\nu _1\}\) into the following three regions:
Thus, it is sufficient to show
First, with \(B_1\) in (19), the behavior of \(U(1,k(\nu ),\gamma _1)\) and \(U(k_0,n,\gamma _2)\) is governed by Lemma 4(i), as \(\Vert \gamma _2+\lambda _n\Vert \ge \Vert \lambda _n\Vert -\Vert \gamma _2\Vert \ge \Vert \lambda _n\Vert /2. \) By Lemma 4(iv) with \(d_n=\frac{1}{2}\Vert \lambda _n\Vert \), \(h_n=\Vert \lambda _n\Vert ^{-2}\) and \(A=\nu _1\), for any \(D>0\), it can be shown that \(U(k(\nu )+1,k_0,\gamma _2+\lambda _n)\ge D\) with high probability when \(\nu _1\) is large. Thus, \(V_n\ge O_p(1)+D+O_p(1)\) with high probability. Second, with \(B_2\) in (19), it follows from Lemma 4(i)–(iii) that
with high probability. In particular, under Assumption (A4), we utilize the fact that
when applying Lemma 4(iii) to \(U(k(\nu )+1,k_0, \gamma _2+\lambda _n)\). Lastly, with \(B_3\) in (19), it can be shown similarly that \(V_n\ge O_p(\log n)+ O_p(\log n)+D\log n\) with high probability by applying Lemma 4(ii) to \(U(1,k(\nu ),\gamma _1)\), \(U(k(\nu )+1,k_0,\gamma _2+\lambda _n)\) and Lemma 4(iii) to \(U(k_0,n,\gamma _2)\), respectively. Therefore, Lemma 3 is proved.
Rights and permissions
About this article
Cite this article
Wang, Z., Liu, W. & Lin, Y. A change-point problem in relative error-based regression. TEST 24, 835–856 (2015). https://doi.org/10.1007/s11749-015-0438-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-015-0438-2