Abstract
Temporally correlated error process is commonly encountered in practice and poses significant challenges in high-dimensional statistical analysis. This paper conducts low-dimensional inference for high-dimensional linear models with stationary errors. We adopt the framework of functional dependence measure for adequate accommodation of the error correlation. A new desparsifying Lasso based testing procedure is developed by incorporating a banded estimator of the error autocovariance matrix. Asymptotic normality of the proposed estimator is established by demonstrating the consistency of the banded autocovariance matrix estimator. The result indicates how the range of p is substantially narrower if the moment condition of error weakens or the dependence becomes stronger. We further develop a data-driven choice of the banding parameter. The simulation studies illustrate the satisfactory finite-sample performance of our proposed procedure, and a real data example is also presented for illustration.
Similar content being viewed by others
Abbreviations
- i.i.d.:
-
Independent and identically distributed
- CDF:
-
Cumulative distribution function
- AR:
-
Autoregressive model
- MA:
-
Moving average model
- Cov:
-
Coverage probability of the confidence intervals
- Len:
-
Length of the confidence intervals
- Ave:
-
Averaged estimated parameters
- Esd:
-
Empirical standard deviation
- ACov:
-
Averaged coverage probability of the confidence intervals
- ALen:
-
Averaged length of the confidence intervals
References
Adamek R, Smeekes S, Wilms I (2020) Lasso inference for high-dimensional time series. arXiv Pre-print arXiv:2007.10952v1
Babii A, Ghysels E, Striaukas J (2020) Inference for high-dimensional regressions with heteroskedasticity and autocorrelation. arXiv Pre-print arXiv:1912.06307v2
Basu S, Michailidis G (2015) Regularized estimation in sparse high-dimensional time series models. Ann Stat 43:1535–1567
Bickel PJ, Levina E (2008) Regularized estimation of large covariance matrices. Ann Stat 36:199–227
Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, New York
Candes E, Tao T (2007) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35:2313–2404
Chernozhukov V, Härdle WK, Huang C, Wang W (2020) Lasso-driven inference in time and space. arXiv pre-print arXiv:1806.05081v4
de Mol C, Giannone D, Reichlin L (2008) Forecasting using a large number of predictors: is Bayesian shrinkage a valid alternative to principal components? J Econ 146:318–328
Deshpande Y, Mackey L, Syrgkanis V, Taddy M (2018) Accurate inference for adaptive linear models. In: Proceedings of the 35th international conference on machine learning, pp 1202–1211
Deshpande Y, Javanmard A, Mehrabi M (2020) Online debiasing for adaptively collected high-dimensional data with applications to time series analysis. arXiv pre-print arXiv: 1911.01040v3
Fan JQ, Li RZ (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Fan JQ, Lv JC (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20:101–148
Fan JQ, Qi L, Tong X (2016) Penalized least squares estimation with weakly dependent data. Sci China Math 59:2335–2354
Grenander U, Szegö G (1958) Toeplitz forms and their applications. Cambridge University Press, London
Gupta S (2012) A note on the asymptotic distribution of lasso estimator for correlated data. Sankhya A 74:10–28
Han Y, Tsay R (2020) High-dimensional linear regression for dependent data with applications to nowcasting. Stat Sin to appear
Hastie T, Tibshirani R, Wainwright M (2015) Statistical learning with sparsity: the Lasso and generalizations. Taylor & Francis Group, New York
Javanmard A, Montanari A (2014) Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 15:2869–2909
Liu WD, Wu WB (2010) Asymptotics of spectral density estimates. Econ Theory 26:1218–1245
Politis DN, Romano JP, Wolf M (1999) Subsampling. Springer, New York
Raskutti G, Wainwright JM, Yu B (2010) Restricted eigenvalue properties for correlated Gaussian designs. J Mach Learn Res 11:2241–2259
Shao X, Wu WB (2007) Asymptotic spectral theory for nonlinear time series. Ann Stat 35:1773–1801
Smith SM (2012) The future of FMRI connectivity. NeuroImage 62:1257–1266
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58:267–288
van de Geer S, Bühlmann P (2009) On the conditions used to prove oracle results for the Lasso. Electron J Stat 3:1360–1392
van de Geer S, Bühlmann P, Ritov Y, Dezeure R (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Stat 42:1166–1202
Wainwright JM (2019) High-dimensional statistics: a non-asymptotic viewpoint. Cambidge University Press, Cambridge
Wang H, Li G, Tasi C (2007) Regression coefficient and autoregressive order shrinkage and selection via the lasso. J R Stat Soc B 69:63–78
Wong K, Li Z, Tewari A (2020) Lasso guarantees for \(\beta \)-mixing heavy tailed time series. Ann Stat 48:1124–1142
Wu WB (2005) Nonlinear system theory: another look at dependence. Proc Natl Acad Sci USA 102:14150–14154
Wu WB, Pourahmadi M (2009) Banding sample autocovariance matrices of stationary processes. Stat Sin 19:1755–1768
Wu WB, Wu YN (2016) Performance bounds for parameter estimates of high-dimensional linear models with correlated errors. Electron J Stat 10:352–379
Xie F, Xiao ZJ (2018) Square-root lasso for high-dimensional sparse linear systems with weakly dependent errors. J Time Ser Anal 39:212–238
Zhang CH (2010) Nearly unbiased variable selection under minimax concave penalty. Ann Stat 38:894–942
Zhang CH, Zhang SS (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc B 76:217–242
Zhang K, Janson L, Murphy S (2020) Inference for batched bandits. arXiv pre-print arXiv:2002.03217v2
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Acknowledgements
Xiao Guo’s research is supported by the National Natural Science Foundation of China, grants 12071452, 72091212, 11601500 and USTC Research Funds of the Double First-Class Initiative, grants YD2040002013. The authors also thank the Editor, Associate Editor, and two anonymous referees for their constructive comments that have led to a substantial improvement of this paper.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix: Proofs of main results
Technical lemmas
Recall that \(\varepsilon _i=g(\ldots ,\xi _{i-1},\xi _i)=g({\mathcal {F}}_i)\) in (6), define
where \({\mathcal {F}}_{i-m,i}=(\xi _{i-m},\ldots ,\xi _i)\) is a \(\sigma \) algebra. Then \(\{\varepsilon _i^*\}\) is m-dependent random variables with mean zero. To prove the main theorems in this paper, we first give some lemmas.
Lemma A.1
(Concentration inequalities under dependence)
-
(i)
If \({\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }<\infty ,\) where \(q>2\) and \(\alpha >0,\) and \(\sum _{i=1}^na_i^2=n\). Let \({\mathbf {a}}=(a_1,\ldots ,a_n)^\top \), and \(\zeta _n=1 (resp.~ \zeta _n=(\log n)^{1+2q} ~or~\zeta _n= n^{q/2-1-\alpha q})\) if \(\alpha >1/2-1/q~(resp.~\alpha =0~or~\alpha <1/2-1/q)\). Then for all \(x>0, ~S_n=\sum _{i=1}^na_i\varepsilon _i\), we have
$$\begin{aligned} {\mathbb {P}}(|S_n|\ge x)\le K_1\frac{\zeta _n\Vert {\mathbf {a}}\Vert _q^q{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }^q}{x^q}+K_2\exp \Big (-\frac{K_3x^2}{n{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }^2}\Big ), \end{aligned}$$where \(K_1, K_2, K_3\) are constants that only depend of q and \(\alpha \).
-
(ii)
If \({\mathcal {D}}_z<\infty \) and \(\sum _{i=1}^na_i^2=n\). Let \(\alpha =2/(1+2z)\), \(c_\alpha \) is a constant only depending on \(\alpha \). Then for all \(x>0, ~S_n=\sum _{i=1}^na_i\varepsilon _i\), we have
$$\begin{aligned} {\mathbb {P}}(|S_n|\ge nx)\le (2+\sqrt{2}c_\alpha )\exp \Big ( -\frac{(\sqrt{n}x/{\mathcal {D}}_z)^\alpha }{2e\alpha } \Big ). \end{aligned}$$
Proof
See Theorems 2 and 3 of Wu and Wu (2016). Details are omitted. \(\square \)
Lemma A.2
Suppose that \(\varDelta _{0,q}<\infty \) for \(q\ge 2\). Let \(a_1,a_2,\ldots ,\in {\mathbb {R}} \), \(A_n=(\sum _{i=1}^n a_i^2)^{1/2}\), and \(C_q=18q^{3/2}(q-1)^{-1/2}.\) Then (i) \({\vert \vert \vert \sum _{i=1}^n a_i\varepsilon _i \vert \vert \vert }_q\le C_qA_n\varDelta _{0,q}\), and (ii) \({\vert \vert \vert \sum _{i=1}^n a_i(\varepsilon _i-\varepsilon _i^*) \vert \vert \vert }_q\le C_qA_n\varDelta _{m+1,q}.\)
Proof
This result can be proved by using the arguments in Liu and Wu (2010) Lemma 1. \(\square \)
Lemma A.3
Suppose \( \varDelta _{m,q}<\infty \) for \(2 <q\le 4\). Let \(d=q/2\), then for any \(j\in {\mathbb {Z}}\),
where
Proof
See Lemma 1 of Wu and Pourahmadi (2009). \(\square \)
Lemma A.4
For \(j\in \{1,\ldots ,p \}\), we have
Proof
This Lemma is easily proved by using the KKT conditions for the nodewise Lasso.
Firstly, notice that \(\widehat{z}_j^\top \widehat{\varvec{\gamma }}_j=\Vert \widehat{\varvec{\gamma }}_j\Vert _1\), where \(\widehat{z}_j\) is the subdifferential of \(\Vert \varvec{\gamma }\Vert _1\), thus \(\Vert \widehat{z}_j\Vert _\infty \le 1\). By the KKT conditions for the nodewise lasso (5), we have
thus
Dividing each side of the above display by \(\widehat{\tau }_j^2\) yields
so that
Moreover, note that the KKT conditions for the nodewise lasso (5) can be written as
using \(\Vert \widehat{z}_j\Vert _\infty \le 1\) yields
which is equivalent to
since \((X_j-{\mathbf {X}}_{-j}\widehat{\varvec{\gamma }}_j)={\mathbf {X}}\widehat{C}_j.\) Then, dividing both sides of the above display by \(\widehat{\tau }_j\) and using that \(\widehat{\varvec{\varTheta }}_j=\widehat{ C}_j/\widehat{\tau }_j^2\), we have
and combine with \(X_j^\top {\mathbf {X}}\widehat{\varvec{\varTheta }}_j/n=1\), we have
\(\square \)
Lemma A.5
Suppose Assumptions 3–4 hold and \(\log p=o(\sqrt{n})\). Then there exist constants c and C, for all j it holds that
Proof
By the definition of \(\widehat{C}_j\), we have \(\Vert {\mathbf {X}}\widehat{C}_j\Vert _2^2/n=\Vert X_{j}-{\mathbf {X}}_{-j} \widehat{\varvec{\gamma }}_{j}\Vert _{2}^2 / n :=\widetilde{\tau }_{j}^2,\) and \(\widehat{\varvec{\varTheta }}_j=\widehat{ C}_j/\widehat{\tau }_j^2\). Thus
Note that \(\widehat{\tau }_j^2=\widetilde{\tau }_j^2+\lambda _j\Vert \widehat{\varvec{\gamma }}_j\Vert _1\), thus we get \(\Vert {\mathbf {X}}\widehat{\varvec{\varTheta }}_j\Vert _2^2/n\le C\) since \(\widetilde{\tau }_j^2\le \widehat{\tau }_j^2\le C\). Furthermore, according to Lemma A.4, Assumptions 3–4 and \(\log p=o(\sqrt{n})\), we get
Thus
\(\square \)
Lemma A.6
Consider the linear model in (2) and the Lasso in (3).
-
(i)
If the error series \(\{\varepsilon _i\}\) has finite q-th moment, \(q>2\), and \({\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }<\infty \) for \( \alpha \ge 0.\) Define
$$\begin{aligned} v=\left\{ \begin{array}{ll} {1}/{2}, &{} { \text{ if } \alpha >1 / 2-1 / q} ,\\ {{1}/{q}+\alpha }, &{} { \text{ if } \alpha <1 / 2-1 / q}. \end{array} \right. , \end{aligned}$$let \(\lambda =A\max \{(n^{-1}\log p )^{1/2}{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }, n^{-v}(p\log p)^{\frac{1}{q}} {\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }\},\) where A is a sufficiently large constant, then, with probability at least \(1- C_1(\log p)^{-1}-C_2p^{1-K_3A^2}\), we have
$$\begin{aligned} \Vert \widehat{\varvec{\beta }}-\varvec{\beta }^*\Vert _1\le Cs_0\lambda \quad \mathrm{and }\quad \Vert {\mathbf {X}}(\widehat{\varvec{\beta }}-\varvec{\beta }^*)\Vert _2/n\le Cs_0\lambda ^2. \end{aligned}$$(A.1) -
(ii)
If the error series \(\{\varepsilon _i\}\) has finite exponential moment, that is \({\mathcal {D}}_z<\infty \). Let \(\alpha =2/(1+2z)\), \(c_\alpha \) is a constant only depending on \(\alpha \) and \(\lambda =An^{-1/2}(\log p)^{1/\alpha }{\mathcal {D}}_z\), where A is a sufficiently large constant. Then, with probability at least \(1- C_1p^{1-C_2A^\alpha }\), we have the bounds (A.1).
Proof
Since \(\widehat{\varvec{\beta }}\) minimizes (3), we have
Define the event \({\mathcal {T}}:=\{\max _{1\le j\le p}|\varGamma _j|\le \lambda /c\}\) for some constant \(c>0\), where \(\varGamma _j=n^{-1}\sum _{i=1}^nx_{ij}\varepsilon _i.\) Then on the event \({\mathcal {T}}\), uing the fact
inequality (A.2) implies that
On the lefthand side in (A.3), using the triangle inequality,
whereas on the righthand side in (A.3), we can use
Thus, we have
In particular, the above implies
Then Assumption 1 shows that
combining (A.4), we get
Furthermore,
we have
Similarly,
Now, we need to control the probability \({\mathbb {P}}({\mathcal {T}}).\) There are 2 situations.
-
(i)
If the error series \(\{\varepsilon _i\}\) has finite q-th moment, \(q>2\), and \({\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }<\infty \) for \( \alpha \ge 0.\) For \(\alpha >1/2-1/q,\) let \(v=1/2\), by the inequality of Lemma A.1 (i) with \(\zeta _n=1\), we have
$$\begin{aligned} \begin{aligned} {\mathbb {P}}(|\varGamma _j|\ge \lambda /c)&\le K_1{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }^q\frac{\sum _{i=1}^n|x_{ij}|^q}{(n\lambda )^q}+K_2\exp \Big (\frac{-K_3n\lambda ^2}{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }^2}\Big )\\&\le K_1\frac{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }^q}{(\sqrt{n}\lambda )^q}+K_2\exp \Big (\frac{-K_3n\lambda ^2}{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }^2}\Big ) \end{aligned} \end{aligned}$$Hence
$$\begin{aligned} \begin{aligned} {\mathbb {P}}({\mathcal {T}}^c)&\le \sum _{j=1}^p K_1\frac{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }^q}{(\sqrt{n}\lambda )^q}+K_2p\exp \Big (\frac{-K_3n\lambda ^2}{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }^2}\Big )\\&= K_1p\frac{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }^q}{(\sqrt{n}\lambda )^q}+K_2p\exp \Big (\frac{-K_3n\lambda ^2}{{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }^2}\Big ), \end{aligned} \end{aligned}$$under our choice of \(\lambda =A\max \{(n^{-1}\log p )^{1/2}{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }, n^{-1/2}(p\log p)^{\frac{1}{q}} {\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,\alpha }\}\) where A is a sufficiently large constant, we have
$$\begin{aligned} {\mathbb {P}}({\mathcal {T}}^c)\le C_1(\log p)^{-1}+C_2p^{1-K_3A^2}. \end{aligned}$$The case of \(\alpha <1/2-1/q\) can be similarly proved.
-
(ii)
If the error series \(\{\varepsilon _i\}\) satisfies stronger moment condition, that is \({\mathcal {D}}_z<\infty \) for \(q\ge 2\). By Lemma A.1 (ii), we have
$$\begin{aligned} {\mathbb {P}}(|\varGamma _j|\ge \lambda /c)\le (2+\sqrt{2}c_\alpha )\exp \Big ( -\frac{(\sqrt{n}\lambda /(c{\mathcal {D}}_z) )^\alpha }{2e\alpha }\Big ). \end{aligned}$$Hence
$$\begin{aligned} \begin{aligned} {\mathbb {P}}({\mathcal {T}}^c)\le \sum _{j=1}^p{\mathbb {P}}(|\varGamma _j|\ge \lambda /c)= (2+\sqrt{2}c_\alpha )p\exp \Big ( -\frac{(\sqrt{n}\lambda /(c {\mathcal {D}}_z ) )^\alpha }{2e\alpha }\Big ). \end{aligned} \end{aligned}$$Let \(\lambda =An^{-1/2}(\log p)^{1/\alpha }{\mathcal {D}}_z\), we get
$$\begin{aligned} {\mathbb {P}}({\mathcal {T}}^c)\le C_1p^{1-C_2A^\alpha }. \end{aligned}$$
The proofs are completed. \(\square \)
Lemma A.7
Under the condition in Lemma A.6 and assume that \(s_0\lambda ^2=o(1)\). For any fixed k, we have
Proof
By using the results in Lemma A.6, we have
thus \(e_k{\mathop {\longrightarrow }\limits ^{P}}\varepsilon _k\) for \(k=0,\ldots ,n-1\). Then for any fixed k,
by the ergodicity condition, and thus the proofs are completed. \(\square \)
Technical proofs
1.1 Proof of Theorem 1
Proof
Based on the definition of the desparsifying Lasso (4), and using
simple algebra yields
where \(\varvec{\varLambda }=-\sqrt{n}(\widehat{\varvec{\varTheta }}\widehat{\varvec{\varSigma }}-{\mathbf {I}}_p)(\widehat{\varvec{\beta }}-\varvec{\beta }^*).\) Thus,
By using the fact that for any matrix \(\mathbf{A}\in {\mathbb {R}}^{m\times n}\) and any vector \(x\in {\mathbb {R}}^{n\times 1}\),
we get
By Lemma A.4, we have
-
(i)
If the error sequence has finite q-th moment, by Lemma A.6 (i), for \(\lambda =A\max \{(n^{-1}\log p )^{1/2}{\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{2,\alpha }, n^{-v}(p\log p)^{\frac{1}{q}} {\vert \vert \vert \varvec{\varepsilon }. \vert \vert \vert }_{q,a}\},\) yields
$$\begin{aligned} {\mathbb {P}}\Big (\Vert \varvec{\varLambda }\Vert _\infty \ge C\sqrt{n}s_0\lambda \Big (\max \limits _j\frac{\lambda _j}{\widehat{\tau }_j^2}\Big ) \Big )\le C_1(\log p)^{-1}+C_2p^{1-K_3A^2}. \end{aligned}$$ -
(ii)
If the error sequence has finite exponential moment, then using Lemma A.6 (ii), for \(\lambda =An^{-1/2}(\log p)^{1/\alpha }\mathcal {D}_z\), yields
$$\begin{aligned} {\mathbb {P}}\Big (\Vert \varvec{\varLambda }\Vert _\infty \ge C\sqrt{n}s_0\lambda \Big (\max \limits _j\frac{\lambda _j}{\widehat{\tau }_j^2}\Big ) \Big )\le C_1p^{1-C_2A^\alpha }. \end{aligned}$$
The following proof is applicable to both the error sequence has finite q-th moment and exponential moment, and we will use the m-dependent approximation to prove (7) holds.
For some constants \(2<d<\infty \) and \(2r<\eta <1/2\), where r is given in Assumption 4, let \(\zeta _n=\lfloor n^{\eta }/(\log n)^d\rfloor \), \(t_n=\lfloor n^{1/2-\eta }\rfloor \) and \(w_n=\lfloor n/\zeta _n-t_n\rfloor \). Define \(\varvec{\varepsilon }^*=(\varepsilon _1^*,\ldots ,\varepsilon _n^*)^\top \) with
Thus \(\{\varepsilon _i^*\}\) are \(t_n\)-dependent sequence. For any \(k\in \{1,\ldots ,p\}\), according to Lemma A.2 (ii) and Lemma A.5 we have
since the processes \(\{\varepsilon _i\}\) is short-range dependence. Therefore, to show
is equivalent to show
Denote by \({\mathbf {v}}=n^{-1/2}\widehat{\varvec{\varTheta }}_k^\top \mathbf { X}^\top /{\sigma _k} =(v_1,\ldots ,v_n)\), where \(v_j=n^{-1/2}\widehat{\varvec{\varTheta }}_k^\top {\mathbf {x}}_j/{\sigma _k}\). Define
Therefore
For n large enough, \(\{\varvec{\varXi }_i\}_{i=1}^{\zeta _n}\) are independent and \(\{\varvec{\varXi }_i^\prime \}_{i=1}^{\zeta _n}\) are independent, since \(\varvec{\varepsilon }^*\) are \(t_n\)-dependent. In the following, we will show that
By Lemma A.5, we get
and
where \(\lambda _{\min }({\varvec{\varSigma }}_{n}^{\varvec{\varepsilon }})\) and \(\lambda _{\max }({\varvec{\varSigma }}_{n}^{\varvec{\varepsilon }})\) are the minimum and maximun eigenvalue of \({\varvec{\varSigma }}_{n}^{\varvec{\varepsilon }}\), respectively, since the eigenvalues of \({\varvec{\varSigma }}_{n}^{\varvec{\varepsilon }}=\mathrm{Cov}(\varvec{\varepsilon })\) are bounded away from zero and infinity (see Section 5.2 in Grenander and Szegö 1958). Thus we have
and
Therefore,
which, together with \(\mathbb { E}(\sum _{i=1}^{\zeta _n}\varvec{\varXi }_i^\prime )=0\) implies \(\sum _{i=1}^{\zeta _n}\varvec{\varXi }_i^\prime =o_{{\mathbb {P}}}(1)\).
By Lemma A.2(ii), we have
Then
implying that \(\lim _{n\rightarrow \infty }\mathrm{Var}(\sum _{i=1}^{\zeta _n}\varvec{\varXi }_i)=1\), since \(\mathrm{Var}( \sum _{i=1}^{\zeta _n}\varvec{\varXi }_i^\prime )=o(1)\).
According to Lemma A.2(ii), the Liapounov condition follows
for some constant \(c>0\). Hence, by central limit theorem,
that is
and the proofs are completed. \(\square \)
1.2 Proof of Lemma 1
Proof
By using the fact that for any matrix \(\mathbf{A}\in {\mathbb {R}}^{m\times n}\),
we have
since \(\widehat{\varvec{\varSigma }}_{n, l}^{\varvec{\varepsilon }} - {\varvec{\varSigma }}_{n}^{\varvec{\varepsilon }}\) is symmetric. Thus
Note that for \(i\ge 0\),
Using Lemma A.3, there exists a constant \(c_q\) depending only on q, such that
thus
Note that
according to Lemma A.1, we get
or
Combining Lemma A.6, yields \(D_2 = O_{{\mathbb {P}}}(s_0\lambda ^2).\) Moreover,
thus \(D_3 =O_{{\mathbb {P}}}(s_0^2\lambda ^2).\) Then we get
since \(D_2=o_{{\mathbb {P}}}(D_3)\) and \(D_4=O(l^2n^{-1})=o_{{\mathbb {P}}}(D_1)\). Therefore
Let \(l\rightarrow \infty , l(n^{2/q-1}+s_0^2\lambda ^2)=o(1)\) for \(2<q\le 4\), yields
and the proofs are completed. \(\square \)
1.3 Proof of Theorem 2
Proof
Since \( \Vert \widehat{\varvec{\varSigma }}_{n, l}^{\varvec{\varepsilon }} - {\varvec{\varSigma }}_{n}^{\varvec{\varepsilon }}\Vert _2=o_{{\mathbb {P}}}(1)\) by Lemma 1, we get
Thus \(\widehat{\sigma }_k^2\) is a consistent estimator of \(\sigma _k^2\). Combining Theorem 1, Lemma 1 and Slutsky’s Theorem, the proofs are completed. \(\square \)
Rights and permissions
About this article
Cite this article
Yuan, P., Guo, X. High-dimensional inference for linear model with correlated errors. Metrika 85, 21–52 (2022). https://doi.org/10.1007/s00184-021-00820-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-021-00820-7