Skip to main content
Log in

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

This paper considers testing regression coefficients in high-dimensional linear model with fixed design matrix. This problem is highly irregular in the frequentist point of view. In fact, we prove that no test can guarantee nontrivial power even when the true model deviates greatly from the null hypothesis. Nevertheless, Bayesian methods can still produce tests with good average power behavior. We propose a new test statistic which is the limit of Bayes factors under normal distribution. The null distribution of the proposed test statistic is approximated by Lindeberg’s replacement trick. Under certain conditions, the global asymptotic power function of the proposed test is also given. The finite sample performance of the proposed test is demonstrated via simulation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Arboretti R, Ceccato R, Corain L, Ronchi F, Salmaso L (2018) Multivariate small sample tests for two-way designs with applications to industrial statistics. Stat Pap 59(4):1483–1503

    MathSciNet  MATH  Google Scholar 

  • Bai Z, Pan G, Yin Y (2018) A central limit theorem for sums of functions of residuals in a high-dimensional regression model with an application to variance homoscedasticity test. TEST 27(4):896–920

    MathSciNet  MATH  Google Scholar 

  • Baltagi BH, Kao C, Na S (2013) Testing for cross-sectional dependence in a panel factor model using the wild bootstrap \(F\) test. Stat Pap 54(4):1067–1094

    MathSciNet  MATH  Google Scholar 

  • Bentkus V, Götze F (1996) Optimal rates of convergence in the CLT for quadratic forms. Ann Probab 24(1):466–490

    MathSciNet  MATH  Google Scholar 

  • Bühlmann P (2013) Statistical significance in high-dimensional linear models. Bernoulli 19(4):1212–1242

    MathSciNet  MATH  Google Scholar 

  • Casella G, Moreno E (2006) Objective Bayesian variable selection. J Am Stat Assoc 101(473):157–167

    MathSciNet  MATH  Google Scholar 

  • Chatterjee S (2006) A generalization of the Lindeberg principle. Ann Probab 34(6):2061–2076

    MathSciNet  MATH  Google Scholar 

  • Chatterjee S (2008) A new method of normal approximation. Ann Probab 36(4):1584–1610

    MathSciNet  MATH  Google Scholar 

  • Chen SX, Zhang LX, Zhong PS (2010) Tests for high-dimensional covariance matrices. J Am Stat Assoc 105(490):810–819

    MathSciNet  MATH  Google Scholar 

  • Cohn DL (2013) Measure theory, 2nd edn. Birkhäuser, New York

    MATH  Google Scholar 

  • Cui H, Guo W, Zhong W (2018) Test for high-dimensional regression coefficients using refitted cross-validation variance estimation. Ann Stat 46(3):958–988

    MathSciNet  MATH  Google Scholar 

  • DasGupta A (2008) Asymptotic theory of statistics and probability, 1st edn. Springer, New York

    MATH  Google Scholar 

  • Dezeure R, Bühlmann P, Zhang CH (2017) High-dimensional simultaneous inference with the bootstrap. TEST 26(4):685–719

    MathSciNet  MATH  Google Scholar 

  • Dicker LH, Erdogdu MA (2017) Flexible results for quadratic forms with applications to variance components estimation. Ann Stat 45(1):386–414

    MathSciNet  MATH  Google Scholar 

  • Draper NR, Pukelsheim F (1996) An overview of design of experiments. Stat Pap 37(1):1–32

    MathSciNet  MATH  Google Scholar 

  • Fan J, Yuan L, Mincheva M (2013) Large covariance estimation by thresholding principal orthogonal complements. J R Stat Soc B 75(4):603–680

    MathSciNet  MATH  Google Scholar 

  • Feng L, Zou C, Wang Z, Chen B (2013) Rank-based score tests for high-dimensional regression coefficients. Electron J Stat 7:2131–2149

    MathSciNet  MATH  Google Scholar 

  • Goddard SD, Johnson VE (2016) Restricted most powerful Bayesian tests for linear models. Scand J Stat 43(4):1162–1177

    MathSciNet  MATH  Google Scholar 

  • Goeman JJ, van de Geer SA, van Houwelingen HC (2006) Testing against a high dimensional alternative. J R Stat Soc B 68(3):477–493

    MathSciNet  MATH  Google Scholar 

  • Goeman JJ, van Houwelingen HC, Finos L (2011) Testing against a high-dimensional alternative in the generalized linear model: asymptotic type I error control. Biometrika 98(2):381–390

    MathSciNet  MATH  Google Scholar 

  • Götze F, Tikhomirov A (2002) Asymptotic distribution of quadratic forms and applications. J Theor Probab 15(2):423–475

    MathSciNet  MATH  Google Scholar 

  • Horn RA, Johnson CR (1991) Topics in matrix analysis, 1st edn. Cambridge University Press, New York

    MATH  Google Scholar 

  • Ingster YI, Tsybakov AB, Verzelen N (2010) Detection boundary in sparse regression. Electron J Stat 4:1476–1526

    MathSciNet  MATH  Google Scholar 

  • Janson L, Barber RF, Candès E (2016) EigenPrism: inference for high dimensional signal-to-noise ratios. J R Stat Soc B 79(4):1037–1065

    MathSciNet  MATH  Google Scholar 

  • Javier Girón F, Lina Martínez M, Moreno E, Torres F (2006) Objective testing procedures in linear models: calibration of the \(p\)-values. Scand J Stat 33(4):765–784

    MathSciNet  MATH  Google Scholar 

  • Jiang J (1996) Reml estimation: asymptotic behavior and related topics. Ann Stat 24(1):255–286

    MathSciNet  MATH  Google Scholar 

  • Jong PD (1987) A central limit theorem for generalized quadratic forms. Probab Theory Relat Fields 75(2):261–277

    MathSciNet  MATH  Google Scholar 

  • Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795

    MathSciNet  MATH  Google Scholar 

  • Lan W, Wang H, Tsai CL (2014) Testing covariates in high-dimensional regression. Ann Inst Stat Math 66(2):279–301

    MathSciNet  MATH  Google Scholar 

  • Lan W, Ding Y, Fang Z, Fang K (2016a) Testing covariates in high dimension linear regression with latent factors. J Multivar Anal 144:25–37

    MathSciNet  MATH  Google Scholar 

  • Lan W, Zhong PS, Li R, Wang H, Tsai CL (2016b) Testing a single regression coefficient in high dimensional linear models. J Econom 195(1):154–168

    MathSciNet  MATH  Google Scholar 

  • Lehmann EL, Romano JP (2005) Testing statistical hypotheses, 3rd edn. Springer, New York

    MATH  Google Scholar 

  • Lei L, Bickel PJ, Karoui NE (2018) Asymptotics for high dimensional regression M-estimates: fixed design results. Probab Theory Relat Fields 172(3–4):983–1079

    MathSciNet  MATH  Google Scholar 

  • Liang F, Paulo R, Molina G, Clyde MA, Berger JO (2008) Mixtures of g priors for Bayesian variable selection. J Am Stat Assoc 103(481):410–423

    MathSciNet  MATH  Google Scholar 

  • Pollard D (1984) Convergence of stochastic processes, 1st edn. Springer, New York

    MATH  Google Scholar 

  • Sevast’yanov BA (1961) A class of limit distributions for quadratic forms of stochastic random variables. Theory Probab Appl 6(3):337–340

    MATH  Google Scholar 

  • Wang S, Cui H (2015) A new test for part of high dimensional regression coefficients. J Multivar Anal 137:187–203

    MathSciNet  MATH  Google Scholar 

  • Xu K (2016) A new nonparametric test for high-dimensional regression coefficients. J Stat Comput Simul 87(5):855–869

    MathSciNet  MATH  Google Scholar 

  • Zhang X, Cheng G (2017) Simultaneous inference for high-dimensional linear models. J Am Stat Assoc 112(518):757–768

    MathSciNet  Google Scholar 

  • Zhang CH, Zhang SS (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc B 76(1):217–242

    MathSciNet  MATH  Google Scholar 

  • Zhong PS, Chen SX (2011) Tests for high-dimensional regression coefficients with factorial designs. J Am Stat Assoc 106(493):260–274

    MathSciNet  MATH  Google Scholar 

  • Zhou Q, Guan Y (2018) On the null distribution of bayes factors in linear regression. J Am Stat Assoc 113(523):1362–1371

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the editor and two anonymous referees for their helpful comments and suggestions which helped to improve the paper significantly. This work was supported by the National Natural Science Foundation of China under Grant No. 11471035.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xingzhong Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Lemma 1

Under the assumptions of Theorem 1, if there exists a Borel set \(G\subset {\mathbb {R}}^n\) and a number \(M\ge 0\) such that

$$\begin{aligned} \int _{{\mathbb {R}}^n} \varphi (\mathbf {y}) d{\mathcal {N}}_n(\mu ,\phi ^{-1} \mathbf {I}_n)(\mathbf {y})\, \mathrm {d} \mathbf {y}\ge \alpha \quad \text {for all } \mu \in G \text { and } \phi >M, \end{aligned}$$

then \(\varphi (\mathbf {y}){\mathbf {1}}_{G}(\mathbf {y})\ge \alpha {\mathbf {1}}_{G}(\mathbf {y})\) a.e. \(\lambda \).

Proof

We prove the claim by contradiction. Suppose \(\lambda (\{\mathbf {y}:\varphi (\mathbf {y}) <\alpha \} \cap G )>0\). Then there exists a sufficiently small \(0< \eta <\alpha \), such that \(\lambda (\{\mathbf {y}:\varphi (\mathbf {y}) <\alpha -\eta \} \cap G)>0\). We denote \( E:=\{\mathbf {y}:\varphi (\mathbf {y}) <\alpha -\eta \} \cap G\). From Lebesgue density theorem (Cohn 2013, Corollary 6.2.6), there exists a point \(z\in E\), such that, for any \(\varepsilon >0\) there is a \(\delta _{\varepsilon }>0\) such that for any \(0< \delta ' <\delta _\varepsilon \),

$$\begin{aligned} \left| \frac{\lambda (E^\complement \cap C_{\delta '})}{\lambda (C_{\delta '})}\right| <\varepsilon , \end{aligned}$$

where \(C_{\delta '}=\prod _{i=1}^n [z_i-{\delta '}, z_i + {\delta '}]\). We put

$$\begin{aligned} \varepsilon =\left( \frac{\sqrt{\pi }}{\sqrt{2} \varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }\right) ^n \frac{\eta }{3}. \end{aligned}$$

Then for any \(\phi >M\) and \(0<\delta ' <\delta _\varepsilon \),

$$\begin{aligned} \alpha\le & {} \int _{{\mathbb {R}}^n}\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\= & {} \int _{E\cap C_{\delta '}}\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}+ \int _{E^\complement \cap C_{\delta '}}\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\&+ \int _{C_{\delta '}^\complement }\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\\le & {} \alpha -\eta + \int _{E^\complement \cap C_{\delta '}} d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}+ \int _{C_{\delta '}^\complement } d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\\le & {} \alpha -\eta + \left( \frac{\phi }{2\pi }\right) ^{n/2}\lambda (E^\complement \cap C_{\delta '}) + 2n\left( 1-\varPhi (\sqrt{\phi }\delta ')\right) \\\le & {} \alpha -\eta + \left( \frac{\phi }{2\pi }\right) ^{n/2} \varepsilon (2\delta ')^n + 2n\left( 1-\varPhi (\sqrt{\phi }\delta ')\right) \\= & {} \alpha -\eta + \left( \frac{\sqrt{\phi } \delta '}{\varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }\right) ^{n} \frac{\eta }{3} + 2n\left( 1-\varPhi (\sqrt{\phi }\delta ')\right) . \end{aligned}$$

In the last inequality, we put \(\delta '\) small enough such that

$$\begin{aligned} \left( \frac{\varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }{\delta '}\right) ^2>M, \end{aligned}$$

and put

$$\begin{aligned} \phi = \left( \frac{\varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }{\delta '}\right) ^2. \end{aligned}$$

Then we obtain the contradiction \(\alpha \le \alpha -(2/3)\eta \). This completes the proof. \(\square \)

Proof

(Proof of Theorem 1) We prove the claim by contradiction. Suppose there exists an \(M\ge 0\) such that

$$\begin{aligned} \int _{{\mathbb {R}}^n} \varphi (\mathbf {y}) d{\mathcal {N}}_n \left( \mathbf {X}_a \varvec{\beta }_a + \mathbf {X}_b \varvec{\beta }_b,\phi ^{-1} \mathbf {I}_n\right) (\mathbf {y})\, \mathrm {d}\mathbf {y}\ge \alpha \end{aligned}$$

for every \(\varvec{\beta }_a\in {\mathbb {R}}^q\), \(\varvec{\beta }_b \in {\mathbb {R}}^p\), \(\phi >0\) satisfying \(\phi \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b >M\). Note that for any \(h>0\),

$$\begin{aligned} \left\{ (\varvec{\beta }_b^\top , \phi )^\top : \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b> h \sqrt{M},\phi > h^{-1} \sqrt{M} \right\} \end{aligned}$$

is a subset of

$$\begin{aligned} \left\{ (\varvec{\beta }_b^\top , \phi )^\top : \phi \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b > M\right\} . \end{aligned}$$

Then Lemma 1 implies that for any \(h>0\), \(\varphi (\mathbf {y}) {\mathbf {1}}_{G_h}(\mathbf {y})\ge \alpha {\mathbf {1}}_{G_h}(\mathbf {y})\) a.e. \(\lambda \), where

$$\begin{aligned} G_h=\left\{ \mathbf {X}_a \varvec{\beta }_a + \mathbf {X}_b \varvec{\beta }_b: \varvec{\beta }_a \in {\mathbb {R}}^{q}, \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b > h \sqrt{M} \right\} . \end{aligned}$$

It can be seen that \(\lambda (\{ \cup _{n=1}^\infty G_{1/n}\}^\complement )= 0\). Hence \(\varphi (\mathbf {y}) \ge \alpha \) a.e. \(\lambda \). On the other hand, since \(\varphi (\mathbf {y})\) is a level \(\alpha \) test, for every \(\phi >0\),

$$\begin{aligned} \int _{{\mathbb {R}}^n} [\varphi (\mathbf {y})-\alpha ] d{\mathcal {N}}_n (0,\phi ^{-1} \mathbf {I}_n)(\mathbf {y}) \,\mathrm {d} \mathbf {y}\le 0. \end{aligned}$$
(10)

Note that the integrand of (10) is nonnegative. It follows that \(\varphi (\mathbf {y})=\alpha \) a.s. \(\lambda \), a contradiction. This completes the proof. \(\square \)

Proof

(Proof of Theorem 2) We note that \({{\,\mathrm{Var}\,}}(\varvec{\xi }^\top \mathbf {A}\varvec{\xi })= 2{{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ({{\,\mathrm{E}\,}}(\xi _1^4)-3){{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2} ) \). Let

$$\begin{aligned} {{\tilde{a}}}_{i,j}:= \frac{a_{i,j}}{ \sqrt{ 2{{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ({{\,\mathrm{E}\,}}(\xi _1^4)-3){{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2} ) } } . \end{aligned}$$

Then

$$\begin{aligned} S=\sum _{i=1}^n {{\tilde{a}}}_{i,i}(\xi _i^2-1) +2\sum _{1\le i< j \le n} {\tilde{a}}_{i,j} \xi _i \xi _j, \quad S_\tau ^* =\tau \sum _{i=1}^n {\tilde{a}}_{i,i}{\check{z}}_i +2\sum _{1\le i <j \le n} {\tilde{a}}_{i,j} z_i z_j. \end{aligned}$$

For \(l=1,\ldots , n\), define

$$\begin{aligned} S_l&= \sum _{i=1}^{l-1} {\tilde{a}}_{i,i}(\xi _i^2-1) + \tau \sum _{i=l+1}^{n} {\tilde{a}}_{i,i} {\check{z}}_i \\&\quad +\, 2\sum _{1\le i<j \le l-1} {\tilde{a}}_{i,j} \xi _i \xi _j +2\sum _{i=1}^{l-1} \sum _{j=l+1}^n {\tilde{a}}_{i,j} \xi _i z_j +2\sum _{l+1 \le i < j \le n} {\tilde{a}}_{i,j} z_i z_j , \\ h_l&= {\tilde{a}}_{l,l} (\xi _l^2 -1) +2\sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i \xi _l +2\sum _{i =l +1}^n {\tilde{a}}_{i,l} z_i \xi _l , \\ g_l&= \tau {\tilde{a}}_{l,l} {\check{z}}_l +2\sum _{i =1}^{l-1} {\tilde{a}}_{i,l} \xi _i z_l +2\sum _{i = l+1}^n {\tilde{a}}_{i,l} z_i z_l . \end{aligned}$$

It can be seen that for \(l=2,\ldots , n\), \(S_{l-1}+ h_{l-1} =S_{l} + g_{l} \), and \(S=S_n + h_n\), \(S_1 + g_1=S_\tau ^*\).

Thus, for any \(f \in {\mathscr {C}}^4 ({\mathbb {R}})\),

$$\begin{aligned} \begin{aligned}&\left| {{\,\mathrm{E}\,}}f\left( S\right) - {{\,\mathrm{E}\,}}f\left( S_\tau ^*\right) \right| \\&\quad = \left| {{\,\mathrm{E}\,}}f(S_n+h_n)-{{\,\mathrm{E}\,}}f(S_1+g_1)\right| \\&\quad = \left| \sum _{l=2}^{n} \left( {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l-1}+h_{l-1})\right) +{{\,\mathrm{E}\,}}f(S_{1}+h_{1})-{{\,\mathrm{E}\,}}f(S_{1}+g_{1})\right| \\&\quad = \left| \sum _{l=1}^{n} {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| . \end{aligned} \end{aligned}$$

Apply Taylor’s theorem, for \(l=1,\ldots ,n\),

$$\begin{aligned} \begin{aligned} f(S_{l}+h_{l})=&f(S_{l}) + \sum _{k=1}^3 \frac{1}{k!} h_l^k f^{(k)} (S_{l}) + \frac{1}{24}h_{l}^4 f^{(4)} (S_{l}+\theta _1 h_{l}), \\ f(S_{l}+g_{l})=&f(S_{l}) + \sum _{k=1}^3 \frac{1}{k!} g_l^k f^{(k)} (S_{l}) + \frac{1}{24}g_{l}^4 f^{(4)} (S_{l}+\theta _{2} g_{l}), \end{aligned} \end{aligned}$$

where \(\theta _1,\theta _2\in [0,1]\). Thus,

$$\begin{aligned} \begin{aligned}&\left| {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| \\&\quad \le \left| \sum _{k=1}^3 \frac{1}{k!} {{\,\mathrm{E}\,}}f^{(k)} (S_{l}) {{\,\mathrm{E}\,}}_l (h_l^k - g_l^k) \right| + \frac{1}{24} \Vert f^{(4)} \Vert _{\infty } \left( {{\,\mathrm{E}\,}}(h_{l}^4)+{{\,\mathrm{E}\,}}(g_{l}^4)\right) , \end{aligned} \end{aligned}$$

where \({{\,\mathrm{E}\,}}_l\) denotes taking expectation with respect to \(\xi _l, z_l ,{\check{z}}_l\). It is straightforward to show that

$$\begin{aligned} \begin{aligned} {{\,\mathrm{E}\,}}_l (h_l-g_l)&=0, \\ {{\,\mathrm{E}\,}}_l (h_l^2-g_l^2)&= \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 1 - \tau ^2 \right) {\tilde{a}}_{l,l}^2 , \\ {{\,\mathrm{E}\,}}_l (h_l^3-g_l^3)&= {{\,\mathrm{E}\,}}(\xi _1^2-1)^3 {\tilde{a}}_{l,l}^3 + 12 ({{\,\mathrm{E}\,}}(\xi _1^4) - 1) {\tilde{a}}_{l,l} \left( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i + \sum _{i=l+1}^n {\tilde{a}}_{i,l} z_i \right) ^2 . \end{aligned} \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned}&\left| {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| \le \frac{1}{2} \Vert f^{(2)}\Vert _\infty \left| {{\,\mathrm{E}\,}}(\xi _1^4)-1 -\tau ^2 \right| {\tilde{a}}_{l,l}^2 \\&\qquad + \frac{1}{6} \Vert f^{(3)}\Vert _{\infty } \Bigg ( \left| {{\,\mathrm{E}\,}}(\xi _1^2-1)^3\right| |{\tilde{a}}_{l,l}^3 | \\&\qquad + 12 ({{\,\mathrm{E}\,}}(\xi _1^4)-1) |{\tilde{a}}_{l,l}| {{\,\mathrm{E}\,}}\bigg ( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i + \sum _{i=l+1}^n {\tilde{a}}_{i,l} z_i \bigg )^2 \Bigg ) \\&\qquad + \frac{1}{24} \Vert f^{(4)} \Vert _{\infty } \left( {{\,\mathrm{E}\,}}(h_{l}^4)+{{\,\mathrm{E}\,}}(g_{l}^4)\right) \\&\qquad \le \frac{ \left| {{\,\mathrm{E}\,}}(\xi _1^4)-1 - \tau ^2 \right| }{2} \Vert f^{(2)}\Vert _\infty {\tilde{a}}_{l,l}^2 \\&\qquad + \frac{ \max \left( \left| {{\,\mathrm{E}\,}}(\xi _1^2-1)^3\right| , 12 ({{\,\mathrm{E}\,}}(\xi _1^4)-1) \right) }{6} \Vert f^{(3)}\Vert _\infty |{\tilde{a}}_{l,l}| \sum _{i=1}^{n} {\tilde{a}}_{i,l}^2 \\&\qquad + \frac{1}{24} \Vert f^{(4)} \Vert _{\infty } \left( {{\,\mathrm{E}\,}}(h_{l}^4)+{{\,\mathrm{E}\,}}(g_{l}^4)\right) . \end{aligned} \end{aligned}$$
(11)

Now we bound \({{\,\mathrm{E}\,}}(h_l^4)\) and \({{\,\mathrm{E}\,}}(g_l^4)\). By direct calculation,

$$\begin{aligned} {{\,\mathrm{E}\,}}(h_l^4)= & {} {{\,\mathrm{E}\,}}(\xi _1^2 - 1)^4 {\tilde{a}}_{l,l}^4 + 24 {{\,\mathrm{E}\,}}[ \xi _1^2(\xi _1^2 -1)^2] {\tilde{a}}_{l,l}^2 {{\,\mathrm{E}\,}}\left( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i +\sum _{i =l +1}^n {\tilde{a}}_{i,l} z_i \right) ^2 \\&+\, 16 {{\,\mathrm{E}\,}}(\xi _1^4 ) {{\,\mathrm{E}\,}}\left( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i +\sum _{i =l +1}^n {\tilde{a}}_{i,l} z_i \right) ^4 \\= & {} {{\,\mathrm{E}\,}}(\xi _1^2 - 1)^4 {\tilde{a}}_{l,l}^4 + 24 {{\,\mathrm{E}\,}}[ \xi _1^2(\xi _1^2 -1)^2] {\tilde{a}}_{l,l}^2 \left( (\sum _{i=1}^n {\tilde{a}}_{i,l}^2) - {\tilde{a}}_{l,l}^2\right) \\&+ \,16 {{\,\mathrm{E}\,}}(\xi _1^4 ) \left( \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 3\right) \sum _{i=1}^{l-1} {\tilde{a}}_{i,l}^4 + 3 \left( (\sum _{i=1}^n {\tilde{a}}_{i,l}^2) - {\tilde{a}}_{l,l}^2 \right) ^2 \right) . \end{aligned}$$

To upper bound the above quantity, we use the facts \( 24 {{\,\mathrm{E}\,}}[ \xi _1^2(\xi _1^2 -1)^2] \le 2(16{{\,\mathrm{E}\,}}(\xi _1^2 -1)^4 + (9/4) {{\,\mathrm{E}\,}}(\xi _1^4) ) \), \({{\,\mathrm{E}\,}}(\xi _1^2 - 1)^4\le {{\,\mathrm{E}\,}}(\xi _1^8)\) and

$$\begin{aligned} \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 3\right) \sum _{i=1}^{l-1} {\tilde{a}}_{i,l}^4 \le \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 1\right) \sum _{i=1}^{l-1} {\tilde{a}}_{i,l}^4 \le \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 1\right) \left( \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2\right) - {\tilde{a}}_{l,l}^2 \right) ^2 . \end{aligned}$$

Then we obtain the bound

$$\begin{aligned} {{\,\mathrm{E}\,}}(h_l^4) \le \left( 16 {{\,\mathrm{E}\,}}(\xi _1^8) + 32 {{\,\mathrm{E}\,}}(\xi _1^4)\right) \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2\right) ^2. \end{aligned}$$
(12)

Similarly, we have

$$\begin{aligned} {{\,\mathrm{E}\,}}(g_l^4) \le \left( 48 {{\,\mathrm{E}\,}}(\xi _1^4) + 3\tau ^4 + 96 \right) \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2\right) ^2. \end{aligned}$$
(13)

Combining (11), (12) and (13) yields

$$\begin{aligned} \begin{aligned}&\sum _{l=1}^n \left| {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| \le \frac{ \left| {{\,\mathrm{E}\,}}(\xi _1^4)-1 - \tau ^2 \right| }{2} \Vert f^{(2)}\Vert _\infty \sum _{l=1}^n {\tilde{a}}_{l,l}^2\\&\qquad + \frac{ \max \left( \left| {{\,\mathrm{E}\,}}(\xi _1^2-1)^3\right| , 12 ({{\,\mathrm{E}\,}}(\xi _1^4)-1) \right) }{6} \Vert f^{(3)}\Vert _\infty \sum _{l=1}^n \left( |{\tilde{a}}_{l,l}| \sum _{i=1}^{n} {\tilde{a}}_{i,l}^2 \right) \\&\qquad + \frac{ 16 {{\,\mathrm{E}\,}}(\xi _1^8) + 80 {{\,\mathrm{E}\,}}(\xi _1^4) + 3\tau ^4 + 96 }{24} \Vert f^{(4)} \Vert _{\infty } \sum _{l=1}^n \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2 \right) ^2 . \end{aligned} \end{aligned}$$

This completes the proof. \(\square \)

Proof

(Proof of Theorem 3) Throughout the proof, we use the similar notations as in Theorem 2 and define

$$\begin{aligned} S=\frac{ (\sqrt{\phi }\varvec{\epsilon })^\top \mathbf {A}\sqrt{\phi }\varvec{\epsilon }- {{\,\mathrm{tr}\,}}(\mathbf {A}) }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \end{aligned}$$

and

$$\begin{aligned} S_{ {\hat{\tau }}}^* = \frac{ {\hat{\tau }} \sum _{i=1}^n a_{i,i}{\check{z}}_i +2\sum _{1\le i <j \le n} a_{i,j} z_i z_j }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } }, \end{aligned}$$

where \(z_1,\ldots , z_n, {\check{z}}_1,\ldots , {\check{z}}_n\) are iid \({\mathcal {N}}(0,1)\) random variables and are independent of \({\hat{\tau }}^2\).

By a standard subsequence argument, we only need to prove the theorem along a subsequence of \(\{n\}\). Hence, without loss of generality, we assume \({\hat{\tau }}^2 \xrightarrow {a.s.} \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-1\). Write

$$\begin{aligned} S_{{\hat{\tau }}}^*= & {} \frac{ \sqrt{\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-1} \sum _{i=1}^n a_{i,i}{\check{z}}_i +2\sum _{1\le i <j \le n} a_{i,j} z_i z_j }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \\&+ \frac{ ({\hat{\tau }} - \sqrt{ \phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-1 } ) \sum _{i=1}^n a_{i,i}{\check{z}}_i }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \\=: & {} S_{{\hat{\tau }},1}^{*} + S_{{\hat{\tau }},2}^{*} . \end{aligned}$$

Note that \(S_{\hat{\tau },1}^*\) is independent of \({\hat{\tau }}\). Since \({{\,\mathrm{E}\,}}( S_{{\hat{\tau }},1}^{*2} )=1\), the distributions \({\mathcal {L}}(S_{{\hat{\tau }},1}^{*}) \) are tight as \(n\rightarrow \infty \). Hence, without loss of generality, we assume \({\mathcal {L}} (S_{{\hat{\tau }},1}^*)\) weakly converges to a limit distribution with distribution function \(F^\dagger (x)\). Let \(S^\dagger \) be a random variable with distribution function \(F^\dagger (x)\). By some algebra (see, e.g., Chen et al. (2010), Proposition A.1.(iii)), it can be shown that \({{\,\mathrm{E}\,}}(S^{*4}_{{\hat{\tau }},1})\) is uniformly bounded. Then \({\mathcal {L}} ( S_{{\hat{\tau }},1}^{*2} )\) is uniformly integrable. Hence \({{\,\mathrm{E}\,}}(S^{\dagger 2})=1\) and \(F^\dagger (x)\) can not concentrate on a single point. Consequently, \(F^\dagger (x)\) is continuous and is strict increasing for \(x\in \{x:0<F(x)<1\}\); see Sevast’yanov (1961) as well the remark made by A. N. Kolmogorov on that paper.

The condition (8) implies that \({{\,\mathrm{E}\,}}[S_{{\hat{\tau }},2}^{*2}|{\hat{\tau }}]\rightarrow 0\) almost surely. Then almost surely, \({\mathcal {L}} (S^*_{{\hat{\tau }}}|{\hat{\tau }}) \rightsquigarrow {\mathcal {L}}(S^\dagger )\). Consequently, for every \(f\in {\mathscr {C}}^4 ({\mathbb {R}})\), we have \(| {{\,\mathrm{E}\,}}[f(S^*_{{\hat{\tau }}}) |{\hat{\tau }}] - {{\,\mathrm{E}\,}}f(S^\dagger ) |\rightarrow 0\) almost surely. On the other hand, Theorem 2 and the condition (8) imply \(|{{\,\mathrm{E}\,}}f(S)- {{\,\mathrm{E}\,}}[f(S^*_{{\hat{\tau }}})|{\hat{\tau }}] |\rightarrow 0\) almost surely. Thus, \(|{{\,\mathrm{E}\,}}f(S)- {{\,\mathrm{E}\,}}f(S^\dagger ) |\rightarrow 0\). That is, \({\mathcal {L}} (S)\rightsquigarrow {\mathcal {L}} (S^\dagger )\).

Note that

$$\begin{aligned} x^{(1)}= \frac{F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})-{{\,\mathrm{tr}\,}}(\mathbf {X}_b^* \mathbf {X}_b^{*\top })^{-1} }{n-q}. \end{aligned}$$

We need to deal with \(F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})\). Since \({\mathcal {L}} (S^*_{{\hat{\tau }}}|{\hat{\tau }}) \rightsquigarrow {\mathcal {L}}(S^\dagger )\) almost surely, the fact

$$\begin{aligned} \Pr \left( S_{{\hat{\tau }}}^* > \frac{F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})}{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ( \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \Bigg | \hat{\tau } \right) =\alpha \end{aligned}$$

implies that almost surely,

$$\begin{aligned} \frac{F^{-1} (1-\alpha ; \mathbf {A},{\hat{\tau }})}{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ( \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \rightarrow F^{\dagger -1}(1-\alpha ). \end{aligned}$$
(14)

We also need the fact that

$$\begin{aligned} (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) = (1+o_p(1))(n-q), \end{aligned}$$
(15)

which is a consequence of

$$\begin{aligned} {{\,\mathrm{E}\,}}\left( (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) \right) =n-q ,\quad {{\,\mathrm{Var}\,}}\left( (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) \right) =O(n-q) . \end{aligned}$$

The fact \(S\rightsquigarrow S^\dagger \), Eqs. (14), (15) and Slutsky’s theorem lead to

$$\begin{aligned}&\Pr \left( T> x^{(1)} \right) \\&\quad = \Pr \left( T> \frac{F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})-{{\,\mathrm{tr}\,}}\left( (\mathbf {X}_b^* \mathbf {X}_b^{*\top })^{-1} \right) }{n-q} \right) \\&\quad =\Pr \Bigg ( (\sqrt{\phi }\varvec{\epsilon })^\top \mathbf {A}(\sqrt{\phi }\varvec{\epsilon })> \frac{ (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) }{n-q} F^{-1} (1-\alpha ;\mathbf {A},{{\hat{\tau }}}) \Bigg ) \\&\quad = \Pr \left( S> \frac{ (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) }{n-q} \frac{F^{-1} (1-\alpha ; \mathbf {A}, {{\hat{\tau }}})}{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ( \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \right) \\&\quad = \Pr \Bigg ( S>(1+o_P(1)) F^{-1}(1-\alpha ) \Bigg ) \\&\quad \rightarrow \alpha . \end{aligned}$$

This proves the theorem. \(\square \)

Proof

(Proof of Proposition 1) From Bai et al. (2018), Theorem 2.1, one can obtain the explicit forms of \({{\,\mathrm{Var}\,}}\left( {\tilde{\varvec{\epsilon }}}^\top \left( {\tilde{\mathbf {P}}}_a \right) {\tilde{\varvec{\epsilon }}} \right) \) and \({{\,\mathrm{Var}\,}}\left( \sum _{i=1}^n {\tilde{\epsilon }}_i^4 \right) \) which involves the traces of certain matrices. Using Horn and Johnson (1991), Theorem 5.5.1, one can see that the eigenvalues of these matrices are all bounded. Hence it can be deduced that \({{\,\mathrm{Var}\,}}( {\tilde{\varvec{\epsilon }}}^\top {\tilde{\mathbf {P}}}_a {\tilde{\varvec{\epsilon }}} )=O(n)\) and \({{\,\mathrm{Var}\,}}\left( \sum _{i=1}^n {\tilde{\epsilon }}_i^4 \right) =O(n)\). Thus,

$$\begin{aligned}&{\tilde{\varvec{\epsilon }}}^\top {\tilde{\mathbf {P}}}_a {\tilde{\varvec{\epsilon }}} = (n-q) \phi ^{-1}+O_P(\sqrt{n}), \\&\sum _{i=1}^n {\tilde{\epsilon }}_i^4 = 3\phi ^{-2} {{\,\mathrm{tr}\,}}({\tilde{\mathbf {P}}}_a^{\circ 2}) +\left( {{\,\mathrm{E}\,}}(\epsilon _1^4)-3 \phi ^{-2}\right) {{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a^{\circ 2} \right) ^2 + O_P(\sqrt{n}). \end{aligned}$$

It follows that

$$\begin{aligned} {\hat{\tau }} ^2 = \phi ^{2}{{\,\mathrm{E}\,}}(\epsilon _1^4)-1 +O_P\left( \frac{\sqrt{n}}{{{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a ^{\circ 2} \right) ^2} \right) . \end{aligned}$$

Let \(\delta _{i,j}=1\) if \(i=j\) and 0 if \(i\ne j\). We have

$$\begin{aligned} n= & {} \sum _{i=1}^n \sum _{j=1}^n \delta _{i,j}^4 \\= & {} \sum _{i=1}^n \sum _{j=1}^n \left( ({\tilde{\mathbf {P}}}_{a})_{i,j}+(\mathbf {P}_{a})_{i,j} \right) ^4 \\\le & {} 8\sum _{i=1}^n \sum _{j=1}^n \left( ({\tilde{\mathbf {P}}}_{a})_{i,j}\right) ^4+ 8\sum _{i=1}^n \sum _{j=1}^n(\mathbf {P}_{a})_{i,j}^4 \\\le & {} 8\sum _{i=1}^n \sum _{j=1}^n \left( ({\tilde{\mathbf {P}}}_{a})_{i,j}\right) ^4+ 8\sum _{i=1}^n \sum _{j=1}^n(\mathbf {P}_{a})_{i,j}^2 \\= & {} 8{{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a^{\circ 2} \right) ^2 +8q. \end{aligned}$$

Then

$$\begin{aligned} \frac{\sqrt{n}}{{{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a^{\circ 2} \right) ^2} = O\left( \frac{1}{\sqrt{n}} \right) . \end{aligned}$$

This completes the proof. \(\square \)

Proof

(Proof of Proposition 2) Without loss of generality, we assume \(\mathbf {A}\) is a diagonal matrix and \(|b_1|\ge \cdots \ge |b_n|\). By a standard subsequence argument, we only need to prove the result along a subsequence. Hence we can assume \(\lim _{n\rightarrow \infty }\Vert b\Vert ^2/{{\,\mathrm{tr}\,}}(\mathbf {A}^2) =c \in [0,+\infty ]\). If \( c=0\), Lyapunov central limit theorem implies that

$$\begin{aligned} \frac{Z^\top \mathbf {A}Z + b^\top Z - {{\,\mathrm{tr}\,}}(\mathbf {A})}{\sqrt{2{{\,\mathrm{tr}\,}}(\mathbf {A}^2)+\Vert \mathbf {b}\Vert ^2}} =(1+o_P(1)) \frac{Z^\top \mathbf {A}Z - {{\,\mathrm{tr}\,}}(\mathbf {A})}{\sqrt{2{{\,\mathrm{tr}\,}}(\mathbf {A}^2)}} +o_P(1) \rightsquigarrow {\mathcal {N}} (0,1). \end{aligned}$$

If \(c=+\infty \),

$$\begin{aligned} \frac{Z^\top \mathbf {A}Z + b^\top Z - {{\,\mathrm{tr}\,}}(\mathbf {A})}{\sqrt{2{{\,\mathrm{tr}\,}}(\mathbf {A}^2)+\Vert \mathbf {b}\Vert ^2}} =(1+o_P(1)) \frac{b^\top Z}{\Vert \mathbf {b}\Vert } +o_P(1) \rightsquigarrow {\mathcal {N}} (0,1). \end{aligned}$$

In what follows, we assume \(c\in (0,+\infty )\). By Helly selection theorem, we can assume \(\lim _{n\rightarrow \infty } |b_i|/\Vert \mathbf {b}\Vert = b_i^*\in [0,1]\), \(i=1,2,\ldots .\) From Fatou’s lemma, we have \(\sum _{i=1}^{\infty } (b_i^{*})^2\le 1\). Consequently, \(\lim _{i\rightarrow \infty } b_i^* =0\).

Note that we have assumed that \(\lambda _1(\mathbf {A}^2)/{{\,\mathrm{tr}\,}}(\mathbf {A}^2)\rightarrow 0\). Then for every fixed integer \(r>0\),

$$\begin{aligned} \frac{ \sum _{i=1}^r a_{i,i}^2 }{ \sum _{i=1}^n a_{i,i}^2 } \le \frac{ r \max _{1\le i\le n} a_{i,i}^2 }{ \sum _{i=1}^n a_{i,i}^2 } \rightarrow 0. \end{aligned}$$

Then there exists a sequence of positive integers \(r(n)\rightarrow \infty \) such that \({(\sum _{i=1}^{r(n)} a_{i,i}^2) }/{( \sum _{i=1}^n a_{i,i}^2) } \rightarrow 0\) and \(r(n)/n\rightarrow 0\). Write

$$\begin{aligned} Z^\top \mathbf {A}Z + b^\top Z - {{\,\mathrm{tr}\,}}(\mathbf {A}) = \sum _{i=1}^{r(n)} a_{i,i}(z_i^2-1) + \sum _{i=1}^{r(n)} b_i z_i + \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) , \end{aligned}$$

which is a sum of independent random variables. The first term is negligible since \({{\,\mathrm{Var}\,}}( \sum _{i=1}^{r(n)} a_{i,i}(z_i^2-1) )=o(\sum _{i=1}^n a_{i,i}^2)\). Now we deal with the third term. From Berry–Esseen inequality (see, e.g., DasGupta 2008, Theorem 11.2), there exists an absolute constant \(C^*>0\), such that

$$\begin{aligned} \begin{aligned}&\sup _{x\in {\mathbb {R}}}\left| \Pr \left( \frac{ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2} } \le x \right) -\varPhi (x) \right| \\&\quad \le C^* \frac{ \sum _{i=r(n)+1}^n {{\,\mathrm{E}\,}}\left| a_{i,i}(z_i^2-1) + b_i z_i \right| ^3 }{ \left( 2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2 \right) ^{3/2} } . \end{aligned} \end{aligned}$$

By some simple algebra, there exist absolute constants \(C_1^*,C_2^*>0\) such that for sufficiently large n,

$$\begin{aligned} \begin{aligned}&\sup _{x\in {\mathbb {R}}}\left| \Pr \left( \frac{ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2} } \le x \right) -\varPhi (x) \right| \\&\quad \le C_1^* \frac{ \max _{1\le i \le n} |a_{i,i}| }{ \sqrt{ \sum _{i=1}^n a_{i,i}^2 } } + C_2^* \frac{ |b_{r(n)+1}| }{ \Vert \mathbf {b}\Vert }. \end{aligned} \end{aligned}$$

Since the right hand side tends to 0, we have

$$\begin{aligned} \frac{ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2} } \rightsquigarrow {\mathcal {N}}(0,1). \end{aligned}$$

Note that \( \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) \) is independent of \(\sum _{i=1}^{r(n)} b_{i} z_i\) and \(\sum _{i=1}^{r(n)} b_i z_i\sim {\mathcal {N}}(0,\sum _{i=1}^{r(n)}b_i^2)\). Thus,

$$\begin{aligned} \frac{ \sum _{i=1}^{r(n)} b_i z_i+ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=1}^n a_{i,i}^2 + \sum _{i=1}^n b_{i}^2} } \rightsquigarrow {\mathcal {N}}(0,1). \end{aligned}$$

This completes the proof. \(\square \)

Proof

(Proof of Theorem 4) We note that

$$\begin{aligned} \begin{aligned}&\Pr \left( \frac{ \mathbf {y}^{*\top } \left( \mathbf {X}_b^* \mathbf {X}_b^{*\top } \right) ^k \mathbf {y}^* }{ \mathbf {y}^{*\top } \mathbf {y}^* } \le {{\,\mathrm{E}\,}}(\gamma _I^k) +\sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) \\&\quad = \Pr \left( \mathbf {y}^{*\top } \left( \mathbf {X}_b^* \mathbf {X}_b^{*\top } \right) ^k \mathbf {y}^* \le \left( {{\,\mathrm{E}\,}}(\gamma _I^k) +\sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) \mathbf {y}^{*\top } \mathbf {y}^* \right) \\&\quad = \Pr \left( \mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^* \le 0 \right) , \end{aligned} \end{aligned}$$
(16)

where

$$\begin{aligned} \mathbf {B}= \left( \mathbf {X}_b^* \mathbf {X}_b^{*\top } \right) ^k - \left( {{\,\mathrm{E}\,}}(\gamma _I^k) +\sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) \mathbf {I}_{n-q}. \end{aligned}$$

Since \(\mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^*= \varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\tilde{\mathbf {U}}_a^\top \varvec{\epsilon }+ 2\varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b + \varvec{\beta }_b^{\top } \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b\), we have

$$\begin{aligned} \begin{aligned}&\Pr \left( \mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^* \le 0 \right) \\&\quad = \Pr \left( \frac{ \varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\tilde{\mathbf {U}}_a^\top \varvec{\epsilon }+ 2\varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b -\phi ^{-1}{{\,\mathrm{tr}\,}}(\mathbf {B}) }{ \sqrt{ 2\phi ^{-2}{{\,\mathrm{tr}\,}}(\mathbf {B}^2) +4 \phi ^{-1} \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b } } \right. \\&\quad \le \left. \frac{ -\varvec{\beta }_b^{\top } \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b -\phi ^{-1}{{\,\mathrm{tr}\,}}(\mathbf {B}) }{ \sqrt{ 2\phi ^{-2} {{\,\mathrm{tr}\,}}(\mathbf {B}^2) +4\phi ^{-1} \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b } } \right) . \end{aligned} \end{aligned}$$

To apply Proposition 2, we need to verify the condition \(\lambda _1\left( \mathbf {B}^2 \right) /{{\,\mathrm{tr}\,}}\left( \mathbf {B}^2 \right) \rightarrow 0\). It is straightforward to show that \({{\,\mathrm{tr}\,}}(\mathbf {B}^2) = ( n-q +2x^2 ) {{\,\mathrm{Var}\,}}( \gamma _I^k )\). On the other hand,

$$\begin{aligned} \lambda _1\left( \mathbf {B}^2 \right)= & {} \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) - \sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) ^2 \\\le & {} 2 \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) \right) ^2 + 4 \frac{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} x^2 . \end{aligned}$$

Thus,

$$\begin{aligned} \frac{ \lambda _1(\mathbf {B}^2) }{ {{\,\mathrm{tr}\,}}(\mathbf {B}^2) }\le & {} 2 \frac{ \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) \right) ^2 }{ (n-q+2x^2) {{\,\mathrm{Var}\,}}(\gamma _I^k) } +4\frac{ x^2 }{ (n-q)(n-q+x^2) } \\\le & {} 2 \frac{ \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) \right) ^2 }{ (n-q) {{\,\mathrm{Var}\,}}(\gamma _I^k) } +\frac{ 4 }{ (n-q) }, \end{aligned}$$

which tends to 0 by the condition (9). Hence Proposition 2 implies that

$$\begin{aligned} \Pr \left( \mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^* \le 0 \right) = \varPhi \left( \frac{ -\varvec{\beta }_b^{\top } \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b -\phi ^{-1} {{\,\mathrm{tr}\,}}(\mathbf {B}) }{ \sqrt{ 2\phi ^{-2} {{\,\mathrm{tr}\,}}(\mathbf {B}^2) +4\phi ^{-1} \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b } } \right) +o(1) . \end{aligned}$$
(17)

Then the conclusion follows from (16), (17) and the following facts

$$\begin{aligned} {{\,\mathrm{tr}\,}}(\mathbf {B})&= -(n-q) \sqrt{ \frac{ 2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{ n-q } } x, \\ {{\,\mathrm{tr}\,}}(\mathbf {B}^2)&= (1+o(1)) ( n-q ) {{\,\mathrm{Var}\,}}( \gamma _I^k ), \\ \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b&= (n-q)\left( {{\,\mathrm{Cov}\,}}\left( \gamma _I^k, \gamma _I w_I^2 \right) - {{\,\mathrm{E}\,}}(\gamma _I w_I^2) \sqrt{\frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q}} x \right) , \\ \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b&= (n-q) {{\,\mathrm{E}\,}}\left[ \left( \gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k) -\sqrt{\frac{2{{\,\mathrm{Var}\,}}(\gamma _I^k)}{n-q}}x \right) ^2 \gamma _I w_I^2 \right] . \end{aligned}$$

\(\square \)

Proof

(Proof of Proposition 3) Fix an \(x\in {\mathbb {R}}\). In view of Theorem 4, we only need to show that \({{\,\mathrm{E}\,}}(\gamma _I w_I^2) = o(1)\) and

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} -\sqrt{\frac{2}{n-q}}x \right) ^2 \gamma _I w_I^2 \right] = o(1). \end{aligned}$$

The former one is a consequence of the assumption \({{\,\mathrm{E}\,}}(\gamma _I w_I^2) = O((n-q)^{-1/2})\). For the latter one, we have

$$\begin{aligned}&{{\,\mathrm{E}\,}}\left[ \left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} -\sqrt{\frac{2}{n-q}}x \right) ^2 \gamma _I w_I^2 \right] \\&\quad \le 2 {{\,\mathrm{E}\,}}\left[ \left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} \right) ^2 \gamma _I w_I^2 \right] + {\frac{4}{n-q}} x^2 {{\,\mathrm{E}\,}}\left[ \gamma _I w_I^2 \right] \\&\quad \le 2 \left( \max _{i\in \{1,\dots ,n-q\}}\gamma _i w_i^2\right) {{\,\mathrm{E}\,}}\left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} \right) ^2 + {\frac{4}{n-q}} x^2 {{\,\mathrm{E}\,}}\left[ \gamma _I w_I^2 \right] \\&\quad = 2 \left( \max _{i\in \{1,\dots ,n-q\}}\gamma _i w_i^2\right) + {\frac{4}{n-q}} x^2 {{\,\mathrm{E}\,}}\left[ \gamma _I w_I^2 \right] \\&\quad \rightarrow 0. \end{aligned}$$

This completes the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, R., Xu, X. A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix. Stat Papers 62, 1821–1852 (2021). https://doi.org/10.1007/s00362-020-01157-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-020-01157-5

Keywords

Mathematics Subject Classification

Navigation