A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Wang, Rui; Xu, Xingzhong

doi:10.1007/s00362-020-01157-5

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Regular Article
Published: 22 January 2020

Volume 62, pages 1821–1852, (2021)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Rui Wang¹ &
Xingzhong Xu^1,2

329 Accesses
Explore all metrics

Abstract

This paper considers testing regression coefficients in high-dimensional linear model with fixed design matrix. This problem is highly irregular in the frequentist point of view. In fact, we prove that no test can guarantee nontrivial power even when the true model deviates greatly from the null hypothesis. Nevertheless, Bayesian methods can still produce tests with good average power behavior. We propose a new test statistic which is the limit of Bayes factors under normal distribution. The null distribution of the proposed test statistic is approximated by Lindeberg’s replacement trick. Under certain conditions, the global asymptotic power function of the proposed test is also given. The finite sample performance of the proposed test is demonstrated via simulation studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Testing Regression Coefficients in High-Dimensional and Sparse Settings

Article 15 October 2021

Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models

Article 07 November 2023

High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study

References

Arboretti R, Ceccato R, Corain L, Ronchi F, Salmaso L (2018) Multivariate small sample tests for two-way designs with applications to industrial statistics. Stat Pap 59(4):1483–1503
MathSciNet MATH Google Scholar
Bai Z, Pan G, Yin Y (2018) A central limit theorem for sums of functions of residuals in a high-dimensional regression model with an application to variance homoscedasticity test. TEST 27(4):896–920
MathSciNet MATH Google Scholar
Baltagi BH, Kao C, Na S (2013) Testing for cross-sectional dependence in a panel factor model using the wild bootstrap $F$ test. Stat Pap 54(4):1067–1094
MathSciNet MATH Google Scholar
Bentkus V, Götze F (1996) Optimal rates of convergence in the CLT for quadratic forms. Ann Probab 24(1):466–490
MathSciNet MATH Google Scholar
Bühlmann P (2013) Statistical significance in high-dimensional linear models. Bernoulli 19(4):1212–1242
MathSciNet MATH Google Scholar
Casella G, Moreno E (2006) Objective Bayesian variable selection. J Am Stat Assoc 101(473):157–167
MathSciNet MATH Google Scholar
Chatterjee S (2006) A generalization of the Lindeberg principle. Ann Probab 34(6):2061–2076
MathSciNet MATH Google Scholar
Chatterjee S (2008) A new method of normal approximation. Ann Probab 36(4):1584–1610
MathSciNet MATH Google Scholar
Chen SX, Zhang LX, Zhong PS (2010) Tests for high-dimensional covariance matrices. J Am Stat Assoc 105(490):810–819
MathSciNet MATH Google Scholar
Cohn DL (2013) Measure theory, 2nd edn. Birkhäuser, New York
MATH Google Scholar
Cui H, Guo W, Zhong W (2018) Test for high-dimensional regression coefficients using refitted cross-validation variance estimation. Ann Stat 46(3):958–988
MathSciNet MATH Google Scholar
DasGupta A (2008) Asymptotic theory of statistics and probability, 1st edn. Springer, New York
MATH Google Scholar
Dezeure R, Bühlmann P, Zhang CH (2017) High-dimensional simultaneous inference with the bootstrap. TEST 26(4):685–719
MathSciNet MATH Google Scholar
Dicker LH, Erdogdu MA (2017) Flexible results for quadratic forms with applications to variance components estimation. Ann Stat 45(1):386–414
MathSciNet MATH Google Scholar
Draper NR, Pukelsheim F (1996) An overview of design of experiments. Stat Pap 37(1):1–32
MathSciNet MATH Google Scholar
Fan J, Yuan L, Mincheva M (2013) Large covariance estimation by thresholding principal orthogonal complements. J R Stat Soc B 75(4):603–680
MathSciNet MATH Google Scholar
Feng L, Zou C, Wang Z, Chen B (2013) Rank-based score tests for high-dimensional regression coefficients. Electron J Stat 7:2131–2149
MathSciNet MATH Google Scholar
Goddard SD, Johnson VE (2016) Restricted most powerful Bayesian tests for linear models. Scand J Stat 43(4):1162–1177
MathSciNet MATH Google Scholar
Goeman JJ, van de Geer SA, van Houwelingen HC (2006) Testing against a high dimensional alternative. J R Stat Soc B 68(3):477–493
MathSciNet MATH Google Scholar
Goeman JJ, van Houwelingen HC, Finos L (2011) Testing against a high-dimensional alternative in the generalized linear model: asymptotic type I error control. Biometrika 98(2):381–390
MathSciNet MATH Google Scholar
Götze F, Tikhomirov A (2002) Asymptotic distribution of quadratic forms and applications. J Theor Probab 15(2):423–475
MathSciNet MATH Google Scholar
Horn RA, Johnson CR (1991) Topics in matrix analysis, 1st edn. Cambridge University Press, New York
MATH Google Scholar
Ingster YI, Tsybakov AB, Verzelen N (2010) Detection boundary in sparse regression. Electron J Stat 4:1476–1526
MathSciNet MATH Google Scholar
Janson L, Barber RF, Candès E (2016) EigenPrism: inference for high dimensional signal-to-noise ratios. J R Stat Soc B 79(4):1037–1065
MathSciNet MATH Google Scholar
Javier Girón F, Lina Martínez M, Moreno E, Torres F (2006) Objective testing procedures in linear models: calibration of the $p$-values. Scand J Stat 33(4):765–784
MathSciNet MATH Google Scholar
Jiang J (1996) Reml estimation: asymptotic behavior and related topics. Ann Stat 24(1):255–286
MathSciNet MATH Google Scholar
Jong PD (1987) A central limit theorem for generalized quadratic forms. Probab Theory Relat Fields 75(2):261–277
MathSciNet MATH Google Scholar
Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assoc 90(430):773–795
MathSciNet MATH Google Scholar
Lan W, Wang H, Tsai CL (2014) Testing covariates in high-dimensional regression. Ann Inst Stat Math 66(2):279–301
MathSciNet MATH Google Scholar
Lan W, Ding Y, Fang Z, Fang K (2016a) Testing covariates in high dimension linear regression with latent factors. J Multivar Anal 144:25–37
MathSciNet MATH Google Scholar
Lan W, Zhong PS, Li R, Wang H, Tsai CL (2016b) Testing a single regression coefficient in high dimensional linear models. J Econom 195(1):154–168
MathSciNet MATH Google Scholar
Lehmann EL, Romano JP (2005) Testing statistical hypotheses, 3rd edn. Springer, New York
MATH Google Scholar
Lei L, Bickel PJ, Karoui NE (2018) Asymptotics for high dimensional regression M-estimates: fixed design results. Probab Theory Relat Fields 172(3–4):983–1079
MathSciNet MATH Google Scholar
Liang F, Paulo R, Molina G, Clyde MA, Berger JO (2008) Mixtures of g priors for Bayesian variable selection. J Am Stat Assoc 103(481):410–423
MathSciNet MATH Google Scholar
Pollard D (1984) Convergence of stochastic processes, 1st edn. Springer, New York
MATH Google Scholar
Sevast’yanov BA (1961) A class of limit distributions for quadratic forms of stochastic random variables. Theory Probab Appl 6(3):337–340
MATH Google Scholar
Wang S, Cui H (2015) A new test for part of high dimensional regression coefficients. J Multivar Anal 137:187–203
MathSciNet MATH Google Scholar
Xu K (2016) A new nonparametric test for high-dimensional regression coefficients. J Stat Comput Simul 87(5):855–869
MathSciNet MATH Google Scholar
Zhang X, Cheng G (2017) Simultaneous inference for high-dimensional linear models. J Am Stat Assoc 112(518):757–768
MathSciNet Google Scholar
Zhang CH, Zhang SS (2014) Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc B 76(1):217–242
MathSciNet MATH Google Scholar
Zhong PS, Chen SX (2011) Tests for high-dimensional regression coefficients with factorial designs. J Am Stat Assoc 106(493):260–274
MathSciNet MATH Google Scholar
Zhou Q, Guan Y (2018) On the null distribution of bayes factors in linear regression. J Am Stat Assoc 113(523):1362–1371
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the editor and two anonymous referees for their helpful comments and suggestions which helped to improve the paper significantly. This work was supported by the National Natural Science Foundation of China under Grant No. 11471035.

Author information

Authors and Affiliations

School of Mathematics and Statistics, Beijing Institute of Technology, Beijing, 100081, China
Rui Wang & Xingzhong Xu
Beijing Key Laboratory on MCAACI, Beijing Institute of Technology, Beijing, 100081, China
Xingzhong Xu

Authors

Rui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xingzhong Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xingzhong Xu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Lemma 1

Under the assumptions of Theorem 1, if there exists a Borel set $G\subset {\mathbb {R}}^n$ and a number $M\ge 0$ such that

$$\begin{aligned} \int _{{\mathbb {R}}^n} \varphi (\mathbf {y}) d{\mathcal {N}}_n(\mu ,\phi ^{-1} \mathbf {I}_n)(\mathbf {y})\, \mathrm {d} \mathbf {y}\ge \alpha \quad \text {for all } \mu \in G \text { and } \phi >M, \end{aligned}$$

then $\varphi (\mathbf {y}){\mathbf {1}}_{G}(\mathbf {y})\ge \alpha {\mathbf {1}}_{G}(\mathbf {y})$ a.e. $\lambda $.

Proof

We prove the claim by contradiction. Suppose $\lambda (\{\mathbf {y}:\varphi (\mathbf {y}) <\alpha \} \cap G )>0$. Then there exists a sufficiently small $0< \eta <\alpha $, such that $\lambda (\{\mathbf {y}:\varphi (\mathbf {y}) <\alpha -\eta \} \cap G)>0$. We denote $ E:=\{\mathbf {y}:\varphi (\mathbf {y}) <\alpha -\eta \} \cap G$. From Lebesgue density theorem (Cohn 2013, Corollary 6.2.6), there exists a point $z\in E$, such that, for any $\varepsilon >0$ there is a $\delta _{\varepsilon }>0$ such that for any $0< \delta ' <\delta _\varepsilon $,

$$\begin{aligned} \left| \frac{\lambda (E^\complement \cap C_{\delta '})}{\lambda (C_{\delta '})}\right| <\varepsilon , \end{aligned}$$

where $C_{\delta '}=\prod _{i=1}^n [z_i-{\delta '}, z_i + {\delta '}]$. We put

$$\begin{aligned} \varepsilon =\left( \frac{\sqrt{\pi }}{\sqrt{2} \varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }\right) ^n \frac{\eta }{3}. \end{aligned}$$

Then for any $\phi >M$ and $0<\delta ' <\delta _\varepsilon $,

$$\begin{aligned} \alpha\le & {} \int _{{\mathbb {R}}^n}\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\= & {} \int _{E\cap C_{\delta '}}\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}+ \int _{E^\complement \cap C_{\delta '}}\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\&+ \int _{C_{\delta '}^\complement }\varphi (\mathbf {y}) d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\\le & {} \alpha -\eta + \int _{E^\complement \cap C_{\delta '}} d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}+ \int _{C_{\delta '}^\complement } d{\mathcal {N}}_n (z, \phi ^{-1} \mathbf {I}_n) (\mathbf {y})\,\mathrm {d} \mathbf {y}\\\le & {} \alpha -\eta + \left( \frac{\phi }{2\pi }\right) ^{n/2}\lambda (E^\complement \cap C_{\delta '}) + 2n\left( 1-\varPhi (\sqrt{\phi }\delta ')\right) \\\le & {} \alpha -\eta + \left( \frac{\phi }{2\pi }\right) ^{n/2} \varepsilon (2\delta ')^n + 2n\left( 1-\varPhi (\sqrt{\phi }\delta ')\right) \\= & {} \alpha -\eta + \left( \frac{\sqrt{\phi } \delta '}{\varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }\right) ^{n} \frac{\eta }{3} + 2n\left( 1-\varPhi (\sqrt{\phi }\delta ')\right) . \end{aligned}$$

In the last inequality, we put $\delta '$ small enough such that

$$\begin{aligned} \left( \frac{\varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }{\delta '}\right) ^2>M, \end{aligned}$$

and put

$$\begin{aligned} \phi = \left( \frac{\varPhi ^{-1}\left( 1-\frac{\eta }{6n}\right) }{\delta '}\right) ^2. \end{aligned}$$

Then we obtain the contradiction $\alpha \le \alpha -(2/3)\eta $. This completes the proof. $\square $

Proof

(Proof of Theorem 1) We prove the claim by contradiction. Suppose there exists an $M\ge 0$ such that

$$\begin{aligned} \int _{{\mathbb {R}}^n} \varphi (\mathbf {y}) d{\mathcal {N}}_n \left( \mathbf {X}_a \varvec{\beta }_a + \mathbf {X}_b \varvec{\beta }_b,\phi ^{-1} \mathbf {I}_n\right) (\mathbf {y})\, \mathrm {d}\mathbf {y}\ge \alpha \end{aligned}$$

for every $\varvec{\beta }_a\in {\mathbb {R}}^q$, $\varvec{\beta }_b \in {\mathbb {R}}^p$, $\phi >0$ satisfying $\phi \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b >M$. Note that for any $h>0$,

$$\begin{aligned} \left\{ (\varvec{\beta }_b^\top , \phi )^\top : \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b> h \sqrt{M},\phi > h^{-1} \sqrt{M} \right\} \end{aligned}$$

is a subset of

$$\begin{aligned} \left\{ (\varvec{\beta }_b^\top , \phi )^\top : \phi \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b > M\right\} . \end{aligned}$$

Then Lemma 1 implies that for any $h>0$, $\varphi (\mathbf {y}) {\mathbf {1}}_{G_h}(\mathbf {y})\ge \alpha {\mathbf {1}}_{G_h}(\mathbf {y})$ a.e. $\lambda $, where

$$\begin{aligned} G_h=\left\{ \mathbf {X}_a \varvec{\beta }_a + \mathbf {X}_b \varvec{\beta }_b: \varvec{\beta }_a \in {\mathbb {R}}^{q}, \varvec{\beta }_b^\top \mathbf {X}_b^\top {\tilde{\mathbf {P}}}_a \mathbf {X}_b \varvec{\beta }_b > h \sqrt{M} \right\} . \end{aligned}$$

It can be seen that $\lambda (\{ \cup _{n=1}^\infty G_{1/n}\}^\complement )= 0$. Hence $\varphi (\mathbf {y}) \ge \alpha $ a.e. $\lambda $. On the other hand, since $\varphi (\mathbf {y})$ is a level $\alpha $ test, for every $\phi >0$,

$$\begin{aligned} \int _{{\mathbb {R}}^n} [\varphi (\mathbf {y})-\alpha ] d{\mathcal {N}}_n (0,\phi ^{-1} \mathbf {I}_n)(\mathbf {y}) \,\mathrm {d} \mathbf {y}\le 0. \end{aligned}$$

(10)

Note that the integrand of (10) is nonnegative. It follows that $\varphi (\mathbf {y})=\alpha $ a.s. $\lambda $, a contradiction. This completes the proof. $\square $

Proof

(Proof of Theorem 2) We note that ${{\,\mathrm{Var}\,}}(\varvec{\xi }^\top \mathbf {A}\varvec{\xi })= 2{{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ({{\,\mathrm{E}\,}}(\xi _1^4)-3){{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2} ) $. Let

$$\begin{aligned} {{\tilde{a}}}_{i,j}:= \frac{a_{i,j}}{ \sqrt{ 2{{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ({{\,\mathrm{E}\,}}(\xi _1^4)-3){{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2} ) } } . \end{aligned}$$

Then

$$\begin{aligned} S=\sum _{i=1}^n {{\tilde{a}}}_{i,i}(\xi _i^2-1) +2\sum _{1\le i< j \le n} {\tilde{a}}_{i,j} \xi _i \xi _j, \quad S_\tau ^* =\tau \sum _{i=1}^n {\tilde{a}}_{i,i}{\check{z}}_i +2\sum _{1\le i <j \le n} {\tilde{a}}_{i,j} z_i z_j. \end{aligned}$$

For $l=1,\ldots , n$, define

$$\begin{aligned} S_l&= \sum _{i=1}^{l-1} {\tilde{a}}_{i,i}(\xi _i^2-1) + \tau \sum _{i=l+1}^{n} {\tilde{a}}_{i,i} {\check{z}}_i \\&\quad +\, 2\sum _{1\le i<j \le l-1} {\tilde{a}}_{i,j} \xi _i \xi _j +2\sum _{i=1}^{l-1} \sum _{j=l+1}^n {\tilde{a}}_{i,j} \xi _i z_j +2\sum _{l+1 \le i < j \le n} {\tilde{a}}_{i,j} z_i z_j , \\ h_l&= {\tilde{a}}_{l,l} (\xi _l^2 -1) +2\sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i \xi _l +2\sum _{i =l +1}^n {\tilde{a}}_{i,l} z_i \xi _l , \\ g_l&= \tau {\tilde{a}}_{l,l} {\check{z}}_l +2\sum _{i =1}^{l-1} {\tilde{a}}_{i,l} \xi _i z_l +2\sum _{i = l+1}^n {\tilde{a}}_{i,l} z_i z_l . \end{aligned}$$

It can be seen that for $l=2,\ldots , n$, $S_{l-1}+ h_{l-1} =S_{l} + g_{l} $, and $S=S_n + h_n$, $S_1 + g_1=S_\tau ^*$.

Thus, for any $f \in {\mathscr {C}}^4 ({\mathbb {R}})$,

$$\begin{aligned} \begin{aligned}&\left| {{\,\mathrm{E}\,}}f\left( S\right) - {{\,\mathrm{E}\,}}f\left( S_\tau ^*\right) \right| \\&\quad = \left| {{\,\mathrm{E}\,}}f(S_n+h_n)-{{\,\mathrm{E}\,}}f(S_1+g_1)\right| \\&\quad = \left| \sum _{l=2}^{n} \left( {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l-1}+h_{l-1})\right) +{{\,\mathrm{E}\,}}f(S_{1}+h_{1})-{{\,\mathrm{E}\,}}f(S_{1}+g_{1})\right| \\&\quad = \left| \sum _{l=1}^{n} {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| . \end{aligned} \end{aligned}$$

Apply Taylor’s theorem, for $l=1,\ldots ,n$,

$$\begin{aligned} \begin{aligned} f(S_{l}+h_{l})=&f(S_{l}) + \sum _{k=1}^3 \frac{1}{k!} h_l^k f^{(k)} (S_{l}) + \frac{1}{24}h_{l}^4 f^{(4)} (S_{l}+\theta _1 h_{l}), \\ f(S_{l}+g_{l})=&f(S_{l}) + \sum _{k=1}^3 \frac{1}{k!} g_l^k f^{(k)} (S_{l}) + \frac{1}{24}g_{l}^4 f^{(4)} (S_{l}+\theta _{2} g_{l}), \end{aligned} \end{aligned}$$

where $\theta _1,\theta _2\in [0,1]$. Thus,

$$\begin{aligned} \begin{aligned}&\left| {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| \\&\quad \le \left| \sum _{k=1}^3 \frac{1}{k!} {{\,\mathrm{E}\,}}f^{(k)} (S_{l}) {{\,\mathrm{E}\,}}_l (h_l^k - g_l^k) \right| + \frac{1}{24} \Vert f^{(4)} \Vert _{\infty } \left( {{\,\mathrm{E}\,}}(h_{l}^4)+{{\,\mathrm{E}\,}}(g_{l}^4)\right) , \end{aligned} \end{aligned}$$

where ${{\,\mathrm{E}\,}}_l$ denotes taking expectation with respect to $\xi _l, z_l ,{\check{z}}_l$. It is straightforward to show that

$$\begin{aligned} \begin{aligned} {{\,\mathrm{E}\,}}_l (h_l-g_l)&=0, \\ {{\,\mathrm{E}\,}}_l (h_l^2-g_l^2)&= \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 1 - \tau ^2 \right) {\tilde{a}}_{l,l}^2 , \\ {{\,\mathrm{E}\,}}_l (h_l^3-g_l^3)&= {{\,\mathrm{E}\,}}(\xi _1^2-1)^3 {\tilde{a}}_{l,l}^3 + 12 ({{\,\mathrm{E}\,}}(\xi _1^4) - 1) {\tilde{a}}_{l,l} \left( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i + \sum _{i=l+1}^n {\tilde{a}}_{i,l} z_i \right) ^2 . \end{aligned} \end{aligned}$$

Thus,

$$\begin{aligned} \begin{aligned}&\left| {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| \le \frac{1}{2} \Vert f^{(2)}\Vert _\infty \left| {{\,\mathrm{E}\,}}(\xi _1^4)-1 -\tau ^2 \right| {\tilde{a}}_{l,l}^2 \\&\qquad + \frac{1}{6} \Vert f^{(3)}\Vert _{\infty } \Bigg ( \left| {{\,\mathrm{E}\,}}(\xi _1^2-1)^3\right| |{\tilde{a}}_{l,l}^3 | \\&\qquad + 12 ({{\,\mathrm{E}\,}}(\xi _1^4)-1) |{\tilde{a}}_{l,l}| {{\,\mathrm{E}\,}}\bigg ( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i + \sum _{i=l+1}^n {\tilde{a}}_{i,l} z_i \bigg )^2 \Bigg ) \\&\qquad + \frac{1}{24} \Vert f^{(4)} \Vert _{\infty } \left( {{\,\mathrm{E}\,}}(h_{l}^4)+{{\,\mathrm{E}\,}}(g_{l}^4)\right) \\&\qquad \le \frac{ \left| {{\,\mathrm{E}\,}}(\xi _1^4)-1 - \tau ^2 \right| }{2} \Vert f^{(2)}\Vert _\infty {\tilde{a}}_{l,l}^2 \\&\qquad + \frac{ \max \left( \left| {{\,\mathrm{E}\,}}(\xi _1^2-1)^3\right| , 12 ({{\,\mathrm{E}\,}}(\xi _1^4)-1) \right) }{6} \Vert f^{(3)}\Vert _\infty |{\tilde{a}}_{l,l}| \sum _{i=1}^{n} {\tilde{a}}_{i,l}^2 \\&\qquad + \frac{1}{24} \Vert f^{(4)} \Vert _{\infty } \left( {{\,\mathrm{E}\,}}(h_{l}^4)+{{\,\mathrm{E}\,}}(g_{l}^4)\right) . \end{aligned} \end{aligned}$$

(11)

Now we bound ${{\,\mathrm{E}\,}}(h_l^4)$ and ${{\,\mathrm{E}\,}}(g_l^4)$. By direct calculation,

$$\begin{aligned} {{\,\mathrm{E}\,}}(h_l^4)= & {} {{\,\mathrm{E}\,}}(\xi _1^2 - 1)^4 {\tilde{a}}_{l,l}^4 + 24 {{\,\mathrm{E}\,}}[ \xi _1^2(\xi _1^2 -1)^2] {\tilde{a}}_{l,l}^2 {{\,\mathrm{E}\,}}\left( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i +\sum _{i =l +1}^n {\tilde{a}}_{i,l} z_i \right) ^2 \\&+\, 16 {{\,\mathrm{E}\,}}(\xi _1^4 ) {{\,\mathrm{E}\,}}\left( \sum _{i=1}^{l-1} {\tilde{a}}_{i,l} \xi _i +\sum _{i =l +1}^n {\tilde{a}}_{i,l} z_i \right) ^4 \\= & {} {{\,\mathrm{E}\,}}(\xi _1^2 - 1)^4 {\tilde{a}}_{l,l}^4 + 24 {{\,\mathrm{E}\,}}[ \xi _1^2(\xi _1^2 -1)^2] {\tilde{a}}_{l,l}^2 \left( (\sum _{i=1}^n {\tilde{a}}_{i,l}^2) - {\tilde{a}}_{l,l}^2\right) \\&+ \,16 {{\,\mathrm{E}\,}}(\xi _1^4 ) \left( \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 3\right) \sum _{i=1}^{l-1} {\tilde{a}}_{i,l}^4 + 3 \left( (\sum _{i=1}^n {\tilde{a}}_{i,l}^2) - {\tilde{a}}_{l,l}^2 \right) ^2 \right) . \end{aligned}$$

To upper bound the above quantity, we use the facts $ 24 {{\,\mathrm{E}\,}}[ \xi _1^2(\xi _1^2 -1)^2] \le 2(16{{\,\mathrm{E}\,}}(\xi _1^2 -1)^4 + (9/4) {{\,\mathrm{E}\,}}(\xi _1^4) ) $, ${{\,\mathrm{E}\,}}(\xi _1^2 - 1)^4\le {{\,\mathrm{E}\,}}(\xi _1^8)$ and

$$\begin{aligned} \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 3\right) \sum _{i=1}^{l-1} {\tilde{a}}_{i,l}^4 \le \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 1\right) \sum _{i=1}^{l-1} {\tilde{a}}_{i,l}^4 \le \left( {{\,\mathrm{E}\,}}(\xi _1^4) - 1\right) \left( \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2\right) - {\tilde{a}}_{l,l}^2 \right) ^2 . \end{aligned}$$

Then we obtain the bound

$$\begin{aligned} {{\,\mathrm{E}\,}}(h_l^4) \le \left( 16 {{\,\mathrm{E}\,}}(\xi _1^8) + 32 {{\,\mathrm{E}\,}}(\xi _1^4)\right) \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2\right) ^2. \end{aligned}$$

(12)

Similarly, we have

$$\begin{aligned} {{\,\mathrm{E}\,}}(g_l^4) \le \left( 48 {{\,\mathrm{E}\,}}(\xi _1^4) + 3\tau ^4 + 96 \right) \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2\right) ^2. \end{aligned}$$

(13)

Combining (11), (12) and (13) yields

$$\begin{aligned} \begin{aligned}&\sum _{l=1}^n \left| {{\,\mathrm{E}\,}}f(S_{l}+h_{l})-{{\,\mathrm{E}\,}}f(S_{l}+g_{l})\right| \le \frac{ \left| {{\,\mathrm{E}\,}}(\xi _1^4)-1 - \tau ^2 \right| }{2} \Vert f^{(2)}\Vert _\infty \sum _{l=1}^n {\tilde{a}}_{l,l}^2\\&\qquad + \frac{ \max \left( \left| {{\,\mathrm{E}\,}}(\xi _1^2-1)^3\right| , 12 ({{\,\mathrm{E}\,}}(\xi _1^4)-1) \right) }{6} \Vert f^{(3)}\Vert _\infty \sum _{l=1}^n \left( |{\tilde{a}}_{l,l}| \sum _{i=1}^{n} {\tilde{a}}_{i,l}^2 \right) \\&\qquad + \frac{ 16 {{\,\mathrm{E}\,}}(\xi _1^8) + 80 {{\,\mathrm{E}\,}}(\xi _1^4) + 3\tau ^4 + 96 }{24} \Vert f^{(4)} \Vert _{\infty } \sum _{l=1}^n \left( \sum _{i=1}^n {\tilde{a}}_{i,l}^2 \right) ^2 . \end{aligned} \end{aligned}$$

This completes the proof. $\square $

Proof

(Proof of Theorem 3) Throughout the proof, we use the similar notations as in Theorem 2 and define

$$\begin{aligned} S=\frac{ (\sqrt{\phi }\varvec{\epsilon })^\top \mathbf {A}\sqrt{\phi }\varvec{\epsilon }- {{\,\mathrm{tr}\,}}(\mathbf {A}) }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \end{aligned}$$

and

$$\begin{aligned} S_{ {\hat{\tau }}}^* = \frac{ {\hat{\tau }} \sum _{i=1}^n a_{i,i}{\check{z}}_i +2\sum _{1\le i <j \le n} a_{i,j} z_i z_j }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } }, \end{aligned}$$

where $z_1,\ldots , z_n, {\check{z}}_1,\ldots , {\check{z}}_n$ are iid ${\mathcal {N}}(0,1)$ random variables and are independent of ${\hat{\tau }}^2$.

By a standard subsequence argument, we only need to prove the theorem along a subsequence of $\{n\}$. Hence, without loss of generality, we assume ${\hat{\tau }}^2 \xrightarrow {a.s.} \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-1$. Write

$$\begin{aligned} S_{{\hat{\tau }}}^*= & {} \frac{ \sqrt{\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-1} \sum _{i=1}^n a_{i,i}{\check{z}}_i +2\sum _{1\le i <j \le n} a_{i,j} z_i z_j }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \\&+ \frac{ ({\hat{\tau }} - \sqrt{ \phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-1 } ) \sum _{i=1}^n a_{i,i}{\check{z}}_i }{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + (\phi ^2{{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \\=: & {} S_{{\hat{\tau }},1}^{*} + S_{{\hat{\tau }},2}^{*} . \end{aligned}$$

Note that $S_{\hat{\tau },1}^*$ is independent of ${\hat{\tau }}$. Since ${{\,\mathrm{E}\,}}( S_{{\hat{\tau }},1}^{*2} )=1$, the distributions ${\mathcal {L}}(S_{{\hat{\tau }},1}^{*}) $ are tight as $n\rightarrow \infty $. Hence, without loss of generality, we assume ${\mathcal {L}} (S_{{\hat{\tau }},1}^*)$ weakly converges to a limit distribution with distribution function $F^\dagger (x)$. Let $S^\dagger $ be a random variable with distribution function $F^\dagger (x)$. By some algebra (see, e.g., Chen et al. (2010), Proposition A.1.(iii)), it can be shown that ${{\,\mathrm{E}\,}}(S^{*4}_{{\hat{\tau }},1})$ is uniformly bounded. Then ${\mathcal {L}} ( S_{{\hat{\tau }},1}^{*2} )$ is uniformly integrable. Hence ${{\,\mathrm{E}\,}}(S^{\dagger 2})=1$ and $F^\dagger (x)$ can not concentrate on a single point. Consequently, $F^\dagger (x)$ is continuous and is strict increasing for $x\in \{x:0<F(x)<1\}$; see Sevast’yanov (1961) as well the remark made by A. N. Kolmogorov on that paper.

The condition (8) implies that ${{\,\mathrm{E}\,}}[S_{{\hat{\tau }},2}^{*2}|{\hat{\tau }}]\rightarrow 0$ almost surely. Then almost surely, ${\mathcal {L}} (S^*_{{\hat{\tau }}}|{\hat{\tau }}) \rightsquigarrow {\mathcal {L}}(S^\dagger )$. Consequently, for every $f\in {\mathscr {C}}^4 ({\mathbb {R}})$, we have $| {{\,\mathrm{E}\,}}[f(S^*_{{\hat{\tau }}}) |{\hat{\tau }}] - {{\,\mathrm{E}\,}}f(S^\dagger ) |\rightarrow 0$ almost surely. On the other hand, Theorem 2 and the condition (8) imply $|{{\,\mathrm{E}\,}}f(S)- {{\,\mathrm{E}\,}}[f(S^*_{{\hat{\tau }}})|{\hat{\tau }}] |\rightarrow 0$ almost surely. Thus, $|{{\,\mathrm{E}\,}}f(S)- {{\,\mathrm{E}\,}}f(S^\dagger ) |\rightarrow 0$. That is, ${\mathcal {L}} (S)\rightsquigarrow {\mathcal {L}} (S^\dagger )$.

Note that

$$\begin{aligned} x^{(1)}= \frac{F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})-{{\,\mathrm{tr}\,}}(\mathbf {X}_b^* \mathbf {X}_b^{*\top })^{-1} }{n-q}. \end{aligned}$$

We need to deal with $F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})$. Since ${\mathcal {L}} (S^*_{{\hat{\tau }}}|{\hat{\tau }}) \rightsquigarrow {\mathcal {L}}(S^\dagger )$ almost surely, the fact

$$\begin{aligned} \Pr \left( S_{{\hat{\tau }}}^* > \frac{F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})}{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ( \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \Bigg | \hat{\tau } \right) =\alpha \end{aligned}$$

implies that almost surely,

$$\begin{aligned} \frac{F^{-1} (1-\alpha ; \mathbf {A},{\hat{\tau }})}{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ( \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \rightarrow F^{\dagger -1}(1-\alpha ). \end{aligned}$$

(14)

We also need the fact that

$$\begin{aligned} (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) = (1+o_p(1))(n-q), \end{aligned}$$

(15)

which is a consequence of

$$\begin{aligned} {{\,\mathrm{E}\,}}\left( (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) \right) =n-q ,\quad {{\,\mathrm{Var}\,}}\left( (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) \right) =O(n-q) . \end{aligned}$$

The fact $S\rightsquigarrow S^\dagger $, Eqs. (14), (15) and Slutsky’s theorem lead to

$$\begin{aligned}&\Pr \left( T> x^{(1)} \right) \\&\quad = \Pr \left( T> \frac{F^{-1} (1-\alpha ;\mathbf {A},{\hat{\tau }})-{{\,\mathrm{tr}\,}}\left( (\mathbf {X}_b^* \mathbf {X}_b^{*\top })^{-1} \right) }{n-q} \right) \\&\quad =\Pr \Bigg ( (\sqrt{\phi }\varvec{\epsilon })^\top \mathbf {A}(\sqrt{\phi }\varvec{\epsilon })> \frac{ (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) }{n-q} F^{-1} (1-\alpha ;\mathbf {A},{{\hat{\tau }}}) \Bigg ) \\&\quad = \Pr \left( S> \frac{ (\sqrt{\phi }\varvec{\epsilon })^\top {\tilde{\mathbf {U}}}_a {\tilde{\mathbf {U}}}_a^\top (\sqrt{\phi }\varvec{\epsilon }) }{n-q} \frac{F^{-1} (1-\alpha ; \mathbf {A}, {{\hat{\tau }}})}{ \sqrt{ 2 {{\,\mathrm{tr}\,}}(\mathbf {A}^2) + ( \phi ^2 {{\,\mathrm{E}\,}}(\epsilon _1^4)-3) {{\,\mathrm{tr}\,}}(\mathbf {A}^{\circ 2}) } } \right) \\&\quad = \Pr \Bigg ( S>(1+o_P(1)) F^{-1}(1-\alpha ) \Bigg ) \\&\quad \rightarrow \alpha . \end{aligned}$$

This proves the theorem. $\square $

Proof

(Proof of Proposition 1) From Bai et al. (2018), Theorem 2.1, one can obtain the explicit forms of ${{\,\mathrm{Var}\,}}\left( {\tilde{\varvec{\epsilon }}}^\top \left( {\tilde{\mathbf {P}}}_a \right) {\tilde{\varvec{\epsilon }}} \right) $ and ${{\,\mathrm{Var}\,}}\left( \sum _{i=1}^n {\tilde{\epsilon }}_i^4 \right) $ which involves the traces of certain matrices. Using Horn and Johnson (1991), Theorem 5.5.1, one can see that the eigenvalues of these matrices are all bounded. Hence it can be deduced that ${{\,\mathrm{Var}\,}}( {\tilde{\varvec{\epsilon }}}^\top {\tilde{\mathbf {P}}}_a {\tilde{\varvec{\epsilon }}} )=O(n)$ and ${{\,\mathrm{Var}\,}}\left( \sum _{i=1}^n {\tilde{\epsilon }}_i^4 \right) =O(n)$. Thus,

$$\begin{aligned}&{\tilde{\varvec{\epsilon }}}^\top {\tilde{\mathbf {P}}}_a {\tilde{\varvec{\epsilon }}} = (n-q) \phi ^{-1}+O_P(\sqrt{n}), \\&\sum _{i=1}^n {\tilde{\epsilon }}_i^4 = 3\phi ^{-2} {{\,\mathrm{tr}\,}}({\tilde{\mathbf {P}}}_a^{\circ 2}) +\left( {{\,\mathrm{E}\,}}(\epsilon _1^4)-3 \phi ^{-2}\right) {{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a^{\circ 2} \right) ^2 + O_P(\sqrt{n}). \end{aligned}$$

It follows that

$$\begin{aligned} {\hat{\tau }} ^2 = \phi ^{2}{{\,\mathrm{E}\,}}(\epsilon _1^4)-1 +O_P\left( \frac{\sqrt{n}}{{{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a ^{\circ 2} \right) ^2} \right) . \end{aligned}$$

Let $\delta _{i,j}=1$ if $i=j$ and 0 if $i\ne j$. We have

$$\begin{aligned} n= & {} \sum _{i=1}^n \sum _{j=1}^n \delta _{i,j}^4 \\= & {} \sum _{i=1}^n \sum _{j=1}^n \left( ({\tilde{\mathbf {P}}}_{a})_{i,j}+(\mathbf {P}_{a})_{i,j} \right) ^4 \\\le & {} 8\sum _{i=1}^n \sum _{j=1}^n \left( ({\tilde{\mathbf {P}}}_{a})_{i,j}\right) ^4+ 8\sum _{i=1}^n \sum _{j=1}^n(\mathbf {P}_{a})_{i,j}^4 \\\le & {} 8\sum _{i=1}^n \sum _{j=1}^n \left( ({\tilde{\mathbf {P}}}_{a})_{i,j}\right) ^4+ 8\sum _{i=1}^n \sum _{j=1}^n(\mathbf {P}_{a})_{i,j}^2 \\= & {} 8{{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a^{\circ 2} \right) ^2 +8q. \end{aligned}$$

Then

$$\begin{aligned} \frac{\sqrt{n}}{{{\,\mathrm{tr}\,}}\left( {\tilde{\mathbf {P}}}_a^{\circ 2} \right) ^2} = O\left( \frac{1}{\sqrt{n}} \right) . \end{aligned}$$

This completes the proof. $\square $

Proof

(Proof of Proposition 2) Without loss of generality, we assume $\mathbf {A}$ is a diagonal matrix and $|b_1|\ge \cdots \ge |b_n|$. By a standard subsequence argument, we only need to prove the result along a subsequence. Hence we can assume $\lim _{n\rightarrow \infty }\Vert b\Vert ^2/{{\,\mathrm{tr}\,}}(\mathbf {A}^2) =c \in [0,+\infty ]$. If $ c=0$, Lyapunov central limit theorem implies that

$$\begin{aligned} \frac{Z^\top \mathbf {A}Z + b^\top Z - {{\,\mathrm{tr}\,}}(\mathbf {A})}{\sqrt{2{{\,\mathrm{tr}\,}}(\mathbf {A}^2)+\Vert \mathbf {b}\Vert ^2}} =(1+o_P(1)) \frac{Z^\top \mathbf {A}Z - {{\,\mathrm{tr}\,}}(\mathbf {A})}{\sqrt{2{{\,\mathrm{tr}\,}}(\mathbf {A}^2)}} +o_P(1) \rightsquigarrow {\mathcal {N}} (0,1). \end{aligned}$$

If $c=+\infty $,

$$\begin{aligned} \frac{Z^\top \mathbf {A}Z + b^\top Z - {{\,\mathrm{tr}\,}}(\mathbf {A})}{\sqrt{2{{\,\mathrm{tr}\,}}(\mathbf {A}^2)+\Vert \mathbf {b}\Vert ^2}} =(1+o_P(1)) \frac{b^\top Z}{\Vert \mathbf {b}\Vert } +o_P(1) \rightsquigarrow {\mathcal {N}} (0,1). \end{aligned}$$

In what follows, we assume $c\in (0,+\infty )$. By Helly selection theorem, we can assume $\lim _{n\rightarrow \infty } |b_i|/\Vert \mathbf {b}\Vert = b_i^*\in [0,1]$, $i=1,2,\ldots .$ From Fatou’s lemma, we have $\sum _{i=1}^{\infty } (b_i^{*})^2\le 1$. Consequently, $\lim _{i\rightarrow \infty } b_i^* =0$.

Note that we have assumed that $\lambda _1(\mathbf {A}^2)/{{\,\mathrm{tr}\,}}(\mathbf {A}^2)\rightarrow 0$. Then for every fixed integer $r>0$,

$$\begin{aligned} \frac{ \sum _{i=1}^r a_{i,i}^2 }{ \sum _{i=1}^n a_{i,i}^2 } \le \frac{ r \max _{1\le i\le n} a_{i,i}^2 }{ \sum _{i=1}^n a_{i,i}^2 } \rightarrow 0. \end{aligned}$$

Then there exists a sequence of positive integers $r(n)\rightarrow \infty $ such that ${(\sum _{i=1}^{r(n)} a_{i,i}^2) }/{( \sum _{i=1}^n a_{i,i}^2) } \rightarrow 0$ and $r(n)/n\rightarrow 0$. Write

$$\begin{aligned} Z^\top \mathbf {A}Z + b^\top Z - {{\,\mathrm{tr}\,}}(\mathbf {A}) = \sum _{i=1}^{r(n)} a_{i,i}(z_i^2-1) + \sum _{i=1}^{r(n)} b_i z_i + \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) , \end{aligned}$$

which is a sum of independent random variables. The first term is negligible since ${{\,\mathrm{Var}\,}}( \sum _{i=1}^{r(n)} a_{i,i}(z_i^2-1) )=o(\sum _{i=1}^n a_{i,i}^2)$. Now we deal with the third term. From Berry–Esseen inequality (see, e.g., DasGupta 2008, Theorem 11.2), there exists an absolute constant $C^*>0$, such that

$$\begin{aligned} \begin{aligned}&\sup _{x\in {\mathbb {R}}}\left| \Pr \left( \frac{ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2} } \le x \right) -\varPhi (x) \right| \\&\quad \le C^* \frac{ \sum _{i=r(n)+1}^n {{\,\mathrm{E}\,}}\left| a_{i,i}(z_i^2-1) + b_i z_i \right| ^3 }{ \left( 2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2 \right) ^{3/2} } . \end{aligned} \end{aligned}$$

By some simple algebra, there exist absolute constants $C_1^*,C_2^*>0$ such that for sufficiently large n,

$$\begin{aligned} \begin{aligned}&\sup _{x\in {\mathbb {R}}}\left| \Pr \left( \frac{ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2} } \le x \right) -\varPhi (x) \right| \\&\quad \le C_1^* \frac{ \max _{1\le i \le n} |a_{i,i}| }{ \sqrt{ \sum _{i=1}^n a_{i,i}^2 } } + C_2^* \frac{ |b_{r(n)+1}| }{ \Vert \mathbf {b}\Vert }. \end{aligned} \end{aligned}$$

Since the right hand side tends to 0, we have

$$\begin{aligned} \frac{ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=r(n)+1}^n a_{i,i}^2 + \sum _{i=r(n)+1}^n b_{i}^2} } \rightsquigarrow {\mathcal {N}}(0,1). \end{aligned}$$

Note that $ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) $ is independent of $\sum _{i=1}^{r(n)} b_{i} z_i$ and $\sum _{i=1}^{r(n)} b_i z_i\sim {\mathcal {N}}(0,\sum _{i=1}^{r(n)}b_i^2)$. Thus,

$$\begin{aligned} \frac{ \sum _{i=1}^{r(n)} b_i z_i+ \sum _{i=r(n)+1}^n \left( a_{i,i}(z_i^2-1) + b_i z_i \right) }{ \sqrt{2\sum _{i=1}^n a_{i,i}^2 + \sum _{i=1}^n b_{i}^2} } \rightsquigarrow {\mathcal {N}}(0,1). \end{aligned}$$

This completes the proof. $\square $

Proof

(Proof of Theorem 4) We note that

$$\begin{aligned} \begin{aligned}&\Pr \left( \frac{ \mathbf {y}^{*\top } \left( \mathbf {X}_b^* \mathbf {X}_b^{*\top } \right) ^k \mathbf {y}^* }{ \mathbf {y}^{*\top } \mathbf {y}^* } \le {{\,\mathrm{E}\,}}(\gamma _I^k) +\sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) \\&\quad = \Pr \left( \mathbf {y}^{*\top } \left( \mathbf {X}_b^* \mathbf {X}_b^{*\top } \right) ^k \mathbf {y}^* \le \left( {{\,\mathrm{E}\,}}(\gamma _I^k) +\sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) \mathbf {y}^{*\top } \mathbf {y}^* \right) \\&\quad = \Pr \left( \mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^* \le 0 \right) , \end{aligned} \end{aligned}$$

(16)

where

$$\begin{aligned} \mathbf {B}= \left( \mathbf {X}_b^* \mathbf {X}_b^{*\top } \right) ^k - \left( {{\,\mathrm{E}\,}}(\gamma _I^k) +\sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) \mathbf {I}_{n-q}. \end{aligned}$$

Since $\mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^*= \varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\tilde{\mathbf {U}}_a^\top \varvec{\epsilon }+ 2\varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b + \varvec{\beta }_b^{\top } \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b$, we have

$$\begin{aligned} \begin{aligned}&\Pr \left( \mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^* \le 0 \right) \\&\quad = \Pr \left( \frac{ \varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\tilde{\mathbf {U}}_a^\top \varvec{\epsilon }+ 2\varvec{\epsilon }^\top \tilde{\mathbf {U}}_a \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b -\phi ^{-1}{{\,\mathrm{tr}\,}}(\mathbf {B}) }{ \sqrt{ 2\phi ^{-2}{{\,\mathrm{tr}\,}}(\mathbf {B}^2) +4 \phi ^{-1} \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b } } \right. \\&\quad \le \left. \frac{ -\varvec{\beta }_b^{\top } \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b -\phi ^{-1}{{\,\mathrm{tr}\,}}(\mathbf {B}) }{ \sqrt{ 2\phi ^{-2} {{\,\mathrm{tr}\,}}(\mathbf {B}^2) +4\phi ^{-1} \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b } } \right) . \end{aligned} \end{aligned}$$

To apply Proposition 2, we need to verify the condition $\lambda _1\left( \mathbf {B}^2 \right) /{{\,\mathrm{tr}\,}}\left( \mathbf {B}^2 \right) \rightarrow 0$. It is straightforward to show that ${{\,\mathrm{tr}\,}}(\mathbf {B}^2) = ( n-q +2x^2 ) {{\,\mathrm{Var}\,}}( \gamma _I^k )$. On the other hand,

$$\begin{aligned} \lambda _1\left( \mathbf {B}^2 \right)= & {} \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) - \sqrt{ \frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} } x \right) ^2 \\\le & {} 2 \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) \right) ^2 + 4 \frac{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q} x^2 . \end{aligned}$$

Thus,

$$\begin{aligned} \frac{ \lambda _1(\mathbf {B}^2) }{ {{\,\mathrm{tr}\,}}(\mathbf {B}^2) }\le & {} 2 \frac{ \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) \right) ^2 }{ (n-q+2x^2) {{\,\mathrm{Var}\,}}(\gamma _I^k) } +4\frac{ x^2 }{ (n-q)(n-q+x^2) } \\\le & {} 2 \frac{ \max _{1\le i \le n-q} \left( \gamma _i^k - {{\,\mathrm{E}\,}}(\gamma _I^k) \right) ^2 }{ (n-q) {{\,\mathrm{Var}\,}}(\gamma _I^k) } +\frac{ 4 }{ (n-q) }, \end{aligned}$$

which tends to 0 by the condition (9). Hence Proposition 2 implies that

$$\begin{aligned} \Pr \left( \mathbf {y}^{*\top } \mathbf {B}\mathbf {y}^* \le 0 \right) = \varPhi \left( \frac{ -\varvec{\beta }_b^{\top } \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b -\phi ^{-1} {{\,\mathrm{tr}\,}}(\mathbf {B}) }{ \sqrt{ 2\phi ^{-2} {{\,\mathrm{tr}\,}}(\mathbf {B}^2) +4\phi ^{-1} \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b } } \right) +o(1) . \end{aligned}$$

(17)

Then the conclusion follows from (16), (17) and the following facts

$$\begin{aligned} {{\,\mathrm{tr}\,}}(\mathbf {B})&= -(n-q) \sqrt{ \frac{ 2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{ n-q } } x, \\ {{\,\mathrm{tr}\,}}(\mathbf {B}^2)&= (1+o(1)) ( n-q ) {{\,\mathrm{Var}\,}}( \gamma _I^k ), \\ \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}\mathbf {X}_b^* \varvec{\beta }_b&= (n-q)\left( {{\,\mathrm{Cov}\,}}\left( \gamma _I^k, \gamma _I w_I^2 \right) - {{\,\mathrm{E}\,}}(\gamma _I w_I^2) \sqrt{\frac{2{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }{n-q}} x \right) , \\ \varvec{\beta }_b^\top \mathbf {X}_b^{*\top } \mathbf {B}^2 \mathbf {X}_b^* \varvec{\beta }_b&= (n-q) {{\,\mathrm{E}\,}}\left[ \left( \gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k) -\sqrt{\frac{2{{\,\mathrm{Var}\,}}(\gamma _I^k)}{n-q}}x \right) ^2 \gamma _I w_I^2 \right] . \end{aligned}$$

$\square $

Proof

(Proof of Proposition 3) Fix an $x\in {\mathbb {R}}$. In view of Theorem 4, we only need to show that ${{\,\mathrm{E}\,}}(\gamma _I w_I^2) = o(1)$ and

$$\begin{aligned} {{\,\mathrm{E}\,}}\left[ \left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} -\sqrt{\frac{2}{n-q}}x \right) ^2 \gamma _I w_I^2 \right] = o(1). \end{aligned}$$

The former one is a consequence of the assumption ${{\,\mathrm{E}\,}}(\gamma _I w_I^2) = O((n-q)^{-1/2})$. For the latter one, we have

$$\begin{aligned}&{{\,\mathrm{E}\,}}\left[ \left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} -\sqrt{\frac{2}{n-q}}x \right) ^2 \gamma _I w_I^2 \right] \\&\quad \le 2 {{\,\mathrm{E}\,}}\left[ \left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} \right) ^2 \gamma _I w_I^2 \right] + {\frac{4}{n-q}} x^2 {{\,\mathrm{E}\,}}\left[ \gamma _I w_I^2 \right] \\&\quad \le 2 \left( \max _{i\in \{1,\dots ,n-q\}}\gamma _i w_i^2\right) {{\,\mathrm{E}\,}}\left( \frac{\gamma _I^k -{{\,\mathrm{E}\,}}(\gamma _I^k)}{\sqrt{{{\,\mathrm{Var}\,}}\left( \gamma _I^k \right) }} \right) ^2 + {\frac{4}{n-q}} x^2 {{\,\mathrm{E}\,}}\left[ \gamma _I w_I^2 \right] \\&\quad = 2 \left( \max _{i\in \{1,\dots ,n-q\}}\gamma _i w_i^2\right) + {\frac{4}{n-q}} x^2 {{\,\mathrm{E}\,}}\left[ \gamma _I w_I^2 \right] \\&\quad \rightarrow 0. \end{aligned}$$

This completes the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, R., Xu, X. A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix. Stat Papers 62, 1821–1852 (2021). https://doi.org/10.1007/s00362-020-01157-5

Download citation

Received: 25 April 2019
Revised: 04 October 2019
Published: 22 January 2020
Issue Date: August 2021
DOI: https://doi.org/10.1007/s00362-020-01157-5

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Abstract

Access this article

Similar content being viewed by others

Testing Regression Coefficients in High-Dimensional and Sparse Settings

Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models

High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Lemma 1

Proof

Proof

Proof

Proof

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A Bayesian-motivated test for high-dimensional linear regression models with fixed design matrix

Abstract

Access this article

Similar content being viewed by others

Testing Regression Coefficients in High-Dimensional and Sparse Settings

Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models

High-Dimensional Regression Under Correlated Design: An Extensive Simulation Study

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Lemma 1

Proof

Proof

Proof

Proof

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation