Skip to main content
Log in

Variable Selection in High-Dimensional Error-in-Variables Models via Controlling the False Discovery Proportion

  • Published:
Communications in Mathematics and Statistics Aims and scope Submit manuscript

Abstract

Multiple testing has gained much attention in high-dimensional statistical theory and applications, and the problem of variable selection can be regarded as a generalization of the multiple testing. It is aiming to select the important variables among many variables. Performing variable selection in high-dimensional linear models with measurement errors is challenging. Both the influence of high-dimensional parameters and measurement errors need to be considered to avoid severely biases. We consider the problem of variable selection in error-in-variables and introduce the DCoCoLasso-FDP procedure, a new variable selection method. By constructing the consistent estimator of false discovery proportion (FDP) and false discovery rate (FDR), our method can prioritize the important variables and control FDP and FDR at a specifical level in error-in-variables models. An extensive simulation study is conducted to compare DCoCoLasso-FDP procedure with existing methods in various settings, and numerical results are provided to present the efficiency of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Barber, R.F., Candes, E.J.: Controlling the false discovery rate via knockoffs. Ann. Stat. 43(5), 2055–2085 (2015)

    Article  MathSciNet  Google Scholar 

  2. Belloni, A., Chernozhukov, V., Kaul, A.: Confidence bands for coefficients in high dimensional linear models with error-in-variables. arXiv: Statistics Theory (2017)

  3. Belloni, A., Rosenbaum, M., Tsybakov, A.B.: Linear and conic programming approaches to-high dimensional errors-in-variables models. J. R. Stat. Soc. Ser. B. (2014) (forthcoming)

  4. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57(1), 289–300 (1995)

    MathSciNet  MATH  Google Scholar 

  5. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2010)

    Article  Google Scholar 

  6. Chen, X., Doerge, R.W.: A strong law of large numbers related to multiple testing normal means. arXiv: Statistics Theory (2014)

  7. Datta, A., Zou, H.: Cocolasso for high-dimensional error-in-variables regression. Ann. Stat. 45(6), 2400–2426 (2017)

    Article  MathSciNet  Google Scholar 

  8. Fan, J., Han, X., Gu, W.: Estimating false discovery proportion under arbitrary covariance dependence. J. Am. Stat. Assoc. 107(499), 1019–1035 (2012)

    Article  MathSciNet  Google Scholar 

  9. Gsell, M., Wager, S., Chouldechova, A., Tibshirani, R.: Sequential selection procedures and false discovery rate control. J. R. Stat. Soc. Ser. B Stat. Methodol. 78(2), 423–444 (2016)

    Article  MathSciNet  Google Scholar 

  10. Hartigan, J.A.: Bounding the maximum of dependent random variables. Electron. J. Stat. 8(2), 3126–3140 (2014)

    Article  MathSciNet  Google Scholar 

  11. Jeng, X.J., Chen, X.: Predictor ranking and false discovery proportion control in high-dimensional regression. J. Multivar. Anal. 171, 163–175 (2019)

    Article  MathSciNet  Google Scholar 

  12. Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. Publ. Am. Stat. Assoc. 96(December), 1348–1360 (2001)

    MathSciNet  MATH  Google Scholar 

  13. Loh, P., Wainwright, M.J.: High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity. Ann. Stat. 40(3), 1637–1664 (2012)

    Article  MathSciNet  Google Scholar 

  14. Peter, B.: Stability selection. J. R. Stat. Soc. 72(4), 417–473 (2010)

    Article  MathSciNet  Google Scholar 

  15. Rosenbaum, M., Tsybakov, A.B.: Sparse recovery under matrix uncertainty. Ann. Stat. 38(5), 2620–2651 (2010)

    Article  MathSciNet  Google Scholar 

  16. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  17. Van de Geer, S., Buhlmann, P., Ritov, Y., Dezeure, R.: On asymptotically optimal confidence regions and tests for high-dimensional models. Ann. Stat. 42(3), 1166–1202 (2014)

    MathSciNet  MATH  Google Scholar 

  18. Wang, Z., Xue, L.: Inference for high dimensional linear models with error-in-variables. Commun. Stat. Simul. Comput. 13, 1–10 (2019). https://doi.org/10.1080/03610918.2018.1554108

    Article  Google Scholar 

  19. Zhao, P., Yu, B.: On model selection consistency of lasso. J. Mach. Learn. Res. 7(12), 2541–2563 (2006)

    MathSciNet  MATH  Google Scholar 

  20. Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xudong Huang.

Additional information

Research is supported by Grant No. 11901006 from the National Natural Science Foundation of China, by Grant Nos. 1908085MA20 and 1908085QA06 from the Natural Science Foundation of Anhui Province.

Appendix: Proof of the Main Results

Appendix: Proof of the Main Results

1.1 Appendix A: Proof of Theorem 3.7

We first need to prove the following lemma, which is essential for the proof of theorem.

The conclusion about the convergence rate in Lemmas 2.1 and 2.2 has been proved in many literatures, such as [2, 7] and so on.

Their proofs are similar and we omit them.

Proof of Lemma 3.5:

We focus first on the case when the covariance matrix \({\varvec{\Sigma }}\) is unknown, then we get results of the known covariance matrix \({\varvec{\Sigma }}\) by \(\widehat{\varvec{\Theta }}={\varvec{\Sigma }}_0^{-1}\). Recall the Eq. (2.9), let \(\varvec{t}=\varvec{\widehat{\Theta }\zeta }_n\), we decompose \(\sqrt{n}(\varvec{\hat{b}-\beta ^0})\) into following two items:

$$\begin{aligned} \sqrt{n}\varvec{(\hat{b}-\beta ^0)}= & {} \sqrt{n}(\varvec{\hat{\beta }^{coco}}-\varvec{\beta }^{\varvec{0}})+\sqrt{n}\varvec{\widehat{\Theta }}\{n^{-1}\mathbf{Z}^T(\mathbf{X}\varvec{\beta }^{\varvec{0}}+ \varvec{\varepsilon }) -(n^{-1}\mathbf{Z}^T\mathbf{Z}-\varvec{\Sigma _w})\varvec{\hat{\beta }^{coco}}\}\\= & {} \sqrt{n}(\varvec{\hat{\beta }^{coco}}-\varvec{\beta }^{\varvec{0}})+\sqrt{n}\varvec{\widehat{\Theta }}\{n^{-1}\mathbf{Z}^T\varvec{\varepsilon }+n^{-1}\mathbf{Z}^T\mathbf{X}\varvec{\beta }^{\varvec{0}}- \varvec{\widehat{\Sigma }(\hat{\beta }^{coco}}-\varvec{\beta }^{\varvec{0}})-\varvec{\widehat{\Sigma }\beta ^0}\}\\= & {} \sqrt{n}\varvec{\widehat{\Theta }}\{(n^{-1}\mathbf{Z}^T\mathbf{X}-\varvec{\widehat{\Sigma })\beta ^0}+ n^{-1}\mathbf{Z}^T\varvec{\varepsilon }\}-\sqrt{n}(\widehat{\varvec{\Theta }}\widehat{\Sigma }-\mathbf{I})(\varvec{\hat{\beta }^{coco}-\beta ^0})\\= & {} \varvec{ t-\delta }, \end{aligned}$$

where \(\varvec{\widehat{\Sigma }}=n^{-1}\mathbf{Z}^T\mathbf{Z}-\varvec{\Sigma }_{\mathbf{w}}\), \(\varvec{\zeta }_n=\sqrt{n}\{(n^{-1}\mathbf{Z}^T\mathbf{X}-\varvec{\widehat{\Sigma })\beta ^0}+n^{-1}\mathbf{Z}^T\varvec{ \varepsilon }\} =n^{-1/2}\sum _{i=1}^n\{\mathbf{Z}_i(\varepsilon _i-\mathbf{W}_i^T\varvec{\beta }^{\varvec{0}})+\varvec{\Sigma _w\beta ^0}\} =n^{-1/2}\sum _{i=1}^n\varvec{\zeta }_{ni}\).

Because \(\varvec{ \varepsilon }\perp \mathbf{X}\), \(\mathbf{W}\perp \mathbf{X}\), \(\mathbf{W}\perp \varvec{\varepsilon }\), and zero-mean conditions, we have that \(E(\varvec{\zeta }_n|\mathbf{X})=0\). Observe that \(\{\zeta _{ni}\}_{i=1}^n\) are independent identically distributed random vector, we apply Lindeberg levy central limiting theorem, we have

$$\begin{aligned} \varvec{\zeta }_n \xrightarrow {D}\mathcal {N}_p(0,{{\varvec{\Gamma }}}), \end{aligned}$$

where \( { {\varvec{\Gamma }}}=\text {var}(\zeta _{ni}|\mathbf{X})=E[\{\mathbf{Z}_i(\varepsilon _i-\mathbf{W}_i^T\varvec{\beta }^{\varvec{0}})+\varvec{\Sigma _w\beta ^0}\}\{\mathbf{Z}_i(\varepsilon _i-\mathbf{W}_i^T\varvec{\beta }^{\varvec{0}})+\varvec{\Sigma _w\beta ^0}\}^{T}|\mathbf{X}]. \) Consequently, when \({\varvec{\Sigma }}\) is assumed to be known, we have

$$\begin{aligned} \mathbf{t_0} \xrightarrow {D}\mathcal {N}_p(0,{\varvec{\Sigma }}_0^{-1}{{\varvec{\Gamma }}}{\varvec{\Sigma }}_0^{-1}), \end{aligned}$$

while \({\varvec{\Sigma }}\) is unavailable, we can plug the \(\widehat{\varvec{\Theta }}\widehat{\Gamma }\widehat{\Theta }^T\) as the estimation of \({\varvec{\Sigma }}_0^{-1}{\varvec{\Gamma }}{\varvec{\Sigma }}_0^{-1}\). Then

$$\begin{aligned} \mathbf{t} \xrightarrow {D}\mathcal {N}_p(0,\widehat{\varvec{\Theta }}\widehat{\Gamma }\widehat{\Theta }^T), \end{aligned}$$

where

$$\begin{aligned} \widehat{\Gamma }= & {} E[\{\mathbf{Z}_i(\varepsilon _i-\mathbf{W}_i^T\varvec{\hat{\beta }^{coco}})+\varvec{\Sigma _w\hat{\beta }^{coco}}\} \{\mathbf{Z}_i(\varepsilon _i-\mathbf{W}_i^T\varvec{\hat{\beta }^{coco}})+\varvec{\Sigma _w\hat{\beta }^{coco}}\}^T |\mathbf{X}]\\= & {} n^{-1}\sum _{i=1}^{n}\{\mathbf{Z}_i(Y_i-\mathbf{Z}_i^T\varvec{\hat{\beta }^{coco}+\Sigma _w\hat{\beta }^{coco}})\} \{\mathbf{Z}_i(Y_i-\mathbf{Z}_i^T\varvec{\hat{\beta }^{coco}+\Sigma _w\hat{\beta }^{coco})\}}^T. \end{aligned}$$

\(\square \)

Proof of Lemma 3.6:

We just need to prove that \(\varvec{\delta }\) satisfies the conclusion in Lemma 3.6, \(\varvec{\delta _0}\)’s proof is similar. For \(\varvec{\delta }\), we apply Holder inequality,

$$\begin{aligned} \Vert \varvec{ \delta }\Vert _\infty \le \sqrt{n}\Vert \widehat{\varvec{\Theta }}\widehat{\Sigma }-I\Vert _\infty \Vert \varvec{\hat{\beta }^{coco}-\beta ^0}\Vert _1. \end{aligned}$$

Lemma 5.5 of [18] implies

$$\begin{aligned} \Vert \widehat{\varvec{\Theta }}\widehat{\Sigma }-I\Vert _\infty \le O_\mathbb {P}\left\{ \max \limits _js_j(1+\max \limits _j\Vert \gamma _j^0\Vert _2)\sqrt{\log (p)/n}\right\} , \end{aligned}$$

with probability at least \(1-c_1\exp (-c_2\log p)\). This, together with (2.13), we have

$$\begin{aligned} \Vert \varvec{\delta }\Vert _\infty \le O_{\mathbb {P}}\left\{ (1+\Vert \varvec{\beta }^{\varvec{0}}\Vert _2)s_0\max \limits _js_j(1+\max \Vert \gamma _j^0\Vert _2)\log p/\sqrt{n}))\right\} , \end{aligned}$$

with probability at least \(1-c_1\exp (-c_2\log p)\). This completes the proof. \(\square \)

Lemma A.1

Consider the estimator \(\widehat{\varvec{\Theta }}\) from the modified node-wise regression, and suppose the Assumptions 3.1 and 3.3 hold. Then for suitable tuning parameters \(\lambda _j\), \(j\in \{1,\ldots ,p\}\), we have

$$\begin{aligned}&\Vert \widehat{\varvec{\Theta }}\widehat{\Gamma }_{\mathbf{1}}-\mathbf{I}\Vert _\infty =O_{\mathbb {P}}\left( \sqrt{\frac{\log (p)}{n}}\right) ,\\&\Vert \widehat{\varvec{\Theta }}\widehat{\Gamma }_1\widehat{\Theta }^T-{\varvec{\Theta }}\Vert _\infty =o_{\mathbb {P}}\left( 1\right) . \end{aligned}$$

Furthermore,

$$\begin{aligned} \Vert {\widehat{{\varvec{\Omega }}}}\Vert _1=O_{\mathbb {P}} \left( p^2\sqrt{\frac{\max _{j}s_j\log (p)}{n}}\right) . \end{aligned}$$

Proof

Let us simplify \({\widehat{{\varvec{\Gamma }}}}\) first, then we have

$$\begin{aligned} \varvec{\widehat{\Gamma }}&=_{\quad }&\text {var}(\sqrt{n}\{(n^{-1}\mathbf{Z}^T\mathbf{X}-\varvec{\widehat{\Sigma })\hat{\beta }^{coco}}+n^{-1}\mathbf{Z}^T\varvec{\varepsilon }\})\\&=_{{(1)}}&\text {var}(n^{-1/2}\mathbf{Z}^T\varvec{\varepsilon })+\text {var}(n^{-1/2}\mathbf{Z}^T\mathbf{W}\varvec{\hat{\beta }^{coco}})\\&=_{\quad }&\text {var}(n^{-1/2}\mathbf{Z}^T{\varvec{\varepsilon }})+\text {var}(n^{-1/2}\mathbf{X}^T\mathbf{W}\varvec{\hat{\beta }^{coco}})\\&=_{\tiny {(2)}}&({\varvec{\Sigma }}+\varvec{\Sigma _w})\sigma _\varepsilon ^2+nb^2\varvec{\Sigma _w\Sigma }\\&=_{\quad }&\sigma _\varepsilon ^2\varvec{\widehat{\Gamma }_1}, \end{aligned}$$

where \(b^2=(\varvec{\hat{\beta }}^\mathbf{coco})^{T}\varvec{\hat{\beta }^{coco}}\) and \(\varvec{\widehat{\Gamma }_1=(\Sigma +\Sigma _w)}+nb^2\sigma _\varepsilon ^{-2}\varvec{\Sigma _w\Sigma }\). Thus, we see that \(\varvec{\widehat{\Omega }=\widehat{\Theta }\widehat{\Gamma }\widehat{\Theta }}^T=\sigma _\varepsilon ^2\varvec{\widehat{\Theta }\widehat{\Gamma }_1\widehat{\Theta }}^T.\)

Then, we will give a brief explanation of the above steps.

  1. (1)

    Recall \(\varvec{\varepsilon \perp X, W\perp X, W\perp \varepsilon }\) and zero-mean, so \(\text {cov}(\mathbf{Z}^T\varvec{\varepsilon },\mathbf{Z}^T\mathbf{W}\varvec{\hat{\beta }^{coco}})=0\).

  2. (2)

    We define \({{\varvec{\phi }}}_n=\mathbf{Z}^T\varvec{\varepsilon }=(\mathbf{X}+\mathbf{W})^T\varvec{\varepsilon }\). For any \(j\in \{1,\ldots ,p\}\),

    $$\begin{aligned}{}[\phi _n]_j=\sum \limits _{i=1}^n\underbrace{(x_{ij}+w_{ij})\varepsilon _i}_{\nu _{ij}}. \end{aligned}$$

Since \(\mathbf{X}, \mathbf{W}, \varvec{ \varepsilon }\) are perpendicular to each other, for each \(j\in \{1,...,p\}\), we have

$$\begin{aligned} \text {var}(\nu _{ij}|\mathbf{X})=\text {var}(\sum \limits _{i=1}^n(x_{ij}+w_{ij})\varepsilon _i) =\sum \limits _{i=1}^n\sigma _\varepsilon ^2(x_{ij}^2+w_{ij}^2), \end{aligned}$$

and for \(j\ne k\),

$$\begin{aligned} \text {cov}(\nu _{ij},\nu _{ik}|\mathbf{X})= & {} E(|\nu _{ij}\nu _{ik}|^2|\mathbf{X}) =\sum \limits _{i=1}^n\sigma _\varepsilon ^2E((x_{ij}+w_{ij})(x_{ik}+w_{ik}))\\= & {} \sum \limits _{i=1}^n\sigma _\varepsilon ^2(x_{ij}x_{ik}+w_{ij}w_{ik}). \end{aligned}$$

Therefore, we can obtain \(\text {var}(n^{-1/2}\mathbf{Z}^T\varvec{\varepsilon })=({\varvec{\Sigma }}+\varvec{\Sigma _w)\sigma _\varepsilon }^2\). Also, \(\text {var}(n^{-1/2}\mathbf{X}^T\mathbf{W}\varvec{\hat{\beta }^{coco}})=nb^2{\varvec{\Sigma }}\varvec{\Sigma _w}\) can be achieved in the similar way.

When Assumption 3.3 holds, lemma 5.5 of [18] gives

$$\begin{aligned}&\Vert \widehat{\varvec{\Theta }}{\varvec{\Sigma }}-\mathbf{I}\Vert _\infty \le O_\mathbb {P}\{\max \limits _js_j(1+\max \limits _j\Vert \gamma _j^0\Vert _2)\sqrt{\log (p)/n})\} =o_\mathbb {P}(1),\\&\Vert \varvec{\widehat{\Theta }}\Vert _{\ell _\infty }=\max \limits _j(1+\Vert \hat{\gamma _j}\Vert _1)/\hat{\tau _j}^2\le O_{\mathbb {P}}\left( \max \limits _j\sqrt{s_j}\Vert \gamma _j^0\Vert _2\right) = O_{\mathbb {P}}(\max \limits _j\sqrt{s_j}). \end{aligned}$$

From what we have been discussed above, and put \(nb^2\sqrt{\max _j s_j}=O_\mathbb {P}(1)\), then we next prove the upper bound for \(\Vert \varvec{\widehat{\Theta }\widehat{\Gamma }_1-I}\Vert _\infty \). By triangle inequality and Holder’s inequality, we get

$$\begin{aligned} \Vert \widehat{\varvec{\Theta }}\widehat{{\varvec{\Gamma }}}_{\mathbf{1}}-\mathbf{I}\Vert _\infty\le & {} \Vert \varvec{\widehat{\Theta }\Sigma }-\mathbf{I}\Vert _\infty +\Vert \varvec{\widehat{\Theta }\Sigma _w}\Vert _\infty +nb^2\sigma _\varepsilon ^{-2}\Vert \varvec{\widehat{\Theta }\Sigma _w\Sigma }\Vert _\infty \\\le & {} \Vert \widehat{\varvec{\Theta }}{\varvec{\Sigma }}-\mathbf{I}\Vert _\infty +\Vert \varvec{\widehat{\Theta }}\Vert _{\ell _\infty }\Vert \varvec{\Sigma _w}\Vert _\infty + nb^2\sigma _\varepsilon ^{-2}\Vert {\varvec{\Sigma }}\Vert _{\ell _1}\Vert \widehat{{\varvec{\Theta }}}\Vert _{\ell _\infty }\Vert \varvec{\Sigma _w}\Vert _\infty \\\le & {} O_{\mathbb {P}}\left( \sqrt{\max _j s_j\log (p)/n}\right) +O_{\mathbb {P}}\left( \sqrt{\log (p)/n}\right) \\\le & {} o_{\mathbb {P}}(1)+O_{\mathbb {P}}\left( \sqrt{\log (p)/n}\right) , \end{aligned}$$

on account of \(n\gg s_j\log p\) and thus the first equality in Lemma A.1 holds.

Next, we prove the second equality. For convenience, we first give the following results which have been mentioned in Lemma 5.5 of [18].

$$\begin{aligned} \Vert \widehat{{\varvec{\Theta }}}_j^T\Vert _1\le & {} O_\mathbb {P}(\max _j\sqrt{s_j}),\quad \Vert \gamma _j^0\Vert _2\le \frac{\Lambda _{\max }}{\Lambda _{\min }}\le \frac{C_{\max }}{C_{\min }},\quad \left| \ \frac{1}{\hat{\tau }_j^2}-\frac{1}{\tau _j^2}\right| \\\le & {} O_{\mathbb {P}}\left( \sqrt{s_j\log (p)/n}\right) . \end{aligned}$$

Combined with Eq. (2.13), we show that

$$\begin{aligned} \Vert \widehat{\varvec{\Theta }}-{\varvec{\Theta }}\Vert _\infty\le & {} \max _j\Vert \widehat{\varvec{\Theta }}_j-{\varvec{\Theta }}_j\Vert _2\\\le & {} \Vert \hat{\gamma }_j-\gamma _j^0\Vert _2/\hat{\tau }_j^2+\Vert \gamma _j^0\Vert _2(1/\hat{\tau }_j^2-1/\tau _j^2) \\\le & {} O_{\mathbb {P}}\left( (1+\Vert \gamma _j^0\Vert _2)\sqrt{s_j\log (p)/n}\right) \\= & {} o_{\mathbb {P}}(1). \end{aligned}$$

As a result,

$$\begin{aligned} \Vert \widehat{\varvec{\Theta }}\widehat{{\varvec{\Gamma }}}_{\mathbf{1}} \widehat{{\varvec{\Theta }}}^T-{\varvec{\Theta }}\Vert _\infty\le & {} \Vert {(\widehat{{\varvec{\Theta }}}\widehat{{\varvec{\Gamma }}}_{\mathbf{1}}-\mathbf{I})\hat{{\varvec{\Theta }}}}^T\Vert _\infty +\Vert \widehat{\varvec{\Theta }}-{\varvec{\Theta }}\Vert _\infty \\\le & {} \max _j\Vert (\widehat{\varvec{\Theta }}\widehat{{\varvec{\Gamma }}}_{\mathbf{1}}-\mathbf{I})\Vert _{\infty }\Vert \widehat{{\varvec{\Theta }}}_j^T\Vert _1+\Vert \widehat{\varvec{\Theta }}-{\varvec{\Theta }}\Vert _\infty \\\le & {} O_{\mathbb {P}}\left( \sqrt{\max _j s_j\log (p)/n}\right) +o_{\mathbb {P}}(1) \\= & {} o_{\mathbb {P}}(1). \end{aligned}$$

With the above conclusions, we can prove the last equation.

$$\begin{aligned} \Vert \widehat{\varvec{\Theta }} \widehat{{\varvec{\Gamma }}}_{\mathbf{1}} \widehat{{\varvec{\Theta }}}^\mathbf{T}\Vert _1\le & {} \Vert (\widehat{{\varvec{\Theta }}}{\widehat{{\varvec{\Gamma }}}_{\mathbf{1}}}-\mathbf{I})\widehat{{\varvec{\Theta }}}^T\Vert _1+\Vert \widehat{{\varvec{\Theta }}}^T\Vert _1\\\le & {} p^2\Vert (\widehat{{\varvec{\Theta }}}{\widehat{{\varvec{\Gamma }}}_{\mathbf{1}}}\widehat{{\varvec{\Theta }}}^T-\widehat{{\varvec{\Theta }}}^T\Vert _\infty +p\Vert \widehat{{\varvec{\Theta }}}_j^T\Vert _1 \\\le & {} O_{\mathbb {P}}\left( p^2\sqrt{\max _j s_j\log (p)/n}\right) +O_{\mathbb {P}}(p\sqrt{\max _j s_j}). \end{aligned}$$

Given that \(\Vert { \widehat{{\varvec{\Omega }}}}\Vert _1=O(\Vert \sigma _\varepsilon ^2\widehat{{\varvec{\Theta }}}{ \widehat{{\varvec{\Gamma }}}_{\mathbf{1}}}\widehat{{\varvec{\Theta }}}^T\Vert )\) and \(p\sqrt{\log (p)}\gg \sqrt{n}\), then we have

$$\begin{aligned} \Vert {\widehat{{\varvec{\Omega }}}}\Vert _1=O_{\mathbb {P}}\left( p^2\sqrt{\max _{j}s_j\log (p)/n}\right) . \end{aligned}$$

These results will be applied in the next proof of Theorem. \(\square \)

Proof of Theorem 3.7:

Recall that \( \sqrt{n}(\varvec{\hat{b}-\beta ^0})=\mathbf{t}-\varvec{\delta }\), where \(\mathbf{t}|\mathbf{X}\sim \mathcal {N}_p(0,{\widehat{\varvec{\Omega }}})\). By making similar standardization with \(\hat{b}\), for arbitrary \(j\in \{1,\ldots ,p\}\), we can get \(\tilde{\beta ^0_j},\tilde{t}_j,\tilde{\delta }_j\). Here,

$$\begin{aligned} \tilde{\beta ^0_j}=\sqrt{n}\beta ^0_j{\widehat{\varvec{\Omega }}}_{jj}^{-1/2}, \quad \tilde{t}_j=\sqrt{n}t_j{\widehat{\varvec{\Omega }}}_{jj}^{-1/2}, \quad \tilde{\delta }_j=\sqrt{n}\delta _j{\widehat{\varvec{\Omega }}}_{jj}^{-1/2}. \end{aligned}$$

Then, we have

$$\begin{aligned} w_j=\tilde{\beta ^0_j}+\tilde{t}_j-\tilde{\delta }_j \end{aligned}$$

and know each \(\tilde{t}_j\) has unit variance. Let \(\varvec{\tilde{\beta ^0}}=(\tilde{\beta ^0_1},\ldots ,\tilde{\beta ^0_p})^T, \varvec{\tilde{t}}=(\tilde{t}_1,\ldots ,\tilde{t}_p)^T\) and \(\varvec{\tilde{\delta }}=(\tilde{\delta }_1,\ldots ,\tilde{\delta }_p)^T\). Recall \({\varvec{\Theta }}={\varvec{\Sigma }}^{-1}\). By \(\Vert \widehat{{\varvec{\Theta }}}{ \widehat{{\varvec{\Gamma }}}_{\mathbf{1}}}\widehat{{\varvec{\Theta }}}^T-\Theta \Vert _\infty =o_{\mathbb {P}}\left( 1\right) \) in Lemma A.1,then the largest and smallest among \((\widehat{{\varvec{\Theta }}}{ \widehat{{\varvec{\Gamma }}}_{\mathbf{1}}}\widehat{{\varvec{\Theta }}}^T)_{jj}\) are between \(C_{\min }\) and \(C_{\max }\). Consequently, using Lemma 3.6, we can obtain

$$\begin{aligned} \mathbb {P}\left\{ \Vert \varvec{\tilde{\delta }}\Vert _\infty \le (\sigma _\varepsilon \sqrt{C_{\min }})^{-1} Q_p(s_0,n)\right\} \rightarrow 1. \end{aligned}$$

Note that, under Assumption 3.3, \(Q_p(s_0,n)=o(1)\),then \(\Vert \varvec{\tilde{\delta }}\Vert _\infty =o_{\mathbb {P}}(1)\).

Subsequently, we divided the proof in two steps. Step 1 finds the lower bound of \(\min \limits _{j\in S_0}|w_j|\), and step 2 establishes the upper bound for \(\max \limits _{j\in P_0}|w_j|\).

Step 1: Applying Theorem 3.3 in [10] with unit Gaussian random variables \(\tilde{t}_j\) and an exponential variable \(\kappa \) with expectation 1, we have

$$\begin{aligned} \max \limits _{j\in P_0}|\tilde{t}_j|^2\le \log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )+2\kappa \end{aligned}$$

as \(p_0\rightarrow \infty \). Combining the bounds of \(\tilde{t}_j\) and \(\tilde{\delta }_j\), as \(p_0\rightarrow \infty \), we get

$$\begin{aligned} \max \limits _{j\in P_0}|w_j|=\max \limits _{j\in P_0}|\tilde{t}_j+\tilde{\delta }_j| \le \sqrt{\log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )+2\kappa }. \end{aligned}$$

Step 2: Similar to \(\max \limits _{j\in P_0}|w_j|\) in step 1. Note that \(s_0\le p_0\),when \(s_0\rightarrow \infty \), we see that

$$\begin{aligned} \max \limits _{j\in S_0}|\tilde{t}_j|\le & {} \sqrt{\log (s_0^2/2\pi )+\log \log (s_0^2/2\pi )+2\kappa }\\\le & {} \sqrt{\log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )+2\kappa }. \end{aligned}$$

Furthermore, we obtain

$$\begin{aligned} \min \limits _{j\in S_0}|w_j|=\min \limits _{j\in S_0}|\tilde{\beta ^0_j}+\tilde{t}_j-\tilde{\delta }_j| \ge \min \limits _{j\in S_0}|\tilde{\beta ^0_j}|-\sqrt{\log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )+2\kappa }. \end{aligned}$$

For the convenience of narrative, let \(L_{p_0}=\sqrt{\log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )+2\kappa }\). The final part of the proof is to consider the probability

$$\begin{aligned} \mathbb {P}\left\{ \min \limits _{j\in S_0}|\tilde{\beta ^0_j}|- L_{p_0} \ge L_{p_0}\right\} =\mathbb {P}\left\{ \min \limits _{j\in S_0}|\tilde{\beta ^0_j}|/2 \ge L_{p_0}\right\} . \end{aligned}$$

This probability converges to 1 as \(s_0\rightarrow \infty \) if for some positive constants \(n_0\), the following inequality

$$\begin{aligned} \min \limits _{j\in S_0}|\tilde{\beta ^0_j}|/2\ge (1+n_0)\sqrt{\log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )} \end{aligned}$$

holds. By \(|\tilde{\beta ^0_j}|\)’s definition and some algebra operations, the above condition is equivalent to

$$\begin{aligned} \min \limits _{j\in S_0}|\beta ^0_j|\ge 2\sigma _\varepsilon \sqrt{C_{\max }/n}\left\{ (1+n_0)\sqrt{\log (p_0^2/2\pi )+\log \log (p_0^2/2\pi )}\right\} . \end{aligned}$$

The proof of Theorem 3.7 is now complete. \(\square \)

1.2 Appendix B: Proof of Theorem 3.8

In this section, we turn to the proof of Theorem 3.8. Also, we include in the main text the important lemma.

Lemma A.2

Consider standardized DCoCoLasso estimator \(w_j\) and the definition of \(\tilde{t}_j\). Suppose the Assumption 3.1 and 3.3 hold, then

$$\begin{aligned} p_0^{-1}\left| \mathbb {E}[V_w(k)]-\mathbb {E}[V_{\tilde{t}}(k)]\right| \rightarrow 0 \quad a.s. \quad \hbox {and} \quad p_0^{-1}\left| V_w(k)-V_{\tilde{t}}(k)\right| \rightarrow 0 \quad a.s. \end{aligned}$$

Furthermore,

$$\begin{aligned} p_0^{-2}\left| \text {var}[V_w(k)]-\text {var}[V_{\tilde{t}}(k)]\right| \rightarrow 0 \quad a.s. \quad as \quad p_0\rightarrow \infty . \end{aligned}$$

Proof

Recall that \(w_j=\tilde{\beta ^0_j}+\tilde{t}_j-\tilde{\delta }_j\), for \(j\in P_0\), let \(F_{j}\) and \(\Phi _{j}\) be the cumulative distribution function of \(w_j\) and \(\tilde{t}_j\), respectively. Then, \(\tilde{\beta ^0_j}=0\) and each \(\tilde{t}_j\) has unit variance for all \(j\in P_0\). From the proof of Theorem 3.7, we get \(\Vert \varvec{\tilde{\delta }}\Vert _\infty =o_{\mathbb {P}}(1)\). By Lemma A.1, we have \(\Vert \widehat{{\varvec{\Theta }}}{ \widehat{{\varvec{\Gamma }}}_{\mathbf{1}}}\widehat{{\varvec{\Theta }}}^T-{\varvec{\Sigma }}^{-1}\Vert _\infty =o_{\mathbb {P}}(1)\), so \(\tilde{t}\) has a nondegenerate multivariate normal distribution, and \(\Phi _{p,j}\) is absolutely continuous about Lebesgue measure on \(\mathbb {R}\). Therefore, for any \(x\in \mathbb {R}\), then yields

$$\begin{aligned} \max \limits _{j\in P_0}|F_j(x)-\Phi _j(x)|=o_{\mathbb {P}}(1). \end{aligned}$$

Further, the above equality implies

$$\begin{aligned} \max \limits _{j\in P_0}|\mathbb {I}(|w_j|\le k)-\mathbb {I}(|\tilde{t}_j|\le k)|=o_{\mathbb {P}}(1). \end{aligned}$$

By the definitions of \(V_w(k)\) and \(V_{\tilde{t}}(k)\), we have

$$\begin{aligned} p_0^{-1}|\mathbb {E}[V_w(k)]-\mathbb {E}[V_{\tilde{t}}(k)]|=o_{\mathbb {P}}(1) \quad \hbox {and} \quad p_0^{-1}|V_w(k)-V_{\tilde{t}}(k)|=o_{\mathbb {P}}(1). \end{aligned}$$

We next prove the last equality in Lemma A.2. For each distinct pair of \(i,j\in \{1,\ldots ,p\}\), define \(F_{i,j}\) as the joint CDF of \((w_i,w_j)\) and \(\Phi _{i,j}\) that of \((\tilde{t}_i,\tilde{t}_j)\). Hence, for any \(x,y\in \mathbb {R}\), we get

$$\begin{aligned} \max \limits _{i\ne j,j\in P_0}|F_{i,j}(x,y)-\Phi _{i,j}(x,y)|=o_{\mathbb {P}}(1). \end{aligned}$$

Then we have

$$\begin{aligned}&\max \limits _{i\ne j,j\in P_0}| \text {cov}\{\mathbb {I}(|\tilde{t}_i-\tilde{\delta }_i|>k),\text {cov}\{\mathbb {I}(|\tilde{t}_j-\tilde{\delta }_j|>k)\}\\&\quad - \text {cov}\{\mathbb {I}(|\tilde{t}_i>k),\text {cov}\{\mathbb {I}(|\tilde{t}_j>k)\} |=o_{\mathbb {P}}(1). \end{aligned}$$

Therefore,

$$\begin{aligned}&\frac{1}{p_0^2}|\text {var}[V_w(k)]-\text {var}[V_{\tilde{t}}(k)]| \\&\quad =\underbrace{\frac{1}{p_0^2}\left( \sum \limits _{j\in P_0}\text {var}[\mathbb {I}(|\tilde{t}_j-\tilde{\delta }_j|>k)]-\sum \limits _{j\in P_0}\text {var}[\mathbb {I}(|\tilde{t}_j>k)] \right) }_{V_1}\\&\qquad +\frac{1}{p_0^2}\left( \sum \limits _{i\ne j,j\in P_0}\text {cov}[\mathbb {I}(|\tilde{t}_i-\tilde{\delta }_i|>k),\mathbb {I}(|\tilde{t}_j-\tilde{\delta }_j|>k)] -\text {cov}[\mathbb {I}(|\tilde{t}_i>k),\mathbb {I}(|\tilde{t}_j>k)] \right) \\&\quad =o_{\mathbb {P}}(1). \end{aligned}$$

Here, \(V_1\) is o(1) as \(p_0\rightarrow \infty \). This complete the proof. \(\square \)

Lemma A.3

If Assumption 3.1 and 3.3 hold, then

$$\begin{aligned}&p^{-2}\left| \text {var}[R_{\tilde{t}}(k)]\right| =O_{\mathbb {P}}\left\{ \max \left( 1/p,\sqrt{\max _js_j\log (p)/n}\right) \right\} ,\\&p_0^{-2}\left| \text {var}[V_{\tilde{t}}(k)]\right| =O_{\mathbb {P}}\left\{ \max \left( 1/p_0,\sqrt{\max _js_j\log (p)/n}\right) \right\} . \end{aligned}$$

Proof

For \(i\ne j\), let \(\rho _{ij}\) be the correlation between \(\tilde{t}_i\) and \(\tilde{t}_j\), and \(\xi _j=\mathbb {I}(|\tilde{t}_j|\le k)\). Definite the following sets:

$$\begin{aligned} \left\{ \begin{array}{l} I_{1,p}=\{(i,j): 1\le i, j\le p, i\ne j, |\rho _{ij}|<1\},\\ I_{2,p}=\{(i,j): 1\le i, j\le p, i\ne j, |\rho _{ij}|=1\}. \end{array}\right. \end{aligned}$$

Namely, \(I_{2,p}\) records distinct pairs \((\xi _i,\xi _j)\) for \(i\ne j\) such that \(\tilde{t}_i\) and \(\tilde{t}_j\) are linearly dependent. Then

$$\begin{aligned} |\text {var}[R_{\tilde{t}}(k)]|=\sum \limits _{j=1}^p\text {var}(\xi _j)+\sum \limits _{(i,j)\in I_{2,p}}\text {cov}(\xi _i,\xi _j) +\sum \limits _{(i,j)\in I_{1,p}}\text {cov}(\xi _i,\xi _j). \end{aligned}$$

The term \(\sum \limits _{j=1}^p\text {var}(\xi _j)=O(p)\). Let R be the correlation matrix of \(\tilde{t}\), we have

$$\begin{aligned} \sum \limits _{(i,j)\in I_{2,p}}\text {cov}(\xi _i,\xi _j)=O(\Vert \mathbf{R}\Vert _1)=O(\Vert {\widehat{{\varvec{\Omega }}}\Vert _{\mathbf{1}}}). \end{aligned}$$

Inequality (A.17) of [6] implies

$$\begin{aligned} \sum \limits _{(i,j)\in I_{1,p}}\text {cov}(\xi _i,\xi _j)\le C\Vert \mathbf{R}\Vert _1=O(\Vert {\widehat{{\varvec{\Omega }}}\Vert _{\mathbf{1}}}). \end{aligned}$$

Consequently, by Lemma A.1, we have that

$$\begin{aligned} p^{-2}|\text {var}[R_{\tilde{t}}(k)]|= & {} O(1/p)+2O_{\mathbb {P}}\left( \sqrt{\max _js_j\log (p)/n}\right) \\= & {} O_{\mathbb {P}}\left\{ \max (1/p,\sqrt{\max _js_j\log (p)/n}\right\} . \end{aligned}$$

In terms of the last equality of Lemma A.3. In fact, we can get the result by replacing \(p_0\) with p. This completes the proof of the lemma. \(\square \)

Proof of Theorem 3.8:

In order to compactly derive the theorem, we give the following additional definition.

Define Marginal FDR as

$$\begin{aligned} \mathrm{mFDR}_w(k)=\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)\vee 1]}. \end{aligned}$$

Recall that \(R_w(k)=\sum \limits _{j=1}^p\mathbb {I}(|w_j|>k)\) and \(V_w(k)=\sum \limits _{j\in P_0}\mathbb {I}(|w_j|>k)\), then define

$$\begin{aligned} R_{\tilde{t}}(k)=\sum \limits _{j=1}^p\mathbb {I}(|\tilde{t}_j|>k) \quad and \quad V_{\tilde{t}}(k)=\sum \limits _{j\in P_0}\mathbb {I}(|\tilde{t}_j|>k). \end{aligned}$$

In addition, we see \(R_w(k)\vee 1=R_w(k)\) for all p large enough.

Next, we divide the proof into three steps.

$$\begin{aligned} \mathrm{Step} \quad \mathrm{A}&:&\lim \limits _{p\rightarrow \infty }\left| \mathrm{FDP}_w(k)-\widehat{\mathrm{FDP}}(k)\right| =0 \qquad \qquad \quad a.s. \\ \mathrm{Step} \quad \mathrm{B}&:&\lim \limits _{p\rightarrow \infty }\left| \widehat{\mathrm{FDP}}(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right| =0 \qquad \quad \qquad a.s.\\ \mathrm{Step} \quad \mathrm{C}&:&\lim \limits _{p\rightarrow \infty }\left| \widehat{\mathrm{FDP}}(k)-\mathrm{FDR}_w(k)\right| =0 \qquad \qquad \quad a.s. \end{aligned}$$

Step A. By Lemma A.3, under the condition of Assumption 3.3, we can apply the Chebyshev’s inequality to the \(p_0^{-1}V_{\tilde{t}}(k)\), then we have

$$\begin{aligned} p_0^{-1}|V_{\tilde{t}}(k)-\mathbb {E}[V_{\tilde{t}}(k)]|\rightarrow 0 \quad a.s. \quad p_0\rightarrow \infty . \end{aligned}$$

Combining \(p_0^{-1}|V_w(k)-V_{\tilde{t}}(k)|\rightarrow 0 \quad a.s.\quad p_0\rightarrow \infty \) in Lemma A.2, we get

$$\begin{aligned} p_0^{-1}|V_w(k)-\mathbb {E}[V_{\tilde{t}}(k)]|\rightarrow 0 \quad a.s.\quad p_0\rightarrow \infty . \end{aligned}$$

Subsequently, we show that the \(R_w(k)\) has the lower bound. Because \(R_w(k)\ge V_w(k)\) and \(\mathbb {E}[V_{\tilde{t}}(k)]=\sum \limits _{j\in P_0}\mathbb {I}(|\tilde{t}_j|>k)=2p_0\Phi (-k)\), then

$$\begin{aligned} \lim \limits _{p_0\rightarrow \infty }\mathbb {P}\{R_w(k)\ge 2p_0\Phi (-k)\}=1. \end{aligned}$$

Finally, we have

$$\begin{aligned} \left| \frac{p_0^{-1}V_w(k)}{p^{-1}R_w(k)}-\frac{p_0^{-1}\mathbb {E}[V_{\tilde{t}}(k)]}{p^{-1}R_w(k)}\right| \rightarrow 0 \quad a.s.\quad p_0\rightarrow \infty . \end{aligned}$$

Due to \(p=p_0+s_0\) and \(s_0/p=o(1)\), then we have

$$\begin{aligned} \left| \frac{V_w(k)}{R_w(k)}-\frac{\mathbb {E}[V_{\tilde{t}}(k)]}{R_w(k)}\right| \rightarrow 0 \quad a.s.\quad p_0\rightarrow \infty . \end{aligned}$$

Consequently, we prove \(\lim \limits _{p\rightarrow \infty }\left| \mathrm{FDP}_w(k)-\widehat{\mathrm{FDP}}(k)\right| =0 \quad a.s.\).

Step B. In step A, we know that \(R_w(k)\) is bounded away from 0 uniformly in p with the probability tending to 1. Therefore, it is sufficient to show

$$\begin{aligned} \lim \limits _{p\rightarrow \infty }\left| \frac{\mathbb {E}[V_{\tilde{t}}(k)]}{R_w(k)}-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right| =0 \quad a.s. \end{aligned}$$

Given that \(p_0^{-1}\left| \mathbb {E}[V_w(k)]-\mathbb {E}[V_{\tilde{t}}(k)]\right| \rightarrow 0 \quad a.s.\) in Lemma A.2, we next state

$$\begin{aligned} p^{-1}|R_w(k)-\mathbb {E}[R_w(k)]|\rightarrow 0 \quad a.s. \quad p\rightarrow \infty . \end{aligned}$$

Then, we just show \(p^{-2}\text {var}[R_w(k)]\rightarrow 0 \quad a.s. \quad p\rightarrow \infty \). To accomplish this, we will prove \( p_0^{-2}\text {var}[V_w(k)]\rightarrow 0\) and \(p^{-2}\text {var}[R_w(k)]-p_0^{-2}\text {var}[V_w(k)]\rightarrow 0\).

In terms of \( p_0^{-2}\text {var}[V_w(k)]\rightarrow 0\). By lemma 7, when \(n\gg s_j\log p\), we have

$$\begin{aligned} p_0^{-2}\left| \text {var}[V_{\tilde{t}}(k)]\right| \rightarrow 0 \quad a.s. \quad p_0\rightarrow \infty . \end{aligned}$$

Combining

$$\begin{aligned} p_0^{-2}\left| \text {var}[V_w(k)]-\text {var}[V_{\tilde{t}}(k)]\right| \rightarrow 0 \quad a.s. \quad as \quad p_0\rightarrow \infty \end{aligned}$$

in Lemma A.2, then we will get the result.

Then for \(p^{-2}\text {var}[R_w(k)]-p_0^{-2}\text {var}[V_w(k)]\rightarrow 0\), observe that

$$\begin{aligned} p^{-1}R_w(k)= & {} p^{-1}V_w(k)+p^{-1}\sum \limits _{j \in S_0}\mathbb {I}(|w_j|>k)\\= & {} (p_0/p)[p_0^{-1}V_w(k)]+(s_0/p)\sum \limits _{j \in S_0}\mathbb {I}(|w_j|>k)/s_0, \end{aligned}$$

then we have \(p^{-2}\text {var}[R_w(k)]-p_0^{-2}\text {var}[V_w(k)]\rightarrow 0\) because \(p_0/p\rightarrow 1\) and \(s_0/p=o(1)\).

Step C. Following the results in Step A and Step B, we have

$$\begin{aligned} \lim \limits _{p\rightarrow \infty }\left| \mathrm{FDP}_w(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right| =0 \quad a.s. \end{aligned}$$

Since \(R_w(k)\) is bounded away from 0 uniformly in p, then

$$\begin{aligned} \left| \mathrm{FDP}_w(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right| \le 2 \quad a.s. \end{aligned}$$

Applying the dominated convergence theorem, we obtain

$$\begin{aligned} \lim \limits _{p\rightarrow \infty }\mathbb {E}\left[ \mathrm{FDP}_w(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right] =0. \end{aligned}$$

In the end, we decompose \(\widehat{FDP}(k)-FDR_w(k)\) by the above results,

$$\begin{aligned} \widehat{\mathrm{FDP}}(k)-\mathrm{FDR}_w(k)= & {} \widehat{\mathrm{FDP}}(k)-\mathbb {E}[\mathrm{FDP}_w(k)]\\= & {} \widehat{\mathrm{FDP}}(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}-\left( \mathbb {E}[\mathrm{FDP}_w(k)]-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right) \\= & {} \widehat{\mathrm{FDP}}(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}-\mathbb {E}\left( \mathrm{FDP}_w(k)-\frac{\mathbb {E}[V_w(k)]}{\mathbb {E}[R_w(k)]}\right) . \end{aligned}$$

Therefore, we have \(\lim \limits _{p\rightarrow \infty }\left| \widehat{\mathrm{FDP}}(k)-\mathrm{FDR}_w(k)\right| =0\). \(\square \)

Proof of Corollary 3.9:

By the definition of \(k_\alpha \) and \(\widehat{\mathrm{FDP}}(k_\alpha )\), we have

$$\begin{aligned} \mathbb {P}\{\widehat{\mathrm{FDP}}(k_\alpha )\le \alpha \}=\mathbb {P}\left\{ \frac{2p\Phi (-k_\alpha )}{R_w(k)}\le \alpha \right\} =1. \end{aligned}$$

Further, for a small fixed constant \(\alpha \), we see that \(\mathbb {P}\{2p\Phi (-k_\alpha )\le p\alpha \}=1\). This implies that \(k_\alpha \) does not approach 0 as \(p\rightarrow \infty \), then we just consider positive constant value \(k_\alpha \) meets the results in theorem.

Theorem 3.8 tells us when \(p\rightarrow \infty \),

$$\begin{aligned} \left| \mathrm{FDP}_w(k_\alpha )-\widehat{\mathrm{FDP}}(k_\alpha )\right| \rightarrow 0 . \end{aligned}$$

Then for any positive constant t, using the Lebesgue’s Dominated Convegence Theorem, we have

$$\begin{aligned}&\lim \limits _{p\rightarrow \infty }\mathbb {P}\{|\mathrm{FDP}_w(k_\alpha )-\widehat{\mathrm{FDP}}(k_\alpha )|> t\}\\&\quad = \lim \limits _{p\rightarrow \infty }\mathbb {E}\left[ \mathbb {E}[\mathbb {I}\{|\mathrm{FDP}_w(k_\alpha )-\widehat{\mathrm{FDP}}(k_\alpha )|> t|k_\alpha \}]\right] \\&\quad =\mathbb {E}\left[ \lim \limits _{p\rightarrow \infty } \mathbb {E}[\mathbb {I}\{|\mathrm{FDP}_w(k_\alpha )-\widehat{\mathrm{FDP}}(k_\alpha )|> t|k_\alpha \}] \right] \\&\quad = 0 \end{aligned}$$

Therefore, \(\lim \limits _{p\rightarrow \infty }\mathbb {P}\{\mathrm{FDP}_w(k_\alpha )\le \alpha \}=1\).

About \(\lim \limits _{p\rightarrow \infty }\mathbb {P}\{\mathrm{FDR}_w(k_\alpha )\le \alpha \}=1\), we can use a similar method to prove. Combining \(\left| \widehat{\mathrm{FDP}}(k_\alpha )-\mathrm{FDR}_w(k_\alpha )\right| \rightarrow 0\) in theorem 3.8, we obtain

$$\begin{aligned} \lim \limits _{p\rightarrow \infty }\mathbb {P}\{|\mathrm{FDR}_w(k_\alpha )-\widehat{\mathrm{FDP}}(k_\alpha )|> k\}=0. \end{aligned}$$

Together with \(\mathbb {P}\{\widehat{\mathrm{FDP}}(k_\alpha )\le \alpha \}=1\), we complete the proof. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, X., Bao, N., Xu, K. et al. Variable Selection in High-Dimensional Error-in-Variables Models via Controlling the False Discovery Proportion. Commun. Math. Stat. 10, 123–151 (2022). https://doi.org/10.1007/s40304-020-00233-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40304-020-00233-4

Keywords

Mathematics Subject Classification

Navigation