Skip to main content
Log in

A unified precision matrix estimation framework via sparse column-wise inverse operator under weak sparsity

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

In this paper, we estimate the high-dimensional precision matrix under the weak sparsity condition where many entries are nearly zero. We revisit the sparse column-wise inverse operator estimator and derive its general error bounds under the weak sparsity condition. A unified framework is established to deal with various cases including the heavy-tailed data, the non-paranormal data, and the matrix variate data. These new methods can achieve the same convergence rates as the existing methods and can be implemented efficiently.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Avella-Medina, M., Battey, H. S., Fan, J., Li, Q. (2018). Robust estimation of high-dimensional covariance and precision matrices. Biometrika, 105(2), 271–284.

    Article  MathSciNet  MATH  Google Scholar 

  • Bickel, P. J., Levina, E. (2008). Covariance regularization by thresholding. Annals of Statistics, 36(6), 2577–2604.

    Article  MathSciNet  MATH  Google Scholar 

  • Bickel, P. J., Ritov, Y., Tsybakov, A. B. (2009). Simultaneous analysis of lasso and Dantzig selector. Annals of Statistics, 37(4), 1705–1732.

    Article  MathSciNet  MATH  Google Scholar 

  • Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.

    Article  MATH  Google Scholar 

  • Cai, T. (2017). Global testing and large-scale multiple testing for high-dimensional covariance structures. Annual Review of Statistics and Its Application, 4, 423–446.

    Article  Google Scholar 

  • Cai, T., Liu, W. (2011). Adaptive thresholding for sparse covariance matrix estimation. Journal of the American Statistical Association, 106(494), 672–684.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, T., Liu, W., Luo, X. (2011). A constrained \(\ell _1\) minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494), 594–607.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, T. T., Liu, W., Xia, Y. (2014). Two-sample test of high dimensional means under dependence. Journal of the Royal Statistical Society, Series B, 76(2), 349–372.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, T. T., Liu, W., Zhou, H. H. (2016). Estimating sparse precision matrix: Optimal rates of convergence and adaptive estimation. Annals of Statistics, 44(2), 455–488.

    Article  MathSciNet  MATH  Google Scholar 

  • Cai, T. T., Ren, Z., Zhou, H. H. (2016). Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation. Electronic Journal of Statistics, 10(1), 1–59.

    MathSciNet  MATH  Google Scholar 

  • Candes, E., Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Annals of Statistics, 35(6), 2313–2351.

    MathSciNet  MATH  Google Scholar 

  • Datta, A., Zou, H. (2017). Cocolasso for high-dimensional error-in-variables regression. Annals of Statistics, 45(6), 2400–2426.

    Article  MathSciNet  MATH  Google Scholar 

  • El Karoui, N. (2008). Operator norm consistent estimation of large-dimensional sparse covariance matrices. Annals of Statistics, 36(6), 2717–2756.

    MathSciNet  MATH  Google Scholar 

  • Fan, J., Feng, Y., Tong, X. (2012). A ROAD to classification in high dimensional space: The regularized optimal affine discriminant. Journal of the Royal Statistical Society, Series B, 74(4), 745–771.

    Article  MathSciNet  MATH  Google Scholar 

  • Fan, J., Liao, Y., Liu, H. (2016). An overview of the estimation of large covariance and precision matrices. The Econometrics Journal, 1(19), C1–C32.

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3), 432–441.

    Article  MATH  Google Scholar 

  • Hastie, T., Tibshirani, R., Tibshirani, R. (2020). Best subset, forward stepwise or lasso? Analysis and recommendations based on extensive comparisons. Statistical Science, 35(4), 579–592.

    MathSciNet  MATH  Google Scholar 

  • Hornstein, M., Fan, R., Shedden, K., Zhou, S. (2019). Joint mean and covariance estimation with unreplicated matrix-variate data. Journal of the American Statistical Association, 114(526), 682–696.

    Article  MathSciNet  MATH  Google Scholar 

  • Leng, C., Tang, C. Y. (2012). Sparse matrix graphical models. Journal of the American Statistical Association, 107(499), 1187–1200.

    Article  MathSciNet  MATH  Google Scholar 

  • Li, X., Zhao, T., Yuan, X., Liu, H. (2015). The flare package for high dimensional linear regression and precision matrix estimation in R. Journal of Machine Learning Research, 16(18), 553–557.

    MathSciNet  MATH  Google Scholar 

  • Liu, H., Lafferty, J., Wasserman, L. (2009). The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. Journal of Machine Learning Research 10(Oct):2295–2328.

    MathSciNet  MATH  Google Scholar 

  • Liu, H., Han, F., Yuan, M., Lafferty, J., Wasserman, L. (2012). High-dimensional semiparametric Gaussian copula graphical models. Annals of Statistics, 40(4), 2293–2326.

    Article  MathSciNet  MATH  Google Scholar 

  • Liu, W., Luo, X. (2015). Fast and adaptive sparse precision matrix estimation in high dimensions. Journal of Multivariate Analysis, 135, 153–162.

    Article  MathSciNet  MATH  Google Scholar 

  • Mai, Q., Zou, H., Yuan, M. (2012). A direct approach to sparse discriminant analysis in ultra-high dimensions. Biometrika, 99(1), 29–42.

    Article  MathSciNet  MATH  Google Scholar 

  • Meinshausen, N., Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Annals of Statistics, 34(3), 1436–1462.

    Article  MathSciNet  MATH  Google Scholar 

  • Negahban, S. N., Ravikumar, P., Wainwright, M. J., Yu, B. (2012). A unified framework for high-dimensional analysis of \(M\)-estimators with decomposable regularizers. Statistical Science, 27(4), 538–557.

    Article  MathSciNet  MATH  Google Scholar 

  • Raskutti, G., Wainwright, M. J., Yu, B. (2011). Minimax rates of estimation for high-dimensional linear regression over \(\ell _q\) balls. IEEE Transactions on Information Theory, 57(10), 6976–6994.

    Article  MathSciNet  MATH  Google Scholar 

  • Ravikumar, P., Wainwright, M. J., Raskutti, G., Yu, B. (2011). High-dimensional covariance estimation by minimizing \(\ell _1\)-penalized log-determinant divergence. Electronic Journal of Statistics, 5, 935–980.

    Article  MathSciNet  MATH  Google Scholar 

  • Ren, Z., Sun, T., Zhang, C. H., Zhou, H. H. (2015). Asymptotic normality and optimalities in estimation of large Gaussian graphical models. Annals of Statistics, 43(3), 991–1026.

    Article  MathSciNet  MATH  Google Scholar 

  • Rothman, A. J., Levina, E., Zhu, J. (2009). Generalized thresholding of large covariance matrices. Journal of the American Statistical Association, 104(485), 177–186.

    Article  MathSciNet  MATH  Google Scholar 

  • Sun, T., Zhang, C. H. (2012). Scaled sparse linear regression. Biometrika, 99(4), 879–898.

    Article  MathSciNet  MATH  Google Scholar 

  • Sun, T., Zhang, C. H. (2013). Sparse matrix inversion with scaled lasso. Journal of Machine Learning Research, 14(1), 3385–3418.

    MathSciNet  MATH  Google Scholar 

  • Tong, T., Wang, C., Wang, Y. (2014). Estimation of variances and covariances for high-dimensional data: A selective review. Wiley Interdisciplinary Reviews: Computational Statistics, 6(4), 255–264.

    Article  Google Scholar 

  • Wainwright, M. J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using \(\ell _1\)-constrained quadratic programming (Lasso). IEEE Transactions on Information Theory, 55(5), 2183–2202.

    Article  MathSciNet  MATH  Google Scholar 

  • Wainwright, M. J. (2019). High-dimensional statistics: A non-asymptotic viewpoint. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Wang, C., Jiang, B. (2020). An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss. Computational Statistics & Data Analysis, 142, 106812.

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, C., Yu, Z., Zhu, L. (2019). On cumulative slicing estimation for high dimensional data. Statistica Sinica, 31(2021), 223–242.

    MathSciNet  MATH  Google Scholar 

  • Wang, X., Yuan, X. (2012). The linearized alternating direction method of multipliers for Dantzig selector. SIAM Journal on Scientific Computing, 34(5), A2792–A2811.

    Article  MathSciNet  MATH  Google Scholar 

  • Witten, D. M., Friedman, J. H., Simon, N. (2011). New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, 20(4), 892–900.

    Article  MathSciNet  Google Scholar 

  • Xue, L., Zou, H. (2012). Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Annals of Statistics, 40(5), 2541–2571.

    Article  MathSciNet  MATH  Google Scholar 

  • Ye, F., Zhang, C. H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the \(\ell _q\) loss in \(\ell _r\) balls. Journal of Machine Learning Research, 11, 3519–3540.

    MathSciNet  MATH  Google Scholar 

  • Yuan, M. (2010). High dimensional inverse covariance matrix estimation via linear programming. Journal of Machine Learning Research, 11, 2261–2286.

    MathSciNet  MATH  Google Scholar 

  • Yuan, M., Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika, 94(1), 19–35.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang, T., Zou, H. (2014). Sparse precision matrix estimation via lasso penalized D-trace loss. Biometrika, 101(1), 103–120.

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou, S. (2014). Gemini: Graph estimation with matrix variate normal instances. Annals of Statistics, 42(2), 532–562.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

We thank the Editor, an Associate Editor, and anonymous reviewers for their insightful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Wang’s research is supported by National Natural Science Foundation of China (11971017, 12031005) and NSF of Shanghai (21ZR1432900). Liu’s research is supported by National Program on Key Basic Research Project (2018AAA0100704), NSFC Grant No. 11825104 and 11690013, Youth Talent Support Program, Shanghai Municipal Science and Technology Major Project (2021SHZDZX0102) .

Appendix

Appendix

This section includes all the technical proofs of the main theorems and some necessary lemmas.

1.1 Proof of Theorem 1

Let \(\varvec{e}\) be a column of the identity matrix \(\mathbf {I}\), then \(\varvec{\beta }^*=\varvec{\varSigma }^{-1} \varvec{e}\) is the corresponding column of the target precision matrix \(\varvec{\varOmega }\). For an arbitrary estimator of \(\widehat{\varvec{\varSigma }}\), we consider the SCIO estimation

$$\begin{aligned} \widehat{\varvec{\beta }}=\underset{\varvec{\beta }\in {\mathbb {R}}^{p}}{{{\,\mathrm{arg\,min}\,}}}\left\{ \frac{1}{2} \varvec{\beta }^\top \widehat{\varvec{\varSigma }}\varvec{\beta }-\varvec{e}^\top \varvec{\beta }+\lambda |\varvec{\beta }|_{1}\right\} . \end{aligned}$$

By the KKT condition, we have

$$\begin{aligned} \widehat{\varvec{\varSigma }}\widehat{\varvec{\beta }}-\varvec{e}+\lambda {\text {sgn}}(\widehat{\varvec{\beta }})=0, \end{aligned}$$
(6)

which ensures the basic inequality

$$\begin{aligned} (\widehat{\varvec{\beta }}-\varvec{\beta }^*)^\top (\widehat{\varvec{\varSigma }}\widehat{\varvec{\beta }}-\varvec{e})\le -\lambda |\widehat{\varvec{\beta }}|_1+\lambda |\varvec{\beta }^*|_1. \end{aligned}$$
(7)

Writing the difference vector as \(\varvec{h}=\varvec{\beta }^*-\widehat{\varvec{\beta }}\), we have

$$\begin{aligned} (\widehat{\varvec{\beta }}-\varvec{\beta }^*)^\top (\widehat{\varvec{\varSigma }}\widehat{\varvec{\beta }}-\varvec{e})&=(\widehat{\varvec{\beta }}-\varvec{\beta }^*)^\top (\widehat{\varvec{\varSigma }}(\widehat{\varvec{\beta }}-\varvec{\beta }^*)+(\widehat{\varvec{\varSigma }}-\varvec{\varSigma })\varvec{\beta }^*)\nonumber \\&\ge (\widehat{\varvec{\beta }}-\varvec{\beta }^*)^\top (\widehat{\varvec{\varSigma }}-\varvec{\varSigma })\varvec{\beta }^*\nonumber \\&\ge -|(\widehat{\varvec{\varSigma }}-\varvec{\varSigma })\varvec{\beta }^*|_\infty |\varvec{h}|_1\nonumber \\&\ge -\Vert \varvec{\varOmega }\Vert _{L_1}\Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty |\varvec{h}|_1. \end{aligned}$$
(8)

Combined with the basic inequality (7), it reduces to

$$\begin{aligned} -\lambda |\widehat{\varvec{\beta }}|_1+\lambda |\varvec{\beta }^*|_1 \ge -\Vert \varvec{\varOmega }\Vert _{L_1}\Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty |\varvec{h}|_1. \end{aligned}$$

Since \(\lambda \ge 3\Vert \varvec{\varOmega }\Vert _{L_1}\Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty \), then by this assumption we obtain

$$\begin{aligned} 3\lambda (|\varvec{\beta }^*|_1-|\widehat{\varvec{\beta }}|_1)\ge -\lambda |\varvec{h}|_1, \end{aligned}$$

and hence

$$\begin{aligned} 3(|\varvec{\beta }^*|_1-|\widehat{\varvec{\beta }}|_1) \ge -|\varvec{h}|_1. \end{aligned}$$

For any index set J, we have

$$\begin{aligned} |\varvec{h}_{J^c}|_1&\le |\varvec{\beta }^*_{J^c}|_1+ |\widehat{\varvec{\beta }}_{J^c}|_1= |\varvec{\beta }^*_{J^c}|_1+ |\widehat{\varvec{\beta }}|_1-|\widehat{\varvec{\beta }}_{J}|_1\\&\le |\varvec{\beta }^*_{J^c}|_1+|\varvec{\beta }^*|_1+\frac{1}{3} |\varvec{h}|_1-|\widehat{\varvec{\beta }}_{J}|_1= 2 |\varvec{\beta }^*_{J^c}|_1+ |\varvec{\beta }^*_{J}|_1+\frac{1}{3} |\varvec{h}|_1-|\widehat{\varvec{\beta }}_{J}|_1\\&\le 2 |\varvec{\beta }^*_{J^c}|_1+|\varvec{h}_{J}|_1+\frac{1}{3} |\varvec{h}|_1= 2 |\varvec{\beta }^*_{J^c}|_1+\frac{4}{3}|\varvec{h}_{J}|_1+\frac{1}{3}|\varvec{h}_{J^c}|_1, \end{aligned}$$

and by rearranging the inequality, we get an important relation

$$\begin{aligned} |\varvec{h}_{J^c}|_1\le 2|\varvec{h}_J|_1+3|\varvec{\beta }^*_{J^c}|_1. \end{aligned}$$
(9)

By the KKT condition (6), we have \(|\widehat{\varvec{\varSigma }}\widehat{\varvec{\beta }}-\varvec{e}|_\infty \le \lambda \) and

$$\begin{aligned} |\widehat{\varvec{\varSigma }}{\varvec{\beta }^*}-\varvec{e}|_\infty \le \Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty |\varvec{\beta }^*|_{1} \le \Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty \Vert \varvec{\varOmega }\Vert _{L_1}\le \frac{1}{3}\lambda . \end{aligned}$$

Thus, we can get

$$\begin{aligned} |\widehat{\varvec{\varSigma }}\varvec{h}|_\infty \le |\widehat{\varvec{\varSigma }}\widehat{\varvec{\beta }}-\varvec{e}|_\infty +|\widehat{\varvec{\varSigma }}{\varvec{\beta }^*}-\varvec{e}|_\infty \le \frac{4}{3}\lambda , \end{aligned}$$

and

$$\begin{aligned} |\varvec{\varSigma }\varvec{h}|_\infty \le |(\widehat{\varvec{\varSigma }}-\varvec{\varSigma })\varvec{h}|_\infty +|\widehat{\varvec{\varSigma }}\varvec{h}|_\infty \le \Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty |\varvec{h}|_1+\frac{4}{3}\lambda . \end{aligned}$$

Then, we conclude that

$$\begin{aligned} |\varvec{h}|_\infty =|\varvec{\varOmega }\varvec{\varSigma }\varvec{h}|_\infty \le \Vert \varvec{\varOmega }\Vert _{L_1} |\varvec{\varSigma }\varvec{h}|_\infty&\le \Vert \varvec{\varOmega }\Vert _{L_1} \Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty |\varvec{h}|_1+\frac{4}{3} \Vert \varvec{\varOmega }\Vert _{L_1} \lambda \nonumber \\&\le \frac{1}{3} \lambda |\varvec{h}|_1+\frac{4}{3} \Vert \varvec{\varOmega }\Vert _{L_1} \lambda , \end{aligned}$$
(10)

and

$$\begin{aligned} |\varvec{h}_J|_1\le |J| |\varvec{h}|_\infty \le \frac{1}{3} \lambda |J| ( |\varvec{h}|_1+4 \Vert \varvec{\varOmega }\Vert _{L_1}). \end{aligned}$$
(11)

Next, we split the argument into two cases.

Case 1: If \(|\varvec{h}_{J^c}|_1\ge 3|\varvec{h}_J|_1\), by the inequality (9), we have \(|\varvec{h}_J|_1\le 3|\varvec{\beta }^*_{J^c}|_1\) and then

$$\begin{aligned} |\varvec{h}|_1=|\varvec{h}_{J^c}|_1+ |\varvec{h}_{J}|_1 \le 3|\varvec{h}_{J}|_1+3|\varvec{\beta }^*_{J^c}|_1\le 12|\varvec{\beta }^*_{J^c}|_1 \end{aligned}$$
(12)

where the first inequality uses the fact (9) again.

Case 2: Otherwise, we may assume \(|\varvec{h}_{J^c}|_1< 3 |\varvec{h}_J|_1\) and then \( |\varvec{h}|_1 \le 4|\varvec{h}_J|_1\). By the bound (11),

$$\begin{aligned} |\varvec{h}|_1 \le 4|\varvec{h}_J|_1 \le \frac{4}{3} \lambda |J| ( |\varvec{h}|_1+4 \Vert \varvec{\varOmega }\Vert _{L_1}). \end{aligned}$$

If the relation \(\lambda |J| \le \frac{1}{2}\) holds, we conclude that

$$\begin{aligned} |\varvec{h}|_1 \le 16 \lambda |J| \Vert \varvec{\varOmega }\Vert _{L_1}. \end{aligned}$$
(13)

Now we begin to conduct the index set J that can control the bounds (12) and (13) simultaneously. To do so, we consider the index set

$$\begin{aligned} J=\{j|~|\varvec{\beta }_j^*|> t\}, \end{aligned}$$

for some \(t>0\). With this setting,

$$\begin{aligned} |\varvec{\beta }^{*}_{J^c}|_1=\sum _{j \in J^c}|\varvec{\beta }^{*}_j|\le t^{1-q} \sum _{j \in J^c} |\varvec{\beta }^{*}_j|^{q} \le t^{1-q}s_p, \end{aligned}$$
(14)

and for the cardinality of J, we have

$$\begin{aligned} |J|\le t^{-q}\sum _{j \in J}\left| \varvec{\beta }^{*}_{j}\right| ^q \le t^{-q}s_p. \end{aligned}$$
(15)

Combining the bounds (12) and (13), we get

$$\begin{aligned} |\varvec{h}|_1 \le \max \{12|\varvec{\beta }^*_{J^c}|_1, 16 \lambda |J| \Vert \varvec{\varOmega }\Vert _{L_1} \}\le \max \{12 t^{1-q}s_p, 16 \lambda t^{-q}s_p \Vert \varvec{\varOmega }\Vert _{L_1} \}, \end{aligned}$$

and setting \(t=\lambda \Vert \varvec{\varOmega }\Vert _{L_1}\) yields the conclusion

$$\begin{aligned} |\varvec{\beta }^*-\widehat{\varvec{\beta }}|_1\le 16\Vert \varvec{\varOmega }\Vert _{L_1}^{1-q}\lambda ^{1-q}s_p. \end{aligned}$$

It remains to check \(\lambda |J| \le \frac{1}{2}\). Let \(t=\lambda \Vert \varvec{\varOmega }\Vert _{L_1}\) and we have

$$\begin{aligned} \lambda |J| \le \lambda (\lambda \Vert \varvec{\varOmega }\Vert _{L_1})^{-q} s_p=\lambda ^{1-q} ( \Vert \varvec{\varOmega }\Vert _{L_1})^{-q} s_p<\frac{1}{2}, \end{aligned}$$

which holds by the assumption.

Finally, invoking (10), we conclude

$$\begin{aligned} |\varvec{\beta }^*-\widehat{\varvec{\beta }}|_\infty&\le \frac{1}{3}\lambda |\varvec{h}|_1+\frac{4}{3}\Vert \varvec{\varOmega }\Vert _{L_1}\lambda \\&\le \frac{16}{3}|\varvec{\varOmega }\Vert _{L_1}^{1-q}\lambda ^{2-q}s_p+\frac{4}{3}\Vert \varvec{\varOmega }\Vert _{L_1}\lambda \\&\le 4\Vert \varvec{\varOmega }\Vert _{L_1}\lambda , \end{aligned}$$

where we use the fact \(\lambda ^{1-q} ( \Vert \varvec{\varOmega }\Vert _{L_1})^{-q} s_p<\frac{1}{2}\) again. The proof is completed. \(\square \)

1.2 Proof of Theorem 2

Note the result of Theorem 1 holds uniformly for all \(i=1,\dots ,p\), that is

$$\begin{aligned} \max _{i=1,\cdots ,p} |\varvec{\beta }^*_i-\widehat{\varvec{\beta }}_i|_1\le 16\Vert \varvec{\varOmega }\Vert _{L_1}^{1-q}\lambda ^{1-q}s_p,~\hbox {and} \max _{i=1,\cdots ,p} |\varvec{\beta }^*_i-\widehat{\varvec{\beta }}_i|_\infty \le 4\Vert \varvec{\varOmega }\Vert _{L_1}\lambda . \end{aligned}$$

By the construction of \(\tilde{\varvec{\varOmega }}\), it is easy to show

$$\begin{aligned} \Vert \tilde{\varvec{\varOmega }}-\varvec{\varOmega }\Vert _\infty \le \max _{i=1,\cdots ,p} |\varvec{\beta }^*_i-\widehat{\varvec{\beta }}_i|_\infty \le 4\Vert \varvec{\varOmega }\Vert _{L_1}\lambda . \end{aligned}$$

Next we study the effect of the symmetrization step. For \(i \in \{1,\dots ,n\}\), we denote

$$\begin{aligned} \tilde{\varvec{\beta }}=\tilde{\varvec{\beta }}_i,~\widehat{\varvec{\beta }}=\widehat{\varvec{\beta }}_i,\varvec{\beta }^*=\varvec{\beta }^*_i. \end{aligned}$$

Since \(|\tilde{\varvec{\beta }}|_1 \le |\widehat{\varvec{\beta }}|_1\) by our construction, for the index set J, we have

$$\begin{aligned} |\tilde{\varvec{\beta }}_{J^c}|_1=|\tilde{\varvec{\beta }}|_1-|\tilde{\varvec{\beta }}_{J}|_1\le |\widehat{\varvec{\beta }}|_1-|\tilde{\varvec{\beta }}_{J}|_1\le |(\widehat{\varvec{\beta }}-\tilde{\varvec{\beta }})_{J}|_1+|\widehat{\varvec{\beta }}_{J^c}|_1, \end{aligned}$$

which yields

$$\begin{aligned} |\tilde{\varvec{\beta }}-\widehat{\varvec{\beta }}|_1\le |(\widehat{\varvec{\beta }}-\tilde{\varvec{\beta }})_{J}|_1+|\widehat{\varvec{\beta }}_{J^c}|_1+|\tilde{\varvec{\beta }}_{J^c}|_1\le 2\{|(\widehat{\varvec{\beta }}-\tilde{\varvec{\beta }})_{J}|_1+|\widehat{\varvec{\beta }}_{J^c}|_1 \}. \end{aligned}$$
(16)

Recall the index set \(J=\{j|~|\varvec{\beta }_j^*|> \lambda \Vert \varvec{\varOmega }\Vert _{L_1}\}\) from Proof of Theorem 1, and also the bounds

$$\begin{aligned} |\varvec{\beta }^{*}_{J^c}|_1 \le ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1} )^{1-q}s_p,~\hbox {and}~ |J| \le ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1})^{-q}s_p. \end{aligned}$$

Thus

$$\begin{aligned} |(\widehat{\varvec{\beta }}-\tilde{\varvec{\beta }})_{J}|_1\le |J| |\widehat{\varvec{\beta }}-\tilde{\varvec{\beta }}|_\infty \le ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1})^{-q}s_p \cdot 8 \Vert \varvec{\varOmega }\Vert _{L_1}\lambda =8 ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1})^{1-q}s_p, \end{aligned}$$

and

$$\begin{aligned} |\widehat{\varvec{\beta }}_{J^c}|_1\le |(\widehat{\varvec{\beta }}-\varvec{\beta }^*)_{J^c}|_1+|\varvec{\beta }^{*}_{J^c}|_1\le |\widehat{\varvec{\beta }}-\varvec{\beta }^*|_1+|\varvec{\beta }^{*}_{J^c}|_1\le 17 ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1} )^{1-q}s_p. \end{aligned}$$

Invoking the bound (16), we get

$$\begin{aligned} |\tilde{\varvec{\beta }}-\widehat{\varvec{\beta }}|_1\le 50 ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1} )^{1-q}s_p, \end{aligned}$$

which ensures

$$\begin{aligned} |\tilde{\varvec{\beta }}-\varvec{\beta }^*|_1\le |\tilde{\varvec{\beta }}-\widehat{\varvec{\beta }}|_1+|\widehat{\varvec{\beta }}-\varvec{\beta }^*|_1\le 66 ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1} )^{1-q}s_p. \end{aligned}$$

Since the above bound holds uniformly for all \(i=1,\dots ,p\), we conclude

$$\begin{aligned} \Vert \tilde{\varvec{\varOmega }}-\varvec{\varOmega }\Vert _{L_1} \le \max _{i=1,\cdots ,p}|\tilde{\varvec{\beta }}-\varvec{\beta }^*|_1\le 66 ( \lambda \Vert \varvec{\varOmega }\Vert _{L_1} )^{1-q}s_p. \end{aligned}$$

The proof is completed. \(\square \)

1.3 Proof of Corollaries

We only prove Corollary 1. The proof of other corollaries share a very similar procedure as Corollary 1 and hence are omitted.

Proof of Corollary 1:

By the assumption (A2) and Proposition 1, we have \(\mathbf {P}\left( \Vert \widehat{\varvec{\varSigma }}_1-\varvec{\varSigma }\Vert _{\infty }\ge C\sqrt{\frac{\log {p}}{n}}\right) = O(p^{-\tau })\). Similarly, by the assumption (A3) and Proposition 1, we have \(\mathbf {P}\left( \Vert \widehat{\varvec{\varSigma }}_1-\varvec{\varSigma }\Vert _{\infty }\ge C\sqrt{\frac{\log {p}}{n}}\right) = O\left( p^{-\tau }+n^{-\frac{\delta }{8}}\right) \).

We take \(\lambda =C_0M_p\sqrt{\frac{\log {p}}{n}}\). Note that \(\Vert \varvec{\varOmega }\Vert _{L_1}\le M_p\), then for a sufficiently large \(C_0\), the condition \(\lambda \ge 3\Vert \varvec{\varOmega }\Vert _{L_1}\Vert \widehat{\varvec{\varSigma }}-\varvec{\varSigma }\Vert _\infty \) required in Theorem 2 holds. By the assumption (A1), when np are large enough, we can obtain that \(\Vert \varvec{\varOmega }\Vert ^{-q}_{L_1}\lambda ^{1-q}s_p\le s_pM_p^{1-2q}\le \frac{1}{2}\). So by applying Theorem 2, the conclusion of Corollary 1 holds. \(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, Z., Wang, C. & Liu, W. A unified precision matrix estimation framework via sparse column-wise inverse operator under weak sparsity. Ann Inst Stat Math 75, 619–648 (2023). https://doi.org/10.1007/s10463-022-00856-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-022-00856-0

Keywords

Navigation