Abstract
This paper studies the estimation of Gaussian graphical models in the unbalanced distributed framework. It provides an effective approach when the available machines are of different powers or when the existing dataset comes from different sources with different sizes and cannot be aggregated in one single machine. In this paper, we propose a new aggregated estimator of the precision matrix and justify such an approach by both theoretical and practical arguments. The limit distribution and convergence rate for this estimator are provided under sparsity conditions on the true precision matrix and controlling for the number of machines. Furthermore, a procedure for performing statistical inference is proposed. On the practical side, using a simulation study and a real data example, we show that the performance of the distributed estimator is similar to that of the non-distributed estimator that uses the full data.
Similar content being viewed by others
References
Arroyo, J., Hou E.: Efficient distributed estimation of inverse covariance matrices. In: 2016 IEEE Statistical Signal Processing Workshop (SSP), pp. 1–5. IEEE (2016)
Battey, H., Fan, J., Liu, H., Lu, J., Zhu, Z.: Distributed testing and estimation under sparse high dimensional models. Ann. Stat. 46(3), 1352–1382 (2018)
Cai, T., Liu, W., Luo, X.: A constrained \(\ell _1\) minimization approach to sparse precision matrix estimation. J. Am. Stat. Assoc. 106(494), 594–607 (2011)
Cai, T., Liu, W., Zhou, H.: Estimating sparse precision matrix: optimal rates of convergence and adaptive estimation. Ann. Stat. 44(2), 455–488 (2016)
Cardoso-Cachopo, A.: Improving methods for single-label text categorization. PhD Thesis, Instituto Superior Tecnico, Universidade Tecnica de Lisboa (2007)
Dumais, S.T.: Improving the retrieval of information from external sources. Behav. Res. Methods, Instruments, & Comput. 23(2), 229–236 (1991)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical Lasso. Biostatistics 9(3), 432–441 (2008)
Guo, J., Levina, E., Michailidis, G., Zhu, J.: Joint estimation of multiple graphical models. Biometrika 98(1), 1–15 (2011)
Hsieh, C.J., Sustik, M.A., Dhillon, I.S., Ravikumar, P.: Quic: quadratic approximation for sparse inverse covariance estimation. J. Mach. Learn. Res. 15(1), 2911–2947 (2014)
Jankova, J., van de Geer, S.: Confidence intervals for high-dimensional inverse covariance estimation. Electron. J. Statistics 9(1), 1205–1229 (2015)
Kallenberg, O.: Foundations of modern probability, vol. 2. Springer, Berlin (1997)
Lee, J.D., Liu, Q., Sun, Y., Taylor, J.E.: Communication-efficient sparse regression. J. Mach. Learn. Res. 18(1), 115–144 (2017)
Liu, D., Liu, R.Y., Xie, M.: Multivariate meta-analysis of heterogeneous studies using only summary statistics: efficiency and robustness. J. Am. Stat. Assoc. 110(509), 326–340 (2015)
McMahan, B., Moore, E., Ramage, D., Hampson, S., Agüera y Arcas, B: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, pp. 1273–1282 (2017)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the Lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Ravikumar, P., Wainwright, M.J., Raskutti, G., Yu, B.: High-dimensional covariance estimation by minimizing \(\ell _1\)-penalized log-determinant divergence. Electron. J. Statistics 5, 935–980 (2011)
Tang, L., Zhou, L., Song, P.X.K.: Distributed simultaneous inference in generalized linear models via confidence distribution. J. Multivar. Anal. 176, 104567 (2020)
Wang, G.P., Cui, H.J.: Efficient distributed estimation of high-dimensional sparse precision matrix for transelliptical graphical models. Acta Mathematica Sinica, English Series 37(5), 689–706 (2021)
Wang, H.: Coordinate descent algorithm for covariance graphical lasso. Stat. Comput. 24(4), 521–529 (2014)
Wang, L., Ren, X., Gu, Q.: Precision matrix estimation in high dimensional Gaussian graphical models with faster rates. In: Artificial Intelligence and Statistics, pp. 177–185 (2016)
Xie, M., Singh, K., Strawderman, W.E.: Confidence distributions and a unifying framework for meta-analysis. J. Am. Stat. Assoc. 106(493), 320–333 (2011)
Xu, G., Shang, Z., Cheng, G.: Distributed generalized cross-validation for divide-and-conquer kernel ridge regression and its asymptotic optimality. J. Comput. Graph. Stat. 28(4), 891–908 (2019)
Xue, J., Liang, F.: Double-parallel Monte Carlo for Bayesian analysis of big data. Stat. Comput. 29(1), 23–32 (2019)
Zhang, T., Zou, H.: Sparse precision matrix estimation via lasso penalized D-trace loss. Biometrika 101(1), 103–120 (2014)
Zhang, Y., Duchi, J., Wainwright, M.: Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates. J. Mach. Learn. Res. 16(1), 3299–3340 (2015)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nezakati, E., Pircalabelu, E. Unbalanced distributed estimation and inference for the precision matrix in Gaussian graphical models. Stat Comput 33, 47 (2023). https://doi.org/10.1007/s11222-023-10211-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-023-10211-9