A fast iterative algorithm for high-dimensional differential network

Tang, Zhou; Yu, Zhangsheng; Wang, Cheng

doi:10.1007/s00180-019-00915-w

A fast iterative algorithm for high-dimensional differential network

Original paper
Published: 17 August 2019

Volume 35, pages 95–109, (2020)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Zhou Tang^1,2,
Zhangsheng Yu^1,2^na1 &
Cheng Wang¹

435 Accesses
7 Citations
Explore all metrics

Abstract

A differential network is an important tool for capturing the changes in conditional correlations under two sample cases. In this paper, we introduce a fast iterative algorithm to recover the differential network for high-dimensional data. The computational complexity of our algorithm is linear in the sample size and the number of parameters, which is optimal in that it is of the same order as computing two sample covariance matrices. The proposed method is appealing for high-dimensional data with a small sample size. The experiments on simulated and real datasets show that the proposed algorithm outperforms other existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

A Systematic Review of Hidden Markov Models and Their Applications

Article 12 May 2020

References

Anderson T (2003) An introduction to multivariate statistical analysis. Wiley series in probability and statistics. Wiley, New York
MATH Google Scholar
Bandyopadhyay S, Mehta M, Kuo D, Sung MK, Chuang R, Jaehnig EJ, Bodenmiller B, Licon K, Copeland W, Shales M et al (2010) Rewiring of genetic networks in response to DNA damage. Science 330(6009):1385–1389
Article Google Scholar
Barabási AL, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5(2):101
Article Google Scholar
Barabási AL, Gulbahce N, Loscalzo J (2011) Network medicine: a network-based approach to human disease. Nat Rev Genet 12(1):56
Article Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Article MathSciNet MATH Google Scholar
Bickel PJ, Levina E (2008) Regularized estimation of large covariance matrices. Ann Stat 36(1):199–227
Article MathSciNet MATH Google Scholar
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found Trends Mach Learn 3(1):1–122
Article MATH Google Scholar
Cai T, Liu W (2011) Adaptive thresholding for sparse covariance matrix estimation. J Am Stat Assoc 106(494):672–684
Article MathSciNet MATH Google Scholar
Cai T, Zhang L (2018) A convex optimization approach to high-dimensional sparse quadratic discriminant analysis. Ann Stat (submitted)
Cai T, Liu W, Luo X (2011) A constrained $\ell _1$ minimization approach to sparse precision matrix estimation. J Am Stat Assoc 106(494):594–607
Article MATH Google Scholar
Ding X, Yang Y, Han B, Du C, Xu N, Huang H, Cai T, Zhang A, Han ZG, Zhou W, Chen L (2014) Transcriptomic characterization of hepatocellular carcinoma with ctnnb1 mutation. PLoS ONE 9(5):e95307
Article Google Scholar
Fan J, Liao Y, Liu H (2016) An overview of the estimation of large covariance and precision matrices. Econom J 19(1):C1–C32
Article MathSciNet Google Scholar
Gambardella G, Moretti MN, De Cegli R, Cardone L, Peron A, Di Bernardo D (2013) Differential network analysis for the identification of condition-specific pathway activity and regulation. Bioinformatics 29(14):1776–1785
Article Google Scholar
Gambardella G, Peluso I, Montefusco S, Bansal M, Medina DL, Lawrence N, Bernardo DD (2015) A reverse-engineering approach to dissect post-translational modulators of transcription factors activity from transcriptional data. BMC Bioinform 16(1):279
Article Google Scholar
Gambardella G, Carissimo A, Chen A et al (2017) The impact of micrornas on transcriptional heterogeneity and gene co-expression across single embryonic stem cells. Nat Commun 8:14126
Article Google Scholar
Hsiao TH, Chiu YC, Hsu PY, Lu TP, Lai LC, Tsai MH, Huang THM, Chuang EY, Chen Y (2016) Differential network analysis reveals the genome-wide landscape of estrogen receptor modulation in hormonal cancers. Sci Rep 6:23035
Article Google Scholar
Ideker T, Krogan NJ (2014) Differential network biology. Mol Syst Biol 8(1):565
Article Google Scholar
Jerome F, Trevor H, Robert T (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3):432–441
Article MATH Google Scholar
Jian G, Elizaveta L, George M, Ji Z (2011) Joint estimation of multiple graphical models. Biometrika 98(1):1–15
Article MathSciNet MATH Google Scholar
Jiang B, Wang X, Leng C (2018) A direct approach for sparse quadratic discriminant analysis. J Mach Learn Res 19(31):1–37
MathSciNet MATH Google Scholar
Julien C, Yves G, Christophe A (2011) Inferring multiple graphical structure. Stat Comput 21(4):537–553
Article MathSciNet MATH Google Scholar
Kaushik A, Ali S, Gupta D (2017) Altered pathway analyzer: a gene expression dataset analysis tool for identification and prioritization of differentially regulated and network rewired pathways. Sci Rep 7:40450
Article Google Scholar
Li Q, Shao J (2015) Sparse quadratic discriminant analysis for high dimensional data. Stat Sin 25:457–473
MathSciNet MATH Google Scholar
Liu H, Lafferty J, Wasserman L (2009) The nonparanormal: semiparametric estimation of high dimensional undirected graphs. J Mach Learn Res 10:2295–2328
MathSciNet MATH Google Scholar
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. Ann Stat 34(3):1436–1462
Article MathSciNet MATH Google Scholar
Nesterov Y (1983) A method for solving the convex programming problem with convergence rate $O(k^2)$. Soviet Math Dokl 27:372–376
MATH Google Scholar
Rothman AJ, Levina E, Zhu J (2009) Generalized thresholding of large covariance matrices. J Am Stat Assoc 104(485):177–186
Article MathSciNet MATH Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58(1):267–288
MathSciNet MATH Google Scholar
Tong T, Wang C, Wang Y (2014) Estimation of variances and covariances for high-dimensional data: a selective review. Comput Stat 6(4):255–264
Article Google Scholar
Wu MY, Zhang XF, Dai DQ, Le OY, Zhu Y, Yan H (2016) Regularized logistic regression with network-based pairwise interaction for biomarker identification in breast cancer. BMC Bioinform 17(1):108
Article Google Scholar
Xue L, Zou H (2012) Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann Stat 40(5):2541–2571
Article MathSciNet MATH Google Scholar
Yuan H, Xi R, Chen C, Deng M (2017) Differential network analysis via the lasso penalized D-trace loss. Biometrika 104(4):755–770
Article MathSciNet MATH Google Scholar
Zhang T, Zou H (2014) Sparse precision matrix estimation via lasso penalized D-trace loss. Biometrika 101(1):103–120
Article MathSciNet MATH Google Scholar
Zhao SD, Cai TT, Li H (2014) Direct estimation of differential networks. Biometrika 101(2):253–268
Article MathSciNet MATH Google Scholar
Zhu Y, Li L (2018) Multiple matrix gaussian graphs estimation. J R Stat Soc Ser B 80:927–950
Article MathSciNet MATH Google Scholar
Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models. Ann Stat 36(4):1509–1533
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

We thank two reviewers, an associate editor, and the editor for their most helpful comments. Yu was supported in part by National Natural Science Foundation of China 11671256, 2016YFC0902403 of the Chinese Ministry of Science and Technology and Neil Shen’s SJTU Medical Research Fund. Wang was partially supported by the Shanghai Sailing Program 16YF1405700 and the National Natural Science Foundation of China 11701367 and 11825104.

Author information

Co-first author: Zhangsheng Yu.

Authors and Affiliations

School of Mathematical Sciences, MOE-LSC, Shanghai Jiao Tong University, Shanghai, 200240, China
Zhou Tang, Zhangsheng Yu & Cheng Wang
Department of Bioinformatics and Biostatistics, Shanghai Jiao Tong University, Shanghai, 20040, China
Zhou Tang & Zhangsheng Yu

Authors

Zhou Tang
View author publications
You can also search for this author in PubMed Google Scholar
Zhangsheng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cheng Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

According to the main results of Beck and Teboulle (2009), to complete the proof of Theorem 1, we need to show only that the loss function $L_1(\varDelta )$ is convex, which is the result of the following lemma.

Lemma 1.1 The loss function (2.3) is a smooth convex function, and its gradient is Lipschitz continuous with Lipschitz constant $L=\lambda _{\max }(S_1) \lambda _{\max }(S_2)$, that is

$$\begin{aligned} \Vert \nabla L_1(\varDelta _1) - \nabla L_1(\varDelta _2)\Vert _2 \le L \Vert \varDelta _1 - \varDelta _2\Vert _2, \end{aligned}$$

where $\lambda _{\max }(S_i)$ is the largest eigenvalue of the sample covariance matrix $S_i$ for $i=1,2$.

Proof

Because the loss function (2.3) is defined by

$$\begin{aligned} L_1(\varDelta )= \frac{1}{2} \text{ tr }\{\varDelta ^{\hbox {T}}S_1 \varDelta S_2\}- \text{ tr }\{\varDelta (S_1-S_2)\}, \end{aligned}$$

We can calculate the gradient of $L_1(\varDelta )$

$$\begin{aligned} \nabla L_1(\varDelta ) = S_1\varDelta S_2 - (S_1-S_2), \end{aligned}$$

and the Hessian matrix is $S_2 \otimes S_1$. Because both covariance matrices $S_1$ and $S_2$ are definite positive matrices, the Hessian matrix is a definite positive matrix. Hence, the loss function $L_1(\varDelta )$ is a smooth convex function.

Moreover, for any $\varDelta _1, \varDelta _2 \in \text {dom}(\nabla L_1)$, we have

$$\begin{aligned} \Vert \nabla L_1(\varDelta _1) - \nabla L_1(\varDelta _2)\Vert _2= & {} \Vert S_1 (\varDelta _1-\varDelta _2) S_2 \Vert _2 \\= & {} \Vert (S_2 \otimes S_1) \text{ vec }(\varDelta _1-\varDelta _2) \Vert _2 \\\le & {} \lambda _{\max }(S_2 \otimes S_1) \Vert \text{ vec }(\varDelta _1-\varDelta _2)\Vert _2 \\= & {} \lambda _{\max }(S_1) \lambda _{\max }(S_2) \Vert \varDelta _1-\varDelta _2\Vert _2 . \end{aligned}$$

The proof is now complete. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tang, Z., Yu, Z. & Wang, C. A fast iterative algorithm for high-dimensional differential network. Comput Stat 35, 95–109 (2020). https://doi.org/10.1007/s00180-019-00915-w

Download citation

Received: 19 December 2018
Accepted: 31 July 2019
Published: 17 August 2019
Issue Date: March 2020
DOI: https://doi.org/10.1007/s00180-019-00915-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast iterative algorithm for high-dimensional differential network

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Violating the normality assumption may be the lesser of two evils

A Systematic Review of Hidden Markov Models and Their Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fast iterative algorithm for high-dimensional differential network

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Violating the normality assumption may be the lesser of two evils

A Systematic Review of Hidden Markov Models and Their Applications

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation