Abstract
In this paper, we propose a Bayesian hierarchical approach to infer network structures across multiple sample groups where both shared and differential edges may exist across the groups. In our approach, we link graphs through a Markov random field prior. This prior on network similarity provides a measure of pairwise relatedness that borrows strength only between related groups. We incorporate the computational efficiency of continuous shrinkage priors, improving scalability for network estimation in cases of larger dimensionality. Our model is applied to patient groups with increasing levels of chronic obstructive pulmonary disease severity, with the goal of better understanding the break down of gene pathways as the disease progresses. Our approach is able to identify critical hub genes for four targeted pathways. Furthermore, it identifies gene connections that are disrupted with increased disease severity and that characterize the disease evolution. We also demonstrate the superior performance of our approach with respect to competing methods, using simulated data.
Similar content being viewed by others
References
Armagan A, Dunson D, Lee J (2013) Generalized double pareto shrinkage. Stat Sin 23(1):119
Atay-Kayis A, Massam H (2005) The marginal likelihood for decomposable and non-decomposable graphical gaussian models. Biometrika 92:317–355
Bahr T et al (2013) Peripheral blood mononuclear cell gene expression in chronic obstructive pulmonary disease. Am J Respir Cell Mol Biol 49(2):316–23
Bowler R et al (2014) Plasma sphingolipids associated with copd phenotypes. Am J Respir Crit Care Med 191(3):275–284
Chatr-Aryamontri A, Breitkreutz B, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Kolas N, O’Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M (2015) The biogrid interaction database: 2015 update. Nucleic Acids Res 43(Database issue):470–478
Chen Z, Kim H, Sciurba F, Lee S, Feghali-Bostwick C, Stolz D, Dhir R, Landreneau R, Schuchert M, Yousem S, Nakahira K, Pilewski J, Lee J, Zhang Y, Ryter S, Choi A (2008) Egr-1 regulates autophagy in cigarette smoke-induced chronic obstructive pulmonary disease. PLoS ONE 3(10):3316
Clyde M, George E (2004) Model uncertainty. Stat Sci 19(1):81–94
Danaher P (2012) Jgl: performs the joint graphical lasso for sparse inverse covariance estimation on multiple classes. http://CRAN.R-project.org/package=JGL
Danaher P, Wang P, Witten D (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc B 76(2):373–397
Dobra A, Jones B, Hans C, Nevins J, West M (2004) Sparse graphical models for exploring gene expression data. J Multivar Anal 90:196–212
Dobra A, Lenkoski A, Rodriguez A (2012) Bayesian inference for general gaussian graphical models with application to multivariate lattice data. J Am Stat Assoc 106:1418–1433
GEO (2015) Gene expression omnibus. http://www.ncbi.nlm.nih.gov/geo
George E, McCulloch R (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889
Gottardo R, Raftery A (2008) Markov chain Monte Carlo with mixtures of mutually singular distributions. J Comput Graph Stat 17(4):949–975
Griffin J, Brown P (2010) Inference with normal-gamma prior distributions in regression problems. Bayesian Anal 5(1):171–188
Guo J, Levina E, Michailidis G, Zhu J (2011) Joint estimation of multiple graphical models. Biometrika 98(1):1–15
Hanahan D, Weinberg R (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674
Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP (2003) Summaries of affymetrix genechip probe level data nucleic acids research. Nucleic Acids Res 31(4):e15
Jones B, Carvalho C, Dobra A, Hans C, Carter C, West M (2005) Experiments in stochastic computation for high dimensional graphical models. Stat Sci 20(4):388–400
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014) Data, information, knowledge and principle: back to metabolism in kegg. Nucleic Acids Res 42:199–205
Khondker Z, Zhu H, Chu H, Lin W, Ibrahim J (2013) The Bayesian Covariance Lasso. Stat Its Interface 6(2):243
Langfelder P, Mischel SHP (2013) When is hub gene selection better than standard meta-analysis? PLoS ONE 8(4):e61505
Li F, Zhang N (2010) Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics. J Am Stat Assoc 105(491):1202–1214
Marwick J, Caramori G, Casolari P, Mazzoni F, Kirkham P, Adcock I, Chung K, Papi A (2010) A role for phosphoinositol 3-kinase delta in the impairment of glucocorticoid responsiveness in patients with chronic obstructive pulmonary disease. J Allergy Clin Immunol 125(5):1146–53
Mukherjee S, Speed T (2008) Network inference using informative priors. Proc Natl Acad Sci 105(38):14,313–14,318
Ni Y, Marchetti G, Baladandayuthapani V, Stingo F (2015) Bayesian approaches for large biological networks. In: Mitra R, Muller P (eds) Nonparametric Bayesian methods in biostatistics and bioinformatics. Springer, New York
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 20(1):140–157
Parshall M (1999) Adult emergency visits for chronic cardiorespiratory disease: does dyspnea matter? Nurs Res 48(2):62–70
Peterson C, Stingo F, Vannucci M (2015) Bayesian inference of multiple Gaussian graphical models. J Am Stat Assoc 110(509):159–174
Peterson C, Stingo F, Vannucci M (2016) Joint bayesian variable and graph selection for regression models with network-structured predictors. Stat Med 35(7):1017–1031
Regan EA et al (2010) Genetic epidemiology of copd (copdgene) study design. COPD 7(1):32–43
Reimand J, Wagih O, Bader G (2013) The mutational landscape of phosphorylation signaling in cancer. Sci Rep. doi:10.1038/srep02651
Roverato A (2002) Hyper-inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand J Stat 29:391–411
Scott J, Berger J (2010) Bayes and empirical Bayes multiplicity adjustment in the variable-selection problem. Ann Stat 38(5):2587–2619
Scott J, Carvalho C (2008) Feature-inclusion stochastic search for Gaussian graphical models. J Comput Graphical Stat 17:790–808
Singh D et al (2014) Altered gene expression in blood and sputum in copd frequent exacerbators in the eclipse cohort. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0107381
Skrepnek G, Skrepnek S (2004) Epidemiology, clinical and economic burden, and natural history of chronic obstructive pulmonary disease and asthma. AM J Manag Care 10(5):S129–38
Stelzer G, Dalah I, Stein T, Satanower Y, Rosen N, Nativ N, Oz-Levi D, Olender T, Belinky F, Bahir I, Krug H, Perco P, Mayer B, Kolker E, Safran M, Lancet D (2011) In-silico human genomics with genecards. Hum Genomics 5(6):709–717
Stingo F, Marchetti G (2015) Efficient local updates for undirected graphical models. Stat Comput 25:159–171
Stingo F, Vannucci M (2011) Variable selection for discriminant analysis with markov random field priors for the analysis of microarray data. Bioinformatics 27(4):495–501
Stingo F, Chen Y, Vannucci M, Barrier M, Mirkes P (2010) A Bayesian graphical modeling approach to microRNA regulatory network inference. Ann Appl Stat 4(4):2024
Telesca D, Mueller P, Kornblau S, Suchard M, Ji Y (2012) Modeling protein expression and protein signaling pathways. J Am Stat Assoc 107(500):1372–1384
Wang H (2012) The Bayesian graphical lasso and efficient posterior computation. Bayesian Anal 7(2):771–790
Wang H (2015) Scaling it up: stochastic search structure learning in graphical models. Bayesian Anal 10(2):351–377
Wang H, Li Z (2012) Efficient gaussian graphical model determination under g-wishart prior distributions. Electron J Stat 6:168–198
Yajima M, Telesca D, Ji Y, Muller P (2015) Detecting differential patterns of interaction in molecular pathways. Biostatistics 16(2):240–251
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Details on our MCMC Algorithm
In this section, we provide a detailed description of Step a and Step b of our MCMC algorithm.
Step a. By partitioning \(\Omega \) into \(V=(\upsilon _{i,j}^2)\), a \(p\times p\) symmetric matrix with zeroed diagonal entries and \((\upsilon _{i,j}^2)_{i<j}\) in the upper diagonal entries and setting \(S=X'X\), we can focus on the last column and row to acquire
Changing variables from \((\omega _{1,2}, \omega _{2,2})\) to \((u=\omega _{1,2},\upsilon =\omega _{2,2}-\omega _{1,2}'\Omega ^{-1}\omega _{1,2})\), we have full conditionals
where \(C=\{(s_{2,2}+\lambda )\Omega _{1,1}^{-1}+{\text {diag}} (v_{1,2}^{-1})\}^{-1}\). Using this method, we can permute any column to attain the full conditional used to generate \(\Omega |\mathbf{{G}},X\). Our full conditional on \(\mathbf{{G}}\) is then an independent Bernoulli of the form
where the quantity \(\frac{\pi }{1-\pi }\) is determined by the MRF prior on the graph structure such that
for proposed new graph \(G_k'\) which differs from the current graph \(G_k\) only in that edge (i, j) is excluded from \(G_k'\) and included in \(G_k\).
Step b. In order to update \(\theta _{k,m}\) and \(\gamma _{k,m}\), we must consider the full conditional distribution. Considering only the terms of the joint prior for graphs \(G_1, \ldots , G_k\) which include \(\theta _{k,m}\), we can see that
The full conditional distribution of \(\theta _{k,m}\) and \(\gamma _{k,m}\) can then be written as
Because the normalizing constant from the joint prior on the graphs is analytically intractable, we use Metropolis–Hastings step to sample from \(\theta _{k,m}\) and \(\gamma _{k,m}\) for each pair of (k, m), \(1\le k<m \le 4\) from the joint full conditional distribution. Each iteration has two steps based on the approach described by [14] to sample from mutually singular distribution mixtures. First, we perform a between-model move. If the current state is \(\gamma _{k,m}=1\), we propose \(\gamma ^\star _{k,m}=0\) and \(\theta ^\star _{k,m}=0\) resulting in the Metropolis–Hastings ratio
where \(\Theta ^\star \) represents the network similarity matrix \(\Theta \) with entry \(\theta _{k,m}=\theta _{k,m}^\star \). If moving instead from \(\gamma _{k,m}=0\) to \(\gamma ^\star _{k,m}=1\), the ratio is
Next, we perform the within-model move if the value of \(\gamma _{k,m}\) sampled from the between-model move is 1. Here, we propose a new value using the same proposal density as before, for \(\theta _{k,m}\). Our Metropolis–Hastings ratio is
In our last step of the MCMC, we sample from the full conditional distribution of \(\nu _{i,j}\). The terms of the joint prior on the graphs including \(\nu _{i,j}\) are
Given the prior on \(\nu _{i,j}\), we can attain the posterior full conditional given the data and all remaining parameters
We then propose a value \(q^\star \) from the density Beta(2, 4) for each pair (i, j) where \(1\le i<j\le p\) and set \(\nu ^\star = {\text {logit}}(q^\star )\). We can write our proposal density in terms of \(\nu ^\star \) as
with Metropolis–Hastings ratio
1.2 Case Study: Comparison to the Fused and Joint Graphical Lasso
In this section, we compare the proposed Bayesian approach to the fused and joint graphical lasso in terms of the findings obtained from the analysis of the ECLIPSE dataset. Specifically, we focused on the Reg Auto and GPL pathways. For both the fused and joint graphical lasso, we selected the penalty parameters that minimized the AIC, as recommended by [9]. For the Reg Auto pathway, the fused graphical lasso penalty parameters were selected as \(\lambda _1=0.015\) and \(\lambda _2=0.0001\), and for the group lasso were selected as \(\lambda _1= 0.015\) and \(\lambda _2=0\) (this value was selected after an extensive grid search with step size of .0000005). For the GPL pathway, penalty parameters were selected as \(\lambda _1=0.02\) and \(\lambda _2=0.0005\) for the fused lasso, and \(\lambda _1=0.02\) and \(\lambda _2=0.0\) for the group lasso. Results are summarized in the two tables below.
Reg auto: method edge count comparison
Proposed method | Group fused lasso | Joint group lasso | |
---|---|---|---|
Group 1 edge count | 98 | 159 | 159 |
Group 2 edge count | 95 | 155 | 155 |
Group 3 edge count | 89 | 155 | 155 |
Group 4 edge count | 98 | 146 | 146 |
Unique edge count | 153 | 190 | 190 |
GPL: method edge count comparison
Proposed method | Group fused lasso | Joint group lasso | |
---|---|---|---|
Group 1 edge count | 312 | 560 | 560 |
Group 2 edge count | 255 | 553 | 553 |
Group 3 edge count | 288 | 545 | 545 |
Group 4 edge count | 314 | 536 | 536 |
Unique edge count | 539 | 802 | 802 |
For the Reg Auto pathway, it can be seen that edge counts were equivalent for the fused lasso and the group lasso. Both lasso methods selected all the possible 190 edges; this illustrates the issue corresponding to high false positive rates for lasso methods and consequently hints at more difficult interpretation of results. Percentage overlap of unique edges for Reg Auto was computed as
and resulted in an overlap of 80 %. Lasso methods identified the same hub genes as the proposed Bayesian approach, plus ATG10 and ULK3.
Similar conclusions can be derived from the analysis of the GPL pathway. The same edges were selected by both the group and fused lasso for all disease groups; 802 out of 820 possible unique edges were selected. Of the 18 edges remaining which were not selected by the lasso methods, five were selected by our proposed method. This resulted in a percentage overlap of unique edges for GPL 67 %. The lasso methods identified the same hub genes as our proposed method in addition to DGKE, DGKQ, and MBOAT1. Overall, the lasso methods have similar results to our proposed approach, but result in much more dense networks due to their higher false positive rates. The proposed Bayesian approach provides sparser solutions that can be more easily interpreted.
Rights and permissions
About this article
Cite this article
Shaddox, E., Stingo, F.C., Peterson, C.B. et al. A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD. Stat Biosci 10, 59–85 (2018). https://doi.org/10.1007/s12561-016-9176-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-016-9176-6