A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD

Shaddox, Elin; Stingo, Francesco C.; Peterson, Christine B.; Jacobson, Sean; Cruickshank-Quinn, Charmion; Kechris, Katerina; Bowler, Russell; Vannucci, Marina

doi:10.1007/s12561-016-9176-6

A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD

Published: 28 October 2016

Volume 10, pages 59–85, (2018)
Cite this article

Statistics in Biosciences Aims and scope Submit manuscript

Elin Shaddox¹,
Francesco C. Stingo ORCID: orcid.org/0000-0001-9150-8552²,
Christine B. Peterson³,
Sean Jacobson⁴,
Charmion Cruickshank-Quinn⁵,
Katerina Kechris⁶,
Russell Bowler⁴ &
…
Marina Vannucci¹

783 Accesses
10 Citations
Explore all metrics

Abstract

In this paper, we propose a Bayesian hierarchical approach to infer network structures across multiple sample groups where both shared and differential edges may exist across the groups. In our approach, we link graphs through a Markov random field prior. This prior on network similarity provides a measure of pairwise relatedness that borrows strength only between related groups. We incorporate the computational efficiency of continuous shrinkage priors, improving scalability for network estimation in cases of larger dimensionality. Our model is applied to patient groups with increasing levels of chronic obstructive pulmonary disease severity, with the goal of better understanding the break down of gene pathways as the disease progresses. Our approach is able to identify critical hub genes for four targeted pathways. Furthermore, it identifies gene connections that are disrupted with increased disease severity and that characterize the disease evolution. We also demonstrate the superior performance of our approach with respect to competing methods, using simulated data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Differential co-expression network centrality and machine learning feature selection for identifying susceptibility hubs in networks with scale-free structure

Article Open access 03 February 2015

Caleb A Lareau, Bill C White, … Brett A McKinney

Learning mixed graphical models with separate sparsity parameters and stability-based model selection

Article Open access 06 June 2016

Andrew J. Sedgewick, Ivy Shi, … Panayiotis V. Benos

Analyzing networks of phenotypes in complex diseases: methodology and applications in COPD

Article Open access 25 June 2014

Jen-hwa Chu, Craig P Hersh, … Edwin K Silverman

References

Armagan A, Dunson D, Lee J (2013) Generalized double pareto shrinkage. Stat Sin 23(1):119
MathSciNet MATH Google Scholar
Atay-Kayis A, Massam H (2005) The marginal likelihood for decomposable and non-decomposable graphical gaussian models. Biometrika 92:317–355
Article MathSciNet MATH Google Scholar
Bahr T et al (2013) Peripheral blood mononuclear cell gene expression in chronic obstructive pulmonary disease. Am J Respir Cell Mol Biol 49(2):316–23
Article Google Scholar
Bowler R et al (2014) Plasma sphingolipids associated with copd phenotypes. Am J Respir Crit Care Med 191(3):275–284
Article Google Scholar
Chatr-Aryamontri A, Breitkreutz B, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Kolas N, O’Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M (2015) The biogrid interaction database: 2015 update. Nucleic Acids Res 43(Database issue):470–478
Article Google Scholar
Chen Z, Kim H, Sciurba F, Lee S, Feghali-Bostwick C, Stolz D, Dhir R, Landreneau R, Schuchert M, Yousem S, Nakahira K, Pilewski J, Lee J, Zhang Y, Ryter S, Choi A (2008) Egr-1 regulates autophagy in cigarette smoke-induced chronic obstructive pulmonary disease. PLoS ONE 3(10):3316
Article Google Scholar
Clyde M, George E (2004) Model uncertainty. Stat Sci 19(1):81–94
Article MathSciNet MATH Google Scholar
Danaher P (2012) Jgl: performs the joint graphical lasso for sparse inverse covariance estimation on multiple classes. http://CRAN.R-project.org/package=JGL
Danaher P, Wang P, Witten D (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc B 76(2):373–397
Article MathSciNet Google Scholar
Dobra A, Jones B, Hans C, Nevins J, West M (2004) Sparse graphical models for exploring gene expression data. J Multivar Anal 90:196–212
Article MathSciNet MATH Google Scholar
Dobra A, Lenkoski A, Rodriguez A (2012) Bayesian inference for general gaussian graphical models with application to multivariate lattice data. J Am Stat Assoc 106:1418–1433
Article MathSciNet MATH Google Scholar
GEO (2015) Gene expression omnibus. http://www.ncbi.nlm.nih.gov/geo
George E, McCulloch R (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889
Article Google Scholar
Gottardo R, Raftery A (2008) Markov chain Monte Carlo with mixtures of mutually singular distributions. J Comput Graph Stat 17(4):949–975
Article MathSciNet Google Scholar
Griffin J, Brown P (2010) Inference with normal-gamma prior distributions in regression problems. Bayesian Anal 5(1):171–188
Article MathSciNet MATH Google Scholar
Guo J, Levina E, Michailidis G, Zhu J (2011) Joint estimation of multiple graphical models. Biometrika 98(1):1–15
Article MathSciNet MATH Google Scholar
Hanahan D, Weinberg R (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674
Article Google Scholar
Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP (2003) Summaries of affymetrix genechip probe level data nucleic acids research. Nucleic Acids Res 31(4):e15
Article Google Scholar
Jones B, Carvalho C, Dobra A, Hans C, Carter C, West M (2005) Experiments in stochastic computation for high dimensional graphical models. Stat Sci 20(4):388–400
Article MathSciNet MATH Google Scholar
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014) Data, information, knowledge and principle: back to metabolism in kegg. Nucleic Acids Res 42:199–205
Article Google Scholar
Khondker Z, Zhu H, Chu H, Lin W, Ibrahim J (2013) The Bayesian Covariance Lasso. Stat Its Interface 6(2):243
Article MathSciNet MATH Google Scholar
Langfelder P, Mischel SHP (2013) When is hub gene selection better than standard meta-analysis? PLoS ONE 8(4):e61505
Article Google Scholar
Li F, Zhang N (2010) Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics. J Am Stat Assoc 105(491):1202–1214
Article MathSciNet MATH Google Scholar
Marwick J, Caramori G, Casolari P, Mazzoni F, Kirkham P, Adcock I, Chung K, Papi A (2010) A role for phosphoinositol 3-kinase delta in the impairment of glucocorticoid responsiveness in patients with chronic obstructive pulmonary disease. J Allergy Clin Immunol 125(5):1146–53
Article Google Scholar
Mukherjee S, Speed T (2008) Network inference using informative priors. Proc Natl Acad Sci 105(38):14,313–14,318
Article Google Scholar
Ni Y, Marchetti G, Baladandayuthapani V, Stingo F (2015) Bayesian approaches for large biological networks. In: Mitra R, Muller P (eds) Nonparametric Bayesian methods in biostatistics and bioinformatics. Springer, New York
Google Scholar
Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 20(1):140–157
MathSciNet MATH Google Scholar
Parshall M (1999) Adult emergency visits for chronic cardiorespiratory disease: does dyspnea matter? Nurs Res 48(2):62–70
Article Google Scholar
Peterson C, Stingo F, Vannucci M (2015) Bayesian inference of multiple Gaussian graphical models. J Am Stat Assoc 110(509):159–174
Article MathSciNet MATH Google Scholar
Peterson C, Stingo F, Vannucci M (2016) Joint bayesian variable and graph selection for regression models with network-structured predictors. Stat Med 35(7):1017–1031
Article MathSciNet Google Scholar
Regan EA et al (2010) Genetic epidemiology of copd (copdgene) study design. COPD 7(1):32–43
Article Google Scholar
Reimand J, Wagih O, Bader G (2013) The mutational landscape of phosphorylation signaling in cancer. Sci Rep. doi:10.1038/srep02651
Roverato A (2002) Hyper-inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand J Stat 29:391–411
Article MathSciNet MATH Google Scholar
Scott J, Berger J (2010) Bayes and empirical Bayes multiplicity adjustment in the variable-selection problem. Ann Stat 38(5):2587–2619
Article MathSciNet MATH Google Scholar
Scott J, Carvalho C (2008) Feature-inclusion stochastic search for Gaussian graphical models. J Comput Graphical Stat 17:790–808
Article MathSciNet Google Scholar
Singh D et al (2014) Altered gene expression in blood and sputum in copd frequent exacerbators in the eclipse cohort. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0107381
Skrepnek G, Skrepnek S (2004) Epidemiology, clinical and economic burden, and natural history of chronic obstructive pulmonary disease and asthma. AM J Manag Care 10(5):S129–38
Google Scholar
Stelzer G, Dalah I, Stein T, Satanower Y, Rosen N, Nativ N, Oz-Levi D, Olender T, Belinky F, Bahir I, Krug H, Perco P, Mayer B, Kolker E, Safran M, Lancet D (2011) In-silico human genomics with genecards. Hum Genomics 5(6):709–717
Article Google Scholar
Stingo F, Marchetti G (2015) Efficient local updates for undirected graphical models. Stat Comput 25:159–171
Article MathSciNet MATH Google Scholar
Stingo F, Vannucci M (2011) Variable selection for discriminant analysis with markov random field priors for the analysis of microarray data. Bioinformatics 27(4):495–501
Article Google Scholar
Stingo F, Chen Y, Vannucci M, Barrier M, Mirkes P (2010) A Bayesian graphical modeling approach to microRNA regulatory network inference. Ann Appl Stat 4(4):2024
Article MathSciNet MATH Google Scholar
Telesca D, Mueller P, Kornblau S, Suchard M, Ji Y (2012) Modeling protein expression and protein signaling pathways. J Am Stat Assoc 107(500):1372–1384
Article MathSciNet MATH Google Scholar
Wang H (2012) The Bayesian graphical lasso and efficient posterior computation. Bayesian Anal 7(2):771–790
MathSciNet Google Scholar
Wang H (2015) Scaling it up: stochastic search structure learning in graphical models. Bayesian Anal 10(2):351–377
Article MathSciNet MATH Google Scholar
Wang H, Li Z (2012) Efficient gaussian graphical model determination under g-wishart prior distributions. Electron J Stat 6:168–198
Article MathSciNet MATH Google Scholar
Yajima M, Telesca D, Ji Y, Muller P (2015) Detecting differential patterns of interaction in molecular pathways. Biostatistics 16(2):240–251
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Rice University, Houston, USA
Elin Shaddox & Marina Vannucci
Dipartimento di Statistica, Informatica, Applicazioni “G.Parenti”, University of Florence, Florence, Italy
Francesco C. Stingo
Department of Biostatistics, UT MD Anderson Cancer Center, Houston, USA
Christine B. Peterson
Department of Medicine, National Jewish Health, Denver, CO, USA
Sean Jacobson & Russell Bowler
Department of Pharmaceutical Sciences, School of Pharmacy, University of Colorado Denver, Denver, CO, USA
Charmion Cruickshank-Quinn
Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Denver, Denver, CO, USA
Katerina Kechris

Authors

Elin Shaddox
View author publications
You can also search for this author in PubMed Google Scholar
Francesco C. Stingo
View author publications
You can also search for this author in PubMed Google Scholar
Christine B. Peterson
View author publications
You can also search for this author in PubMed Google Scholar
Sean Jacobson
View author publications
You can also search for this author in PubMed Google Scholar
Charmion Cruickshank-Quinn
View author publications
You can also search for this author in PubMed Google Scholar
Katerina Kechris
View author publications
You can also search for this author in PubMed Google Scholar
Russell Bowler
View author publications
You can also search for this author in PubMed Google Scholar
Marina Vannucci
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco C. Stingo.

Appendix

1.1 Details on our MCMC Algorithm

In this section, we provide a detailed description of Step a and Step b of our MCMC algorithm.

Step a. By partitioning $\Omega $ into $V=(\upsilon _{i,j}^2)$, a $p\times p$ symmetric matrix with zeroed diagonal entries and $(\upsilon _{i,j}^2)_{i<j}$ in the upper diagonal entries and setting $S=X'X$, we can focus on the last column and row to acquire

$$\begin{aligned} \Omega = \left( \begin{array}{cc} \Omega _{1,1}&{}\quad \omega _{1,2}\\ \omega _{1,2}' &{}\quad \omega _{2,2} \end{array}\right) , \quad S=\left( \begin{array}{cc} S_{1,1}&{}\quad s_{1,2}\\ s_{1,2}' &{}\quad s_{2,2} \end{array}\right) , \quad V=\left( \begin{array}{cc} V_{1,1} &{}\quad v_{1,2} \\ v_{1,2}' &{}\quad 0 \end{array}\right) . \end{aligned}$$

Changing variables from $(\omega _{1,2}, \omega _{2,2})$ to $(u=\omega _{1,2},\upsilon =\omega _{2,2}-\omega _{1,2}'\Omega ^{-1}\omega _{1,2})$, we have full conditionals

$$\begin{aligned} u|\cdot \sim N(-Cs_{1,2}, C) \quad {\text {and}} \; \upsilon |\cdot \sim {\text {Gamma}}\biggr (\frac{n}{2}+1, \frac{s_{2,2}+\lambda }{2}\biggr ), \end{aligned}$$

where $C=\{(s_{2,2}+\lambda )\Omega _{1,1}^{-1}+{\text {diag}} (v_{1,2}^{-1})\}^{-1}$. Using this method, we can permute any column to attain the full conditional used to generate $\Omega |\mathbf{{G}},X$. Our full conditional on $\mathbf{{G}}$ is then an independent Bernoulli of the form

$$\begin{aligned} P(g_{i,j}=1|\Omega , X)=\frac{N(\omega _{i,j}|0, \upsilon _1^2)\pi }{N(\omega _{i,j}|0, \upsilon _1^2)\pi + N(\omega _{i,j}|0,\upsilon _0^2)(1-\pi )}, \end{aligned}$$

where the quantity $\frac{\pi }{1-\pi }$ is determined by the MRF prior on the graph structure such that

$$\begin{aligned} \frac{\pi }{1-\pi }=\frac{p(G_k'|\nu _{i,j}, \Theta , \{G_m\}_{m\ne k})}{p(G_k|\nu _{i,j}, \Theta , \{G_m\}_{m\ne k})}=\exp \bigg \{-\nu _{i,j}+2\sum _{m\ne k} \theta _{k,m}g_{m,i,j})\bigg \}, \end{aligned}$$

for proposed new graph $G_k'$ which differs from the current graph $G_k$ only in that edge (i, j) is excluded from $G_k'$ and included in $G_k$.

Step b. In order to update $\theta _{k,m}$ and $\gamma _{k,m}$, we must consider the full conditional distribution. Considering only the terms of the joint prior for graphs $G_1, \ldots , G_k$ which include $\theta _{k,m}$, we can see that

$$\begin{aligned} p(G_1, \ldots , G_4|\nu , \Theta )&=\prod _{i<j}C(\nu _{i,j}, \Theta )^{-1}\exp \bigg (\nu _{i,j}{} \mathbf{{1}}^T\mathbf{{g_{i,j}}}+\mathbf{{g_{i,j}}}^T \Theta \mathbf{{g_{i,j}}}\bigg ) \\&\propto \prod _{i<j} C(\nu _{i,j},\Theta )^{-1}\exp \bigg (2\theta _{k,m} g_{k,i,j}g_{m,i,j}\bigg ). \end{aligned}$$

The full conditional distribution of $\theta _{k,m}$ and $\gamma _{k,m}$ can then be written as

$$\begin{aligned} p(\theta _{k,m}, \gamma _{k,m}|\cdot )&= p(G_1, \ldots , G_k|\nu , \Theta )p(\theta _{k,m}|\gamma _{k,m})p(\gamma _{k,m}|w)\\&\propto \biggr (\prod _{i<j} C(\nu _{i,j}, \Theta )^{-1}\exp (2\theta _{k,m}g_{k,i,j}g_{m,i,j})\biggr ) \\&\qquad \times \;\biggr ((1-\gamma _{k,m})\delta _0+\gamma _{k,m}\frac{\beta ^\alpha }{\Gamma (\alpha )}\theta _{k,m}^{\alpha -1}e^{-\beta \theta _{k,m}}\biggr ) \\&\qquad \times \;\biggr (w^{\gamma _{k,m}}(1-w)^{(1-\gamma _{k,m})}\biggr ). \end{aligned}$$

Because the normalizing constant from the joint prior on the graphs is analytically intractable, we use Metropolis–Hastings step to sample from $\theta _{k,m}$ and $\gamma _{k,m}$ for each pair of (k, m), $1\le k<m \le 4$ from the joint full conditional distribution. Each iteration has two steps based on the approach described by [14] to sample from mutually singular distribution mixtures. First, we perform a between-model move. If the current state is $\gamma _{k,m}=1$, we propose $\gamma ^\star _{k,m}=0$ and $\theta ^\star _{k,m}=0$ resulting in the Metropolis–Hastings ratio

$$\begin{aligned} r&=\frac{p(\theta ^\star _{k,m}, \gamma ^\star _{k,m}|\cdot )\times q(\theta _{k,m})}{p(\theta _{k,m}, \gamma _{k,m}|\cdot )} =\frac{\Gamma (\alpha )}{\Gamma (\alpha ^\star )}\frac{(\beta ^\star )^{ \alpha ^\star }}{\beta ^\alpha }(\theta _{k,m})^{\alpha ^\star -\alpha }e^{ (\beta -\beta ^\star )\theta _{k,m}}\\&\quad \times \;\prod _{i<j}\frac{C(\nu _{i,j}, \Theta )\exp (-2\theta _{k,m}g_{k,i,j}g_{m,i,j})}{C(\nu _{i,j}, \Theta ^\star )}\frac{1-w}{w}, \end{aligned}$$

where $\Theta ^\star $ represents the network similarity matrix $\Theta $ with entry $\theta _{k,m}=\theta _{k,m}^\star $. If moving instead from $\gamma _{k,m}=0$ to $\gamma ^\star _{k,m}=1$, the ratio is

$$\begin{aligned} r&=\frac{p(\theta ^\star _{k,m}, \gamma ^\star _{k,m}|\cdot )}{p(\theta _{k,m}, \gamma _{k,m}|\cdot )\times q(\theta _{k,m})} =\frac{\Gamma (\alpha ^\star )}{\Gamma (\alpha )}\frac{\beta ^{\alpha }}{(\beta ^\star )^{\alpha ^\star }}\times (\theta _{k,m})^{\alpha -\alpha ^\star } e^{(\beta ^\star -\beta )\theta _{k,m}} \\&\quad \times \;\prod _{i<j}\frac{C(\nu _{i,j}, \Theta )\exp (-2\theta ^\star _{k,m}g_{k,i,j}g_{m,i,j})}{C(\nu _{i,j}, \Theta ^\star )}\frac{w}{1-w}. \end{aligned}$$

Next, we perform the within-model move if the value of $\gamma _{k,m}$ sampled from the between-model move is 1. Here, we propose a new value using the same proposal density as before, for $\theta _{k,m}$. Our Metropolis–Hastings ratio is

$$\begin{aligned} r&=\frac{p(\theta ^\star _{k,m}, \gamma ^\star _{k,m}|\cdot )\cdot q(\theta _{k,m})}{p(\theta _{k,m}, \gamma _{k,m}|\cdot )\cdot q(\theta ^\star _{k,m})} =\biggr (\frac{\theta ^\star _{k,m}}{\theta _{k,m}}\biggr )^{\alpha -\alpha ^\star } \cdot {e}^{(\beta ^\star -\beta )(\theta ^\star _{k,m}-\theta _{k,m})}\\&\quad \times \;\prod _{i<j} \frac{C(\nu _{i,j},\Theta )\exp (2(\theta ^\star _{k,m}-\theta _{k,m})g_{k,i,j} g_{m,i,j})}{C(\nu _{i,j}, \Theta ^\star )}. \end{aligned}$$

In our last step of the MCMC, we sample from the full conditional distribution of $\nu _{i,j}$. The terms of the joint prior on the graphs including $\nu _{i,j}$ are

$$\begin{aligned} p(G_1, \ldots , G_k|\nu , \Theta )&=\prod _{i<j}C(\nu _{i,j}, \Theta )^{-1}\exp \bigg (\nu _{i,j}{} \mathbf{{1}}^T\mathbf{{g_{i,j}}}+\mathbf{{g_{i,j}}}^T\Theta \mathbf{{g_{i,j}}}\bigg ) \\&\propto C(\nu _{i,j},\Theta )^{-1}\exp \bigg (\nu _{i,j}{} \mathbf{{1}}^T\mathbf{{g_{i,j}}}\bigg ). \end{aligned}$$

Given the prior on $\nu _{i,j}$, we can attain the posterior full conditional given the data and all remaining parameters

$$\begin{aligned} p(\nu _{i,j}|\cdot )&\propto \frac{\exp (a\nu _{i,j})}{(1+e^{\nu _{i,j}})^ {a+b}}C(\nu _{i,j},\Theta )^{-1}\exp \bigg (\nu _{i,j}{} \mathbf{{1}}^T\mathbf{{g_{i,j}}}\bigg )\\&=\frac{\exp (\nu _{i,j}(a+\mathbf{{1}}^T\mathbf{{g_{i,j}}}))}{C(\nu _{i,j}, \Theta )\cdot (1+e^{\nu _{i,j}})^{a+b}}. \end{aligned}$$

We then propose a value $q^\star $ from the density Beta(2, 4) for each pair (i, j) where $1\le i<j\le p$ and set $\nu ^\star = {\text {logit}}(q^\star )$. We can write our proposal density in terms of $\nu ^\star $ as

$$\begin{aligned} q(\nu ^\star )=\frac{1}{B(a^\star , b^\star )}\frac{e^{a^\star \nu ^\star }}{(1+e^{\nu ^\star })^{a^\star + b^\star }}, \end{aligned}$$

with Metropolis–Hastings ratio

$$\begin{aligned} r&=\frac{p(\nu ^\star |\cdot )}{p(\nu _{i,j}|\cdot )}\frac{q(\nu _{i,j})}{q(\nu ^\star )}\\&=\frac{\exp ((\nu ^\star -\nu _{i,j})\cdot (a-a^\star +\mathbf{{1}}^ T\mathbf{{g_{i,j}}}))\cdot C(\nu _{i,j}, \Theta )\cdot (1+e^{\nu _{i,j}})^{a+b-a^\star -b^\star }}{C(\nu ^\star , \Theta )\times (1+e^{\nu ^\star })^{a+b-a^\star -bI^\star }}. \end{aligned}$$

1.2 Case Study: Comparison to the Fused and Joint Graphical Lasso

In this section, we compare the proposed Bayesian approach to the fused and joint graphical lasso in terms of the findings obtained from the analysis of the ECLIPSE dataset. Specifically, we focused on the Reg Auto and GPL pathways. For both the fused and joint graphical lasso, we selected the penalty parameters that minimized the AIC, as recommended by [9]. For the Reg Auto pathway, the fused graphical lasso penalty parameters were selected as $\lambda _1=0.015$ and $\lambda _2=0.0001$, and for the group lasso were selected as $\lambda _1= 0.015$ and $\lambda _2=0$ (this value was selected after an extensive grid search with step size of .0000005). For the GPL pathway, penalty parameters were selected as $\lambda _1=0.02$ and $\lambda _2=0.0005$ for the fused lasso, and $\lambda _1=0.02$ and $\lambda _2=0.0$ for the group lasso. Results are summarized in the two tables below.

Reg auto: method edge count comparison

	Proposed method	Group fused lasso	Joint group lasso
Group 1 edge count	98	159	159
Group 2 edge count	95	155	155
Group 3 edge count	89	155	155
Group 4 edge count	98	146	146
Unique edge count	153	190	190

GPL: method edge count comparison

	Proposed method	Group fused lasso	Joint group lasso
Group 1 edge count	312	560	560
Group 2 edge count	255	553	553
Group 3 edge count	288	545	545
Group 4 edge count	314	536	536
Unique edge count	539	802	802

For the Reg Auto pathway, it can be seen that edge counts were equivalent for the fused lasso and the group lasso. Both lasso methods selected all the possible 190 edges; this illustrates the issue corresponding to high false positive rates for lasso methods and consequently hints at more difficult interpretation of results. Percentage overlap of unique edges for Reg Auto was computed as

$$\begin{aligned} \frac{{\text {Unique Edges in Proposed and Lasso Method}}}{{\text {Unique Lasso Edge Count}}}, \end{aligned}$$

and resulted in an overlap of 80 %. Lasso methods identified the same hub genes as the proposed Bayesian approach, plus ATG10 and ULK3.

Similar conclusions can be derived from the analysis of the GPL pathway. The same edges were selected by both the group and fused lasso for all disease groups; 802 out of 820 possible unique edges were selected. Of the 18 edges remaining which were not selected by the lasso methods, five were selected by our proposed method. This resulted in a percentage overlap of unique edges for GPL 67 %. The lasso methods identified the same hub genes as our proposed method in addition to DGKE, DGKQ, and MBOAT1. Overall, the lasso methods have similar results to our proposed approach, but result in much more dense networks due to their higher false positive rates. The proposed Bayesian approach provides sparser solutions that can be more easily interpreted.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shaddox, E., Stingo, F.C., Peterson, C.B. et al. A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD. Stat Biosci 10, 59–85 (2018). https://doi.org/10.1007/s12561-016-9176-6

Download citation

Received: 01 March 2016
Revised: 26 September 2016
Accepted: 15 October 2016
Published: 28 October 2016
Issue Date: April 2018
DOI: https://doi.org/10.1007/s12561-016-9176-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD

Abstract

Access this article

Similar content being viewed by others

Differential co-expression network centrality and machine learning feature selection for identifying susceptibility hubs in networks with scale-free structure

Learning mixed graphical models with separate sparsity parameters and stability-based model selection

Analyzing networks of phenotypes in complex diseases: methodology and applications in COPD

References

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Details on our MCMC Algorithm

1.2 Case Study: Comparison to the Fused and Joint Graphical Lasso

Reg auto: method edge count comparison

GPL: method edge count comparison

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD

Abstract

Access this article

Similar content being viewed by others

Differential co-expression network centrality and machine learning feature selection for identifying susceptibility hubs in networks with scale-free structure

Learning mixed graphical models with separate sparsity parameters and stability-based model selection

Analyzing networks of phenotypes in complex diseases: methodology and applications in COPD

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Details on our MCMC Algorithm

1.2 Case Study: Comparison to the Fused and Joint Graphical Lasso

Reg auto: method edge count comparison

GPL: method edge count comparison

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation