Statistics in Biosciences

, Volume 10, Issue 1, pp 59–85 | Cite as

A Bayesian Approach for Learning Gene Networks Underlying Disease Severity in COPD

  • Elin Shaddox
  • Francesco C. StingoEmail author
  • Christine B. Peterson
  • Sean Jacobson
  • Charmion Cruickshank-Quinn
  • Katerina Kechris
  • Russell Bowler
  • Marina Vannucci


In this paper, we propose a Bayesian hierarchical approach to infer network structures across multiple sample groups where both shared and differential edges may exist across the groups. In our approach, we link graphs through a Markov random field prior. This prior on network similarity provides a measure of pairwise relatedness that borrows strength only between related groups. We incorporate the computational efficiency of continuous shrinkage priors, improving scalability for network estimation in cases of larger dimensionality. Our model is applied to patient groups with increasing levels of chronic obstructive pulmonary disease severity, with the goal of better understanding the break down of gene pathways as the disease progresses. Our approach is able to identify critical hub genes for four targeted pathways. Furthermore, it identifies gene connections that are disrupted with increased disease severity and that characterize the disease evolution. We also demonstrate the superior performance of our approach with respect to competing methods, using simulated data.


Gaussian graphical model Bayesian inference Markov random field prior Spike-and-slab prior Gene network Chronic obstructive pulmonary disease (COPD) 


  1. 1.
    Armagan A, Dunson D, Lee J (2013) Generalized double pareto shrinkage. Stat Sin 23(1):119MathSciNetzbMATHGoogle Scholar
  2. 2.
    Atay-Kayis A, Massam H (2005) The marginal likelihood for decomposable and non-decomposable graphical gaussian models. Biometrika 92:317–355MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Bahr T et al (2013) Peripheral blood mononuclear cell gene expression in chronic obstructive pulmonary disease. Am J Respir Cell Mol Biol 49(2):316–23CrossRefGoogle Scholar
  4. 4.
    Bowler R et al (2014) Plasma sphingolipids associated with copd phenotypes. Am J Respir Crit Care Med 191(3):275–284CrossRefGoogle Scholar
  5. 5.
    Chatr-Aryamontri A, Breitkreutz B, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Kolas N, O’Donnell L, Reguly T, Nixon J, Ramage L, Winter A, Sellam A, Chang C, Hirschman J, Theesfeld C, Rust J, Livstone MS, Dolinski K, Tyers M (2015) The biogrid interaction database: 2015 update. Nucleic Acids Res 43(Database issue):470–478CrossRefGoogle Scholar
  6. 6.
    Chen Z, Kim H, Sciurba F, Lee S, Feghali-Bostwick C, Stolz D, Dhir R, Landreneau R, Schuchert M, Yousem S, Nakahira K, Pilewski J, Lee J, Zhang Y, Ryter S, Choi A (2008) Egr-1 regulates autophagy in cigarette smoke-induced chronic obstructive pulmonary disease. PLoS ONE 3(10):3316CrossRefGoogle Scholar
  7. 7.
    Clyde M, George E (2004) Model uncertainty. Stat Sci 19(1):81–94MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Danaher P (2012) Jgl: performs the joint graphical lasso for sparse inverse covariance estimation on multiple classes.
  9. 9.
    Danaher P, Wang P, Witten D (2014) The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc B 76(2):373–397MathSciNetCrossRefGoogle Scholar
  10. 10.
    Dobra A, Jones B, Hans C, Nevins J, West M (2004) Sparse graphical models for exploring gene expression data. J Multivar Anal 90:196–212MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Dobra A, Lenkoski A, Rodriguez A (2012) Bayesian inference for general gaussian graphical models with application to multivariate lattice data. J Am Stat Assoc 106:1418–1433MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    GEO (2015) Gene expression omnibus.
  13. 13.
    George E, McCulloch R (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889CrossRefGoogle Scholar
  14. 14.
    Gottardo R, Raftery A (2008) Markov chain Monte Carlo with mixtures of mutually singular distributions. J Comput Graph Stat 17(4):949–975MathSciNetCrossRefGoogle Scholar
  15. 15.
    Griffin J, Brown P (2010) Inference with normal-gamma prior distributions in regression problems. Bayesian Anal 5(1):171–188MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Guo J, Levina E, Michailidis G, Zhu J (2011) Joint estimation of multiple graphical models. Biometrika 98(1):1–15MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Hanahan D, Weinberg R (2011) Hallmarks of cancer: the next generation. Cell 144(5):646–674CrossRefGoogle Scholar
  18. 18.
    Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP (2003) Summaries of affymetrix genechip probe level data nucleic acids research. Nucleic Acids Res 31(4):e15CrossRefGoogle Scholar
  19. 19.
    Jones B, Carvalho C, Dobra A, Hans C, Carter C, West M (2005) Experiments in stochastic computation for high dimensional graphical models. Stat Sci 20(4):388–400MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M (2014) Data, information, knowledge and principle: back to metabolism in kegg. Nucleic Acids Res 42:199–205CrossRefGoogle Scholar
  21. 21.
    Khondker Z, Zhu H, Chu H, Lin W, Ibrahim J (2013) The Bayesian Covariance Lasso. Stat Its Interface 6(2):243MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Langfelder P, Mischel SHP (2013) When is hub gene selection better than standard meta-analysis? PLoS ONE 8(4):e61505CrossRefGoogle Scholar
  23. 23.
    Li F, Zhang N (2010) Bayesian variable selection in structured high-dimensional covariate spaces with applications in genomics. J Am Stat Assoc 105(491):1202–1214MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Marwick J, Caramori G, Casolari P, Mazzoni F, Kirkham P, Adcock I, Chung K, Papi A (2010) A role for phosphoinositol 3-kinase delta in the impairment of glucocorticoid responsiveness in patients with chronic obstructive pulmonary disease. J Allergy Clin Immunol 125(5):1146–53CrossRefGoogle Scholar
  25. 25.
    Mukherjee S, Speed T (2008) Network inference using informative priors. Proc Natl Acad Sci 105(38):14,313–14,318CrossRefGoogle Scholar
  26. 26.
    Ni Y, Marchetti G, Baladandayuthapani V, Stingo F (2015) Bayesian approaches for large biological networks. In: Mitra R, Muller P (eds) Nonparametric Bayesian methods in biostatistics and bioinformatics. Springer, New YorkGoogle Scholar
  27. 27.
    Park T, Casella G (2008) The Bayesian lasso. J Am Stat Assoc 20(1):140–157MathSciNetzbMATHGoogle Scholar
  28. 28.
    Parshall M (1999) Adult emergency visits for chronic cardiorespiratory disease: does dyspnea matter? Nurs Res 48(2):62–70CrossRefGoogle Scholar
  29. 29.
    Peterson C, Stingo F, Vannucci M (2015) Bayesian inference of multiple Gaussian graphical models. J Am Stat Assoc 110(509):159–174MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Peterson C, Stingo F, Vannucci M (2016) Joint bayesian variable and graph selection for regression models with network-structured predictors. Stat Med 35(7):1017–1031MathSciNetCrossRefGoogle Scholar
  31. 31.
    Regan EA et al (2010) Genetic epidemiology of copd (copdgene) study design. COPD 7(1):32–43CrossRefGoogle Scholar
  32. 32.
    Reimand J, Wagih O, Bader G (2013) The mutational landscape of phosphorylation signaling in cancer. Sci Rep. doi: 10.1038/srep02651
  33. 33.
    Roverato A (2002) Hyper-inverse Wishart distribution for non-decomposable graphs and its application to Bayesian inference for Gaussian graphical models. Scand J Stat 29:391–411MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Scott J, Berger J (2010) Bayes and empirical Bayes multiplicity adjustment in the variable-selection problem. Ann Stat 38(5):2587–2619MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Scott J, Carvalho C (2008) Feature-inclusion stochastic search for Gaussian graphical models. J Comput Graphical Stat 17:790–808MathSciNetCrossRefGoogle Scholar
  36. 36.
    Singh D et al (2014) Altered gene expression in blood and sputum in copd frequent exacerbators in the eclipse cohort.
  37. 37.
    Skrepnek G, Skrepnek S (2004) Epidemiology, clinical and economic burden, and natural history of chronic obstructive pulmonary disease and asthma. AM J Manag Care 10(5):S129–38Google Scholar
  38. 38.
    Stelzer G, Dalah I, Stein T, Satanower Y, Rosen N, Nativ N, Oz-Levi D, Olender T, Belinky F, Bahir I, Krug H, Perco P, Mayer B, Kolker E, Safran M, Lancet D (2011) In-silico human genomics with genecards. Hum Genomics 5(6):709–717CrossRefGoogle Scholar
  39. 39.
    Stingo F, Marchetti G (2015) Efficient local updates for undirected graphical models. Stat Comput 25:159–171MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Stingo F, Vannucci M (2011) Variable selection for discriminant analysis with markov random field priors for the analysis of microarray data. Bioinformatics 27(4):495–501CrossRefGoogle Scholar
  41. 41.
    Stingo F, Chen Y, Vannucci M, Barrier M, Mirkes P (2010) A Bayesian graphical modeling approach to microRNA regulatory network inference. Ann Appl Stat 4(4):2024MathSciNetCrossRefzbMATHGoogle Scholar
  42. 42.
    Telesca D, Mueller P, Kornblau S, Suchard M, Ji Y (2012) Modeling protein expression and protein signaling pathways. J Am Stat Assoc 107(500):1372–1384MathSciNetCrossRefzbMATHGoogle Scholar
  43. 43.
    Wang H (2012) The Bayesian graphical lasso and efficient posterior computation. Bayesian Anal 7(2):771–790MathSciNetGoogle Scholar
  44. 44.
    Wang H (2015) Scaling it up: stochastic search structure learning in graphical models. Bayesian Anal 10(2):351–377MathSciNetCrossRefzbMATHGoogle Scholar
  45. 45.
    Wang H, Li Z (2012) Efficient gaussian graphical model determination under g-wishart prior distributions. Electron J Stat 6:168–198MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Yajima M, Telesca D, Ji Y, Muller P (2015) Detecting differential patterns of interaction in molecular pathways. Biostatistics 16(2):240–251MathSciNetCrossRefGoogle Scholar

Copyright information

© International Chinese Statistical Association 2016

Authors and Affiliations

  1. 1.Department of StatisticsRice UniversityHoustonUSA
  2. 2.Dipartimento di Statistica, Informatica, Applicazioni “G.Parenti”University of FlorenceFlorenceItaly
  3. 3.Department of BiostatisticsUT MD Anderson Cancer CenterHoustonUSA
  4. 4.Department of MedicineNational Jewish HealthDenverUSA
  5. 5.Department of Pharmaceutical Sciences, School of PharmacyUniversity of Colorado DenverDenverUSA
  6. 6.Department of Biostatistics and Informatics, Colorado School of Public HealthUniversity of Colorado DenverDenverUSA

Personalised recommendations