Abstract
Many statistical methods have been developed to screen for differentially expressed genes associated with specific phenotypes in the microarray data. However, it remains a major challenge to synthesize the observed expression patterns with abundant biological knowledge for more complete understanding of the biological functions among genes. Various methods including clustering analysis on genes, neural network, Bayesian network and pathway analysis have been developed toward this goal. In most of these procedures, the activation and inhibition relationships among genes have hardly been utilized in the modeling steps. We propose two novel Bayesian models to integrate the microarray data with the putative pathway structures obtained from the KEGG database and the directional gene–gene interactions in the medical literature. We define the symmetric Kullback–Leibler divergence of a pathway, and use it to identify the pathway(s) most supported by the microarray data. Monte Carlo Markov Chain sampling algorithm is given for posterior computation in the hierarchical model. The proposed method is shown to select the most supported pathway in an illustrative example. Finally, we apply the methodology to a real microarray data set to understand the gene expression profile of osteoblast lineage at defined stages of differentiation. We observe that our method correctly identifies the pathways that are reported to play essential roles in modulating bone mass.
Similar content being viewed by others
References
Chen M-H, Shao Q-M, Ibrahim JG (2000) Monte Carlo methods in Bayesian computation. Springer, New York
Chen M-H, Huang L, Ibrahim JG, Kim S (2008) Bayesian variable selection and computation for generalized linear models with conjugate priors. Bayesian Anal 3:585–614
Curtis RK, Oresic M, Vidal-Puig A (2005) Pathways to the analysis of microarray data. Trends Biotechnol 23(8):429–435
Efron B, Tibshirani R (2007) On testing the significance of sets of genes. Ann Appl Stat 1:107–129
Ellis B, Wong WH (2008) Learning causal Bayesian network structures from experimental data. J Am Stat Assoc 103:778–789
Fletcher R, Reeves CM (1964) Function minimization by conjugate gradients. Comput J 7:148–154
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7(3–4):601–620
Geweke J (1992) Evaluating the accuracy of sampling-based approaches to calculating posterior moments. In: Bernado JM, Berger JO, Dawid AP, Smith AFM (eds) Bayesian statistics 4. Clarendon, Oxford
Hartmann C (2006) A Wnt canon orchestrating osteoblastogenesis. Trends Cell Biol 16(3):151–158
Hartemink A, Gifford DK, Jaakkola TS, Young RA (2002) Bayesian methods for elucidating genetic regulatory networks. IEEE Intell Syst Biol 17(2):37–43
Heckerman D (1995) A tutorial on learning Bayesian networks. Technical Report MSR-TR-95-06, Microsoft Research
Hoffmann A, Gross G (2001) BMP signaling pathways in cartilage and bone formation. Crit Rev Eucar Gene Expr 11(1–3):23–46
Ishii M, Kurachi Y (2006) Muscarinic acetylcholine receptors. Curr Pharm Des 12(28):3573–3581
Jensen ED, Gopalakrishnan R, Westendorf JJ (2010) Regulation of gene expression in osteoblasts. BioFactors 36(1):25–32
Jimia E, Hirataa S, Shina M, Yamazakia M, Fukushimaa H (2010) Molecular mechanisms of BMP-induced bone formation: Cross-talk between BMP and NF-?B signaling pathways in osteoblastogenesis. Jpn Dent Sci Rev 46(1):33–42
Kalajzic I, Staale A, Yang W-P, Wu Y, Johnson SE, Feyen JHM, Krueger W, Maye P, Yu F, Zhao Y, Kuo L, Gupta RR, Achenie LEK, Wang H-W, Shin D-G, Rowe DW (2005) Expression profile of osteoblast lineage at defined stages of differentiation. J Biol Chem 280:24618–24626
Kanehisa M, Goto S (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28(1):27–30
Kay GG, Abou-Donia MB, Messer WS, Murphy DG, Tsao JW, Ouslander JG (2005) Antimuscarinic drugs for overactive bladder and their potential effects on cognitive function in older patients. J Am Geriatr Soc 53(12):2195–2201
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Liu JS (1994) The collapsed Gibbs sampler with applications to a gene regulation problem. J Am Stat Assoc 89:958–966
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092
Monni S, Li H (2010) Bayesian methods for network-structures genomic data. In: Chen MH, Dey DK, Muller P, Sun D, Ye K (eds) Frontiers of statistical decision making and Bayesian analysis: In honor of James O. Berger. Springer, New York, pp 303–315
Newton M, Quintana F, Den Boon J, Sengupta S, Ahlquist P (2007) Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis. Ann Appl Stat 1:85–106
Sachs K, Gifford D, Jaakkola T, Sorger P, Lauffenburger DA (2002) Bayesian network approach to cell signaling pathway modeling. Sci Signal Transduct Knowl Environ 148:pe38
Sebastiani P, Abad M, Ramoni M (2004) Bayesian networks for genomic analysis. In: Dougherty ER, Shmulevich I, Chen J, Wang ZJ (eds) Genomic signal processing and statistics. Hindawi Publishing Corporation, New York, pp 281–320
Shen H, West M (2010) Bayesian modeling for biological annotation of gene expression pathway signatures. In: Chen MH, Dey DK, Muller P, Sun D, Ye K (eds) Frontiers of statistical decision making and Bayesian analysis: In honor of James O. Berger. Springer, New York, pp 285–302
Tilg H, Moschen AR (2006) Adipocytokines: mediators linking adipose tissue, inflammation and immunity. Nat Rev Immunol 6:772–783
van Amerongen R, Nusse R (2009) Towards an integrated view of Wnt signaling in development. Development 136(19):3205–3214
Werhli A, Husmeier D (2007) Reconstructing gene regulatory networks with Bayesian network by combining expression data with multiple sources of prior knowledge. Stat Appl Genet Mol Biol 6(1):1–45
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, Y., Chen, MH., Pei, B. et al. A Bayesian Approach to Pathway Analysis by Integrating Gene–Gene Functional Directions and Microarray Data. Stat Biosci 4, 105–131 (2012). https://doi.org/10.1007/s12561-011-9046-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-011-9046-1