Abstract
We propose a framework for learning from data and validating Bayesian network models of gene networks. The learning phase selects multiple locally optimal models of the data and reports the best of them. The validation phase assesses the confidence in the model reported by studying the different locally optimal models obtained in the learning phase. We prove that our framework is asymptotically correct under the faithfulness assumption. Experiments with real data (320 samples of the expression levels of 32 genes involved in Saccharomyces cerevisiae, i.e. baker’s yeast, pheromone response) show that our framework is reliable.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bernard, A. and Hartemink, A. J. (2005) Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data. In Pacific Symposium on Biocomputing 10.
Blake, W. J., Kærn, M., Cantor, C. R. and Collins, J. J. (2003) Noise in Eukaryotic Gene Expression. Nature 422:633-637.
Castelo, R. and Kočka, T. (2003) On Inclusion-Driven Learning of Bayesian Networks. Journal of Machine Learning Research 4:527-574.
Chickering, D. M. (1996) Learning Bayesian Networks is NP-Complete. In Learning from Data: Artificial Intelligence and Statistics V:121-130.
Chickering, D. M. (2002) Optimal Structure Identification with Greedy Search. Journal of Machine Learning Research 3:507-554.
Cooper, G. and Herskovits, E. H. (1992) A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9:309-347.
Costanzo, M. C., Crawford, M. E., Hirschman, J. E., Kranz, J. E., Olsen, P., Robertson, L. S., Skrzypek, M. S., Braun, B. R., Hopkins, K. L., Kondu, P., Lengieza, C., Lew-Smith, J. E., Tillberg, M. and Garrels, J. I. (2001) YPD™, PombePD™ and WormPD™: Model Organism Volumes of the BioKnowledge™ Library, an Integrated Resource for Protein Information. Nucleic Acids Research 29:75-79.
Elion, E. A. (2000) Pheromone Response, Mating and Cell Biology. Current Opinion in Microbiology 3:573-581.
Friedman, N. (2004) Inferring Cellular Networks Using Probabilistic Graphical Models. Science 303:799-805.
Friedman, N., Goldszmidt, M. and Wyner, A. (1999) Data Analysis with Bayesian Networks: A Bootstrap Approach. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence 196-205.
Friedman, N. and Koller, D. (2003) Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks. Machine Learning 50:95-125.
Friedman, N., Linial, M., Nachman, I. and Pe’er, D. (2000) Using Bayesian Networks to Analyze Expression Data. Journal of Computational Biology 7:601-620.
Gavin, I. M., Kladde, M. P. and Simpson, R. T. (2000) Tup1p Represses Mcm1p Transcriptional Activation and Chromatin Remodeling of an a-Cell-Specific Gene. The EMBO Journal 19:5875-5883.
Hartemink, A. J., Gifford, D. K., Jaakkola, T. S. and Young, R. A. (2002) Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models. In Pacific Symposium on Biocomputing 7:437-449.
Heckerman, D., Geiger, D. and Chickering, D. M. (1995) Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning 20:197-243.
Herskovits, E. H. (1991) Computer-Based Probabilistic-Network Construction. PhD Thesis, Stanford University.
Imoto, S., Goto, T. and Miyano, S. (2002) Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Network and Nonparametric Regression. In Pacific Symposium on Biocomputing 7:175-186.
McAdams, H. H. and Arkin, A. (1997) Stochastic Mechanisms in Gene Expression. In Proceedings of the National Academy of Science of the USA 94:814-819.
Nielsen, J. D., Kočka, T. and Peña, J. M. (2003) On Local Optima in Learning Bayesian Networks. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence 435-442.
Pe’er, D., Regev, A., Elidan, G. and Friedman, N. (2001) Inferring Subnetworks from Perturbed Expression Profiles. Bioinformatics 17:S215-S224.
Peña, J. M., Björkegren, J. and Tegnér, J. (2005) Growing Bayesian Network Models of Gene Networks from Seed Genes. Bioinformatics 21:ii224-ii229.
Ren, B., Robert, F., Wyrick, J. J., Aparicio, O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., Volkert, T. L., Wilson, C. J., Bell, S. P. and Young, R. A. (2000) Genome-Wide Location and Function of DNA Binding Proteins. Science 290:2306-2309.
Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A. and Nolan, G. P. (2005) Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data. Science 308:523-529.
Schwarz, G. (1978) Estimating the Dimension of a Model. Annals of Statistics 6:461-464.
Sebastiani, P., Gussoni, E., Kohane, I. S. and Ramoni, M. (2003) Statistical Challenges in Functional Genomics (with Discussion). Statistical Science 18:33-60.
Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. (1998) Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell 9:3273-3297.
Spirtes, P., Glymour, C. and Scheines, R. (1993) Causation, Prediction, and Search. Springer-Verlag, New York.
Studený, M. (2003) Characterization of Inclusion Neighbourhood in Terms of the Essential Graph: Upper Neighbours. In Proceedings of the Seventh European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty 161-172.
Studený, M. (2003) Characterization of Inclusion Neighbourhood in Terms of the Essential Graph: Lower Neighbours. In Proceedings of the Sixth Workshop on Uncertainty Processing 243-262.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Peña, J.M., Björkegren, J., Tegnér, J. (2007). Learning and Validating Bayesian Network Models of Gene Networks. In: Lucas, P., Gámez, J.A., Salmerón, A. (eds) Advances in Probabilistic Graphical Models. Studies in Fuzziness and Soft Computing, vol 213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68996-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-540-68996-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68994-2
Online ISBN: 978-3-540-68996-6
eBook Packages: EngineeringEngineering (R0)