Learning and Validating Bayesian Network Models of Gene Networks

  • Jose M. Peña
  • Johan Björkegren
  • Jesper Tegnér
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 214)


We propose a framework for learning from data and validating Bayesian network models of gene networks. The learning phase selects multiple locally optimal models of the data and reports the best of them. The validation phase assesses the confidence in the model reported by studying the different locally optimal models obtained in the learning phase. We prove that our framework is asymptotically correct under the faithfulness assumption. Experiments with real data (320 samples of the expression levels of 32 genes involved in Saccharomyces cerevisiae, i.e. baker’s yeast, pheromone response) show that our framework is reliable.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bernard, A. and Hartemink, A. J. (2005) Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data. In Pacific Symposium on Biocomputing 10.Google Scholar
  2. 2.
    Blake, W. J., Kærn, M., Cantor, C. R. and Collins, J. J. (2003) Noise in Eukaryotic Gene Expression. Nature 422:633-637.CrossRefGoogle Scholar
  3. 3.
    Castelo, R. and Kočka, T. (2003) On Inclusion-Driven Learning of Bayesian Networks. Journal of Machine Learning Research 4:527-574.CrossRefGoogle Scholar
  4. 4.
    Chickering, D. M. (1996) Learning Bayesian Networks is NP-Complete. In Learning from Data: Artificial Intelligence and Statistics V:121-130.MathSciNetGoogle Scholar
  5. 5.
    Chickering, D. M. (2002) Optimal Structure Identification with Greedy Search. Journal of Machine Learning Research 3:507-554.CrossRefMathSciNetGoogle Scholar
  6. 6.
    Cooper, G. and Herskovits, E. H. (1992) A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9:309-347.MATHGoogle Scholar
  7. 7.
    Costanzo, M. C., Crawford, M. E., Hirschman, J. E., Kranz, J. E., Olsen, P., Robertson, L. S., Skrzypek, M. S., Braun, B. R., Hopkins, K. L., Kondu, P., Lengieza, C., Lew-Smith, J. E., Tillberg, M. and Garrels, J. I. (2001) YPD, PombePD and WormPD: Model Organism Volumes of the BioKnowledge Library, an Integrated Resource for Protein Information. Nucleic Acids Research 29:75-79.CrossRefGoogle Scholar
  8. 8.
    Elion, E. A. (2000) Pheromone Response, Mating and Cell Biology. Current Opinion in Microbiology 3:573-581.CrossRefGoogle Scholar
  9. 9.
    Friedman, N. (2004) Inferring Cellular Networks Using Probabilistic Graphical Models. Science 303:799-805.CrossRefGoogle Scholar
  10. 10.
    Friedman, N., Goldszmidt, M. and Wyner, A. (1999) Data Analysis with Bayesian Networks: A Bootstrap Approach. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence 196-205.Google Scholar
  11. 11.
    Friedman, N. and Koller, D. (2003) Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks. Machine Learning 50:95-125.MATHGoogle Scholar
  12. 12.
    Friedman, N., Linial, M., Nachman, I. and Pe’er, D. (2000) Using Bayesian Networks to Analyze Expression Data. Journal of Computational Biology 7:601-620.CrossRefGoogle Scholar
  13. 13.
    Gavin, I. M., Kladde, M. P. and Simpson, R. T. (2000) Tup1p Represses Mcm1p Transcriptional Activation and Chromatin Remodeling of an a-Cell-Specific Gene. The EMBO Journal 19:5875-5883.CrossRefGoogle Scholar
  14. 14.
    Hartemink, A. J., Gifford, D. K., Jaakkola, T. S. and Young, R. A. (2002) Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models. In Pacific Symposium on Biocomputing 7:437-449.Google Scholar
  15. 15.
    Heckerman, D., Geiger, D. and Chickering, D. M. (1995) Learning Bayesian Networks: The Combination of Knowledge and Statistical Data. Machine Learning 20:197-243.MATHGoogle Scholar
  16. 16.
    Herskovits, E. H. (1991) Computer-Based Probabilistic-Network Construction. PhD Thesis, Stanford University.Google Scholar
  17. 17.
    Imoto, S., Goto, T. and Miyano, S. (2002) Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Network and Nonparametric Regression. In Pacific Symposium on Biocomputing 7:175-186.Google Scholar
  18. 18.
    McAdams, H. H. and Arkin, A. (1997) Stochastic Mechanisms in Gene Expression. In Proceedings of the National Academy of Science of the USA 94:814-819.Google Scholar
  19. 19.
    Nielsen, J. D., Kočka, T. and Peña, J. M. (2003) On Local Optima in Learning Bayesian Networks. In Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence 435-442.Google Scholar
  20. 20.
    Pe’er, D., Regev, A., Elidan, G. and Friedman, N. (2001) Inferring Subnetworks from Perturbed Expression Profiles. Bioinformatics 17:S215-S224.Google Scholar
  21. 21.
    Peña, J. M., Björkegren, J. and Tegnér, J. (2005) Growing Bayesian Network Models of Gene Networks from Seed Genes. Bioinformatics 21:ii224-ii229.CrossRefGoogle Scholar
  22. 22.
    Ren, B., Robert, F., Wyrick, J. J., Aparicio, O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., Volkert, T. L., Wilson, C. J., Bell, S. P. and Young, R. A. (2000) Genome-Wide Location and Function of DNA Binding Proteins. Science 290:2306-2309.CrossRefGoogle Scholar
  23. 23.
    Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D. A. and Nolan, G. P. (2005) Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data. Science 308:523-529.CrossRefGoogle Scholar
  24. 24.
    Schwarz, G. (1978) Estimating the Dimension of a Model. Annals of Statistics 6:461-464.MATHMathSciNetGoogle Scholar
  25. 25.
    Sebastiani, P., Gussoni, E., Kohane, I. S. and Ramoni, M. (2003) Statistical Challenges in Functional Genomics (with Discussion). Statistical Science 18:33-60.MATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. (1998) Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell 9:3273-3297.Google Scholar
  27. 27.
    Spirtes, P., Glymour, C. and Scheines, R. (1993) Causation, Prediction, and Search. Springer-Verlag, New York.Google Scholar
  28. 28.
    Studený, M. (2003) Characterization of Inclusion Neighbourhood in Terms of the Essential Graph: Upper Neighbours. In Proceedings of the Seventh European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty 161-172.Google Scholar
  29. 29.
    Studený, M. (2003) Characterization of Inclusion Neighbourhood in Terms of the Essential Graph: Lower Neighbours. In Proceedings of the Sixth Workshop on Uncertainty Processing 243-262.Google Scholar

Copyright information

© Springer 2007

Authors and Affiliations

  • Jose M. Peña
    • 1
  • Johan Björkegren
    • 2
  • Jesper Tegnér
    • 2
    • 1
  1. 1.IFMLinköping UniversityLinköpingSweden
  2. 2.CGBKarolinska InstituteStockholmSweden

Personalised recommendations