Machine Learning

, Volume 52, Issue 1–2, pp 147–167 | Cite as

On Learning Gene Regulatory Networks Under the Boolean Network Model

  • Harri Lähdesmäki
  • Ilya Shmulevich
  • Olli Yli-Harja


Boolean networks are a popular model class for capturing the interactions of genes and global dynamical behavior of genetic regulatory networks. Recently, a significant amount of attention has been focused on the inference or identification of the model structure from gene expression data. We consider the Consistency as well as Best-Fit Extension problems in the context of inferring the networks from data. The latter approach is especially useful in situations when gene expression measurements are noisy and may lead to inconsistent observations. We propose simple efficient algorithms that can be used to answer the Consistency Problem and find one or all consistent Boolean networks relative to the given examples. The same method is extended to learning gene regulatory networks under the Best-Fit Extension paradigm. We also introduce a simple and fast way of finding all Boolean networks having limited error size in the Best-Fit Extension Problem setting. We apply the inference methods to a real gene expression data set and present the results for a selected set of genes.

gene regulatory networks network inference Consistency Problem Best-Fit Extension paradigm 


  1. Akutsu, T., Kuhara, S., Maruyama, O., & Miyano, S. (1998). Identification of gene regulatory networks by strategic gene disruptions and gene overexpressions. In Proc. the 9th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'98), (pp. 695–702).Google Scholar
  2. Akutsu, T., Miyano, S., & Kuhara, S. (1999). Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. Pacific Symposium on Biocomputing, 4,17–28.Google Scholar
  3. Akutsu, T., Miyano, S., & Kuhara, S. (2000). Inferring qualitative relations in genetic networks and metabolic pathways. Bioinformatics, 16, 727–734.Google Scholar
  4. Arnone, M. I., & Davidson, E. H. (1997). The hardwiring of development: Organization and function of genomic regulatory systems. Development, 124, 1851–1864.Google Scholar
  5. Banerjee, N., & Zhang, M. Q. (2002). Functional genomics as applied to mapping transcription regulatory networks. Current Opinion in Microbiology, 5:3, 313–317.Google Scholar
  6. Boros, E., Ibaraki, T., & Makino, K. (1998). Error-Free and Best-Fit Extensions of partially defined Boolean functions. Information and Computation, 140, 254–283.Google Scholar
  7. Chen, T., He, H. L., & Church, G. M. (1999). Modeling gene expression with differential equations. Pacific Symposium on Biocomputing, 4,29–40.Google Scholar
  8. Chen, Y., Dougherty, E., & Bittner, M. (1997). Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics, 2, 364–374.Google Scholar
  9. Chen, Y., Kamat, V., Dougherty, E. R., Bittner, M. L., Meltzer, P. S., & Trent, J. M. (2002). Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics, 18, 1207–1215.Google Scholar
  10. Cormen, T. H., Leiserson, C. E., & Rivest, R. L. (1998). Introduction to Algorithms. MIT Press.Google Scholar
  11. D'Haeseleer, P., Wen, X., Fuhrman, S., & Somogyi, R. (1999). Linear modeling of mRNA expression levels during CNS development and injury. Pacific Symposium on Biocomputing, 4,41–52Google Scholar
  12. de Jong, H. (2002). Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology, 9:1,67–103.Google Scholar
  13. Dougherty, E. R., Kim, S., & Chen, Y. (2000). Coefficient of determination in nonlinear signal processing. Signal Processing, 80:10, 2219–2235.Google Scholar
  14. Friedman, N., Linial, M., Nachman, I., & Pe'er, D. (2000). Using Bayesian networks to analyze expression data. Journal of Computational Biology, 7, 601–620.Google Scholar
  15. Glass, L., & Kauffman, S. A. (1973). The logical analysis of continuous non-linear biochemical control networks. Journal of Theoretical Biology, 39, 103–129.Google Scholar
  16. Hartemink, A., Gifford, D., Jaakkola, T., & Young, R. (2001). Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pacific Symposium on Biocomputing, 6, 422–433.Google Scholar
  17. Hasty, J., McMillen, D., Isaacs, F., & Collins, J. J. (2001). Computational studies of gene regulatory networks: In numero molecular biology. Nature Reviews Genetics, 2, 268–279.Google Scholar
  18. Huang, S. (1999). Gene expression profiling, genetic networks and cellular states: An integrating concept for tumorigenesis and drug discovery. Journal of Molecular Medicine, 77, 469–480.Google Scholar
  19. Ideker, T. E., Thorsson, V., & Karp, R. M. (2000). Discovery of regulatory interactions through perturbation: Inference and experimental design. Pacific Symposium on Biocomputing, 5, 302–313.Google Scholar
  20. Karp, R. M., Stoughton, R., & Yeung, K. Y. (1999). Algorithms for choosing differential gene expression experi-ments. RECOMB99 (pp. 208–217). ACM.Google Scholar
  21. Kauffman, S. A. (1969). Metabolic stability and epigenesis in randomly constructed genetic nets. Journal of Theoretical Biology, 22, 437–467.Google Scholar
  22. Kauffman, S. A. (1993). The Origins of Order: Self-organization and Selection in Evolution. New York: Oxford University Press.Google Scholar
  23. Kerr, M. K., Leiter, E. H., Picard, L., & Churchill, G. A. (2002). Sources of variation in microarray experiments. In W. Zhang, & I. Shmulevich (Eds.), Computational and Statistical Approaches to Genomics. Boston: Kluwer Academic Publishers.Google Scholar
  24. Lee, T. I., Rinaldi, N. J., Robert, F., Odom, D. T., Bar-Joseph, Z., Gerber, G. K., Hannett, N. M., Harbison, C. T., Thompson, C. M., Simon, I. et al. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799–804.Google Scholar
  25. Liang, S., Fuhrman, S., & Somogyi, R. (1998). REVEAL, A general reverse engineering algorithm for inference of genetic network architectures. Pacific Symposium on Biocomputing, 3,18–29.Google Scholar
  26. Maki, Y., Tominaga, D., Okamoto, M., Watanabe, S., & Eguchi, Y. (2001). Development of a system for the inference of large scale genetic networks. Pacific Symposium on Biocomputing, 6, 446–458.Google Scholar
  27. MacKay, D. J. C. (1992). Bayesian interpolation. Neural Computation, 4:3, 415–447.Google Scholar
  28. McAdams, H. H., & Arkin, A. (1997). Stochastic mechanisms in gene expression. Proc. Natl. Acad. Sci. USA, 94, 814–819.Google Scholar
  29. McAdams, H. H., & Arkin, A. (1999). It's a noisy business! Genetic regulation at the nanomolar scale. Trends in Genetics, 15,65–69.Google Scholar
  30. Mestl, T., Plahte, E., & Omholt, S. W. (1995). A mathematical framework for describing and analyzing gene regulatory networks. Journal of Theoretical Biology, 176, 291–300.Google Scholar
  31. Murphy, K., & Mian, S. (1999). Modelling gene expression data using dynamic Bayesian networks. Technical Report, University of California, Berkeley.Google Scholar
  32. Noda, K., Shinohara, A., Takeda, M., Matsumoto, S., Miyano, S., & Kuhara, S. (1998). Finding genetic network from experiments by weighted network model. Genome Informatics, 9, 141–150.Google Scholar
  33. Ren, B., Robert, F., Wyrick, J. J., Aparicio. O., Jennings, E. G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E. et al. (2000). Genome-wide location and function of DNA binding proteins. Science, 290, 2306–2309.Google Scholar
  34. Shmulevich, I., Dougherty, E. R., Seungchan, K., & Zhang, W. (2002a). Probabilistic Boolean networks: A rule-based uncertainty model for gene regulatory networks. Bioinformatics, 18, 261–274.Google Scholar
  35. Shmulevich, I., Saarinen, A., Yli-Harja, O., & Astola, J. (2002b). Inference of genetic regulatory networks under the Best-Fit Extension paradigm. In W. Zhang, and I. Shmulevich (Eds.), Computational And Statistical Approaches To Genomics. Boston: Kluwer Academic Publishers.Google Scholar
  36. Shmulevich, I., & Zhang, W. (2002c). Binary analysis and optimization-based normalization of gene expression data. Bioinformatics, 18, 555–565.Google Scholar
  37. Simon, I., Barnett, J., Hannett, N., Harbison, C. T., Rinaldi, N. J., Volkert, T. L., Wyrick, J. J., Zeitlinger, J., Gifford, D. K., Jaakkola, T. S., & Young, R. A. (2001). Serial regulation of transcriptional regulators in the yeast cell cycle. Cell, 106, 697–708.Google Scholar
  38. Smolen, P., Baxter, D. A., & Byrne, J. H. (2000). Mathematical modeling of gene networks. Neuron, 26, 567–580.Google Scholar
  39. Smyth, G. K., Yang, Y. H., & Speed, T. (2003). Statistical issues in cDNA microarray data analysis. In M. J. Brownstein, & A. B. Khodursky (Eds.), Functional Genomics: Methods and Protocols, Methods in Molecular Biology series (pp. 111–136). Totowa, NJ: Humana Press. To appear.Google Scholar
  40. Somogyi, R., & Sniegoski, C. (1996). Modeling the complexity of gene networks: Understanding multigenic and pleiotropic regulation. Complexity, 1,45–63.Google Scholar
  41. Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D., & Futcher, B. (1998). Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by Microarray Hybridization. Molecular Biology of the Cell, 9, 3273–3297.Google Scholar
  42. Tabus, I., & Astola, J. (2001). On the use of MDL principle in gene expression prediction. Journal of Applied Signal Processing, 4, 297–303.Google Scholar
  43. Tabus, I., Rissanen, J., & Astola, J. (2002). Normalized maximum likelihood models for Boolean regression with application to prediction and classification in genomics. In W. Zhang, & I. Shmulevich (Eds.), Computational And Statistical Approaches To Genomics. Boston: Kluwer Acadmic Publishers.Google Scholar
  44. Thieffry, D., Huerta, A. M., Pèrez-Rueda, E., & Collado-Vides, J. (1998). From specific gene regulation to genomic networks: A global analysis of transcriptional regulation in Escherichia coli. BioEssays, 20, 433–440.Google Scholar
  45. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., & Altman, R. B. (2001). Missing value estimation methods for DNA microarrays. Bioinformatics, 17, 520–525.Google Scholar
  46. Vohradsky, J. (2001). Neural model of the genetic network. The Journal of Biological Chemistry, 276:39, 36168–36173.Google Scholar
  47. Weaver, D. C., Workman, C. T., & Stormo, G. D. (1999). Modeling regulatory networks with weight matrices. Pacific Symposium on Biocomputing, 4, 112–123.Google Scholar
  48. Yli-Harja, O., Linne, M.-L., & Astola, J. (2001). On the use of cDNAmicroarray data in Boolean network inference. In Proc. Conf. on Computer Science and Information Technologies (pp. 405–409). Yerevan, Armenia.Google Scholar

Copyright information

© Kluwer Academic Publishers 2003

Authors and Affiliations

  • Harri Lähdesmäki
    • 1
  • Ilya Shmulevich
    • 2
  • Olli Yli-Harja
    • 1
  1. 1.Institute of Signal Processing, Digital Media InstituteTampere University of TechnologyTampereFinland
  2. 2.Cancer Genomics LaboratoryUniversity of Texas M.D. Anderson Cancer CenterHoustonUSA

Personalised recommendations