Skip to main content

Robust Learning of High-dimensional Biological Networks with Bayesian Networks

  • Chapter
  • 568 Accesses

Abstract

Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • (2007). GEO. Gene Expression Omnibus Website.http://www.ncbi.nlm.nih.gov/geo/, last accessed September.

  • Aliferis, C. F., Tsamardinos, I., Statnikov, A., and Brown, L. E. (2003). Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery. In Valafar, F. and Valafar, H., editors, Proceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Scienes (METMBS’03), pages 371–376. CSREA Press.

    Google Scholar 

  • Barabási, A.-L. and Bonabeau, E. (2003). Scale-Free Networks. Scientific American, 288:60–69.

    Article  Google Scholar 

  • Beinlich, I. A., Suermondt, H. J., Chavez, R. M., and Cooper, G. F. (1989). The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In Hunter, J., Cookson, J., and Wyatt, J., editors, Second European Conference on Artificial Intelligence in Medicine, volume 38, pages 247–256, London, Great Britain. Springer-Verlag, Berlin.

    Google Scholar 

  • Binder, J., Koller, D., Russell, S., and Kanazawa, K. (1997). Adaptive Probabilistic Networks with Hidden Variables. Machine Learning, 29(2–3):213–244.

    Article  MATH  Google Scholar 

  • Chickering, D. M. (1995). A Transformational Characterization of Equivalent Bayesian Network Structures. In Besnard, P. and Hanks, S., editors, Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pages 87–98, San Mateo, CA. Morgan Kaufmann Publishers, Inc.

    Google Scholar 

  • Chickering, D. M., Geiger, D., and Heckerman, D. (1994). Learning Bayesian Networks isNP-Hard. Technical Report MSR-TR-94-17, Microsoft Research, Redmond, WA.

    Google Scholar 

  • Dejori, M. (2005). Inference Modeling of Gene Regulatory Networks. PhD thesis, TU München, Garching, Germany.

    Google Scholar 

  • Dejori, M., Schürmann, B., and Stetter, M. (2004). Hunting Drug Targets by Systems-Level Modeling of Gene Expression Profiles. IEEE Transactions on Nanobioscience, 3(3):180–191.

    Article  Google Scholar 

  • Dejori, M. and Stetter, M. (2003). Bayesian Inference of Genetic Networks from Gene-Expression Data: Convergence and Reliability. In Arubnia, H., Joshua, R., and Mun, Y., editors, Proceedings of the 2003 International Conference on Artificial Intelligence, pages 321–327. CSREA Press.

    Google Scholar 

  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability. Chapman & Hall, New York.

    Google Scholar 

  • Friedman, N. (1997). Learning Belief Networks in the Presence of Missing Values and Hidden Variables. In Fisher, D. H., editor, Proceedings of the 14th International Conference on Machine Learning, pages 125–133, San Francisco, CA. Morgan Kaufmann Publishers Inc.

    Google Scholar 

  • Friedman, N. (1998). The Bayesian structural EM algorithm. In Cooper, G. and Moral, S., editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI’98), pages 129–138, San Francisco, CA. Morgan Kaufmann Publishers Inc.

    Google Scholar 

  • Friedman, N., Goldszmidt, M., and Wyner, A. J. (1999a). On the Application of The Bootstrap for Computing Confidence Measures on Features of Induced Bayesian Networks. In Proceedings of 7th International Workshop on Artificial Intelligence and Statistics.

    Google Scholar 

  • Friedman, N., Linial, M., Nachman, I., and Pe’er, D. (2000). Using Bayesian networks to analyze expression data. In Shamir, R., Miyano, S., Istrail, S., Pevzner, P., and Waterman, M., editors, The 4th Annual International Conference on Computational Molecular Biology (RECOMB), pages 127–135, New York. ACM.

    Google Scholar 

  • Friedman, N., Nachman, I., and Pe’er, D. (1999b). Learning bayesian network structure from massive datasets: The sparse candidate algorithm. In Laskey, K. B. and Prade, H., editors, Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99), pages 206–215, San Francisco, CA. Morgan Kaufmann.

    Google Scholar 

  • Goldenberg, A. and Moore, A. (2004). Tractable Learning of Large Bayes Net Structures from Sparse Data. In Proceedings of the 21st International Conference on Machine Learning (ICML’04), page 44, New York. ACM Press.

    Google Scholar 

  • Hartemink, A. J., Gifford, D. K., Jaakkola, T., and Young, R. A. (2001). Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. In Altman, R. B., Dunker, K. A., and Hunker, L., editors, Pacific Symposium on Biocomputing, pages 422–433. World Scientific Publishing.

    Google Scholar 

  • Heckerman, D. (1995). A Tutorial on Learning With Bayesian Networks. Technical report, Microsoft Research, Redmond, WA.

    Google Scholar 

  • Huang, E., Cheng, S. H., Dressman, H., Pittman, J., Tsou, M. H., Horng, C. F., Bild, A.,Iversen, E. S., Liao, M., Chen, C. M., West, M., Nevins, J. R., and Huang, A. T. (2003). Gene expression predictors of breast cancer outcomes. Lancet, 361(9369):1590–1596.

    Article  Google Scholar 

  • Imoto, S., Goto, T., and Miyano, S. (2002). Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression. In Pacific Symposium on Biocomputing, pages 175–186.

    Google Scholar 

  • Jeong, H., Tombor, B., Albert, R., Oltvai, Z., and Barabási, A.-L. (2000). The large-scale organization of metabolic networks. Nature, 407:651–654.

    Article  Google Scholar 

  • Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983). Optimization by Simulated Annealing. Science, 220(4598):671–680.

    Article  MathSciNet  Google Scholar 

  • Ma, X.-J., Wang, Z., Ryan, P. D., Isakoff, S. J., Barmettler, A., Fuller, A., Muir, B., Mohapatra, G., Salunga, R., Tuggle, J. T., Tran, Y., Tran, D., Tassin, A., Amon, P., Wang, W., Wang, W., Enright, E., Stecker, K., Estepa-Sabal, E., Smith, B., Younger, J., Balis, U., Michaelson, J., Bhan, A., Habin, K., Baer, T. M., Brugge, J., Haber, D. A., Erlander, M. G., and Sgroi, D. C. (2004). A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell, 5(6):607–616.

    Article  Google Scholar 

  • Nägele, A., Dejori, M., and Stetter, M. (2007). Bayesian substructure learning–approximate learning of very large network structures. In Kok, J. N., Koronacki, J., de Mántaras, R. L., Matwin, S., Mladenic, D., and Skowron, A., editors, Machine Learning: ECML 2007. 18th European Conference on Machine Learning, volume 4701 of Lecture Notes in Computer Science, pages 238–249. Springer, Berlin.

    Google Scholar 

  • Neapolitan, R. E. (2003). Learning Bayesian Networks. Prentice Hall, Englewood Cliffs, NJ, first edition.

    Google Scholar 

  • Pe’er, D., Regev, A., Elidan, G., and Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17 Suppl 1.

    Google Scholar 

  • Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995). Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray. Science, 270(5235):467–470.

    Article  Google Scholar 

  • Spirtes, P., Glymour, C., and Scheines, R. (2001). Causation, Prediction, and Search. The MIT Press, Cambridge, MA, second edition.

    MATH  Google Scholar 

  • Stetter, M., Nägele, A., and Dejori, M. (2007). GeneSim: Intelligent IT Platform for the Biomedical World. In Schuster, A., editor, Intelligent Computing Everywhere, pages 171–194. Springer, Berlin.

    Chapter  Google Scholar 

  • Tsamardinos, I., Brown, and Constantin, A. (2006a). The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65(1):31–78.

    Article  Google Scholar 

  • Tsamardinos, I., Statnikov, A. R., Brown, L. E., and Aliferis, C. F. (2006b). Generating realistic large bayesian networks by tiling. In Sutcliffe, G., Goebel, R., Sutcliffe, G., and Goebel, R., editors, Proceedings of the 19th International Florida Artificial Intelligence Research Society Conference, pages 592–597, Menlo Park, CA. AAAI Press.

    Google Scholar 

  • Venter, J. C., Adams, M. D., and et al. (2001). The Sequence of the Human Genome. Science, 291(5507):1304–1351.

    Article  Google Scholar 

  • Verma, T. S. and Pearl, J. (1991). Equivalence and synthesis of causal models. In Bonissone, P. P., Henrion, M., Kanal, L. N., and Lemmer, J. F., editors, Proceedings of the 6th Annual Conference on Uncertainty in Artificial Intelligence (UAI’90), pages 255–268, North Holland. Elsevier Science Publishers B.V., Amsterdam.

    Google Scholar 

  • Yeoh, E.-J., Ross, M. E., Shurtleff, S. A., Williams, W. K., Patel, D., Mahfouz, R., Behm, F. G., Raimondi, S. C., Relling, M. V., Patel, A., and et. al., C. C. (2002). Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1(2):133–143.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Nägele .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag London Limited

About this chapter

Cite this chapter

Nägele, A., Dejori, M., Stetter, M. (2008). Robust Learning of High-dimensional Biological Networks with Bayesian Networks. In: Schuster, A. (eds) Robust Intelligent Systems. Springer, London. https://doi.org/10.1007/978-1-84800-261-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-84800-261-6_7

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84800-260-9

  • Online ISBN: 978-1-84800-261-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics