Robust Learning of High-dimensional Biological Networks with Bayesian Networks

Nägele, Andreas; Dejori, Mathäus; Stetter, Martin

doi:10.1007/978-1-84800-261-6_7

Robust Learning of High-dimensional Biological Networks with Bayesian Networks

Andreas Nägele^2,3,
Mathäus Dejori⁴ &
Martin Stetter⁵

Chapter

568 Accesses

Abstract

Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

(2007). GEO. Gene Expression Omnibus Website.http://www.ncbi.nlm.nih.gov/geo/, last accessed September.
Aliferis, C. F., Tsamardinos, I., Statnikov, A., and Brown, L. E. (2003). Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery. In Valafar, F. and Valafar, H., editors, Proceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Scienes (METMBS’03), pages 371–376. CSREA Press.
Google Scholar
Barabási, A.-L. and Bonabeau, E. (2003). Scale-Free Networks. Scientific American, 288:60–69.
Article Google Scholar
Beinlich, I. A., Suermondt, H. J., Chavez, R. M., and Cooper, G. F. (1989). The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In Hunter, J., Cookson, J., and Wyatt, J., editors, Second European Conference on Artificial Intelligence in Medicine, volume 38, pages 247–256, London, Great Britain. Springer-Verlag, Berlin.
Google Scholar
Binder, J., Koller, D., Russell, S., and Kanazawa, K. (1997). Adaptive Probabilistic Networks with Hidden Variables. Machine Learning, 29(2–3):213–244.
Article MATH Google Scholar
Chickering, D. M. (1995). A Transformational Characterization of Equivalent Bayesian Network Structures. In Besnard, P. and Hanks, S., editors, Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pages 87–98, San Mateo, CA. Morgan Kaufmann Publishers, Inc.
Google Scholar
Chickering, D. M., Geiger, D., and Heckerman, D. (1994). Learning Bayesian Networks isNP-Hard. Technical Report MSR-TR-94-17, Microsoft Research, Redmond, WA.
Google Scholar
Dejori, M. (2005). Inference Modeling of Gene Regulatory Networks. PhD thesis, TU München, Garching, Germany.
Google Scholar
Dejori, M., Schürmann, B., and Stetter, M. (2004). Hunting Drug Targets by Systems-Level Modeling of Gene Expression Profiles. IEEE Transactions on Nanobioscience, 3(3):180–191.
Article Google Scholar
Dejori, M. and Stetter, M. (2003). Bayesian Inference of Genetic Networks from Gene-Expression Data: Convergence and Reliability. In Arubnia, H., Joshua, R., and Mun, Y., editors, Proceedings of the 2003 International Conference on Artificial Intelligence, pages 321–327. CSREA Press.
Google Scholar
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability. Chapman & Hall, New York.
Google Scholar
Friedman, N. (1997). Learning Belief Networks in the Presence of Missing Values and Hidden Variables. In Fisher, D. H., editor, Proceedings of the 14th International Conference on Machine Learning, pages 125–133, San Francisco, CA. Morgan Kaufmann Publishers Inc.
Google Scholar
Friedman, N. (1998). The Bayesian structural EM algorithm. In Cooper, G. and Moral, S., editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI’98), pages 129–138, San Francisco, CA. Morgan Kaufmann Publishers Inc.
Google Scholar
Friedman, N., Goldszmidt, M., and Wyner, A. J. (1999a). On the Application of The Bootstrap for Computing Confidence Measures on Features of Induced Bayesian Networks. In Proceedings of 7th International Workshop on Artificial Intelligence and Statistics.
Google Scholar
Friedman, N., Linial, M., Nachman, I., and Pe’er, D. (2000). Using Bayesian networks to analyze expression data. In Shamir, R., Miyano, S., Istrail, S., Pevzner, P., and Waterman, M., editors, The 4th Annual International Conference on Computational Molecular Biology (RECOMB), pages 127–135, New York. ACM.
Google Scholar
Friedman, N., Nachman, I., and Pe’er, D. (1999b). Learning bayesian network structure from massive datasets: The sparse candidate algorithm. In Laskey, K. B. and Prade, H., editors, Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99), pages 206–215, San Francisco, CA. Morgan Kaufmann.
Google Scholar
Goldenberg, A. and Moore, A. (2004). Tractable Learning of Large Bayes Net Structures from Sparse Data. In Proceedings of the 21st International Conference on Machine Learning (ICML’04), page 44, New York. ACM Press.
Google Scholar
Hartemink, A. J., Gifford, D. K., Jaakkola, T., and Young, R. A. (2001). Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. In Altman, R. B., Dunker, K. A., and Hunker, L., editors, Pacific Symposium on Biocomputing, pages 422–433. World Scientific Publishing.
Google Scholar
Heckerman, D. (1995). A Tutorial on Learning With Bayesian Networks. Technical report, Microsoft Research, Redmond, WA.
Google Scholar
Huang, E., Cheng, S. H., Dressman, H., Pittman, J., Tsou, M. H., Horng, C. F., Bild, A.,Iversen, E. S., Liao, M., Chen, C. M., West, M., Nevins, J. R., and Huang, A. T. (2003). Gene expression predictors of breast cancer outcomes. Lancet, 361(9369):1590–1596.
Article Google Scholar
Imoto, S., Goto, T., and Miyano, S. (2002). Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression. In Pacific Symposium on Biocomputing, pages 175–186.
Google Scholar
Jeong, H., Tombor, B., Albert, R., Oltvai, Z., and Barabási, A.-L. (2000). The large-scale organization of metabolic networks. Nature, 407:651–654.
Article Google Scholar
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983). Optimization by Simulated Annealing. Science, 220(4598):671–680.
Article MathSciNet Google Scholar
Ma, X.-J., Wang, Z., Ryan, P. D., Isakoff, S. J., Barmettler, A., Fuller, A., Muir, B., Mohapatra, G., Salunga, R., Tuggle, J. T., Tran, Y., Tran, D., Tassin, A., Amon, P., Wang, W., Wang, W., Enright, E., Stecker, K., Estepa-Sabal, E., Smith, B., Younger, J., Balis, U., Michaelson, J., Bhan, A., Habin, K., Baer, T. M., Brugge, J., Haber, D. A., Erlander, M. G., and Sgroi, D. C. (2004). A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell, 5(6):607–616.
Article Google Scholar
Nägele, A., Dejori, M., and Stetter, M. (2007). Bayesian substructure learning–approximate learning of very large network structures. In Kok, J. N., Koronacki, J., de Mántaras, R. L., Matwin, S., Mladenic, D., and Skowron, A., editors, Machine Learning: ECML 2007. 18th European Conference on Machine Learning, volume 4701 of Lecture Notes in Computer Science, pages 238–249. Springer, Berlin.
Google Scholar
Neapolitan, R. E. (2003). Learning Bayesian Networks. Prentice Hall, Englewood Cliffs, NJ, first edition.
Google Scholar
Pe’er, D., Regev, A., Elidan, G., and Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17 Suppl 1.
Google Scholar
Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995). Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray. Science, 270(5235):467–470.
Article Google Scholar
Spirtes, P., Glymour, C., and Scheines, R. (2001). Causation, Prediction, and Search. The MIT Press, Cambridge, MA, second edition.
MATH Google Scholar
Stetter, M., Nägele, A., and Dejori, M. (2007). GeneSim: Intelligent IT Platform for the Biomedical World. In Schuster, A., editor, Intelligent Computing Everywhere, pages 171–194. Springer, Berlin.
Chapter Google Scholar
Tsamardinos, I., Brown, and Constantin, A. (2006a). The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65(1):31–78.
Article Google Scholar
Tsamardinos, I., Statnikov, A. R., Brown, L. E., and Aliferis, C. F. (2006b). Generating realistic large bayesian networks by tiling. In Sutcliffe, G., Goebel, R., Sutcliffe, G., and Goebel, R., editors, Proceedings of the 19th International Florida Artificial Intelligence Research Society Conference, pages 592–597, Menlo Park, CA. AAAI Press.
Google Scholar
Venter, J. C., Adams, M. D., and et al. (2001). The Sequence of the Human Genome. Science, 291(5507):1304–1351.
Article Google Scholar
Verma, T. S. and Pearl, J. (1991). Equivalence and synthesis of causal models. In Bonissone, P. P., Henrion, M., Kanal, L. N., and Lemmer, J. F., editors, Proceedings of the 6th Annual Conference on Uncertainty in Artificial Intelligence (UAI’90), pages 255–268, North Holland. Elsevier Science Publishers B.V., Amsterdam.
Google Scholar
Yeoh, E.-J., Ross, M. E., Shurtleff, S. A., Williams, W. K., Patel, D., Mahfouz, R., Behm, F. G., Raimondi, S. C., Relling, M. V., Patel, A., and et. al., C. C. (2002). Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1(2):133–143.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Technical University Munich, Boltzmannstr. 3, 85748, Garching, Germany
Andreas Nägele
Department of Information & Communications, Siemens Corporate Technology, Otto-Hahn-Ring 6, 81730, Munich, Germany
Andreas Nägele
Intelligent Vision and Reasoning Princeton, Siemens Corporate Research, NJ, USA
Mathäus Dejori
Department of Information and Communications, Siemens Corporate Technology, D-81730, Munich, Germany
Martin Stetter

Authors

Andreas Nägele
View author publications
You can also search for this author in PubMed Google Scholar
Mathäus Dejori
View author publications
You can also search for this author in PubMed Google Scholar
Martin Stetter
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Nägele .

Editor information

Editors and Affiliations

School of Computing and Mathematics, University of Ulster at Jordanstown, Jordanstown, Northern Ireland, UK
Alfons Schuster

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nägele, A., Dejori, M., Stetter, M. (2008). Robust Learning of High-dimensional Biological Networks with Bayesian Networks. In: Schuster, A. (eds) Robust Intelligent Systems. Springer, London. https://doi.org/10.1007/978-1-84800-261-6_7

Download citation

DOI: https://doi.org/10.1007/978-1-84800-261-6_7
Publisher Name: Springer, London
Print ISBN: 978-1-84800-260-9
Online ISBN: 978-1-84800-261-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics