Abstract
Structure learning of Bayesian networks applied to gene expression data has become a potentially useful method to estimate interactions between genes. However, the NP-hardness of Bayesian network structure learning renders the reconstruction of the full genetic network with thousands of genes unfeasible. Consequently, the maximal network size is usually restricted dramatically to a small set of genes (corresponding with variables in the Bayesian network). Although this feature reduction step makes structure learning computationally tractable, on the downside, the learned structure might be adversely affected due to the introduction of missing genes. Additionally, gene expression data are usually very sparse with respect to the number of samples, i.e., the number of genes is much greater than the number of different observations. Given these problems, learning robust network features from microarray data is a challenging task. This chapter presents several approaches tackling the robustness issue in order to obtain a more reliable estimation of learned network features.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
(2007). GEO. Gene Expression Omnibus Website.http://www.ncbi.nlm.nih.gov/geo/, last accessed September.
Aliferis, C. F., Tsamardinos, I., Statnikov, A., and Brown, L. E. (2003). Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery. In Valafar, F. and Valafar, H., editors, Proceedings of the International Conference on Mathematics and Engineering Techniques in Medicine and Biological Scienes (METMBS’03), pages 371–376. CSREA Press.
Barabási, A.-L. and Bonabeau, E. (2003). Scale-Free Networks. Scientific American, 288:60–69.
Beinlich, I. A., Suermondt, H. J., Chavez, R. M., and Cooper, G. F. (1989). The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In Hunter, J., Cookson, J., and Wyatt, J., editors, Second European Conference on Artificial Intelligence in Medicine, volume 38, pages 247–256, London, Great Britain. Springer-Verlag, Berlin.
Binder, J., Koller, D., Russell, S., and Kanazawa, K. (1997). Adaptive Probabilistic Networks with Hidden Variables. Machine Learning, 29(2–3):213–244.
Chickering, D. M. (1995). A Transformational Characterization of Equivalent Bayesian Network Structures. In Besnard, P. and Hanks, S., editors, Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, pages 87–98, San Mateo, CA. Morgan Kaufmann Publishers, Inc.
Chickering, D. M., Geiger, D., and Heckerman, D. (1994). Learning Bayesian Networks isNP-Hard. Technical Report MSR-TR-94-17, Microsoft Research, Redmond, WA.
Dejori, M. (2005). Inference Modeling of Gene Regulatory Networks. PhD thesis, TU München, Garching, Germany.
Dejori, M., Schürmann, B., and Stetter, M. (2004). Hunting Drug Targets by Systems-Level Modeling of Gene Expression Profiles. IEEE Transactions on Nanobioscience, 3(3):180–191.
Dejori, M. and Stetter, M. (2003). Bayesian Inference of Genetic Networks from Gene-Expression Data: Convergence and Reliability. In Arubnia, H., Joshua, R., and Mun, Y., editors, Proceedings of the 2003 International Conference on Artificial Intelligence, pages 321–327. CSREA Press.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability. Chapman & Hall, New York.
Friedman, N. (1997). Learning Belief Networks in the Presence of Missing Values and Hidden Variables. In Fisher, D. H., editor, Proceedings of the 14th International Conference on Machine Learning, pages 125–133, San Francisco, CA. Morgan Kaufmann Publishers Inc.
Friedman, N. (1998). The Bayesian structural EM algorithm. In Cooper, G. and Moral, S., editors, Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence (UAI’98), pages 129–138, San Francisco, CA. Morgan Kaufmann Publishers Inc.
Friedman, N., Goldszmidt, M., and Wyner, A. J. (1999a). On the Application of The Bootstrap for Computing Confidence Measures on Features of Induced Bayesian Networks. In Proceedings of 7th International Workshop on Artificial Intelligence and Statistics.
Friedman, N., Linial, M., Nachman, I., and Pe’er, D. (2000). Using Bayesian networks to analyze expression data. In Shamir, R., Miyano, S., Istrail, S., Pevzner, P., and Waterman, M., editors, The 4th Annual International Conference on Computational Molecular Biology (RECOMB), pages 127–135, New York. ACM.
Friedman, N., Nachman, I., and Pe’er, D. (1999b). Learning bayesian network structure from massive datasets: The sparse candidate algorithm. In Laskey, K. B. and Prade, H., editors, Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99), pages 206–215, San Francisco, CA. Morgan Kaufmann.
Goldenberg, A. and Moore, A. (2004). Tractable Learning of Large Bayes Net Structures from Sparse Data. In Proceedings of the 21st International Conference on Machine Learning (ICML’04), page 44, New York. ACM Press.
Hartemink, A. J., Gifford, D. K., Jaakkola, T., and Young, R. A. (2001). Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. In Altman, R. B., Dunker, K. A., and Hunker, L., editors, Pacific Symposium on Biocomputing, pages 422–433. World Scientific Publishing.
Heckerman, D. (1995). A Tutorial on Learning With Bayesian Networks. Technical report, Microsoft Research, Redmond, WA.
Huang, E., Cheng, S. H., Dressman, H., Pittman, J., Tsou, M. H., Horng, C. F., Bild, A.,Iversen, E. S., Liao, M., Chen, C. M., West, M., Nevins, J. R., and Huang, A. T. (2003). Gene expression predictors of breast cancer outcomes. Lancet, 361(9369):1590–1596.
Imoto, S., Goto, T., and Miyano, S. (2002). Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression. In Pacific Symposium on Biocomputing, pages 175–186.
Jeong, H., Tombor, B., Albert, R., Oltvai, Z., and Barabási, A.-L. (2000). The large-scale organization of metabolic networks. Nature, 407:651–654.
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983). Optimization by Simulated Annealing. Science, 220(4598):671–680.
Ma, X.-J., Wang, Z., Ryan, P. D., Isakoff, S. J., Barmettler, A., Fuller, A., Muir, B., Mohapatra, G., Salunga, R., Tuggle, J. T., Tran, Y., Tran, D., Tassin, A., Amon, P., Wang, W., Wang, W., Enright, E., Stecker, K., Estepa-Sabal, E., Smith, B., Younger, J., Balis, U., Michaelson, J., Bhan, A., Habin, K., Baer, T. M., Brugge, J., Haber, D. A., Erlander, M. G., and Sgroi, D. C. (2004). A two-gene expression ratio predicts clinical outcome in breast cancer patients treated with tamoxifen. Cancer Cell, 5(6):607–616.
Nägele, A., Dejori, M., and Stetter, M. (2007). Bayesian substructure learning–approximate learning of very large network structures. In Kok, J. N., Koronacki, J., de Mántaras, R. L., Matwin, S., Mladenic, D., and Skowron, A., editors, Machine Learning: ECML 2007. 18th European Conference on Machine Learning, volume 4701 of Lecture Notes in Computer Science, pages 238–249. Springer, Berlin.
Neapolitan, R. E. (2003). Learning Bayesian Networks. Prentice Hall, Englewood Cliffs, NJ, first edition.
Pe’er, D., Regev, A., Elidan, G., and Friedman, N. (2001). Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17 Suppl 1.
Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995). Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray. Science, 270(5235):467–470.
Spirtes, P., Glymour, C., and Scheines, R. (2001). Causation, Prediction, and Search. The MIT Press, Cambridge, MA, second edition.
Stetter, M., Nägele, A., and Dejori, M. (2007). GeneSim: Intelligent IT Platform for the Biomedical World. In Schuster, A., editor, Intelligent Computing Everywhere, pages 171–194. Springer, Berlin.
Tsamardinos, I., Brown, and Constantin, A. (2006a). The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65(1):31–78.
Tsamardinos, I., Statnikov, A. R., Brown, L. E., and Aliferis, C. F. (2006b). Generating realistic large bayesian networks by tiling. In Sutcliffe, G., Goebel, R., Sutcliffe, G., and Goebel, R., editors, Proceedings of the 19th International Florida Artificial Intelligence Research Society Conference, pages 592–597, Menlo Park, CA. AAAI Press.
Venter, J. C., Adams, M. D., and et al. (2001). The Sequence of the Human Genome. Science, 291(5507):1304–1351.
Verma, T. S. and Pearl, J. (1991). Equivalence and synthesis of causal models. In Bonissone, P. P., Henrion, M., Kanal, L. N., and Lemmer, J. F., editors, Proceedings of the 6th Annual Conference on Uncertainty in Artificial Intelligence (UAI’90), pages 255–268, North Holland. Elsevier Science Publishers B.V., Amsterdam.
Yeoh, E.-J., Ross, M. E., Shurtleff, S. A., Williams, W. K., Patel, D., Mahfouz, R., Behm, F. G., Raimondi, S. C., Relling, M. V., Patel, A., and et. al., C. C. (2002). Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell, 1(2):133–143.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag London Limited
About this chapter
Cite this chapter
Nägele, A., Dejori, M., Stetter, M. (2008). Robust Learning of High-dimensional Biological Networks with Bayesian Networks. In: Schuster, A. (eds) Robust Intelligent Systems. Springer, London. https://doi.org/10.1007/978-1-84800-261-6_7
Download citation
DOI: https://doi.org/10.1007/978-1-84800-261-6_7
Publisher Name: Springer, London
Print ISBN: 978-1-84800-260-9
Online ISBN: 978-1-84800-261-6
eBook Packages: Computer ScienceComputer Science (R0)