Protein Networks and Pathway Analysis pp 157-175 | Cite as
Discovering Biological Networks from Diverse Functional Genomic Data
Abstract
Recent advances in biotechnology have produced a wealth of genomic data, which capture a variety of complementary cellular features. While these data promise to yield key insights into molecular biology, much of the available information remains underutilized because of the lack of scalable approaches for integrating signals across large, diverse data sets. A proper framework for capturing these numerous snapshots of complementary phenomena under a variety of conditions can provide the holistic view necessary for developing precise systems-level hypotheses.
Here we describe bioPIXIE, a system for combining information from diverse genomic data sets to predict biological networks. bioPIXIE utilizes a Bayesian framework for probabilistic integration of several high-throughput genomic data types including gene expression, protein–protein interactions, genetic interactions, protein localization, and sequence data to predict biological networks. The main purpose of the system is to support user-driven exploration through the inferred functional network, which is enabled by a public, web-based interface. We describe the features and supporting methods of this integration and discovery framework and present case examples where bioPIXIE has been used to generate specific, testable hypotheses for Saccharomyces cerevisiae, many of which have been confirmed experimentally.
Key words
Data integration network inference function prediction Bayesian networks pathway analysis functional network functional linkageReferences
- 1.Deng, M., F. Sun and T. Chen. 2003. Assessment of the reliability of protein–protein interactions and protein function prediction. Pac Symp Biocomput 140–151.Google Scholar
- 2.Bader, J.S., A. Chaudhuri, J.M. Rothberg and J. Chant. 2004. Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 22:78–85.PubMedCrossRefGoogle Scholar
- 3.Sprinzak, E., S. Sattath and H. Margalit. 2003. How reliable are experimental protein–protein interaction data? J Mol Biol 327:919–923.PubMedCrossRefGoogle Scholar
- 4.Barutcuoglu, Z., R.E. Schapire and O.G. Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics 22:830–836.PubMedCrossRefGoogle Scholar
- 5.Lanckriet, G.R., M. Deng, N. Cristianini, M.I. Jordan and W.S. Noble. 2004. Kernel-based data fusion and its application to protein function prediction in yeast. Pac Symp Biocomput 300–311.Google Scholar
- 6.Letovsky, S. and S. Kasif. 2003. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics 19 Suppl 1:i197–i204.PubMedCrossRefGoogle Scholar
- 7.von Mering, C., M. Huynen, D. Jaeggi, S. Schmidt, P. Bork and B. Snel. 2003. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 31:258–261.CrossRefGoogle Scholar
- 8.Lee, I., S.V. Date, A.T. Adai and E.M. Marcotte. 2004. A probabilistic functional network of yeast genes. Science 306:1555–1558.PubMedCrossRefGoogle Scholar
- 9.Jansen, R., H. Yu, D. Greenbaum, Y. Kluger, N.J. Krogan, S. Chung, A. Emili, M. Snyder, et al. 2003. A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 302:449–453.PubMedCrossRefGoogle Scholar
- 10.Jaimovich, A., G. Elidan, H. Margalit and N. Friedman. 2005. Towards an integrated protein–protein interaction network. Research in Computational Molecular Biology, Proceedings Cambridge, MA, USA, 3500:14–38.CrossRefGoogle Scholar
- 11.Myers, C.L., D. Robson, A. Wible, M.A. Hibbs, C. Chiriac, C.L. Theesfeld, K. Dolinski and O.G. Troyanskaya. 2005. Discovery of biological networks from diverse functional genomic data. Genome Biol 6:R114.PubMedCrossRefGoogle Scholar
- 12.Murali, T.M., C.J. Wu and S. Kasif. 2006. The art of gene function prediction. Nat Biotechnol 24:1474–1475; author reply 1475–1476.PubMedCrossRefGoogle Scholar
- 13.Druzdzel, M. 1999. SMILE: Structural Modeling, Inference, and Learning Engine and GeNIe: A Development Environment for Graphical Decision-Theoretic Models (Intelligent Systems Demonstration). pp. 902-903. In National Conference on Artificial Intelligence (AAAI-99). AAAI Press/The MIT Press, Menlo Park, CA.Google Scholar
- 14.Web site. Graphviz Home Page. In http://www.graphviz.org
- 15.Eddy, S.R. 2004. What is Bayesian statistics? Nat Biotechnol 22:1177–1178.PubMedCrossRefGoogle Scholar
- 16.Myers, C.L., D.R. Barrett, M.A. Hibbs, C. Huttenhower and O.G. Troyanskaya. 2006. Finding function: evaluation methods for functional genomic data. BMC Genomics 7:187.PubMedCrossRefGoogle Scholar
- 17.Ball, C.A., K. Dolinski, S.S. Dwight, M.A. Harris, L. Issel-Tarver, A. Kasarskis, C.R. Scafe, G. Sherlock, et al. 2000. Integrating functional genomic information into the Saccharomyces genome database. Nucleic Acids Res 28:77–80.PubMedCrossRefGoogle Scholar
- 18.Schauber, C., L. Chen, P. Tongaonkar, I. Vega, D. Lambertson, W. Potts and K. Madura. 1998. Rad23 links DNA repair to the ubiquitin/proteasome pathway. Nature 391:715–718.PubMedCrossRefGoogle Scholar
- 19.Ashburner, M., C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. Cherry, A.P. Davis, K. Dolinski, et al. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29.PubMedCrossRefGoogle Scholar
- 20.Boyle, E.I., S. Weng, J. Gollub, H. Jin, D. Botstein, J.M. Cherry and G. Sherlock. 2004. GO:TermFinder – open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20:3710–3715.PubMedCrossRefGoogle Scholar
- 21.Miles, J. and T. Formosa. 1992. Evidence that POB1, a Saccharomyces cerevisiae protein that binds to DNA polymerase alpha, acts in DNA metabolism in vivo. Mol Cell Biol 12:5724–5735.PubMedGoogle Scholar
- 22.Fisher, R.A. 1915. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10:507–521.Google Scholar
- 23.Kloster, M., C. Tang and N.S. Wingreen. 2005. Finding regulatory modules through large-scale gene-expression data analysis. Bioinformatics 21:1172–1179.PubMedCrossRefGoogle Scholar
- 24.Myers, C.L. and O.G. Troyanskaya. 2007. Context-sensitive data integration and prediction of biological networks. Bioinformatics 23:2322–2330.PubMedCrossRefGoogle Scholar
- 25.Huh, W.K., J.V. Falvo, L.C. Gerke, A.S. Carroll, R.W. Howson, J.S. Weissman and E.K. O’Shea. 2003. Global analysis of protein localization in budding yeast. Nature 425:686–691.PubMedCrossRefGoogle Scholar
- 26.Friedman, N., D. Geiger and M. Goldszmidt. 1997. Bayesian network classifiers. Machine Learning 29:131–163.CrossRefGoogle Scholar
- 27.Prakash, S. and L. Prakash. 2000. Nucleotide excision repair in yeast. Mutat Res 451:13–24.PubMedCrossRefGoogle Scholar
- 28.van Laar, T., A.J. van der Eb and C. Terleth. 2002. A role for Rad23 proteins in 26S proteasome-dependent protein degradation? Mutat Res 499:53–61.PubMedCrossRefGoogle Scholar