Discovering Biological Networks from Diverse Functional Genomic Data

  • Chad L. Myers
  • Camelia Chiriac
  • Olga G. Troyanskaya
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 563)

Abstract

Recent advances in biotechnology have produced a wealth of genomic data, which capture a variety of complementary cellular features. While these data promise to yield key insights into molecular biology, much of the available information remains underutilized because of the lack of scalable approaches for integrating signals across large, diverse data sets. A proper framework for capturing these numerous snapshots of complementary phenomena under a variety of conditions can provide the holistic view necessary for developing precise systems-level hypotheses.

Here we describe bioPIXIE, a system for combining information from diverse genomic data sets to predict biological networks. bioPIXIE utilizes a Bayesian framework for probabilistic integration of several high-throughput genomic data types including gene expression, protein–protein interactions, genetic interactions, protein localization, and sequence data to predict biological networks. The main purpose of the system is to support user-driven exploration through the inferred functional network, which is enabled by a public, web-based interface. We describe the features and supporting methods of this integration and discovery framework and present case examples where bioPIXIE has been used to generate specific, testable hypotheses for Saccharomyces cerevisiae, many of which have been confirmed experimentally.

Key words

Data integration network inference function prediction Bayesian networks pathway analysis functional network functional linkage 

References

  1. 1.
    Deng, M., F. Sun and T. Chen. 2003. Assessment of the reliability of protein–protein interactions and protein function prediction. Pac Symp Biocomput 140–151.Google Scholar
  2. 2.
    Bader, J.S., A. Chaudhuri, J.M. Rothberg and J. Chant. 2004. Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 22:78–85.PubMedCrossRefGoogle Scholar
  3. 3.
    Sprinzak, E., S. Sattath and H. Margalit. 2003. How reliable are experimental protein–protein interaction data? J Mol Biol 327:919–923.PubMedCrossRefGoogle Scholar
  4. 4.
    Barutcuoglu, Z., R.E. Schapire and O.G. Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics 22:830–836.PubMedCrossRefGoogle Scholar
  5. 5.
    Lanckriet, G.R., M. Deng, N. Cristianini, M.I. Jordan and W.S. Noble. 2004. Kernel-based data fusion and its application to protein function prediction in yeast. Pac Symp Biocomput 300–311.Google Scholar
  6. 6.
    Letovsky, S. and S. Kasif. 2003. Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics 19 Suppl 1:i197–i204.PubMedCrossRefGoogle Scholar
  7. 7.
    von Mering, C., M. Huynen, D. Jaeggi, S. Schmidt, P. Bork and B. Snel. 2003. STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 31:258–261.CrossRefGoogle Scholar
  8. 8.
    Lee, I., S.V. Date, A.T. Adai and E.M. Marcotte. 2004. A probabilistic functional network of yeast genes. Science 306:1555–1558.PubMedCrossRefGoogle Scholar
  9. 9.
    Jansen, R., H. Yu, D. Greenbaum, Y. Kluger, N.J. Krogan, S. Chung, A. Emili, M. Snyder, et al. 2003. A Bayesian networks approach for predicting protein–protein interactions from genomic data. Science 302:449–453.PubMedCrossRefGoogle Scholar
  10. 10.
    Jaimovich, A., G. Elidan, H. Margalit and N. Friedman. 2005. Towards an integrated protein–protein interaction network. Research in Computational Molecular Biology, Proceedings Cambridge, MA, USA, 3500:14–38.CrossRefGoogle Scholar
  11. 11.
    Myers, C.L., D. Robson, A. Wible, M.A. Hibbs, C. Chiriac, C.L. Theesfeld, K. Dolinski and O.G. Troyanskaya. 2005. Discovery of biological networks from diverse functional genomic data. Genome Biol 6:R114.PubMedCrossRefGoogle Scholar
  12. 12.
    Murali, T.M., C.J. Wu and S. Kasif. 2006. The art of gene function prediction. Nat Biotechnol 24:1474–1475; author reply 1475–1476.PubMedCrossRefGoogle Scholar
  13. 13.
    Druzdzel, M. 1999. SMILE: Structural Modeling, Inference, and Learning Engine and GeNIe: A Development Environment for Graphical Decision-Theoretic Models (Intelligent Systems Demonstration). pp. 902-903. In National Conference on Artificial Intelligence (AAAI-99). AAAI Press/The MIT Press, Menlo Park, CA.Google Scholar
  14. 14.
    Web site. Graphviz Home Page. In http://www.graphviz.org
  15. 15.
    Eddy, S.R. 2004. What is Bayesian statistics? Nat Biotechnol 22:1177–1178.PubMedCrossRefGoogle Scholar
  16. 16.
    Myers, C.L., D.R. Barrett, M.A. Hibbs, C. Huttenhower and O.G. Troyanskaya. 2006. Finding function: evaluation methods for functional genomic data. BMC Genomics 7:187.PubMedCrossRefGoogle Scholar
  17. 17.
    Ball, C.A., K. Dolinski, S.S. Dwight, M.A. Harris, L. Issel-Tarver, A. Kasarskis, C.R. Scafe, G. Sherlock, et al. 2000. Integrating functional genomic information into the Saccharomyces genome database. Nucleic Acids Res 28:77–80.PubMedCrossRefGoogle Scholar
  18. 18.
    Schauber, C., L. Chen, P. Tongaonkar, I. Vega, D. Lambertson, W. Potts and K. Madura. 1998. Rad23 links DNA repair to the ubiquitin/proteasome pathway. Nature 391:715–718.PubMedCrossRefGoogle Scholar
  19. 19.
    Ashburner, M., C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. Cherry, A.P. Davis, K. Dolinski, et al. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25–29.PubMedCrossRefGoogle Scholar
  20. 20.
    Boyle, E.I., S. Weng, J. Gollub, H. Jin, D. Botstein, J.M. Cherry and G. Sherlock. 2004. GO:TermFinder – open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20:3710–3715.PubMedCrossRefGoogle Scholar
  21. 21.
    Miles, J. and T. Formosa. 1992. Evidence that POB1, a Saccharomyces cerevisiae protein that binds to DNA polymerase alpha, acts in DNA metabolism in vivo. Mol Cell Biol 12:5724–5735.PubMedGoogle Scholar
  22. 22.
    Fisher, R.A. 1915. Frequency distribution of the values of the correlation coefficient in samples from an indefinitely large population. Biometrika 10:507–521.Google Scholar
  23. 23.
    Kloster, M., C. Tang and N.S. Wingreen. 2005. Finding regulatory modules through large-scale gene-expression data analysis. Bioinformatics 21:1172–1179.PubMedCrossRefGoogle Scholar
  24. 24.
    Myers, C.L. and O.G. Troyanskaya. 2007. Context-sensitive data integration and prediction of biological networks. Bioinformatics 23:2322–2330.PubMedCrossRefGoogle Scholar
  25. 25.
    Huh, W.K., J.V. Falvo, L.C. Gerke, A.S. Carroll, R.W. Howson, J.S. Weissman and E.K. O’Shea. 2003. Global analysis of protein localization in budding yeast. Nature 425:686–691.PubMedCrossRefGoogle Scholar
  26. 26.
    Friedman, N., D. Geiger and M. Goldszmidt. 1997. Bayesian network classifiers. Machine Learning 29:131–163.CrossRefGoogle Scholar
  27. 27.
    Prakash, S. and L. Prakash. 2000. Nucleotide excision repair in yeast. Mutat Res 451:13–24.PubMedCrossRefGoogle Scholar
  28. 28.
    van Laar, T., A.J. van der Eb and C. Terleth. 2002. A role for Rad23 proteins in 26S proteasome-dependent protein degradation? Mutat Res 499:53–61.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Chad L. Myers
    • 1
  • Camelia Chiriac
    • 2
  • Olga G. Troyanskaya
    • 3
  1. 1.Department of Computer Science and EngineeringUniversity of MinnesotaMinneapolisUSA
  2. 2.PharmacopeiaCranburyUSA
  3. 3.Department of Computer SciencePrinceton UniversityPrincetonUSA

Personalised recommendations