Skip to main content

Effects of Functional Bias on Supervised Learning of a Gene Network Model

  • Protocol
  • First Online:
Computational Systems Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 541))

Abstract

Gene networks have proven to be an effective approach for modeling cellular systems, capable of capturing some of the extreme complexity of cells in a formal theoretical framework. Not surprisingly, this complexity, combined with our still-limited amount of experimental data measuring the genes and their interactions, makes the reconstruction of gene networks difficult. One powerful strategy has been to analyze functional genomics data using supervised learning of network relationships based upon reference examples from our current knowledge. However, this reliance on the set of reference examples for the supervised learning can introduce major pitfalls, with misleading reference sets resulting in suboptimal learning. There are three requirements for an effective reference set: comprehensiveness, reliability, and freedom from bias. Perhaps not too surprisingly, our current knowledge about gene function is highly biased toward several specific biological functions, such as protein synthesis. This functional bias in the reference set, especially combined with the corresponding functional bias in data sets, induces biased learning that can, in turn, lead to false positive biological discoveries, as we show here for the yeast Saccharomyces cerevisiae. This suggests that careful use of current knowledge and genomics data is required for successful gene network modeling using the supervised learning approach. We provide guidance for better use of these data in learning gene networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Jansen, R., Yu, H., et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 2003; 302:449–53.

    Article  PubMed  CAS  Google Scholar 

  2. Lee, I., Date, S. V., et al. A probabilistic functional network of yeast genes. Science 2004; 306:1555–8.

    Article  PubMed  CAS  Google Scholar 

  3. Myers, C. L., Robson, D., et al. Discovery of biological networks from diverse functional genomic data. Genome Biol 2005; 6:R114.

    Article  PubMed  Google Scholar 

  4. Rhodes, D. R., Tomlins, S. A., et al. Probabilistic model of the human protein-protein interaction network. Nat Biotechnol 2005; 23:951–9.

    Article  PubMed  CAS  Google Scholar 

  5. Zhong, W., and Sternberg, P. W. Genome-wide prediction of C. elegans genetic interactions. Science 2006; 311:1481–4.

    Article  PubMed  CAS  Google Scholar 

  6. Ashburner, M., Ball, C. A., et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25–9.

    Article  PubMed  CAS  Google Scholar 

  7. Cherry, J. M., Adler, C., et al. SGD: Saccharomyces genome database. Nucleic Acids Res 1998; 26:73–9.

    Article  PubMed  CAS  Google Scholar 

  8. Kanehisa, M., and Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000; 28:27–30.

    Article  PubMed  CAS  Google Scholar 

  9. Efron, B., and Tibshirani, R. An introduction to the bootstrap. New York: Chapman & Hall, 1993.

    Google Scholar 

  10. Krogan, N. J., Cagney, G., et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006; 440:637–43.

    Article  PubMed  CAS  Google Scholar 

  11. Reguly, T., Breitkreutz, A., et al. Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol 2006; 5:11.

    Article  PubMed  Google Scholar 

  12. Mewes, H. W., Amid, C., et al. MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res 2004; 32:D41–4.

    Article  PubMed  CAS  Google Scholar 

  13. Jansen, R., Greenbaum, D., et al. Relating whole-genome expression data with protein-protein interactions. Genome Res 2002; 12:37–46.

    Article  PubMed  CAS  Google Scholar 

  14. Watts, D. J., and Strogatz, S. H. Collective dynamics of 'small-world' networks. Nature 1998; 393:440–2.

    Article  PubMed  CAS  Google Scholar 

  15. Jansen, R., and Gerstein, M. Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction. Curr Opin Microbiol 2004; 7:535–45.

    Article  PubMed  CAS  Google Scholar 

  16. Witten, I. H., and Frank, E. Data Mining: Practical Machine Learning Tools and Techniques. San Francisco, CA: Morgan Kaufmann, 2005.

    Google Scholar 

Download references

Acknowledgments

This work was supported by grants from the N.S.F. (IIS-0325116, EIA-0219061, 0241180), N.I.H. (GM06779-01), Welch (F-1515), and a Packard Fellowship (E.M.M.). We thank Cynthia V. Marcotte and Ray Hardesty for help with editing.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Lee, I., Marcotte, E.M. (2009). Effects of Functional Bias on Supervised Learning of a Gene Network Model. In: Ireton, R., Montgomery, K., Bumgarner, R., Samudrala, R., McDermott, J. (eds) Computational Systems Biology. Methods in Molecular Biology, vol 541. Humana Press. https://doi.org/10.1007/978-1-59745-243-4_20

Download citation

  • DOI: https://doi.org/10.1007/978-1-59745-243-4_20

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-905-5

  • Online ISBN: 978-1-59745-243-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics