Computational Analysis of the Yeast Proteome: Understanding and Exploiting Functional Specificity in Genomic Data

Huttenhower, Curtis; Myers, Chad L.; Hibbs, Matthew  A.; Troyanskaya, Olga G.

doi:10.1007/978-1-59745-540-4_15

Curtis Huttenhower²,
Chad L. Myers²,
Matthew A. Hibbs² &
…
Olga G. Troyanskaya²

Part of the book series: Methods in Molecular Biology ((MIMB,volume 548))

1446 Accesses
1 Citations

Summary

Modern experimental techniques have produced a wealth of high-throughput data that has enabled the ongoing genomic revolution. As the field continues to integrate experimental and computational analyzes of this data, it is essential that performance evaluations of high-throughput results be carried out in a consistent and biologically informative manner. Here, we present an overview of evaluation techniques for high-throughput experimental data and computational methods, and we discuss a number of potential pitfalls in this process. These primarily involve the biological diversity of genomic data, which can be masked or misrepresented in overly simplified global evaluations. We describe systems for preserving information about biological context during dataset evaluation, which can help to ensure that multiple different evaluations are more directly comparable. This biological variety in high-throughput data can also be taken advantage of computationally through data integration and process specificity to produce richer systems-level predictions of cellular function. An awareness of these considerations can greatly improve the evaluation and analysis of any high-throughput experimental dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kitano H. (2002). Looking beyond the details: a rise in system-oriented approaches in genetics and molecular biology. Curr Genet;41(1):1–10.
Article PubMed Google Scholar
Steinmetz LM, Deutschbauer AM. (2002). Gene function on a genomic scale. J Chromatogr B Analyt Technol Biomed Life Sci;782(1–2):151–63.
PubMed Google Scholar
Ideker T, Galitski T, Hood L. (2001). A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet;2:343–72.
Article PubMed Google Scholar
Cahill DJ, Nordhoff E. (2003). Protein arrays and their role in proteomics. Adv Biochem Eng Biotechnol;83:177–87.
PubMed Google Scholar
Sydor JR, Nock S. (2003). Protein expression profiling arrays: tools for the multiplexed high-throughput analysis of proteins. Proteome Sci;1(1):3.
Article PubMed Google Scholar
Oleinikov AV, Gray MD, Zhao J, Montgomery DD, Ghindilis AL, Dill K. (2003). Self-assembling protein arrays using electronic semiconductor microchips and in vitro translation. J Proteome Res;2(3):313–9.
Article PubMed Google Scholar
Huang RP. (2003). Protein arrays, an excellent tool in biomedical research. Front Biosci;8:d559–76.
Article PubMed Google Scholar
Cutler P. (2003) Protein arrays: the current state-of-the-art. Proteomics;3(1):3–18.
Article Google Scholar
Bartel PL, Fields S. (1995). Analyzing protein-protein interactions using two-hybrid system. Methods Enzymol;254:241–63.
Article PubMed Google Scholar
Grunenfelder B, Winzeler EA. (2002). Treasures and traps in genome-wide data sets: case examples from yeast. Nat Rev Genet;3(9):653–61.
Article PubMed Google Scholar
Chen Y, Xu D. (2003). Computational analyses of high-throughput protein-protein interaction data. Curr Protein Pept Sci;4(3):159–81.
Article PubMed Google Scholar
Bader GD, Heilbut A, Andrews B, Tyers M, Hughes T, Boone C. (2003). Functional genomics and proteomics: charting a multidimensional map of the yeast cell. Trends Cell Biol;13(7):344–56.
Article PubMed Google Scholar
von Mering C, Krause R, Snel B, et al. (2002). Comparative assessment of large-scale data sets of protein-protein interactions. Nature;417(6887):399–403.
Article PubMed Google Scholar
Deane CM, Salwinski L, Xenarios I, Eisenberg D. (2002). Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics;1(5):349–56.
Article PubMed Google Scholar
Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y. (2001). A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA;98(8):4569–74.
Article PubMed Google Scholar
Yue H, Eastman PS, Wang BB, et al. (2001). An evaluation of the performance of cDNA microarrays for detecting changes in global mRNA expression. Nucleic Acids Res;29(8):E41-1.
Article PubMed Google Scholar
Primig M, Williams RM, Winzeler EA, et al. (2000). The core meiotic transcriptome in budding yeasts. Nat Genet;26(4):415–23.
Article PubMed Google Scholar
Myers CL, Barrett DR, Hibbs MA, Huttenhower C, Troyanskaya OG. (2006). Finding function: evaluation methods for functional genomic data. BMC Genomics;7:187.
Article PubMed Google Scholar
Lee I, Date SV, Adai AT, Marcotte EM. (2004). A probabilistic functional network of yeast genes. Science;306(5701):1555–8.
Article PubMed Google Scholar
van Rijsbergen CJ. (1979). Information retrieval. London, Boston: Butterworth.
Google Scholar
Egan JP. (1975). Signal detection theory and ROC-analysis. New York: Academic.
Google Scholar
Davis J, Goadrich M. (2006). The relationship between precision-recall and ROC curves. 23rd international Conference on Machine Learning, 2006, Pittsburgh, PA: ACM. pp233–40.
Google Scholar
Mewes HW, Frishman D, Guldener U, et al. (2002). MIPS: a database for genomes and protein sequences. Nucleic Acids Res;30(1):31–4.
Article PubMed Google Scholar
Ball CA, Dolinski K, Dwight SS, et al. (2000). Integrating functional genomic information into the Saccharomyces genome database. Nucleic Acids Res;28(1):77–80.
Article PubMed Google Scholar
Kanehisa M, Goto S. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res;28(1):27–30.
Article PubMed Google Scholar
Ashburner M, Ball CA, Blake JA, et al. (2000). Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet;25(1):25–9.
Google Scholar
Choi JK, Yu U, Kim S, Yoo OJ. (2003). Combining multiple microarray studies and modeling interstudy variation. Bioinformatics (Oxford, England);19(Suppl 1):i84–90.
Article Google Scholar
Moreau Y, Aerts S, De Moor B, De Strooper B, Dabrowski M. (2003). Comparison and meta-analysis of microarray data: from the bench to the computer desk. Trends Genet;19(10):570–7.
Article PubMed Google Scholar
Hu P, Greenwood CM, Beyene J. (2005). Integrative analysis of multiple gene expression profiles with quality-adjusted effect size models. BMC Bioinformatics;6:128.
Article PubMed Google Scholar
Troyanskaya OG, Dolinski K, Owen AB, Altman RB, Botstein D. (2003). A Bayesian framework for combining heterogeneous data sources for gene function prediction (in Saccharomyces cerevisiae). Proc Natl Acad Sci USA;100(14):8348–53.
Article PubMed Google Scholar
Jaimovich A, Elidan G, Margalit H, Friedman N. (2006). Towards an integrated protein-protein interaction network: a relational Markov network approach. J Comput Biol;13(2):145–64.
Article PubMed Google Scholar
Deng M, Chen T, Sun F. (2004). An integrated probabilistic model for functional prediction of proteins. J Comput Biol;11(2–3): 463–75.
Article PubMed Google Scholar
Karaoz U, Murali TM, Letovsky S, et al. (2004). Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA;101(9):2888–93.
Article PubMed Google Scholar
Barutcuoglu Z, Schapire RE, Troyanskaya OG. (2006). Hierarchical multi-label prediction of gene function. Bioinformatics (Oxford, England);22(7):830–6.
Article Google Scholar
Myers CL, Robson D, Wible A, et al. (2005). Discovery of biological networks from diverse functional genomic data. Genome Biol;6(13):R114.
Article PubMed Google Scholar
Myers CL, Troyanskaya OG. (2007). Context-sensitive data integration and prediction of biological networks. Bioinformatics (Oxford, England);23(17):2322–30.
Article Google Scholar
Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG. (2007). Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics (Oxford, England);23(20):2692–9.
Article Google Scholar
Alter O, Brown PO, Botstein D. (2000). Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA;97(18):10101–6.
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, Princeton, NJ, 08544, USA
Curtis Huttenhower, Chad L. Myers, Matthew A. Hibbs & Olga G. Troyanskaya

Authors

Curtis Huttenhower
View author publications
You can also search for this author in PubMed Google Scholar
Chad L. Myers
View author publications
You can also search for this author in PubMed Google Scholar
Matthew A. Hibbs
View author publications
You can also search for this author in PubMed Google Scholar
Olga G. Troyanskaya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Olga G. Troyanskaya .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Huttenhower, C., Myers, C.L., Hibbs, M.A., Troyanskaya, O.G. (2009). Computational Analysis of the Yeast Proteome: Understanding and Exploiting Functional Specificity in Genomic Data. In: Stagljar, I. (eds) Yeast Functional Genomics and Proteomics. Methods in Molecular Biology, vol 548. Humana Press. https://doi.org/10.1007/978-1-59745-540-4_15

Download citation

DOI: https://doi.org/10.1007/978-1-59745-540-4_15
Published: 01 April 2009
Publisher Name: Humana Press
Print ISBN: 978-1-934115-71-8
Online ISBN: 978-1-59745-540-4
eBook Packages: Springer Protocols

Publish with us

Policies and ethics