Integration of Full-Coverage Probabilistic Functional Networks with Relevance to Specific Biological Processes

  • Katherine James
  • Anil Wipat
  • Jennifer Hallinan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5647)

Abstract

Probabilistic functional integrated networks are powerful tools with which to draw inferences from high-throughput data. However, network analyses are generally not tailored to specific biological functions or processes. This problem may be overcome by extracting process-specific sub-networks, but this approach discards useful information and is of limited use in poorly annotated areas of the network. Here we describe an extension to existing integration methods which exploits dataset biases in order to emphasise interactions relevant to specific processes, without loss of data. We apply the method to high-throughput data for the yeast Saccharomyces cerevisiae, using Gene Ontology annotations for ageing and telomere maintenance as test processes. The resulting networks perform significantly better than unbiased networks for assigning function to unknown genes, and for clustering to identify important sets of interactions. We conclude that this integration method can be used to enhance network analysis with respect to specific processes of biological interest.

Keywords

Integrated networks relevance network analysis clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cusick, M.E., Klitgord, N., Vidal, M., Hill, D.E.: Interactome: Gateway into Systems Biology. Hum. Mol. Genet. 14(2), 171–181 (2005)CrossRefGoogle Scholar
  2. 2.
    Adourian, A., Jennings, E., Balasubramanian, R., Hines, W.M., Damian, D., Plasterer, T.N., Clish, C.B., Stroobant, P., McBurney, R., Verheij, E.R., Bobeldijk, I., van der Greef, J., Lindberg, J., Kenne, K., Andersson, U., Hellmold, H., Nilsson, K., Salter, H., Schuppe-Koistinen, I.: Correlation Network Analysis for Data Integration and Biomarker Selection. Mol. Biosyst. 4, 249–259 (2008)CrossRefPubMedGoogle Scholar
  3. 3.
    Li, C., Li, H.: Network-Constrained Regularization and Variable Selection for Analysis of Genomic Data. Bioinformatics 24, 1175–1182 (2008)CrossRefPubMedGoogle Scholar
  4. 4.
    Godzik, A., Jambon, M., Friedberg, I.: Computational Protein Function Prediction: Are We Making Progress? Cell Mol. Life Sci. 64, 2505–2511 (2007)CrossRefPubMedGoogle Scholar
  5. 5.
    Mellor, J.C., Yanai, I., Clodfelter, K.H., Mintseris, J., DeLisi, C.: Predictome: A Database of Putative Functional Links between Proteins. Nucleic Acids Res. 30, 306–309 (2002)CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    von Mering, C., Jensen, L.J., Kuhn, M., Chaffron, S., Doerks, T., Krüger, B., Snel, B., Bork, P.: String 7–Recent Developments in the Integration and Prediction of Protein Interactions. Nucleic Acids Res. 35, 358–362 (2007)CrossRefGoogle Scholar
  7. 7.
    De Las Rivas, J., de Luis, A.: Interactome Data and Databases: Different Types of Protein Interaction. Comp. Funct. Genomics. 5, 173–178 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Galperin, M.Y.: The Molecular Biology Database Collection: 2008 Update. Nucleic Acids Res. 36, 2–4 (2008)CrossRefGoogle Scholar
  9. 9.
    Marcotte, E., Date, S.: Exploiting Big Biology: Integrating Large-Scale Biological Data for Function Inference. Brief. Bioinform. 2, 363–374 (2001)CrossRefPubMedGoogle Scholar
  10. 10.
    Mathivanan, S., Periaswamy, B., Gandhi, T.K.B., Kandasamy, K., Suresh, S., Mohmood, R., Ramachandra, Y.L., Pandey, A.: An Evaluation of Human Protein-Protein Interaction Data in the Public Domain. BMC Bioinformatics 7(suppl. 5) (2006)Google Scholar
  11. 11.
    Rigaut, G., Shevchenko, A., Rutz, B., Wilm, M., Mann, M., Seraphin, B.: A Generic Protein Purification Method for Protein Complex Characterization and Proteome Exploration. Nat. Biotechnol. 17, 1030–1032 (1999)CrossRefPubMedGoogle Scholar
  12. 12.
    Fields, S., Song, O.: A Novel Genetic System to Detect Protein-Protein Interactions. Nature 340, 245–246 (1989)CrossRefPubMedGoogle Scholar
  13. 13.
    Kaganman, I.: Fretting for a More Detailed Interactome. Nat. Methods 4, 112–113 (2007)CrossRefPubMedGoogle Scholar
  14. 14.
    Bader, G.D., Hogue, C.W.V.: Analyzing Yeast Protein-Protein Interaction Data Obtained from Different Sources. Nat. Biotechnol. 20, 991–997 (2002)CrossRefPubMedGoogle Scholar
  15. 15.
    Collins, S.R., Kemmeren, P., Zhao, X.-C., Greenblatt, J.F., Spencer, F., Holstege, F.C.P., Weissman, J.S., Krogan, N.J.: Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces Cerevisiae. Mol. Cell Proteomics. 6, 439–450 (2007)CrossRefPubMedGoogle Scholar
  16. 16.
    Futschik, M.E., Chaurasia, G., Herzel, H.: Comparison of Human Protein-Protein Interaction Maps. Bioinformatics 23, 605–611 (2007)CrossRefPubMedGoogle Scholar
  17. 17.
    Hart, G.T., Lee, I., Marcotte, E.R.: A High-Accuracy Consensus Map of Yeast Protein Complexes Reveals Modular Nature of Gene Essentiality. BMC Bioinformatics 8, 236 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Huttenhower, C., Troyanskaya, O.G.: Assessing the Functional Structure of Genomic Data. Bioinformatics 24, 330–338 (2008)CrossRefGoogle Scholar
  19. 19.
    Beyer, A., Bandyopadhyay, S., Ideker, T.: Integrating Physical and Genetic Maps: From Genomes to Interaction Networks. Nat. Rev. Genet. 8, 699–710 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Hallinan, J.S., Wipat, A.: Motifs and Modules in Fractured Functional Yeast Networks. In: IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology (CIBCB 2007), pp. 189–196 (2007)Google Scholar
  21. 21.
    Lee, I., Date, S.V., Adai, A.T., Marcotte, E.M.: A Probabilistic Functional Network of Yeast Genes. Science 306, 1555–1558 (2004)CrossRefPubMedGoogle Scholar
  22. 22.
    Koehler, J., Baumbach, J., Taubert, J., Specht, M., Skusa, A., Rüegg, A., Rawlings, C., Verrier, P., Philippi, S.: Graph-Based Analysis and Visualization of Experimental Results with Ondex. Bioinformatics 22, 1383–1390 (2006)CrossRefGoogle Scholar
  23. 23.
    Liu, Y., Kim, I., Zhao, H.: Protein Interaction Predictions from Diverse Sources. Drug Discov. Today 13, 409–416 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Asthana, S., King, O.D., Gibbons, F.D., Roth, F.P.: Predicting Protein Complex Membership Using Probabilistic Network Reliability. Genome Res. 14, 1170–1175 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Bader, G.D., Hogue, C.W.V.: An Automated Method for Finding Molecular Complexes in Large Protein Interaction Networks. BMC Bioinformatics 4, 2 (2003)CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Brun, C., Herrmann, C., Guenoche, A.: Clustering Proteins from Interaction Networks for the Prediction of Cellular Functions. BMC Bioinformatics 5, 95 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Chua, H.N., Sung, W.-K., Wong, L.: Using Indirect Protein Interactions for the Prediction of Gene Ontology Functions. BMC Bioinformatics 8(suppl. 4) (2007)Google Scholar
  28. 28.
    Karaoz, U., Murali, T.M., Letovsky, S., Zheng, Y., Ding, C., Cantor, C.R., Kasif, S.: Whole-Genome Annotation by Using Evidence Integration in Functional-Linkage Networks. Proc. Natl. Acad. Sci. U. S. A. 101, 2888–2893 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Clauset, A., Moore, C., Newman, M.E.J.: Hierarchical Structure and the Prediction of Missing Links in Networks. Nature 453, 98–101 (2008)CrossRefPubMedGoogle Scholar
  30. 30.
    Gilchrist, M.A., Salter, L.A., Wagner, A.: A Statistical Framework for Combining and Interpreting Proteomic Datasets. Bioinformatics 20, 689–700 (2004)CrossRefPubMedGoogle Scholar
  31. 31.
    Myers, C.L., Troyanskaya, O.G.: Context-Sensitive Data Integration and Prediction of Biological Networks. Bioinformatics 23, 2322–2330 (2007)CrossRefPubMedGoogle Scholar
  32. 32.
    Li, J., Li, X., Su, H., Chen, H., Galbraith, D.W.: A Framework of Integrating Gene Relations from Heterogeneous Data Sources: An Experiment on Arabidopsis Thaliana. Bioinformatics 22, 2037–2043 (2006)CrossRefPubMedGoogle Scholar
  33. 33.
    Yellaboina, S., Goyal, K., Mande, S.C.: Inferring Genome-Wide Functional Linkages in E. Coli by Combining Improved Genome Context Methods: Comparison with High-Throughput Experimental Data. Genome Res. 17, 527–535 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Deng, M., Chen, T., Sun, F.: An Integrated Probabilistic Model for Functional Prediction of Proteins. J. Comput. Biol. 11, 463–475 (2004)CrossRefPubMedGoogle Scholar
  35. 35.
    Jaimovich, A., Elidan, G., Margalit, H., Friedman, N.: Towards an Integrated Protein-Protein Interaction Network: A Relational Markov Network Approach. J. Comput. Biol. 13, 145–164 (2006)CrossRefPubMedGoogle Scholar
  36. 36.
    Chen, Y., Xu, D.: Global Protein Function Annotation through Mining Genome-Scale Data in Yeast Saccharomyces Cerevisiae. Nucleic Acids Res. 32, 6414–6424 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Kiemer, L., Costa, S., Ueffing, M., Cesareni, G.: Wi-Phi: A Weighted Yeast Interactome Enriched for Direct Physical Interactions. Proteomics 7, 932–943 (2007)CrossRefPubMedGoogle Scholar
  38. 38.
    Guan, Y., Myers, C.L., Lu, R., Lemischka, I.R., Bult, C.J., Troyanskaya, O.G.: A Genomewide Functional Network for the Laboratory Mouse. PLoS Comput. Biol. 4 (2008)Google Scholar
  39. 39.
    Kim, W.K., Krumpelman, C., Marcotte, E.M.: Inferring Mouse Gene Functions from Genomic-Scale Data Using a Combined Functional Network/Classification Strategy. Genome Biol. 9(suppl. 1) (2008)Google Scholar
  40. 40.
    Kann, M.G.: Protein Interactions and Disease: Computational Approaches to Uncover the Etiology of Diseases. Brief Bioinform. 8, 333–346 (2007)CrossRefPubMedGoogle Scholar
  41. 41.
    Geisler-Lee, J., O’Toole, N., Ammar, R., Provart, N.J., Millar, A.H., Geisler, M.: A Predicted Interactome for Arabidopsis. Plant Physiol. 145, 317–329 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Lin, X., Liu, M., Chen, X.-w.: Protein-Protein Interaction Prediction and Assessment from Model Organisms. In: BIBM 2008: Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine, pp. 187–192 (2008)Google Scholar
  43. 43.
    Mrowka, R., Patzak, A., Herzel, H.: Is There a Bias in Proteome Research? Genome Res. 11, 1971–1973 (2001)CrossRefPubMedGoogle Scholar
  44. 44.
    Tanay, A., Sharan, R., Kupiec, M., Shamir, R.: Revealing Modularity and Organization in the Yeast Molecular Network by Integrated Analysis of Highly Heterogeneous Genomewide Data. Proc. Natl. Acad. Sci. U. S. A. 101, 2981–2986 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Myers, C.L., Barrett, D.R., Hibbs, M.A., Huttenhower, C., Troyanskaya, O.G.: Finding Function: Evaluation Methods for Functional Genomic Data. BMC Genomics 7, 187 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  46. 46.
    Chen, J., Hsu, W., Lee, M.L., Ng, S.-K.: Discovering Reliable Protein Interactions from High-Throughput Experimental Data Using Network Topology. Artif. Intell. Med. 35, 37–47 (2005)CrossRefPubMedGoogle Scholar
  47. 47.
    Chen, J., Hsu, W., Lee, M.L., Ng, S.-K.: Increasing Confidence of Protein Interactomes Using Network Topological Metrics. Bioinformatics 22, 1998–2004 (2006)CrossRefPubMedGoogle Scholar
  48. 48.
    Lee, I., Li, Z., Marcotte, E.M.: An Improved, Bias-Reduced Probabilistic Functional Gene Network of Baker’s Yeast, Saccharomyces Cerevisiae. PLoS ONE 2 (2007)Google Scholar
  49. 49.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)PubMedGoogle Scholar
  50. 50.
    Guo, Z., Li, Y., Gong, X., Yao, C., Ma, W., Wang, D., Li, Y., Zhu, J., Zhang, M., Yang, D., Wang, J.: Edge-Based Scoring and Searching Method for Identifying Condition-Responsive Protein-Protein Interaction Sub-Network. Bioinformatics 23, 2121–2128 (2007)CrossRefPubMedGoogle Scholar
  51. 51.
    Li, Y., Ma, W., Guo, Z., Yang, D., Wang, D., Zhang, M., Zhu, J., Li, Y.: Characterizing Proteins with Finer Functions: A Case Study for Translational Functions of Yeast Proteins. In: Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007 pp. 141–144 (2007)Google Scholar
  52. 52.
    Wodak, S.J., Pu, S., Vlasblom, J., Seraphin, B.: Challenges and Rewards of Interaction Proteomics. Mol. Cell Proteomics 8, 3–18 (2009)CrossRefPubMedGoogle Scholar
  53. 53.
    Blackburn, E.H.: Switching and Signaling at the Telomere. Cell 106, 661–673 (2001)CrossRefPubMedGoogle Scholar
  54. 54.
    Sozou, P.D., Kirkwood, T.B.: A Stochastic Model of Cell Replicative Senescence Based on Telomere Shortening, Oxidative Stress, and Somatic Mutations in Nuclear and Mitochon-drial DNA. J. Theor. Biol. 213, 573–586 (2001)CrossRefPubMedGoogle Scholar
  55. 55.
    Stark, C., Breitkreutz, B.-J., Reguly, T., Boucher, L., Breitkreutz, A., Tyers, M.: Biogrid: A General Repository for Interaction Datasets. Nucleic Acids Res. 34, 535–539 (2006)CrossRefGoogle Scholar
  56. 56.
    Reguly, T., Breitkreutz, A., Boucher, L., Breitkreutz, B.-J., Hon, G.C., Myers, C.L., Parsons, A., Friesen, H., Oughtred, R., Tong, A., Stark, C., Ho, Y., Botstein, D., Andrews, B., Boone, C., Troyanskya, O.G., Ideker, T., Dolinski, K., Batada, N.N., Tyers, M.: Comprehensive Curation and Analysis of Global Interaction Networks in Saccharomyces Cerevisiae. J. Biol. 5, 11 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  57. 57.
    Kanehisa, M., Goto, S.: KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 28, 27–30 (2000)CrossRefPubMedPubMedCentralGoogle Scholar
  58. 58.
    Dwight, S.S., Harris, M.A., Dolinski, K., Ball, C.A., Binkley, G., Christie, K.R., Fisk, D.G., Issel-Tarver, L., Schroeder, M., Sherlock, G., Sethuraman, A., Weng, S., Botstein, D., Cherry, J.M.: Saccharomyces Genome Database (SGD) Provides Secondary Gene Annotation Using the Gene Ontology (GO). Nucleic Acids Res. 30, 69–72 (2002)CrossRefPubMedPubMedCentralGoogle Scholar
  59. 59.
    Linghu, B., Snitkin, E.S., Holloway, D.T., Gustafson, A.M., Xia, Y., DeLisi, C.: High-Precision High-Coverage Functional Inference from Integrated Data Sources. BMC Bioinformatics 9, 119 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  60. 60.
    Hanley, J.A., McNeil, B.J.: The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve. Radiology 143, 29–36 (1982)CrossRefPubMedGoogle Scholar
  61. 61.
    Henderson, A.R.: Assessing Test Accuracy and Its Clinical Consequences: A Primer for Receiver Operating Characteristic Curve Analysis. Ann. Clin. Biochem. 30(Pt 6), 521–539 (1993)CrossRefPubMedGoogle Scholar
  62. 62.
    Enright, A.J., Van Dongen, S., Ouzounis, C.A.: An Efficient Algorithm for Large-Scale Detection of Protein Families. Nucleic Acids Res. 30, 1575–1584 (2002)CrossRefPubMedPubMedCentralGoogle Scholar
  63. 63.
    Tetko, I.V., Brauner, B., Dunger-Kaltenbach, I., Frishman, G., Montrone, C., Fobo, G., Ruepp, A., Antonov, A.V., Surmeli, D., Mewes, H.-W.: MIPS Bacterial Genomes Functional Annotation Benchmark Dataset. Bioinformatics 21, 2520–2521 (2005)CrossRefPubMedGoogle Scholar
  64. 64.
    Kirkwood, T.: Ageing: Too Fast by Mistake. Nature 444, 1015–1017 (2006)CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Katherine James
    • 1
    • 2
  • Anil Wipat
    • 1
    • 2
  • Jennifer Hallinan
    • 1
    • 2
  1. 1.School of Computing ScienceNewcastle UniversityNewcastle-upon-TyneUnited Kingdom
  2. 2.Centre for Integrated Biology of Ageing and Nutrition, Ageing Research Laboratoires, Institute for Ageing and HealthNewcastle UniversityNewcastle-upon-TyneUnited Kingdom

Personalised recommendations