Inference of Protein Function from the Structure of Interaction Networks

  • Oliver Mason
  • Mark Verwoerd
  • Peter Clifford


We consider the problem of using graph-theoretical techniques to predict the function of unannotated proteins in an organism’s proteome. Specifically, we present an overview of the major methods for predicting protein function based on interaction network structure and describe an abstract framework within which these methods can be treated in a unified fashion. We also present a comparison of the proposed methods and highlight some open theoretical and practical questions in the area.


Protein function prediction Graph algorithms Graph multicuts Markov random fields 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This work was partially supported by Science Foundation Ireland (SFI) grant 03/RP1/I382 and the Irish Higher Education Authority (HEA) PRTLI Network Mathematics grant. Neither Science Foundation Ireland nor the Higher Education Authority is responsible for any use of data appearing in this publication.


  1. 1.
    Altschul SF et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410Google Scholar
  2. 2.
    Altschul SF et al (1997) Gap-blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402CrossRefGoogle Scholar
  3. 3.
    Ashburner M et al (2000) Gene ontology: tool for the unification of biology. Nat Genet 25: 25–29CrossRefGoogle Scholar
  4. 4.
    Baldi P, Hatfield GW (2002) DNA microarrays and gene expression. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  5. 5.
    Barabasi L, Oltvai Z (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113CrossRefGoogle Scholar
  6. 6.
    Breitkreutz BJ et al (2003) The GRID: the general repository for interaction datasets. Genome Biol 4:R23CrossRefGoogle Scholar
  7. 7.
    Brun C et al (2003) Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol 5:R6CrossRefMathSciNetGoogle Scholar
  8. 8.
    Bu D et al (2003) Topological structure analysis of the protein–protein interaction network in budding yeast. Nucleic Acids Res 31(9):2443–2450CrossRefMathSciNetGoogle Scholar
  9. 9.
    Chua H, Sung W, Wong L (2006) Exploiting indirect neighbours and topological weight to predict protein function from protein–protein interactions. Bioinformatics 22(13):1623–1630CrossRefGoogle Scholar
  10. 10.
    Costanzo M et al (2001) YPD, PombePD and WormPD: model organism volumes of the bioknowledge library, an integrated resource for protein information. Nucleic Acids Res 29(1):75–79CrossRefGoogle Scholar
  11. 11.
    Deng M et al (2003) Prediction of protein function using protein–protein interaction data. J Comput Biol 10(6):947–960CrossRefGoogle Scholar
  12. 12.
    Diestel R (2000) Graph theory. Springer, BerlinGoogle Scholar
  13. 13.
    Fawcett T (2005) An introduction to ROC analysis. Pattern Recogn Lett 27(8):861–874CrossRefGoogle Scholar
  14. 14.
    Gavin A et al (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415:141–147CrossRefGoogle Scholar
  15. 15.
    Giot L et al (2003) A protein interaction map of Drosophilamelanogaster. Science 302: 1727–1736CrossRefGoogle Scholar
  16. 16.
    Hishigaki H et al (2001) Assessment of prediction accuracy of protein function from protein–protein interaction data. Yeast 18:523–531CrossRefGoogle Scholar
  17. 17.
    Ito T et al (2001) A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA 98(8):4569–4574CrossRefGoogle Scholar
  18. 18.
    Jacq B (2001) Protein function from the perspective of molecular interactions and genetic networks. Brief Bioinform 2(1):38–50CrossRefGoogle Scholar
  19. 19.
    Jeong H, Mason S, Barabasi A, Oltvai Z (2001) Lethality and centrality in protein networks. Nature 411:41–42CrossRefGoogle Scholar
  20. 20.
    Karaoz U et al (2004) Whole-genome annotation by using evidence integration in functional-linkage networks. Proc Natl Acad Sci USA 101:2888–2893CrossRefGoogle Scholar
  21. 21.
    Karp P et al (2002) The ecoCyc database. Nucleic Acids Res 30(1):56–58CrossRefGoogle Scholar
  22. 22.
    Kitano H (2002) Systems biology: a brief overview. Science 295:1662–1664CrossRefGoogle Scholar
  23. 23.
    Letovsky S, Kasif S (2003) Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics 19:i197–i204CrossRefGoogle Scholar
  24. 24.
    Li S et al (2004) A map of the interactome network of the metazoan C. elegans. Science 303:540–543Google Scholar
  25. 25.
    Mason O, Verwoerd M (2007) Graph theory and networks in biology. IET Syst Biol 1(2): 89–119CrossRefGoogle Scholar
  26. 26.
    Mewes H et al (2002) MIPS: a database for genomes and protein sequences. Nucleic Acids Res 30(1):31–34CrossRefGoogle Scholar
  27. 27.
    Nabieva E et al (2005) Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21:i302–i310CrossRefGoogle Scholar
  28. 28.
    Pellegrini M et al (1999) Assigning protein function by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci USA 96(8):4285–4288CrossRefGoogle Scholar
  29. 29.
    Pereira-Leal J, Enright A, Ouzounis C (2004) Detection of functional modules from protein interaction networks. Protein Struct Funct Bioinform 54:49–57CrossRefGoogle Scholar
  30. 30.
    Przulj N, Wigle D, Jurisica I (2004) Functional topology in a network of protein interactions. Bioinformatics 20(3):340–348CrossRefGoogle Scholar
  31. 31.
    Rain J et al (2001) The protein–protein interaction map of Heliobacter pylori. Nature 409: 211–215CrossRefGoogle Scholar
  32. 32.
    Ruepp A et al (2004) The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res 32(18):5539–5545CrossRefGoogle Scholar
  33. 33.
    Samanta M, Liang S (2003) Predicting protein functions from redundancies in large-scale protein interaction networks. Proc Natl Acad Sci USA 100(22):12579–12583CrossRefGoogle Scholar
  34. 34.
    Schwikowski B, Uetz P, Fields S (2000) A network of protein–protein interactions in yeast. Nat Biotechnol 18:1257–1261CrossRefGoogle Scholar
  35. 35.
    Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol, 3:88CrossRefGoogle Scholar
  36. 36.
    Sontag E (2004) Some new directions in control theory inspired by systems biology. IET Syst Biol 1:9–18Google Scholar
  37. 37.
    Twyman RM (2004) Principles of proteomics. Garland Science/BIOS Scientific Publishers (Advanced Text Series), Taylor and Francis, LondonGoogle Scholar
  38. 38.
    Vazquez A et al (2003) Global protein function prediction from protein–protein interaction networks. Nat Biotechnol 21(6):697–700CrossRefGoogle Scholar
  39. 39.
    Venter C et al (2001) The sequence of the human genome. Science 291:1304–1351CrossRefGoogle Scholar
  40. 40.
    Von Mering C et al (2002) Comparative assessment of large-scale data sets of protein–protein interactions. Nature 417:399–403CrossRefGoogle Scholar
  41. 41.
    Xenarios I et al (2000) DIP: the database of interacting proteins. Nucleic Acids Res 28(1): 289–291CrossRefGoogle Scholar
  42. 42.
    Yu H et al (2004) Genomic analysis of essentiality within protein networks. Trends Genet 20(6):227–231CrossRefGoogle Scholar
  43. 43.
    Zhou X, Kao M, Wong W (2002) Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl Acad Sci USA 99(20):1278312788Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.Hamilton InstituteNUI MaynoothMaynoothIreland

Personalised recommendations