Protein Complex Prediction with RNSC

Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 804)

Abstract

Graph theory analysis of biological networks, such as protein–protein interactions (PPIs), gene regulatory, metabolic, etc., has identified a strong relationship between topology of these networks and the underlying cellular function and biological processes (Sharan et al. Mol Syst Biol 3:88, 2007). We focus on PPI networks, in which nodes correspond to proteins and edges represent interactions among the proteins. The size of these networks is ever growing, and thus efficient identification of various network motifs and dense sub-networks has become necessary. Predicting highly connected sub-graphs in a PPI network is important to biologists as it may help to identify biologically meaningful protein complexes, and with further integrative analysis may lead to identifying dynamic assembly of individual subunits in these complexes. In this chapter, we describe one method for predicting protein complexes in two steps. The first step is to partition the nodes of a PPI network (i.e. proteins) into highly connected groups or clusters using the Restricted Neighbourhood Search Clustering algorithm. This provides a set of clusters that represent candidate complexes. The second step of the method is to filter the candidate complexes based on three criteria: minimum cluster size, minimum interaction density, and minimum functional homogeneity, which reflects the extent to which the proteins of the candidate cluster operate in the same functional group. Candidate complexes passing all three criteria are then put forward as predicted protein complexes. The effectiveness of this method is investigated in the previous studies (King et al. Bioinformatics 20:3013–3020, 2004; Brohee and van Helden BMC Bioinformatics 7:488, 2006; and Moschopoulos et al. BMC Bioinformatics 10(Suppl 6):S11, 2009).

Key words

Protein–protein interaction network Protein complex Graph theory Graph clustering Local search RNSC Functional homogeneity 

Notes

Acknowledgments

AK is supported by an NSERC Postdoctoral Fellowship. NP is supported by the NSF CAREER IIS-0644424, the NSF Cyber-Enabled Discovery and Innovation OIA-1028394, and the Serbian Ministry of Education and Science Project III44006 grants. IJ is supported in part by the Natural Sciences Research Council (NSERC #104105), Genome Canada via the Ontario Genomics Institute, Canada Foundation for Innovation (CFI #12301 and CFI #203383), Canada Research Chair Program, CIHR Catalyst grant (BIO-99745), and IBM. This research was funded in part by the Ontario Ministry of Health and Long Term Care. The views expressed do not necessarily reflect those of the OMOHLTC. We thank Tijana Milenković for insightful comments.

References

  1. 1.
    Sharan R, Ulitsky I, Shamir R. (2007) Network-based prediction of protein function. Mol Syst Biol, 3:88.PubMedCrossRefGoogle Scholar
  2. 2.
    King AD, Przulj N, Jurisica I. (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20:3013–3020.PubMedCrossRefGoogle Scholar
  3. 3.
    von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P. (2002) Comparative assessment of large-scale data sets of protein–protein interactions. Nature, 417:399–403.CrossRefGoogle Scholar
  4. 4.
    Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vazquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, Barabasi AL, Tavernier J, Hill DE, Vidal M. (2008) High-quality binary protein interaction map of the yeast interactome network. Science, 322:104–110.PubMedCrossRefGoogle Scholar
  5. 5.
    Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Hao T, Goldberg DS, Li N, Martinez M, Rual JF, Lamesch P, Xu L, Tewari M, Wong SL, Zhang LV, Berriz GF, Jacotot L, Vaglio P, Reboul J, Hirozane-Kishikawa T, Li Q, Gabel HW, Elewa A, Baumgartner B, Rose DJ, Yu H, Bosak S, Sequerra R, Fraser A, Mango SE, Saxton WM, Strome S, Van Den Heuvel S, Piano F, Vandenhaute J, Sardet C, Gerstein M, Doucette-Stamm L, Gunsalus KC, Harper JW, Cusick ME, Roth FP, Hill DE, Vidal M. (2004) A map of the interactome network of the metazoan C. elegans. Science, 303:540–543.PubMedCrossRefGoogle Scholar
  6. 6.
    Sharan R, Ideker T. (2006) Modeling cellular machinery through biological network comparison. Nat Biotechnol, 24:427–433.PubMedCrossRefGoogle Scholar
  7. 7.
    de Lichtenberg U, Jensen LJ, Brunak S, Bork P. (2005) Dynamic complex formation during the yeast cell cycle. Science, 307:724–727.PubMedCrossRefGoogle Scholar
  8. 8.
    Schaeffer SE. (2007) Graph clustering. Comput Sci Rev, 1:27–64.CrossRefGoogle Scholar
  9. 9.
    King AD (2004) Graph clustering with restricted neighbourhood search. Department of Computer Science, University of Toronto, Toronto.Google Scholar
  10. 10.
    King AD. (2005) An efficient cost-based graph clustering algorithm. McGill University, Montreal.Google Scholar
  11. 11.
    Van Dongen S. (2000) A Cluster Algorithm for Graphs. In Information Systems. National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, 1–40.Google Scholar
  12. 12.
    Brohee S, van Helden J. (2006) Evaluation of clustering algorithms for protein–protein interaction networks. BMC Bioinformatics, 7:488.PubMedCrossRefGoogle Scholar
  13. 13.
    Moschopoulos CN, Pavlopoulos GA, Schneider R, Likothanassis SD, Kossida S. (2009) GIBA: a clustering tool for detecting protein complexes. BMC Bioinformatics, 10(Suppl 6):S11.Google Scholar
  14. 14.
    Milenkovic T, Przulj N. (2008) Uncovering biological network function via graphlet degree signatures. Cancer Informatics, 6:257–273.PubMedGoogle Scholar
  15. 15.
    Brown KR, Jurisica I. (2005) Online predicted human interaction database. Bioinformatics, 21:2076–2082.PubMedCrossRefGoogle Scholar
  16. 16.
    Brown KR, Jurisica I. (2007) Unequal evolutionary conservation of human protein interactions in interologous networks. Genome Biol, 8:R95.PubMedCrossRefGoogle Scholar
  17. 17.
    Brown KR, Otasek D, Ali M, McGuffin MJ, Xie W, Devani B, Toch IL, Jurisica I. (2009) NAViGaTOR: network analysis, visualization and graphing Toronto. Bioinformatics, 25:3327–3329.Google Scholar
  18. 18.
    Peregrin-Alvarez JM, Xiong X, Su C, Parkinson J. (2009) The modular organization of protein interactions in Escherichia coli. PLoS Comput Biol, 5:e1000523.PubMedCrossRefGoogle Scholar
  19. 19.
    Bader GD, Hogue CW. (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4:2.PubMedCrossRefGoogle Scholar
  20. 20.
    Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley RL Jr, White KP, Braverman M, Jarvie T, Gold S, Leach M, Knight J, Shimkets RA, McKenna MP, Chant J, Rothberg JM. (2003) A protein interaction map of Drosophila melanogaster. Science 302:1727–1736.PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Andrew D. King
    • 1
  • Nataša Pržulj
    • 2
  • Igor Jurisica
    • 3
  1. 1.Department of Industrial Engineering and Operations ResearchColumbia UniversityNew YorkUSA
  2. 2.Department of ComputingImperial College LondonLondonUK
  3. 3.Ontario Cancer Institute/UHNTorontoCanada

Personalised recommendations