Protein Complex Prediction with RNSC
Graph theory analysis of biological networks, such as protein–protein interactions (PPIs), gene regulatory, metabolic, etc., has identified a strong relationship between topology of these networks and the underlying cellular function and biological processes (Sharan et al. Mol Syst Biol 3:88, 2007). We focus on PPI networks, in which nodes correspond to proteins and edges represent interactions among the proteins. The size of these networks is ever growing, and thus efficient identification of various network motifs and dense sub-networks has become necessary. Predicting highly connected sub-graphs in a PPI network is important to biologists as it may help to identify biologically meaningful protein complexes, and with further integrative analysis may lead to identifying dynamic assembly of individual subunits in these complexes. In this chapter, we describe one method for predicting protein complexes in two steps. The first step is to partition the nodes of a PPI network (i.e. proteins) into highly connected groups or clusters using the Restricted Neighbourhood Search Clustering algorithm. This provides a set of clusters that represent candidate complexes. The second step of the method is to filter the candidate complexes based on three criteria: minimum cluster size, minimum interaction density, and minimum functional homogeneity, which reflects the extent to which the proteins of the candidate cluster operate in the same functional group. Candidate complexes passing all three criteria are then put forward as predicted protein complexes. The effectiveness of this method is investigated in the previous studies (King et al. Bioinformatics 20:3013–3020, 2004; Brohee and van Helden BMC Bioinformatics 7:488, 2006; and Moschopoulos et al. BMC Bioinformatics 10(Suppl 6):S11, 2009).
Key wordsProtein–protein interaction network Protein complex Graph theory Graph clustering Local search RNSC Functional homogeneity
AK is supported by an NSERC Postdoctoral Fellowship. NP is supported by the NSF CAREER IIS-0644424, the NSF Cyber-Enabled Discovery and Innovation OIA-1028394, and the Serbian Ministry of Education and Science Project III44006 grants. IJ is supported in part by the Natural Sciences Research Council (NSERC #104105), Genome Canada via the Ontario Genomics Institute, Canada Foundation for Innovation (CFI #12301 and CFI #203383), Canada Research Chair Program, CIHR Catalyst grant (BIO-99745), and IBM. This research was funded in part by the Ontario Ministry of Health and Long Term Care. The views expressed do not necessarily reflect those of the OMOHLTC. We thank Tijana Milenković for insightful comments.
- 4.Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vazquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, Barabasi AL, Tavernier J, Hill DE, Vidal M. (2008) High-quality binary protein interaction map of the yeast interactome network. Science, 322:104–110.PubMedCrossRefGoogle Scholar
- 5.Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Hao T, Goldberg DS, Li N, Martinez M, Rual JF, Lamesch P, Xu L, Tewari M, Wong SL, Zhang LV, Berriz GF, Jacotot L, Vaglio P, Reboul J, Hirozane-Kishikawa T, Li Q, Gabel HW, Elewa A, Baumgartner B, Rose DJ, Yu H, Bosak S, Sequerra R, Fraser A, Mango SE, Saxton WM, Strome S, Van Den Heuvel S, Piano F, Vandenhaute J, Sardet C, Gerstein M, Doucette-Stamm L, Gunsalus KC, Harper JW, Cusick ME, Roth FP, Hill DE, Vidal M. (2004) A map of the interactome network of the metazoan C. elegans. Science, 303:540–543.PubMedCrossRefGoogle Scholar
- 9.King AD (2004) Graph clustering with restricted neighbourhood search. Department of Computer Science, University of Toronto, Toronto.Google Scholar
- 10.King AD. (2005) An efficient cost-based graph clustering algorithm. McGill University, Montreal.Google Scholar
- 11.Van Dongen S. (2000) A Cluster Algorithm for Graphs. In Information Systems. National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, 1–40.Google Scholar
- 13.Moschopoulos CN, Pavlopoulos GA, Schneider R, Likothanassis SD, Kossida S. (2009) GIBA: a clustering tool for detecting protein complexes. BMC Bioinformatics, 10(Suppl 6):S11.Google Scholar
- 17.Brown KR, Otasek D, Ali M, McGuffin MJ, Xie W, Devani B, Toch IL, Jurisica I. (2009) NAViGaTOR: network analysis, visualization and graphing Toronto. Bioinformatics, 25:3327–3329.Google Scholar
- 20.Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley RL Jr, White KP, Braverman M, Jarvie T, Gold S, Leach M, Knight J, Shimkets RA, McKenna MP, Chant J, Rothberg JM. (2003) A protein interaction map of Drosophila melanogaster. Science 302:1727–1736.PubMedCrossRefGoogle Scholar