Clustering PPI Networks

  • Sourav S. BhowmickEmail author
  • Boon-Siew Seah
Part of the Computational Biology book series (COBO, volume 24)


Due to the availability of large-scale ppi networks, since the last decade significant research efforts have been invested in analyzing these networks in order to comprehend cellular organization and functioning [1].


Functional Module Protein Pair Ensemble Cluster Dense Subgraph Hierarchical Cluster Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    A. Zhang, Protein Interaction Networks: Computational Analysis (Cambridge University Press, 2009)Google Scholar
  2. 2.
    S.S. Bhowmick, B.-S. Seah, Clustering and Summarizing Protein-Protein Interaction Networks: A Survey. IEEE Trans. Knowl. Data Eng. 28(3), 638–658 (2016)Google Scholar
  3. 3.
    J. Ji, A. Zhang, et al., Functional module detection from protein-protein interaction networks, in IEEE TKDE, vol. 26, issue no. 2, 2014Google Scholar
  4. 4.
    F. Radicchi, C. Castellano, et al., Defining and identifying communities in networks. PNAS 101(9) (2004)Google Scholar
  5. 5.
    F. Luo, Y. Yang et al., Modular organization of protein interaction networks. Bioinformatics 23(2), 207–214 (2007)CrossRefGoogle Scholar
  6. 6.
    M.P.H. Stumpf, T. Thorne et al., Estimating the size of the human interactome. PNAS 105(19), 6959–6964 (2008)CrossRefGoogle Scholar
  7. 7.
    M.J. Barber, Modularity and community detection in bipartite networks. Phys. Rev. 76(6) (2007)Google Scholar
  8. 8.
    M.E.J. Newman, Modularity and community structure in networks. PNAS 103(23) (2006)Google Scholar
  9. 9.
    J. Ruan, W. Zhang, An efficient spectral algorithm for network community discovery and its applications to biological and social networks, in Proceedings of ICDM, 2007, pp. 643–648Google Scholar
  10. 10.
    U. Brandes, D. Delling et al., On finding graph clusterings with maximum modularity. Graph-Theoretic Concepts in Computer Science, 2007, pp. 121–132Google Scholar
  11. 11.
    X. Xu, N. Yuruk, Z. Feng, T.A.J. Schweiger, Scan: a structural clustering algorithm for networks, in In ACM SIGKDD, 2007Google Scholar
  12. 12.
    H. Sun, J. Huang, et al., gskeletonclu: Density-based network clustering via structure-connected tree division or agglomeration, in IEEE ICDM, 2010Google Scholar
  13. 13.
    M. Newman, M. Girvan, Finding and evaluating community structure in networks. Phys. Rev. 69(2) (2004)Google Scholar
  14. 14.
    J. Huang, H. Sun, et al., Shrink: a structural clustering algorithm for detecting hierarchical communities in networks, in ACM CIKM, 2010Google Scholar
  15. 15.
    A. Clauset, M.E.J. Newman, C. Moore, Finding community structure in very large networks. Phys. Rev. E 70(6) (2004)Google Scholar
  16. 16.
    G. Karypis, V. Kumar, A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 359–392 (1998). JanMathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    D.A. Spielman, S.-H. Teng, A local clustering algorithm for massive graphs and its application to nearly-linear time graph partitioning, Sept 2008Google Scholar
  18. 18.
    Y. Zhou, H. Cheng, J.X. Yu, Clustering large attributed graphs: an efficient incremental approach, in IEEE ICDM, 2010Google Scholar
  19. 19.
    T. Nepusz, H. Yu, A. Paccanaro, Detecting overlapping protein complexes in protein-protein interaction networks. Nat. Methods 9, 471–472 (2012)CrossRefGoogle Scholar
  20. 20.
    C.G. Rivera, R. Vakil, J.S. Bader, NeMo: network module identification in cytoscape. BMC Bioinform. 11(Suppl 1), S61 (2010). JanCrossRefGoogle Scholar
  21. 21.
    S. Asur, D. Ucar, S. Parthasarathy, An ensemble framework for clustering protein-protein interaction networks. Bioinformatics (Oxford, England) 23, i29–40 (2007)Google Scholar
  22. 22.
    H.N. Chua, W.-K Sung, L. Wong, Exploiting indirect neighbours and topological weight to predict protein function from protein–protein interactions. Bioinformatics 22(13) (2006)Google Scholar
  23. 23.
    S. Navlakha, J. White, N. Nagarajan, M. Pop, C. Kingsford, Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information. J. Comput. Biol. (J. Comput. Mol. Cell Biol.) 17, 503–516 (2010). MarCrossRefGoogle Scholar
  24. 24.
    C. Kingsford, S. Navlakha, Exploring biological network dynamics with ensembles of graph partitions, in Pacific Symposium on Biocomputing, 2010, pp. 166–77Google Scholar
  25. 25.
    G.D. Bader, C.W.V. Hogue, An automated method for finding molecular complexes in large protein interaction networks. BMC Boinform. 4, 2 (2003). JanCrossRefGoogle Scholar
  26. 26.
    A.C. Gavin, M. Bosche et al., Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002)CrossRefGoogle Scholar
  27. 27.
    M.C. Costanzo, M.E. Crawford, et al., YPD, PombePD and WormPD: model organism volumes of the BioKnowledge Library, an integrated resource for protein information. Nucleic Acids Res. 29(1), 75–79 (2001)Google Scholar
  28. 28.
    A.H. Tong, B. Drees et al., A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295, 321–324 (2001)CrossRefGoogle Scholar
  29. 29.
    P. Uetz, L. Giot, G. Cagney, T.A. Mansfield, R.S. Judson, J.R. Knight, D. Lockshon, V. Narayan, M. Srinivasan, P. Pochart, A. Qureshi-Emili, Y. Li, B. Godwin, D. Conover, T. Kalbfleisch, G. Vijayadamodar, M. Yang, M. Johnston, S. Fields, J.M. Rothberg, A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000). FebCrossRefGoogle Scholar
  30. 30.
    B.L. Drees, B. Sundin, et al., A protein interaction map for cell polarity development. PNAS 154(3) (2001)Google Scholar
  31. 31.
    A.E. Mayes, L. Verdone, et al., Characterization of Sm-like proteins in yeast and their association with U6 snRNA. EMBO J. 18(15) (1999)Google Scholar
  32. 32.
    T. Ito, T. Chiba, R. Ozawa, M. Yoshida, M. Hattori, Y. Sakaki, A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98, 4569–4574 (2001)CrossRefGoogle Scholar
  33. 33.
    M. Altaf-Ul-Amin, Y. Shinbo, et al., Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform. 7(1) (2006)Google Scholar
  34. 34.
    I. Xenarios, L. Salwinski et al., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30(1), 303–305 (2002)CrossRefGoogle Scholar
  35. 35.
    M. Li, J.-E Chen, et al., Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform. 9(1) (2008)Google Scholar
  36. 36.
    A.D. King, N. Przulj, I. Jurisica, Protein complex prediction via cost-based clustering. Bioinformatics (Oxford, England) 20, 3013–3020 (2004)Google Scholar
  37. 37.
    C. von Mering, R. Krause et al., Comparative assessment of largescale data sets of protein-protein interactions. Nature 417, 399–403 (2002)CrossRefGoogle Scholar
  38. 38.
    L. Giot, J.S. Bader et al., A protein interaction map of Drosophila melanogaster. Science 302, 1727–1736 (2003)CrossRefGoogle Scholar
  39. 39.
    S. Li, C.M. Armstrong et al., A map of the interactome network of the metazoan C.elegans. Science 303, 540–543 (2004)CrossRefGoogle Scholar
  40. 40.
    P. Pei, A. Zhang, A “seed-refine” algorithm for detecting protein complexes from protein interaction data. IEEE Trans. Nanobiosci. 6(1), 43–50 (2007)CrossRefGoogle Scholar
  41. 41.
    A.C. Gavin, P. Aloy et al., Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 431–436 (2006)CrossRefGoogle Scholar
  42. 42.
    N.J. Krogan, G. Cagney, H. Yu, G. Zhong, X. Guo, A. Ignatchenko, J. Li, S. Pu, N. Datta, A.P. Tikuisis, T. Punna, J.M. Peregrín-Alvarez, M. Shales, X. Zhang, M. Davey, M.D. Robinson, A. Paccanaro, J.E. Bray, A. Sheung, B. Beattie, D.P. Richards, V. Canadien, A. Lalev, F. Mena, P. Wong, A. Starostine, M.M. Canete, J. Vlasblom, S. Wu, C. Orsi, S.R. Collins, S. Chandran, R. Haw, J.J. Rilstone, K. Gandi, N.J. Thompson, G. Musso, P. St, Onge, S. Ghanny, M.H.Y. Lam, G. Butland, A.M. Altaf-Ul, S. Kanaya, A. Shilatifard, E. O’Shea, J.S. Weissman, C.J. Ingles, T.R. Hughes, J. Parkinson, M. Gerstein, S.J. Wodak, A. Emili, J.F. Greenblatt, Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006)Google Scholar
  43. 43.
    S.R. Collins, P. Kemmeren et al., Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol. Cell Proteomics 6(3), 439–450 (2007)CrossRefGoogle Scholar
  44. 44.
    P. Jiang, M. Singh, SPICi: a fast clustering algorithm for large biological networks. Bioinformatics (Oxford, England) 26, 1105–1111 (2010)Google Scholar
  45. 45.
    L.J. Jensen, M. Kuhn et al., STRING 8a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res. 37, D412–D416 (2009)CrossRefGoogle Scholar
  46. 46.
    C. Huttenhower, E.M. Haley, et al., Exploring the human genome with functional maps. Genome Res. 19(6) (2009)Google Scholar
  47. 47.
    A.J. Enright, S. Van Dongen, C.A. Ouzounis, An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002). AprCrossRefGoogle Scholar
  48. 48.
    V. Satuluri, S. Parthasarathy, D. Ucar, Markov clustering of protein interaction networks with improved balance and scalability, in Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology - BCB ’10, (ACM Press, New York, New York, USA, 2010), p. 247Google Scholar
  49. 49.
    S. Razick, G. Magklaras, I.M. Donaldson, iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinform. 9 (2008)Google Scholar
  50. 50.
    Y.-K. Shih, S. Parthasarathy, Identifying functional modules in interaction networks through overlapping Markov clustering. Bioinformatics (Oxford, England) 28, i473–i479 (2012)Google Scholar
  51. 51.
    L. Kiemer, S. Costa et al., WI-PHI: a weighted yeast interactome enriched for direct physical interactions. Proteomics 7, 932–943 (2007)CrossRefGoogle Scholar
  52. 52.
    J.B. Pereira-Leal, A.J. Enright, C.A. Ouzounis, Detection of functional modules from protein interaction networks. PROTEINS: Struct. Funct. Bioinform. 54(1), 49–57 (2004)CrossRefGoogle Scholar
  53. 53.
    Y.-R Cho, W. Hwang, M. Ramanathan, A. Zhang, Semantic integration to identify overlapping functional modules in protein interaction networks. BMC Bioinform. 8(1) (2007)Google Scholar
  54. 54.
    Y. Cho, L. Shi, A. Zhang, Functional module detection by functional flow pattern mining in protein interaction networks. BMC Bioinform. 9 (2008)Google Scholar
  55. 55.
    X. Lei, X. Huang, L. Shi, A. Zhang, Clustering PPI data based on improved functional-flow model through quantum-behaved PSO. Int. J. Data Mining Bioinform. 6(1), 42–60 (2012)CrossRefGoogle Scholar
  56. 56.
    V. Spirin, L.A. Mirny, Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA 100, 12123–12128 (2003). OctCrossRefGoogle Scholar
  57. 57.
    B. Adamcsek, G. Palla, et al., CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics 22(8) (2006)Google Scholar
  58. 58.
    S. Zhang, X. Ning, X.-S. Zhang, Identification of functional modules in a PPI network by clique percolation clustering. Comput. Biol. Chem. 30(6), 445–451 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  59. 59.
    G. Cui, Y. Chen, et al., An algorithm for finding functional modules and protein complexes in protein-protein interaction networks. J. Biomed. Biotechnol. (2008)Google Scholar
  60. 60.
    A. Ruepp, A. Zollner et al., The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomes. Nucleic Acids Res. 32(18), 5539–5545 (2004)CrossRefGoogle Scholar
  61. 61.
    G. Liu, L. Wong, H.N. Chua, Complex discovery from weighted PPI networks. Bioinformatics (Oxford, England) 25, 1891–1897 (2009)Google Scholar
  62. 62.
    Y. Ho, A. Gruhler et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002)CrossRefGoogle Scholar
  63. 63.
    P. Aloy, B. Bottcher et al., Structure-based assembly of protein complexes in yeast. Science 303, 2026–2029 (2004)CrossRefGoogle Scholar
  64. 64.
    E. Georgii, S. Dietmann, T. Uno, P. Pagel, K. Tsuda, Enumeration of condition-dependent dense modules in protein interaction networks. Bioinformatics (Oxford, England) 25, 933–940 (2009)Google Scholar
  65. 65.
    U. Guldener et al., MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res. 34, D436–D441 (2006)CrossRefGoogle Scholar
  66. 66.
    S. Kerrien, Y. Alam-Faruque, B. Aranda, I. Bancarz, a. Bridge, C. Derow, E. Dimmer, M. Feuermann, A. Friedrichsen, R. Huntley, C. Kohler, J. Khadake, C. Leroy, a. Liban, C. Lieftink, L. Montecchi-Palazzi, S. Orchard, J. Risse, K. Robbe, B. Roechert, D. Thorneycroft, Y. Zhang, R. Apweiler, H. Hermjakob, IntAct–open source resource for molecular interaction data. Nucleic Acids Res. 35, D561–D565 (2007)Google Scholar
  67. 67.
    B.J. Frey, D. Dueck, Clustering by passing messages between data points. Science (New York, NY) 315, 972–976 (2007). FebMathSciNetCrossRefzbMATHGoogle Scholar
  68. 68.
    K. Macropol, T. Can, A.K. Singh, RRW: repeated random walks on genome-scale protein networks for local cluster discovery. BMC Bioinform. 10, 283 (2009). JanCrossRefGoogle Scholar
  69. 69.
    J. Chen, B. Yuan, Detecting functional modules in the yeast protein–protein interaction network. Bioinformatics 22(18) (2006)Google Scholar
  70. 70.
    J.M. Cherry, C. Adler, et al., SGD: Saccharomyces genome database. Nucleic Acids Res. 26(1) (1998)Google Scholar
  71. 71.
    D. Dotan-Cohen, A.A. Melkman, S. Kasif, Hierarchical tree snipping: clustering guided by prior knowledge. Bioinformatics (Oxford, England) 23, 3335–3342 (2007)Google Scholar
  72. 72.
    M. Mete, F. Tang, X. Xu, N. Yuruk, A structural approach for finding functional modules from large biological networks. BMC Bioinform. 9 (2008)Google Scholar
  73. 73.
    M. Jayapandian, A. Chapman et al., Michigan Molecular Interactions (MiMI): putting the jigsaw puzzle together. Nucleic Acids Res. 35, D566–D571 (2006)CrossRefGoogle Scholar
  74. 74.
    D. Greene, G. Cagney, N. Krogan, P. Cunningham, Ensemble non-negative matrix factorization methods for clustering protein-protein interactions. Bioinformatics 24(15), 1722–1728 (2008)CrossRefGoogle Scholar
  75. 75.
    E. Segal, H. Wang, D. Koller, Discovering molecular pathways from protein interaction and gene expression data. Bioinformatics 19 (2003)Google Scholar
  76. 76.
    A.P. Gasch, P.T. Spellman et al., Genomic expression program in the response of yeast cells to environmental changes. Mol. Biol. Cell 11 (2000)Google Scholar
  77. 77.
    P.T. Spellman, G. Sherlock, et al., Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol. Biol. Cell 9(12) (1998)Google Scholar
  78. 78.
    H. Lu, B. Shi et al., Integrated analysis of multiple data sources reveals modular structure of biological networks. Biochem. Biophys. Res. Commun. 345(1), 302–309 (2006)CrossRefGoogle Scholar
  79. 79.
    W.K. Huh, J.V. Falvo et al., Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003)CrossRefGoogle Scholar
  80. 80.
    J.M. Stuart, E. Segal, D. Koller, S.K. Kim, A Gene-coexpression network for global discovery of conserved genetic modules. Science 302 (2003)Google Scholar
  81. 81.
    I.A. Maraziotis, K. Dimitrakopoulou, A. Bezerianos, Growing functional modules from a seed protein via integration of protein interaction and gene expression data. BMC Bioinform. 8(1) (2007)Google Scholar
  82. 82.
    A. Patil, H. Nakamura, Filtering high-throughput protein-protein interaction data using a combination of genomic features. BMC Bioinform. 6(100) (2005)Google Scholar
  83. 83.
    H. Zheng, H. Wang, D.H. Glass, Integration of genomic data for inferring protein complexes from global protein–protein interaction networks. IEEE Trans. Syst. Man Cybern. Part B: Cybern. 38(1) (2008)Google Scholar
  84. 84.
    L.J. Lu, Y. Xia, et al., Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15(7) (2005)Google Scholar
  85. 85.
    T.R. Hughes, M.J. Marton, et al., Functional discovery via a compendium of expression profiles. Cell 102(1) (2000)Google Scholar
  86. 86.
    R.J. Cho, M.J. Campbell et al., A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell. 2(1), 65–73 (1998)CrossRefGoogle Scholar
  87. 87.
    I. Ulitsky, R. Shamir, Identifying functional modules using expression profiles and confidence-scored protein interactions. Bioinformatics 25(9), 1158–1164 (2009)CrossRefGoogle Scholar
  88. 88.
    A.P. Gasch, M. Huang, et al., Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p. Mol. Biol. Cell 12(10) (2001)Google Scholar
  89. 89.
    L. Shi, X. Lei, A. Zhang, Detecting protein complexes with semi-supervised learning in protein interaction networks. Proteome Sci. 9 (2011)Google Scholar
  90. 90.
    H. Wang, W. Wang, J. Yang, P. Yu, Clustering by pattern similarity in large data sets, in ACM SIGMOD, 2002Google Scholar
  91. 91.
    J. Sun, B. Feng, W.B. Xu, Particle swarm optimization with particles having quantum behavior, in IEEE Proceedings of Congress on Evolutionary Computation, 2004Google Scholar
  92. 92.
    G. Palla, I. Derényi, I. Farkas, T. Vicsek, Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818 (2005). JuneCrossRefGoogle Scholar
  93. 93.
    K. Voevodski, S.-H Teng, Y. Xia, Finding local communities in protein networks. BMC Bioinform. 10(1) (2009)Google Scholar
  94. 94.
    J. Vlasblom, S.J. Wodak, Markov clustering versus affinity propagation for the partitioning of protein interaction graphs. BMC Bioinform. 10, 99 (2009). JanCrossRefGoogle Scholar
  95. 95.
    M. Girvan, M.E.J. Newman, Community structure in social and biological networks. PNAS 99(12) (2002)Google Scholar
  96. 96.
    I. Ulitsky, R. Shamir, Identification of functional modules using network topology and high-throughput data. BMC Syst. Biol. 8(1) (2007)Google Scholar
  97. 97.
    S. Brohee, J. van Helden, Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinform. 7(1) (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore

Personalised recommendations