Survey: Enhancing protein complex prediction in PPI networks with GO similarity weighting

  • True Price
  • Francisco I. PeñaIII
  • Young-Rae Cho


Predicting protein complexes from protein-protein interaction (PPI) networks has been the focus of many computational approaches over the last decade. These methods tend to vary in performance based on the structure of the network and the parameters provided to the algorithm. Here, we evaluate the merits of enhancing PPI networks with semantic similarity edge weights using Gene Ontology (GO) and its annotation data. We compare the cluster features and predictive efficacy of six well-known unweighted protein complex detection methods (Clique Percolation, MCODE, DPClus, IPCA, Graph Entropy, and CoAch) against updated weighted implementations. We conclude that incorporating semantic similarity edge weighting in PPI network analysis unequivocally increases the performance of these methods.

Key words

protein-protein interactions PPI PPI networks protein interaction networks semantic similarity protein complexes weighted networks 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Adamcsek, B., Palla, G., Farkas, I.J., Derenyi, I., Vicsek, T. 2006. CFinder: Locating cliques and overlapping modules in biological networks. Bioinformatics 22, 1021–1023.PubMedCrossRefGoogle Scholar
  2. [2]
    Altaf-Ul-Amin, M., Shinbo, Y., Mihara, K., Kurokawa, K., Kanaya, S. 2006. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics 7, 207.PubMedCrossRefGoogle Scholar
  3. [3]
    Altman, D. 1991. Practical Statistics for Medical Research. Statistical Science Series, Chapman and Hall.Google Scholar
  4. [4]
    Bader, G.D., Hogue, C.W. 2003. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2.PubMedCrossRefGoogle Scholar
  5. [5]
    Bard, J.B.L., Rhee, S.Y. 2004. Ontologies in biology: Design, applications and future challenges. Nat Rev Genet 5, 213–222.PubMedCrossRefGoogle Scholar
  6. [6]
    Benabderrahmane, S., Smail-Tabbone, M., Poch, O., Napoli, A., Devignes, M.-D. 2010. IntelliGO: A new vector-based semantic similarity measure including annotation origin. BMC Bioinformatics 11, 588.PubMedCrossRefGoogle Scholar
  7. [7]
    Cho, Y.-R., Mina, M., Lu, Y., Kwon, N., Guzzi, P.H. 2013. M-Finder: Uncovering functionally associated proteins from interactome data integrated with GO annotations. Proteome Sci 11, S3.CrossRefGoogle Scholar
  8. [8]
    Derényi, I., Palla, G., Vicsek, T. 2005. Clique percolation in random networks. Phys Rev Lett 94, 160202.PubMedCrossRefGoogle Scholar
  9. [9]
    Farkas, I., Ábel, D., Palla, G., Vicsek, T. 2007. Weighted network modules. New J Phys 9, 180.CrossRefGoogle Scholar
  10. [10]
    Guo, X., Liu, R., Shriver, C.D., Hu, H., Liebman, M.N. 2006. Assessing semantic similarity measures for the characterization of human regulatory pathways. Bioinformatics 22, 967–973.PubMedCrossRefGoogle Scholar
  11. [11]
    Guzzi, P.H., Mina, M., Guerra, C., Cannataro, M. 2012. Semantic similarity analysis of protein data: Assessment with biological features and issues. Brief Bioinform 13, 569–585.PubMedCrossRefGoogle Scholar
  12. [12]
    Jain, S., Bader, G.D. 2010. An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology. BMC Bioinformatics 11, 562.PubMedCrossRefGoogle Scholar
  13. [13]
    Jiang, J.J., Conrath, D.W. 1997. Semantic similarity based on corpus statistics and lexical taxonomy. 10th International Conference on Research in Computational Linguistics, Taipei.Google Scholar
  14. [14]
    Kenley, E.C., Cho, Y.-R. 2011. Detecting protein complexes and functional modules from protein interaction networks: A graph entropy approach. Proteomics 11, 3835–3844.CrossRefGoogle Scholar
  15. [15]
    Kumpula, J., Kivelä, M., Kaski, K., Saramäki, J. Sequential algorithm for fast clique percolation. Phys Rev E 78, 026109Google Scholar
  16. [16]
    Li, M., Chen, J., Wang, J., Hu, B., Chen, G. 2008. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinformatics 9, 398.PubMedCrossRefGoogle Scholar
  17. [17]
    Li, X., Wu, M., Kwoh, C.-K., Ng, S.-K. 2010. Computational approaches for detecting protein complexes from protein interaction networks: A survey. BMC Genomics 11, S3.PubMedCrossRefGoogle Scholar
  18. [18]
    Lin, D. 1998. An information-theoretic definition of similarity. In: Proceedings of 15th International Conference on Machine Learning (ICML), Madison, USA, 296–304.Google Scholar
  19. [19]
    Mistry, M., Pavlidis, P. Gene Ontology term overlap as a measure of gene functional similarity. BMC Bioinformatics 9, 327.Google Scholar
  20. [20]
    Onnela, J., Saramäki, J., Kertész, J., Kaski, K. 2005. Intensity and coherence of motifs in weighted complex networks. Phys Rev E 71, 065103CrossRefGoogle Scholar
  21. [21]
    Palla, G., Derenyi, I., Farkas, I., Vicsek, T. 2005. Uncovering the overlapping community structure of complex networks in nature and society. Nature 435, 814–818.PubMedCrossRefGoogle Scholar
  22. [22]
    Pedersen, T., Pakhomov, S.V.S., Patwardhan, S., Chute, C.G. 2007. Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40, 288–299.PubMedCrossRefGoogle Scholar
  23. [23]
    Pesquita, C., Faria, D., Bastos, H., Ferreira, A.E.N., Falcao, A.O., Couto, F.M. 2008. Metrics for GO based protein semantic similarity: a systematic evaluation. BMC Bioinformatics 9, S4.PubMedCrossRefGoogle Scholar
  24. [24]
    Pesquita, C., Faria, D., Falcao, A.O., Lord, P., Couto, F.M. 2009. Semantic similarity in biomedical ontologies. PLoS Comput Biol 5, e1000443.PubMedCrossRefGoogle Scholar
  25. [25]
    Pu, S., Wong, J., Turner, B., Cho, E., Wodak, S.J. 2009. Up-to-date catalogues of yeast protein complexes. Nucl Acid Res 37, 825–831.CrossRefGoogle Scholar
  26. [26]
    Reid, F., McDaid, A.F., Hurley, N.J. 2012. Percolation computation in complex networks. In: Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Istanbul, Turkey, 274–281.Google Scholar
  27. [27]
    Resnik, P. 1995. Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of 14th International Joint Conference on Artificial Intelligence, Montreal, Canada, 448–453.Google Scholar
  28. [28]
    Spirin, V., Mirny, L.A. 2003. Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci USA 100, 12123–12128.PubMedCrossRefGoogle Scholar
  29. [29]
    Stark, C., Breitkreutz, B.J., Chatr-Aryamontri, A., Boucher, L., Oughtred, R., Livstone, M.S., Nixon, J., Van Auken, K., Wang, X., Shi, X., Reguly, T., Rust, J.M., Winter, A., Dolinski, K., Tyers, M. 2011. The BioGRID interaction database: 2011 update. Nucl Acid Res 39, D698–D704.CrossRefGoogle Scholar
  30. [30]
    The Gene Ontology Consortium. 2010. The Gene Ontology in 2010: Extensions and refinements. Nucl Acid Res 38, D331–D335.CrossRefGoogle Scholar
  31. [31]
    The Gene Ontology Consortium. 2012. The Gene Ontology: Enhancements for 2011. Nucl Acid Res 40, D559–D564.CrossRefGoogle Scholar
  32. [32]
    Venkatesan, K., Rual, J.F., Vazquez, A., Stelzl, U., Lemmens, I., Hirozane-Kishikawa, T., Hao, T., Zenkner, M., Xin, X., Goh, K.I., Yildirim, M.A., Simonis, N., Heinzmann, K., Gebreab, F., Sahalie, J.M., Cevik, S., Simon, C., de Smet, A.S., Dann, E., Smolyar, A., Vinayagam, A., Yu, H., Szeto, D., Borick, H., Dricot, A., Klitgord, N., Murray, R.R., Lin, C., Lalowski, M., Timm, J., Rau, K., Boone, C., Braun, P., Cusick, M.E., Roth, F.P., Hill, D.E., Tavernier, J., Wanker, E.E., Barabási, A.L., Vidal, M. 2009. An empirical framework for binary interactome mapping. Nat Method 6, 83–90.CrossRefGoogle Scholar
  33. [33]
    Wang, J.Z., Du, Z., Payattakool, R., Yu, P.S., Chen, C.-F. 2007. A new method to measure the semantic similarity of GO terms. Bioinformatics 23, 1274–1281.PubMedCrossRefGoogle Scholar
  34. [34]
    Wang, J., Zhou, X., Zhu, J., Zhou, C., Guo, Z. 2010. Revealing and avoiding bias in semantic similarity scores for protein pairs. BMC Bioinformatics 11, 290.PubMedCrossRefGoogle Scholar
  35. [35]
    Wu, Z., Palmer, M. 1994. Verb semantics and lexical selection. In: Proceedings of 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, USA, 133–138.Google Scholar
  36. [36]
    Wu, M., Li, X., Kwoh, C.-K., Ng, S.-K. A coreattachment based method to detect protein complexes in PPI networks. BMC Bioinformatics 10, 169.Google Scholar
  37. [37]
    Yu, H., Braun, P., Yildirim, M.A., Lemmens, I., Venkatesan, K., Sahalie, J., Hirozane-Kishikawa, T., Gebreab, F., Li, N., Simonis, N., Hao, T., Rual, J.F., Dricot, A., Vazquez, A., Murray, R.R., Simon, C., Tardivo, L., Tam, S., Svrzikapa, N., Fan, C., de Smet, A.S., Motyl, A., Hudson, M.E., Park, J., Xin, X., Cusick, M.E., Moore, T., Boone, C., Snyder, M., Roth, F.P., Barabási, A.L., Tavernier, J., Hill, D.E., Vidal, M. 2008. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110.PubMedCrossRefGoogle Scholar

Copyright information

© International Association of Scientists in the Interdisciplinary Areas and Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • True Price
    • 1
  • Francisco I. PeñaIII
    • 1
  • Young-Rae Cho
    • 1
  1. 1.Department of Computer ScienceBaylor UniversityWacoUSA

Personalised recommendations