Skip to main content

Advertisement

Log in

Complex Prediction in Large PPI Networks Using Expansion and Stripe of Core Cliques

  • Original research article
  • Published:
Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Abstract

The widespread availability and importance of large-scale protein-protein interaction (PPI) data demand a flurry of research efforts to understand the organisation of a cell and its functionality by analysing these data at the network level. In the bioinformatics and data mining fields, network clustering acquired a lot of attraction to examine a PPI network’s topological and functional aspects. The clustering of PPI networks has been proven to be an excellent method for discovering functional modules, disclosing functions of unknown proteins, and other tasks in numerous research over the last decade. This research proposes a unique graph mining approach to detect protein complexes using dense neighbourhoods (highly connected regions) in an interaction graph. Our technique first finds size-3 cliques associated with each edge (protein interaction), and then these core cliques are expanded to form high-density subgraphs. Loosely connected proteins are stripped out from these subgraphs to produce a potential protein complex. Finally, the redundancy is removed based on the Jaccard coefficient. Computational results are presented on the yeast and human protein interaction dataset to highlight our proposed technique’s efficiency. Predicted protein complexes of the proposed approach have a significantly higher score of similarity to those used as gold standards in the CYC-2008 and CORUM benchmark databases than other existing approaches.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Patra S, Mohapatra A (2020) Review of tools and algorithms for network motif discovery in biological networks. IET Syst Biol 14(4):171–189. https://doi.org/10.1049/iet-syb.2020.0004

    Article  PubMed  PubMed Central  Google Scholar 

  2. Grigorov MG (2005) Global properties of biological networks. Drug Discov Today 10(5):365–72. https://doi.org/10.1016/S1359-6446(05)03369-6

    Article  CAS  PubMed  Google Scholar 

  3. Khanin R, Wit E (2006) How scale-free are biological networks. J Comput Biol 13(3):810–818. https://doi.org/10.1089/cmb.2006.13.810

    Article  CAS  PubMed  Google Scholar 

  4. Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. https://doi.org/10.1126/science.286.5439.509

    Article  CAS  PubMed  Google Scholar 

  5. Balasundaram B, Butenko S, Trukhanov S (2005) Novel approaches for analyzing biological networks. J Comb Optim 10(1):23–39. https://doi.org/10.1007/s10878-005-1857-x

    Article  Google Scholar 

  6. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868):141–147. https://doi.org/10.1038/415141a

    Article  CAS  PubMed  Google Scholar 

  7. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dümpelfeld B, Edelmann A (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084):631–636. https://doi.org/10.1038/nature04532

    Article  PubMed  Google Scholar 

  8. Song L, Li D, Zeng X, Wu Y, Guo L, Zou Q (2014) nDNA-prot: identification of DNA-binding proteins based on unbalanced classification. BMC Bioinform 15(1):1–10. https://doi.org/10.1186/1471-2105-15-298

    Article  CAS  Google Scholar 

  9. Cao B, Luo J, Liang C, Wang S, Song D (2015) Moepga: A novel method to detect protein complexes in yeast protein-protein interaction networks based on multiobjective evolutionary programming genetic algorithm. Comput Biol Chem 58:173–181. https://doi.org/10.1016/j.compbiolchem.2015.06.006

    Article  CAS  PubMed  Google Scholar 

  10. ur Rehman Z, Idris A, Khan A (2018) Multi-dimensional scaling based grouping of known complexes and intelligent protein complex detection. Comput Biol Chem 74:149–156. https://doi.org/10.1016/j.compbiolchem.2018.03.023

    Article  CAS  Google Scholar 

  11. Zahiri J, Emamjomeh A, Bagheri S, Ivazeh A, Mahdevar G, Tehrani HS, Mirzaie M, Fakheri BA, Mohammad-Noori M (2020) Protein complex prediction: a survey. Genomics 112(1):174–183. https://doi.org/10.1016/j.ygeno.2019.01.011

    Article  CAS  PubMed  Google Scholar 

  12. Zaslavsky L, Ciufo S, Fedorov B, Tatusova T (2016) Clustering analysis of proteins from microbial genomes at multiple levels of resolution. BMC Bioinform 17(8):545–552. https://doi.org/10.1186/s12859-016-1112-8

    Article  CAS  Google Scholar 

  13. Yu L, Gao L, Li K, Zhao Y, Chiu DK (2011) A degree-distribution based hierarchical agglomerative clustering algorithm for protein complexes identification. Comput Biol Chem 35(5):298–307. https://doi.org/10.1016/j.compbiolchem.2011.07.005

    Article  CAS  PubMed  Google Scholar 

  14. Hartwell LH, Hopfield JJ, Leibler S, Murray AW (1999) From molecular to modular cell biology. Nature 402(6761):C47–C52. https://doi.org/10.1038/35011540

    Article  CAS  PubMed  Google Scholar 

  15. Keretsu S, Sarmah R (2016) Weighted edge based clustering to identify protein complexes in protein-protein interaction networks incorporating gene expression profile. Comput Biol Chem 65:69–79. https://doi.org/10.1016/j.compbiolchem.2016.10.001

    Article  CAS  PubMed  Google Scholar 

  16. Jalili S, Marashi SA (2015) CAMWI: detecting protein complexes using weighted clustering coefficient and weighted density. Comput Biol Chem 58:231–240. https://doi.org/10.1016/j.compbiolchem.2015.07.012

    Article  CAS  PubMed  Google Scholar 

  17. Pyrogova I, Wong L (2018) Protein complex prediction by date hub removal. Comput Biol Chem 74:407–419. https://doi.org/10.1016/j.compbiolchem.2018.03.012

    Article  CAS  PubMed  Google Scholar 

  18. Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D (2000) DIP: the database of interacting proteins. Nucleic Acids Res 28(1):289–291. https://doi.org/10.1093/nar/28.1.289

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Rives AW, Galitski T (2003) Modular organization of cellular networks. Proc Natl Acad Sci 100(3):1128–1133. https://doi.org/10.1073/pnas.0237338100

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Bader GD, Hogue CW (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform 4(1):1–27. https://doi.org/10.1186/1471-2105-4-2

    Article  Google Scholar 

  21. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818. https://doi.org/10.1038/nature03607

    Article  CAS  PubMed  Google Scholar 

  22. Altaf-Ul-Amin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S (2006) Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform 7(1):1–13. https://doi.org/10.1186/1471-2105-7-207

    Article  CAS  Google Scholar 

  23. Li XL, Foo CS, Ng SK (2007) Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. Comput Syst Bioinform 6:157–168. https://doi.org/10.1142/9781860948732_0019

    Article  Google Scholar 

  24. Wu M, Li X, Kwoh CK, Ng SK (2009) A core-attachment based method to detect protein complexes in PPI networks. BMC Bioinform 10(1):1–16. https://doi.org/10.1186/1471-2105-10-169

    Article  Google Scholar 

  25. Nepusz T, Yu H, Paccanaro A (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods 9(5):471–472. https://doi.org/10.1038/nmeth.1938

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Haque M, Sarmah R, Bhattacharyya DK (2018) A common neighbor based technique to detect protein complexes in PPI networks. J Genet Eng Biotechnol 16(1):227–238. https://doi.org/10.1016/j.jgeb.2017.10.010

    Article  PubMed  Google Scholar 

  27. Xiao Q, Luo P, Li M, Wang J, Wu FX (2019) A novel core-attachment-based method to identify dynamic protein complexes based on gene expression profiles and PPI networks. Proteomics 19(5):e1800129. https://doi.org/10.1002/pmic.201800129

    Article  CAS  PubMed  Google Scholar 

  28. Meng X, Xiang J, Zheng R, Wu FX, Li M (2021) DPCMNE: detecting protein complexes from protein-protein interaction networks via multi-level network embedding. IEEE/ACM Trans Comput Biol Bioinf 19(3):1592–602. https://doi.org/10.1109/TCBB.2021.3050102

    Article  Google Scholar 

  29. King AD, Przulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20(17):3013–3020. https://doi.org/10.1093/bioinformatics/bth351

    Article  CAS  PubMed  Google Scholar 

  30. Kovács IA, Palotai R, Szalay MS, Csermely P (2010) Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics. PLoS One 5(9):e12528. https://doi.org/10.1371/journal.pone.0012528

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Ou-Yang L, Yan H, Zhang XF (2017) A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks. BMC Bioinform 18(13):23–34. https://doi.org/10.1186/s12859-017-1877-4

    Article  CAS  Google Scholar 

  32. Wang J, Liang J, Zheng W, Zhao X, Mu J (2019) Protein complex detection algorithm based on multiple topological characteristics in PPI networks. Inf Sci 489:78–92. https://doi.org/10.1016/j.ins.2019.03.015

    Article  Google Scholar 

  33. Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575–1584. https://doi.org/10.1093/nar/30.7.1575

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Hwang W, Cho YR, Zhang A, Ramanathan M (2006) A novel functional module detection algorithm for protein-protein interaction networks. Algorithms Mol Biol 1(1):1–11. https://doi.org/10.1186/1748-7188-1-24

    Article  CAS  Google Scholar 

  35. Peng W, Wang J, Zhao B, Wang L (2014) Identification of protein complexes using weighted pagerank-nibble algorithm and core-attachment structure. IEEE/ACM Trans Comput Biol Bioinf 12(1):179–192. https://doi.org/10.1109/TCBB.2014.2343954

    Article  Google Scholar 

  36. Farutin V, Robison K, Lightcap E, Dancik V, Ruttenberg A, Letovsky S, Pradines J (2006) Edge-count probabilities for the identification of local protein communities and their organization. Proteins Struct Funct Bioinform 62(3):800–818. https://doi.org/10.1002/prot.20799

    Article  CAS  Google Scholar 

  37. Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764. https://doi.org/10.1038/nature09182

    Article  CAS  PubMed  Google Scholar 

  38. Tasgin M, Herdagdelen A, Bingol H (2007) Community detection in complex networks using genetic algorithms. arXiv preprint arXiv:0711.0491. https://doi.org/10.48550/arXiv.0711.0491

  39. Xu Y, Zhou J, Zhou S, Guan J (2017) CPredictor3.0: detecting protein complexes from PPI networks with expression data and functional annotations. BMC Syst Biol 11(7):45–56. https://doi.org/10.1186/s12918-017-0504-3

    Article  CAS  Google Scholar 

  40. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D (2004) The database of interacting proteins: 2004 update. Nucleic Acids Res 32:D449–D451. https://doi.org/10.1093/nar/gkh086

    Article  PubMed  PubMed Central  Google Scholar 

  41. Oughtred R, Stark C, Breitkreutz BJ, Rust J, Boucher L, Chang C, Kolas N, O’Donnell L, Leung G, McAdam R, Zhang F (2019) The BioGRID interaction database: 2019 update. Nucleic Acids Res 47(D1):D529–D541. https://doi.org/10.1093/nar/gky1079

    Article  CAS  PubMed  Google Scholar 

  42. Pu S, Wong J, Turner B, Cho E, Wodak SJ (2009) Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res 37(3):825–831. https://doi.org/10.1093/nar/gkn1005

    Article  CAS  PubMed  Google Scholar 

  43. Giurgiu M, Reinhard J, Brauner B, Dunger-Kaltenbach I, Fobo G, Frishman G, Montrone C, Ruepp A (2019) CORUM: the comprehensive resource of mammalian protein complexes-2019. Nucleic Acids Res 47(D1):D559–D563. https://doi.org/10.1093/nar/gky973

    Article  CAS  PubMed  Google Scholar 

  44. Yamasaki C, Murakami K, Takeda JI, Sato Y, Noda A, Sakate R, Habara T, Nakaoka H, Todokoro F, Matsuya A, Imanishi T (2010) H-InvDB in 2009: extended database and data mining resources for human genes and transcripts. Nucleic Acids Res 38:D626–D632. https://doi.org/10.1093/nar/gkp1020

    Article  CAS  PubMed  Google Scholar 

  45. Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G (2004) GO: TermFinder-open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20(18):3710–3715. https://doi.org/10.1093/bioinformatics/bth456

    Article  CAS  PubMed  Google Scholar 

  46. Cherry JM, Adler C, Ball C, Chervitz SA, Dwight SS, Hester ET, Jia Y, Juvik G, Roe T, Schroeder M, Weng S (1998) SGD: Saccharomyces genome database. Nucleic Acids Res 26(1):73–79. https://doi.org/10.1093/nar/26.1.73

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Pomaznoy M, Ha B, Peters B (2018) GOnet: a tool for interactive Gene Ontology analysis. BMC Bioinform 19(1):1–8. https://doi.org/10.1186/s12859-018-2533-3

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We acknowledge the infrastructure and computational facilities received from DST-FIST Bioinformatics Lab of IIIT Bhubaneswar.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tushar Ranjan Sahoo.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author confirms that there are no conflicts of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sahoo, T.R., Vipsita, S. & Patra, S. Complex Prediction in Large PPI Networks Using Expansion and Stripe of Core Cliques. Interdiscip Sci Comput Life Sci 15, 331–348 (2023). https://doi.org/10.1007/s12539-022-00541-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12539-022-00541-z

Keywords

Navigation