Complex Prediction in Large PPI Networks Using Expansion and Stripe of Core Cliques

Sahoo, Tushar Ranjan; Vipsita, Swati; Patra, Sabyasachi

doi:10.1007/s12539-022-00541-z

Complex Prediction in Large PPI Networks Using Expansion and Stripe of Core Cliques

Original research article
Published: 28 October 2022

Volume 15, pages 331–348, (2023)
Cite this article

Interdisciplinary Sciences: Computational Life Sciences Aims and scope Submit manuscript

Tushar Ranjan Sahoo¹,
Swati Vipsita¹^na1 &
Sabyasachi Patra¹^na1

299 Accesses
1 Citation
Explore all metrics

Abstract

The widespread availability and importance of large-scale protein-protein interaction (PPI) data demand a flurry of research efforts to understand the organisation of a cell and its functionality by analysing these data at the network level. In the bioinformatics and data mining fields, network clustering acquired a lot of attraction to examine a PPI network’s topological and functional aspects. The clustering of PPI networks has been proven to be an excellent method for discovering functional modules, disclosing functions of unknown proteins, and other tasks in numerous research over the last decade. This research proposes a unique graph mining approach to detect protein complexes using dense neighbourhoods (highly connected regions) in an interaction graph. Our technique first finds size-3 cliques associated with each edge (protein interaction), and then these core cliques are expanded to form high-density subgraphs. Loosely connected proteins are stripped out from these subgraphs to produce a potential protein complex. Finally, the redundancy is removed based on the Jaccard coefficient. Computational results are presented on the yeast and human protein interaction dataset to highlight our proposed technique’s efficiency. Predicted protein complexes of the proposed approach have a significantly higher score of similarity to those used as gold standards in the CYC-2008 and CORUM benchmark databases than other existing approaches.

Graphical Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel intelligent Fuzzy-AHP based evolutionary algorithm for detecting communities in complex networks

Article 29 February 2024

Adapting k-means for graph clustering

Article Open access 04 December 2021

Recent developments of sequence-based prediction of protein–protein interactions

Article 24 December 2022

References

Patra S, Mohapatra A (2020) Review of tools and algorithms for network motif discovery in biological networks. IET Syst Biol 14(4):171–189. https://doi.org/10.1049/iet-syb.2020.0004
Article PubMed PubMed Central Google Scholar
Grigorov MG (2005) Global properties of biological networks. Drug Discov Today 10(5):365–72. https://doi.org/10.1016/S1359-6446(05)03369-6
Article CAS PubMed Google Scholar
Khanin R, Wit E (2006) How scale-free are biological networks. J Comput Biol 13(3):810–818. https://doi.org/10.1089/cmb.2006.13.810
Article CAS PubMed Google Scholar
Barabasi AL, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512. https://doi.org/10.1126/science.286.5439.509
Article CAS PubMed Google Scholar
Balasundaram B, Butenko S, Trukhanov S (2005) Novel approaches for analyzing biological networks. J Comb Optim 10(1):23–39. https://doi.org/10.1007/s10878-005-1857-x
Article Google Scholar
Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415(6868):141–147. https://doi.org/10.1038/415141a
Article CAS PubMed Google Scholar
Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dümpelfeld B, Edelmann A (2006) Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084):631–636. https://doi.org/10.1038/nature04532
Article PubMed Google Scholar
Song L, Li D, Zeng X, Wu Y, Guo L, Zou Q (2014) nDNA-prot: identification of DNA-binding proteins based on unbalanced classification. BMC Bioinform 15(1):1–10. https://doi.org/10.1186/1471-2105-15-298
Article CAS Google Scholar
Cao B, Luo J, Liang C, Wang S, Song D (2015) Moepga: A novel method to detect protein complexes in yeast protein-protein interaction networks based on multiobjective evolutionary programming genetic algorithm. Comput Biol Chem 58:173–181. https://doi.org/10.1016/j.compbiolchem.2015.06.006
Article CAS PubMed Google Scholar
ur Rehman Z, Idris A, Khan A (2018) Multi-dimensional scaling based grouping of known complexes and intelligent protein complex detection. Comput Biol Chem 74:149–156. https://doi.org/10.1016/j.compbiolchem.2018.03.023
Article CAS Google Scholar
Zahiri J, Emamjomeh A, Bagheri S, Ivazeh A, Mahdevar G, Tehrani HS, Mirzaie M, Fakheri BA, Mohammad-Noori M (2020) Protein complex prediction: a survey. Genomics 112(1):174–183. https://doi.org/10.1016/j.ygeno.2019.01.011
Article CAS PubMed Google Scholar
Zaslavsky L, Ciufo S, Fedorov B, Tatusova T (2016) Clustering analysis of proteins from microbial genomes at multiple levels of resolution. BMC Bioinform 17(8):545–552. https://doi.org/10.1186/s12859-016-1112-8
Article CAS Google Scholar
Yu L, Gao L, Li K, Zhao Y, Chiu DK (2011) A degree-distribution based hierarchical agglomerative clustering algorithm for protein complexes identification. Comput Biol Chem 35(5):298–307. https://doi.org/10.1016/j.compbiolchem.2011.07.005
Article CAS PubMed Google Scholar
Hartwell LH, Hopfield JJ, Leibler S, Murray AW (1999) From molecular to modular cell biology. Nature 402(6761):C47–C52. https://doi.org/10.1038/35011540
Article CAS PubMed Google Scholar
Keretsu S, Sarmah R (2016) Weighted edge based clustering to identify protein complexes in protein-protein interaction networks incorporating gene expression profile. Comput Biol Chem 65:69–79. https://doi.org/10.1016/j.compbiolchem.2016.10.001
Article CAS PubMed Google Scholar
Jalili S, Marashi SA (2015) CAMWI: detecting protein complexes using weighted clustering coefficient and weighted density. Comput Biol Chem 58:231–240. https://doi.org/10.1016/j.compbiolchem.2015.07.012
Article CAS PubMed Google Scholar
Pyrogova I, Wong L (2018) Protein complex prediction by date hub removal. Comput Biol Chem 74:407–419. https://doi.org/10.1016/j.compbiolchem.2018.03.012
Article CAS PubMed Google Scholar
Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D (2000) DIP: the database of interacting proteins. Nucleic Acids Res 28(1):289–291. https://doi.org/10.1093/nar/28.1.289
Article CAS PubMed PubMed Central Google Scholar
Rives AW, Galitski T (2003) Modular organization of cellular networks. Proc Natl Acad Sci 100(3):1128–1133. https://doi.org/10.1073/pnas.0237338100
Article CAS PubMed PubMed Central Google Scholar
Bader GD, Hogue CW (2003) An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinform 4(1):1–27. https://doi.org/10.1186/1471-2105-4-2
Article Google Scholar
Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818. https://doi.org/10.1038/nature03607
Article CAS PubMed Google Scholar
Altaf-Ul-Amin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S (2006) Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform 7(1):1–13. https://doi.org/10.1186/1471-2105-7-207
Article CAS Google Scholar
Li XL, Foo CS, Ng SK (2007) Discovering protein complexes in dense reliable neighborhoods of protein interaction networks. Comput Syst Bioinform 6:157–168. https://doi.org/10.1142/9781860948732_0019
Article Google Scholar
Wu M, Li X, Kwoh CK, Ng SK (2009) A core-attachment based method to detect protein complexes in PPI networks. BMC Bioinform 10(1):1–16. https://doi.org/10.1186/1471-2105-10-169
Article Google Scholar
Nepusz T, Yu H, Paccanaro A (2012) Detecting overlapping protein complexes in protein-protein interaction networks. Nat Methods 9(5):471–472. https://doi.org/10.1038/nmeth.1938
Article CAS PubMed PubMed Central Google Scholar
Haque M, Sarmah R, Bhattacharyya DK (2018) A common neighbor based technique to detect protein complexes in PPI networks. J Genet Eng Biotechnol 16(1):227–238. https://doi.org/10.1016/j.jgeb.2017.10.010
Article PubMed Google Scholar
Xiao Q, Luo P, Li M, Wang J, Wu FX (2019) A novel core-attachment-based method to identify dynamic protein complexes based on gene expression profiles and PPI networks. Proteomics 19(5):e1800129. https://doi.org/10.1002/pmic.201800129
Article CAS PubMed Google Scholar
Meng X, Xiang J, Zheng R, Wu FX, Li M (2021) DPCMNE: detecting protein complexes from protein-protein interaction networks via multi-level network embedding. IEEE/ACM Trans Comput Biol Bioinf 19(3):1592–602. https://doi.org/10.1109/TCBB.2021.3050102
Article Google Scholar
King AD, Przulj N, Jurisica I (2004) Protein complex prediction via cost-based clustering. Bioinformatics 20(17):3013–3020. https://doi.org/10.1093/bioinformatics/bth351
Article CAS PubMed Google Scholar
Kovács IA, Palotai R, Szalay MS, Csermely P (2010) Community landscapes: an integrative approach to determine overlapping network module hierarchy, identify key nodes and predict network dynamics. PLoS One 5(9):e12528. https://doi.org/10.1371/journal.pone.0012528
Article CAS PubMed PubMed Central Google Scholar
Ou-Yang L, Yan H, Zhang XF (2017) A multi-network clustering method for detecting protein complexes from multiple heterogeneous networks. BMC Bioinform 18(13):23–34. https://doi.org/10.1186/s12859-017-1877-4
Article CAS Google Scholar
Wang J, Liang J, Zheng W, Zhao X, Mu J (2019) Protein complex detection algorithm based on multiple topological characteristics in PPI networks. Inf Sci 489:78–92. https://doi.org/10.1016/j.ins.2019.03.015
Article Google Scholar
Enright AJ, Van Dongen S, Ouzounis CA (2002) An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 30(7):1575–1584. https://doi.org/10.1093/nar/30.7.1575
Article CAS PubMed PubMed Central Google Scholar
Hwang W, Cho YR, Zhang A, Ramanathan M (2006) A novel functional module detection algorithm for protein-protein interaction networks. Algorithms Mol Biol 1(1):1–11. https://doi.org/10.1186/1748-7188-1-24
Article CAS Google Scholar
Peng W, Wang J, Zhao B, Wang L (2014) Identification of protein complexes using weighted pagerank-nibble algorithm and core-attachment structure. IEEE/ACM Trans Comput Biol Bioinf 12(1):179–192. https://doi.org/10.1109/TCBB.2014.2343954
Article Google Scholar
Farutin V, Robison K, Lightcap E, Dancik V, Ruttenberg A, Letovsky S, Pradines J (2006) Edge-count probabilities for the identification of local protein communities and their organization. Proteins Struct Funct Bioinform 62(3):800–818. https://doi.org/10.1002/prot.20799
Article CAS Google Scholar
Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764. https://doi.org/10.1038/nature09182
Article CAS PubMed Google Scholar
Tasgin M, Herdagdelen A, Bingol H (2007) Community detection in complex networks using genetic algorithms. arXiv preprint arXiv:0711.0491. https://doi.org/10.48550/arXiv.0711.0491
Xu Y, Zhou J, Zhou S, Guan J (2017) CPredictor3.0: detecting protein complexes from PPI networks with expression data and functional annotations. BMC Syst Biol 11(7):45–56. https://doi.org/10.1186/s12918-017-0504-3
Article CAS Google Scholar
Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D (2004) The database of interacting proteins: 2004 update. Nucleic Acids Res 32:D449–D451. https://doi.org/10.1093/nar/gkh086
Article PubMed PubMed Central Google Scholar
Oughtred R, Stark C, Breitkreutz BJ, Rust J, Boucher L, Chang C, Kolas N, O’Donnell L, Leung G, McAdam R, Zhang F (2019) The BioGRID interaction database: 2019 update. Nucleic Acids Res 47(D1):D529–D541. https://doi.org/10.1093/nar/gky1079
Article CAS PubMed Google Scholar
Pu S, Wong J, Turner B, Cho E, Wodak SJ (2009) Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res 37(3):825–831. https://doi.org/10.1093/nar/gkn1005
Article CAS PubMed Google Scholar
Giurgiu M, Reinhard J, Brauner B, Dunger-Kaltenbach I, Fobo G, Frishman G, Montrone C, Ruepp A (2019) CORUM: the comprehensive resource of mammalian protein complexes-2019. Nucleic Acids Res 47(D1):D559–D563. https://doi.org/10.1093/nar/gky973
Article CAS PubMed Google Scholar
Yamasaki C, Murakami K, Takeda JI, Sato Y, Noda A, Sakate R, Habara T, Nakaoka H, Todokoro F, Matsuya A, Imanishi T (2010) H-InvDB in 2009: extended database and data mining resources for human genes and transcripts. Nucleic Acids Res 38:D626–D632. https://doi.org/10.1093/nar/gkp1020
Article CAS PubMed Google Scholar
Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G (2004) GO: TermFinder-open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20(18):3710–3715. https://doi.org/10.1093/bioinformatics/bth456
Article CAS PubMed Google Scholar
Cherry JM, Adler C, Ball C, Chervitz SA, Dwight SS, Hester ET, Jia Y, Juvik G, Roe T, Schroeder M, Weng S (1998) SGD: Saccharomyces genome database. Nucleic Acids Res 26(1):73–79. https://doi.org/10.1093/nar/26.1.73
Article CAS PubMed PubMed Central Google Scholar
Pomaznoy M, Ha B, Peters B (2018) GOnet: a tool for interactive Gene Ontology analysis. BMC Bioinform 19(1):1–8. https://doi.org/10.1186/s12859-018-2533-3
Article CAS Google Scholar

Download references

Acknowledgements

We acknowledge the infrastructure and computational facilities received from DST-FIST Bioinformatics Lab of IIIT Bhubaneswar.

Author information

Swati Vipsita, Sabyasachi Patra have contributed equally to this work.

Authors and Affiliations

CSE, IIIT Bhubaneswar, Gothapatna, Bhubaneswar, Odisha, 751003, India
Tushar Ranjan Sahoo, Swati Vipsita & Sabyasachi Patra

Authors

Tushar Ranjan Sahoo
View author publications
You can also search for this author in PubMed Google Scholar
Swati Vipsita
View author publications
You can also search for this author in PubMed Google Scholar
Sabyasachi Patra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tushar Ranjan Sahoo.

Ethics declarations

Conflict of Interest

On behalf of all authors, the corresponding author confirms that there are no conflicts of interest.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sahoo, T.R., Vipsita, S. & Patra, S. Complex Prediction in Large PPI Networks Using Expansion and Stripe of Core Cliques. Interdiscip Sci Comput Life Sci 15, 331–348 (2023). https://doi.org/10.1007/s12539-022-00541-z

Download citation

Received: 11 July 2022
Revised: 06 October 2022
Accepted: 07 October 2022
Published: 28 October 2022
Issue Date: September 2023
DOI: https://doi.org/10.1007/s12539-022-00541-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions