Abstract
Detecting protein complexes is crucial to understand principles of cellular organization. Plenty evidences have indicated that sub-graphs with high density in protein-protein interaction (PPI) network, especially dynamic PPI network (DPIN), usually correspond to protein complexes. As a well-known density-based clustering algorithm, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) has been used in many areas due to its simplicity and the ability to detect clusters of different sizes and shapes. However, one of its limitations is that the performance of DBSCAN depends on two specified parameters ε and MinPts, where ε represents the maximum radius of a neighborhood from an observing point while MinPts means the minimum number of data points contained in such a neighborhood. In this article, we develop a new method named as P-DBSCAN to detect protein complexes in DPIN by using Pigeon-Inspired Optimization (PIO) Algorithm to optimize the parameters ε and MinPts in DBSCAN. The experiments on DIP and MIPS datasets show that P-DBSCAN outperforms the state-of-the-art methods for protein complex detection in terms of several criteria such as precision, recall and f-measure.
This is a preview of subscription content, access via your institution.
References
- 1
Uetz P, Giot L, Cagney G, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 2000, 403: 623–627
- 2
Zhu H, Bilgin M, Bangham R, et al. Global analysis of protein activities using proteome chips. Science, 2001, 293: 2101–2105
- 3
Xenarios I, Salwnski L, Duan X J, et al. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucl Acids Res, 2002, 30: 303–305
- 4
Güldener U, Münsterkötter M, Kastenmller G, et al. CYGD: the comprehensive yeast genome database. Nucl Acids Res, 2005, 33: 364–368
- 5
Cherry J M. SGD: Saccharomyces Genome Database. Nucl Acids Res, 1998, 26: 73–79
- 6
Montanez G, Cho Y R. Predicting false positives of protein-protein interaction data by semantic similarity measures. Curr Bioinform, 2013, 8: 339–346
- 7
Li M, Zheng R, Zhang H, et al. Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods, 2014, 67: 325–333
- 8
Watts D J, Strogatz S H. Collective dynamics of ‘small-world’ networks. Nature, 1998, 393: 440–442
- 9
Antonio S, Paul O M. Small-world network approach to identify key residues in protein-protein interaction. Proteins, 2005, 58: 672–682
- 10
Rives A W, Galitski T. Modular organization of cellular networks. Proc Nat Acad Sci USA, 2003, 100: 1128–1133
- 11
Palla G, Dernyi I, Farkas I J, et al. Uncoverring the overlapping community structure of complex networks in nature and society. Nature, 2005, 435: 814–818
- 12
Adamcsek B, Palla G, Farkas I, et al. CFinder: locating cliques and overlapping modules in biological networks. Bioinformatics, 2006, 22: 1021–1023
- 13
Altaf-Ul-Amin M, Shinbo Y, Mihara K, et al. Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinform, 2006, 7: 207–228
- 14
Li M, Chen J, Wang J, et al. Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform, 2008, 9: 398–413
- 15
Peng J, Mona S. SPICi: a fast clustering algorithm for large biological networks. Bioinformatics, 2010, 26: 1105–1111
- 16
Liu G, Wong L, Chua H N. Complex discovery from weighted PPI networks. Bioinformatics, 2009, 25: 1891–1897
- 17
Leung H C M, Xiang Q, Yiu S M, et al. Predicting protein complexes from PPI data: a core-attachment approach. J Comput Biol, 2009, 16: 133–144
- 18
Wang J X, Liu B B, Li M, et al. Identifying protein complexes from interaction networks based on clique percolation and distance restriction. BMC Genom, 2010, 11: S10–S24
- 19
Ester M, Kriegel H P, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, 1996. 226–231
- 20
Duan H B, Qiao P X. Pigeon-inspired optimization: a new swarm intelligence optimizer for air robot path planning. Int J Intell Comput Cybern, 2014, 7: 24–37
- 21
Lei X J, Wu S, Ge L, et al. Clustering and overlapping modules detection in PPI network based on IBFO. Proteomics, 2013, 13: 278–290
- 22
Lei X J, Tian J F, Ge L, et al. The clustering model and algorithm of PPI network based on propagating mechanism of artificial bee colony. Inform Sci, 2013, 247: 21–39
- 23
Lv Q, Wu H J, Wu J Z, et al. A parallel ant colonies approach to de novo prediction of protein backbone in CASP8/9. Sci China Inf Sci, 2013, 56: 108103
- 24
Lei X J, Wang F, Wu F X, et al. Protein complex identification through Markov clustering with firefly algorithm on dynamic proteinCprotein interaction networks. Inf Sci, 2016, 329: 303–316
- 25
Lei X J, Ying C, Wu F X, et al. Clustering PPI data by combining FA and SHC method. BMC Genom, 2015, 16: S3–S12
- 26
Zhao J, Zhou R. Pigeon-inspired optimization applied to constrained gliding trajectories. Nonlinear Dyn, 2015, 82: 1781–1795
- 27
Li C, Duan H B. Target detection approach for UAVs via improved Pigeon-inspired Optimization and Edge Potential Function. Aerosp Sci Technol, 2014, 39: 352–360
- 28
Sun H, Duan H B. PID controller design based on Prey-Predator Pigeon-Inspired Optimization algorithm. In: Proceedings of the International Conference on Mechatronics and Automation, Tianjin, 2014. 1416–1421
- 29
Wang J X, Li M, Chen J, et al. A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans Comput Biol Bioinform. 2011, 8: 607–620
- 30
van Dongen S. Graph clustering by flow simulation. Dissertation for Doctoral Degree. Center for Math and Computer Science (CWI), University of Utrecht. 2000
- 31
King A D, Przulj N, Jurisica I. Protein complex prediction via cost-based clustering. Bioinformatics, 2004, 20: 3013–3020
- 32
Zhang A D. Protein interaction networks. New York: Cambridge University Press, 2009
- 33
Radicchi F, Castellano C, Cecconi F, et al. Defining and identifying communities in networks. Proc Nat Acad Sci USA, 2004, 101: 2658–2663
- 34
Washburn M P, Wolters D, Yates J R. Large-scale analysis of the yeast proteome by multidimensional protein identification technology. Nat Biotechnol, 2001, 19: 242–247
- 35
Cho Y R, Hwang H, Ramanathan M, et al. Semantic integration to identify overlapping functional modules in protein interaction networks. BMC Bioinform, 2007, 8: 265–277
- 36
Wang J X, Peng X Q, Li M, et al. Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics, 2013, 13: 301–312
- 37
Tu B P, Kudlicki A, Rowicka M, et al. Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science, 2005, 310: 1152–1158
- 38
Pu S, Wong J, Turner B, et al. Up-to-date catalogues of yeast protein complexes. Nucl Acids Res 2009, 37: 825–831
- 39
Mewes H W, Amid C, Arnold R, et al. MIPS: analysis and annotation of proteins from whole genomes. Nucl Acids Res, 2004, 32: 41–44
- 40
Tang Y, Li M, Wang J X. CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems, 2015, 127: 67–72
Author information
Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lei, X., Ding, Y. & Wu, FX. Detecting protein complexes from DPINs by density based clustering with Pigeon-Inspired Optimization Algorithm. Sci. China Inf. Sci. 59, 070103 (2016). https://doi.org/10.1007/s11432-016-5578-9
Received:
Accepted:
Published:
Keywords
- dynamic protein-protein interaction network (DPIN)
- pigeon-inspired optimization (PIO)
- protein complex
- density based clustering
- gene expression