Abstract
Identifying essential proteins from protein-protein interaction networks is important for studies on biological evolution and new drug’s development. Most of the presented criteria for prioritizing essential proteins only focus on a certain attribute of the proteins in the network, which suffer from information loss. In order to overcome this problem, a relatively comprehensive and effective novel method for essential proteins identification based on improved multicriteria decision making (MCDM), called essential proteins identification-technique for order preference by similarity to ideal solution (EPI-TOPSIS), is proposed. First, considering different attributes of proteins, we propose three methods from different aspects to evaluate the significance of the proteins: gene-degree centrality (GDC) for gene expression sequence; subcellular-neighbor-degree centrality (SNDC) and subcellular-in-degree centrality (SIDC) for subcellular location information and protein complexes. Then, betweenness centrality (BC) and these three methods are considered together as the multiple criteria of the decision-making model. Analytic hierarchy process is used to evaluate the weights of each criterion, and the essential proteins are prioritized by an ideal solution of MCDM, i.e., TOPSIS. Experiments are conducted on YDIP, YMIPS, Krogan and BioGRID networks. The results indicate that EPI-TOPSIS outperforms several state-of-the-art approaches for identifying the essential proteins through the performance measures.
摘要
从蛋白质相互作用网络中识别关键蛋白质对生物进化和新药物研制具有重要意义. 目前许多蛋白质关键性的评判标准只关注蛋白质的某个属性, 这会有信息丢失的问题. 针对这一问题, 本文提出一种基于改进多准则决策的更全面有效的关键蛋白质鉴定方法(EPI-TOPSIS). 首先, 考虑蛋白质的不同属性, 从三个不同的方面来评估蛋白质重要性: 基于表达序列的基因度中心性; 基于定位信息和蛋白质复合物的亚细胞-邻居度中心性与亚细胞-复合物入度中心性. 然后将介数中心性与这三种方法一起考虑作为多准则决策模型的属性准则, 采用层次分析法赋予各个准则权重, 通过多准则决策的逼近理想距离求解蛋白质关键性, 并对蛋白质进行优先级排序. 最后, 在YDIP、 YMIPS、 Krogan和BioGRID网络上进行实验, 结果表明EPI-TOPSIS性能优于对比算法.
Similar content being viewed by others
References
KAMATH R S, FRASER A G, DONG Y, et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi [J]. Nature, 2003, 421(6920): 231–237.
WINZELER A, SHOEMAKER D D, ASTROMOFF A, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis [J]. The EMBO Journal, 1999, 285(5429): 901–906.
JEONG H, MASON S P, BARABÁSI A L, et al. Lethality and centrality in protein networks [J]. Nature, 2001, 411(6833): 41–42.
JIMENEZ-SANCHEZ G, CHILDS B, VALLE D. Human disease genes [J]. Nature, 2001, 409(6822): 853–855.
GILL N, SINGH S, ASERI T C. Computational disease gene prioritization: An appraisal [J]. Journal of Computational Biology, 2014, 21(6): 456–465.
GIAEVER G, CHU A M, NI L, et al. Functional profiling of the saccharomyces cerevisiae genome [J]. Nature, 2002, 418(6896): 387–391.
CULLEN L M, ARNDT G M. Genome-wide screening for gene function using RNAi in mammalian cells [J]. Immunology and Cell Biology, 2005, 83(3): 217–223.
ROEMER T, JIANG B, DAVISON J, et al. Large-scale essential gene identification in Candida albicans and applications to antifungal drug discovery [J]. Molecular Microbiology, 2003, 50(1): 167–181.
JEONG H, MASON S P, BARABÁSI A L, et al. Lethality and centrality in protein networks [J]. Nature, 2001, 411(6833): 41–42.
FREEMAN L C. Centrality in social networks conceptual clarification [J]. Social Networks, 1978, 1(3): 215–239.
JOY M P, BROCK A, INGBER D E, et al. High-betweenness proteins in the yeast protein interaction network [J]. Journal of Biomedicine and Biotechnology, 2005, 2005: 594674.
WUCHTY S, STADLER P F. Centers of complex networks [J]. Journal of Theoretical Biology, 2003, 223(1): 45–53.
ESTRADA E, RODRÍGUEZ-VELÁZQUEZ J A. Subgraph centrality in complex networks [J]. Physical Review E, 2005, 71(5): 056103.
BONACICH P. Power and centrality: A family of measures [J]. American Journal of Sociology, 1987, 92(5): 1170–1182.
LI M, WANG J X, CHEN X, et al. A local average connectivity-based method for identifying essential proteins from the network level [J]. Computational Biology and Chemistry, 2011, 35(3): 143–150.
WANG J X, LI M, WANG H, et al. Identification of essential proteins based on edge clustering coefficient [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2012, 9(4): 1070–1080.
NIE T Y, GUO Z, ZHAO K, et al. Using mapping entropy to identify node centrality in complex networks [J]. Physica A: Statistical Mechanics and Its Applications, 2016, 453: 290–297.
HSING M, BYLER K G, CHERKASOV A. The use of Gene Ontology terms for predicting highly-connected ‘hub’ nodes in protein-protein interaction networks [J]. BMC systems biology, 2008, 2(1): 80–80.
LEI X J, ZHAO J, FUJITA H, et al. Predicting essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets [J]. Knowledge-Based Systems, 2018, 151: 136–148.
XIAO Q H, WANG J X, PENG X Q, et al. Identifying essential proteins from active PPI networks constructed with dynamic gene expression [J]. BMC Genomics, 2015, 16(Suppl3): S1.
NEPUSZ T, YU H Y, PACCANARO A. Detecting overlapping protein complexes in protein-protein interaction networks [J]. Nature Methods, 2012, 9(5): 471–472.
ZHANG W, XU J, LI X, et al. A new method for identifying essential proteins by measuring co-expression and functional similarity [J]. IEEE Transactions on Nanobioscience, 2016, 15(8): 939–945.
LI M, LU Y, NIU Z B, et al. United complex centrality for identification of essential proteins from PPI networks [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 14(2): 370–380.
LUO J, QI Y. Identification of essential proteins based on a new combination of local interaction density and protein complexes [J]. PLoS ONE, 2015, 10(6): e0131418.
LI M, ZHANG H H, WANG J X, et al. A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data [J]. BMC Systems Biology, 2012, 6: 15.
TANG X W, WANG J X, ZHONG J C, et al. Predicting essential proteins based on weighted degree centrality [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2014, 11(2): 407–418.
LUO J, MA L. A new integration-centric algorithm of identifying essential proteins based on topology structure of protein-protein interaction network and complex information [J]. Current Bioinformatics, 2013, 8(3): 380–385.
LU P, YU J. A mixed clustering coefficient centrality for identifying essential proteins [J]. International Journal of Modern Physics B, 2020, 34(10): 5–9.
LEI X, YANG X, FUJITA H. Random walk based method to identify essential proteins by integrating network topology and biological characteristics [J]. Knowledge Based Systems, 2019, 167: 53–67.
LEI X J, YANG X Q, WU F X. Artificial fish swarm optimization based method to identify essential proteins [J]. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2020, 17(2): 495–505.
ZENG M, LI M, WU F X, et al. DeepEP: A deep learning framework for identifying essential proteins [J]. BMC Bioinformatics, 2019, 20(Suppl16): 506.
TZENG G, HUANG J. Multiple attribute decision making: Methods and applications [M]//Boca Raton: CRC Press, 2011.
DENG Y, CHAN F T S, WU Y, et al. A new linguistic MCDM method based on multiple-criterion data fusion [J]. Expert Systems with Applications, 2011, 38(6): 6985–6993.
ABO-SINNA M A, AMER A H, IBRAHIM A S. Extensions of TOPSIS for large scale muti-objective nonliner programming problems with block angular structure [J]. Applied Mathematical Modelling, 2008, 32(3): 292–302.
LEI X J, ZHAO J, FUJITA H, et al. Predicting Essential proteins based on RNA-Seq, subcellular localization and GO annotation datasets [J]. Knowledge-Based Systems, 2018, 151: 136–148.
XENARIOS I, FERNANDEZ E, SALWINSKI L, et al. DIP: The database of interacting proteins: 2001 update [J]. Nucleic Acids Research, 2001, 29(1): 239–241.
MEWES H W, FRISHMAN D, MAYER K F X, et al. MIPS: analysis and annotation of proteins from whole genomes in 2005 [J]. Nucleic Acids Research, 2006, 34(suppl_l): D169–D172.
KROGAN N J, CAGNEY G, YU H Y, et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae [J]. Nature, 2006, 440(7084): 637–643.
STARK C, BREITKREUTZ B J, CHATRARYAMONTRI A, et al. The BioGRID Interaction Database: 2011 update [J]. Nucleic Acids Research, 2010, 39(suppl_l): D698–D704.
FRIEDEL C C, KRUMSIEK J, ZIMMER R, et al. Bootstrapping the interactome: Unsupervised identification of protein complexes in yeast [M]//Research in computational molecular biology. Berlin, Heidelberg: Springer, 2008: 3–16.
PU S Y, VLASBLOM J, EMILI A, et al. Identifying functional modules in the physical interactome of Saccharomyces cerevisiae [J]. Proteomics, 2007, 7(6): 944–960.
PU S Y, WONG J, TURNER B, et al. Up-to-date catalogues of yeast protein complexes [J]. Nucleic Acids Research, 2008, 37(3): 825–831.
CHERRY J M, ADLER C, BALL C, et al. SGD: Saccharomyces genome database [J]. Nucleic Acids Research, 1998, 26(1): 73–79.
ZHANG R, LIN Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes [J]. Nucleic Acids Research, 2008, 37(suppl_l): D455–D458.
WINZELER E A, SHOEMAKER D D, ASTROMOFF A, et al. Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis [J]. Science, 1999, 285(5429): 901–906.
BINDER J X, PLETSCHER-FRANKILD S, TSAFOU K, et al. COMPARTMENTS: Unification and visualization of protein subcellular localization evidence [J]. Database, 2014, 2014: bau012.
TU B P, KUDLICKI A, ROWICKA M, et al. Logic of the yeast metabolic cycle: Temporal compartmentalization of cellular processes [J]. Science, 2005, 310(5751): 1152–1158.
HOLMAN A G, DAVIS P J, FOSTER J M. Computational prediction of essential genes in an unculturable endosymbiotic bacterium, Wolbachia of Brugia malayi [J]. BMC Microbiology, 2009, 9: 243.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: the National Natural Science Foundation of China (Nos. 62162040 and 11861045)
Rights and permissions
About this article
Cite this article
Lu, P., Chen, Y. & Liao, Y. Novel Scheme for Essential Proteins Identification Based on Improved Multicriteria Decision Making. J. Shanghai Jiaotong Univ. (Sci.) 28, 418–431 (2023). https://doi.org/10.1007/s12204-023-2584-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12204-023-2584-0
Key words
- protein-protein interaction network
- essential proteins
- multicriteria decision making (MCDM)
- biological information