Abstract
Predicting essential proteins is indispensable for understanding the minimal requirements of cellular survival and development. In recent years, many methods combined with the topological features of PPI networks have been proposed. However, most of these approaches ignored the intrinsic characteristics of biological attributes. This paper integrates Gene expression data, Subcellular localization and PPI networks to identify essential proteins, named GSP. We use local average connectivity and edge clustering coefficient unite with gene expression data to measure centralities of nodes. Compared with non-essential proteins, essential proteins appear more frequently in some subcellular localizations such as Nucleus and considering that different compartments play different roles, thus we integrate subcellular localization information to identify essential proteins. The computational experiment results on the yeast PPI networks show that the proposed method GSP outperforms other state-of-art methods including DC, EC, IC, SC, NC, LAC, PeC, WDC and UDoNC.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Glass, J.I., Hutchison, C.A., Smith, H.O., Venter, J.C.: A systems biology tour de force for a near-minimal bacterium. Mol. Syst. Biol. 5, 330 (2009)
Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37, 455–458 (2009)
Li, M., Zheng, R.Q., Li, Q., Wang, J.X., Wu, F.X., Zhang, Z.H.: Prioritizing disease genes by using search engine algorithm. Curr. Bioinform. 11, 195–202 (2016)
Lan, W., Wang, J.X., Li, M., Peng, W., Wu, F.X.: Computational approaches for prioritizing candidate disease genes based on PPI networks. Tsinghua Sci. Technol. 20, 500–512 (2015)
Giaever, G., Chu, A.M., Ni, L., Connelly, C., Riles, L., Vronneau, S., Dow, S., Lucaudanila, A., Anderson, K., Andr, B.: Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387 (2002)
Cullen, L.M., Arndt, G.M.: Genome-wide screening for gene function using RNAi in mammalian cells. Immunol. Cell Biol. 83, 217–223 (2005)
Roemer, T., Jiang, B., Davison, J., Ketela, T., Veillette, K., Breton, A., Tandia, F., Linteau, A., Sillaots, S., Marta, C.: Large-scale essential gene identification in candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50, 167–181 (2003)
Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Natl. Acad. Sci. U. S. A. 98, 4569–4574 (2001)
Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 (2002)
Mering, C.V., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein—[Ndash]—-protein interactions. Nature 417, 399–403 (2002)
Jeong, H., Mason, S.P., Barabsi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)
Bonacich, P.: Power and centrality: a family of measures. Am. J. Soc. 92, 1170–1182 (1987)
Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19, 415–428 (1977)
Hahn, M.W., Kern, A.D.: Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22, 803–806 (2005)
Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. Biomed. Res. Int. 2005, 96 (2005)
Estrada, E., Rodrguez-Velzquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E 71, 056103–056103 (2005)
Li, M., Wang, J., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1070 (2012)
Li, M., Wang, J., Chen, X., Wang, H., Pan, Y.: A local average connectivity-based method for identifying essential proteins from the network level. Comput. Biol. Chem. 35, 143 (2011)
Li, M., Lu, Y., Wang, J., Wu, F.X., Pan, Y.: A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 372 (2015)
Wang, J., Li, M., Chen, J., Pan, Y.: A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 607–620 (2011)
Wang, J.X., Chen, J.E., Min, L., Hu, B., Gang, C.: Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform. 9, 1–16 (2008)
Zhao, B., Wang, J., Li, M., Wu, F.X., Pan, Y.: Detecting protein complexes basedon uncertain graph model. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 486–497 (2014)
Peng, W., Wang, J., Zhao, B., Wang, L.: Identification of protein complexes using weighted PageRank-Nibble algorithm and core-attachment structure. IEEE/ACM T. Comput. Biol. Bioinform. 12, 179–192 (2015)
Michael, H., Grant, B.K., Artem, C.: The use of gene ontology terms for predicting highly-connected ‘hub’ nodes in protein-protein interaction networks. BMC Syst. Biol. 2, 1–14 (2008)
Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 10, 290 (2009)
Li, M., Zhang, H., Wang, J.X., Pan, Y.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6, 15 (2012)
Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 407 (2014)
Peng, X., Wang, J., Wang, J., Wu, F.X., Pan, Y.: Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. Plos One 10, e0130743 (2015)
Peng, W., Wang, J., Cheng, Y., Lu, Y., Wu, F., Pan, Y.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 276–288 (2015)
Chao, Q., Sun, Y., Dong, Y.: A new method for identifying essential proteins based on network topology properties and protein complexes. Plos One 11, e0161042 (2016)
Luo, J., Kuang, L.: A new method for predicting essential proteins based on dynamic network topology and complex information. Comput. Biol. Chem. 52, 34–42 (2014)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Natl. Acad. Sci. U. S. A. 101, 2658–2663 (2004)
Xenarios, I., Salwnski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303 (2002)
Tu, B.P., Mcknight, S.L.: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 310, 1152 (2005)
Binder, J.X., Pletscher-Frankild, S., Tsafou, K., Stolte, C., ODonoghue, S.I., Schneider, R., Jensen, L.J.: Compartments: Unification and Visualization Of Protein Subcellular Localization Evidence. Database, (2014–01-01) 2014, bau012 (2014)
Mewes, H.W., Amid, C., Arnold, R., Frishman, D., Güldener, U., Mannhaupt, G., Münsterkötter, M., Pagel, P., Strack, N., Stümpflen, V., Warfsmann, J.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 34, D169 (2006)
Isseltarver, L., Christie, K.R., Dolinski, K., Andrada, R., Balakrishnan, R., Ball, C.A., Binkley, G., Dong, S., Dwight, S.S., Fisk, D.G.: Saccharomyces genome database. Methods Enzymol. 350, 329 (2002)
Acknowledgments
This paper is supported by the National Natural Science Foundation of China (61672334, 61502290, 91530320, and 61401263), Industrial Research Project of Science and Technology in Shaanxi Province (2015GY016), and the Innovation Scientists and Technicians Troop Construction Projects of Henan Province (154200510012).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Lei, X., Wang, S., Pan, L. (2017). Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data. In: He, C., Mo, H., Pan, L., Zhao, Y. (eds) Bio-inspired Computing: Theories and Applications. BIC-TA 2017. Communications in Computer and Information Science, vol 791. Springer, Singapore. https://doi.org/10.1007/978-981-10-7179-9_8
Download citation
DOI: https://doi.org/10.1007/978-981-10-7179-9_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7178-2
Online ISBN: 978-981-10-7179-9
eBook Packages: Computer ScienceComputer Science (R0)