Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data

Lei, Xiujuan; Wang, Siguo; Pan, Linqiang

doi:10.1007/978-981-10-7179-9_8

Xiujuan Lei¹³,
Siguo Wang¹³ &
Linqiang Pan^14,15

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 791))

Included in the following conference series:

International Conference on Bio-Inspired Computing: Theories and Applications

1198 Accesses
3 Citations

Abstract

Predicting essential proteins is indispensable for understanding the minimal requirements of cellular survival and development. In recent years, many methods combined with the topological features of PPI networks have been proposed. However, most of these approaches ignored the intrinsic characteristics of biological attributes. This paper integrates Gene expression data, Subcellular localization and PPI networks to identify essential proteins, named GSP. We use local average connectivity and edge clustering coefficient unite with gene expression data to measure centralities of nodes. Compared with non-essential proteins, essential proteins appear more frequently in some subcellular localizations such as Nucleus and considering that different compartments play different roles, thus we integrate subcellular localization information to identify essential proteins. The computational experiment results on the yeast PPI networks show that the proposed method GSP outperforms other state-of-art methods including DC, EC, IC, SC, NC, LAC, PeC, WDC and UDoNC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Glass, J.I., Hutchison, C.A., Smith, H.O., Venter, J.C.: A systems biology tour de force for a near-minimal bacterium. Mol. Syst. Biol. 5, 330 (2009)
Article Google Scholar
Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37, 455–458 (2009)
Article Google Scholar
Li, M., Zheng, R.Q., Li, Q., Wang, J.X., Wu, F.X., Zhang, Z.H.: Prioritizing disease genes by using search engine algorithm. Curr. Bioinform. 11, 195–202 (2016)
Article Google Scholar
Lan, W., Wang, J.X., Li, M., Peng, W., Wu, F.X.: Computational approaches for prioritizing candidate disease genes based on PPI networks. Tsinghua Sci. Technol. 20, 500–512 (2015)
Article MathSciNet Google Scholar
Giaever, G., Chu, A.M., Ni, L., Connelly, C., Riles, L., Vronneau, S., Dow, S., Lucaudanila, A., Anderson, K., Andr, B.: Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387 (2002)
Article Google Scholar
Cullen, L.M., Arndt, G.M.: Genome-wide screening for gene function using RNAi in mammalian cells. Immunol. Cell Biol. 83, 217–223 (2005)
Article Google Scholar
Roemer, T., Jiang, B., Davison, J., Ketela, T., Veillette, K., Breton, A., Tandia, F., Linteau, A., Sillaots, S., Marta, C.: Large-scale essential gene identification in candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50, 167–181 (2003)
Article Google Scholar
Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)
Article Google Scholar
Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Natl. Acad. Sci. U. S. A. 98, 4569–4574 (2001)
Article Google Scholar
Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 (2002)
Article Google Scholar
Mering, C.V., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein—[Ndash]—-protein interactions. Nature 417, 399–403 (2002)
Article Google Scholar
Jeong, H., Mason, S.P., Barabsi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)
Article Google Scholar
Bonacich, P.: Power and centrality: a family of measures. Am. J. Soc. 92, 1170–1182 (1987)
Article Google Scholar
Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19, 415–428 (1977)
Article MATH Google Scholar
Hahn, M.W., Kern, A.D.: Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22, 803–806 (2005)
Article Google Scholar
Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. Biomed. Res. Int. 2005, 96 (2005)
Google Scholar
Estrada, E., Rodrguez-Velzquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E 71, 056103–056103 (2005)
Article MathSciNet Google Scholar
Li, M., Wang, J., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1070 (2012)
Article Google Scholar
Li, M., Wang, J., Chen, X., Wang, H., Pan, Y.: A local average connectivity-based method for identifying essential proteins from the network level. Comput. Biol. Chem. 35, 143 (2011)
Article MATH MathSciNet Google Scholar
Li, M., Lu, Y., Wang, J., Wu, F.X., Pan, Y.: A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 372 (2015)
Article Google Scholar
Wang, J., Li, M., Chen, J., Pan, Y.: A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 607–620 (2011)
Article Google Scholar
Wang, J.X., Chen, J.E., Min, L., Hu, B., Gang, C.: Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform. 9, 1–16 (2008)
Article Google Scholar
Zhao, B., Wang, J., Li, M., Wu, F.X., Pan, Y.: Detecting protein complexes basedon uncertain graph model. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 486–497 (2014)
Article Google Scholar
Peng, W., Wang, J., Zhao, B., Wang, L.: Identification of protein complexes using weighted PageRank-Nibble algorithm and core-attachment structure. IEEE/ACM T. Comput. Biol. Bioinform. 12, 179–192 (2015)
Article Google Scholar
Michael, H., Grant, B.K., Artem, C.: The use of gene ontology terms for predicting highly-connected ‘hub’ nodes in protein-protein interaction networks. BMC Syst. Biol. 2, 1–14 (2008)
Article Google Scholar
Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 10, 290 (2009)
Article Google Scholar
Li, M., Zhang, H., Wang, J.X., Pan, Y.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6, 15 (2012)
Article Google Scholar
Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 407 (2014)
Article Google Scholar
Peng, X., Wang, J., Wang, J., Wu, F.X., Pan, Y.: Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. Plos One 10, e0130743 (2015)
Article Google Scholar
Peng, W., Wang, J., Cheng, Y., Lu, Y., Wu, F., Pan, Y.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 276–288 (2015)
Article Google Scholar
Chao, Q., Sun, Y., Dong, Y.: A new method for identifying essential proteins based on network topology properties and protein complexes. Plos One 11, e0161042 (2016)
Article Google Scholar
Luo, J., Kuang, L.: A new method for predicting essential proteins based on dynamic network topology and complex information. Comput. Biol. Chem. 52, 34–42 (2014)
Article Google Scholar
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Natl. Acad. Sci. U. S. A. 101, 2658–2663 (2004)
Article Google Scholar
Xenarios, I., Salwnski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303 (2002)
Article Google Scholar
Tu, B.P., Mcknight, S.L.: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 310, 1152 (2005)
Article Google Scholar
Binder, J.X., Pletscher-Frankild, S., Tsafou, K., Stolte, C., ODonoghue, S.I., Schneider, R., Jensen, L.J.: Compartments: Unification and Visualization Of Protein Subcellular Localization Evidence. Database, (2014–01-01) 2014, bau012 (2014)
Google Scholar
Mewes, H.W., Amid, C., Arnold, R., Frishman, D., Güldener, U., Mannhaupt, G., Münsterkötter, M., Pagel, P., Strack, N., Stümpflen, V., Warfsmann, J.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 34, D169 (2006)
Article Google Scholar
Isseltarver, L., Christie, K.R., Dolinski, K., Andrada, R., Balakrishnan, R., Ball, C.A., Binkley, G., Dong, S., Dwight, S.S., Fisk, D.G.: Saccharomyces genome database. Methods Enzymol. 350, 329 (2002)
Article Google Scholar

Download references

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (61672334, 61502290, 91530320, and 61401263), Industrial Research Project of Science and Technology in Shaanxi Province (2015GY016), and the Innovation Scientists and Technicians Troop Construction Projects of Henan Province (154200510012).

Author information

Authors and Affiliations

School of Computer Science, Shaanxi Normal University, Xi’an, 710119, Shaanxi, China
Xiujuan Lei & Siguo Wang
Key Laboratory of Image Information Processing and Intelligent Control, School of Automation, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
Linqiang Pan
School of Electric and Information Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, Henan, China
Linqiang Pan

Authors

Xiujuan Lei
View author publications
You can also search for this author in PubMed Google Scholar
Siguo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Linqiang Pan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiujuan Lei .

Editor information

Editors and Affiliations

School of Automation, Huazhong University of Science and Technology, Wuhan, China
Cheng He
Automation College, Harbin Engineering University, Harbin, China
Hongwei Mo
School of Automation, Huazhong University of Science and Technology, Wuhan, China
Linqiang Pan
Automation College, Harbin Engineering University, Harbin, China
Yuxin Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lei, X., Wang, S., Pan, L. (2017). Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data. In: He, C., Mo, H., Pan, L., Zhao, Y. (eds) Bio-inspired Computing: Theories and Applications. BIC-TA 2017. Communications in Computer and Information Science, vol 791. Springer, Singapore. https://doi.org/10.1007/978-981-10-7179-9_8

Download citation

DOI: https://doi.org/10.1007/978-981-10-7179-9_8
Published: 09 November 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7178-2
Online ISBN: 978-981-10-7179-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics