Skip to main content

Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data

  • Conference paper
  • First Online:
Bio-inspired Computing: Theories and Applications (BIC-TA 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 791))

Abstract

Predicting essential proteins is indispensable for understanding the minimal requirements of cellular survival and development. In recent years, many methods combined with the topological features of PPI networks have been proposed. However, most of these approaches ignored the intrinsic characteristics of biological attributes. This paper integrates Gene expression data, Subcellular localization and PPI networks to identify essential proteins, named GSP. We use local average connectivity and edge clustering coefficient unite with gene expression data to measure centralities of nodes. Compared with non-essential proteins, essential proteins appear more frequently in some subcellular localizations such as Nucleus and considering that different compartments play different roles, thus we integrate subcellular localization information to identify essential proteins. The computational experiment results on the yeast PPI networks show that the proposed method GSP outperforms other state-of-art methods including DC, EC, IC, SC, NC, LAC, PeC, WDC and UDoNC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Glass, J.I., Hutchison, C.A., Smith, H.O., Venter, J.C.: A systems biology tour de force for a near-minimal bacterium. Mol. Syst. Biol. 5, 330 (2009)

    Article  Google Scholar 

  2. Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res. 37, 455–458 (2009)

    Article  Google Scholar 

  3. Li, M., Zheng, R.Q., Li, Q., Wang, J.X., Wu, F.X., Zhang, Z.H.: Prioritizing disease genes by using search engine algorithm. Curr. Bioinform. 11, 195–202 (2016)

    Article  Google Scholar 

  4. Lan, W., Wang, J.X., Li, M., Peng, W., Wu, F.X.: Computational approaches for prioritizing candidate disease genes based on PPI networks. Tsinghua Sci. Technol. 20, 500–512 (2015)

    Article  MathSciNet  Google Scholar 

  5. Giaever, G., Chu, A.M., Ni, L., Connelly, C., Riles, L., Vronneau, S., Dow, S., Lucaudanila, A., Anderson, K., Andr, B.: Functional profiling of the Saccharomyces cerevisiae genome. Nature 418, 387 (2002)

    Article  Google Scholar 

  6. Cullen, L.M., Arndt, G.M.: Genome-wide screening for gene function using RNAi in mammalian cells. Immunol. Cell Biol. 83, 217–223 (2005)

    Article  Google Scholar 

  7. Roemer, T., Jiang, B., Davison, J., Ketela, T., Veillette, K., Breton, A., Tandia, F., Linteau, A., Sillaots, S., Marta, C.: Large-scale essential gene identification in candida albicans and applications to antifungal drug discovery. Mol. Microbiol. 50, 167–181 (2003)

    Article  Google Scholar 

  8. Uetz, P., Giot, L., Cagney, G., Mansfield, T.A., Judson, R.S., Knight, J.R., Lockshon, D., Narayan, V., Srinivasan, M., Pochart, P., Qureshi-Emili, A., Li, Y., Godwin, B., Conover, D., Kalbfleisch, T., Vijayadamodar, G., Yang, M., Johnston, M., Fields, S., Rothberg, J.M.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)

    Article  Google Scholar 

  9. Ito, T., Chiba, T., Ozawa, R., Yoshida, M., Hattori, M., Sakaki, Y.: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Natl. Acad. Sci. U. S. A. 98, 4569–4574 (2001)

    Article  Google Scholar 

  10. Ho, Y., Gruhler, A., Heilbut, A., Bader, G.D., Moore, L., Adams, S.L., Millar, A., Taylor, P., Bennett, K., Boutilier, K.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180 (2002)

    Article  Google Scholar 

  11. Mering, C.V., Krause, R., Snel, B., Cornell, M., Oliver, S.G., Fields, S., Bork, P.: Comparative assessment of large-scale data sets of protein—[Ndash]—-protein interactions. Nature 417, 399–403 (2002)

    Article  Google Scholar 

  12. Jeong, H., Mason, S.P., Barabsi, A.L., Oltvai, Z.N.: Lethality and centrality in protein networks. Nature 411, 41–42 (2001)

    Article  Google Scholar 

  13. Bonacich, P.: Power and centrality: a family of measures. Am. J. Soc. 92, 1170–1182 (1987)

    Article  Google Scholar 

  14. Snee, R.D.: Validation of regression models: methods and examples. Technometrics 19, 415–428 (1977)

    Article  MATH  Google Scholar 

  15. Hahn, M.W., Kern, A.D.: Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol. Biol. Evol. 22, 803–806 (2005)

    Article  Google Scholar 

  16. Joy, M.P., Brock, A., Ingber, D.E., Huang, S.: High-betweenness proteins in the yeast protein interaction network. Biomed. Res. Int. 2005, 96 (2005)

    Google Scholar 

  17. Estrada, E., Rodrguez-Velzquez, J.A.: Subgraph centrality in complex networks. Phys. Rev. E 71, 056103–056103 (2005)

    Article  MathSciNet  Google Scholar 

  18. Li, M., Wang, J., Wang, H., Pan, Y.: Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans. Comput. Biol. Bioinform. 9, 1070 (2012)

    Article  Google Scholar 

  19. Li, M., Wang, J., Chen, X., Wang, H., Pan, Y.: A local average connectivity-based method for identifying essential proteins from the network level. Comput. Biol. Chem. 35, 143 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  20. Li, M., Lu, Y., Wang, J., Wu, F.X., Pan, Y.: A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 372 (2015)

    Article  Google Scholar 

  21. Wang, J., Li, M., Chen, J., Pan, Y.: A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 607–620 (2011)

    Article  Google Scholar 

  22. Wang, J.X., Chen, J.E., Min, L., Hu, B., Gang, C.: Modifying the DPClus algorithm for identifying protein complexes based on new topological structures. BMC Bioinform. 9, 1–16 (2008)

    Article  Google Scholar 

  23. Zhao, B., Wang, J., Li, M., Wu, F.X., Pan, Y.: Detecting protein complexes basedon uncertain graph model. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 486–497 (2014)

    Article  Google Scholar 

  24. Peng, W., Wang, J., Zhao, B., Wang, L.: Identification of protein complexes using weighted PageRank-Nibble algorithm and core-attachment structure. IEEE/ACM T. Comput. Biol. Bioinform. 12, 179–192 (2015)

    Article  Google Scholar 

  25. Michael, H., Grant, B.K., Artem, C.: The use of gene ontology terms for predicting highly-connected ‘hub’ nodes in protein-protein interaction networks. BMC Syst. Biol. 2, 1–14 (2008)

    Article  Google Scholar 

  26. Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinform. 10, 290 (2009)

    Article  Google Scholar 

  27. Li, M., Zhang, H., Wang, J.X., Pan, Y.: A new essential protein discovery method based on the integration of protein-protein interaction and gene expression data. BMC Syst. Biol. 6, 15 (2012)

    Article  Google Scholar 

  28. Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Trans. Comput. Biol. Bioinform. 11, 407 (2014)

    Article  Google Scholar 

  29. Peng, X., Wang, J., Wang, J., Wu, F.X., Pan, Y.: Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. Plos One 10, e0130743 (2015)

    Article  Google Scholar 

  30. Peng, W., Wang, J., Cheng, Y., Lu, Y., Wu, F., Pan, Y.: UDoNC: an algorithm for identifying essential proteins based on protein domains and protein-protein interaction networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 276–288 (2015)

    Article  Google Scholar 

  31. Chao, Q., Sun, Y., Dong, Y.: A new method for identifying essential proteins based on network topology properties and protein complexes. Plos One 11, e0161042 (2016)

    Article  Google Scholar 

  32. Luo, J., Kuang, L.: A new method for predicting essential proteins based on dynamic network topology and complex information. Comput. Biol. Chem. 52, 34–42 (2014)

    Article  Google Scholar 

  33. Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Natl. Acad. Sci. U. S. A. 101, 2658–2663 (2004)

    Article  Google Scholar 

  34. Xenarios, I., Salwnski, L., Duan, X.J., Higney, P., Kim, S.M., Eisenberg, D.: DIP, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303 (2002)

    Article  Google Scholar 

  35. Tu, B.P., Mcknight, S.L.: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science 310, 1152 (2005)

    Article  Google Scholar 

  36. Binder, J.X., Pletscher-Frankild, S., Tsafou, K., Stolte, C., ODonoghue, S.I., Schneider, R., Jensen, L.J.: Compartments: Unification and Visualization Of Protein Subcellular Localization Evidence. Database, (2014–01-01) 2014, bau012 (2014)

    Google Scholar 

  37. Mewes, H.W., Amid, C., Arnold, R., Frishman, D., Güldener, U., Mannhaupt, G., Münsterkötter, M., Pagel, P., Strack, N., Stümpflen, V., Warfsmann, J.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 34, D169 (2006)

    Article  Google Scholar 

  38. Isseltarver, L., Christie, K.R., Dolinski, K., Andrada, R., Balakrishnan, R., Ball, C.A., Binkley, G., Dong, S., Dwight, S.S., Fisk, D.G.: Saccharomyces genome database. Methods Enzymol. 350, 329 (2002)

    Article  Google Scholar 

Download references

Acknowledgments

This paper is supported by the National Natural Science Foundation of China (61672334, 61502290, 91530320, and 61401263), Industrial Research Project of Science and Technology in Shaanxi Province (2015GY016), and the Innovation Scientists and Technicians Troop Construction Projects of Henan Province (154200510012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiujuan Lei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lei, X., Wang, S., Pan, L. (2017). Predicting Essential Proteins Based on Gene Expression Data, Subcellular Localization and PPI Data. In: He, C., Mo, H., Pan, L., Zhao, Y. (eds) Bio-inspired Computing: Theories and Applications. BIC-TA 2017. Communications in Computer and Information Science, vol 791. Springer, Singapore. https://doi.org/10.1007/978-981-10-7179-9_8

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7179-9_8

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7178-2

  • Online ISBN: 978-981-10-7179-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics