Identifying essential proteins based on dynamic protein-protein interaction networks and RNA-Seq datasets


The identification of essential proteins is not only important for understanding organism structure on the molecular level, but also beneficial to drug-target detection and genetic disease prevention. Traditional methods often employ various centrality indices of static protein-protein interaction (PPI) networks and/or gene expression profiles to predict essential proteins. However, the prediction accuracy of most methods still has room to be further improved. In this study, we propose a strategy to increase the prediction accuracy of essential protein identification in three ways. Firstly, RNA-Seq datasets are employed to construct integrated dynamic PPI networks. Using a RNA-Seq dataset is expected to give more accurate predictions than using microarray gene expression profiles. Secondly, a novel integrated dynamic PPI network is constructed by considering both the co-expression pattern and the co-expression level of the RNA-Seq data. Thirdly, a novel two-step strategy is proposed to identify essential proteins from two known centrality indices. Numerical experiments have shown that the proposed strategy can increase the prediction accuracy dramatically, which can be generalized to many existing methods and centrality indices.

This is a preview of subscription content, access via your institution.


  1. 1

    Giaever G, Chu A M, Ni L, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature, 2002, 418: 387–391

    Article  Google Scholar 

  2. 2

    Cullen L M, Arndt G M. Genome-wide screening for gene function using RNAi in mammalian cells. Immun Cell Biol, 2005, 83: 217–223

    Article  Google Scholar 

  3. 3

    Wang J X, Peng W, Wu F X. Computational approaches to predicting essential proteins: a survey. Proteom-Clin Appl, 2013, 7: 181–192

    Article  Google Scholar 

  4. 4

    Gerdes S Y, Scholle M D, Campbell J W, et al. Experimental determination and system level analysis of essential genes in Escherichia coli MG1655. J Bacteriol, 2003, 185: 5673–5684

    Article  Google Scholar 

  5. 5

    Batada N N, Hurst L D, Tyers M. Evolutionary and physiological importance of hub proteins. PLoS Comput Biol, 2006 2: e88

    Article  Google Scholar 

  6. 6

    Hahn M W, Kern A D. Comparative genomics of centrality and essentiality in three eukaryotic protein-interaction networks. Mol Biol Evol, 2005, 22: 803–806

    Article  Google Scholar 

  7. 7

    Yu H, Greenbaum D, Lu H X, et al. Genomic analysis of essentiality within protein networks. Trends Genet, 2004, 20: 227–231

    Article  Google Scholar 

  8. 8

    Estrada E. Virtual identification of essential proteins within the protein interaction network of yeast. Proteomics, 2006, 6: 35–40

    Article  Google Scholar 

  9. 9

    Li M, Lu Y, Wang J X, et al. A topology potential-based method for identifying essential proteins from PPI networks. IEEE/ACM Trans Comput Biol Bioinform, 2015, 12: 372–383

    Article  Google Scholar 

  10. 10

    Ren J, Wang J X, Li M, et al. Discovering essential proteins based on PPI network and protein complex. Int J Data Min Bioinform, 2015, 12: 24–43

    Article  Google Scholar 

  11. 11

    Li M, Zheng R Q, Zhang H H, et al. Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods, 2014, 67: 325–333

    Article  Google Scholar 

  12. 12

    Tang Y, Li M, Wang J X, et al. CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems, 2015, 127: 67–72

    Article  Google Scholar 

  13. 13

    Wasserman S, Faust K. Social Network Analysis: Methods and Applications. Cambridge: Cambridge University Press, 1994

    Book  MATH  Google Scholar 

  14. 14

    Freeman L C. Centrality in social networks conceptual clarification. Soc Netw, 1979, 1: 215–239

    Article  Google Scholar 

  15. 15

    Zotenko E, Mestre J, O’leary D P, et al. Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality. PLoS Comput Biol, 2008, 4: e1000140

    MathSciNet  Article  Google Scholar 

  16. 16

    Jeong H, Mason S P, Barabási A L, et al. Lethality and centrality in protein networks. Nature, 2001, 411: 41–42

    Article  Google Scholar 

  17. 17

    Bonacich P. Power and centrality: a family of measures. Amer J Sociol, 1987, 92: 1170–1182

    Article  Google Scholar 

  18. 18

    Li M, Wang J X, Chen X, et al. A local average connectivity-based method for identifying essential proteins from the network level. Comput Biol Chem, 2011, 35: 143–150

    MathSciNet  Article  Google Scholar 

  19. 19

    Estrada E, Rodriguez-Velazquez J A. Subgraph centrality in complex networks. Phys Rev E, 2005, 71: 056103

    MathSciNet  Article  Google Scholar 

  20. 20

    Wang J X, Peng X Q, Peng W, et al. Dynamic protein interaction network construction and applications. Proteomics, 2014, 14: 338–352

    Article  Google Scholar 

  21. 21

    Xiao Q H, Wang J X, Peng X Q, et al. Identifying essential proteins from active PPI networks constructed with dynamic gene expression. BMC Genomics, 2015, 16: S1

    Article  Google Scholar 

  22. 22

    Tang X W, Wang J X, Liu B B, et al. A comparison of the functional modules identified from time course and static PPI network data. BMC Bioinform, 2011, 12: 339

    Article  Google Scholar 

  23. 23

    Jin R M, Mccallen S, Liu C C, et al. Identifying dynamic network modules with temporal and spatial constraints. In: Proceedings of Pacific Symposium on Biocomputing, Big Island of Hawaii, 2009. 203–214

    Google Scholar 

  24. 24

    Luo J W, Kuang L. A new method for predicting essential proteins based on dynamic network topology and complex information. Computl Biol Chem, 2014, 52: 34–42

    Article  Google Scholar 

  25. 25

    Chen B L, Fan W W, Liu J, et al. Identifying protein complexes and functional modules from static PPI networks to dynamic PPI networks. Brief Bioinform, 2014, 15: 177–194

    Article  Google Scholar 

  26. 26

    Oh S, Song S, Grabowski G, et al. Time series expression analyses using RNA-Seq: a statistical approach. BioMed Res Int, 2013, 203681

    Google Scholar 

  27. 27

    Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol, 2005, 4: 17

    MathSciNet  MATH  Google Scholar 

  28. 28

    Langmead B, Salzberg S L. Fast gapped-read alignment with Bowtie 2. Nat Methods, 2012, 9: 357–359

    Article  Google Scholar 

  29. 29

    Ferragina P, Manzini G. Opportunistic data structures with applications. In: Proceedings of IEEE 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, 2000. 390–398

    Google Scholar 

  30. 30

    Trapnell C, Pachter L, Salzberg S L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 2009, 25: 1105–1111

    Article  Google Scholar 

  31. 31

    Trapnell C, Roberts A, Goff L, et al. Differential gene and transcript expression analysis of RNA-Seq experiments with TopHat and Cufflinks. Nat Protoc, 2012, 7: 562–578

    Article  Google Scholar 

  32. 32

    Wang J X, Li M, Wang H, et al. Identification of essential proteins based on edge clustering coefficient. IEEE/ACM Trans Comput Biol Bioinform, 2012, 9: 1070–1080

    Article  Google Scholar 

  33. 33

    Liu G M, Wong L, Chua H N. Complex discovery from weighted PPI networks. Bioinformatics, 2009, 25: 1891–1897

    Article  Google Scholar 

  34. 34

    Lage K, Karlberg E O, Størling Z M, et al. A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol, 2007, 25: 309–316

    Article  Google Scholar 

  35. 35

    Chen Y X, Wang W H, Zhou Y Y, et al. In silico gene prioritization by integrating multiple data sources. PLoS ONE, 2011, 6: e21137

    Article  Google Scholar 

  36. 36

    Stocchetto S, Marin O, Carignani G, et al. Biochemical evidence that Saccharomyces cerevisiae YGR262c gene, required for normal growth, encodes a novel Ser/Thr-specific protein kinase. FEBS Lett, 1997, 414: 171–175

    Article  Google Scholar 

  37. 37

    Jaquet L, Jauniaux J C. Disruption and basic functional analysis of five chromosome X novel ORFs of Saccharomyces cerevisiae reveals YJL125c as an essential gene for vegetative growth. Yeast, 1999, 15: 51–61

    Article  Google Scholar 

  38. 38

    Huang M E, Cadieu E, Souciet J L, et al. Disruption of six novel yeast genes reveals three genes essential for vegetative growth and one required for growth at low temperature. Yeast, 1997, 13: 1181–1194

    Article  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Bolin Chen.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Shang, X., Wang, Y. & Chen, B. Identifying essential proteins based on dynamic protein-protein interaction networks and RNA-Seq datasets. Sci. China Inf. Sci. 59, 070106 (2016).

Download citation


  • essential protein
  • dynamic protein network
  • RNA-Seq data
  • gene co-expression pattern
  • M2 measure