Identification of Essential Proteins by Using Complexes and Interaction Network

  • Min Li
  • Yu Lu
  • Zhibei Niu
  • Fang-Xiang Wu
  • Yi Pan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8492)


Essential proteins are indispensable in maintaining the cellular life. Identification of essential proteins can provide basis for drug target design, disease treatment as well as synthetic biology minimal genome. However, it is still time-consuming and expensive to identify essential protein based on experimental approaches. With the development of high-throughput experimental techniques in the post-genome era, a large number of PPI data and gene expression data can be obtained, which provide an unprecedented opportunity to study essential proteins at the network level. So far, many network topological methods have been proposed to identify the essential proteins. In this paper, we propose a new method, United complex Centrality(UC), to identify essential proteins by integrating protein complexes information and topological features of PPI network. By analysis of the relationship between protein complexes and essential proteins, we find that proteins appeared in multiple complexes are more inclined to be essential compared to these only appeared in a single complex. The experiment results show that protein complex information can help identify the essential proteins more accurate. Our method UC is obviously better than traditional centrality methods(DC, IC, EC, SC, BC, CC, NC) for identifying essential proteins. In addition, even compared with Harmonic Centricity which also used protein complexes information, it still has a great advantage.


essential proteins PPI network protein complexes traditional centrality methods 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Pál, C., Papp, B., Hurst, L.D.: Genomic function (communication arising): rate of evolution and gene dispensability. Nature 421(6922), 496–497 (2003)CrossRefGoogle Scholar
  2. 2.
    Zhang, J., He, X.: Significant impact of protein dispensability on the instantaneous rate of protein evolution. Molecular Biology and Evolution 22(4), 1147–1155 (2005)CrossRefGoogle Scholar
  3. 3.
    Liao, B.Y., Scott, N.M., Zhang, J.: Impacts of gene essentiality, expression pattern, and gene compactness on the evolutionary rate of mammalian proteins. Molecular Biology and Evolution 23(11), 2072–2080 (2006)CrossRefGoogle Scholar
  4. 4.
    Winzeler, E.A., Shoemaker, D.D., Astromoff, A., et al.: Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science 285(5429), 901–906 (1999)CrossRefGoogle Scholar
  5. 5.
    Kamath, R.S., Fraser, A.G., Dong, Y., et al.: Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 421(6920), 231–237 (2003)CrossRefGoogle Scholar
  6. 6.
    Kondrashov, F.A., Ogurtsov, A.Y., Kondrashov, A.S.: Bioinformatical assay of human gene morbidity. Nucleic Acids Research 32(5), 1731–1737 (2004)CrossRefGoogle Scholar
  7. 7.
    Furney, S.J., Albá, M.M., López-Bigas, N.: Differences in the evolutionary history of disease genes affected by dominant or recessive mutations. BMC Genomics 7(1), 165 (2006)CrossRefGoogle Scholar
  8. 8.
    Fraser, H.B., Hirsh, A.E., Steinmetz, L.M., et al.: Evolutionary rate in the protein interaction network. Science 296(5568), 750–752 (2002)CrossRefGoogle Scholar
  9. 9.
    Xu, J., Li, Y.: Discovering disease-genes by topological features in human protein - protein interaction network. Bioinformatics 22(22), 2800–2805 (2006)CrossRefGoogle Scholar
  10. 10.
    Park, D., Park, J., Park, S.G., et al.: Analysis of human disease genes in the context of gene essentiality. Genomics 92(6), 414–418 (2008)CrossRefGoogle Scholar
  11. 11.
    Jeong, H., Mason, S.P., Barabási, A.L., et al.: Lethality and centrality in protein networks. Nature 411(6833), 41–42 (2001)CrossRefGoogle Scholar
  12. 12.
    Estrada, E.: Virtual identification of essential proteins within the protein interaction network of yeast. Proteomics 6(1), 35–40 (2006)CrossRefMathSciNetGoogle Scholar
  13. 13.
    He, X., Zhang, J.: Why do hubs tend to be essential in protein networks? PLoS Genetics 2(6), e88 (2006)Google Scholar
  14. 14.
    Zotenko, E., Mestre, J., O’Leary, D.P., et al.: Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality. PLoS Computational Biology 4(8), e1000140 (2008)Google Scholar
  15. 15.
    Chua, H.N., Tew, K.L., Li, X.L., et al.: A unified scoring scheme for detecting essential proteins in protein interaction networks. In: 20th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2008, vol. 2, pp. 66–73. IEEE (2008)Google Scholar
  16. 16.
    Batada, N.N., Hurst, L.D., Tyers, M.: Evolutionary and physiological importance of hub proteins. PLoS Computational Biology 2(7), e88 (2006)Google Scholar
  17. 17.
    Seo, C.H., Kim, J.R., Kim, M.S., et al.: Hub genes with positive feedbacks function as master switches in developmental gene regulatory networks. Bioinformatics 25(15), 1898–1904 (2009)CrossRefGoogle Scholar
  18. 18.
    Acencio, M.L., Lemke, N.: Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information. BMC Bioinformatics 10(1), 290 (2009)CrossRefGoogle Scholar
  19. 19.
    Vallabhajosyula, R.R., Chakravarti, D., Lutfeali, S., et al.: Identifying hubs in protein interaction networks. PLoS One 4(4), e5344 (2009)Google Scholar
  20. 20.
    Pang, K., Sheng, H., Ma, X.: Understanding gene essentiality by finely characterizing hubs in the yeast protein interaction network. Biochemical and Biophysical Research Communications 401(1), 112–116 (2010)CrossRefGoogle Scholar
  21. 21.
    Ning, K., Ng, H.K., Srihari, S., et al.: Examination of the relationship between essential genes in PPI network and hub proteins in reverse nearest neighbor topology. BMC Bioinformatics 11(1), 505 (2010)CrossRefGoogle Scholar
  22. 22.
    Freeman, L.C.: A set of measures of centrality based on betweenness. Sociometry, 35–41 (1977)Google Scholar
  23. 23.
    Joy, M.P., Brock, A., Ingber, D.E., et al.: High-betweenness proteins in the yeast protein interaction network. BioMed Research International 2005(2), 96–103 (2005)Google Scholar
  24. 24.
    Wuchty, S., Stadler, P.F.: Centers of complex networks. Journal of Theoretical Biology 223(1), 45–53 (2003)CrossRefMathSciNetGoogle Scholar
  25. 25.
    Estrada, E., Rodriguez-Velazquez, J.A.: Subgraph centrality in complex networks. Physical Review E 71(5), 056103 (2005)Google Scholar
  26. 26.
    Bonacich, P.: Power and centrality: A family of measures. American Journal of Sociology, 1170–1182 (1987)Google Scholar
  27. 27.
    Stephenson, K., Zelen, M.: Rethinking centrality: Methods and examples. Social Networks 11(1), 1–37 (1989)CrossRefMathSciNetGoogle Scholar
  28. 28.
    Wang, H., Li, M., Wang, J., Pan, Y.: A new method for identifying essential proteins based on edge clustering coefficient. In: Chen, J., Wang, J., Zelikovsky, A. (eds.) ISBRA 2011. LNCS (LNBI), vol. 6674, pp. 87–98. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  29. 29.
    Li, M., Wang, J., Chen, X., et al.: A local average connectivity-based method for identifying essential proteins from the network level. Computational Biology and Chemistry 35(3), 143–150 (2011)CrossRefMathSciNetGoogle Scholar
  30. 30.
    Tang, X., Wang, J., Zhong, J., Pan, Y.: Predicting essential proteins based on weighted degree centrality. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2014)Google Scholar
  31. 31.
    Li, M., Zheng, R., Zhang, H., Wang, J., Pan, Y.: Effective identification of essential proteins based on priori knowledge, network topology and gene expressions. Methods (2014)Google Scholar
  32. 32.
    Kim, W.: Prediction of essential proteins using topological properties in GO-pruned PPI network based on machine learning methods. Tsinghua Science and Technology 17(6), 645–658 (2012)Google Scholar
  33. 33.
    Sprinzak, E., Sattath, S., Margalit, H.: How reliable are experimental protein - protein interaction data? Journal of Molecular Biology 327(5), 919–923 (2003)CrossRefGoogle Scholar
  34. 34.
    Hart, G.T., Lee, I., Marcotte, E.M.: A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics 8(1), 236 (2007)CrossRefGoogle Scholar
  35. 35.
    Spirin, V., Mirny, L.A.: Protein complexes and functional modules in molecular networks. Proceedings of the National Academy of Sciences 100(21), 12128–12128 (2003)Google Scholar
  36. 36.
    Mewes, H.W., Amid, C., Arnold, R., et al.: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Research 32(suppl. 1), D41–D44 (2004)Google Scholar
  37. 37.
    Xenarios, I., Rice, D.W., Salwinski, L., et al.: DIP: the database of interacting proteins. Nucleic Acids Research 28(1), 289–291 (2000)CrossRefGoogle Scholar
  38. 38.
    Gavin, A.C., Aloy, P., Grandi, P., et al.: Proteome survey reveals modularity of the yeast cell machinery. Nature 440(7084), 631–636 (2006)CrossRefGoogle Scholar
  39. 39.
    Krogan, N.J., Cagney, G., Yu, H., et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084), 637–643 (2006)CrossRefGoogle Scholar
  40. 40.
    Ho, Y., Gruhler, A., Heilbut, A., et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415(6868), 180–183 (2002)CrossRefGoogle Scholar
  41. 41.
    Issel-Tarver, L., Christie, K.R., Dolinski, K., et al.: Saccharomyces Genome Database. Methods in Enzymology 350, 329 (2002)CrossRefGoogle Scholar
  42. 42.
    Zhang, R., Lin, Y.: DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Research 37(suppl. 1), D455–D458 (2009)Google Scholar
  43. 43. (Saccharomyces Genome Deletion Project)
  44. 44.
    Li, M., Wang, J., Wang, H., Pan, Y.: Essential proteins discovery from weighted protein interaction networks. In: Borodovsky, M., Gogarten, J.P., Przytycka, T.M., Rajasekaran, S. (eds.) ISBRA 2010. LNCS, vol. 6053, pp. 89–100. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  45. 45.
    Ren, J., Wang, J., Li, M., Wang, H., Liu, B.: Prediction of essential proteins by integration of PPI network topology and protein complexes information. In: Chen, J., Wang, J., Zelikovsky, A. (eds.) ISBRA 2011. LNCS (LNAI), vol. 6674, pp. 12–24. Springer, Heidelberg (2011)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Min Li
    • 1
  • Yu Lu
    • 1
  • Zhibei Niu
    • 1
  • Fang-Xiang Wu
    • 3
  • Yi Pan
    • 1
    • 2
  1. 1.School of Information Science and EngineeringCentral South UniversityChangshaP.R. China
  2. 2.Department of Computer ScienceGeorgia State UniversityAtlantaUSA
  3. 3.Department of Mechanical Engineering and Division of Biomedical EngineeringUniversity of SaskatchewanSaskatoonCanada

Personalised recommendations