The Effect of Sequence Complexity on the Construction of Protein-Protein Interaction Networks

  • Mehdi Kargar
  • Aijun An
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6334)

Abstract

In this paper, the role of sequence complexity in the construction of important nodes in protein-protein interaction (PPI) networks is investigated. We use two complexity measures, linguistic complexity and Shanon entropy, to measure the complexity of protein sequences. Three different datasets of yeast PPI networks are used to conclude the results. It has been shown that there are two important types of nodes in the PPI networks, which are hub and bottleneck nodes. It has been shown recently that hubs and bottlenecks tend to be essential in the process of evolution. Better understanding of the properties of these two types of nodes will shed light on why proteins interact with each other in the observed manner. We show that the sequence complexity of hubs are lower than that of non-hubs. But the difference is not significant in most cases. On the other hand, the sequence complexity of bottlenecks are lower than that of non-bottlenecks and the difference is significant in most cases. Modularity has an effective role in the construction of PPI networks. We find that there is no significant difference in the node complexity among different modules in a PPI network.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abnizova, I., Gilks, W.R.: Studying statistical properties of regulatory dna sequences, and their use in predicting regulatory regions in the eukaryotic genomes. Briefings in Bioinformatics 7(1), 48–54 (2006)CrossRefGoogle Scholar
  2. 2.
    Danon, L., Diaz-Guilera, A., Duch, J., Arenas, A.: Comparing community structure identification. J. Stat. Mech. P09008, 1–10 (2005)Google Scholar
  3. 3.
    Dezso, Z., Nikolsky, Y., Nikolskaya, T., Miller, J., Cherba, D., Webb, C., Bugrim, A.: Identifying disease-specific genes based on their topological significance in protein networks. BMC Systems Biology 3(36) (March 2009)Google Scholar
  4. 4.
    Girvan, M., Newman, M.E.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. U S A 99, 7821–7826 (2002)MathSciNetCrossRefMATHGoogle Scholar
  5. 5.
    Guimera, R., Amaral, L.A.N.: Functional cartography of complex metabolic networks. Nature 433, 895–900 (2005)CrossRefGoogle Scholar
  6. 6.
    Guldener, U., Munsterkotter, M., Oesterheld, M., Pagel, P., Ruepp, A., Mewes, H.W., Stumpflen, V.: Mpact: The mips protein interaction resource on yeast. Nucleic. Acids. Res. 34, D436–D441 (2006)CrossRefGoogle Scholar
  7. 7.
    Han, J.D., Bertin, N., Hao, T., Goldberg, D.S., Berriz, G.F., Zhang, L.V., Dupuy, D., Walhout, A.J., Cusick, M.E., Roth, F.P., Vidal, M., et al.: Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature 430, 88–93 (2004)CrossRefGoogle Scholar
  8. 8.
    Menconi, G., Benci, V., Buiatti, M.: Data compression and genomes: A two-dimensional life domain map. J. Theor. Biol. 253(2), 281–288 (2008)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Missiuro, P.V., Liu, K., Zou, L., Ross, B.C., Zhao, G., Liu, J.S., Ge, H.: Information flow analysis of interactome networks. PLOS Computational Biology 5(4) (April 2009)Google Scholar
  10. 10.
    Nan, F., Adjeroh, D.: On complexity measures for biological sequences. In: Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference, pp. 522–526. TeX Users Group (2004)Google Scholar
  11. 11.
    Newman, M.E.: Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Phys. Rev. E 64, 016132 (2001)CrossRefGoogle Scholar
  12. 12.
    Newman, M.E.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Newman, M.E., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 69, 026113 (2004)CrossRefGoogle Scholar
  14. 14.
    Orlov, Y., Boekhorst, R., Abnizova, I.: Statistical measures of the structure of genomic sequences: entropy, complexity, and position information. J. Bioinform. Comput. Biol. 4(2), 523–536 (2006)CrossRefGoogle Scholar
  15. 15.
    Pirhaji, L., Kargar, M., Sheari, A., Poormohammadi, H., Sadeghi, M., Pezeshk, H., Eslahchi, C.: The performances of the chi-square test and complexity measures for signal recognition in biological sequences. J. Theor. Biol. 251(2), 380–387 (2008)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The database of interacting proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004)CrossRefGoogle Scholar
  17. 17.
    Sheari, A., Kargar, M., Katanforoush, A., Arab, S., Sadeghi, M., Pezeshk, H., Eslahchi, C., Marashi, S.-A.: A tale of two symmetrical tails: Structural and functional characteristics of palindromes in proteins. BMC Bioinformatics 9(274) (2008)Google Scholar
  18. 18.
    Troyanskaya, O.G., Arbell, O., Koren, Y., Landau, G.M., Bolshoy, A.: Sequence complexity profiles of prokaryotic genomic sequences: A fast algorithm for calculating linguistic complexity. Bioinformatics 18(5), 679–688 (2002)CrossRefGoogle Scholar
  19. 19.
    Vingaa, S., Jonas, S., Almeidaa, B.: Renyi continuous entropy of dna sequences. J. Theor. Biol. 231(3), 377–388 (2004)CrossRefGoogle Scholar
  20. 20.
    Yu, H., Kim, P.M., Sprecher, E., Trifonov, V., Gerstein, M.: The importance of bottlenecks in protein networks: Correlation with gene essentiality and expression dynamics. PLoS Computational Biology 3, 713–720 (2007)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Mehdi Kargar
    • 1
  • Aijun An
    • 1
  1. 1.Department of Computer Science and EngineeringYork UniversityOntarioCanada

Personalised recommendations