Prediction of Protein Interaction Based on Similarity of Phylogenetic Trees

  • Florencio Pazos
  • David Juan
  • Jose M. G. Izarzugaza
  • Eduardo Leon
  • Alfonso Valencia
Part of the Methods in Molecular Biology book series (MIMB, volume 484)


Computational methods for predicting protein interaction partners are becoming increasingly popular. Many of them are mature enough to be widely used by molecular biologists who can look for proteins related to the protein of interest in order to infer information about its context in the cell. In this chapter we describe the use of the mirrortree set of programs and related software for predicting protein interactions. They are all based on the idea that interacting or functionally related proteins tend to show similar phylogenetic trees due to coevolution. The basic mirrortree program can be used to calculate the similarity between the phylogenetic trees implicit in the multiple sequence alignments of two protein families. The ECID database contains protein interactions and relationships from different computational and experimental sources for the model organism Escherichia coli, including the ones generated with mirrortree. Finally, the TSEMA server uses the concept of tree similarity between interacting families to look for the best mapping between two families of interacting proteins: which member in one family interacts with which member in the other.

Key Words

Protein interaction protein functional relationship coevolution similarity of phylogenetic trees mirrortree 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Salwinski, L. and Eisenberg, D. (2003) Computational methods of analysis of protein-protein interactions. Curr. Opin. Struct. Biol. 13, 377–382.PubMedCrossRefGoogle Scholar
  2. 2.
    Valencia, A. and Pazos, F. (2002) Computational methods for the prediction of protein interactions. Curr. Opin. Struct. Biol. 12, 368–373.PubMedCrossRefGoogle Scholar
  3. 3.
    Huynen, M., Snel, B., Lathe, W., and Bork, P. (2000) Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Genome Res. 10, 1204–1210.PubMedCrossRefGoogle Scholar
  4. 4.
    von Mering, C., Krause, R., Snel, B., Cornell, M., Oliver, S. G., Fields, S., and Bork, P. (2002) Comparative assessment of large scale data sets of protein-protein interactions. Nature 417, 399–403.CrossRefGoogle Scholar
  5. 5.
    von Mering, C., Huynen, M., Jaeggi, D., Schmidt, S., Bork, P., and Snel, B. (2003) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res. 31, 258–261.CrossRefGoogle Scholar
  6. 6.
    Fryxell, K.J. (1996) The coevolution of gene family trees. Trends Genet. 12, 364–369.PubMedCrossRefGoogle Scholar
  7. 7.
    Goh, V.-S., Bogan, A. A., Joachimiak, M., Walther, D., and Cohen, F.E. (2000) Coevolution of proteins with their interaction partners. J. Mol. Biol. 299, 283–293.PubMedCrossRefGoogle Scholar
  8. 8.
    Pazos, F. and Valencia, A. (2001) Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng. 14, 609–614.PubMedCrossRefGoogle Scholar
  9. 9.
    Pazos, F., Ranea, J. A. G., Juan, D., and Sternberg, M. J. E. (2005) Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. J. Mol. Biol., 352, 1002–1015.PubMedCrossRefGoogle Scholar
  10. 10.
    Labedan, B., Xu, Y., Naumoff, D. G., and Glansdorff, N. (2004) Using quaternary structures to assess the evolutionary history of proteins: the case of the aspartate carbamoyltransferase. Mol. Biol. Evol. 21, 364–373.PubMedCrossRefGoogle Scholar
  11. 11.
    Izarzugaza, J. M., Juan, D., Pons, C., Ranea, J. A., Valencia, A., and Pazos, F. (2006) TSEMA: interactive prediction of protein pairings between interacting families. Nucleic Acids Res. 34, W315–319.PubMedCrossRefGoogle Scholar
  12. 12.
    Tatusov, R. L., Koonin, E. V., and Lipman, D. J. (1997) A genomic perspective of protein families. Science 278, 631–637.PubMedCrossRefGoogle Scholar
  13. 13.
    Pazos, F. and Valencia, A. (2002) In silico two-hybrid system for the selection of physically interacting protein pairs. Proteins 47, 219–227.PubMedCrossRefGoogle Scholar
  14. 14.
    Pellegrini, M., Marcotte, E. M., Thompson, M. J., Eisenberg, D., and Yeates, T. O. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288.PubMedCrossRefGoogle Scholar
  15. 15.
    Dandekar, T., Snel, B., Huynen, M., and Bork, P. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328.PubMedCrossRefGoogle Scholar
  16. 16.
    Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–280.PubMedCrossRefGoogle Scholar
  17. 17.
    Hoffmann, R. and Valencia, A. (2004) A gene network for navigating the literature. Nat. Genet. 36, 664.PubMedCrossRefGoogle Scholar
  18. 18.
    Ramani, A. K. and Marcotte, E. M. (2003) Exploiting the co-evolution of interacting proteins to discover interaction specificity. J. Mol. Biol. 327, 273–284.PubMedCrossRefGoogle Scholar
  19. 19.
    Chenna, R., Sugawara, H., Koike, T., Lopez, R., Gibson, T. J., Higgins, D. G., and Thompson, J. D. (2003) Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 31, 3497–3500.PubMedCrossRefGoogle Scholar
  20. 20.
    Perrière, G. and Gouy, M. (1996) WWW-Query: an on-line retrieval system for biological sequence banks. Biochimie 78, 364–369.PubMedCrossRefGoogle Scholar
  21. 21.
    Bateman, A., Coin, L., Durbin, R., Finn, R. D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E. L., et al. (2004) The Pfam protein families database. Nucleic Acids Res. 32, D138–141.PubMedCrossRefGoogle Scholar
  22. 22.
    Sato, T., Yamanishi, Y., Kanehisa, M., and Toh, H. (2005) The inference of protein-protein interactions by co-evolutionary analysis is improved by excluding the information about the phylogenetic relationships. Bioinformatics 21, 3482–3489.PubMedCrossRefGoogle Scholar
  23. 23.
    Marcotte, E. M., Pellegrini, M., Ho-Leung, N., Rice, D. W., Yeates, T. O., and Eisenberg, D. (1999) Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, Totowa, NJ 2008

Authors and Affiliations

  • Florencio Pazos
    • 1
  • David Juan
    • 2
  • Jose M. G. Izarzugaza
    • 2
  • Eduardo Leon
    • 2
  • Alfonso Valencia
    • 3
  1. 1.Computational Systems Biology GroupNational Centre for Biotechnology (CNB-CSIC)MadridSpain
  2. 2.Structural Computational Biology ProgrammeSpanish National Cancer Research Centre (CNIO)MadridSpain
  3. 3.Structural Computational Biology ProgrammeSpanish National Cancer Research Centre (CNIO), C/Melchor Fernandez AlmagroMadridSpain

Personalised recommendations