Skip to main content
Log in

Comparison of sequence-based and structure-based phylogenetic trees of homologous proteins: Inferences on protein evolution

  • Published:
Journal of Biosciences Aims and scope Submit manuscript

Abstract

Several studies based on the known three-dimensional (3-D) structures of proteins show that two homologous proteins with insignificant sequence similarity could adopt a common fold and may perform same or similar biochemical functions. Hence, it is appropriate to use similarities in 3-D structure of proteins rather than the amino acid sequence similarities in modelling evolution of distantly related proteins. Here we present an assessment of using 3-D structures in modelling evolution of homologous proteins. Using a dataset of 108 protein domain families of known structures with at least 10 members per family we present a comparison of extent of structural and sequence dissimilarities among pairs of proteins which are inputs into the construction of phylogenetic trees. We find that correlation between the structure-based dissimilarity measures and the sequence-based dissimilarity measures is usually good if the sequence similarity among the homologues is about 30% or more. For protein families with low sequence similarity among the members, the correlation coefficient between the sequence-based and the structure-based dissimilarities are poor. In these cases the structure-based dendrogram clusters proteins with most similar biochemical functional properties better than the sequence-similarity based dendrogram. In multi-domain protein families and disulphide-rich protein families the correlation coefficient for the match of sequence-based and structure-based dissimilarity (SDM) measures can be poor though the sequence identity could be higher than 30%. Hence it is suggested that protein evolution is best modelled using 3-D structures if the sequence similarities (SSM) of the homologues are very low.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Balaji S and Srinivasan N 2001 Use of a database of structural alignments and phylogenetic trees in investigating the relationship between sequence and structural variability among homologous proteins; Protein Eng. 14 219–226

    Article  PubMed  CAS  Google Scholar 

  • Balaji S, Sujatha S, Kumar S S C and Srinivasan N 2001 PALI: A database of Phylogeny and ALIgnment of homologous protein structures; Nucleic Acids Res. 29 61–65

    Article  PubMed  CAS  Google Scholar 

  • Bateman A, Birney E, Durbin R, Eddy S R, Howe K L and Sonnhammer E L L 2000 The Pfam protein families database; Nucleic Acids Res. 28 263–266

    Article  PubMed  CAS  Google Scholar 

  • Bray J E, Todd A E, Pearl F M, Thornton J M and Orengo C A 2000 The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues; Protein Eng. 13 153–165

    Article  PubMed  CAS  Google Scholar 

  • Bujnicki J M 2000 Phylogeny of the restriction endonuclease-like superfamily inferred from comparison of protein structures; J. Mol. Evol. 50 39–44

    PubMed  CAS  Google Scholar 

  • Chothia C and Lesk A M 1986 The relation between the divergence of sequence and structure in protein; EMBO J. 5 823–826

    PubMed  CAS  Google Scholar 

  • Doolittle R F 1981 Similar amino acid sequences: chance or common ancestry?; Science 214 149–159

    Article  PubMed  CAS  Google Scholar 

  • Efron B 1979a Bootstrap methods: Another look at the jackknife; Ann. Stat. 7 1–26

    Article  Google Scholar 

  • Efron B 1979b Computers and Theory of Statistics:Thinking the Unthinkable; SIAM Rev. 21 460–480

    Article  Google Scholar 

  • Evans S V 1993 SETOR: hardware-lighted three-dimensional solid model representations of macromolecules; J. Mol. Graph. 11 127–128, 134–138

    Google Scholar 

  • Felsenstein J 1995 PHYLIP (Phylogeny Inference Package) version 3.57c (Department of Genetics, University of Washington, Seattle, USA)

    Google Scholar 

  • Flores T P, Orengo C A, Moss D S and Thornton J M 1993 Comparison of conformational characteristics in structurally similar protein pairs; Protein Sci. 2 1811–1826

    Article  PubMed  CAS  Google Scholar 

  • Goh C S, Bogan A A, Joachimiak M, Walter D and Cohen F E 2000 Co-evolution of proteins with their interaction partners; J. Mol. Biol. 299 283–293

    Article  PubMed  CAS  Google Scholar 

  • Gowri V S, Pandit S B, Karthik P S, Srinivasan N and Balaji S 2003 Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database; Nucleic Acids Res. 31 486–488

    Article  PubMed  CAS  Google Scholar 

  • Grishin N V 1997 Estimation of evolutionary distances from protein spatial structures; J. Mol. Evol. 45 359–369

    Article  PubMed  CAS  Google Scholar 

  • Holm L and Sander C 1993 Protein structure comparison by alignment of distance matrices; J. Mol. Biol. 233 123–138

    Article  PubMed  CAS  Google Scholar 

  • Holm L and Sander C 1997 An evolutionary treasure: unification of a broad set of amidohydrolases related to urease; Proteins: Struct. Funct. Genet. 28 72–82

    Article  CAS  Google Scholar 

  • Hubbard T J and Blundell T L 1987 Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modeling; Protein Eng. 1 59–71

    Google Scholar 

  • Johnson M S, Overington J P and Blundell T L 1993 Alignment and searching for common protein folds using a data bank of structural templates; J. Mol. Biol. 231 735–752

    Article  PubMed  CAS  Google Scholar 

  • Johnson M S, Sali A and Blundell T L 1992a Phylogenetic relationships from three-dimensional protein structures; Methods Enzymol. 183 670–690

    Article  Google Scholar 

  • Johnson M S, Sutcliffe M J and Blundell T L 1992b Molecular anatomy: phyletic relationships derived from three-dimensional structures of proteins; J. Mol. Evol. 1 43–59

    Google Scholar 

  • Lesk A M and Chothia C 1980 How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins; J Mol. Biol. 136 225–270

    Article  PubMed  CAS  Google Scholar 

  • Murzin A G 1993a Can homologous proteins evolve different enzymatic activities?; Trends Biochem. Sci. 18 403–405

    Article  PubMed  CAS  Google Scholar 

  • Murzin A G 1993b Sweet-tasting protein monellin is related to the cystatin family of thiol proteinase inhibitors; J. Mol. Biol. 230 689–694

    Article  PubMed  CAS  Google Scholar 

  • Murzin A G 1998 How far divergent evolution goes in proteins?; Curr. Opin. Struct. Biol. 8 380–387

    Article  PubMed  CAS  Google Scholar 

  • Murzin A G, Brenner S E, Hubbard T and Chothia C 1995 SCOP: a structural classification of proteins database for the investigation of sequences and structures; J. Mol. Biol. 247 536–540

    PubMed  CAS  Google Scholar 

  • Pazos F and Valencia A 2001 Similarity of Phylogenetic trees as indicator of protein-protein interaction; Prot. Eng. 14 609–614

    Article  CAS  Google Scholar 

  • Russell R B and Barton G B 1992 Multiple protein sequence alignment from tertiary structure comparison: assignment of global and residue confidence levels; Proteins: Struct. Funct. Genet. 14 309–323

    Article  CAS  Google Scholar 

  • Russell R B and Barton G J 1994 Structural features can be unconserved in proteins with similar folds. An analysis of side-chain to side-chain contacts secondary structure and accessibility; J. Mol. Biol. 244 332–350

    Article  PubMed  CAS  Google Scholar 

  • Russell R B and Sternberg M J 1996 A novel binding site in catalase is suggested by structural similarity to the calycin superfamily; Protein Eng. 9 107–111

    Article  PubMed  CAS  Google Scholar 

  • Russell R B and Sternberg M J 1997 Two new examples of protein structural similarities within the structure-function twilight zone; Protein Eng. 10 333–338

    Article  PubMed  CAS  Google Scholar 

  • Russell R B, Saqi M A, Sayle R A and Sternberg M J 1997 Recognition of analogous and homologous protein folds: analysis of sequence and structure conservation; J. Mol. Biol. 269 423–439

    Article  PubMed  CAS  Google Scholar 

  • Sowdhamini R, Burke D F, Huang J F, Mizuguchi K, Nagarajaram H A, Srinivasan N, Steward R E and Blundell T L 1998 CAMPASS: a database of structurally aligned protein superfamilies; Structure 6 1087–1094

    Article  PubMed  CAS  Google Scholar 

  • Sowdhamini R, Rufino S D and Blundell T L 1996 A database of globular protein structural domains: clustering of representative family members into similar folds; Fold. Des. 1 209–220

    Article  PubMed  CAS  Google Scholar 

  • Sujatha S, Balaji S and Srinivasan N 2001 PALI: a database of alignments and phylogeny of homologous protein structures; Bioinformatics 17 375–376

    Article  PubMed  CAS  Google Scholar 

  • Todd A E, Orengo C A and Thornton J M 2001 Evolution of function in protein superfamilies, from a structural perspective; J. Mol. Biol. 307 1113–1143

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to N Srinivasan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Balaji, S., Srinivasan, N. Comparison of sequence-based and structure-based phylogenetic trees of homologous proteins: Inferences on protein evolution. J Biosci 32, 83–96 (2007). https://doi.org/10.1007/s12038-007-0008-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12038-007-0008-1

Keywords

Navigation