Skip to main content
Log in

Distinct Stages of Protein Evolution as Suggested by Protein Sequence Analysis

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract.

Evolution of proteins encoded in nucleotide sequences began with the advent of the triplet code. The chronological order of the appearance of amino acids on the evolution scene and the steps in the evolution of the triplet code have been recently reconstructed (Trifonov, 2000b) on the basis of 40 different ranking criteria and hypotheses. According to the consensus chronology, the pair of complementary GGC and GCC codons for the amino acids alanine and glycine appeared first. Other codons appeared as complementary pairs as well, which divided their respective amino acids into two alphabets, encoded by triplets with either central purines or central pyrimidines: G, D, S, E, N, R, K, Q, C, H, Y, and W (Glycine alphabet G) and A, V, P, S, L, T, I, F, and M (Alanine alphabet A). It is speculated that the earliest polypeptide chains were very short, presumably of uniform length, belonging to two alphabet types encoded in the two complementary strands of the earliest mRNA duplexes. After the fusion of the minigenes, a mosaic of the alphabets would form. Traces of the predicted mosaic structure have been, indeed, detected in the protein sequences of complete prokaryotic genomes in the form of weak oscillations with the period 12 residues in the form of alteration of two types of 6 residue long units. The next stage of protein evolution corresponded to the closure of the chains in the loops of the size 25–30 residues (Berezovsky et al., 2000). Autocorrelation analysis of proteins of 23 complete archaebacterial and eubacterial genomes revealed that the preferred distances between valine, alanine, glycine, leucine, and isoleucine along the sequences are in the same range of 25–30 residues, indicating that the loops are primarily closed by hydrophobic interactions between the ends of the loops. The loop closure stage is followed by the formation of typical folds of 100–200 amino acids, via end-to-end fusion of the genes encoding the loop-size chains. This size was apparently dictated by the optimal ring closure for DNA. In both cases the closure into the ring (loop) rendered evolutionarily advantageous stability to the respective structures. Further gene fusions lead to the formation of modern multidomain proteins. Recombinational gene splicing is likely to have appeared after the DNA circularization stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Additional information

Received: 21 December 2000 / Accepted: 28 February 2001

Rights and permissions

Reprints and permissions

About this article

Cite this article

Trifonov, E., Kirzhner, A., Kirzhner, V. et al. Distinct Stages of Protein Evolution as Suggested by Protein Sequence Analysis. J Mol Evol 53, 394–401 (2001). https://doi.org/10.1007/s002390010229

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s002390010229

Navigation