Inferences from protein and nucleic acid sequences: Early molecular evolution, divergence of kingdoms and rates of change
- Cite this article as:
- Dayhoff, M.O., Barker, W.C. & McLaughlin, P.J. Origins Life Evol Biosphere (1974) 5: 311. doi:10.1007/BF01207633
Presently the sequences of more than 150 different kinds of proteins and nucleic acids are known from the many thousands thought to exist in all living creatures. Some few of these have occpied much the same functional niche within the living cell from near the beginning of life. In three of these latter, sequence evidence pointing to duplications of genetic material in a primitive ancestor is available and in the fourth other evidence suggests it. Such a duplication, shared by the many descendant species, permits us to locate the point of earliest time on an evolutionary tree and to infer the actual order of subsequent evolutionary events. The amounts of change which have occurred in each descendant line can be estimated with good confidence. Some inferences can be made of the structure of the ancestral duplicated sequence, the evolutionary mechanisms which have been operative on it, and the functional capacity of the organism in which it originated. We will describe new, sensitive, objective methods for establishing the probable common ancestry of very distantly related sequences and the quantitative evolutionary change which has taken place. These methods will be applied to the four families, and evolutionary trees will be derived where possible. Of the three families containing duplications of genetic material, two are nucleic acids: transfer RNA and 5S ribosomal RNA. Both of these structures are functional in the synthesis of coded proteins, and prototypes must have been present in the cell at the inception of the fundamental coding process that all living things share. There are many types of tRNA which recognise the various nucleotide triplets and the 20 amino acids. These types are thought to have arisen as a result of many gene duplications. Relationships among these types will be discussed. The 5S ribosomal RNA, presently functional in both eukaryotes and prokaryotes, is very likely descended from an early form incorporating almost a complete duplication of genetic material. The amount of evolution in the various lines can again be compared. The other two families containing duplications are proteins: ferredoxin and cytochrome c. Ferredoxin from photosynthetic and nonphotosynthetic bacteria shows clear evidence of a duplication of genetic material. This duplication is very possibly shared by the ferredoxin from plant plastids and the related adrenodoxin from mammalian mitochondria. If so, a chronology of the detalls of evolution of these groups can be inferred. From these examples of protein and nucleic acid sequence, we conclude that the amount of change in the bacterial lines is less than that in the eukaryote lines. Even though mutant bacteria are easily produced in the laboratory, though their evolutionary adaptation to new drugs is very rapid, and though new virulent strains often appear spontaneously, nevertheless the sequences of ancient structures in the wild types have changed less than those in the eukaryote lines. Cytochrome c sequences from many eukaryotes and the closely related cytochrome c2 fromRhodospirillum rubrum are known. Other types of cytochrome, such as c551 and c553, are probably related to these through gene duplication. Knowledge of enough of these structures to establish an early duplication will provide a time orientation for the cytochrome c evolutionary tree. This quantitative tree now contains sequences from animals, fungi, green plants, protozoa, and bacteria, examples from all five biological kingdoms.