Summary
We present compositional statistics, a new method of phylogenetic inference, which is an extension of evolutionary parsimony. Compositional statistics takes account of the base composition of the compared sequences by using nucleotide positions that evolutionary parsimony ignores. It shares with evolutionary parsimony the features of rate invariance and the fundamental distinction between transitions and transversions. Of the presently available methods of phylogenetic inference, compositional statistics is based on the fewest and mildest assumptions about the mode of DNA sequence evolution. It is therefore applicable to phylogenetic studies of the most distantly related organisms or molecules. This was illustrated by analyzing conservative positions in the DNA sequences of the large subunit of RNA polymerase from three archaebacterial groups, a eubacterium, a chloroplast, and the three eukaryotic polymerases. Internally consistent results, which are in accord with our knowledge of organelle origin and archaebacterial physiology, were achieved.
Similar content being viewed by others
References
Allison LA, Moyle M, Shales M, Ingles CJ (1985) Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases. Cell 42:599–610
Anderson S, Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG (1981) Sequence and organization of the human mitochondrial genome. Nature 290:457–465
Anderson S, de Bruijn MHL, Coulson AR, Eperon IC, Sanger F, Young IG (1982) Complete sequence of bovine mitochondrial DNA: conserved features of the mammalian mitochondrial genome. J Mol Biol 156:683–717
Berghöfer B, Kröckel L, Körtner C, Truss M, Schallenberg J, Klein A (1988) Relatedness of archaebacterial RNA polymerase core subunits to their eubacterial and eukaryotic equivalents. Nucleic Acids Res 16:8113–8128
Bibb MJ Van Etten RA, Wright CT, Walberg MW, Clayton DA (1981) Sequence and gene organization of mouse mitochondrial DNA. Cell 26:167–180
Cavender JA (1989) Mechanized derivations of linear invariants. Mol Biol Evol 6:301–316
Evers R, Hammer A, Köck J, Jess W, Borst P, Mémet S, Cornelissen AWCA (1989)Trypanosoma brucei contains two RNA polymerase II largest subunit genes with an altered C-terminal domain. Cell 56:585–597
Felsenstein J (1978) Cases in which parsimony and compatibility methods may be positively misleading. Syst Zool 27: 401–410
Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376
Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791
Felsenstein J (1989) Phylogenetic inference programs (PHYLIP) manual 3.2. University of Washington, Seattle
Fitch FM (1986) The estimate of total nucleotide substitutions from pairwise differences is biased. Philos Trans R Soc Lond B 312:317–324
Fox GE, Stackebrandt E, Hespell RB, Gibson J, Maniloff J, Dyer TA, Wolfe RS, Balch WE, Tanner RS, Magrum LJ, Zablen LB, Blakemore R, Gupta R, Bonen L, Lewis BJ, Stahl DA, Luehrsen KR, Chen RN, Woese CR (1980) The phylogeny of prokaryotes. Science 209:457–463
Gadaleta G, Pepe G, De Candia G, Quagliariello C, Sbisà E, Saccone C (1989) The complete nucleotide sequence of theRattus norvegicus mitochondrial genome: cryptic signals revealed by comparative analysis between vertebrates. J Mol Evol 28:497–516
Gogarten JP, Rausch T, Bernasconi P, Kibak H, Taiz L (1989a) Molecular evolution of H+-ATPases I.Methanococcus andSulfolobus are monophyletic with respect to eukaryotes and eubacteria. Z Naturforsch 44:97–105
Gogarten JP, Kibak H, Dittrich P, Taiz L, Bowman EJ, Bowman BJ, Manolson MF,Poole RJ, Date T, Oshima T, Konishi J, Denda K,Yoshida M (1989b). The evolution of the vacuolar H+-ATPase: implications for the origin of eukaryotes. Proc Natl Acad Sci USA 86:6661–6665
Goodman M, Koop BF, Czelusniak J, Fitch DHA, Tagle DA, Slighton JL (1989) Molecular phylogeny of the family of apes and humans. Genome 31:316–335
Gouy M, Li W-H (1989) Phylogenetic analysis based on rRNA sequences supports the archaebacterial rather than the eocyte tree. Nature 339:145–147
Hudson GS, Holton TA, Whitfield PR, Bottomley W (1988) Spinach chloroplast rpoBC genes encode three subunits of the chloroplast RNA polymerase. J Mol Biol 200:639–654
Jones WJ, Nagle DP, Whitman WB (1987) Methanogens and the diversity of archaebacteria. Microbiol Rev 51:135–177
Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism, vol 3. Academic Press, New York, pp 21–132
Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120
Köck J, Evers R, Cornelissen AWCA (1988) Structure and sequence of the gene for the largest subunit of trypanosomal RNA polymerase III. Nucleic Acids Res 16:8753–8772
Lake JA (1986) In defence of bacterial phylogeny. Nature 321:657–658
Lake JA (1987a) A rate-independent technique for analysis of nucleic acid sequences: evolutionary parsimony. Mol Biol Evol 4:167–191
Lake JA (1987b) Determining evolutionary distances from highly diverged nucleic acid sequences: operator metrics. J Mol Evol 26:59–73
Lake JA (1988) Origin of the eukaryotic nucleus determined by rate-invariant analysis of rRNA sequences. Nature 331: 184–186
Lanave C, Preparata G, Saccone C, Serio G (1984) A new method for calculating evolutionary substitution rates. J Mol Evol 20:86–93
Lanave C, Preparata G, Saccone C (1985) Mammalian genes as molecular clocks? J Mol Evol 21:346–350
Leffers H, Gropp F, Lottspeich F, Zillig W, Garrett RA (1989) Sequence, organization, transcription and evolution of RNA polymerase subunit genes from the archaebacterial extreme halophilesHalobacterium halobium andHalococcus morrhuae. J Mol Biol 206:1–17
Li W-H (1989) A statistical test of phylogenies estimated from sequence data. Mol Biol Evol 6:424–435
Margulis L (1970) Origin of eukaryotic cells. Yale University Press, New Haven CT
Mémet S, Gouy M, Marck C, Sentenac A, Buhler J-M (1988)RPA190, the gene coding for the largest subunit of yeast RNA polymerase A. J Biol Chem 263:2830–2839
Muto A, Osawa S (1987) The guanine and cytosine content of genomic DNA and bacterial evolution. Proc Natl Acad Sci USA 84:166–169
Olsen G (1987) Earliest phylogenetic branchings: comparing rRNA-based evolutionary trees inferred with various techniques. Cold Spring Harbor Symp Quant Biol 52:825–837
Osawa S, Ohama T, Yamao F, Muto A, Jukes TH, Ozeki H, Umesono K (1988) Directional mutation pressure and transfer RNA in choice of the third nucleotide of synonymous two-codon sets. Proc Natl Acad Sci USA 85:1124–1128
Ovchinnikov YA, Monastyrskaya GS, Gubanov VV, Guryev SO, Salomatina IS, Shuvaeva TM, Lipkin VM, Sverdlov ED (1982) The primary structure ofE. coli RNA polymerase. Nucleotide sequence of the rpoC gene and amino acid sequence of the beta′-subunit. Nucleic Acids Res 10:4035–4044
Prager EM, Wilson AC (1988) Ancient origin of lactalbumin from lysozyme: analysis of DNA and amino acid sequences. J Mol Evol 27:326–335
Pühler G, Leffers H, Gropp F, Palm P, Klenk H-P, Lottspeich F, Garrett RA, Zillig W (1989a) Archaebacterial DNA-dependent RNA polymerases testify to the evolution of the eukaryotic nuclear genome. Proc Natl Acad Sci USA 86:4569–4573
Pühler G, Lottspeich F, Zillig W (1989b) Organization and nucleotide sequence of the genes encoding the large subunits A, B and C of the DNA-dependent RNA polymerase of the archaebacteriumSulfolobus acidocaldarius. Nucleic Acids Res 17:4517–4537
Saccone C, Pesole G, Preparata G (1989) DNA microenvironments and the molecular clock. J Mol Evol 29:407–411
Saitou N, Imanishi T (1989) Relative efficiencies of the Fitch-Margoliash, maximum-parsimony, maximum-likelihood, minimum-evolution, and neighbor-joining methods of phylogenetic tree construction in obtaining the correct tree. Mol Biol Evol 6:514–525
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Shoemaker JS, Fitch WM (1989) Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated. Mol Biol Evol 6:270–289
Spencer DF, Schnare FN, Gray MW (1984) Evidence from nuclear sequences that invariable sites should be considered when sequence divergence is calculated. Mol Biol Evol 6:270–289
Spencer DF,Schnare FN, Gray MW (1984) Pronounced structural similarities between the small subunit ribosomal RNA genes of wheat mitochondria andE. coli. Proc Natl Acad Sci USA 81:493–497
Sueoka N (1988) Directional mutation pressure and neutral molecular evolution. Proc Natl Acad Sci USA 85:2653–2657
Templeton AR (1983) Convergent evolution and nonparametric inferences from restriction data and DNA sequences. In: Weir BS (ed) Statistical analysis of DNA sequence data. Marcel Dekker, New York, pp 151–179
Wilson AC, Cann RL, Carr SM, George M, Gyllensten UB, Helm-Bychowski KM, Higuchi RG, Palumbi SR, Prager EM, Sage RD, Stoneking M (1985) Mitochondrial DNA and two perspectives on evolutionary genetics. Biol J Linn Soc 26375–400
Woese CR (1987) Bacterial evolution. Microbiol Rev 51:221–271
Zillig W, Klenk H-P, Palm P, Leffers H, Pühler G, Gropp F, Garrett RA (1989) Did eukaryotes originate by a fusion event? Endocytobiosis Cell Res 6:1–25
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Sidow, A., Wilson, A.C. Compositional statistics: An improvement of evolutionary parsimony and its application to deep branches in the tree of life. J Mol Evol 31, 51–68 (1990). https://doi.org/10.1007/BF02101792
Received:
Revised:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF02101792