Codon preference and primary sequence structure in protein-coding regions

Tavaré, Simon; Song, Brenda

doi:10.1007/BF02458838

Codon preference and primary sequence structure in protein-coding regions

Published: January 1989

Volume 51, pages 95–115, (1989)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Simon Tavaré¹ &
Brenda Song¹

52 Accesses
16 Citations
Explore all metrics

Abstract

The stochastic complexity of a data base of 365 protein-coding regions is analysed. When the primary sequence is modeled as a spatially homogeneous Markov source, the fit to observed codon preference is very poor. The situation improves substantially when a non-homogeneous model is used. Some implications for the estimation of species phylogeny and substitution rates are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Site-Specific Amino Acid Distributions Follow a Universal Shape

Article 24 November 2020

Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content

Article Open access 27 December 2023

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Literature

Almagor, H. 1983. “A Markov Analysis of DNA Sequences.”J. Theor. Biol. 104, 633–645.
Article Google Scholar
Bernardi, G., B. Olofsson, J. Filipski, M. Zerial, J. Salinas, G. Cuny, M. Meunier-Rotival and F. Rodier. 1985. “The Mosaic Genome of Warm-Blooded Vertebrates.”Science 228, 953–958.
Google Scholar
Bernardi, G. and G. Bernardi. 1985. “Codon Usage and Genome Composition.”J. Molec. Evol. 22, 363–365.
Article MathSciNet Google Scholar
Billingsley, P. 1961.Statistical Inference for Markov Processes. Chicago: University of Chicago Press.
Google Scholar
Blaisdell, B. E. 1985. “Markov Chain Analysis Finds a Significant Influence of Neighboring Bases on the Occurrence of a Base in Eukaryotic Nuclear DNA Sequences Both Protein-Coding and Noncoding.”J. Molec. Evol. 21, 278–288.
Article Google Scholar
—. 1986. “A Measure of the Similarity of Sets of Sequences Not Requiring Sequence Alignment.”Proc. Natn. Acad. Sci. U.S.A. 83, 5155–5159.
Article MATH Google Scholar
Chatfield, C. 1973. “Statistical Inference Regarding Markov Chain Models.”Appl. Statist. 22, 7–20.
Article Google Scholar
Erickson, J. W. and G. G. Altman. 1979. “A Search for Patterns in the Nucleotide Sequence of the MS2 Genome.”J. Math. Biol. 7, 219–230.
Article MATH Google Scholar
Felsenstein, J. 1983. “Statistical Inference of Phylogenies.”J. R. Statist. Soc. 146, 246–272.
MATH Google Scholar
Fuchs, C. 1980. “On the Distribution of Nucleotides in Seven Completely Sequenced DNAs.”Gene 10, 371–373.
Article Google Scholar
Garden, P. W. 1980. “Markov Analysis of Viral DNA/RNA Sequences.”J. Theor. Biol. 82, 679–684.
Article Google Scholar
Gouy, M. and C. Gautier. 1982. “Codon Usage in Bacteria: Correlation with Gene Expressivity.”Nucleic Acids Res. 10, 7055–7074.
Google Scholar
Grantham, R., C. Gautier and M. Gouy. 1980a. “Codon Frequencies in 119 Individual Genes Confirm Consistent Choices of Degenerate Bases according to Genome Type.”Nucleic Acids Res. 9, r43-r74.
Google Scholar
———. R. Mercier and A. Pavé. 1980b. “Codon Catalog Usage and the Genome Hypothesis.”Nucleic Acids Res. 8, r49-r62.
Google Scholar
———, M. Jacobzone and R. Mercier. 1981. “Codon Catalog Usage is a Genome Strategy Modulated for Gene Expressivity.”Nucleic Acids Res. 9, r43-r74.
Google Scholar
Grosjean, H. and W. Fiers. 1982. “Preferential Codon Usage in prokaryotic Genes—The Optimal Anticodon Interaction Energy and the Selective Codon Usage in Efficiently Expressed Genes.”Gene 18, 199–209.
Article Google Scholar
Ikemura, T. 1981. “Correlation Between the Abundance ofEscherichia coli Transfer RNAs and the Occurrence of the Respective Codons in its Protein Genes.”J. Molec. Biol. 146, 1–21.
Article Google Scholar
—. 1985. “Codon Usage and the tRNA Content in Unicellular and Multicellular Organisms.”Molec. Biol. Evol. 2, 13–34.
Google Scholar
— and H. Ozeki. 1982. “Codon Usage and Transfer RNA Contents: Organism-Specific Codon-Choice Patterns in Reference to the Isoacceptor Contents.”Cold Spring Harbor Symp. Quant. Biol. 49, 1087–1097.
Google Scholar
Katz, R. W. 1981. “On Some Criteria for Estimating the Order of a Markov Chain.”Technometrics 23, 243–249.
Article MATH MathSciNet Google Scholar
Kimura, M. 1983.The Neutral Theory of Molecular Evolution. New York: Cambridge University Press.
Google Scholar
Konopka, A. 1984. “Is the Information Content of DNA Evolutionarily Significant?”J. Theor. Biol. 107, 697–704.
Google Scholar
Lipman, D. J. and J. Maizel. 1982. “Comparative Analysis of Nucleic Acid Sequences by their General Constraints.”Nucleic Acids Res. 10, 2733–2739.
Google Scholar
— and W. J. Wilbur. 1983. “Contextual Constraints on Synonymous Codon Choice.”J. Molec. Biol. 163, 363–376.
Article Google Scholar
Maruyama, T., T. Gojobori, S. Aota and T. Ikemura. 1986. “Codon Usage Tabulated from the GenBank Genetic Sequence Data.”Nucleic Acids Res. 14, r151-r197.
Google Scholar
Nei, M. 1987.Molecular Evolutionary Genetics. New York: Columbia University Press.
Google Scholar
Nyunona, H. and C. J. Lusty. 1983. “The CarB Gene ofEscherichia coli: A Duplicated Gene Coding for the Large Sub-unit of Carbamoyl-Phosphate Synthetase.”Proc. Natn. Acad. Sci. U.S.A. 80, 4529–4633.
Google Scholar
Ogasawara, N. 1985. “Markedly Unbiased Codon Usage inBacillus subtilis.”Gene 40, 145–150.
Article Google Scholar
Phillips, G. J., J. Arnold and R. Ivarie. 1987a. “Mono-Through Hexanucleotide Composition of theEscherichia Coli Genome: A Markov Chain Analysis.”Nucleic Acids Res. 15, 2611–2626.
Google Scholar
—, J. Arnold and R. Ivarie. 1987b. “The Effect of Codon Usage on the Oligonucleotide Composition of theE. coli Genome and Identification of Over- and Under-represented Sequences by Markov Chain Analysis.”Nucleic Acids Res. 15, 2627–2638.
Google Scholar
Sharp, P. M. and W.-H. Li. 1986. “An Evolutionary Perspective on Synonymous Codon Usage in Unicellular Organisms.”J. Molec. Evol. 24, 28–38.
Article Google Scholar
Shulman, M. J., C. M. Steinbert and N. Westmoreland. 1981. “The Coding Function of Nucleotide Sequences can be Discerned by Statistical Analysis.”J. Theor. Biol. 88, 409–420.
Article Google Scholar
Smith, T. F., M. S. Waterman and J. R. Sadler. 1983. “Statistical Characterization of Nucleic Acid Sequence Functional Domains.”Nucleic Acids Res. 11, 2205–2220.
Google Scholar
Tong, H. 1975. “Determination of the Order of a Markov Chain by Akaike's Information Criterion.”J. Appl. Prob. 12, 488–497.
Article MATH Google Scholar
Subba Rao, J., C. P. Geevan and G. Subba Rao. 1982. “Significance of the Information Content of DNA in Mutations and Evolution.”J. Theor. Biol. 96, 571–577.
Article Google Scholar
Wilbur, W. J. 1985. “Codon Equilibrium I: Testing for Homogeneous Equilibrium.”J. Molec. Evol. 21, 169–181.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, University of Utah, 84112, Salt Lake City, UT, U.S.A.
Simon Tavaré & Brenda Song

Authors

Simon Tavaré
View author publications
You can also search for this author in PubMed Google Scholar
Brenda Song
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tavaré, S., Song, B. Codon preference and primary sequence structure in protein-coding regions. Bltn Mathcal Biology 51, 95–115 (1989). https://doi.org/10.1007/BF02458838

Download citation

Received: 01 July 1988
Issue Date: January 1989
DOI: https://doi.org/10.1007/BF02458838

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Codon preference and primary sequence structure in protein-coding regions

Abstract

Access this article

Similar content being viewed by others

Site-Specific Amino Acid Distributions Follow a Universal Shape

Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Literature

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Codon preference and primary sequence structure in protein-coding regions

Abstract

Access this article

Similar content being viewed by others

Site-Specific Amino Acid Distributions Follow a Universal Shape

Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content

The Influence of Protein Stability on Sequence Evolution: Applications to Phylogenetic Inference

Literature

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation