Summary
Analysis of the sequence data available today, comprising more than 500,000 bases, confirms the previously observed phenomenon that there are distinct dinucleotide preferences in DNA sequences. Consistent behaviour is observed in the major sequence groups analysed here in prokaryotes, eukaryotes and mitochondria. Some doublet preferences are common to all groups and are found in most sequences of the Los Alamos Library. The patterns seen in such large data sets are very significant statistically and biologically. Since they are present in numerous and diverse nucleotide sequences, one may conclude that they confer evolutionary advantages on the organism.
In eukaryotes RR and YY dinucleotides are preferred over YR and RY (where R is a purine and Y a pyrimidine). Since opposite-chain nearest-neighbour purine clashes are major determinants of DNA structure, it appears that the tight packaging of DNA in nucleosomes disfavors, in general, such (YR and RY) steric repulsion.
Similar content being viewed by others
References
Almagor H (1983) A Markov analysis of DNA sequences. J Theor Biol 104:633–645
Bird AP (1980) DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res 8:1499–1504
Blaisdell BE (1983) A prevalent persistent global nonrandomness that distinguishes coding and noncoding eucaryotic nuclear DNA sequences. J Mol Evol 19:122–133
Calladine CR (1982) Mechanics of sequence-dependent stacking of bases in B-DNA. J Mol Biol. 161:343–352
Crick FHC, Brenner S, Klug A, Pieczenik G (1976) A speculation on the origin of protein synthesis. Orig Life 7:389–397
Dickerson RE, Drew HR (1981a) Structure of B-DNA dodecamer. II. Influence of base sequence on helix structure. J Mol Biol 149:761–786
Dickerson RE, Drew HR (1981b) Kinematic model for B-DNA. Proc Natl Acad Sci USA 78:7318–7322
Drew HR, Dickerson RE (1981) Structure of a B-DNA dodecamer. III. Geometry of hydration. J Mol Biol 151:535–556
Drew HR, Wing RM, Takano T, Broka C, Tanaka S, Itakura K, Dickerson RE (1981) Structure of B-DNA dodecamer: conformation and dynamics. Proc Natl Acad Sci USA 78:2179–2183
Fickett JW (1982) Recognition of protein coding regions in DNA sequences. Nucleic Acids Res 10:5303–5318
Gold L, Pribnow D, Schneider T, Shinedling S, Singer BS, Stromo G (1981) Translational initiation in prokaryotes. Annu Rev Microbiol 35:365–403
Goldberg M (1979) PhD Thesis, Stanford University, Palo Alto, California
Josse J, Kaiser AD, Korenberg A (1961) Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid. J Biol Chem 236:864–875
Lipman D, Maizel J (1982) Comparative analysis of nucleic acid sequences by their general constraint. Nucleic Acids Res 10:2723–2739
Nussinov R (1980a) Strong adenine clustering in nucleotide sequences. J Theor Biol 85:285–291
Nussinov R (1980b) Some rules in the ordering of nucleotides in the DNA. Nucleic Acids Res 8:4545–4562
Nussinov R (1981a) Nearest neighbor nucleotide patterns: structural and biological implications. J Biol Chem 256:8458–8462
Nussinov R (1981b) The eukaryotic dinucleotide preference rules and their implications on degenerate codon choice. J Mol Biol 149:125–131
Nussinov R (1981c) The universal dinucleotide asymmetry rules in DNA and the amino acid codon choice. J Mol Evol 17:237–244
Nussinov R (1982) Some indications for inverse DNA duplication. J Theor Biol 95:783–793
Pieczenik G (1980a) Multimers of a supperssor transfer RNA: supporting evidence for alternate conformations of the anticodon loop region. J Mol Biol 138:879–884
Pieczenik G (1980b) Predicting coding function from nucleotide sequence or survival of “filtres” of tRNA. Proc Natl Acad Sci USA 77:3539–3543
Rodier R, Gabarro-Arpa J, Ehrlich R, Reiss C (1982) Key for protein coding identification: computer analysis of codon strategy. Nucleic Acids Res 10:391–402
Salser W (1977) Globin mRNA sequences: analysis of base pairing and evolutionary implications. Cold Spring Harbor Symp Quant Biol 62:985–1002
Shepherd JCW (1981a) Periodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a commaless genetic code. J Mol Evol 17:94–102
Shepherd JCW (1981b) Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci USA 78:1596–1600
Urieli-Shoval S, Gruenbaum Y, Sedat J, Razin A (1982) The absence of detectable methylated bases inDrosophila melanogaster DNA. FEBS Lett 146:148–152
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Nussinov, R. Strong doublet preferences in nucleotide sequences and DNA geometry. J Mol Evol 20, 111–119 (1984). https://doi.org/10.1007/BF02257371
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02257371