The Duplexing of the Genetic Code and Sequence-Dependent DNA Geometry

Kasman, Alex

doi:10.1007/s11538-018-0486-3

The Duplexing of the Genetic Code and Sequence-Dependent DNA Geometry

Original Article
Published: 10 August 2018

Volume 80, pages 2734–2760, (2018)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Alex Kasman ORCID: orcid.org/0000-0002-9399-7430¹

271 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

It is well known that sequences of bases in DNA are translated into sequences of amino acids in cells via the genetic code. More recently, it has been discovered that the sequence of DNA bases also influences the geometry and deformability of the DNA. These two correspondences represent a naturally arising example of duplexed codes, providing two different ways of interpreting the same DNA sequence. This paper will set up the notation and basic results necessary to mathematically investigate the relationship between these two natural DNA codes. It then undertakes two very different such investigations: one graphical approach based only on expected values and another analytic approach incorporating the deformability of the DNA molecule and approximating the mutual information of the two codes. Special emphasis is paid to whether there is evidence that pressure to maximize the duplexing efficiency influenced the evolution of the genetic code. Disappointingly, the results fail to support the hypothesis that the genetic code was influenced in this way. In fact, applying both methods to samples of realistic alternative genetic codes shows that the duplexing of the genetic code found in nature is just slightly less efficient than average. The implications of this negative result are considered in the final section of the paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

One integral characteristic of the set of genetic codes. The property of all known natural codes

Article 14 November 2014

The DNA from a Coding Perspective

Modeling the Genetic Code: p-Adic Approach

Notes

Of course, Alice can similarly send a string of elements from \(\mathcal {S}\) to encode a string of elements from \(\mathcal {X}\), but all of the information about the code is contained in the map f acting on single elements and so the idea of using longer strings will largely be ignored below.
The specific size of the rectangle will never be specified. All that matters is that they are small enough that they do not intersect.
Because the trigonometric functions are not one-to-one, one must always be careful when working with their “inverses”, which are actually only inverses on specified intervals. The Tait–Bryan system of coordinates for angles is being used in this application because it places the endpoints of those intervals far from the orientations that appear in codon geometry. (In contrast, some of the other standard coordinate systems would assign very different angle coordinates to the geometry that bends a tiny amount in one direction from the vertical and one that bends a tiny amount in another directions).
Unfortunately, angles in the Hassan–Calladine parameters are measured in degrees rather than radians.
The value of \(\phi \) has no effect when \(\theta _1^2+\theta _2^2=0\). It is merely for convenience that we set \(\phi =0\) in that case.
Here, we are making an assumption of normality. It does seem likely that the probability distribution for each parameter is approximately normal and centered a the expected value. However, even if that is not the case, so long as we are considering the average position for a large number of observations then the assumption should be valid by the Central Limit Theorem.
\(\text {diag}(z_1,\ldots ,z_6)\) denotes the diagonal matrix with the scalar \(z_i\) in the i th position on the diagonal. Then \(\text {diag}(z_1,\ldots ,z_6){\hat{\mathbf{s}}}(d)\) is a vector whose entries are the standard deviations of the Hassan–Calladine parameters for the dimer d each scaled by the corresponding z-score.
Since the values of the Tait–Bryan angles are so limited in range, in this application, the global topology of SO(3) can be ignored and distances computed merely as if these were points in \(\mathbb {R}^3\). The Mathematica notebook (“http://kasmana.people.cofc.edu/DNAGeometry/”) is used to generate the figures and perform the calculations that also includes code to use a more sophisticated definition that forms an actual metric on the space of rotations, but the results were essentially the same and are not worth the extra complexity in notation required to introduce the necessary definitions.
These graphs are 3-dimensional. The figures included in this journal article are, by necessity, 2-dimensional projections. Since the third coordinate varies very little as compared to the other two, the projection from above has been selected and distances in the figure should be relatively accurate. Still, it is important to note that no such projection was used in the computation of the total length.
The alternative genetic codes considered by Itzkovitz and Alon are a subclass of the ones considered in this paper. In particular, they only consider the ones which can be produced by the composition of a permutation of the first bases, a permutation of the second bases, and a permutation in the third bases.

References

Alexander RW, Schimmel P (2001) Wobble hypothesis. In: Brenner S, Miller JH (eds) Encyclopedia of genetics. Elsevier, Amsterdam
Google Scholar
Barrell BG, Bankier AT, Drouin J (1979) A different genetic code in human mitochondria. Nature 282:189–194
Article Google Scholar
Berg JM, Tymoczko JL, Stryer L (2002) Biochemistry, 5th edn. WH Freeman, New York. Section 5.5.1
Eslami-Mossallam B, Schram RD, Tompitak M, van Noort John, Schiessel H, (2016) Multiplexing genetic and nucleosome positioning codes: a computational approach. PLoS One 11(6):e0156905. https://doi.org/10.1371/journal.pone.0156905
Article Google Scholar
Fujii S, Kono H, Takenaka S, Go N, Sarai A (2007) Sequence-dependent DNA deformability studied using molecular dynamics simulations. Nucleic Acids Res 35(18):6063–6074
Article Google Scholar
Hassan MA, Calladine CR (1995) The assessment of the geometry of dinucleotide steps in double-helical DNA; a new local calculation scheme. J Mol Biol 251:648–664
Article Google Scholar
Itzkovitz S, Alon U (2007) The genetic code is nearly optimal for allowing additional information within protein-coding sequences. Genom Res 17(4):405–12 Epub 2007 Feb 9
Article Google Scholar
Kawaguchi Y, Honda H, Taniguchi-Morimura J, Iwasaki S (1989) The codon CUG is read as serine in an asporogenic yeast Candida cylindracea. Nature 341:164–166
Article Google Scholar
Kiga D, Sakamoto K, Kodama K, Kigawa T, Matsuda T, Yabuki T, Shirouzu M, Harada Y, Nakayama H, Takio K (2002) An engineered Escherichia coli tyrosyl-tRNA synthetase for site-specific incorporation of an unnatural amino acid into proteins in eukaryotic translation and its application in a wheat germ cell-free system. Proc Natl Acad Sci USA 99:9715–9720
Article Google Scholar
Koonin EV, Novozhilov AS (2017) Origin and evolution of the universal genetic code. Annu Rev Genet 51:4562
Article Google Scholar
Kumara B, Saini S (2016) Analysis of the optimality of the standard genetic code. Mol BioSyst 12:2642–2651
Article Google Scholar
Lajoie MJ, Söll D, Church GM (2016) Overcoming challenges in engineering the genetic code. J Mol Biol 428(5 Pt B):10041021
Article Google Scholar
Lankas F, Sponer J, Langowski J, Thomas E (2003) Cheatham III. DNA basepair step deformability inferred from molecular dynamics simulations. Biophys J 85:2872–2883
Article Google Scholar
Liu CC, Schultz PG (2010) Adding new chemistries to the genetic code. Annu Rev Biochem 79:413–444
Article Google Scholar
Matsumoto A, Olson WK (2002) Sequence-dependent motions of DNA: a normal mode analysis at the base-pair level. Biophys J 83:22–41
Article Google Scholar
Olson WK, Gorin AA, Xiang-Jun L, Hock LM, Zhurkin Victor B (1998) DNA sequence-dependent deformability deduced from protein-DNA crystal complexes. Proc Natl Acad Sci USA 95:11163–11168
Article Google Scholar
Rohs R, West SM, Sosinsky A et al (2009) The role of DNA shape in protein-DNA recognition. Nature 461(7268):1248–1281
Article Google Scholar
Srinivasan G, James CM (2002) Pyrrolysine encoded by UAG in Archaea. Science 296(5572):1459–1462
Article Google Scholar
Wang L, Brock A, Herberich B, Schultz PG (2001) Expanding the genetic code of Escherichia coli. Science 292:498–500
Article Google Scholar
Yamao F, Muto A, Kawauchi Y, Iwami M, Iwagami S, Azumi Y, Osawa S (1985) UGA is read as tryptophan in Mycoplasma capricolum. Proc Natl Acad Sci USA 82:2306–2309
Article Google Scholar
Zhang Z, Yu J (2011) On the organizational dynamics of the genetic code. Genom Proteomics Bioinform 9(1–2):21–29
Article Google Scholar

Download references

Acknowledgements

I am grateful to Jason Cantarella (University of Georgia), Madison Hyer (Medical University of South Carolina), Martin Jones (College of Charleston), Brenton Lemesurier (College of Charleston), Garrett Mitchener (College of Charleston), and Laura Kasman (Medical University of South Carolina) for helpful discussion and feedback. I would also like to thank Wilma Olson and the organizers of the Thematic Year on Mathematics of Molecular and Cellular Biology at the IMA where I met her and first learned about the sequence-dependent geometry of DNA.

Author information

Authors and Affiliations

College of Charleston, Charleston, USA
Alex Kasman

Authors

Alex Kasman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Kasman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kasman, A. The Duplexing of the Genetic Code and Sequence-Dependent DNA Geometry. Bull Math Biol 80, 2734–2760 (2018). https://doi.org/10.1007/s11538-018-0486-3

Download citation

Received: 27 July 2017
Accepted: 03 August 2018
Published: 10 August 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11538-018-0486-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Duplexing of the Genetic Code and Sequence-Dependent DNA Geometry

Abstract

Access this article

Similar content being viewed by others

One integral characteristic of the set of genetic codes. The property of all known natural codes

The DNA from a Coding Perspective

Modeling the Genetic Code: p-Adic Approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Duplexing of the Genetic Code and Sequence-Dependent DNA Geometry

Abstract

Access this article

Similar content being viewed by others

One integral characteristic of the set of genetic codes. The property of all known natural codes

The DNA from a Coding Perspective

Modeling the Genetic Code: p-Adic Approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation