Abstract
This paper addresses the relationship between information and structure of the genetic code. The code has two puzzling anomalies: First, when viewed as 64 sub-cubes of a \(4 \times 4 \times 4\) cube, the codons for serine (S) are not contiguous, and there are amino acid codons with zero redundancy, which goes counter to the objective of error correction. To make sense of this, the paper shows that the genetic code must be viewed not only on stereochemical, co-evolution, and error-correction considerations, but also on two additional factors of significance to natural systems, that of an information-theoretic dimensionality of the code data, and the principle of maximum entropy. One implication of non-integer dimensionality associated with data dimensions is self-similarity to different scales, and it is shown that the genetic code does satisfy this property, and it is further shown that the maximum entropy principle operates through the scrambling of the elements in the sense of maximum algorithmic information complexity, generated by an appropriate exponentiation mapping. It is shown that the new considerations and the use of maximum entropy transformation create new constraints that are likely the reasons for the non-uniform codon groups and codons with no redundancy.
Similar content being viewed by others
Data availability
All available data are in the manuscript.
Code availability
Not applicable.
References
Berger A, Hill TP (2015) An introduction to Benford’s law. Princeton University Press
Błażej P, Miasojedow B, Grabińska M, Mackiewicz P (2015) Optimization of mutation pressure in relation to properties of protein-coding sequences in bacterial genomes. PLoS ONE 10(6):e0130411
Błażej P, Mackiewicz DG, M. et al (2017) Optimization of amino acid replacement costs by mutational pressure in bacterial genomes. Sci Rep 7:1061
Błażej P, Wnętrzak M, Mackiewicz D, Mackiewicz P (2018) Optimization of the standard genetic code according to three codon positions using an evolutionary algorithm. PLoS ONE 13(8):e0201715
Buhrman H, van der Gulik PT, Kelk SM, Koolen WM, Stougie L (2011) Some mathematical refinements concerning error minimization in the genetic code. IEEE/ACM Trans Comput Biol Bioinform 8:1358–1372
Crick FHC (1968) The origin of the genetic code. J Mol Biol 38:367–379
De Martino A, De Martino D (2018) An introduction to the maximum entropy approach and its application to inference problems in biology. Heliyon 4(4):e00596. https://doi.org/10.1016/j.heliyon.2018.e00596
Di Giulio M (2005) The origin of the genetic code: theories and their relationships, a review. Biosystems 80:175–184
Di Giulio M (2016) The lack of foundation in the mechanism on which are based the physico-chemical theories for the origin of the genetic code is counterposed to the credible and natural mechanism suggested by the co-evolution theory. J Theor Biol 399:134–140
Firnberg E, Ostermeier M (2013) The genetic code constrains yet facilitates Darwinian evolution. Nucleic Acids Res 41:7420–7428
Francis BR (2013) Evolution of the genetic code by incorporation of amino acids that improved or changed protein function. J Mol Evol 77:134–158
Goodarzi H, Nejad HA, Torabi N (2004) On the optimality of the genetic code, with the consideration of termination codons. Biosystems 77:163–173
Grosjean H, Westhof E (2016) An integrated, structure- and energy-based view of the genetic code. Nucleic Acids Res 44:8020–8040
Jaynes ET (2003) Probability theory: the logic of Science. Cambridge University Press
Jestin JL, Kempf A (2009) Optimization models and the structure of the genetic code. J Mol Evol 69:452–457
Kak S (2020) Information theory and dimensionality of space. Sci Rep 10:20733. https://doi.org/10.1038/s41598-020-77855-9
Kak S (2021a) Asymptotic freedom in noninteger spaces. Sci Rep 11:1–5. https://doi.org/10.1038/s41598-021-83002-9
Kak S (2021b) The intrinsic dimensionality of data. Circuits Syst Signal Process 40:2599–2607. https://doi.org/10.1007/s00034-020-01583-8
Kak S (2021c) Fractals with optimal information dimension. Circuits Syst Signal Process 40:5733–5743. https://doi.org/10.1007/s00034-021-01726-5
Kak S (2021d) The e-dimensionality of genetic information. Techrxiv. https://doi.org/10.36227/techrxiv.14977479.v1
Kak S (2022a) Number of autonomous cognitive agents in a neural network. J Artif Intell Conscious. https://doi.org/10.1142/S2705078522500023
Kak S (2022b) The iterated Newcomb-Benford distribution for structured systems. Int J Appl Comput Math 8:51. https://doi.org/10.1007/s40819-022-01251-2
Kak S (2022c) New classes of regular symmetric fractals. Circuits Syst Signal Process 41:4149–4159. https://doi.org/10.1007/s00034-022-01966-z
Knuth D (2006) The art of computer programming, Generating all trees history of combinatorial generation. Addison-Wesley
Koonin EV (2017) Frozen accident pushing 50: Stereochemistry, expansion, and chance in the evolution of the genetic code. Life 7:E22
Křížek M, Křížek P (2012) Why has nature invented three stop codons of DNA and only one start codon? J Theor Biol 7(304):183–187. https://doi.org/10.1016/j.jtbi.2012.03.026
Li M, Vitányi P (2009) An introduction to kolmogorov complexity and its applications. Springer
Lobanov AV, Turanov AA, Hatfield DL, Gladyshev VN (2010) Dual functions of codons in the genetic code. Crit Rev Biochem Mol Biol 45:257–265
Mandelbrot BB (1983) The fractal geometry of nature. Freeman W. H
Massey SE (2008) A neutral origin for error minimization in the genetic code. J Mol Evol 67:510–516
Massey SE (2015) Genetic code evolution reveals the neutral emergence of mutational robustness, and information as an evolutionary constraint. Life 5:1301–1332
Massey SE (2016) The neutral emergence of error minimized genetic codes superior to the standard genetic code. J Theor Biol 408:237–242
Novozhilov AS, Wolf YI, Koonin EV (2007) Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape. Biol Direct 2:24
Odlyzko AM (1985) Discrete logarithms in finite fields and their cryptographic significance. In: Beth T, Cot N, Ingemarsson I (eds) Lecture Notes in Computer Science, vol 209. Springer, Berlin, Heidelberg
Salge C, Ay N, Polani D, Prokopenko M (2015) Zipf’s law: balancing signal usage cost and communication efficiency. PLoS ONE 10(10):e0139475
Sella G, Ardell DH (2006) The co-evolution of genes and genetic codes: Crick’s frozen accident revisited. J Mol Evol 63:297–313
Sengupta S, Aggarwal N, Bandh AV (2014) Two perspectives on the origin of the standard genetic code. Orig Life Evol Biosph 44:287–291
Singh P (1985) The so-called Fibonacci numbers in ancient and medieval India. Hist Math 12:229–244
Wells A (1984) A polynomial form for logarithms modulo a prime. IEEE Trans Inf Theory 30:845–846
Wnętrzak M, Błażej P, Mackiewicz D et al (2018) The optimality of the standard genetic code assessed by an eight-objective evolutionary algorithm. BMC Evol Biol 18:192
Wnętrzak M, Błażej P, Mackiewicz P (2019) Optimization of the standard genetic code in terms of two mutation types: Point mutations and frameshifts. Biosystems 2019(181):44–50. https://doi.org/10.1016/j.biosystems.2019.04.012
Funding
Not applicable.
Author information
Authors and Affiliations
Contributions
This represents author’s own work.
Corresponding author
Ethics declarations
Conflict of interest
There are no financial or non-financial interests associated with the paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kak, S. Self-similarity and the maximum entropy principle in the genetic code. Theory Biosci. 142, 205–210 (2023). https://doi.org/10.1007/s12064-023-00396-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12064-023-00396-y