Abstract
An algebraic and geometrical approach is used to describe the primaeval RNA code and a proposed Extended RNA code. The former consists of all codons of the type RNY, where R means purines, Y pyrimidines, and N any of them. The latter comprises the 16 codons of the type RNY plus codons obtained by considering the RNA code but in the second (NYR type), and the third, (YRN type) reading frames. In each of these reading frames, there are 16 triplets that altogether complete a set of 48 triplets, which specify 17 out of the 20 amino acids, including AUG, the start codon, and the three known stop codons. The other 16 codons, do not pertain to the Extended RNA code and, constitute the union of the triplets YYY and RRR that we define as the RNA-less code. The codons in each of the three subsets of the Extended RNA code are represented by a four-dimensional hypercube and the set of codons of the RNA-less code is portrayed as a four-dimensional hyperprism. Remarkably, the union of these four symmetrical pairwise disjoint sets comprises precisely the already known six-dimensional hypercube of the Standard Genetic Code (SGC) of 64 triplets. These results suggest a plausible evolutionary path from which the primaeval RNA code could have originated the SGC, via the Extended RNA code plus the RNA-less code. We argue that the life forms that probably obeyed the Extended RNA code were intermediate between the ribo-organisms of the RNA World and the last common ancestor (LCA) of the Prokaryotes, Archaea, and Eucarya, that is, the cenancestor. A general encoding function, E, which maps each codon to its corresponding amino acid or the stop signal is also derived. In 45 out of the 64 cases, this function takes the form of a linear transformation F, which projects the whole six-dimensional hypercube onto a four-dimensional hyperface conformed by all triplets that end in cytosine. In the remaining 19 cases the function E adopts the form of an affine transformation, i.e., the composition of F with a particular translation. Graphical representations of the four local encoding functions and E, are illustrated and discussed. For every amino acid and for the stop signal, a single triplet, among those that specify it, is selected as a canonical representative. From this mapping a graphical representation of the 20 amino acids and the stop signal is also derived. We conclude that the general encoding function E represents the SGC itself.
Similar content being viewed by others
References
Becerra, A., Islas, S., Leguina, J.I., Silva, E., Lazcano, A., 1997. Polyphyletic gene losses can bias backtrack characterizations of the cenancestor. J. Mol. Evol. 45, 115–118.
Coxeter, H.S.M., 1973. Regular Polytopes, 3rd edition. Dover Publication Inc., New York.
Crick, F.H.C., 1968. The origin of the genetic code. J. Mol. Biol. 38, 367–379.
Crick, F.H.C., Griffith, J.S., Orgel, L.E., 1957. Codes without commas. Proc. Natl. Acad. Sci. USA 43, 417–421.
Eigen, M., Schuster, P., 1977. The hypercycle. A principle of natural self-organization. Part A: Emergence of the hypercycle. Naturwissenschaften 64, 541–565.
Eigen, M., Schuster, P., 1978. The hypercycle. A principle of natural self-organization. Part C: The realisitic hypercycle. Naturwissenschaften 64, 341–369.
Eigen, M., Lindemann, B., Winkler-Oswatitsch, R., Clarke, C.H., 1985. Pattern analysis of 5S rRNA. Proc. Natl. Acad. Sci. USA 82, 2437–2441.
Freeland, S.J., Hurst, L.D., 1998. The genetic code is one in a million. J. Mol. Evol. 47, 238–248.
García, J.A., Alvarez, S., Flores, A., Govezensky, T., Bobadilla, J.R., José, M.V., 2004. Statistical analysis of the distribution of amino acids in Borrelia burgdorferi genome under different genetic codes. Physica A 342, 288–293.
Gesteland, R.F., Cech, T.R., Atkins, J.F., 1999. The RNA World. Cold Spring Harbor Laboratory Press.
Gil, R., Sabater-Muñoz, B., Latorre, A., Silva, S.J., Moya, A., 2002. Extreme genome reduction in Buchnera spp.: Toward the minimal genome needed for symbiotic life. Proc. Natl. Acad. Sci. USA 99, 4454–4458.
Gilbert, W., 1986. The RNA World. Nature 319, 618.
He, M., Petoukhov, S.V., Ricci, P.E., 2004. Genetic code, Hamming distance and stochastic matrices. Bull. Math. Biol. 66, 1405–1421.
Hutchinson III, C.A., Peterson, S.N., Gill, S.R., Cline, R.T., White, O., Fraser, C., Smith, H.O., Venter, J.C., 1999. Global transposon mutagenesis and a minimal mycoplasma genome. Science 286, 2165–2169.
Jiménez-Montaño, M.A., de la Mora-Basañez, C.R., Pöschel, T., 1996. The hypercube structure of the genetic code explains conservative and non-conservative amino acid substitutions in vivo and in vitro. Biosystems 39, 117–125.
Keck, J.G., Baldick, C.J., Moss, B., 1990. Role of DNA replication in Vaccinia virus gene expression: A naked template is required for transcription of three late trans-activator genes. Cell 61, 801–809.
Kenneth, D.J., Ellington, A.D., 1995. The search for missing links between self-replicating nucleic acids and the RNA world. Orig. Life Evol. Biosph. 25, 515–530.
Konecny, J., Schöniger, M., Hofacker, L., 1995. Complementary coding to the primeval comma-less code. J. Theor. Biol. 173, 263–270.
Jacob, F., 1977. Evolution and tinkering. Science 196, 1161–1166.
Lazcano, A., 1995. Cellular evolution during the early Archean: What happened between the progenote and the cenancestor? Microbiologia SEM 11, 1–13.
Lazcano, A., Miller, S.L., 1996. The origin and early evolution of life: Prebiotic chemistry, the pre-RNA World, and time. Cell 85, 793–798.
Mushegian, A.R., Koonin, E.V., 1996. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 93, 10268–10273.
Sánchez, R., Morgado, E., Grau, R., 2004a. The genetic code Boolean lattice, MATCH Commun. Math. Comput. Chem. 52, 29–46.
Sánchez, R., Morgado, E., Grau, R., 2004b. Genetic code Boolean algebras. W-Seas Transactions of the International Conference on Biology and Biomedicine, vol. 1, issue 2, pp. 190–197. Corfus, Greece, ISNN 1109–1518.
Sánchez, R., Morgado, E., Grau, R., 2005. A genetic code boolean structure I. The meaning of boolean deductions. Bull. Math. Biol. 67, 1–14.
Szathmáry, E., 2005. In search of the simplest cell. Science. 433, 469–470.
White, M., Undated. Maximum symmetry in the genetic code: The Rafiki map. Unpublished manuscript. http://www.codefun.com/Images/genetic/Max/Sym300pi.pdf.
Woese, C., 1967. The Genetic Code, Ch. 7. Harper and Row, New York.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
José, M.V., Morgado, E.R. & Govezensky, T. An Extended RNA Code and its Relationship to the Standard Genetic Code: An Algebraic and Geometrical Approach. Bull. Math. Biol. 69, 215–243 (2007). https://doi.org/10.1007/s11538-006-9119-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11538-006-9119-3