Strong Comma-Free Codes in Genetic Information

Fimmel, Elena; Michel, Christian J.; Strüngmann, Lutz

doi:10.1007/s11538-017-0307-0

Strong Comma-Free Codes in Genetic Information

Original Article
Published: 22 June 2017

Volume 79, pages 1796–1819, (2017)
Cite this article

Bulletin of Mathematical Biology Aims and scope Submit manuscript

Elena Fimmel¹,
Christian J. Michel² &
Lutz Strüngmann¹

237 Accesses
19 Citations
Explore all metrics

Abstract

Comma-free codes constitute a class of circular codes, which has been widely studied, in particular by Golomb et al. (Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab 23:1–34, 1958a, Can J Math 10:202–209, 1958b), Michel et al. (Comput Math Appl 55:989–996, 2008a, Theor Comput Sci 401:17–26, 2008b, Inf Comput 212:55–63, 2012), Michel and Pirillo (Int J Comb 2011:659567, 2011), and Fimmel and Strüngmann (J Theor Biol 389:206–213, 2016). Based on a recent approach using graph theory to study circular codes Fimmel et al. (Philos Trans R Soc 374:20150058, 2016), a new class of circular codes, called strong comma-free codes, is identified. These codes detect a frameshift during the translation process immediately after a reading window of at most two nucleotides. We describe several combinatorial properties of strong comma-free codes: enumeration, maximality, self-complementarity and \(CF^3\)-property (comma-free property in all the three possible frames). These combinatorial results also highlight some new properties of the genetic code and its evolution. Each amino acid in the standard genetic code is coded by at least one strong comma-free code of size 1. There are 9 amino acids \(S=\{Asn,Asp,Gln,Gly,Lys,Met,Phe,Pro,Trp\}\) among 20 such that for each amino acid from S, its synonymous trinucleotide set (excluding the necessary periodic trinucleotides \(\{AAA,CCC,GGG,TTT\}\)) is a strong comma-free code. The primeval comma-free RNY code of Eigen and Schuster (Naturwissenschaften 65:341–369, 1978) is a self-complementary \(CF^3\)-code of size 16. Furthermore, it is the union of two strong comma-free codes of size 8 which are complementary to each other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Arquès DG, Michel CJ (1996) A complementary circular code in the protein coding genes. J Theor Biol 182:45–58
Article Google Scholar
Canapa A, Cerioni PN, Barucca M, Olmo E, Caputo V (2002) A centromeric satellite DNA may be involved in heterochromatin compactness in gobiid fishes. Chromosom Res 10:297–304
Article Google Scholar
Clark J, Holton DA (1991) A first look at graph theory. World Scientific, Singapore
Book MATH Google Scholar
Crick FH, Brenner S, Klug A, Pieczenik G (1976) A speculation on the origin of protein synthesis. Orig Life 7:389–397
Article Google Scholar
Crick F, Griffith JS, Orgel LE (1957) Codes without commas. Proceedings of the National Academy of Sciences, vol 43. U.S.A, pp 416–421
Eigen M, Schuster P (1978) The hypercycle. A principle of natural self-organization. Part C: the realistic hypercycle. Naturwissenschaften 65:341–369
Article Google Scholar
El Soufi K, Michel CJ (2014) Circular code motifs in the ribosome decoding center. Comput Biol Chem 52:9–17
Article Google Scholar
El Soufi K, Michel CJ (2015) Circular code motifs near the ribosome decoding center. Comput Biol Chem 59:158–176
Article Google Scholar
El Soufi K, Michel CJ (2016) Circular code motifs in genomes of eukaryotes. J Theor Biol 408:198–212
Article MathSciNet MATH Google Scholar
El Soufi K, Michel CJ (2017) Unitary circular code motifs in genomes of eukaryotes. Biosystems 153:45–62
Fimmel E, Giannerini S, Gonzalez D, Strüngmann L (2014) Circular codes, symmetries and transformations. J Math Biol. doi:10.1007/s00285-014-0806-7
MATH Google Scholar
Fimmel E, Strüngmann L (2015) On the hierarchy of trinucleotide n-circular codes and their corresponding amino acids. J Theor Biol 364:113–120
Article MathSciNet Google Scholar
Fimmel E, Strüngmann L (2016) Maximal dinucleotide comma-free codes. J Theor Biol 389:206–213
Article MATH Google Scholar
Fimmel E, Michel CJ, Strüngmann L (2016) \(n\)-Nucleotide circular codes in graph theory. Philos Trans R Soc A 374:20150058
Article MathSciNet MATH Google Scholar
Frey G, Michel CJ (2006) Identification of circular codes in bacterial genomes and their use in a factorization method for retrieving the reading frames of genes. Comput Biol Chem 30:87–101
Article MATH Google Scholar
Gemayel R, Vinces MD, Legendre M, Verstrepen KJ (2010) Variable tandem repeats accelerate evolution of coding and regulatory sequences. Ann Rev Genet 44:445–477
Article Google Scholar
Golomb SW, Delbruck M, Welch LR (1958a) Construction and properties of comma-free codes. Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab 23:1–34
Google Scholar
Golomb SW, Gordon B, Welch LR (1958b) Comma-free codes. Can J Math 10:202–209
Article MathSciNet MATH Google Scholar
Michel CJ (2012) Circular code motifs in transfer and 16S ribosomal RNAs: a possible translation code in genes. Comput Biol Chem 37:24–37
Article MathSciNet MATH Google Scholar
Michel CJ (2013) Circular code motifs in transfer RNAs. Comput Biol Chem 45:17–29
Article MathSciNet Google Scholar
Michel CJ (2015) The maximal \(C^3\) self-complementary trinucleotide circular code \(X\) in genes of bacteria, eukaryotes, plasmids and viruses. J Theor Biol 380:156–177
Article MathSciNet MATH Google Scholar
Michel CJ (2017) The maximal \(C^3\) self-complementary trinucleotide circular code \(X\) in genes of bacteria, archaea, eukaryotes, plasmids and viruses. Life 7(20):1–16
Google Scholar
Michel CJ, Pirillo G (2011) Strong trinucleotide circular codes. Int J Comb 2011:659567. doi:10.1155/2011/659567
Michel CJ, Pirillo G, Pirillo MA (2008a) Varieties of comma free codes. Comput Math Appl 55:989–996
Article MathSciNet MATH Google Scholar
Michel CJ, Pirillo G, Pirillo MA (2008b) A relation between trinucleotide comma-free codes and trinucleotide circular codes. Theor Comput Sci 401:17–26
Article MathSciNet MATH Google Scholar
Michel CJ, Pirillo G, Pirillo MA (2012) A classification of 20-trinucleotide circular codes. Inf Comput 212:55–63
Article MathSciNet MATH Google Scholar
Nirenberg MW, Matthaei JH (1961) The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proceedings of the National Academy of Sciences, vol 47. U.S.A., pp 1588–1602
Shepherd JCW (1981) Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proceedings of the National Academy of Sciences, vol 78. U.S.A., pp 1596–1600

Download references

Author information

Authors and Affiliations

Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163, Mannheim, Germany
Elena Fimmel & Lutz Strüngmann
Theoretical Bioinformatics, ICube, CNRS, University of Strasbourg, 300 Boulevard Sébastien Brant, 67400, Illkirch, France
Christian J. Michel

Authors

Elena Fimmel
View author publications
You can also search for this author in PubMed Google Scholar
Christian J. Michel
View author publications
You can also search for this author in PubMed Google Scholar
Lutz Strüngmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian J. Michel.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fimmel, E., Michel, C.J. & Strüngmann, L. Strong Comma-Free Codes in Genetic Information. Bull Math Biol 79, 1796–1819 (2017). https://doi.org/10.1007/s11538-017-0307-0

Download citation

Received: 08 January 2017
Accepted: 02 June 2017
Published: 22 June 2017
Issue Date: August 2017
DOI: https://doi.org/10.1007/s11538-017-0307-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strong Comma-Free Codes in Genetic Information

Abstract

Access this article

Similar content being viewed by others

Self-complementary circular codes in coding theory

One integral characteristic of the set of genetic codes. The property of all known natural codes

Equivalence classes of circular codes induced by permutation groups

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Strong Comma-Free Codes in Genetic Information

Abstract

Access this article

Similar content being viewed by others

Self-complementary circular codes in coding theory

One integral characteristic of the set of genetic codes. The property of all known natural codes

Equivalence classes of circular codes induced by permutation groups

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation