Skip to main content
Log in

Strong Comma-Free Codes in Genetic Information

  • Original Article
  • Published:
Bulletin of Mathematical Biology Aims and scope Submit manuscript

Abstract

Comma-free codes constitute a class of circular codes, which has been widely studied, in particular by Golomb et al. (Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab 23:1–34, 1958a, Can J Math 10:202–209, 1958b), Michel et al. (Comput Math Appl 55:989–996, 2008a, Theor Comput Sci 401:17–26, 2008b, Inf Comput 212:55–63, 2012), Michel and Pirillo (Int J Comb 2011:659567, 2011), and Fimmel and Strüngmann (J Theor Biol 389:206–213, 2016). Based on a recent approach using graph theory to study circular codes Fimmel et al. (Philos Trans R Soc 374:20150058, 2016), a new class of circular codes, called strong comma-free codes, is identified. These codes detect a frameshift during the translation process immediately after a reading window of at most two nucleotides. We describe several combinatorial properties of strong comma-free codes: enumeration, maximality, self-complementarity and \(CF^3\)-property (comma-free property in all the three possible frames). These combinatorial results also highlight some new properties of the genetic code and its evolution. Each amino acid in the standard genetic code is coded by at least one strong comma-free code of size 1. There are 9 amino acids \(S=\{Asn,Asp,Gln,Gly,Lys,Met,Phe,Pro,Trp\}\) among 20 such that for each amino acid from S, its synonymous trinucleotide set (excluding the necessary periodic trinucleotides \(\{AAA,CCC,GGG,TTT\}\)) is a strong comma-free code. The primeval comma-free RNY code of Eigen and Schuster (Naturwissenschaften 65:341–369, 1978) is a self-complementary \(CF^3\)-code of size 16. Furthermore, it is the union of two strong comma-free codes of size 8 which are complementary to each other.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Arquès DG, Michel CJ (1996) A complementary circular code in the protein coding genes. J Theor Biol 182:45–58

    Article  Google Scholar 

  • Canapa A, Cerioni PN, Barucca M, Olmo E, Caputo V (2002) A centromeric satellite DNA may be involved in heterochromatin compactness in gobiid fishes. Chromosom Res 10:297–304

    Article  Google Scholar 

  • Clark J, Holton DA (1991) A first look at graph theory. World Scientific, Singapore

    Book  MATH  Google Scholar 

  • Crick FH, Brenner S, Klug A, Pieczenik G (1976) A speculation on the origin of protein synthesis. Orig Life 7:389–397

    Article  Google Scholar 

  • Crick F, Griffith JS, Orgel LE (1957) Codes without commas. Proceedings of the National Academy of Sciences, vol 43. U.S.A, pp 416–421

  • Eigen M, Schuster P (1978) The hypercycle. A principle of natural self-organization. Part C: the realistic hypercycle. Naturwissenschaften 65:341–369

    Article  Google Scholar 

  • El Soufi K, Michel CJ (2014) Circular code motifs in the ribosome decoding center. Comput Biol Chem 52:9–17

    Article  Google Scholar 

  • El Soufi K, Michel CJ (2015) Circular code motifs near the ribosome decoding center. Comput Biol Chem 59:158–176

    Article  Google Scholar 

  • El Soufi K, Michel CJ (2016) Circular code motifs in genomes of eukaryotes. J Theor Biol 408:198–212

    Article  MathSciNet  MATH  Google Scholar 

  • El Soufi K, Michel CJ (2017) Unitary circular code motifs in genomes of eukaryotes. Biosystems 153:45–62

  • Fimmel E, Giannerini S, Gonzalez D, Strüngmann L (2014) Circular codes, symmetries and transformations. J Math Biol. doi:10.1007/s00285-014-0806-7

    MATH  Google Scholar 

  • Fimmel E, Strüngmann L (2015) On the hierarchy of trinucleotide n-circular codes and their corresponding amino acids. J Theor Biol 364:113–120

    Article  MathSciNet  Google Scholar 

  • Fimmel E, Strüngmann L (2016) Maximal dinucleotide comma-free codes. J Theor Biol 389:206–213

    Article  MATH  Google Scholar 

  • Fimmel E, Michel CJ, Strüngmann L (2016) \(n\)-Nucleotide circular codes in graph theory. Philos Trans R Soc A 374:20150058

    Article  MathSciNet  MATH  Google Scholar 

  • Frey G, Michel CJ (2006) Identification of circular codes in bacterial genomes and their use in a factorization method for retrieving the reading frames of genes. Comput Biol Chem 30:87–101

    Article  MATH  Google Scholar 

  • Gemayel R, Vinces MD, Legendre M, Verstrepen KJ (2010) Variable tandem repeats accelerate evolution of coding and regulatory sequences. Ann Rev Genet 44:445–477

    Article  Google Scholar 

  • Golomb SW, Delbruck M, Welch LR (1958a) Construction and properties of comma-free codes. Biologiske Meddelelser, Kongelige Danske Videnskabernes Selskab 23:1–34

    Google Scholar 

  • Golomb SW, Gordon B, Welch LR (1958b) Comma-free codes. Can J Math 10:202–209

    Article  MathSciNet  MATH  Google Scholar 

  • Michel CJ (2012) Circular code motifs in transfer and 16S ribosomal RNAs: a possible translation code in genes. Comput Biol Chem 37:24–37

    Article  MathSciNet  MATH  Google Scholar 

  • Michel CJ (2013) Circular code motifs in transfer RNAs. Comput Biol Chem 45:17–29

    Article  MathSciNet  Google Scholar 

  • Michel CJ (2015) The maximal \(C^3\) self-complementary trinucleotide circular code \(X\) in genes of bacteria, eukaryotes, plasmids and viruses. J Theor Biol 380:156–177

    Article  MathSciNet  MATH  Google Scholar 

  • Michel CJ (2017) The maximal \(C^3\) self-complementary trinucleotide circular code \(X\) in genes of bacteria, archaea, eukaryotes, plasmids and viruses. Life 7(20):1–16

    Google Scholar 

  • Michel CJ, Pirillo G (2011) Strong trinucleotide circular codes. Int J Comb 2011:659567. doi:10.1155/2011/659567

  • Michel CJ, Pirillo G, Pirillo MA (2008a) Varieties of comma free codes. Comput Math Appl 55:989–996

    Article  MathSciNet  MATH  Google Scholar 

  • Michel CJ, Pirillo G, Pirillo MA (2008b) A relation between trinucleotide comma-free codes and trinucleotide circular codes. Theor Comput Sci 401:17–26

    Article  MathSciNet  MATH  Google Scholar 

  • Michel CJ, Pirillo G, Pirillo MA (2012) A classification of 20-trinucleotide circular codes. Inf Comput 212:55–63

    Article  MathSciNet  MATH  Google Scholar 

  • Nirenberg MW, Matthaei JH (1961) The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proceedings of the National Academy of Sciences, vol 47. U.S.A., pp 1588–1602

  • Shepherd JCW (1981) Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proceedings of the National Academy of Sciences, vol 78. U.S.A., pp 1596–1600

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian J. Michel.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fimmel, E., Michel, C.J. & Strüngmann, L. Strong Comma-Free Codes in Genetic Information. Bull Math Biol 79, 1796–1819 (2017). https://doi.org/10.1007/s11538-017-0307-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11538-017-0307-0

Keywords

Navigation