Two Proteins for the Price of One: The Design of Maximally Compressed Coding Sequences

  • Bei Wang
  • Dimitris Papamichail
  • Steffen Mueller
  • Steven Skiena
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3892)


The emerging field of synthetic biology moves beyond conventional genetic manipulation to construct novel life forms which do not originate in nature. We explore the problem of designing the provably shortest genomic sequence to encode a given set of genes by exploiting alternate reading frames. We present an algorithm for designing the shortest DNA sequence simultaneously encoding two given amino acid sequences. We show that the coding sequence of naturally occurring pairs of overlapping genes approach maximum compression. We also investigate the impact of alternate coding matrices on overlapping sequence design. Finally, we discuss an interesting application for overlapping gene design, namely the interleaving of an antibiotic resistance gene into a target gene inserted into a virus or plasmid for amplification.


Gene Pair Antibiotic Resistance Gene Substitution Matrice Human Disease Gene Alternate Reading Frame 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cello, J., Paul, A., Wimmer, E.: Chemical synthesis of poliovirus cDNA: Generation of infectious virus in the absence of natural template. Science 297, 1016–1018 (2002)CrossRefGoogle Scholar
  2. 2.
    Smith, H., Hutchison, C., Pfannkoch, C., Venter, J.C.: Generating a synthetic genome by whole genome assembly: Phix174 bacteriophage from synthetic oligonucleotides. Proc. Nat. Acad. Sci. 100, 15440–15445 (2003)CrossRefGoogle Scholar
  3. 3.
    Kodumal, S., Pael, K., Reid, R., Menzella, H., Welch, M., Santi, D.: Total synthesis of long DNA sequences: Synthesis of a contiguous 32-kb polyketide synthase gene cluster. Proc. Nat. Acad. Sci. 44, 15573–15578 (2004)CrossRefGoogle Scholar
  4. 4.
    Ball, P.: Starting from scratch. Nature 431, 624–626 (2004)CrossRefGoogle Scholar
  5. 5.
    Tian, J., Gong, H., Sheng, N., Zhou, Z., Gulari, E., Gao, X., Church, G.: Accurate multiplex gene synthesis from programmable DNA microchips. Nature 432, 1050–1054 (2004)CrossRefGoogle Scholar
  6. 6.
    Skiena, S., Wimmer, E.: Gene design for vaccines and theraputic phages. NSF ITR Award 0325123 (2003)Google Scholar
  7. 7.
    Cohen, B., Skiena, S.: Natural selection and algorithmic design of mrna. J. Computational Biology 10, 419–432 (2003)CrossRefGoogle Scholar
  8. 8.
    Skiena, S.: Designing better phages. Bioinformatics 17, 253–261 (2001)CrossRefGoogle Scholar
  9. 9.
    Fukuda, Y., Washio, T., Tomita, M.: Evolution of overlapping genes: Comparative genomics of mycoplasma genitalium and mycoplasma pneumoniae. In: The Ninth Workshop on Genome Informatics (1998)Google Scholar
  10. 10.
    Cann, A.J.: Principles of Molecular Virology. Academic Press, London (1993)Google Scholar
  11. 11.
    Keese, P., Gibbs, A.: Origins of genes: “big bang” or continuous creation? Proc. Natl. Acad. Sci. 89, 9489–9493 (1992)CrossRefGoogle Scholar
  12. 12.
    Krakauer, D.C.: Evolutionary principles of genomic compression. Comments on Theor. Biol. (2002)Google Scholar
  13. 13.
    Oppenheim, D., Yahofsky, C.: Translational coupling during expression of the tryptophan operon of e. coli. Genetics 95, 785–795 (1980)Google Scholar
  14. 14.
    Miyata, T., Yasunaga, T.: Evolution of overlapping genes. Nature 272, 532–535 (1978)CrossRefGoogle Scholar
  15. 15.
    Krakauer, D.C.: Stability and evolution of overlapping genes. Evolution 54(3), 731–739 (2000)CrossRefGoogle Scholar
  16. 16.
    Veeramachaneni, V., Makalowski, W., Galdzicki, M., Sood, R., Makalowska, I.: Mammalian overlapping genes: The comparative method. Genome Research 14, 280–286 (2004)CrossRefGoogle Scholar
  17. 17.
    Fukuda, Y., Nakayama, Y., Tomita, M.: On dynamics of overlapping genes in bacterial genomes. Gene. 323, 181–187 (2003)CrossRefGoogle Scholar
  18. 18.
    Rogozin, I., Spiridonov, A., Sorokin, A., Wolf, Y., King, J., Tatusov, R., Koonin, E.: Purifying and directional selection in overlapping prokaryotic genes. Trends Genet. 18(5), 228–232 (2002)CrossRefGoogle Scholar
  19. 19.
    Karlin, S., Chen, C., Gentles, A., Cleary, M.: Associations between human disease genes and overlapping gene groups and multiple amino acid runs. Proc. Natl. Acad. Sci. 99(26), 17008–17013 (2002)CrossRefGoogle Scholar
  20. 20.
    Freeland, S., Hurst, L.: Evolution encoded. Sci. Am. 290(4), 84–91 (2004)CrossRefGoogle Scholar
  21. 21.
    Gilis, D., Massar, S., Cerf, N.J., Rooman, M.: Optimality of the genetic code with respect to protein stability and amino-acid frequencies. Genome Biol. 2(11) (2001)Google Scholar
  22. 22.
    Marti-Renom, M.A., Stuart, A.C., Fiser, A., Sanchez, R., Melo, F., Sali, A.: Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000)CrossRefGoogle Scholar
  23. 23.
    Levitt, M.: A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104, 59–107 (1976)CrossRefGoogle Scholar
  24. 24.
    Elber, R., Karplus, M.: Enhanced sampling in molecular dynamics: Use of the time-dependent hartree approximation for a simulation of carbon monoxide diffusion through myoglobin. J. Am. Chem. Soc. 112, 9161–9175 (1990)CrossRefGoogle Scholar
  25. 25.
    Hornak, V., Simmerling, C.: Generation of accurate protein loop conformations through low-barrier molecular dynamics. Proteins 51, 577–590 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Bei Wang
    • 1
  • Dimitris Papamichail
    • 2
  • Steffen Mueller
    • 3
  • Steven Skiena
    • 2
  1. 1.Dept. of Computer ScienceDuke UniversityDurhamUSA
  2. 2.Dept. of Computer ScienceState University of New YorkStony BrookUSA
  3. 3.Dept. of MicrobiologyState University of New YorkStony BrookUSA

Personalised recommendations