An essential step of any DNA computation is encoding the input data on single or double DNA strands. Due to the biochemical properties of DNA, complementary single strands can bind to one another forming double-stranded DNA. Consequently, data-encoding DNA strands can sometimes interact in undesirable ways when used in computations. It is crucial thus to analyze properties that guard against such phenomena and study sets of sequences that ensure that no unwanted bindings occur during any computation. This paper formalizes and investigates properties of DNA languages that guarantee their robusteness during computations. After defining and investigating several types of DNA languages possessing good encoding properties, such as sticky-free and overhang-free languages, we give algorithms for deciding whether regular DNA languages are invariant under bio-operations. We also give a method for constructing DNA languages that, in addition to being invariant and sticky-free, possess error-detecting properties. Finally, we present the results of running tests that check whether several known gene languages (the set of genes of a given organism) as well as the input DNA languages used in Adleman’s DNA computing experiment, have the defined properties.
Keywords:Theoretical DNA computing DNA encodings codes formal languages.
Unable to display preview. Download preview PDF.