Abstract
In chemistry, synthesis is the process in which a target compound is produced in a step-wise manner from given base compounds. A recent, promising approach for carrying out these reactions is DNA-templated synthesis, since, as opposed to more traditional methods, this approach leads to a much higher effective molarity and makes much desired (sequential) one-pot synthesis possible. With this method, compounds are tagged with DNA sequences and reactions can be controlled by bringing two compounds together via their tags. This leads to new cost optimization problems of minimizing the number of different tags or strands to be used under various conditions. We identify relevant optimization criteria, provide the first computational approach to automatically inferring DNA-templated programs, and obtain efficient optimal and near-optimal results, and also provide a brute-force integer linear programming approach for complete solutions to smaller instances.
Similar content being viewed by others
References
Adleman LM (1994) Molecular computation of solutions to combinatorial problems. Science 5187:1021–1024
Andersen JL, Flamm C, Hanczyc MM, Merkle D (2015) Towards optimal DNA-templated computing. Int J Unconv Comput 11(3–4):185–203
Benson E, Mohammed A, Gardell J, Masich S, Czeizler E, Orponen P, Högberg B (2015) DNA rendering of polyhedral meshes at the nanoscale. Nature 523:441–444
Cardelli L (2010) Two-domain DNA strand displacement. In: 6th workshop on developments in computational models, volume 26 of electronic proceedings in theoretical computer science, pp 47–61
de Werra D, Pasche C (1989) Paths, chains, and antipaths. Networks 19(1):107–115
Ershov AP (1958) On programming of arithmetic operations. Dokl Akad Nauk SSSR 118(3):427–430
Flajolet P, Raoult JC, Vuillemin J (1979) The number of registers required for evaluating arithmetic expressions. Theor Comput Sci 9(1):99–125
Goodnow RA Jr, Dumelin CE, Keefe AD (2017) DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nat Rev Drug Discov 16:131–147
Gorska K, Winssinger N (2013) Reactions templated by nucleic acids: more ways to translate oligonucleotide-based instructions into emerging function. Angew Chem Int Ed 52(27):6820–6843
Hansen BN, Mihalchuk A (2015) DNA-templated computing. Master’s thesis, University of Southern Denmark, Denmark. http://cheminf.imada.sdu.dk/dna/. Accessed 25 July 2018
Hansen BN, Larsen KS, Merkle D, Mihalchuk A (2017) DNA-templated synthesis optimization. In: 23rd international conference on DNA computing and molecular programming (DNA), volume 10467 of Lecture Notes in Computer Science. Springer, pp 17–32
He Y, Liu DR (2011) A sequential strand-displacement strategy enables efficient six-step DNA-templated synthesis. J Am Chem Soc 133(26):9972–9975
Hendrickson JB (1977) Systematic synthesis design. 6. Yield analysis and convergency. J Am Chem Soc 99:5439–5450
Li X, Liu DR (2004) DNA-templated organic synthesis: nature’s strategy for controlling chemical reactivity applied to synthetic molecules. Angew Chem Int Ed 43:4848–4870
Meng W, Muscat RA, McKee ML, Milnes PJ, El-Sagheer AH, Bath J, Davis BG, Brown T, O’Reilly RK, Turberfield AJ (2016) An autonomous molecular assembler for programmable chemical synthesis. Nat Chem 8:542–548
Nakata I (1967) On compiling algorithms for arithmetic expressions. Commun ACM 10(8):492–494
Phillips A, Cardelli L (2009) A programming language for composable DNA circuits. J R Soc Interface 6(Suppl 4):S419–S436
Sethi R, Ullman JD (1970) The generation of optimal code for arithmetic expressions. J ACM 17(4):715–728
Strahler AN (1952) Hypsometric (area-altitude) analysis of erosional topography. Bull Geol Soci Am 63:1117–1142
Wickham SFJ, Bath J, Katsuda Y, Endo M, Hidaka K, Sugiyama H, Turberfield AJ (2012) A DNA-based molecular motor that can navigate a network of tracks. Nat Nanotechnol 7:169–173
Author information
Authors and Affiliations
Corresponding author
Additional information
A conference version of this paper was presented at DNA23 (Hansen et al. 2017). This paper differs substantially from the conference version: We provide proof of all theorems and have added a section on brute-force computation using integer linear programming. Furthermore, we present an empirical evaluation for the inference of DNA-templated programs with respect to different optimization criteria and we employ a large set of synthesis plans in order to analyze the solution space of the underlying optimization question. The second and third authors were supported in part by the Danish Council for Independent Research, Natural Sciences, Grants DFF-1323-00247 and DFF-7014-00041.
Appendices
Appendix A: DNA program example
We consider an example synthesis tree with four base compounds. The actual names of the compounds is not used in any of our algorithms, but for illustration, assume the base compounds are A, B, C, and D. Furthermore, we assume that the tagged compound A reacts with the tagged compound B (\(A+B \rightarrow E\)), and that E will have the tag of B. The complete assumptions are
and we demonstrate one possible program computing the target compound X as a sequential one-pot synthesis.
We first tag the base compounds A at the left end of the tag a and B at the right end of the tag b. The tag a (respectively b) is depicted as a red (respectively blue) line in the following (Color figure online).
The state is as follows:
We add the complementary strand \(\overline{ba}\) in order to bring A and B in close vicinity and they react to produce E. In this process, A loses its tag.
\(\curvearrowright\)
\(\curvearrowright\)
We release the produced tagged compound E with the strand ba and E is now tagged with b. The tag a is now unattached and we add the complementary tag \(\overline{a}\) such that in the subsequent operations, it can be ignored.
\(\curvearrowright\)
Since they are no longer relevant, we will not depict the inert strands in the following.
In order to avoid unintended interference, we block the tagged compound E with a strand \(\overline{bc}\) (\(\overline{c}\) shown in orange) (Color figure online).
We proceed with the base compounds C and D in a similar manner. Note that C is tagged with a and D is tagged with b, i.e., adding them to the pot in the beginning would have led to unintended interference. By adding \(\overline{ba}\), the tagged compounds C and D react to produce F, and D loses its tag.
\(\curvearrowright\)
We then release the tagged compound F using the strand ba and pacify the tag b.
The blocked tagged compound E is released with the strand bc.
Finally, the tagged compounds E and F are brought in close vicinity using the strand \(\overline{ba}\), producing X, and F loses its tag.
\(\curvearrowright\)
In the very last step, the target compound is released using strand ba, which finalizes the synthesis.
\(\curvearrowright\)
The only non-inert tag is the tag attached to compound X, which makes it chemically easy to extract the compound from the pot. The synthesis required three different tags and two different strands (and their corresponding complementary tags and strands).
The given example also illustrates the minimization of the number of tags for blocking, when assuming that only two tags on the compounds are used (see the definition of Mnt) and the number of tags for blocking is to be minimized. Without loss of generality, we choose the goal compound X to be tagged with b. Given that decision, and given that we have restricted ourselves to using only two different tags on the compounds, there are no further choices for tagging: The tagging of all nodes in the tree is simply inferred as follows. The nodes A, C, and F need to be tagged with an a, and B, D, and E with a b. In this example, the subtree of the root X corresponding to \(A+B \rightarrow E\) is synthesized before the subtree corresponding to \(C+D\rightarrow F\). As we need to block the result of the former synthesis, we need an additional tag for blocking for the subtree E. With respect to the definition of Mnt, this corresponds to the recursive calculations for the inference \(\max ({\textsc {Mnt}}(E, 0, 0),{\textsc {Mnt}}(F,1,0))\) (the choice to synthesize the subtree \(C+D\rightarrow F\) first would, in this specific example, lead to the same overall result). This leads to the following base cases for the leaves: \({\textsc {Mnt}}(A, 0, 0) = 0\) and \({\textsc {Mnt}}(B, 0, 0) = 0\), and for the other subtree \({\textsc {Mnt}}(C, 1, 0) = 1\) and \({\textsc {Mnt}}(D, 1 ,0) = 1\). Obviously, \({\textsc {Mnt}}(E, 0, 0) = 0\) and \({\textsc {Mnt}}(F, 1, 0) = 1\), leading to \({\textsc {Mnt}}(X, 0, 0) = \min ( \max ({\textsc {Mnt}}(E, 0, 0), {\textsc {Mnt}}(F, 1, 0) ), \ldots ) = 1\). Thus, only one additional tag is needed for blocking.
Appendix B: Example tree used for empirical evaluation
Figure 5 shows the example tree used for the empirical evaluation.
Appendix C: Details for empirical evaluation
In order to perform an empirical evaluation of the possible sets of strands which can be used in order to perform a specific synthesis successfully, we randomly created 2,999,928 synthesis trees with Strahler number \(\mathscr {S}(t)=6\), using the following recursive process: If the Strahler number s does not correspond to just one node (a leaf), then we create a node and generate its subtrees as follows. With probability 2 / 3, we recursively generate two subtrees, both of which have Strahler numbers \(s-1\). With probability 1 / 3, we let one subtree have Strahler number s and choose uniformly at random between the Strahler numbers one through \(s-1\) for the other subtree. In all cases, the ordering of the subtrees (left or right) are decided upon uniformly at random.
Of the 37 possible strand sets which use 4 tags, only 13 were able to solve at least one synthesis plan. 23,780 of all the synthesis plans could not be solved with a strand set based on 3 tags (and 5 strands), but required 4 tags (and 5 strands). Only 9 of the 13 strand sets could be used to solve at least one of the 23,780 synthesis plans and 2 of the 9 strand sets could be used for all 23,780 synthesis plans; see Fig. 4 and the details given in Table 3.
Rights and permissions
About this article
Cite this article
Hansen, B.N., Larsen, K.S., Merkle, D. et al. DNA-templated synthesis optimization. Nat Comput 17, 693–707 (2018). https://doi.org/10.1007/s11047-018-9697-7
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11047-018-9697-7