Skip to main content
Log in

DNA-templated synthesis optimization

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

In chemistry, synthesis is the process in which a target compound is produced in a step-wise manner from given base compounds. A recent, promising approach for carrying out these reactions is DNA-templated synthesis, since, as opposed to more traditional methods, this approach leads to a much higher effective molarity and makes much desired (sequential) one-pot synthesis possible. With this method, compounds are tagged with DNA sequences and reactions can be controlled by bringing two compounds together via their tags. This leads to new cost optimization problems of minimizing the number of different tags or strands to be used under various conditions. We identify relevant optimization criteria, provide the first computational approach to automatically inferring DNA-templated programs, and obtain efficient optimal and near-optimal results, and also provide a brute-force integer linear programming approach for complete solutions to smaller instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Adleman LM (1994) Molecular computation of solutions to combinatorial problems. Science 5187:1021–1024

    Article  Google Scholar 

  • Andersen JL, Flamm C, Hanczyc MM, Merkle D (2015) Towards optimal DNA-templated computing. Int J Unconv Comput 11(3–4):185–203

    Google Scholar 

  • Benson E, Mohammed A, Gardell J, Masich S, Czeizler E, Orponen P, Högberg B (2015) DNA rendering of polyhedral meshes at the nanoscale. Nature 523:441–444

    Article  Google Scholar 

  • Cardelli L (2010) Two-domain DNA strand displacement. In: 6th workshop on developments in computational models, volume 26 of electronic proceedings in theoretical computer science, pp 47–61

    Article  Google Scholar 

  • de Werra D, Pasche C (1989) Paths, chains, and antipaths. Networks 19(1):107–115

    Article  MathSciNet  Google Scholar 

  • Ershov AP (1958) On programming of arithmetic operations. Dokl Akad Nauk SSSR 118(3):427–430

    MathSciNet  MATH  Google Scholar 

  • Flajolet P, Raoult JC, Vuillemin J (1979) The number of registers required for evaluating arithmetic expressions. Theor Comput Sci 9(1):99–125

    Article  MathSciNet  Google Scholar 

  • Goodnow RA Jr, Dumelin CE, Keefe AD (2017) DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nat Rev Drug Discov 16:131–147

    Article  Google Scholar 

  • Gorska K, Winssinger N (2013) Reactions templated by nucleic acids: more ways to translate oligonucleotide-based instructions into emerging function. Angew Chem Int Ed 52(27):6820–6843

    Article  Google Scholar 

  • Hansen BN, Mihalchuk A (2015) DNA-templated computing. Master’s thesis, University of Southern Denmark, Denmark. http://cheminf.imada.sdu.dk/dna/. Accessed 25 July 2018

  • Hansen BN, Larsen KS, Merkle D, Mihalchuk A (2017) DNA-templated synthesis optimization. In: 23rd international conference on DNA computing and molecular programming (DNA), volume 10467 of Lecture Notes in Computer Science. Springer, pp 17–32

  • He Y, Liu DR (2011) A sequential strand-displacement strategy enables efficient six-step DNA-templated synthesis. J Am Chem Soc 133(26):9972–9975

    Article  Google Scholar 

  • Hendrickson JB (1977) Systematic synthesis design. 6. Yield analysis and convergency. J Am Chem Soc 99:5439–5450

    Article  Google Scholar 

  • Li X, Liu DR (2004) DNA-templated organic synthesis: nature’s strategy for controlling chemical reactivity applied to synthetic molecules. Angew Chem Int Ed 43:4848–4870

    Article  Google Scholar 

  • Meng W, Muscat RA, McKee ML, Milnes PJ, El-Sagheer AH, Bath J, Davis BG, Brown T, O’Reilly RK, Turberfield AJ (2016) An autonomous molecular assembler for programmable chemical synthesis. Nat Chem 8:542–548

    Article  Google Scholar 

  • Nakata I (1967) On compiling algorithms for arithmetic expressions. Commun ACM 10(8):492–494

    Article  Google Scholar 

  • Phillips A, Cardelli L (2009) A programming language for composable DNA circuits. J R Soc Interface 6(Suppl 4):S419–S436

    Article  Google Scholar 

  • Sethi R, Ullman JD (1970) The generation of optimal code for arithmetic expressions. J ACM 17(4):715–728

    Article  MathSciNet  Google Scholar 

  • Strahler AN (1952) Hypsometric (area-altitude) analysis of erosional topography. Bull Geol Soci Am 63:1117–1142

    Article  Google Scholar 

  • Wickham SFJ, Bath J, Katsuda Y, Endo M, Hidaka K, Sugiyama H, Turberfield AJ (2012) A DNA-based molecular motor that can navigate a network of tracks. Nat Nanotechnol 7:169–173

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Merkle.

Additional information

A conference version of this paper was presented at DNA23 (Hansen et al. 2017). This paper differs substantially from the conference version: We provide proof of all theorems and have added a section on brute-force computation using integer linear programming. Furthermore, we present an empirical evaluation for the inference of DNA-templated programs with respect to different optimization criteria and we employ a large set of synthesis plans in order to analyze the solution space of the underlying optimization question. The second and third authors were supported in part by the Danish Council for Independent Research, Natural Sciences, Grants DFF-1323-00247 and DFF-7014-00041.

Appendices

Appendix A: DNA program example

We consider an example synthesis tree with four base compounds. The actual names of the compounds is not used in any of our algorithms, but for illustration, assume the base compounds are A, B, C, and D. Furthermore, we assume that the tagged compound A reacts with the tagged compound B (\(A+B \rightarrow E\)), and that E will have the tag of B. The complete assumptions are

$$\begin{aligned} A+B\rightarrow & {} E, \quad E \hbox { will inherit the tag of } B \\ C+D\rightarrow & {} F, \quad F \hbox { will inherit the tag of } C \\ E+F\rightarrow & {} X, \quad X \hbox { will inherit the tag of } E \end{aligned}$$

and we demonstrate one possible program computing the target compound X as a sequential one-pot synthesis.

We first tag the base compounds A at the left end of the tag a and B at the right end of the tag b. The tag a (respectively b) is depicted as a red (respectively blue) line in the following (Color figure online).

figure b

The state is as follows:

figure c

We add the complementary strand \(\overline{ba}\) in order to bring A and B in close vicinity and they react to produce E. In this process, A loses its tag.

figure d
figure e

\(\curvearrowright\)

figure f

\(\curvearrowright\)

figure g

We release the produced tagged compound E with the strand ba and E is now tagged with b. The tag a is now unattached and we add the complementary tag \(\overline{a}\) such that in the subsequent operations, it can be ignored.

figure h
figure i

\(\curvearrowright\)

figure j

Since they are no longer relevant, we will not depict the inert strands in the following.

In order to avoid unintended interference, we block the tagged compound E with a strand \(\overline{bc}\) (\(\overline{c}\) shown in orange) (Color figure online).

figure k
figure l

We proceed with the base compounds C and D in a similar manner. Note that C is tagged with a and D is tagged with b, i.e., adding them to the pot in the beginning would have led to unintended interference. By adding \(\overline{ba}\), the tagged compounds C and D react to produce F, and D loses its tag.

figure m
figure n

\(\curvearrowright\)

figure o

We then release the tagged compound F using the strand ba and pacify the tag b.

figure p
figure q

The blocked tagged compound E is released with the strand bc.

figure r
figure s

Finally, the tagged compounds E and F are brought in close vicinity using the strand \(\overline{ba}\), producing X, and F loses its tag.

figure t
figure u

\(\curvearrowright\)

figure v

In the very last step, the target compound is released using strand ba, which finalizes the synthesis.

figure w
figure x

\(\curvearrowright\)

figure y

The only non-inert tag is the tag attached to compound X, which makes it chemically easy to extract the compound from the pot. The synthesis required three different tags and two different strands (and their corresponding complementary tags and strands).

The given example also illustrates the minimization of the number of tags for blocking, when assuming that only two tags on the compounds are used (see the definition of Mnt) and the number of tags for blocking is to be minimized. Without loss of generality, we choose the goal compound X to be tagged with b. Given that decision, and given that we have restricted ourselves to using only two different tags on the compounds, there are no further choices for tagging: The tagging of all nodes in the tree is simply inferred as follows. The nodes A, C, and F need to be tagged with an a, and B, D, and E with a b. In this example, the subtree of the root X corresponding to \(A+B \rightarrow E\) is synthesized before the subtree corresponding to \(C+D\rightarrow F\). As we need to block the result of the former synthesis, we need an additional tag for blocking for the subtree E. With respect to the definition of Mnt, this corresponds to the recursive calculations for the inference \(\max ({\textsc {Mnt}}(E, 0, 0),{\textsc {Mnt}}(F,1,0))\) (the choice to synthesize the subtree \(C+D\rightarrow F\) first would, in this specific example, lead to the same overall result). This leads to the following base cases for the leaves: \({\textsc {Mnt}}(A, 0, 0) = 0\) and \({\textsc {Mnt}}(B, 0, 0) = 0\), and for the other subtree \({\textsc {Mnt}}(C, 1, 0) = 1\) and \({\textsc {Mnt}}(D, 1 ,0) = 1\). Obviously, \({\textsc {Mnt}}(E, 0, 0) = 0\) and \({\textsc {Mnt}}(F, 1, 0) = 1\), leading to \({\textsc {Mnt}}(X, 0, 0) = \min ( \max ({\textsc {Mnt}}(E, 0, 0), {\textsc {Mnt}}(F, 1, 0) ), \ldots ) = 1\). Thus, only one additional tag is needed for blocking.

Appendix B: Example tree used for empirical evaluation

Figure 5 shows the example tree used for the empirical evaluation.

Fig. 5
figure 5

Example synthesis tree used for the empirical evaluation with different optimization criteria as also visualized at http://cheminf.imada.sdu.dk/dna/. The coloring of the nodes is chosen according to the optimization of the number of tags on compounds, bold edges indicate tag inheritance, the left and right end tagging is indicated by the small circles to the left and right, respectively, of the large circles, which represent the compounds. The DNA-templated program, which uses 2 tags on the compounds (here red and blue) and 3 tags overall, has a length of 168 instructions. The DNA-templated program itself, a visualization of the sequence of state changes for the sequential one-pot synthesis, as well as statistical information can also be found via the before-mentioned URL. The shortest possible DNA-templated program, which can easily be found by tagging all 37 input compounds with a different tag, has a length of 109 instructions. By an exhaustive enumeration of all tree traversals and employing the ILP-based brute force approach for each of the traversals, a program of length 109 was found that uses the minimum 25 tags for that program length. (Color figure online)

Appendix C: Details for empirical evaluation

In order to perform an empirical evaluation of the possible sets of strands which can be used in order to perform a specific synthesis successfully, we randomly created 2,999,928 synthesis trees with Strahler number \(\mathscr {S}(t)=6\), using the following recursive process: If the Strahler number s does not correspond to just one node (a leaf), then we create a node and generate its subtrees as follows. With probability 2 / 3, we recursively generate two subtrees, both of which have Strahler numbers \(s-1\). With probability 1 / 3, we let one subtree have Strahler number s and choose uniformly at random between the Strahler numbers one through \(s-1\) for the other subtree. In all cases, the ordering of the subtrees (left or right) are decided upon uniformly at random.

Of the 37 possible strand sets which use 4 tags, only 13 were able to solve at least one synthesis plan. 23,780 of all the synthesis plans could not be solved with a strand set based on 3 tags (and 5 strands), but required 4 tags (and 5 strands). Only 9 of the 13 strand sets could be used to solve at least one of the 23,780 synthesis plans and 2 of the 9 strand sets could be used for all 23,780 synthesis plans; see Fig. 4 and the details given in Table 3.

Table 3 Shown are the 13 out of the 37 possible strand sets with 4 tags, that could be used for at least one of the 2,999,928 synthesis plans

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hansen, B.N., Larsen, K.S., Merkle, D. et al. DNA-templated synthesis optimization. Nat Comput 17, 693–707 (2018). https://doi.org/10.1007/s11047-018-9697-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-018-9697-7

Keywords

Mathematics Subject Classification

Navigation