DNA-templated synthesis optimization

Hansen, Bjarke N.; Larsen, Kim S.; Merkle, Daniel; Mihalchuk, Alexei

doi:10.1007/s11047-018-9697-7

DNA-templated synthesis optimization

Published: 31 July 2018

Volume 17, pages 693–707, (2018)
Cite this article

Natural Computing Aims and scope Submit manuscript

243 Accesses
1 Citation
Explore all metrics

Abstract

In chemistry, synthesis is the process in which a target compound is produced in a step-wise manner from given base compounds. A recent, promising approach for carrying out these reactions is DNA-templated synthesis, since, as opposed to more traditional methods, this approach leads to a much higher effective molarity and makes much desired (sequential) one-pot synthesis possible. With this method, compounds are tagged with DNA sequences and reactions can be controlled by bringing two compounds together via their tags. This leads to new cost optimization problems of minimizing the number of different tags or strands to be used under various conditions. We identify relevant optimization criteria, provide the first computational approach to automatically inferring DNA-templated programs, and obtain efficient optimal and near-optimal results, and also provide a brute-force integer linear programming approach for complete solutions to smaller instances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parameters for Successful PCR Primer Design

Computational Organic Chemistry: The Frontier for Understanding and Designing Bioorthogonal Cycloadditions

Article Open access 10 May 2024

AiZynthFinder 4.0: developments based on learnings from 3 years of industrial application

Article Open access 23 May 2024

References

Adleman LM (1994) Molecular computation of solutions to combinatorial problems. Science 5187:1021–1024
Article Google Scholar
Andersen JL, Flamm C, Hanczyc MM, Merkle D (2015) Towards optimal DNA-templated computing. Int J Unconv Comput 11(3–4):185–203
Google Scholar
Benson E, Mohammed A, Gardell J, Masich S, Czeizler E, Orponen P, Högberg B (2015) DNA rendering of polyhedral meshes at the nanoscale. Nature 523:441–444
Article Google Scholar
Cardelli L (2010) Two-domain DNA strand displacement. In: 6th workshop on developments in computational models, volume 26 of electronic proceedings in theoretical computer science, pp 47–61
Article Google Scholar
de Werra D, Pasche C (1989) Paths, chains, and antipaths. Networks 19(1):107–115
Article MathSciNet Google Scholar
Ershov AP (1958) On programming of arithmetic operations. Dokl Akad Nauk SSSR 118(3):427–430
MathSciNet MATH Google Scholar
Flajolet P, Raoult JC, Vuillemin J (1979) The number of registers required for evaluating arithmetic expressions. Theor Comput Sci 9(1):99–125
Article MathSciNet Google Scholar
Goodnow RA Jr, Dumelin CE, Keefe AD (2017) DNA-encoded chemistry: enabling the deeper sampling of chemical space. Nat Rev Drug Discov 16:131–147
Article Google Scholar
Gorska K, Winssinger N (2013) Reactions templated by nucleic acids: more ways to translate oligonucleotide-based instructions into emerging function. Angew Chem Int Ed 52(27):6820–6843
Article Google Scholar
Hansen BN, Mihalchuk A (2015) DNA-templated computing. Master’s thesis, University of Southern Denmark, Denmark. http://cheminf.imada.sdu.dk/dna/. Accessed 25 July 2018
Hansen BN, Larsen KS, Merkle D, Mihalchuk A (2017) DNA-templated synthesis optimization. In: 23rd international conference on DNA computing and molecular programming (DNA), volume 10467 of Lecture Notes in Computer Science. Springer, pp 17–32
He Y, Liu DR (2011) A sequential strand-displacement strategy enables efficient six-step DNA-templated synthesis. J Am Chem Soc 133(26):9972–9975
Article Google Scholar
Hendrickson JB (1977) Systematic synthesis design. 6. Yield analysis and convergency. J Am Chem Soc 99:5439–5450
Article Google Scholar
Li X, Liu DR (2004) DNA-templated organic synthesis: nature’s strategy for controlling chemical reactivity applied to synthetic molecules. Angew Chem Int Ed 43:4848–4870
Article Google Scholar
Meng W, Muscat RA, McKee ML, Milnes PJ, El-Sagheer AH, Bath J, Davis BG, Brown T, O’Reilly RK, Turberfield AJ (2016) An autonomous molecular assembler for programmable chemical synthesis. Nat Chem 8:542–548
Article Google Scholar
Nakata I (1967) On compiling algorithms for arithmetic expressions. Commun ACM 10(8):492–494
Article Google Scholar
Phillips A, Cardelli L (2009) A programming language for composable DNA circuits. J R Soc Interface 6(Suppl 4):S419–S436
Article Google Scholar
Sethi R, Ullman JD (1970) The generation of optimal code for arithmetic expressions. J ACM 17(4):715–728
Article MathSciNet Google Scholar
Strahler AN (1952) Hypsometric (area-altitude) analysis of erosional topography. Bull Geol Soci Am 63:1117–1142
Article Google Scholar
Wickham SFJ, Bath J, Katsuda Y, Endo M, Hidaka K, Sugiyama H, Turberfield AJ (2012) A DNA-based molecular motor that can navigate a network of tracks. Nat Nanotechnol 7:169–173
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark
Bjarke N. Hansen, Kim S. Larsen, Daniel Merkle & Alexei Mihalchuk

Authors

Bjarke N. Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Kim S. Larsen
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Merkle
View author publications
You can also search for this author in PubMed Google Scholar
Alexei Mihalchuk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Merkle.

Additional information

A conference version of this paper was presented at DNA23 (Hansen et al. 2017). This paper differs substantially from the conference version: We provide proof of all theorems and have added a section on brute-force computation using integer linear programming. Furthermore, we present an empirical evaluation for the inference of DNA-templated programs with respect to different optimization criteria and we employ a large set of synthesis plans in order to analyze the solution space of the underlying optimization question. The second and third authors were supported in part by the Danish Council for Independent Research, Natural Sciences, Grants DFF-1323-00247 and DFF-7014-00041.

Appendices

Appendix A: DNA program example

We consider an example synthesis tree with four base compounds. The actual names of the compounds is not used in any of our algorithms, but for illustration, assume the base compounds are A, B, C, and D. Furthermore, we assume that the tagged compound A reacts with the tagged compound B ($A+B \rightarrow E$), and that E will have the tag of B. The complete assumptions are

$$\begin{aligned} A+B\rightarrow & {} E, \quad E \hbox { will inherit the tag of } B \\ C+D\rightarrow & {} F, \quad F \hbox { will inherit the tag of } C \\ E+F\rightarrow & {} X, \quad X \hbox { will inherit the tag of } E \end{aligned}$$

and we demonstrate one possible program computing the target compound X as a sequential one-pot synthesis.

We first tag the base compounds A at the left end of the tag a and B at the right end of the tag b. The tag a (respectively b) is depicted as a red (respectively blue) line in the following (Color figure online).

The state is as follows:

We add the complementary strand $\overline{ba}$ in order to bring A and B in close vicinity and they react to produce E. In this process, A loses its tag.

$\curvearrowright$

$\curvearrowright$

We release the produced tagged compound E with the strand ba and E is now tagged with b. The tag a is now unattached and we add the complementary tag $\overline{a}$ such that in the subsequent operations, it can be ignored.

$\curvearrowright$

Since they are no longer relevant, we will not depict the inert strands in the following.

In order to avoid unintended interference, we block the tagged compound E with a strand $\overline{bc}$ ($\overline{c}$ shown in orange) (Color figure online).

We proceed with the base compounds C and D in a similar manner. Note that C is tagged with a and D is tagged with b, i.e., adding them to the pot in the beginning would have led to unintended interference. By adding $\overline{ba}$, the tagged compounds C and D react to produce F, and D loses its tag.

$\curvearrowright$

We then release the tagged compound F using the strand ba and pacify the tag b.

The blocked tagged compound E is released with the strand bc.

Finally, the tagged compounds E and F are brought in close vicinity using the strand $\overline{ba}$, producing X, and F loses its tag.

$\curvearrowright$

In the very last step, the target compound is released using strand ba, which finalizes the synthesis.

$\curvearrowright$

The only non-inert tag is the tag attached to compound X, which makes it chemically easy to extract the compound from the pot. The synthesis required three different tags and two different strands (and their corresponding complementary tags and strands).

The given example also illustrates the minimization of the number of tags for blocking, when assuming that only two tags on the compounds are used (see the definition of Mnt) and the number of tags for blocking is to be minimized. Without loss of generality, we choose the goal compound X to be tagged with b. Given that decision, and given that we have restricted ourselves to using only two different tags on the compounds, there are no further choices for tagging: The tagging of all nodes in the tree is simply inferred as follows. The nodes A, C, and F need to be tagged with an a, and B, D, and E with a b. In this example, the subtree of the root X corresponding to $A+B \rightarrow E$ is synthesized before the subtree corresponding to $C+D\rightarrow F$. As we need to block the result of the former synthesis, we need an additional tag for blocking for the subtree E. With respect to the definition of Mnt, this corresponds to the recursive calculations for the inference $\max ({\textsc {Mnt}}(E, 0, 0),{\textsc {Mnt}}(F,1,0))$ (the choice to synthesize the subtree $C+D\rightarrow F$ first would, in this specific example, lead to the same overall result). This leads to the following base cases for the leaves: ${\textsc {Mnt}}(A, 0, 0) = 0$ and ${\textsc {Mnt}}(B, 0, 0) = 0$, and for the other subtree ${\textsc {Mnt}}(C, 1, 0) = 1$ and ${\textsc {Mnt}}(D, 1 ,0) = 1$. Obviously, ${\textsc {Mnt}}(E, 0, 0) = 0$ and ${\textsc {Mnt}}(F, 1, 0) = 1$, leading to ${\textsc {Mnt}}(X, 0, 0) = \min ( \max ({\textsc {Mnt}}(E, 0, 0), {\textsc {Mnt}}(F, 1, 0) ), \ldots ) = 1$. Thus, only one additional tag is needed for blocking.

Appendix B: Example tree used for empirical evaluation

Figure 5 shows the example tree used for the empirical evaluation.

Appendix C: Details for empirical evaluation

In order to perform an empirical evaluation of the possible sets of strands which can be used in order to perform a specific synthesis successfully, we randomly created 2,999,928 synthesis trees with Strahler number $\mathscr {S}(t)=6$, using the following recursive process: If the Strahler number s does not correspond to just one node (a leaf), then we create a node and generate its subtrees as follows. With probability 2 / 3, we recursively generate two subtrees, both of which have Strahler numbers $s-1$. With probability 1 / 3, we let one subtree have Strahler number s and choose uniformly at random between the Strahler numbers one through $s-1$ for the other subtree. In all cases, the ordering of the subtrees (left or right) are decided upon uniformly at random.

Of the 37 possible strand sets which use 4 tags, only 13 were able to solve at least one synthesis plan. 23,780 of all the synthesis plans could not be solved with a strand set based on 3 tags (and 5 strands), but required 4 tags (and 5 strands). Only 9 of the 13 strand sets could be used to solve at least one of the 23,780 synthesis plans and 2 of the 9 strand sets could be used for all 23,780 synthesis plans; see Fig. 4 and the details given in Table 3.

Table 3 Shown are the 13 out of the 37 possible strand sets with 4 tags, that could be used for at least one of the 2,999,928 synthesis plans

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hansen, B.N., Larsen, K.S., Merkle, D. et al. DNA-templated synthesis optimization. Nat Comput 17, 693–707 (2018). https://doi.org/10.1007/s11047-018-9697-7

Download citation

Published: 31 July 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s11047-018-9697-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DNA-templated synthesis optimization

Abstract

Access this article

Similar content being viewed by others

Parameters for Successful PCR Primer Design

Computational Organic Chemistry: The Frontier for Understanding and Designing Bioorthogonal Cycloadditions

AiZynthFinder 4.0: developments based on learnings from 3 years of industrial application

References