1 Introduction

With recent high-throughput technology, we can synthesize large heterogeneous collections of DNA structures [7, 13] and also read them all out precisely in a single procedure [10]. This contrasts with the older practice of assembling structures one at a time and of reading them out individually (e.g., by fluorescence), or reading them together ambiguously (e.g., by gel electrophoresis). Can we take advantage of these high-throughput and high-precision technologies, not only to do things faster but also to devise new techniques and algorithms? In this paper, we examine some DNA algorithms that assume both high-throughput synthesis and high-throughput sequencing: they would not be very practical otherwise.

A sequence ‘s’ of DNA nucleotides hybridizes (forms a double strand) with its reverse Watson-Crick complement denoted ‘s*’; we write the resulting double strand as ‘s’. Subsequences of ‘s’ are called domains provided they are independent of each other, that is, provided that differently identified domains do not hybridize with each other, or with significantly long parts of each other [16]. Under normal laboratory conditions, a domain ‘a’ is called short if it hybridizes reversibly with ‘a*’, and long if it hybridizes irreversibly with it.

A short single-stranded domain ‘t’, called a toehold, followed in the same sequence by a long single-stranded domain ‘a’ can initiate strand displacement. This is the process (later detailed in Fig. 3) where a single-stranded sequence ‘ta’ hybridizes to a double strand composed of ‘t*’ attached to the bottom strand of a double-stranded ‘a’. The invading ‘ta’ can displace and possibly replace the existing ‘a’ domain of the double strand through a random-walk competition between the two ‘a’ domains hybridizing to the same ‘a*’.

A nick is an interruption in one of the two strands of a double strand, at the boundary between two domains. By cascading short and long domains, occasionally separated by nicks, we can achieve multi-step strand displacements where the whole sequence of displacements can itself be reversible or irreversible. This way we can emulate reversible and irreversible chemical reactions [14] and other computational abstractions [9].

The readout of the outcome of such sequences of displacements is often done by fluorescence. Fluorophore/quencher pairs are attached to some domains that participate in the reactions, those in particular whose displacement indicates that a significant event has occurred. The displacement separates the fluorophore from the quencher and hence induces visible fluorescence. This provides a real-time account of the computation, but the readout capability is restricted: a limited number of separate events can be detected by using different fluorescence colors. This is analogous to debugging a program by inserting a limited number of print statements at a time, each one printing a single letter.

Another way of achieving a readout is via gel electrophoresis, to distinguish the sequences in a solution by their length at the end of the experiment (or at predetermined time points). Many different sequences can be identified provided they have different lengths (masses) and provided that we know their length ahead of time. Unexpected lengths can be hard to identify. This is analogous to debugging a program by using control flow counters to tell us how many times each routine is invoked, or each structure is accessed, without any insight about the order of events.

Finally, and especially with more recent high-throughput technology, we can obtain a readout by ligating the nicks and sequencing all the strands in the solution at the end of the experiment (or at predetermined time points). With high-throughput sequencing, we can inspect potentially the entire composition of the solution. The debugging analogy now is that of taking a core dump: analyzing in complete detail the entire state of a computation, but only infrequently or at the end, and again without any obvious insight on the order of events that occurred.

The order of events is usually of great interest: for example, multiple laborious gene knockout experiments are frequently carried out to determine the order of gene activations. What if we could instead take a single core dump that tells us the order of all the events of interest? To that end, we should record the order of events within the state of the system, so that we can inspect such recording at the end as part of the core dump. Assuming high-throughput sequencing, we can embed a large amount of information within the solution. We are going to assume that we can embed \(N^2\) pieces of information, where N is the number of events of interest. This seems achievable for reasonably small N while providing a lot of information, encoding for each event whether it happened before, together, or after any other event. Each one of the \(N^2\) event order detectors is a structure that accepts inputs but does not produce outputs: when it detects certain conditions, it locks down in a stable state and waits to be sequenced later.

Our strategy is therefore to embed a preorder (a reflexive and transitive relation) of events within the solution. This is a pre-order because we may not be able to detect the precise order of two events if they happen very close to each other, in which case both directions are recorded. With \(N^2\) detectors, we can determine the order of any pair of events without needing to coordinate the detectors with each other or with a central structure; hence, each detector can be relatively simple. An alternative is to use only N detectors that sequentially add records to a central tape, but this requires a way of guaranteeing atomic access to the tape [9]. Still, event recorders of the tape variety, readable by sequencing, have been nicely demonstrated using natural DNA and protein mechanisms [12, 15].

A preorder is not the entire history of a computation. We are considering the preorder of first-occurrence of events: any subsequent occurrences for the same signal are not recorded. This limited information can still provide support for causality: if an event always precedes another event over a number of runs, then this supports the first event causing the second, or having a common cause.

In the rest of this paper, we aim to describe the architecture of such a preorder recorder, using DNA strand displacement technology, slowly building up from simpler problems. A property of all the designs in this paper is that (apart from the single-stranded input signals) all DNA structures are nicked double strands with no additional modifications or secondary structure. Therefore, the required and potentially large numbers of components can be fabricated by bacterial cloning as a single or a few long DNA double strands, followed by enzymatic cutting and nicking [4] (see Appendix). Other technologies for high-throughput synthesis of large heterogeneous libraries exist [7, 13]. Thus, we rely on both high-throughput synthesis for producing the \(N^2\) detectors and on high-throughput sequencing to read them out.

2 Occurrence Recorder

We begin by investigating the simplest event recorder: recording the occurrence of events at any time during an experiment. By an ‘event’ here, we mean the appearance of a whole population of identical molecules and in fact a specific structure of molecules that can be uniformly identified. Any event that does not fit that description must first be transduced into one of these uniform molecular structures. By a signal, we mean a population of one such molecular species over time, and by an event, we mean the appearance of a signal population (we do not detect the disappearance of a population).

In discussions, we summarize DNA structures by a textual notation. In addition to lowercase letters like ‘a’ for single-stranded long domains, and underlined letters like ‘a’ for the corresponding double-stranded long domains, a single short domain is used for all toeholds: ‘’ is an open (i.e., un-hybridized) toehold on a single strand or on the upper strand of a double strand, ‘\(\text {\_}\)’ is an open toehold on the lower strand of a double strand, and ‘’ is a covered (double-stranded) toehold. A sequence of domains on a double strand with an initial open toehold and an intermediate covered toehold looks like ‘\(\underline{\,\,\,\text {a}{\textbf {-}}\text {b}}\)’. This summary notation omits information about nicks, which are instead detailed in corresponding figures. Note that, before sequencing, all the open domains should be complemented, and all the nicks should be ligated.

Figures instead depict the corresponding single and double strands graphically (e.g., Fig. 1). A domain is a short or long sequence of dashes ‘-​’ with domain delimiters ‘\(\mathtt{>}\)’ and ‘\(\mathtt{<}\)’ pointing in the 5’-to-3’ direction to indicate either a nick (an interruption in the strand) or the 3’ end, and ‘+’ to indicate the 5’ end or the logical boundary of a domain (not a nick). The name of a domain is a lower-case letter placed on top of the upper strand, with implicitly the reverse complement domain on the lower strand. All toeholds are the same sequence: they have a blank name. Reversible reactions are ‘\(\mathtt{<=>}\)’ and irreversible reactions are ‘\(\mathtt{=>}\)’.

2.1 Yes Gate

The events that we want to detect are represented by single-strands ‘a’ each consisting of a (short) toehold ‘’ attached to a (long) domain ‘a’. If ‘a’ is ever present, we want to know about it: this is the purpose of the Yes gate for ‘a’.

First (Fig. 1) let us consider the traditional way of detecting ‘a’. A double-stranded structure ‘\(\underline{\,\,\,\text {a}{\textbf {-}}\text {q}}\)’ with an open toehold ‘_’ accepts the single-strand ‘a’ (reversibly) and opens up another toehold, yielding ‘\(\underline{{\textbf {-}}\text {a}\,\,\,\text {q}}\)’. That structure then locks down (irreversibly) by combining with an auxiliary single-strand ‘q’ to produce the fully hybridized ‘\(\underline{{\textbf {-}}\text {a}{\textbf {-}}\text {q}}\)’ and the toehold-free ‘q’.

If we attach a fluorophore (F) and quencher (Q) pair at the right end of ‘\(\underline{\,\,\,\text {a}{\textbf {-}}\text {q}}\)’ (and not at the end of ‘q’), we can detect the occurrence of ‘a’ because it separates F from Q and causes visible fluorescence. However, if we were to (ligate and) sequence the solution, it would be difficult or impossible to tell the difference between the initial and final state, because they differ only by open toeholds and by the positions of nicks that are erased by ligation.

Fig. 1
An illustration represents 3 step reaction of domains. The first 2 are reversible reactions. The reaction from 2 to 3 is irreversible. In total, there are 2 single strands labeled, a and q, and 1 double strand labeled, a q, in each step. The double strands are F and Q in the first 2 reactions.

Yes gate for detection by fluorescence

Let us now consider (Fig. 2) an additional domain ‘r’ that will help us tag the desired outcome. The ‘q’ single-strand is replaced by a ‘\(\underline{{\textbf {-}}\text {qr}}\)’ double strand, but with a nick on the bottom between ‘q’ and ‘r’.Footnote 1 The first reaction is the same as before, but the second reaction is now a 4-way strand displacementFootnote 2 (Fig. 3 right). This detector is non-catalytic: it captures some of the ‘a’ strands and releases ‘a’ strands (which are usually harmless).

Fig. 2
An illustration represents 3 step reaction of domains. The first 2 are reversible reactions. The reaction from 2 to 3 is irreversible. In total, there are a single strand labeled, a, and 2 double strands in each step. The double strands are sequenced as follows. 1 and 2, a q, and q r. 3, a q r and q.

Yes gate for detection by sequencing

If this gate is triggered, then the main outcome is ‘\(\underline{{\textbf {-}}\text {a}{\textbf {-}}\text {qr}}\)’, which is a nicked but fully complemented double strand: it is ready for ligation and sequencing. If the gate is not triggered, then the outcome is the initial ‘\(\underline{\,\,\,\text {a}{\textbf {-}}\text {q}}\)’ which is distinguishable after sequencing.Footnote 3

Fig. 3
2 illustrations. 1, represents a 3 way sequencing of strands labeled a, a q. 2, presents a 4 way sequencing of strands labeled a q q r.

3-way and 4-way displacements in the first and last steps of Fig. 2

For a catalytic version (one that does not sequester the input), consider the design in Fig. 4: we add two more structures to Fig. 2 that absorb the ‘a’ that was left over and convert it back to a free ‘a’. Such catalytic irreversible gates avoid sequestering weak signals, while being fully activated by weak signals, leading to robust detection (if the signals are not drained too quickly by downstream processing).

Fig. 4
An illustration represents 3 step reaction of domains. The first 2 are reversible reactions. The reaction from 2 to 3 is irreversible. In total, there are a single strand labeled, a, and 2 double strands in each step. The double strands are sequenced as u a and u.

Catalytic Yes gate, additional reactions

2.2 Occurrence Recorder Algorithm

We can use Yes gates to detect a collection of signals in an experiment via a single high-throughput readout: we prepare a Yes gate detector for each signal, we mix them in at the beginning, and we sequence the entire solution at the end, revealing any detectors that have fired.

3 Coincidence Recorder

We now move to a more interesting task: detecting the simultaneous presence of signals. The idea enabling the sequencing-based readout of gates, and in particular the novel use of 4-way displacement, is due to Chen and Seelig [5] (the Yes gate of Fig. 2 is also a special case of this). Their design was originally meant as an AND gate made of a sequenceable Join part accepting inputs, and a sequenceable Fork part producing outputs. We are going to use just a sequenceable Join half to detect the simultaneous occurrence of any pair of signals in a given set of signals, relying on high-throughput sequencing to inspect all possible combinations.

3.1 Join Gate

The design in Fig. 5 is rooted in a fluorophore-oriented Join gate, along the lines of Fig. 1, which ultimately comes from [11] and [2]. However, again here we want to find a sequencing-friendly version, where the initial structures ‘\(\underline{\,\,\,\text {a}{\textbf {-}}\text {b}{\textbf {-}}\text {q}}\)’ and ‘ qr’ with input signals ‘a’ and ‘b’ are sequencing-distinguishable from the final structure ‘\(\underline{{\textbf {-}}\text {a}{\textbf {-}}\text {b}{\textbf {-}}\text {qr}}\)’, which indicates that both signals were present at the same time. The gate locks down when the two signals are received in turn. If one signal appears first and persists until the second arrives, this gives the same result as both signal appearing together. If one signal is removed before the other one appears, the gate reverts and the result indicates no co-occurrence. The gate can be made more kinetically symmetrical by mixing Join(a,b) with Join(b,a).

Fig. 5
An illustration represents 4 step reaction of domains. The reactions from 1 to 2 and 2 to 3 are reversible reactions. The reaction from 3 to 4 is irreversible. In total, there are 2 single strands labeled, a and b, and 2 double strands in each step.

Join gate for detection by sequencing

As in the previous case, we can add structures to this gate that convert it to a catalytic gate. But we need to handle the two signals together in the additional structures, because ‘a’ must be able to revert to ‘a’ when ‘b’ is not present. Hence, we use the binary structure in Fig. 6 for distinct a,b. This structure cannot coexist with a catalytic Yes(a) as it would lock down Join(a,b) on the first input: a later non-coincident ‘b’ would give a false positive. Join(a,a) must not have the additional catalytic structures for the same reason: it is best to replace it with a non-catalytic Yes(a).

Fig. 6
An illustration represents the first and last reactions in a multi-step reaction of domains. The first reaction and subsequent reactions are reversible reactions. The reaction from the penultimate to the ultimate reaction is irreversible. The reactions contain 2 single strands sequenced as, a and b, and 2 double strands sequenced as v b a and v.

Catalytic Join gate, additional reactions

3.2 Coincidence Recorder Algorithm

We can use Join gates to detect the simultaneous occurrence of any pair of distinct signals in a collection: we prepare a Join gate detector for each such pair, we mix them in at the beginning, and we sequence the entire solution at the end, revealing any detectors that have fired. If we detect Join(a,b) and Join(b,c), we can deduce the coincidence of a and c, and we should also detect Join(a,c): that redundancy serves as a crosscheck. We could use fewer Join gates, but if we did not include the transitive Join(a,c) and b never came, we would not detect the coincidence of a and c.

4 Preorder Recorder

We now aim to build a device to record the order of occurrence of events in an experiment. The question is: given a set of events \(a,b,c,d,\ldots \) that occur in some order, in what order did they first occur? If some events can occur together (up to experimental uncertainty), the relationship is a preorder: a reflexive and transitive relation. We want to reconstruct the temporal preorder of events from a single observation at the end of a run, with a single mass sequencing.

Such a preorder recorder would be useful for monitoring a process over time without sampling the system at multiple time points. Our recorder does not record timing and does not record sequence, but it records the first-occurrence preorder, storing it within the system itself. Recording the order, rather than the full timing of events, means that we need not use energy during periods of inactivity, and we need not worry about how often we should sample the system. The energy expenditure is all preloaded: no additional resources are needed no matter how long or complex the events history becomes, and there can be no ‘memory overflow’ of the recording. Repeated preorder experiments can build up evidence for causality, by observing which events always happen in the same order, independently of timing and other conditions.

The algorithm below uses a number of gates that is quadratic in the number of signals N but is independent of the observation time. After the initial setup, it requires no further energy because it reacts to signals and does not actively inspect the environment for their presence. More subtly, the algorithm uses a number of distinct domains that is just \(N+4\) (+1 for toeholds). This is important to avoid crosstalk among domains, which becomes more difficult to avoid when we have more domains. The situation would be much worse if we needed \(N^2\) distinct domains in addition to \(N^2\) gates.

The coincidence recorder in the previous section was obtained by iterating a Join gate. For the preorder recorder, we iterate a choice gate, which we describe next. Instead of presenting directly the domain structures (for which there are multiple possibilities), we first describe abstractly how the choice gate behaves, and how the algorithm uses it. The DNA implementation is described later.

4.1 Choice Gate Specification

A choice gate is a two-input gate denoted a?b between input events a and b. As an abstract operator it is symmetric: \(a?b = b?a\). Its desired behavior is as follows:

  • If a arrives no later than b, then a?b produces a distinct result that we indicate \(a\le b\) or equivalently \(b\ge a\).

  • If b arrives no later than a, then a?b produces a distinct result that we indicate \(b\le a\) or equivalently \(a\ge b\).

  • If a and b arrive together, then a?b produces a result that we indicate \(a\sim b\) or equivalently \(b\sim a\). (This is in practice an equal mixture of \(a\le b\) and \(b\le a\), or an unequal mixture if they arrive slightly offset.)

  • As a special case, if a ever arrives, then a?a produces a result \(a\sim a\).

The three results between different a and b are assumed to be distinct and distinguishable by sequencing. Our algorithm requires only that there are three detectable final configurations: \(a\le b\) and \(b\le a\) depending on which of two inputs arrives first, and a mixture of the two, \(a\sim b\), if they arrive together. We may further analyze the results quantitatively: a 100%/0% mixture of \(a\le b\) and \(b\le a\) indicates that enough of a arrived to exhaust the gate population before any of b (if any) arrived. Other mixtures may indicate how much events overlapped in time, their relative strength, or some confusion between those. Weak signals may appear to have arrived together.

There are many ways to achieve this specification, and we will discuss at least two. But first we describe the algorithm that uses these gates.

4.2 Preorder Recorder Algorithm

Suppose we have a (moderately large) set of events a, b, c, d, e, f, ..., like the occurrence of some mRNAs in a cell-free extract. They will activate in some order like b.cd.ae.d (b first, then cd together, then ae together, then d). We want to store that order as the events arise, and read it back at the end.

For N signals, we need \(N^2\) distinct DNA structures: all the possible combinations of two signals, including all the x?x cases. We are not going to distinguish event sequences with repetitions and oscillations: we only look at the first occurrence of a signal. For example, the sequence b.b.b is the same as b for us, and a.b.a is the same as a.b (we can still tell that the first a arrived before the first b: the second a does not confound it).

We do not provide any external timing: there are no clocks needed to sample these signals over time, and there is no predetermined sampling frequency. We just need to assume that the sequence of events is slow enough. If it is not slow enough then a.b will look just like ab (in practice, the closeness of two signals will be reflected in the relative proportions of a \(\le \) b and b \(\le \) a, so we can still get some more information). The time resolution is thus determined by the speed of the DNA reactions. If they happen to be fast enough for the intended observed system, then sampling over longer time periods does not require any more gates or any more energy: the gates just naturally sit waiting for the signals to arrive.

The input to our algorithm is a preorder of signals, like a.bc.def.g that is occurring in real time in our experiment. We initially add to the solution all the choice gates x?y such that x and y range over all those signals (including \(x=y\)). At the end, we sequence all the leftover structures (e.g., \(x\le y\)) and we reconstruct the preorder from them. The process of reconstructing the preorder graph from what is essentially its reachability matrix is called transitive reduction and has the same complexity as transitive closure and matrix multiplication [1].

4.3 Crosstalking Choice Gate

We now describe a DNA implementation of the choice gate a?b. We discuss below how the gates crosstalk, and what are the consequences of crosstalking. But in summary, for our application, this implementation is sufficient, and it is also considerably more economical than a ‘proper’ non-crosstalking implementation.

The inputs are the usual two-domain signals with toehold on the left. For each abstract choice operator a?b, we use two pairs of double strands abbreviated as group [a?b| and group |b?a], with a?b = [a?b| + |b?a]. They are symmetric but different because [a?b| reacts to a ‘b’ strand, while |b?a] reacts to an ‘a’ strand. Conversely, [a?b| reacts also to an ‘a’ strand and |b?a] reacts also to a ‘b’ strand, through the same toehold but in opposite directions.

In Fig. 7, each of the primary structures (top) eventually binds to one and only one of the two end caps (bottom): we arbitrarily associate one end cap with [a?b| and the other with |b?a] (the square bracket indicates the side the end cap is with), so in fact a?b = [a?b| + |b?a] = [b?a| + |a?b] = b?a. The central portions with the ‘a’ and ‘b’ domains are surrounded by four fixed domains ‘s’, ‘p’, ‘q’, ‘r’: these are the same sequences for all the choice gates, regardless of variations in ‘a’ and ‘b’.Footnote 4 The nameless toehold is the same sequence everywhere.

Fig. 7
2 illustrations. 1, is labeled, a, question mark b, in square and dash brackets. It presents 2 double strands sequenced as p a b q and s p. 2, is labeled with a question mark a, in square and dash brackets. It represents 2 double strands sequenced as p b a q and q r.

Crosstalking choice gate

If a signal ‘b’ (with toehold on the left) binds to [a?b|, it blocks the toehold and displaces to the right. It also releases ‘b’ (with toehold to the right), which goes to |b?a], again blocks the toehold there, and displaces to the left, catalytically releasing a copy of the original ‘b’. The end caps can bind to the remaining open toeholds and lock down the configuration. If ‘a’ arrives later, it finds all the toeholds blocked and cannot bind to the remaining structures, Thus ‘b’ arriving first prevents ‘a’ from binding later. If ‘a’ arrives first, the situation is symmetric, with the end caps binding to the opposite structures than in the ‘b’-first case.

In more detail, the initial binding of signals opens up new toeholds for the double-stranded ‘sp’,‘ qr’ end caps: they cause 4-way strand displacements and stabilize the outcomes in a way that is distinguishable by sequencing. For a ‘b’ input the final structures are ‘\(\underline{\text {p}{\textbf {-}}\text {a}{\textbf {-}}\text {b}{\textbf {-}}\text {qr}}\)’ + ‘q’, which is the result we earlier called \(a\ge b\), and ‘\(\underline{\text {sp}{\textbf {-}}\text {b}{\textbf {-}}\text {a}{\textbf {-}}\text {q}}\)’ + ‘p’, which is the result we earlier called \(b\le a\) (Fig. 8, top). The opposite happens if ‘a’ arrives first (Fig. 8, bottom). If ‘a’ and ‘b’ arrive together, then both results are produced because the released ‘a’ and ‘b’ bind concurrently to as yet untouched copies of the gates.

Fig. 8
4 illustrations. 1, is labeled, a is greater than or equal to b. It presents 2 double strands sequenced as, p a b q r, and q. 2, is labeled, b is less than or equal to a. It presents 2 double strands sequenced as, s p b a q, and p.

Crosstalking choice gate outcomes for a?b. Top: for input ‘b’ (red), which also releases back ‘b’ (teal, not shown). Bottom: for input ‘a’ (red), which also releases back ‘a’ (blue, not shown)

These activations are irreversible and catalytic: ‘a’ and ‘b’ are released back without requiring additional structures. This is going to help kinetically and is also less likely to perturb the system we are observing. Reflexive gates a?a work as expected: we need them to signify that a signal ‘a’ has arrived at some time. We produce the a?a structures by the general recipe, meaning as [a?a| + |a?a], hence with twice the concentration of the main structure. This is in fact what we need to keep the kinetics balanced with respect to non-reflexive gates.

A single choice gate works as described, but we need to consider the situation where there are multiple choice gates together. In a gate with [a?b|, the input ‘b’ releases ‘b’, which goes on to bind to |b?a], but also to any other |b?x]: crosstalk! Normally this would be incorrect, but here we want to activate |b?x] as well, since it tells us that ‘b’ arrived before ‘x’. If there is a |b?x], then there is also an [x?b|, which driven by ‘b’ activates |b?x] anyway. So the crosstalk between gates does not hurt in this particular instance. The most interesting consequence is that, as we noted, although we have \(N^2\) gates, we only have to encode N distinct domains (plus the 4 auxiliary ones). This greatly reduces the potential interference between domains that would be an obstacle to scaling up the number of signals. As an added benefit, these crosstalking gates are automatically catalytic (cf. Fig. 9).

As an example, for 3 signals abc, we use the following 9 choice gates (first column) and corresponding initial structures (second column):

$$\begin{aligned} \text {ga}&\text {tes}&\text {struc}&\text {tures}&\text {after}&\text { `}\,{\textbf {-}}\text {c'}&\text {after}&\text { `}{\textbf {-}}\text {b'} \\ a&?a&[a?a|&\,\,\, |a?a]&[a?a|&\,\,\, |a?a]&[a?a|&\,\,\, |a?a] \\ b&?b&[b?b|&\,\,\, |b?b]&[b?b|&\,\,\, |b?b]&b\ge b&\,\,\, b\le b \\ c&?c&[c?c|&\,\,\, |c?c]&c\ge c&\,\,\, c\le c&c\ge c&\,\,\, c\le c \\ a&?b&[a?b|&\,\,\, |b?a]&[a?b|&\,\,\, |b?a]&a\ge b&\,\,\, b\le a \\ a&?c&[a?c|&\,\,\, |c?a]&a\ge c&\,\,\, c\le a&a\ge c&\,\,\, c\le a \\ b&?c&[b?c|&\,\,\, |c?b]&b\ge c&\,\,\, c\le b&b\ge c&\,\,\, c\le b \end{aligned}$$
Fig. 9
An illustration presents 6 double strands listed in 3 lines, sequenced as follows. Line 1. b cross a, and a cross b. Line 2. p, a cross b, b, b cross a, q, and p, b cross a, a, a cross b, q. Line 3. s p and q r.

Non-crosstalking choice gate

If a signal ‘c’ arrives, it initially activates 3 structures, the ones of the form [x?c|, producing outcomes \(x\ge c\). Soon after, the signal ‘c’ that is released by those activations crosstalks with the structures of the form |c?y], producing outcomes \(c\le y\) (third column). If a signal ‘b’ arrives next, it further activates some gates, but not the ones that have been used up by ‘c’ (fourth column). If we sequence the structures at this point, we can conclude (with multiple redundancies) that:

$$\begin{aligned} c\le b \le a \end{aligned}$$

That’s a definite \(c<b\), because we observe \(c\le b\) but not \(b\le c\). Moreover, we do not observe \(a\le a\) which means that a never arrived. If we were to observe \(c\le b\) and \(b\le c\), then we would deduce that cb arrived together, up to our time resolution.

Detection of the preorder should be robust because of the redundancies. Background noise and bad gates can be tolerated, because we just need to detect which of \(a \le b\) vs. \(b\le a\) is strongest. Moreover, our set of observed structures must be transitively closed: if the input is the sequence a.b.c then we should observe \(a \le b\) (and not \(b \le a\)) and \(b \le c\) (and not \(c \le b\)), and transitively also \(a \le c\) (and not \(c \le a\)). The transitive closures can act as consistency checks.

4.4 A “Proper” Choice Gate

If we want to use a choice gate in some general and modular way within some bigger design, then we need a gate that respects all the conventions, and in particular that does not crosstalk with unrelated gates. In the design in Fig. 9 the domains called ‘axb’ and ‘bxa’ are uniquely determined by ‘a’ and ‘b’ to avoid crosstalk with other gates. Here, a ‘b’ input does not release a ‘b’ signal that connects with other gates, but rather a ‘bxa’ signal that binds uniquely to the other half of that choice gate. In our preorder recorder application, where we use \(N^2\) gates, we would now need \(N+N^2\) distinct signal domains. Other than that, this choice gate could replace the crosstalking one. A catalytic version can be obtained as in Fig. 4.

5 Conclusions

We have described a class of DNA algorithms designed to take advantage of high-throughput sequencing and also relying on high-throughput synthesis. A combinatorial number of different structures are activated on demand without any timing or synchronization, operating by natural parallelism. The outcome is produced not as an output but as the final state of the system to be read by sequencing.