## Abstract

The computational power of stochastic chemical reaction networks (CRNs) varies significantly with the output convention and whether or not error is permitted. Focusing on probability 1 computation, we demonstrate a striking difference between *stable* computation that converges to a state where the output cannot change, and the notion of *limit-stable* computation where the output eventually stops changing with probability 1. While stable computation is known to be restricted to semilinear predicates (essentially piecewise linear), we show that limit-stable computation encompasses the set of predicates \(\phi :{\mathbb {N}}\rightarrow \{0,1\}\) in \(\Delta ^0_2\) in the arithmetical hierarchy (a superset of Turing-computable). In finite time, our construction achieves an error-correction scheme for Turing universal computation. We show an analogous characterization of the functions \(f:{\mathbb {N}}\rightarrow {\mathbb {N}}\) computable by CRNs with probability 1, which encode their output into the count of a certain species. This work refines our understanding of the tradeoffs between error and computational power in CRNs.

This is a preview of subscription content, access via your institution.

## Notes

A CRN where reactions can increase or decrease the count of molecules may not be consistent with the conservation of mass for a closed system. In that case the CRN represents the behavior of an open system with an implicit, potentially unbounded (or replenishable) reservoir.

The second and third reactions are equivalent to the gambler’s ruin problem (Feller 1968), which tells us that, because the probability of increasing the count of \(Y\) is twice that of decreasing its count, there is a positive probability that \(Y\)’s count never reaches \(0\). The first reaction can only increase this probability. Since whenever \(Y\)’s count reaches \(0\), we have another try, eventually with probability \(1\) we will not stop visiting the state \(\{0 Y\}\).

The notions of stable and limit-stable are incomparable in the sense that a CRN can compute a predicate under one convention but not the other. The previous paragraph showed a CRN that limit-stably computes a predicate but does not stably compute it. To see the other direction, consider removing the first reaction and starting in state \(\{1 N, 1 Y\}\), i.e., a no voter and a yes voter. This state has undefined output, as does any state with positive count of \(Y\), since there is always an \(N\) present. The state \(\{1 N, 0 Y\}\) is has defined output “no” and is stable (since we removed the first reaction). Since that state is always reachable, the CRN stably computes the predicate \(\phi =0\). However, there is a positive probability that \(\{1 N, 0 Y\}\) is never reached, so the predicate is not computed under the limit-stable convention.

Consider the contrast to stable CRN computation. Although stably computing CRNs do not know when they are finished, an outside observer

*can*compute if the CRN is done—i.e., no sequence of reactions can change the output, since this is easily reduced to the problems of deciding whether one state is reachable from another, known to be decidable (Mayr 1981), and deciding if a superset of a state is reachable from another, also decidable (Karp and Miller 1969).In Sect. 3 and beyond, we restrict attention to the case that \(|\Sigma | = 1\), i.e., single-integer inputs. Since our main result will show that the predicates computable with probability 1 by a CRN encompass all of \(\Delta ^0_2\), this restriction will not be crucial, since any computable encoding function \(e:{\mathbb {N}}^k \rightarrow {\mathbb {N}}\) that represents \(k\)-tuples of integers as a single integer, and its inverse decoding function \(d:{\mathbb {N}}\rightarrow {\mathbb {N}}^k\), can be computed by the CRN to handle a \(k\)-tuples of inputs that is encoded into the count of a single input species \(X\). Such encoding functions are provably not computable by semilinear functions, so this distinction is more crucial in the realm of stable computation by CRDs, which are limited to semilinear predicates and functions (Chen et al. 2014; Angluin et al. 2006).

In other words, species in \(\Lambda \setminus \Sigma \) must always start with the same counts, and counts of species in \(\Sigma \) are varied to represent different inputs to \({\mathcal {D}}\), similarly to a Turing machine that starts with different binary string inputs, but the Turing machine must always start with the same initial state and tape head position.

A common restriction is to assume the

*finite density constraint*, which stipulates that arbitrarily large mass cannot occupy a fixed volume, and thus the volume must grow proportionally with the total molecular count. With some minor modifications to ensure*relative*rates of reactions stay the same (even though all bimolecular reactions would be slowed down in absolute terms), our construction would work under this assumption, although the time analysis would change. For the sake of conceptual clarity, we present the construction assuming a constant volume. The issue is discussed in more detail in Sect. 4.Note that this is equivalent to requiring \(\lim _{i\rightarrow \infty } \Phi ({\mathbf {c}}_i) = b\), hence the term “limit” in the phrase “limit-stable” comes from this requirement on infinite executions with a well-defined limit output.

If the error takes the register machine \(M\) to a configuration from which it halts but possibly produces the wrong answer, then, assuming (1) is accomplished by other means, it is easy to ensure (2): the CRD can simply simulate \(M\) over and over again in an infinite loop, always updating the CRD’s output to be the most recent output of \(M\). Since (1) ensures that errors eventually stop occurring, all but finitely many simulations of \(M\) give the correct answer, causing the CRD to stabilize on this answer. Most of the complexity of the construction described in the subsequent sections is to handle the case that the error takes \(M\) to a configuration from which it does not halt if simulated correctly from that point on.

It is sufficient to bound the number of decrements, rather than total instructions, since we may assume without loss of generality that \(M\) contains no “all-increment” cycles. (If it does then either these lines are not reachable or \(M\) enters an infinite loop.) Thus any infinite computation of \(M\) must decrement infinitely often.

Intuitively, with an \(\ell \)-stage clock, if there are \(a\) molecules of \(A\), the frequency of time that \(C_\ell \) is present is less than \(\frac{1}{a^{\ell -1}}\). A stage \(\ell = 4\) clock is used to ensure that the error decreases quickly enough that with probability \(1\) a finite number of errors are ever made, and the last error occurs in finite expected time. A more complete analysis of the clock module is contained in Sect. 4.

Although there are three instructions and three reactions in this implementation of flush(

*r*,*r'*), there is not a 1–1 mapping between instructions and reactions; the three reactions collectively have the same effect as the three instructions, assuming the third reaction does not erroneously happen when the first reaction is possible.Recall that the Borel–Cantelli Lemma does not require the events to be independent, and indeed they are not in our case (e.g., failing to decrement may increase or decrease the chance of error on subsequent decrement instructions.

They actually show it is \(O(\ell ^{s-1} v / k)\), where \(v\) is the volume (1 in our case), \(k\) is the rate constant on all reactions (also 1 in our case).

Technically, we are defining, for each \(t\in {\mathbb {N}}\), a measure on the set of all states, giving the state’s probability of being reached in exactly \(t\) steps, so for each \(t\in {\mathbb {N}}\), \(\mu (\cdot ,t):{\mathbb {N}}^\Lambda \rightarrow [0,1]\) is a probability measure on \({\mathbb {N}}^\Lambda \). Since the evolution of the system is Markovian, once we know the probability \(\mu ({\mathbf {c}},t)\) of ending up in state \({\mathbf {c}}\) after precisely \(t\) steps, it does not matter the particular sequence of \(t-1\) states preceding \({\mathbf {c}}\) that got the system there, in order to determine probability of the various states that could follow \({\mathbf {c}}\).

As in Sect. 3, we focus on functions taking single integers as input for the sake of simplifying the discussion. The ideas carry through just as easily to functions \(f:{\mathbb {N}}^k \rightarrow {\mathbb {N}}\). One could modify the construction to simply have extra input registers for a direct encoding, or one could encode several inputs \(n_1,\ldots ,n_k\in {\mathbb {N}}\) into a single integer \(n\in {\mathbb {N}}\) using some injective encoding function \(e:{\mathbb {N}}^k\rightarrow {\mathbb {N}}\), which could be decoded as the first step of the register machine computation.

For convenience we are assuming a single input species, so that once we know the initial context of \({\mathcal {C}}\), we can equivalently fully describe the input to \({\mathcal {C}}\) as a single natural number \(n\) describing the initial count of \(X\).

Equivalently, \(\lim _{i\rightarrow \infty } \Phi ({\mathbf {c}}_i) = m\).

An erroneous value of \(Y\) can be made permanent if its backup is actually correct—since we only compare the backup. Thus we must ensure that even in the presence of errors, certain invariants are maintained.

The definition of \(r(n,t)\) appears to require checking an infinite number of possible natural numbers \(m\). However, only finitely many such natural numbers can have nonzero probability of being the count of \(Y\) after exactly \(t\) reactions since there are only a finite number of execution sequences of length \(t\), each terminating in a configuration accounting for one possible value of \(\#_{n,t} Y\). Therefore calculating \(r(n,t)\) requires only searching through this finite list and picking the smallest natural number that has maximum probability.

For a register machine even to meaningfully “read” an input \(n\) requires decrementing the input register at least \(n\) times to get it to 0, whereas a Turing machine can process \(n\)’s binary representation in only \(\approx \log n\) steps.

## References

Angluin D, Aspnes J, Eisenstat D (2006) Stably computable predicates are semilinear. In: PODC 2006: Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing. ACM Press, New York, NY, USA, pp 292–299

Aspnes J, Ruppert E (2007) An introduction to population protocols. Bull Eur Assoc Theor Comput Sci 93:98–117

Brijder R (2014) Output stability and semilinear sets in chemical reaction networks and deciders. In: DNA 2014: Proceedings of the 20th international meeting on DNA computing and molecular programming, lecture notes in computer science. Springer, Berlin

Cardelli L (2011) Strand algebras for DNA computing. Nat Comput 10(1):407–428

Chen Y-J, Dalchau N, Srinivas N, Phillips A, Cardelli L, Soloveichik D, Seelig G (2013) Programmable chemical controllers made from DNA. Nat Nanotechnol 8(10):755–762

Chen H-L, Doty D, Soloveichik D (2014) Deterministic function computation with chemical reaction networks. Nat Comput 13(4):517–534 Preliminary version appeared in DNA 2012

Cook M, Soloveichik D, Winfree E, Bruck J (2009) Programmability of chemical reaction networks. In: Condon A, Harel D, Kok JN, Salomaa A, Winfree E (eds) Algorithmic Bioprocess. Springer, Berlin, pp 543–584

Feller W (1968) An introduction to probability theory and its applications, vol 1. Wiley, New York

Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81(25):2340–2361

Karp RM, Miller RE(1969) Parallel program schemata. J Comput Syst Sci 3(2):147–195

Mayr EW (1981) An algorithm for the general Petri net reachability problem. In: Proceedings of the thirteenth annual ACM symposium on theory of computing (STOC ’81). ACM, New York, NY, USA, pp 238–246

Petri CA (1966) Communication with automata. Technical report, DTIC Document

Rogers H Jr (1967) Theory of recursive functions and effective computability. McGraw-Hill series in higher mathematics. McGraw-Hill, New York

Shoenfield JR (1959) On degrees of unsolvability. Ann Math 69(3):644–653

Soare RI (2013) Interactive computing and relativized computability. In: Copeland BJ, Posy CJ, Shagrir O (eds) Computability: Turing, Gödel, Church, and beyond. MIT Press, Cambridge, pp 203–260

Soloveichik D, Cook M, Winfree E, Bruck J (2008) Computation with finite stochastic chemical reaction networks. Nat Comput 7(4):615–633

Soloveichik D, Seelig G, Winfree E (2010) DNA as a universal substrate for chemical kinetics. Proc Natl Acad Sci 107(12):5393 Preliminary version appeared in DNA 2008

Volterra V (1926) Variazioni e fluttuazioni del numero dindividui in specie animali conviventi. Mem Acad Lincei Roma 2:31–113

Zavattaro G, Cardelli L (2008) Termination problems in chemical kinetics. In: CONCUR 2008-concurrency theory, pp 477–491

## Acknowledgments

We thank Shinnosuke Seki, Chris Thachuk, and Luca Cardelli for many useful and insightful discussions. The first author was supported by NSF Grants CCF-1049899 and CCF-1217770, the second author was supported by NSF Grants CCF-1219274 and CCF-1162589 and the Molecular Programming Project under NSF Grant 1317694, and the third author was supported by NIGMS Systems Biology Center Grant P50 GM081879.

## Author information

### Authors and Affiliations

### Corresponding author

## Rights and permissions

## About this article

### Cite this article

Cummings, R., Doty, D. & Soloveichik, D. Probability 1 computation with chemical reaction networks.
*Nat Comput* **15**, 245–261 (2016). https://doi.org/10.1007/s11047-015-9501-x

Published:

Issue Date:

DOI: https://doi.org/10.1007/s11047-015-9501-x