Abstract
This paper presents new results from a detailed study of the structure of autocatalytic sets. We show how autocatalytic sets can be decomposed into smaller autocatalytic subsets, and how these subsets can be identified and classified. We then argue how this has important consequences for the evolvability, enablement, and emergence of autocatalytic sets. We end with some speculation on how all this might lead to a generalized theory of autocatalytic sets, which could possibly be applied to entire ecologies or even economies.
This is a preview of subscription content, log in to check access.
References
Ashkenasy G, Jegasia R, Yadav M, Ghadiri MR (2004) Design of a directed molecular network. PNAS 101(30):10,872–10,877
Braakman R, Smith E (2012) The emergence and early evolution of biological carbonfixation. PLoS Comput Biol 8(4):e1002,455
Cameron PJ (1995) Combinatorics: topics, techniques, algorithms. Cambridge University Press, Cambridge
Dorogovtsev SN, Mendes JFF (2003) Evolution of networks: from biological nets to the internet and WWW. Oxford University Press, Oxford
Dyson FJ (1982) A model for the origin of life. J Mol Evol 18:344–350
Eigen M, Schuster P (1977) The hypercycle: a principle of natural selforganization. Part A: emergence of the hypercycle. Naturwissenschaften 64:541–565
Gánti T (1997) Biogenesis itself. J Theor Biol 187:583–593
Hayden EJ, von Kiedrowski G, Lehman N (2008) Systems chemistry on ribozyme selfconstruction: evidence for anabolic autocatalysis in a recombination network. Angew Chem Int Ed 120:8552–8556
Hordijk W, Hein J, Steel M (2010) Autocatalytic sets and the origin of life. Entropy 12(7):1733–1742
Hordijk W, Kauffman SA, Steel M (2011) Required levels of catalysis for emergence of autocatalytic sets in models of chemical reaction systems. Int J Mol Sci 12(5):3085–3101
Hordijk W, Steel M (2004) Detecting autocatalytic, selfsustaining sets in chemical reaction systems. J Theor Biol 227(4):451–461
Hordijk W, Steel M (2012) Predicting templatebased catalysis rates in a simple catalytic reaction model. J Theor Biol 295:132–138
Kauffman SA (1971) Cellular homeostasis, epigenesis and replication in randomly aggregated macromolecular systems. J Cybernet 1(1):71–96
Kauffman SA (1986) Autocatalytic sets of proteins. J Theor Biol 119:1–24
Kauffman SA (1993) The origins of order. Oxford University Press, Oxford
Letelier JC, SotoAndrade J, Abarzúa FG, CornishBowden A, Cárdenas ML (2006) Organizational invariance and metabolic closure: analysis in terms of (M;R) systems. J Theor Biol 238:949–961
Lifson S (1997) On the crucial stages in the origin of animate matter. J Mol Evol 44:1–8
Mossel E, Steel M (2005) Random biochemical networks: the probability of selfsustaining autocatalysis. J Theor Biol 233(3):327–336
Newman MEJ (2010) Networks: an introduction. Oxford University Press, Oxford
Orgel LE (2008) The implausibility of metabolic cycles on the prebiotic earth. PLoS Biol 6(1):5–13
Rosen R (1991) Life itself. Columbia University Press, New York
Sievers D, von Kiedrowski G (1994) Selfreplication of complementary nucleotidebased oligomers. Nature 369:221–224
Steel M (2000) The emergence of a selfcatalysing structure in abstract originoflife models. Appl Math Lett 3:91–95
Taran O, Thoennessen O, Achilles K, von Kiedrowski G (2010) Synthesis of informationcarrying polymers of mixed sequences from double stranded short deoxynucleotides. J Syst Chem 1(9)
Vasas V, Fernando C, Santos M, Kauffman S, Sathmáry E (2012) Evolution before genes. Biol Direct 7:1
Vasas V, Szathmáry E, Santos M (2010) Lack of evolvability in selfsustaining autocatalytic networks constraints metabolismfirst scenarios for the origin of life. PNAS 107(4):1470–1475
Wächterhäuser G (1990) Evolution of the first metabolic cycles. PNAS 87:200–204
Acknowledgments
This paper was finalized while WH and SK were visiting the Computational Systems Biology Research Group of the Tampere University of Technology, Finland. MS thanks the Royal Society of New Zealand for funding support. We also thank Vera Vasas for helpful and stimulating discussions.
Author information
Affiliations
Corresponding author
Appendix
Appendix
Proof of Theorem 1:
Part 1: First, consider a directed graph G that has 2k vertices r _{1}, r _{2},…, r _{ k }, and r′_{1}, r′_{2},…, r′_{ k }. For each \(i=1,2,\ldots, k1,\) place a directed edge from r _{ i } to r _{ i+1} and also one from r _{ i } to r′_{ i+1}. Next, for each i = 1, 2, …, k − 1, place a directed edge from r′_{ i } to r _{ i+1} and also one from r′_{ i } to r′_{ i+1}. Finally place directed edges from r _{ k } back to r _{1} and to r′_{1}; similarly place directed edges from r′_{ k } back to r _{1} and to r′_{1}.
Notice that the number of minimal directed cycles in this digraph is 2^{k}, since we have complete freedom to select r _{ i } or r′_{ i } at each step in the cycle, and we must select one of them (to get a cycle) but not more than one (to get a minimal cycle).
We now use this graph to construct an RAF set that has exponentially many irrRAFs as follows. Associate with r _{ i } the reaction \(a_i+b_i \Rightarrow c_i\) and with r′_{ i } the reaction \(a'_i + b'_i \Rightarrow c_i,\) where:

(i)
the a _{ i }, b _{ i }, a′_{ i }, b′_{ i } and c _{ i } are all distinct from each other (and across different choices of i there is no repetition), and

(ii)
the a _{ i }, b _{ i }, a′_{ i }, b′_{ i } are all in the food set F (for all i).
For the catalysis set C, we let c _{ i } catalyze r _{ i+1} and r′_{ i+1} (for \(i=1,2, \ldots, k1\)). In addition, let c _{ k } catalyze r _{1} and r′_{1}. Figure 3 illustrates this RAF set for the case k = 3.
The irrRAFs in this resulting RAF set are now in onetoone correspondence with the minimal directed cycles of the graph G described above, and there are 2^{k} such minimal cycles, but only 2k reactions and 5k molecules. So, the number of irrRAFs is exponential in the size of the RAF set. Notice that this construction can be carried out within the binary polymer model.
Part 2: For an arbitrary subset \(\mathcal{R}'' \subseteq \mathcal{R},\) let \(s(\mathcal{R}'')\) denote the (possibly empty) subset of \(\mathcal{R}\) obtained by applying the RAF algorithm to \(\mathcal{R}''\) and F, and let \(\mathcal{R}''_{\neq \emptyset}\) be the set of reactions r in \(\mathcal{R}''\) for which \(s(\mathcal{R}''\){r}) ≠ ∅. We first establish the following result:
Claim 1: If \(\mathcal{R}'\) is any RAF, then \(\mathcal{R}''\) is a maximal proper subRAF of \(\mathcal{R}'\) if and only if

(a)
\(\mathcal{R}'' = s(\mathcal{R}'\{r\})\) for some reaction \(r \in \mathcal{R}'_{\neq \emptyset},\) and

(b)
\(\mathcal{R}''\) is not strictly contained within any other set of type (a).
To verify this claim, suppose that A is a maximal proper subRAF of \(\mathcal{R}'.\) Then there is at least one reaction \(r \in \mathcal{R}'A.\) Notice that, since \(A \subseteq \mathcal{R}'\{r\},\,s(A)=A\) is a nonempty subset of \(s(\mathcal{R}' \{r\});\) moreover \(s(\mathcal{R}'\{r\})\) is a strict subRAF of \(\mathcal{R}'\) since \(s(\mathcal{R}'\{r\})\) does not include r while \(\mathcal{R}'\) does. Thus, since A is a maximal proper subRAF of \(\mathcal{R}\) we have
and so (a) holds. Property (b) now follows by the maximality assumption.
Conversely, suppose that (a) and (b) hold for \(\mathcal{R}''.\) Then \(\mathcal{R}''=s(\mathcal{R}'\{r\})\) is nonempty and so \(s(\mathcal{R}'\{r\})\) is a proper subRAF of \(\mathcal{R}',\) and if it were not a maximal proper subRAF of \(\mathcal{R}'\) then, from the first part of the proof \(s(\mathcal{R}'\{r\})\) would need to be strictly contained within \(s(\mathcal{R}'  \{r'\})\) for some reaction \(r' \in \mathcal{R}'_{\neq \emptyset},\) and this is impossible since we are assuming that (b) holds.
From Claim 1, the number of maximal proper subRAFs is at most the number of sets of the form \(s(\mathcal{R}'\{r\})\) for \(r \in \mathcal{R}',\) and there are at most \(\mathcal{R}'\) such sets across the possible choices of r from \(\mathcal{R}.'\)
Part 3: Part (i) follows directly from Claim 1, since the collection of RAF sets \( \{s({\mathcal{R}}'\{r\}): r \in {\mathcal{R}}'_{\neq \emptyset}\} \) can be computed in polynomial time, and property (b) in Claim 1 can then also be checked in polynomial time.
Part (ii) also follows from Claim 1, since this shows that \(\mathcal{R}'\) is the union of two proper subRAFs if and only if
for some pair of distinct elements r _{1}, r _{2} of \(\mathcal{R}'_{\neq \emptyset}.\)
From this, it is clear how to obtain a polynomial time algorithm: first construct the set \(\mathcal{R}'_{\neq \emptyset},\) and, provided this set is nonempty, search for all pairs \(r_1, r_2 \in \mathcal{R}'_{\neq \emptyset}\) for which Eqn. (1) holds; for each such pair we can set \(\mathcal{R}_i:=s(\mathcal{R}'\){r _{ i }}), for i = 1, 2 so that \(\mathcal{R}' = \mathcal{R}_1 \cup \mathcal{R}_2.\) If no such pair r _{1}, r _{2} exists (or if \(\mathcal{R}'_{\neq \emptyset}\) is empty), then report that \(\mathcal{R}'\) cannot be decomposed further. This completes the proof of the part (ii).
For part (iii), it suffices to verify the following:
Claim 2: If \(\mathcal{R}'\) is any RAF set and \(\mathcal{R}_0\) is any nonempty subset of \(\mathcal{R}'\) then \(\mathcal{R}_0\) is contained within every subRAF of \(\mathcal{R}'\) if and only if \(s(\mathcal{R}'\{r\}) = \emptyset\) for all \(r \in \mathcal{R}_0.\)
To verify this claim, first suppose there exists \(r \in \mathcal{R}_0\) with \(s(\mathcal{R}'\{r\}) \neq \emptyset.\) Then \(s(\mathcal{R}'\{r\})\) is a subRAF of \(\mathcal{R}'\) and yet RAF \(s(\mathcal{R}'\{r\})\) does not contain \(\mathcal{R}_0,\) since \(s(\mathcal{R}'\{r\})\) is a subset of \(\mathcal{R}'r\) and so does not contain \(r \in \mathcal{R}_0.\) Conversely, suppose there exists a subRAF \(\mathcal{R}''\) of \(\mathcal{R}'\) which does not contain \(\mathcal{R}_0.\) Select any reaction \(r \in \mathcal{R}_0\mathcal{R}''.\) Then \(\mathcal{R}'' \subseteq s(\mathcal{R}'\{r\})\) and so \(s(\mathcal{R}'\{r\}) \neq \emptyset.\) This establishes Claim 2, as required, and completes the proof. \(\square\)
Proof of Corollary 1
The algorithm constructs the Hasse diagram from the top down, starting from the single node \(\mathcal{R}'.\) We apply Part 3(i) of Theorem 1 to list all the maximal proper subRAFs of \(\mathcal{R}',\) and then place edges from each of these to \(\mathcal{R}'\) (if \(\mathcal{R}'\) has no maximal proper subRAFs then \(\mathcal{R}'\) is irreducible and we leave the node as it is). Now we repeat this step recursively on these subRAFs, introducing edges as before, and also identifying any two (or more) nodes labeled by the same subRAF. We continue in this way until the network can be extended no further, in which case all the nodes with no children comprise the set of irrRAFs of \(\mathcal{R}'.\)
The resulting network N that we have constructed contains all the nodes of the Hasse diagram of the poset (i.e. it contains all the subRAFs of \(\mathcal{R}'\)); moreover, the edge set is a subset of the edges in the Hasse diagram. This last claim needs a short proof: if we have constructed an edge in N from \(\mathcal{R}_1\) to \(\mathcal{R}_2,\) where \(\mathcal{R}_1 \subset \mathcal{R}_2\) we need to show that there is no other path in N from \(\mathcal{R}_1\) to \(\mathcal{R}_2\) via a sequence of increasing subRAFs (which would make the edge \((\mathcal{R}_1, \mathcal{R}_2)\) redundant). Suppose there were such a second path, and let \((\mathcal{R}_3, \mathcal{R}_2)\) be the last edge on this path. Then, referring to Claim 1 (in the proof of Part 2 of Theorem 2), \(\mathcal{R}_1 = s(\mathcal{R}_2\{r\})\) would be strictly contained in \(\mathcal{R}_3 = s(\mathcal{R}_2\{r'\})\) for some reactions r, r′ and this is forbidden in allowing \(\mathcal{R}_1\) to be selected as a maximal proper subRAF of \(\mathcal{R}_2.\)
Thus, each edge in N will be present as an edge in the Hasse diagram. Moreover, all edges in the Hasse diagram are present in N, for suppose that in the Hasse diagram there is an edge from \(\mathcal{R}_1\) to \(\mathcal{R}_2.\) where \(\mathcal{R}_1 \subset \mathcal{R}_2.\) Then \(\mathcal{R}_1\) must be a maximal subRAF of \(\mathcal{R}_2\) and so, by construction, the algorithm inserts an edge from \(\mathcal{R}_1\) to \(\mathcal{R}_2\) during the step at which the subRAF \(\mathcal{R}_2\) and its maximal subRAFs are considered.
In summary, we have verified that the algorithm described constructs exactly the Hasse diagram of subRAFs of \(\mathcal{R}' .\) \(\square\)
Rights and permissions
About this article
Cite this article
Hordijk, W., Steel, M. & Kauffman, S. The Structure of Autocatalytic Sets: Evolvability, Enablement, and Emergence. Acta Biotheor 60, 379–392 (2012). https://doi.org/10.1007/s1044101291651
Received:
Accepted:
Published:
Issue Date:
Keywords
 Origin of life
 Autocatalytic sets
 Evolvability
 Emergence
 Functional organization