The Structure of Autocatalytic Sets: Evolvability, Enablement, and Emergence


This paper presents new results from a detailed study of the structure of autocatalytic sets. We show how autocatalytic sets can be decomposed into smaller autocatalytic subsets, and how these subsets can be identified and classified. We then argue how this has important consequences for the evolvability, enablement, and emergence of autocatalytic sets. We end with some speculation on how all this might lead to a generalized theory of autocatalytic sets, which could possibly be applied to entire ecologies or even economies.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5


  1. Ashkenasy G, Jegasia R, Yadav M, Ghadiri MR (2004) Design of a directed molecular network. PNAS 101(30):10,872–10,877

    Article  Google Scholar 

  2. Braakman R, Smith E (2012) The emergence and early evolution of biological carbon-fixation. PLoS Comput Biol 8(4):e1002,455

    Article  Google Scholar 

  3. Cameron PJ (1995) Combinatorics: topics, techniques, algorithms. Cambridge University Press, Cambridge

    Google Scholar 

  4. Dorogovtsev SN, Mendes JFF (2003) Evolution of networks: from biological nets to the internet and WWW. Oxford University Press, Oxford

    Google Scholar 

  5. Dyson FJ (1982) A model for the origin of life. J Mol Evol 18:344–350

    Article  Google Scholar 

  6. Eigen M, Schuster P (1977) The hypercycle: a principle of natural self-organization. Part A: emergence of the hypercycle. Naturwissenschaften 64:541–565

    Article  Google Scholar 

  7. Gánti T (1997) Biogenesis itself. J Theor Biol 187:583–593

    Article  Google Scholar 

  8. Hayden EJ, von Kiedrowski G, Lehman N (2008) Systems chemistry on ribozyme self-construction: evidence for anabolic autocatalysis in a recombination network. Angew Chem Int Ed 120:8552–8556

    Article  Google Scholar 

  9. Hordijk W, Hein J, Steel M (2010) Autocatalytic sets and the origin of life. Entropy 12(7):1733–1742

    Article  Google Scholar 

  10. Hordijk W, Kauffman SA, Steel M (2011) Required levels of catalysis for emergence of autocatalytic sets in models of chemical reaction systems. Int J Mol Sci 12(5):3085–3101

    Article  Google Scholar 

  11. Hordijk W, Steel M (2004) Detecting autocatalytic, self-sustaining sets in chemical reaction systems. J Theor Biol 227(4):451–461

    Article  Google Scholar 

  12. Hordijk W, Steel M (2012) Predicting template-based catalysis rates in a simple catalytic reaction model. J Theor Biol 295:132–138

    Article  Google Scholar 

  13. Kauffman SA (1971) Cellular homeostasis, epigenesis and replication in randomly aggregated macromolecular systems. J Cybernet 1(1):71–96

    Article  Google Scholar 

  14. Kauffman SA (1986) Autocatalytic sets of proteins. J Theor Biol 119:1–24

    Article  Google Scholar 

  15. Kauffman SA (1993) The origins of order. Oxford University Press, Oxford

    Google Scholar 

  16. Letelier JC, Soto-Andrade J, Abarzúa FG, Cornish-Bowden A, Cárdenas ML (2006) Organizational invariance and metabolic closure: analysis in terms of (M;R) systems. J Theor Biol 238:949–961

    Article  Google Scholar 

  17. Lifson S (1997) On the crucial stages in the origin of animate matter. J Mol Evol 44:1–8

    Article  Google Scholar 

  18. Mossel E, Steel M (2005) Random biochemical networks: the probability of self-sustaining autocatalysis. J Theor Biol 233(3):327–336

    Article  Google Scholar 

  19. Newman MEJ (2010) Networks: an introduction. Oxford University Press, Oxford

    Google Scholar 

  20. Orgel LE (2008) The implausibility of metabolic cycles on the prebiotic earth. PLoS Biol 6(1):5–13

    Article  Google Scholar 

  21. Rosen R (1991) Life itself. Columbia University Press, New York

    Google Scholar 

  22. Sievers D, von Kiedrowski G (1994) Self-replication of complementary nucleotide-based oligomers. Nature 369:221–224

    Article  Google Scholar 

  23. Steel M (2000) The emergence of a self-catalysing structure in abstract origin-of-life models. Appl Math Lett 3:91–95

    Article  Google Scholar 

  24. Taran O, Thoennessen O, Achilles K, von Kiedrowski G (2010) Synthesis of information-carrying polymers of mixed sequences from double stranded short deoxynucleotides. J Syst Chem 1(9)

  25. Vasas V, Fernando C, Santos M, Kauffman S, Sathmáry E (2012) Evolution before genes. Biol Direct 7:1

    Article  Google Scholar 

  26. Vasas V, Szathmáry E, Santos M (2010) Lack of evolvability in self-sustaining autocatalytic networks constraints metabolism-first scenarios for the origin of life. PNAS 107(4):1470–1475

    Article  Google Scholar 

  27. Wächterhäuser G (1990) Evolution of the first metabolic cycles. PNAS 87:200–204

    Article  Google Scholar 

Download references


This paper was finalized while WH and SK were visiting the Computational Systems Biology Research Group of the Tampere University of Technology, Finland. MS thanks the Royal Society of New Zealand for funding support. We also thank Vera Vasas for helpful and stimulating discussions.

Author information



Corresponding author

Correspondence to Wim Hordijk.



Proof of Theorem 1:

Part 1: First, consider a directed graph G that has 2k vertices r 1r 2,…, r k , and r1r2,…, r k . For each \(i=1,2,\ldots, k-1,\) place a directed edge from r i to r i+1 and also one from r i to r i+1. Next, for each i = 1, 2, …, k − 1, place a directed edge from r i to r i+1 and also one from r i to r i+1. Finally place directed edges from r k back to r 1 and to r1; similarly place directed edges from r k back to r 1 and to r1.

Notice that the number of minimal directed cycles in this digraph is 2k, since we have complete freedom to select r i or r i at each step in the cycle, and we must select one of them (to get a cycle) but not more than one (to get a minimal cycle).

We now use this graph to construct an RAF set that has exponentially many irrRAFs as follows. Associate with r i the reaction \(a_i+b_i \Rightarrow c_i\) and with r i the reaction \(a'_i + b'_i \Rightarrow c_i,\) where:

  1. (i)

    the a i b i a i b i and c i are all distinct from each other (and across different choices of i there is no repetition), and

  2. (ii)

    the a i b i a i b i are all in the food set F (for all i).

For the catalysis set C, we let c i catalyze r i+1 and r i+1 (for \(i=1,2, \ldots, k-1\)). In addition, let c k catalyze r 1 and r1. Figure 3 illustrates this RAF set for the case k = 3.

The irrRAFs in this resulting RAF set are now in one-to-one correspondence with the minimal directed cycles of the graph G described above, and there are 2k such minimal cycles, but only 2k reactions and 5k molecules. So, the number of irrRAFs is exponential in the size of the RAF set. Notice that this construction can be carried out within the binary polymer model.

Part 2: For an arbitrary subset \(\mathcal{R}'' \subseteq \mathcal{R},\) let \(s(\mathcal{R}'')\) denote the (possibly empty) subset of \(\mathcal{R}\) obtained by applying the RAF algorithm to \(\mathcal{R}''\) and F, and let \(\mathcal{R}''_{\neq \emptyset}\) be the set of reactions r in \(\mathcal{R}''\) for which \(s(\mathcal{R}''-\){r}) ≠ ∅. We first establish the following result:

Claim 1: If \(\mathcal{R}'\) is any RAF, then \(\mathcal{R}''\) is a maximal proper subRAF of \(\mathcal{R}'\) if and only if

  1. (a)

    \(\mathcal{R}'' = s(\mathcal{R}'-\{r\})\) for some reaction \(r \in \mathcal{R}'_{\neq \emptyset},\) and

  2. (b)

    \(\mathcal{R}''\) is not strictly contained within any other set of type (a).

To verify this claim, suppose that A is a maximal proper subRAF of \(\mathcal{R}'.\) Then there is at least one reaction \(r \in \mathcal{R}'-A.\) Notice that, since \(A \subseteq \mathcal{R}'-\{r\},\,s(A)=A\) is a non-empty subset of \(s(\mathcal{R}' -\{r\});\) moreover \(s(\mathcal{R}'-\{r\})\) is a strict subRAF of \(\mathcal{R}'\) since \(s(\mathcal{R}'-\{r\})\) does not include r while \(\mathcal{R}'\) does. Thus, since A is a maximal proper subRAF of \(\mathcal{R}\) we have

$$ A= s(A) = s({\mathcal{R}}'-\{r\}), $$

and so (a) holds. Property (b) now follows by the maximality assumption.

Conversely, suppose that (a) and (b) hold for \(\mathcal{R}''.\) Then \(\mathcal{R}''=s(\mathcal{R}'-\{r\})\) is nonempty and so \(s(\mathcal{R}'-\{r\})\) is a proper subRAF of \(\mathcal{R}',\) and if it were not a maximal proper subRAF of \(\mathcal{R}'\) then, from the first part of the proof \(s(\mathcal{R}'-\{r\})\) would need to be strictly contained within \(s(\mathcal{R}' - \{r'\})\) for some reaction \(r' \in \mathcal{R}'_{\neq \emptyset},\) and this is impossible since we are assuming that (b) holds.

From Claim 1, the number of maximal proper subRAFs is at most the number of sets of the form \(s(\mathcal{R}'-\{r\})\) for \(r \in \mathcal{R}',\) and there are at most \(|\mathcal{R}'|\) such sets across the possible choices of r from \(\mathcal{R}.'\)

Part 3: Part (i) follows directly from Claim 1, since the collection of RAF sets \( \{s({\mathcal{R}}'-\{r\}): r \in {\mathcal{R}}'_{\neq \emptyset}\} \) can be computed in polynomial time, and property (b) in Claim 1 can then also be checked in polynomial time.

Part (ii) also follows from Claim 1, since this shows that \(\mathcal{R}'\) is the union of two proper subRAFs if and only if

$$ {\mathcal{R}}' = s({\mathcal{R}}'-\{r_1\}) \cup s({\mathcal{R}}'-\{r_2\}) $$

for some pair of distinct elements r 1, r 2 of \(\mathcal{R}'_{\neq \emptyset}.\)

From this, it is clear how to obtain a polynomial time algorithm: first construct the set \(\mathcal{R}'_{\neq \emptyset},\) and, provided this set is non-empty, search for all pairs \(r_1, r_2 \in \mathcal{R}'_{\neq \emptyset}\) for which Eqn. (1) holds; for each such pair we can set \(\mathcal{R}_i:=s(\mathcal{R}'-\){r i }), for i = 1, 2 so that \(\mathcal{R}' = \mathcal{R}_1 \cup \mathcal{R}_2.\) If no such pair r 1, r 2 exists (or if \(\mathcal{R}'_{\neq \emptyset}\) is empty), then report that \(\mathcal{R}'\) cannot be decomposed further. This completes the proof of the part (ii).

For part (iii), it suffices to verify the following:

Claim 2: If \(\mathcal{R}'\) is any RAF set and \(\mathcal{R}_0\) is any non-empty subset of \(\mathcal{R}'\) then \(\mathcal{R}_0\) is contained within every subRAF of \(\mathcal{R}'\) if and only if \(s(\mathcal{R}'-\{r\}) = \emptyset\) for all \(r \in \mathcal{R}_0.\)

To verify this claim, first suppose there exists \(r \in \mathcal{R}_0\) with \(s(\mathcal{R}'-\{r\}) \neq \emptyset.\) Then \(s(\mathcal{R}'-\{r\})\) is a subRAF of \(\mathcal{R}'\) and yet RAF \(s(\mathcal{R}'-\{r\})\) does not contain \(\mathcal{R}_0,\) since \(s(\mathcal{R}'-\{r\})\) is a subset of \(\mathcal{R}'-r\) and so does not contain \(r \in \mathcal{R}_0.\) Conversely, suppose there exists a subRAF \(\mathcal{R}''\) of \(\mathcal{R}'\) which does not contain \(\mathcal{R}_0.\) Select any reaction \(r \in \mathcal{R}_0-\mathcal{R}''.\) Then \(\mathcal{R}'' \subseteq s(\mathcal{R}'-\{r\})\) and so \(s(\mathcal{R}'-\{r\}) \neq \emptyset.\) This establishes Claim 2, as required, and completes the proof. \(\square\)

Proof of Corollary 1

The algorithm constructs the Hasse diagram from the top down, starting from the single node \(\mathcal{R}'.\) We apply Part 3(i) of Theorem 1 to list all the maximal proper subRAFs of \(\mathcal{R}',\) and then place edges from each of these to \(\mathcal{R}'\) (if \(\mathcal{R}'\) has no maximal proper subRAFs then \(\mathcal{R}'\) is irreducible and we leave the node as it is). Now we repeat this step recursively on these subRAFs, introducing edges as before, and also identifying any two (or more) nodes labeled by the same subRAF. We continue in this way until the network can be extended no further, in which case all the nodes with no children comprise the set of irrRAFs of \(\mathcal{R}'.\)

The resulting network N that we have constructed contains all the nodes of the Hasse diagram of the poset (i.e. it contains all the subRAFs of \(\mathcal{R}'\)); moreover, the edge set is a subset of the edges in the Hasse diagram. This last claim needs a short proof: if we have constructed an edge in N from \(\mathcal{R}_1\) to \(\mathcal{R}_2,\) where \(\mathcal{R}_1 \subset \mathcal{R}_2\) we need to show that there is no other path in N from \(\mathcal{R}_1\) to \(\mathcal{R}_2\) via a sequence of increasing subRAFs (which would make the edge \((\mathcal{R}_1, \mathcal{R}_2)\) redundant). Suppose there were such a second path, and let \((\mathcal{R}_3, \mathcal{R}_2)\) be the last edge on this path. Then, referring to Claim 1 (in the proof of Part 2 of Theorem 2), \(\mathcal{R}_1 = s(\mathcal{R}_2-\{r\})\) would be strictly contained in \(\mathcal{R}_3 = s(\mathcal{R}_2-\{r'\})\) for some reactions rr′ and this is forbidden in allowing \(\mathcal{R}_1\) to be selected as a maximal proper subRAF of \(\mathcal{R}_2.\)

Thus, each edge in N will be present as an edge in the Hasse diagram. Moreover, all edges in the Hasse diagram are present in N, for suppose that in the Hasse diagram there is an edge from \(\mathcal{R}_1\) to \(\mathcal{R}_2.\) where \(\mathcal{R}_1 \subset \mathcal{R}_2.\) Then \(\mathcal{R}_1\) must be a maximal subRAF of \(\mathcal{R}_2\) and so, by construction, the algorithm inserts an edge from \(\mathcal{R}_1\) to \(\mathcal{R}_2\) during the step at which the subRAF \(\mathcal{R}_2\) and its maximal subRAFs are considered.

In summary, we have verified that the algorithm described constructs exactly the Hasse diagram of subRAFs of \(\mathcal{R}' .\) \(\square\)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hordijk, W., Steel, M. & Kauffman, S. The Structure of Autocatalytic Sets: Evolvability, Enablement, and Emergence. Acta Biotheor 60, 379–392 (2012).

Download citation


  • Origin of life
  • Autocatalytic sets
  • Evolvability
  • Emergence
  • Functional organization