Interior Operators and Their Relationship to Autocatalytic Networks

The emergence of an autocatalytic network from an available set of elements is a fundamental step in early evolutionary processes, such as the origin of metabolism. Given the set of elements, the reactions between them (chemical or otherwise), and with various elements catalysing certain reactions, a Reflexively Autocatalytic F-generated (RAF) set is a subset R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$'$$\end{document}′ of reactions that is self-generating from a given food set, and with each reaction in R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$'$$\end{document}′ being catalysed from within R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$'$$\end{document}′. RAF theory has been applied to various phenomena in theoretical biology, and a key feature of the approach is that it is possible to efficiently identify and classify RAFs within large systems. This is possible because RAFs can be described as the (nonempty) subsets of the reactions that are the fixed points of an (efficiently computable) interior map that operates on subsets of reactions. Although the main generic results concerning RAFs can be derived using just this property, we show that for systems with at least 12 reactions there are generic results concerning RAFs that cannot be proven using the interior operator property alone.


Introduction
Discrete graph-theoretic models have been developed to describe the emergence and structure of self-generating autocatalytic reaction networks within a larger network.This approach was pioneered by Stuart Kauffman's modelling of autocatalytic systems in a simple polymer-based origin-of-life model [13,14], as well as independent results on the appearance of cycles in random directed graphs [2,4] motivated by their relevance to the emergence of living systems.Kauffman's notion of a self-generating autocatalytic network was later formalised as a the concept of a Reflexively Autocatalytic and F-generated set ('RAF', defined shortly) [11].The subsequent theory and algorithms concerning RAFs have been applied in a number of areas, ranging from the origin and structure of primitive metabolism [19,20], to cognitive modelling in cultural evolution [6,7], to ecology [8,9], and to economics [9].The RAF concept is related to (but different from) Robert Rosen's Metabolism-Replacement (M;R) systems in theoretical biology [12].
The task of determining whether or not a large network of 'reactions' contains a RAF and if so finding one, is made tractable (in polynomial time) by the property of a certain RAF map defined on the subsets of the full network of reactions.Here we generalize RAF maps to interior operators and investigate the properties of such operators, as well as the extent to which such operators (on arbitrary finite sets) can be realized as RAF maps.In particular, we show that there are generic results concerning RAFs that are not provable from just the basic properties of the RAF map as an interior operator.The significance of this result in applications is that certain generic properties of RAFs may require more detailed arguments than those that can be derived using interior operator properties alone.
We begin by defining interior operators on finite sets, listing some of their basic properties, and describing how they arise naturally from directed graphs.The results are then applied to self-generating autocatalytic networks.

Interior operators and their fixed sets
In this paper, we will assume that all sets are finite, and given a set Y , we write 2 Y to denote the power set of Y .A function ψ : 2 Y → 2 Y is an interior operator on the subsets of Y if it satisfies the following three properties (nesting, monotonicity, and idempotence) for all subsets X, X ′ of Y : The term 'interior operator' comes from topology, since the function that assigns to any subspace S of a topological space the interior of S (the union of all the open sets contained in S) satisfies the three properties (I 1 )-(I 3 ).
Given an interior operator, ψ : 2 Y → 2 Y and a subset X of Y , let denote collection of subsets of X that are fixed by ψ.We refer to the collection {F ψ (X) : X ∈ 2 Y } as the fixed sets of ψ.Note that F ψ (X) ̸ = ∅ since ∅ ∈ F ψ (X) for any interior operator ψ.
The following lemma summarises some basic and elementary properties of interior operators (a proof is provided in the Appendix).Lemma 1. Let, ψ : 2 Y → 2 Y be an interior operator, and let X be a subset of Y . 1iii) An arbitrary collection C of subsets is the collection of fixed sets for some interior operator if and only if ∅ ∈ C and C is union-closed.Moreover, in that case, there is a unique interior operator ψ C that has C as its collection of fixed sets, and which is determined by: for all X ⊆ Y .
Notice that Parts (i) and (ii) of this lemma imply that ψ(X) is the unique maximal fixed set contained within X.
Next, consider any function λ : 2 Y → 2 Y that satisfies the properties (I 1 ) and (I 2 ) of an interior operator (but not necessarily (I 3 )).Define a function where H 0 (X) = X and H i+1 (X) = λ(H i (X)) for all i ≥ 0. Notice that since Y is finite, this intersection is finite, and thus, ψ λ (X) = H n (X) for the first value of n for which H n (X) = H n+1 (X).

Interior operators arising from directed graphs
Let D = (Y, A) be a finite directed graph with vertex set Y , and for any nonempty subset X of Y , let D|X be the induced sub-digraph on X (i.e.D|X has vertex set X and (u, v) is an arc of D|X if and only if (u, v) ∈ A and u, v ∈ X).We let d + D (v) denote the in-degree of vertex v in D, and for v ∈ X, we let d + D|X (v) denote the in-degree of vertex v in D|X.Let Y k denote the subsets of Y of size k, and for k ≥ 1, let: We say that C(D) is trivial if C(D) = {∅}.The following result (particularly Part (iii)) will play an important role in Section 3.3.The proof is provided in the Appendix.
Proposition 2. Remarks: • In Part (i), the claim that C(D) is nontrivial implies that D has a directed cycle was noted in [5].• Proposition 2(iii) fails for k = 1 or k = 2; in fact, Y can be arbitrarily large in these cases (e.g., for k = 1 take the arc set for any digraph D on vertex set Y .Moreover, as Y becomes large, the proportion of interior operators on 2 Y that can be realised as ψ C(D) for some D converges to zero as |Y | grows.To see this, observe that there are exactly 2 n2 digraphs on a vertex set Y of size n, and each digraph uniquely determines ψ C(D) (though many digraphs produce the same interior operator 2 ).By contrast, the total number of interior operators on 2 Y grows much faster, as the following result shows (a proof is provided in the Appendix).Proposition 3.For any set Y of size n, there are at least 2 ( n ⌊n/2⌋ ) interior operators on 2 Y .

Self-generating autocatalytic networks (RAFs)
A catalytic reaction system (CRS) is a quadruple Q = (X, R, C, F ) consisting of a finite nonempty set X of elements (e.g., molecule types) and a finite set R of reactions; here a reaction r ∈ R refers to an ordered pair (A, B) where A and B are multisets of elements from X.In addition, C is a subset of X × R where (x, r) ∈ C has the interpretation that element x 'catalyses' reaction r.We will denote such a CRS by writing Q = (X, R, C, F ).For each r ∈ R, the subset of X consisting of those elements x for which (x, r) ∈ C are called the catalysts of r, and a particular subset of X, namely a set F that has the interpretation as a set of elements that are freely available to the system.Accordingly, F is referred to as a food set.We write to denote the reaction that has the reactants A = {a 1 , . . ., a k }, the products B = {b 1 , . . ., b l }, and the catalysts {c 1 , . . ., c r }.
Let ρ(r) denote the set of reactants of r (i.e., A, ignoring multiplicities), and let π(r) denote the products of r (i.e., B, ignoring multiplicities) 3 .Moreover, for a subset A subset R ′ is F-generated if the reactions in R ′ can be placed in some linear order r 1 , r 2 , . . ., r k so that ρ(r 1 ) ⊆ F and for all j between 2 and k we have ρ(r j ) ⊆ F ∪ π({r 1 , . . ., r j−1 }).In other words, the reactions in R ′ are F-generated if they can proceed in some order so that the reactant(s) of each reaction are available by the time they are first required.We call such an ordered sequence of R ′ an admissible ordering.
Finally, given a CRS Q = (X, R, C, F ), we say that a subset R ′ of R is a RAF (Reflexively Autocatalytic and F-generated set) if R ′ is nonempty and is F -generated and, in addition, each reaction r ∈ R ′ is catalysed by at least one element in F ∪π(R ′ ).For any CRS Q, let C RAF Q denote the set of RAFs for Q.
Example 1.Consider the CRS Q = (X, R, F, C) for which X = {f, f ′ , x, y, z}, F = {f, f ′ } and the set R of reactions (with a catalyst indicated in square brackets) is given by: In this case, R has exactly two admissible orderings (r 1 , r 2 , r 3 and r 1 , r 3 , r 2 ), and

The maxRAF interior operator
A basic result is that when a CRS Q has a RAF, it has has a unique maximal RAF (which is the union of all the RAFs for Q), denoted maxRAF(Q) [11].For any subset and let φ Q : 2 R → 2 R be the following function: To describe how φ Q can be viewed as an interior operator, we will first recall some further terminology.Given a subset R ′ of reactions R, a subset W of X is said to be R ′ -closed if the following property holds: • If a reaction r in R ′ has all its reactants in W (i.e.ρ(r) ⊆ W ), then all the products of r are also in W (i.e., π(r) ⊆ W ).
The union of two R ′ -closed sets need not be R ′ -closed; nevertheless, given a nonempty subset W 0 of X, there is a unique minimal R ′ -closed set containing W 0 , denoted cl R ′ (W 0 ).This can be computed in polynomial time in the size of the system by constructing a nested increasing sequence of subsets of elements where: and for each r ∈ R ′ , the reactants of r and at least one catalyst of r is present in cl R ′ (F ).This allows us to express φ Q as an operator of the form ψ λ , where λ is a function on 2 R that satisfies the interior operator properties (I 1 ) and (I 2 ).
Let λ Q : 2 R → 2 R be the function defined by: The function λ Q clearly satisfies conditions (I 1 ) and (I 2 ).If we recall the definition of ψ λ from Eqn. (2), the maxRAF operator has a representation in the following result from [18].
This identity (φ Q = ψ λ ) allows for a polynomial-time algorithm to compute φ Q (c.f.[18] and the references therein).In particular, a nonempty subset R ′ of R is a RAF if and only if φ Q (R ′ ) = R ′ .Some new and interesting algebraic (semigroup) properties of the map φ Q were established recently in [16] (see also [15], which considers a more general notion than a RAF, corresponding to 'pseudo-RAFs' in the RAF literature, and which we do not explore further in this paper).

RAFs in elementary CRS systems
At this point, it is helpful to consider a very special type of catalytic reaction system.A CRS Q = (X, R, C, F ) is said to be elementary if each of its reactions has all its reactants in the present food set (formally, ρ(r) ⊆ F for each r ∈ R).
Given an elementary CRS Q = (X, R, C, F ), define a digraph D(Q) = (V, A Q ) to have vertex set R and an arc from r to r ′ (r ̸ = r ′ ) if a product of r catalyses r ′ ; in addition, we place an arc from r to itself if either a product of r or an element of F catalyses r.
The following result is easily verified from the definitions (or see [17], Theorem 2.1) and describes the set of RAFs of an elementary CRS Q (i.e., C RAF Q ) in terms of the fixed sets of the interior operators arising from digraphs (from Section 2.1, and recalling the definition of C(D)).This will be applied in the next section.
An immediate consequence of this lemma and Proposition 2(ii) is the following.
Corollary 1.If Q is an elementary CRS which has a RAF, then for any two RAFs of Note that this corollary can fail without the assumption that Q is elementary; Example 1 provides a counterexample for the two RAFs R ′ = {r 1 } and R ′′ = R = {r 1 , r 2 , r 3 }.If one removes the 'elementary' restriction on a CRS, the class of possible set systems that can be realised as RAFs of some suitably chosen CRS becomes larger and less tractable.We investigate this further in the next section, where we will apply Lemma 2 and the earlier Proposition 2(iii).

Representing an interior operator as a RAF operator
The main results in RAF theory that are generic (i.e., which hold regardless of the particular choices or restrictions on F , X, R or C in Q) can be established by using only the property that the maxRAF operator φ Q is a (efficiently computable) interior operator (see [18]).This raises the question as to whether theorems that hold true for all RAFs can always be established from (just) this generic property.In other words, can every interior operator on every finite set Y be realised as the maxRAF operator associated with a suitably chosen catalytic reaction system Q = (X, R, C, F ) in which Y is identified (via a bijection) with the set R of reactions in Q.We show that the answer is 'no' by describing a generic result in RAF theory that is not a consequence of the interior operator property of the maxRAF operator.
More precisely, we say that ψ has a RAF-realisation if there exists a CRS Q = (X, R, C, F ) and a bijection b : Y → R such that for each Y ′ ∈ 2 Y we have: where R ′ = β(Y ′ ) and where β : 2 Y → 2 R is the natural bijection induced by b.In other words, the diagram shown commutes for each Y ′ ⊆ Y .Note that no restriction is placed on the sets X, F , and C in Q; in particular, they could be arbitrarily large sets.
We now show that such a realisation is not always possible, as described in Proposition 5(ii) below.For this result, an irreducible RAF (iRAF) for a with the property that it contains no (nonempty) RAF as a proper subset (i.e., φ Proposition 5. (i) For any integer k ≥ 3 and any CRS Q = (X, R, C, F ) with |R| ≥ k 3 − 3k 2 + 4k, not all subsets of R of size k are iRAFs.(ii) For any finite set R of size at least 12, there exists an interior operator ψ on 2 R that does not have a RAF-realisation.
Proof of Proposition 5: Part (i): Let m = (k 2 −3k +3), and suppose that |R| ≥ km and every subset of R of size k is an iRAF; we will derive a contradiction.Since |R| ≥ km there exist m disjoint subsets of R of size k, call them R 1 , . . ., R m .Since these are subsets of R of size k they are iRAFs for Q.Now, any RAF requires at least one reaction to have all its reactants in the food set F (this can easily been verified by considering the first reaction in any admissible ordering of the reactions in a RAF).Thus, we can select one such reaction r i from R i (for each i), to obtain a set R k = {r 1 , . . ., r m } of m (distinct) reactions, with each reaction in R m having all its reactants in This is an elementary CRS, and so, by Lemma 2, the set of RAFs of It is easily verified that ψ satisfies properties (I 1 ), (I 2 ) and (I 3 ) and so is an interior operator, but ψ has no RAF-realisation by Part (i).□ Remarks: • The condition that k ≥ 3 is required in Proposition 5(i) since for k ≤ 2 it is easy to construct CRS systems with an arbitrarily large set of reactions and with all subsets of R of size k being iRAFs (based on the second remark following Proposition 2).
• Note also that the value 12 in Proposition 5(i) (when k = 3) can be reduced to 4 if one restricts to RAF representations within elementary CRS systems.However, without that restriction, Proposition 5(i) does not hold if 12 is replaced by 4.An example is provided by the CRS Q consisting of X = {f, c 1 , c 2 , c 3 , γ, x, y, z}, F = {f } and R comprising the four catalysed reactions: For this system, each of the four subsets of R of size 3 is an iRAF of Q = (X, R, C, F ).
It is possible that the value of 12 in Proposition 5(i) (when k = 3) could be reduced further (or that the lower bound of 4 provided by the example above could be increased), however this would require more elaborate arguments.

Concluding comments
Proposition 2 provides set-theoretic necessary conditions for a union-closed collection of sets to be realisable by a digraph.A natural question is whether there is a settheoretic characterisation of the class of union-closed sets to be realisable by a digraph.A more difficult task would be to characterise the set systems that are realisable as the RAFs of some CRS.Related to the (still open) union-closed conjecture [1], is the question of whether there is always a reaction that lies in at least half the RAFs (for either an elementary or general CRS).Although we have focused on applications of interior operators arising from digraphs to autocatalytic networks, other properties of interior operators realisable by graph-based processes may also be relevant to various applications (e.g. in investigating the fixed sets present within digraph models of neuronal networks of the type discussed in [10])).
Part (ii): Suppose there is no vertex w ∈ W \ U for which U ∪ {w} ∈ C(D).Then there is no arc from any vertex in U to a vertex in W \ U .However, every vertex in W \ U has an incoming arc from some vertex in W , and therefore, it has an incoming arc from some vertex in W \ U .Thus D|(W \ U ) has the property that every vertex in this induced graph has in-degree at least 1, so W \ U ∈ C(D).
Part (iii): Let D = (V, A), and suppose that C k (D) = Y k and C j (D) = ∅ for all 1 ≤ j < k, where k ≥ 3. We first show that this implies that d and so there is a subset X ′ of size at least k − 1 for which (x ′ , v ′ ) ∈ A for each x ′ ∈ X ′ .Let X ′′ be any subset of X ′ of size exactly k − 1.Since C k−1 (D) = ∅ and |X ′′ | = k − 1, at least one element x ′′ ∈ X ′′ has no incoming arc from any other vertex in X ′′ , which means that (v ′ , x ′′ ) ∈ A, since X ′′ ∪ {v ′ } ∈ C k (D).On the other hand, (x ′′ , v ′ ) ∈ A (by definition of X ′′ ), which implies that {v ′ , x ′′ } ∈ C 2 (D), providing the required contradiction since C 2 (D) = ∅ (since k ≥ 3).Thus each vertex v in D has in-degree at most k − 2, as claimed.
If we now let n = |Y | then, since |A| = v∈Y d + D (v), we obtain: We count this set in two ways.Since n = |Y |, the number of choices for S is n k .Moreover, for each such set S, there are precisely k arcs that form a cycle involving k elements in S, since: (a) if any more arcs were present between the vertices of S then a set in C j (D) for some j < k would appear, and (b) if no cycle was present involving all elements of k, then S would not lie in C k (D), and both of these two possibilities are excluded by the two assumptions stated in Part (iii).In summary, We can also count Ω by first selecting an arc (u, v) from A and counting the number of sets S ∈ Y k that contain u and v.By the assumptions in Part (iii), each subset of Y of size k induces a unique cycle through all the vertices (and with no other arcs present between the vertices), so the number of sets S that can be chosen for (u, v) is n−2 k−2 .Thus we have: Combining Eqns.( 4), ( 5) and ( 6) gives: follows from Proposition 2 that not every interior operator on 2 Y can be realised as ψ C(D) for some digraph D. For example, if we let Y = {a, b, c} and take the union-closed set system C = {∅, {a}, {a, b, c}} then C cannot equal C(D) for any digraph D by Proposition 2(ii).Alternatively, consider the union-closed set system C + k = {X ∈ 2 Y : |X| ≥ k}∪{∅}, where k ≥ 3.This satisfies the two assumptions in Proposition 2(iii) and so for any set Y with |Y and no subsets of size less than k, and so we can apply Proposition 2(iii) (withY = R m ) to deduce that m = |R m | ≤ 1 + (k − 1)(k − 2).But this contradicts the inequality m = |R|/k = k 2 − 3k + 4 > 1 + (k − 1)(k − 2).Part (ii): Put k = 3 in Part (i) and consider the following map ψ : 2 R → 2 R : Now consider the set Ω of pairs (S, a) where S is a subset of k vertices from Y , and a is an arc between any two vertices of S. Formally,Ω = {(S, a) : S ∈ Y k ; a ∈ A ∩ (S × S)}.
n, which simplifies to n ≤ 1 + (k − 1)(k − 2), as claimed.□ Proof of Proposition 3: Let A = {U ⊂ Y : |U | = ⌊n/2⌋}, which is an antichain in the poset 2 Y (partially ordered by set inclusion) of size n ⌊n/2⌋ (A is also a largest antichain by Sperner's theorem).Let S be a subset of A, and let C[S] be the collection of subsets of Y consisting of ∅, the sets in S, and all possible unions of the sets from S. In this case, C[S] satisfies the conditions of Lemma 1 (iii) and so there is a unique interior operator ψ C[S] that has the fixed set C[S].Moreover, the collection of minimal nonempty fixed sets of ψ C[S] is precisely the sets in S, so if S ̸ = S ′ , then ψ C[S] ̸ = ψ C[S ′ ] .Since there are 2 ( n ⌊n/2⌋ ) choices for S, this completes the proof.□