Genetic recombination as a Generalised Gradient Flow

It is well known that the classical recombination equation for two parent individuals is equivalent to the law of mass action of a strongly reversible chemical reaction network, and can thus be reformulated as a generalised gradient system. Here, this is generalised to the case of an arbitrary number of parents. Furthermore, the gradient structure of the backward-time partitioning process is investigated.


Introduction
Genetic recombination describes the reshuffling of genetic information that occurs during the reproductive cycle of (sexual) organisms; it is one of the major mechanisms that maintain genetic diversity within populations. One of the standard models for its description is the deterministic recombination equation in continuous time, in the following simply referred to as recombination equation; for background, see [7]. This equation describes the evolution of the distribution of types in a (haploid) population under the assumption of random mating, while neglecting stochastic fluctuations.
In the case of finite sets of alleles at an arbitrary (but finite) number of sites, the recombination equation was reinterpreted by [12] as the law of mass action of a network of chemical reactions between gametes. This reaction network was shown to be strongly reversible and general theory [14,16] on chemical reaction networks thus implies that it admits a representation in terms of a generalised gradient system, with respect to entropy [11]. This strengthens an earlier result by Akin [1, Thm. III.2.5] that entropy is a strong Lyapunov function for recombination; a somewhat weaker statement can be found in [13,Thm. 6.3.5].
A generalised version of the model, which allows for a more general reproduction mechanism (involving an arbitrary number of parents) as well as more general (not necessarily discrete) type spaces, was considered in [5]. There, the authors reduced the original measure-valued, infinite-dimensional equation via a suitable ansatz function to a finite-dimensional, albeit still nonlinear, system. This system was then analysed using lattice-theoretic techniques, leading to an explicit recursion formula for its solution. A somewhat different, but simpler approach can be found in [2], which relates the evolution of the type distribution, forward in time, to an ancestral partitioning process backward in time. More precisely, the solution of the recombination equation is expressed in terms of the solution of the (linear) differential equation for the law of the partitioning process.
One obvious question is now whether this more general model can also be represented as a strongly reversible chemical reaction network, and, consequently, as a generalised gradient system like the more classical version, which involves only two parents and finite type spaces.
We shall see that, in the case of finite type spaces, the answer is affirmative, generalising the results of Müller and Hofbauer [12] to the multi-parent case. In addition, we will that the dynamics of the law of the partitioning process [2,4], which is independent of the type space, can be rewritten as as a generalised (linear) gradient system.
Finally, we reconsider the finite-dimensional, nonlinear system from [5] and show that it, too, can be rewritten in terms of a law of mass action of a chemical reaction network, which structurally resembles the one mentioned above for finite type spaces; however, it is not clear whether it is indeed a gradient system, due to the loss of reversibility incurred by the loss of information when transitioning from types to partitions.
The paper is organised as follows. First, we recall and explain the recombination equation for multiple parents. The connection to chemical reaction networks is established in Section 3, for finite type spaces. Section 4 contains the results on the gradient structure of this network, and Section 5 explains the gradient structure of a particular class of Markov chains, which covers the partitioning process. Finally, Section 6 contains the reformulation of the nonlinear system from [5] as a law of mass action.

The recombination equation
For our purposes, a genetic type is a sequence x = (x 1 , . . . , x n ) ∈ X := X 1 × · · · × X n of fixed length n, where X 1 , . . . , X n are locally compact Hausdorff spaces and the evolution of the (gametic) type distribution of the population is modelled as a differentiable one-parameter family ω = (ω t ) t 0 of (Borel) probability measures on X. This generality is useful in the context of quantitative genetics, when modelling the evolution of quantitative, polygenic traits such as body size, brain volume, growth rate or milk production; cf. [7,Ch. IV].
To understand the dynamics of ω, let us start on the level of individuals. When two or more parents jointly produce an offspring, a partition of S := {1, . . . , n}, the set of genetic sites (or loci ), describes how the type of that offspring is pieced together from the types of its parents; recall that a partition of a set M is a collection of pairwise disjoint, non-empty subsets, called blocks, whose union is M ; whenever we enumerate the blocks of a partition of S, that is, whenever we write A = {A 1 , . . . , A |A| }, where |A| is the number of blocks in A, we order the blocks such that A 1 is the block that contains 1 and, for all 2 k |A|, A k is the block that contains the smallest element not contained in k−1 j=1 A j . Formally, given k parent individuals of types for all 1 i k), and a partition A = {A 1 , . . . , A k } of S into k blocks, the type of the offspring is Here, for each A ⊆ S, π A is the projection (x i ) i∈S → (x i ) i∈A and the symbol ⊔ is used to denote the joining of gene fragments, respecting the order of the sites; given pairwise disjoint subsets U 1 , . . . , U k of S and sequences x (ℓ) in π U ℓ (X) with 1 ℓ k, we write for the sequence indexed by U 1 ∪ . . . ∪ U k that has at site i the letter x (j) i , where U j is the unique subset that contains i. In particular, in Eq. (1), y = (y 1 , . . . , y n ) where For any U ⊆ S, we denote by the marginal type space with respect to U . In the following, we denote by P (S) the set of all partitions of S, not to be confused with the set P(X) of all (Borel) probability measures on X.
To express the effect of recombination on the type distribution in a concise way, we define for any A ∈ P (S) a (nonlinear) operator on P(X) by Here, denotes measure product and the dot denotes the push-forward of probability measures; i.e., π A .ν(E) := ν π −1 A (E) , for any Borel measurable subset E ⊆ X A and ν ∈ P(X). The operators R A are called recombinators [3]. Clearly, R A (ν) is the distribution of the type of the joint offspring of |A| parents whose types are drawn independently from ν, where the letters at two different sites k and ℓ come from the same parent if and only if k and ℓ are in the same block of A; we call such an offspring A-recombined. If X is finite, the recombinator can also be written as follows.
Lemma 1. Assume that X is finite. Then, for all ν ∈ P(X) and all A = {A 1 , . . . , A k } ∈ P (S), we have .
Proof. Let us writeR for the map on P(X) defined by the right-hand side. Then, for all y ∈ X,R which implies the identity claimed.

Remark 2. As expressions of the form
are quite cumbersome, we simplify the notation by formally identifying each element m in a finite set M with the associated point (or Dirac) measure δ m . Under this convention, the statement from Lemma 2.2 reads Put differently, we identify the vector space of finite signed measures on M with the vector space R M of formal sums of its elements. Unless stated otherwise, all vectors are interpreted as column vectors. This entails that the standard scalar product v, w of any two vectors v and w can be written as v T w (where T denotes transposition) whereas vw T denotes the matrix that maps any other vector u to w, u v. ♦ Before we continue, we will need to recall from [2] a few additional notions around partitions. Given two partitions A and B, we say that A is finer than B and write A B if and only if every block of A is a subset of some block of B; this defines a partial order on P (S), and we denote the unique minimal (maximal) element by 0 := {{i} | i ∈ S} (1 := {S}). The coarsest common refinement of A and B is defined as it is the largest (coarsest) element of P (S) smaller (finer) or equal to both A and B. Furthermore, given a partition A of S and some subset U ⊆ S, we denote by We are now ready to state the recombination equation [5, Eq. (7)], where the ̺(A) are non-negative real numbers, called recombination rates. This equation expresses that in each infinitesimal time interval [t, t+ dt], for each A ∈ P (S), each individual is with probability ̺(A) dt replaced by a new A-recombined offspring, distributed as R A (ω t ).
In other words, the current type distribution ω t is replaced by the convex combination  (2) has the form where the coefficients a t (A) satisfy the coupled nonlinear differential equationṡ with initial value a 0 (1) = 1 and a 0 (A) = 0, otherwise. The sums run over all partitions of U , where the underdot marks the summation variable.
We may also (compare Remark 2) rewrite this system in vector-notation as follows.
where a t := A∈P (S) a t (A)A = A∈P (S) a t (A)δ A . This system can be solved recursively by lattice-theoretic means; compare [5]. We will consider it in greater detail in Section 6 and show that it is the law of mass action of a chemical reaction network (compare Section 3).
While the system (6) is finite-dimensional, it is still highly nonlinear. In fact, Eq. (2) can also be related to a linear system, via an ancestral partitioning process that runs backward in time. Here, we only give a brief sketch of the idea; for further background on genealogical methods in the context of recombination, the reader is referred to the excellent review [4].
The partitioning process is a Markov chain Σ = (Σ t ) t 0 in continuous time with state space P (S) and is most easily understood when started in Σ 0 = 1. Assume that we want to sample the type of a single individual (Alice, say) that is alive at time T ; Now, for 0 t T , each block σ of Σ t corresponds to an independent ancestor of Alice that lived at time T − t and from whom she inherited the letters at the sites in σ. At time T = T − 0, Alice herself is alive and corresponds to the unique block of Σ 0 = 1.
To understand the time-evolution of Σ, keep in mind that every block in Σ t corresponds to one of Alice's ancestors. Recall that, by our interpretation of Eq. (2), every individual alive at time T − t was with probability ̺(B) dt a B-recombined offspring of parents alive at time T − t − dt. This means for the evolution of the partitioning process that a block A ∈ Σ t is in the infinitesimal time step from t to t + dt, with probability ̺(B) dt, replaced by the collection of blocks of induced partition B A , which reflects the partitioning of the genome of the corresponding ancestor of Alice across its own parents. More formally, Σ is a Markov chain in continuous time with rate matrix Q, where Here, the marginal recombination rates are given by Formalising our verbal discussion above gives the following stochastic representation of the solution ω of Eq. 2. More generally for arbitrary starting values, one has the duality relation Put differently, Eq. (8) implies that any solution of Eq. (2) is of the form with initial value 1.

Chemical reaction networks
Let us recapitulate a few basic notions in chemical reaction network theory, taylored to our purposes. For an introduction, see [10].
Let S (not to be confused with S, the set of sequence sites) be a finite set, the elements of which will be thought of as the reacting species in a chemical reaction network (CRN), that is, a finite collection of chemical reactions, which are represented by symbolic expressions of the form Here, the r i and s i are reacting species (not necessarily distinct) and κ > 0 is the reaction constant. The left and right-hand sides in Eq. (9) are called the complexes of substrates and products. In our setting, we will always have m 1 = m 2 = m, as we will see later.
Remark 5. Recall from Remark 2 that we formally identified the elements of any finite set with the corresponding point (or Dirac) measures. In this sense, the addition in Eq. (9) can be understood as addition of vectors in the space of signed measures on S.
Of particular interest are strongly reversible CRNs. They are usually defined as CRNs in which the forward reaction constant agrees with the backward reaction constant for every reaction. In the present setting, where we think of reactions as unidirectional, it is more convenient to phrase this slightly differently. Definition 6. A CRN is called strongly reversible if it can be partitioned into pairs, each consisting of a reaction, together with its backward reaction, Note that the reaction constant is the same for both reactions. ♦ Given a CRN, it is natural to inquire about the dynamics of the probability vector of normalised concentrations of species. As the left and right-hand sides in Eq. (9) contain the same number of reacting species, the total mass is preserved and may therefore be normalised to one. The law of mass action translates the collection of formal expressions (9) into a system of coupled differential equations for c = (c t ) t 0 . It assumes that each reaction occurs with a rate that is proportional to the concentration of each of the substrates, and hence to their product; the proportionality factor is the reaction constant κ in Eq. (9). As each reaction decreases the concentration of substrates and increases the concentration of products, we obtain the following system of ordinary differential equations, where, again, the reacting species s 1 , . . . , s m and r 1 , . . . , r m are identified with the corresponding point measures δ s 1 , . . . , δ sm and δ r 1 , . . . , δ rm and the sum is taken over all reactions that make up the CRN. We refer the interested reader to [9, Ex. 11.1.C] for a probabilistic variation on this theme. We now return to recombination. In [12], genetic recombination is treated as a CRN with the types as reacting species, in the special case of two parents. For example, recombination This describes the process of recombination at the molecular level; first, the parental sequences (x 1 , x 2 , x 3 ) and (y 1 , y 2 , y 3 ) are split in two, according to A. Then, two new sequences are obtained by joining the leading part of one sequence with the trailing part of the other, and vice versa. For each (ordered) pair of types and each partition A, the reaction constant is or One way to resolve this ambiguity is to order the substrates and define the reaction accordingly. Thus, there may be many different reactions with a common complex of substrates. More precisely, for every C ∈ P (S) and each ordered tuple x (1) , . . . , x (|C|) ∈ X C , we define a chemical reaction via the following graphical construction, illustrated in Figure 1. First, just as in the two-parent case, the |C| types are broken up into their subsequences π C j (x (i) ) over the blocks of C. Then, they are arranged on a two-dimensional, |C|-periodic grid (or discrete torus), where π C j x (i) is placed in the i-th column and j-th row. Finally, the products are formed by joining the fragments along each diagonal line, running from north-west to southeast through the grid. Alternatively, one may think about moving the i-th row i − 1 places to the left, and then joining the fragments in each column. More formally, every choice of C and x (1) , . . . , x (|C|) defines a reaction, |C| j=1 x (j) where the indices are to be read modulo |C|.
Notice that the right-hand side depends on the order of the substrates, while the left-hand side is independent of it. For instance, in our earlier example with three sites and parents (that is, C is 0, the trivial partition into singletons), the choice x (1) = x, x (2) = y, x (3) = z leads to Eq. (10), while exchanging the roles of the second and third type leads to Eq. (11).
Proof. We will show that for all C ∈ P (S) and all ν ∈ P(X), we have Recall that, by Lemma 1, we have Here, we obtain the second equality by replacing the product ν x (1) · . . . · ν(x (|C|) ) with its cyclic permutation ν x (1−j+1) · . . . · ν x (|C|−j+1) , and subsequently renaming the indices; recall that indices are to be read modulo |C|. Similarly, keeping in mind that x∈X ν(x) = 1 because ν is a probability measure, we obtain, which completes the argument.
We have thus seen that, in the case of finite type spaces, genetic recombination can be reinterpreted as a CRN, also in the case of an arbitrary number of parents. In fact, this network is strongly reversible.
Theorem 8. The CRN from Theorem 7 is strongly reversible in the sense of Definition 6.
Proof. Let C be fixed. Define ϕ : Note that ϕ x (1) , . . . , x (|C|) contains the products in the reaction defined by C together with x (1) , . . . , x (|C|) , in reverse order (compare (12)). As short reflection on Fig. 1 reveals that ϕ is an involution and therefore partitions X |C| into orbits that contain either one or two elements. Consider first an orbit with two elements x (1) , . . . , x (|C|) and y (1) , . . . , y (|C|) . Then, the associated reactions form a forward-backward reaction pair, On the other hand, the reaction defined by a fixed point of ϕ is void, since its product and substrate complex agree.
Next, we consider the connection to gradient systems.

Gradient systems
For this section, we need a few basic notions from differential (particularly Riemannian) geometry, which we recall here for the convenience of the reader. For further background, we refer the reader to [15], in particular Chapter 5. For a real-valued differentiable function V , defined on (some subset of) R d , and a function C with the same domain and values in the symmetric positive semi-definite matrices, we call the ordinary differential equation a generalised gradient system (with respect to the potential V ). Here, is the nabla symbol and {ê 1 , . . . ,ê d } denotes the standard basis of R d .
Given x ∈ R d , a vector v in T x (R d ), the tangent space of R d at x, and a continuously differentiable curve γ in R d with γ(0) = x and γ ′ (0) = v, recall that the directional derivative of V in direction v is given by The one-form dV is called the exterior derivative of V ; note that it can be defined analogously for any real-valued function on a smooth manifold, and, in particular, does not depend on the Euclidean structure of R d . One has, by an application of the chain rule, where ·, · denotes the standard scalar product on R d . Replacing the standard scalar product by a general Riemannian metric ·, · x , (that is, a positive definite, symmetric bilinear form on the tangent space, which varies smoothly, depending on the base point), Eq. (14) can be used to define the gradient of V with respect to this metric [15,Ex. 108], denoted by grad ·,· (V ); it is the unique vectorfield that satisfies for all x and v. Geometrically, this means that, unless x is an equilibrium, grad ·,· (V )(x) points in the direction of steepest ascent of V at point x, with respect to the chosen metric.
In particular, if C(x) in Eq. (13) is invertible and we consider the metric, Thus, Eq. (13) can be thought of as a gradient system in the classical sense, if we replace the Euclidean metric on R d by a Riemannian one, at least in the case that C(x) is invertible. The interpretation is somewhat more delicate when C(x) fails to be invertible. Intuitively, one might think of the kernel of C(x) as a set of forbidden directions, and try to restrict attention to submanifolds which partition the space and are in each point x tangent to the image of C. However, this interpretation is only valid when the image of C is integrable in the sense that whenever Y and Z are two vectorfields such that Y (x) ∈ Im C(x) and Z(x) ∈ Im C(x) for all x, then also [Y, Z](x) ∈ Im C(x) for all x, where [Y, Z] denotes the Lie bracket of Y and Z; this is the content of Frobenius' theorem [15, Thm. 1.9.2]. The situation when Im C is not integrable can be understood via the theory of sub-Riemannian manifolds. Roughly speaking, this theory is concerned with Riemannian metrics which may take the value +∞; see [6] for an overview.

Remark 9.
To demonstrate that the condition of integrability is not trivial, consider the following two vectorfields on R 3 .
Note that [X 1 , X 2 ] = ∂ ∂x 3 is nowhere in the span of X 1 and X 2 ; thus, proving integrability in our case (and for the gradient systems arising in chemical reaction network theory in general) might be an interesting question in its own right. ♦ We remark that, under the assumption that (13) has a unique equilibrium, the potential V is always a strong, (global) Lyapunov function (by which we mean that V is strictly increasing along non-constant solutions). This is because by the positive semi-definiteness of C(x). Equality holds if and only if ∇V (x) is in the kernel of C(x) (implying thatẋ = 0), hence, if and only if the system is in equilibrium.
We have seen in the previous section that the general recombination equation, interpreted as a chemical reaction network, is strongly reversible. Thus, it is a gradient system in the sense of Eq. (13), by standard theory; compare [16,14], where this is proved in much greater generality. For the sake of completeness, we include the simple proof of this fact, in the special case needed for our purposes.
Theorem 10. The law of mass action for any strongly reversible CRN can be written as a generalised gradient system,ċ is called the negative free energy and C is a continuous function on P(S), which is smooth on its interior and takes values in the positive semi-definite matrices.
Proof. Due to strong reversibility (see Definition 6), the law of mass action takes the forṁ where the outer sum is taken over all forward-backward reaction pairs in the network. Define for x, y 0, (15) L(x, y) := x − y log (x) − log (y) .
It is a straightforward exercise to verify that L defines a continuous, non-negative function on R 2 0 , which is smooth on R 2 >0 . Note that Thus, setting (for each forward-backward reaction pair) we see by the multiplication rule for the logarithm, that Here, we also used that s T ∇F (c) = − log c(s) for all s ∈ S. Since a non-negative linear combination of positive semi-definite, symmetric matrices is symmetric and positive semidefinite, the claim follows.
Remark 11. Since the total mass, s∈S c t (s), is preserved in our case, we may replace the negative free energy F in Theorem 10 by the entropy For the solution of the recombination equation (Eq. (2)) this has the following consequence. It is a well-known fact that, when considering the set of probability measures on a product space which all have the same marginals, the product measure of these marginals is a maximiser for the entropy. As the one-dimensional marginals are preserved under recombination (in absence of mutation or selection), the fact that Eq. (2) can be written as a with L defined in Eq. (15) Next, we treat the slightly more complicated example of three diallelic loci (but still 2 parents); compare [12,Ex. 2]. Again, we denote the two alleles by 0 and 1. We denote the type (i 1 , i 2 , i 3 ) by g 4i i +2i 2 +i 3 ; in other words, the index of a type is just the type itself, read as a binary integer. For example, we refer to (0, 0, 0) by g 0 and to (1, 0, 1) by g 5 , and identify g i with the canonical i + 1-th basis vector of R 8 . Now, by the proof of Theorem 10, we associate to each reaction pair of the form , if g i−1 and g j−1 are on the same side of (16), , if g i−1 and g j−1 are on different sides of (16), 0, otherwise and C(ν) is then given by summing these matrices over all forward-backward reaction pairs in the network. To keep things tidy, instead of summing over all forward-backward reaction pairs, we write down the sums over each different linkage class seperately; this allows to take advantage of the following symmetry implied by our choice of indices. Namely, as 1s are only exchanged between gametes but their relative positions in the sequence remains unchanged, the sum of indices is the same for each complex that are in the same linkage class, of which there are seven; six consisting of only one forward-backward reaction pair each, and one consisting of six such pairs. Assume that M belongs to a reaction within a complex where the indices sum to ℓ. Then, it is easy to see that we have where denotes the reversal of columns and denotes the reversal of rows within a matrix and A is a ℓ+1 2 × ℓ+1 2 matrix if ℓ 7 and a 14−ℓ+1 2 matrix if ℓ > 7. For ℓ even , M is of the form  where A is now an ℓ 2 × ℓ 2 matrix if ℓ 7 and 14−ℓ 2 if ℓ > 7; Here, the extra 0 between the reflected copies of A comes from the fact that reactions of the form do not contribute to the system. Let us now write these matrices A for the different linkage classes. We abbreviate the function ν → L ν(g i 1 )ν(g i 2 ), ν(g j 1 ))ν(g j 2 ) by L i 1 i 2 ,j 1 j 2 . , representing the reactions g 0 + g 6 , representing the reactions g 0 + g 3 ̺ 2 +̺ 3 ← −−− → g 2 + g 1 and g 4 + g 7 ̺ 2 +̺ 3 ← −−− → g 5 + g 6 . Finally, the last linkage class, comprised of the six reactions g 2 + g 5

monotone Markov chains and the partitioning process
We have seen how the result of Müller and Hofbauer [12] generalises in the setting of an arbitrary number of parents, at least for finite type spaces. For more general, potentially uncountable type spaces, this approach fails because it is not clear how to even make sense of the notion of the concentration of individual types, unless ω t is pure point. Now, we show how the evolution of the law of the partitioning process, related to ω via Eq. 8 can be written as a gradient system. Ultimately, this is due to the monotonicity of its sample paths; recall from (7) that the transition rate from A to B vanishes whenever B A. In particular, the number of blocks increases strictly in each transition.
Definition 12. Let X = X t t 0 be a continuous-time Markov chain on a finite state space E with rate matrix Q(i, j) i,j∈E ; it is called a Markov chain with strictly monotone orbits (MCsmo)(with respect to a real-valued function W on E) if Q(i, j) > 0 implies that W (j) > W (i). ♦ Recall that the distribution of a finite-state Markov chain X = (X t ) t 0 can be interpreted as a probability vector, which evolves in time according to the differential equation, If X has strictly monotone orbits in the sense of Definition 12, Eq. (17) can be written as a generalised gradient system, as defined in Section 4.
Theorem 13. Let X = X t t 0 be a MCsmo with respect to W and define Ψ : R E → R, Then, Eq. (17), which describes the time evolution of p X t , is equivalent tȯ where K takes values in the symmetric, positive semi-definite matrices, is continuous on P(E) and smooth on its interior.
Since Ψ is linear with (constant) gradient Inserting p X t for p, this is exactly the right-hand side of Eq. (17) The partitioning process mentioned in Section 2 is a process of succesive refinement; in every non-silent transition, the number of blocks increases at least by one. Thus, it is a MCsmo with respect to the number of blocks.
Corollary 14. The law p Σ of the partitioning process with generator Q given in Eq. (7) satisfies a generalised gradient system with respect to N given by We conclude with an explicit example. and can be rewritten as Here, the vector (1, 2, 2, 3) T is the gradient (with respect to the euclidean metric) of Ψ(p) = p(A) + 2p(B) + 2p(C) + 3p(D).
Also, the matrix is symmetric and it is positive semi-definite, as it can be written as a sum of positive semi-definite matrices (as long as p(A), p(B), p(C) ≥ 0), is a generalised gradient system in the sense of Eq. (13).
Note that the coefficient matrix in Eq. (18) is not diagonalisable; this is because the eigenvalue −2 has algebraic multiplicity 3, but the associated eigenspace is merely two-dimensional and spanned by (0, 1, −1, 0) T and (0, 1, 0, −1) T . Its general solution will therefore contain terms of the form te −2t . This seems to contradict the fact that a linear generalised gradient system can not have resonant solutions of the form t k e λt for k ≥ 1. One has to keep in mind, however, that the gradient representation only holds on the nonnegative cone (which is forward-invariant for the system). Note also that the problematic generalised eigenspace only has a trivial intersection with R 4 0 . We conclude with one additional example.
Example 16. Let us now consider the actual partitioning process, for three loci. We have the five partitions . Identifying A i with the i-th basis vector in R 5 , the generator Q of the partitioning process (cf. Eq. (7)) reads where ̺ 1 , ̺ 2 , ̺ 3 are as in Subsection 4.1, and he gradient system then for the distribution p Σ t then readṡ where D 1 , . . . , D 5 are chosen such that the rows sum to 0 and (1, 2, 2, 3) T is the gradient ∇N of the mean number of blocks N , defined in Corollary 14. Again, the maximum of the potential, the partition {{1}, {2}, {3}} characterises linkage equilibrium ('all sites come from independent ancestors').

Nonlinear partitioning as a chemical reaction network
We have seen in the previous chapter that the evolution of the law of the partitioning process can be rewritten as a linear generalised gradient system. We now consider the nonlinear system from Theorem 4. We will see that it, too, can be interpreted as the law of mass action for a network of chemical reactions between the partitions of S. Its construction is very similar to the network from Section 3.
To motivate this result, imagine that at time t = 0, we paint every gamete in a different color. As described in Theorem 12 and Fig. 1, for every C ∈ P (S), every randomly chosen |C|-tuple of gametes undergoes a chemical reaction as in Eq. (12) at rate ̺(C) |C| . But now, instead of investigating the effect on the type distribution, we ask how the initially assigned colors are mixed in the process. To this end, we attach to each individual a partition of its sites by grouping together all sites with the same color. Now, consider the j-th gamete that results from such a reaction (compare Eq. (12)); for two sites k and ℓ in this individual to have the same color, they must come from the same individual on the left-hand side (this is due to the fact that the tuple was chosen randomly and, as there are infinitely many colors in the population, the probability that the same color occurs in more than one individual in the chosen sample is negligible). More formally, there must be an i between 1 and |C| such that k and ℓ are both in C i . If that is true, both sites come from the i + j − 1-th individual, and thus must share the same block of A i+j−1 . Put more concisely, this means that k and ℓ belong to the same block of the induced partition A i+j−1 | C i for some i ∈ {1, . . . , |C|}. Equivalently, this means that the partition that describes the coloring of the j-th product gamete is given precisely by For an illustration, see Fig. 2. Thus, the reaction network from Section 3 translates to the system consisting of the reactions one for each C and every |C|-tuple of partitions of S; as always, indices are to be read mod |C|. These reactions are of the same form as the ones between gametes in Eq. (12), after replacing the type fragments π C i x (i+j−1) with the induced partitions A i+j−1 | C i .
We finish by showing that the law of mass action of this chemical reaction network is precisely the nonlinear system from Theorem 4.
Theorem 17. The nonlinear system of ordinary differential equations that describes the dynamics of the coefficients in (5) can be written as the law of mass action for the CRN comprised of all reactions (20). More concisely, (6) is equivalent tȯ where the summation is over P (S).
Proof. We will use the following identity (the proof of which will conclude the proof of the theorem), valid for all B A and all a ∈ R P (S) , where B = {B 1 , . . . , B |B| }. Inserting (21), we see that the second sum on the right-hand side of Eq. (6), can be written as Notice that the second argument of the Kronecker function is always finer than B. Thus, the whole summand vanishes whenever B A does not hold. We may therefore ignore the restriction B A in the inner sum, which allows us then to change the order of summation. After using the Kronecker function to perform the summation with respect to A, what remains is Up to renaming B with C, this is exactly the first part of the law of mass action for the CRN described above. Using the same argument as in the proof of Theorem 7, the first sum in Eq. (6) and subsequently rename the summation indices. Thus, which finishes the proof of Eq. (21) and hence, of the theorem. Despite their similar appearance, there is one crucial difference between the CRN from Section 3, and the one above. Because the products are pieced together from partitions of subsets induced by the substrates, the total number of blocks on the right-hand side is in general strictly larger than on the left-hand side. This implies that this network is not reversible, and the question whether it can be interpreted as a gradient system remains open. The loss of reversibility appears to be the coarse-graining of the information in our system that we performed by transitioning from the (potentially infinite) set of types to the finite set of partitions. This is vaguely reminiscent of the common phenomenon in statistical mechanics where the projection of the underlying (high-dimensional) microscopic model to a smaller set of macroscopic degrees of freedom leads to a loss of reversibility.