Spatially Inhomogeneous Populations with Seed-banks: Duality, Existence and Clustering

We consider a system of interacting Moran models with seed-banks. Individuals live in colonies and are subject to resampling and migration as long as they are $active$. Each colony has a seed-bank into which individuals can retreat to become $dormant$, suspending their resampling and migration until they become active again. The colonies are labelled by $\mathbb{Z}^d$, $d \geq 1$, playing the role of a $geographic\, space$. The sizes of the active and the dormant population are $finite$ and depend on the $location$ of the colony. Migration is driven by a random walk transition kernel. Our goal is to study the equilibrium behaviour of the system as a function of the underlying model parameters. In the present paper we show that, under mild condition on the sizes of the active population, the system is well-defined and has a dual. The dual consists of a system of $interacting$ coalescing random walks in an $inhomogeneous$ environment that switch between active and dormant. We analyse the dichotomy of $coexistence$ (= multi-type equilibria) versus $clustering$ (= mono-type equilibria), and show that clustering occurs if and only if two random walks in the dual starting from arbitrary states eventually coalesce with probability one. The presence of the seed-bank $enhances\, genetic\, diversity$. In the dual this is reflected by the presence of time lapses during which the random walks are dormant and do not move.


Background, motivation and outline
Dormancy is an evolutionary trait observed in plants, bacteria and other microbial populations, where an organism enters a reversible state of low metabolic activity as a response to adverse environmental conditions. The dormant state of an organism in a population is characterised by interruption of basic reproduction and phenotypic development during periods of environmental stress [24,29]. The dormant organisms reside in what is called a seed-bank of the population. After a varying and possibly large number of generations, dormant organisms can be resuscitated under more favourable conditions and reprise reproduction after becoming active by leaving the seed-bank. This strategy is known to have important implications for the genetic diversity and overall fitness of the underlying population [24,23], since the seed-bank of a population often acts as a buffer against evolutionary forces such as genetic drift, selection and environmental variability. The importance of dormancy has led to several attempts to model seed-banks from a mathematical perspective ( [2,1]; see also [3] for a broad overview).
In [2] and [1], the Fisher-Wright model with seed-bank was introduced and analysed. In the Fisher-Wright model with seed-bank, individuals live in a colony, are subject to resampling where they adopt each other's type, and move in and out of the seed-bank where they suspend resampling. The seed-bank acts as a repository for the genetic information of the population. Individuals that reside inside the seed-bank are called dormant, those that reside outside are called active. Both the long-time behaviour and the genealogy of the population were analysed for the continuum model obtained by letting the size of the colony tend to infinity, called the Fisher-Wright diffusion with seed-bank.
In [14], [15], [13] the continuum model was extended to a spatial setting in which individuals live in multiple colonies, labelled by a countable Abelian group playing the role of a geographic space. In the spatial model with seed-banks, each colony is endowed with its own seed-bank and individuals are allowed to migrate between colonies. The goal was to understand the change in behaviour compared to the spatial model without seed-bank.
Most papers on seed-banks deal with the large-colony-size limit, for which the evolution is described by a system of coupled SDE's. In [19], a multi-colony Fisher-Wright model with seedbanks was introduced where the colony sizes are finite. However, this model is restricted to homogeneous population sizes and a finite geographic space. The present paper introduces an individual-based spatial model with seed-banks in continuous time where the sizes of the underlying populations are finite and vary across colonies. The latter make the model more interesting from a biological perspective, but raise extra technical challenges. The key tool that we use to tackle these challenges is stochastic duality [11,4]. The spatial model introduced in this paper fits in the realm of interacting particle systems, which often admit additional structures such as duality [25,28]. In particular, our spatial model can be viewed as a hybrid of the well-known Voter Model and the generalized Symmetric Exclusion Process, 2j-SEP, j ∈ N/2 [5,11,26]. Both the Voter Model and the 2j-SEP enjoy the stochastic duality property, and our system inherits this as well: it is dual to a system consisting of coalescing random walks with repulsive interactions. The resulting dual process shares striking resemblances with the dual processes of the Voter Model and 2j-SEP, because the original process is a modified hybrid of them. It has been recognised in the literature [31,23,24,2,1] that qualitatively different behaviour may occur when the exit time of a typical individual from the seed-bank can become large. In the present paper we are able to model this phenomenon as well, due to the inhomogeneity in the seed-bank sizes. Our main goals are the following: (1) Introduce a model with seed-banks whose size is finite and depends on the geographic location of the colony. Prove existence and uniqueness of the process via well-posedness of an associated martingale problem and duality with a system of interacting coalescing random walks.
(2) Identify a criterion for coexistence (= convergence towards multi-type equilibria) and clustering (= convergence towards mono-type equilibria). Show that there is a one-parameter family of equilibria controlled by the density of types.
(3) Identify the domain of attraction of the equilibria.
(4) Identify the parameter regime under which the criterion for clustering is met. In case of clustering, find out how fast the mono-type clusters grow in space-time. In case of coexistence, establish mixing properties of the equilibria.
In the present paper we settle (1) and (2). In [18] we will address (3) and (4). We focus on the situation where the individuals can be of two types. The extension to infinitely many types, called the Fleming-Viot measure-valued diffusion, only requires standard adaptations and will not be considered here. The paper is organised as follows. In Section 2 we give a quick definition of the model and state our main theorems about the well-posedness, the duality and the clustering criterion. In Section 3 we give a more detailed definition of the model, prove that the martingale problem associated with its generator is well-posed, establish duality with an interacting seed-bank coalescent, demonstrate that the system exhibits a dichotomy between clustering and coexistence, and formulate a necessary and sufficient condition for clustering to prevail in terms of the dual, called the clustering criterion. Sections 4-6 are devoted to the proof of our main theorems.

Main theorems
In Section 2.1 we give a quick definition of the system. In Section 2.2 we argue that, under mild conditions on the sizes of the active population, the system is well-defined and has a dual that consists of finitely many interacting coalescing random walks.

Quick definition of the multi-colony system
Individuals live in colonies labelled by Z d , d ≥ 1, which plays the role of a geographic space. (In what follows, the geographic space can be any countable Abelian group.) Each colony has an active population and a dormant population. Each individual carries one of two types: ♥ and ♠. Individuals are subject to: (1) Active individuals in any colony resample with active individuals in any colony.
(2) Active individuals in any colony exchange with dormant individuals in the same colony.
For (1) we assume that each active individual at colony i at rate a(i, j) uniformly draws an active individual at colony j and adopts its type. For (2) we assume that each active individual at colony i at rate λ uniformly draws a dormant individual at colony i and the two individuals trade places while keeping their type (i.e., the active individual becomes dormant and the dormant individual becomes active). Note that dormant individuals do not resample.
At each colony i we register the pair (X i (t), Y i (t)), representing the number of active, respectively, dormant individuals of type ♥ at time t at colony i. We write (N i , M i ) to denote the size of the active, respectively, dormant population at colony i. The resulting Markov process is denoted by and lives on the state space where [n] = {0, 1, . . . , n}, n ∈ N. In Section 3.2 we will show that, under mild assumptions on the model parameters, the Markov process in (2.1) is well defined and has a dual (Z * (t)) t≥0 . The latter consists of finite collections of particles that perform interacting coalescing random walks, with rates that are controlled by the model parameters.
Let P be the set of probability distributions on X defined by We say that (2.1) exhibits clustering if the distribution of Z(t) converges to a limiting distribution µ ∈ P as t → ∞. Otherwise, we say that it exhibits coexistence. In Section 3.2 we will show that clustering is equivalent to coalescence occurring eventually with probability 1 in the dual consisting of two particles. This will be the main route to the dichotomy. For simplicity we let the exchange rate λ ∈ (0, ∞) be the same for every colony, and let the migration kernel be translation invariant and irreducible.

Assumption 2.1. [Homogeneous migration]
The migration kernel a(·, ·) satisfies: The former of the last two assumptions ensures that the way genetic information moves between colonies is homogeneous in space, while the latter ensures that the total rate of resampling is finite and that resampling is possible also at the same colony. Since it is crucial for our analysis that the population sizes remain constant, we view migration as a change of types without the individuals actually moving themselves. In this way, genetic information moves between colonies while the individuals themselves stay put. We write to denote the ratio of the size of the active and the dormant population in colony i.

Well-posedness and duality
Theorem 2.2 provides us with two sufficient conditions under which the system is well-defined and has a tractable dual. It shows a trade-off : the more we restrict the tails of the migration kernel, the less we need to restrict the sizes of the active population. The sizes of the dormant population play no role because all the events (resampling, migration and exchange) in our model are initiated by active individuals and dormant individuals do not feel the spatial extent of the geographic space. Theorem 3.10, Corollary 3.11 and Theorem 3.13 in Section 3.2 contain the fine details.

Theorem 2.3. [Equilibrium]
If the initial distribution of the system is such that each active and each dormant individual adopts a type with the same probability independently of other individuals, then the system admits a one-parameter family of equilibria.
• The family of equilibria is parameterised by the probability to have one of the two types.
• The system converges to a mono-type equilibrium if and only if two random walks in the dual starting from arbitrary states eventually coalesce with probability one.
Theorem 2.3 tells us that the system converges to an equilibrium when it is started from a specific class of initial distributions, namely, products of binomials. It also provides a criterion in terms of the dual that determines whether the equilibrium is mono-type or multi-type. Theorem 3.14, Corollary 3.15 and Theorem 3.17 in Section 3.2 contain the fine details.

Basic theorems: duality, well-posedness and clustering criterion
In Section 3.1 we define and analyse the single-colony model. In Section 3.2 we do the same for the multi-colony model. Our focus is on well-posedness, duality and convergence to equilibrium.

Definition: resampling and exchange
Consider two populations, called active and dormant, consisting of N and M haploid individuals, respectively. Individuals in the population carry one of two genetic types: ♥ and ♠. Dormant individuals reside inside the seed-bank, active individuals reside outside. The dynamics of the single-colony Moran model with seed-bank is as follows: -Each individual in the active population carries a resampling clock that rings at rate 1. When the clock rings, the individual randomly chooses an active individual and adopts its type.
-Each individual in the active population also carries an exchange clock that rings at rate λ. When the clock rings, the individual randomly chooses a dormant individual and exchanges state, i.e., becomes dormant and forces the chosen dormant individual to become active. During the exchange the two individuals retain their type.
Since the sizes of the two populations remain constant, we only need two variables to describe the dynamics of the population, namely, the number of a type-♥ individuals in both populations (see Table 1).

Initial state Event
Final state Transition rate (x, y) Let x and y denote the number of individuals of type ♥ in the active and the dormant population, respectively. After a resampling event, (x, y) can change to (x − 1, y) or (x + 1, y), while after an exchange event (x, y) can change to (x − 1, y + 1) or (x + 1, y − 1). Both changes in the resampling event occur at rate x N −x N . In the exchange event, however, to see (x, y) change to (x − 1, y + 1), an exchange clock of a type-♥ individual in the active population has to ring (which happens at rate λx), and that individual has to choose a type-♠ individual in the dormant population (which happens with probability M−y M ). Hence the total rate at which (x, y) changes to (x − 1, y + 1) is λx M−y M . By the same argument, the total rate at which (x, y) changes to (x+1, y−1) is λ(N −x) y M . For convenience we multiply the rate of resampling by a factor 1 2 , in order to make it compatible with the Fisher-Wright model. Thus, the generator G of the process is given by describes the Moran resampling of active individuals at rate 1 2 and describes the exchange between active and dormant individuals at rate λ. From here onwards, we denote the Markov process associated with the generator G by Note that Z is well-defined because it is a continuous-time Markov chain with finitely many states.

Duality and equilibrium
The classical Moran model is known to be dual to the block-counting process of the Kingman coalescent. In this section we show that the single-colony Moran model with seed-bank also has a coalescent dual.

Definition 3.1. [Block-counting process]
The block-counting process of the interacting seedbank coalescent (defined in Definition 3.5 below) is the continuous-time Markov chain where K = N M is the ratio of the sizes of the active and the dormant population. ✷ The first two transitions in (3.6) correspond to exchange, the third transition to resampling. Later in this section we describe the associated interacting seed-bank coalescent process, which gives the genealogy of Z.
The following result gives the duality between Z and Z * .

Theorem 3.2. [Duality]
The process Z is dual to the process Z * via the duality relation Note that the duality relation fixes the factorial moments and thereby the mixed moments of the random vector (X(t), Y (t)). This enables us to determine the equilibrium distribution of Z.
Although the above duality is new in the literature on seed-banks, the notion of factorial duality is not uncommon in mathematical models involving finite and fixed population sizes [8,12]. Similar types of dualities are often found for other models too (e.g. self-duality of independent random walks, exclusion and inclusion processes, etc. [11]). Remarkably, in the special case where N = M = 2j for some j ∈ N/2, Giardinà et al. (2009) [11,Section 3.2] identified the same duality relation as in (3.7) as a self-duality for the generalized 2j-SEP on two-sites. This is not surprising given the fact that the exchange rates between active and dormant individuals defined in Table 1 are precisely the rates (up to rescaling) for the 2j-SEP on two sites. We refer the reader to Section 4.1 to gain further insights into this.

Corollary 3.4. [Equilibrium] Suppose that Z starts from initial state
Then (X(t), Y (t)) converges in law as t → ∞ to a random vector (X ∞ , Y ∞ ) whose distribution is given by Note that the equilibrium behaviour of Z is the same as for the classical Moran model without seed-bank. The fixation probability of type ♥ is X+Y N +M , which is nothing but the initial frequency of type-♥ individuals in the entire population. Even though the presence of the seed-bank delays the time of fixation, because its size is finite size it has no significant effect on the overall qualitative behaviour of the process. We will see in Section 3.2 that the situation is different in the multi-colony model.

Interacting seed-bank coalescent
In our model, the genealogy of a sample taken from the finite population of N + M individuals is governed by a partition-valued coalescent process similarly as for the genealogy of the classical Moran model. However, due the presence of the seed-bank, blocks of a partition are marked as A (active) and D (dormant). Unlike in the genealogy of the classical Moran model, the blocks interact with each other. This interaction is present because of the restriction to finite size of the active and the dormant population. For this reason, we name the block process an interacting seed-bank coalescent. For convenience, we will use the word lineage to refer to a block in a partition.
Let P k be the set of partitions of {1, 2, . . . , k}. For ξ ∈ P k , denote the number of lineages in ξ by |ξ|. Furthermore, for j, k, l ∈ N, define Before we give the formal definition, let us adopt some notation. For π, π ′ ∈ P N,M , we say that π ≻ π ′ if π ′ can be obtained from π by merging two active lineages. Similarly, we say that π ✶ π ′ if π ′ can be obtained from π by altering the state of a single lineage (A → D or D → A). We write |π| A and |π| D to denote the number of active and dormant lineages present in π, respectively.

Definition 3.5. [Interacting seed-bank coalescent]
The interacting seed-bank coalescent is the continuous-time Markov chain with state space P M,N characterised by the following transition rates: if π ✶ π ′ by change of state of one lineage in π from D to A.

✷
The factor 1 − |π|D M in the transition rate of a single active lineage when π becomes dormant reflects the fact that, as the seed-bank gets full, it becomes more difficult for an active lineage to enter the seed-bank. Similarly, as the number of active lineages decreases due to the coalescence, it becomes easier for a dormant lineage to leave the seed-bank and become active. This also tells us that there is a repulsive interaction between the lineages of the same state (A or D). Due to this interaction, it is tricky to study the coalescent. As N, M get large, the interaction becomes weak. As N, M → ∞, after proper space-time scaling, the coalescent converges weakly to a limit coalescent where the interaction is no longer present. In fact, it can be shown that when both the time and the parameters are scaled properly, the coalescent converges weakly as N, M → ∞ to the seed-bank coalescent described in [1].
We can also describe the coalescent in terms of an interacting particle system with the help of a graphical representation (see Figure 1). The interacting particle system consists of two reservoirs, called active reservoir and dormant reservoir, having N and M labeled sites, respectively, each of which can be occupied by at most one particle. The particles in the active and dormant reservoir are called active and dormant particles, respectively. The active particles can coalesce with each other, in the sense that if an active particle occupies a labeled site where an active particle is present already, then the two particles are glued together to form a single particle at that site. Active particles can become dormant by moving to an empty site in the dormant reservoir, while dormant particles can become active by moving to an empty site in the active reservoir. The transition rates are as follows: • An active particle tries to coalesce with another active particle at rate 1 2 by choosing uniformly at random a labeled site in the active reservoir. If the chosen site is empty, then it ignores the transition, otherwise it coalesces with the active particle present at the new site.
• An active particle becomes dormant at rate λ by moving to a random labeled site in the dormant reservoir when the chosen site is empty, otherwise it remains in the active reservoir. • A dormant particle becomes active at rate λK by moving to a random labeled site in the active reservoir when the chosen site is empty, otherwise it remains in the dormant reservoir.
Clearly, the particles interact with each other due to the finite capacity of the two reservoirs. If N, M → ∞, then the probability to choose an empty site in a reservoir tends to 1, and so the system converges (after proper scaling) to an interacting particle system where the particles move independently between the two reservoirs. Note that if we define n t = number of active particles at time t and m t = number of dormant particles at time t, then Z * = (n t , m t ) t≥0 is the block-counting process defined in Definition 3.1. Also, if we remove the labels of the sites in the two reservoirs and represents the particle configuration by an element of P N,M , then we obtain the interacting seed-bank coalescent described in Definition 3.5. Even though it is natural to describe the genealogical process via a partitionvalued stochastic process, we will stick with the interacting particle system description of the dual, since this will be more convenient for the multi-colony model.

Multi-colony model
In this section we consider multiple colonies, each with their own seed-bank. Each colony has an active population and a dormant population. We take Z d as the underlying geographic space where the colonies are located (any countable Abelian group will do). With each colony i ∈ Z d we associate a variable (X i , Y i ), with X i and Y i the number of type-♥ active and dormant individuals, respectively, at colony i. Let (N i , M i ) denote the size of the active and the dormant population at colony i. In each colony active individuals are subject to resampling and migration, and to exchange with dormant individuals that are in the same colony. Dormant individuals are not subject to resampling and migration.
Since it is crucial for our duality to keep the population sizes constant, we consider migration of types without the individuals actually moving themselves. To be precise, by a migration from colony j to colony i we mean that an active individual from colony i randomly chooses an active individual from colony j and adopts its type. In this way, the genetic information moves from colony j to colony i, while the individuals themselves stay put.

Definition: resampling, exchange and migration
We assume that each active individual at colony i resamples from colony j at rate a(i, j), adopting the type of a uniformly chosen active individual at colony j. Here, the migration kernel a(·, ·) is assumed to satisfy Assumption 2.1. After a migration to colony i, the only variable that is affected is X i , the number of type-♥ active individuals at colony i. The final state can be either X i − 1 or X i + 1 depending on whether a type-♥ active individual from colony i chooses a type-♠ active individual from another colony or a type-♠ active individual from colony i chooses a type-♥ active individual from another colony. The rate at which X i changes to X i − 1 due to a migration from colony j is Nj , while the rate at which X i changes to X i + 1 due to a migration from colony j is Note that for i = j the migration rate is , which is the same as the effective birth and death rate in the single-colony Moran model. Thus, the resampling within each colony is already taken care of via the migration. It remains to define the associated exchange mechanism between the active and the dormant individuals in a colony. The exchange mechanism is the same as in the single-colony model, i.e., in each colony each active individual at rate λ performs an exchange with a dormant individual chosen uniformly from the seed-bank of that colony. For simplicity, we take the exchange rate λ to be the same in each colony.
The state space X of the process is Initial state Event Final state Transition rate Throughout the remainder of this paper, we adopt the convention given in (3.14) for addition and subtraction of configurations in X . The generator L for the process, acting on functions in (3.15) D = f ∈ C(X ) : f depends on finitely many coordinates , is given by describes the resampling of active individuals in different colonies (= migration), describes the resampling of active individuals in the same colony, and describes the exchange of active and dormant individuals in the same colony. From now on, we denote the process associated with the generator L by with X i (t) and Y i (t) representing the number of type-♥ active and dormant individuals at colony i at time t, respectively. Since Z is an interacting particle system, in order to show existence and uniqueness of the process, we can in principle follow the method described by Liggett in [25, Chapter I, Section 3]. However, for Liggett's method to work, a uniform bound on the sizes , then we can construct the process by providing a unique solution to the martingale problem for L. The following proposition tells us that L is indeed a Markov pregenerator and thus prepares the ground for proving the well-posedness of the martingale problem for L.
The existence of solutions to the martingale problem will be shown by using the techniques described in [25]. In order to establish uniqueness of the solution, we will need to exploit the dual process.

Duality
The dual process is a block-counting process associated to a spatial version of the interacting seedbank coalescent described in Section 3.1.3. We briefly describe the spatial coalescent process in terms of an interacting particle system. At each site i ∈ Z d there are two reservoirs, an active reservoir and a dormant reservoir, with N i ∈ N and M i ∈ N labeled locations, respectively. Each location in a reservoir can accommodate at most one particle. As before, we refer to the particles in an active and dormant reservoir as active particles and dormant particles, respectively. The dynamics of the interacting particle system is as follows (see Figure 2).
• An active particle at site i ∈ Z d becomes dormant at rate λ by moving to a random labeled location (out of M i many) in the dormant reservoir at site i when the chosen labeled location is empty, otherwise it remains in the active reservoir.
• A dormant particle at site i ∈ Z d becomes active at rate λK i with K i = Ni Mi by moving to a random labeled location (out of N i many) in the active reservoir at site i when the chosen labeled location is empty, otherwise it remains in the dormant reservoir.
• An active particle at site i chooses a random labeled location (out of N j many) from the active reservoir at site j at rate a(i, j) and does the following: -If the chosen location in the active reservoir at site j is empty, then the particle moves to site j and thereby migrates from the active reservoir at site i to the active reservoir at site j.
-If the chosen location in the active reservoir at site j is occupied by a particle, then it coalesces with that particle.
Note that an active particle can migrate between different sites in Z d and can coalesce with another active particle even when they are at different sites in Z d . For simplicity, we will impose the same assumptions on the migration kernel a(·, ·) as stated in Assumption 2.1. A configuration (η i ) i∈Z d of the particle system is an element represents the state of the labeled locations in the active and the dormant reservoir at site i (1 means occupied by a particle, 0 means empty). Below we give the definition of the block-counting process associated to the spatial coalescent process described above. Although it is an interesting problem to construct the block-counting process starting from a configuration with infinitely many particles, we will restrict ourselves to configurations with finitely many particles only, because this makes the state space countable. Thus, the block-counting process is a continuous-time Markov chain on a countable state space and hence, in the definition below, it suffices to specify the possible transitions and their respective rates only.

Definition 3.7. [Dual]
The dual process is a continuous-time Markov chain with state space and with transition rates where the configurations δ i,A , δ i,D ∈ X * ⊂ X are as in (3.13), and additions and subtractions of configurations are performed in accordance with (3.14). ✷ Here, n i (t) and m i (t) are the number of active and dormant particles at site i ∈ Z d at time t. The first transition describes the coalescence of an active particle at site i with other active particles elsewhere. The second and third transition describe the movement of particles between the active and the dormant reservoir at site i. The fourth transition describes the migration of an active particle from site i to site j. The following lemma tells us that the dual process Z * is a welldefined and non-explosive (equivalent to uniqueness) Feller process on the countable state space X * .

Lemma 3.8. [Uniqueness of dual]
There exists a unique minimal Feller process (Z * (t)) t≥0 on X * with transition rates given in (3.23).
Before we proceed we recall the definition of the martingale problem.

Definition 3.9. [Martingale problem]
Suppose that (L, D) is a Markov pregenerator, and let η ∈ X . A probability measure P η (or, equivalently, a process with law P η ) on D([0, ∞), X ) is said to solve the martingale problem for L with initial point η if The following theorem gives the duality relation between the dual process Z * and any solution to the martingale problem for (L, D). This type of duality is sometimes referred to as martingale duality.
Suppose that the sizes (N i ) i∈Z d of the active populations are such that, for any T > 0, Then, for any t ≥ 0, where the expectations are taken with respect to P η and P ξ , respectively.
Note that the duality function is a product over all colonies of the duality function that appeared in the single-colony model. The infinite products are well-defined: all but finitely many factors are 1, because of our assumption that there are only finitely many particles in the dual process. Also note that there is no restriction on (M i ) i∈Z d , the sizes of the dormant populations. This is because dormant individuals do not migrate and therefore do not feel the spatial extent of the system.
At first glance it may seem that (3.25) places a severe restriction on (N i ) i∈Z d , the sizes of the active populations. However, this is not the case. The following corollary provides us with a large class of active population sizes for which Theorem 3.10 is true under mild assumptions on the migration kernel a(·, ·).

Corollary 3.11. [Duality criterion]
Suppose that Assumption 2.1 is in force. Then (3.25), and consequently the duality relation in (3.26), hold for every Corollary 3.11 shows a trade-off : the more we restrict the tails of the migration kernel, the less we need to restrict the sizes of the active populations.

Well-posedness
We use a martingale problem for the generator L defined in (3.16), in the sense of [9, p.173], to construct Z. The following proposition gives existence of solutions for any choice of the reservoir sizes. As for the uniqueness of solutions, we will see that a restriction on the sizes of the active populations is required.

Proposition 3.12. [Existence]
Let L be the generator defined in (3.16) acting on the set of local functions D defined in (3.15). Then for all η ∈ X there exists a solution P η (a probability measure on D([0, ∞), X )) to the martingale problem of (L, D) with initial state η.
The following theorem gives the well-posedness of the martingale problem for (L, D) under a restricted class of sizes of the active populations and thus proves the existence of a unique Feller Markov process describing our multi-colony model. • Z is Feller and strong Markov, and its generator is an extension of (L, D).
In view of the above result, from here onwards, we implicitly assume that the restriction on (N i ) i∈Z d to N is always in force.

Equilibrium
Let us set Z i (t) := (X i (t), Y i (t)) for i ∈ Z d and denote by µ(t) the distribution of Z(t). Further, for each θ ∈ [0, 1] and i ∈ Z d , let ν i θ be the probability measure on For θ ∈ [0, 1], let ν θ be the distribution on X defined by ν θ := i∈Z d ν i θ and set Let D : X × X * → [0, 1] be the function defined by • ν is an equilibrium for the process Z.
is the total number of dual particles present at time t.

Clustering criterion
We next analyse the long-time behaviour of the multi-colony Moran model with seed-banks. Our interest is to capture the nature of the equilibrium. To be precise, we investigate whether coexistence of different types is possible in equilibrium. The measures i∈Z d δ (0,0) and i∈Z d δ (Ni,Mi) are the trivial equilibria where the system concentrates on only one of the two types. When the system converges to an equilibrium that is not a mixture of these two trivial equilibria, we say that coexistence happens. For i ∈ Z d , let us denote the frequency of type-♥ active and dormant individuals at colony i at time t by x i (t) := Xi(t) Ni and y i (t) := Yi(t) Mi respectively. Definition 3.16. [Clustering and Coexistence] The system is said to exhibit clustering if the following hold: for all i, j ∈ Z d and any initial configuration η ∈ X . Otherwise, the system is said to exhibit coexistence. ✷ The above conditions make sure that if an equilibrium exists, then it is a mixture of the two trivial equilibria.
The following criterion, which follows from Theorem 3.11, gives an equivalent condition for clustering. Note that the system clusters if and only if the genetic variability at time t between any two colonies converges to 0 as t → ∞. From the duality relation in Theorem 3.10 it follows that this quantity is determined by the state of the dual process starting from two particles. 4 Proofs: duality and equilibrium for the single-colony model Section 4.1 contains the proof of Theorem 3.2, which follows the algebraic approach to duality described in [4,30]. Section 4.2 contains the proof of Proposition 3.3 and Corollary 3.4, which uses the duality in the single-colony model.

Duality and change of representation
Before we proceed with the proof of Theorem 3.2, and other results related to stochastic duality, it is worth stressing the importance of duality theory. Though originally introduced in the context of interacting particle systems, over the last decade duality theory has gained popularity in various fields, ranging from statistical physics and stochastic analysis to population genetics. One reason behind this wide interests is the simplification that duality provides: it often allows one to extract information about a complex stochastic process through a simpler process. To date, in the literature there exist two systematic approaches towards duality, namely, pathwise construction and Liealgebraic framework. The former of the two approaches is more practical and widespread in the context of mathematical population genetics [16,7,20,21], while the latter has been developed more recently and reveals deeper mathematical structures behind duality, and often also provides a larger class of duality functions (see e.g. [4], [10], [17], [30] for a general overview and further references). In what follows, we adopt the Lie algebraic framework suggested by Carinci et al. (2015) [4] and prepare the ground for this setting. The downside is that this approach does not capture the underlying genealogy of the original process. However, it does offer the opportunity to obtain a larger class of duality functions by applying symmetries from the Lie algebra to an already existing duality function [11]. In this paper we refrain from exploring the latter aspect of the Lie-algebraic framework.
We start with briefly recalling that a (real) Lie algebra g is a linear space over R endowed with a so-called Lie bracket [·, ·] : g × g → g that is bilinear, skew-symmetric and satisfies the Jacobi identity [30]. The requirement of the bilinearity and skew-symmetry uniquely characterizes a Lie bracket by its action on a basis of g. An example of a (real) Lie algebra is the well-known su(2)algebra, which is the 3-dimensional vector space over R defined by the action of a Lie bracket on its basis elements {J + , J − , J 0 } as It is straightforward to see that which are the same commutation relations as in (4.1). Thus, for each α ∈ N, the Lie homomorphism φ α : su(2) → gl(V α ) defined by its action on the generators {J + , J − , J 0 } given by is a finite-dimensional representation of su (2). Similarly, we can verify that {J α,+ , J α,− , J α,0 }, α ∈ N, form a representation of the dual su(2)-algebra (defined by the commutation relations in (4.1), but with opposite signs). Below we introduce the notion of duality between two operators and prove a lemma that will be crucial in the proof of duality of both the single-colony and the multi-colony model. The relevance to our context of the above discussion on su(2) and its dual algebra will become clear as we go along.  ·, y))(x) = (BD(x, ·))(y) for all (x, y) ∈ Ω ×Ω. ✷ The following lemma intertwines the su(2) and its dual algebra with a duality function. Then the following duality relations hold: Proof. By straightforward calculations, it can be shown that d α (x, n) satisfies the relations from which the above dualities in (4.6) follow immediately. (2)-algebra] The basic idea behind the algebraic approach to duality is to write the generator of a given process in terms of simple operators that form a representation of some known Lie algebra and to make an Ansatz to obtain an intertwiner of the chosen representation. The intertwiner d α in the above lemma was first identified in [12, Lemma 1] as a duality function in disguise for the classical duality between the Moran model and the blockcounting process of Kingman's coalescent. Recently, in [4] this duality was put in the algebraic framework by deriving it from an intertwining via d α of two representations of the Heisenberg algebra H (2). The connection of d α to the su(2)-algebra was also made in [11, Section 3.2], where the authors obtained a self-duality function of 2j-SEP factorized in terms of d α by considering symmetries related to the su(2)-algebra. The relation of our seed-bank model to the su(2)-algebra becomes clear once we realize that the seed-bank component in our single-colony model is an inhomogeneous version of the 2j-SEP on two-sites. Thus, it is natural to expect that the classical duality of Moran model can be retrieved from representations of su(2)-algebra as well. The above lemma indeed provides the ingredients to establish the duality of our single-colony model from representations of the su(2)-algebra. Although it is possible to guess the dual process of the singlecolony model without going into the Lie-algebraic framework, the true usefulness of this approach lies in identifying the dual of the spatial model, where such speculation is no longer feasible.  Since Ω is countable, it is enough to show the generator criterion for duality, i.e., (4.10) GD( · ; (n, m)) (X, Y ) = GD((X, Y ); · ) (n, m), (X, Y ), (n, m) ∈ Ω.

Remark 4.3. [Seed-bank and su
In our notation, (4.10) translates into G D −→ G. It is somewhat tedious to verify (4.10) by direct computation. Rather, we will write down a proof with the help of the elementary operators defined in (4.2). This approach will also reveal the underlying change of representation of the two operators G, G that is embedded in the duality.
Note that (4.11) where the subscripts indicate which variable of the associated function the operators act on.

Equilibrium
Proof of Proposition 3.3. For x ∈ R and r ∈ N, let (x) r be the falling factorial defined as where we put (x) r = 1 when r = 0. For any n ∈ N 0 , we can write x n as and the expectation in the last line of (4.14) is with respect to the dual process. Let T be the first time at which there is only one particle left in the dual, i.e., T = inf{t > 0 : n t + m t = 1}. Note that, for any initial state (i, j) ∈ Ω\{(0, 0)}, T < ∞ with probability 1, and the distribution of (n t , m t ) converges as t → ∞ to the invariant distribution N N +M δ (1,0) + M N +M δ (0,1) . So, for any (i, j) ∈ Ω\{(0, 0)}, where we use that the second term after the first equality converges to 0 because T < ∞ with probability 1. Combining (4.16) with (4.14), we get where the last equality follows from (4.13) and the fact that c n,0 c m,0 = 0 when (n, m) = (0, 0).

Proof of Corollary 3.4. Note that the distribution of a two-dimensional random vector (Z 1 , Z 2 ) taking values in [N ] × [M ] is determined by the mixed moments
where (x, y) = f −1 (i). We can write c = A p, where p = (p i ) i∈I , c = (c i ) i∈I and A is an invertible (N + 1)(M + 1) × (N + 1)(M + 1) matrix. Hence, p = A −1 c is uniquely determined by the mixed moments, and convergence of the mixed moments of (X(t), Y (t)) as shown in Proposition 3.3 is enough to conclude that (X(t), Y (t)) converges in distribution as t → ∞ to a random vector . The distribution of (X ∞ , Y ∞ ) is also uniquely determined, and is given by

Proofs: duality and well-posedness for the multi-colony model
In Section 5.1, we give the proof of Lemma 3.8. In Section 5.2, we introduce equivalent versions for the multi-colony setting of the operators defined in (4.2) for the single-colony setting, and use these to prove Theorem 3.10 and Corollary 3.11. In Section 5.3 we prove Proposition 3.6, Proposition 3.12 and Theorem 3.13.

Proof of Lemma 3.8
Proof. Note that the rate-matrix is nothing but the dual generator L dual obtained from the rates specified in (3.23). The action of L dual on a function f : X * → R is given by where ξ = (n i , m i ) i∈Z d ∈ X * and the configurations δ i,A , δ i,D ∈ X * ⊂ X are as in (3.13 Let us define the function V : X * → (0, ∞) as and, for k ∈ N, set Since X * contains configurations with finitely many particles, V is well-defined. It is straightforward to see that Let ξ = (n i , m i ) i∈Z d ∈ X * be arbitrary. Note that, for any i, j ∈ Z d with i = j, and so by using (5.1) we obtain where c = i∈Z d a(0, i) < ∞. Hence, setting p := max{1, c} > 0, we have that which proves our the claim. Table 3: Action of operators on f ∈ C(X ).

Duality
The same duality relations as in Lemma 4.2 hold for these operators as well. The only difference is that the duality function becomes the site-wise product of the duality functions appearing in the single-colony model.

Lemma 5.1. [Multi-colony intertwiner]
Let D : X × X * → [0, 1] be the function defined by Proof. Recall that L = L Mig + L Res + L Exc , where L Mig , L Res , L Ex are defined in (3.17)- (3.19). In terms of the operators defined earlier, these have the following representations: Similarly, the generatorL of the dual process defined in Definition 3.7 acting on f ∈ C(X * ) is given byL =L Mig + L Exc + L King , where for ξ = (n i , m i ) i∈Z d ∈ X * . The representations of these operators arê

Proof of duality relation
Proof of Theorem 3.10. We combine [9,Theorem 4.11 and Corollary 4.13] and reinterpret these in our context: • Let (η t ) t≥0 and (ξ t ) t≥0 be two independent processes on E 1 and E 2 that are solutions to the martingale problem for (L 1 , D 1 ) and (L 2 , D 2 ) with initial states x ∈ E 1 and y ∈ E 2 . Assume that D : E 1 × E 2 → R is such that D( · ; ξ) ∈ D 1 for any ξ ∈ E 2 and D(η ; ·) ∈ D 2 for any η ∈ E 1 . Also assume that for each T > 0 there exists an integrable random variable U T such that (5.14) sup To apply the above, pick E 1 = X , E 2 = X * , L 1 = L, L 2 = L dual , D 1 = D, D 2 = C(X * ), where L dual is the generator of the dual process Z * and set D to be the function defined in Lemma 5.1. Note that, since D contains local functions only, D( · ; ξ) ∈ D for any ξ ∈ X * and, since X * is countable, D(η ; · ) ∈ C(X * ) for any η ∈ X . Fix x = (X i , Y i ) i∈Z d ∈ X and y = (n i , m i ) i∈Z d ∈ X * . Note that, by Proposition 5.2, (L 1 D( · ; y))(x) = (L 2 D(x ; · ))(y). Pick (ξ t ) t≥0 to be the process Z * with initial state y. Note that (ξ t ) t≥0 is the unique solution to the martingale problem for (L dual , C(X * )) with initial state y. Let (η t ) t≥0 denote any solution Z to the martingale problem for (L, D) with initial state x. Fix T > 0 and note that, for 0 ≤ s, t < T ,  (0, i). Now, by Definition 3.7, the process (ξ t ) t≥0 is the interacting particle system with coalescence in which the total number of particles can only decrease in time, and Define the random variable U T by Then, combining (5.17)-(5.18) with the fact that the function D takes values in [0, 1], we see that U T satisfies all the conditions in (5.14), while assumption (3.25) in Theorem 3.10 ensures the integrability of U T .

Proof of duality criterion
Proof of Corollary 3.11. Let ξ = (n i , m i ) i∈Z d ∈ X * and T > 0 be fixed. By Theorem 3.10, it suffices to show that, for any (N i ) i∈Z d ∈ N , where P ξ is the law of the dual process Z * started from initial state ξ. Let n = i∈Z d (n i + m i ) be the initial number of particles, and let N (t) be the total number of migration events within the time interval [0, t]. We will construct a Poisson process N * via coupling such that N (t) ≤ N * (t) for all t ≥ 0 with probability 1. For this purpose, let us consider n independent particles performing a random walk on Z d according to the migration kernel a(·, ·). For each k = 1, . . . , n, let ξ k (t) and ξ * k (t) denote the position of the k-th dependent and independent particle at time t, respectively. We take ξ k (0) = ξ * k (0) and couple each k-th interacting particle with the k-th independent particle as below: • If the independent particle makes a jump from site ξ * k (t) to j * ∈ Z d , then the dependent particle jumps from ξ k (t) to j = ξ k (t) + (j * − ξ * k (t)) with probability p k (t) given by if the dependent particle is in an active and non-coalesced state, where n j (t) is the number of active particles at site j.
• The dependent particle does the other transitions (waking up, becoming dormant and coalescence) independently of the previous migration events, with the prescribed rates defined in Definition 3.7.
Note that, since the migration kernel is translation invariant, under the above coupling the effective rate at which a dependent particle migrates from site i to j is n i a(i, j)(1 − nj Nj ) when there are n i and n j active particles at site i and j, respectively. Also, if N k (t) and N * k (t) are the number of migration steps made within the time interval [0, t] by the k-th dependent and independent particle, respectively, then under this coupling N k (t) ≤ N * k (t) with probability 1. Set N * (·) = n k=1 N * k (·). Then, clearly, Also, N * is a Poisson process with intensity cn, since each independent particle migrates at a total rate c. Let Y l , X l ∈ Z d denote the step at the l-th migration event in the dependent and independent particle systems, respectively. Note that (X l ) l∈N are i.i.d. with distribution (a(0, i)) i∈Z d . Since, under the above coupling, a dependent particle copies the step of an independent particle with a certain probability (possibly 0), and Γ(0) is the minimum length of the box within which all n dependent particles at time 0 are located, we have, for any t ≥ 0, |X l |. To prove part (a), note that E[e δS N * (T ) ] < ∞ and so, by Chebyshev's inequality, Thus, the inequality in (5.24) reduces to Since, under the assumption of part (a), lim k→∞ 1 k log c k = 0, there exists a K ∈ N such that c k ≤ e δk/2 for all k ≥ K. Hence, using (5.26), we find that which settles part (a).
To prove part (b), note that, under the assumption i∈Z d i γ a(0, i) < ∞ for some γ > d + δ, we have E[S γ N * (T ) ] < ∞, and since S N * (T ) is a positive random variable, we get From (5.24) we get . By the assumption of part (b), there exists a C > 0 such that and so using (5.28), we obtain which settles part (b).

Well-posedness
In this section we prove Proposition 3.6, Proposition 3.12 and Theorem 3.13.

Existence
Since the state space X is compact, the theory described in [25, Chapter I, Section 3] is applicable in our setting without any significant changes. The interacting particle systems in [25] have state space W S , where W is a compact phase space and S is a countable site space. In our setting, the site space is S = Z d , but the phase space differs at each site, i.e., [N i ] × [M i ] at site i ∈ Z d . The general form of the generator of an interacting particle system in [25] is where the sum is taken over all finite subsets T of S, and η ξ is the configuration For finite T ⋐ X , c T (η, dξ) is a finite positive measure on W T = W T . To make the latter compatible with our setting, we define The interpretation is that η is the current configuration of the system, c T (η, W T ) is the total rate at which a transition occurs involving all the coordinates in T , and c T (η, dξ)/c T (η, W T ) is the distribution of the restriction to T of the new configuration after that transition has taken place. Fix η = (X i , Y i ) i∈Z d ∈ X . Comparing (5.34) with the formal generator L defined in (3.16), we see that the form of c T (·, ·) is as follows: Note that the total mass is a(0, i).

Lemma 5.3. [Bound on rates]
Proof of Proposition 3.6. By [25, Proposition 6.1 of Chapter I], it suffices to show that where the sum is taken over all finite subsets T ⋐ S containing i ∈ S. Since in our case S = Z d , we let i ∈ Z d be fixed. By Lemma 5.3, the sum reduces to c {i} , and clearly c {i} ≤ (c + λ)N i < ∞.
Proof of Proposition 3.12. By [25, Proposition 6.1 and Theorem 6.7 of Chapter I], to show existence of solutions to the martingale problem for (L, D), it is enough to prove that (5.38) is satisfied. But we already showed this in the proof of Proposition 3.6.

Uniqueness
Before we turn to the proof of Theorem 3.13, we state and prove the following proposition, which, along with the duality established in Corollary 3.11, will play a key role in the proof of the uniqueness of solutions to the martingale problem. Note that it is enough to show the following: Then P X is determined by the family By (4.13), the family F is equivalent to the family containing the mixed moments of (X 1 , . . . , X n ). Since X takes a total of N = n i=1 (N i + 1) many values, we can write the distribution P X as the N -dimensional vector p = (p 1 , . . . , p N ), where p i = P X (X = f −1 (i)) and f : Note that F * also contains N elements, and so we can write F * as the N -dimensional vector e = (e 1 , . . . , e N ), where e i = E[ n k=1 X α k k ], (α 1 , . . . , α n ) = f −1 (i). We show that there exists an invertible linear operator that maps p to e. Indeed, for i = 1, . . . , n, define the (N i + 1) × (N i + 1) Vandermonde matrix A i , Being Vandermonde matrices, all A i are invertible. Finally, define the N × N matrix A by A = A 1 ⊗ A 2 ⊗ · · · ⊗ A n , where ⊗ denotes the Kronecker product for matrices. Then A is invertible because all A i are. Also, we can check that A p = e, and hence the distribution of X given by p = A −1 e is uniquely determined by e, i.e., the family F * .
Proof of Theorem 3.13. We use [9,Proposition 4.7], which states the following (reinterpreted in our setting): • Let S 1 be compact and S 2 be separable. Let x ∈ S 1 , y ∈ S 2 be arbitrary and D : S 1 × S 2 → R be such that the set {D( · ; z) : z ∈ S 2 } is separating on the set of probability measures on S 1 . Assume that, for any two solutions (η t ) t≥0 and (ξ t ) t≥0 of the martingale problem for (L 1 , D 1 ) and (L 2 , D 2 ) with initial states x and y, the duality relation holds: for all t ≥ 0. If for every z ∈ S 2 there exists a solution to the martingale problem for (L 2 , D 2 ) with initial state z, then for every η ∈ S 1 uniqueness holds for the martingale problem for (L 1 , D 1 ) with initial state η.
where L dual is the generator of the dual process Z * . Note that in our setting the martingale problem for (L dual , C(X * )) is already well-posed (the unique solution is the dual process Z * in Lemma 3.8). Hence, combining the above observations with Proposition 5.4 and Corollary 3.11, we get uniqueness of the solutions to the martingale problem for (L, D) for every initial state η ∈ X . The second claim follows from [25, Theorem 6.8 of Chapter I].

Proofs: equilibrium and clustering criterion
In Section 6.1 we prove Theorem 3.14 and Corollary 3.15. In Section 6.2 we derive expressions for the single-site genetic variability in terms of the dual process. In Section 6.3 we use one dual particle to write down expressions for first moments. In Section 6.4 we use two dual particles to write down expressions for second moments. In Section 6.5 we use these expressions to prove Theorem 3.17.

Convergence to equilibrium
Proof of Theorem 3.14. Since the state space X is compact and thus the set of all probability measures on X is compact as well, by Prokhorov's theorem. It therefore suffices to prove convergence of the finite-dimensional distributions of Z(t) = (X i (t), Y i (t)) i∈Z d . Now recall from the proof of Proposition 5.4 that the distribution of an n-dimensional random vector X(t) := (X 1 (t), . . . , X n (t)) taking values in n l=1 [N l ] is determined by In fact, the distribution of X(t) converges if and only if E n l=1 ( X l (t) α l ) /( N l α l ) converges for all (α l ) 1≤l≤n ∈ n l=1 [N l ] as t → ∞. Since our duality function is given by it suffices to show that lim t→∞ E ν θ [D(Z(t); η)] exists for all η ∈ X * . Let η ∈ X * be fixed. By duality, we have where E ξ denotes expectation w.r.t the law of Z(t) started at configuration ξ ∈ X , Z * (t) = (n i (t), m i (t)) i∈Z d is the dual process started at configuration η, and E η denotes expectation w.r.t the law of the dual process. A simple calculation shows that if V is a random variable with distribution Binomial (N, p), then E V n / N n = p n for 0 ≤ n ≤ N . Since (X i (0), Y i (0)) i∈Z d are all independent under ν θ with Binomials as marginal distributions, we have where |Z * (t)| := i∈Z d n i (t) + m i (t) is total number of particles in the dual process at time t. Now, since the dual process is coalescing, |Z * (t)| is decreasing in t. Since θ ∈ [0, 1], we see that exists, which proves the existence of an equilibrium measure ν such that the distribution of Z(t) weakly converges to ν. Also, by definition, Proof of Corollary 3.15. This follows by choosing η = δ i,A and η = δ i,D in the last part of Theorem 3.14 and noting that E η [θ |Z * (t)| ] = θ when |η| = 1.

Genetic variability
For i, j ∈ Z d and t ≥ 0, define is the genetic variability (also frequently referred to as 'sample heterozygosity') at time t between the active populations of colony i and j, i.e., the probability that two individuals drawn randomly from the two populations at time t are of different type, and is the genetic variability at time t between the active population of colony i and the dormant population of colony j. Note that the conditions in Definition 3.16 are equivalent to where the expectation is taken conditional on an arbitrary initial condition (X i (0), Y i (0)) i∈Z d , which we suppress from the notation. We use the dual process to compute E(∆ (i,A),(j,A) (t)) and E(∆ (i,A),(j,D) (t)), namely, Thus, in terms of the duality function D defined in Lemma 5.1, where δ i,A , δ j,A are defined in (3.13). Similarly, Since, by the duality relation in (3.26), we have (6.14) where η 0 = Z * (0) and the expectation in the left-hand side is taken with respect to the dual process (ξ t ) t≥0 = Z * defined in Definition 3.7. Combining the above with (6.11)-(6.12), we get and (6.16) In Sections 6.3-6.4 we derive expressions for the terms appearing in (6.15)-(6.16).

Dual: single particle
We saw earlier that, in order to compute the first moment of X i (t) and Y i (t), we need to put a single particle at site i in the active and the dormant state as initial configurations, respectively. This motivates us to analyse the dual process when it starts with a single particle. The generator L dual of the dual process can be written as for f ∈ C(X * ) and ξ = (n i , m i ) i∈Z d ∈ X * . When there is a single particle in the system at time 0, and consequently at any later time, the only parts of the generator that are non-zero are L AD , L DA and L Mig . Here, L AD turns an active particle at site i into a dormant particle at site i at rate λ, L DA turns a dormant particle at site i into an active particle at site i at rate λK i , with K i = Ni Mi , while L Mig moves an active particle at site i to site j = i at rate a(i, j). Let us denote the state of the particle at time t by ξ(t) ∈ Z d × {A, D}, where the first coordinate of ξ(t) is the location of the particle and the second coordinate indicates whether the particle is active (A) or dormant (D). Let P ξ be the law of the process (ξ(t)) t≥0 with initial state ξ.  D)).
Proof. Recall that, via the duality relation, where the expectation in the right-hand side is taken with respect to the dual process with initial state δ i,A (a single active particle at site i), which has law P (i,A) . Since the term inside the expectation is equal to X k (0) D), the claim follows immediately. The same argument holds for E[ Yi (t) Mi ] with initial condition (i, D) in the dual process.

Dual: two particles
We need to find expressions for the second moments appearing in (6.9)-(6.10) in order to fully specify E(∆ (i,A),(j,A) (t)) and E(∆ (i,A),(j,D) (t)). This requires us to analyse the dual process starting from two particles. Unlike for the single-particle system, now all parts of the generator L dual (see (6.17)) are non-zero, until the two particles coalesce into a single particle. The two particles repel each other: one particle discourages the other particle to come to the same location. The rates in the two-particle system are: • (Migration) An active particle at site i migrates to site j at rate a(i, j) if there is no active particle at site j, otherwise at rate a(i, j)(1 − 1 Nj ).
• (A → D) An active particle at site i becomes dormant at site i at rate λ if there is no dormant particle at site i, otherwise at rate λ(1 − 1 Mi ). • (D → A) A dormant particle at site i becomes active at site i at rate λK i if there is no active particle at site i, otherwise at rate λ(K i − 1 Mi ). • (Coalescence) An active particle at site i coalesces with another active particle at site j at rate 1 Ni when j = i, otherwise at rate a(i,j) Nj .
Note that after coalescence has taken place, there is only one particle left in the system, which evolves as the single-particle system. Let (ξ 1 (t), ξ 2 (t), c(t)) ∈ S = S * × S * × {0, 1} be the configuration of the two-particle system at time t, where S * = Z d × {A, D}. Here ξ 1 (t) and ξ 2 (t) represent the location and state of the two particles. The variable c(t) takes value 1 if the two particles have coalesced into a single particle by time t, and 0 otherwise. It is necessary to add the extra variable c(t) to the configuration in order to make the process Markovian (the rates depend on whether there are one or two particles in the system). To avoid triviality we assume that c(0) = 0 with probability 1, i.e., two particles at time 0 are always in a non-coalesced state. We denote the law of the process (ξ 1 (t), ξ 2 (t), c(t)) t≥0 by P ξ , where the initial condition is ξ ∈ S * × S * . It is to be noted that, since the number of active and dormant particles at a site i at any time are limited by N i and M i , respectively, the two-particle system is not defined whenever it is started from an initial configuration violating the maximal occupancy of the associated sites. Let τ be the first time at which the coalescence event has occurred, i.e., (6.24) Note that, conditional on τ < t, ξ 1 (s) = ξ 2 (s) for all s ≥ t with probability 1. Define, where i, j ∈ Z d and α, β ∈ {A, D}. To avoid ambiguity, we set M (i,α),(j,β) (·) = 0 when ((i, α), (j, β)) is not a valid initial condition for the two-particle system.
Proof. Note that M (i,α),(j,β) (t) = D(Z(t); δ i,α + δ j,β ), where D is the duality function. So, via the duality relation, we have where the expectation in the right-hand side is taken with respect to the dual process when the initial condition has one particle at site i with state α and one particle at site j with state β, which has law P ((i,α),(j,β)) . Depending on the configuration of the process at time t, the right-hand side of (6.28) equals the desired expression.
The following lemma provides a nice comparison between the one-particle and two-particle system.
Replacing the left-quantity in (6.31) with E η [ Yi(t) Mi − M (i,D),(j,β) (t)] and using the same arguments, we see that the inequality for α = D follows.
Since, by assumption, τ < ∞ with probability 1 irrespective of the initial configuration of the twoparticle system, and since the left-hand quantity is positive, we have E Xi(t) Ni − M (i,A),(j,β) (t) → 0 as t → ∞. By a similar argument the other part of (6.33) is proved as well.
If ((i, A), (j, A)) is a valid initial condition for the two-particle system, then by using (6.15)-(6.16) and (6.33), we have If ((i, A), (j, A)) is not a valid initial condition, then we must have that i = j and N i = 1, and so ∆ (i,A),(j,A) (t) = 0 by definition. Thus, for any i, j ∈ Z d , (6.37) lim t→∞ E ∆ (i,A),(j,A) (t) = 0.
Since ((i, A), (j, D)) is always a valid initial condition for the two-particle system, we also have and hence from (6.5) we have that, for any i, j ∈ Z d , E(∆ i,j (t)) → 0 as t → ∞, which proves the claim.
"=⇒" Suppose that the system clusters for any initial configuration Z(0) ∈ X . Then, by dominated convergence, the system clusters for any initial distribution of Z(0) as well. Fix θ ∈ (0, 1), and let the distribution of Z(0) be ν θ , where θ)).
We will prove via contradiction that in the dual two particles with arbitrary valid initial states coalesce with probability 1, i.e., τ < ∞ with probability 1. Indeed, suppose that this is not true, i.e., for some valid initial configuration (ξ 1 , ξ 2 ) ∈ S * × S * of the two-particle system we have P (ξ1,ξ2) (τ = ∞) > 0, where S * = Z d × {A, D}. Since the dual process with two particles is irreducible (any valid configuration is accessible), we have P ξ (τ = ∞) > 0 for any valid initial condition ξ ∈ S * × S * . Let ρ := P ((i,A),(i,D)) (τ = ∞) > 0, where i ∈ Z d is fixed. Note that ((i, A), (i, D)) is always a valid initial condition for the two-particle system, since N i , M i ≥ 1. Let P (i,A) be the law of the single-particle process (ξ(t)) t≥0 started with initial condition (i, A).