1 Introduction

The concept of adaptive dynamics is a heuristic biological theory for the evolution of a population made up of different types that has been developed in the 1990s, see Metz et al. (1996), Dieckmann and Law (1996), Bolker and Pacala (1997), Bolker and Pacala (1999), Dieckmann and Law (2000). It assumes asexual, clonal reproduction with the possibility of mutation. These mutations are rare and new types can initially be neglected, but selection acts fast and the population is assumed to always be at equilibrium. This implies a separation of the fast ecological and slow evolutionary time scale. Fixation or extinction of a mutant are determined by its invasion fitness that describes its exponential growth rate in a population at equilibrium. This notion of fitness is dependent on the current resident population and therefore changes over time. The equilibria do not need to be monomorphic and allow for coexistence and evolutionary branching. Eventually, so-called evolutionary stable states can be reached, where all possible mutants have negative invasion fitness and therefore the state of the population is final.

A special case of adaptive dynamics are so-called adaptive walks or adaptive flights. The concept of adaptive walks was introduced by Maynard Smith (1962, 1970) and further developed by Kauffman and Levin (1987), Kauffman (1992) and Orr (2003). Here, evolution is modelled as a random walk on the type space that moves towards higher fitness as the population adapts to its environment. More precisely, a discrete state space is equipped with a graph structure that marks the possibility of mutation between neighbours. A fixed, but possibly random, fitness landscape is imposed on the type space. In contrast to the above, this individual fitness is not dependent on the current state of the population. Adaptive walks move along neighbours of increasing fitness, according to some transition law, towards a local or global optimum. Adaptive flights, a term that has been introduced by Neidhart and Krug (2011), can take larger steps and jump between local fitness maxima to eventually attain a global maximum. Quantities of interest are, among others, the typical length of an adaptive walk before reaching a local fitness maximum and the distribution of maxima, see Nowak and Krug (2015), as well as the number of accessible paths, see Schmiegelt and Krug (2014) and Berestycki et al. (2016, 2017). They have been studied under various assumptions on the correlations of the fitness landscape and the transition law of the walk. Examples, mentioned by Nowak and Krug (2015), are the natural adaptive walk, where the transition probabilities are proportional to the increase in fitness, or the greedy adaptive walk, which always jumps to the fittest available neighbour.

Over the last years, stochastic individual-based models have been introduced to study different aspects of evolution. They start out with a model that considers a collection of individuals. Each individual is characterised by a type, for example its genotype. The population evolves in time under the mechanisms of birth, death, and mutation, where the parameters depend on the types. The population size is not fixed but the resources of the environment, represented by the carrying capacityK, are limited. This results in a competitive interaction between the individuals, which limits the population size to the order of K. The dynamics are modelled as a continuous time Markov process, as shown by Fournier and Méléard (2004). It is of particular interest to study the convergence of this process in the limits of large populations, rare mutations, and small mutation steps.

For a finite type space, Ethier and Kurtz (1986) have shown that, rescaling the population by K, the process converges to the deterministic solution of a system of differential equations in the limit of large populations, i.e. as K tends to infinity. The differential equations are of Lotka–Volterra type with additional terms for the effects of mutation. This result was generalised for types in \({\mathbb R}^d\) by Fournier and Méléard (2004). For finite times, in the limit of rare mutations, this deterministic system converges to the corresponding mutation-free Lotka–Volterra system. Under certain conditions, these converge in time to unique equilibrium configurations, see Hofbauer and Sigmund (1998) Champagnat et al. (2010).

Champagnat (2006), Champagnat et al. (2008), Champagnat and Méléard (2011) and others have considered the simultaneous limit of large populations and rare mutations . Here, the mutation probability \(\mu _K\) tends to zero as K tends to infinity. They make strong assumptions on the scaling of \(\mu _K\), where only very small mutation probabilities \(\mu _K\ll 1/(K\log K)\) are considered. This ensures the separation of different mutation events. With high probability, a mutant either dies out or fixates in the resident population before the next mutation occurs. To balance the rare mutations, time is rescaled by \(1/(K\mu _K)\), which corresponds to the average time until a mutation occurs. The limiting process is a Markov jump process called trait substitution sequence (TSS) or polymorphic evolution sequence (PES), depending on whether the population stays monomorphic or branches into several coexisting types. In the framework of adaptive walks, these sequences correspond to the natural walk, mentioned above.

Similar convergence results have been shown for many variations of the original individual-based model under the same scaling, including small mutational effects, fast phenotypic switches, spatial aspects, and also diploid organisms, see, e.g. Baar et al. (2017), Baar and Bovier (2018), Champagnat and Méléard (2007), Tran (2008), Leman (2016), Collet et al. (2013), Neukirch and Bovier (2017), Bovier et al. (2018).

The drawback of all these results is the strong assumption on the mutation rate. The separation of mutations which results in small mutational effects and slow evolution has been criticised by Barton and Polechová (2005). We therefore consider a scenario where the mutation rate is much higher, although decreasing, and the mutation events are no longer separated. This allows for several mutations to accumulate before a new type fully invades the population. To study the extreme case, as first done by Bovier and Wang (2013) and recently by Bovier et al. (2019), we consider the two limits separately. We take the deterministic model, arising from limit of large populations, and let the mutation rate \(\mu \) tend to zero while rescaling the time by \(\ln 1/\mu \). This corresponds to the time that a mutant takes to reach a macroscopic population size of order 1, rather than the time until a mutant appears, as before. The time that the system takes to re-equilibrate is negligible on the chosen time scale and hence the resulting limit is a jump process between metastable equilibrium states.

We consider a finite type space with a graph structure representing the possibility of mutation. First, we prove that, under certain assumptions, the deterministic model converges pointwise to a deterministic jump process in the rare mutation limit. This process jumps between Lotka–Volterra equilibria of the current macroscopic types. For a (possibly polymorphic) resident population, we have to carefully track the growth of the different microscopic mutants that compete to invade the population. The first mutant to reach a macroscopically visible population size solves an optimisation problem and balances high invasion fitness and large initial conditions, where the latter is determined by the graph distance to the resident types. The limiting process can be fully described by its jump times and jump chain, which are closely related to this optimisation problem. It can make arbitrarily large jumps and may reach an evolutionary stable state.

Second, we show how we can derive different limiting processes by changing the parameters of the system. On one hand, assuming equal competition between all individuals and monomorphic initial conditions, the description of the jump process can be simplified. In this case, the invasion fitness of a type is just the difference between its own individual fitness, defined by its birth and death rate, and that of the resident type. Hence, we can relate back to the classical notion of fixed fitness landscapes in the context of adaptive walks. The limiting process resembles an adaptive flight since it always jumps to types of higher individual fitness, eventually reaching a global fitness maximum. A similar scenario was studied in the context of adaptive walks and flights by Krug and Karl (2003), Jain and Krug (2005, 2007), Jain (2007). Here the fitness is also assumed to be fixed but time steps are discrete. As in our case, the transitions between macroscopic types are determined by balancing high initial conditions, depending on the distance in the type space, and high fitness.

On the other hand, we modify the deterministic system such that the subpopulations can only reproduce when their size lies above a certain threshold. This limits the radius in which a resident population can foster mutants. A threshold of \(\mu ^\ell \) mimics the scaling of \(\mu _K\approx K^{-1/\ell }\) in the simultaneous limit, where resident types can produce mutants in a radius of \(\ell \). Bovier et al. (2019) and Champagnat, Méléard, and Tran (preprint 2019) recently studied this scaling for the type space of a discrete line. A similar scaling has also been applied to a Moran-type model by Durrett and Mayberry (2011) and an adaptive walk-type model with restricted mutation radius has been studied by Jain and Krug (2007). The resulting limit processes of the modified deterministic system are similar to the previously mentioned greedy adaptive walk. However, they are not all restricted to jumping to direct neighbours only, and thus can cross valleys in the fitness landscape and reach a global fitness maximum. Only when we choose the extreme case \(\ell =1\), the resulting limit is exactly the greedy adaptive walk.

The remainder of this paper is organised as follows. In Sect. 2, we introduce the deterministic system and the corresponding mutation free Lotka–Volterra system and present the main theorems, stating the convergence to different jump processes in the limit of rare mutation for different scenarios. We relate the deterministic system to the individual-based stochastic model and present a modification that mimics the simultaneous limit of large populations and rare, but still overlapping, mutations. Moreover, we give a short outline of the strategy of the proofs. Sections 3 and 4 are devoted to the proof of the first convergence result. The proof is split into three parts. The analysis of the exponential growth phase of the mutants, which follows ideas from Bovier and Wang (2013), is given in Sect. 3. The following Lotka–Volterra invasion phase has been studied in detail by Champagnat et al. (2010). In Sect. 4, we show how to combine the two phases to prove the main result. Next, in Sect. 5, we consider the special case of equal competition, where we can simplify the description of the limiting jump process. Since the assumptions of the result from Champagnat et al. (2010) are no longer satisfied, we have to slightly change the proof. In Sect. 6, we finally present an extension of the original deterministic system, where we limit the range of mutation to mimic the scaling of \(\mu _K\approx K^{-1/\ell }\) in the simultaneous limit. In the extreme case, where only resident types can foster mutants, the greedy adaptive walk arises in the limit. For the intermediate cases, we present some first results on accessibility of types.

2 Model introduction and main results

In this section we introduce the deterministic model for evolution that is the focus of our studies. Similar models have been studied by Hofbauer and Sigmund (1998), who give an extensive overview of models of population dynamics in their book. We present the main result of convergence of this deterministic process in the limit of rare mutations on a divergent time scale. The limiting process is a deterministic jump process that jumps between Lotka–Volterra equilibria, involving different types. In the special case of equal competition, we derive a simplified description of this limiting process. Moreover, we relate the model to the stochastic individual-based model introduced by Fournier and Méléard (2004) and present a modification of the deterministic system that mimics the simultaneous limit of large populations and rare, but not too rare, mutations. In the case where only neighbouring types of the current resident type can arise as mutants, the limiting object is a true adaptive walk. At the end of the section we outline the proofs that are given in the following sections.

2.1 The deterministic system and relations to Lotka–Volterra systems

The model we consider is a classical Lotka–Volterra system with additional mutation terms. We consider a population consisting of subpopulations that are characterised by their types (e.g. geno- or phenotypes). In this paper we choose the n-dimensional hypercube \(\mathbb {H}^n:=\{0,1\}^n\) as our type space. The sequences of ones and zeros can, for example, be interpreted as sequences of loci with different alleles. The type \((0,\ldots ,0)\) can be seen as the wildtype while all other types have accumulated mutations on some loci. However, we will not assume to always start out with a monomorphic population of this type.

The choice of \(\mathbb {H}^n\) can easily be generalised to any finite set. We comment on this in Sect. 2.2.

The state of the system is described by \(\xi ^\mu _t=(\xi ^\mu _t(x))_{x\in \mathbb {H}^n}\), where \(\xi ^\mu _t(x)\) denotes the size of the subpopulation of type x at time t. \(\xi ^\mu _t\) can be seen as a non-negative vector or (not necessarily normalised) measure on \(\mathbb {H}^n\).

The dynamics of \((\xi ^\mu _t)_{t\ge 0}\) are determined by the system of differential equations

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(x)&=\left[ b(x)-d(x)-\sum _{y\in \mathbb {H}^n}\alpha (x,y)\xi ^\mu _t(y)\right] \xi ^\mu _t(x)\nonumber \\&\quad +\mu \sum _{y\in \mathbb {H}^n}\xi ^\mu _t(y)b(y)m(y,x)-\mu \xi ^\mu _t(x)b(x)\sum _{y\in \mathbb {H}^n}m(x,y),\quad x\in \mathbb {H}^n, \end{aligned}$$
(2.1)

where the parameters are chosen as follows.

Definition 1

For \(x,y\in \mathbb {H}^n\), we define

  • \(b(x)\in {\mathbb R}_+\), the birth rate of an individual with type x,

  • \(d(x)\in {\mathbb R}_+\), the (natural) death rate of an individual with type x,

  • \(\alpha (x,y)\in {\mathbb R}_+\), the competitive pressure that is imposed upon an individual with type x by an individual with type y,

  • \(\mu \in [0,1]\), the probability of mutation at a birth event,

  • \(m(x,\cdot )\in \mathcal {M}_p(\mathbb {H}^n)\), the law of the mutant.

Here \(\mathcal {M}_p(\mathbb {H}^n)\) is the set of probability measures on \(\mathbb {H}^n\). We assume that \(m(x,x)=0\), for every \(x\in \mathbb {H}^n\). For each \(x\in \mathbb {H}^n\), we define \(r(x):=b(x)-d(x)\), its individual fitness.

Abiotic factors like temperature, chemical milieu, or other environmental properties enter through b and d, while biotic factors such as competition due to limited food supplies, segregated toxins, or predator-prey relationships are reflected in the competition kernel \(\alpha \).

We could also let the probability of mutation depend on \(x\in \mathbb {H}^n\) in a way such that it is still proportional to some \(\mu \), i.e. \(\mu M(x)\). However, this would not change the limiting process, therefore we stick with a constant \(\mu \) for simplicity of notation.

Note that the competition term ensures that solutions are always bounded. This implies Lipschitz continuity for the coefficients, and hence the classical theory for ordinary differential equations ensures existence, uniqueness, and continuity in t of such solutions \(\xi ^\mu _t\). Moreover, for non-negative initial condition \(\xi ^\mu _0\), \(\xi ^\mu _t\) is non-negative at all times.

Definition 2

For \(x\in \mathbb {H}^n\), we denote by \(|x|:=\sum _{i=1}^nx_i\) the 1-norm. We write \(x\sim y\) if x and y are direct neighbours on the hypercube, i.e. if \(|x-y|=1\). Else, we write \(x\not \sim y\). We denote the standard Euclidean norm by \(\left\| \cdot \right\| \).

To ensure that the mutants which a type \(x\in \mathbb {H}^n\) can produce are exactly its direct neighbours, we introduce the following assumption. It corresponds to only allowing single mutations.

(A) :

For every \(x,y\in \mathbb {H}^n\), \(m(x,y)>0\) if and only if \(x\sim y\).

Again, this assumption is not necessary and can easily be relaxed. However, it simplifies notation and does not change the method of the proofs. We comment on the case of general finite (directed) graphs as type spaces in Sect. 2.2.

Under the above assumption, (2.1) reduces to

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(x)&=\left[ r(x)-\sum _{y\in \mathbb {H}^n}\alpha (x,y)\xi ^\mu _t(y)\right] \xi ^\mu _t(x)\nonumber \\&\quad +\mu \sum _{y\sim x}b(y)m(y,x)\xi ^\mu _t(y)-\mu b(x)\xi ^\mu _t(x). \end{aligned}$$
(2.2)

In the mutation-free case, where \(\mu =0\), the equations take the form of a competitive Lotka–Volterra system

$$\begin{aligned} \tfrac{d}{dt}\xi ^0_t(x)=\left[ r(x)-\sum _{y\in \mathbb {H}^n}\alpha (x,y)\xi ^0_t(y)\right] \xi ^0_t(x). \end{aligned}$$
(2.3)

Understanding this system is essential since it determines the short term dynamics of the system with mutation as \(\mu \rightarrow 0\). For a subset of types we study the stable states of the Lotka–Volterra system involving these types.

Definition 3

For a subset \(\mathbf x \subset \mathbb {H}^n\) we define the set of Lotka–Volterra equilibria by

$$\begin{aligned} \text {LVE}(\mathbf x ):=\left\{ \xi \in ({\mathbb R}_{\ge 0})^\mathbf x :\forall \ x\in \mathbf x :\ \Big [r(x)-\sum _{y\in \mathbf x }\alpha (x,y)\xi (y)\Big ]\xi (x)=0\right\} . \end{aligned}$$
(2.4)

Moreover, we let \(\text {LVE}_+(\mathbf x ):=\text {LVE}(\mathbf x )\cap ({\mathbb R}_{>0})^\mathbf x \). If \(\text {LVE}_+(\mathbf x )\) contains exactly one element, we denote it by \(\bar{\xi }_\mathbf x \), the equilibrium size of a population of coexisting types \(\mathbf x \).

Remark 1

If \(\text {LVE}_+(\mathbf x )=\{\bar{\xi }_\mathbf x \}\), this implies \(r(x)>0\) for all \(x\in \mathbf x \). In the case where \(\mathbf x =\{x\}\), we obtain \(\bar{\xi }_x(x):=\bar{\xi }_\mathbf x (x)=\frac{r(x)}{\alpha (x,x)}\).

The following assumption ensures that for a subset \(\mathbf x \subset \mathbb {H}^n\), such that \(r(x)>0\) for all \(x\in \mathbf x \), there exists a unique asymptotically stable equilibrium of the Lotka–Volterra system involving types \(\mathbf x \).

(\(\mathbf{B }_\mathbf x \)):

There exist \(\theta _x>0\), \(x\in \mathbf x \), such that

$$\begin{aligned}&\displaystyle \forall \ x,y\in \mathbf x :\ \theta _x\alpha (x,y)=\theta _y\alpha (y,x), \end{aligned}$$
(2.5)
$$\begin{aligned}&\displaystyle \forall \ u\in {\mathbb R}^\mathbf x \backslash \{0\}:\ \sum _{x,y\in \mathbf x } \theta _x\alpha (x,y)u(x)u(y)>0. \end{aligned}$$
(2.6)

This is, for example, trivially satisfied by any symmetric, positive definite matrix \((\alpha (x,y))_{x,y\in \mathbf x }\). Under this condition, Champagnat, Jabin, and Raoul have proven convergence to a unique stable equilibrium.

Theorem 1

(Champagnat et al. (2010), Prop.1) Assume (B\(_\mathbf x \)) for a subset \(\mathbf x \subset \mathbb {H}^n\) such that \(r(x)>0\), for all \(x\in \mathbf x \). Then there exists a unique \(\bar{\xi }_\mathbf x \in ({\mathbb R}_+)^\mathbf x \backslash \{0\}\) such that for any solution \(\xi ^0_t\) to (2.3) with initial condition \(\xi ^0_0\in ({\mathbb R}_{>0})^\mathbf x \times \{0\}^{\mathbb {H}^n\backslash \mathbf x }\),

$$\begin{aligned} \left. \xi ^0_t\right| _\mathbf x \rightarrow \bar{\xi }_\mathbf x \,\text { as }t\rightarrow \infty . \end{aligned}$$
(2.7)

The proof of this theorem uses the Lyapunov functional

$$\begin{aligned} L(\xi )=\frac{1}{2}\sum _{x,y\in \mathbf x } \theta _x\alpha (x,y)\xi (x)\xi (y)-\sum _{x\in \mathbf x } \theta _xr(x)\xi (x),\quad \xi \in {\mathbb R}^\mathbf x . \end{aligned}$$
(2.8)

(2.5) ensures that

$$\begin{aligned} \frac{d}{dt}L\left( \left. \xi ^0_t\right| _\mathbf x \right)&=(\nabla L)\left( \left. \xi ^0_t\right| _\mathbf x \right) \cdot \tfrac{d}{dt}\left. \xi ^0_t\right| _\mathbf x \nonumber \\&=-\sum _{x\in \mathbf x } \theta _x\left[ r(x)-\sum _{y\in \mathbf x }\alpha (x,y)\xi ^0_t(y)\right] ^2\xi ^0_t(x)\le 0, \end{aligned}$$
(2.9)

while (2.6) gives convexity of L.

Remark 2

Note that 2.6 implies

$$\begin{aligned} \forall \ u\in {\mathbb R}^\mathbf x \backslash \{0\}:&\ \sum _{x,y\in \mathbf x } \theta _x\alpha (x,y)u(x)u(y)\ge \kappa _\mathbf x \left\| u\right\| ^2, \end{aligned}$$
(2.10)

where

$$\begin{aligned} \kappa _\mathbf x :=\min _{u:\left\| u\right\| =1}\sum _{x,y\in \mathbf x } \theta _x\alpha (x,y)u(x)u(y)>0. \end{aligned}$$
(2.11)

We set \(\kappa :=\min _\mathbf{x \subset \mathbb {H}^n}\kappa _\mathbf x \).

Connected to this positive definiteness property and the Lotka–Volterra equilibria, we define a norm that is used to measure the distance between the current state of the population and the equilibrium size. Since the \(\theta _x\), \(x\in \mathbf x \), in (B\(_\mathbf x \)) are not unique, we fix an arbitrary choice of such parameters. In the case where \((\alpha (x,y))_{x,y\in \mathbf x }\) is irreducible, we can choose the unique normalised version where \(\sum _{x\in \mathbf x }\theta _x=1\).

Definition 4

For \(\mathbf x \subset \mathbb {H}^n\) such that \(\text {LVE}_+(\mathbf x )=\{\bar{\xi }_\mathbf x \}\) and (B\(_\mathbf x \)) is satisfied, we define a scalar product on \({\mathbb R}^\mathbf x \) (or \({\mathcal M}(\mathbf x )\), the set of non-negative measures on \(\mathbf x \)) by

$$\begin{aligned} \langle u,v\rangle _\mathbf x :=\sum _{x\in \mathbf x }\frac{\theta _x}{\bar{\xi }_\mathbf x (x)}u(x)v(x),\quad u,v\in {\mathbb R}^\mathbf x . \end{aligned}$$
(2.12)

The corresponding norm is defined by \(\left\| u\right\| _\mathbf x :=\sqrt{\langle u,u\rangle _\mathbf x }\).

This scalar product is chosen exactly in a way such that we can use the positive definiteness (2.10) and the properties of \(\bar{\xi }_\mathbf x \). Moreover, we notice that

$$\begin{aligned} c_\mathbf x ^2\left\| u\right\| ^2:=\left( \min _{x\in \mathbf x }\frac{\theta _x}{\bar{\xi }_\mathbf x (x)}\right) \left\| u\right\| ^2\le \left\| u\right\| ^2_\mathbf x \le \left( \max _{x\in \mathbf x }\frac{\theta _x}{\bar{\xi }_\mathbf x (x)}\right) \left\| u\right\| ^2=:C_\mathbf x ^2\left\| u\right\| ^2. \end{aligned}$$
(2.13)

Remark 3

Throughout the paper, constants labelled c and C have varying values. Specific constants, as \(c_\mathbf x \) and \(C_\mathbf x \) above, are labelled differently and referenced when used repetitively.

While some types \(\mathbf x \) coexist at their equilibrium size \(\bar{\xi }_\mathbf x \), other types \(y\in \mathbb {H}^n\backslash \mathbf x \), which only have a small population size, grow in their presence. Considering the rate of exponential growth in (2.3), we formulate a notion of invasion fitness.

Definition 5

For \(\mathbf x \subset \mathbb {H}^n\) such that \(\text {LVE}_+(\mathbf x )=\{\bar{\xi }_\mathbf x \}\) and \(y\in \mathbb {H}^n\), we define the invasion fitness of an individual with type y in a population of coexisting types \(\mathbf x \) at equilibrium by \(f_{y,\mathbf x }:=r(y)-\sum _{x\in \mathbf x }\alpha (y,x)\bar{\xi }_\mathbf x (x)\).

Notice that \(f_{x,\mathbf x }=0\) for all \(x\in \mathbf x \). In contrast to the individual fitness r, which is fixed, this notion of fitness varies over time and depends on the current resident types.

2.2 Convergence to a deterministic jump process

We now come back to the system (2.2), involving mutation. We assume that the system starts out close to the equilibrium size of some subset of types \(\mathbf x \subset \mathbb {H}^n\) and study its evolution over time. We distinguish between macroscopic resident types that coexist at their equilibrium size and microscopic mutant types that have a population size that tends to 0 as \(\mu \rightarrow 0\). The initial conditions are specified as follows.

Definition 6

A collection of measures \(\xi ^\mu _0\in {\mathcal M}(\mathbb {H}^n)\), depending on \(\mu \), satisfies the initial conditions for resident types\(\mathbf x \subset \mathbb {H}^n\), \(\eta >0\), and\(\bar{c}>0\) if \(\text {LVE}_+(\mathbf x )=\{\bar{\xi }_\mathbf x \}\) and there exists a \(\mu _0\in (0,1]\) and constants \(0\le c_y\le C_y<\infty \) and \(\lambda _y\ge 0\), for each \(y\in \mathbb {H}^n\), such that, for every \(\mu \in (0,\mu _0]\),

$$\begin{aligned} \xi ^\mu _0(y)\in [c_y\mu ^{\lambda _y},C_y\mu ^{\lambda _y}], \end{aligned}$$
(2.14)

where

$$\begin{aligned} \forall ~y\in \mathbf x :&\ \lambda _y=0,\ \bar{\xi }_\mathbf x (y)-\eta \frac{\bar{c}}{\sqrt{|\mathbf x |}}\le c_y,C_y\le \bar{\xi }_\mathbf x (y)+\eta \frac{\bar{c}}{\sqrt{|\mathbf x |}}, \end{aligned}$$
(2.15)
$$\begin{aligned} \forall ~y\in \mathbb {H}^n\backslash \mathbf x :&\ \lambda _y>0,\ 0\le c_y,C_y<\infty \quad \text {or}\end{aligned}$$
(2.16)
$$\begin{aligned}&\ \lambda _y=0,\ 0\le c_y,C_y\le \frac{\eta }{3},\ f_{y,\mathbf x }<0. \end{aligned}$$
(2.17)

If \(\xi ^\mu _0(y)\equiv 0\), we choose any \(\lambda _y>\max _{z\in \mathbb {H}^n: \xi ^\mu _0(z)>0}\lambda _z+n\).

We write \(\xi ^\mu _0\in \text {IC}(\mathbf x ,\eta ,\bar{c})\).

This definition is very technical. We could choose more simple initial conditions for our main theorem, for example a monomorphic macroscopic type and no microscopic types of positive population size. However, after the first invasion step, this is exactly what the system looks like, and we want to be able to iterate our procedure. The definition roughly implies that all macroscopic types are close to their coexistence equilibrium (within an attractive domain) and all microscopic types y are of order \(\mu ^{\lambda _y }\) as \(\mu \rightarrow 0\). The types that are not part of the resident types but of order \(\mu ^0\) are assumed to be unfit and of small enough size. This ensures that they do not “trigger” the stopping time that marks the beginning of the next Lotka–Volterra phase, i.e. the time when the first fit mutant reaches a macroscopic level. This stopping time is defined in (3.1)

Let \(\mathbf x ^0\subset \mathbb {H}^n\) be the initial set of coexisting types, i.e. \(\xi ^\mu _0\in \text {IC}(\mathbf x ^0,\eta ,\bar{c})\), and set \(T_0:=0\). During a time of order 1, each type \(y\in \mathbb {H}^n\) grows to a size of order \(\mu ^{\rho ^0_y}\), where

$$\begin{aligned} \rho ^0_y:=\min _{z\in \mathbb {H}^n}[\lambda _z+|z-y|], \end{aligned}$$
(2.18)

due to incoming mutants from other types. This can be argued as folllows. The population of type y collects incoming mutants from all other types z of order \(\xi ^\mu _0(z)\mu ^{|z-y|}\). These influences are summed up but in the limit of \(\mu \rightarrow 0\), the asymptotically largest summand, i.e. the smallest exponent of \(\mu \), dominates all other terms.

Assume now that, after the \((i-1)^\text {st}\) invasion, at time \(T_{i-1}\ln 1/\mu \), we have coexisting resident types \(\mathbf x ^{i-1}\) and all types \(y\in \mathbb {H}^n\) have population size of order \(\mu ^{\rho ^{i-1}}_y\), where \(\rho ^{i-1}_y=\min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|]\) is satisfied. During a time of order \(\ln 1/\mu \), microscopic types grow until the first type reaches a population size of order 1. The population sizes during growth can be approximated as

$$\begin{aligned} \xi ^\mu _{t\ln \frac{1}{\mu }}(y)\approx \mu ^{\min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|-(t-T_{i-1})f_{z,\mathbf x ^{i-1}}]}. \end{aligned}$$
(2.19)

This is a little tricky and takes into account that there are three possible sources that could dominate the growth of type y: First, the population at y could just grow at its own exponential growth rate \(f_{y,\mathbf x ^{i-1}}\). This gives \(\mu ^{\rho ^{i-1}_y -(t-T_{i-1})f_{y,\mathbf x ^{i-1}}}\). Second, it could come from mutants from the large populations in \(x\in \mathbf x ^{i-1}\). This gives \(\mu ^{|x-y|}\) since x has to mutate \(|x-y|\) times to reach y. Finally, it could come from the mutants that have grown at any other site z over the last period. This gives \(\mu ^{\rho ^{i-1}_z+ |z-y|-(t-T_{i-1})f_{z,\mathbf {x}^{i-1}}}\).

Since mutants from another type can never increase the population size past \(\mu ^1\), the first microscopic type y to reach a size of order 1 must have grown at its own rate \(f_{y,\mathbf x ^{i-1}}>0\). The time to reach this macroscopic size (after the last invasion) is of order \((\rho ^{i-1}_y/f_{y,\mathbf x ^{i-1}})\ln 1/\mu \).

Summarising these thoughts, we inductively define

$$\begin{aligned} y^i_*&:=\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{\begin{array}{c} y\in \mathbb {H}^n:\\ f_{y,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_y}{f_{y,\mathbf x ^{i-1}}}, \end{aligned}$$
(2.20)

the \(i\text {th}\) invading type (if the minimiser is unique),

$$\begin{aligned} T_i&:=T_{i-1}+\min _{\begin{array}{c} y\in \mathbb {H}^n:\\ f_{y,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_y}{f_{y,\mathbf x ^{i-1}}}, \end{aligned}$$
(2.21)

the time of the \(i\text {th}\) invasion on the time scale \(\ln 1/\mu \), and

$$\begin{aligned} \rho ^i_y&:=\min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|-(T_i-T_{i-1})f_{z,\mathbf x ^{i-1}}] \end{aligned}$$
(2.22)

the \(\mu \)-exponent of the population size of type y at the time of the \(i^\text {th}\) invasion. If there is no \(y\in \mathbb {H}^n\) such that \(f_{y,\mathbf x ^{i-1}}>0\), we set \(T_i:=\infty \).

At time \(T_i\ln 1/\mu \), the types \(y^i_*\) and \(\mathbf x ^{i-1}\) re-equilibrate according to the mutation-free Lotka–Volterra dynamics. If it is unique, we denote the support of the new equilibrium, i.e. the new coexisting resident types, by \(\mathbf x ^i\).

Remark 4

The results in this paper can easily be generalised to finite, possibly directed graphs as type spaces, where (directed) edges mark the possibility of mutation. In these cases the Hamming distance on the hypercube (e.g. \(|z-y|\) in (2.22)) is replaced by a “directed” distance, corresponding to lengths of directed paths (e.g. by the length of the shortest path from z to y). Note that this directed distance is not a distance in the classical sense since it might not be symmetric. For ease of notation and due to the nice applicability to genetic sequences, we stick with the hypercube in this paper.

With the above notation, we can now characterise the limiting process as follows.

Theorem 2

Consider the system of differential equations (2.2) and let \(\xi ^\mu _0\in \text {IC}(\mathbf x ^0,\eta ,\bar{c})\), for \(\eta \) small enough. Assume (A) and (B\(_\mathbf{x ^{i-1}\cup y^i_*}\)), for every \(1\le i<I\), where we set \(I:=i\) for the smallest \(i\in {\mathbb N}\) where either

  1. (a)

    the minimiser in (2.20) is not unique, or

  2. (b)

    there is a \(y\in (\mathbf x ^{i-1}\cup y^i_*)\backslash \mathbf x ^i\) such that \(f_{y,\mathbf x ^i}\ge 0\),

and \(I:=\infty \) if none of these occur. In the latter case, we set \(T_\infty :=\infty \).

Then, for every \(t\in [0,T_I)\backslash \{T_i,0\le i\le I\}\),

$$\begin{aligned} \lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln \frac{1}{\mu }}=\sum _{i=0}^{I-1} \mathbb {1}_{T_i\le t< T_{i+1}}\sum _{x\in \mathbf x ^i}\delta _{x}\bar{\xi }_\mathbf{x ^i}(x). \end{aligned}$$
(2.23)

Remark 5

(i) Note that (B\(_\mathbf{x ^{i-1}\cup y^i_*}\)) implies (B\(_\mathbf{x ^{i-1}}\)) and (B\(_\mathbf{x ^i}\)), with the same constants \(\theta _x\). (ii) Case (a) is very unlikely if the parameters of the model are chosen in a random fashion since it requires a very particular equality. Case (b) guarantees that we terminate the procedure as soon as the conditions of \(\text {IC}(\mathbf x ^i,\eta ,\bar{c})\) are not satisfied after the \(i^\text {th}\) invasion. For every \(y\in (\mathbf x ^{i-1}\cup y^i_*)\backslash \mathbf x ^i\), \(f_{y,\mathbf x ^i}\le 0\) is ensured (going through the proof of Theorem 1), so the only problem can arise from equality. (iii) Note that the theorem implies that, in the case of \(T_i=\infty \), even if there was a mutant type \(y\in \mathbb {H}^n\backslash \mathbf x ^{i-1}\) such that \(f_{y,\mathbf x ^{i-1}}=0\), it would not be able to invade the resident population.

The proof of this result is given in Sects. 3 and 4.

2.3 Convergence in the case of equal competition

In the context of adaptive walks and flights, the fitness landscape on the type space is possibly random, but usually fixed over time. For a monomorphic resident type, the current fitness of any type, corresponding to its invasion fitness, is determined by the difference between its individual fitness and the fitness of the resident type.

As a special case of our model, we consider equal competition between all types on the hypercube. In this case, one can simplify the description of the limit process and derive some interesting properties.

We introduce the additional assumption

(C) :

For every \(x,y\in \mathbb {H}^n\), \(\alpha (x,y)\equiv \alpha >0\).

This leads to a couple of nice properties of the invasion fitness \(f_{y,\{x\}}\). As in the adaptive walks framework, we obtain

$$\begin{aligned} f_{y,\{x\}}=r(y)-\alpha (y,x)\bar{\xi }_{\{x\}}(x)=r(y)-r(x), \end{aligned}$$
(2.24)

which yields

$$\begin{aligned} f_{y,\{x\}}=-f_{x,\{y\}}~\text { and }~f_{z,\{y\}}+f_{y,\{x\}}=f_{z,\{x\}}. \end{aligned}$$
(2.25)

As a consequence, there is some kind of transitivity of invasion fitness. A type z that is unfit relative to some other type y, i.e. \(f_{z,\{y\}}<0\), is unfit relative to all types that are fitter than y. This ensures that types which are once suppressed by resident types stay at a microscopic level forever. In particular, case (b) in Theorem 2 is automatically excluded by assumption (C).

As before, we terminate the procedure as soon as case (a) in Theorem 2 occurs to ensure that there is always a unique mutant that reaches the threshold of order 1 first after an invasion. Starting out with only a single type at its equilibrium size, i.e. \(\mathbf x ^0=\{x^0\}\), this also implies that we avoid coexistence and always maintain a monomorphic resident population. This is due to the fact that an invading type has to have higher rate r than the current resident type, which prevents a polymorphic Lotka–Volterra equilibrium.

Assumption (B\(_\mathbf x \)) can no longer be satisfied for constant \(\alpha \), as soon as \(|\mathbf x |\ge 2\). However, it is no longer needed since the resident types are monomorphic, i.e. \(|\mathbf x ^i|=1\), and have a lower rate r than the invading types, which implies a unique stable equilibrium of the Lotka–Volterra sytem involving \(\mathbf x ^i\cup y^i_*\). We comment on this in more detail in Sect. 5, where we adapt the proof of Theorem 2 to this situation.

In the case of a monomorphic resident population \(\mathbf x =\{x\}\), we use the shorthand notation \(\bar{\xi }_x:=\bar{\xi }_{\{x\}}\), \(f_{y,x}:=f_{y,\{x\}}\). For types \(x^i\), \(x^j\), we write \(f_{i,j}:=f_{x^i,x^j}\).

The limiting jump process can now be described in a simple way.

Theorem 3

Consider the system of differential equations (2.2) and let \(\xi ^\mu _0\in \text {IC}(\{x^0\},\eta ,\bar{c})\) such that \(\lambda _y\ge |y-x^0|\), for all \(y\in \mathbb {H}^n\), and \(\eta \) small enough. Assume (A) and (C) and set \(I:=i\) for the smallest \(i\in {\mathbb N}\) such that the minimiser in (2.20) is not unique, and \(I:=\infty \) if this does not occur. In the latter case, we set \(T_\infty :=\infty \).

Then, for every \(t\in [0,T_I)\backslash \{T_i,0\le i\le I\}\),

$$\begin{aligned} \lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln \frac{1}{\mu }}=\sum _{i=0}^{I-1} \mathbb {1}_{T_i\le t< T_{i+1}}\delta _{x^i}\bar{\xi }_{x^i}(x^i). \end{aligned}$$
(2.26)

Moreover, the following identities hold:

$$\begin{aligned} x^i&=\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n: f_{y,x^{i-1}}>0}\frac{|y-x^0|-|x^{i-1}-x^0|}{f_{y,x^{i-1}}}, \end{aligned}$$
(2.27)
$$\begin{aligned} T_i&=\frac{|x^i-x^0|-|x^{i-1}-x^0|}{f_{i,i-1}}. \end{aligned}$$
(2.28)

Remark 6

(i) \(\lambda _y\ge |y-x^0|\) ensures that the initial population size of all microscopic types is not larger than what they gain due to incoming mutants from \(x^0\) within a time of order 1. This is neccessary to obtain the identities for \(x^i\) and \(T_i\). (ii) Uniqueness of the minimiser in (2.20) is equivalent to uniqueness of the minimiser in (2.27) and hence \(x^i\) is well-defined. (iii) In the case where \(I=\infty \), the jump process in Theorem 3 continues as long as there is a type with higher individual fitness, i.e. higher rate r. As a result, it can cross arbitrarily large valleys in the fitness landscape (defined by r) and eventually reaches a global fitness maximum, where it remains. Note that this global maximum does not have to be unique. The jump process reaches the maximum that is closest to \(x^0\) in \(\mathbb {H}^n\), which is unique if \(I=\infty \), and equally fit types cannot invade as mentioned in Remark 5 (iii). With these properties, the jump process resembles an adaptive flight. However, it does not quite fit into that framework since it is not only jumping to local fitness maxima. (iv) Every invasion step increases the distance on \(\mathbb {H}^n\) between the resident type and \(x_0\). This can be seen inductively as follows. Consider the \((i+1)^\text {st}\) invasion. \(x^i\) was a minimiser of \((|y-x^0|-|x^{i-1}-x^0|)/f_{y,x^{i-1}}\). If now y satisfies \(f_{y,x^i}>0\), then

$$\begin{aligned} \frac{|y-x^0|-|x^{i-1}-x^0|}{f_{y,x^{i-1}}}\ge \frac{|x^i-x^0|-|x^{i-1}-x^0|}{f_{i,i-1}}, \end{aligned}$$
(2.29)

and since \(f_{y,x^{i-1}}=f_{y,x^i}+f_{i,i-1}>f_{i,i-1}\) and \(|x^i-x^0|>|x^{i-1}-x^0|\) (by assumption), \(|y-x^0|-|x^{i-1}-x^0|>|x^i-x^0|-|x^{i-1}-x^0|\), and hence \(|y-x^0|>|x^i-x^0|\). The proof of Theorem 3 is found in Sect. 5.

2.4 Derivation from the individual-based stochastic model in the large population limit

The deterministic system, that is studied above, can be obtained as the large population limit of an individual-based Markov process. At time t, we consider a population of finite size \(N(t)\in {\mathbb N}\). Each living individual is represented by its type \(x_1(t),\ldots ,x_{N(t)}(t)\in \mathbb {H}^n\) and the state of the population is described by the finite point measure

$$\begin{aligned} \nu ^\mu _t=\sum _{i=1}^{N(t)} \delta _{x_i(t)}. \end{aligned}$$
(2.30)

\(\nu ^\mu _t(x)\) describes the number of induviduals of type \(x\in \mathbb {H}^n\) at time t. The dynamics of the Markov process are determined by the same parameters b, d, \(\alpha \), \(\mu \), and m as for the deterministic system \(\xi ^\mu _t\).

To let the size of the population tend to infinity, we introduce the carrying capacity of the environment, denoted by \(K\in {\mathbb N}\). This can for example be interpreted as the amount of available space or resources. As K increases, the competitive pressure between individuals decreases and we scale \(\alpha _K(x,y)\equiv \frac{\alpha (x,y)}{K}\). This leads to an equilibrium population size of order K. To derive a finite limit for large populations, i.e. as \(K\rightarrow \infty \), we consider the rescaled measure

$$\begin{aligned} \nu ^{\mu ,K}_t:=\frac{\nu _t}{K}. \end{aligned}$$
(2.31)

This measure-valued Markov process can be constructed similar to (Fournier and Méléard 2004, Ch 2), with infinitesimal generator

$$\begin{aligned} \mathcal {L}^K\phi (\nu )&=\sum _{x\in \mathbb {H}^n}K\nu (x)\left( \phi \left( \nu +\frac{\delta _x}{K}\right) -\phi (\nu )\right) b(x)(1-\mu )\nonumber \\&\quad +\sum _{x\in \mathbb {H}^n}K\nu (x)\sum _{y\sim x} \left( \phi \left( \nu +\frac{\delta _y}{K}\right) -\phi (\nu )\right) b(x)\mu m(x,y)\nonumber \\&\quad +\sum _{x\in \mathbb {H}^n}K\nu (x)\left( \phi \left( \nu -\frac{\delta _x}{K}\right) -\phi (\nu )\right) \left( d(x)+\sum _{y\in \mathbb {H}^n}\frac{\alpha (x,y)}{K}K\nu (y)\right) , \end{aligned}$$
(2.32)

where \(\nu \in {\mathcal M}(\mathbb {H}^n)\) is a non-negative measure on \(\mathbb {H}^n\) and \(\phi \) a measurable bounded function from \({\mathcal M}(\mathbb {H}^n)\) to \({\mathbb R}\).

Ethier and Kurtz have shown convergence of this process to \(\xi ^\mu \) as K tends to infinity.

Theorem 4

(Ethier and Kurtz (1986), Chap.11, Thm.2.1) Assume that the initial conditions converge almost surely to a deterministic limit, i.e. \(\nu ^{\mu ,K}_0\rightarrow \xi ^\mu _0\), as \(K\rightarrow \infty \). Then, for every \(T\ge 0\), \((\nu ^{\mu ,k}_t)_{0\le t\le T}\) almost surely converges uniformly to the deterministic process \((\xi ^\mu _t)_{0\le t\le T}\), which is the unique solution to the system of differential equations 2.2 with initial condition \(\xi ^\mu _0\).

2.5 Convergence for a limited radius of mutation

The limiting process in Theorem 3 already looks similar to the greedy adaptive walk of Nowak and Krug (2015), mentioned in the introduction. It is a monomorphic jump process on the type space that always jumps to types of higher individual fitness r. However, it can take larger steps than just to neighbouring types and we have seen that the initial type \(x^0\) plays an important role in determining the jump chain. This is due to the fact that, already after an arbitrarily small time, mutation has induced a positive population size for every possible type. These mutant populations have size of order \(\mu \) to the power of the distance to \(x^0\) on \(\mathbb {H}^n\). The next invading type is then found balancing low initial \(\mu \)-power and high (invasion) fitness.

In all our previous considerations, arbitrarily small populations were able to reproduce and foster mutants, which can lead to population sizes as small as \(\mu ^n\). This might not always fit reality well.

If we consider the stochastic model, introduced in the previous subsection, and allow for the mutation probability \(\mu _K\) to decrease as K increases, we can study the simultaneous limit of large populations and rare mutations. To be able to reproduce within a time of order 1 in a population of size \(\mu _K^n\) implies that

$$\begin{aligned} \lim _{K\rightarrow \infty }\mu _K^n\cdot K\ge 1, \end{aligned}$$
(2.33)

or equivalently

$$\begin{aligned} \lim _{K\rightarrow \infty }\frac{\mu _K}{K^{-\frac{1}{n}}}\ge 1. \end{aligned}$$
(2.34)

In this case, we would recover the deterministic system (2.2) in the limit of \(K\rightarrow \infty \).

If now \(\mu _K\) was of order \(K^{-\frac{1}{\ell }}\) for some \(\ell <n\), this implies that populations with a size of order \(\mu _K^\lambda \), for \(\lambda >\ell \) are vanishing as \(K\rightarrow \infty \) and hence cannot reproduce. If we consider a monomorphic resident type x, it spreads mutants y of population size \(\mu _K^{|y-x|}\). This means that it can initially only foster mutant populations in a radius of \(\ell \).

This regime has already been studied by Bovier et al. (2019) and Champagnat, Méléard, and Tran (preprint 2019). It is shown that, on the type space \({\mathbb N}\) (with neighbours having difference exactly 1) and on the usual time scale of \(\ln 1/\mu _K\), a fitness valley of width \(\le \ell \), but no further, can be crossed. However, crossing a wider valley is possible on a faster diverging time scale.

In the following, we want to mimic this parameter regime of the stochastic system in our determnistic model. To do so, we introduce a cut-off that freezes the dynamics of a population below the threshold of \(\bar{\xi }\mu ^\ell \), where

$$\begin{aligned}\bar{\xi }:=\min \{\bar{\xi }_x(x)/2: x\in \mathbb {H}^n,r(x)>0\}>0\end{aligned}$$

is chosen such that every resident type will eventually surpass this value (which is relevant in the case \(\ell =1\)). The new system of differential equations then reads as follows.

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(x)&= \left[ b(x)\mathbb {1}_{\xi ^\mu _t(x)\ge \bar{\xi }\mu ^\ell }-d(x)-\sum _{y\in \mathbb {H}^n}\alpha (x,y)\xi ^\mu _t(y)\mathbb {1}_{\xi ^\mu _t(y)\ge \bar{\xi }\mu ^\ell }\right] \xi ^\mu _t(x)\nonumber \\&\quad +\mu \sum _{y\sim x}\xi ^\mu _t(y)\mathbb {1}_{\xi ^\mu _t(y)\ge \bar{\xi }\mu ^\ell }b(y)m(y,x)-\mu \xi ^\mu _t(x)\mathbb {1}_{\xi ^\mu _t(x)\ge \bar{\xi }\mu ^\ell }b(x). \end{aligned}$$
(2.35)

Remark 7

Reproduction (clonal and non-clonal) is stopped for types below the threshold of \(\bar{\xi }\mu ^\ell \). As a result, those types are in a kind of dormant state and can only grow due to the mutational influence of other, larger types. It does not affect the system that these dormant types remain at a low level since they do not influence the dynamics of other types and only become active again if they gain a larger amount due to incoming mutants.

The death rate of populations below \(\bar{\xi }\mu ^\ell \) is not set to zero. This is neccessary to actually drop below the threshold if a population declines due to negative fitness. Otherwise, the population would remain at exactly \(\bar{\xi }\mu ^\ell \) and could immediately start growing again when its fitness becomes positive due to a change of resident types. This is however not what we want to achieve since populations that drop to the threshold are supposed to go extinct and only reappear due to incoming new mutants.

We cannot simply set the population size of a type to zero below the threshold. In that case, a zero-type would never become active since every gain due to mutation would immediately be cancelled.

As mentioned above, for \(\ell \ge n\), we just recover the original scenario of Theorem 2. This is due to the fact that, as long as there is at least one macroscopic type, every other type has population size of at least \(\mu ^n\), due to mutants from this macroscopic type.

For \(\ell =1\), if we keep assumption (C) of constant competition and a monomorphic initial condition, we obtain the greedy adaptive walk of Nowak and Krug (2015), where the process always jumps to the fittest direct neighbour of the current resident type. We re-define

$$\begin{aligned} x^i&:=\mathop {{{\,\mathrm{arg\,max}\,}}}\limits _{y\sim x^{i-1}}r(y), \end{aligned}$$
(2.36)
$$\begin{aligned} T_i&:=T_{i-1}+\frac{1}{f_{i,i-1}}, \end{aligned}$$
(2.37)

and set \(T_i:=\infty \), as soon as there exists no \(y\sim x^{i-1}\) such that \(r(y)>r(x^{i-1})\).

The convergence can be stated as follows.

Theorem 5

Consider the system of differential equations (2.35) for \(\ell =1\) and let \(\xi ^\mu _0\in \text {IC}(\{x^0\},\eta ,\bar{c})\) such that \(\lambda _y\ge 1\), for all \(y\sim x^0\), \(\lambda _y\ge 2\), for \(|y-x^0|\ge 2\), and \(\eta \) small enough. Assume (A) and (C) and set \(I:=i\) for the smallest \(i\in {\mathbb N}\) such that the maximiser in (2.36) is not unique, and \(I:=\infty \) if this does not occur. In the latter case, we set \(T_\infty :=\infty \).

Then, for every \(t\in [0,T_I)\backslash \{T_i,0\le i\le I\}\),

$$\begin{aligned} \lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln \frac{1}{\mu }}=\sum _{i=0}^{I-1} \mathbb {1}_{T_i\le t< T_{i+1}}\delta _{x^i}\bar{\xi }_{x^i}(x^i). \end{aligned}$$
(2.38)

Remark 8

(i) \(\lambda _y\ge 2\), for \(|y-x^0|\ge 2\), ensures that no microscopic type has a larger initial population that what it gains due to the first incoming mutants from other types. (ii) The adaptive walk in Theorem 5 stops as soon as it reaches a local maximum of the individual fitness r since only direct neighbours of the resident type can be reached. Local maxima do not need to be strict. However, as in the previous cases, mutants with invasion fitness 0 cannot invade the resident population. (iii) It is no longer the case that every step increases the distance to \(x^0\). The walk could return to a type close to \(x^0\), which just could not be reached before because one had to go around a valley in the fitness landscape defined by r.

In Sect. 6, we discuss the proof of Theorem 5, as well as the intermediate cases of \(1<\ell <n\).

2.6 Structure of the proofs

The general strategy of the proofs of all three theorems is to split the analysis of the evolution into two parts. First, the microscopic mutants grow in the presence of the coexisting resident types until one of them reaches a macroscopic population size of order 1, i.e. that does not vanish as \(\mu \rightarrow 0\). Second, this macroscopic mutant and the resident types attain a new equilibrium according to the Lotka–Volterra dynamics. The two phases are visualised in Fig. 2, found in Sect. 4, prior to the proof of Theorem 2.

The first phase is studied in detail in Sect. 3. Theorem 6 gives upper and lower bounds for the exponential growth of the non-resident types. The growth can be due to a type’s own (invasion) fitness or due to mutants from a growing neighbour. To get the correct approximation, the influences of all existing types have to be summed up. Meanwhile, the resident types stay close to their equilibrium. Corollary 1 considers the \(\ln 1/\mu \)-time scale and derives an approximation for the first time that a mutant reaches the macroscopic threshold.

After the threshold is reached, for the second phase, we can apply Theorem 1 to the Lotka–Volterra system involving the macroscopic mutant type and the resident types to derive the convergence to a new equilibrium state. This is possible since we now have a non-negative initial condition that does not vanish as \(\mu \rightarrow 0\).

In Sect. 4, this theorem is combined with Theorem 6, or rather Corollary 1, to analyse the full evolution of our system (2.2). First, in Lemma 1, the dependence of solutions on the initial condition and the size of \(\mu \) is studied to be able to approximate the full system by the Lotka–Volterra system only involving the macroscopic types. Second, in Lemma 2, continuity of the duration of the Lotka–Volterra phase in the initial condition is shown. From this, a uniform bound on the time to reach the initial conditions of Theorem 6 again is derived. All of this is then combined to show the convergence in Theorem 2, one invasion step at a time, and recursively describe the limiting process.

To prove Theorem 3, only slight changes have to be made. Since assumption (B\(_\mathbf x \)) is not satisfied, Theorem 1 can no longer be applied directly. However, the assumption is mainly needed to show uniqueness of the limiting equilibrium, which is, in this case, already implied by the structure of the individual fitness landscape. The rest of the proof, found in Sect. 5, is then devoted to simplifying the expressions for \(y^i_*\), \(T_i\), and \(\rho ^i_y\).

In Sect. 6, Theorem 5 is proved. Here, the bounds from Theorem 6 have to be revised. The rest of the argument follows the previous proofs.

3 Invasion analysis

In this section, we prove an exponential approximation of the growth of the non-resident subpopulations until the first type reaches a macroscopic threshold of order 1. We choose this threshold to be at \(\eta >0\), independent of \(\mu \), and pick \(\eta \) small enough for our purposes in the end.

Definition 7

For a resident population of \(\mathbf x \subset \mathbb {H}^n\), the time when the first mutant type reaches \(\eta >0\) is defined as

$$\begin{aligned} \tilde{T}^\mu _\eta&:=\inf \{s\ge 0:\exists \ y\in \mathbb {H}^n\backslash \mathbf x :\xi ^\mu _s(y)>\eta \}. \end{aligned}$$
(3.1)

To consider the evolutionary time scale \(\ln 1/\mu \), we define \(T^\mu _\eta \) through \(\tilde{T}^\mu _\eta =T^\mu _\eta \ln 1/\mu \).

We can now state the first result that describes the evolution of the system until \(\tilde{T}^\mu _\eta \).

Theorem 6

Consider the system of differential equations (2.2) and assume (A). Then there exist \(\tilde{\eta }>0\) and \(0<\bar{c}\le \bar{C}\), uniform in all \(\mathbf x \subset \mathbb {H}^n\) for which \(\text {LVE}_+(\mathbf x )=\{\bar{\xi }_\mathbf x \}\) and (B\(_\mathbf x \)) is satisfied, such that for \(\eta \le \tilde{\eta }\) and \(\mu <\eta \) the following holds:

If \(\xi ^\mu _0\in \text {IC}(\mathbf x ,\eta ,\bar{c})\), then, for every \(0<t_0\le t<\tilde{T}^\mu _\eta \) and every \(y\in \mathbb {H}^n\),

$$\begin{aligned} \check{c}\sum _{z\in \mathbb {H}^n}\mathrm {e}^{t(f_{z,\mathbf x }-\eta \check{C})}\mu ^{\rho _z+|z-y|}\le \xi ^\mu _t(y)\le \hat{c}\sum _{z\in \mathbb {H}^n}\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}\mu ^{\rho _z+|z-y|}(1+t)^m, \end{aligned}$$
(3.2)

where \(\rho _y:=\min _{z\in \mathbb {H}^n}(\lambda _z+|z-y|)\), \(m\in {\mathbb N}\), and \(0<\check{c},\check{C},\hat{c},\hat{C}<\infty \) are independent of \(\mu \) and \(\eta \) (but dependent on \(t_0\)).

Moreover, for all \(x\in \mathbf x \),

$$\begin{aligned} \xi ^\mu _t(x)\in [\bar{\xi }_\mathbf x (x)-\eta \bar{C},\bar{\xi }_\mathbf x (x)+\eta \bar{C}]. \end{aligned}$$
(3.3)

As a Corollary, we estimate the growth of the different subpopulations on the time scale \(\ln 1/\mu \) and derive the asymptotics of \(T^\mu _\eta \) as \(\mu \rightarrow 0\).

Corollary 1

Under the same assumptions as in Theorem 6 and with the same constants, we obtain that, for every \(y\in \mathbb {H}^n\) and every \(t_0\le t\ln 1/\mu \le \tilde{T}^\mu _\eta \),

$$\begin{aligned} \check{c}\mu ^{\min _{z\in \mathbb {H}^n}[\rho _z+|z-y|-t(f_{z,\mathbf x }-\eta \check{C})]}\le&\xi ^\mu _{t\ln \frac{1}{\mu }}(y)\nonumber \\ \le&2^n\hat{c}\mu ^{\min _{z\in \mathbb {H}^n}[\rho _z+|z-y|-t(f_{z,\mathbf x }+\eta \hat{C})]}\left( 1+t\ln \frac{1}{\mu }\right) ^m. \end{aligned}$$
(3.4)

Moreover, as long as there is a \(y\in \mathbb {H}^n\) for which \(f_{y,\mathbf x }>0\), there is an \(\bar{\eta }\le \tilde{\eta }\) such that for every \(\eta \le \bar{\eta }\)

$$\begin{aligned} \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \lambda _y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }+\eta \hat{C}}\le \liminf _{\mu \rightarrow 0}T^\mu _\eta \le \limsup _{\mu \rightarrow 0}T^\mu _\eta \le \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \lambda _y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }-\eta \check{C}}. \end{aligned}$$
(3.5)

Proof of Theorem 6

The proof consists of two steps. We only derive the existence of \(\tilde{\eta }\) for a specific set \(\mathbf x \). To get a uniform parameter, we just have to minimise over the finite set of all such sets \(\mathbf x \).

First, we show that (3.3) holds up to time \(\tilde{T}^\mu _\eta \). Second, we inductively prove the upper bound in (3.2). The lower bound can derived analogously.

Step 1\(\xi ^\mu _t(x)\in [\bar{\xi }_\mathbf x (x)-\eta \bar{C},\bar{\xi }_\mathbf x (x)+\eta \bar{C}]\).

To prove our first claim, we analyse the distance of \(\left. \xi ^\mu _t\right| _\mathbf x :=(\xi ^\mu _t(x))_{x\in \mathbf x }\) from \(\bar{\xi }_\mathbf x \) with respect to the norm \(\left\| \cdot \right\| _\mathbf x \), defined in (2.12). We prove that, in an annulus with respect to the norm \(\left\| \cdot \right\| _\mathbf x \), this distance declines. Hence, starting inside the annulus, \(\left. \xi ^\mu _t\right| _\mathbf x \) will remain there. This argument is depicted in Fig. 1.

To approximate

$$\begin{aligned} \frac{d}{dt}\frac{\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x ^2}{2}=\left\langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,\frac{d}{dt}(\left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x )\right\rangle _\mathbf x \end{aligned}$$
(3.6)

from above, we split the right hand side of (2.2) into two parts.

We define \(F,V:\mathcal {M}(\mathbb {H}^n)\rightarrow {\mathbb R}^\mathbf x \),

$$\begin{aligned} F_x(\xi )=\left[ r(x)-\sum _{y\in \mathbf x }\alpha (x,y)\xi (y)\right] \xi (x),\quad x\in \mathbf x , \end{aligned}$$
(3.7)

the Lotka–Volterra part, and

$$\begin{aligned} V_x(\xi )=-\sum _{y\in \mathbb {H}^n\backslash \mathbf x }\alpha (x,y)\xi (y)\xi (x) +\mu \sum _{y\sim x}b(y)m(y,x)\xi (y)-\mu b(x)\xi (x),\quad x\in \mathbf x , \end{aligned}$$
(3.8)

the error part of the differential equation.

With this,

$$\begin{aligned} \left\langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,\frac{d}{dt}(\left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x )\right\rangle _\mathbf x =\langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,F(\xi ^\mu _t)\rangle _\mathbf x +\langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,V(\xi ^\mu _t)\rangle _\mathbf x . \end{aligned}$$
(3.9)

We first approximate the norm of the error part, using that \(|\xi ^\mu _t(y)|\le \eta \) for \(y\in \mathbb {H}^n\backslash \mathbf x \). In addition, we assume that, for every \(x\in \mathbf x \), \(\xi ^\mu _t(x)\ge \eta \). We choose \(\eta \) such that this is always implied by (3.3) at the end of Step 1.

We estimate

$$\begin{aligned} |V_x(\xi ^\mu _t)|&\le \eta 2^n\max _{x\in \mathbf x ,y\in \mathbb {H}^n}\alpha (x,y)|\xi ^\mu _t(x)|+\mu n\max _{y\in \mathbb {H}^n}b(y)\max _{y\in \mathbb {H}^n}|\xi ^\mu _t(y)|\nonumber \\&\quad +\mu \max _{y\in \mathbb {H}^n}b(y)|\xi ^\mu _t(x)| \end{aligned}$$
(3.10)

and hence, using that \(\max _{y\in \mathbb {H}^n}|\xi ^\mu _t(y)|\le \left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| \le c_\mathbf x ^{-1}\left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| \),

$$\begin{aligned} \left\| V(\xi ^\mu _t)\right\| _\mathbf x&\le \eta 2^n\max _{x\in \mathbf x ,y\in \mathbb {H}^n}\alpha (x,y)\left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x \nonumber \\&\quad +\mu \max _{y\in \mathbb {H}^n}b(y)\left( n\sqrt{|\mathbf x |\max _{x\in \mathbf x }\frac{\theta _x}{\bar{\xi }_\mathbf x (x)}\max _{y\in \mathbb {H}^n}|\xi ^\mu _t(y)|^2}+\left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x \right) \nonumber \\&\le \eta 2^n\max _{x\in \mathbf x ,y\in \mathbb {H}^n}\alpha (x,y)\left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x \nonumber \\&\quad +\mu \max _{y\in \mathbb {H}^n}b(y)\left( n\sqrt{|\mathbf x |} C_\mathbf x c_\mathbf x ^{-1}+1\right) \left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x \nonumber \\&\le \eta \left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x C, \end{aligned}$$
(3.11)

for some \(C<\infty \) independent of \(\eta \) and \(\mu \).

Next, we approximate the Lotka–Volterra part. To do so, we show that a slight perturbation of the positive definite matrix \((\theta _x\alpha (x,y))_{x,y\in \mathbf x }\) is still positive definite. Let \(\zeta \in {\mathbb R}^\mathbf x \) such that, for \(x\in \mathbf x \), \(|\zeta (x)-1|\le \tilde{\varepsilon }_\mathbf x \). Then

$$\begin{aligned}&\sum _{x,y\in \mathbf x }\zeta (x)\theta _x\alpha (x,y)u(x)u(y)\nonumber \\&\quad =\sum _{x,y\in \mathbf x }\theta _x\alpha (x,y)u(x)u(y)+\sum _{x,y\in \mathbf x }(\zeta (x)-1)\theta _x\alpha (x,y)u(x)u(y)\nonumber \\&\quad \ge \kappa \left\| u\right\| ^2-\max _{x\in \mathbf x }|\zeta (x)-1|\max _{x,y\in \mathbf x }(\theta _x\alpha (x,y))\sum _{x,y\in \mathbf x } |u(x)||u(y)|\nonumber \\&\quad \ge \left\| u\right\| ^2\big [\kappa -\tilde{\varepsilon }_\mathbf x |\mathbf x |^2\max _{x,y\in \mathbf x }\theta _x\alpha (x,y)\big ]\ge \frac{\kappa }{2}\left\| u\right\| ^2, \end{aligned}$$
(3.12)

as long as \(\tilde{\varepsilon }_\mathbf x \le \kappa (2|\mathbf x |^2\max _{x,y\in \mathbf x }\theta _x\alpha (x,y))^{-1}\).

We now apply this to \(\zeta (x)=\xi ^\mu _t(x)/\bar{\xi }_\mathbf x (x)\). The condition \(|\zeta (x)-1|\le \tilde{\varepsilon }_\mathbf x \) is satisfied whenever

$$\begin{aligned} |\xi ^\mu _t(x)-\bar{\xi }_\mathbf x (x)|\le \tilde{\varepsilon }_\mathbf x \bar{\xi }_\mathbf x (x), \end{aligned}$$
(3.13)

which is the case if

$$\begin{aligned} \left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \le c_\mathbf x \tilde{\varepsilon }_\mathbf x \min _{x\in \mathbf x }\bar{\xi }_\mathbf x (x)=:\varepsilon _\mathbf x . \end{aligned}$$
(3.14)

Using the fact that \(\bar{\xi }_\mathbf x \) is an equilibrium of (2.4) for which \(\bar{\xi }_\mathbf x (x)>0\) holds for all \(x\in \mathbf x \), we derive

$$\begin{aligned}&\langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,F(\xi ^\mu _t)\rangle _\mathbf x \nonumber \\&\quad = \sum _{x\in \mathbf x } \frac{\theta _x}{\bar{\xi }_\mathbf x (x)}(\xi ^\mu _t(x)-\bar{\xi }_\mathbf x (x)) \left[ r(x)-\sum _{y\in \mathbf x }\alpha (x,y)\xi ^\mu _t(y)\right] \xi ^\mu _t(x)\nonumber \\&\quad = -\sum _{x\in \mathbf x } \frac{\theta _x}{\bar{\xi }_\mathbf x (x)}(\xi ^\mu _t(x)-\bar{\xi }_\mathbf x (x)) \left[ \sum _{y\in \mathbf x }\alpha (x,y)(\xi ^\mu _t(y)-\bar{\xi }_\mathbf x (y))\right] \xi ^\mu _t(x)\nonumber \\&\quad = -\sum _{x,y\in \mathbf x }\frac{\xi ^\mu _t(x)}{\bar{\xi }_\mathbf x (x)}\theta _x\alpha (x,y)(\xi ^\mu _t(x)-\bar{\xi }_\mathbf x (x))(\xi ^\mu _t(y)-\bar{\xi }_\mathbf x (y))\nonumber \\&\quad \le -\frac{\kappa }{2}\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| ^2. \end{aligned}$$
(3.15)

Combining estimates (3.11) and (3.15), we get

$$\begin{aligned} \frac{d}{dt}\frac{\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x ^2}{2}&= \langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,F(\xi ^\mu _t)\rangle _\mathbf x +\langle \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x ,V(\xi ^\mu _t)\rangle _\mathbf x \nonumber \\&\le -\frac{\kappa }{2}\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| ^2 +\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \left\| V(\xi ^\mu _t)\right\| _\mathbf x \nonumber \\&\le -\frac{\kappa }{2}\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| ^2 +\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \eta \left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x C\nonumber \\&\le -\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x ^2\left( \frac{\kappa }{2C_\mathbf x ^2}-\eta \frac{C\left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x }{\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x }\right) \nonumber \\&\le -\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x ^2\frac{\kappa }{4C_\mathbf x ^2}<0, \end{aligned}$$
(3.16)

whenever

$$\begin{aligned} \varepsilon _\mathbf x \ge \left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \ge \eta C\left\| \left. \xi ^\mu _t\right| _\mathbf x \right\| _\mathbf x \frac{4C_\mathbf x ^2}{\kappa }\ge \eta C\bigg (\left\| \bar{\xi }_\mathbf x \right\| _\mathbf x -\varepsilon _\mathbf x \bigg )\frac{4C_\mathbf x ^2}{\kappa }=:\eta \underline{c}. \end{aligned}$$
(3.17)

Finally, we choose \(\tilde{\eta }\) small enough such that \(\tilde{\eta }<\varepsilon _\mathbf x /\underline{c}\).

Now we can follow the argument that was outlined in the beginning and is supported by Fig. 1. As long as \(\eta \le \tilde{\eta }\) and

$$\begin{aligned} \left\| \left. \xi ^\mu _0\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| \le \eta \underline{c}C_\mathbf x ^{-1}=:\eta \bar{c}_\mathbf x , \end{aligned}$$
(3.18)

we obtain that \(\left\| \left. \xi ^\mu _0\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \le \eta c\). Because of (3.16), we obtain that \(\left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \le \eta \underline{c}\), for every \(0\le t\le \tilde{T}^\mu _\eta \), and hence

$$\begin{aligned} \left\| \left. \xi ^\mu _t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| \le \eta \underline{c}c_\mathbf x ^{-1}=: \eta \bar{C}_\mathbf x . \end{aligned}$$
(3.19)

For the single types, this implies, for every \(0\le t\le \tilde{T}^\mu _\eta \), that

$$\begin{aligned} \xi ^\mu _t(x)\in [\bar{\xi }_\mathbf x (x)-\eta \bar{C}_\mathbf x ,\bar{\xi }_\mathbf x (x)+\eta \bar{C}_\mathbf x ],\quad x\in \mathbf x , \end{aligned}$$
(3.20)

whenever

$$\begin{aligned} \xi ^\mu _0(x)\in \left[ \bar{\xi }_\mathbf x (x)-\eta \frac{\bar{c}_\mathbf x }{\sqrt{|\mathbf x |}},\bar{\xi }_\mathbf x (x)+\eta \frac{\bar{c}_\mathbf x }{\sqrt{|\mathbf x |}}\right] ,\quad x\in \mathbf x . \end{aligned}$$
(3.21)

Setting \(\bar{c}:=\min _\mathbf{y \subset \mathbb {H}^n}\bar{c}_\mathbf y \) and \(\bar{C}:=\max _\mathbf{y \subset \mathbb {H}^n}\bar{C}_\mathbf y \), and choosing \(\tilde{\eta }\le \min _{x\in \mathbf x }\bar{\xi }_\mathbf x (x)/(2\bar{C}+2)\) to ensure that \(\xi ^\mu _t(x)>\eta \), for every \(x\in \mathbf x \), we arrive at the claim.

Fig. 1
figure 1

Scheme for the argument in Step 1. Dashed lines indicate balls \(B(\bar{\xi }_\mathbf x ,\eta \bar{c}_\mathbf x )\) and \(B(\bar{\xi }_\mathbf x ,\eta \bar{C}_\mathbf x )\) with respect to the standard Euclidean norm, while solid lines correspond to balls \(B_\mathbf x (\bar{\xi }_\mathbf x ,\eta c)\) and \(B_\mathbf x (\bar{\xi }_\mathbf x ,\varepsilon _\mathbf x )\) with respect to the \(\left\| \cdot \right\| _\mathbf x \) norm

Step 2 Inductive exponential bounds.

We derive the upper bound for \(\xi ^\mu _t(y)\) in (3.2) in full length. At the end of the proof, we comment on how the same strategy can be adapted to the lower bound.

To begin, we establish an upper bound on \(\tfrac{d}{dt}\xi ^\mu _t\).

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(y)&\le \Big [r(y)-\sum _{x\in \mathbf x }\alpha (y,x)(\bar{\xi }_\mathbf x (x)-\eta \bar{C} )\Big ]\xi ^\mu _t(y)+\mu \sum _{z\sim y}\underbrace{b(z)m(z,y)}_{\le \tilde{C}_y\forall z\sim y}\xi ^\mu _t(z)\nonumber \\&\le \left[ r(y)-\sum _{x\in \mathbf x }\alpha (y,x)\bar{\xi }_\mathbf x (x)+\eta \bar{C}\underbrace{\sum _{x\in \mathbf x }\alpha (y,x)}_{=:\hat{C}_y}\right] \xi ^\mu _t(y)+\mu \tilde{C}_y\sum _{z\sim y}\xi ^\mu _t(z)\nonumber \\&\le [f_{y,\mathbf x }+\eta \hat{C}]\xi ^\mu _t(y)+\mu \tilde{C}\sum _{z\sim y}\xi ^\mu _t(z), \end{aligned}$$
(3.22)

where \(\hat{C}:=\max _{y\in \mathbb {H}^n}\hat{C}_y<\infty \) and \(\tilde{C}:=\max _{y\in \mathbb {H}^n}\tilde{C}_y<\infty \).

We prove by induction that, for every \(m\ge 0\), there exists a constant \(C_m<\infty \), independent of \(\mu \), \(\eta \), and y, such that, for every \(0\le t\le \tilde{T}^\mu _\eta \),

$$\begin{aligned} \xi ^\mu _t(y)\le C_m\left[ \sum _{\begin{array}{c} z\in \mathbb {H}^n\\ |z-y|\le m \end{array}}\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}\Bigg (\mu ^{\rho _z+|z-y|}+\frac{1}{\eta }\mu ^{m+1}\Bigg )(1+t)^m+\mu ^{m+1}\right] . \end{aligned}$$
(3.23)

For the case \(m=0\), we approximate

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(y)\le [f_{y,\mathbf x }+\eta \hat{C}]\xi ^\mu _t(y)+\mu \tilde{C}\underbrace{\sum _{z\sim y}\mathbb {1}_{z\in \mathbf x }(\bar{\xi }_\mathbf x (z)+\eta \bar{C} )+\mathbb {1}_{z\in \mathbb {H}^n\backslash \mathbf x }\eta }_{\le C\text { uniformly in }y,z}, \end{aligned}$$
(3.24)

and hence

$$\begin{aligned} \xi ^\mu _t(y)&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\xi ^\mu _0(y)+\mu \tilde{C}C\int _0^t \mathrm {e}^{(t-s)(f_{y,\mathbf x }+\eta \hat{C})}ds\nonumber \\&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}C_y\mu ^{\lambda _y}+\mu \tilde{C}C\frac{1}{f_{y,\mathbf x }+\eta \hat{C}}(\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}-1). \end{aligned}$$
(3.25)

Choose \(\tilde{\eta }>0\) small enough such that \(f_{y,\mathbf x }+\tilde{\eta }\hat{C}< 0\), for every \(y\in \mathbb {H}^n\) for which \(f_{y,\mathbf x }<0\). Then, for \(\eta \le \tilde{\eta }\) and a different constant \(C<\infty \), the second summand can be bounded from above by \(C\mu \), for \(f_{y,\mathbf x }<0\), and by \(C/\eta \cdot \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\mu \), for \(f_{y,\mathbf x }\ge 0\). C can be chosen independent of y, \(\mu \), \(\eta \le \tilde{\eta }\), and \(0\le t\le \tilde{T}^\mu _\eta \). Overall, using \(\lambda _y\ge \rho _y\), we get

$$\begin{aligned} \xi ^\mu _t(y)\le \underbrace{((\max _{y\in \mathbb {H}^n}C_y)\vee C)}_{=:C_0<\infty }\Bigg [\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\Big (\mu ^{\rho _y}+\frac{1}{\eta }\mu \Big )+\mu \Bigg ], \end{aligned}$$
(3.26)

which is the desired bound.

Assuming that the hypothesis holds for \(m-1\) and using (3.22), we approximate

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(y)&\le \ [f_{y,\mathbf x }+\eta \hat{C}]\xi ^\mu _t(y)\nonumber \\&\quad +\mu \tilde{C}\sum _{z\sim y}C_{m-1}\left[ \sum _{\begin{array}{c} u\in \mathbb {H}^n\\ |u-z|\le m-1 \end{array}}\mathrm {e}^{t(f_{u,\mathbf x }+\eta \hat{C})}\Bigg (\mu ^{\rho _u+|u-z|}+\frac{1}{\eta }\mu ^m\Bigg )(1+t)^{m-1}+\mu ^m\right] . \end{aligned}$$
(3.27)

Splitting up the second summand, Gronwall’s inequality yields

$$\begin{aligned} \xi ^\mu _t(y)&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\xi ^\mu _0(y)+\tilde{C}C_{m-1}n\mu ^{m+1} \int _0^t\mathrm {e}^{(t-s)(f_{y,\mathbf x }+\eta \hat{C})}ds\nonumber \\&\quad +\tilde{C}C_{m-1} \sum _{z\sim y}\sum _{\begin{array}{c} u\in \mathbb {H}^n\\ |u-z|\le m-1 \end{array}} \Bigg (\mu ^{\rho _u+|u-z|+1}+\frac{1}{\eta }\mu ^{m+1}\Bigg )\nonumber \\&\quad \cdot \int _0^t(1+s)^{m-1}\mathrm {e}^{s(f_{u,\mathbf x }+\eta \hat{C})}\mathrm {e}^{(t-s)(f_{y,\mathbf x }+\eta \hat{C})}ds \nonumber \\&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}C_y\mu ^{\lambda _y}+C \mu ^{m+1}\Bigg (1+\frac{1}{\eta }\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\Bigg )\nonumber \\&\quad +\tilde{C}C_{m-1} \sum _{z\sim y}\sum _{\begin{array}{c} u\in \mathbb {H}^n\\ |u-z|\le m-1 \end{array}} \Bigg (\mu ^{\rho _u+|u-z|+1}+\frac{1}{\eta }\mu ^{m+1}\Bigg )(1+t)^{m-1}\nonumber \\&\quad \cdot \int _0^t\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\mathrm {e}^{s(f_{u,\mathbf x }-f_{y,\mathbf x })}ds , \end{aligned}$$
(3.28)

where we bound the first integral just as before in the base case.

We distinguish two cases to approximate the second integral. If \(f_{u,\mathbf x }\ne f_{y,\mathbf x }\), then

$$\begin{aligned} \int _0^t \mathrm {e}^{t(f_{y,\mathbf x }+\eta \check{C})}\mathrm {e}^{s(f_{u,\mathbf x }-f_{y,\mathbf x })}ds&=\frac{1}{f_{u,\mathbf x }-f_{y,\mathbf x }}\bigg (\mathrm {e}^{t(f_{u,\mathbf x }+\eta \check{C})}-\mathrm {e}^{t(f_{y,\mathbf x }+\eta \check{C})}\bigg )\nonumber \\&=\frac{1}{|f_{u,\mathbf x }-f_{y,\mathbf x }|}|\mathrm {e}^{t(f_{u,\mathbf x }+\eta \check{C})}-\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}|\nonumber \\&\le C'\bigg (\mathrm {e}^{t(f_{u,\mathbf x }+\eta \check{C})}+\mathrm {e}^{t(f_{y,\mathbf x }+\eta \check{C})}\bigg ), \end{aligned}$$
(3.29)

for some \(C'<\infty \) large enough, uniformly in y and u.

If \(f_{u,\mathbf x }=f_{y,\mathbf x }\), then

$$\begin{aligned} \int _0^t \mathrm {e}^{t(f_{y,\mathbf x }+\eta \check{C})}\mathrm {e}^{s(f_{u,\mathbf x }-f_{y,\mathbf x })}ds=t\mathrm {e}^{t(f_{y,\mathbf x }+\eta \check{C})}. \end{aligned}$$
(3.30)

Plugging this back into (3.28) we get

$$\begin{aligned} \xi ^\mu _t(y)&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}C_y\mu ^{\lambda _y}+C \mu ^{m+1}\Bigg (1+\frac{1}{\eta }\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\Bigg )\nonumber \\&\quad +\tilde{C}C_{m-1} \sum _{z\sim y}\sum _{\begin{array}{c} u\in \mathbb {H}^n\\ |u-z|\le m-1 \end{array}} \Bigg (\mu ^{\rho _u+|u-z|+1}+\frac{1}{\eta }\mu ^{m+1}\Bigg )(1+t)^{m-1}\nonumber \\&\quad \cdot C'(1+t)(\mathrm {e}^{t(f_{u,\mathbf x }+\eta \hat{C})}+\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})})\nonumber \\&\le \underbrace{(n+o(1))((\max _{y\in \mathbb {H}^n}C_y)\vee C\vee \tilde{C}C_{m-1}C')}_{\le C_m\text { for }\mu <\tilde{\eta }}\nonumber \\&\quad \cdot \left[ \sum _{\begin{array}{c} z\in \mathbb {H}^n\\ |z-y|\le m \end{array}}\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}\Bigg (\mu ^{\rho _z+|z-y|}+\frac{1}{\eta }\mu ^{m+1}\Bigg )(1+t)^m+\mu ^{m+1}\right] , \end{aligned}$$
(3.31)

where we used that \(\rho _y\le (\rho _u+|u-z|+1)\wedge \lambda _y\) for all \(z\sim y\) and \(|u-z|\le m-1\), and gathered all the higher \(\mu \)-powers in the o(1) with respect to the limit \(\mu \rightarrow 0\). This concludes the proof of (3.23).

Finally, we can choose \(m\ge \max _{y\in \mathbb {H}^n}\max _{z\in \mathbb {H}^n}\rho _z+|z-y|\ge n\) and, since \(f_{z,\mathbf x }=0\) for all \(z\in \mathbf x \), we get

$$\begin{aligned} \xi ^\mu _t(y)&\le C_m\left[ \sum _{z\in \mathbb {H}^n}\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}\Bigg (\mu ^{\rho _z+|z-y|}+\frac{1}{\eta }\mu ^{m+1}\Bigg )(1+t)^m+\mu ^{m+1}\right] \nonumber \\&\le C_m\left[ \sum _{z\in \mathbb {H}^n}\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}(\mu ^{\rho _z+|z-y|}+\mu ^m)(1+t)^m+\sum _{z\in \mathbf x }\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}\mu ^{m+1}\right] \nonumber \\&\le 3C_m\sum _{z\in \mathbb {H}^n}\mathrm {e}^{t(f_{z,\mathbf x }+\eta \hat{C})}\mu ^{\rho _z+|z-y|}(1+t)^m. \end{aligned}$$
(3.32)

With \(\hat{c}:=3C_m\) and choosing \(\tilde{\eta }\) uniform over all subsets \(\mathbf x \subset \mathbb {H}^n\) of coexisting resident types, this yields the desired upper bound.

The proof of the lower bound is very similar. We approximate, for every \(y\in \mathbb {H}^n\),

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(y)\ge [f_{y,\mathbf x }-\eta \check{C}]\xi ^\mu _t(y)+\mu \tilde{c}\sum _{z\sim y}\xi ^\mu _t(z), \end{aligned}$$
(3.33)

and then use the inductive application of Gronwall’s inequality twice.

First, to prove that, for an arbitrarily small \(t_0>0\), \(\xi ^\mu _{t_0/2}(y)\ge c_{t_0}\mu ^{\rho _y}\), where \(c_{t_0}>0\) can be chosen uniformly in \(\mu \), \(\eta \), and y. This corresponds to mutation producing a positive population size for every type within a time of order 1.

Second, we show that, for every \(0\le m\le n\), there exists a constant \(c_m>0\), independent of \(\mu \), \(\eta \), and y, such that, for \((n+m){t_0}/(2n)\le t\le \tilde{T}^\mu _\eta \),

$$\begin{aligned} \xi ^\mu _t(y)\ge c_m \sum _{\begin{array}{c} z\in \mathbb {H}^n\\ |z-y|\le m \end{array}}\mu ^{\rho _z+|z-y|}\mathrm {e}^{t(f_{z,\mathbf x }-\eta \check{C})}. \end{aligned}$$
(3.34)

Setting \(\check{c}:=c_n\) yields the lower bound in (3.2), for \({t_0}\le t\le \tilde{T}^\mu _\eta \). \(\square \)

Proof of Corollary 1

The inequalities in (3.4) follow directly from (3.2) by inserting the new time scale. For the lower bound, only the asymptotically largest summand, corresponding to the smallest \(\mu \)-power, is kept. For the upper bound, every one of the \(2^n\) summands is estimated against this largest one.

To prove the second part of the corollary, we first show that, for \(\mu \) small enough, the first non-resident type y that reaches the \(\eta \)-threshold, i.e. the type that determines the stopping time \(T^\mu _\eta \), satisfies \(\lambda _y>0\) and hence \(\rho _z+|z-y|>0\), for every \(z\in \mathbb {H}^n\).

Let \(y\in \mathbb {H}^n\backslash \mathbf x \) be a non-resident type for which \(\lambda _y=0\). This implies \(\xi ^\mu _0(y)\le \eta /3\) and \(f_{y,\mathbf x }<0\). Going back into the proof of (3.23) and using that \(\tilde{\eta }\) is chosen such that \(f_{y,\mathbf x }+\tilde{\eta }\hat{C}<0\), this yields

$$\begin{aligned} \xi ^\mu _t(y)&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}C_y\mu ^{\lambda _y}+\mu \tilde{C}C\frac{1}{f_{y,\mathbf x }+\eta \hat{C}}(\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}-1)\nonumber \\&\le \mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})}\frac{\eta }{3}+\mu \tilde{C}C\frac{1}{|f_{y,\mathbf x }+\eta \hat{C}|}(1-\mathrm {e}^{t(f_{y,\mathbf x }+\eta \hat{C})})\nonumber \\&\le \frac{\eta }{3}+\frac{\mu \tilde{C}C}{|f_{y,\mathbf x }+\tilde{\eta }\hat{C}|}\le \frac{2}{3}\eta , \end{aligned}$$
(3.35)

whenever \(\mu \le \eta |f_{y,\mathbf x }+\tilde{\eta }\hat{C}|/3\tilde{C}C\). As a consequence, as \(\mu \rightarrow 0\), y stays strictly below \(\eta \) and does not determine \(T^\mu _\eta \).

Now we assume that \(T^\mu _\eta \) is determined by a non-resident type \(y\in \mathbb {H}^n\) for which \(\lambda _y>0\), i.e. y is the first mutant to reach the \(\eta \)-threshold. Let \(\bar{\eta }\le \tilde{\eta }\wedge 1\wedge \check{c}\). Then, assuming that \(0<\mu \le \eta \le \bar{\eta }\), the lower bound in (3.4) yields

$$\begin{aligned} \check{c}\mu ^{\min _{z\in \mathbb {H}^n}[\rho _z+|z-y|-T^\mu _\eta (f_{z,\mathbf x }-\eta \check{C})]}\le \xi ^\mu _{\tilde{T}^\mu _\eta }(y)=\eta , \end{aligned}$$
(3.36)

and hence

$$\begin{aligned} \ln (\mu )\min _{z\in \mathbb {H}^n}[\rho _z+|z-y|-T^\mu _\eta (f_{z,\mathbf x }-\eta \check{C})]\le \ln \left( \frac{\eta }{\check{c}}\right) \le 0. \end{aligned}$$
(3.37)

Since \(\ln (\mu )<0\), we obtain, for every \(z\in \mathbb {H}^n\), that

$$\begin{aligned} \rho _z+|z-y|\ge T^\mu _\eta (f_{z,\mathbf x }-\eta \check{C}), \end{aligned}$$
(3.38)

and therefore, if we choose \(\bar{\eta }\) small enough such that, for every \(\eta \le \bar{\eta }\) and every \(z\in \mathbb {H}^n\) for which \(f_{z,\mathbf x }>0\), also \(f_{z,\mathbf x }-\eta \check{C}>0\),

$$\begin{aligned} T^\mu _\eta \le \min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }-\eta \check{C}}. \end{aligned}$$
(3.39)

To get a lower bound for \(T^\mu _\eta \), (3.4) implies

$$\begin{aligned} \eta =\xi ^\mu _{\tilde{T}^\mu _\eta }(y)\le 2^n\hat{c}\mu ^{\min _{z\in \mathbb {H}^n}[\rho _z+|z-y|-T^\mu _\eta (f_{z,\mathbf x }+\eta \hat{C})]}\left( 1+\tilde{T}^\mu _\eta \right) ^m, \end{aligned}$$
(3.40)

which yields

$$\begin{aligned} \ln (\mu )\min _{z\in \mathbb {H}^n}[\rho _z+|z-y|-T^\mu _\eta (f_{z,\mathbf x }+\eta \hat{C})]\ge \ln \left( \frac{\eta }{2^n\hat{c}(1+\tilde{T}^\mu _\eta )^m}\right) , \end{aligned}$$
(3.41)

and therefore there exists a \(z\in \mathbb {H}^n\) such that

$$\begin{aligned} \rho _z+|z-y|\le T^\mu _\eta (f_{z,\mathbf x }+\eta \hat{C})+\frac{\ln \left( \frac{2^n\hat{c}}{\eta }\right) +m\ln (1+\tilde{T}^\mu _\eta )}{\ln \frac{1}{\mu }}. \end{aligned}$$
(3.42)

The second summand on the right hand side is positive and, with (3.39), converges to zero as \(\mu \rightarrow 0\). Since the left hand side is positive this implies that \(f_{z,\mathbf x }+\eta \hat{C}>0\) and by our choice of \(\tilde{\eta }\) in the proof of (3.23) we obtain \(f_{z,\mathbf x }\ge 0\).

Consequently, for every fixed \(0<\eta \le \bar{\eta }\), it follows that

$$\begin{aligned} \liminf _{\mu \rightarrow 0}T^\mu _\eta \ge \frac{\rho _z+|z-y|}{f_{z,\mathbf x }+\eta \check{C}}\ge \min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }\ge 0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }+\eta \check{C}}. \end{aligned}$$
(3.43)

Overall, for every fixed \(0<\eta \le \bar{\eta }\), we obtain

$$\begin{aligned} \min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }\ge 0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }+\eta \check{C}}\le \liminf _{\mu \rightarrow 0}T^\mu _\eta \le \limsup _{\mu \rightarrow 0}T^\mu _\eta \le \min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }-\eta \check{C}}. \end{aligned}$$
(3.44)

If we now pick \(\bar{\eta }\) small enough, both minima are realised by the same \(z\in \mathbb {H}^n\) for which \(f_{z,\mathbf x }>0\), that also minimise

$$\begin{aligned} \min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }}, \end{aligned}$$
(3.45)

and we can reduce to only considering \(z\in \mathbb {H}^n\) such that \(f_{z,\mathbf x }>0\) in the lower bound.

All the above considerations apply to a single y for which \(\lambda _y>0\). Considering all such \(y\in \mathbb {H}^n\) we get that asymptotically

$$\begin{aligned} \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \lambda _y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }+\eta \check{C}}\le \liminf _{\mu \rightarrow 0}T^\mu _\eta \le \limsup _{\mu \rightarrow 0} T^\mu _\eta \le \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \lambda _y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x }>0 \end{array}}\frac{\rho _z+|z-y|}{f_{z,\mathbf x }-\eta \check{C}}. \end{aligned}$$
(3.46)

For the upper bound, the minimum can be used since, if \(T^\mu _\eta \) was larger than this minimum, the minimiser would reach the \(\eta \)-level before \(\tilde{T}^\mu _\eta \), which would be a contradiction.

This finishes the proof of the corollary. \(\square \)

4 Construction of the jump process

In this section we combine the results of Theorem 6, or rather Corollary 1, and Theorem 1 to derive the convergence of \(\xi ^\mu \) as \(\mu \rightarrow 0\) to a jump process that moves between Lotka–Volterra equilibria of coexistence. We prove the convergence by an induction over the invasion steps and show that after each invasion the criteria for the initial conditions in Theorem 6 are again satisfied.

Before we get to the actual proof, we derive two lemmas. The first lemma treats the boundedness of solutions of (2.2), the continuity in the initial condition, and the perturbation through the mutation rate \(\mu \).

Lemma 1

Let

$$\begin{aligned} \varOmega :=\left\{ \xi \in {\mathcal M}(\mathbb {H}^n):\forall x\in \mathbb {H}^n: \xi (x)\in \left[ 0,2\frac{|r(x)|}{\alpha (x,x)}\right] \right\} . \end{aligned}$$
(4.1)

There is a \(\mu _0>0\) such that, for every \(0\le \mu <\mu _0\), for every \(\xi ^\mu _0\in \varOmega \), and for every \(t\ge 0\), we obtain \(\xi ^\mu _t\in \varOmega \), where \(\xi ^\mu _t\) is the solution of (2.2).

Moreover, there are positive, finite constants A, B such that, for every \(0\le \mu _1,\mu _2<\mu _0\), for every \(\xi ^{\mu _1}_0,\xi ^{\mu _2}_0\in \varOmega \), and every \(t\ge s\ge 0\),

$$\begin{aligned} \left\| \xi ^{\mu _1}_t-\xi ^{\mu _2}_t\right\| \le \mathrm {e}^{(t-s)A}\left( \left\| \xi ^{\mu _1}_s-\xi ^{\mu _2}_s\right\| +\sqrt{(\mu _1+\mu _2)\frac{B}{A}}\right) . \end{aligned}$$
(4.2)

Proof

To prove the first claim, assume that \(\xi ^\mu _t\in \varOmega \) and \(\xi ^\mu _t(x)=2|r(x)|/\alpha (x,x)\), for some \(x\in \mathbb {H}^n\). Then

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(x)&\le [r(x)-\alpha (x,x)\xi ^\mu _t(x)]\xi ^\mu _t(x)+\mu \sum _{y\sim x}b(y)m(y,x)\xi ^\mu _t(y)\nonumber \\&\le \frac{-2|r(x)|^2}{\alpha (x,x)}+\mu 2n\max _{y\in \mathbb {H}^n}\frac{b(y)|r(y)|}{\alpha (y,y)}<0, \end{aligned}$$
(4.3)

for

$$\begin{aligned} \mu <\mu _0:=\min _{y\in \mathbb {H}^n}\frac{2|r(y)|^2}{\alpha (y,y)}\left( 2n\max _{y\in \mathbb {H}^n}\frac{b(y)|r(y)|}{\alpha (y,y)}\right) ^{-1}. \end{aligned}$$
(4.4)

Hence, \(\xi ^\mu _t\) cannot leave \(\varOmega \).

For the second claim, we approximate

$$\begin{aligned}&\frac{d}{dt}\frac{\left\| \xi ^{\mu _1}_t-\xi ^{\mu _2}_t\right\| ^2}{2}\nonumber \\&\quad = \sum _{x\in \mathbb {H}^n}(\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))r(x)(\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))\nonumber \\&\quad -\sum _{x\in \mathbb {H}^n}(\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))\sum _{y\in \mathbb {H}^n}\alpha (x,y)(\xi ^{\mu _1}_t(x)\xi ^{\mu _1}_t(y)-\xi ^{\mu _2}_t(x)\xi ^{\mu _2}_t(y))\nonumber \\&\quad +\sum _{x\in \mathbb {H}^n} (\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))\mu _1\left( \sum _{y\sim x}b(y)m(y,x)\xi ^{\mu _1}_t(y)-b(x)\xi ^{\mu _1}_t(x)\right) \nonumber \\&\quad -\sum _{x\in \mathbb {H}^n} (\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))\mu _2\left( \sum _{y\sim x}b(y)m(y,x)\xi ^{\mu _2}_t(y)-b(x)\xi ^{\mu _2}_t(x)\right) \nonumber \\&\quad \le \max _{x\in \mathbb {H}^n}|r(x)|\sum _{x\in \mathbb {H}^n}(\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))^2\nonumber \\&\quad -\sum _{x\in \mathbb {H}^n}\sum _{y\in \mathbb {H}^n}\alpha (x,y)(\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x))^2\xi ^{\mu _1}_t(y)\nonumber \\&\quad +\sum _{x\in \mathbb {H}^n}\sum _{y\in \mathbb {H}^n}\alpha (x,y)|\xi ^{\mu _1}_t(x)-\xi ^{\mu _2}_t(x)|\cdot |\xi ^{\mu _1}_t(y)-\xi ^{\mu _2}_t(y)|\cdot |\xi ^{\mu _2}_t(x)|\nonumber \\&\quad +\mu _1\max _{x\in \mathbb {H}^n}b(x)\sum _{x\in \mathbb {H}^n} \max _{x\in \mathbb {H}^n}(|\xi ^{\mu _1}_t(x)|+|\xi ^{\mu _2}_t(x)|)\max _{x\in \mathbb {H}^n}|\xi ^{\mu _1}_t(x)|\left( \sum _{y\sim x}m(y,x)+1\right) \nonumber \\&\quad +\mu _2\max _{x\in \mathbb {H}^n}b(x)\sum _{x\in \mathbb {H}^n} \max _{x\in \mathbb {H}^n}(|\xi ^{\mu _1}_t(x)|+|\xi ^{\mu _2}_t(x)|)\max _{x\in \mathbb {H}^n}|\xi ^{\mu _2}_t(x)|\left( \sum _{y\sim x}m(y,x)+1\right) , \end{aligned}$$
(4.5)

which implies

$$\begin{aligned} \frac{d}{dt}\frac{\left\| \xi ^{\mu _1}_t-\xi ^{\mu _2}_t\right\| ^2}{2}&\le \left\| \xi ^{\mu _1}_t-\xi ^{\mu _2}_t\right\| ^2\Big [\max _{x\in \mathbb {H}^n}|r(x)|+2^{2n}\max _{x,y\in \mathbb {H}^n}\alpha (x,y)\left\| \xi ^{\mu _2}_t\right\| \Big ]\nonumber \\&\quad +(\mu _1+\mu _2)(2^n\cdot 2)\max _{x\in \mathbb {H}^n}b(x)\left( \left\| \xi ^{\mu _1}_t\right\| +\left\| \xi ^{\mu _2}_t\right\| \right) ^2\nonumber \\&=: \left\| \xi ^{\mu _1}_t-\xi ^{\mu _2}_t\right\| ^2A+(\mu _1+\mu _2)B, \end{aligned}$$
(4.6)

where A and B depend on \(b,r,\alpha \), and can be chosen uniformly in \(t\ge 0\), \(0\le \mu _i<\mu _0\), and initial values \(\xi ^{\mu _i}_0\in \varOmega \) since \(\left\| \xi ^{\mu _i}_t\right\| \le \max _{\xi \in \varOmega }\left\| \xi \right\| <\infty \). Applying Gronwall’s inequality and taking the square root implies the claim.

Theorem 6 and Corollary 1 provide us with approximations for \(\xi ^\mu _t\) during the exponential growth phase and Theorem 1 guarantees convergence to a new equilibrium during the invasion phase. To show that this second phase vanishes on the time scale \(\ln 1/\mu \), we need to bound its duration uniformly in the approximate state of the system at its beginning.

We introduce the following notation for the time until the initial conditions for the next growth phase are reached.

Definition 8

$$\begin{aligned} \tilde{\tau }^\mu _\eta (\xi , \mathbf x ):=\inf \Bigg \{t\ge 0:&\forall \ x\in \mathbf x : |\xi ^\mu _t(x)-\bar{\xi }_\mathbf x (x)|\le \eta \frac{\bar{c}}{\sqrt{|\mathbf x |}},\nonumber \\&\forall \ y\in \mathbb {H}^n\backslash \mathbf x : \xi ^\mu _t(y)\le \frac{\eta }{3}; \xi ^\mu _0=\xi \Bigg \}, \end{aligned}$$
(4.7)

***In the proof of Theorem 2, we approximate the true system, solving (2.2), by the mutation-free Lotka–Volterra system during the invasion. The second lemma proves continuity in the initial condition for a slight variation of \(\tilde{\tau }^\mu _\eta (\xi ,\mathbf x )\), corresponding to the case of \(\mu =0\).

Lemma 2

Let \(\mathbf y \subset \mathbb {H}^n\) such that \(r(y)>0\), for all \(y\in \mathbf y \), and (B\(_\mathbf y \)) is satisfied. Let \(\mathbf x \subset \mathbf y \) such that the equilibrium state of the Lotka–Volterra system involving types \(\mathbf y \) is supported on \(\mathbf x \) and assume \(f_{y,\mathbf x }<0\), for every \(y\in \mathbf y \backslash \mathbf x \). Define

$$\begin{aligned} \bar{\tau }^0_\eta (\xi ,\mathbf x ,\mathbf y ):=\inf \{t\ge 0:&\left\| \left. \xi ^0_t\right| _\mathbf x -\bar{\xi }_\mathbf x \right\| _\mathbf x \le \frac{\eta \bar{c}c_\mathbf x }{2\sqrt{|\mathbf x |}},\nonumber \\&\forall \ y\in \mathbf y \backslash \mathbf x : \xi ^0_t(y)\le \frac{\eta }{6}\wedge \hat{\eta };\xi ^0_0=\xi \}, \end{aligned}$$
(4.8)

where \(\left\| \cdot \right\| _\mathbf x \) is the norm defined in (2.12), corresponding to \(\bar{\xi }_\mathbf x \), and \(\hat{\eta }:=\eta \bar{c}c_\mathbf x /(2\sqrt{|\mathbf x |}\underline{c})\). Then, for \(\eta \) small enough, \(\bar{\tau }^0_\eta (\xi ,\mathbf x ,\mathbf y )\) is continuous in \(\xi \in ({\mathbb R}_{>0})^\mathbf y \times \{0\}^{\mathbb {H}^n\backslash \mathbf y }\).

Remark 9

Theorem 1 ensures that the Lotka–Volterra system involving the types \(\mathbf y \) converges to a unique equilibrium and hence \(\mathbf x \) in Lemma 2 is uniquely determined.

Proof

Since we are considering the case of \(\mu =0\), we obtain \(\xi ^0_t\in ({\mathbb R}_{>0})^\mathbf y \times \{0\}^{\mathbb {H}^n\backslash \mathbf y }\), for all \(t\ge 0\) and \(\xi ^0_0\in ({\mathbb R}_{>0})^\mathbf y \times \{0\}^{\mathbb {H}^n\backslash \mathbf y }\). As in Step 1 of the proof of Theorem 6, it follows that, as long as \(\xi ^0_t(y)\le \hat{\eta }\), for \(y\in \mathbf y \backslash \mathbf x \), and

$$\begin{aligned} \hat{\eta } \underline{c}\le \left\| \left. \xi ^0_t\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x \le \varepsilon _\mathbf{x }, \end{aligned}$$
(4.9)

we obtain

$$\begin{aligned} \frac{d}{dt}\frac{\left\| \left. \xi ^0_t\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x ^2}{2}\le -\left\| \left. \xi ^0_t\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x ^2\frac{\kappa }{4C_\mathbf x ^2}=:-\tilde{\kappa }\left\| \left. \xi ^0_t\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x ^2. \end{aligned}$$
(4.10)

Hence

$$\begin{aligned} \left\| \left. \xi ^0_t\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x \le \mathrm {e}^{-\tilde{\kappa }(t-t_0)}\left\| \left. \xi ^0_{t_0}\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x . \end{aligned}$$
(4.11)

Moreover, (4.9) implies, for every \(x\in \mathbf x \),

$$\begin{aligned} |\xi ^0_t(x)-\bar{\xi }_\mathbf{x }(x)|\le \frac{\varepsilon _\mathbf x }{c_\mathbf x }. \end{aligned}$$
(4.12)

Since \(f_{y,\mathbf x }<0\) for every \(y\in \mathbf y \backslash \mathbf x \), we can choose \(\varepsilon _\mathbf x \) small enough such that

$$\begin{aligned} \tfrac{d}{dt}\xi ^0_t(y)&=[r(y)-\sum _{z\in \mathbb {H}^n}\alpha (y,z)\xi ^0_t(z)]\xi ^0_t(y)\nonumber \\&\le \left[ f_{y,\mathbf x }+\sum _{x\in \mathbf x }\alpha (y,x)\frac{\varepsilon _\mathbf x }{c_\mathbf x }\right] \xi ^0_t(y)\le -C\xi ^0_t(y), \end{aligned}$$
(4.13)

for some \(C>0\). Hence,

$$\begin{aligned} \xi ^0_t(y)\le \mathrm {e}^{-C(t-t_0)}\xi ^0_{t_0}(y). \end{aligned}$$
(4.14)

We have now found an attractive domain around the limiting equilibrium of the Lotka–Volterra system.

Next, we can derive the continuity of \(\bar{\tau }^0_\eta (\xi ,\mathbf x ,\mathbf y )\). Let \(\gamma >0\) arbitrarily small such that \(\mathrm {e}^{\tilde{\kappa }\gamma },\mathrm {e}^{C\gamma }\le 2\). Let \(\xi ^{0,1}\) and \(\xi ^{0,2}\) be two versions of the process with different initial values \(\xi ^{0,1}_0\) and \(\xi ^{0,2}_0\). By Lemma 1,

$$\begin{aligned}&\left\| \left. \xi ^{0,1}_t\right| _\mathbf x -\left. \xi ^{0,2}_t\right| _\mathbf x \right\| _\mathbf x \le C_\mathbf x \left\| \left. \xi ^{0,1}_t\right| _\mathbf x -\left. \xi ^{0,2}_t\right| _\mathbf x \right\| \le \mathrm {e}^{(t-t_0)A}C_\mathbf x \left\| \left. \xi ^{0,1}_{t_0}\right| _\mathbf x -\left. \xi ^{0,2}_{t_0}\right| _\mathbf x \right\| , \end{aligned}$$
(4.15)
$$\begin{aligned}&|\xi ^{0,1}_t(y)-\xi ^{0,2}_t(y)|\le \left\| \xi ^{0,1}_t-\xi ^{0,2}_t\right\| \le \mathrm {e}^{(t-t_0)A}\left\| \xi ^{0,1}_{t_0}-\xi ^{0,2}_{t_0}\right\| . \end{aligned}$$
(4.16)

Now, if we pick initial conditions that are very similar, namely that satisfy

$$\begin{aligned} \left\| \xi ^{0,1}_0-\xi ^{0,2}_0\right\| \le \mathrm {e}^{-(\bar{\tau }^0_{\bar{\eta }}(\xi ^{0,1}_{t_0},\mathbf x ,\mathbf y )+\gamma )A}\left[ (\mathrm {e}^{\tilde{\kappa }\gamma }-1)\frac{\eta \bar{c}c_\mathbf x }{2\sqrt{|\mathbf x |}C_\mathbf x }\wedge (\mathrm {e}^{C\gamma }-1)\left( \frac{\eta }{6}\wedge \hat{\eta }\right) \right] , \end{aligned}$$
(4.17)

we can apply (4.15) and (4.16) and use the definition of \(\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )\) to derive

$$\begin{aligned} \left\| \left. \xi ^{0,2}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x&\le \left\| \left. \xi ^{0,2}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}\right| _\mathbf x -\left. \xi ^{0,1}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}\right| _\mathbf x \right\| _\mathbf x +\left\| \left. \xi ^{0,1}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x \nonumber \\&\le \mathrm {e}^{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )A}C_\mathbf x \left\| \left. \xi ^{0,2}_0\right| _\mathbf x -\left. \xi ^{0,1}_0\right| _\mathbf x \right\| +\frac{\eta \bar{c}c_\mathbf x }{2\sqrt{|\mathbf x |}} \le \mathrm {e}^{\tilde{\kappa }\gamma }\frac{\eta \bar{c}c_\mathbf x }{2\sqrt{|\mathbf x |}}, \end{aligned}$$
(4.18)

and for \(y\in \mathbf y \backslash \mathbf x \),

$$\begin{aligned} \xi ^{0,2}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}(y)&\le |\xi ^{0,2}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}(y)-\xi ^{0,1}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}(y)|+\xi ^{0,1}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}(y)\nonumber \\&\le \mathrm {e}^{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )A}\left\| \xi ^{0,2}_0-\xi ^{0,1}_0\right\| +\left( \frac{\eta }{6}\wedge \hat{\eta }\right) \le \mathrm {e}^{C\gamma }\left( \frac{\eta }{6}\wedge \hat{\eta }\right) . \end{aligned}$$
(4.19)

For all \(\eta >0\) such that

$$\begin{aligned} \hat{\eta }\underline{c}=\frac{\eta \bar{c}c_\mathbf x }{2\sqrt{|\mathbf x |}}\le \frac{\varepsilon _\mathbf x }{2}, \end{aligned}$$
(4.20)

we obtain

$$\begin{aligned} \left\| \left. \xi ^{0,2}_{\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )}\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x \le \varepsilon _\mathbf x \end{aligned}$$
(4.21)

and hence (4.11) and (4.14) can be applied to \(\xi ^{0,2}\) with \(t=\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )+\gamma \) and \(t_0=\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )\) to obtain \(\bar{\tau }^0_{\eta }(\xi ^{0,2}_0,\mathbf x ,\mathbf y )\le \bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )+\gamma \).

Repeating the same calculation switching 1 and 2 and using this bound for \(\bar{\tau }^0_{\eta }(\xi ^{0,2}_0,\mathbf x ,\mathbf y )\) to apply (4.17), it follows that

$$\begin{aligned}&\left\| \left. \xi ^{0,1}_{\bar{\tau }^0_{\eta }(\xi ^{0,2}_0,\mathbf x ,\mathbf y )}\right| _\mathbf x -\bar{\xi }_\mathbf{x }\right\| _\mathbf x \le \mathrm {e}^{\tilde{\kappa }\gamma }\frac{\eta \bar{c}c_\mathbf x }{2\sqrt{|\mathbf x |}}, \end{aligned}$$
(4.22)
$$\begin{aligned}&\xi ^{0,1}_{\bar{\tau }^0_{\eta }(\xi ^{0,2}_0,\mathbf x ,\mathbf y )}(y)\le \mathrm {e}^{C\gamma }\left( \frac{\eta }{6}\wedge \hat{\eta }\right) , \end{aligned}$$
(4.23)

and therefore \(\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )\le \bar{\tau }^0_{\eta }(\xi ^{0,2}_0,\mathbf x ,\mathbf y )+\gamma \).

Hence, \(|\bar{\tau }^0_{\eta }(\xi ^{0,1}_0,\mathbf x ,\mathbf y )-\bar{\tau }^0_{\eta }(\xi ^{0,2}_0,\mathbf x ,\mathbf y )|\le \gamma \), which proves the continuity.

To mark the transition between the exponential growth phase and the Lotka–Volterra invasion phase, we extend the definition of \(\tilde{T}^\mu _\eta \) (Definition 7) to the \(i^\text {th}\) invasion.

Definition 9

For \(i\ge 1\), the time when the first mutant type reaches \(\eta >0\) after the \((i-1)^\text {st}\) invasion is defined as

$$\begin{aligned} \tilde{T}^\mu _{\eta ,i}:=\inf \{s\ge \tilde{T}^\mu _{\eta ,i-1}:\exists \ y\in \mathbb {H}^n\backslash (\mathbf x ^{i-2}\cup \mathbf x ^{i-1}):\xi ^\mu _s(y)>\eta \}. \end{aligned}$$
(4.24)

We set \(\tilde{T}^\mu _{\eta ,0}:=0\) and \(\mathbf x ^{-1}:=\emptyset \).

To consider the evolutionary time scale \(\ln 1/\mu \), we define \(T^\mu _{\eta ,i}\) through \(\tilde{T}^\mu _{\eta ,i}=T^\mu _{\eta ,i}\ln 1/\mu \).

Fig. 2
figure 2

The two phases of \(y^i_*=x^i\) invading \(x^{i-1}\), in the case where there is no coexistence. The dashed line corresponds to \(\xi ^\mu _t(x^{i-1})\), the solid line depicts \(\xi ^\mu _t(x^i)\)

We can now turn to the proof of Theorem 2 and inductively derive the convergence of \(\xi ^\mu _{t\ln 1/\mu }\) to a jump process as \(\mu \rightarrow 0\). The two phases of an invasion (exponential growth and Lotka–Volterra) are depicted in Fig. 2.

Proof of Theorem 2

The proof is split into several parts. The main goal is to inductively approximate \(T^\mu _{\eta ,i}\) and \(\xi ^\mu _{t\ln 1/\mu }\), similar to Corollary 1. We claim that, for each \(1\le i\le I\) such that \(T_i<\infty \),

$$\begin{aligned}&\min _{\begin{array}{c} y\in \mathbb {H}^n\\ \rho ^{i-1}_y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|-\eta \hat{C}_{i-1}}{f_{z,\mathbf x ^{i-1}}+\eta \hat{C}}\le \liminf _{\mu \rightarrow 0}T^\mu _{\eta ,i}-T_{i-1}\nonumber \\&\quad \le \limsup _{\mu \rightarrow 0}T^\mu _{\eta ,i}-T_{i-1}\le \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \rho ^{i-1}_y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|+\eta \check{C}_{i-1}}{f_{z,\mathbf x ^{i-1}}-\eta \check{C}}. \end{aligned}$$
(4.25)

Moreover, for each \(0\le i<I\) such that \(T_i<\infty \), \(T_i<t<T_{i+1}\), there are positive constants \(\check{c}_i\), \(\check{C}_i\), \(\hat{c}_i\), \(\hat{C}_i\), and m, such that, for every \(y\in \mathbb {H}^n\),

$$\begin{aligned} \check{c}_i&\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}-\eta \check{C})]+\eta \check{C}_i}\le \xi ^\mu _{t\ln 1/\mu }(y)\nonumber \\&\le \hat{c}_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})]-\eta \hat{C}_i}\left( 1+t\ln \frac{1}{\mu }\right) ^{(i+1)m}, \end{aligned}$$
(4.26)

while, for each \(x\in \mathbf x ^i\), \(\xi ^\mu _{t\ln 1/\mu }(x)\in [\bar{\xi }_\mathbf{x ^i}(x)-\eta \bar{C},\bar{\xi }_\mathbf{x ^i}(x)+\eta \bar{C}]\).

In the first step, we approximate \(|T^\mu _{\eta ,i}-T_i|\le \eta C\), assuming that the claim holds true. Second, we derive a uniform bound on the duration of the \(i^\text {th}\) invasion phase, using Lemma 2. In Step 3, we prove the bounds that are claimed above. Finally, we use these bounds to derive the convergence as \(\mu \rightarrow 0\).

Step 1\(|T^\mu _{\eta ,i}-T_i|\le \eta C\).

In the case where there exists a \(y\in \mathbb {H}^n\) such that \(f_{y,x^{i-1}}>0\), we want to relate \(T_i\), as defined in (2.21), to \(T^\mu _{\eta ,i}\).

First, we prove a different identity for \(T_i\) that is similar to (4.25), namely the second equality of

$$\begin{aligned} T_i-T_{i-1}=\min _{\begin{array}{c} y\in \mathbb {H}^n:\\ f_{y,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho _y^{i-1}}{f_{y,\mathbf x ^{i-1}}}=\min _{\begin{array}{c} y\in \mathbb {H}^n\\ \rho ^{i-1}_y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|}{f_{z,\mathbf x ^{i-1}}}. \end{aligned}$$
(4.27)

On one hand, \(f_{y,\mathbf x ^{i-1}}>0\) implies \(\rho ^{i-1}_y>0\). The only cases in which \(\rho ^{i-1}_y=0\) are if \(y\in \mathbf x ^{i-1}\), then \(f_{y,\mathbf x ^{i-1}}=0\), or if \(y\in \mathbf x ^{i-2}\backslash \mathbf x ^{i-1}\), which implies \(f_{y,\mathbf x ^{i-1}}<0\) (else we would have terminated the procedure after the \((i-1)^\text {st}\) invasion due to case (b) in Theorem 2). Hence

$$\begin{aligned} \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \rho ^{i-1}_y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|}{f_{z,\mathbf x ^{i-1}}} \,{\le }\,\min _{\begin{array}{c} y\in \mathbb {H}^n\\ f_{y,\mathbf x ^{i-1}}>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|}{f_{z,\mathbf x ^{i-1}}} \,{\le }\,\min _{\begin{array}{c} y\in \mathbb {H}^n\\ f_{y,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_y}{f_{y,\mathbf x ^{i-1}}}, \end{aligned}$$
(4.28)

where we inserted \(z=y\) in the second step.

On the other hand, if we assume that \(\bar{y}\) and \(\bar{z}\) realise the minima, which implies that \(f_{\bar{z},\mathbf x ^{i-1}}>0\), we obtain

$$\begin{aligned} \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \rho ^{i-1}_y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|}{f_{z,\mathbf x ^{i-1}}} =\frac{\rho ^{i-1}_{\bar{z}}+|\bar{z}-\bar{y}|}{f_{\bar{z},\mathbf x ^{i-1}}} \ge \frac{\rho ^{i-1}_{\bar{z}}}{f_{\bar{z},\mathbf x ^{i-1}}} \ge \min _{\begin{array}{c} y\in \mathbb {H}^n\\ f_{y,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_y}{f_{y,\mathbf x ^{i-1}}}. \end{aligned}$$
(4.29)

Now, under the assumption that (4.25) holds true, we approximate

$$\begin{aligned}&\liminf _{\mu \rightarrow 0}T^\mu _{\eta ,i}-T_{i-1}\nonumber \\&\quad \ge \left( \min _{\begin{array}{c} y\in \mathbb {H}^n\\ \rho ^{i-1}_y>0 \end{array}}\min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\rho ^{i-1}_z+|z-y|}{f_{z,\mathbf x ^{i-1}}}\right) \left( \min _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{f_{z,\mathbf x ^{i-1}}}{f_{z,\mathbf x ^{i-1}}+\eta \hat{C}}\right) \nonumber \\&\qquad -\eta \hat{C}_{i-1}\max _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{1}{f_{z,\mathbf x ^{i-1}}+\eta \hat{C}}\nonumber \\&\quad = (T_i-T_{i-1})\left( 1-\max _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{\eta \hat{C}}{f_{z,\mathbf x ^{i-1}}+\eta \hat{C}}\right) -\eta \hat{C}_{i-1}\max _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{1}{f_{z,\mathbf x ^{i-1}}+\eta \hat{C}}\nonumber \\&\quad = (T_i-T_{i-1})-\eta ((T_i-T_{i-1})\hat{C}+\hat{C}_{i-1})\max _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{1}{f_{z,\mathbf x ^{i-1}}+\eta \hat{C}} \end{aligned}$$
(4.30)

and, analogously,

$$\begin{aligned} \limsup _{\mu \rightarrow 0}T^\mu _{\eta ,i}-T_{i-1}\le&(T_i-T_{i-1})+\eta ((T_i-T_{i-1})\check{C}+\check{C}_{i-1})\max _{\begin{array}{c} z\in \mathbb {H}^n\\ f_{z,\mathbf x ^{i-1}}>0 \end{array}}\frac{1}{f_{z,\mathbf x ^{i-1}}-\eta \check{C}}. \end{aligned}$$
(4.31)

As a result there is a constant \(C>0\) such that, for \(\eta \) and \(\mu \) small enough,

$$\begin{aligned} |T^\mu _{\eta ,i}-T_i|\le \eta C. \end{aligned}$$
(4.32)

Step 2 Uniform time bound on the Lotka–Volterra phase.

We show that, for \(\eta \) small enough,

$$\begin{aligned} \tilde{\tau }^\mu _\eta (\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)=\inf \big \{t\ge 0:&\forall \ x\in \mathbf x ^i: |\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+t}(x)-\bar{\xi }_\mathbf{x ^i}(x)|\le \eta \frac{\bar{c}}{\sqrt{|\mathbf x ^i|}},\nonumber \\&\forall \ y\in \mathbb {H}^n\backslash \mathbf x ^i: \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+t}(y)\le \frac{\eta }{3}\big \} \end{aligned}$$
(4.33)

is bounded by some constant \(\bar{T}_\eta \).

Since \(\text {LVE}_+(\mathbf x ^{i-1})=\{\bar{\xi }_\mathbf{x ^{i-1}}\}\) and \(f_{y^i_*,\mathbf x ^{i-1}}>0\), we obtain \(r(y)>0\), for every \(y\in (\mathbf x ^{i-1}\cup y^i_*)\). (B\(_\mathbf{x ^{i-1}\cup y^i_*}\)) holds by assumption and hence Lemma 2 can be applied to \(\mathbf y =\mathbf x ^{i-1}\cup y^i_*\) and \(\mathbf x =\mathbf x ^i\).

Let

$$\begin{aligned} \varOmega ^i_\eta :=&\{\xi : \xi (y^i_*)=\eta ,\nonumber \\&\xi (x)\in [\bar{\xi }_\mathbf{x ^{i-1}}(x)-\eta \bar{C},\bar{\xi }_\mathbf{x ^{i-1}}(x)+\eta \bar{C}]\ \forall \ x\in \mathbf x ^{i-1}, \xi (y)=0\text { else}\}, \end{aligned}$$
(4.34)

then, by continuity of \(\bar{\tau }^0_\eta (\xi ,\mathbf x ^i,\mathbf x ^{i-1}\cup y^i_*)\) in \(\xi \) (Lemma 2) and the compactness of \(\varOmega ^i_\eta \),

$$\begin{aligned} \sup _{\xi \in \varOmega ^i_\eta }\bar{\tau }^0_{\eta }(\xi ,\mathbf x ^i,\mathbf x ^{i-1}\cup y^i_*)=:\bar{T}_\eta <\infty . \end{aligned}$$
(4.35)

Using Lemma 1, for

$$\begin{aligned} \xi :={\left\{ \begin{array}{ll}\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}(x)&{}x\in \mathbf x ^{i-1}\cup y^i_*\\ 0&{}\text {else}\end{array}\right. }\ \in \varOmega ^i_\eta ,\qquad \bar{\tau }:=\bar{\tau }^0_{\eta }(\xi ,\mathbf x ^i,\mathbf x ^{i-1}\cup y^i_*), \end{aligned}$$
(4.36)

we obtain, for \(x\in \mathbf x ^i\), \(y\in \mathbf x ^{i-1}\cup y^i_*\backslash \mathbf x ^i\), \(\xi ^0_0=\xi \), and \(\mu \) small enough, that

$$\begin{aligned} |\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\bar{\tau }}(x)-\bar{\xi }_\mathbf{x ^i}(x)|&\le \left\| \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\bar{\tau }}-\xi ^0_{\bar{\tau }}\right\| +c_\mathbf{x ^i}^{-1}\left\| \left. \xi ^0_{\bar{\tau }}\right| _\mathbf{x ^i}-\bar{\xi }_\mathbf{x ^i}\right\| _\mathbf{x ^i}\nonumber \\&\le \mathrm {e}^{\bar{\tau } A}\left( \left\| \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}-\xi \right\| +\sqrt{\mu \frac{B}{A}}\right) +\frac{\eta \bar{c}}{2\sqrt{|\mathbf x ^i|}}\le \frac{\eta \bar{c}}{\sqrt{|\mathbf x ^i|}},\end{aligned}$$
(4.37)
$$\begin{aligned} \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\bar{\tau }}(y)&\le \left\| \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\bar{\tau }}-\xi ^0_{\bar{\tau }}\right\| +\xi ^0_{\bar{\tau }}(y)\nonumber \\&\le \mathrm {e}^{{\bar{\tau }} A}\left( \left\| \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}-\xi \right\| +\sqrt{\mu \frac{B}{A}}\right) +\frac{\eta }{6}\le \frac{\eta }{3}. \end{aligned}$$
(4.38)

Here we used that, for \(\eta \) small enough, \(\left\| \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}-\xi \right\| \le 2^n\max _{y\in \mathbb {H}^n\backslash (\mathbf x ^{i-1}\cup y^i_*)}\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}(y)\) tends to zero as \(\mu \rightarrow 0\). A more precise approximation for this is given in Step 3 and 4.

Overall, \(\tilde{\tau }^\mu _{\eta }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)\le \bar{\tau }\le \bar{T}_\eta \).

Step 3 Approximation of \(\xi ^\mu _{t\ln 1/\mu }\) and \(T^\mu _{\eta ,i}\).

We now turn to the proof of (4.25) and (4.26).

(4.26) in the case of \(i=0\) is given by Theorem 6 and Corollary 1, setting \(\check{c}_0:=\check{c}\), \(\check{C}_i:=0\), \(\hat{c}_0:=2^n\hat{c}\), and \(\hat{C}_i:=0\) and using that by Step 1, for every \(t<T_1\), there are \(\eta \) and \(\mu \) small enough such that \(t<T^\mu _{\eta ,1}\). Corollary 1 also gives(4.25) for \(i=1\).

Assuming that the claims holds for \(0\le i-1<I\), \(T_i<\infty \) implies that there is some \(y'\in \mathbb {H}^n\) for which \(f_{y',\mathbf x ^{i-1}}>0\), and hence, for every \(y\in \mathbb {H}^n\),

$$\begin{aligned}&\check{c}_{i-1}\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|-(T^\mu _{\eta ,i}-T_{i-1})(f_{z,\mathbf x ^{i-1}}-\eta \check{C})]+\eta \check{C}_{i-1}}\le \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}(y)\nonumber \\&\quad \le \hat{c}_{i-1}\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|-(T^\mu _{\eta ,i}-T_{i-1})(f_{z,\mathbf x ^{i-1}}+\eta \hat{C})]-\eta \hat{C}_{i-1}}\left( 1+\tilde{T}^\mu _{\eta ,i}\right) ^{im}. \end{aligned}$$
(4.39)

Moreover, \(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}(y^i_*)=\eta \) and, for every \(x\in \mathbf x ^{i-1}\), \(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}(x)\in [\bar{\xi }_\mathbf{x ^{i-1}}(x)-\eta \bar{C},\bar{\xi }_\mathbf{x ^{i-1}}(x)+\eta \bar{C}]\). Similar to Corollary 1, we obtain (4.25).

Next, we estimate the evolution of the different types during the Lotka–Volterra phase. Lemma 1 gives \(\xi ^\mu _t(z)\le 2|r(z)|/\alpha (z,z)\), for all \(z\in \mathbb {H}^n\) and \(t\ge 0\), and therefore

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(y)\ge \left[ r(y)-\sum _{z\in \mathbb {H}^n}\alpha (y,z)\frac{2|r(z)|}{\alpha (z,z)}-\mu b(y)\right] \xi ^\mu _t(y)\ge -K\xi ^\mu _t(y), \end{aligned}$$
(4.40)

for some \(K>0\).

By Step 2, we know that \(\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)\le \bar{T}_\eta \) and hence (4.39) yields

$$\begin{aligned} \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)}(y)\ge \mathrm {e}^{-K\bar{T}_\eta }\check{c}_{i-1}\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|-(T^\mu _{\eta ,i}-T_{i-1})(f_{z,\mathbf x ^{i-1}}-\eta \check{C})]+\eta \check{C}_{i-1}} \end{aligned}$$
(4.41)

Using Step 1, we can approximate

$$\begin{aligned} \min _{z\in \mathbb {H}^n}&[\rho ^{i-1}_z+|z-y|-(T^\mu _{\eta ,i}-T_{i-1})(f_{z,\mathbf x ^{i-1}}-\eta \check{C})]+\eta \check{C}_{i-1}\nonumber \\&= \min _{z\in \mathbb {H}^n}[\rho ^{i-1}_z+|z-y|-(T^\mu _{\eta ,i}-T_{i-1})f_{z,\mathbf x ^{i-1}}]+\eta (\check{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\check{C})\nonumber \\&\le \rho ^i_y+\eta (\check{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\check{C}+C\max _{z\in \mathbb {H}^n}f_{z,\mathbf x ^{i-1}}). \end{aligned}$$
(4.42)

We now plug this back in as the exponent and set \(\check{c}'_i:=\mathrm {e}^{-K\bar{T}_\eta }\check{c}_{i-1}\) as well as \(\check{C}'_i\ge \check{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\check{C}+C\max _{z\in \mathbb {H}^n}f_{z,\mathbf x ^{i-1}}\) to derive

$$\begin{aligned} \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)}(y)\ge \check{c}'_i\mu ^{\rho ^i_y+\eta \hat{C}'_i}. \end{aligned}$$
(4.43)

Note that \(\check{C}'_i\) can be chosen uniformly in \(\eta \) since \(T^\mu _{\eta ,i}\le T_i+\eta C\) by Step 1, while \(\check{c}'_i\) may depend on \(\eta \).

On the other hand,

$$\begin{aligned} \tfrac{d}{dt}\xi ^\mu _t(y)\le r(y)\xi ^\mu _t(y)+\mu \tilde{C}\sum _{z\sim y}\xi ^\mu _t(z). \end{aligned}$$
(4.44)

Following the same argument as for the upper bound in (3.2) (compare Step 2 of the proof of Theorem 6, with \(t=\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)\) and \(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}\) instead of \(\xi ^\mu _0\)), we obtain

$$\begin{aligned} \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)}(y)&\le \hat{c} \mathrm {e}^{\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)\max _{z\in \mathbb {H}^n}r(z)}\Bigg (1+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)\Bigg )^m\sum _{z\in \mathbb {H}^n}\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}}(z)\mu ^{|z-y|}. \end{aligned}$$
(4.45)

By Step 1,

$$\begin{aligned} \min _{z'\in \mathbb {H}^n}&[\rho ^{i-1}_{z'}+|z'-z|-(T^\mu _{\eta ,i}-T_{i-1})(f_{z',\mathbf x ^{i-1}}+\eta \hat{C})]-\eta \hat{C}_{i-1}+|z-y|\nonumber \\ \ge&\min _{z'\in \mathbb {H}^n}[\rho ^{i-1}_{z'}+|z'-y|-(T^\mu _{\eta ,i}-T_{i-1})f_{z',\mathbf x ^{i-1}}]-\eta (\hat{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\hat{C})\nonumber \\ \ge&\rho ^i_y-\eta (\hat{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\hat{C}+C\max _{z\in \mathbb {H}^n}f_{z,\mathbf x ^{i-1}}). \end{aligned}$$
(4.46)

Using this and Step 2, we derive

$$\begin{aligned} \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)}(y)&\le \hat{c} \mathrm {e}^{\bar{T}_\eta \max _{z\in \mathbb {H}^n}r(z)}(1+\bar{T}_\eta )^m\nonumber \\&\qquad \cdot \sum _{z\in \mathbb {H}^n}\hat{c}_{i-1}\mu ^{\rho ^i_y-\eta (\hat{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\hat{C}+C\max _{z\in \mathbb {H}^n}f_{z,\mathbf x ^{i-1}})}\left( 1\,{+}\,\tilde{T}^\mu _{\eta ,i}\right) ^{im}\nonumber \\&\le \,\hat{c}'_i\left( 1+\tilde{T}^\mu _{\eta ,i}\right) ^{im}\mu ^{\rho ^i_y-\eta \hat{C}'_i} \end{aligned}$$
(4.47)

where \(\hat{c}'_i:=2^n\hat{c} \mathrm {e}^{\bar{T}_\eta \max _{z\in \mathbb {H}^n}r(z)}(1+\bar{T}_\eta )^m\hat{c}_{i-1}\) and \(\hat{C}'_i\ge \hat{C}_{i-1}+(T^\mu _{\eta ,i}-T_{i-1})\hat{C}+C\max _{z\in \mathbb {H}^n}f_{z,\mathbf x ^{i-1}}\). As above, \(\hat{C}'_i\) can be chosen uniformly in \(\eta \) since \(T^\mu _{\eta ,i}\le T_i+\eta C\) by Step 1, while \(\hat{c}'_i\) may depend on \(\eta \).

For \(\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)=\tau (\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)\ln \frac{1}{\mu }\) and \(\mu \) small enough, Step 1 implies

$$\begin{aligned} |T^\mu _{\eta ,i}+\tau \left( \xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i\right) -T_i|\le \eta C+\frac{\bar{T}_\eta }{\ln \frac{1}{\mu }}\le 2\eta C. \end{aligned}$$
(4.48)

For \(T_i<t<T_{i+1}\), we can now pick \(\eta \) small enough such that \(T_i+2\eta C<t<T_{i+1}-\eta C\), and hence

$$\begin{aligned} \limsup _{\mu \rightarrow 0}T^\mu _{\eta ,i}+\tau (\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)<t<\liminf _{\mu \rightarrow 0}T^\mu _{\eta ,i+1}. \end{aligned}$$
(4.49)

As in Corollary 1, with the above bounds on \(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)}\), we derive

$$\begin{aligned} \xi ^\mu _{t\ln 1/\mu }(y)&\ge \check{c}\check{c}'_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+\eta \check{C}'_i+|z-y|-(t-(T^\mu _{\eta ,i}+\tau (\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)))(f_{z,\mathbf x ^i}-\eta \check{C})]}\nonumber \\&\ge \check{c}\check{c}'_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}-\eta \check{C})]+\eta (\check{C}'_i+2C\max _{z\in \mathbb {H}^n}(f_{z,\mathbf x ^i}-\eta \check{C}))}\nonumber \\&=\check{c}_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}-\eta \check{C})]+\eta \check{C}_i}, \end{aligned}$$
(4.50)

defining \(\check{c}_i:=\check{c}\check{c}'_i\) and \(\check{C}_i:=\check{C}'_i+2C\max _{z\in \mathbb {H}^n}(f_{z,\mathbf x ^i}-\eta \check{C})\).

Similarly, the upper bound is derived as

$$\begin{aligned} \xi ^\mu _{t\ln 1/\mu }&(y)\le 2^n\hat{c}\hat{c}'_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z-\eta \hat{C}'_i+|z-y|-(t-(T^\mu _{\eta ,i}+\tau (\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i)))(f_{z,\mathbf x ^i}+\eta \hat{C})]}\nonumber \\&\quad \cdot (1+\tilde{T}^\mu _{\eta ,i})^{im}\left( 1+\left( t\ln \frac{1}{\mu }-(\tilde{T}^\mu _{\eta ,i}+\tilde{\tau }(\xi ^\mu _{\tilde{T}^\mu _{\eta ,i}},\mathbf x ^i))\right) \right) ^m\nonumber \\&\le 2^n\hat{c}\hat{c}'_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})]-\eta (\hat{C}'_i+2C\max _{z\in \mathbb {H}^n}(f_{z,\mathbf x ^i}+\eta \hat{C}))}\left( 1+t\ln \frac{1}{\mu }\right) ^{(i+1)m}\nonumber \\&=\hat{c}_i\mu ^{\min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})]-\eta \hat{C}_i}\left( 1+t\ln \frac{1}{\mu }\right) ^{(i+1)m}, \end{aligned}$$
(4.51)

with \(\hat{c}_i:=2^n\hat{c}\hat{c}'_i\) and \(\hat{C}_i:=\hat{C}'_i+2C\max _{z\in \mathbb {H}^n}(f_{z,\mathbf x ^i}+\eta \hat{C})\). This concludes the proof of (4.26).

Notice, that, although \(\check{c}_i\) and \(\hat{c}_i\) may vary for different \(\eta \), \(\check{C}_i\) and \(\hat{C}_i\) can be chosen uniformly in \(\eta \).

For every \(x\in \mathbf x ^i\), we obtain \(\xi ^\mu _{t\ln 1/\mu }(x)\in [\bar{\xi }_\mathbf{x ^i}(x)-\eta \bar{C},\bar{\xi }_\mathbf{x ^i}(x)+\eta \bar{C}]\), as in Theorem 6.

Step 4 Convergence for \(T_i<t<T_{i+1}\).

We now want to prove the actual convergence. We already know that the resident types are staying close to their equilibrium between \(T_i\) and \(T_{i+1}\) and therefore mainly have to show that the population sizes of the non-resident types vanish as \(\mu \rightarrow 0\).

We claim that, for each \(i\ge 0\), \(T_i<t<T_{i+1}\), and \(y\in \mathbb {H}^n\backslash \mathbf x ^i\),

$$\begin{aligned} \min _{z\in \mathbb {H}^n}[\rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})]-\eta \hat{C}_i\ge \gamma , \end{aligned}$$
(4.52)

for some \(\gamma >0\) and all \(\eta \) small enough, and hence

$$\begin{aligned} 0\le \lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln 1/\mu }(y)\le \lim _{\mu \rightarrow 0}\hat{c}_i\mu ^\gamma \left( 1+t\ln \frac{1}{\mu }\right) ^{(i+1)m}=0. \end{aligned}$$
(4.53)

We distinguish several cases. If \(z\in \mathbf x ^i\), this implies \(f_{z,\mathbf x ^i}=0\), \(\rho ^i_z=0\), and \(|z-y|\ge 1\). Hence

$$\begin{aligned} \rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})-\eta \hat{C}_i\ge 1-\eta ((t-T_i)\hat{C}+\hat{C}_i). \end{aligned}$$
(4.54)

If \(z\in \mathbb {H}^n\backslash \mathbf x ^i\) and \(\rho ^i_z=0\), this implies \(f_{z,\mathbf x ^i}<0\) and

$$\begin{aligned} \rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})-\eta \hat{C}_i\ge -(t-T_i)f_{z,\mathbf x ^i}-\eta ((t-T_i)\hat{C}+\hat{C}_i). \end{aligned}$$
(4.55)

If \(z\in \mathbb {H}^n\backslash \mathbf x ^i\), \(\rho ^i_z>0\), and \(f_{z,\mathbf x ^i}\le 0\), we get

$$\begin{aligned} \rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})-\eta \hat{C}_i\ge \rho ^i_z-\eta ((t-T_i)\hat{C}+\hat{C}_i). \end{aligned}$$
(4.56)

Since \(\check{C}_i\) does not depend on \(\eta \), all these expressions can be bounded from below by a positive constant \(\gamma \) if \(\eta \) is small enough.

Finally, if \(z\in \mathbb {H}^n\backslash \mathbf x ^i\), \(\rho ^i_z>0\), and \(f_{z,\mathbf x ^i}>0\), we obtain \(t<T_{i+1}\le \rho ^i_z/f_{z,\mathbf x ^i}+T_i\) and, for \(\eta \) and \(\gamma \) small enough, \(t-T_i<(\rho ^i_z-\eta \hat{C}_i-\gamma )/(f_{z,\mathbf x ^i}+\eta \hat{C})\). Therefore,

$$\begin{aligned} \rho ^i_z+|z-y|-(t-T_i)(f_{z,\mathbf x ^i}+\eta \hat{C})-\eta \hat{C}_i>\rho ^i_z-\eta \hat{C}_i-(\rho ^i_z-\eta \check{C}_i-\gamma )=\gamma . \end{aligned}$$
(4.57)

This proves the claim, in particular in the case where \(T_{i+1}=\infty \) and there is no \(y\in \mathbb {H}^n\) such that \(f_{y,\mathbf x ^i}>0\).

Last, we consider the \(x\in \mathbf x ^i\). For every \(\eta \) small enough,

$$\begin{aligned} \lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln 1/\mu }(x)\in [\bar{\xi }_\mathbf{x ^i}(x)-\eta \bar{C},\bar{\xi }_\mathbf{x ^i}(x)+\eta \bar{C}]. \end{aligned}$$
(4.58)

As a result, \(\lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln 1/\mu }(x)=\bar{\xi }_\mathbf{x ^i}(x)\) and

$$\begin{aligned} \lim _{\mu \rightarrow 0}\xi ^\mu _{t\ln 1/\mu }=\sum _{x\in \mathbf x ^i}\delta _x\bar{\xi }_\mathbf{x ^i}(x). \end{aligned}$$
(4.59)

5 Special case of equal competition

In this section we turn to the proof of Theorem 3, the special case of equal competition between types. We go through the proof of Theorem 2 to make changes where assumptions are no longer satisfied and check the identities for \(x^i\) and \(T_i\).

Proof of Theorem 3

Unfortunately, assumption (B\(_\mathbf x \)) is not satisfied since there are no constants \(\theta _x\) such that \((\theta _x\alpha )_{x,y\in \mathbf x }\) is positive definite for \(|\mathbf x |\ge 2\). To still be able to apply the results of Theorem 2, we have to carefully go through all the points, where assumption (B\(_\mathbf x \)) was used.

In the proof of Theorem 6, this property is only used for the resident types \(\mathbf x \). In the case where \(\mathbf x \) consists of a single type, the positive definiteness is trivially satisfied since \(\alpha >0\).

In the case of Theorem 1, we have to argue differently in a few places. Champagnat et al. (2010) derive Proposition 1 from a more general theorem. If one adapts the proof of this theorem to our situation, one sees that assumption (B\(_\mathbf x \)) is first used to prove that there are only finitely many equilibrium points. In our special case, we are only considering Lotka–Volterra systems involving the old resident type \(x^{i-1}\) and the minimizing mutant \(y^i_*=x^i\). An equilibrium point \(\xi ^*\in ({\mathbb R}_{\ge 0})^{\{x^{i-1},x^i\}}\) has to satisfy

$$\begin{aligned}&\xi ^*(x^{i-1})=0\text { or }r(x^{i-1})=\alpha (\xi ^*(x^{i-1})+\xi ^*(x^i)),\nonumber \\ \text {and }&\xi ^*(x^i)=0\text { or }r(x^i)=\alpha (\xi ^*(x^{i-1})+\xi ^*(x^i)). \end{aligned}$$
(5.1)

Since \(f_{x^i,x^{i-1}}>0\), we obtain \(r(x^i)>r(x^{i-1})\) and there are only three equilibrium points, namely (0, 0), \((r(x^{i-1})/\alpha ,0)\), and \((0,r(x^i)/\alpha )\).

Moreover, assumption (B\(_\mathbf x \)) is used to prove that the evolutionary stable state (if existent) is unique. An evolutionary stable state \(\bar{\xi }\in ({\mathbb R}_{\ge 0})^{\{x^{i-1},x^i\}}\) is characterised by

$$\begin{aligned} {\left\{ \begin{array}{ll} r(x^j)-\alpha (\bar{\xi }(x^{i-1})+\bar{\xi }(x^i))\le 0,\quad \text {if }\bar{\xi }(x^j)=0,\\ r(x^j)-\alpha (\bar{\xi }(x^{i-1})+\bar{\xi }(x^i))=0,\quad \text {if }\bar{\xi }(x^j)>0, \end{array}\right. } \end{aligned}$$
(5.2)

for \(j\in \{i-1,i\}\). Since \(f_{i,i+1}>0\), only the last of the three equilibrium points satisfies these assumptions,

$$\begin{aligned} r(x^{i-1})-\alpha (\bar{\xi }(x^{i-1})+\bar{\xi }(x^i))=r(x^{i-1})-\alpha \left( 0+\frac{r(x^i)}{\alpha }\right) =-f_{i,i-1}\le 0, \end{aligned}$$
(5.3)
$$\begin{aligned} r(x^i)-\alpha (\bar{\xi }(x^{i-1})+\bar{\xi }(x^i))=r(x^i)-\alpha \left( 0+\frac{r(x^i)}{\alpha }\right) =0. \end{aligned}$$
(5.4)

Finally, in Lemma 2, we are again in the situation where \(\mathbf x \) consists of only one type and hence the positive definiteness is trivial.

The only thing left is to show the identities for \(x^i\) and \(T_i\). We claim that, for \(i\ge 0\),

$$\begin{aligned} \rho ^{i+1}_y=&\min _{z_{i+1}\in \mathbb {H}^n}\ldots \min _{z_1\in \mathbb {H}^n} \left[ |y-z_{i+1}|+\sum \limits _{j=1}^{i}|z_{j+1}-z_j|+|z_1-x^0|\right. \nonumber \\&\left. -f_{z_1,x^0}T_1-\sum _{j=1}^{i} f_{z_{j+1},x_j}(T_{j+1}-T_j)\right] . \end{aligned}$$
(5.5)

From the initial condition we obtain \(\rho ^0_y=\min _{z\in \mathbb {H}^n}[\lambda _z+|z-y|]=|y-x^0|\). Hence,

$$\begin{aligned} y^1_*=\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^0}>0}\frac{|y-x^0|}{f_{y,x^0}} \end{aligned}$$
(5.6)

and

$$\begin{aligned} T_1=\min _{\begin{array}{c} y\in \mathbb {H}^n:\\ f_{y,x^0}>0 \end{array}}\frac{|y-x^0|}{f_{y,x^0}}. \end{aligned}$$
(5.7)

Since \(f_{y^1_*,x^0}=r(y^1_*)-r(x^0)>0\), the new equilibrium is monomorphic of type \(x^1=y^1_*\) and \(T_1=|x^1-x^0|/f_{1,0}\). Moreover,

$$\begin{aligned} \rho ^1_y=\min _{z\in \mathbb {H}^n}[\rho ^0_z+|z-y|-T_1f_{z,x^0}]=\min _{z\in \mathbb {H}^n}\left[ |y-z|+|z-x^0|-f_{z,x^0}T_1\right] . \end{aligned}$$
(5.8)

Assume that \(x^i\), \(T_i\), and \(\rho ^i_y\) are of the proposed form. Then there is a unique

$$\begin{aligned}&x^{i+1}=y^{i+1}_*=\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{\rho ^i_y}{f_{y,x^i}}\nonumber \\&\quad =\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{\min _{z_i\in \mathbb {H}^n}\left[ |y-z_i|+\rho ^{i-1}_{z_i}-f_{z_i,x^{i-1}}(T_i-T_{i-1})\right] }{f_{y,x^i}}\nonumber \\&\quad =\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\min _{z_i\in \mathbb {H}^n}F(y,z_i), \end{aligned}$$
(5.9)

where the last equality serves as the definition of the function \(F:\mathbb {H}^n\times \mathbb {H}^n\rightarrow {\mathbb R}_+\).

Assume that the minimum over \(z_i\) is only realised by some \(\bar{z}\ne y^{i+1}_*\), i.e.

$$\begin{aligned} \min _{\begin{array}{c} y\in \mathbb {H}^n\\ f_{y,x^i}>0 \end{array}}\min _{z_i\in \mathbb {H}^n}F(y,z_i)=\min _{z_i\in \mathbb {H}^n}F(y^{i+1}_*,z_i)=F(y^{i+1}_*,\bar{z})<F(y^{i+1}_*,y^{i+1}_*). \end{aligned}$$
(5.10)

Looking back at the definition of F and using that

$$\begin{aligned} \rho ^{i-1}_{y^{i+1}_*}&=\min _{z\in \mathbb {H}^n}[\rho ^{i-2}_z+|z-y^{i+1}_*|-(T_{i-1}-T_{i-2})f_{z,\mathbf x ^{i-2}}]\nonumber \\&\le \min _{z\in \mathbb {H}^n}[\rho ^{i-2}_z+|z-\bar{z}|-(T_{i-1}-T_{i-2})f_{z,\mathbf x ^{i-2}}]+|\bar{z}-y^{i+1}_*|\nonumber \\&=\rho ^{i-1}_{\bar{z}}+|y^{i+1}_*-\bar{z}|, \end{aligned}$$
(5.11)

this yields

$$\begin{aligned} 0\le |y^{i+1}_*-\bar{z}|+\rho ^{i-1}_{\bar{z}}-\rho ^{i-1}_{y^{i+1}_*}<(f_{\bar{z},x^{i-1}}-f_{y^{i+1}_*,x^{i-1}})(T_i-T_{i-1}) \end{aligned}$$
(5.12)

and, since \(T_i>T_{i-1}\), we obtain \(f_{\bar{z},x^{i-1}}>f_{y^{i+1}_*,x^{i-1}}>0\). But this would imply

$$\begin{aligned} \min _{z_i\in \mathbb {H}^n}F(\bar{z},z_i)\le F(\bar{z},\bar{z})<F(y^{i+1}_*,\bar{z})=\min _{\begin{array}{c} y\in \mathbb {H}^n\\ f_{y,x^i}>0 \end{array}}\min _{z_i\in \mathbb {H}^n}F(y,z_i), \end{aligned}$$
(5.13)

which is a contradiction. Hence, \(\bar{z}\) can be chosen equal to \(y^{i+1}_*\).

Repeating the previous argument shows that the minimum oder \(z_1,\ldots ,z_{i+1}\) is achieved at \(z_1=\cdots =z_{i+1}=y\) and hence

$$\begin{aligned} x^{i+1}=&\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{\rho ^{i-1}_y-f_{y,x^{i-1}}(Ti-T_{i-1})}{f_{y,x^i}}=\cdots \nonumber \\ =&\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{|y-x^0|-f_{y,x^0}\frac{|x^1-x^0|}{f_{1,0}}-\sum _{j=1}^{i-1}f_{y,x^j}(T_{j+1}-T_j)}{f_{y,x^i}}\nonumber \\ =&\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{|y-x^0|}{f_{y,x^i}}-\frac{f_{y,x^{i-1}}}{f_{y,x^i}}T_i-\sum _{j=1}^{i-1}\frac{f_{y,x^{j-1}}-f_{y,x^j}}{f_{y,x^i}}T_j\nonumber \\ =&\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{|y-x^0|}{f_{y,x^i}}-\frac{f_{y,x^{i-1}}(|x^i-x^0|-|x^{i-1}-x^0|)}{f_{y,x^i}f_{i,i-1}}\nonumber \\&-\sum _{j=1}^{i-1}\frac{|x^j-x^0|-|x^{j-1}-x^0|}{f_{j,j-1}}\frac{f_{y,x^{j-1}}-f_{y,x^j}}{f_{y,x^i}}\nonumber \\ =&\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{|y-x^0|}{f_{y,x^i}}-(|x^i-x^0|-|x^{i-1}-x^0|)\left( \frac{1}{f_{i,i-1}}+\frac{1}{f_{y,x^i}}\right) \nonumber \\&\qquad \qquad \qquad \qquad -\frac{|x^{i-1}-x^0|-|x^0-x^0|}{f_{y,x^i}}\nonumber \\ =&\mathop {{{\,\mathrm{arg\,min}\,}}}\limits _{y\in \mathbb {H}^n:f_{y,x^i}>0}\frac{|y-x^0|-|x^i-x^0|}{f_{y,x^i}}-T_i, \end{aligned}$$
(5.14)

where we use (2.25) several times. Analogously,

$$\begin{aligned} T_{i+1}&=T_i+\min _{\begin{array}{c} y\in \mathbb {H}^n:\\ f_{y,x^i}>0 \end{array}}\frac{\rho ^i_y}{f_{y,x^i}}\nonumber \\&=T_i+\left( \frac{|x^{i+1}-x^0|-|x^i-x^0|}{f_{i+1,i}}-T_i\right) =\frac{|x^{i+1}-x^0|-|x^i-x^0|}{f_{i+1,i}}. \end{aligned}$$
(5.15)

Finally,

$$\begin{aligned} \rho ^{i+1}_y=\min _{z_{i+1}\in \mathbb {H}^n}[\rho ^i_{z_{i+1}}+|z_{i+1}-y|-(T_{i+1}-T_i)f_{z_{i+1},x^i}], \end{aligned}$$
(5.16)

which is of the desired form. This proves the claim and hence the theorem.

6 A first look at limited range of mutation

In this section we present the proof of Theorem 5, where \(\ell =1\), and take a first look at the intermediate cases of \(1<\ell <n\).

6.1 Proof for the case \(\ell =1\)

We again go over the previous proofs and make alterations where necessary.

Proof of Theorem 5

We only consider the first invasion step. We can assume that \(\eta <\bar{\xi }\). Consequently, up to time \(\tilde{T}^\mu _{\eta ,1}\wedge \inf \{t\ge 0:\exists \ z\in \mathbb {H}^n, |z-x^0|>1,\xi ^\mu _t(z)\ge \bar{\xi }\mu \}\), the neighbours of type \(x^0\) are the only active mutants. As before,

$$\begin{aligned} \xi ^\mu _t(x^0)\in [\bar{\xi }_{x^0}(x^0)-\eta \bar{C},\bar{\xi }_{x^0}(x^0)+\eta \bar{C}]. \end{aligned}$$
(6.1)

Moreover, as in (3.22) and (3.33), we obtain

$$\begin{aligned} {[}f_{x^0,x^0}-\eta \check{C}]\xi ^\mu _t(x^0)\le \tfrac{d}{dt}\xi ^\mu _t(x^0)\le [f_{x^0,x^0}+\eta \hat{C}]\xi ^\mu _t(x^0), \end{aligned}$$
(6.2)

and with \(f_{x^0,x^0}=0\), \(c:=\bar{\xi }_{x^0}(x^0)-\bar{c}\bar{\xi }\), and \(C:=\bar{\xi }_{x^0}(x^0)+\bar{c}\bar{\xi }\),

$$\begin{aligned} c\mathrm {e}^{-t\eta \check{C}}\le \xi ^\mu _t(x^0)\le C\mathrm {e}^{t\eta \hat{C}}. \end{aligned}$$
(6.3)

Considering the neighbours \(y\sim x^0\) of the resident type, we derive

$$\begin{aligned} {[}f_{y,x^0}-\eta \check{C}]\xi ^\mu _t(y)+\mu \tilde{c}\xi ^\mu _t(x^0)\le \tfrac{d}{dt}\xi ^\mu _t(y)\le [f_{y,x^0}+\eta \hat{C}]\xi ^\mu _t(y)+\mu \tilde{C}\xi ^\mu _t(x^0), \end{aligned}$$
(6.4)

and hence the upper bound,

$$\begin{aligned} \xi ^\mu _t(y)&\le \mathrm {e}^{t(f_{y,x^0}+\eta \hat{C})}C_y\mu ^{\lambda _y}+\mu \tilde{C}C\int _0^t\mathrm {e}^{s\eta \hat{C}}\mathrm {e}^{(t-s)(f_{y,x^0}+\eta \hat{C})}ds\nonumber \\&\le \mu \mathrm {e}^{t(f_{y,x^0}+\eta \hat{C})}\left( C_y\mu ^{\lambda _y-1}+\tilde{C}C\int _0^t\mathrm {e}^{-sf_{y,x^0}}ds\right) \nonumber \\&\le \hat{c}'\mu \mathrm {e}^{t\eta \hat{C}}\left( (1+t)\mathrm {e}^{tf_{y,x^0}}+1\right) , \end{aligned}$$
(6.5)

for some \(\hat{c}'<\infty \), uniformly in \(y\sim x^0\), \(\eta <\bar{\xi }\), and \(\mu \).

A similar lower bound can be shown and, on the \(\ln 1/\mu \)-time scale, we obtain

$$\begin{aligned} \check{c}'\mu ^{((1-tf_{y,x^0})\wedge 1)+t\eta \check{C}}\le \xi ^\mu _{t\ln \frac{1}{\mu }}(y)\le \hat{c}'\mu ^{((1-tf_{y,x^0})\wedge 1)-t\eta \hat{C}}\left( 1+t\ln \frac{1}{\mu }\right) . \end{aligned}$$
(6.6)

Using this bound, all types z such that \(|z-x^0|=2\) can be bounded from above using the same type of calculation to derive

$$\begin{aligned} \xi ^\mu _{t\ln \frac{1}{\mu }}(z)\le C\mu ^{2-t\eta \hat{C}}\left( \left( 1+t\ln \frac{1}{\mu }\right) ^2\mu ^{-t\max _{y\sim x^0}f_{y,x^0}}+1\right) . \end{aligned}$$
(6.7)

Hence, for \(\eta \) small enough, \(\tilde{T}^\mu _{\eta ,1}\approx \inf \{t\ge 0:\exists \ z\in \mathbb {H}^n, |z-x^0|>1,\xi ^\mu _t(z)\ge \bar{\xi }\mu \}\).

As in Corollary 1, we can now argue that

$$\begin{aligned} \min _{\begin{array}{c} y\sim x^0\\ f_{y,x^0}>0 \end{array}}\frac{1}{f_{y,x^0}+\eta \check{C}}\le \liminf _{\mu \rightarrow 0}T^\mu _{\eta ,1}\le \limsup _{\mu \rightarrow 0}T^\mu _{\eta ,1}\le \min _{\begin{array}{c} y\sim x^0\\ f_{y,x^0}>0 \end{array}}\frac{1}{f_{y,x^0}-\eta \check{C}}. \end{aligned}$$
(6.8)

The first mutant \(y^1_*\) to reach the \(\eta \)-level is the neighbour of \(x^0\) minimising \(1/f_{y,x^0}\) (given \(f_{y,x^0}>0\)), hence maximising r(y), which is unique (or else we set \(I:=i\) and terminate the procedure). This yields \(T^\mu _{\eta ,1}\approx T_1=1/f_{y^1_*,x^0}\).

The Lotka–Volterra phase can be analysed just as before. Since \(y^1_*\) satisfies \(r(y^1_*)>r(x^0)\), the new equilibrium has \(x^1=y^1_*\) as the only resident type.

Since, for every other \(y\sim x^0\), \(r(y)<r(x^1)\), these types always stay unfit, do not foster mutants above the threshold, and we do not need to consider them any further.

During the Lotka–Volterra phase, once the \(\xi ^\mu _t(z)\), \(z\sim x^1\) have surpassed \(\bar{\xi }\mu \), they start to grow. However, since the duration of the Lotka–Volterra phase can be bounded uniformly as before, this only results in mutant populations of order \(\mu ^{1}\), which fits the initial conditions for the next invasion step.

6.2 The intermediate cases

For now, we stick with the assumption of constant competition. In the case of \(\ell \ge n\), arbitrarily large steps can be taken. In particular, arbitrarily large valleys in the fitness landscape (defined by r) can be crossed. A (strict) global fitness maximum is reached eventually and is the only stable point. If \(\ell =1\), the limiting walk always jumps to the fittest nearest neighbour and (strict) local fitness maxima are stable points. In both cases, the microscopic types do not have to be tracked to characterise the jump process. The next step is determined only by the previous and possibly the initial resident type.

The cases \(2\le \ell \le n-1\) interpolate between the two extreme scenarios. To study accessibility of different types, we again need to keep track of the microscopic populations. To this extent, we define some new quantities.

Definition 10

The first appearance time of a type y (on the \(\ln 1/\mu \)-time scale) is denoted by

$$\begin{aligned} \tau ^\mu _y:=\inf \left\{ s\ge 0:\xi ^\mu _{s\ln \frac{1}{\mu }}(y)>0\right\} . \end{aligned}$$
(6.9)

The \(\mu \)-power the population size of type y would have at time \(t\ln 1/\mu \) due to its own growth rate (neglecting mutation from neighbours after \(\tau ^\mu _y\)) is

$$\begin{aligned} \lambda _t(y):=\mathbb {1}_{t\ge \tau ^\mu _y} \Bigg (\underbrace{\ell \wedge |y-x^0|}_\text {initial size}-\sum _{i=0}^\infty \underbrace{f_{y,x^i}(t\wedge T^\mu _{\eta ,i+1}-\tau ^\mu _y\vee T^\mu _{\eta ,i})_+}_{\begin{array}{c} \text {growth between}\\ i^\text {th}\text { and }(i+1)^\text {st}\text { invasion} \end{array}}\Bigg )+\mathbb {1}_{t<\tau ^\mu _y}\infty , \end{aligned}$$
(6.10)

where \(x^i\) and \(T^\mu _{\eta ,i}\) are just as before.

All types under the mutational influence of type y are denoted by

$$\begin{aligned} \varLambda _t(y):=\{z\in \mathbb {H}^n: |z-y|+\lambda _t(y)\le \ell \} \end{aligned}$$
(6.11)

and \(\varLambda _t:=\bigcup _{y\in \mathbb {H}^n}\varLambda _t(y)\).

Since we are assuming constant competition, the population sizes of the different types are approximated by

$$\begin{aligned} \xi ^\mu _{t\ln \frac{1}{\mu }}(y)\approx \mathbb {1}_{y\in \varLambda _t}\mu ^{\min _{z\in \varLambda _t}[|y-z|+\lambda _t(z)]}, \end{aligned}$$
(6.12)

where we drop multiplicative constants and all terms involving \(\eta \). Figure 3 visualises the interplay of \(\lambda _t(y)\), \(\xi ^\mu _{t\ln 1/\mu }(y)\), and the sets \(\varLambda _t(y)\) for an easy example.

Fig. 3
figure 3

Example for the case \(\ell =3\). The mutational influence of \(x_2\) reaches \(x_1\) and \(x_3\). The population size of \(x_3\) is not determined by its own growth rate but by mutants from the resident type \(x_4\)

It is not easy to make general statements about the evolution of this intermediate model. However, we state some first results on the accessibility of types.

Definition 11

A type \(y\in \mathbb {H}^n\) is called accessible if \(y\in \varLambda _\infty :=\bigcup _{t\ge 0}\varLambda _t\).

Remark 10

This is equivalent to \(\tau ^\mu _y<\infty \).

Since resident types can only produce mutants in a radius of \(\ell \), in order to be accessible, a type has to be reached on a path with types of increasing fitness and at most distance \(\ell \). Figure 4 gives an example for such a path.

Lemma 3

A necessary condition for a type y to be accessible is the existence of a path \((y_0=x^0,y_1,\ldots ,y_m=y)\) and indices \(i_0=0<i_1<\cdots <i_k=m\), such that

$$\begin{aligned} \forall \ 1\le j\le k:&\ |i_j-i_{j-1}|\le \ell , \end{aligned}$$
(6.13)
$$\begin{aligned} \forall \ 1\le j< k:&\ f_{y_{i_j},y_{i_{j-1}}}>0,\end{aligned}$$
(6.14)
$$\begin{aligned} \forall \ i_{j-1}<i<i_j:&\ f_{y_{i_{j-1}},y_i}>0. \end{aligned}$$
(6.15)
Fig. 4
figure 4

A possible path to access y, for \(\ell =3\)

Proof

Assume that \(y\ne x^0\). If \(y\in \varLambda _0(x^0)\), this implies \(|y-x^0|\le \ell \). Hence we can choose any shortest path from \(x^0\) to y and pick the indices \(i_j\) such that the conditions are satisfied.

If y is accessible but \(y\notin \varLambda _0(x^0)\), then \(\tau ^\mu _y>0\). There is at least one \(z\ne y\) such that \(y\in \varLambda _{\tau ^\mu _y}(z)\). We choose such a z for which the rate r(z) is maximal. Consequently, \(\tau ^\mu _z<\tau ^\mu _y\) and \(\xi ^\mu _{\tau ^\mu _y\ln \frac{1}{\mu }}(z)\approx \mu ^{\lambda _{\tau ^\mu _y}(z)}\) (else z would just grow due to mutants from a fitter type, which would imply that z was not chosen such that the rate r(z) is maximal). Any direct path from z to y now only goes through types that are unfit in comparison to z. We set \(y_{i_k}:=z\).

We can now iterate this procedure with z replacing y. In addition, we know that, for the \(z'\ne z\) such that \(z\in \varLambda _{\tau ^\mu _z}(z')\) and \(r(z')\) is maximised, \(r(z)>r(z')\) (else, we would obtain \(\varLambda _t(z)\subset \varLambda _t(z')\), for all \(t\ge 0\), and z would not have been chosen maximising r(z)). We set \(y_{i_{k-1}}:=z'\) and continue until we reach \(x^0\).

Remark 11

The condition in Lemma 3 is not sufficient. Even if such a path exists, there might be a type z that is reached before \(y_{i_j}\) such that \(r(z)>r(y_{i_j})\). In this case the population of \(y_{i_j}\) is not fit to grow and might never reach the necessary size to induce mutants of type \(y_{i_{j+1}}\).

As a Corollary, we can consider the non-crossing of fitness valleys. Figure 5 gives the example of a non-accessible type, surrounded by a fitness valley.

Corollary 2

If a type y is surrounded by a fitness valley of width at least \(\ell +1\), i.e. for all paths \((y_0=x_0,y_1,\ldots ,y_m=y)\) there exists an \(i\le m-(\ell +1)\) such that \(f_{y_i,y_j}>0,\forall \ i<j<m\), it is non-accessible.

Fig. 5
figure 5

Due to the high fitness of \(y_{i_1}\) and \(y_{i_2}\), y is not accessible for \(\ell =3\)

Proof

The claim follows directly from Lemma 3 since in this case the necessary path cannot exist.

As a result, at least in the matter of crossing fitness valleys, the intermediate cases interpolate between the extreme cases.

However, as in the case of \(\ell =n\), it is still possible to take arbitrarily large steps in the macroscopic process or the limiting jump process, respectively. If there was a series of types with distance smaller than \(\ell +1\) and fast increasing rate r, then each population could be overtaken by its faster growing mutants before it reaches the macroscopic level of \(\mu ^0\).

Overall, the microscopic types play an important role in defining the limiting process.