An approximation of populations on a habitat with large carrying capacity

We consider stochastic dynamics of a population which starts from a small colony on a habitat with large but limited carrying capacity. A common heuristics suggests that such population grows initially as a Galton–Watson branching process and then its size follows an almost deterministic path until reaching its maximum, sustainable by the habitat. In this paper we put forward an alternative and, in fact, more accurate approximation which suggests that the population size behaves as a special nonlinear transformation of the Galton–Watson process from the very beginning.

1. Introduction 1.1.The model.A large population often starts from a few individuals who colonize a new habitat.Initially, in abundance of resources and lack of competition it grows rapidly until reaching the carrying capacity.Then the population fluctuates around the carrying capacity for a very long period of time, until, by chance, it eventually dies out, see, e.g., [6], [7].
This cycle is captured by a stochastic model of density dependent branching process Z = (Z n , n ∈ Z + ) generated by the recursion started at an initial colony size Z 0 .The random variables ξ n, j take integer values and, for each n ∈ N, are conditionally i.i.d.given all previous generations The object of our study is the density process of the population Z n := Z n /K relative to the carrying capacity parameter K > 0. The common distribution of the random variables ξ n, j is assumed to depend on the density Z n−1 : and is determined by the functions p ℓ : R + → [0, 1].Both processes Z and Z are indexed by K, but this dependence is suppressed in the notation.The mean and the variance of offspring distribution when the density process has value x are denoted by assumed to exist.Consequently, and Var(ξ n,1 |F n−1 ) = σ 2 (Z n−1 ).
If the offspring mean function satisfies the process Z has a supercritical reproduction below the capacity K, critical reproduction at K and a subcritical reproduction above K.Thus a typical trajectory of Z grows rapidly until it reaches the vicinity of K.It then stays there fluctuating for a very long period of time and gets extinct eventually if p 0 (x) > 0 for all x ∈ R + .
Thus the lifespan of such population roughly divides between the emergence stage, at which the population establishes itself, the quasi-stationary stage around the carrying capacity and the decline stage which ends up with inevitable extinction.REMARK 1.While (4) is typical for populations with quasi stable equilibrium at the capacity, it is not needed in the proofs and will not be assumed in what follows.

Large initial colony.
A more quantitative picture can be obtained by considering the dynamics for the density process derived from (1) by setting f (x) := xm(x), dividing by K and rearranging: The second term on the right has zero mean and conditional variance Consequently (5) can be viewed as a deterministic dynamical system perturbed by small noise of order 1 O P (K −1/2 ).If the initial colony size is relatively large, i.e., proportional to the carrying capacity: 1 The usual notations for probabilistic orders is used throughout.In particular, for a sequence of random variables ζ (K) and a sequence of numbers α(K) x n where x n follows the unperturbed deterministic dynamics started at x 0 .If (4) is assumed, x = 1 is the stable fixed point of f and if, in addition, f is an increasing function, then the sequence x n increases to 1 with n when x 0 < 1.This limit also implies that the probability of early extinction tends to zero as K → ∞.Moreover, the stochastic fluctuations about the deterministic limit converge to a Gaussian process where V n satisfies the recursion, [14], with N(0, 1) i.i.d.random variables W n 's.Roughly speaking, this implies that when K is large, Z n grows towards the capacity K along the deterministic path Kx n and its fluctuations are of order O P (K 1/2 ): If p 0 (x) > 0 for all x ∈ R + and ( 4) is imposed, zero is an absorbing state and hence the population gets extinct eventually.Large deviations analysis, see for example [15], [13], and [11], [10], shows that the mean of the time to extinction τ e = inf{n ≥ 0 : Z n = 0} grows exponentially with K.In this paper we are concerned with the establishment stage of the population, which occurs well before the ultimate extinction, on the time scale of log K.
1.3.Small initial colony.When Z 0 is a fixed integer, say Z 0 = 1, then Z 0 /K → x 0 = 0 and, since f (0) = 0, the solution to (6) is x n = 0 for all n ∈ N. In this case the approximation (7) ceases to provide useful information.An alternative way to describe the stochastic dynamics in this setting was suggested recently in [3], [4,5].It is based on the long known heuristics [12], [17], [16], according to which such a population behaves initially as the Galton-Watson branching process and, if it manages to avoid extinction at this early stage, it continues to grow towards the carrying capacity following an almost deterministic curve.
This heuristics is made precise in [5] as follows.We couple Z to a supercritical Galton-Watson branching process with the offspring distribution identical to that of Z at zero density size This coupling is defined under assumption (a1) below in Section 3.2.
Denote by ρ := m(0) > 1, define2 n c := n c (K) = [log ρ K c ] for some c ∈ ( 1 2 , 1) and let Y n := Y n /K be the density of Y .Then Z n = Z n /K is approximated in [5] by where f k stands for the k-th iterate of f .As is well known [1] ρ where W is an a.s.finite random variable.Moreover, under certain technical conditions on f , the limit can be shown to exist and define a continuous function.
In particular, this result implies that when K is a large integer power of ρ the distribution of Z n 1 is close to that of H(W ).Moreover, where x n solves (6) started from the random initial condition H(W ).This approximation also captures the early extinction event since H(0) = 0 and P(W = 0) = P(lim n Y n = 0), the extinction probability of the Galton-Watson process Y .
1.4.This paper's contribution.In this paper we address the question of the rate of convergence in (10).Note that if the probabilities in (2) are constant with respect to x then f (x) = ρx, consequently H(x) = x, and the processes Z and Y coincide.In this case where the order of convergence is implied by the CLT for the Galton-Watson process [8] by which √ ρ n (ρ −n Y n − W ) converges in distribution to a mixed normal law as n → ∞.Thus it can be expected that at best the sequence in ( 10) is of order O P (K −1/2 ) as K → ∞.However, the best rate of convergence in the approximation in Theorem 2 described above, is achieved with c = 5 8 and it is only O P (K −1/8 log K).This can be seen from a close examination of the proof in [5].
The goal of this paper is to put forward a different approximation with much faster rate of convergence of order O P (K −1/2 log K).This is still slower than the rate achievable in the density independent case, but only by a logarithmic factor.The new proof highlights a better understanding of population dynamics at the emergence stage, which shows that, in fact, a sharper approximation is given by the Galton-Watson process transformed by a nonlinear function H arising in deterministic dynamics (9).
It is not clear at the moment whether the log K factor is avoidable and whether a central limit type theorem holds.These questions are left for further research.

The main result
We will make the following assumptions.
(a1) The offspring distribution F x (t) = ∑ ℓ≤t p ℓ (x) is stochastically decreasing with respect to the population density: for any y ≥ x, (a2) The second moment of the offspring distribution, cf.(3), is L-Lipschitz for some L > 0.
(a3) The function f (x) = xm(x) has two continuous bounded derivatives and 3 REMARK 3. Assumption (a1) means that the reproduction drops with population density.In particular, it implies that x → m(x) is a decreasing function and hence, which is only slightly weaker than (a3).The assumption (a2) is technical.REMARK 4. The distribution of the process Z does not depend on the values of {p ℓ (0), ℓ ∈ Z + } for any K, while the distribution of W and, therefore, of H(W ) does.This is not a contradiction since our assumptions imply continuity of x → p ℓ (x) where the convergence holds since m(x) is differentiable and a fortiori continuous at x = 0.By the stochastic order assumption (a1), F x (t) − F 0 (t) ≥ 0 for any t ≥ 0. Since both F x and F 0 are discrete with jumps at integers, for any s ≥ 0, This in turn implies that p ℓ (x) → p ℓ (0) as x → 0 for all ℓ.
EXAMPLE 7. Stochastic Ricker model [9] is given by a density dependent branching process with the offspring distribution where γ > 0 is a constant, q ℓ , ℓ ≥ 1 is a given probability distribution, and no offspring are produced with probability 1−e −γx .This model satisfies the stochastic ordering assumption (a1).The mean value of the distribution q ℓ is denoted by e r , to emphasize the relation to the deterministic Ricker model.With such notation, Under normalization m(0) = ρ and m(1) = 1 this becomes A direct calculation verifies the assumptions (a2) and (a3).

Proof of Theorem 5
We will construct the process Z defined in (1) and the Galton-Watson process Y from (8) on a suitable probability space so that Y n ≥ Z n for all n ∈ N and the trajectories of these processes remain sufficiently close at least for relatively small n's (Section 3.2).We will then show that H is twice continuously differentiable (Section 3.1) and use Taylor's approximation to argue (Section 3.3) that This convergence combined with (11) implies the result.Below we will write C for a generic constant whose value may change from line to line.

Properties of H.
In this section we establish existence of the limit (9) under the standing assumptions and verify its smoothness.The proof of existence relies on a result on functional iteration, shown in [2].LEMMA 8. [2, Lemma 1] Let x m,n be the sequence generated by the recursion subject to the initial condition x 0,n = x/ρ n > 0, where ρ > 1 and C ≥ 0 are constants.There exists a locally bounded function ψ : R + → R + such that for any n Throughout we will use the notation H n (x) := f n (x/ρ n ).
LEMMA 9.Under assumption (a3) there exists a continuous function H : R + → R + and a locally bounded function g : R PROOF.By assumption (a3) and hence for any and x 0,n = x/ρ n .By Lemma 8 there exists a locally bounded function ψ such that for any The bound (14) also implies where, in view of (15), Since f has bounded second derivative and f ′ (0) = ρ, cf. ( 13), Plugging this bound into (16) and iterating n times we obtain where we defined Thus the limit H(x) = lim n→∞ f n (x/ρ n ) exists and satisfies the claimed bound with g(x) = g(x)/(1 − ρ −1 ).Continuity of H follows since H n are continuous for each n and the convergence is uniform on compacts.
COROLLARY 10. f is topologically semiconjugate to its linearization at the origin: PROOF.Since f is continuous The next lemma shows that H is strictly increasing in a vicinity of the origin and is therefore a local conjugacy.LEMMA 11.There exists an a > 0 such that H is strictly increasing on [0, a] and f PROOF.Let c := f ′′ ∞ and r := ρ/c then Since f is ρ-Lipschitz and f (0) = 0, for any j = 1, ..., n and x ∈ [0, r), and hence for all x ∈ [0, r) where we used the Bernoulli inequality.Thus we can choose a number a ∈ (0, r) Taking the limit n → ∞ implies that H is strictly increasing on [0, a].Being continuous, H is invertible and (17) holds by Corollary 10.
REMARK 12.Under additional assumption that f is strictly increasing on the whole R + , the function H is furthermore a global conjugacy, i.e. (17) holds on R + .
The next lemma establishes differentiability of H.

LEMMA 13. H has continuous derivative
where the series converges uniformly on compacts. PROOF.
Step 1.Let us first argue that the infinite product in ( 18) is well defined.By assumption (a3), f is ρ-Lipschitz and hence f n is ρ n -Lipschitz.
Consequently, H n is 1-Lipschitz for all n ∈ N and so is H.This will be used in the proof on several occasions.Let c := f ′′ ∞ and r := 1 2 ρ/c, then For x > 0 define the function j(x) := [log ρ (x/r)].Then for any j > j(x), where † holds since − log(1 − u) ≤ 2u for all u ∈ [0, 1  2 ].The partial products in (19) can be written as In view of the estimate (21), G n (x) converges to G(x) := T (x) exp(L(x)) for any where we used the bound |T (x)| ≤ 1.For any R > 0 and all x ∈ [0, R] the estimate (21) implies and thus, in view of the bound (22), we obtain Since G n is continuous for any n, this uniform convergence implies that G is continuous as well.
Step 2. To show that H(x) is differentiable and to verify the claimed formula for the derivative, it remains to show that the sequence of derivatives converges to G uniformly on compacts.Fix an R > 0, define J(R) = [log ρ (R/r)] and, for n > J(R), let and Since f ′ ∞ = ρ all these functions are bounded by 1 and Since f ′ is continuous and the convergence H n → H is uniform on compacts, it follows that and hence, to complete the proof, we need to show that To this end, in view of Corollary 10, and hence Consequently, for all x ∈ (0, R], Here the bound † holds since for j > J(R) both arguments of f ′ are smaller than r and thus (20) applies.The inequality ‡ is true since H is 1-Lipschitz.The inverses in the last line of (25) are well defined for n ≥ k := [log ρ (R/H(a))] + 1 where a is the constant guaranteed by Lemma 11.Moreover, for all such n Moreover, the sequence of functions for all n large enough: where the inequality holds since H −1 is increasing near the origin.It follows now from Dini's theorem that the convergence in (26) is uniform: The convergence in (24) holds since both Q n and P n are bounded by 1 and LEMMA 14. H has continuous second derivative PROOF.The partial products in (18) where the convention 0/0 = 0 is used.By assumption (a3), f ′′ / f ′ is bounded uniformly on a vicinity of the origin.H ′ is continuous by Lemma 13 and therefore is bounded on compacts.Hence the series is compactly convergent.By Lemma 13, so is G n .Thus G ′ n (x) converges compactly, its limit is continuous and coincides with H ′′ (x).
3.2.The auxiliary Galton-Watson process.Let (U n, j : n ∈ N, j ∈ Z + ) be an array of i.i.d.U ([0, 1]) random variables and define ξ n, j (x) = F −1 x (U n, j ) := min t ≥ 0 : F x (t) ≥ U n, j , where F x (t) is the offspring distribution function when the population density is x, cf.assumption (a1).Then P(ξ n, j (x) = k) = p k (x) for all k ∈ Z + .Let η n, j := ξ n, j (0).By assumption (a1) Let Z = (Z n , n ∈ Z + ) and Y = (Y n , n ∈ Z + ) be processes generated by the recursions started from the same initial conditions Z 0 = Y 0 = 1.By construction these processes coincide in distribution with (1) and ( 8) respectively.Moreover, in view of (28), by induction 3.3.The approximation.In view of (11), Since H has continuous derivative it follows that Thus to prove the assertion of Theorem 5 it remains to show that The process By Taylor's approximation and in view of Corollary 10 where Since f ′ ∞ = ρ is assumed, f is ρ-Lipschitz.By subtracting equation ( 5) from (30) we obtain the bound for subject to δ 0 = |H(1/K) − 1/K|, where we defined ε (1) (η n, j − ρ). Consequently, and it is left to show that the contribution of each term at time

Contribution of R n (K).
To estimate the residual, defined in (31), let us show first that the family of random variables max is bounded in probability as K → ∞.By equation (32), If H ′′ is bounded then (34) is obviously bounded.Let us proceed assuming that H ′′ is unbounded.Define ψ(M) := max x≤M |H ′′ (x)|.By continuity, ψ(M) is finite, continuous and increases to ∞.Let ψ −1 be its generalized inverse Since ψ is continuous and unbounded, ψ −1 is nondecreasing (not necessarily con- tinuous) and ψ −1 (t) → ∞ as t → ∞.Then for any R ≥ 0, by the union bound, This proves that (34) is bounded in probability.The contribution of R n (K) in (33) can now be bounded as Hence 3.3.3.Contribution of ε (3) .By conditional independence of η n, j 's In view of (29), the sequence D m := Y m − Z m ≥ 0 satisfies where the last bound holds in view of (29) and the well known formula for the second moment of the Galton-Watson process.Since D 0 = 0 it follows that Hence the contribution of ε (3) in (33) is bounded by 3.3.4.Contribution of ε (2) .By assumption (a2), where † holds by (28).Hence ε (2) contributes ρ n 1 −m K −3/2 ρ m ≤ CK −1/2 log K.