The critical domain size of stochastic population models

Identifying the critical domain size necessary for a population to persist is an important question in ecology. Both demographic and environmental stochasticity impact a population’s ability to persist. Here we explore ways of including this variability. We study populations with distinct dispersal and sedentary stages, which have traditionally been modelled using a deterministic integrodifference equation (IDE) framework. Individual-based models (IBMs) are the most intuitive stochastic analogues to IDEs but yield few analytic insights. We explore two alternate approaches; one is a scaling up to the population level using the Central Limit Theorem, and the other a variation on both Galton–Watson branching processes and branching processes in random environments. These branching process models closely approximate the IBM and yield insight into the factors determining the critical domain size for a given population subject to stochasticity.


Introduction
Determining the size of the domain necessary for population persistence is a well known problem in ecology. First discussed in Kierstead and Slobodkin (1953), the critical domain size is the domain size required to independently sustain a population. Often discussed in the context of conservation, the critical domain size may inform spatial management decisions for species of concern that are experiencing habitat degradation or other negative anthropogenic effects. For example, for many marine species, marine protected areas have been found to allow resident species to increase in abundance as well as in biomass (Jamieson and Levings 2001), achieving both conservation and fishery management goals. For deterministic models, the critical domain size may be described as the smallest domain for which there exists a stable non-zero steady state. However, at low population densities, deterministic models may not accurately reflect extinction risks. For example, when the need for a reserve or management area arises, the population of interest is often already depleted and thus more vulnerable to variability both in demographic rates and in its environment. For populations subject to uncertainty, we broadly define the critical domain size as the size of the domain necessary to reach some accepted measure of persistence. One such measure is the probability of extinction of a population, that is, the probability of the total population size being zero by a given time. What is deemed to be an acceptable probability of extinction will vary depending on application (e.g. Burnham and Anderson 2002;Flather et al. 2011). In this work, we will use the probability of ultimate extinction, which is the limit of the probability of extinction as time tends to infinity. The simplest domain, and the one we consider in this work, is a one-dimensional domain of length L, which is both a good starting point to develop theory as well as biologically reasonable in certain habitats such as rivers, shorelines, or narrow valleys. We will also assume very harsh conditions outside of the domain (e.g. strong negative anthropogenic influence, uninhabitable landscapes) where no individuals survive. Thus the critical domain size estimated here may be conservative if individuals may survive outside of the domain.
Integrodifference equations While reaction-diffusion, partial differential equation models have been studied extensively for their insights into critical domain sizes (Reitzel et al. 2004;Skellam 1951), they assume continuous reproduction and dispersal rather than taking into account the seasonal reproductive strategies used by the species of interest (Lutscher 2008). We are interested in populations with discrete dispersal and reproductive phases, which are commonly modelled using integrodifference equations (IDEs). Many plants, for example, disperse as seeds for a short time and then remain in one place for the remainder of their lives. The majority of marine species have dispersing, planktonic larvae, with many of those species having comparatively sedentary adult lives (Lockwood et al. 2002). For these reasons, IDEs have been used extensively to model both plant and marine species (Andersen 1991;Lockwood et al. 2002;Mistro et al. 2005;White et al. 2008).
For simplicity, we consider only non-structured populations (i.e. all of the individuals disperse and then reproduce) in order to be able to determine the effects of stochasticity on the extinction dynamics without conflating the results with those caused by more complicated population structures. We assume the simplest case of density independent linear population growth in order to separate the effects of stochasticity from those due to any nonlinearities. We also note that many of the populations we are interested in will be well below the environmental carrying capacity and a linear growth rate is a close approximation to many nonlinear growth functions at very low densities in the absence of an Allee effect. We here model only females, assuming males are sufficient for reproduction.
An IDE satisfying the above assumptions on a domain of length L has the form N n (x) = L/2 −L/2 k(x, y)r N n−1 (y)dy, where N n (x) is the population density in generation n at location x ∈ R, r denotes the linear population growth rate and k(x, y) is the dispersal kernel, a probability density function describing the probability of movement of an individual during one time step from location y to x. For example, the Laplace dispersal kernel, arises from the assumption that organisms settle out of a dispersal phase (diffusion coefficient D) at a constant rate α (Neubert et al. 1995). As has been done previously (Reimer et al. 2016), we scale the dispersal kernels using the mean of the strictly positive distribution (i.e. the mean of k(x) for k ≥ 0) and will refer to this as the mean dispersal distance. The Laplace kernel has a mean value of √ D/α (Kot 1992;Lockwood et al. 2002) and here all units are scaled to this mean dispersal distance for simplicity. We will use the Laplace kernel in the following examples, as it is one of the few kernels for which an analytic solution to the critical domain size problem for this deterministic IDE exists (Reimer et al. 2016;Van Kirk and Lewis 1997).
Existing theory To the best of our knowledge, stochastic analogues to integrodifference equations have only been studied for calculating invasion speeds of travelling waves subject to variability. Kot et al. (2004) used density independent branching random walks to simulate a stochastic IDE with linear, density independent growth, determining that stochastic variation in dispersal and reproduction does not lower the asymptotic invasion speed. Gilbert et al. (2014) explored the effects of spatial variation in demographic and dispersal parameters caused by environmental variability on spreading speeds using an IDE framework. Travis et al. (2011) have argued have argued for, and explored the advantages of, using both analytical IDEs and individual-based approaches in the context of climate driven range expansions.
We also build on previous results on non-spatial stochastic population models (e.g. Engen and Saether 1998;Lande 1993;Leigh Jr 1981;Hiebeler 1997). The branching process methods we consider here have been motivated by, and applied widely to, studies of genetics (e.g . Feller 1951;Keiding 1975). A continuous-time analogue to our spatially implicit approximation method has been considered by Samia and Lutscher (2012) with an approximation to their model of dispersing individuals in the context of populations in streams. They approximated a partial differential equation model with a spatially implicit model using the model eigenvalues, and then constructed a continuous-time Markov chain to explore extinction probabilities and the time to extinction. In this work, we combine these results for non-spatial stochastic models with stochastic analogues to IDEs in order to understand the effect of domain size on persistence in single species spatial stochastic models.
Stochastic models We are interested in stochastic models analogous to IDEs for populations with discrete dispersal and life history events. In Sect. 2, we address the effects of demographic stochasticity, which affects the fitness of each individual in a population independently and arises from the variability inherent in deaths, dispersal, and reproductive events. We incorporate demographic stochasticity using random variables with given distributions ideally obtained through statistical study (Engen and Saether 1998;Lande 1993;Shaffer 1981). Demographic stochasticity is most important for populations with low numbers, as individual level fluctuations in growth and death rates most significantly affect subsequent generations for smaller populations.
We first develop a spatially explicit individual-based model (IBM), which is the most realistic way to incorporate uncertainty into an IDE framework. IBMs allow for flexibility in incorporating individual variation in demographic and dispersal rates and can be a powerful tool for exploring species' spatial dynamics (Bocedi et al. 2014). In order to obtain more general analytic results, we develop two spatially implicit approximations to the IBM. The first scales the model up to the population level using the Central Limit Theorem. The second approximation allows for analytic results and is based on a Galton-Watson branching process modified to include loss via dispersal outside of the domain. We use classical results on branching processes to determine the domain size necessary to achieve specified conservation goals. Both of these approximations make use of the modified dispersal success approximation of Reimer et al. (2016), which is based on the idea that we can avoid explicit spatial dependence by assuming that during each time step a fixed proportion of the population is lost via dispersal to areas outside the domain. We compare these results with both the corresponding deterministic IDE as well as simulations from the stochastic IBM through an illustrative example, though the results hold over a much broader parameter space.
In Sect. 3, we consider environmental stochasticity, which is important to populations of any size (Lande 1993). This stochasticity arises from fluctuations in the environment that are assumed to affect the demographic rates of all individuals in a population simultaneously and in a similar way (e.g. variable temperatures or changes in food availability) (Lande 1993;Leigh Jr 1981). We address the same question of critical domain size using similar methods but now allowing for environmental stochasticity as well. We reformulate our IBM to include dependence on a random environmental variable and again consider the dynamics approximated at the population level. The branching process framework used to investigate the effects of demographic stochasticity is modified to include environmental variation using the theory of branching processes in random environments. Both demographic and environmental variability are known to be important in natural populations (Lande and Orzack 1988;Melbourne and Hastings 2008) and the tools developed here allow for further exploration of their comparative effects on extinction risk.

Individual-based models with demographic stochasticity
Simulating each individual in a population via an IBM is one of the simplest and most intuitive ways to examine population dynamics. Unfortunately, results generated in this way are difficult to compare and analytic tools have not yet been developed to determine the probability of extinction, making it difficult to determine sensitivity to various factors such as initial population size or demographic rates. We describe a stochastic IBM analogous to an IDE in order to be able to compare our approximations to these simulation results.
Simulations begin at generation n = 0 with an initial population of size N 0 evenly distributed throughout a one-dimensional domain of length L. We consider only females, so each individual has r offspring where r is a random variable with probability mass function { p i } for i = 0, 1, 2, . . ., where p i is the probability of an individual having i female offspring in one reproductive period and p i = 1. Note that this process will not explicitly incorporate individual deaths. Hence p 1 is the probability that either the female produced one offspring and then died, or that she produced no offspring but survived to the next generation, and p 0 is the probability that an individual produced no offspring and did not survive until the next generation. Thus parental survival is included in the term "offspring". At the start of each generation, and for each individual, we first draw the random variable for the number of offspring produced from { p i }. Each of these offspring then disperses according to a dispersal kernel k(x, y) and if they remain within the domain, they are assumed to survive to the start of the next time step and the process is repeated.
In order to investigate the probability of extinction by a given generation or the ultimate probability of extinction, it is necessary to run many simulations of this model.

Spatially implicit population level model
As has been done for IDE models, we can approximate the population level outcomes of the IBM by disregarding the location of individuals inside the domain and only considering changes to the total population size via a spatially implicit approximation. We do this by approximating the proportion of the population that successfully remains within the domain following the dispersal process from time n − 1 to n with a scalar A. Note that throughout this work, a bar is used to denote population level variables. For the IDE model Eq.
(1), the total population size within the domainN n at time n has been approximated via the deterministic recursion relation where 0 ≤Ā ≤ 1 (Lutscher and Lewis 2004;Van Kirk and Lewis 1997). We apply this same concept when approximating the total population size of the stochastic IBM according toN The IBM assumes that the number of offspring of each individual in a population is a real-valued independent identically distributed (iid) random variable r with probability mass function { p i }, expected value E[r ] = ξ , and variance σ 2 r . To scale from the individual to the population level, we employ the Central Limit Theorem; for a population of sizeN n , the average number of offspring per capitarN converges to a normally distributed random variable with mean E[rN ] = ξ and variance Var(rN ) = σ 2 r /N n . NoterN cannot realistically take a negative value, so we assume the distribution of rN to be normally distributed as above but with all of the probabilities of a negative distribution assigned to zero and the distribution then appropriately rescaled.
In order to remove explicit spatial dependence of each individual, we consider a scalar A that is the probability that an individual remains within the domain following one dispersal period. We first naively assume individuals are evenly distributed throughout a one-dimensional domain of length L at any given time.
Note that this is only the correct formulation for symmetric dispersal kernels, as discussed in Lutscher and Lewis (2004). By the Central Limit Theorem, the proportion AN of the population of sizeN n that remains within the domain is a normally distributed random variable. For individuals assumed to be evenly distributed throughout the domain, E[ĀN ] = S, where S is the average dispersal success (Van Kirk and Lewis 1997): The variance of the distribution ofĀN is Var(ĀN ) = σ 2 S /N n , where σ 2 S is the variance of s(y) over the domain: While S provides a reasonable approximation to the average dispersal success of a population for a range of deterministic IDEs (this was explored in Reimer et al. 2016), our recent work has revisited the assumption that the population is evenly distributed throughout the domain. We use the modified dispersal success approximation S throughout this work, which we found in Reimer et al. (2016) consistently outperforms S as an approximation to the proportion of individuals retained in the domain following dispersal. To obtain S, we weight the dispersal success functions by s(y)/S when approximating the population's average successful dispersal rate. This is based on the assumption that more individuals are in the centre of the domain than near the edges and that s(y) provides a good approximation to the population distribution inside the domain (Reimer et al. 2016;Van Kirk and Lewis 1997). This modified average dispersal rate is: It has been shown for IDEs that S ≥ S (Reimer et al. 2016), since it weights areas that are thought to have more individuals more heavily, based on the idea that they not only have higher dispersal success but are also at a location that is more likely to receive settling individuals for symmetric dispersal kernels. At the population level,ĀN may be drawn from a normal distribution with E[ĀN ] = S and variance Var(ĀN ) = σ 2 S /N n , where σ 2 S is the variance of s(y) 2 /S over the domain: We call this the S population level approximation forĀN . Now in each generation we start with a population of sizeN n−1 , draw the values of the random variablesrN and AN according to their distributions as described above, and calculateN n according to Eq. (4). It is important to note here that we now redefine our notion of extinction from requiring the death of all individuals to requiring the population size to fall below a certain value. This is to account for the fact that even ifĀN is less than 1 for many generations, the population level approximation will only tend asymptotically towards zero but not achieve it. This threshold, which we will call the quasi-extinction threshold (e.g. Ginzburg et al. 1982;Lande 1993), will here be 1, so that ifN < 1, our population will be considered extinct. This threshold value of 1 will be used for all population level approximations in this work. We did not find our results on the efficacy of the approximations to be sensitive to the quasi-extinction threshold for values near 1; as expected, small increases in the quasi-extinction threshold reduced extinction times for the IBM and the approximations, and vice versa. We did not explore the effects of choosing a significantly larger quasi-extinction threshold.

A brief note on the time to extinction
The number of surviving offspring of an individual over a time step is determined by the product of the two positive random variables r and A. We could also consider B = Ar as a single independent random variable. Since A and r are independent, and Since we are here interested in the population level model, we can ignore the shape of the distribution of B and look to its average value. For a population of sizeN n , the population level growth rate over one generation isB For a deterministic model of the formN n+1 =BN n whereB is a constant scalar, ifB > 1, the population tends to infinity and forB < 1, the population tends to extinction. For the deterministic model withB < 1 and an initial population of sizē N 0 , we can calculate the time to extinction by settingB nN 0 = 1 (recall that the quasi-extinction threshold is 1) and solving for n as In the stochastic population level model whereB is a random variable as described above, the expected value ofN n isB nN 0 , but now even withB > 1, extinction may still occur. As in the deterministic model, linear dependence of n on ln(N 0 ) was found by Lande (1993) when considering the time to extinction of a continuous time stochastic model. This dependence on ln(N 0 ) seems to also hold here when we include space implicitly into our discrete time stochastic model ( Fig. 1). As predicted by the deterministic model, as well as the continuous time stochastic model of (Lande 1993), the average time to extinction scales linearly with lnN 0 . This plot was made from 10,000 simulations for each value ofN 0 with a quasi-extinction threshold ofN n < 1. All simulations in this work were performed using commercial software packages (MATLAB and Statistics Toolbox Release R2014b, The MathWorks, Inc., Natick, MA, USA)

Example of the population level approximation
Let offspring be produced according to the probability mass function { p 0 = 0.1, p 1 = 0.3, p 2 = 0.6}, so that E[rN ] = 1.5 and Var(rN ) = 0.45/N n , and suppose dispersal follows a Laplace distribution Eq. (2). If we assume that the average dispersal success approximation S describes the loss of individuals in one generation outside the domain of length L, thenĀN is a random variable chosen from a normal distribution with mean and variance according to Eqs. (8) and (9) respectively. This S population level approximation closely predicts the cumulative extinction probabilities (the probability of extinction by a given generation) of the IBM (Fig. 2). In the next section, we discuss the fact that the asymptotic extinction probability is strictly less than 1 (Sect. 2.3.2) for the spatially implicit branching process approximation when the domain is of a sufficient size; this appears to also be true for both the IBM and the population level approximations (Fig. 2).

Modified Galton-Watson branching process
While the population level simulations provide a reasonable and less computationally costly approximation to the IBM, they do not provide any new understanding of the driving processes or variables. We turn to a different approximation in order to be able to explore extinction probabilities and mechanisms analytically. We consider a modified Galton-Watson process adapted to suit our populations of interest by including an extra spatially implicit dispersal step following the reproduction process.
Using branching processes allows us to deal with a population of discrete individuals experiencing stochastic variation in both their dispersal and reproduction (Kot et al. 2004).

Galton-Watson branching processes
Originally motivated by the desire of the French aristocracy to predict the extinction of family names (Keiding 1975;Watson and Galton 1875), Galton-Watson branching processes have since been applied to studies of nuclear chain reactions, the survival of mutant genes, and population biology (Feller 1951). They are based on the assumption that individuals in each generation have r offspring, independently of each other, where r is an iid random variable with probability mass function { p i }, i = 0, 1, 2, . . .. Each p i is the probability that an individual produces i offspring in one generation, so p i = 1. This corresponds to the probability generating function f (s) = p i s i for s on the unit interval.
The population size is a sequence of random variables {Z n } describing the population size in the nth generation, given an initial population Z 0 = z. The mean reproductive rate is and the expected size of the nth generation is Unlike in similar deterministic models, there remains a chance of extinction even when the mean reproductive rate is greater than 1. The realized population size Z n+1 is found via a recursion relation that sums the descendants of each of the individuals in generation n, so that where r j,n is an integer-valued random variable describing the number of offspring of the jth individual in generation n (Watson and Galton 1875).
To understand the classical results on the extinction probabilities for Galton-Watson branching processes, we first consider the case where Z 0 = 1, since the results for Z 0 = z > 1 follow easily. The probability d n of extinction by time n of the lineage of one individual in the first generation can be found via the intuitive recursion relation Here (a) is the probability that the original individual had no offspring, (b) is the probability that it produced one offspring and that this individual's lineage went extinct in the next n − 1 generations, and similarly, (c) is the probability that it produced two offspring and that both of their lineages died out in the subsequent n − 1 generations, with the pattern continuing for all possible numbers of offspring (Bartlett 1960). The ultimate extinction probability d is the limit Clearly d n must reach a limit as n increases, since it is an increasing function in n and is bounded above by 1, so as it reaches this limit, d n−1 tends to d n . We thus find the asymptotic extinction probability by setting d n = d n−1 = d in Eq. (18) and solving where f (d) is the probability generating function in d. This always has a solution of d = 1, but may also possess further solutions less than 1. A trivial case occurs when each individual produces exactly one individual in the subsequent generation; in this case p 1 = 1 and the probability of extinction is d = 0 so the population size remains constant over time.
Disregarding the trivial case, Galton and Watson's classic results show that if the mean number of offspring produced by an individual ξ (recall that ξ = f (1)) is less than or equal to one (known as the subcritical and critical cases, respectively), then the population will die out almost surely. If ξ is larger than one (the supercritical case), then the extinction probability is strictly less than one and the population will tend to grow exponentially (Bartlett 1960;Keiding 1975).

A spatially implicit modified branching process
We consider the possibility that, in addition to the birth-and-death processes of the Galton-Watson model, a certain fraction F of each new generation is lost via dispersal outside the domain where they die before they can reproduce. Now the probability of extinction by generation n is dependent upon both the usual birth-death processes as well as the dispersal behaviour of the individuals. We maintain consistency with the IBM regarding the order of these two processes within one generation so that individuals first reproduce according to the probability generating function f (s) and then disperse, leaving the domain with probability F. We thus obtain a modified version of Eq. (18), Here (a) is the probability that the original individual had no offspring, (b 1 ) describes the probability that the first individual produced one surviving offspring and that this individual's lineage went extinct in the subsequent n − 1 generations, and (b 2 ) is the probability that the single offspring of the first individual dispersed to unfavorable habitat before it could reproduce in the next generation. Following intuitively, (c 1 ) is the probability that the original individual produced two offspring and both of their lineages died out in n − 1 generations, (c 2 ) is the probability that the two offspring dispersed to unfavourable habitat and were lost, and (c 3 ) is the probability that only one of them settled outside the domain while the other remained but their lineage died out in n − 1 generations, and the recursion continues on in this way.
We may re-write Eq. (21) as and so the probability of extinction as n → ∞ is found by solving for d in where f [(1 − F)d + F] is the probability generating function in (1 − F)d + F. This is seen by rearranging Eq. (23) as Trivially, if there is no good habitat, the probability of extinction by any time step will be 1, since d n = p 0 + p 1 + p 2 + · · · when F = 1 and so d = 1 is always a solution to Eq. (23). Whether there is a second, smaller solution is now determined not only by the reproductive probabilities but also by F.
To determine whether the process is supercritical, critical, or subcritical and thus determine whether extinction is certain we need to determine the expected value of offspring of this modified branching process. Definep i as the probability of an individual producing i offspring that successfully settle within the domain following their dispersal phase. Theñ where are multinomial coefficients. The corresponding probability generating function is and so the expected reproductive value is If we make the assumption that these sums do not run to infinity but rather to some maximum number of biologically possible offspring, then these summations become finite. Eq. (28) simplifies to which can be shown by induction for both the infinite or finite sum (Appendix). Thus the expected reproductive rate of the modified branching process is the product of the expected reproductive rate without the loss of individuals outside the domain and the proportion of individuals that successfully settle within the domain. Applying the results of Watson and Galton (1875), the criticality condition for the spatially implicit branching process is and from this we can determine the proportion of individuals F * that must be retained within the domain in order to avoid certain eventual extinction. For any F < F * , the population is supercritical and for any F ≥ F * , eventual extinction is certain. To find a suitable value for F, we recall the dispersal success approximation S of Eq. (8). We let F = 1 − S and compare the resulting extinction risks with those from the spatially explicit IBM simulation.

Example of modified branching process
Suppose an individual in a population of initial size Z 0 = z reproduces according to { p 0 = 0.1, p 1 = 0.3, p 2 = 0.6}, for a mean reproductive rate E[r ] = 1.5. Assume dispersal occurs according to a Laplace distribution, Eq. (2). As in Reimer et al. (2016), the critical domain size of the deterministic IDE model with logistic growth, an intrinsic growth rate of r = 1.5, and a Laplace dispersal kernel, is L * = 2.703 times the mean dispersal distance. For a domain of this size, S = 0.6647 (from Eq. (14)) and so E[r ] (1 − F) = E[r ] S = 0.9970 < 1. Thus the branching process predicts that the critical domain size of the deterministic model is too small to sustain a population subject to demographic stochasticity (Fig. 3).
We compare the probability of eventual extinction for a range of domain lengths around L * (Fig. 4). The branching process model predicts certain extinction until some critical length value, at which point the ultimate extinction probability becomes less than one. The greater the initial population, the sharper the reduction in the ultimate extinction probability for lengths larger than the critical value. The critical domain length at which the population's ultimate extinction probability becomes less than one and then solving for L using Eq. (14), obtaining L * S = 2.722 (Fig. 4). In conservation biology, we may be interested in the probability of extinction of groups of individuals rather than of a single individual's lineage. If we assume each individual's probability of extinction is independent, then the probability d n (z) of an initial population of size z going extinct by time n is If the probability of extinction of a single individual's lineage is less than one, then the probability of extinction for the entire population tends asymptotically toward 0 as z increases. The probability of ultimate extinction for an initial population of size z can similarly be determined by For the example above, the non-spatial probability of extinction is 0.17 when z = 1, as determined by Eq. (20). As the domain length L of our modified model tends to infinity, the results on the probability of extinction tend asymptotically towards those of the standard non-spatial Galton-Watson branching process. For any L > L * and an initial population greater than one, extinction is very unlikely, regardless of how much larger than L * the length of the domain is (Fig. 4). Thus under the assumptions of our model, for an initial population of a few hundred or thousand individuals, the domain need only be slightly larger than L * to ensure a very low chance of extinction. Increasing the domain length far beyond L * does little to further decrease the probability of extinction.
Alternatively, rather than determining the critical domain size for an individual and considering the population level implications, we could set a conservation goal d for a given population. For example, let us require an ultimate extinction probability of no more than 10 %. We know that with an initial population of z individuals we need d z = 0.10. Let us suppose z = 1000, so we require d ≤ 0.998. We solve Eq. (23) to obtain F = 0.333 so the domain needs to be large enough to retain the proportion of individuals 1 − F = 0.667. We let S = 0.667 and solve for the corresponding critical length L * S = 2.726 necessary to achieve the chosen conservation goal.

Environmental stochasticity
We now turn our attention to the effects of a variable environment on population dynamics. Unlike in the case of demographic stochasticity, where the variability is due to random fluctuations in individual growth and survival rates, variation in environmental conditions affects all individuals similarly and simultaneously. In order to examine the effect this type of variability has on extinction probabilities, we will allow the population growth rates to fluctuate as a stationary time series (for examples, see Athreya and Karlin 1971;Lande 1993;Smith and Wilkinson 1969).

Individual-based models with environmental stochasticity
There are many different ways in which we could reflect the variable nature of the environment in our model. One way is to add a noise term into an otherwise deterministic framework (Melbourne and Hastings 2008). This can be either additive or multiplicative depending on the model formulation and, in effect, it serves to alter the average birth rate. We choose to incorporate environmental variability directly into the birth rate of each individual, as this is straightforward within a branching process framework and also because of the implied understanding of environmental stochasticity. We take environmental variability to affect a population by changing the growth rate of every individual in the same way, by either shifting the mean and/or variance of the probability mass functions determining growth. We introduce a sequence of iid environmental variables {ς n }, for n = 0, 1, 2, . . . where ς n describes the environmental conditions in generation n selected from a countable number of environmental states ς j , j = 0, 1, 2, . . .. We denote the probability of a certain environment ς j occurring for a given generation as h j . Each ς j uniquely determines the probability generating function of the reproductive rates of each individual, which are now determined by the probability mass function { p j i } comprised of the probabilities of an individual having i surviving offspring, given environment ς j .
To incorporate this random environmental variable into our IBM, we begin each simulation with an initial population of size Z 0 = z evenly distributed throughout the domain at generation n = 0. We first draw a value for the environmental random variable for generation ς n , which corresponds to a set of reproductive probabilities described by the probability generating function f ς n . We then determine the number of each individual's offspring according to f ς n and finally select the position of the offspring after the dispersal period according to a chosen dispersal kernel. Individuals who settle within the domain survive and repeat this process over the next generation, while those who settle outside are considered lost. As with the IBM presented in Sect. 2.1, this is computationally costly for large populations and results for different reproductive and environmental probabilities are difficult to standardize or compare. We again rely on these results from the IBM on cumulative extinction probabilities and the critical domain length as a benchmark against which we measure and compare our spatially implicit approximations.

Population level models with environmental stochasticity
We again rescale the IBM up to the population level using the Central Limit Theorem. This rescaling best mimics the IBM when the population size is not too small, due to the constraints of the Central Limit Theorem, but even for an initial population of a single individual, we find that it closely predicts the extinction probabilities.
By the same assumptions as in Sect. 2.2, we approximate the dynamics at the population level by Eq. (4). BothrN andĀN are random variables drawn from the same types of distributions as in Sect. 2.2, the difference being that we now incorporate environmental variation. As in the IBM, individual reproductive probabilities in each generation are influenced by a random variable describing the environmental conditions over that generation ς n . This may affect both the mean and variance of the population's per capita growth raterN . As in Sect. 2.2, we approximate the proportion of the population that settles in the domainĀN by a normal distribution with mean S. We again require a quasi-extinction threshold (N < 1) since the population will never achieve zero unless eitherĀN orrN are zero, signifying a catastrophic collapse of the entire population in one generation.
(2) yields a close prediction of the probability of extinction of the IBM (Fig. 5). We use this S population level approximation to examine the difference between the probability of extinction of a population subject to both environmental and demographic stochasticity or to only one or the other. In this example, for a population subject to only demographic stochasticity, it is as though h 2 = 1 and h 0 = h 3 = 0. When the population is subject only to environmental stochasticity, Var i = 0 for all i values. Our model results agree with the intuitive notion that a population subject to multiple types of stochasticity has a higher probability of extinction (Fig. 6).

A modified branching process approximation to the IBM
We again approximate the IBM by a spatially implicit branching process in order to obtain analytic results on the probability of extinction. To include environmental stochasticity, we incorporate random environments into branching processes of the  Cumulative probability of extinction for the S population level approximation to the IBM under three different kinds of stochasticity as described in Sect. 3.2.1 for a domain of length L = 2.7. Observe that the combination of both demographic and environmental stochasticity results in the highest probabilities of extinction. We here set the quasi-extinction threshold asN n < 1. This figure was generated over 10,000 simulations for an initial population of ten individuals Galton-Watson type, known as branching processes in random environments (BPRE). This was first done for iid random environments (Smith and Wilkinson 1969) and subsequent results have been generalized to include any stationary ergodic sequence (Athreya and Karlin 1971). Unfortunately, as was shown by Smith and Wilkinson (1969) in their initial study, "the elegant functional equations that play such a vital role in the theory of the classical Galton-Watson process, and many published generalizations thereof, do not arise in the present study". Results on the probability of extinction by a given generation tend to be approximations rather than explicit expressions of expected results. For insight into approximations for the time to extinction, or extinction probabilities of supercritical BPREs, see Agresti (1975), Dekking (1987), D'Souza and Hambly (1997, Geiger and Kersting (2001), Grey and Zhunwei (1991), Grey and Zhunwei (1993), Wilkinson (1969).
We can, however, explicitly determine a criticality condition for the probability of ultimate extinction, where the long time behaviour changes from certain extinction to extinction with probability strictly less than one.

Standard BPREs
We briefly describe the general BPRE framework of Smith and Wilkinson (1969) in order to be able to modify it to include implicit spatial dependence. Consider a sequence of iid random variables {ς n }, each selected from countably many possible environments {ς i } according to the probability mass function {h i }. Take the reproductive probabilities of each individual to be dependent on the environment in a given generation so that p i , the probability of an individual producing i offspring in one generation for a given environment ς j , is now p i (ς j ) = p j i . This results in a family of probability generating functions for each environment ς j , The expected number of offspring in a fixed environment ς j is then and {ξ n } is a sequence of iid random variables (Smith and Wilkinson 1969). Convention dictates that we assume P{ξ j < ∞} = 1 for all environments as well as P{ p j 0 + p j 1 < 1} > 0 for at least one value of j (that is, the generating function is strictly convex on the unit interval for at least one environment) Wilkinson (1969). {Z n } is a sequence of iid random variables whose state space is the non-negative integers describing the population size in the nth generation given an initial population size of Z 0 = z. We assume Z 0 = 1 for the remainder of this section unless otherwise stated. Smith and Wilkinson (1969) determined that certain extinction, which they termed "mortality", occurs for a given environmental state space if where the expectation is taken over all possible environments. The case where E[ln ξ ] = zero is called "critical" and cases where it is strictly less than 0, "subcriti-cal". "Supercritical" or "immortal" is used to describe the case when the probability of ultimate extinction is strictly less than one, and Smith and Wilkinson have shown that the following two conditions are both necessary and sufficient for supercriticality: where again the expectation is over all possible environments. The first condition intuitively corresponds to the deterministic requirement that the reproductive rate be greater than one to avoid stability of the zero steady state in a map. The second condition serves to ensure that "catastrophes" do not occur, wiping out the entire population in one time step (Smith and Wilkinson 1969).

A spatially implicit modified BPRE
We again incorporate the probability F of offspring loss due to dispersal outside the domain, but now within random environments. Note that here we assume F does not depend on environmental conditions. We formulate the probability generating functions incorporating F by grouping our growth rates according to how many offspring successfully settle inside the domain after the dispersal period, denoting these postdispersal rates asp and the expected reproductive value in a given environment ς j is again by induction (Appendix). As in the case with a constant environment (Sect. 2.3.2), the expected reproductive rate in a given environment is the product of the expected reproductive rate of the non-spatial model and the proportion of individuals that settle within the domain. Applying the conditions for extinction outlined in Eq. (37) for the non-spatial branching process, the criticality condition in the spatially implicit case is where the expectation is taken over all possible environments. For a given set of environmental conditions and corresponding reproductive rates, we can determine the critical proportion F * that can be lost from the domain before extinction is certain. For any F < F * , the population will be supercritical and for any F ≥ F * , eventual extinction occurs with probability one.

BPRE example
We build on our previous examples and assume an environmental state space of three possible environments, ς 1 , ς 2 , and ς 3 , which occur with probability h 1 = 0.4, h 2 = 0.4 and h 3 = 0.2, respectively. Each of these environments has a corresponding probability mass function for the random variable r , which describes the number of offspring of an individual. We assume that for the first environment, reproductive probabilities are { p 1 0 = 0.1, p 1 1 = 0.2, p 1 2 = 0.7}, for the second they are { p 2 0 = 0.1, p 2 1 = 0.3, p 2 2 = 0.6}, and for the third, { p 3 0 = 0.2, p 3 1 = 0.3, p 3 2 = 0.5}. This results in mean reproductive values ξ 1 = 1.6, ξ 2 = 1.5, and ξ 3 = 1.3 and respective variances Var 1 = 0.44, Var 2 = 0.45, and Var 3 = 0.61. The expected number of offspring ξ over all environments is Ignoring loss via dispersal, and so the non-spatial process is supercritical. To determine the criticality condition while implicitly including space, we incorporate loss of individuals outside the domain with probability F into our probabilities p j i , so that the probability generating function f j (s) now has spatially implicit coefficients as in Eq. (25). For our three environments, the above probabilities, and the knowledge that and so for the probability mass function with values h 1 , h 2 , h 3 as above, As expected, this is the sum of the non-spatial expected reproductive rate from Eq. (42) and ln(1 − F). We find the critical proportion F * from Eq. (40) by solving which results in F * = 0.331. We use S to approximate the proportion of individuals successfully settling within the domain, so S = 1 − F, and then solve for the corresponding domain lengths using Eq. (8) to obtain the branching process approximation to the critical domain size for an ultimate extinction probability strictly less than one. If we assume dispersal according to a Laplace kernel Eq. (2), then L = 2.744 (Fig. 7). Again, from Reimer et al. (2016), the critical domain size of the stochastic model is larger than that of the corresponding deterministic IDE model with an intrinsic reproductive value of 1.5, logistic growth and a Laplace dispersal kernel resulting in a critical domain size of L * = 2.703.

Discussion
The critical domain size deemed necessary for population persistence is highly dependent on model structure and assumptions. We here have considered populations that would typically be modelled using an IDE framework, with discrete dispersal and reproductive periods. We were interested in ways to include stochasticity -both demographic and environmental -into estimators of critical domain size for these populations.
First considering only demographic stochasticity, we created an IBM in order to simulate explicitly each individual's variable reproductive rate and also its random dispersal distance. While the IBM most closely mimics the assumptions of the deterministic IDE framework, it does not easily lend itself to comparison between different kernels or parameters. It is also slow to simulate for large populations due to the computational costs associated with having each individual disperse independently. We thus developed two approximations to the IBM, which both perform well for a variety of domain lengths and initial population sizes. Using the Central Limit Theorem, we found that scaling the IBM up to the population level results in a close approximation to the probability of extinction and provides faster simulation results. To obtain analytic results, we modified a Galton-Watson branching process to implicitly include space. This also closely approximated the IBM while allowing for closer examination of the affects of per capita growth rates and dispersal distances on population decline. IBMs rarely yield analytic insight and this spatially implicit branching process approximation allows us to scale individual variability up to obtain results at the population level.
Following the same model progression but now also including environmental stochasticity, we modified our IBM so that each individual reproduced and dispersed independently but now with the probability distributions of reproduction influenced by the environment for a given generation. We again used the Central Limit Theorem to obtain population level simulation results on the probability of extinction for various domain lengths. We then modified the branching process to include random environments using a BPRE framework. Unlike when we only considered demographic variability, existing results on BPREs do not allow us to determine analytically the long time probability of extinction or the probability of extinction by a given generation. Rather we must rely on the criticality condition, which separates the cases where extinction is certain from those where the population goes extinct with a probability less than one.
Both the population level, as well as the as well as the branching process, models relied on S to approximate the proportion of individuals successfully retained following each dispersal event. As was shown for the deterministic case in Reimer et al. (2016), this provides evidence that the details of the dispersal kernel are not important, but rather it is the proportion of the population retained in the domain that determines population persistence. This is consistent with the findings of Lockwood et al. (2002) who have shown that the tails of symmetric dispersal kernels are not important for the critical domain size of deterministic models. For asymmetric dispersal in deterministic models, see Lutscher et al. (2005) and Lutscher et al. (2010).
Regardless of whether we are considering the results from the IBM, the population level approximations, or the branching process models, the critical domain size of the stochastic models was always larger than that of the corresponding deterministic IDE models. This implies that the critical domain size of the deterministic model is not large enough to sustain a population subject to either demographic or environmental stochasticity. This disparity between the stochastic and deterministic models increased for decreasing mean growth rates and with greater variance in growth rates. As was shown by Watson and Galton (1875), certain extinction is determined only by the expected reproductive rate, however, if extinction is not certain, then the variance will affect the ultimate extinction probability. The probability of extinction by a given generation thus relies on both the expected reproductive rate and the variance in the probability mass function { p i }. The higher the expected reproductive rate, the lower the probability of extinction, and the greater the variance, the higher the probability of extinction.
From studying the branching process models, we concluded that once the domain is larger than the criticality condition dictates, any additional area added to the domain does not greatly affect the probability of extinction for large initial populations. This result followed from the assumption of independence of all individuals, which may not be true in real populations. Density dependent reproduction either at low or high densities may occur due to resource limitation or the effect of density on mating probabilities. In addition, when including environmental stochasticity we did not allow for the possibility that dispersal behaviour may be dependent on the environmental conditions for a given generation. Including environmental dependence of dispersal behaviour may add additional realism. Further extensions to these models could also include patchy domains connected via dispersal, as well as including stage or age structure into the populations (Caswell 2006). Lutscher and Lewis (2004) have made substantial progress obtaining analytic results for stage-structured populations on patchy domains modelled with integrodifference matrix population models, as have Fagan and Lutscher (2006). We have not, however, explored these extensions here, and while an extension to patchy domains seems intuitively straightforward, the addition of stage-structure may not lend itself to the branching process approximations.
Because both the population level and branching process approximations rely on S rather than on a comprehensive understanding of the dispersal kernel, it would be just as easy to have both the population level approximations and the branching process models represent a domain in two or three dimensions, rather than along a onedimensional domain. Given empirical data, S could be parametrized in this way, rather than requiring an explicit form for k(x, y), which may be beneficial in systems where the mechanistic underpinnings of dispersal are unknown. For our modified BPRE, it is sufficient if experimental information can be obtained on both reproductive rates in a range of environments and the proportion of individuals retained within the area of interest.
In spite of the fact that analytic results on the critical domain size of the IBM do not exist, we were able to use branching processes to address the questions of critical domain size for populations subject to demographic and environmental stochasticity. While the population level approximations provided faster simulation results and insight into the efficacy of using the S approximation to dispersal success in this framework, it is the branching process approximations that allowed for insight into the affects of varying demographic rates, dispersal distances, and environments. Demographic and environmental variability are known to be important in determining the population persistence and these tools may aid in understanding their comparative effects.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix: Proof of Eqs. (29) and (39)
Here we prove from Eq. (29) using induction. This proof also holds for Eq. (39), with a change of notation to represent the dependence of ξ on the environment ζ j . We show that for all k ∈ N, First, let k = 1 (i.e. a parent either has zero or one offspring in a generation). Then Next, assume that Eq. (46) holds for any k ∈ N, say Therefore, .
We now show that (a) = (k + 1) p k+1 (1 − F) for all k. This problem can be simplified to the proof that If we here let p = q − 1, and n = k − 1, this becomes n p=0 n! p!(n − p)!
(1 − F) p F n− p , which is the binomial formula for [(1 − F) + F] n , which is 1. So now from Eq. (46), we have that