Skip to main content

Advertisement

Log in

Protected polymorphisms and evolutionary stability of patch-selection strategies in stochastic environments

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

We consider a population living in a patchy environment that varies stochastically in space and time. The population is composed of two morphs (that is, individuals of the same species with different genotypes). In terms of survival and reproductive success, the associated phenotypes differ only in their habitat selection strategies. We compute invasion rates corresponding to the rates at which the abundance of an initially rare morph increases in the presence of the other morph established at equilibrium. If both morphs have positive invasion rates when rare, then there is an equilibrium distribution such that the two morphs coexist; that is, there is a protected polymorphism for habitat selection. Alternatively, if one morph has a negative invasion rate when rare, then it is asymptotically displaced by the other morph under all initial conditions where both morphs are present. We refine the characterization of an evolutionary stable strategy for habitat selection from Schreiber (Am Nat 180:17–34, 2012) in a mathematically rigorous manner. We provide a necessary and sufficient condition for the existence of an ESS that uses all patches and determine when using a single patch is an ESS. We also provide an explicit formula for the ESS when there are two habitat types. We show that adding environmental stochasticity results in an ESS that, when compared to the ESS for the corresponding model without stochasticity, spends less time in patches with larger carrying capacities and possibly makes use of sink patches, thereby practicing a spatial form of bet hedging.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Anderson JT, Geber MA (2010) Demographic source-sink dynamics restrict local adaptation in Elliott’s blueberry (Vaccinium elliottii). Evolution 64:370–384

    Article  Google Scholar 

  • Beckmann JP, Berger J (2003) Using black bears to test ideal-free distribution models experimentally. J Mammal 84:594–606

    Article  Google Scholar 

  • Cantrell RS, Cosner C, Deangelis DL, Padron V (2007) The ideal free distribution as an evolutionarily stable strategy. J Biol Dyn 1:249–271

    Article  Google Scholar 

  • Cantrell RS, Cosner C, Lou Y (2010) Evolution of dispersal and the ideal free distribution. Math Biosci Eng 7:17–36

    Article  Google Scholar 

  • Cantrell RS, Cosner C, Lou Y (2012) Evolutionary stability of ideal free dispersal strategies in patchy environments. J Math Biol 65:943–965

    Article  Google Scholar 

  • Chesson PL (2000) General theory of competitive coexistence in spatially-varying environments. Theor Popul Biol 58:211–237

    Article  Google Scholar 

  • Childs DZ, Metcalf CJE, Rees M (2010) Evolutionary bet-hedging in the real world:empirical evidence and challenges revealed by plants. Proc Royal Soc B Biol Sci 277:3055–3064

    Article  Google Scholar 

  • Cosner C (2005) A dynamic model for the ideal-free distribution as a partial differential equation. Theor Popul Biol 67:101–108

    Article  Google Scholar 

  • Cressman R, Křivan V (2006) Migration dynamics for the ideal free distribution. Am Nat 168:384–397

    Article  Google Scholar 

  • Cressman R, Křivan V (2010) The ideal free distribution as an evolutionarily stable state in density-dependent population games. Oikos 119:1231–1242

    Article  Google Scholar 

  • Cressman R, Křivan V, Garay J (2004) Ideal free distributions, evolutionary games, and population dynamics in multiple-species environments. Am Nat 164:473–489

    Article  Google Scholar 

  • Doncaster CP, Clobert J, Doligez B, Danchin E, Gustafsson L (1997) Balanced dispersal between spatially varying local populations: an alternative to the source-sink model. Am Nat 150(4):425–445

    Article  Google Scholar 

  • Dreisig H (1995) Ideal free distributions of nectar foraging bumblebees. Oikos 72:161–172

    Article  Google Scholar 

  • Edelaar P, Bolnick DI (2012) Non-random gene flow: an underappreciated force in evolution and ecology. Trends Ecol Evol 27:659–665

    Article  Google Scholar 

  • Ethier SN, Kurtz TG (2005) Markov processes: characterization and convergence. Wiley, Hoboken

    Google Scholar 

  • Evans SN, Ralph P, Schreiber SJ, Sen A (2013) Stochastic growth rates in spatio-temporal heterogeneous environments. J Math Biol 66:423–476

    Article  Google Scholar 

  • Fox LR, Eisenbach J (1992) Contrary choices: possible exploitation of enemy-free space by herbivorous insects in cultivated vs. wild crucifers. Oecologia 89:574–579

    Article  Google Scholar 

  • Fretwell SD, Lucas HL Jr (1969) On territorial behavior and other factors influencing habitat distribution in birds. Acta Biotheor 19:16–36

    Article  Google Scholar 

  • Friedman A (1964) Partial differential equations of parabolic type. Prentice-Hall Inc., Englewood Cliffs

    Google Scholar 

  • Gejji R, Lou Y, Munther D, Peyton J (2012) Evolutionary convergence to ideal free dispersal strategies and coexistence. Bull Math Biol 74:257–299

    Article  Google Scholar 

  • Geritz SAH, Metz JAJ, Kisdi E, Meszena G (1997) Dynamics of adaptation and evolutionary branching. Phys Rev Lett 78:2024–2027

    Article  Google Scholar 

  • Godin JJ, Keenleyside MHA (1984) Foraging on patchily distributed prey by a Cichlid fish (Teleostei, Cichlidae): a test of the ideal free distribution theory. Anim Behav 32:120–131

    Article  Google Scholar 

  • Harper DGC (1982) Competitive foraging in mallards: ideal free ducks. Anim Behav 30:575–584

    Article  Google Scholar 

  • Hastings A (1983) Can spatial variation alone lead to selection for dispersal? Theor Popul Biol 24:244–251

    Article  Google Scholar 

  • Haugen TO, Winfield IJ, Vøllestad LA, Fletcher JM, James JB, Stenseth NC (2006) The ideal free pike: 50 years of fitness-maximizing dispersal in Windermere. Proc Royal Soc B Biol Sci 273:2917–2924

    Article  Google Scholar 

  • Holt RD (1997) On the evolutionary stability of sink populations. Evol Ecol 11:723–731

    Article  Google Scholar 

  • Holt RD, Barfield M (2001) On the relationship between the ideal free distribution and the evolution of dispersal. In: Clobert J, Danchin E, Dhondt A, Nichols J (eds) Dispersal. Oxford University Press, Oxford, pp 83–95

    Google Scholar 

  • Ikeda N, Watanabe S (1989) Stochastic differential equations and diffusion processes, vol 24, 2nd edn. North-Holland Publishing Co., Amsterdam

    Google Scholar 

  • Jaenike J (1985) Genetic and environmental determinants of food preference in Drosophila tripunctata. Evolution 39:362–369

    Article  Google Scholar 

  • Jaenike J, Holt RD (1991) Genetic variation for habitat preference: evidence and explanations. Am Nat 137:S67–S90

    Article  Google Scholar 

  • Jansen VAA, Yoshimura J (1998) Populations can persist in an environment consisting of sink habitats only. Proc Nat Acad Sci USA 95:3696–3698

    Article  Google Scholar 

  • Kallenberg O (2002) Foundations of modern probability. Springer, New York

    Book  Google Scholar 

  • Katzenberger GS (1991) Solutions of a stochastic differential equation forced onto a manifold by a large drift. Ann Probab 19:1587–1628

    Article  Google Scholar 

  • Křivan V (1997) Dynamic ideal free distribution: effects of optimal patch choice on predator-prey dynamics. Am Nat 149:164–178

    Article  Google Scholar 

  • Le Gall J-F (1983) Applications du temps local aux équations différentielles stochastiques unidimensionnelles. In: Proceedings of seminar on probability, lecture notes in Mathematics, XVII, vol 986. Springer, Berlin, pp 15–31

  • Li X, Mao X (2009) Population dynamical behavior of non-autonomous Lotka–Volterra competitive system with random perturbation. Discret Contin Dyn Syst 24:523–545

    Article  Google Scholar 

  • Liu M, Wang K, Wu Q (2011) Survival analysis of stochastic competitive models in a polluted environment and stochastic competitive exclusion principle. Bull Math Biol 73:1969–2012

    Article  Google Scholar 

  • Maynard Smith J, Price GR (1973) The logic of animal conflict. Nature 246:15–18

    Article  Google Scholar 

  • Mayr E (1963) Animal species and evolution. Harvard University Press, Cambridge

    Book  Google Scholar 

  • McPeek MA, Holt RD (1992) The evolution of dispersal in spatially and temporally varying environments. Am Nat 6:1010–1027

    Article  Google Scholar 

  • Milinski M (1979) An evolutionarily stable feeding strategy in sticklebacks. Zeitschrift für Tierpsychologie 51:36–40

    Article  Google Scholar 

  • Oksanen T, Power ME, Oksanen L (1995) Ideal free habitat selection and consumer-resource dynamics. Am Nat 146:565–585

    Article  Google Scholar 

  • Orians GH, Wittenberger JF (1991) Spatial and temporal scales in habitat selection. Am Nat 137:S29–S49

    Article  Google Scholar 

  • Prout T (1968) Sufficient conditions for multiple niche polymorphism. Am Nat 102:493–496

    Article  Google Scholar 

  • Ravigné V, Olivieri I, Dieckmann U (2004) Implications of habitat choice for protected polymorphisms. Evol Ecol Res 6:125–145

    Google Scholar 

  • Robinson HS, Wielgus RB, Cooley HS, Cooley SW (2008) Sink populations in carnivore management: Cougar demography and immigration in a hunted population. Ecol Appl 18:1028–1037

    Article  Google Scholar 

  • Rogers LCG, Williams D (2000) Diffusions, Markov processes, and martingales: Itô calculus, vol 2, 2nd edn. Cambridge University Press, Cambridge

  • Rosenzweig ML (1981) A theory of habitat selection. Ecology 62:327–335

    Article  Google Scholar 

  • Schreiber SJ (2012) Evolution of patch selection in stochastic environments. Am Nat 180:17–34

    Article  Google Scholar 

  • Schreiber SJ, Benaïm M, Atchadé KAS (2011) Persistence in fluctuating environments. J Math Biol 62:655–683

    Article  Google Scholar 

  • Schreiber SJ, Fox LR, Getz WM (2000) Coevolution of contrary choices in host-parasitoid systems. Am Nat 155:637–648

    Article  Google Scholar 

  • Schreiber SJ, Fox LR, Getz WM (2002) Parasitoid sex allocation affects coevolution of patch selection in host-parasitoid systems. Evol Ecol Res 4:701–718

    Google Scholar 

  • Schreiber SJ, Vejdani M (2006) Handling time promotes the coevolution of aggregation in predator-prey systems. Proc Royal Soc Biol Sci 273:185–191

    Article  Google Scholar 

  • Sokurenko EV, Gomulkiewicz R, Dykhuizen DE (2006) Source-sink dynamics of virulence evolution. Nat Rev Microbiol 4:548–555

    Article  Google Scholar 

  • Stroock DW (2008) Partial differential equations for probabilists, Cambridge studies in advanced mathematics, vol 112. Cambridge University Press, Cambridge

    Google Scholar 

  • Tittler R, Fahrig L, Villard MA (2006) Evidence of large-scale source-sink dynamics and long-distance dispersal among Wood Thrush populations. Ecology 87:3029–3036

    Article  Google Scholar 

  • Tregenza T (1995) Building on the ideal free distribution. Adv Ecol Res 26:253–307

    Article  Google Scholar 

  • Turelli M, Schemske DW, Bierzychudek P (2001) Stable two-allele polymorphisms maintained by fluctuating fitnesses and seed banks: protecting the blues in Linanthus parryae. Evolution 55:1283–1298

    Article  Google Scholar 

  • van Baalen M, Křivan V, van Rijn PCJ, Sabelis MW (2001) Alternative food, switching predators, and the persistence of predator-prey systems. Am Nat 157:512–524

    Article  Google Scholar 

  • van Baalen M, Sabelis MW (1993) Coevolution of patch selection strategies of predator and prey and the consequences for ecological stability. Am Nat 142:646–670

    Article  Google Scholar 

  • Via S (1990) Ecological genetics and host adaptation in herbivorous insects: the experimental study of evolution in natural and agricultural systems. Ann Rev Entomol 35:421–446

    Article  Google Scholar 

  • Zhang Z, Chen D (2013) A new criterion on existence and uniqueness of stationary distribution for diffusion processes. Adv Differ Equ 2013:13

    Article  Google Scholar 

Download references

Acknowledgments

The authors thank Dan Crisan, Alison Etheridge, Tom Kurtz, and Gregory Roth for helpful discussions. S. N. Evans was supported in part by NSF grant DMS-0907639 and NIH grant 1R01GM109454-01. A. Hening was supported by EPSRC grant EP/K034316/1. S. J. Schreiber was supported in part by NSF grants EF-0928987 and DMS-1022639.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexandru Hening.

Appendices

Appendix A: Proof of Proposition 2.1

The stochastic differential equation for \(Z\) is of the form

$$\begin{aligned} d Z_t = b(Z_t) \, dt + \sigma (Z_t) \, dW_t, \end{aligned}$$
(7.1)

where \(b(z) {:=}\mu z - \kappa z^2\) and \(\sigma (z) {:=} \sigma z\). It follows from Itô’s existence and uniqueness theorem for strong solutions of stochastic differential equations that this equation has a unique strong solution up to possibly a finite but strictly positive explosion time.

Set \(R_t {:=} \log Z_t\) for \(t \ge 0\). By Itô’s lemma,

$$\begin{aligned} d R_t = \left( \mu - \frac{\sigma ^2}{2} - \kappa \exp (R_t)\right) \, dt + \sigma \, dW_t. \end{aligned}$$
(7.2)

It follows from the comparison principle of Ikeda and Watanabe (see Chapter VI Theorem 1.1 of Ikeda and Watanabe (1989)), Theorem 1.4 of Le Gall (1983), or Theorem V.43.1 of Rogers and Williams (2000)) that

$$\begin{aligned} R_t \le R_0 + \left( \mu - \frac{\sigma ^2}{2}\right) t + \sigma W_t, \end{aligned}$$
(7.3)

and so \(Z\) does not explode to \(+\infty \) in finite time. Moreover, since \(r \mapsto \mu - \kappa e^r\) is a bounded, uniformly Lipschitz function on \((-\infty ,0]\) it follows from Itô’s existence and uniqueness theorem that \(R\) does not explode to \(-\infty \) in finite time, so that \(Z\) does not hit \(0\) in finite time. We could have also established this result by using the scale function and speed measure calculated below to check Feller’s necessary and sufficient for the boundary point of a one-dimensional diffusion to be inaccessible—see Theorem 23.12 of Kallenberg (2002).

It is not hard to check using Itô’s lemma that an explicit solution of the SDE is

$$\begin{aligned} Z_t = \frac{Z_0 \exp ((\mu -\sigma ^2/2)t+\sigma W_t)}{1+Z_0 \frac{\mu }{\kappa }\int _0^t \exp ((\mu -\sigma ^2/2)s+\sigma W_s) \, ds}. \end{aligned}$$

We see from the inequality (7.3) that if \(\mu - \sigma ^2/2 < 0\), then \(\lim _{t \rightarrow \infty } Z_t = 0\) almost surely.

We use the theory based on the scale function and speed measure of a one-dimensional diffusion (see, for example, Chapter 23 of Kallenberg (2002) or Sections V.6-7 of Rogers and Williams (2000)) below to establish that \(Z\) is positive recurrent with a unique stationary distribution when \(\mu - \sigma ^2/2 > 0\). Similar calculations show that \(Z\) is null recurrent when \(\mu - \sigma ^2/2 = 0\), and hence \(\liminf _{t \rightarrow \infty } Z_t = 0\) almost surely and \(\limsup _{t \rightarrow \infty } Z_t = \infty \). It follows from (7.2) and the comparison principle that if \(Z'\) and \(Z''\) are two solutions of (7.1) with respective parameters \(\mu ',\kappa ',\sigma '\) and \(\mu '',\kappa '',\sigma ''\) satisfying \(\mu ' \le \mu ''\), \(\kappa '=\kappa ''\), \(\sigma ' = \sigma ''\) and the same initial conditions, then \(Z_t' \le Z_t''\). We will show below that

$$\begin{aligned} \lim _{t \rightarrow \infty } \frac{1}{t} \int _0^t Z_s \, ds = \frac{1}{\kappa } \cdot (\mu - \sigma ^2/2) \end{aligned}$$

almost surely when \(\mu - \sigma ^2/2 > 0\), and hence

$$\begin{aligned} \lim _{t \rightarrow \infty } \frac{1}{t} \int _0^t Z_s \, ds = 0 \end{aligned}$$

almost surely when \(\mu - \sigma ^2/2 = 0\).

We now identify the scale function and speed measure of the one-dimensional diffusion \(Z\). A choice for the scale function is

$$\begin{aligned} s(x)&= \int _c^x \exp \left( -\int _{a}^y \frac{2b(z)}{\sigma ^2(z)}\,dz\right) \,dy \nonumber \\&= \int _c^x \left( \frac{y}{a}\right) ^{-2\mu /\sigma ^2} e^{\frac{2\kappa }{\sigma ^2}(y-a)}\,dy \end{aligned}$$
(7.4)

for arbitrary \(a,c \in {\mathbb {R}}_{++}\) (recall that the scale function is only defined up to affine transformations). If we set \(\tilde{\sigma }= (\sigma s')\circ s^{-1}\), then

$$\begin{aligned} d s(Z_t) = \tilde{\sigma }(s(Z_t)) \, d\tilde{W}_t \end{aligned}$$

and the diffusion process \(s(Z)\) is in natural scale on the state space \(s({\mathbb {R}}_{++})\) with speed measure \(m\) that has density \(\frac{1}{\tilde{\sigma }^2}\).

The total mass of the speed measure is

$$\begin{aligned} m({\mathbb {R}}_{++})&= \int _{s({\mathbb {R}}_{++})} \frac{1}{\tilde{\sigma }^2(x)}\,dx\!=\!\int _{s({\mathbb {R}}_{++})} \frac{1}{((\sigma s')\circ s^{-1})^2(x)}\,dx \!=\!\int _0^\infty \frac{1}{\sigma ^2(u)s'(u)}\,du\nonumber \\&= \int _0^\infty \frac{1}{(\sigma u)^2\left( \frac{u}{a}\right) ^{-2\mu /\sigma ^2} e^{\frac{2\kappa }{\sigma ^2}(u-a)}}\,du\nonumber \\&= \frac{1}{\sigma ^2a^{2\mu /\sigma }} \int _0^\infty u^{\frac{2\mu }{\sigma ^2} -2} e^{-\frac{2\kappa }{\sigma ^2}(u-a)}\,du. \end{aligned}$$
(7.5)

By Theorem 23.15 of Kallenberg (2002), the diffusion process \(Z\) has a stationary distribution concentrated on \({\mathbb {R}}_{++}\) if and only if the process \(s(Z)\) has \((-\infty ,+\infty )\) as its state space and the speed measure has finite total mass or \(s(Z)\) has a finite interval as its state space and the boundaries are reflecting. The introduction of an extra negative drift to geometric Brownian motion cannot make zero a reflecting boundary, so we are interested in conditions under which \(s({\mathbb {R}}_{++}) = (-\infty ,\infty )\) and the speed measure has finite total mass. We see from (7.4) and (7.5) that this happens if and only if \(\mu -\sigma ^2/2 >0\), a condition we assume holds for the remainder of the proof.

The diffusion \(s(Z)\) has a stationary distribution with density \(f{:=}\frac{1}{m({\mathbb {R}}_{++})\tilde{\sigma }^2}\) on \(s({\mathbb {R}}_{++}) = (-\infty ,+\infty )\), and so the stationary distribution of \(Z\) is the distribution on \({\mathbb {R}}_{++}\) that has density

$$\begin{aligned} g(x)&= f(s(x)) s'(x)\\&= \frac{1}{m({\mathbb {R}}_{++})\tilde{\sigma }^2(s(x))} s'(x)\\&= \frac{1}{m({\mathbb {R}}_{++})\sigma ^2(x)s'(x)}\\&= \frac{1}{m({\mathbb {R}}_{++})x^2\sigma ^2\left( \frac{x}{a}\right) ^{-2\mu /\sigma ^2} e^{\frac{2\kappa }{\sigma ^2}(x-a)}}, \quad x\in {\mathbb {R}}_{++}. \end{aligned}$$

This has the form of a \(\mathrm {Gamma}(k,\theta )\) density with parameters \(\theta {:=}\frac{\sigma ^2}{2\kappa }\) and \(k=\frac{2\mu }{\sigma ^2}-1\). Therefore,

$$\begin{aligned} g(x) = \frac{1}{\Gamma (k)\theta ^k}x^{k-1}e^{-\frac{x}{\theta }} = \frac{1}{\Gamma \left( \frac{2\mu }{\sigma ^2}-1\right) \left( \frac{\sigma ^2}{2\kappa }\right) ^{\frac{2\mu }{\sigma ^2}-1}}x^{\frac{2\mu }{\sigma ^2}-2}e^{\frac{-2\kappa x}{\sigma ^2}}, \quad x \in {\mathbb {R}}_{++}. \end{aligned}$$

Theorem 20.21 from Kallenberg (2002) implies that the shift-invariant \(\sigma \)-field is trivial for all starting points. The ergodic theorem for stationary stochastic processes then tells us that, if we start \(Z\) with its stationary distribution,

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t h(Z_s)\,ds = \int _0^\infty h(x) g(x) \, dx \end{aligned}$$

for any Borel function \(h:{\mathbb {R}}_{++}\rightarrow {\mathbb {R}}\) with \(\int _0^\infty |h(x)|g(x) \, dx<\infty \). Since \(Z\) has positive continuous transition densities we can conclude that

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t h(Z_s)\,ds = \int _0^\infty h(x) g(x) \, dx \end{aligned}$$

\(\mathbb P^x\)-almost surely for any \(x\in {\mathbb {R}}_{++}\).

In particular,

$$\begin{aligned} \int _{{\mathbb {R}}_{++}}x g(x) \, dx = k \theta = \frac{1}{\kappa } \cdot \left( \mu - \frac{\sigma ^2}{2}\right) . \end{aligned}$$

Appendix B: Proof of Theorem 4.1

To simplify our presentation, we re-write the joint dynamics of \(X\) and \(Y\) as

$$\begin{aligned} dX_t&= X_t \left( \mu \cdot \alpha - (aX_t + cY_t)\right) \,dt + \sigma _X X_t \,dU_t\\ dY_t&= Y_t \left( \mu \cdot \beta - (cX_t + bY_t)\right) \,dt + \sigma _Y Y_t \,dV_t,\nonumber \end{aligned}$$
(8.1)

where \(a{:=}\langle \alpha ,\alpha \rangle _\kappa , b {:=}\langle \beta ,\beta \rangle _\kappa , c{:=}\langle \alpha ,\beta \rangle _\kappa , \sigma _X{:=}\sqrt{\alpha \cdot \Sigma \alpha }\), and \(\sigma _Y{:=}\sqrt{\beta \cdot \Sigma \beta }\).

To prove Theorem 4.1, we need several preliminary results. First, we prove existence and uniqueness of solutions to the system (8.1) as well as a useful comparison result in Theorem 8.1. Second, in Proposition 8.3, we establish that \((X_t,Y_t)\) remains in \(\mathbb {R}_{++}^2=(0,\infty )^2\) for all \(t\ge 0\) whenever \((X_0,Y_0)\in \mathbb {R}_{++}^2\). Third, in Proposition 8.4, we show that weak limit points of the empirical measures \(\frac{1}{t}\int _0^t \mathbb {P}^{(x,y)}\{(X_s,Y_s)\in \cdot \} \, ds\) are stationary distributions for the process \((X,Y)\) thought of as a process on \({\mathbb {R}}_+^2\) (rather than \({\mathbb {R}}_{++}^2\)). Finally, we show that \(\lim _{t\rightarrow \infty } Y_t =0\) with probability one in Proposition 8.5 and conclude by showing that \(\frac{1}{t}\int _0^t \mathbb {P}^{(x,y)}\{(X_s,Y_s)\in \cdot \} \, ds\) converges weakly to \(\rho _{\bar{X}} \otimes \delta _0\) concentrated on \({\mathbb {R}}_{++} \times \{0\}\).

Theorem 8.1

The stochastic differential equation in (8.1) has a unique strong solution and \(X_t,Y_t\in L^p(\mathbb {P}^{(x,y)})\) for all \(t,p>0\) for all \((x,y) \in \mathbb {R}_{++}^2\). This solution satisfies \(X_t > 0\) and \(Y_t > 0\) for all \(t \ge 0\), \(\mathbb {P}^{(x,y)}\)-almost surely for all \((x,y) \in \mathbb {R}_{++}^2\). Let \(((\bar{X}_t ,\bar{Y}_t))_{t \ge 0}\) be the stochastic process defined by the pair of stochastic differential equations

$$\begin{aligned} d\bar{X}_t&= \bar{X}_t \left( \mu \cdot \alpha - a \bar{X}_t \right) \,dt + \sigma _X \bar{X}_t \,dU_t\\ d\bar{Y}_t&= \bar{Y}_t \left( \mu \cdot \beta - b\bar{Y}_t\right) \,dt + \sigma _Y \bar{Y}_t \,dV_t\nonumber \end{aligned}$$
(8.2)

If \((X_0,Y_0)=(\bar{X}_0,\bar{Y}_0)\), then

$$\begin{aligned} X_t\le \bar{X}_t \end{aligned}$$

and

$$\begin{aligned} Y_t\le \bar{Y}_t \end{aligned}$$

for all \(t\ge 0\).

Proof

The uniqueness and existence of strong solutions is fairly standard, see, for example, Theorem 2.1 in Li and Mao (2009). One notes that the drift coefficients are locally Lipschitz so strong solutions exist and are unique up to the explosion time. It is easy to show this explosion time is almost surely infinite (see Theorem 2.1 in Li and Mao (2009)). Next, suppose that \(X_0 = \bar{X}_0\). We adapt the comparison principle of Ikeda and Watanabe (Chapter VI Theorem 1.1 from Ikeda and Watanabe (1989)) proved by the local time techniques of Le Gall (see Theorem 1.4 from Le Gall (1983) and Theorem V.43.1 in Rogers and Williams (2000)) to show that \(\bar{X}_t - X_t \ge 0\) for all \(t \ge 0\).

Define \(\rho :{\mathbb {R}}_+\rightarrow {\mathbb {R}}_+\) by \(\rho (x)=|x|^2\). Note that

$$\begin{aligned}&\int _0^t \rho (|\bar{X}_s-X_s|)^{-1} \mathbf{1}{\{\bar{X}_s - X_s>0\} }\, d[\bar{X}- X]_s \\&\quad = \int _0^t \rho (|\bar{X}_s-X_s|)^{-1} (\sigma _X \bar{X}_s - \sigma _X X_s)^2 \mathbf{1}\{\bar{X}_s - X_s>0\}\,ds\\&\quad \le \sigma _X^2 t.\\ \end{aligned}$$

Since \(\int _{0+}\rho (u)^{-1}\,du=\infty \), by Proposition V.39.3 from Rogers and Williams (2000) the local time at 0 of \(X-\bar{X}\) is zero for all \(t\ge 0\). Put \(x^+ {:=} x \vee 0\). By Tanaka’s formula (see equation IV.43.6 in Rogers and Williams (2000)),

$$\begin{aligned} (X_t-\bar{X}_t)^+&= \int _0^t \mathbf{1}{\{X_s-\bar{X}_s>0\}} (\sigma _X X_s-\sigma _X \bar{X}_s)\,dU_t\\&+\!\int _0^t \mathbf{1}{ \{X_s\!-\!\bar{X}_s\!>\!0\}} \left[ (\mu \cdot \alpha \!-\! (aX_s\!+\!cY_s))X_s\!-\!(\mu \cdot \alpha \!-\!a\bar{X}_s)\bar{X}_s\right] \,ds. \end{aligned}$$

For \(K>0\) define the stopping time

$$\begin{aligned} T_K{:=}\inf \{t>0 : X_t\ge K ~\text {or} ~\bar{X}_t \ge K\} \end{aligned}$$

and the stopped processes \(X^K_t=X_{T_K\wedge t}\) and \(\bar{X}^K_t=\bar{X}_{T_K\wedge t}\). Then, stopping the processes at \(T_K\) and taking expectations yields

$$\begin{aligned} 0&\le \mathbb E(X^K_t-\bar{X}^K_t)^+ \\&= \mathbb E\int _0^{t\wedge T_K} \mathbf{1}\{X_s\!-\!\bar{X}_s>0\} \left[ (\mu \cdot \alpha X_s - X_s(aX_s + cY_s)) -(\mu \cdot \alpha \bar{X}_s - a\bar{X}_s^2 )\right] \,ds\\&=\mathbb E\int _0^{t\wedge T_K} \mathbf{1}\{X_s-\bar{X}_s>0\} \left[ \mu \cdot \alpha (X_s - \bar{X}_s) -a(X_s^2-\bar{X}_s^2) -cX_sY_s \right] \,ds\\&\le \mathbb E\int _0^{t\wedge T_K} \mathbf{1}\{X_s-\bar{X}_s>0\} \mu \cdot \alpha (X_s - \bar{X}_s)\,ds\\&\le \mu \cdot \alpha \, \mathbb E\int _0^{t\wedge T_K} (X_s - \bar{X}_s)^+\,ds\\&\le \mu \cdot \alpha \, \mathbb E\int _0^t (X^K_s - \bar{X}^K_s)^+\,ds. \\ \end{aligned}$$

By Gronwall’s Lemma (see, for example, Appendix 5 of Ethier and Kurtz (2005)) \(\mathbb E[(X^K_t-\bar{X}^K_t)^+] = 0\) for all \(t\ge 0\), so \(X^K_t\le \bar{X}^K_t\) for all \(t\ge 0\). Now let \(K\rightarrow \infty \) and recall that \(\bar{X}\) does not explode to get that \(X_t\le \bar{X}_t\) for all \(t\ge 0\). Since we have shown before that \(\bar{X}\) is dominated by a geometric Brownian motion, a process that has finite moments of all orders, we get that \(X_t,Y_t\in L^p(\mathbb {P}^{(x,y)})\) for all \(t,p>0\) and for all \((x,y) \in \mathbb {R}_{++}^2\). \(\square \)

Remark 8.2

Note that the SDEs for all the processes considered here have unique strong solutions in \(L^p\) for all \(t\ge 0, p>0\) and for all strictly positive starting points. This follows by arguments similar to those that are in Theorem 2.1 from Li and Mao (2009) and in Theorem 8.1 by noting that our SDEs for \((X,Y)\), \((\bar{X}, \bar{Y})\) etc. are all of the form

$$\begin{aligned} d\breve{X}_t&= \breve{X}_t\left[ \lambda _1 - \lambda _2 \breve{Y}_t -\lambda _3 \breve{X}_t \right] \,dt + \breve{X}_t \sigma _X\,dU_t\\ d\breve{Y}_t&= \breve{Y}_t\left[ \lambda _4 - \lambda _5 \breve{X}_t -\lambda _6 \breve{Y}_t \right] \,dt + \breve{Y}_t \sigma _Y\,dV_t\\ \breve{X}_0&= x\\ \breve{Y}_0&= y \end{aligned}$$

for \(\lambda _1,\dots ,\lambda _6\in {\mathbb {R}}_+\) and \(x,y\in {\mathbb {R}}_{++}\).

The next proposition tells us that none of our processes hit zero in finite time.

Proposition 8.3

Let \((X,Y)\) be the process given by (8.1). If \((X_0,Y_0) \in {\mathbb {R}}_{++}^2\), then \((X_t,Y_t) \in {\mathbb {R}}_{++}^2\) for all \(t \ge 0\) almost surely. A similar conclusion holds for all of the other processes we work with.

Proof

As an example of the method of proof, we look at the process \((X,Y)\) given by (8.1). Taking logarithms and using Itô’s lemma,

$$\begin{aligned} d \log X_t = \left( \mu \cdot \alpha - (aX_t + cY_t)-\frac{1}{2}\sigma _X^2\right) \,dt + \sigma _X \,dU_t. \end{aligned}$$

Therefore,

$$\begin{aligned} \log X_t = \int _0^t\left( \mu \cdot \alpha - (aX_s + cY_s)-\frac{1}{2}\sigma _X^2\right) \,ds + \sigma _X U_t. \end{aligned}$$

can’t go to \(-\infty \) in finite time because \(X_t\) and \(Y_t\) do not blow up. \(\square \)

Proposition 8.4

Let \((X,Y)\) be the process given by (8.1) and fix \((x,y) \in {\mathbb {R}}_{++}^2\). Any sequence \(\{t_n\}_{n \in \mathbb N}\) such that \(t_n\rightarrow \infty \) has a subsequence \(\{u_n\}_{n \in \mathbb N}\) such that the sequence of probability measures

$$\begin{aligned} \frac{1}{u_n} \int _0^{u_n} \mathbb {P}^{(x,y)} \{(X_s,Y_s) \in \cdot \} \, ds \end{aligned}$$

converges in the topology of weak convergence of probability measures on \({\mathbb {R}}_+^2\). Any such limit is a stationary distribution for the process \((X,Y)\) thought of as a process with state space \({\mathbb {R}}_+^2\).

Proof

Set \(\varphi (x,y) {:=}x+y\) so that \(\varphi \ge 0\) for \(x,y > 0\). Put \(\psi (x,y)=\mu \cdot \alpha x + \mu \cdot \beta y -x(ax+cy)-y(cx+by)\). Note that \(\psi \) is bounded above on the quadrant \(x,y \ge 0\) and \(\lim _{\Vert (x,y)\Vert \rightarrow \infty }\psi (x,y)=-\infty \) where \(\Vert \cdot \Vert \) is the Euclidean distance on \({\mathbb {R}}^2\). Using Itô’s lemma we get

$$\begin{aligned} \varphi (X_t,Y_t)- \int _0^t \psi (X_s,Y_s)\,ds&= \int _0^t\sigma _Y Y_s\,dV_s +\int _0^t\sigma _X X_s\,dU_s. \end{aligned}$$

Therefore, \(\varphi (X_t,Y_t)- \int _0^t \psi (X_s,Y_s)\,ds\) is a martingale. Applying Theorem 9.9 of Ethier and Kurtz (2005) completes the proof. \(\square \)

The following result is essentially Theorem 10 in Liu et al. (2011). We include the proof for completeness.

Proposition 8.5

Suppose that \(\alpha \cdot \mu - \alpha \cdot \Sigma \alpha /2>0\), \(\beta \cdot \mu -\beta \cdot \Sigma \beta /2>0\), and \(\mathcal {I}(\alpha ,\beta )<0\). If \((X,Y)\) is the process given by (8.1), then \(\lim _{t\rightarrow \infty } Y_t = 0\) \(\mathbb {P}^{(x,y)}\)-a.s. for all \((x,y) \in {\mathbb {R}}_{++}^2\).

Proof

Using Ito’s lemma and the definition of \(\mathcal {I}(\alpha ,\beta )\),

$$\begin{aligned} a\frac{\log \left( \frac{Y_t}{Y_0}\right) }{t}\!-\!c\frac{\log \left( \frac{X_t}{X_0}\right) }{t}&= a\left( \mu \cdot \beta \!-\!\frac{\sigma _Y^2}{2}\right) \!-\!c \left( \mu \cdot \alpha \!-\!\frac{\sigma _X^2}{2}\right) \\&-(ab-c^2)\frac{\int _0^t Y_s\,ds}{t}+ ~a\sigma _Y \frac{V_t}{t} -c\sigma _X\frac{U_t}{t}\\&= a \mathcal {I}(\alpha ,\beta ) - (ab-c^2)\frac{\int _0^t Y_s\,ds}{t} + a\sigma _Y \frac{V_t}{t} -c\sigma _X\frac{U_t}{t}. \end{aligned}$$

By the Cauchy–Schwarz inequality, \((ab-c^2)=\langle \alpha ,\alpha \rangle _\kappa \langle \beta ,\beta \rangle _\kappa - (\langle \alpha ,\beta \rangle _\kappa )^2 \ge 0\), and so

$$\begin{aligned} \frac{\log \left( \frac{Y_t}{Y_0}\right) }{t}&\le \frac{c}{a}\frac{\log \left( \frac{X_t}{X_0}\right) }{t} + \mathcal {I}(\alpha ,\beta ) + \sigma _Y \frac{V_t}{t} - \frac{c}{a}\sigma _X\frac{U_t}{t}. \end{aligned}$$

Let \(\bar{X}\) be the process defined by (8.2) with \(\bar{X}_0=X_0\). Proposition 2.3 implies

$$\begin{aligned} \lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t \bar{X}_s\,ds =(\mu \cdot \alpha -\sigma _X^2/2)/a \quad \text {almost surely}. \end{aligned}$$
(8.3)

It follows from Theorem 8.1 that \(X_t\le \bar{X}_t\) for all \(t\ge 0\). Thus, with probability one,

$$\begin{aligned} \limsup _{t\rightarrow \infty }\frac{\log X_t}{t}&\le \limsup _{t\rightarrow \infty }\frac{\log \bar{X}_t}{t}\\&= \left( \mu \cdot \alpha - \frac{\sigma _X^2}{2}\right) - a\lim _{t\rightarrow \infty }\frac{1}{t}\int _0^t \bar{X}_s\,ds + \sigma _X \lim _{t\rightarrow \infty }\frac{U_t}{t}\\&= \left( \mu \cdot \alpha - \frac{\sigma _X^2}{2}\right) - a (\mu \cdot \alpha -\sigma _X^2/2)/a\\&= 0. \end{aligned}$$

Since \(U\) and \(V\) are Brownian motions, \(\lim _{t\rightarrow \infty } \frac{U_t}{t} = \lim _{t\rightarrow \infty } \frac{V_t}{t}=0\), and \(\limsup _{t\rightarrow \infty } \frac{\log X_t}{t}\le 0\) almost surely, so

$$\begin{aligned} \limsup _{t\rightarrow \infty } \frac{\log Y_t}{t} \le \mathcal {I}(\alpha ,\beta ) <0 \quad \text {almost surely}. \end{aligned}$$

In particular, \(\lim _{t \rightarrow \infty } Y_t = 0\) almost surely. \(\square \)

We can now finish the proof of Theorem 4.1. Fix \(\epsilon >0\) and \(\eta >0\) sufficiently small. Define the stopping time

$$\begin{aligned} T_{\epsilon }{:=}\inf \{t \ge 0 :Y_t\ge \epsilon \}. \end{aligned}$$

and the stopped process \(X^\epsilon _t{:=}X_{t\wedge T_{\epsilon }}\). By Proposition 8.5, there exists \(T>0\) such that

$$\begin{aligned} \mathbb P^{(x,y)}\{Y_t\le \epsilon ~\text {for all}~ t\ge T\}\ge 1-\eta \end{aligned}$$

Define the process \(\check{X}\) via

$$\begin{aligned} d\check{X}_t = \check{X}_t [(\mu \cdot \alpha -c\epsilon )-a\check{X}_t]\,dt + \sigma _X \check{X}_tdU_t \end{aligned}$$

and the stopped process \(\check{X}^\epsilon _t{:=}\check{X}_{t\wedge T_{\epsilon }}\). Start the process \(\check{X}\) at time \(T\) with the condition \(\check{X}_T = X_T\). We want to show that the process \(\check{X}^\epsilon \) is dominated by the process \(X^\epsilon \), that is \(X^\epsilon _t\ge \check{X}^\epsilon _t\) for all \(t\ge T\). By the strong Markov property, we can assume \(T=0\).

The proof is very similar to the one from Theorem 8.1. With the notation from the proof of Theorem 8.1, we have

$$\begin{aligned}&\int _0^t \rho (|\check{X}^\epsilon _s-X^\epsilon _s|)^{-1} \mathbf{1}\{\check{X}_s^\epsilon -X_s^\epsilon >0\} \, d[\check{X}^\epsilon - X^\epsilon ]_s\\&\quad =\int _0^t \rho (|\check{X}^\epsilon _s-X^\epsilon _s|)^{-1} (\sigma _X \check{X}^\epsilon _s - \sigma _X X^\epsilon _s)^2 \mathbf{1}\{\check{X}_s^\epsilon -X_s^\epsilon >0\}]\,ds\\&\quad \le \sigma _X^2 t \end{aligned}$$

so the local time of the process \(\check{X}^\epsilon -X^\epsilon \) at zero is identically zero. Then, using Tanaka’s formula,

$$\begin{aligned} (\check{X}^\epsilon _t-X^\epsilon _t)^+&= \int _0^{t\wedge T_\epsilon } \mathbf{1}\{\check{X}_s-X_s>0\}(\sigma _X \check{X}_s - \sigma _X X_s)\,dU_t \\&+ \int _0^{t\wedge T_\epsilon } \mathbf{1}\{\check{X}_s-X_s>0\} \left[ ((\mu \cdot \alpha -c\epsilon )\check{X}_s-a\check{X}_s^2)\right. \\&\left. -(\mu \cdot \alpha X_s - X_s(cY_s + aX_s))\right] \,ds. \end{aligned}$$

Taking expectations,

$$\begin{aligned} \mathbb E[(\check{X}^\epsilon _t-X^\epsilon _t)^+]&= \mathbb E\int _0^{t\wedge T_\epsilon }\mathbf{1}\{\check{X}_s-X_s>0\}[(\mu \cdot \alpha (\check{X}_s-X_s)-(c\epsilon \check{X}_s -cX_sY_s)\\&~ - a(\check{X}_s^2-X_s^2))\,ds]\\&\le \mu \cdot \alpha \,\mathbb E\int _0^{t\wedge T_\epsilon } (\check{X}_s-X_s)^+\,ds\\&\le \mu \cdot \alpha \,\mathbb E\int _0^t (\check{X}^\epsilon _s-X^\epsilon _s)^+\,ds. \end{aligned}$$

By Gronwall’s Lemma, \(\mathbb E[(\check{X}^\epsilon _t-X^\epsilon _t)^+]=0\). As a result, remembering we assumed \(T=0\), we have \(\check{X}^\epsilon _t\le X^\epsilon _t\) for all \(t\ge T\). For \(\epsilon \) small enough we know that \(\check{X}\) has a stationary distribution concentrated on \({\mathbb {R}}_{++}\). For any sequence \(a_n\rightarrow \infty \), if the Cesaro averages \(\frac{1}{a_n} \int _0^{a_n} \mathbb {P}^{(x,y)}\{(X_s,Y_s) \in \cdot \} \, ds\) converge weakly, then the limit is a distribution of the form \(\varphi \otimes \delta _0\), where \(\varphi \) is a mixture of the unique stationary distribution \(\rho _{\bar{X}}\) described in Proposition 2.3 and the point mass at \(0\). By the above, the limit of \(\frac{1}{a_n} \int _0^{a_n} \mathbb {P}^{(x,y)}\{(X_s,Y_s) \in \cdot \} \, ds\) cannot have any mass at \((0,0)\) because \(\check{X}_t\le X_t\) on the event \(\{Y_t\le \epsilon ~\text {for all}~ t\ge T\}\) that has probability \(\mathbb P^{(x,y)}\{Y_t\le \epsilon ~\text {for all}~ t\ge T\}\ge 1-\eta \). Since \(\eta >0\) was arbitrary, we conclude that \(\varphi =\rho _{\bar{X}} \otimes \delta _0\), as required.

Appendix C: Proof of Theorem 4.2

Our proof is along the same lines as the proofs of Theorems 4 and 5 in Schreiber et al. (2011). We will once again simplify our notation by re-writing the SDE for the pair \((X,Y)\) as in (8.1). We assume throughout this appendix that the hypotheses of Theorem 4.2 hold; that is, \(\mathcal {I}(\alpha ,\beta ) > 0\) and \(\mathcal {I}(\beta ,\alpha ) > 0\).

Let \(((\bar{X}_t ,\bar{Y}_t))_{t \ge 0}\) be the stochastic process defined by the pair of stochastic differential equations in (8.2) with initial conditions \((\bar{X}_0,\bar{Y}_0) = (X_0, Y_0)\). We know from Theorem 8.1 that \(X_t \le \bar{X}_t\) and \(Y_t \le \bar{Y}_t\) for all \(t \ge 0\).

Note from Corollary 3.3 that \(\alpha \cdot (\mu - \Sigma \alpha /2) > 0\) and \(\beta \cdot (\mu - \Sigma \beta /2) > 0\) and hence, by Proposition 2.3, the process \((\bar{X}, \bar{Y})\) has a unique stationary distribution on \({\mathbb {R}}_{++}^2\) and is strongly ergodic.

Let

$$\begin{aligned} \Pi _t(\cdot ){:=} \frac{1}{t} \int _0^t \mathbf{1}\{(X_s, Y_s)\in \cdot \} \,ds \end{aligned}$$

be the normalized occupation measures of \((X,Y)\). We know that the random probability measures

$$\begin{aligned} \bar{\Pi }_t(\cdot ){:=} \frac{1}{t} \int _0^t \mathbf{1}\{(\bar{X}_s, \bar{Y}_s)\in \cdot \} \,ds \end{aligned}$$

converge almost surely and so, in particular, they are tight on \({\mathbb {R}}_+^2 = [0,\infty )^2\); that is, for any \(\epsilon >0\) we can find a box \([0,K] \times [0,K]\) such that

$$\begin{aligned} \frac{1}{t}\int _0^t \mathbf{1}\{(\bar{X}_s, \bar{Y}_s) \in [0,K] \times [0,K]\} \, ds > 1-\epsilon ~\text { for all}~ t > 0. \end{aligned}$$

Therefore,

$$\begin{aligned} \frac{1}{t}\int _0^t \mathbf{1}\{(X_s, Y_s) \in [0,K] \times [0,K]\} \, ds&\ge \frac{1}{t}\int _0^t \mathbf{1}\{(\bar{X}_s, \bar{Y}_s) \in [0,K] \times [0,K]\} \, ds \\&> 1-\epsilon ~\text { for all}~ t > 0, \end{aligned}$$

and hence the normalized occupation measures of \((X,Y) \) are also tight on \({\mathbb {R}}_+^2\). By Prohorov’s theorem ((Kallenberg 2002, Theorem 16.3)), there exists a random probability measure \(\nu \) on \({\mathbb {R}}_+^2\) and a (possibly random) sequence \((t_n)\subset {\mathbb {R}}_{++}\) such that \(t_n\rightarrow \infty \) for which

$$\begin{aligned} \Pi _{t_n} \Longrightarrow \nu \end{aligned}$$
(9.1)

as \(n\rightarrow \infty \) almost surely, where \(\Longrightarrow \) denotes weak convergence of probability measures on \({\mathbb {R}}_+^2\). That is, with probability one for all bounded and continuous function \(u: {\mathbb {R}}_+^2 \rightarrow {\mathbb {R}}\) we have

$$\begin{aligned} \int _{{\mathbb {R}}_+} u(x,y) \, \Pi _{t_n}(dx,dy) \rightarrow \int _{{\mathbb {R}}_+} u(x,y) \, \nu (dx,dy) \end{aligned}$$

as \(n\rightarrow \infty \).

Proposition 9.1

The probability measure \(\nu \) is almost surely a stationary distribution for \((X,Y)\) thought of as a process with state space \({\mathbb {R}}_+^2\).

Proof

Let \((P_t)_{t\ge 0}\) be the semigroup of the process \((X,Y)\) thought of as a process on \({\mathbb {R}}_+^2\). For simplicity let us write \(Z_t{:=}(X_t,Y_t)\) for all \(t\ge 0\) and \(\nu _n{:=}\Pi _{t_n}\).

By the Strong Law of Large Numbers for martingales, we have that for all \(r\in {\mathbb {R}}_+\) and all bounded measurable functions \(f\)

$$\begin{aligned} \lim _{k\rightarrow \infty } \frac{1}{k} \sum _{i=0}^{k-1} [f(Z_{r+(i+1)t}) - P_tf(Z_{r+it})] =0 \quad \text {almost surely}. \end{aligned}$$

As a result,

$$\begin{aligned} \frac{1}{k}\left( \int _t^{kt}f(Z_s)\,ds\!-\!\!\int _0^{(k-1)t}P_t f(Z_s)\,ds \right) \!\!&= \!\! \frac{1}{k}\sum _{i=0}^{k-1} \int _0^t [f(Z_{r+(i+1)t})\!-\!P_tf(Z_{r+it})]\,dr\\&\rightarrow 0 ~\text {as}~k\rightarrow \infty \quad \text {almost surely}. \end{aligned}$$

This implies that

$$\begin{aligned} \lim _{u\rightarrow \infty }\frac{1}{u} \int _0^u [f(Z_{s+t})- P_t(Z_s)]\,ds = 0 \quad \text {almost surely}. \end{aligned}$$

Thus,

$$\begin{aligned} \int f\,d\nu - \int P_t f\,d\nu&= \lim _{n\rightarrow \infty }\left( \int f\,d\nu _n - \int P_t f\,d\nu _n\right) \nonumber \\&= \lim _{n\rightarrow \infty }\frac{1}{t_n} \left[ \int _0^{t_n} (f(Z_s)-P_tf(Z_s))\,ds\right] \nonumber \\&= \lim _{n\rightarrow \infty }\frac{1}{t_n} \left[ \int _0^{t_n-t} (f(Z_{s+t})-P_tf(Z_s))\,ds\right. \nonumber \\&\left. +\int _0^t f(Z_s)\,ds - \int _{t_n-t}^{t_n} P_t f(Z_s)\,ds \right] \nonumber \\&= \lim _{n\rightarrow \infty } \frac{1}{t_n} \left[ \int _0^{t_n-t} (f(Z_{s+t})-P_tf(Z_s))\,ds \right] \nonumber \\&= 0 \quad \text {almost surely}. \end{aligned}$$
(9.2)

The last result is equivalent to saying that \(\nu \) is almost surely a stationary distribution for \((X,Y)\). \(\square \)

Proposition 9.2

There exists a stationary distribution \(\pi \) of \((X,Y)\) that assigns all of its mass to \({\mathbb {R}}_{++}^2\).

Proof

We argue by contradiction. Because the process stays in one of the four sets \({\mathbb {R}}_{++}\), \({\mathbb {R}}_{++} \times \{0\}\), \(\{0\} \times {\mathbb {R}}_{++}\), \(\{(0,0)\}\) when it is started in the set, any stationary distribution for \((X,Y)\) thought of as a process on \({\mathbb {R}}_+^2\) can be written as a convex combination of stationary distributions that respectively assign all of their masses to one of the four sets, should such a stationary distribution exist for the given set. Suppose there is no stationary distribution that is concentrated on \({\mathbb {R}}_{++}^2\). Then, any stationary distribution is the convex combination of stationary distributions that respectively assign all of their mass to the three sets \({\mathbb {R}}_{++} \times \{0\}\), \(\{0\} \times {\mathbb {R}}_{++}\), and \(\{(0,0)\}\), and hence any stationary distribution is of the form

$$\begin{aligned} p_X \mu _X + p_Y \mu _Y + p_0 \delta _{(0,0)}, \end{aligned}$$

where the random variables \(p_X, p_Y, p_0\) are nonnegative and \(p_X+p_Y + p_0 =1\) almost surely, and \(\mu _X = \rho _{\bar{X}} \otimes \delta _0\) and \(\mu _Y = \delta _0 \otimes \rho _{\bar{Y}}\) for \(\rho _{\bar{X}}\) and \(\rho _{\bar{Y}}\) the unique stationary distributions of \(\bar{X}\) and \(\bar{Y}\). Next, we proceed as in Proposition 8.5 to find the limit of \(\frac{\log X_{t_n}}{t_n}\). Let us first argue that

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{t_n} \int _0^{t_n} X_s\,ds&= \int _{{\mathbb {R}}_{+}^2} x \, \nu (dx,dy)\\ \lim _{n\rightarrow \infty }\frac{1}{t_n} \int _0^{t_n} Y_s\,ds&= \int _{{\mathbb {R}}_{+}^2} y \, \nu (dx,dy) \quad \text {almost surely}.\nonumber \end{aligned}$$
(9.3)

Note that the infinitesimal generator of \((\log X, \log Y)\) thought of as a process on \({\mathbb {R}}^2\) is uniformly elliptic with smooth coefficients and so it has smooth transition densities (see, for example, Section 3.3.4 of Stroock (2008)). Moreover, an application of a suitable minimum principle for the Kolmogorov forward equation (see, for example, Theorem 5 in Section 2 of Chapter 2 of Friedman (1964)) shows that the transition densities are everywhere strictly positive. It follows that \((X,Y)\) thought of as a process on \({\mathbb {R}}_{+}^2\) has smooth transition densities that are everywhere positive.

Because the process \(\bar{X}\) also has smooth, every positive transition densities for similar reasons, the almost sure behavior of the \(\bar{X}\) started from a fixed point is the same as it is starting from its stationary distribution \(\rho _{\bar{X}}\). As a result, we get by Birkhoff’s pointwise ergodic theorem ((Kallenberg 2002, Theorem 10.6)) that, for all \(K>0\),

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} \bar{X}_s \mathbf{1}\{\bar{X}_s > K\} \, ds = \mathbb E^{\rho _{\bar{X}}} [\bar{X}_s \mathbf{1}\{\bar{X}_s > K\}] \end{aligned}$$

\(\mathbb P^{x}\) almost surely for any \(x\in {\mathbb {R}}_+\). Therefore, by dominated convergence

$$\begin{aligned} \lim _{K \rightarrow \infty } \lim _{n \rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} \bar{X}_s \mathbf{1}\{\bar{X}_s > K\} \, ds = \lim _{m \rightarrow \infty } \mathbb E_{\rho _{\bar{X}}} [\bar{X}_s \mathbf{1}\{\bar{X}_s > K\}]= 0. \end{aligned}$$

The following inequalities are immediate due to the positivity of the terms

$$\begin{aligned} \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s \le K\} \,ds&\le \frac{1}{t_n} \int _0^{t_n} X_s\, ds\nonumber \\&= \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s \le K\} \,ds \nonumber \\&\quad + \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s > K\}\, ds. \end{aligned}$$
(9.4)

Recall that \(X_t\le \bar{X}_t\) for all \(t\ge 0\) and hence

$$\begin{aligned} \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s > K\} \,ds \le \frac{1}{t_n} \int _0^{t_n} \bar{X}_s \mathbf{1}\{ \bar{X}_s > K\} \,ds. \end{aligned}$$

This implies

$$\begin{aligned} \limsup _{n\rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s > K\} \,ds \le \limsup _{n\rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} \bar{X}_s \mathbf{1}\{ \bar{X}_s > K\}\, ds, \end{aligned}$$

and therefore

$$\begin{aligned} 0&\le \lim _{K\rightarrow \infty } \limsup _{n\rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s > K\} \,ds\nonumber \\&\le \lim _{K\rightarrow \infty } \limsup _{n\rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} \bar{X}_s \mathbf{1}\{ \bar{X}_s > K\} \,ds =0. \end{aligned}$$
(9.5)

By (9.1) and Theorem 4.27 of Kallenberg (2002),

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s \le K\}\, ds = \int _{{\mathbb {R}}_{++}^2} x \mathbf{1}\{x\le K\}\, \nu (dx,dy). \end{aligned}$$

for any \(K\) such that

$$\begin{aligned} \nu (\{K\}\times {\mathbb {R}}_+)=0. \end{aligned}$$

While this last condition need not hold a priori for all \(K\), we can only have

$$\begin{aligned} \nu (\{K\}\times {\mathbb {R}}_+)>0 \end{aligned}$$

for countably many \(K\), so there exists a sequence \((K_m)\subset {\mathbb {R}}_+\) such that \(K_m\rightarrow \infty \) as \(m\rightarrow \infty \) with

$$\begin{aligned} \nu (\{K_m\}\times {\mathbb {R}}_+)=0. \end{aligned}$$

By dominated convergence,

$$\begin{aligned} \lim _{m\rightarrow \infty } \lim _{n\rightarrow \infty } \frac{1}{t_n} \int _0^{t_n} X_s \mathbf{1}\{ X_s \le K_m\}\, ds&= \lim _{K\rightarrow \infty } \int _{{\mathbb {R}}_{+}^2} x \mathbf{1}\{x\le K\} \,\nu (dx,dy) \nonumber \\&= \int _{{\mathbb {R}}_{+}^2} x \,\nu (dx,dy). \end{aligned}$$
(9.6)

Combining (9.4), (9.5) and (9.6) gives (9.3).

It follows from Itô’s formula, the observation \(\mathcal {I}(\alpha ,\alpha )=0\), (9.3), and the fact that \(\lim _{n\rightarrow \infty }\frac{U_{t_n}}{t_n}=0\) that

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{\log X_{t_n}}{t_n}&= \mu \cdot \alpha - \frac{\sigma _X^2}{2} - \mathbb E^\nu [aX_t + bY_t]\\&= p_X\left( \mu \cdot \alpha -a\mathbb E^{\bar{\rho }_X}[X_t] - \frac{\sigma _X^2}{2}\right) \\&+~p_Y\left( \mu \cdot \alpha -b\mathbb E^{\bar{\rho }_Y}[Y_t]-\frac{\sigma _X^2}{2}\right) + p_0 \left( \mu \cdot \alpha -\frac{\sigma _X^2}{2}\right) \\&= p_X \mathcal {I}(\alpha ,\alpha ) + p_Y \mathcal {I}(\alpha ,\beta ) + p_0\left( \mu \cdot \alpha -\frac{\sigma _X^2}{2}\right) \\&= p_Y \mathcal {I}(\alpha ,\beta ) + p_0\left( \mu \cdot \alpha -\frac{\sigma _X^2}{2}\right) \quad \text {almost surely}. \end{aligned}$$

By assumption, \(\mathcal {I}(\alpha ,\beta )>0\) and we have already observed that \(\mu \cdot \alpha -\frac{\sigma _X^2}{2}>0\). Because \(\bar{X}_t\) converges in distribution as \(t \rightarrow \infty \) to a distribution that assigns all of its mass to \({\mathbb {R}}_{++}^2\), it follows that \(\frac{\log \bar{X}_{t_n}}{t_n}\) converges in probability to \(0\). However, since \(X_t \le \bar{X}_t\) for all \(t \ge 0\) it follows that \(p_Y \mathcal {I}(\alpha ,\beta ) + p_0\left( \mu \cdot \alpha -\frac{\sigma _X^2}{2}\right) \le 0\) and hence

$$\begin{aligned} p_Y=p_0=0 \quad \text {almost surely}. \end{aligned}$$
(9.7)

The same argument applied to \((Y_t)_{t\ge 0}\) establishes

$$\begin{aligned} p_X=p_0=0 \quad \text {almost surely}. \end{aligned}$$
(9.8)

Therefore, \(p_X=p_Y=p_0=0\), and this contradicts the assumption that \(p_X+p_Y+p_0=1\). \(\square \)

We can now finish the proof of Theorem 4.2.

Proof

Proposition 9.2 implies that \((X,Y)\) has a stationary distribution \(\pi \) on \({\mathbb {R}}_{++}^2\). By Theorem 20.17 from Kallenberg (2002), our process \((X,Y)\) is either Harris recurrent or uniformly transient. We say that \((X_t,Y_t)\rightarrow \infty \) almost surely as \(t\rightarrow \infty \) if \(\mathbf{1}_K(X_t,Y_t)\rightarrow 0\) as \(t \rightarrow \infty \) for any compact set \(K\subset {\mathbb {R}}_{++}^2\). Theorem 20.21 from Kallenberg (2002) gives that if \((X,Y)\) is transient, then \((X_t,Y_t)\rightarrow \infty \) and so \((X,Y)\) cannot have a stationary distribution. Hence, since we know our process has a stationary distribution \(\pi \), it must be Harris recurrent. Theorem 20.21 from Kallenberg (2002) then gives us Eq. (4.1).

Theorem 20.18 from Kallenberg (2002), 20.18 gives that any Harris recurrent Feller process on \({\mathbb {R}}_{++}^2\) with strictly positive transition densities has a locally finite invariant measure that is equivalent to Lebesgue measure and is unique up to a normalization. We already know that we have a stationary distribution, so this distribution is unique and has an almost everywhere strictly positive density with respect to Lebesgue measure. Theorem 20.12 from Kallenberg (2002) says that any Harris recurrent Feller process is strongly ergodic, and so Eq. (4.2) holds. \(\square \)

Remark 9.3

In Theorem 3.1 of Zhang and Chen (2013), the authors claim to show that the system of SDE describing \((X,Y)\) always has a unique stationary distribution. We note that their use of moments just checks tightness in \({\mathbb {R}}_{+}^2{:=}[0,\infty )^2\) and not in \({\mathbb {R}}_{++}^2 = (0,\infty )^2\). It does not stop mass going off to \({\mathbb {R}}_{+}^2 \setminus {\mathbb {R}}_{++}^2 = ({\mathbb {R}}_+ \times \{0\}) \cup (\{0\} \times {\mathbb {R}}_+)\), which is exactly what can happen in our case. Thus, their proof only shows the existence of a stationary distribution on \({\mathbb {R}}_{+}^2\)—it does not show the existence of a stationary distribution on \({\mathbb {R}}_{++}^2\). Furthermore, their proof for the uniqueness of a stationary distribution on \({\mathbb {R}}_{+}^2\) breaks down because their assumption of irreducibility is false. The process \((X,Y)\) is irreducible on \({\mathbb {R}}_{++}^2\), but it is not irreducible on \({\mathbb {R}}_{+}^2\) since \(P_t((0,0),U){:=}\mathbb P^{(0,0)}\{(X_t,Y_t)\in U\}=0\) for any open subset \(U\) that lies in the interior of \({\mathbb {R}}_{+}^2\). If we work on \({\mathbb {R}}_{+}^2\), it is not true that the diffusion \((X,Y)\) has a unique stationary distribution. We can obtain infinitely many stationary distributions on \({\mathbb {R}}_{+}^2\) of the form \((u \rho _{\bar{X}} + v \delta _0)\otimes \delta _0\) where \(\rho _{\bar{X}}\) is the unique stationary distribution of \(\bar{X}\) on \({\mathbb {R}}_{++}\) and \(u,v\in {\mathbb {R}}_+\) satisfy \(u+v=1\).

Appendix D: Proof of Theorem 5.1

Assume that the matrix \(\Sigma \) is positive definite and that the dispersion proportion vector \(\alpha \) is such that \(\mu \cdot \alpha - \alpha \cdot \Sigma \alpha /2>0\) so that a population playing the strategy \(\alpha \) persists. Under these assumptions the function \(\beta \mapsto \mathcal {I}(\alpha ,\beta )\) is strictly concave. Hence, by the method of Lagrange multipliers, \(\mathcal {I}(\alpha ,\beta )<0\) for all \(\beta \ne \alpha \) and \(\alpha _i>0\) for all \(i\) if and only if there exists a constant, which we denote by \(\lambda \), such that

$$\begin{aligned} \left. \lambda = \frac{\partial \mathcal {I}}{\partial \beta _i } (\alpha ,\beta )\right| _{\beta =\alpha } = \mu _i - \kappa _i \alpha _i (\mu \cdot \alpha - \alpha \cdot \Sigma \alpha /2)/\langle \alpha ,\alpha \rangle _\kappa - \sum _j \alpha _j \sigma _{ij} \end{aligned}$$
(10.1)

for all \(i\). Multiplying (10.1) by \(\alpha _i\) and summing with respect to \(i\), we get

$$\begin{aligned} \lambda&= \mu \cdot \alpha - \langle \alpha ,\alpha \rangle _\kappa (\mu \cdot \alpha - \alpha \cdot \Sigma \alpha /2)/\langle \alpha ,\alpha \rangle _\kappa - \alpha \cdot \Sigma \alpha \\&=- \alpha \cdot \Sigma \alpha / 2 \end{aligned}$$

This expression for the Lagrange multiplier and (10.1) provide the characterization of a mixed ESS in Eq. (5.1) when \(\alpha _i>0\) for all \(i\). The characterization of the more general case of \(\alpha _i>0\) for at least two patches follows similarly by restricting the method of Lagrange multiples to the appropriate face of the probability simplex.

Suppose that \(\mu _i -\sigma _{ii}/2>0\) so that a population remaining in patch \(i\) and not dispersing to other patches persists. The strategy \(\alpha _i=1\) and \(\alpha _j=0\) for all \(j\ne i\) is an ESS only if

$$\begin{aligned} \left. \frac{\partial \mathcal {I}}{\partial \beta _j }(\alpha ,\beta )\right| _{\beta =\alpha }- \left. \frac{\partial \mathcal {I}}{\partial \beta _i}(\alpha ,\beta )\right| _{\beta =\alpha }<0 \end{aligned}$$

for all \(j\ne i\). Evaluating these partial derivatives gives the criterion (5.2) for the pure ESS.

We conclude by considering the case \(n=2\). Define the function \(g:[0,1]\rightarrow \mathbb {R}\) by

$$\begin{aligned} g(a)&= \left. \frac{\partial \mathcal {I}}{\partial \beta _1}((a_1,a_2),(b_1,b_2))\right| _{(a_1,a_2)=(a,1-a), (b_1,b_2)=(a,1-a)}\\&\quad ~-\left. \frac{\partial \mathcal {I}}{\partial \beta _2}((a_1,a_2),(b_1,b_2))\right| _{(a_1,a_2)=(a,1-a), (b_1,b_2)=(a,1-a)}.\\ \end{aligned}$$

The inequalities (5.2) for the pure strategies \((1,0)\) and \((0,1)\), respectively, correspond to \(g(0)<0\) and \(g(1)>0\), respectively. Hence, when these inequalities are reversed, the intermediate value theorem implies there exists \(a\in (0,1)\) such that \(g(a)=0\). Such an \(a\) satisfies the mixed ESS criterion (5.1) and, therefore, is an ESS.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Evans, S.N., Hening, A. & Schreiber, S.J. Protected polymorphisms and evolutionary stability of patch-selection strategies in stochastic environments . J. Math. Biol. 71, 325–359 (2015). https://doi.org/10.1007/s00285-014-0824-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-014-0824-5

Keywords

Mathematics Subject Classification

Navigation