Abstract
We study a general class of birth-and-death processes with state space \({\mathbb {N}}\) that describes the size of a population going to extinction with probability one. This class contains the logistic case. The scale of the population is measured in terms of a ‘carrying capacity’ \(K\). When \(K\) is large, the process is expected to stay close to its deterministic equilibrium during a long time but ultimately goes extinct. Our aim is to quantify the behavior of the process and the mean time to extinction in the quasi-stationary distribution as a function of \(K\), for large \(K\). We also give a quantitative description of this quasi-stationary distribution. It turns out to be close to a Gaussian distribution centered about the deterministic long-time equilibrium, when \(K\) is large. Our analysis relies on precise estimates of the maximal eigenvalue, of the corresponding eigenvector and of the spectral gap of a self-adjoint operator associated with the semigroup of the process.
1 Introduction
We study a general class of birth-and-death processes with state space \({\mathbb {N}}\) that describes the size of a population going to extinction with probability one. For a population of size \(n\in {\mathbb {N}}^*\), the birth rate is denoted by \(\lambda _n>0\,\) and the death rate by \(\mu _n>0\). Furthermore, we assume that
where \(\tilde{\lambda },\tilde{\mu }\) are positive functions and \(K\) is a scaling parameter describing the amount of available resources (that is called the ‘carrying capacity’ in ecology). We assume that \(\lambda _0=\mu _0=0\), entailing absorption at state \(0\).
In this work, we consider the case where absorption at \(0\) happens with probability one. We also assume that the time to this absorption has finite expectation. In this situation, the unique stationary probability measure is \(\delta _0\), the Dirac mass at state \(0\). In order to understand the behavior of the process before absorption, a relevant object to look at is a so-called quasi-stationary distribution, i.e, a probability distribution that is stationary when the process is conditioned to survive. Our aim is to describe what happens for large \(K\).
The prominent example is the so-called logistic birth-and-death process \((X_t^{\scriptscriptstyle {K}},t\ge 0)\) defined by following birth and death rates
for \(n\ge 1\), where \(\tilde{\lambda }, \tilde{\mu }\) are positive parameters. It is a classical result (see e.g. [16]) that if the process starts in a state of the form \(\lfloor x_0 K\rfloor \) (\(x_0>0\)), then the rescaled process \(X_t^{\scriptscriptstyle {K}}/K\) is ‘close’, in the limit as \(K\rightarrow \infty \), during any given finite interval of time, to the solution of the differential equation
with initial condition \(x_0\). This differential equation has a unique attractive equilibrium \(x_*=\tilde{\lambda }-\tilde{\mu }\) and the integer \(\lfloor x_* K\rfloor \) can be considered as an approximation of the population size over every given finite time interval. However, for each \(K\), the process \(X_t^{\scriptscriptstyle {K}}\) goes almost surely to extinction as \(t\rightarrow \infty \), see [10].
In this paper, we consider more general processes with the same kind of behavior. One of our motivations is to quantify, as a function of \(K\), the scale of the mean time to extinction, the time-scale of convergence to the quasi-stationary distribution, and the time-scale during which the process is close to the rescaled deterministic equilibirum \(\lfloor x_* K\rfloor \) with high probability.
Our results can be colloquially described as follows. We get an upper bound of order \(K\log K\) for the time it takes for the process to be close to the quasi-stationary distribution. We also get the existence of a time interval, exponentially long in \(K\), during which the process, if we start from a population of order \(K\), is nearly distributed according to the quasi-stationary distribution.
We also prove that the total variation distance between the quasi-stationary distribution and a Gaussian distribution is bounded by \(1/\sqrt{K}\). This Gaussian distribution is centered around \(\lfloor x_* K\rfloor \) and its variance is of order \(K\).
As a by-product of our analysis we show that the mean time to extinction with respect to the quasi-stationary distribution is given by

where \(c\) is a constant independent of \(K\) that is explicitly given later on. Roughly speaking, this mean time is exponentially large in \(K\).
Motivated by population extinction in biology, many people attempted to analyze quasi-stationary distributions. But even in the simplest models, like the logistic model, this turned out to be a complicated task. Previous results are mostly based on either Monte-Carlo simulations or uncontrolled approximations based on heuristic ansatzes, see the review paper [20] and also [15, 19]. The present work is the first one in which controlled mathematical approximations are obtained for the quasi-stationary distribution for a class of models encompassing the logistic model.
We are aware of only a few mathematical results related to our work. In [9], the authors do not study the quasi-stationary distribution but only the mean time to expectation starting from a state of order \(K\) for which they obtain the asymptotic behavior in \(K\) (see also [21]). Here we are able to control this quantity for all initial states and also for the quasi-stationary distribution as a starting distribution. In [2], the authors show that the quasi-stationary distribution can be approximated in total variation distance by an auxiliary process called the ‘returned process’. They also prove a bound for the total variation distance between the law of the process \(X_t^{\scriptscriptstyle {K}}\) for fixed values of \(t\) and the quasi-stationary distribution. This is somewhat related to one of our theorems (Theorem 3.6). Let us also mention the articles [5, 6, 8] about quantitative convergence to quasi-stationarity.
The main tool in this work is the analysis of an operator \(L\) that is related to the generator of the killed process. We use a weighted Hilbert space where \(L\) is self-adjoint. The operator \(L\) has a maximal simple and negative eigenvalue \(-\rho _0\). The mean time to extinction is exactly \(1/\rho _{0}\). The quasi-stationary distribution is constructed from the corresponding positive eigenvector.The method of analysis of the equation \(Lu=-\rho _0 u\) is inspired by matching techniques reminiscent of the WKB method in Physics [11, 17].
2 Standing assumptions and notations
In the sequel most quantities will depend on the parameter \(K\). We will not indicate systematically this dependence in the notation, except when we want to highlight it. Recall that
In the rest of the paper, the functions \(x\mapsto \tilde{\lambda }(x)\) and \(x\mapsto \tilde{\mu }(x)\), defined on \({\mathbb {R}}_+\), are assumed to be positive, differentiable and increasing. In particular, this implies that the sequences \((\lambda _n)_n\) and \((\mu _n)_n\) are increasing.
From now on, we assume that the following properties for the functions \(\tilde{\lambda }\) and \(\tilde{\mu }\) hold throughout the paper.
Some comments are in order about the above assumptions. The relevant assumptions from a biological viewpoint are assumptions (2.2), (2.3) and (2.4). The first one means that, when the population size gets large, deaths prevail. The second one means the opposite: at low population size, births prevail. The third one means that there is a unique equilibrium for the associated differential equation. This rules out for instance the so-called Allee effect where there are two non-trivial equilibria. Assumption (2.5) is a genericity property. The remaining assumptions are technical but they are by far true in the logistic case and in many other models.
We shall denote by \((X_t^{\scriptscriptstyle {K}}, t\ge 0)\) the birth-and-death process associated with the rates \((\lambda _n)\) and \((\mu _n)\). Thorough the paper we will use the classical notation
and we set \(\pi _1:=\frac{1}{\mu _1}\). The following trivial identity will be used repeatedly.
One can verify that condition (2.2), together with the facts that \((\mu _n)_n\) is increasing and that \(\tilde{\mu }(0)\) is bounded away from zero, imply the following two properties:

The property \((\star )\) implies absorption of the process at state \(0\) with probability one. The property \((\star \star )\) ensures finiteness of the expectation of the absorption time, that is, \({\mathbb {E}}_m[T_{0}]<+\infty \) for every \(m\in \mathbb {N}^{*}\), where \(T_0=\inf \{t\ge 0 : X_t^{\scriptscriptstyle {K}}=0\}\). We refer to [13, p. 384] and [1, chapter 3] for details.
Condition (2.6) implies

(See Lemma 9.1 for a proof.) As proved in [10], this is a sufficient condition for the existence and uniqueness of a quasi-stationary distribution. It turns out that it is a necessary condition as well as it can be deduced from [4]. Condition (2.7) implies
This follows from the mean value theorem to the function \(x\mapsto \log \tilde{\mu }(x)\). We will assume that
This is a technical condition that we use in the spectral theory of the operator associated with the process.
Finally, let us recall (see e.g. [16]) that for large \(K\), the process \((X_t^{\scriptscriptstyle {K}}/K, t\ge 0)\) is close to the solution of the ordinary differential equation
during any given finite time interval. Our assumptions imply that the differential equation (2.16) has the unique non-zero equilibrium \(x_{*}\). Observe that, because of assumptions on the functions \(x\mapsto \tilde{\lambda }(x)\) and \(x\mapsto \tilde{\mu }(x)\), one has \(\frac{\tilde{\lambda }(x)}{\tilde{\mu }(x)} >1\) for \(x<x_{*}\) and \(\frac{\tilde{\lambda }(x)}{\tilde{\mu }(x)}<1\) for \(x>x_{*}\). This implies the stability of the equilibrium \(x_*\) of the deterministic equation (2.16) and, using (2.5), we get
We shall use the notation
This quantity plays a natural role in the sequel.
An example. For the logistic birth-and-death process defined in (1.1), we have \(\tilde{\lambda }(x)=\tilde{\lambda }\) and \(\tilde{\mu }(x)=\tilde{\mu }+x\). If \(\tilde{\lambda }>\tilde{\mu }\), it is easy to check that all the above conditions are fullfilled. One has \(n_{*}{\scriptstyle (K)}=\lfloor (\tilde{\lambda }-\tilde{\mu }) K\rfloor \).
3 Statements of the main results
3.1 The generator and its spectrum
Our goal is to link the semigroup of the process \((X_t^{\scriptscriptstyle {K}},t\ge 0)\) ‘killed’ at \(0\) to a self-adjoint operator with compact resolvent in an appropriate Hilbert space. The spectral theory for this operator lies at the core of our work.
Let us denote by \(\fancyscript{D}\) the set of sequences with finite support on \(\mathbb {N}^{*}\). Define the operator \({\tilde{L}}\) with domain \(\fancyscript{D}\) by
We introduce the following weighted space of sequences of complex numbers
where the \(\pi _n\)’s are defined in (2.10). The space \(\ell ^2(\pi )\) is a Hilbert space when endowed with the scalar product
where \( \bar{u}_{n}\) is the complex conjugate of \(u_n\). We shall denote by \(\Vert \!\cdot \!\Vert _{\pi }\) the associated norm.
The main content of the following theorem is that one can extend the operator \({\tilde{{L}}}\) to an operator \(L\) that is the infinitesimal generator of a positive and contractive semigroup in \(\ell ^2(\pi )\). Moreover, this operator has a discrete spectrum with a maximal eigenvalue that is simple and negative.
Theorem 3.1
(The operator \(L\), \(\rho _0\), \(\varphi \) and \(\rho _1\)).
-
1.
The operator \({\tilde{{L}}}\) is symmetric on \(\fancyscript{D}\). It is closable in \(\ell ^{2}(\pi )\).
-
2.
We will denote by \({L}\) its closure and by \(\mathcal {D}\) the domain of this closure. The operator \({L}\) defines a positive contraction semigroup in \(\ell ^{2}(\pi )\).
-
3.
\({L}\) is a dissipative, self-adjoint operator with a compact resolvent. Its spectrum is discrete and the maximal eigenvalue is simple and negative. We denote it by \(-\rho _{0}\). The corresponding eigenvector can be chosen positive and we denote it by \(\varphi \). Finally, we denote by \(-\rho _{1}\) the second largest eigenvalue.
The proof of this theorem is given in Sect. 4.
Remark 3.1
The construction of \(\mathcal {D}\) is general; see [14, III.5.3].
For all \(t>0\), \(n,m\in \mathbb {N}^{*}\), let

where for each \(n\), \(\mathrm {e}_{n}\) is defined by \(\mathrm {e}_{n}(k) = \delta _{n,k}\) for \(k=1,2,\ldots \). A straightforward computation shows that the ‘matrix’ \((P_{t}(m,n))_{(m,n)\in \mathbb {N}^{*}\! \times \mathbb {N}^{*}}\) is a solution of the Kolmogorov equation
Furthermore, one can verify that there exists some \(M\ge 1\) such that for all \(t\) and all \(n\), \(\left| \sum _{k=1}^\infty P_t(n,k) \right| \le M.\) The uniqueness of such a family has been proven in [12, Theorem 14 p. 528] under Assumption (2.12). This implies that the symmetric sub-markovian semigroup \((P_{t},t\ge 0)\) is the extension of the transition semigroup of the Markov process \((X^{\scriptscriptstyle {K}}_{t}, t\ge 0)\) to \(\ell ^{2}(\pi )\).
In what follows, the solution \(u^{{\scriptscriptstyle 0}}=(u_n^{{\scriptscriptstyle 0}})_{n\in \mathbb {N}^{*}}\) of the homogeneous equation
such that \(u_1^{{\scriptscriptstyle 0}}=1\) will play an important role. Using (2.11) it is easy to verify that

with the convention that \(\sum _{j=1}^{{\scriptscriptstyle 0}}=0\).
Remark 3.2
Notice that \(u^{{\scriptscriptstyle 0}}\notin \ell ^2(\pi )\). Indeed, using (2.11), observe that
Hence
But by (2.15) the last sum tends to \(+\infty \) when \(N\) goes to infinity.
3.2 Estimates of the largest eigenvalue and of the associated eigenvector
Our first main result gives the behavior of \(\rho _{0}\) and \(\varphi \) as functions of \(K\) when \(K\) gets large. Recall that \(x_*\) and \(n_{*}{\scriptstyle (K)}\) are defined in (2.4) and (2.18), respectively, and that \(u^{{\scriptscriptstyle 0}}=(u_n^{{\scriptscriptstyle 0}})_n\) is the solution of the homogeneous equation (3.2). The function \(H\) is defined in (2.8) and recall that \(H''(x_*)>0\) (see (2.5)).
Theorem 3.2
(Estimates of \(\rho _0\) and \(\varphi \)).
For all \(K>1\), we have

Moreover, for all \(K>1\), we have
where
The proof of this theorem is given in Sect. 5. Notice that the constant \(c\) defined by
is strictly positive by the assumptions on the functions \(\tilde{\lambda }\), \(\tilde{\mu }\). It will appear several times later on.
Remark 3.3
In the logistic case, one finds

The following theorem provides a lower bound for the spectral gap.
Theorem 3.3
(Spectral gap).
There exists a constant \(d>0\) such that for all \(K>1\)
The proof of this theorem is given in Sect. 6.
Remark 3.4
As a consequence of the preceding two theorems, one has \(\rho _{0}(K) \ll \rho _{1}(K)-\rho _{0}(K)\) for large \(K\) because
.
3.3 Quasi-stationary distribution, survival rate and mean time to extinction
We refer to [7, 18] for background and more informations about quasi-stationary distributions. As usual, we shall denote by \({\mathbb {P}}_\nu \) the law of the process starting from a distribution \(\nu \) and by \({\mathbb {P}}_n\) the law of the process starting from the state \(n\), i.e. starting from the distribution \(\delta _n\). The corresponding exepectations are respectively denoted by \({\mathbb {E}}_\nu \) and \({\mathbb {E}}_n\).
Proposition 3.4
For all \(K>1\), the probability measure \(\nu =(\nu _n)_n\) on \(\mathbb {N}^{*}\) defined by
is the unique quasi-stationary distribution of the birth and death process.
Note that the quasi-stationary distribution \(\nu \) depends on \(K\) through \(\varphi \).
Proof
In order to prove that \(\nu \) is a quasi-stationary distribution, we must verify that \({\mathbb {P}}_{\nu }(X_t^{\scriptscriptstyle {K}} \in A | T_0>t)=\nu (A)\) for all \(t>0\) and for all subsets \(A\subseteq \mathbb {N}^{*}\). Observe that for all \(A\subseteq \mathbb {N}^{*}\), \({1\!\!1}_A\in \ell ^2(\pi )\). We have, using that \(L\) is self-adjoint,

Replacing \(A\) by \(\mathbb {N}^{*}\) yields the wanted relation. Since we have uniqueness [by (2.13)], \(\nu \) must be the quasi-stationary distribution. \(\square \)
Before proceeding with the other results, we observe that the previous proof shows that for all \(t>0\)

The quantity \(\rho _0\) is usually called the exponential rate of survival. The mean time to extinction (starting from the quasi-stationary distribution) is thus
In view of Theorem 3.2, it is of order \(e^{cK}{/}{\sqrt{K}}\) for some positive constant \(c\). More precisely, we have the following corollary.
Corollary 3.5
(Approximation of the mean time to extinction).
For all \(K>1\) we have

Note that there is another way to obtain the above estimate of \({\mathbb {E}}_{\nu }\big [T_0\big ]\). Indeed, we have
and since (see [13])
the estimate can be obtained by using Proposition 3.4 and Theorem 3.2 to deal with \(\varphi _n\).
3.4 Convergence rate to the quasi-stationary distribution and Gaussian approximation
We denote by
the total variation distance between two probability measures \(\mu ^{{\scriptscriptstyle (1)}}\) and \(\mu ^{{\scriptscriptstyle (2)}}\). Recall that

where \(\fancyscript{P}({\mathbb {N}})\) is the powerset of \({\mathbb {N}}\).
The process \((X^{\scriptscriptstyle {K}}_{t}, t\ge 0)\) is said to have a Yaglom limit if there exists a probability measure \(\mathfrak m\) on \(\mathbb {N}^{*}\) such that for every \(n\in \mathbb {N}^{*}\) and for every \(A\in \fancyscript{P}(\mathbb {N}^{*})\) one has
When it exists, the Yaglom limit is a quasi-stationary distribution (whereas the converse is false in general), see [18].
The following theorem provides a quantitative bound for the distance (in total variation) between the law of the process and a convex combination of the Dirac mass at \(0\) and the quasi-stationary distribution \(\nu \). It also shows that \(\nu \) is the Yaglom limit of \(\big (X^{\scriptscriptstyle {K}}_{t}, t\ge 0)\) with a quantitative error bound. Recall that \(-\rho _1\) is the second largest eigenvalue of \(L\) (see Theorem 3.1).
Theorem 3.6
There exist three strictly positive constants \(a, c_1,C\) such that for all \(K>1\), for all \(n\in \mathbb {N}^{*}\) and for all \(t\ge 0\), we have

where
and where \(u^{{\scriptscriptstyle 0}}\) is defined in (3.3). Moreover

In particular, the probability measure \(\nu \) is the Yaglom limit (in total variation distance) of the process \((X_t^{\scriptscriptstyle {K}}, t\ge 0)\).
The proof of this theorem is given in Sect. 7
Remark 3.5
The proof of the previous theorem consists in establishing the following more explicit estimate: there exist three strictly positive constants \(a,c_1,C\) such that for all \(K>1\), for all \(n\in \mathbb {N}^{*}\) and for all \(t\ge 0\), we have

Then we show that the estimates (3.5) and (3.6) follow from (3.7).
Remark 3.6
The estimate (3.5) can be interpreted as follows. Recall that, for \(K\) large, \(\rho _{0}\) is very small. Therefore, if we start with \(n=\mathcal {O}(K)\) and if \(t\) is such that \(K\log K/(\rho _{1}-\rho _{0})\ll t\ll 1/\rho _{0}\), we get the following rough estimate:

This inequality highlights the existence of an interval of time during which the process is either extinct with a probability close to \(1-\alpha _{n}{\scriptstyle (K)}\) or obeys the quasi-stationary distribution \(\nu \) with a probability close \(\alpha _{n}{\scriptstyle (K)}\). This interval has a length that is roughly exponentially large in \(K\).
Remark 3.7
It follows from Theorems 3.2 and 3.3 that, for \(K\) large enough,
Hence, for \(K\) large enough, the estimate (3.6) can be written as

Remark 3.8
Note that for every \(n\ge 1\), the weights \(\alpha _n{\scriptstyle (K)}\) appearing in (3.5) can be written as
for all \(K>1\). This follows by adapting the proof of Lemma 9.5.
The last result shows that the quasi-stationary distribution \(\nu \) is close, as \(K\) gets large, to a Gaussian law centered at \(n_{*}{\scriptstyle (K)}\). Recall that the function \(H\) is defined in (2.8).
Theorem 3.7
We have

where \(G^{\scriptscriptstyle K}\) is the probability measure on \(\mathbb {N}^{*}\) given by

where

and where
Recall that \(H''(x_*)>0\) by (2.17). In the logistic case, one has \(\sigma =\sqrt{\tilde{\lambda }}\). The proof of this theorem is given in Sect. 8.
4 Proof of Theorem 3.1
4.1 \({\tilde{L}}\) is symmetric and closable in \(\ell ^{2}(\pi )\)
Using (2.11), the reader can verify that, for all \(u,v\in \fancyscript{D}\), one has \(\langle {\tilde{{L}}} u, v\rangle _\pi =\langle u, {\tilde{{L}}} v\rangle _\pi \). Hence \({\tilde{{L}}}\) is symmetric.
To verify closedness, one can apply a result in [14, III.5.3] saying that it is equivalent to prove that, for every sequence \((y^{(k)})_{k}\in \fancyscript{D}\) such that \(y^{(k)}\rightarrow 0\) (in \(\ell ^{2}(\pi )\)) and such that \(\tilde{L}y^{(k)}\) converges to \(y\) (in \(\ell ^{2}(\pi )\)), \(y=0\). Details are left to the reader.
4.2 \({L}\) defines a positive contraction semigroup in \(\ell ^{2}(\pi )\)
The key result in proving this claim is the following.
Proposition 4.1
For every \(f\in \ell ^2(\pi )\) and every \(\rho >0\), the equation
has a unique solution \(y\in \mathcal {D}\) denoted by \(\,R_{\rho } f\). Moreover
Finally, if \(f\) is nonnegative, so is \(\,R_{\rho } f\).
It is well-known that the previous bound is a sufficient condition for \({L}\) to generate a \(C_{0}\) contraction semigroup \(Q_{t}\) in \(\ell ^{2}(\pi )\), see e.g. [24, p. 249].
The proof of this proposition requires two preliminary results. For \(1\le n\le N\) we define (on \(\ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots , N\})}\)) the truncated operator \(L_{N}\) by
The operator \(L_{{\scriptscriptstyle N}}\) satisfies the following positive maximum principle.
Lemma 4.2
Let \(v\in \ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots ,N\})}\) and let \(m\in \{1,\ldots ,N\}\) such that \(v_m=\sup _{1\le n\le N}v_n\). If \(v_m\ge 0\), then \((L_{{\scriptscriptstyle N}}v)_m\le 0\).
Proof
For \(2\le m\le N-1\), we get
since, by definition of \(m\), \(v_m\) is maximal. The cases \(m=1\) and \(m=N\) follow similarly. \(\square \)
Lemma 4.3
Let \(g\in \ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots ,N\})}\) and \(\rho >0\). The equation \((\rho -L_{{\scriptscriptstyle N}})v=g\) has a unique solution in \(\ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots ,N\})}\). Moreover, one has \(\Vert v\Vert _{\ell ^{\infty }{\scriptscriptstyle (\{1,\ldots ,N\})}} \le \Vert g\Vert _{\ell ^{\infty }{\scriptscriptstyle (\{1,\ldots ,N\})}}/\rho \). Finally, if \(g\ge 0\) then \(v\ge 0\).
Proof
If \(g\in \ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots ,N\})}\) and \(\rho >0\) are such that \(g=(\rho -L_{{\scriptscriptstyle N}})v\), and if \(m\in \{1,\ldots ,N\}\) is such that \(v_{m}=\sup _{1\le n\le N}v_n\ge 0\) then, by Lemma 4.2, \(v_{m}\le g_{m}/\rho \). Considering \(-v\) and \(-g\), it follows that if \((\rho -L_{{\scriptscriptstyle N}})v=g\) and if \(m\in \{1,\ldots ,N\}\) is such that \(v_{m}=\inf _{1\le n\le N}v_n\le 0\) then \(v_{m}\ge g_{m}/\rho \). This implies that \(v\ge 0\) if \(g\ge 0\). The previous two inequalities imply \(\Vert v\Vert _{\ell ^{\infty }{\scriptscriptstyle (\{1,\ldots ,N\})}} \le \Vert g\Vert _{\ell ^{\infty }{\scriptscriptstyle (\{1,\ldots ,N\})}}/\rho \). In particular we have \(\text {Ker}(\rho -L_{{\scriptscriptstyle N}})=\{0\}\), namely \(\rho -L_{{\scriptscriptstyle N}}\) is invertible in \(\ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots ,N\})}\). The lemma is proved. \(\square \)
We now turn to the proof of Proposition 4.1.
Let \(f\in \fancyscript{D}\) and let \(N_0\ge 1\) be such that \(f_n=0\) for all \(n>N_0\). Applying Lemma 4.3 for \(N>N_0\) yields a \(v^{\scriptscriptstyle {(N)}}\in \ell ^{{\scriptscriptstyle \infty }}{\scriptstyle (\{1,\ldots ,N\})}\) such that
We also have that for all \(N>N_0\)
Define \(u^{\scriptscriptstyle {(N)}}\in \fancyscript{D}\) by
For all \(p\in \mathbb {N}^{*}\) we have
It is then easy to show that
by using (2.2) and (4.1). Hence, since we assume that (2.15) holds, we get that \((\rho -{L})\;u^{\scriptscriptstyle {(N)}}\) converges strongly to \(f\). Using \(u^{\scriptscriptstyle {(N)}}_{{\scriptscriptstyle N+1}}=0\) we obtain
where \(r_{\!{\scriptscriptstyle N}}=\big [(\rho +\lambda _{{\scriptscriptstyle N}}+\mu _{{\scriptscriptstyle N}}) v^{\scriptscriptstyle {(N)}}_{{\scriptscriptstyle N}}-\mu _{{\scriptscriptstyle N}} v^{\scriptscriptstyle {(N)}}_{{\scriptscriptstyle N-1}}\big ]v^{\scriptscriptstyle {(N)}}_{{\scriptscriptstyle N}} \pi _{{\scriptscriptstyle N}}\).
One gets (recall that \(u^{\scriptscriptstyle {(N)}}_{{\scriptscriptstyle N+1}}=0\))
where we used (2.11). Hence, it follows from (4.2) and the previous inequality that
Therefore we obtain
where the right hand side is the largest root of the polynomial function \(x\mapsto \rho x^2-\Vert f\Vert _\pi x-r\!_{{\scriptscriptstyle N}}\). Since \(r_{{\scriptscriptstyle N}}\) tends to \(0\) by (2.15) when \(N\) tends to infinity, \(\sup _{{\scriptscriptstyle N}}\Vert u^{\scriptscriptstyle {(N)}}\Vert _{\pi }<\infty \). Since a ball in the Hilbert space \(\ell ^{2}(\pi )\) is weakly compact [24, p. 126], we can extract from the sequence \((u^{\scriptscriptstyle {(N)}})\) a subsequence weakly converging to some \(u\in \ell ^{2}(\pi )\). Moreover
by [24, Theorem 1, p. 120]. Since the sequence \(((\rho -{L})u^{{\scriptscriptstyle (N)}})\) is also weakly convergent to \(f\) (see above, even strongly convergent in our case), we can apply [14, Problem 5.12, p. 165] to conclude that \(u\in \mathcal {D}\) and \((\rho -{L})u=f\).
At this point, we have proved that for all \(f\in \fancyscript{D}\) the equation \((\rho -{L})u=f\) has a solution in \(\mathcal {D}\).
If \(f\) is nonnegative, Lemma 4.3 implies that all the \(u^{\scriptscriptstyle {(N)}}\) are nonnegative for \(N\) large enough, hence \(u\) is nonnegative.
For every \(w\in \mathcal {D}\), there is a sequence \((w^{(n)})\), with \(w^{(n)}\in \fancyscript{D}\) for all \(n\), converging to \(w\) (in \(\ell ^{2}(\pi )\)) with \(({L}w^{(n)})\) converging to \({L}w\) in \(\ell ^{2}(\pi )\) (see [14, III.5.2]). As before,
Therefore
for all \(w\in \mathcal {D}\). This implies that the equation
has a unique solution \(u\in \mathcal {D}\) for every \(f\in \fancyscript{D}\). This solution, denoted by \(R_{\rho }f\), satisfies
and it is nonnegative if \(f\) is nonnegative. Since \(\fancyscript{D}\) is dense in \(\ell ^{2}(\pi )\), the linear operator \(R_{\rho }\) can be extended to a linear operator on \(\ell ^{2}(\pi )\) with a norm that is at most \(\rho ^{-1}\) (see [14, II.2.2]).
Since \(\fancyscript{D}\) is dense in \(\ell ^{2}(\pi )\), for each \(f\in \ell ^{2}(\pi )\) we can find a sequence \((f^{(k)})\subset \fancyscript{D}\) converging to \(f\) in \(\ell ^{2}(\pi )\). Moreover, \(\big (R_{\rho }f^{(k)})\) converges to \(R_{\rho }f\). Since, for all \(k\), \(R_{\rho }f^{(k)}\in \mathcal {D}\) and \( {L}R_{\rho } f^{(k)}=\rho R_{\rho }f^{(k)}-f^{(k)} \) converges in \(\ell ^2(\pi )\) to \(\rho R_{\rho }f-f\), we conclude, by using [14, III.5.2], that, for every \(f\in \ell ^2(\pi )\), \(R_{\rho } f\in \mathcal {D}\) and
Nonnegativity follows easily. This finishes the proof of the proposition.
We can now make the proof of statement 2 in Theorem 3.1. Using Proposition 4.1, we can apply [24, p. 249] to show that \({L}\) generates a \(C_{0}\) contraction semigroup \(Q_{t}\) in \(\ell ^{2}(\pi )\). For all \(t\ge 0\), the operator \(Q_{t}\) maps nonnegative sequences to nonnegative sequences since this holds for \(R_{\rho }\) for all \(\rho >0\) using [24, formula 3, p. 246].
4.3 Compactness, self-adjointness and dissipativity
\({L}\) has a compact resolvent in \(\ell ^{2}(\pi )\). From the equation \(\ (\rho - L)\, R_{\rho }= \mathrm{Id}\), we get for every \(f\in \ell ^2(\pi )\)
We are going to verify that each term is uniformly square summable at infinity with respect to the weights \((\pi _{n})\).
This is obvious for the first term since \(\lim _{n\rightarrow \infty }\frac{1}{\lambda _{n}+\mu _{n}+\rho }=0\).
For the other two terms, by using (2.11), we have for all \(N\ge 2\)
Using (2.2) and (2.14) we conclude that for all \(\varepsilon >0\), there exists \(N_{\varepsilon }\) such that for all \(N\ge N_{\varepsilon }\)
Compactness of the resolvent follows.
If \(-\rho \) is an eigenvalue, a corresponding eigenvector \(u\) (in \(\ell ^2(\pi )\)) must satisfy the identities
Therefore, \(u_{1}\) determines all the \({u_{n}}'s\). This implies that all eigenvalues are simple.
Positivity of the eigenvector associated with the maximal eigenvalue \(\,-\rho _{0}\) follows from the fact that the semigroup preserves nonnegativity and the fact that if an eigenvector is orthogonal to any positive function, it would be equal to \(0\), which is not true.
Self-adjointness and dissipativity. Self-adjointness follows by an argument found in [14, problem V.3.32, p. 279]. In more details, it follows from Eq. (4.3) that for all \(u\in \mathcal {D}\), \(\langle u,Lu\rangle _{\pi }\le 0\), hence is \(L\) is dissipative and the numerical range of \(L\) is contained in the negative real line. By Theorem V.3.2 page 268 in [14] the defect index is constant outside the negative real line, and equal to zero on the positive real line by Proposition 4.1. Therefore the spectrum of \(L\) is contained in the negative real line and \(L\) is self adjoint by Theorem 3.16 in [14, Chapter V, p. 271].
5 Proof of Theorem 3.2
For every small number \(\rho \), we are going to consider sequences \((u_{n})_{n}\) satisfying
The strategy will be as follows. If \(\rho =0\), \(u^{{\scriptscriptstyle 0}}\) is a solution of (5.1) for all \(n\ge 1\) and the constant sequence \(\,1\,\) is a solution of (5.1) for all \(n\ge 2\). For small \(\rho \ne 0\) and \(n\le n_{*}{\scriptstyle (K)}\), we will look for a solution of (5.1) that is a small perturbation of \(u^{{\scriptscriptstyle 0}}\). Since \(u^{{\scriptscriptstyle 0}} \notin \ell ^2(\pi )\) (see Remark 3.2), we cannot use such an argument for large \(n\). For \(n\ge n_{*}{\scriptstyle (K)}-1\), we will use Levinson’s technique (see [11, 17]) to prove that there is a solution of (5.1) that is almost constant. Then we will match these two solutions in \(\{n_{*}{\scriptstyle (K)}-1, n_{*}{\scriptstyle (K)}\}\). This will be possible for a single value of \(\rho \) that has to be \(\rho _{0}\). Since (5.1) is a recursion of order \(2\), this matched sequence is a solution for all \(n\in \mathbb {N}^{*}\). Finally we will prove that this sequence belongs to \(\mathcal D\) (see Theorem 3.1 for the definition of \(\mathcal D\)).
5.1 When \(1\le n \le n_{*}{\scriptstyle (K)}\)
We look for a solution of the form
where \(u^{{\scriptscriptstyle 0}}=(u^{{\scriptscriptstyle 0}}_{n})\) is defined in (3.3).
Proposition 5.1
There exists a constant \(\widetilde{C}>0\) such that for \(K\) large enough and for each \(\rho \in \left[ -{\scriptstyle 1/(3\widetilde{C}K\log K)},{\scriptstyle 1/(3\widetilde{C}K\log K)}\right] \) the Eq. (5.1) admits for all \(n\le n_{*}{\scriptstyle (K)}\) a solution of the form
where
-
1.
\(\delta _1=0\);
-
2.
\(\delta _n\) is a solution of
$$\begin{aligned} \lambda _{n}\frac{u^{{\scriptscriptstyle 0}}_{n+1}}{u^{{\scriptscriptstyle 0}}_{n}}\big (\delta _{n+1}-\delta _{n}\big )-\mu _{n} \frac{u^{{\scriptscriptstyle 0}}_{n-1}}{u^{{\scriptscriptstyle 0}}_{n}} \big (\delta _{n}-\delta _{n-1}\big ){1\!\!1}_{\{n\ge 2\}} =-\rho \big (1+\delta _{n}\big ); \end{aligned}$$ -
3.
\(1+\delta _{n} >0\) and \( \Vert (\delta _n)\Vert _{\ell ^\infty {\scriptscriptstyle (\{1,\ldots , n_{*} (K)\})}} \le \frac{|\rho |\widetilde{C}K\log K}{1-|\rho |\widetilde{C}K\log K}. \)
-
4.
\(\delta =(\delta _n)_n\) is a smooth function of \(\rho \) and
$$\begin{aligned} \left\| \frac{\mathrm {d}\delta }{\mathrm {d}{\rho }}(\rho ) - \Delta ^{\!{\scriptscriptstyle 0}}\right\| _{\ell ^\infty {\scriptscriptstyle (\{1,\ldots , n_{*} (K)\})}} \le 4(\widetilde{C}K\log K)^2\,|\rho | \end{aligned}$$where
$$\begin{aligned} \Delta ^{\!{\scriptscriptstyle 0}}_n= \sum _{j=1}^{n-1} \frac{1}{\lambda _j \pi _j u_j^{{\scriptscriptstyle 0}} u_{j+1}^{{\scriptscriptstyle 0}}} + \sum _{j=1}^{n-1} \sum _{p=2}^{j}\frac{(u_p^{{\scriptscriptstyle 0}})^2 \pi _p}{\lambda _j\pi _j u_j^{{\scriptscriptstyle 0}} u_{j+1}^{{\scriptscriptstyle 0}}} \quad \mathrm{for\,all}\;n\ge 2 \end{aligned}$$(5.2)and \(\Delta ^{\!{\scriptscriptstyle 0}}_1=0\).
Proof
It is easy to check that
We impose \(\delta _{1}=0\) (i.e. \(v_1=1\)).
We now apply Lemma 9.7 for \(n\ge 2\) with
For \(r>s\), we have
Observing that \(\lambda _1\, u_2^{{\scriptscriptstyle 0}}\,\delta _2=-\rho \), we get
Equation (5.3) can be written as
where \(B\) is a linear operator defined as
Using Lemma 9.3 and the fact that \(\mu _\ell /\lambda _\ell <1\) for \(\ell \le n_{*}{\scriptstyle (K)}-1\), we have the bound
where \(\widetilde{C}>0\) is a constant independent of \(K\) since \(\lambda _p\ge p\tilde{\lambda }(0)\). Therefore
We denote by \(\Omega \) the complex disk centered at the origin and of radius \( \frac{1}{3\widetilde{C} K\log K}. \) For every \(\rho \in \Omega \), the operator \(\mathrm{Id} - \rho B\) is invertible and \(\delta = (\mathrm{Id} - \rho B)^{-1} \,\rho \Delta ^{\!{\scriptscriptstyle 0}}\). It follows from (5.4) that
Therefore, \(\delta \) is bounded in \(\ell ^\infty {\scriptstyle (\{1,\ldots , n_{*} {\scriptscriptstyle (K)}\})}\) by \(\frac{1}{2}\) and \(1+\delta _{n}>0\) for all \(n\le n_{*}{\scriptstyle (K)}\). It also follows that \(\delta =(\delta _n)_{1\le n_{*}{\scriptscriptstyle (K)}}\) is an analytic function on \(\Omega \). We now compute its derivative in \(\Omega \):
Using (5.4) we get that for every \(\rho \in \Omega \)
This finishes the proof of the proposition. \(\square \)
5.2 When \(n\ge n_{*}{\scriptstyle (K)}-1\)
Proposition 5.2
Let \(C\) be the constant defined in Lemma 9.1. For \(K\) large enough and each \(\rho \in \left[ -{\scriptstyle 1/(3CK)},{\scriptstyle 1/(3CK)}\right] \) the Eq. (5.1) admits for all \(n\ge n_{*}{\scriptstyle (K)}-1\) a solution
where
-
1.
\(w_{n_{*}{\scriptstyle (K)}-1}=0\);
-
2.
\(w_n\) is a solution of \(\lambda _{n}(w_{n+1}-w_{n})+\mu _{n}(w_{n-1}-w_{n})=- \rho (1+w_{n})\);
-
3.
\(1+w_{n} >0\) and \( \Vert w_n\Vert _{\ell ^\infty (\{n_*{\scriptscriptstyle (K)} -1,n_*{\scriptscriptstyle (K)} ,\ldots \})}\le \frac{|\rho |CK}{1-|\rho |CK}. \)
-
4.
\(w=(w_n)\) is a smooth function of \(\rho \) and
$$\begin{aligned} \left\| \frac{\mathrm {d}w}{\mathrm {d}\rho }(\rho ) - W^{{\scriptscriptstyle 0}}\right\| _{\ell ^\infty (\{n_*{\scriptscriptstyle (K)} -1,n_*{\scriptscriptstyle (K)} ,\ldots \})} \le 4(CK)^2|\rho | \end{aligned}$$where
$$\begin{aligned} W_n^{{\scriptscriptstyle 0}} = \sum _{j=n_{*}{\scriptscriptstyle (K)}-1} ^{n-1} \sum _{p=j+1}^{\infty } \frac{ \pi _{p}}{\lambda _{j}\pi _{j}}\quad \mathrm{for\,all}\;n\ge n_{*}{\scriptstyle (K)}\end{aligned}$$(5.5)and \(W_{n_*{\scriptscriptstyle (K)} -1}=0\).
Proof
Let us define by induction for \(n\ge n_{*}{\scriptstyle (K)}\),
with \(w_{n_{*}{\scriptscriptstyle (K)} -1}=0\). It is easy to check by using (2.11) that
Equation (5.6) can be written as
where \(A\) is a linear operator defined as
The second assertion in Lemma 9.1 yields the following estimates:
We denote by \(\Omega '\) the complex disk centered at the origin and of radius \(\frac{1}{3CK}\). Thus, if \(\rho \in \Omega '\), the operator \(\mathrm{Id} - \rho A\) is invertible and \(w = (\mathrm{Id} - \rho A)^{-1} \,\rho W^{{\scriptscriptstyle 0}}\). It follows from (5.7) that
Therefore, \(w\) is bounded in \(\ell ^{{\scriptscriptstyle \infty }}(\{n_{*}{\scriptstyle (K)}-1,n_{*}{\scriptstyle (K)},\ldots \})\) by \(\frac{1}{2}\) and \(1+w_{n}>0\) for all \(n\ge n_{*}{\scriptstyle (K)}-1\). It also follows that \(w\) is analytic in \(\Omega '\). Its derivative is
Using (5.7), we get for every \(\rho \in \Omega '\)
The proof of the proposition is complete. \(\square \)
5.3 Matching
We consider \(I=\left[ -{\scriptstyle 1/(3\widetilde{C}K\log K)},{\scriptstyle 1/(3\widetilde{C}K\log K)}\right] \) and \(K\) large enough so that \(\widetilde{C}\log K > C\). With this choice for the interval \(I\), Propositions 5.1 and 5.2 apply for any \(\rho \in I\). We will match the solutions obtained in the two previous subsections in the set \(\{n_{*}{\scriptstyle (K)}-1, n_{*}{\scriptstyle (K)}\}\), namely \(u_n^{{\scriptscriptstyle 0}} (1+\delta _n(\rho ))\) for \(n\le n_{*}{\scriptstyle (K)}\) and \(1+w_n(\rho )\) for \(n\ge n_{*}{\scriptstyle (K)}-1\). We will prove that there is a unique \(\rho \in I\) such that there exists a nonzero constant \(b\) such that for \(n=n_{*}{\scriptstyle (K)}-1\) and \(n=n_{*}{\scriptstyle (K)}\),
We have the following proposition.
Proposition 5.3
Define the function \(f\) by
The minimal positive zero \(\tilde{\rho }_0\) of \(f\) satisfies

where \(H\) is defined in (2.8).
Proof
We are going to find a symmetric interval centered around \(0\) that contains a unique solution of \(f(\rho )=0\). Define the auxiliary function \(g(\rho )=f(\rho )-f(0)\). One can check, using Propositions 5.1 and 5.2 and Lemma 9.6 that for all \(\rho \in I\) one has
where
Let
For all \(K\) large enough we have, using Lemma 9.6 items 1 and 4,
Hence
for all \(K\) large enough by Lemma 9.6. Therefore the function \(g\) is monotone increasing in the interval \((-\eta (K),\eta (K))\) and, since \(g(0)=0\), we have

Now because
we have
This implies that the equation \(g(\rho )\,{=}\,-f(0)\) has a unique solution \(\tilde{\rho }_0\) in \([-\eta (K),\eta (K)]\). (This is a special instance of a more general result on quantitative estimates in the inverse function theorem derived in [22].)
It follows from (5.8) that for all \(\rho \in [-\eta (K),\eta (K)]\)
which implies that
Using (5.10) and statements 1 and 4 in Lemma 9.6, the proposition follows. \(\square \)
We now end the proof of Theorem 3.2. We define a sequence \(\tilde{\varphi }\) by
where \(\delta _{n}(\tilde{\rho }_{0})\) and \(w_{n}(\tilde{\rho }_{0})\) are defined in Propositions 5.1 and 5.2, and
It also follows from these propositions that \(\tilde{\varphi }\) is bounded and hence belongs to \(\ell ^2(\pi )\). In addition, we get for \(n\ge 1\)
Let us consider the sequence \((\tilde{\varphi }^{(k)})_{k\ge 1}\) of elements in \(\ell ^2(\pi )\) defined by \(\tilde{\varphi }^{(k)}_{n} = \tilde{\varphi }_{n} {1\!\!1}_{\{n\le k\}}\). Remark that for all \(k\ge 1\), \(\tilde{\varphi }^{(k)}\in \fancyscript{D}\). A straightforward computation leads to
Using assumptions (2.12) and (2.15), we can easily prove that
This implies that \(\tilde{\varphi }\in \mathcal {D}\) and \(L\tilde{\varphi }=-\tilde{\rho }_{0} \tilde{\varphi }\). By Theorem 3.1, the eigenvector \(\varphi \) is positive. Hence it cannot be orthogonal in \(\ell ^2(\pi )\) to \( \tilde{\varphi }\) that is strictly positive by Propositions 5.1 and 5.2. Since \(L\) is self-adjoint, this implies that \(\rho _{0} = \tilde{\rho }_{0}\) and \(\varphi = \tilde{\varphi }\).
By Assumption (2.9), it follows that \(K\int _{0}^{{\scriptscriptstyle \frac{1}{K}}} \log \frac{\tilde{\mu }}{\tilde{\lambda }}(x)\mathrm {d}x = \log \frac{\tilde{\mu }}{\tilde{\lambda }}({\scriptstyle \frac{1}{K}}) + {\mathcal O}({\scriptstyle \frac{1}{K}})= \log \frac{\mu _{1}}{\lambda _{1}}+ \mathcal {O}({\scriptstyle \frac{1}{K}})\). Therefore, using Proposition 5.3 we obtain

The estimate for \(\varphi \) follows from Propositions 5.1 and 5.2.
6 Proof of Theorem 3.3
6.1 A Poincaré inequality
The proof is based on a Poincaré inequality for the Dirichlet form defined for \(y\in \mathcal{D}\) by
Recall that \(\varphi \) is the eigenvector associated to the maximal eigenvalue \(-\rho _0\) of \(L\) (see Theorem 3.1).
Proposition 6.1
For every \(y\in \mathcal {D}\) such that \(\langle \varphi ,y\rangle _{\pi }=0\), we have
where
Proof
Take any \(y\in \fancyscript{D}\). This implies that there exists some integer \(N\) such that \(y_n=0\) for all \(n>N\). We then have
where by convention \(\sum _{1}^{0}=0\). (Recall that \(\bar{y}_{n}\) is the complex conjugate of \(y_{n}\).) Hence, since \(y_{N+1}=0\),
By Cauchy–Schwarz inequality we get
where
and
Using that \(y_{N+1}=0\) and (2.11) we obtain
since for \(n\ge 2\)
and
Note also that [since \(y_{N+1}=0\) and using (2.11)]
Therefore
and we get from (6.3) and the previous estimate
We now derive an upper bound for \(T_2\). We now use the assumption that \(y\) is such that \(\langle \varphi ,y\rangle _{\pi }=0\) on the top of being such that \(y_n=0\) for all \(n\ge N+1\). In other words
Let \(\tilde{n}\) be a fixed integer over which we will optimize later on. Then we get, using Cauchy–Schwarz inequality,
We used (6.5) for the second equality, that is, \(\sum _{q=1}^{n}y_{q}\varphi _{q}\pi _q= -\sum _{q=n+1}^{\infty }y_{q}\varphi _{q}\pi _q\). Combining (6.4) and the previous bound we thus get that, if \(\langle \varphi ,y\rangle _{\pi }=0\),
where \(g\) has been defined in (6.2). This implies (6.1) on \(\mathcal {D}\) by closure. \(\square \)
6.2 Lower estimate for the spectral gap
Lemma 6.2
The spectral gap is bounded below by \(g\) defined in (6.2):
Proof
Let us consider an eigenvector \(y\in \mathcal {D}\) with eigenvalue \(-\rho _{1}\). Since \(L\) is self-adjoint in \(\ell ^2(\pi )\), we have \(\langle \varphi ,y\rangle =0\). Therefore we get from inequality (6.1) in Proposition 6.1
and the result follows. \(\square \)
From what precedes, the proof of Theorem 3.3 boils down to prove the following proposition.
Proposition 6.3
For all \(K\ge 2\), \(g\ge \frac{{\mathcal {O}}(1)}{\log K}\) where \(g\) is defined in (6.2).
Before giving the proof of this proposition, we introduce the following technical quantities. Let
Observe that \(x_{**}<\infty \) because of (2.2). Also observe that \(x_{*} < x_{**}\) by the assumptions made on the functions \(\tilde{\lambda }\) and \(\tilde{\mu }\). We also define
We will also need to introduce an integer \(n_{***}{\scriptstyle (K)}\) that is defined as follows. By the assumptions made on the functions \(\tilde{\mu }\) and \(\tilde{\lambda }\) (see (2.3) and (2.4)), there exists a number \(\theta \) such that
Thus we can define the following real number (that is strictly smaller than \(x_*\)).
Then we define the integer
By definition
We now turn to the proof of Proposition 6.3.
Proof
From Lemma 9.3 and Theorem 3.2, we have
Therefore
We now derive an upper bound for each sum.
We first deal with the second sum in (6.10). To this end we write
where
where \(n_{**}{\scriptstyle (K)}\) is defined in (6.7). Using Young’s inequality and Lemma 9.1, we first get
Next we have
We used several facts: \((\mu _q)\) is increasing, \(\tilde{\mu }(x)\ge \tilde{\mu }(0)>0\), and the integers \(n_{*}{\scriptstyle (K)},n_{**}{\scriptscriptstyle (K)}\) are of order \(K\). Finally we have, using Lemma 9.4 and the numbers \(\Lambda _{n,m}\) defined just before that lemma,

For \(x_{*}\le s\le x_{**}\) (see (6.6) for the definition of \(x_{**}\)) we have for some positive constant \(\hat{c}\)
Hence we get

where we have isolated the term \(q=n+1\) that gives \({\mathcal {O}}(1)\). We introduce the new variables \(p=q+n+1\) and \(r=q-n-1\) to get

We now turn to the sum running from \(1\) to \(n_{*}{\scriptstyle (K)}\) in (6.10). We write
where
By using (2.3) and inverting the order of summations we get
where \(n_{***}{\scriptstyle (K)}\) is defined in (6.9). We estimate \(\hat{S}_2\) as follows.
The last estimate follows by splitting the second sum from \(1\) to \(n_{***}{\scriptscriptstyle (K)}/2\) and from \(n_{***}{\scriptscriptstyle (K)}/2\) to \(n_{***}{\scriptscriptstyle (K)}-1\).
Finally, we have the estimates

For \(x_{***}\le s\le x_{*}\) we have
for some constant \(c_{2}>0\), hence

We now use the variables \(p=q+n\) and \(r=q-n\),

Gathering all the bounds, we get the desired result. \(\square \)
7 Proof of Theorem 3.6
7.1 Preliminary estimates
We first derive some useful estimates. Recall that the constant \(c\) has been defined in (3.4).
Proposition 7.1
For all \(K>1\) we have

Proof
Recall that
Assume that \(K\) is large enough so that Propositions 5.1, 5.2 and Lemma 9.3 apply. We obtain
Observe that \(\Vert {1\!\!1}\Vert _\pi ^2=\sum _{j=1}^\infty \pi _j\) and
Now using (3.3) we get for all \(j\le n_{*}{\scriptstyle (K)}-1\)
Hence
We split this sum into three sums, \(s_1\), \(s_2\) and \(s_3\), that we define and estimate as follows. We have
since in this range \(\Lambda _{\ell +1,j}\le \theta ^{\ell -j+1}\) and \(\mu _j \ge j \tilde{\mu }(x_*)\). Next we have
We use the fact that \(\Lambda _{\ell +1,n_{***}{\scriptscriptstyle (K)}+1}\le 1\) and \(\Lambda _{n_{***}{\scriptscriptstyle (K)},j}\le \theta ^{n_{***}{\scriptscriptstyle (K)}-j}\) to get
that can be seen by estimating the sums from \(1\) to \(n_{***}{\scriptscriptstyle (K)}/2\) and from \(n_{***}{\scriptscriptstyle (K)}/2\) to \(n_{***}{\scriptscriptstyle (K)}-1\). Finally
where we first interchange the summations and then follow a very similar argument as in the estimate of \(S_3\) in the proof of Proposition 6.3. Therefore we obtain
Now observe that
Since \((u_j^{{\scriptscriptstyle 0}})\) is monotone increasing and using Lemma 9.3 we get
as we have seen above.
Using (7.1) we have
The result follows using (7.2), (7.3), Lemma 9.3, and the estimation
where the first inequality follows again from Lemma 9.3 and the definition of \(V\), while the second inequality is the lower bound in statement 5 in Lemma 9.6. \(\square \)
Note that for every \(A\in \fancyscript{P}(\mathbb {N}^{*})\), \({1\!\!1}_{A}\in \ell ^{2}(\pi )\).
Proposition 7.2
There exists \(\bar{C}>0\) such that for all \(t\ge 0\), for all \(K>1\) and for all \(n\in \mathbb {N}^{*}\), we have

where \(c\) is defined in (3.4).
Proof
Let \(\mathcal {Q}\) be the spectral projection on the spectral complement of \(-\rho _{0}\). By spectral theory (see e.g. [14, Theorem V.2.10, p. 260]) we have

Again by spectral theory and Cauchy–Schwarz inequality

since \(\Vert {1\!\!1}_A\Vert _\pi ^2\le \Vert {1\!\!1}\Vert _\pi ^2 =\sum _{j=1}^\infty \pi _j\). The result follows from the definition of \(P_t\) (see (3.1)) using statement 5 of Lemma 9.6. \(\square \)
The estimate in Proposition 7.2 is not satisfactory for \(n\) large since \(\pi _n\) tends to \(0\) as \(n\) tends to infinity. In fact, we can use the descent from infinity to get an estimate on the error that is uniform in \(n\).
Proposition 7.3
There exist three strictly positive constants \(a,c_1,C'\) such that for all \(t\ge 0\), for all \(K>1\) and for all \(n\in \mathbb {N}^{*}\), we have

Proof
For \(q\in {\mathbb {N}}\) define \(T_q=\inf \{t\ge 0 : X_t^{\scriptscriptstyle {K}}=q\}\). From the proof of Proposition 2.3 in [3] we obtain

where

One can prove that \(a>0\) (see Lemma 9.2 for a proof).
Using Chebyshev inequality we get for all \(t>0\)

For every \(n\ge n_{**}{\scriptstyle (K)}\), we have

By the strong Markov property we have
Using Proposition 7.2 and Lemma 9.4 we obtain

where \(c_1>0\) is a constant independent of \(n, t, A\) and \(K\). Using Cauchy–Schwarz inequality we obtain, using (7.4) and (7.5),

for all \(t>0\) and for \(K\) large enough so that \(2\rho _0\le a\). Hence

where we used the identity \({\mathbb {E}}_n\left[ e^{\rho _0 T_{n_{**}{\scriptscriptstyle (K)}}}\right] = \frac{\varphi _n}{\varphi _{n_{**}{\scriptscriptstyle (K)}}}\) for all \(n\ge n_{**}{\scriptstyle (K)}\). This identity comes from the fact that the process
is a martingale (where we write \(\varphi (n)\) instead of \(\varphi _n\) for the sake of readability). This relies on the equation \(L\varphi =-\rho _{0}\varphi \). The identity then follows from the Martingale Stopping Theorem (see e.g. [23]). Therefore we obtain

for all \(n\ge n_{**}{\scriptstyle (K)}\). The same bound holds for all \(n<n_{**}{\scriptstyle (K)}\) using Proposition 7.2.
\(\square \)
7.2 Proof of Theorem 3.6
We first establish inequality (3.7). Observe that for every \(B\in \fancyscript{P}({\mathbb {N}})\)
Inequality (3.7) follows by using twice Proposition 7.3. This implies the first inequality in the theorem using Proposition 7.1, Theorem 3.2 and statement 3 in Proposition 5.1.
The second inequality in the theorem is proved as follows. Let \(t_1{\scriptstyle (K)}\) be such that for all \(t\ge t_1{\scriptstyle (K)}\)

We start by considering \(t\ge t_1{\scriptstyle (K)}\). We have using Proposition 7.3

The bound follows using again Propositions 7.3 and 7.1, Lemma 9.3 (twice), Theorem 3.2 and Propositions 5.1 and 5.2. To have the bound for all \(t<t_1{\scriptstyle (K)}\), observe that the left-hand side is at most equal to \(2\). The bound follows by eventually taking a larger constant (uniformly in \(n\), \(K\) and \(t\)).
8 Proof of Theorem 3.7
Let \(K\) be large enough such that \(n_{1}=n_{*}{\scriptstyle (K)}-\sqrt{K}\log K>0\) and \(n_{2}=n_{*}{\scriptstyle (K)}+\sqrt{K}\log K<n_{**}\scriptstyle {(K)}\). We have
For \(n\le n_{1}\), \(\Lambda _{n_{*}\scriptscriptstyle {(K)},n}\) is increasing, \(\mu _{n}\le {\mathcal {O}}(1)K\) and \(\mu _{n_{*}\scriptscriptstyle {(K)}}\ge 1\) (\(K\) large). Therefore using Lemma 9.4 we get

Using Lemma 9.3, Propositions 5.2 and 5.1, and Theorem 3.2 this implies

For \(n_{2}\le n\le n_{**}\scriptstyle {(K)}\), \(\Lambda _{n,n_{*}\scriptscriptstyle {(K)}}^{-1}\) is decreasing, \(\mu _{n}\le {\mathcal {O}}(1)K\) and \(\mu _{n_{*}\scriptscriptstyle {(K)}}\ge 1\) (\(K\) large), therefore using Lemma 9.4 we have (since \(H''(x_*)>0\))

For \(n\ge n_{**}\scriptstyle {(K)}\) we have
hence

Using Lemma 9.3, Propositions 5.2 and 5.1 and Theorem 3.2 this implies

Finally, for \(n_{*}{\scriptstyle (K)}\le n\le n_{2}\), using Lemma 9.4 we have

The same estimate holds for \(n_{1}\le n\le n_{*}{\scriptstyle (K)}\).
It is easy to verify using Lemma 9.3, Propositions 5.2 and 5.1, Theorem 3.2 and Lemma 9.4 that for \(n_{1}\le n\le n_{2}\)
This implies for \(n_{1}\le n\le n_{2}\)

Therefore, setting

we obtain
We also observe that

for some positive constant \(\tilde{c}\). Theorem 3.7 follows after some easy manipulations of the normalizations.
References
Allen, L.J.S.: An Introduction to Stochastic Processes with Applications to Biology. CRC Press, New York (2011)
Barbour, A.D., Pollett, P.K.: Total variation approximation for quasi-stationary distributions. J. Appl. Probab. 47, 934–946 (2010)
Bansaye, V., Méléard, S., Richard, M.: How do birth and death processes come down from infinity?, preprint (2013). arXiv:1310.7402 [math.PR]
Cattiaux, P., Collet, P., Lambert, A., Martínez, S., Méléard, S., San Martín, J.: Quasi-stationary distributions and diffusion models in population dynamics. Ann. Probab. 37(5), 1926–1969 (2009)
Champagnat, N., Villemonais, D.: Exponential convergence to quasi-stationary distribution and Q-process, preprint (2014). arXiv:1404.1349v1 [math.PR]
Cloez, B., Thai, M.N.: Quantitative results for the Fleming-Viot particle system in discrete space, preprint (2014). arXiv:1312.2444v2 [math.PR]
Collet, P., Martínez, S.: Quasi-Stationary Distributions. Probability and its Applications. Springer, New York (2013)
Diaconis, P., Miclo, L.: On quantitative convergence to quasi-stationarity, preprint (2014). arXiv:1406.1805v1 [math.PR]
Doering, C., Sargsyan, K., Sander, L.: Extinction times for birth-death processes: exact results, continuum asymptotics, and the failure of the Fokker-Planck approximation. Multiscale Model. Simul. 3(2), 283–299 (2005)
van Doorn, E.: Quasi-stationary distributions and convergence to quasi-stationarity for birth-death processes. Adv. Appl. Probab. 23, 683–700 (1991)
Fedoryuk, M.: Asymptotic Analysis. Linear Ordinary Differential Equations. Springer, Berlin (1993)
Karlin, S., McGregor, J.L.: The differential equations of birth and death processes and the Stieltjes moment problem. Trans. Am. Math. Soc. 86, 489–546 (1957)
Karlin, S., Taylor, H.M.: An Introduction to Stochastic Modeling, 3rd edn. Academic Press, New York (1998)
Kato, T.: Perturbation Theory of Linear Operators. Springer, New York (1966)
Kessler, D., Shnerb, N.: Extinction rates for fluctuation-induced metastabilities: a real-space WKB approach. J. Stat. Phys. 127(5), 861–886 (2007)
Kurtz, T.G.: Solutions of ordinary differential equations as limits of pure jump Markov processes. J. Appl. Probab. 7, 49–58
Levinson, N.: The asymptotic nature of solutions of linear systems of differential equations. Duke Math. J. 15, 111–126 (1948)
Méléard, S., Villemonais, D.: Quasi-stationary distributions and population processes. Probab. Surv. 9, 340–410 (2012)
Nåsell, I.: Extinction and quasi-stationarity in the stochastic logistic SIS model. Lecture Notes in Mathematics, Mathematical Biosciences Subseries, vol. 2022. Springer, New York (2011)
Ovaskainen, O., Meerson, B.: Stochastic models of population extinction. Trends Ecol. Evol. 25, 643–652 (2010)
Sagitov, S., Shahmerdenova, A.: Extinction times for a birth-death process with weak competition. Lith. Math. J. 53, 220–234 (2013)
Sotomayor, J.: Inversion of smooth mappings. Z. Angew. Math. Phys. 41(2), 306–310 (1990)
Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion. Fundamental Principles of Mathematical Sciences, vol. 293. Springer, Berlin (1991)
Yosida, K.: Functional analysis. Reprint of the sixth (1980) edition. Classics in Mathematics. Springer, Berlin (1995)
Acknowledgments
The third author benefited from the support of the “Chaire Modélisation Mathématique et Biodiversité” funded by Veolia Environnement, the Ecole polytechnique and the Muséum national d’Histoire naturelle. The authors thank the referees for their careful reading and comments.
Author information
Authors and Affiliations
Corresponding author
9 Appendix: some technical lemmas and estimates
9 Appendix: some technical lemmas and estimates
Let
. Recall that we assume that \(I<+\infty \) [see (2.6)].
Lemma 9.1
There exists \(C\ge 1\) such that for all \(K>1\)
Proof
Using (2.1) we get
This proves the first estimate. Next, by definition of \(n_{*}{\scriptstyle (K)}\), \(n_{**}\scriptstyle {(K)}\) and \(x_{**}\) (see Sects. 2 and 6.2), we have
where we set \(C=(x_{**} + 1)I\) and where we used Young’s inequality to get the second inequality. \(\square \)
Lemma 9.2
The quantity

where \(n_{**}\scriptstyle {(K)}\) is defined in (6.7), is strictly positive.
Proof
The proof follows immediately from the above proof noticing that
\(\square \)
Recall that \(u^{{\scriptscriptstyle 0}}\) is defined in (3.3).
Lemma 9.3
There exists a constant \(C>0\) such that for all \(K\) large enough, and for all \(1\le n\le n_{*}{\scriptstyle (K)}\)
Proof
We take \(K\) large enough such that
where \(x_{***}\) is defined in (6.8). Observe that \(u_n^{{\scriptscriptstyle 0}}\) is increasing hence for \(n\le n_{*}{\scriptstyle (K)}\)

where \(C>0\) is independent of \(K\). \(\square \)
For \(n> m\) let
By convention we set \(\Lambda _{n,n}=1\). We have the following lemma.
Lemma 9.4
For all \(m,n\in \mathbb {N}^{*}\) such that \(n>m\) we have

where \(H\) is defined in (2.8) and where \(\sup _{m, n, K} |c(m,n,K)| <\infty \).
Proof
By definition (2.1)

where \(h(s):=\log \big (\tilde{\mu }(s)/\tilde{\lambda }(s)\big )\) (\(H'(s)=h(s)\)). Using the trapezoidal rule we get
for some \(\xi _j\in [j,j+1]\). Therefore, using (2.9), we obtain
The results follows. \(\square \)
Lemma 9.5
Proof
We have

The first sum (plus \(1\)) is equal to

The second sum is bounded similarly and we get
The lemma is proved. \(\square \)
The next lemma is about estimating various quantities: \(u_{n_*{\scriptscriptstyle (K)}}^{{\scriptscriptstyle 0}}-u_{n_*{\scriptscriptstyle (K)}-1}^{{\scriptscriptstyle 0}}\) [where \(u_n^{{\scriptscriptstyle 0}}\) is defined in (3.3)], \(W_{n_*{\scriptscriptstyle (K)}}^{{\scriptscriptstyle 0}}\) [see (5.5) for the definition], \(\Delta _{n_*{\scriptscriptstyle (K)}}^{\!{\scriptscriptstyle 0}}-\Delta _{n_*{\scriptscriptstyle (K)}-1}^{\!{\scriptscriptstyle 0}}\) [where \(\Delta ^{\!{\scriptscriptstyle 0}}_n\) is defined in (5.2)] and \(D(K)\) [that is defined in (5.9)].
Lemma 9.6
For all \(K>1\) we have the following estimates.
-
1.
-
2.
\(W_{n_*{\scriptscriptstyle (K)}}^{{\scriptscriptstyle 0}}= \frac{\sqrt{2\pi }}{2x_* \tilde{\lambda }(x_*)\sqrt{K H''(x_*)}}\left( 1+\frac{(\log K)^3}{\sqrt{K}}\right) ;\)
-
3.
\(\Delta _{n_*{\scriptscriptstyle (K)}}^{\!{\scriptscriptstyle 0}}-\Delta _{n_*{\scriptscriptstyle (K)}-1}^{\!{\scriptscriptstyle 0}} = -\frac{\sqrt{2\pi }}{2x_* \tilde{\lambda }(x_*)\sqrt{K H''(x_*)}} \left( 1+\frac{(\log K)^3}{\sqrt{K}}\right) ;\)
-
4.
-
5.
There exist a constant \(\gamma \in (0,1)\), that is independent of \(K\), such that

where \(c\) is defined in (3.4).
Proof
The proof of the first statement follows from Lemma 9.4, namely

We continue by estimating \(W_{n_{*}\scriptscriptstyle {(K)}}^{{\scriptscriptstyle 0}}\). Write
We start by estimating \(I_3\). We again make use of Lemma 9.4.
using the monotonicity of \((\mu _{n})_n\) and the definition of \(n_{**}{\scriptstyle (K)}\).
We now estimate \(I_2\).

using the monotonicity of \(H\) and Taylor’s expansion.
Finally we estimate \(I_1\). We use again Lemma 9.4.

The estimation of
is done similarly by decomposing the sum into three sums with the same ranges as before.
The estimation for \(D(K)\) follows immediately from the above estimates and Lemma 9.5.
Finally, the upper bound in statement 5 is obtained as follows. We have
where \(\Lambda _{n,1}=\prod _{j=1}^{n-1} \frac{\mu _j}{\lambda _j}\). Using Lemma 9.4 we get
The second sum is estimated by using the fact that \(\lambda _j/\mu _j<1/2\) for \(j\ge n_{**}{\scriptstyle (K)}\). The first sum is split into a sum from \(1\) to \(n_{***}{\scriptstyle (K)}\) and a sum from \(n_{***}{\scriptstyle (K)}+1\) to \(n_{**}{\scriptstyle (K)}\). In both cases, we use Lemma 9.4 and the steepest descent method for the sum from \(n_{***}{\scriptstyle (K)}+1\) to \(n_{**}{\scriptstyle (K)}\). The lower bound in statement 5 is obtained using
and the steepest descent method as before. This finishes the proof of the lemma. \(\square \)
Consider the linear equations
where \((\alpha _n)_{n\ge 1}\), \((\beta _n)_{n\ge 1}\) and \((h_n)_{n\ge 1}\) are given sequences of real numbers. The coefficients \(\alpha _{n}\) and \(\beta _{n}\) are positive. Define
Note that for \(r\ge s\ge q\)
We have the following lemma.
Lemma 9.7
The general solution of the homogeneous equation (8.1) when \(h_n=0\) for all \(n\ge 1\) (homogeneous equation) satisfies the recurrence property
In the general case, the solution of (8.1) is
In case of convergence of \(\sum _{p=q}^{\infty }\frac{h_{p}}{\alpha _{p}\;\Theta _{p+1,q}}\), this can be rewritten as
for some constant \(\tilde{A}_{q}\). (We use the convention \(\sum _{q}^{q-1}=0\).)
Proof
For \(n\ge q\) we define \(A_{n+1}\) by
Then
i.e.
and for all \(n\ge q+1\)
where
Then for all \(n\ge q\)
Hence
This implies the first two statements of the lemma. In case of convergence this can be rewritten as
for some constant \(\tilde{A}_{q}\). Indeed, since \(j\ge p\ge q\), we have \(\Theta _{j+1,p+1} = \frac{\Theta _{j+1,q}}{\Theta _{p+1,q}}\). Thus
which implies the last statement of the lemma. \(\square \)
Rights and permissions
About this article
Cite this article
Chazottes, JR., Collet, P. & Méléard, S. Sharp asymptotics for the quasi-stationary distribution of birth-and-death processes. Probab. Theory Relat. Fields 164, 285–332 (2016). https://doi.org/10.1007/s00440-014-0612-6
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-014-0612-6
Mathematics Subject Classification
- Primary 92D25
- Secondary 60J27
- 60J28
- 60J80
- 47A75
- 92D40
