INAR approximation of bivariate linear birth and death process

Chen, Zezhun; Dassios, Angelos; Tzougas, George

doi:10.1007/s11203-023-09289-9

INAR approximation of bivariate linear birth and death process

Open access
Published: 15 May 2023

Volume 26, pages 459–497, (2023)
Cite this article

Download PDF

You have full access to this open access article

Statistical Inference for Stochastic Processes Aims and scope Submit manuscript

INAR approximation of bivariate linear birth and death process

Download PDF

1129 Accesses
Explore all metrics

Abstract

In this paper, we propose a new type of univariate and bivariate Integer-valued autoregressive model of order one (INAR(1)) to approximate univariate and bivariate linear birth and death process with constant rates. Under a specific parametric setting, the dynamic of transition probabilities and probability generating function of INAR(1) will converge to that of birth and death process as the length of subintervals goes to 0. Due to the simplicity of Markov structure, maximum likelihood estimation is feasible for INAR(1) model, which is not the case for bivariate and multivariate birth and death process. This means that the statistical inference of bivariate birth and death process can be achieved via the maximum likelihood estimation of a bivariate INAR(1) model.

A bivariate INAR(1) model with different thinning parameters

Article 12 February 2015

Bivariate zero truncated Poisson INAR(1) process

Article 05 December 2015

A bivariate integer-valued bilinear autoregressive model with random coefficients

Article 10 May 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The simple linear birth and death process, which was first introduced by Feller (1939), is a widely used Markov model with applications in population growth, epidemiology, genetics and so on. The basic idea of this process is that the probabilities of any individual giving birth to a new individual, or any individual dying, are constant at any moment in time and all individuals are independent of each other. Many statistical properties, including moments, distribution function, extinction probability, or some other cumulative distribution of interests, are explicitly derived in the literature; see for example, Kendall (1949). The statistical inference for simple birth and death processes is then developed by Keiding (1975), where maximum likelihood estimators and other asymptotic results are discussed. Since the distribution function of simple birth and death processes is explicit, the construction of the likelihood function is straightforward. However, it is pointed out in the literature that the transition probability is actually cumbersome and numerically unstable when the size of population is large over time. At the same time, a variety of alternative estimation methods have been proposed. For example, quasi- and pseudo - likelihood estimators (Chen and Hyrien 2011; Crawford et al. 2014) addressed it as a missing data problem and apply an EM algorithm to maximize it. Tavaré (2018) found those transition probabilities by numerical inversion of the probability generating function and then applied Bayesian methods to perform estimation. Davison et al. (2021) adopted a saddle point approximation method to further improve the accuracy of transition probabilities.

The bivariate and multivariate birth and death process are developed in Griffiths (1972, 1973). Griffiths (1972) described the transmission of malaria (so called host-vector situation) as a bivariate birth and death process where there is no direct infection between the same type of population. Then the author extended the model to multivariate case (Griffiths 1973) which can be regarded as an approximation of general epidemic with several types of infective. However, due to the intractability of the joint probability generating function, maximum likelihood estimation for parameters is not implementable. One possible way forward is to use integer-valued time series to approximate the continuous birth and death process and maximum likelihood estimation would then be feasible.

In recent years, there has been a growing interest in modelling integer-valued time series due to the presence of count data from different scientific fields such as social science, healthcare, insurance, economic and the financial industry. In particular, regarding to the univariate case, Al-Osh and Alzaid (1987) and McKenzie (1985) were the first to consider an INAR(1) model based on the so-called binomial thinning operator. The idea here is to manipulate the operation between coefficients and variables as well as the innovation terms in a way that the values are always integer. One can apply different discrete random variables to describe this operation. For more details, the interested reader can refer to Weiß (2018), Davis et al. (2016), Scotto et al. (2015), Weiß (2008) among many more.

In this paper, we propose an integer-valued autoregressive model of order one (INAR(1)) to approximate continuous birth and death process. In this way, the continuous process is approximated by a discrete Markov chain so that transition probabilities as well as likelihood function can be written down explicitly. As the birth and death process in our setting does not consider any immigrant, the innovation term is dropped in the proposed INAR(1) model. Similar to Nelson (1990), Kirchner (2016), where they find out the relationship between discrete models and their continuous counterparts, we also first need to make sure the our proposed discrete INAR(1) model would converge to birth and death process in weak convergence sense. Then we will explore how our proposed model would help facilitate the statistical inference. According to the probability generating function of the simple birth and death process, the death part can be described by binomial random variable while the birth part corresponds to a negative binomial. Then one can construct a bivariate INAR model based on these random variables to describe the bivariate birth and death process and even the multivariate one. As the transition probabilities and likelihood function of bivariate birth and death process cannot be written down explicitly, the main contribution is that the proposed bivariate INAR(1) model would provide a feasible way to estimate the parameters of bivariate birth and death process (Maximum likelihood estimation).

The paper is organized as follows: Sect. 2 reviews some main results of univariate and bivariate birth and death processes with constant rates. Section 3 introduces Integer-valued autoregressive models as well as some distributional properties. Section 4 constructs the discrete semimartingale using the proposed INAR models and proves the weak convergence between constructed semimartingale and birth and death processes. A simulation study is carried out in Sect. 5 to illustrate the estimation method via proposed INAR models an their corresponding properties of estimators. Some concluding remarks are in Sect. 6.

2 Univariate and bivariate birth and death processes

In this section, we will review the essential elements of simple birth-and-death processes, including moments and other distributional properties. These are well known and extensively discussed in the literature. Then, we will discuss the bivariate case where analytic expressions of the distribution function are not available.

2.1 Simple univariate birth-and-death process

Suppose that we have a population whose total number is evolved as a simple birth and death process $ Z_t $, with constant birth rate $ \lambda \ge 0 $, death rate $ \mu \ge 0 $ and initial population $ Z_0 \in {\mathbb {N}}$. In other words, the probability that any individual gives birth in time $ \Delta $ is $ \lambda \Delta $, and the probability that any individual dies in time $ \Delta $ is $ \mu \Delta $. Individuals are independent of each other. Let $ P_n (t) = \Pr (Z_t = n) $ be the probability that the total population is n at time t. Then the transition probability of the simple birth and death process is characterized by the following ordinary differential equation (ODE)

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{d P_n(t)}{dt} &{}= \lambda (n-1) P_{n-1} (t) + \mu (n+1) P_{n+1}(t) - (\lambda + \mu ) n P_{n}(t), \quad n \ge 1 \\ P_{Z_0}(0) &{}= 1 \end{array}\right. } \end{aligned}$$

(1)

Applying a liner transform $ \sum _{n} \theta ^n $ on both sides and defining $ \varphi (t,\theta ) = \sum _{n} \theta ^n P_{n}(t) $, we can get a partial differential equation whose solution $ \varphi $ is the probability generating function of $ Z_t^{(a)} $.

$$\begin{aligned} \begin{aligned} \frac{\partial \varphi }{\partial t}&= \lambda \theta ^2 \frac{\partial \varphi }{\partial \theta } + \mu \frac{\partial \varphi }{\partial \theta } -(\lambda + \mu )\theta \frac{\partial \varphi }{\partial \theta }\\&= (\lambda \theta - \mu ) (\theta -1) \frac{\partial \varphi }{\partial \theta } \\ \varphi (0,\theta )&= \theta ^a \end{aligned} \end{aligned}$$

(2)

This linear PDE can be solved explicitly

$$\begin{aligned} \begin{aligned} \varphi (t,\theta )&= \left( 1 - \alpha (t) + \alpha (t) \frac{\beta (t) \theta }{1 - (1-\beta (t))\theta }\right) ^{Z_0} \\ \alpha (t)&= \frac{(\lambda - \mu ) e^{(\lambda - \mu )t}}{\lambda e^{(\lambda - \mu )t} - \mu }, \quad \beta (t) = \frac{\lambda - \mu }{\lambda e^{(\lambda - \mu )t} - \mu } \end{aligned} \end{aligned}$$

(3)

This probability generating function clearly gives the construction of $ Z_t $ given $ Z_0 $, i.e. the sum of i.i.d zero-modified geometric random variables

$$\begin{aligned} Z_t \sim \sum _{i=1}^{Z_0} B_i(\alpha (t))G_i(\beta (t)), \end{aligned}$$

(4)

where $ B_i $ are i.i.d Bernoulli random variables and $ G_i $ are i.i.d Geometric random variables with mean $ \alpha (t) $ and $ \frac{1}{\beta (t)} $, respectively. Furthermore, from the definition of transition probability, the linear birth and death process is a pure-jump semimartingale with following characteristic triplet:

$$\begin{aligned} \begin{aligned}&Ch(Z_t) = {\left\{ \begin{array}{ll} B_t = 0 \\ C_t = 0 \\ \nu (Z_t;dt,dx) = dt K(Z_t,dx) = dt (\lambda Z_{t^-} \delta _{1}(dx) + \mu Z_{t^-} \delta _{-1}(dx)) \end{array}\right. } \\&\int _{R} \left( x^2 \wedge 1 \right) K(Z_t,dx) = (\lambda + \mu ) Z_{t^-} < \infty , \quad \text {given that}\, Z_{t^-} \, \text {is finite} \end{aligned} \end{aligned}$$

(5)

With the help of piece-wise deterministic Markov process theory in Davis (1984), the infinitesimal generator of the simple birth and death process $ Z_t $ acting on a function f(t, Z) within its domain $ \Omega ({\mathcal {A}}) $ is given by

$$\begin{aligned} {\mathcal {A}} f(t, Z) = \frac{\partial f}{\partial t} + \lambda Z (f(t, Z+1) - f(t, Z)) + \mu Z (f(t, Z-1) - f(t, Z)), \end{aligned}$$

(6)

where $ \Omega ({\mathcal {A}}) $ is the domain for the generator $ {\mathcal {A}} $ such that f(t, Z) is differentiable with respect to t for all t, Z, and

$$\begin{aligned} \begin{aligned}&\vert f(t, Z+1) - f(t, Z) \vert< \infty \\&\vert f(t, Z-1) - f(t, Z) \vert < \infty . \end{aligned} \end{aligned}$$

(7)

The first and second moments can be derived by applying infinitesimal generator to the functions $ f(t,z) = Z, Z^2 $ such that

$$\begin{aligned} \begin{aligned} {\mathcal {A}} Z&= \lambda Z(Z+1 - Z) + \mu Z (Z -1 -Z) \\ {\mathcal {A}} Z^2&= \lambda Z( (Z +1)^2 - Z^2) + \mu Z ( (Z-1)^2 - Z^2), \end{aligned} \end{aligned}$$

(8)

which leads to two ODEs,

$$\begin{aligned} \begin{aligned} \frac{d {\mathbb {E}}[Z_t]}{dt}&= (\lambda - \mu ){\mathbb {E}}[Z_t] \\ \frac{d {\mathbb {E}}[Z_t^2]}{dt}&= 2(\lambda - \mu ) {\mathbb {E}}[Z_t^2] + (\lambda + \mu ) {\mathbb {E}}[Z_t] \end{aligned} \end{aligned}$$

(9)

Then, we can solve them explicitly

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[Z_t]&= Z_0 e^{(\lambda - \mu )t} \\ {\mathbb {E}}[Z_t^2]&= Z_0^2 e^{2(\lambda - \mu )t} + \frac{Z_0 (\lambda + \mu )}{(\lambda - \mu )} e^{(\lambda - \mu )t} \left( e^{(\lambda - \mu )t} - 1 \right) \\ Var(Z_t)&= \frac{Z_0 (\lambda + \mu )}{(\lambda - \mu )} e^{(\lambda - \mu )t} \left( e^{(\lambda - \mu )t} - 1 \right) \end{aligned} \end{aligned}$$

(10)

According to the analytic expression of the first moment, it is clear that the population is bound to become extinct if $ \lambda < \mu $.

2.2 Bivariate birth-and-death process

Suppose there are two populations $\textbf{M} = (M_1,M_2)^T $ with initial population $ \textbf{M}_0 \in {\mathbb {N}}^2_+ $. The rate with which the population $ M_1 $ increases by one is $ \lambda _{21} M_2 + \lambda _{11} M_1 $ while the same for the population $ M_2 $ would be $ \lambda _{12} M_1 + \lambda _{22} M_2$. The subscript $ \lambda _{i,j} $ means that the rate is from population i contributed to population j. The death rate for two populations would be $ \mu _1, \mu _2 $ respectively. The two population is not independent as long as the cross birth rates $\lambda _{i,j} \ne 0,\ i \ne j$. Then denote $ P_{mn}(t) = \Pr (M_{1,t}= m, M_{2,t}= n)$. This satisfies the following ODE

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{d P_{m,n}}{dt} =&{} \left( \lambda _{11} (m-1) + \lambda _{21} n \right) P_{m-1,n} + \mu _1 (m+1) P_{m+1,n} \\ &{}+ \left( \lambda _{12} m + \lambda _{22} (n-1) \right) P_{m,n-1} + \mu _2 (n+1) P_{m,n+1}\\ &{}- \left( (\lambda _{11} + \lambda _{12} + \mu _1) m + (\lambda _{21} + \lambda _{22} + \mu _2) n \right) P_{m,n} \\ P_{\textbf{M}_0}(0) =&{} 1, \quad M_{1,0}, M_{2,0} \in {\mathbb {N}}_+ \end{array}\right. } \end{aligned}$$

(11)

Griffiths (1972) introduced this bivariate birth death process ($ \lambda _{11} = \lambda _{22} = 0 $) to describe the host-vector epidemic situation where the birth probability of two population depends on the size the other population only, e.g. transmission of malaria. To get the joint probability generating function of $ \Psi (t,\theta ,\phi ) = \sum _m \sum _n \theta ^m \phi ^n P_{mn}(t) $, we can apply a linear transform $ \sum _m \sum _n \theta ^m \phi ^n $ on both sides of the ODE. The resulting PDE is

$$\begin{aligned} \begin{aligned} \frac{\partial \Psi }{\partial t}&= \lambda _{11} \theta ^2 \frac{\partial \Psi }{\partial \theta } + \lambda _{21} \theta \phi \frac{\partial \Psi }{\partial \phi } + \mu _1 \frac{\partial \Psi }{\partial \theta } + \lambda _{12} \theta \phi \frac{\partial \Psi }{\partial \theta } + \lambda _{22} \phi ^2 \frac{\partial \Psi }{\partial \phi } + \mu _2 \frac{\partial \Psi }{\partial \phi } \\&\quad - \theta (\lambda _{11} + \lambda _{12} + \mu _1) \frac{\partial \Psi }{\partial \theta } - \phi (\lambda _{21} + \lambda _{22} + \mu _2) \frac{\partial \Psi }{\partial \phi } \\&= (\lambda _{11}\theta ^2 + \lambda _{12} \theta \phi + \mu _1 - \theta (\lambda _{11} + \lambda _{12} + \mu _1)) \frac{\partial \Psi }{\partial \theta } \\&\quad + (\lambda _{22}\phi ^2 + \lambda _{21} \theta \phi + \mu _2 - \phi (\lambda _{21} +\lambda _{22} + \mu _2)) \frac{\partial \Psi }{\partial \phi } \\ \Psi (0,\theta ,\phi )&= \theta ^{M_{1,0}} \phi ^{M_{2,0}} \end{aligned} \end{aligned}$$

(12)

This is a semi-linear PDE. The subsidiary equations are defined as

$$\begin{aligned} \begin{aligned} \frac{d \Psi }{ 0} = \frac{dt}{1}&= \frac{- d\theta }{\lambda _{11}\theta ^2 + \lambda _{12} \theta \phi + \mu _1 - \theta (\lambda _{11} + \lambda _{12} + \mu _1) } \\&= \frac{-d\phi }{\lambda _{22}\phi ^2 + \lambda _{21} \theta \phi + \mu _2 - \phi (\lambda _{21} +\lambda _{22} + \mu _2)} \end{aligned} \end{aligned}$$

(13)

The first fraction does not mean divide $ d \Psi $ by 0 and combining with the second fraction $ \frac{dt}{1} $ infers that $\Psi =$ constant, according to chapter 8 of Bailey (1991) . Matching the third and fourth differentials above, we have

$$\begin{aligned} \begin{aligned} \frac{d\theta }{d\phi } = \frac{\lambda _{11}\theta ^2 + \lambda _{12} \theta \phi + \mu _1 - \theta (\lambda _{11} + \lambda _{12} + \mu _1) }{\lambda _{22}\phi ^2 + \lambda _{21} \theta \phi + \mu _2 - \phi (\lambda _{21} +\lambda _{22} + \mu _2)} \end{aligned} \end{aligned}$$

(14)

It seems that there is no way to solve this non-linear ODE and therefore no explicit solution is available for this PDE. However, it can be shown that this PDE gives a unique solution by Existence-Uniqueness Theorem for Quasilinear First-Order Equations. With regard to its characteristic, similar to the univariate case, this process is a pure-jump semimartingale with following characteristic triplets:

$$\begin{aligned} \begin{aligned}&Ch(\textbf{M}_t) = {\left\{ \begin{array}{ll} B_t = 0 \\ C_t = 0 \\ \nu (\textbf{M}_t ; dt,dx) = dt K(\textbf{M}_t,dx) = \\ dt ( \tilde{\varvec{\lambda }}_1\delta _{(1,0)}(dx) + \tilde{\varvec{\lambda }}_2 \delta _{(0,1)}(dx) + \tilde{\varvec{\mu }}_1 \delta _{(-1,0)}(dx) + \tilde{\varvec{\mu }}_2 \delta _{(0,-1)}(dx)) \textbf{M}_{t^-} \end{array}\right. }\\&\int _R \left( x^2 \wedge 1 \right) K(M_t,dx) = (\tilde{\varvec{\lambda }}_1 + \tilde{\varvec{\lambda }}_2 + \tilde{\varvec{\mu }}_1 + \tilde{\varvec{\mu }}_2) \textbf{M}_{t^-} < \infty ,\ \text {given that}\, \textbf{M}_{t^-} \, \text {is finite}, \\&\text {where} \\&\tilde{\varvec{\lambda }}_1 = (\lambda _{11},\lambda _{21}), \quad \tilde{\varvec{\lambda }}_2 = (\lambda _{21},\lambda _{22}),\quad \tilde{\varvec{\mu }}_1 = (\mu _1,0), \quad \tilde{\varvec{\mu }}_2 = (0,\mu _2) \end{aligned} \end{aligned}$$

(15)

The moments of this bivariate process can be derived by applying again infinitesimal generator.

Proposition 1

The first and second moments of the bivariate birth and death process $ \textbf{M}_t = (M_{1,t}, M_{2,t}) $ defined in (11) are given by

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[M_{1,t}]&= M_{1,0} \left( \frac{\lambda _{12} c }{2\lambda _{12}c + \kappa _1 - \kappa _2} e^{(\lambda _{12}c - \kappa _2 ) t} + \frac{\lambda _{12} c + \kappa _1 - \kappa _2}{2\lambda _{12}c + \kappa _1 - \kappa _2} e^{-(\lambda _{12}c + \kappa _1)t} \right) \\&+ M_{2,0} \frac{\lambda _{21} }{2\lambda _{12}c + \kappa _1 - \kappa _2} \left( e^{(\lambda _{12}c - \kappa _2 ) t} - e^{-(\lambda _{12}c + \kappa _1)t} \right) \\ {\mathbb {E}}[M_2,t]&= M_{1,0} \frac{\lambda _{12} }{2\lambda _{12}c + \kappa _1 - \kappa _2} \left( e^{(\lambda _{12}c - \kappa _2 ) t} - e^{-(\lambda _{12}c + \kappa _1)t} \right) \\&+ M_{2,0} \left( \frac{\lambda _{12} c +\kappa _1 - \kappa _2 }{2\lambda _{12}c + \kappa _1 - \kappa _2} e^{(\lambda _{12}c - \kappa _2 ) t} + \frac{\lambda _{12} c}{2\lambda _{12}c + \kappa _1 - \kappa _2} e^{-(\lambda _{12}c + \kappa _1)t} \right) , \end{aligned} \end{aligned}$$

(16)

where

$$\begin{aligned} \kappa _1 = \mu _1 - \lambda _{11}, \quad \kappa _2 = \mu _2 - \lambda _{22}, \quad c = \frac{\kappa _2 - \kappa _1 + \sqrt{(\kappa _1 - \kappa _2)^2 + 4\lambda _{21}\lambda _{12}}}{2\lambda _{12}}. \end{aligned}$$

The second moments $ {\mathbb {E}}[M_{1,t}^2], {\mathbb {E}}[M_{2,t}^2] $ and $ {\mathbb {E}}[M_{1,t} M_{2,t}] $ are determined by the following system of ODE,

$$\begin{aligned} \begin{aligned} \frac{d}{dt}{\mathbb {E}}[M_{1,t}^2]&= -2\kappa _1 {\mathbb {E}}[M_{1,t}^2] + 2\lambda _{21} {\mathbb {E}}[M_{1,t} M_{2,t}] + \lambda _{21} {\mathbb {E}}[M_{2,t}]+ \mu _1 {\mathbb {E}}[M_{1,t}] \\ \frac{d}{dt}{\mathbb {E}}[M_{2,t}^2]&= -2\kappa _2 {\mathbb {E}}[M_{2,t}^2] + 2\lambda _{12} {\mathbb {E}}[M_{1,t} M_{2,t}] + \lambda _{12} {\mathbb {E}}[M_{1,t}] + \mu _2 {\mathbb {E}}[M_{2,t}] \\ \frac{d}{dt} {\mathbb {E}}[M_{1,t} M_{2,t}]&= -(\kappa _1 + \kappa _2) {\mathbb {E}}[M_{1,t} M_{2,t}] + \lambda _{21} {\mathbb {E}}[M_{2,t}^2] + \lambda _{12} {\mathbb {E}}[M_{1,t}^2] \end{aligned} \end{aligned}$$

(17)

Proof

See Appendix A.1. $\square $

Note that to ensure the bivariate process becomes extinct with probability one, we need the (necessary and sufficient condition) $ (\mu _1 - \lambda _{11})(\mu _2 - \lambda _{22}) > \lambda _{12} \lambda _{21} $ according to Griffiths (1973). Many interesting properties of the process have been investigated by Griffiths (1972, 1973). In general, this bivariate birth and death process is not straightforward to apply in practice because there are no explicit solutions to the above PDE, and the second moments have to be evaluated by numerical methods. The discrete integer-value model proposed in the next section would be a possible solution.

3 Univariate and bivariate INAR models

In this section, we will introduce integer-valued autoregressive models which will serve as discrete approximations for continuous counterparts discussed in the last section. The derivation of this approximation will demonstrate how to parameterize the bivariate INAR case.

3.1 Univariate INAR model

The classical integer-value autoregressive (INAR) model is introduced by defining a so-called binomial thinning operator $ \circ $ such that $ \alpha \circ X $ is the sum of X i.i.d Bernoulli random variable with success probability $ \alpha $. i.e.

$$\begin{aligned} \alpha \circ X = \sum _{i=1}^{X} b_i, \quad b_i \overset{i.i.d}{\sim }\ \text {Bernoulli}(\alpha ) \end{aligned}$$

(18)

A well-known Poisson INAR(1) model $ X_t $ is given by

$$\begin{aligned} X_t = \alpha \circ X_{t-1} + R_t, \end{aligned}$$

(19)

where $ \{R_i\}_{i=1,\dots ,t} $ are i.i.d Poisson variables with parameter $ \rho $. The key idea of the integer-value model would be the operator $ \circ $. One can choose different discrete random variables to construct different integer-valued models. Indicated by the transition probability of continuous birth and death process, i.e. the sum of i.i.d zero-modified geometric random variables shown in Eq. (4), INAR model can be a good approximation by combining $ \circ $ and geometric operator as defined below.

Definition 1

A birth and death INAR(1) model with survival probability $ \alpha \in [0,1] $ and birth probability $ p \in [0,1] $ is defined as

$$\begin{aligned} X_t = p *_1 \alpha \circ X_{t-1}, \end{aligned}$$

(20)

where

$ \circ $ is the binomial operator
$ *_1 $ is a geometric (reproduction) operator such that $ p*_1 X = \sum _{i=1}^{X} g^{(1)}_i $ with $ g^{(1)}_i $ being i.i.d geometric random variable with success probability p whose probability mass function is given by
$$\begin{aligned} P(g^{(1)}_i = k ) = p(1 - p)^{k-1}, \quad k = 1,2, \dots , \end{aligned}$$
$ p*_1 \alpha \circ X = \sum _{i=1}^{\alpha \circ X} g^{(1)}_i$

Remark The innovation is dropped as there is no independent immigrant process in the birth and death process investigated.

Proposition 2

The birth and death INAR(1) model has the following statistical properties

1.
The probability generating function of $X_t$ can be iterated backwardly such that
$$\begin{aligned} \begin{aligned} \varphi ^{(I)}(t,\theta )&= {\mathbb {E}}[\theta ^{X_t}] ={\mathbb {E}}\left[ \left( 1 - \alpha + \frac{\alpha p \theta }{1 - (1-p)\theta }\right) ^{X_{t-1}}\right] \\&={\mathbb {E}}\left[ \left( 1 - \alpha _i + \frac{\alpha _i p_i \theta }{1 - (1-p_i)\theta }\right) ^{X_{t-i}}\right] , \quad i = 1, \dots , t \end{aligned} \end{aligned}$$
(21)
where
$$\begin{aligned} \begin{aligned} p_i&= \frac{p^i}{d_{i-1}} \quad \alpha _i = \frac{\alpha ^i}{d_{i-1}} \\ d_i&= p^i \left( 1 + (1-p)\frac{\frac{\alpha }{p} - \left( \frac{\alpha }{p}\right) ^{i+1}}{1 - \frac{\alpha }{p}} \right) \\ \end{aligned} \end{aligned}$$
(22)
In order words, the birth and death operator $p*_1 \alpha \circ $ as a whole is iterable.
$$\begin{aligned} X_t = p_1 *_1 \alpha _1 \circ X_{t - 1} = p_2 *_1 \alpha _2 \circ X_{t-2} = \dots = p_t *_1 \alpha _t \circ X_0 \end{aligned}$$
(23)
2.
Then the mean, variance and covariance are given by
$$\begin{aligned} \begin{aligned} {\mathbb {E}}[X_t]&= \frac{\alpha _i}{p_i} {\mathbb {E}}[X_{t-i}] \\ Var(X_{t})&= \left( \frac{\alpha _i (1- p_i)}{p_i^2} + \frac{\alpha _i(1- \alpha _i)}{p_i^2} \right) {\mathbb {E}}[X_{t-i}] + \frac{\alpha _i^2}{p_i^2} Var(X_{t-i}) \\ Cov(X_t, X_{t-i})&= \frac{\alpha _i}{p_i} Var(X_{t-i}) \end{aligned} \end{aligned}$$
(24)

Proof

See Appendix A.2. $\square $

Note that if $ \alpha /p < 1 $, the process $ X_t $ will become extinct eventually. It is obvious that the continuous birth and death process can be approximated by this discrete INAR(1) model by directly matching the probability generating function $\varphi ^{(I)}$ to the one $\varphi $ in Eq. (3) as the $ p *_1 \alpha \circ X$ is the sum of X i.i.d zero-modified geometric random variables.

3.2 Bivariate INAR model

Discrete approximation for univariate birth and death process is somehow simple because the PDE(2) has an explicit solution and hence the distribution is already known. In the case where the dynamic of two populations are characterized by (11), no explicit solution for its PDE (12). However, from the birth and death INAR(1) model, it is clear that birth and death probability are closely related to binomial and negative binomial random variables. Based on the dynamic (11) and linear form of the first moment (16), a bivariate INAR(1) model is proposed as follows.

Definition 2

A bivariate birth and death INAR(1) model $ \textbf{Y}_t = (Y_{1,t}, Y_{2,t})^T $ with survival probability $ \alpha _1, \alpha _2 \in [0,1] $ and birth probability $ \beta _{11}, \beta _{12}, \beta _{21}, \beta _{22} \in [0,1] $ is defined as

$$\begin{aligned} \begin{aligned} Y_{1,t} = \beta _{11} *_1 \alpha _1 \circ Y_{1,t-1} + \beta _{21} *_2 Y_{2,t-1} \\ Y_{2,t} = \beta _{12} *_2 Y_{1,t-1} + \beta _{22} *_1 \alpha _2 \circ Y_{2,t-1}, \end{aligned} \end{aligned}$$

(25)

where

$ \circ $ is the binomial operator
$ *_2 $ is another geometric (reproduction) operator different from $ *_1 $ such that $ \beta *_2 X = \sum _{i=1}^{X} g^{(2)}_i $ with $ g^{(2)}_i $ being i.i.d geometric random variable whose success probability is $ \beta $. The probability mass function is given by
$$\begin{aligned} P(g^{(2)}_i = k ) = \beta (1 - \beta )^{k}, \quad k = 0,1,2, \dots , \end{aligned}$$
Conditional on $ \textbf{Y}_{t-1} $, the random variables $ \beta _{11} *_1 \alpha _1 \circ Y_{1,t-1}, \ \beta _{21} *_2 Y_{2,t-1}, \ \beta _{12} *_2 Y_{1,t-1} \ \text {and} \ \beta _{22} *_1 \alpha _2 \circ Y_{1,t-1} $ are all independent of each other.

Now it seems that the structure of bivariate INAR(1) matches the the dynamics of (11), i.e. the birth probability depends on the size of both populations while death probability depends on the size of its own population. We adopt another geometric random variable $ g^{(2)} $ which is slightly different from $ g^{(1)} $ because for example, if we use $ g^{(1)} $, $ Y_{1,t} \ge Y_{2,t-1} \forall t $ which is not reasonable when $ Y_{1,t-1} < Y_{2,t-1} $ for a population.

Proposition 3

The first and second moments of the bivariate INAR(1) defined above are characterized by the following recursive formulas

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[Y_{1,t}]&= \frac{\alpha _1}{\beta _{11}} {\mathbb {E}}[Y_{1,t-1}] + \frac{1-\beta _{21}}{\beta _{21}} {\mathbb {E}}[Y_{2,t-1}] \\ {\mathbb {E}}[Y_{2,t}]&= \frac{1-\beta _{12}}{\beta _{12}} {\mathbb {E}}[Y_{1,t-1}] + \frac{\alpha _2}{\beta _{22}} {\mathbb {E}}[Y_{2,t-1}] \\ Var(Y_{1,t})&= \frac{\alpha _1^2 }{\beta _{11}^2}Var(Y_{1,t-1}) + \frac{\alpha _1(2 - \beta _{11} - \alpha _1)}{\beta _{11}^2} {\mathbb {E}}[Y_{1,t-1}] + \left( \frac{1-\beta _{21}}{\beta _{21}}\right) ^2 Var(Y_{2,t-1}) \\&+ \frac{1-\beta _{21}}{\beta _{21}^2} {\mathbb {E}}[Y_{2,t-1}] + 2\frac{\alpha _1 (1-\beta _{21})}{\beta _{11}\beta _{21}} Cov(Y_{1,t-1},Y_{2,t-1}) \\ Var(Y_{2,t})&= \left( \frac{1-\beta _{12}}{\beta _{12}}\right) ^2 Var(Y_{1,t-1}) + \frac{1-\beta _{12}}{\beta _{12}^2} {\mathbb {E}}[Y_{1,t-1}] + \frac{\alpha _2^2}{\beta _{22}^2} Var(Y_{2,t-1}) \\&+ \frac{\alpha _2 (2-\beta _{22} -\alpha _2) }{\beta ^2_{22}} {\mathbb {E}}[Y_{2,t-1}] + 2\frac{\alpha _2(1-\beta _{12})}{\beta _{12}\beta _{22}} Cov(Y_{1,t-1}, Y_{2,t-1}) \\&Cov(Y_{1,t}, Y_{2,t}) \\&= \left( \frac{\alpha _1 \alpha _2}{\beta _{11}\beta _{22}} + \frac{(1-\beta _{21})(1-\beta _{12})}{\beta _{12} \beta _{21}} \right) Cov(Y_{1,t-1},Y_{2,t-1}) \\&+ \frac{\alpha _1 (1-\beta _{12})}{\beta _{11}\beta _{12}}Var(Y_{1,t-1}) + \frac{\alpha _2 (1-\beta _{21})}{\beta _{21} \beta _{22}} Var(Y_{2,t-1}) \end{aligned} \end{aligned}$$

(26)

Proof

Similar to Proposition 2, the moments can be derived by conditional expectation. The first and second moment for random variable $ g^{(2)}_i $ with parameter $ \beta $ are $ \frac{1-\beta }{\beta } $ and $ \frac{1-\beta }{\beta ^2} $. Then the first moment for $X_t$ are

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[Y_{1,t} \vert \textbf{Y}_{t-1}]&= {\mathbb {E}}[\beta _{11} *_1 \alpha _1 \circ Y_{1,t-1} \vert Y_{1,t-1}] + {\mathbb {E}}[\beta _{21} *_2 Y_{2,t-1} \vert Y_{2,t-1}] \\&= \frac{\alpha _1}{\beta _{11}} Y_{1,t-1} + \frac{1-\beta _{21}}{\beta _{21}} Y_{2,t-1} \\ \end{aligned} \end{aligned}$$

The second moments are given by

$$\begin{aligned} \begin{aligned} Var(Y_{1,T} \vert \textbf{Y}_t)&= Var(\beta _{11} *_1 \alpha _1 \circ Y_{1,t-1} \vert Y_{1,t-1}) + Var(\beta _{21} *_2 Y_{2,t-1} \vert Y_{2,t-1}) \\&= \frac{\alpha _1 (2 - \beta _{11} - \alpha _1)}{\beta _{11}^2} Y_{1,t-1} + \frac{1-\beta _{21}}{\beta _{21}^2} Y_{2,t-1} \\ Var(Y_{1,t})&= Var({\mathbb {E}}[ Y_{1,t-1} \vert \textbf{Y}_{t-1}]) + {\mathbb {E}}[ Var(Y_{1,t} \vert \textbf{Y}_{t-1})] \\&+ 2Cov({\mathbb {E}}[ \beta _{11} *_1 \alpha _1 \circ Y_{1,t-1} \vert Y_{1,t-1}], {\mathbb {E}}[\beta _{21} *_2 Y_{2,t-1} \vert Y_{2,t-1}]) \\&+ 2{\mathbb {E}}[ Cov( \beta _{11} *_1 \alpha _1 \circ Y_{1,t-1}, \beta _{21} *_2 Y_{2,t-1} \vert \textbf{Y}_{t-1}) ]\\&= Var({\mathbb {E}}[ Y_{1,t} \vert \textbf{Y}_{t-1} ]) + {\mathbb {E}}[Var(Y_{1,t} \vert \textbf{Y}_{t-1} )] \\ {}&+ \frac{\alpha _1(1-\beta _{21})}{\beta _{11}\beta _{21}} Cov(Y_{1,t-1},Y_{2,t-1}) \\ Cov(Y_{1,t}, Y_{2,t})&= Cov(\beta _{11} *_1 \alpha _1 \circ Y_{1,t-1}, \beta _{12} *_2 Y_{1,t-1}) \\&+ Cov(\beta _{11} *_1 \alpha _1 \circ Y_{1,t-1}, \beta _{22} *_1 \alpha _2 \circ Y_{2,t-1}) \\&+ Cov(\beta _{21} *_2 Y_{2,t-1}, \beta _{12} *_2 Y_{1,t-1}) + Cov(\beta _{21} *_2 Y_{2,t-1}, \beta _{22} *_1 \alpha _2 \circ Y_{2,t-1}) \\&= \frac{\alpha _1(1-\beta _{12})}{ \beta _{11} \beta _{12}} Var(Y_{1,t-1}) + \frac{\alpha _1 \alpha _2}{\beta _{11}\beta _{22}} Cov(Y_{1,t-1}, Y_{2,t-1}) \\&+ \frac{(1-\beta _{12})(1-\beta _{21})}{\beta _{12}\beta _{21}} Cov(Y_{2,t-1}, Y_{1,t-1}) + \frac{(1-\beta _{21}\alpha _2)}{\beta _{21}\beta _{22}}Var(Y_{2,t-1}) \end{aligned} \end{aligned}$$

The first and second moments of $ Y_{2,t} $ can be derived in a similar way. $\square $

Proposition 4

If the eigen-values $ \eta _1, \eta _2 $ of the following matrix

$$\begin{aligned} A = \begin{bmatrix} \frac{\alpha _1}{\beta _{11}} &{} \frac{1-\beta _{21}}{\beta _{21}} \\ \frac{1-\beta _{12}}{\beta _{12}} &{} \frac{\alpha _2}{\beta _{22}} \end{bmatrix} \end{aligned}$$

(27)

lie in the interval $ [-1, 1] $, then the bivariate population $ X_t, Y_t $ will become extinct eventually.

Proof

The first moment can be expressed in a matrix form

$$\begin{aligned} {\mathbb {E}}[\textbf{Y}_t] = A {\mathbb {E}}[\textbf{Y}_{t-1}] = A^t {\mathbb {E}}[\textbf{Y}_0] \end{aligned}$$

(28)

The tth power of a matrix here is defined as t times matrix multiplication. By eigen-decomposition, power of a matrix can be expressed as

$$\begin{aligned} A^t = Q \mathop {\textrm{diag}}\nolimits (\{\eta _1^t, \eta _2^t\}) Q^{-1}, \end{aligned}$$

(29)

where $ Q = (\nu _1, \nu _2)$ is eigen vector matrix with $ \nu _1, \nu _2 $ as eigen vectors for $ \eta _1,\eta _2 $. Now, it is clear that $ {\mathbb {E}}[\textbf{Y}_t] $ is decreasing in t when $ \eta _1, \eta _2 \in [-1,1]$. $\square $

4 Weak convergence to continuous birth and death process

In this section, we will construct two continuous processes from the above proposed INAR models. These processes, under a certain parametrization, will converge weakly to the aforementioned continuous birth and death processes when the length of sub-interval goes to 0.

4.1 Construction of continuous processes

Since the continuous birth and death processes are clearly semimartingale defined in non-negative state spaces, to apply limit theorem of locally bounded semimartingales, we need to construct ’continuous’ processes on a dense subsets of $ {\mathbb {R}}_+ $ (will take $ t \in [0,1] $ for convenience) and compute their characteristic triplets from the discrete INAR models. Finally, when everything is set up nicely, we can apply weak convergence of semimartingale theorem to prove the result. The construction mainly follows from Jacod and Shiryaev (2013, Chapter II, section 3).

Starting with a discrete basis $ {\mathcal {B}} = (\mathbf {\Omega }, \textbf{F},({\mathcal {F}}_n)_{n \in {\mathbb {N}}},\textbf{P}) $, assume that he INAR models $ X_n $ and $ \textbf{Y}_n $ defined above are adapted to this discrete stochastic basis and so as the increment processes

$$\begin{aligned} \begin{aligned}&U_k = X_k - X_{k-1}, \quad U_0 = X_0 \\&\textbf{V}_k = \textbf{Y}_k - \textbf{Y}_{k-1}, \quad \textbf{V}_0 = \textbf{Y}_0, \quad k = 0,1,2, \dots \end{aligned} \end{aligned}$$

(30)

then we can construct ’continuous’ processes via time change.

Definition 3

Given a fixed time interval [0, 1] , one can define a equal-length grid with size n such that each subinterval with length $ \Delta = \frac{1}{n} $. The following the processes:

$$\begin{aligned} Z_t^{(n)} = \sum _{k=0}^{\sigma _t} U_k, \quad \textbf{M}_t^{(n)} = \sum _{k=0}^{\sigma _t} \textbf{V}_t, \end{aligned}$$

(31)

where $ \sigma _t = \lfloor tn \rfloor $, are adapted to the continuous-time basis $ \tilde{{\mathcal {B}}} = (\mathbf {\Omega }, \textbf{F}, G = (\mathfrak {g}_t)_{t\ge 0}, \textbf{P}) $. The parameters setting for $ Z_t^{(n)} $ are

$$\begin{aligned} \alpha = \frac{(\lambda - \mu )e^{(\lambda - \mu )\Delta }}{\lambda e^{(\lambda - \mu )\Delta } - \mu }, \quad p = \frac{\lambda - \mu }{\lambda e^{(\lambda - \mu )\Delta } - \mu }. \end{aligned}$$

(32)

The parameters setting for $ M_t^{(n)} $ are

$$\begin{aligned} \begin{aligned} \alpha _1&= \frac{(\lambda _{11} - \mu _1) \omega _1(\Delta ) }{\lambda _{11}\omega _1(\Delta ) - \mu _1 }, \quad \alpha _2 = \frac{(\lambda _{22} - \mu _2) \omega _2(\Delta ) }{\lambda _{22}\omega _2(\Delta ) - \mu _2 } \\ \beta _{11}&= \frac{\lambda _{11} - \mu _1}{\lambda _{11}\omega _1(\Delta ) - \mu _1},\quad \beta _{22} = \frac{\lambda _{22} - \mu _2}{\lambda _{22}\omega _2(\Delta ) - \mu _2}\\ \beta _{21}&= \left( 1 + C_{\beta _1} \left( e^{ u_1 \Delta } - e^{u_2\Delta } \right) \right) ^{-1}, \quad \beta _{12} = \left( 1 + C_{\beta _2} \left( e^{u_1\Delta } - e^{u_2\Delta } \right) \right) ^{-1}, \end{aligned} \end{aligned}$$

(33)

where

$$\begin{aligned} \begin{aligned} \omega _1 (\Delta )&= C_{\alpha } e^{u_1\Delta } + (1 -C_{\alpha }) e^{u_2\Delta }, \quad \omega _2 (\Delta ) = (1-C_{\alpha }) e^{u_1 \Delta } + C_{\alpha } e^{u_2\Delta } \\ C_{\alpha }&= \frac{\lambda _{12} c }{2\lambda _{12}c + \mu _1 - \mu _2}, \ C_{\beta _1} = \frac{\lambda _{21} }{2\lambda _{12}c + \kappa _1 - \kappa _2}, \ C_{\beta _2} = \frac{\lambda _{12} }{2\lambda _{12}c + \kappa _1 - \kappa _2} \\ u_1&= \lambda _{12} c - \kappa _2, \quad u_2 = -(\lambda _{12} c + \kappa _1),\quad \kappa _i = \mu _i - \lambda _{ii}, \ i= 1,2 \\ c&= \frac{\kappa _2 - \kappa _1 + \sqrt{(\kappa _1 - \kappa _2)^2 + 4\lambda _{21}\lambda _{12}}}{2\lambda _{12}}. \end{aligned} \end{aligned}$$

It is straightforward to derive the parameter setting for univariate case since we only need to match the parameter via probability generating function between $ Z_t^{(n)} $ and $ Z_t $. However, in the other case where the closed form probability generating function for $ \textbf{M}_t $ is not available, we need to seek other ways to set up $\alpha _i$ and $\beta _{i,j}$ in terms of $\lambda $ and $\mu $. The direct approach would be to match the first and second order moments to see whether it works. It is clear that we can match moment equations (26) to (16) and find out the mapping of $ \beta _{12}, \beta _{21} $ in terms of $ \lambda _{i,j}, \ \mu _{i}, \ i,j \in \{1,2\} $. Unfortunately, only the ratio $ \alpha _i /\beta _{ii} $ is known. Nevertheless, the parameter setting in univariate case shows us the way to distribute the ratio $ \alpha /p $ to $ \alpha $ and p. Then $ \alpha _i, \beta _{ii}$ can be set up in a similar way.

Proposition 5

With the above parameters setting and any non-negative integer m, the transition probabilities for $ Z_t^{(n)} $ conditional on $ Z_{t-\Delta }^{(n)} = k $ are

$$\begin{aligned} \begin{aligned}&\Pr (Z_t^{(n)} = k + m \vert Z_{t-\Delta }^{(n)} = k) = \left( {\begin{array}{c}k+m-1\\ k-1\end{array}}\right) (\lambda \Delta )^m + o(\Delta ^m) \\&\Pr (Z_t^{(n)} = k - m \vert Z_{t-\Delta }^{(n)} = k) = \left( {\begin{array}{c}k\\ k-m\end{array}}\right) (\mu \Delta )^m + o(\Delta ^m) \end{aligned} \end{aligned}$$

(34)

The above probabilities can be simplified as,

$$\begin{aligned} \begin{aligned}&\Pr \left( Z_t^{(n)} = k + 1 \vert Z_{t-\Delta }^{(n)} = k\right) = \lambda k \Delta + o(\Delta )\\&\Pr \left( Z_t^{(n)} = k - 1 \vert Z_{t-\Delta }^{(n)} = k\right) = \mu k \Delta + o(\Delta ) \\&\Pr \left( \vert Z_t^{(n)} - k \vert \ge 2 \vert Z_{t-\Delta }^{(n)} = k\right) = o(\Delta ) \end{aligned} \end{aligned}$$

(35)

On the other hand, the transition probabilities for $ \textbf{M}_t^{(n)} $ conditional on $ \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} = (k_1, k_2) $ given by

$$\begin{aligned} \begin{aligned}&\Pr (M_{i,t}^{(n)} = k_i + m \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} ) \\&\quad = \sum _{j = k_i}^{k_i + m} \left( {\begin{array}{c}j-1\\ k_i - 1\end{array}}\right) \left( {\begin{array}{c}k_1 + k_2 + m - j - 1\\ k_{i'} - 1\end{array}}\right) (\lambda _{ii}\Delta )^{j - k_i} (\lambda _{i',i}\Delta )^{k_i+m - j} + o(\Delta ^m) \\&\quad \Pr (M_{i,t}^{(n)} = k_i - m \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} ) = \left( {\begin{array}{c}k_i\\ k_i - m\end{array}}\right) (\mu _i \Delta )^m + o(\Delta ^m), \end{aligned} \end{aligned}$$

(36)

where $i \in \{1,2\} $ and $ i' = 3 - i $. Due to the conditional independence of bivariate INAR models, the joint transition probabilities for $ \textbf{M}_t^{(n)} $ conditional on $ \textbf{M}_{t-\Delta }^{(n)} $ are

$$\begin{aligned} \begin{aligned}&\Pr ( M_{1,t}^{(n)} = k_1 \pm m_1, M_{2,t}^{(n)} = k_2 \pm m_2 \vert \textbf{M}_{t-\Delta }^{(n)}=\textbf{k}) \\&\quad =\Pr (M_{1,t}^{(n)} = k_1 \pm m_1 \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} ) \Pr (M_{2,t}^{(n)} = k_2 + m_2 \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} ) \end{aligned} \end{aligned}$$

(37)

Similarly, the above probabilities can be simplified as

$$\begin{aligned} \begin{aligned}&\Pr (M_{i,t}^{(n)} = k_i + 1 \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} ) = \lambda _{ii}k_1 \Delta + \lambda _{i',i}k_2\Delta + o(\Delta ) \\&\Pr (M_{i,t}^{(n)} = k_i - 1 \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} ) = \mu _i k_i \Delta + o(\Delta ) \\&\Pr \left( \vert M_{i,t}^{(n)} - k_i \vert \ge 2 \vert \textbf{M}_{t-\Delta }^{(n)} = \textbf{k} \right) = o(\Delta ) \end{aligned} \end{aligned}$$

(38)

Proof

See Appendix A.3. $\square $

It is obvious that the above transition probabilities have exactly the same form as continuous counterparts when $ m = 1 $. Consequently, the Lévy measures of $ Z^{(n)}_t $ and $ \textbf{M}_t^{(n)} $ have similar structure to their continuous counterparts.

Proposition 6

The continuous processes $ Z_t^{(n)} $ and $ \textbf{M}_t^{(n)} $ defined above are semimartingales with following characteristics triplets.

$$\begin{aligned} \begin{aligned}&Ch(Z_t^{(n)}) = {\left\{ \begin{array}{ll} &{}B_t = 0 \\ &{}C_t = 0 \\ &{}\nu ([0,t] \times g) = \sum _{k=1}^{\sigma _t} (g(1)\lambda + g(-1)\mu )X_{k-1}\Delta + O(\Delta ) \end{array}\right. } \\&Ch(\textbf{M}_t^{(n)}) = {\left\{ \begin{array}{ll} \textbf{B}_t = 0 \\ \textbf{C}_t = 0 \\ \mathbf {\nu }([0,t] \times g) = &{} \sum _{k=1}^{\sigma _t} \left( g(1,0)\tilde{\varvec{\lambda }}_{1} + g(-1,0)\tilde{\varvec{\mu }}_1\right) \textbf{Y}_{k-1}\Delta \\ &{}+ \left( g(0,1)\tilde{\varvec{\lambda }}_2 + g(0,-1)\tilde{\varvec{\mu }}_2\right) \textbf{Y}_{k-1}\Delta \\ {} &{}+ O(\Delta ), \end{array}\right. } \end{aligned} \end{aligned}$$

(39)

where the g is a continuous, non-negative, bounded Borel function vanishing near 0 and $ \textbf{M}_t^{(n)} $ respectively, the truncation function is $ h = \vert x \vert \textbf{1}_{\{\vert x \vert < 1 \}} $ and

$$\begin{aligned} \tilde{\varvec{\lambda }}_1 = (\lambda _{11},\lambda _{21}), \quad \tilde{\varvec{\lambda }}_2 = (\lambda _{21},\lambda _{22}),\quad \tilde{\varvec{\mu }}_1 = (\mu _1,0), \quad \tilde{\varvec{\mu }}_2 = (0,\mu _2) \end{aligned}$$

Proof

See Appendix A.4. $\square $

Theorem 7

With the the definition and the parametrization above, and the initial distribution condition:

$$\begin{aligned} Z_0^{(n)} = Z_0, \quad \textbf{M}_0^{(n)} = \textbf{M}_0, \end{aligned}$$

(40)

the processes $ Z_t^{(n)} $ and $ \textbf{M}_t^{(n)} $ converge weakly to the continuous birth and death processes $ Z_t $ and $ \textbf{M}_t $.

$$\begin{aligned} \begin{aligned}&\underset{n \rightarrow \infty }{\lim }\ Z_t^{(n)} \overset{w}{\rightarrow }\ Z_t \\&\underset{n \rightarrow \infty }{\lim }\ \textbf{M}_t^{(n)} \overset{w}{\rightarrow }\ \textbf{M}_t, \end{aligned} \end{aligned}$$

(41)

when the size of subinterval $ \Delta $ goes to 0 or equivalently, $ n \rightarrow \infty $.

Proof

Here we simply apply Theorem 3.39 from Jacod and Shiryaev (2013, chapter IX, section 3), the limit theorem of semimartingales for the locally bounded case.

i
The local strong Majorization Hypothesis: For both cases $ Z_t $ and $ \textbf{M}_t $, the first two terms of the characteristic triplets are 0 and stochastic integrals with respect to the function is clearly finite on [0, 1]
ii
Local Conditions on big jumps: For both cases $ Z_t $ and $ \textbf{M}_t $, there is no jump with absolute size greater than 1.
iii
The local uniqueness: for every choices of initial distributions for $ Z_0 $ and $ \textbf{M}_0 $, their Lévy measures are uniquely characterized by their (joint) probability distribution functions.
iv
Continuity Condition, the characteristic triplets $B_t(\omega ), C_t(\omega ), \nu (\omega ; dt, dx) $ of $ Z_t $ and $ \textbf{M}_t $ are continuous with respect to $ \omega $.
v
Weak convergence of initial distribution. This is stated at the beginning of this theorem.
vi
Convergence of characteristic triplet of discrete processes to that of their continuous counterparts. This can be proved by showing the uniform convergence of Lévy measures. For every $ a >0 $, define a stopping time for the population process:
$$\begin{aligned} S_a(X) = \inf \left\{ t: \vert X_t \vert> a, \ \text {or} \ \vert X_{t^-} \vert > a \right\} \end{aligned}$$
(42)

For the univariate case, the stochastic integral with respect to $ g*v $ for any Borel function g is given by

$$\begin{aligned} \begin{aligned} (g* \nu _{t \wedge S_a})\circ Z^{(n)} =&g *\nu (Z^{(n)};[0,t \wedge S_a(Z^{(n)})],R) \\ =&\int _0^{t \wedge S_a(Z^{(n)})} \int _{R} g(x) (\lambda \delta _1(dx) + \mu \delta _{-1}(dx)) Z^{(n)}_{s^-} ds \\ =&\int _0^{t \wedge S_a(Z^{(n)})} ( g(1)\lambda + g(-1)\mu ) Z^{(n)}_{s^-} ds \\ =&\sum _{k=1}^{\sigma _{t \wedge S_a(Z^{(n)})}} \left( g(1) \lambda + g(-1) \mu \right) Z^{(n)}_{k-1}\Delta \\&+ \left( g(1)\lambda + g(-1) \mu \right) Z^{(n)}_{\sigma _{t \wedge S_a(Z^{(n)})}} \left( t \wedge S_a(Z^{(n)}) - \sigma _{t \wedge S_a(Z^{(n)})} \Delta \right) \\ \end{aligned} \end{aligned}$$

(43)

and the absolute difference of two stochastic integrals is given by,

$$\begin{aligned} \begin{aligned}&\vert g * \nu ^n_{t \wedge S_a } - (g* \nu _{t \wedge S_a})\circ Z^{(n)} \vert \\&\quad = \left| O(\Delta ) + \left( g(1)\lambda + g(-1) \mu \right) Z_{\sigma _{t \wedge S_a(Z^{(n)})}} \left( t \wedge S_a(Z^{(n)}) - \sigma _{t \wedge S_a(Z^{(n)})} \Delta \right) \right| \\&\quad \le O(\Delta ) + \vert g(1)\lambda + g(-1) \mu \vert Z_{\sigma _{t \wedge S_a(Z^{(n)})}} \left( t \wedge S_a(Z^{(n)}) - \sigma _{t \wedge S_a(Z^{(n)})} \Delta \right) \end{aligned} \end{aligned}$$

(44)

It is clear that all the quantity inside $ \vert ..\vert $ are finite and for every $ \xi >0 $, and then there exists a natural number N such that for $ n > N $, we have

$$\begin{aligned} \vert g * \nu ^n_{t \wedge S_a } - (g* \nu _{t \wedge S_a})\circ Z^{(n)} \vert < \xi \end{aligned}$$

(45)

and hence we have the uniform convergence for $ g * \nu ^n_{t \wedge S_a } $ to $ (g* \nu _{t \wedge S_a})\circ Z^{(n)} $. For the bivariate case, the stochastic integral $ g*\nu $, where $ \nu $ is the Lévy measure of M, for any Borel function g is given by

$$\begin{aligned} \begin{aligned}&(g* \nu _{t \wedge S_a}) \circ \textbf{M}^{(n)} = g* \nu (\textbf{M}^{(n)};[0,t\wedge S_a(\textbf{M}^{(n)})],R)\\ =&\int _0^{t \wedge S_a(\textbf{M}^{(n)})} \int _{R} g(x) (\tilde{\varvec{\lambda }}_1 \delta _{(1,0)}(dx) + \tilde{\varvec{\lambda }}_2\delta _{(0,1)}(dx) \\&+ \tilde{\varvec{\mu }}_1\delta _{(0,-1)} (dx) + \tilde{\varvec{\mu }}_2 \delta _{(0,-1)}(dx))\textbf{M}_{s^-}^{(n)} ds \\ =&\int _0^{t \wedge S_a(\textbf{M}^{(n)})} \left( g(1,0)\tilde{\varvec{\lambda }}_1 + g(0,1)\tilde{\varvec{\lambda }}_2 +g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,-1)\tilde{\varvec{\mu }}_2 \right) \textbf{M}_{s^-}^{(n)} ds \\ =&\sum _{k=1}^{t \wedge S_a(\textbf{M}^{(n)})} \left( g(1,0)\tilde{\varvec{\lambda }}_1 + g(0,1)\tilde{\varvec{\lambda }}_2 +g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,-1)\tilde{\varvec{\mu }}_2 \right) \textbf{M}_{k-1}^{(n)}\Delta \\ {}&+ \left( g(1,0)\tilde{\varvec{\lambda }}_1 + g(0,1)\tilde{\varvec{\lambda }}_2 +g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,-1)\tilde{\varvec{\mu }}_2 \right) \\ {}&\times \textbf{M}^{(n)}_{\sigma _{t \wedge S_a(\textbf{M}^{(n)})}} \left( t \wedge S_a(\textbf{M}^{(n)}) - \sigma _{t \wedge S_a(\textbf{M}^{(n)})}\right) \end{aligned} \end{aligned}$$

(46)

Then the absolute difference of two stochastic integrals is given by

$$\begin{aligned} \begin{aligned}&\vert g * \nu _{t \wedge S_a(\textbf{M}^{(n)})} - (g * \nu _{t \wedge S_a}) \circ \textbf{M}^{(n)} \vert \\&\quad \le O(\Delta ) + \left| g(1,0)\tilde{\varvec{\lambda }}_1 + g(0,1)\tilde{\varvec{\lambda }}_2 +g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,-1)\tilde{\varvec{\mu }}_2 \right| \\&\quad \times \textbf{M}^{(n)}_{\sigma _{t \wedge S_a(\textbf{M}^{(n)})}} \left( t \wedge S_a(\textbf{M}^{(n)}) - \sigma _{t \wedge S_a(\textbf{M}^{(n)})}\right) \end{aligned} \end{aligned}$$

(47)

Hence the uniform convergence holds using similar argument as in the univariate case. Finally, the $ Z_t^{(n)} $, $ M_t^{(n)} $ converge weakly to $ Z_t $ and $ M_t $ respectively. $\square $

5 Simulation study

In this section, we outline the simulation algorithm for bivariate birth and death processes. Then estimation method, properties of estimators are investigated in the simulation study.

5.1 Simulation of bivariate birth and death process

The simulation algorithm of bivariate birth and death process $ \textbf{M}_t $ can be derived straightforwardly according to its ODE (11). Given the current population $ \textbf{M}_t $, the waiting time that a event (birth or death in either population) will happen follows exponential distribution with rate

$$\begin{aligned} \rho _t =(\lambda _{11} + \lambda _{12} + \mu _1) M_{1,t} + (\lambda _{21}+ \lambda _{22} + \mu _2) M_{2,t} \end{aligned}$$

Then the probability that this event will happen in population $ M_{1,t} $ is

$$\begin{aligned} p_1 = \frac{\lambda _{21} M_{2,t} + (\lambda _{11} + \mu _1) M_{1,t}}{\rho _t} \end{aligned}$$

(48)

The probability that this event will happen in population $ M_{2,t} $ would simply be $ p_2 = 1 - p_1 $. Suppose now an event happens in population $ M_{1,t} $, the probability that there is a new individual would be

$$\begin{aligned} p_1^b = \frac{\lambda _{11} M_{1,t} + \lambda _{21} M_{2,t}}{\lambda _{21} M_{2,t} + (\lambda _{11} + \mu _1) M_{1,t}}, \end{aligned}$$

(49)

and the probability that an individual dies is $ p_1^d = 1 - p_1^b $. Likewise, if the event happens in the population $ M_{2,t} $, the birth probability would be

$$\begin{aligned} p_2^b = \frac{\lambda _{12} M_{1,t} + \lambda _{22} M_{2,t}}{\lambda _{12} M_{1,t} + (\lambda _{22} + \mu _2) M_{2,t}} \end{aligned}$$

(50)

and death probability $ p_n^d = 1 - p_n^b $. Overall, the simulation algorithm is shown in the following Algorithm 1.

On the other hand, the simulation procedure of bivariate INAR(1) model is straightforward because the distribution of $ \textbf{Y}_t $ are indicated by the operator $ (\circ , \ *_1, \ *_2) $ given $ \textbf{Y}_{t-1} $.

5.2 Statistical inference of univariate and bivariate birth and death process

5.2.1 Quasi-MLE for bivariate LBD

In the univariate case, parameters estimation and their asymptotic properties are available in Keiding (1975). Suppose now we have the full information of the sample path, the exact inter-arrival times for each birth and death events $ \{\tau _i\}_{\{i= 0,1,2,\dots \}}$ on the sampling interval [0, T] where $ \tau _0 = 0 $, the maximum likelihood estimators for $ Z_t $ are

$$\begin{aligned} \hat{\lambda } = \frac{B_T}{X_T}, \quad \hat{\mu } = \frac{D_T}{X_T}, \quad X_T = \sum _{k=1}^{B_T + D_T} \tau _k Z_{\tau _{k-1}} + \left( T - \sum _{i=1}^n \tau _i\right) Z_T, \end{aligned}$$

(51)

where $ B_T, D_T $ are total number of birth and death events respectively. The asymptotic properties are given by fixed T and large population

$$\begin{aligned} \lim _{Z_0 \rightarrow \infty } \left( \frac{Z_0(e^{(\lambda - \mu )T}-1)}{\lambda - \mu }\right) ^{\frac{1}{2}} \left( {\begin{array}{c}\hat{\lambda } - \lambda \\ \hat{\mu } - \mu \end{array}}\right) \overset{D}{\rightarrow } \textbf{N}\left( \left( {\begin{array}{c}0\\ 0\end{array}}\right) , \ \begin{pmatrix} \lambda &{} 0 \\ 0 &{} \mu \\ \end{pmatrix}\right) \end{aligned}$$

(52)

In practice, one may not have exact information of inter-arrival time of the events. Instead, we have records for populations sampling over a fixed-length interval $ \Delta $ such that $ Z_0, Z_{\Delta }, Z_{2\Delta } \dots Z_{n\Delta }$ are available. Then to estimate the parameters $ \lambda ,\mu $, one can numerically maximize the Quasi log-likelihood function from the proposed INAR(1) model $ X_k = Z_{k\Delta }, \ k =0,1,\dots ,n $. The log likelihood function is given by,

$$ \begin{aligned} \begin{aligned}&\ell (\alpha ,p) = \sum _{k=1}^n \log \Pr (X_{k-1},X_k) \\&\Pr (X_{k-1}, X_k) \\&\quad ={\left\{ \begin{array}{ll} 1, \quad &{}X_{k-1} = X_k = 0 \\ (1-\alpha )^{X_{t-1}}, \quad &{}X_k = 0\\ \sum _{j=1}^{\min \{X_{k-1},X_k\}} f_b(j;X_{k-1},\alpha ) f_{nb}(X_k-j;j,p), \quad &{}X_{k-1}>0 \ \& \ X_k > 0, \end{array}\right. } \end{aligned} \end{aligned}$$

(53)

where $ f_b $ and $ f_{nb} $ are probability mass function of binomial and negative binomial random variables

$$\begin{aligned} f_b(x;n,\alpha ) = \left( {\begin{array}{c}n\\ x\end{array}}\right) \alpha ^x(1-\alpha )^{n-x} \quad f_{nb}(x;n,\beta ) = \left( {\begin{array}{c}n+x-1\\ n-1\end{array}}\right) \beta ^{n}(1-\beta )^x \end{aligned}$$

Table 1 Parameter setting for univariate case

Full size table

The simulation is conducted as follow: we generate 1000 sample paths of $ Z_t $ using the parameters settings in Table 1. Since $ Z_t $ are continuous sample paths, we set up an equal-distance grid with sampling interval $ \Delta $. Then the equal-distance observations $ X_t$ are obtained by counting the total number of population up to each discrete time $(0,\Delta , 2\Delta ,\dots ,n\Delta )$ where $ n = \frac{T}{\Delta } $. The log likelihood function is then maximized by ’optim’ function with method = ’BFGS’ in R programming. Finally, we can recover the rate estimates by inverting the parametrization in equation (32) such that

$$\begin{aligned} \tilde{\lambda } = \frac{\frac{1-\hat{p}}{\hat{p}}\log \frac{\hat{\alpha }}{\hat{p}}}{\frac{\hat{\alpha }}{\hat{p}} - 1}, \quad \tilde{\mu } = \tilde{\lambda } - \frac{1}{\Delta }\log \frac{\hat{\alpha }}{\hat{p}} \end{aligned}$$

(54)

In the following, we will first explore how the size of $ \Delta $ would affect properties estimators, i.e. bias and mean square error (MSE), and how much more computational time we need compared to true MLE method. Four different size of sampling intervals $ \Delta = \{0.1,0.05,0.025,0.01\} $ is chosen and the results are presented in Table 2. The theoretical row shows the biased and MSE computed through equation (52). There is no surprise that the True MLE method from Eq. (51) performs the best, with lowest MSE and computational time. The Quasi-MLE method by constructing INAR model, on the other hand, becomes better as we decreasing the size of sampling interval $ \Delta $ but it still performs no better than the true MLE method and require much more computational time. The empirical distribution of these estimators are illustrated in Fig. 1 and since the general shape of distribution of $ \tilde{\lambda } $ and $ \tilde{\mu } $ has little difference, we will only show the distribution of $ \tilde{\lambda }$. It is clear that only the case $ \Delta = 0.01 $ has satisfactory normal shape compared to all other cases.

Table 2 Properties of different maximum likelihood estimators

Full size table

To achieve asymptotic normality for Quasi-MLE method from INAR model, one need not only large initial population, but also a small sampling interval $ \Delta $. In the following simulation, we would fix the sampling interval $ \Delta = 0.01 $ and investigate how the size of initial population would affect the asymptotic distribution of estimators and the computational time for estimation procedure. To explore the effect of $ Z_0 $ for asymptotic distribution, we choose $ Z_0 \in \{5,10,30,50\}$ and it seems from Fig. 2 that to ensure asymptotic normality for both estimators, one need at least $ Z_0 = 30 $, which is a large sample size in statistical sense.

The computational time with respect to $ Z_0 \in \{10,50,100,150,\dots ,500\} $ clearly shows a linear trend in Fig. 3. This is reasonable as the number of summation involved in Eq. (53) increases linearly with respect to $ Z_0 $

In summary, the Quasi-MLE method constructed from INAR model can reach moderate level of estimation accuracy and asymptotic normality with large initial population $ Z_0 \ge 30 $ and small sampling interval $ \Delta \le 0.01 $. However, it would require much more computational time than the true MLE method. This method should only be used in the case where we have no information on inter-arrival time of birth and death events.

5.2.2 Quasi-MLE for bivariate LBD

Since the bivariate INAR(1) model is a bivariate Markov Chain, the log likelihood function can be written as the sum of logarithm of transition probabilities. Denote $ \Theta = \{\alpha _1,\alpha _2,\beta _{11},\beta _{12},\beta _{21},\beta _{22}\} $ as the parameter space of bivariate INAR(1) model, then the likelihood function can be written as

$$\begin{aligned} \begin{aligned} \ell (\Theta )&= \sum _{t=1}^n \log \Pr (X_{t}, Y_{t} \vert X_{t-1}, Y_{t-1}) \\&= \sum _{t=1}^n \left( \log \Pr (X_t \vert X_{t-1}, Y_{t-1}) + \log \Pr (Y_t \vert X_{t-1}, Y_{t-1}) \right) \\&= \ell _x (\Theta _x) + \ell _y(\Theta _y), \end{aligned} \end{aligned}$$

(55)

where $ \Theta _x = \{\alpha _1, \beta _{11}, \beta _{21}\} $ and $ \Theta _y = \{\alpha _2, \beta _{12},\beta _{22}\} $. Because $ X_{t} $ and $ Y_t $ are independent of each other given the last state $ (X_{t-1}, Y_{t-t}) $, the likelihood function can be separated into two parts, $ \ell _x $ and $ \ell _y $ respectively. Then transition probability for $ X_t $ is given by

$$ \begin{aligned} \begin{aligned}&\Pr (X_t = z_1 \vert X_{t-1} = x, Y_{t-1} = y) \\&\quad ={\left\{ \begin{array}{ll} 1 \quad &{}z_1 = x = y = 0 \\ (1- \alpha _1)^x \beta _{21}^y \quad &{}z_1 = 0 \\ f_{nb}(z_1;y,\beta _{21}) \quad &{}x = 0 \ \& \ y>0 \\ \sum _{i=1}^{\min \{x,z_1\}} f_b(i;x,\alpha _1) f_{nb}(z_1 - i ; i,\beta _{11}) \quad &{}x> 0 \ \& \ y =0 \\ \sum _{j=1}^{z_1}\sum _{i=1}^{\min \{x,j\}} f_b(i;x,\alpha _1) f_{nb}(z_1 - i; i,\beta _{11}) f_{nb}(z_1 - j; y,\beta _{21}) \\ \quad + (1-\alpha _1)^x f_{nb}(z_1;y,\beta _{21}) \quad &{}x>0 \ \& \ y>0 \end{array}\right. } \end{aligned} \end{aligned}$$

The one for $Y_t$ is

$$ \begin{aligned} \begin{aligned}&\Pr (Y_t = z_2 \vert X_{t-1} = x, Y_{t-1} = y) \\&\quad = {\left\{ \begin{array}{ll} 1 \quad &{}z_2 = x = y = 0 \\ (1- \alpha _2)^y \beta _{12}^x \quad &{}z_2 = 0 \\ f_{nb}(z_2;x,\beta _{12}) \quad &{}x> 0 \ \& \ y = 0 \\ \sum _{i=1}^{\min \{y,z_2\}} f_b(i;y,\alpha _2) f_{nb}(z_2 - i ; i,\beta _{22}) \quad &{}x = 0 \ \& \ y>0 \\ \sum _{j=1}^{z_2}\sum _{i=1}^{\min \{y,j\}} f_b(i;y,\alpha _2) f_{nb}(z_2 - i; i,\beta _{22}) f_{nb}(z_2 - j; y,\beta _{12}) \\ \quad + (1-\alpha _2)^y f_{nb}(z_2;x,\beta _{12}) \quad &{}x>0 \ \& \ y>0 \end{array}\right. } \end{aligned} \end{aligned}$$

One can then numerically maximize the log likelihood function $ \ell _x, \ell _y $ given the random samples $ \{ (X_0, Y_0), (X_1,Y_1), \dots , (X_n,Y_n) \} $. From the estimated parameters $ \hat{\Theta } $, we can solve the following system of equations to get the estimates $ \Theta _{bd} =\{\lambda _{11},\lambda _{12},\lambda _{21},\lambda _{22},\mu _1,\mu _2\} $ for bivariate birth and death process.

$$\begin{aligned} {\left\{ \begin{array}{ll} &{}\alpha _1(\Theta _{bd},\Delta ) - \hat{\alpha }_1 = 0 \\ &{}\alpha _2(\Theta _{bd},\Delta ) -\hat{\alpha }_2 = 0 \\ &{}\beta _{11}(\Theta _{bd},\Delta ) -\hat{\beta }_{11} = 0 \\ &{}\beta _{12}(\Theta _{bd},\Delta ) -\hat{\beta }_{12} = 0 \\ &{}\beta _{21}(\Theta _{bd},\Delta ) -\hat{\beta }_{21} = 0 \\ &{}\beta _{22}(\Theta _{bd},\Delta ) -\hat{\beta }_{22} = 0, \end{array}\right. } \end{aligned}$$

(56)

where the parametrization function $.(\Theta _{bd},\Delta ) $ are given in equation 33 and $ \Delta $ is chosen based on the interpretation of birth and death rates. For example, when the random samples are collected on daily basis over a year $ t= 1 $, one can define $ \Delta = t/365 $. Then these parameters $ \Theta _{bd} $ are interpreted on an annual scale.

Table 3 Parameter setting for simulation

Full size table

In the following, we will simulate the $ r_2 = 1000 $ sample paths of $ \textbf{M}_t $ based on the pre-specific parameters in Table 3. Then equal-distance gird with sampling interval $ \Delta $ is set up and random samples $ (\textbf{Y}_0, \textbf{Y}_1, \dots , \textbf{Y}_n ) $ are obtained, like the way mentioned in the univariate case. Then the likelihood functions $\ell _x, \ell _y$ are maximized by ’optim’ in R with method being specified as ’BFGS’ and the maximum likelihood estimators $\hat{\Theta }$ are obtained. Finally, we can obtain the estimators $\hat{\Theta }_{bd}$ by numerically solving the system of equations (56) via a root-finding algorithm (e.g. Newton–Raphson method). Referring to the estimation results in univariate case, we focus on the choices of $ \Delta \in \{0.02,0.01,0.005\} $ as well as large initial population (40, 50) , and hopefully we can obtain asymptotic normality for each estimator. The empirical distribution of these estimators $\Theta _{bd}$ are illustrated in Fig. 4 and their properties are summarized in Table 4.

Table 4 Properties of different maximum likelihood estimators

Full size table

The bias and MSE of most estimators are decreasing with respect to $ \Delta $ as expected. However, the MSE of birth rates are much larger than the estimators of death rates. Except the estimators for death rates, all other estimators for birth rate are skewed to different directions and clearly non-normal distributed. This may caused by some of non-normal estimators for proposed INAR model illustrated in Fig. 5. In the classical setting where the innovation term is included, one need stationary condition to ensure asymptotic normality for all estimators of parameters, see Bu et al. (2008). And in our case, INAR model itself is not stationary and hence some of the estimate can be skewed.

Notice that the pair of birth rates that contributed to the same population, $ (\lambda _{11}, \lambda _{21}) $ and $ (\lambda _{12},\lambda _{22}) $ are skewed in opposite directions. It is then worthwhile to see whether the sum of these pair estimators has desired asymptotic properties and the results in Fig. 6 confirms our conjecture. Combining the simulation procedure of bivariate birth and death processes, Quasi-MLE method may not be able to distinguish the pair of birth rates contributed to the same population. Instead, it would provide good estimators for the scale of total birth rates $\bar{\lambda }_1 = \hat{\lambda }_{11}r_m + \hat{\lambda }_{21} (1-r_m) $ and $ \bar{\lambda }_2 = \hat{\lambda }_{12} r_m + \hat{\lambda }_{22} (1-r_m)$ where $ r_m = \frac{{\mathbb {E}}[M_{1,t}]}{{\mathbb {E}}[M_{1,t} + M_{2,t}]} $. Furthermore, according to the proof A.1, the relationship between first moment of two population is given by

$$\begin{aligned} {\mathbb {E}}[M_{1,t}] = c{\mathbb {E}}[M_{2,t}] + (M_{1,0} - cM_{2,0})e^{-(\lambda _{12}c - \kappa _2)t}. \end{aligned}$$

(57)

As long as the whole process is not extinct with probability one, i.e. $\kappa _1 \kappa _2 < \lambda _{12}\lambda _{21} $, the exponential power $ (\lambda _{12}c - \kappa _2) $ will always be positive and hence $ {\mathbb {E}}[M_{1,t}] \approx c{\mathbb {E}}[M_{2,t}] $ when t is large. In other words, the ratio

Table 5 Properties for total birth rates estimators

Full size table

Table 6 Parameter setting for simulation

Full size table

$$\begin{aligned} r_m = \frac{{\mathbb {E}}[M_{1,t}]}{{\mathbb {E}}[M_{1,t} + M_{2,t}]} \rightarrow \frac{c}{1 + c}, \end{aligned}$$

(58)

becomes a constant eventually. For the parameter setting in Table 3, $ c = 1.040833 $, $ r_m \approx \frac{1}{2} $ and hence $ \hat{\lambda }_{11} + \hat{\lambda }_{21} $ serves as an estimator for the total birth rate of $ M_{1,t} $. In practice, the c is unknown as true parameters need to be estimated. Then we can use the values at the end of sampling period to approximate $ r_m $, i.e.

$$\begin{aligned} r_m \approx \frac{M_{1,T}}{M_{1,T} + M_{2,T} } \end{aligned}$$

(59)

The properties of $ \bar{\lambda }_1, \bar{\lambda }_2 $ and their empirical distribution are shown in Table 5 and Fig. 7. These new estimators benefits from nice properties, low bias and MSE and they decreases as $ \Delta $ decreases. Most importantly, they are not skewed anymore and asymptotic normal.

Table 7 Properties of different maximum likelihood estimators

Full size table

Table 8 Properties for total birth rates estimators

Full size table

Let us try another parameter setting in Table 6 to verify this conjecture. Same simulation and estimation process as previous case and the results are shown in Tables 7, 8 and Fig. 8. This time the constant c is 0.576306 and $ r_m = 0.365605 $. Similar to the last setting, the estimators for all birth rates are skewed and some of them have large bias and MSE. The estimators for total birth rate, on the other hand, are of low bias and MSE and they are again asymptotic normal.

Let us finally try another parameter setting in Table 9 where the $ \textbf{M}_t $ is going to be extinct eventually. It means that the exponential function in Eq. (57) can no longer be omitted. The results are illustrated in Table 10 and Fig. 9 and they look similar to the results of the first case. Nice properties for death rates’ estimators but skewed and non-normal for birth rates’ estimators.

Table 9 Parameter setting for simulation

Full size table

Table 10 Properties of different maximum likelihood estimators

Full size table

6 Concluding remarks

In this paper, we propose an integer-valued autoregressive model INAR(1) to approximate the continuous birth-and-death process. In univariate case, we propose a birth-death operator $p *_1 \alpha \circ X$ which is the sum of zero-modified geometric random variable. The parametrization of p and $\alpha $ can be determined by matching the first and second moment of continuous process. Then we propose an bivariate INAR(1) model to approximate bivariate birth and death process where birth probabilities will also depend on the size of the other population. The parametrization of this model can be obtained in a similar way. The convergence from discrete process to continuous process is proved by apply weak convergence theorem of locally bounded semimartingales. Due to the simple Markov structure of INAR(1) model, maximum likelihood estimation would be feasible. It is however not the case for bivariate and multivariate birth and death process. Basically, one can extend the result here to multivariate case, i.e. we can approximate multivariate birth and death process in Griffiths (1973) by multivariate INAR(1) model using the these operators $*_1, *_2, \circ $ only as well as adding an immigrant process. However, the difficulty of expressing the parameters of INAR(1) model in terms of the parameters of multivariate birth and death process would be increasing and as we need to find out the first moment of birth and process explicitly.

References

Al-Osh M, Alzaid AA (1987) First-order integer-valued autoregressive (INAR (1)) process. J Time Ser Anal 8(3):261–275
Article MathSciNet MATH Google Scholar
Bailey NT (1991) The elements of stochastic processes with applications to the natural sciences, vol 25
Bu R, McCabe B, Hadri K (2008) Maximum likelihood estimation of higher-order integer-valued autoregressive processes. J Time Ser Anal 29(6):973–994
Article MathSciNet MATH Google Scholar
Chen R, Hyrien O (2011) Quasi-and pseudo-maximum likelihood estimators for discretely observed continuous-time Markov branching processes. J Stat Plan Inference 141(7):2209–2227
Article MathSciNet MATH Google Scholar
Crawford FW, Minin VN, Suchard MA (2014) Estimation for general birth-death processes. J Am Stat Assoc 109(506):730–747
Article MathSciNet MATH Google Scholar
Davis MH (1984) Piecewise-deterministic Markov processes: a general class of non-diffusion stochastic models. J R Stat Soc Ser B Methodol 46(3):353–376
MATH Google Scholar
Davis RA, Holan SH, Lund R, Ravishanker N (2016) Handbook of discrete-valued time series
Davison AC, Hautphenne S, Kraus A (2021) Parameter estimation for discretely observed linear birth-and-death processes. Biometrics 77(1):186–196
Article MathSciNet Google Scholar
Feller W (1939) Die grundlagen der volterraschen theorie des kampfes ums dasein in wahrscheinlichkeitstheoretischer behandlung. Acta Biotheor 5:441–470
Article MathSciNet MATH Google Scholar
Griffiths D (1972) A bivariate birth-death process which approximates to the spread of a disease involving a vector. J Appl Probab 9(1):65–75
Article MathSciNet MATH Google Scholar
Griffiths D (1973) Multivariate birth-and-death processes as approximations to epidemic processes. J Appl Probab 10(1):15–26
Article MathSciNet MATH Google Scholar
Jacod J, Shiryaev A (2013) Limit theorems for stochastic processes, vol 288
Keiding N (1975) Maximum likelihood estimation in the birth-and-death process. Ann Stat 3(2):363–372
Article MathSciNet MATH Google Scholar
Kendall DG (1949) Stochastic processes and population growth. J R Stat Soc Ser B Methodol 11(2):230–282
MathSciNet MATH Google Scholar
Kirchner M (2016) Hawkes and INAR($\infty $) processes. Stoch Process Their Appl 126(8):2494–2525
Article MathSciNet MATH Google Scholar
McKenzie E (1985) Some simple models for discrete variate time series. J Am Water Resour Assoc 21(4):645–650
Article Google Scholar
Nelson DB (1990) Arch models as diffusion approximations. J Econom 45(1–2):7–38
Article MathSciNet MATH Google Scholar
Scotto MG, Weiß CH, Gouveia S (2015) Thinning-based models in the analysis of integer-valued time series: a review. Stat Model 15(6):590–618
Article MathSciNet MATH Google Scholar
Tavaré S (2018) The linear birth-death process: an inferential retrospective. Adv Appl Probab 50(A):253–269
Article MathSciNet MATH Google Scholar
Weiß CH (2008) Thinning operations for modeling time series of counts—a survey. Adv Stat Anal 92(3):319–341
Article MathSciNet MATH Google Scholar
Weiß CH (2018) An introduction to discrete-valued time series

Download references

Acknowledgements

The authors would like to thank the anonymous referees for their very helpful comments and suggestions which have significantly improved this article.

Author information

Angelos Dassios and George Tzougas have contributed equally to this work.

Authors and Affiliations

Department of Statistics, London School of Economics and Political Science, Houghton Street, London, WC2A 2AE, England, UK
Zezhun Chen & Angelos Dassios
Department of Actuarial Mathematics and Statistics, Heriot-Watt University, Campus The Avenue, City, EH14 4AS, Scotland, UK
George Tzougas

Authors

Zezhun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Angelos Dassios
View author publications
You can also search for this author in PubMed Google Scholar
George Tzougas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zezhun Chen.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs

1.1 A.1 Proof of proposition 1

Similar to univariate case defined in (6), the infinitesimal generator of the bivariate birth and death process $ (M_{1,t}, M_{2,t}) $ acting on a function $ f(t, M_1, M_2) $ within its domain $ \Omega ({\mathcal {A}}) $ is given by

$$\begin{aligned} \begin{aligned} {\mathcal {A}}f(t, M_1, M_2) =&\frac{\partial f}{\partial t} + (\lambda _{11} M_1 + \lambda _{21} M_2) ( f(t, M_1 + 1, M_2) - f(t, M_1, M_2)) \\&+ \mu _1 M_1 (f(t, M_1-1,M_2) - f(t,M_1,M_2))\\&+ (\lambda _{12}M_1 + \lambda _{22}M_2) (f(t, M_1, M_2+1) - f(t, M_1,M_2)) \\&+ \mu _2 M_2 (f(t,M_1,M_2-1) - f(t,M_1,M_2)) , \end{aligned} \end{aligned}$$

where $ \Omega ({\mathcal {A}}) $ is the domain for the generator $ {\mathcal {A}} $ such that $ f(t,M_1,M_2) $ is differentiable with respect to t for all $ t, M_1, M_2 $ and

$$\begin{aligned} \begin{aligned}&\vert f(t, M_1+1, M_2) - f(t, M_1, M_2) \vert< \infty , \quad \vert f(t, M_1-1, M_2) - f(t, M_1, M_2) \vert< \infty \\&\vert f(t, M_1, M_2+1) - f(t, M_1, M_2) \vert< \infty , \quad \vert f(t, M_1, M_2-1) - f(t,M_1,M_2) \vert < \infty \\ \end{aligned} \end{aligned}$$

Apply infinitesimal generator $ {\mathcal {A}} $ to functions $ f(t,M_1,M_2) = M_{1,t}, M_{2,t}, M_{1,t}^2, M_{2,t}^2$ and $M_{1,t} M_{2,t} $ respectively, we have

$$\begin{aligned} \begin{aligned} {\mathcal {A}} M&=( \lambda _{11} M_1 + \lambda _{21} M_2 )(M_1+1 - M_1) + \mu _1 M_1 (M_1 - 1 - M_1) \\ {\mathcal {A}} N&= ( \lambda _{12} M_1 + \lambda _{22} M_2 ) (M_2+1 - M_2) + \mu _2 M_2 (M_2- 1 - M_2) \\ {\mathcal {A}} M_1^2&= ( \lambda _{11} M_1 + \lambda _{21} M_2 ) ((M_1+1)^2 - M_1^2) + \mu _1 M_1 ((M_1 - 1)^2 - M_1^2) \\ {\mathcal {A}} M_2^2&= ( \lambda _{12} M_1 + \lambda _{22} M_2) ((M_2+1)^2 - M_2^2) + \mu _2 N ((M_2 - 1)^2 - M_2^2) \\ {\mathcal {A}} M_1 M_2&=( \lambda _{11}M_1 + \lambda _{21} M_2 ) ( (M_1+1)N - M_1 M_2)\\&+ ( \lambda _{12} M_1 + \lambda _{22} M_2) (M_1(M_2+1) - M_1 M_2) \\&+ \mu _1 M_1 ((M_1-1)M_2 -M_1 M_2) + \mu _2 (M_1 (M_2-1) - M_1 M_2) \end{aligned} \end{aligned}$$

The first two result in the following system of ODE

$$\begin{aligned} \begin{aligned} \frac{d}{dt} {\mathbb {E}}[M_{1,t}]&= \lambda _{21} {\mathbb {E}}[M_{2,t}] - \kappa _1 {\mathbb {E}}[M_{1,t}] \\ \frac{d}{dt} {\mathbb {E}}[M_{2,t}]&= \lambda _{12} {\mathbb {E}}[M_{1,t}] - \kappa _2 {\mathbb {E}}[M_{2,t}] \end{aligned} \end{aligned}$$

(A1)

The other three equations become (17), which is hard to solve explicitly as we need to solve an inhomogeneous ordinary differential equation system. To solve the system (A1), we can first assume a linear relationship between $ {\mathbb {E}}[M_{1,t}]$ and $ {\mathbb {E}}[M_{2,t}] $ such that $ {\mathbb {E}}[M_{1,t}] = c {\mathbb {E}}[M_{2,t}] + g(t) $ for some constant c and a real value function g. Applying this substitution, the first ODE in (A1) becomes

$$\begin{aligned} c \frac{d}{dt} {\mathbb {E}}[M_{2,t}] + g'(t) = \lambda _{21} {\mathbb {E}}[M_{2,t}] - \kappa _1 (c {\mathbb {E}}[M_{2,t}] + g(t)) \end{aligned}$$

Make use of the second the ODE in, the first ODE can be rearranged into a ordinary equation.

$$\begin{aligned} \begin{aligned}&c(\lambda _{12} {\mathbb {E}}[M_{1,t}] - \kappa _2{\mathbb {E}}[M_{2,t}]) + g'(t) = \lambda _{21} {\mathbb {E}}[M_{2,t}] - \kappa _1 {\mathbb {E}}[M_{1,t}] \\&(\lambda _{12} c+ \kappa _1) {\mathbb {E}}[M_{1,t}] - (c\kappa _2 + \lambda _{21}) {\mathbb {E}}[M_{2,t}] + g'(t) = 0 \\&(\lambda _{12} c + \kappa _1) (c{\mathbb {E}}[M_{2,t}] + g(t)) - (c\kappa _2 + \lambda _{21}) {\mathbb {E}}[M_{2,t}] + g'(t) = 0\\&(\lambda _{12} c^2 + (\kappa _1 - \kappa _2) c - \lambda _{21}){\mathbb {E}}[M_{2,t}] + g'(t) + (\lambda _{12} + \kappa _1) g(t) = 0\\ \end{aligned} \end{aligned}$$

Then c is the solution of the quadratic equation

$$\begin{aligned} \begin{aligned}&\lambda _{12} c^2 + (\kappa _1 - \kappa _2) c - \lambda _{21} = 0 \\&c = \frac{\kappa _2 - \kappa _1 \pm \sqrt{(\kappa _1 - \kappa _2)^2 + 4\lambda _{21}\lambda _{12}}}{2\lambda _{12}} \end{aligned} \end{aligned}$$

Both roots would result in the same moments, so just take the positive root. The function g would be the solution of following ODE

$$\begin{aligned} \begin{aligned}&g'(t) + (\lambda _{12} + \kappa _1) g(t) = 0 \\&g(t) = g(0) e^{-(\lambda _{12}c - \kappa _2)t} \\&g(0) = {\mathbb {E}}[M_{1,0}] - c{\mathbb {E}}[M_{2,0}] = (a - cb) \end{aligned} \end{aligned}$$

Then $ {\mathbb {E}}[M_{2,t}] $ is determined by the following ODE

$$\begin{aligned} \frac{d}{dt} {\mathbb {E}}[M_{2,t}] = \lambda _{12}{\mathbb {E}}[M_{1,t}] - \kappa _2 {\mathbb {E}}[M_{2,t}] = (\lambda _{12} c - \kappa _2) {\mathbb {E}}[M_{2,t}] + \lambda _{12} g(t) \end{aligned}$$

and $ {\mathbb {E}}[M_{1,t}] = c{\mathbb {E}}[M_{2,t}] + g(t) $ $\square $

1.2 A.2 Proof of proposition 2

For the first property, we can verify in the following way.

When $ i = 1 $

$$\begin{aligned} d_0{} & {} = p^0 \left( 1 + (1-p)\frac{\frac{\alpha }{p} - \left( \frac{\alpha }{p}\right) }{1 - \frac{\alpha }{p}} \right) = 1 \\ p_1{} & {} = p, \quad \alpha _1 = \alpha \end{aligned}$$

Suppose equation (22) holds for $ i = k $. Then for $ i = k+1 $, we have

$$\begin{aligned}{} & {} \phi ^{(I)}(t,\theta ) = {\mathbb {E}}\left[ \left( 1 - \alpha _k + \frac{\alpha _k p_k \theta }{1 - (1-p_k)\theta }\right) ^{X_{t-k}}\right] \\{} & {} = {\mathbb {E}}\left[ \left( \frac{1-\alpha _k - (1-\alpha _k- p_k)\theta }{1- (1- p_k)\theta } \right) ^{X_{t-k}} \right] \\{} & {} = {\mathbb {E}} \left[ \left( \frac{1-\alpha - (1-\alpha - p) \frac{1-\alpha _k - (1-\alpha _k- p_k)\theta }{1- (1- p_k)\theta } }{1- (1- p) \frac{1-\alpha _k - (1-\alpha _k- p_k)\theta }{1- (1- p_k)\theta } } \right) ^{X_{t-k-1}} \right] \\{} & {} = {\mathbb {E}} \left[ \left( \frac{\frac{1-\alpha - (1-\alpha -p)(1-\alpha _k)}{1-(1-p)(1-\alpha _k)} - \left( \frac{(1-p_k + (1-p)(\alpha _k - p_k-1))}{1-(1-p)(1-\alpha _k)} - \frac{\alpha \alpha _k}{1-(1-p)(1-\alpha _k)} \right) \theta }{1 - \frac{(1-p_k + (1-p)(\alpha _k - p_k-1))}{1-(1-p)(1-\alpha _k)}\theta } \right) ^{X_{t-k-1}}\right] \end{aligned}$$

It is then clear that

$$\begin{aligned}{} & {} 1 - (1 - p) (1- \alpha _k) = \frac{d_{k-1} - (1-p)(d_{k-1} - \alpha ^k)}{d_{k-1}} = \frac{d_{k}}{d_{k-1}} \\{} & {} \alpha _{k+1} = \frac{\alpha \alpha _k }{1 - (1 - p) (1- \alpha _k)} = \frac{\alpha ^{k+1}}{d_k} \\{} & {} p_{k+1} = 1 - \frac{(1-p_k + (1-p)(\alpha _k - p_k-1))}{1-(1-p)(1-\alpha _k)} \\{} & {} \quad = 1 - \frac{1- (1-p)(1-\alpha _k) - pp_k}{1-(1-p)(1-\alpha _k)} = \frac{pp_k}{1-(1-p)(1-\alpha _k)} = \frac{p^{k+1}}{d_k} \end{aligned}$$

So Eq. (22) holds for all $ i = 1,2,\dots $. For the second property, the moments can be found by conditional expectation such that

$$\begin{aligned}{} & {} {\mathbb {E}}[p *_1 \alpha \circ X \vert X ]\\{} & {} = {\mathbb {E}} \left[ {\mathbb {E}}\left[ \sum _{j=1}^{\alpha \circ X } g_j^{(j)} \vert \alpha \circ X \right] \vert X\right] = \frac{1}{p} {\mathbb {E}}[(\alpha \circ X) \vert X] = \frac{1}{p} {\mathbb {E}}\left[ \sum _{j=1}^X b_j \vert X \right] \\{} & {} = \frac{\alpha }{p} X \\{} & {} Var(p *_1 \alpha \circ X \vert X )\\{} & {} = {\mathbb {E}}\left[ Var\left( \sum _{j=1}^{\alpha \circ X } g_j^{(j)} \vert \alpha \circ X \right) \vert X \right] + Var\left( {\mathbb {E}}\left[ \sum _{j=1}^{\alpha \circ X } g_j^{(j)} \vert \alpha \circ X \right] \vert X \right) \\{} & {} = \frac{1-p}{p^2} {\mathbb {E}} [\alpha \circ X \vert X] + \frac{1}{p^2} Var(\alpha \circ X \vert X) \\{} & {} = \left( \frac{\alpha (1-p)}{p^2} + \frac{\alpha (1-\alpha )}{p^2} \right) X \\ \quad Var(p *_1 \alpha \circ X ){} & {} = {\mathbb {E}}[ Var( p *_1 \alpha \circ X \vert X )] + Var( {\mathbb {E}}[p *_1 \alpha \circ X \vert X]) \\ \quad Cov(X_t, X_{t-i}){} & {} = Cov(p_i *_1 \alpha _i \circ X_{t-i}, X_{t-i}) \\{} & {} =Cov\left( {\mathbb {E}}[p_i *_1 \alpha _i \circ X_{t-i} \vert X_{t-i}], {\mathbb {E}}[X_{t-i} \vert X_{t-i}]\right) \\{} & {} \quad + {\mathbb {E}}[Cov(p_i *_1 \alpha _i \circ X_{t-i}, X_{t-i} \vert X_{t-i})] \\{} & {} = \frac{\alpha _i}{p_i} Cov(X_{t-i},X_{t-i}) + 0 \end{aligned}$$

$\square $

1.3 A.3 Proof of Proposition 5

The transition probability of giving out $ m (m > 0) $ birth of from the process $ Z^{(\Delta )}_t $ given $ Z^{(\Delta )}_{t-\Delta } $ during an infinitesimal time interval $ \Delta $ is given by

$$\begin{aligned} \begin{aligned}&\Pr (Z^{(\Delta )}_t = k+m \vert Z^{(\Delta )}_{t-\Delta } = k) = \Pr (X_{s} = k+m \vert X_{s-1} = k) \\&\quad = \sum _{j=1}^{\min \{k,k+m\}} \left( {\begin{array}{c}k\\ j\end{array}}\right) \alpha ^j (1-\alpha )^{k-j} \left( {\begin{array}{c}k+m-1\\ j-1\end{array}}\right) p^{j} (1 -p)^{k+m-j} \\&\quad = \sum _{j=1}^{k} \left( {\begin{array}{c}k\\ j\end{array}}\right) \left( {\begin{array}{c}k+m-1\\ j-1\end{array}}\right) e^{(\lambda - \mu ) j \Delta } \lambda ^{k+m-j} \mu ^{k-j} (e^{(\lambda - \mu )\Delta }-1)^{2k-2j+m} (\lambda -\mu )^{2j} \\&\quad \times (\lambda e^{(\lambda - \mu )\Delta } - \mu )^{-(2k+m)} \\&\quad = \sum _{j=1}^{k-1} \left( {\begin{array}{c}k\\ j\end{array}}\right) \left( {\begin{array}{c}k+m-1\\ j-1\end{array}}\right) \lambda ^{k-j} \mu ^{k-j} (\lambda -\mu )^{2j} ((\lambda -\mu )\Delta + o(\Delta ))^{2k-2j+m} \\&\quad \times (1 + (\lambda - \mu )\Delta + o(\Delta ))^{j} \left( \frac{1}{\lambda - \mu } - \frac{\lambda }{\lambda - \mu } \Delta + o(\Delta ) \right) ^{2k+m}, \\ \end{aligned} \end{aligned}$$

where all the exponential function are expressed as their corresponding Taylor expansion at $ \Delta = 0 $. To make comparison with the continuous birth and death process, we are interested in the coefficients in front of $ \Delta $. First we need to check the lowest order of $ \Delta $ in above probability. That is, we would like to minimize the sum

$$\begin{aligned} \underset{1\le j \le k}{\min }\ 2k - 2j + m = m \end{aligned}$$

So for the transition probability is rearranged in the following way

$$\begin{aligned} \begin{aligned}&\Pr (Z^{(\Delta )}_t = k+m \vert Z^{(\Delta )}_{t-\Delta } = k) \\&\quad = \left( {\begin{array}{c}k\\ k\end{array}}\right) \left( {\begin{array}{c}k+m-1\\ k-1\end{array}}\right) \lambda ^m (\lambda - \mu )^{2k} ((\lambda - \mu ) \Delta + o(\Delta ) )^m (1 + (\lambda - \mu )\Delta + o(\Delta ))^{k} \\&\quad \times \left( \frac{1}{\lambda - \mu } - \frac{\lambda }{\lambda - \mu } \Delta + o(\Delta ) \right) ^{2k+m} + o(\Delta ^m) \\&\quad = \left( {\begin{array}{c}k+m-1\\ k-1\end{array}}\right) (\lambda \Delta )^m + o(\Delta ^m), \end{aligned} \end{aligned}$$

Then it is clear that the there is no first order term in the case where $ m \ge 2 $ and the probability that giving out exactly one birth is

$$\begin{aligned} \Pr (Z^{(\Delta )}_t = k+1 \vert Z^{(\Delta )}_{t-\Delta } = k) = \lambda k \Delta + o(\Delta ) \end{aligned}$$

On the other hand, we can derive the probability that $ m (1 \le m \le k) $ individuals die within infinitesimal time $ \Delta $ in a similar way

$$\begin{aligned}&\Pr (Z^{(\Delta )}_t = k-m \vert Z^{(\Delta )}_{t-\Delta } = k) = \Pr (X_s = k-m \vert X_{s-1} = k) \\&\quad = \sum _{j=1}^{\min \{k-m,k\}} \left( {\begin{array}{c}k\\ j\end{array}}\right) \alpha ^j (1 - \alpha )^{k - j} \left( {\begin{array}{c}k-m-1\\ j-1\end{array}}\right) p^j (1 - p)^{k-m-j} \\&\quad = \sum _{j=1}^{k-m} \left( {\begin{array}{c}k\\ j\end{array}}\right) \left( {\begin{array}{c}k-m-1\\ j-1\end{array}}\right) e^{(\lambda - \mu ) j \Delta } \lambda ^{k-m-j} \mu ^{k-j} (e^{(\lambda - \mu )\Delta }-1)^{2k-2j-m} \\ {}&\quad \times (\lambda -\mu )^{2j} (\lambda e^{(\lambda - \mu )\Delta } - \mu )^{-(2k-m)} \\&\quad = \sum _{j=1}^{k-m} \left( {\begin{array}{c}k\\ j\end{array}}\right) \left( {\begin{array}{c}k-m-1\\ j-1\end{array}}\right) \lambda ^{k-m-j} \mu ^{k-j} (\lambda - \mu )^{2j} ((\lambda - \mu ) \Delta + o(\Delta ))^{2k - 2j - m} \\&\quad \times (1 + (\lambda - \mu ) \Delta + o(\Delta ))^j \left( \frac{1}{\lambda - \mu } - \frac{\lambda }{\lambda - \mu } \Delta + o(\Delta ) \right) ^{2k-m} \\ \end{aligned}$$

The minimum order of $ \Delta $ is determined by

$$\begin{aligned} \underset{1 \le j \le k-m}{\min }\ 2k - 2j - m = m \end{aligned}$$

(A2)

Then the probability is reduced to

$$\begin{aligned} \begin{aligned}&\Pr (Z^{(\Delta )}_t = k-m \vert Z^{(\Delta )}_{t-\Delta } = k) \\&\quad = \left( {\begin{array}{c}k\\ k-m\end{array}}\right) \left( {\begin{array}{c}k-m-1\\ k-m-1\end{array}}\right) \mu ^m (\lambda - \mu )^{2(k-m)} ((\lambda - \mu )\Delta + o(\Delta ))^m\\&\quad (1 + (\lambda - \mu )\Delta + o(\Delta ))^{k-m} \\&\quad \times \left( \frac{1}{\lambda - \mu } - \frac{\lambda }{\lambda - \mu } \Delta + o(\Delta ) \right) ^{2k-m} + o(\Delta ^m) \\&\quad = \left( {\begin{array}{c}k\\ k-m\end{array}}\right) (\mu \Delta )^m + o(\Delta ^m) \end{aligned} \end{aligned}$$

(A3)

The transition probability that only one individual dies is

$$\begin{aligned} \Pr (Z^{(\Delta )}_t = k-1 \vert Z^{(\Delta )}_{t-\Delta } = k) = \mu k \Delta + o(\Delta ) \end{aligned}$$

(A4)

the probability that more than one individuals die are $ o(\Delta ) $ As the birth rate and death $ (\lambda , \mu ) $ are time-homogeneous, so as the parameters $ \alpha , p $, the transition probabilities stay the same for all time $ t \in [t_1, t_2]$. This means that the discrete birth and death INAR(1) model would result in the the same dynamic (1) of simple birth and death process when $ \Delta $ is small enough.

Similar to the univariate case. It is necessary to find out the transition probabilities before proceeding to the weak convergence. The transition probability of giving out $ m (m \ge 1) $ births of population $ M_t^{(\Delta )} $ given $ M^{(\Delta )}_{t-\Delta } = k_1, N^{(\Delta )}_{t - \Delta } = k_2 $ during an infinitesimal time $ \Delta $ is given by

$$\begin{aligned}&\Pr ( M_t^{(\Delta )} = k_1 + m\vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) = \Pr (X_s = k_1+m \vert X_{s-1} = k_1, Y_{s-1} = k_2) \\&\quad = \sum _{j=0}^{k_1 + m} \Pr (\beta _{11} *_1 \alpha _1 \circ X_{s-1} = j \vert X_{s-1} = k_{1}) \Pr (\beta _{21} *_2 Y_{s-1} = k_1 +m - j \vert Y_{s-1} = k_2) \\&\quad = \sum _{j=1}^{k_1+m} \left( \sum _{i=1}^{\min \{k_1,j\}} \left( {\begin{array}{c}k_1\\ i\end{array}}\right) \alpha _1^i (1-\alpha _1)^{k_1 -i} \left( {\begin{array}{c}j-1\\ i-1\end{array}}\right) \beta _{11}^i (1-\beta _{11})^{j-i} \right) \left( {\begin{array}{c}k_2 + k_1 +m - j -1\\ k_2 - 1\end{array}}\right) \\&\quad \times \beta _{21}^{k_2} (1-\beta _{21})^{k_1 + m- j} + (1-\alpha _1)^{k_1} \left( {\begin{array}{c}k_2 + k_1 + m -1\\ k_2 - 1\end{array}}\right) \beta _{21}^{k_2} (1-\beta _{21})^{k_1 + m} \\&\quad =\sum _{j=1}^{k_1+m} \sum _{i=1}^{\min \{k_1, j\}} \left( {\begin{array}{c}k_1\\ i\end{array}}\right) \left( {\begin{array}{c}j-1\\ i-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 +m -j -1\\ k_2 - 1\end{array}}\right) \left( \frac{(\lambda _{11} -\mu _1) \omega _1(\Delta )}{\lambda _{11} \omega _1(\Delta ) - \mu _1} \right) ^i \\&\qquad \times \left( \frac{\mu _1(\omega _1(\Delta ) -1)}{\lambda _{11} \omega _1(\Delta ) - \mu _1} \right) ^{k_1-i } \left( \frac{\lambda _{11} - \mu _1}{\lambda _{11} \omega _1(\Delta ) - \mu _1}\right) ^{i} \left( \frac{\lambda _{11}(\omega _1(\Delta ) - 1)}{\lambda _{11} \omega _1(\Delta ) - \mu _1} \right) ^{j-i} \\&\qquad \times \left( \frac{1}{1 + C_{\beta _1}(e^{u_1\Delta } - e^{u_2\Delta }) }\right) ^{k_2} \left( \frac{C_{\beta _1}(e^{u_1\Delta } - e^{u_2\Delta })}{1 + C_{\beta _1}(e^{u_1\Delta } - e^{u_2\Delta }) }\right) ^{k_1 + m -j} + o(\Delta ^{2k_1 + m - 1}) \\&\quad = \sum _{j=1}^{k_1 + m} \sum _{i=1}^{\min \{k_1, j\}} \left( {\begin{array}{c}k_1\\ i\end{array}}\right) \left( {\begin{array}{c}j-1\\ i-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 + m -j -1\\ k_2 - 1\end{array}}\right) (\lambda _{11} - \mu _1)^i ( 1 + (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^i \\&\qquad \times \mu _1^{k_1 - i} ((\lambda _{11}-\mu _1)\Delta + o(\Delta ))^{k_1 - i} (\lambda _{11} - \mu _1)^i \lambda _{11}^{j-i} ((\lambda _{11}-\mu _1)\Delta + o(\Delta ))^{j - i} \\&\qquad \times \left( \frac{1}{\lambda _{11} - \mu _1} - \frac{\lambda _{11}}{\lambda _{11} - \mu _1} \Delta + o(\Delta )\right) ^{k_1 + j} \\&\qquad \times (1 - C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_2} (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_1 + m- j} + o(\Delta ^{2k_1 + m - 1}) \\&\quad = \sum _{j=1}^{k_1+m} \sum _{i=1}^{\min \{k_1, j\}} \left( {\begin{array}{c}k_1\\ i\end{array}}\right) \left( {\begin{array}{c}j-1\\ i-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 + m -j -1\\ k_2 - 1\end{array}}\right) (\lambda _{11} - \mu _1)^{2i} \lambda _{11}^{j-i} \mu _1^{k_1 - i}\\&\qquad \times ( (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{k_1 + j -2i} ( 1 + (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{i} \\&\qquad \times \left( \frac{1}{\lambda _{11} - \mu _1} - \frac{\lambda _{11}}{\lambda _{11} - \mu _1} \Delta + o(\Delta )\right) ^{k_1 + j} \\&\qquad \times (1 - C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_2} (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_1 + m - j} + o(\Delta ^{2k_1 + m - 1}), \end{aligned}$$

where all the exponential function are expressed in their corresponding Taylor expansion at $ \Delta = 0$. The lowest order of $ \Delta $ is determined by the power of $ ((\lambda _{11} - \mu _{1}) + o(\Delta )) $ and $ (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta )) $,

$$\begin{aligned} \underset{1\le i \le \min \{k_1,j\} }{\min } k_1 + j -2i + k_1 + m - j = \underset{1\le i \le \min \{k_1,j\}}{\min } 2k_1 -2i + m = m, \end{aligned}$$

where $ j =\in \{1,\dots ,k_1+m\} $. This leads to $j = k_1, i = k_1 -1$ and $j = k_1 - 1, i = k_1 - 1$, respectively. Then the transition probability reduces to

$$\begin{aligned} \begin{aligned}&\Pr ( M_t^{(\Delta )} = k_1+m \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) \\&\quad = \sum _{j=k_1}^{k_1 + m} \left( {\begin{array}{c}k_1\\ k_1\end{array}}\right) \left( {\begin{array}{c}j-1\\ k_1-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 +m -j -1\\ k_2 - 1\end{array}}\right) (\lambda _{11} - \mu _1)^{2k_1} \lambda _{11}^{j-k_1} \mu _1^{k_1 - k_1}\\&\qquad \times ( (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{k_1 + j -2k_1} ( 1 + (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{k_1} \\&\qquad \times \left( \frac{1}{\lambda _{11} - \mu _1} - \frac{\lambda _{11}}{\lambda _{11} - \mu _1} \Delta + o(\Delta )\right) ^{k_1 + j} \\&\qquad \times (1 - C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_2} (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_1 + m - j} + o(\Delta ^{2k_1 + m -1}), \\&\quad = \sum _{j=k_1}^{k_1 + m} \left( {\begin{array}{c}j-1\\ k_1-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 +m -j -1\\ k_2 - 1\end{array}}\right) (\lambda _{11} \Delta )^{j-k_1} (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_1 + m - j} \\&\qquad + o(\Delta ^{2k_1 + m -1}), \\&\quad = \sum _{j=k_1}^{k_1 + m} \left( {\begin{array}{c}j-1\\ k_1-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 +m -j -1\\ k_2 - 1\end{array}}\right) (\lambda _{11} \Delta )^{j-k_1} (\lambda _{21}\Delta )^{k_1 + m - j} + o(\Delta ^{m}), \\ \end{aligned} \end{aligned}$$

Then the probability when $ m = 1 $ is given by

$$\begin{aligned} \begin{aligned}&\Pr ( M_t^{(\Delta )} = k_1+1 \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 )\\&\quad = \sum _{j=k_1}^{k_1 + 1} \left( {\begin{array}{c}j-1\\ k_1-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 -j\\ k_2 - 1\end{array}}\right) (\lambda _{11} \Delta )^{j-k_1} (\lambda _{21}\Delta )^{k_1 + 1 - j} + o(\Delta ), \\&\quad = k_2 \lambda _{21} \Delta + k_1 \lambda _{11} \Delta + o(\Delta ) \end{aligned} \end{aligned}$$

(A5)

By symmetry, the birth transition probability for the other population is given by

$$\begin{aligned} \begin{aligned}&\Pr ( N_t^{(\Delta )} = k_2 + m \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) = P(Y_s = k_2 \vert X_{s-1} = k_1, Y_{s-1} = k_2) \\&\quad = \sum _{j=k_2}^{k_2 + m} \left( {\begin{array}{c}j-1\\ k_2-1\end{array}}\right) \left( {\begin{array}{c}k_1 + k_2 +m -j -1\\ k_1 - 1\end{array}}\right) (\lambda _{22} \Delta )^{j-k_2} (\lambda _{12}\Delta )^{k_1 + m - j} + o(\Delta ^{m}). \end{aligned} \end{aligned}$$

On the other hand, the probability that m individual die in population $ M_t^{(\Delta )}$ given that $ M^{(\Delta )}_{t-\Delta } = k_1 $, $N^{(\Delta )}_{t - \Delta } = k_2 $ during an infinitesimal time interval $ \Delta $ is

$$\begin{aligned} \begin{aligned}&\Pr ( M_t^{(\Delta )} = k_1-m \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) = \Pr (X_s = k_1-m \vert X_{s-1} = k_1, Y_{s-1} = k_2) \\&\quad = \sum _{j=1}^{k_1-m} \left( \sum _{i=1}^{\min \{k_1,j\}} \left( {\begin{array}{c}k_1\\ i\end{array}}\right) \alpha _1^i (1-\alpha _1)^{k_1 -i} \left( {\begin{array}{c}j-1\\ i-1\end{array}}\right) \beta _{11}^i (1-\beta _{11})^{j-i} \right) \\&\qquad \times \left( {\begin{array}{c}k_2 + k_1 -m - j -1\\ k_2 - 1\end{array}}\right) \beta _{21}^{k_2} (1-\beta _{21})^{k_1 - j} + (1-\alpha _1)^{k_1} \left( {\begin{array}{c}k_2 + k_1 -m- 1\\ k_2 - 1\end{array}}\right) \beta _{21}^{k_2} (1-\beta _{21})^{k_1} \\&\quad = \sum _{j=1}^{k_1-m} \sum _{i=1}^{j} \left( {\begin{array}{c}k_1\\ i\end{array}}\right) \left( {\begin{array}{c}j-1\\ i-1\end{array}}\right) \left( {\begin{array}{c}k_2 + k_1 -m -j -1\\ k_2 - 1\end{array}}\right) (\lambda _{11} - \mu _1)^{2i} \lambda _{11}^{j-i} \mu _1^{k_1 -i} \\&\qquad \times ( (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{k_1 + j -2i}\\&\qquad \times ( 1 + (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{i} \left( \frac{1}{\lambda _{11} - \mu _1} - \frac{\lambda _{11}}{\lambda _{11} - \mu _1} \Delta + o(\Delta )\right) ^{k_1 + j} \\&\qquad \times (1 - C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_2} (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_1 - m - j} + o(\Delta ^{2k_1+k_2 - 1}) \\&\quad = \left( {\begin{array}{c}k_1+1\\ k_1\end{array}}\right) \left( {\begin{array}{c}k_1-1\\ k_1-1\end{array}}\right) \left( {\begin{array}{c}k_2-1\\ k_2 - 1\end{array}}\right) (\lambda _{11} - \mu _1)^{2k_1} \mu _1 ((\lambda _{11} - \mu _1)\Delta + o(\Delta )) \\&\qquad \times (1 + (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{k_1} \\&\qquad \times \left( \frac{1}{\lambda _{11} - \mu _1} - \frac{\lambda _{11}}{\lambda _{11} - \mu _1} \Delta + o(\Delta )\right) ^{2k_1 + 1} (1 - C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_2} + o(\Delta ) \\&\quad = \mu _1 (k_1+1) \Delta + o(\Delta ) \end{aligned} \end{aligned}$$

The lowest order of $ \Delta $ is determined by the power of $ ((\lambda _{11} - \mu _{1}) + o(\Delta )) $ and $ (C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta )) $,which is

$$\begin{aligned} \underset{1\le i \le j }{\min }\ k_1 + j - 2i + k_1 - m - j = \underset{1\le i \le j }{\min }\ 2k_1- 2i - m = m, \end{aligned}$$

where $ j \in \{1,\dots ,k_1 - m\} $. So the above probability reduces to

$$\begin{aligned} \begin{aligned}&\Pr ( M_t^{(\Delta )} = k_1-m \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) \\&\quad = \left( {\begin{array}{c}k_1\\ k_1 - m\end{array}}\right) \left( {\begin{array}{c}k_1-m-1\\ k_1 -m -1\end{array}}\right) \left( {\begin{array}{c}k_2-1\\ k_2-1\end{array}}\right) (\lambda _{11} - \mu _1)^{2(k_1 - m)} \mu _1^{m} \\&\quad \times ( (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{m}\\&\quad \times ( 1 + (\lambda _{11} - \mu _1)\Delta + o(\Delta ))^{k_1 - m} \left( \frac{1}{\lambda _{11} - \mu _1} - \frac{\lambda _{11}}{\lambda _{11} - \mu _1} \Delta + o(\Delta )\right) ^{2k_1 - m} \\&\quad \times (1 - C_{\beta _1}(u_1 - u_2)\Delta + o(\Delta ))^{k_2}+ o(\Delta ^{m})\\&\quad = \left( {\begin{array}{c}k_1\\ k_1 - m\end{array}}\right) (\mu _1 \Delta )^m + o(\Delta ^m) \end{aligned} \end{aligned}$$

It is not surprise that the transition probability only depends on its own size of population from the bivariate INAR construction. Then the death transition probability for the other population is

$$\begin{aligned} \Pr ( N_t^{(\Delta )} = k_2-m \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) = \left( {\begin{array}{c}k_2\\ k_2 - m\end{array}}\right) (\mu _2 \Delta )^m + o(\Delta ^m) \end{aligned}$$

(A6)

The it is clear that the probabilities for both population that there is only one death would have the same form as in the univariate case. By conditional independence of bivariate INAR model, the joint transition probability would be the product of any of two transition probabilities shown above. For example, for any two integers $ m_1, m_2 \in {\mathbb {Z}} $,

$$\begin{aligned} \begin{aligned}&\Pr ( M_t^{(\Delta )} = k_1 + m_1, N_t^{(\Delta )} = k_2 + m_2 \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 )\\&\quad = \Pr ( M_t^{(\Delta )} = k_1 + m_1 \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) \Pr ( N_t^{(\Delta )} \\&\quad = k_2 + m_2 \vert M^{(\Delta )}_{t-\Delta } = k_1 , N^{(\Delta )}_{t - \Delta } = k_2 ) \end{aligned} \end{aligned}$$

(A7)

Then it is straightforward to show that, to have a first order $ \Delta $ term, the only possible combinations of $ (m_1, m_2) $ are $ \{(1,0),(0,1),(-1,0),(0,-1) \}$, which under the proposed parametrization, the joint process only allow one jump during infinitesimal time which coincide with the bivariate continuous birth and death process.

As the birth rates $ \lambda _{i,j} $ and death rate are $ \mu _{i} $ time homogeneous, so as the parameters $ \alpha _i, \beta _{i,j}, i,j \in \{1,2\}$, the transition probabilities stay the same for all time $ t \in [0, 1] $. This means that this bivariate birth and death INAR(1) model would result in the the same dynamic (11) when $ \Delta $ is small enough. $\square $

1.4 A.4 Proof of Proposition 6

According to Theorem 3.11, Chapter 2 in Jacod and Shiryaev (2013). We need to make sure the following two sum is finite for any truncation function h before constructing their discrete Lévy measures.

$$\begin{aligned} \begin{aligned}&\sum _{k=1}^{\sigma _t} \vert {\mathbb {E}}[h(U_k) \vert {\mathcal {F}}_{k-1}] \vert< \infty \\&\sum _{k=1}^{\sigma _t} {\mathbb {E}}[ \vert U_k^2 \wedge 1 \vert \ \vert {\mathcal {F}}_{k-1}] < \infty , \\ \end{aligned} \end{aligned}$$

(A8)

where $ U_k $ are increments of any underlying processes. This can be shown straightforwardly as the there are only finite number of terms for summation $ \sigma _t \le n $ and truncation functions are bounded. Then by The Theorem 3.11, the Lévy triplets for $ Z^{(n)}_t $ is

$$\begin{aligned} Ch(Z_t^{(n)}) = {\left\{ \begin{array}{ll} &{} B_t = \sum _{k=1}^{\sigma _t} {\mathbb {E}}[h(U_k) \vert {\mathcal {F}}_{k-1}] \\ &{} C_t = 0 \\ &{} \nu (Z_t^{(n)}; [0,t] \times g)= \sum _{k=1}^{\sigma _t} {\mathbb {E}}[ g(U_k)1_{\{U_k \ne 0\}}) \vert {\mathcal {F}}_{k-1}] \end{array}\right. } \end{aligned}$$

(A9)

We can choose the truncation function $ h(x) = \vert x \vert 1_{\{\vert x \vert < 1\}} $ such that $ B_t $ is always 0 as there is no jump with size smaller than 1 in $ Z_t^{(n)} $. Finally for the discrete stochastic integral:

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}[g(U_k)1_{\{U_k \ne 0\}} \vert {\mathcal {F}}_{k-1}] \\&\quad = {\mathbb {E}}[g(1)1_{\{U_k =1 \}} \vert {\mathcal {F}}_{k-1}] + {\mathbb {E}}[g(-1)1_{\{U_k = -1\}} \vert {\mathcal {F}}_{k-1}] \\&\quad + \sum _{\eta = 2}^{\infty } {\mathbb {E}}[ g(\eta )1_{\{U_k = \eta \}} + g(-\eta )1_{\{U_k = -\eta \}} \vert {\mathcal {F}}_{k-1} ]\\&\quad = g(1)\lambda X_{k-1} \Delta + g(-1) \mu X_{k-1} \Delta + o(\Delta ) + \frac{o(\Delta ^2)}{1-\Delta } \\&\quad = g(1)\lambda X_{k-1} \Delta + g(-1) \mu X_{k-1} \Delta + o(\Delta ) \end{aligned} \end{aligned}$$

(A10)

Then it is clear that

$$\begin{aligned} \nu (Z_t^{(n)}; [0,t] \times g) = \sum _{k=1}^{\sigma _t} (g(1)\lambda + g(-1)\mu )X_{k-1}\Delta + O(\Delta ) \end{aligned}$$

For the bivariate case, the proof is similar, the conditional expectation

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}[g(\textbf{V}_k)1_{\{\textbf{V}_k \ne (0,0)^T \}} \vert {\mathcal {F}}_{k-1}] \\&\quad = {\mathbb {E}}[g(1,0)1_{\{\textbf{V}_k = (1,0)^T \}} \vert {\mathcal {F}}_{k-1}] + {\mathbb {E}}[g(0,1)1_{\{\textbf{V}_k = (0,1)^T\}} \vert {\mathcal {F}}_{k-1}] \\&\quad + {\mathbb {E}}[g(-1,0)1_{\{\textbf{V}_k = (-1,0)^T \}} \vert {\mathcal {F}}_{k-1}] + {\mathbb {E}}[g(0,-1)1_{\{\textbf{V}_k = (0,-1)^T\}} \vert {\mathcal {F}}_{k-1}] \\&\quad + \sum _{ \vert i \vert + \vert j\vert> 1 } {\mathbb {E}}[g(i,j)1_{\{\textbf{V}_k = (i,j)^T \}} \vert {\mathcal {F}}_{k-1}] + {\mathbb {E}}[g(i,-j)1_{\{\textbf{V}_k = (i,-j)^T\}} \vert {\mathcal {F}}_{k-1}] \\&\quad + \sum _{ \vert i \vert + \vert j\vert > 1 } {\mathbb {E}}[g(-i,j)1_{\{\textbf{V}_k = (-i,j)^T \}} \vert {\mathcal {F}}_{k-1}] + {\mathbb {E}}[g(-i,-j)1_{\{\textbf{V}_k = (-i,-j)^T\}} \vert {\mathcal {F}}_{k-1}] \\&\quad = \left( g(1,0)\tilde{\varvec{\lambda }}_{1} + g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,1) \tilde{\varvec{\lambda }}_2 + g(0,-1)\tilde{\varvec{\mu }}_2\right) \textbf{Y}_{k-1}\Delta + o(\Delta ) + \frac{o(\Delta ^2)}{(1-\Delta )^2} \\&\quad = \left( g(1,0)\tilde{\varvec{\lambda }}_{1} + g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,1) \tilde{\varvec{\lambda }}_2 + g(0,-1)\tilde{\varvec{\mu }}_2\right) \textbf{Y}_{k-1}\Delta + o(\Delta ) \end{aligned} \end{aligned}$$

(A11)

Finally the discrete stochastic integral is given by

$$\begin{aligned} \sum _{k=1}^{\sigma _t} \left( g(1,0)\tilde{\varvec{\lambda }}_{1} + g(-1,0)\tilde{\varvec{\mu }}_1 + g(0,1)\tilde{\varvec{\lambda }}_2 + g(0,-1)\tilde{\varvec{\mu }}_2\right) \textbf{Y}_{k-1}\Delta + O(\Delta ) \end{aligned}$$

(A12)

$\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, Z., Dassios, A. & Tzougas, G. INAR approximation of bivariate linear birth and death process. Stat Inference Stoch Process 26, 459–497 (2023). https://doi.org/10.1007/s11203-023-09289-9

Download citation

Received: 19 April 2022
Accepted: 12 April 2023
Published: 15 May 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11203-023-09289-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

INAR approximation of bivariate linear birth and death process

Abstract

Similar content being viewed by others

A bivariate INAR(1) model with different thinning parameters

Bivariate zero truncated Poisson INAR(1) process

A bivariate integer-valued bilinear autoregressive model with random coefficients

1 Introduction

2 Univariate and bivariate birth and death processes

2.1 Simple univariate birth-and-death process

2.2 Bivariate birth-and-death process

Proposition 1

Proof

3 Univariate and bivariate INAR models

3.1 Univariate INAR model

Definition 1

Proposition 2

Proof

3.2 Bivariate INAR model

Definition 2

Proposition 3

Proof

Proposition 4

Proof

4 Weak convergence to continuous birth and death process

4.1 Construction of continuous processes

Definition 3

Proposition 5

Proof

Proposition 6

Proof

Theorem 7

Proof

5 Simulation study

5.1 Simulation of bivariate birth and death process

5.2 Statistical inference of univariate and bivariate birth and death process

5.2.1 Quasi-MLE for bivariate LBD

5.2.2 Quasi-MLE for bivariate LBD

6 Concluding remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix A: Proofs

Appendix A: Proofs

1.1 A.1 Proof of proposition 1

1.2 A.2 Proof of proposition 2

1.3 A.3 Proof of Proposition 5

1.4 A.4 Proof of Proposition 6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation