INAR approximation of bivariate linear birth and death process

In this paper, we propose a new type of univariate and bivariate Integer-valued autoregressive model of order one (INAR(1)) to approximate univariate and bivariate linear birth and death process with constant rates. Under a specific parametric setting, the dynamic of transition probabilities and probability generating function of INAR(1) will converge to that of birth and death process as the length of subintervals goes to 0. Due to the simplicity of Markov structure, maximum likelihood estimation is feasible for INAR(1) model, which is not the case for bivariate and multivariate birth and death process. This means that the statistical inference of bivariate birth and death process can be achieved via the maximum likelihood estimation of a bivariate INAR(1) model.

distribution function, extinction probability, or some other cumulative distribution of interests, are explicitly derived in the literature; see for example, Kendall (1949). The statistical inference for simple birth and death processes is then developed by Keiding (1975), where maximum likelihood estimators and other asymptotic results are discussed. Since the distribution function of simple birth and death processes is explicit, the construction of the likelihood function is straightforward. However, it is pointed out in the literature that the transition probability is actually cumbersome and numerically unstable when the size of population is large over time. At the same time, a variety of alternative estimation methods have been proposed. For example, quasi-and pseudo -likelihood estimators (Chen and Hyrien 2011;Crawford et al. 2014) addressed it as a missing data problem and apply an EM algorithm to maximize it. Tavaré (2018) found those transition probabilities by numerical inversion of the probability generating function and then applied Bayesian methods to perform estimation. Davison et al. (2021) adopted a saddle point approximation method to further improve the accuracy of transition probabilities.
The bivariate and multivariate birth and death process are developed in Griffiths (1972Griffiths ( , 1973. Griffiths (1972) described the transmission of malaria (so called host-vector situation) as a bivariate birth and death process where there is no direct infection between the same type of population. Then the author extended the model to multivariate case (Griffiths 1973) which can be regarded as an approximation of general epidemic with several types of infective. However, due to the intractability of the joint probability generating function, maximum likelihood estimation for parameters is not implementable. One possible way forward is to use integer-valued time series to approximate the continuous birth and death process and maximum likelihood estimation would then be feasible.
In recent years, there has been a growing interest in modelling integer-valued time series due to the presence of count data from different scientific fields such as social science, healthcare, insurance, economic and the financial industry. In particular, regarding to the univariate case, Al-Osh and Alzaid (1987) and McKenzie (1985) were the first to consider an INAR(1) model based on the so-called binomial thinning operator. The idea here is to manipulate the operation between coefficients and variables as well as the innovation terms in a way that the values are always integer. One can apply different discrete random variables to describe this operation. For more details, the interested reader can refer to Weiß (2018), Davis et al. (2016), Scotto et al. (2015), Weiß (2008) among many more.
In this paper, we propose an integer-valued autoregressive model of order one (INAR(1)) to approximate continuous birth and death process. In this way, the continuous process is approximated by a discrete Markov chain so that transition probabilities as well as likelihood function can be written down explicitly. As the birth and death process in our setting does not consider any immigrant, the innovation term is dropped in the proposed INAR(1) model. Similar to Nelson (1990), Kirchner (2016), where they find out the relationship between discrete models and their continuous counterparts, we also first need to make sure the our proposed discrete INAR(1) model would converge to birth and death process in weak convergence sense. Then we will explore how our proposed model would help facilitate the statistical inference. According to the probability generating function of the simple birth and death process, the death part can be described by binomial random variable while the birth part corresponds to a negative binomial. Then one can construct a bivariate INAR model based on these random variables to describe the bivariate birth and death process and even the multivariate one. As the transition probabilities and likelihood function of bivariate birth and death process cannot be written down explicitly, the main contribution is that the proposed bivariate INAR(1) model would provide a feasible way to estimate the parameters of bivariate birth and death process (Maximum likelihood estimation).
The paper is organized as follows: Sect. 2 reviews some main results of univariate and bivariate birth and death processes with constant rates. Section 3 introduces Integer-valued autoregressive models as well as some distributional properties. Section 4 constructs the discrete semimartingale using the proposed INAR models and proves the weak convergence between constructed semimartingale and birth and death processes. A simulation study is carried out in Sect. 5 to illustrate the estimation method via proposed INAR models an their corresponding properties of estimators. Some concluding remarks are in Sect. 6.

Univariate and bivariate birth and death processes
In this section, we will review the essential elements of simple birth-and-death processes, including moments and other distributional properties. These are well known and extensively discussed in the literature. Then, we will discuss the bivariate case where analytic expressions of the distribution function are not available.

Simple univariate birth-and-death process
Suppose that we have a population whose total number is evolved as a simple birth and death process Z t , with constant birth rate λ ≥ 0, death rate μ ≥ 0 and initial population Z 0 ∈ N. In other words, the probability that any individual gives birth in time is λ , and the probability that any individual dies in time is μ . Individuals are independent of each other. Let P n (t) = Pr(Z t = n) be the probability that the total population is n at time t. Then the transition probability of the simple birth and death process is characterized by the following ordinary differential equation (ODE) Applying a liner transform n θ n on both sides and defining ϕ(t, θ) = n θ n P n (t), we can get a partial differential equation whose solution ϕ is the probability generating function of Z This linear PDE can be solved explicitly This probability generating function clearly gives the construction of Z t given Z 0 , i.e. the sum of i.i.d zero-modified geometric random variables where B i are i.i.d Bernoulli random variables and G i are i.i.d Geometric random variables with mean α(t) and 1 β(t) , respectively. Furthermore, from the definition of transition probability, the linear birth and death process is a pure-jump semimartingale with following characteristic triplet: With the help of piece-wise deterministic Markov process theory in Davis (1984), the infinitesimal generator of the simple birth and death process Z t acting on a function f (t, Z ) within its domain (A) is given by The first and second moments can be derived by applying infinitesimal generator to the functions f (t, z) = Z , Z 2 such that which leads to two ODEs, Then, we can solve them explicitly According to the analytic expression of the first moment, it is clear that the population is bound to become extinct if λ < μ.

Bivariate birth-and-death process
Suppose there are two populations M = (M 1 , M 2 ) T with initial population M 0 ∈ N 2 + . The rate with which the population M 1 increases by one is λ 21 M 2 + λ 11 M 1 while the same for the population M 2 would be λ 12 M 1 + λ 22 M 2 . The subscript λ i, j means that the rate is from population i contributed to population j. The death rate for two populations would be μ 1 , μ 2 respectively. The two population is not independent as long as the cross birth rates λ i, j = 0, i = j. Then denote P mn (t) = Pr(M 1,t = m, M 2,t = n). This satisfies the following ODE Griffiths (1972) introduced this bivariate birth death process (λ 11 = λ 22 = 0) to describe the host-vector epidemic situation where the birth probability of two population depends on the size the other population only, e.g. transmission of malaria. To get the joint probability generating function of (t, θ, φ) = m n θ m φ n P mn (t), we can apply a linear transform m n θ m φ n on both sides of the ODE. The resulting PDE is This is a semi-linear PDE. The subsidiary equations are defined as The first fraction does not mean divide d by 0 and combining with the second fraction dt 1 infers that = constant, according to chapter 8 of Bailey (1991) . Matching the third and fourth differentials above, we have It seems that there is no way to solve this non-linear ODE and therefore no explicit solution is available for this PDE. However, it can be shown that this PDE gives a unique solution by Existence-Uniqueness Theorem for Quasilinear First-Order Equations. With regard to its characteristic, similar to the univariate case, this process is a pure-jump semimartingale with following characteristic triplets: The moments of this bivariate process can be derived by applying again infinitesimal generator.

Proposition 1 The first and second moments of the bivariate birth and death process
where Proof See Appendix A.1.
Note that to ensure the bivariate process becomes extinct with probability one, we need the (necessary and sufficient condition) (μ 1 − λ 11 )(μ 2 − λ 22 ) > λ 12 λ 21 according to Griffiths (1973). Many interesting properties of the process have been investigated by Griffiths (1972Griffiths ( , 1973. In general, this bivariate birth and death process is not straightforward to apply in practice because there are no explicit solutions to the above PDE, and the second moments have to be evaluated by numerical methods. The discrete integer-value model proposed in the next section would be a possible solution.

Univariate and bivariate INAR models
In this section, we will introduce integer-valued autoregressive models which will serve as discrete approximations for continuous counterparts discussed in the last section. The derivation of this approximation will demonstrate how to parameterize the bivariate INAR case.

Univariate INAR model
The classical integer-value autoregressive (INAR) model is introduced by defining a so-called binomial thinning operator • such that α • X is the sum of X i.i.d Bernoulli random variable with success probability α. i.e.
A well-known Poisson INAR(1) model X t is given by where where • • is the binomial operator • * 1 is a geometric (reproduction) operator such that p * 1 X = (1) i being i.i.d geometric random variable with success probability p whose probability mass function is given by Remark The innovation is dropped as there is no independent immigrant process in the birth and death process investigated.

Proposition 2
The birth and death INAR(1) model has the following statistical properties 1. The probability generating function of X t can be iterated backwardly such that In order words, the birth and death operator p * 1 α• as a whole is iterable.
2. Then the mean, variance and covariance are given by Proof See Appendix A.2.
Note that if α/ p < 1, the process X t will become extinct eventually. It is obvious that the continuous birth and death process can be approximated by this discrete INAR(1) model by directly matching the probability generating function ϕ (I ) to the one ϕ in Eq. (3) as the p * 1 α • X is the sum of X i.i.d zero-modified geometric random variables.

Bivariate INAR model
Discrete approximation for univariate birth and death process is somehow simple because the PDE(2) has an explicit solution and hence the distribution is already known. In the case where the dynamic of two populations are characterized by (11), no explicit solution for its PDE (12). However, from the birth and death INAR(1) model, it is clear that birth and death probability are closely related to binomial and negative binomial random variables. Based on the dynamic (11) and linear form of the first moment (16), a bivariate INAR(1) model is proposed as follows.

Definition 2 A bivariate birth and death INAR
where • • is the binomial operator • * 2 is another geometric (reproduction) operator different from * 1 such that β i being i.i.d geometric random variable whose success probability is β. The probability mass function is given by Now it seems that the structure of bivariate INAR(1) matches the the dynamics of (11), i.e. the birth probability depends on the size of both populations while death probability depends on the size of its own population. We adopt another geometric random variable g (2) which is slightly different from g (1) because for example, if we use

Proposition 3 The first and second moments of the bivariate INAR(1) defined above are characterized by the following recursive formulas
Proof Similar to Proposition 2, the moments can be derived by conditional expectation. The first and second moment for random variable g i with parameter β are 1−β β and 1−β β 2 . Then the first moment for X t are The second moments are given by The first and second moments of Y 2,t can be derived in a similar way.

Proposition 4 If the eigen-values
lie in the interval [−1, 1], then the bivariate population X t , Y t will become extinct eventually.
Proof The first moment can be expressed in a matrix form The tth power of a matrix here is defined as t times matrix multiplication. By eigendecomposition, power of a matrix can be expressed as where Q = (ν 1 , ν 2 ) is eigen vector matrix with ν 1 , ν 2 as eigen vectors for

Weak convergence to continuous birth and death process
In this section, we will construct two continuous processes from the above proposed INAR models. These processes, under a certain parametrization, will converge weakly to the aforementioned continuous birth and death processes when the length of sub-interval goes to 0.

Construction of continuous processes
Since the continuous birth and death processes are clearly semimartingale defined in nonnegative state spaces, to apply limit theorem of locally bounded semimartingales, we need to construct 'continuous' processes on a dense subsets of R + (will take t ∈ [0, 1] for convenience) and compute their characteristic triplets from the discrete INAR models. Finally, when everything is set up nicely, we can apply weak convergence of semimartingale theorem to prove the result. The construction mainly follows from Jacod and Shiryaev (2013, Chapter II, section 3). Starting with a discrete basis B = (¨, F, (F n ) n∈N , P), assume that he INAR models X n and Y n defined above are adapted to this discrete stochastic basis and so as the increment processes then we can construct 'continuous' processes via time change.
Definition 3 Given a fixed time interval [0, 1], one can define a equal-length grid with size n such that each subinterval with length = 1 n . The following the processes: where σ t = tn , are adapted to the continuous-time basisB The parameters setting for M where It is straightforward to derive the parameter setting for univariate case since we only need to match the parameter via probability generating function between Z (n) would be to match the first and second order moments to see whether it works. It is clear that we can match moment equations (26) to (16) and find out the mapping of β 12 , β 21 in terms of λ i, j , μ i , i, j ∈ {1, 2}. Unfortunately, only the ratio α i /β ii is known. Nevertheless, the parameter setting in univariate case shows us the way to distribute the ratio α/ p to α and p. Then α i , β ii can be set up in a similar way.
Proposition 5 With the above parameters setting and any non-negative integer m, the transition probabilities for Z The above probabilities can be simplified as, On the other hand, the transition probabilities for M where i ∈ {1, 2} and i = 3 − i. Due to the conditional independence of bivariate INAR models, the joint transition probabilities for M Similarly, the above probabilities can be simplified as It is obvious that the above transition probabilities have exactly the same form as continuous counterparts when m = 1. Consequently, the Lévy measures of Z where the g is a continuous, non-negative, bounded Borel function vanishing near 0 and M (n) t respectively, the truncation function is h = |x|1 {|x|<1} and Theorem 7 With the the definition and the parametrization above, and the initial distribution condition: Z when the size of subinterval goes to 0 or equivalently, n → ∞.
Proof Here we simply apply Theorem 3.39 from Jacod and Shiryaev (2013, chapter IX, section 3), the limit theorem of semimartingales for the locally bounded case.
i The local strong Majorization Hypothesis: For both cases Z t and M t , the first two terms of the characteristic triplets are 0 and stochastic integrals with respect to the function is clearly finite on [0, 1] ii Local Conditions on big jumps: For both cases Z t and M t , there is no jump with absolute size greater than 1. iii The local uniqueness: for every choices of initial distributions for Z 0 and M 0 , their Lévy measures are uniquely characterized by their (joint) probability distribution functions. iv Continuity Condition, the characteristic triplets B t (ω), C t (ω), ν(ω; dt, dx) of Z t and M t are continuous with respect to ω. v Weak convergence of initial distribution. This is stated at the beginning of this theorem. vi Convergence of characteristic triplet of discrete processes to that of their continuous counterparts. This can be proved by showing the uniform convergence of Lévy measures. For every a > 0, define a stopping time for the population process: For the univariate case, the stochastic integral with respect to g * v for any Borel function g is given by and the absolute difference of two stochastic integrals is given by, It is clear that all the quantity inside |..| are finite and for every ξ > 0, and then there exists a natural number N such that for n > N , we have and hence we have the uniform convergence for g * ν n t∧S a to (g * ν t∧S a ) • Z (n) . For the bivariate case, the stochastic integral g * ν, where ν is the Lévy measure of M, for any Borel function g is given by Then the absolute difference of two stochastic integrals is given by Hence the uniform convergence holds using similar argument as in the univariate case. Finally, the Z converge weakly to Z t and M t respectively.

Simulation study
In this section, we outline the simulation algorithm for bivariate birth and death processes. Then estimation method, properties of estimators are investigated in the simulation study.

Simulation of bivariate birth and death process
The simulation algorithm of bivariate birth and death process M t can be derived straightforwardly according to its ODE (11). Given the current population M t , the waiting time that a event (birth or death in either population) will happen follows exponential distribution with rate ρ t = (λ 11 + λ 12 + μ 1 )M 1,t + (λ 21 + λ 22 + μ 2 )M 2,t Then the probability that this event will happen in population M 1,t is The probability that this event will happen in population M 2,t would simply be p 2 = 1 − p 1 . Suppose now an event happens in population M 1,t , the probability that there is a new individual would be and the probability that an individual dies is p d 1 = 1 − p b 1 . Likewise, if the event happens in the population M 2,t , the birth probability would be and death probability p d n = 1− p b n . Overall, the simulation algorithm is shown in the following Algorithm 1.
On the other hand, the simulation procedure of bivariate INAR(1) model is straightforward because the distribution of Y t are indicated by the operator (•, * 1 , * 2 ) given Y t−1 .

Quasi-MLE for bivariate LBD
In the univariate case, parameters estimation and their asymptotic properties are available in Keiding (1975). Suppose now we have the full information of the sample path, the exact inter-arrival times for each birth and death events {τ i } {i=0,1,2,... } on the sampling interval [0, T ] where τ 0 = 0, the maximum likelihood estimators for Z t arê where B T , D T are total number of birth and death events respectively. The asymptotic properties are given by fixed T and large population In practice, one may not have exact information of inter-arrival time of the events. Instead, we have records for populations sampling over a fixed-length interval such that Z 0 , Z , Z 2 . . . Z n are available. Then to estimate the parameters λ, μ, one can numerically maximize the Quasi log-likelihood function from the proposed INAR(1) model X k = Z k , k = 0, 1, . . . , n. The log likelihood function is given by, where f b and f nb are probability mass function of binomial and negative binomial random variables  The simulation is conducted as follow: we generate 1000 sample paths of Z t using the parameters settings in Table 1. Since Z t are continuous sample paths, we set up an equaldistance grid with sampling interval . Then the equal-distance observations X t are obtained by counting the total number of population up to each discrete time (0, , 2 , . . . , n ) where n = T . The log likelihood function is then maximized by 'optim' function with method = 'BFGS' in R programming. Finally, we can recover the rate estimates by inverting the parametrization in equation (32) In the following, we will first explore how the size of would affect properties estimators, i.e. bias and mean square error (MSE), and how much more computational time we need compared to true MLE method. Four different size of sampling intervals = {0.1, 0.05, 0.025, 0.01} is chosen and the results are presented in Table 2. The theoretical row shows the biased and MSE computed through equation (52). There is no surprise that the True MLE method from Eq. (51) performs the best, with lowest MSE and computational time. The Quasi-MLE method by constructing INAR model, on the other hand, becomes better as we decreasing the size of sampling interval but it still performs no better than the true MLE method and require much more computational time. The empirical distribution of these estimators are illustrated in Fig. 1 and since the general shape of distribution ofλ and μ has little difference, we will only show the distribution ofλ. It is clear that only the case = 0.01 has satisfactory normal shape compared to all other cases. To achieve asymptotic normality for Quasi-MLE method from INAR model, one need not only large initial population, but also a small sampling interval . In the following simulation, we would fix the sampling interval = 0.01 and investigate how the size of initial population would affect the asymptotic distribution of estimators and the computational time for estimation procedure. To explore the effect of Z 0 for asymptotic distribution, we choose Z 0 ∈ {5, 10, 30, 50} and it seems from Fig. 2 that to ensure asymptotic normality for both estimators, one need at least Z 0 = 30, which is a large sample size in statistical sense.  Table 1 and the dash lines stand for empirical means The computational time with respect to Z 0 ∈ {10, 50, 100, 150, . . . , 500} clearly shows a linear trend in Fig. 3. This is reasonable as the number of summation involved in Eq. (53) increases linearly with respect to Z 0 In summary, the Quasi-MLE method constructed from INAR model can reach moderate level of estimation accuracy and asymptotic normality with large initial population Z 0 ≥ 30 and small sampling interval ≤ 0.01. However, it would require much more computational time than the true MLE method. This method should only be used in the case where we have no information on inter-arrival time of birth and death events.

Quasi-MLE for bivariate LBD
Since the bivariate INAR(1) model is a bivariate Markov Chain, the log likelihood function can be written as the sum of logarithm of transition probabilities. Denote = {α 1 , α 2 , β 11 , β 12 , β 21 , β 22 } as the parameter space of bivariate INAR(1) model, then the where x = {α 1 , β 11 , β 21 } and y = {α 2 , β 12 , β 22 }. Because X t and Y t are independent of each other given the last state (X t−1 , Y t−t ), the likelihood function can be separated into two parts, x and y respectively. Then transition probability for X t is given by The one for Y t is One can then numerically maximize the log likelihood function x , y given the random From the estimated parametersˆ , we can solve the following system of equations to get the estimates bd = {λ 11 , λ 12 , λ 21 , λ 22 , μ 1 , μ 2 } for bivariate birth and death process.
where the parametrization function .( bd , ) are given in equation 33 and is chosen based on the interpretation of birth and death rates. For example, when the random samples are collected on daily basis over a year t = 1, one can define = t/365. Then these parameters bd are interpreted on an annual scale. In the following, we will simulate the r 2 = 1000 sample paths of M t based on the prespecific parameters in Table 3. Then equal-distance gird with sampling interval is set up and random samples (Y 0 , Y 1 , . . . , Y n ) are obtained, like the way mentioned in the univariate case. Then the likelihood functions x , y are maximized by 'optim' in R with method being specified as 'BFGS' and the maximum likelihood estimatorsˆ are obtained. Finally, we can obtain the estimatorsˆ bd by numerically solving the system of equations (56) via a root-finding algorithm (e.g. Newton-Raphson method). Referring to the estimation results in univariate case, we focus on the choices of ∈ {0.02, 0.01, 0.005} as well as large initial population (40, 50), and hopefully we can obtain asymptotic normality for each estimator. The empirical distribution of these estimators bd are illustrated in Fig. 4 and their properties are summarized in Table 4.
The bias and MSE of most estimators are decreasing with respect to as expected. However, the MSE of birth rates are much larger than the estimators of death rates. Except the estimators for death rates, all other estimators for birth rate are skewed to different directions and clearly non-normal distributed. This may caused by some of non-normal estimators for proposed INAR model illustrated in Fig. 5. In the classical setting where the innovation term is included, one need stationary condition to ensure asymptotic normality for all estimators  Table 3 and the dash lines stand for empirical means  Bu et al. (2008). And in our case, INAR model itself is not stationary and hence some of the estimate can be skewed. Notice that the pair of birth rates that contributed to the same population, (λ 11 , λ 21 ) and (λ 12 , λ 22 ) are skewed in opposite directions. It is then worthwhile to see whether the sum of these pair estimators has desired asymptotic properties and the results in Fig. 6 confirms our conjecture. Combining the simulation procedure of bivariate birth and death processes, Quasi-MLE method may not be able to distinguish the pair of birth rates contributed to the same population. Instead, it would provide good estimators for the scale of total birth ratesλ 1 = λ 11 r m +λ 21 (1 − r m ) andλ 2 =λ 12 r m +λ 22 (1 − r m ) where r m = As long as the whole process is not extinct with probability one, i.e. κ 1 κ 2 < λ 12 λ 21 , the exponential power (λ 12 c − κ 2 ) will always be positive and hence E[M 1,t ] ≈ cE[M 2,t ] when t is large. In other words, the ratio  Table 3 and the dash lines stand for empirical means   Table 3 and the dash lines stand for empirical means becomes a constant eventually. For the parameter setting in Table 3, c = 1.040833, r m ≈ 1 2 and henceλ 11 +λ 21 serves as an estimator for the total birth rate of M 1,t . In practice, the c is unknown as true parameters need to be estimated. Then we can use the values at the end of sampling period to approximate r m , i.e.
The properties ofλ 1 ,λ 2 and their empirical distribution are shown in Table 5 and Fig. 7. These new estimators benefits from nice properties, low bias and MSE and they decreases as decreases. Most importantly, they are not skewed anymore and asymptotic normal. Let us try another parameter setting in Table 6 to verify this conjecture. Same simulation and estimation process as previous case and the results are shown in Tables 7, 8 and Fig. 8. This time the constant c is 0.576306 and r m = 0.365605. Similar to the last setting, the estimators for all birth rates are skewed and some of them have large bias and MSE. The estimators for total birth rate, on the other hand, are of low bias and MSE and they are again asymptotic normal.
Let us finally try another parameter setting in Table 9 where the M t is going to be extinct eventually. It means that the exponential function in Eq. (57) can no longer be omitted. The results are illustrated in Table 10 and Fig. 9 and they look similar to the results of the first case. Nice properties for death rates' estimators but skewed and non-normal for birth rates' estimators.  Table 3 and the dash lines stand for empirical means

Concluding remarks
In this paper, we propose an integer-valued autoregressive model INAR(1) to approximate the continuous birth-and-death process. In univariate case, we propose a birth-death operator p * 1 α • X which is the sum of zero-modified geometric random variable. The parametrization of p and α can be determined by matching the first and second moment of continuous process. Then we propose an bivariate INAR(1) model to approximate bivariate birth and death process where birth probabilities will also depend on the size of the other population.
The parametrization of this model can be obtained in a similar way. The convergence from discrete process to continuous process is proved by apply weak convergence theorem of locally bounded semimartingales. Due to the simple Markov structure of INAR(1) model, maximum likelihood estimation would be feasible. It is however not the case for bivariate and multivariate birth and death process. Basically, one can extend the result here to multivariate case, i.e. we can approximate multivariate birth and death process in Griffiths (1973) by multivariate INAR(1) model using the these operators * 1 , * 2 , • only as well as adding an immigrant process. However, the difficulty of expressing the parameters of INAR(1) model in terms of the parameters of multivariate birth and death process would be increasing and as we need to find out the first moment of birth and process explicitly.  Table 9 and the dash lines stand for empirical means The first two result in the following system of ODE The other three equations become (17), which is hard to solve explicitly as we need to solve an inhomogeneous ordinary differential equation system. To Make use of the second the ODE in, the first ODE can be rearranged into a ordinary equation.
Then c is the solution of the quadratic equation Both roots would result in the same moments, so just take the positive root. The function g would be the solution of following ODE and

A.2 Proof of proposition 2
For the first property, we can verify in the following way.
Suppose equation (22) holds for i = k. Then for i = k + 1, we have So Eq. (22) holds for all i = 1, 2, . . . . For the second property, the moments can be found by conditional expectation such that

A.3 Proof of Proposition 5
The transition probability of giving out m(m > 0) birth of from the process Z during an infinitesimal time interval is given by where all the exponential function are expressed as their corresponding Taylor expansion at = 0. To make comparison with the continuous birth and death process, we are interested in the coefficients in front of . First we need to check the lowest order of in above probability. That is, we would like to minimize the sum min 1≤ j≤k 2k − 2 j + m = m So for the transition probability is rearranged in the following way Then it is clear that the there is no first order term in the case where m ≥ 2 and the probability that giving out exactly one birth is On the other hand, we can derive the probability that m(1 ≤ m ≤ k) individuals die within infinitesimal time in a similar way The minimum order of is determined by Then the probability is reduced to The transition probability that only one individual dies is the probability that more than one individuals die are o( ) As the birth rate and death (λ, μ) are time-homogeneous, so as the parameters α, p, the transition probabilities stay the same for all time t ∈ [t 1 , t 2 ]. This means that the discrete birth and death INAR(1) model would result in the the same dynamic (1) of simple birth and death process when is small enough. Similar to the univariate case. It is necessary to find out the transition probabilities before proceeding to the weak convergence. The transition probability of giving out m(m ≥ 1) births of population M where all the exponential function are expressed in their corresponding Taylor expansion at = 0. The lowest order of is determined by the power of ((λ 11 − μ 1 ) + o( )) and where j =∈ {1, . . . , k 1 + m}. This leads to j = k 1 , i = k 1 − 1 and j = k 1 − 1, i = k 1 − 1, respectively. Then the transition probability reduces to Then the probability when m = 1 is given by By symmetry, the birth transition probability for the other population is given by On the other hand, the probability that m individual die in population M ( ) t given that M The lowest order of is determined by the power of ((λ 11 − μ 1 ) + o( )) and (C β 1 (u 1 − u 2 ) + o( )),which is where j ∈ {1, . . . , k 1 − m}. So the above probability reduces to It is not surprise that the transition probability only depends on its own size of population from the bivariate INAR construction. Then the death transition probability for the other population is The it is clear that the probabilities for both population that there is only one death would have the same form as in the univariate case. By conditional independence of bivariate INAR model, the joint transition probability would be the product of any of two transition probabilities shown above. For example, for any two integers m 1 , m 2 ∈ Z, Then it is straightforward to show that, to have a first order term, the only possible combinations of (m 1 , m 2 ) are {(1, 0), (0, 1), (−1, 0), (0, −1)}, which under the proposed parametrization, the joint process only allow one jump during infinitesimal time which coincide with the bivariate continuous birth and death process.
As the birth rates λ i, j and death rate are μ i time homogeneous, so as the parameters α i , β i, j , i, j ∈ {1, 2}, the transition probabilities stay the same for all time t ∈ [0, 1]. This means that this bivariate birth and death INAR(1) model would result in the the same dynamic (11) when is small enough.

A.4 Proof of Proposition 6
According to Theorem 3.11, Chapter 2 in Jacod and Shiryaev (2013). We need to make sure the following two sum is finite for any truncation function h before constructing their discrete Lévy measures.