## Abstract

Count time series data feature prominently in epidemiology, business, and environmental sciences. Often, such data exhibit zero-inflation and overdispersion in addition to serial dependence. Parametric models such as the negative binomial distribution are employed to account for overdispersion. In practice, the conditional variance structure may be unknown or may not be negative binomial. In this paper, a distribution-free approach for estimation of regression parameters of conditionally overdispersed and zero-inflated time series models is developed. Parameter estimates are optimal in the Godambe-information sense. Simulation studies indicate that our method is robust to model misspecification with small relative bias and nearly the same efficiency as that of the MLE for some observation-driven count time series processes. A case study comparing our method with fully parametric methods using weekly syphilis counts from 2007–2010 in Virginia, USA, illustrates the benefit of our method.

## Introduction

Count time series data appear in a wide variety of applications including public health, business, and environmental sciences. For recent theoretical developments and applications, see Davis et al. [4]. Often, count time series data exhibit a preponderance of zero values, in addition to overdispersion. Count time series regression models based on zero-inflated Poisson (ZIP) and zero-inflated negative binomial (ZINB) distributions are available within either a parameter-driven model framework (e.g. Yang et al. [19]), or an observation-driven model framework (e.g. Yang et al. [18]); for the distinction between parameter-driven and observation-driven time series models, see Cox et al. [2].

In practice, the conditional variance structure of an observed time series may be either unknown or may not be adequately modelled by either a Poisson or a negative binomial distribution. In Thavaneswaran and Ravishanker [14], a distribution-free approach is proposed that circumvents likelihood misspecification. Their approach is semi-parametric in that only the first few conditional distribution moments need to be specified and model parameters are estimated using the theory of estimating functions (e.g. Godambe and Heyde [5]). In Thavaneswaran and Ravishanker [14], conditional distribution moments are all taken from a parametric model (e.g. ZIP, ZINB).

Though semi-parametric models are advocated for as being robust to model misspecification, their use of higher-order moments, typically taken from a particular parametric distribution, diminishes their robustness when the true parametric model has conditional moments that are very different from those specified. For example, Tang et al. [13] compared their distribution-free approach to another distribution-free method (Yu et al. [21]) that requires second-moment specification taken from a ZIP model. Tang et al. [13] showed that when the response variable in a longitudinal model for zero-inflated count data is simulated from a ZINB distribution, regression parameter estimates, asymptotic standard errors, and empirical type one errors for Yu’s method are affected by variance misspecification. In contrast, variance misspecification is not a concern for Tang’s distribution-free method as it does not rely on specification of second and higher moments.

In this paper, we develop distribution-free zero-inflated count time regression models that are robust to variance misspecification. Our method is an example of a functional response model (FRM). FRMs extend the usual class of GLMs and nonlinear regression models to model a broader class of regression problems involving higher-order moments and between-subject attributes (See references within Tang et al. [13] for real data applications of FRM). We leverage independent data ZIP and ZINB regression FRMs (Tang et al. [13]) to develop zero-inflated count time series regression FRMs. Unlike Thavaneswaran and Ravishanker [14], our distribution-free FRM approach is robust against model misspecification and against second- or higher-order conditional moment misspecification.

This paper is organized as follows: In Sect. 2, we review observation-driven ZIP and ZINB autoregressions. In Sect. 3, we outline our version of FRM and inference for it within the estimating function-based framework for discrete-time stochastic processes (Godambe and Heyde [5]). In Sect. 4, we compare the performance of our FRM-based estimates relative to fully parametric-based estimates using simulation studies. In Sect. 5, we illustrate our approach using real data. The paper concludes with a discussion in Sect. 6.

## Functional Response Models for Observation-Driven Count Time Series Regression

In Sect. 2.1, a brief review of observation-driven ZIP and ZINB autoregressive processes developed by Yang et al. [18] is given. The FRM approach is subsequently derived in Sect. 2.2.

### ZIP and ZINB Autoregressions

Let \(\{{ Y }_t\}\), \(t=1,\ldots ,n\) denote the response series of discrete counts and let \({\mathcal{F}}_{t-1}\) denote the \(\sigma \)-field generated by past response values and past covariate values at times up to and including time *t*. Conditional on \({\mathcal{F}}_{t-1}\), \(Y_t\) is assumed to be distributed as \({\text{ ZIP }}(\lambda _t, \omega _t)\) with a probability mass function (p.m.f.) that mixes a zero-centred degenerate distribution with a Poisson distribution. The p.m.f. of \(Y_t\) given \({\mathcal{F}}_{t-1}\) is defined as follows:

where \(\lambda _t\) is the mean parameter of the non-inflated Poisson distribution, \(\omega _t\) is the zero-inflation parameter, and \({{I}}_{(y_t=0)}\) is an indicator function for observing a zero count. The ZIP autoregression conditional mean parameter \(\lambda _t\) and conditional zero-inflation parameter \(\omega _t\) are modelled using a log-linear model and a logistic regression model as in the following:

and

where \({\pmb \beta }= (\beta _1,\ldots ,\beta _p)^{{\top }},\)\({\pmb \gamma }=(\gamma _1,\ldots ,\gamma _q)^{{\top }}\) are the regression coefficients for the log-linear part (2.2) and logistic part (2.3), respectively. Interest centres on estimating \({\pmb \theta }=({\pmb \beta }^{{\top }}, {\pmb \gamma }^{{\top }})^{{\top }},\) the vector of \((p + q)\)-dimensional vector of unknown parameters. Here, \({\mathbf{x}}_{t-1}^{{\top }} = ({ x }_{(t-1),1},\ldots , { x }_{(t-1),p})^{{\top }}\) and \({\mathbf{z}}_{t-1}^{{\top }} = ({ z }_{(t-1),1},\ldots , { z }_{(t-1),q})^{{\top }}\) denote vectors of past explanatory variables, into which functions of the lagged response series can be incorporated to account for serial correlation. The conditional mean and variance (derived in Yang et al. [18]) are \(E({ Y }_t|{\mathcal{F}}_{t-1})=\lambda _t(1-\omega _t)\) and \({\text{ Var }}({ Y }_t|{\mathcal{F}}_{t-1})=\lambda _t(1-\omega _t)(1+\lambda _t \omega _t),\), respectively. The conditional variance exceeds the conditional mean.

Conditional on \({\mathcal{F}}_{t-1},\) a \({\text{ ZINB }}(\lambda _t, \omega _t, \tau )\) distribution is obtained by mixing the degenerate distribution in (2.1) with a negative binomial distribution; the conditional p.m.f. is as follows:

for \(y_t=0,1,2,\ldots \). Under \({\text{ ZINB }}(\lambda _t, \omega _t, \tau )\), the conditional variance is \({\text{ Var }}(Y_t|{\mathcal{F}}_{t-1}) = \lambda _t(1-\omega _t)(1+\lambda _t \omega _t + \frac{\lambda _t}{\tau })\) and it exceeds the conditional mean given by \(E({ Y }_t|{\mathcal{F}}_{t-1})=\lambda _t(1-\omega _t)\) (see Yang [17]). Covariates are incorporated, as in ZIP autoregression, through log-linear and logistic regression models. For example,

where the regression model parameters are \({\pmb \beta }= (\beta _1,\ldots ,\beta _p)^{{\top }}\) and \({\pmb \gamma }=(\gamma _1,\ldots ,\gamma _q)^{{\top }}.\)\({\mathbf{x}}_{t-1}^{{\top }} = ({ x }_{(t-1),1},\ldots , { x }_{(t-1),p})^{{\top }}\), and \({\mathbf{z}}_{t-1}^{{\top }} = ({ z }_{(t-1),1},\ldots , { z }_{(t-1),q})^{{\top }}\) denote vectors of past explanatory variables. The unknown parameter is \({\pmb \theta }=({\pmb \beta }^{{\top }}, {\pmb \gamma }^{{\top }}, \tau )^{{\top }}\) of dimension \(p+q+1\) for ZINB autoregressions. For identifiability, we assume the dispersion parameter is constant over time. ZIP and ZINB autoregressive model parameters are estimated using maximum partial likelihood (MPL) (e.g. Kedem and Fokianos [9]). The asymptotic normality of ZIP autoregressive regression parameter MPLEs is derived in Yang et al. [18]; proof of asymptotic normality of ZINB autoregressive regression parameter estimates is similar.

Model specification based on conditional mean specification of both ZIP and ZINB, or in fact in any other zero-inflated distribution, remains valid, no matter which zero-inflated data generating mechanism give rise to the observed time series as the conditional means are identical. Within the estimating function framework, model parameters are not identifiable based on conditional mean specification alone (Thavaneswaran and Ravishanker [14]). The estimating function method requires minimum specification of the first- and second-order conditional moments where the second-order moment is typically borrowed from a specific parametric distribution. However, the conditional variance specified by the NB distribution, as an example, for the non-degenerate part, may not be valid for overdispersed count time series in practice. To overcome the lack of robustness against higher-order *conditional* distribution moment misspecification, in the next section, we adopt the functional response modelling approach.

### Functional Response Models

Functional response models (FRMs) are distribution-free regression models that included generalized linear models (GLMs) as a special case but free up the researcher from the limitations of usual regression modelling. For a thorough treatment of FRMs, see Chapter 6 of Kowalski and Tu [10]. Distribution-free regression models based on functional response modelling simply require specification of the conditional mean of a function of the response series given past information and past regressors. The specification is

where \(\pmb f\) denotes some vector-valued function, \(\pmb h(\pmb \theta )\) is a smooth vector-valued function, and \({\pmb \theta }\) is the vector of unknown regression parameters. Our choice of \(\pmb f(.)\) function is based on ZIP autoregressive specification in (2.1), (2.2), and (2.3). Further details of our specification are provided in “Appendix 1”. Let

The benefits of FRM become apparent when we examine estimation based on conditional mean structure of the response \({ Y }_{t}\) in contrast to the proposal in (2.7). Inference based on the conditional mean of \({ Y }_{t}\) alone, assuming a conditionally ZIP (or any conditionally zero-inflated process), remains valid via the specification

However, \(\pmb \gamma \) and \({\pmb \beta }\) are not estimable because they are not identifiable based on solely specifying the conditional mean. One distribution-free approach to solve the lack of identifiability issue would be to specify the conditional variance along with the conditional mean and then estimate model parameters by solving the optimal Godambe estimating equation [14]. Unlike our FRM, the optimal Godambe estimating function approach is sensitive to variance (as well as higher-order moment) misspecification.

## Estimation and Inference

For the FRM in (2.6) and (2.7), let \({\pmb \theta }=({\pmb \beta }^{{\top }}, {\pmb \gamma }^{{\top }})^{{\top }}\) and

The estimate of \({\pmb \theta }\) is found by solving the optimal martingale EF given by the following equation for \(\pmb \theta .\)

Let \({\widehat{{\pmb \theta }}}_{QL}\) be the solution of the optimal martingale EF and estimator of \({\pmb \theta }\). Under mild regularity conditions, \(\displaystyle \sqrt{n}({\widehat{{\pmb \theta }}}_{QL} - {\pmb \theta }) \xrightarrow []{D} N(\pmb 0, K_{\theta })\) as \(n \rightarrow \infty ,\) where \(K_{\theta } = B^{-1} E \left( D_t^{{\top }} V_t^{-1} S_t S_t^{{\top }} V_t^{-1} D_t \right) B^{-{\top }}\) and \(B=E(D_t V_t^{-1} D_t^{{\top }}).\) A consistent estimate of \(K_{\theta }\) is given by the sandwich estimator

with \({\widehat{B}} = \frac{1}{n-1} \sum _{t=1}^n {\widehat{D}}_t^{{\top }} {\widehat{V}}_t^{-1} {\widehat{D}}_t.\) Here, the matrices \({\widehat{B}}, {\widehat{D}}_t, {\widehat{S}}_t,\) and \({\widehat{V}}_t\) are evaluated at \({\widehat{{\pmb \theta }}}_{QL}.\) Regularity conditions for consistency and asymptotic normality of \({\widehat{{\pmb \theta }}}_{QL}\) appear in “Appendix 2”.

## Simulation Study

In the following subsections, we examine small-sample properties ZIP autoregressions, ZINB autoregressions, and FRM. In each case, we fit the correct likelihood, as well as an incorrect likelihood and contrast the resulting estimates with those obtained from FRM method. Parametric models were fitted using **R**’s **ZIM** package [20]. Rather than using nonlinear equation solvers available in **R** to obtain parameter estimates for FRM, we elected to view the problem of solving the estimating equation as equivalent to minimizing the square of the norm of the estimating equation under certain conditions (see Contreras and Ryan [1] for details) and made use of the **optim** function in **R**. Based on past experience, we have found the minimization approach to solving an EF to be computationally faster than when using some of the solvers in **R**.

In all Monte Carlo studies, \(M=5000\) time series of lengths \(n=200\) for Simulation Study 1, \(n=500\) for Simulation Study 2, and \(n=200\) for Simulation Study 3 are generated. The length of time series generated in each of the Simulation Studies 1 and 2 is based on the smallest number of observations needed, determined by trial and error, for optimal performance with respect to criteria used for comparison between estimation procedures, such as size of the test, when the likelihood is correctly specified. The length of time series generated for Simulation Study 3 is rationalized within Sect. 4.3.

Estimates are compared on the basis of mean and median point estimates, relative bias, empirical standard deviation of estimates (ESE), mean asymptotic standard error of estimates (ASE), and empirical Type I error probabilities. Relative bias is computed as the ratio of the usual bias statistic to the true parameter. Empirical Type I error probabilities are the fraction of times the null hypothesis is rejected at the nominal 5% significance level using Wald statistics whose asymptotic distribution is normal. Specifically, for each estimation method, Wald statistics were of the form \(z= {\hat{\theta }}_i/\widehat{{\text{ s.e. }}}({\hat{\theta }}_i)\) where the estimated standard error in the denominator is the asymptotic standard deviation from each model (i.e. ZIP, ZINB, and FRM). For brevity, only the empirical type one error rate for testing one of the parameters is reported in Tables 1, 2, and 3. All empirical Type I error rates are found in “Appendix 3”. The covariates used in Simulation Study 1 and Simulation Study 2 are the same as the those in the final model for weekly syphilis counts between 2007 and 2010 in Maryland, USA, analyzed in Yang et al. [18]. Parameter settings in Simulation Study 1 are settings used in studies of Yang et al. [18]. In Simulation Study 2 and 3, we used settings close to those in studies by Yang et al. [19].

### Simulation Study 1: ZIP Autoregression

Time series of length \(n=200\) were generated from a ZIP autoregressive model using the **rzip** function in R’s **ZIM** package [20]. We specified \(\eta _t\) and \(\xi _t\) in (2.2) and (2.3) as follows:

with parameter vector \({\pmb \theta }=(\beta _0, \beta _1, \gamma _0, \gamma _1)^{{\top }}=(1.2, 0.6, 0.4, -0.8)^{{\top }}.\) ZIP autoregression (i.e. the correct likelihood) and ZINB autoregressions were fitted using the function **zim** within the **ZIM****R** package.

FRM performs well for small samples sizes as indicated in Table 1. With ZIP as the correct likelihood, ZIP estimates are consistent with respect to mean point estimates, which have smaller relative bias compared to ZINB. As mean and median estimates are nearly identical, only mean estimates are reported here. Relative bias statistics for FRM-based estimates are in close agreement with those from the correct likelihood (i.e. ZIP). Even for the misspecified likelihood (ZINB), relative bias estimates are small as expected due to correct specification of the conditional mean structure with maximum relative bias observed to be 3.5%.

A test of \(H_0:\beta _1=0.6\) against the two-sided alternative is conducted for all three models. In all of ZIP, ZINB, and FRM, empirical standard errors are close to their asymptotic counterparts though ZINB had slightly larger standard errors as compared to ZIP and FRM. When \(n=200\), all three methods have empirical Type I error rates that are close to the nominal 5% value. We also determined empirical error rates for testing all other parameters in the model and found their empirical test sizes to be close to 5%; see table in “Appendix 3”.

### Simulation Study 2: ZINB Autoregression

In this study, ZINB autoregressions of length \(n=500\) were generated using the **rzinb** function in **R**’s **ZIM** package. The regression models used were those in Sect. 4.1 with parameter \({\pmb \theta }=(\beta _0, \beta _1, \gamma _0, \gamma _1)^{{\top }}=(3, 0.5, -1, -0.5)^{{\top }}\) and the dispersion parameter associated with the negative binomial distribution is \(\tau =1.5.\) As in Simulation Study 1, the **zim** function with **ZIM** of **R** software was used to fit ZIP and ZINB autoregressions.

Results are reported in Table 2. Not surprisingly, both ZIP and FRM have estimates with small relative bias as expected since conditional means are correctly specified. When the likelihood is correctly specified as ZINB, its maximum relative bias is smaller than either maximum relative bias of ZIP or FRM, as expected. ZINB’s maximum relative bias is 3.7%.

Mean asymptotic standard errors for FRM and ZINB agree with their corresponding empirical standard deviations. FRM’s standard errors (empirical and asymptotic) slightly underestimate the ZINB’s standard errors (empirical and asymptotic). In contrast, mean asymptotic standard errors of ZIP parameter estimate for the mean (\(\beta _0, \beta _1\)) are much smaller than their empirical counterparts while ZIP’s asymptotic and empirical standard errors of zero-inflation parameter estimates (\(\gamma _0, \gamma _1\)) are in close agreement.

We report the empirical Type I error rates of ZIP, ZINB, and FRM for testing \(H_0: \beta _1=0.5\) against the two-sided alternative in Table 2. FRM and ZINB empirical test sizes are close to the nominal significance level of 5% when \(n=500.\) For ZIP, the empirical size of the test is 0.613 which stems from underestimation of the empirical standard deviation of \({\hat{\beta }}_1\)’s by ZIP. We also obtained empirical Type I error rates for testing all other parameters; see table in “Appendix 3”. As ZINB is the correct likelihood, empirical Type I error rates were close the nominal significance level of 5% across the board. Type I error rates were also close to 5% for all parameters when FRM was the method of estimation. For ZIP, the empirical Type I error rate for testing \(\beta _0\) is 61%, but interestingly, the empirical test sizes for testing parameters in the logistic part were not as badly affected by variance misspecification.

### Simulation Study 3: Dynamic ZINB Time Series Regression

Dynamic zero-inflated models are a class of parameter-driven models from which count time series with low frequencies can be generated. Dynamic zero-inflated models may be conditionally equi-dispersed or overdispersed, and autoregression is governed by a stationary autoregressive process of order *p* (AR(p)). In this section, we examine the performance of non-dynamic ZIP, non-dynamic ZINB, and FRM when data are generated from a dynamic ZINB (Yang et al. [19]). Let \(\{Y_t\}\) be the observed count time series and conditional on a latent autoregressive process (\(z_t\)) assume \(Y_t\) is ZINB(\(\lambda _t, \omega , \tau \)) with the conditional mean and zero-inflation parameter given by

Here, \(z_t\) is a stationary AR(2) process \(z_t = \phi _1 z_{t-1} + \phi _2 z_{t-2} + \varepsilon _t\) and \(\varepsilon _t\) is an independent zero-mean white noise process with standard deviation \(\sigma .\) Conditional on the latent process \(z_t\), the mean and variance of the dynamic ZINB process, derived in Yang et al. [19], are \(E(Y_t|z_t)=\lambda _t(1-\omega )\) and \(Var(Y_t|z_t)=\lambda _t(1-\omega )(1+\omega \lambda _t + \displaystyle \lambda _t/\tau )\), respectively, resulting in a conditionally overdispersed process. We used **dzim.sim** from the **ZIM** package in **R** to generate \(M=5000\) series of length \(n=200\). The parameter values were \(\beta _0=2\) and \(\gamma _0={\text{ logit }}(0.2) = -1.386, \phi _1=0.8, \phi _2=-0.6, \sigma =0.25,\) and \(\tau =2.5\). Without loss of generality, we assess an intercept only model here though covariates may enter into the conditional mean model. Non-dynamic ZIP, ZINB, and our FRM were fitted using the same R functions in Simulation Studies 1 and 2. Results of this study are found in Table 3.

The relative bias in both estimates is large (at least 5%) for FRM and non-dynamic ZIP, whereas non-dynamic ZINB estimates have the smallest relative bias (at most 7%). The empirical standard deviations of FRM and non-dynamic ZINB are close to their mean asymptotic ones unlike non-dynamic ZIP where the mean asymptotic standard error of \({\hat{\beta }}_0\) grossly underestimates the empirical standard deviation. The empirical size of the test of hypothesis \(H_0: \beta _0=2\) against its two-sided alternative, for each of non-dynamic ZIP, non-dynamic ZINB, and FRM, is reported in Table 3. Empirical Type I error rates for testing \(\gamma _0\) appear in “Appendix 3”. For both parameters, test size for all three methods fails to be close to the nominal 5% level. Non-dynamic ZIP performs the poorest with respect to test size which is not surprising given that it’s mean asymptotic standard errors are much smaller than their empirical counterparts. Though at 11%, non-dynamic ZINB performs best with respect to test size, it is not close to the nominal significance level. Even when we increased the series length to \(n=1000\) (results not shown), none of non-dynamic ZIP, non-dynamic ZINB, or FRM performed well with respect to relative bias and empirical test sizes when data arise from dynamic ZINB processes. However, with sample size as low as \(n=200\), mean asymptotic standard errors for FRM and ZINB estimates were close to their empirical standard deviations. As FRM is robust to conditional variance misspecification, this latter observation suggests FRM is amenable to estimating zero-inflated overdispersed parameter-driven processes well after adjustments for excess bias; see Sect. 6 for more details.

## Data Example

In disease surveillance, public health researchers have an interest in detecting trends in communicable diseases. The *Morbidity and Mortality Weekly Report (MMWR)* produced by *Centers for Disease Control and Prevention* in the USA is a report that provides researchers with rates and proportions of communicable diseases. MMWRs are typically descriptive in nature, and statements about whether there has been an increase or decrease in disease rates within a time interval do not have *p* values or confidence intervals attached to them.

In this section, we analyze a time series of weekly syphilis cases in Virginia, USA, available from the **ZIM** package in **R**, and statistically test for the presence of a linear trend in the number of weekly syphilis cases between 2007 and 2010. We carry out hypothesis testing by fitting ZIP and ZINB autoregressions and using our FRM method. We demonstrate how our FRM aids in robustness checks for postulated parametric models in order to assess the statistical significance of linear trends.

From the time series plot shown in Fig. 1, there appears to be a slight increase in the number of weekly cases from 2007 until 2010, but further analysis is needed to justify any such statement. Of the \(n=209\) observations in the data, 56 (27%) are zero counts suggesting potential zero-inflation. From the bar chart of observed data counts in Fig. 2, it appears there is a mixture of two distributions: a degenerate distribution with point mass at zero and a separate discrete distribution. It is therefore reasonable to assume that the observed data ought to be fitted using a zero-inflated probability distribution. We begin by assessing the presence of linear trend by fitting ZIP and ZINB autoregressions using the **zim** function from the **ZIM** package in **R**.

The models we fitted were final models selected by Yang et al. [18] for weekly syphilis cases in Maryland, USA during 2007 until 2010. The models for the log-linear and logistic parts, respectively, were the following: \(\displaystyle \eta _t = \beta _0 + \beta _1 I_{(y_{t-1} > 0)} + \beta _2 x_t\) and \(\xi _t = \gamma _0 + \gamma _1 x_t,\) where \(x_t = t/1000\) for \(t=1,\ldots ,209\) is used to model trend. It is of interest to note that our choice of time trend covariate was also utilized by Zeger [22] for testing whether there is a decrease in observed monthly polio counts in the United States between January 1970 and December 1983 for a time series of length \(n=186.\) Other time trend scaling in literature is of the form *t*/*n*, where *t* represents the time index and *n* is the length of the time series. Asymptotic justification for *t*/*n* type scaling of time is found on p. 492 in Davis et al. [3]. Autoregression is introduced using \(I_{(y_{t-1} > 0)}\), an indicator function for a positive count at one lagged time point.

Next, we proceeded to fit our FRM to assess the robustness of parametric model assumptions. We used parameter estimates from fitted ZIP and ZINB autoregressive models as starting values for solving the FRM estimating function in Eq. (3.2). We perturbed FRM estimates and used them as starting values for FRM until the change in parameter estimates was negligible.

A comparison of parameter estimates, standard errors, and *p* values, reported in Table 4, reveals some similarities but also some differences between the models and conclusions that can be drawn. In particular, all three methods report an increase in the mean number of weekly syphilis cases over time with fairly close parameter estimates (see estimates of \({\beta }_2\)). With a *p* value of about 0.001, the evidence for an increasing trend is overwhelming from ZIP, whereas no trend is detected from ZINB (*p* value = 0.046), and the evidence for an increase in mean weekly cases, though arguably present based at a 5% significance level, is not overwhelming based on FRM (*p* value = 0.024). ZIP incorrectly identifies a trend in the mean of weekly cases because it underestimates the standard error of the trend effect; the corresponding estimated standard errors from ZINB and FRM are larger and correct for underestimation by ZIP. All three methods identify the presence of zero-inflation though none of the methods detects a trend in the proportion of excess zero cases over time.

To choose between the conclusions from ZINB and FRM, we compared the parameter estimates of both models in more detail. For the log-linear part, ZINB and FRM parameter estimates are similar. Parameter estimates within the logit part of ZINB and FRM, however, are different leading us to question the adequacy of the negative binomial distribution as a model for the conditional variance structure.

At this point in the analysis, it would be difficult to justify either ZIP or ZINB as suitable parametric models. Typical parametric analysis involves identifying an adequate data generating mechanism whose suitability we argue can be determined with the aid our FRM. As FRM is robust against variance misspecification, it is safer to rely on inference based on FRM rather than ZINB. We conclude the evidence for an increase in mean weekly number of syphilis cases is somewhat present and warrants further investigation but is not cause for alarm.

## Discussion

Serially dependent count data occur frequently in diverse applications such as public health, economics, and environmental sciences. Often, such data exhibit an abundance of zero counts and overdispersion; time series regression models for such data are ZIP and ZINB autoregressive processes. While ZINB may provide a better fit than ZIP, the suitability of ZINB’s variance structure may be difficult to justify empirically for real data. FRM is a distribution-free approach to zero-inflated and overdispersed count data modelling that has been successfully adopted for independent data regression (Tang et al. [13]) and longitudinal studies (Zhang et al. [23]) with zero-inflation.

To the best of our knowledge, the suitability of FRM has not been investigated within discrete-time stochastic process set-ups. We investigated FRM’s suitability for regression modelling of serially dependent zero-inflated and overdispersed count data. In our simulation studies, FRM performed well when time series arise from observation-driven ZIP and ZINB autoregressive models. FRM fared poorly in the simulation study where the true process is parameter-driven (i.e. dynamic) ZINB with respect to relative bias and size of tests. On the other hand, when data arise from a dynamic ZINB process, FRM-based mean asymptotic standard errors were close to empirical standard deviations in simulations leading us to propose bias correction and prevention of FRM as a useful extension of this work. Bias correction has been extensively studied for models estimated by generalized estimating equations; for example, see Paul and Zhang [12]. Another direction for research is FRM for zero-inflated and overdispersed integer autoregressive models (e.g. Weiß et al. [16]). For our FRM approach, we have developed R code which is available from the corresponding author upon request.

An anonymous reviewer pointed out another distribution-free approach to zero-inflated count data modelling by Valle et al. [15] that appeared after we submitted our manuscript. In Valle et al. [15], data are hypothesized to arise from an ordered multinomial probit model with class membership of the observed data determined by an unobserved latent parameter. Class interval endpoints of the multinomial distribution are then estimated from data via a Bayesian framework. It is not possible to estimate the magnitude of the regression coefficients for covariate effects using the method in Valle et al. [15], but it is possible to identify which parameter estimates are different from zero as well as to determine parameter estimate sign. In contrast, regression parameters are estimable within our FRM method and estimates are optimal in the Godambe-information sense allowing for confidence interval construction in addition to hypothesis testing.

The approach of Valle et al. [15] was shown in simulation studies to correctly identify significant covariates when they enter the model as linear spline variables. The assumption of a linear trend, while sufficient for testing the presence of linear trends, does not fully capture complex shapes that can occur in time trends, especially after an intervention for infectious diseases, for example. We would like to investigate whether the approach in Valle et al. [15] is amenable to time series data. Another potential future direction for research would be to adapt our FRM approach to adjust for smooth functions of covariates to permit nonlinear time trends.

## References

- 1.
Contreras M, Ryan LM (2000) Fitting nonlinear and constrained generalized estimating equations with optimization software. Biometrics 56(4):1268–1271

- 2.
Cox DR, Gudmundsson G, Lindgren G, Bondesson L, Harsaae E, Laake P, Juselius K, Lauritzen SL (1981) Statistical analysis of time series: some recent developments [with discussion and reply]. Scand J Stat 8:93–115

- 3.
Davis Richard A, Dunsmuir William TM, Yin W (2000) On autocorrelation in a poisson regression model. Biometrika 87(3):491–505

- 4.
Davis RA, Holan SH, Lund R, Ravishanker N (2016) Handbook of discrete-valued time series. CRC Press, Boca Raton

- 5.
Godambe VP, Heyde CC (2010) Quasi-likelihood and optimal estimation. In: Maller R, Basawa I, Hall P, Seneta E (eds) Selected works of C.C. Heyde. Selected works in probability and statistics. Springer, New York, NY

- 6.
Godambe VP (1985) The foundations of finite sample estimation in stochastic processes. Biometrika 72(2):419–428

- 7.
Hwang SY, Basawa IV (2011) Godambe estimating functions and asymptotic optimal inference. Stat Prob Lett 81(8):1121–1127

- 8.
Jacod J, Sørensen M (2018) A review of asymptotic theory of estimating functions. Stat Inference Stoch Process 21:415–434. https://doi.org/10.1007/s11203-018-9178-8

- 9.
Kedem B, Fokianos K (2005) Regression models for time series analysis, vol 488. Wiley, Hoboken

- 10.
Kowalski J, Tu XM (2008) Modern applied U-statistics, vol 714. Wiley, Hoboken

- 11.
Kung-Yee L, Zeger Scott L (1986) Longitudinal data analysis using generalized linear models. Biometrika 73(1):13–22

- 12.
Sudhir P, Xuemao Z (2014) Small sample gee estimation of regression parameters for longitudinal data. Stat Med 33(22):3869–3881

- 13.
Tang Wan L, Naiji CT, Wenjuan W, David GD, Han Y, Tu Xin M (2015) On performance of parametric and distribution-free models for zero-inflated and over-dispersed count responses. Stat Med 34(24):3235–3245

- 14.
Thavaneswaran A, Ravishanker N (2015) Estimating equation approaches for integer-valued time series models. In: Lund R, Davis RA, Holan SH, Ravishanker N (eds) Handbook of discrete-valued time series. Chapman and Hall, CRC, Boca Raton, pp 145–163

- 15.
Valle D, Toh KB, Laporta GZ, Zhao Q (2019) Ordinal regression models for zero-inflated and/or over-dispersed count data. Sci Rep 9(1):3046

- 16.
Weiß CH, Homburg A, Puig P (2016) Testing for zero inflation and overdispersion in inar (1) models. Stat Pap 60:823–848

- 17.
Yang M (2012) Statistical models for count time series with excess zeros. PhD (Doctor of Philosophy) thesis, University of Iowa. https://doi.org/10.17077/etd.bcrq9mz0

- 18.
Yang M, Zamba GK, Cavanaugh JE (2013) Markov regression models for count time series with excess zeros: a partial likelihood approach. Stat Methodol 14:26–38

- 19.
Yang M, Cavanaugh JE, Zamba GK (2015) State-space models for count time series with excess zeros. Stat Modell 15(1):70–90

- 20.
Yang M, Zamba GKD, Cavanaugh JE (2017) ZIM: zero-inflated models for count time series with excess zeros. R package version 1.0.3. https://CRAN.R-project.org/package=ZIM

- 21.
Yu Q, Chen R, Tang W, He H, Gallop R, Crits-Christoph P, Hu J, Tu XM (2013) Distribution-free models for longitudinal count responses with overdispersion and structural zeros. Stat Med 32(14):2390–2405

- 22.
Zeger Scott L (1988) A regression model for time series of counts. Biometrika 75(4):621–629

- 23.
Zhang H, Tang L, Kong Y, Chen T, Liu X, Zhang Z, Zhang B (2018) Distribution-free models for latent mixed population responses in a longitudinal setting with missing data. Stat Methods Med Res 28(10–11):3273–3285

## Acknowledgements

The authors wish to thank two anonymous referees for their careful reading of this manuscript. Their suggestions and remarks lead to significant improvements in the quality and content of our original work. In addition, the authors thank Dr. Wan Tang (Tulane University) for sharing R code written for fitting independent data ZIP and ZINB regression using FRM.

## Author information

### Affiliations

### Corresponding author

## Ethics declarations

### Conflict of interest

The authors declare that they have no conflict of interest.

## Additional information

### Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research of M. Ghahramani is partially supported by a grant from Natural Sciences and Engineering Research Council (NSERC) of Canada. S.S. White was supported by an NSERC Undergraduate Student Research Award.

## Appendices

### Appendix 1: Conditional Mean Vector and Conditional Variance Matrix of \(\pmb f_t\)

Under the FRM model in (2.7) which assumes ZIP specification for the response variable \(Y_t\), it is easy to verify that elements of \(\pmb h_t = (h_{1t}, h_{2t})^{{\top }}=(E(f_{1t}|{\mathcal{F}}_{t-1}), E(f_{2t}|{\mathcal{F}}_{t-1}))^{{\top }}\) are given by the following expressions:

Under the FRM model in (2.7) and the assumption of ZIP specification for response \(Y_t\) once more, the entries of the variance–covariance matrix (\(V_t\)) of \(S_t =\pmb f_t - \pmb h_t\) are as follows:

Note: \(\displaystyle \frac{\partial \pmb h_t}{\partial {\pmb \theta }} = \left[ \begin{array}{cc} \frac{\partial h_{1t}}{\partial {\pmb \beta }} &{} \frac{\partial h_{1t}}{\partial \pmb \gamma } \\ \frac{\partial h_{2t}}{\partial {\pmb \beta }} &{} \frac{\partial h_{2t}}{\partial \pmb \gamma } \end{array} \right] \) is of dimension \(2 \times k,\) where \(k=p+q.\)

### Appendix 2: Asymptotic Normality of FRM-Based Estimator

Consider a strictly stationary and ergodic stochastic process \(\{Y_t\}.\) In what follows, let \({\mathcal{F}}_{t}\) be the \(\sigma \)-field generated by past observations and past covariate values (\(\{x_t\}\) and \(\{z_t\}\)) up to an including time *t* for \(t=1,\ldots ,n.\) Put \(S_t=\pmb f_t - \pmb h_t.\) Then, \(\{S_t\}\) is a zero-mean martingale difference sequence. Clearly, \(E(S_{t+1}|{\mathcal{F}}_{t})=S_{t}.\)\(E(S_t)= \pmb 0\) since \(E(S_t|{\mathcal{F}}_{t-1})= \pmb 0\) and \(E(S_t) = EE(S_t |{\mathcal{F}}_{t-1}).\) Consider the class *L* of linear EFs \(U_n({\pmb \theta })\) defined by

where \(\{S_t\}\) is the sequence of vector martingale differences defined by Eq. (2.7) and \(W_{t-1}({\pmb \theta })\) is a \({\mathcal{F}}_{t-1}\) measurable weight matrix of order \((p+q)\times 2.\) The following assumptions are satisfied by \(\pmb S_t\):

- 1.
\(E\left( \frac{\partial S_t({\pmb \theta })}{\partial {\pmb \theta }} | {\mathcal{F}}_{t-1} \right) \ne \pmb 0\),

- 2.
\(E\left( S_t({\pmb \theta }) S_t({\pmb \theta })^{{\top }}| {\mathcal{F}}_{t-1}\right) \) is invertible.

The optimal EF within *L* in the Godambe-optimal sense (Godambe [6]) is given by

The EF \(U_n^o\) was obtained by maximizing the \((p+q)\times (p+q)\) Godambe-information matrix

As \(E\left( \frac{\partial S_t({\pmb \theta })}{\partial {\pmb \theta }} | {\mathcal{F}}_{t-1} \right) = -D_t^{{\top }}\) and \(E\left( S_t({\pmb \theta }) S_t({\pmb \theta })^{{\top }}| {\mathcal{F}}_{t-1}\right) =Var(\pmb f_t|{\mathcal{F}}_{t-1})=V_t\) in Eq. (3.1), the optimal EF in Eq. (3.2) follows. Hwang and Basawa [7] showed that assuming the EFs in the class *L* of EFs is smooth in the sense of condition (C1), then any consistent solution \({\hat{{\pmb \theta }}}_n\) of \(U_n({\pmb \theta })=\pmb 0\) and \({\hat{{\pmb \theta }}}^o_n\) of \(U_n^0({\pmb \theta })=\pmb 0\) is asymptotically normal. Condition (C1) is as follows.

Condition (C1): For each fixed \({\pmb \theta }\), there exists an open neighbourhood \(N_n({\pmb \theta })\) about \({\pmb \theta }\) such that

and, uniformly in \({\pmb \theta }^*\) on \(N_n({\pmb \theta }),\)

The asymptotic distribution of \({\hat{{\pmb \theta }}}_n\) and \({\hat{{\pmb \theta }}}^o_n\) is given by the following theorem.

**Theorem**
* Under (C1), as*
\(n \rightarrow \infty ,\)

*and*

###
*Proof*

This is an application Eq. (3.7) of Theorem 2 in Hwang and Basawa [7] where \(J^o=J^o({\pmb \theta })\) is Eq. (3.3) of Hwang and Basawa [7]. In particular,

A consistent estimate of \(J^0({\pmb \theta })\) is \(\widehat{J^o}({\pmb \theta }) = n^{-1} \sum _{t=1}^n D_t^{{\top }} V_t^{-1} D_t\), and if one can construct a sequence of weakly consistent estimators \({\widehat{V}}_t\) for \(V_t\) and each \({\widehat{V}}_t\) is positive-definite, then a consistent estimator of the variance of \(\widehat{J^o}({\pmb \theta })\) is \(\widehat{J^o}({\widehat{{\pmb \theta }}}^o_n).\) As protection against variance misspecification, we use the sandwich estimator instead of \(\widehat{J^o}({\widehat{{\pmb \theta }}}^o_n)\) for the asymptotic variance (c.f. Liang and Zeger [11]). Equations (3.1) and (3.2) of Hwang and Basawa [7] give the general form of *J* and *H* matrices. \(\square \)

The existence and uniqueness of a sequence of consistent estimators that solve the equation \(U_n^0({\pmb \theta })=\pmb 0\) for EFs in class *L* can be justified using Condition 2.2 and Theorem 2.5 in Jacod and Sørensen [8].

### Appendix 3: Empirical Type I Error Rates from Simulation Studies

See Table 5.

## Rights and permissions

## About this article

### Cite this article

Ghahramani, M., White, S.S. Time Series Regression for Zero-Inflated and Overdispersed Count Data: A Functional Response Model Approach.
*J Stat Theory Pract* **14, **29 (2020). https://doi.org/10.1007/s42519-020-00094-8

Published:

### Keywords

- Estimating function
- Overdispersion
- Regression
- Time series
- Zero-inflation

### Mathematics Subject Classification

- 62G99
- 62M05
- 62M10
- 62P10
- 62P12
- 62P20