Abstract
The marginalized zero-inflated poisson (MZIP) regression model quantifies the effects of an explanatory variable in the mixture population. Also, in practice the variables are usually partially observed. Thus, we first propose to study the maximum likelihood estimator when all variables are observed. Then, assuming that the probability of selection is modeled using mixed covariates (continuous, discrete and categorical), we propose a semiparametric inverse-probability weighted (SIPW) method for estimating the parameters of the MZIP model with covariates missing at random (MAR). The asymptotic properties (consistency, asymptotic normality) of the proposed estimators are established under certain regularity conditions. Through numerical studies, the performance of the proposed estimators was evaluated. Then the results of the SIPW are compared to the results obtained by semiparametric inverse-probability weighted kermel-based (SIPWK) estimator method. Finally, we apply our methodology to a dataset on health care demand in the United States.
Similar content being viewed by others
REFERENCES
D. Lambert, “Zero-inflated Poisson regression with an application to defects in manufacturing,” Technometrics 34, 1–14 (1992).
D. Lambert, ‘‘Zero-inflated Poisson regression with an application to defects in manufacturing,’’ Technometrics 34, 1–14 (1992).
E. Dietz and D. Böhning, ‘‘On estimation of the Poisson parameter in zero-modified Poisson models,’’ Comput. Stat. Data Anal. 4, 441–459 (2000).
K. K. W. Yau and A. H. Lee, ‘‘Zero-inflated Poisson regression with random effects to evaluate an occupational injury prevention programme,’’ Stat. Med. 20, 2907–2920 (2001).
A. O. Diallo, A. Diop, and J.-F. Dupuy, ‘‘Asymptotic properties of themaximum likelihood estimator in zero-inflated binomial regression,’’ Commun. Stat. Theory Methods 46 (20), 9930–9948 (2017).
Y. B. Cheung, ‘‘Zero-inflated models for regression analysis of count data: a study of growth and development,’’ Stat. Med. 21, 1461–1469 (2002).
M. Reilly and M. S. Pepe, ‘‘A mean score method for missing and auxiliary covariates data in regression methods,’’ Biometrika 82, 299–314 (1995).
A. Diallo, A. Diop, and J.-F. Dupuy, ‘‘Estimation in zero-inflated binomial regression with missing covariates,’’ Statistics 53 (4), 839–865 (2019).
T. M. Lukusa, S.-M. Lee, and C.-S. Li, ‘‘Semiparametric estimation of a zero-inflated Poisson regression model with missing covariates,’’ Metrika 79 (4), 457–483 (2016).
T. M. Lukusa and F. K. Hing Phoa, ‘‘A note on the weighting-type estimations of the zero-inflated Poisson regression model with missing data in covariates,’’ Journal Pre-proof. (2019).
D. G. Horvitz and D. J. Thompson, ‘‘A generalization of sampling without replacement from a finite universe,’’ Current Res. Biostat. 47, 663–685 (1952).
S. H. Hsieh, S. M. Lee, and P. S. Shen, ‘‘Logistic regression analysis of randomized response data with missing covariates,’’ J. Stat. Plan. Inference 140, 927–940 (2010).
D. B. Rubin, ‘‘Inference and missing data,’’ Biometrika 63 (3), 581–592 (1976).
R. V. Foutz, ‘‘On the unique consistent solution to the likelihood equations,’’ J. Am. Stat. Assoc. 72, 147–148 (1977).
D. Böhning, E. Dietz, P. Schlattmann, L. Mendonca, and U. Kirchner, ‘‘The zero-inflated Poisson model and the decayed, missing, and filled teeth index in dental epidemiology,’’ J. R. Stat. Soc. Ser. A 162, 195–209 (1999).
S. H. Hsieh, S. M. Lee, and P. S. Shen, ‘‘Semiparametric analysis of randomized response data with missing covariates in logistic regression,’’ Comput. Stat. Data Anal. 53, 2673–2692 (2009).
D. Long, J. S. Preisser, A. H. Herringb, and C. E. Golin, ‘‘A marginalized zero-inflated Poisson regression model with overall exposure effects,’’ Statist. Med. 33, 5151–5165 (2014).
J. S. Preisser, J. W. Stamm, D. L. Long, and M. E. Kincade, ‘‘Review and recommendations for zero-inflated count regression modeling of dental caries indices in epidemiological studies,’’ Caries Research. 46 (4), 413–423 (2012) .
K. H. Benecha, J. S. Preisser, and K. Das, ‘‘Marginal Zero-inflated models with missing covariates,’’ Biometric Journal (2018).
A. Henningsen and O. Toomet, ‘‘maxLik: A package for maximum likelihood estimation in R,’’ Computational Statistics 26 (3), 443–458 (2011).
D. Wang and S. X. Chen, ‘‘Empirical likelihood for estimating equations with missing values,’’ Ann. Stat. 37, 490–517 (2000).
M. Reilly and M. S. Pepe, ‘‘A mean score method for missing and auxiliary covariates data in regression methods,’’ Biometrika 82, 299–314 (2019).
E. A. Nadaraya, ‘‘On estimating regression,’’ Theory of probability and its applications 9, 141–142 (1964).
S. Wang and C. Y. Wang, ‘‘A note on kernel assisted estimators in missing covariate regression,’’ Stat. Probabil. Lett. 55, 439–449 (2001).
G. S. Watson, ‘‘Smooth regression analysis,’’ Sankhya, Series A 26, 359–372 (1964).
Herbert A. Sturges, ‘‘The Choice of a Class Interval,’’ Journal of the American Statistical Association 21, 153 (1926).
G F. Jenks, Optimal Data Classification for Choropleth Maps (Lawrence Kansas, 1977).
K. J. G. Kouakou, O. Hili, and J. F. Dupuy, ‘‘Estimation in the Zero-Inflated Bivariate Poisson model, with an application to healt-care utilization data,’’ Africa Statistica 16 (2): 2767–2788 (2021).
E. Ali, M. L. Diop, nd A. Diop, ‘‘Statistical inference in a Zero-Inflated Bell regression model,’’ Mathematical methods of Statistic 31, 91–104 (2022).
Suruchi Mishra, Cynthia L. Ogden, and Melissa Dimeler, Dietary Supplement Use in the United States: National Health and Nutrition Examination Survey, National Health Statistics Reports, 2017–March 2020 (2023).
ACKNOWLEDGMENTS
Authors are grateful to referees and editor for their comments and suggestions that led to significant improvements of earlier versions of this article.
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
The authors declare that they have no conflicts of interest.
Appendix A
Appendix A
PROOFS OF ASYMPTOTIC RESULTS
6.1 Proof of Theorem 1
We prove consistency of \(\hat{\theta}_{n}^{F}\) by checking the conditions of the inverse function theorem of Foutz [13]. These conditions are proved in a series of technical lemmas.
Lemma 1. As \(n\rightarrow\infty\), \(n^{-1/2}U_{F,n}(\theta_{0})\) converges in probability to 0.
Proof of Lemma 1. Decompose \(n^{-1/2}U_{F,n}(\theta_{0})\) as, for every \(i=1,\ldots,n\), we have
For \(i=1,\ldots,n\) and \(l=1,\ldots,q\);
We have
Now, we have
\(\mathbb{E}(J_{i}|\mathbf{X}_{i},\mathbf{Z}_{i})=\mathbb{P}(Y_{i}=0|\mathbf{X}_{i},\mathbf{Z}_{i}),\) \(\mathbb{E}(Y_{i}|\mathbf{X}_{i},\mathbf{Z}_{i})=\nu_{i}\) and \(\mathbb{E}(1-J_{i}|\mathbf{X}_{i},\mathbf{Z}_{i})=\mathbb{P}(Y_{i}>0|\mathbf{X}_{i},\mathbf{Z}_{i}).\) It follows that \(\mathbb{E}[Z_{il}B_{i}(\theta_{0})]=0\).
Using similarly arguments we prove that, for every \(i=1,\ldots,n\) and \(j=1,\ldots,p\), \(\mathbb{E}[X_{ij}A_{i}(\theta_{0})]=0\).
Now, for every \(i=1,\ldots,n\) and \(l=1,\ldots,q\), we have
By \(\mathbf{H3}\), we have \(\mathbb{E}\left(Z_{il}^{2}B_{i}^{2}(\theta_{0})\right)<\infty\).
Using similar arguments, we prove \(\textrm{var}\left(X_{ij}A_{i}(\theta_{0})\right)<\infty\) for every \(i=1,\ldots,n\) and \(j=1,\ldots,p\).
Thus, by the weak law of large numbers, \(n^{-1/2}U_{F,n}(\theta_{0})\) converges in probability to \(0\), which concludes the proof.
Lemma 2. As \(n\rightarrow\infty\), \(n^{-1/2}\frac{\partial U_{F,n}(\theta)}{\partial\theta^{T}}\) converges in probability to a fixed function \(-\Sigma(\theta)\), uniformly in an open neighbourhood of \(\theta_{0}\).
Proof of Lemma 2: Let \(\tilde{U}_{F,n}(\theta):=n^{-1/2}\frac{\partial U_{F,n}(\theta)}{\partial\theta^{T}}\), and \(\nu_{\theta_{0}}\) be an open neighbourhood of \(\theta_{0}\). Let \(\theta\in\nu_{\theta_{0}}\).
By the weak law of large numbers and \(\mathbf{H3}\), \(\tilde{U}_{F,n}(\theta)=\frac{1}{n}\sum_{i=1}^{n}\left\{\frac{\partial^{2}l_{i}(\theta)}{\partial\theta\partial\theta^{T}}\right\}\) converges in probability to the matrix \(-\Sigma(\theta)\) as \(n\rightarrow\infty\), where \(\Sigma(\theta)=\mathbb{E}\left[-\frac{\partial^{2}l_{1}(\theta)}{\partial\theta\partial\theta^{T}}\right]\).
By conditions \(\mathbf{H4}\), we prove that the convergence of \(\tilde{U}_{F,n}(\theta)\) to \(-\Sigma(\theta)\) is uniform on \(\nu_{\theta_{0}}\).
The conditions inverse function theorem of Foutz [13] are verified. Finally \(\hat{\theta}_{n}\) converges in probability to \(\theta_{0}\).
Now, we prove that \(\hat{\theta}_{n}^{F}\) is asymptotically Gaussian. To do this, it follows by a Taylor’s expansion of \(U_{F,n}(\hat{\theta}_{F,n})\) at \(\theta_{0}\) yields
.
By calculations \(\textrm{var}(U_{F,n}(\theta_{0}))=\frac{1}{n}\sum_{i=1}^{n}\mathbb{E}\left(\dot{l_{i}}(\theta_{0})\dot{l_{i}}(\theta_{0})^{T}\right)=Q_{F}(\theta_{0})\).
Finally, by Lemma 2 and Slusky’s theorem, \(\sqrt{n}(\hat{\theta}_{n}^{F}-\theta_{0})\) converges in distribution to the Gaussian vector of mean zero and variance \(\Delta_{F}\), where \(\Delta_{F}\) is defined in Theorem 1.
Appendix B
6.2 Proof of Theorem 2
We prove consistency of \(\hat{\theta}_{n}^{ws}\) by checking the conditions of the inverse function theorem of Foutz [13]. These conditions are proved in a series of technical lemmas.
Lemma 3. As \(n\rightarrow\infty\), \(n^{-1/2}U_{w,n}(\theta_{0},\hat{\pi})\) converges in probability to \(0\).
Proof of Lemma 3. We decompose \(n^{-1/2}U_{w,n}(\theta_{0},\hat{\pi})\) as
Considering the first term of this decomposition.
Let \(\mathbf{S}^{\prime}_{i}=(\mathbf{S}^{D}_{i},\mathbf{S}^{\prime,D}_{i})\) and \(G_{n}(\theta_{0},\pi)=n^{-1/2}U_{ws,n}(\theta_{0},\hat{\pi})-n^{-1/2}U_{ws,n}(\theta_{0},\pi)\), we have
were \(o_{p}^{*}(a_{n})\) denotes a matrix whose components are uniformly \(o_{p}(a_{n})\). By the weak law of large numbers we have
converges in probability to \(0\) as \(n\rightarrow\infty\).
Using conditions \(\mathbf{H3}\), we prove that \(\dot{l_{i}}(\theta_{0})\) is finite a.s. Finally, by Slutsky’s theorem
converges in probability to \(0\) as \(n\rightarrow\infty\).
Next, consider the term \(n^{-1/2}U_{ws,n}(\theta_{0},\pi(Y_{i},\mathbf{S}^{\prime}_{i}))\) in decomposition (6.1).
We show that \(n^{-1/2}U_{ws,n}(\theta_{0},\pi(Y_{i},\mathbf{S}^{\prime}_{i}))\) converges in probability to \(0\) as \(n\rightarrow\infty\).
For every \(i=1,\ldots,n\), we have
For \(i=1,\ldots,n\) and \(l=1,\ldots,q\);
Two cases should be considered, namely: (i) \(Z_{il}\) is a component of \(Z^{\textrm{obs}}\) and (ii) \(Z_{il}\) is a component of \(Z^{\textrm{miss}}\). In case (i), we have
Given \(\mathbf{V}_{i}=(Y_{i},\mathbf{S}^{\prime}_{i})\), \(Z_{il}B_{i}(\theta_{0})\) is a function of \((\mathbf{X}^{\textrm{miss}},\mathbf{Z}^{\textrm{miss}})\) only. Thus, by the MAR assumption, \(B_{i}(\theta_{0})\) and \(\Delta_{i}\) are independent
In case (ii),
Given \(\mathbf{V}_{i}\), \(Z_{il}B_{i}(\theta_{0})\) is a function of \((\mathbf{X}^{\textrm{miss}},\mathbf{Z}^{\textrm{miss}})\) only. Thus, by the MAR assumption, \(B_{i}(\theta_{0})\) and \(\Delta_{i}\) are independent
It follows that \(\mathbb{E}[\frac{\Delta_{i}}{\pi_{i}(Y_{i},\mathbf{S}^{\prime}_{i})}Z_{il}B_{i}(\theta_{0})]=0\).
Using similar arguments, we prove that \(\mathbb{E}[\frac{\Delta_{i}}{\pi_{i}(Y_{i},\mathbf{S}^{\prime}_{i})}X_{ij}A_{i}(\theta_{0})]=0\).
Now, for every \(i=1,\ldots,n\) and \(l=1,\ldots,q\) , we have
By \(\mathbf{H3}\), we have \(\mathbb{E}\left(\frac{\Delta_{i}}{\pi_{i}^{2}(Y_{i},\mathbf{S}^{\prime}_{i})}Z_{il}^{2}B_{i}^{2}(\theta_{0})\right)<\infty\) .
Using similar arguments, we prove
Thus, by the weak law of large numbers, \(n^{-1/2}U_{ws,n}(\theta_{0},\pi(Y_{i},\mathbf{S}^{\prime}_{i}))\) converges in probability to \(0\) as \(n\rightarrow\infty\).
Finally \(n^{-1/2}U_{w,n}(\theta_{0},\hat{\pi}(Y_{i},\mathbf{S}^{\prime}_{i}))\) converges to \(0\), which concludes the proof.
Lemma 4. As \(n\rightarrow\infty\), \(n^{-1/2}\frac{\partial U_{ws,n}(\theta,\hat{\pi})}{\partial\theta^{T}}\) converges in probability to a fixed function \(-\Sigma(\theta)\), uniformly in a neighbourhood of \(\theta_{0}\).
Proof of Lemma 4. Let \(\bar{U}_{ws,n}(\theta,\hat{\pi}):=n^{-1/2}\frac{\partial U_{ws,n}(\theta,\pi)}{\partial\theta^{T}}\) and \(\ddot{l_{i}}(\theta)=\frac{\partial^{2}l_{i}(\theta)}{\partial\theta\partial\theta^{T}}\). We have
Using similary argument in Lemma 4, we have \(\bar{U}_{ws,n}(\theta,\hat{\pi})-\bar{U}_{ws,n}(\theta,\pi)\) converges in probability to \(0\). By the weak law of large numbers, and \(\mathbf{H3}\)
converges in probability to the matrix \(-\Sigma(\theta)\) as \(n\rightarrow\infty\).
By \(\mathbf{H5}\), we prove that the convergence of \(\tilde{U}_{ws,n}(\theta,\hat{\pi})\) to \(-\Sigma(\theta)\) is uniform.
The conditions inverse function theorem Foutz [13] are verified. Finally \(\hat{\theta}_{n}^{ws}\) converges in probability to \(\theta_{0}\).
Now, we prove that \(\theta_{n}^{ws}\) is asymptotically Gaussian.
It follows by a Taylor’s expansion of \(U_{ws,n}(\hat{\theta}^{ws}_{n},\hat{\pi})\) at \((\theta_{0},\hat{\pi})\) yields
therefore
thus
By calculations,
Let \(H(\theta_{0},\pi)=U_{ws,n}(\theta_{0},\hat{\pi})-U_{ws,n}(\theta_{0},\pi)\)
where \(o_{p}(a_{n})\) denotes a column vector whose components are uniformly \(o_{p}(a_{n})\).
Let
In order to show that \(\mathbf{Q}_{1n}=\mathbf{O}_{p}(1/\sqrt{n}),\) \(\mathbb{E}(\mathbf{Q}_{1n})=\mathbf{O}_{p}(1/\sqrt{n})\) and \(\textrm{Var}(\mathbf{Q}_{1n})=\mathbf{O}_{p}^{*}(1/n)\) where \(O^{*}(a_{n})\) and \(O(a_{n})\) denote a matrix and column vector whose components are uniformly \(O(a_{n})\). It first can be shown that
and then
Thus, we have
We have
and
Therefore, \(\mathbf{Q}_{1n}=O_{p}(\frac{1}{\sqrt{n}})\), \(\mathbf{Q}_{2n}\) can be expressed as follows:
where \(\Phi_{k}=\frac{\Delta_{k}-\pi(Y_{k},\mathbf{S}^{\prime}_{k})}{\pi(Y_{k},\mathbf{S}^{\prime}_{k})}\) and
We have \(\mathbb{E}\left[\Psi_{ik}(\theta_{0})|Y_{i}=Y_{k},\mathbf{S}^{\prime}_{i}=\mathbf{S}^{\prime}_{k}\right]=0\) and, hence,
Let \(\Psi_{iks}(\theta_{0})\) be the \(s\)th element of \(\Psi_{ik}(\theta_{0})\). Then, by Cauchy–Schwarz’s inequality,
Because for each element of \(\Psi_{ik}(\theta_{0})\)
we can proove \(\mathbb{E}\left[\mid\Phi_{k}\left(\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\Psi_{ik}(\theta_{0})\right)\mid\right]<\infty\).
By the weak law of large numbers \(\frac{1}{n}\sum_{k=1}^{n}\left\{\Phi_{k}\left[\frac{1}{\sqrt{n}}\sum_{i=1}^{n}\Psi_{ik}(\theta_{0})\right]\right\}=o_{p}(1)\) . Hence, \(\mathbf{Q}_{2n}\) can be expressed as \(\mathbf{Q}_{2n}=\frac{1}{\sqrt{n}}\sum_{k=1}^{n}\left[\frac{\Delta_{k}-\pi(Y_{k},\mathbf{S}^{\prime}_{k})}{\pi(Y_{k},\mathbf{S}^{\prime}_{k})}\right]\dot{l}_{k}^{*}(\theta_{0}+o_{p}(1)\).
and let \(\Sigma=\textrm{Cov}\left[U_{ws,n}(\theta_{0},\pi),U_{ws,n}(\theta_{0},\hat{\pi})-U_{ws,n}(\theta_{0},\pi)\right]\), we have
where the notation \(o^{*}(a_{n})\) denotes a matrix whose components are uniformly \(o^{*}(a_{n})\). Finally,
Thus, by the central limit theorem, we have \(U_{ws,n}(\theta_{0},\hat{\pi})\) converges in distribution to the Gaussian vector of mean zero and variance \(\Omega_{3}(\theta_{0},\pi)-\left[\Omega_{4}(\theta_{0},\pi)-\Omega_{5}(\theta_{0},\pi)\right]\). Because \(\left[-\bar{U}_{ws,n}^{-1}(\theta_{0},\hat{\pi})-\Sigma^{-1}(\theta_{0})\right]\) converges in probability to \(0\), by Slutsky’s theorem \(\left[-\bar{U}_{ws,n}^{-1}(\theta_{0},\hat{\pi})-\Sigma^{-1}(\theta_{0})\right]U_{ws,n}(\theta_{0},\hat{\pi})\) converges in distribution to \(0\).
Finally, by Lemma 4 and Slutsky’s theorem, \(\sqrt{n}(\hat{\theta}^{ws}_{n}-\theta_{0})\) converges in distribution to the Gaussian vector of mean zero and variance
About this article
Cite this article
Amani, K.M., Hili, O. & Kouakou, K.J. Statistical Inference in Marginalized Zero-inflated Poisson Regression Models with Missing Data in Covariates. Math. Meth. Stat. 32, 241–259 (2023). https://doi.org/10.3103/S1066530723040038
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1066530723040038