Abstract
The inferences on zero-inflated data are difficult to deal and the problem motivated a relevant part of the research since the earlier times of the statistical science. The case of multivariate zero-inflated data is still subject of active debates. In this contribution we primarily deal with a permutation-based test for comparisons of two groups with multivariate zero-inflated data. By the use of a leading example, we formulate different questions and translate them on different inferential hypotheses. A permutation-based solution is proposed for each of them and their interpretation is discussed. Finally, we extend the method to the general case of—possibly many—continuous predictors and the presence of covariates (nuisance). The data and the R code are implemented in the library flip on CRAN repository and on the web-appendinx of this paper.
Similar content being viewed by others
References
Aitchison J (1955) On the distribution of a positive random variable having a discrete probability mass at the origin. J Am Stat Assoc 50(271):901–908
Basso D, Pesarin F, Salmaso L, Solari A (2009) Permutation tests for stochastic ordering and ANOVA: theory and applications in R. Lecture notes no. 194. Springer, New York
Diallo AO, Diop A, Dupuy J-F (2016) Asymptotic properties of the maximum likelihood estimator in zero-inflated binomial regression. Commun Stat Theory Methods 46(20):9930–9948
Dupuy J-F (2017) Inference in a generalized endpoint-inflated binomial regression model. Statistics 51(4):888–903
Edgington E, Onghena P (2007) Randomization tests, 4th edn. Chapman & Hall/CRC, Boca Raton
Faroughi P, Ismail N (2017) Bivariate zero-inflated negative binomial regression model with applications. J Stat Comput Simul 87:457–477
Finos L (2014) flip: multivariate permutation tests. R package version 2.4.3
Gómez-Déniz E, Vázquez-Polo FJ, García-García V (2014) A discrete version of the half-normal distribution and its generalization with applications. Stat Pap 55:497–511
Good P (2005) Permutation, parametric, and bootstrap tests of hypotheses, 3rd edn. Springer, New York
Gschlößl S, Czado C (2006) Modelling count data with overdispersion and spatial effects. Stat Pap 49:531
Hall DB (2000) Zero-inflated Poisson and binomial regression with random effects: a case study. Biometrics 56(4):1030–1039
Hall DB, Berenhaut KS (2002) Score tests for heterogeneity and overdispersion in zero-inflated Poisson and binomial regression models. Can J Stat 30:415–430
Karlis D, Ntzoufras I (2003) Analysis of sports data by using bivariate Poisson models. J R Stat Soc Ser D (Stat) 52:381–393
Lambert D (1992) Zero-inflated Poisson regression, with an application to defeats in manufacturing. Technometrics 34:1–14
Li C-S, Lu J-C, Park J, Kim K, Brinkley PA, Peterson JP (1999) Multivariate zero-inflated Poisson models and their applications. Technometrics 41:29–38
Marcus R, Peritz E, Gabriel K (1976) On closed testing procedures with special reference to ordered analysis of variance. Biometrika 63:655–660
Min Y, Agresti A (2002) Modeling nonnegative data with clumping at zero: a survey. J Iran Stat Soc 1(1–2):7–33
Mullahy J (1986) Specification and testing of some modified count data models. J Econom 33(3):341–365
Partha D, Trivedi PK (1997) Demand for medical care by the elderly: a finite mixture approach. J Appl Econom 12:313–336
Pesarin F (2001) Multivariate permutation test with application to biostatistics. Wiley, Chichester
Pesarin F, Salmaso L (2010) Permutation tests for complex data: theory, applications and software. Wiley, Chichester
Ridout M, Hinde J, DemAtrio CGB (2001) A score test for testing a zero-inflated Poisson regression model against zero-inflated negative binomial alternatives. Biometrics 57:219–223
Rodríguez-Avi J, Olmo-Jiménez MJ (2015) A regression model for overdispersed data without too many zeros. Stat Pap 58(3):749–773
Roy S (1953) On a heuristic method of test construction and its use in multivariate analysis. Ann Math Stat 24:220–238
Saei A, McGilchrist C (1997) Random threshold models applied to inflated zero class data. Aust J Stat 39:5–16
Sáez-Castillo AJ, Conde-Sánchez A (2017) Detecting over- and under-dispersion in zero inflated data with the hyper-Poisson regression model. Stat Pap 58:19–33
Sen PK (2007) Union-intersection principle and constrained statistical inference. J Stat Plan Inference 137:3741–3752
Shankar V, Milton J, Mannering F (1997) Modeling accident frequencies as zero-altered probability processes: an empirical inquiry. Accid Anal Prev 29(6):829–837
Tian G-L, Ma H, Zhou Y, Deng D (2015) Generalized endpoint-inflated binomial model. Comput Stat Data Anal 89:97–114
Wang SC (1998) Analysis of zero-heavy data using a mixture model approach. PhD Dissertation, Virginia Polytechnic Institute and State University, Blacksburg, VA
Westfall PH, Young SS (1993) Resampling-based multiple testing: examples and methods for P-value adjustment. Wiley, New York
Young DS, Raim AM, Johnson NR (2017) Zero-inflated modelling for characterizing coverage errors of extracts from the us census bureau’s master address file. J R Stat Soc Ser A (Stat Soc) 180:73–97
Zeileis A, Kleiber C, Jackman S (2008) Regression models for count data in R. J Stat Softw 27:1–25
Acknowledgements
LF was supported by grant from the University of Padua (Project CPDA158444/15).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Finos, L., Pesarin, F. On zero-inflated permutation testing and some related problems. Stat Papers 61, 2157–2174 (2020). https://doi.org/10.1007/s00362-018-1025-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-1025-x