Abstract
In this paper, we consider testing the hypothesis concerning the means of two independent semicontinuous distributions whose observations are zero-inflated, characterized by a sizable number of zeros and positive observations from a continuous distribution. The continuous parts of the two semicontinuous distributions are assumed to follow a density ratio model. A new two-part test is developed for this kind of data. The proposed test takes the sum of one test for equality of proportions of zero values and one conditional test for the continuous distribution. The test is proved to follow a χ2 distribution with two degrees of freedom. Simulation studies show that the proposed test controls the type I error rates at the desired level, and is competitive to, and most of the time more powerful than two popular tests. A real data example from a dietary intervention study is used to illustrate the usefulness of the proposed test.
Similar content being viewed by others
References
J A Anderson. Multivariate logistic compounds, Biometrika, 1979, 66: 17–26.
C Bascoul-Mollevi, S Gourgou-Bourgade, A Kramar. Two-part statistics with paired data, Statistics in Medicine, 2005, 24: 1435–1448.
E J Bedrick, A Hossain. Conditional tests for homogeneity of zero-inflated Poisson and Poissonhurdle distributions, Computational Statistics and Data Analysis, 2013, 61: 99–106.
S Cai, J Chen, J V Zidek. Hypothesis test in the presence of multiple samples under density ratio models, Statistica Sinica, 2017, 27: 761–783.
K L Delucchi, A Bostrom. Methods for analysis of skewed data distributions in psychiatric clinical studies: working with many zero values, American Journal of Psychiatry, 2004, 161: 1159–1168.
K Fokianos. Density ratio model selection, Journal of Statistical Computation and Simulation, 2007, 77: 805–819.
A P Hallstrom. A modified Wilcoxon test for non-negative distributions with a clump of zeros, Statistics in Medicine, 2010, 29: 391–400.
W Kassahun-Yimer, P S Albert, L M Lipsky, A Liu. A joint model for multivariate hierarchical semicontinuous data with replications, Statistical Methods in Medical Research, 2019, 28: 858–870.
R Kay, S Little. Transformations of the Explanatory Variables in the Logistic Regression Model for Binary Data, Biometrika, 1987, 74(3): 495–501.
P A Lachenbruch. Comparisons of two-part models with competitors, Statistics in Medicine, 2001, 20: 1215–1234.
P A Lachenbruch. Analysis of data with excess zeros, Statistical Methods in Medical Research, 2002, 11: 297–302 (2002).
Y Min, A Agresti. Modeling nonnegative data with clumping at zero: A survey, Journal of The Iranian Chemical Society, 2002, 1: 7–33.
K Muralidharan, B K Kale. Modified gamma distribution with singularity at zero, Communications in Statistics: Simulation and Computation, 2002, 31: 143–158.
TR Nansel, LMB Laffel, DL Haynie, SN Mehta, LM Lipsky, LK Volkening, DA Butler, LA Higgins, A Liu. Improving dietary quality in youth with type 1 diabetes: randomized clinical trial of a family-based behavioral intervention, International Journal of Behavioral Nutrition and Physical Activity, 2015, 12(1): 58.
J Qin, B Zhang. A goodness-of-fit test for logistic regression models based on case-control data, Biometrika, 1997, 84: 609–618.
J Qin. Empirical likelihood ratio based confidence intervals for mixture proportions, The Annals of Statistics, 1999, 27: 1368–1384.
S Taylor, K Pollard. Hypothesis tests for point-mass mixture data with application to omics data with many zero values, Statistical Applications in Genetics and Molecular Biology, 2009, 8: 1–43.
B D Wagner, C E Robertson, J K Harris. Application of two-part statistics for comparison of sequence variant counts, PLoS One, 2011, 6: 20–26.
B Zhang. Assessing goodness-of-fit of generalized logit models based on case-control data, Journal of Multivariate Analysis, 2002, 82: 17–38.
L Zhang, J Wu, W D Johnson. Empirical study of six tests for equality of populations with zero-inflated continuous distributions, Communications in Statistics: Simulation and Computation, 2010, 39: 1196–1211.
F Zou, J P Fine, B S Yandell. On empirical likelihood for a semiparametric mixture model, Biometrika, 2002, 89: 61–75.
Acknowledgement
The authors thank Dr. Tonja Nansel for helpful discussions on the CHEF study.
Author information
Authors and Affiliations
Corresponding author
Additional information
Supported by the National Natural Science Foundation of China (No.11971433), the First Class Discipline of Zhejiang-A (Zhejiang Gongshang University-Statistics), the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development.
Rights and permissions
About this article
Cite this article
Lu, Yh., Liu, Ay., Jiang, Mj. et al. A new two-part test based on density ratio model for zero-inflated continuous distributions. Appl. Math. J. Chin. Univ. 35, 203–219 (2020). https://doi.org/10.1007/s11766-020-3957-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11766-020-3957-x