Abstract
We discuss the testing problem of homogeneity of the marginal distributions of a continuous bivariate distribution based on a paired sample with possibly missing components (missing completely at random). Applying the well-known two-sample Crámer–von-Mises distance to the remaining data, we determine the limiting null distribution of our test statistic in this situation. It is seen that a new resampling approach is appropriate for the approximation of the unknown null distribution. We prove that the resulting test asymptotically reaches the significance level and is consistent. Properties of the test under local alternatives are pointed out as well. Simulations investigate the quality of the approximation and the power of the new approach in the finite sample case. As an illustration we apply the test to real data sets.
Similar content being viewed by others
References
Akritas MG, Antoniou ES, Osgood DW (2002) A nonparametric approach to matched pairs with missing data. Sociol Methods Res 30:425–454
Akritas MG, Antoniou ES, Kuha J (2006) Nonparametric analysis of factorial designs with random missingness: bivariate data. J Am Stat Assoc 101:1513–1526
Amro L, Pauly M (2016) Permuting incomplete paired data: a novel exact and asymptotic correct randomization test. J Stat Comput Simul 87:1148–1159
Amro L, Konietschke F, Pauly M (2018) Multiplication-combination tests for incomplete paired data. arxiv:1801.08821
Anderson TW (1962) On the distribution of the two-sample Cramer–von Mises criterion. Ann Math Stat 33:1148–1159
Bhoj DS (1978) Testing equality of means of correlated variates with missing observations on both responses. Biometrika 65:225–228
Bhoj DS (1984) On difference of means of correlated variates with incomplete data on both responses. J Stat Comput Simul 19:275–289
Bhoj DS (1987) On testing equality of means of correlated variates with incomplete data. Biometrical J 29:589–594
Bhoj DS (1989) On comparing correlated means in the presence of incomplete data. Biometrical J 31:279–288
Bhoj DS (1991) Testing equality of means in the presence of correlation and missing data. Biometrical J 33:63–72
Derrick B, Russ B, Toher D, White P (2017) Test statistics for the comparison of means for two samples which include both paired observations and independent observations. J Mod Appl Stat Methods 16:137–157
Dubnicka SR, Blair RC, Hettmansperger TP (2002) Rank-based procedures for mixed paired and two-sample designs. J Mod Appl Stat Methods 1:32–41
Dudley RM (1984) A course on empirical processes. Lecture Notes in Mathematics 1097. Springer, New York, pp 1–142
Dunu ES (1994) Comparing the powers of several proposed tests for testing the equality of the means of two populations when some data are missing. Ph.D. thesis, University of North Texas
Einsporn RL, Habtzghi D (2013) Combining paired and two-sample data using a permutation test. J Data Sci 11:767–779
Ekbohm G (1976) On comparing means in the paired case with incomplete data on both responses. Biometrika 63:299–304
Ekbohm G (1981) On testing equality of means in the paired case with incomplete data on both responses. Biometrical J 23:251–259
Fong Y, Huang Y, Lemos MP, Mcelrath MJ (2018) Rank-based two-sample tests for paired data with missing values. Biostatistics 19:281–294
Gänßler P, Ziegler K (1994) A uniform law of large numbers for set-indexed processes with applications to empirical and partial-sum processes. Probab Banach Spaces 9:385–400
Gao X (2007) A nonparametric procedure for the two-factor mixed model with missing data. Biometrical J 49:774–788
Gibbons JD, Chakraborti S (2011) Nonparametric statistical inference. CRC Press, Boca Raton
Guo B, Yuan Y (2017) A comparative review of methods for comparing means using partially paired data. Stat Methods Med Res 26:1323–1340
Hamdan MA, Khuri AI, Crews SL (1978) A test for equality of means of two correlated normal variates with missing data on both responses. Biometrical J 20:667–674
Howard AG (2012) Missing data in non-parametric tests of correlated data. Ph.D. thesis, The University of North Carolina at Chapel Hill
Kiefer J (1959) K-sample analogues of the Kolmogorov–Smirnov and Cramer–V. Mises tests. Ann Math Stat 30:420–447
Konietschke F, Harrar SW, Lange K, Brunner E (2012) Ranking procedures for matched pairs with missing data—asymptotic theory and a small sample approximation. Comput Stat Data Anal 56:1090–1102
Koul HK, Müller UU, Schick A (2013) The transfer principle: a tool for complete case analysis. Ann Stat 40:3031–3049
Kuan PF, huang B (2013) A simple and robust method for partially matched samples using the p-values pooling approach. Stat Med 32:3247–3259
Lin P-E, Stivers LE (1975) On difference of means with incomplete data. Biometrika 61:325–334
Little RJA, Rubin DB (2014) Statistical analysis with missing data. Wiley, Hoboken
Looney S, Jones P (2003) A method for comparing two normal means using combined samples of correlated and uncorrelated data. Stat Med 22:1601–1610
Martinez-Camblor P, Corral N, de la Hera J (2012) Hypothesis test for paired samples in the presence of missing data. J Appl Stat 40:76–87
Maritz JM (1995) A permutation paired test allowing for missing values. Aust N Z J Stat 37:153–159
Modarres R (2008) Tests of bivariate exchangeability. Int Stat Rev 76:203–213
Morrison DF (1973) A test for equality of means of correlated variates with missing data on one response. Biometrika 60:101–105
Rempala GA, Looney SW (2006) Asymptotic properties of a two sample randomized test for partially dependent data. J Stat Plan Inference 68–89
Samawi HM, Vogel R (2014) Notes on two sample tests for partially correlated (paired) data. J Appl Stat 41:109–117
Samawi HM, Vogel R (2015) On some nonparametric tests for partially observed correlated data: proposing new tests. J Stat Theory Appl 14:131–155
Student (1908) The probable error of a mean. Biometrika 6:1–25
Tang X (2007) New test statistic for comparing medians with incomplete paired data. Ph.D. thesis, University of Pittsburgh
The Comprehensive R Archive Network (2018). https://cran.r-project.org/web/packages/copula/copula.pdf
Uddin N, Hasan MS (2017) Testing equality of two normal means using combined samples of paired and unpaired data. Commun Stat Comput Simul 46:2430–2446
van der Vaart A, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
Wilcoxon F (1945) Individual comparisons by ranking methods. Biometrics Bull 1:80–83
Woolson R, Leeper J, Cole J, Clarke W (1976) A Monte Carlo investigation of a statistic for a bivariate missing data problem. Commun Stat Theory Methods 5:681–688
Xu J, Harrar SW (2012) Accurate mean comparisons for paired samples with missing data: an application to a smokingcessation trial. Biometrical J 54:281–295
Yu D, Lim Y, Liang F, Kim K, Kim BS, Jang W (2012) Permutation test for incomplete paired data with application to cDNA microarray data. Comput Stat Data Anal 56:510–521
Ziegler K (1997) Functional central limit theorems for triangular arrays of function-indexed processes under uniformly integrable entropy conditions. J Multivar Anal 62:233–272
Acknowledgements
The author wishes to thank the Editor, an Associate Editor, and two Referees for very helpful comments and suggestions. Special thanks goes to Y. Fong from the Vaccine and Infectious Disease Division of the Fred Hutchinson Cancer Research Center in Seattle, WA, USA, for the support.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Online Resource 1
A simple example is given which certifies the fact that tests for verifying symmetry about zero or tests for verifying exchangeability are not applicable for the treatment of the testing problem of marginal homogeneity. (PDF 106KB).
Online Resource 2
R code for the implementation of simulations is given. (PDF 114KB).
Rights and permissions
About this article
Cite this article
Gaigall, D. Testing marginal homogeneity of a continuous bivariate distribution with possibly incomplete paired data. Metrika 83, 437–465 (2020). https://doi.org/10.1007/s00184-019-00742-5
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-019-00742-5