Abstract
In this paper, we consider an exact test of goodness of fit for binomial distribution in sparse data situation. A conventional way is viewing this problem as an independence test problem of a two-way contingency table. We propose an approach to promote the efficiency of the Diaconis–Sturmfels (DS) algorithm when n (sample size) is much larger than m [the first parameter of a binomial distribution B(m, p)] through representing the data and then utilizing minimal Markov bases of the corresponding multinomial model. Simulation results and real data analysis indicate that our method makes the DS algorithm computationally faster.
Similar content being viewed by others
References
Agresti A (2001) Exact inference for categorical data: recent advances and continuing controversies. Statist Med 20:2709–2722
Aigner M, Ziegler GM (1998) Proofs from THE BOOK. Springer, Berlin, pp 141–146
Aoki S, Hara H, Takemura A (2012) Markov bases in algebraic statistics. Springer, New York
Baglivo J, Oliver D, Pagano M (1988) Methods for the analysis of contingency tables with large and small cell counts. J Am Stat Assoc 83:1006–1013
Best DJ, Rayner JCW (1997) Goodness of fit for the binomial distribution. Austril J Statist 39(3):355–364
Best DJ, Rayner JCW (2006) Improved testing for the binomial distribution using chi-squared components with data-dependent cells. J Stat Comput Simul 76(1):75–81
Briales E, Campillo A, Marijuán C, Pisón P (1998) Minimal systems of generators for ideals of semigroups. J Pure Appl Alge 124:7–30
Cheon S, Liang F, Chen Y, Yu K (2014) Stochastic approximation Monte Carlo important sampling for approximating exact conditional probabilities. Stat Comput 24:505–520
Dalenius P, Reiss RS (1982) Data-swapping: a technique for disclosure control. J Stat Plan Inference 6:73–85
De Loera JA, Onn S (2006) Markov basis of three-way tables are arbitrarily complecated. J Symb Comput 41:173–181
Diaconis P, Sturmfels B (1998) Algebraic algorithms for sampling from conditional distributions. Ann Stat 26:363–397
Drton M, Sturmfels B, Sullivant S (2009) In: Lectures on algebraic statistics. Oberwolfach seminars, , vol 39. Birkh\(\ddot{\text{a}}\)user Verlag, Basel
Haberman SJ (1988) A warning on the use of chi-squared statistics with frequency tables with small expected cell counts. J Am Stat Assoc 83:555–560
Hara H, Takemura A, Yoshida R (2010) On connectivity of fibers with positive marginals in multiple logistic regression. J Multivar Anal 101:909–925
Klein M, Linton P (2013) On a comparison of tests of homogeneity of binomial proportions. J Stat Theo Appl 12(3):208–224
Lindsay JK (1995) Modelling frequency and count data. Oxford University Press, Oxford
Mehta CR, Patel NR (1983) A network algorithm for performing Fisher’s exact test in \(r\times c\) contingency tables. J Am Stat Asso 78:427–434
Mehta CR, Patel NR, Senchaudhuri P (1988) Importance sampling for estimating exact probabilities in permutational inference. J Am Stat Asso 83:999–1005
Park S, Lim J (2015) On censored cumulative residual Kullback-Leibler information and goodness-of-fit test with type II censored data. Stat Pap 56:247–256
Patefield WM (1981) Algorithm AS 159: an efficient method of generating random \(R\times C\) tables with given row and column totals. J R Stat Soc C 30:91–97
Quinino EC, Ho LL, Suyama E (2013) Alternative estimator for the parameters of a mixture of two binomial distributions. Stat Pap 54:47–69
Takemura A, Aoki S (2004) Some characterizations of minimal Markov basis for sampling from discrete conditional distributions. Ann Inst Stat Math 56(1):1–17
Takken A (1999) Monte Carlo goodness-of-fit for discrete data. Ph.D. Dissertation, Department of Statistics, Stanford University
Zardasht V, Parsi S, Mousazadeh M (2015) On empirical cumulative residual entropy and a goodness-of-fit test for exponentiality. Stat Papers 56:677–688
Acknowledgments
The authors wish to thank the Editor, an Associate Editor, and two anonymous referees for their constructive comments on early versions of this work that lead to substantial improvements in the article. This research was supported by the National Natural Science Foundation of China (Grants 11201365, 11301408) and the Fundamental Research Funds for the Central Universities (Grants K5051370016, JB140701).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Li, B., Fu, L. Exact test of goodness of fit for binomial distribution. Stat Papers 59, 851–860 (2018). https://doi.org/10.1007/s00362-016-0793-4
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-016-0793-4