Abstract
We introduce a simple combinatorial scheme for systematically running through a complete enumeration of sample reuse procedures such as the bootstrap, Hartigan's subsets, and various permutation tests. The scheme is based on Gray codes which give ‘tours’ through various spaces, changing only one or two points at a time. We use updating algorithms to avoid recomputing statistics and achieve substantial speedups. Several practical examples and computer codes are given.
Similar content being viewed by others
References
Albers, W., Bickel, P. J. and van Zwet, W. R. (1976) Asymptotic expansions for the power of distribution-free tests in the one-sample problem. Annals of Statistics 4, 108–156.
Arbenz, P. and Golub, G. (1988) On the spectral decomposition of hermitian matrices modified by low rank perturbations with applications. SIAM Journal of Matrix Analysis and its Applications 9, 40–58.
Bailey, W. A. (1992) Exploring the Limits of the Bootstrap, ed. R. LePage and L. Billard, pp. 309–318. Wiley, New York.
Barndorff-Nielsen, O. and Cox, D. (1989) Asymptotic Methods in Statistics. Chapman and Hall, London.
Basu, D. (1980) Randomization analysis of experimental data: the Fisher randomization test. Journal of the American Statistical Association, 15, 575–581.
Bickel, P. J. and Freedman, D. A. (1981) Some asymptotic theory for the bootstrap. Annals of Statistics 9, 1196–1217.
Bunch, J. and Nielsen, C. (1978) Updating the singular value decomposition. Numerische Mathematik, 111–129.
Bunch, J., Nielsen, C. and Sorensen, D. C. (1978) Rank one modification of the symmetric eigenvalue problem. Numerische Mathematik, 31, 31–48.
Conway, N. H., Sloane, N. J. A. and Wilks, A. R. (1989) Gray codes for reflection groups. Graphs and Combinatorics 5, 315–325.
Dempster, A. P. (1969) Elements of Continuous Multivariate Analysis. Addison-Wesley, Reading, MA.
Diaconis, P., Holmes, S., Janson, S., Lalley, S. and Pemantle R. (1994) Metrics on Compositions and Coincidences Among Renewal Sequences. To appear in the IMA: Monte Carlo Markov Chain Workshop.
Diaconis, P. and Holmes S. (1994) Three examples of Monte Carlo Markov Chains: at the interface between statistical computing, computer science and statistical mechanics. To appear in the IMA volume on Monte Carlo Markov chains.
Eastment, H. T. and Krzanowski, W. J. (1982) Cross-validatory choice of the number of components from a principal component analysis. Technometrics, 24, 73–77.
Eaton, M. L. and Efron, B. (1970) Hotelling's T 2 test under symmetry conditions. Journal of the American Statistical Association, 65, 702–711.
Efron, B. (1969) Student's t-test under symmetry conditions. Journal of the American Statistical Association, 64, 1278–1302.
Efron, B. (1979) Bootstrap methods: another look at the jackknife. Annals of Statistics, 7, 1–26.
Efron, B. (1982) The Jackknife, the Bootstrap and other Resampling Plans. CBMS-NSF. SIAM, Philadelphia.
Efron, B. (1992) Jackknife-after-bootstrap standard errors and influence functions. Journal of the Royal Statistical Society, 54, 83–127.
Efron, B. and Tibshirani, R. (1993) An Introduction to the Bootstrap. Chapman and Hall, London.
Elhay, S., Golub, G. and Kautsky, J. (1989) Updating and Downdating of Orthonomial polynomials with data fitting applications, volume 70, 149–172. Nato Adv. Sci. Ser. F: Comput. Systems Sci.
Elliott, D. and Rao, K. (1982) Fast Fourier Transforms, Analyses, Applications. Academic Press, New York.
Fisher, N. and Hall, P. G. (1991) Bootstrap algorithms for small samples. Journal of Statistical Planning and Inference, 27, 157–169.
Fisher, R. A. (1935) The Design of Experiments. Oliver and Boyd, London.
Fisher, R. A. (1936) The coefficient of racial likeness and the future of craniometry. Journal of the Royal Anthropological Institute, 66, 57–63.
Forsythe, A. and Hartigan, J. A. (1970) Efficiency of confidence intervals generated by repeated subsample calculations. Biometrika, 57, 629–639.
Friedman, J. H. and Rafsky, L. C. (1979) Multivariate generalizations of the Wald-Wolfowitz and smirnov two-sample tests. Annals of Statistics, 7, 697–717.
Golub, G. and Reinsch, C. (1970) Singular value decomposition and least squares solutions. Numerische Mathematik, 14, 403–420.
Good, I. (1958) The interaction algorithm and practical Fourier analysis. Journal of the Royal Statistical Society B, 20, 361–372.
Good, P. I. (1994) Permutation Tests. Springer-Verlag, New York.
Gordon, L. (1974a) Completely separating groups in subsampling. Annals of Statistics 2, 572–578.
Gordon, L. (1974b) Efficiency in subsampling. Annals of Statistics, 2, 739–750.
Graham, R., Hinkley, D., John, P. and Shi, S. (1990) Balanced design of bootstrap simulations. Journal of the Royal Statistical Society, 52, 185–202.
Gray, F. (1939) Coding for data transmission. Technical report, Bell System Technical Journal.
Hall, P. (1992) The Bootstrap and Edgeworth Expansions. Springer-Verlag, New York.
Hall, P. and Martin, M. (1988) On bootstrap resampling and iteration. Biometrika, 75, 661–671.
Hall, P., Martin, M. A. and Schucany, W. R. (1989) Better nonparametric bootstrap confidence intervals for the correlation coefficient. Journal of Statistical Computing and Simulation, 33, 161–172.
Hartigan, J. (1969) Using subsample values as typical values. Journal of the American Statistical Association, 64, 1303–1317.
Klingsberg, P. (1982) A gray code for compositions. Journal of Algorithms, 3, 41–44.
Ko, C. and Ruskey, F. (1992) Generating permutations of a bag by interchanges. Information Processing Letters, 41, 263–269.
Lehmann, E. (1975) Nonparametrics: Statistical Methods Based on Ranks. Holden-Day, San Francisco.
Manly, B. (1991) Randomization and Monte Carlo Methods in Biology. Chapman and Hall, London.
Maritz, J. (1979) A note on exact robust confidence intervals for location. Biometrika, 66, 163–166.
Maritz, J. (1981) Distribution-Free Statistical Methods. Chapman and Hall, London.
Miller, J. E. (1970) Transmission of analog signals over a gaussian channel by permutation modulation coding. PhD thesis, Columbia University.
Nijenhuis, A. and Wilf, H. S. (1978) Combinatorial Algorithms for Computers and Calculators. Academic Press, New York.
Pagano, M. and Spino, C. (1991) Efficient calculation of the permutation distribution of trimmed means. Journal of the American Statistical Association, 86, 729–737.
Pagano, M. and Tritchler, D. (1983) On obtaining permutation distributions in polynomial time. Journal of the American Statistical Association, 78, 435–440.
Pearson, E. (1937) Some aspects of the problem of randomization. Biometrika, 29, 53–64.
Phua, P. and Chew, S. (1992) Symmetric Rank-one Update and Quasinewton methods. World Scientific Publishing, River Edge, N.J.
Pitman, E. (1937) Significance tests which may be applied to samples from any population. Journal of the Royal Statistical Society, 4, 119–130.
Politis, D. and Romano, J. (to appear) A general theory for large sample confidence regions based on subsamples under minimal assumptions. Technical report, Annals of Statistics.
Press, W., Flannery, B. P., Teulosky, S. and Vetterling, W. (1986) Numerical Recipes. Cambridge University Press.
Ruskey, F. (1994) Combinatorial Generation. Forthcoming.
Tritchler, D. (1984) On inverting permutation tests. Journal of the American Statistical Association, 79, 200–207.
Wilf, H. S. (1989) Combinatorial Algorithms: an Update. SIAM, Philadelphia.
Witle, D. and Gallian, J. (1984) A survey; hamiltonian cycles in cayley graphs. Discrete Mathematics 51, 293–304.
Witztum, D., Rips, E. and Rosenburg, Y. (1994) Equidistant letter sequences in the Book of Genesis. Unpublished.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Diaconis, P., Holmes, S. Gray codes for randomization procedures. Stat Comput 4, 287–302 (1994). https://doi.org/10.1007/BF00156752
Issue Date:
DOI: https://doi.org/10.1007/BF00156752