Abstract
When the classical nonparametric bootstrap is implemented by a Monte-Carlo procedure one resamples values from a sequence of, typically, independent and identically distributed ones. But what happens when a decision has to be taken based on such resampled values? One way to quantify the loss of information due to this resampling step is to consider the deficiency distance, in the sense of Le Cam, between a statistical experiment of n independent and identically distributed observations and the one consisting of m observations taken from the original n by resampling with replacement. By comparing with an experiment where only subsamplingwith a random subsampling size has been performed one can bound the deficiency in terms of the amount of information contained in additional observations. It follows for certain experiments that the deficiency distance is proportional to the expected fraction of observations missed when resampling.
Similar content being viewed by others
References
W. Adamski, “On the Relations between Continuous and Nonatomic Measures”, Math. Nachr. 99, 55–60 (1980).
P. Erdõs and A. Rényi, “On a Classical Problem of Probability Theory”, Magyar Tud. Akad. Mat. Kutató Int. Közl. 6, 215–220 (1961).
M. Falk and F. Marohn, “On the Loss of Information due to Nonrandom Truncation”, J. Multivar. Anal. 72, 1–21 (2000).
V. Genon-Catalot and C. Larédo, “Asymptotic Equivalence of Nonparametric Diffusion and Euler Scheme Experiments”, Ann. Statist. 42, 1145–1165 (2014).
S. Greenberg and M. Mohri, “Tight Lower Bound on the Probability of a Binomial Exceeding its Expectation”, Statist. Probab. Lett. 86, 91–98 (2014).
J. Helgeland, “Additional Observations and Statistical Information in the Case of 1-Parameter Exponential Distributions”, Z. Wahrsch. undVerw.Gebiete 59, 77–100 (1982).
A. Janssen and R.-D. Reiss, “Comparison of Location Models ofWeibull Type Samples and Extreme Value Processes”, Probab. Theory Rel. Fields 78, 273–292 (1988).
A. Janssen and F. Marohn, “On Statistical Information of Extreme Order Statistics, Local Extreme Value Alternatives, and Poisson Point Processes”, J.Multivar. Anal. 48, 1–30 (1994).
N. L. Johnson A. W. Kemp and S. Kotz, Univariate Discrete Distributions, 2nd. ed. (Wiley, New York, 2005).
V. Konakov E. Mammen and J. Woerner, “Statistical Convergence of Markov Experiments to Diffusion Limits”, Bernoulli 20, 623–644 (2014).
L. Le Cam, “Sufficiency and Approximate Sufficiency”, Ann. Math. Statist. 35, 1419–1455 (1964).
L. Le Cam, “On the Information Contained in Additional Observations”, Ann. Statist. 2, 630–649 (1974).
E. Mammen, “The Statistical InformationContained in AdditionalObservations”, Ann. Statist. 14, 665–678 (1986).
E. Mammen, When Does Bootstrap Work? Asymptotic Results and Simulations (Springer, New York, 1992).
E. Mariucci, “Asymptotic Equivalence of Discretely Observed Diffusion Processes and Their Euler Scheme: Small Variance Case”, Statist. Inference Stoch. Process. 19, 71–91 (2016).
F. Marohn, “Global Sufficiency of Extreme Order Statistics in Location Models of Weibull Type”, Probab. Theory Rel. Fields 88, 261–268 (1991).
F. Marohn, “Neglecting Observations in Gaussian Sequences of Statistical Experiments”, Statist. Decisions 13, 83–92 (1995).
G. Milstein and M. Nussbaum, “Diffusion Approximation for Nonparametric Autoregression”, Probab. Theory Rel. Fields 112, 535–543 (1998).
H. Putter and W. R. van Zwet, “Resampling: Consistency of Substitution Estimators”, Ann. Statist. 24, 2297–2318 (1996).
R.-D. Reiss, “A New Proof of the Approximate Sufficiency of Sparse Order Statistics”, Statist. Probab. Lett. 4, 233–235 (1986).
R.-D. Reiss M. Falk and M. Weller, “Inequalities for the Relative Sufficiency between Sets of Order Statistics”, in Statistical Extremes and Applications (Springer, Dordrecht, 1984), pp. 597–610.
E. Schechter, Handbook of Analysis and Its Foundations (Academic Press, San Diego, 1997).
S. Y. T. Soon, “Binomial Approximation for Dependent Indicators”, Statist. Sinica 6, 703–714 (1996).
H. Strasser, “Towards a Statistical Theory of Optimal Quantization”, in Data Analysis: Scientific Modeling and Practical Application (Springer, Berlin–Heidelberg, 2000), pp. 369–383.
E. N. Torgersen, “Comparison of Experiments when the Parameter Space is Finite”, Z.Wahrsch. und Verw. Gebiete 16, 219–249 (1970).
E. N. Torgersen, “Comparison of Translation Experiments”, Ann. Math. Statist. 43, 1383–1399 (1972).
E. N. Torgersen, “Measures of Information Based on Comparison with Total Information and with Total Ignorance”, Ann. Statist. 9, 638–657 (1981).
E. N. Torgersen, Comparison of Statistical Experiments, in Encyclopedia of Mathematics and its Applications (Cambridge Univ. Press, Cambridge, 1991).
Author information
Authors and Affiliations
Corresponding author
About this article
Cite this article
Wiklund, T. The Deficiency Introduced by Resampling. Math. Meth. Stat. 27, 145–161 (2018). https://doi.org/10.3103/S1066530718020047
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.3103/S1066530718020047