Skip to main content
Log in

The resampling method via representative points

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

The bootstrap method relies on resampling from the empirical distribution to provide inferences about the population with a distribution F. The empirical distribution serves as an approximation to the population. It is possible, however, to resample from another approximating distribution of F to conduct simulation-based inferences. In this paper, we utilize representative points to form an alternative approximating distribution of F for resampling. The representative points in terms of minimum mean squared error from F have been widely applied to numerical integration, simulation, and the problems of grouping, quantization, and classification. The method of resampling via representative points can be used to estimate the sampling distribution of a statistic of interest. A basic theory for the proposed method is established. We prove the convergence of higher-order moments of the new approximating distribution of F, and establish the consistency of sampling distribution approximation in the cases of the sample mean and sample variance under the Kolmogorov metric and Mallows–Wasserstein metric. Based on some numerical studies, it has been shown that the proposed resampling method improves the nonparametric bootstrap in terms of confidence intervals for mean and variance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bickel PJ, Freedman DA (1981) Some asymptotic theory for the bootstrap. Ann Stat 9:1196–1217

    Article  MathSciNet  Google Scholar 

  • Billingsley P (1995) Probability and measure, 3rd edn. Wiley, New York

    Google Scholar 

  • Chakraborty S, Roychowdhury MK, Sifuentes J (2021) High precision numerical computation of principal points for univariate distributions. Sankhya B 83:558–584

    Article  MathSciNet  Google Scholar 

  • Chen WY, Mackey L, Gorham J, Briol FX, Oates C (2018) Stein points. In: Proceedings of the 35th international conference on machine learning (ICML), pp 844–853

  • Cox DR (1957) Note on grouping. J Am Stat Assoc 52:543–547

    Article  Google Scholar 

  • Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26

    Article  MathSciNet  Google Scholar 

  • Efron B (1981) Nonparametric standard errors and confidence intervals. Can J Stat 9(2):139–158

    Article  MathSciNet  Google Scholar 

  • Efron B, Tibshirani R (1994) An introduction to the bootstrap. Chapman & Hall, New York

    Book  Google Scholar 

  • Fang KT, He SD (1982) The problem of selecting a given number of representative points in a normal population and a generalized Mills’ ratio. Technical report SOLONR327. Stanford University, Department of Statistics, Stanford

  • Fang KT, Wang Y (1994) Number-theoretic methods in statistics. Chapman and Hall, London

    Book  Google Scholar 

  • Fang KT, Wang Y, Bentler PM (1994) Some applications of number-theoretic methods in statistics. Stat Sci 9:416–428

    Article  MathSciNet  Google Scholar 

  • Fang KT, Yuan KH, Bentler PM (1994) Applications of number-theoretic methods to quantizers of elliptically contoured distributions. IMS Lecture Notes-Monograph Series. Multivar Anal Appl 24:211–225

  • Fang KT, Zhou M, Wang WJ (2014) Applications of the representative points in statistical simulations. Sci China Math 57:2609–2620

    Article  MathSciNet  Google Scholar 

  • Flury B (1990) Principal points. Biometrika 77:33–41

    Article  MathSciNet  Google Scholar 

  • Flury B (1993) Estimation of principal points. J R Stat Soc C Appl Stat 42:139–151

    MathSciNet  Google Scholar 

  • Fei R (1991) Statistical relationship between the representative point and the population. J Wuxi Inst Light Ind 10:78–81

    Google Scholar 

  • Giné E, Zinn J (1989) Necessary conditions for the bootstrap of the mean. Ann Stat 17:684–691

    Article  MathSciNet  Google Scholar 

  • Graf S, Luschgy H (2007) Foundations of quantization for probability distributions. Springer, Berlin

    Google Scholar 

  • Hall P (1990) Asymptotic properties of the bootstrap for heavy-tailed distributions. Ann Probab 18:1342–1360

    Article  MathSciNet  Google Scholar 

  • Hall P (1992) The bootstrap and Edgeworth expansion. Springer, New York

    Book  Google Scholar 

  • Iyengar S, Solomon H (1983) Selecting representative points in normal populations. In: Rizvi MH, Rustagi J, Siegmund D (eds) Recent advances in statistics: papers in honor of Herman Chernoff on his 60th birthday. Academic Press, New York

    Google Scholar 

  • Joseph VR, Dasgupta T, Tuo R, Wu CJ (2015) Sequential exploration of complex surfaces using minimum energy designs. Technometrics 57:64–74

    Article  MathSciNet  Google Scholar 

  • Joseph VR, Wang D, Gu L, Lyu S, Tuo R (2019) Deterministic sampling of expensive posteriors using minimum energy designs. Technometrics 61:297–308

    Article  MathSciNet  Google Scholar 

  • Linde Y, Buzo A, Gray R (1980) An algorithm for vector quantizer design. IEEE Trans Commun 28:84–95

    Article  Google Scholar 

  • Lloyd SP (1982) Least squares quantization in PCM. IEEE Trans Inform Theory 28:129–137

    Article  MathSciNet  Google Scholar 

  • Mak S, Joseph VR (2018) Support points. Ann Stat 46:2562–2592

    Article  MathSciNet  Google Scholar 

  • Mallows CL (1972) A note on asymptotic joint normality. Ann Math Stat 43:508–515

    Article  MathSciNet  Google Scholar 

  • Matsuura S, Tarpey T (2020) Optimal principal points estimators of multivariate distributions of location-scale and location-scale-rotation families. Stat Pap 61:1629–1643

    Article  MathSciNet  Google Scholar 

  • Max J (1960) Quantizing for minimum distortion. IEEE Trans Inform Theory 6:7–12

    Article  MathSciNet  Google Scholar 

  • Pagès G (2015) Introduction to vector quantization and its applications for numerics. ESAIM Proc Surv 48:29–79

    Article  MathSciNet  Google Scholar 

  • Pagès G (2018) Numerical probability. Universitext. Springer, Cham

  • Pagès G, Yu J (2016) Pointwise convergence of the Lloyd I algorithm in higher dimension. SIAM J Control Optim 54:2354–2382

    Article  MathSciNet  Google Scholar 

  • Panaretos VM, Zemel Y (2019) Statistical aspects of Wasserstein distances. Annu Rev Stat Appl 6:405–431

    Article  MathSciNet  Google Scholar 

  • Rohatgi VK, Székely GJ (1972) Sharp inequalities between skewness and kurtosis. Stat Probab Lett 8:297–299

    Article  MathSciNet  Google Scholar 

  • Serfling RJ (1980) Approximation theorems of mathematical statistics. Wiley, New York

    Book  Google Scholar 

  • Shao J, Tu D (1995) The Jackknife and bootstrap. Springer, New York

    Book  Google Scholar 

  • Shi X (1986) A note on bootstrapping U-statistics. Chin J Appl Prob Stat 2:144–148

    MathSciNet  Google Scholar 

  • Singh K (1981) On the asymptotic accuracy of Efron’s bootstrap. Ann Stat 9:1187–1195

    Article  MathSciNet  Google Scholar 

  • Tarpey T (1997) Estimating principal points of univariate distributions. J Appl Stat 24:499–512

    Article  Google Scholar 

  • Tarpey T, Flury B (1996) Self-consistency: a fundamental concept in statistics. Stat Sci 11:229–243

    MathSciNet  Google Scholar 

  • Tarpey T, Petkova E (2010) Principal point classification: applications to differentiating drug and placebo responses in longitudinal studies. J Stat Plan Inference 140:539–550

    Article  MathSciNet  PubMed  PubMed Central  Google Scholar 

  • Tarpey T, Li L, Flury B (1995) Principal points and self-consistent points of elliptical distributions. Ann Stat 23:103–112

    Article  MathSciNet  Google Scholar 

  • Tarpey T, Petkova E, Lu Y, Govindarajulu U (2010) Optimal partitioning for linear mixed effects models: applications to identifying placebo responders. J Am Stat Assoc 105:968–977

    Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  • Van der Vaart AW (2000) Asymptotic statistics. Cambridge University Press, Cambridge

    Google Scholar 

  • Xu LH, Fang KT, He P (2022) Properties and generation of representative points of the exponential distribution. Stat Pap 63:197–223

    Article  MathSciNet  Google Scholar 

  • Yang J, He P, Fang KT (2022) Three kinds of discrete approximations of statistical multivariate distributions and their applications. J Multivar Anal. https://doi.org/10.1016/j.jmva.2021.104829

    Article  MathSciNet  Google Scholar 

  • Zoppè A (1995) Principal points of univariate continuous distributions. Ann Probab 5:127–132

    Google Scholar 

Download references

Acknowledgements

Our work was supported in part by the Guangdong Provincial Key Laboratory of Interdisciplinary Research and Application for Data Science, BNU-HKBU United International College (2022B1212010006) and in part by Guangdong Higher Education Upgrading Plan (2021-2025) (UIC R0400001-22).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kai-Tai Fang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Notations

Notations are summarized in Table 8.

Table 8 Summary of notations

Appendix B: Proofs

1.1 Proof of Theorem 1

Proof

According to the properties of RPs, \(Y^{mse}_k\) converges to X in distribution as k approaches infinity. Then, part (i) follows from the continuous mapping theorem. Part (ii) follows the Portmanteau theorem. \(\square \)

1.2 Proof of Theorem 2

According to Jensen’s inequality and the self-consistent property of RPs, the inequality (B1) is established. In general, the convergence in distribution cannot imply the convergence in moment. Additional conditions are required. One famous condition is the uniform integrability. Sufficient conditions for uniform integrability are in Lemma 2.

Lemma 1

(Pagès 2018, Eq. 5.10) Let \(X\sim F\) and \(Y^{mse}_k\sim F_{mse,k}\). For every convex function \(g:{\mathbb {R}}^{d} \rightarrow {\mathbb {R}}\) such that \(g(X) \in L^1(P)\), we have

$$\begin{aligned} E[g(Y^{mse}_k)]\le E[g(X)], \quad \text {for all } k\in {\mathbb {N}}^+, \end{aligned}$$
(B1)

where \(L^{1}(P)\) consists of all real valued measurable functions on the sample space \({\mathcal {X}}\) of X that satisfy \(\int _{{\mathcal {X}}}|g(x)|d P(x)<\infty \).

Lemma 2

(Billingsley 1995, p. 338) Suppose that X and \(X_{n}\) are random variables on the same probability space. Let r be a positive integer. If \({X_n} \rightarrow X\) in distribution and \(\sup _{n\in {\mathbb {N}}^{+}}E ( |X_n|^{r+\varepsilon } ) <\infty \) with some \(\varepsilon >0\), then \(E\left( |X |^r\right) <\infty \) and \(E[(X_n)^r]\rightarrow E(X^r)\) as \(n \rightarrow \infty \).

Proof

Part (i). Given \(E( |X |^{r+\varepsilon }) < \infty \) for some \(r\ge 3\) and \(r\in {\mathbb {N}}^+\), where \(\varepsilon >0\). Let \(g_1(x)=|x|^{r+\varepsilon }\). Since \(g_1(x)\) is convex for any \(r\ge 3\) and satisfies \(\int _{{\mathcal {X}}}|x|^{r+\varepsilon }dF(x)<\infty \), according to Lemma 1, we have

$$\begin{aligned} \sup _{k\in {\mathbb {N}}^+} E ( |Y^{mse}_k |^{r+\varepsilon } ) \le E ( |X |^{r+\varepsilon } ) < \infty . \end{aligned}$$
(B2)

Since \(Y^{mse}_k \rightarrow X\) in distribution and \(\sup _{k \in {\mathbb {N}}^+} E ( |Y^{mse}_k |^{r+\varepsilon } ) < \infty \), according to Lemma 2, we have \(E ( |X|^r ) <\infty \) and \(E[(Y^{mse}_k)^r]\rightarrow E(X^r)\) as \(k \rightarrow \infty \).

Part (ii). Under the same condition, let a continuous function \(g_2(x)=|x-\mu |\), where \(\mu =E(X)\) is the expectation of F. It is also the expectation of \(F_{mse,k}\) for any \(k \in {\mathbb {N}}^+\). By the continuous mapping theorem, \(Y^{mse}_k \rightarrow X\) in distribution implies \(|Y^{mse}_k-\mu |\rightarrow |X-\mu |\) in the same mode. Using the Minkowski inequality, we have

$$\begin{aligned} \left[ E(|Y^{mse}_k-\mu |^p) \right] ^{\frac{1}{p}} \le \left[ E(|Y^{mse}_k|^p)\right] ^{\frac{1}{p}} + \left[ E(|\mu |^p)\right] ^{\frac{1}{p}}= \left[ E(|Y^{mse}_k|^p)\right] ^{\frac{1}{p}} + |\mu |, \end{aligned}$$
(B3)

for \(1\le p < \infty \). By (B2)–(B3),

$$\begin{aligned} \sup _{k\in {\mathbb {N}}^+} E ( |Y^{mse}_k-\mu |^{r+\varepsilon } ) \le \sup _{k\in {\mathbb {N}}^+} \left\{ \left[ E(|Y^{mse}_k|^{r+\varepsilon })\right] ^{\frac{1}{r+\varepsilon }} + |\mu |\right\} ^{r+\varepsilon } < \infty \end{aligned}$$

holds when \(E ( |X |^{r+\varepsilon } ) < \infty \). Hence, \(E(|Y^{mse}_k-\mu |^r)\rightarrow E(|X-\mu |^r)\) as \(k \rightarrow \infty \) according to Lemma 2. \(\square \)

1.3 Proof of Theorem 3

Lemma 3

(Bickel and Freedman 1981) Denote \(Z_1(G),\cdots ,Z_n(G)\) as independent random variables from the common distribution function G. Let \(G^{(n)}\) be the distribution of the statistic \(n^{-1/2}\sum _{j=1}^n\left\{ Z_j(G)-E[Z_j(G)]\right\} \). Denote \(K^{(n)}\) as the sampling distribution of the same statistics based on n random variables from distribution K. If \(G, K\in {\mathcal {F}}_2\), then

$$\begin{aligned} \rho _2(G^{(n)},K^{(n)})\le \rho _2(G,K). \end{aligned}$$

The proof of Lemma 3 can be found in Mallows (1972, Lemma 3) and Bickel and Freedman (1981, Lemma 8.6 and Lemma 8.7).

Lemma 4

(Bickel and Freedman 1981, Lemma 8.3) Denote \(G_n\) as the distribution of \(n^{1/2}[{\bar{X}}-E_G(X)]\). Provided that \(G_n,\ G\in {\mathcal {F}}_{2}\), we have \(\rho _{2}(G_n,G)\rightarrow 0\) if and only if \(G_n\) converges to G in distribution and

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{-\infty }^{\infty } x^2\textrm{d}G_n(x)=\int _{-\infty }^{\infty } x^2\textrm{d}G(x). \end{aligned}$$

Proof

Part (i). Denote the law of \(n^{1/2}({\bar{X}}-\mu )\) and \(n^{1/2}({\bar{Y}}-{\tilde{\mu }}_k)\) as \(H_n\) and \({\tilde{H}}_n^{(k)}\), respectively. Since \(F \in {\mathcal {F}}_2\), so dose \(F_{mse,k}\). By Lemma 3, for any \(n,k \in {\mathbb {N}}^+\), we have

$$\begin{aligned} \rho _2({\tilde{H}}_n^{(k)},H_{n})\le \rho _2(F_{mse,k},F). \end{aligned}$$
(B4)

Since \({\tilde{\mu }}_k=\mu \) for any k and \(Y^{mse}_k\) converges to X in mean square, we have \(F_{mse,k}\) converges to F in distribution and \(\lim _{k\rightarrow \infty } \int _{-\infty }^{\infty }y^2dF_{mse,k}(y)=\int _{-\infty }^{\infty }x^2dF(x)\). According to Lemma 4, it is equivalent to \(\lim _{k\rightarrow \infty } \rho _{2}(F_{mse,k},F)=0\). By (B4), it follows that \(\rho _2({\tilde{H}}^{(n)},H^{(n)}) \rightarrow 0\) as \(k\rightarrow \infty \) for any \(n \in {\mathbb {N}}^+\). According to Graf and Luschgy (2007, Lemma 3.4, p. 33), the relationship between the Mallows–Wasserstein distance and the mean squared error function is

$$\begin{aligned} \rho _2(F_{mse,k},F)=\left( {\text {MSE}}_{X}({\mathcal {A}})\right) ^{1/2}=E^{1/2}[(X-Y^{mse}_k)^2]. \end{aligned}$$

Given k-RPs, we can calculate the upper bound of the Mallows–Wasserstein distance between \({\tilde{H}}_n^{(k)}\) and \(H_n\).

Part (ii). Denote the variances of X and \(Y^{mse}_k\) as \(\sigma ^2\) and \({\tilde{\sigma }}^2_k\), respectively.

Let \(\Phi (x)\) be the cumulative distribution function of the standard normal distribution, which is a uniformly continuous function. It leads to

$$\begin{aligned} \rho _{\infty } \biggl ( \Phi \biggl ( \frac{x}{\sigma }\biggr ) , \Phi \biggl ( \frac{x}{{\tilde{\sigma }}_k}\biggr )\biggr ) \rightarrow 0 \text { as } k\rightarrow \infty . \end{aligned}$$
(B5)

Due to Lindeberg–Lévy central limit theorem, we obtain

$$\begin{aligned} \rho _{\infty } \biggl ( H_n(x),\Phi \biggl ( \frac{x}{\sigma }\biggr )\biggr ) \rightarrow 0 \text { as } n\rightarrow \infty , \end{aligned}$$
(B6)

and

$$\begin{aligned} \rho _{\infty } \biggl ( {\tilde{H}}_n^{(k)}(x), \Phi \biggl ( \frac{x}{{\tilde{\sigma }}_k}\biggr ) \biggr ) \rightarrow 0 \text { as } n\rightarrow \infty . \end{aligned}$$
(B7)

Using the triangle inequality, we have

$$\begin{aligned} \begin{aligned} \rho _{\infty } \biggl ({\tilde{H}}_n^{k}(x), H_n(x) \biggr )&\le \rho _{\infty } \biggl ( {\tilde{H}}_n^{(k)}(x),\Phi \biggl ( \frac{x}{{\tilde{\sigma }}_k}\biggr ) \biggr ) +\rho _{\infty } \biggl ( H_n(x),\Phi \biggl ( \frac{x}{\sigma } \biggr )\biggr )\\&\quad + \rho _{\infty } \biggl ( \Phi \biggl ( \frac{x}{\sigma }\biggr ),\Phi \biggl ( \frac{x}{{\tilde{\sigma }}_k}\biggr ) \biggr ). \end{aligned} \end{aligned}$$
(B8)

By (B5)–(B7), we have \(\rho _{\infty } \bigl ({\tilde{H}}_n^{k}(x), H_n(x) \bigr ) \rightarrow 0\) as \(n\rightarrow \infty \) and \(k\rightarrow \infty \).

If we give a fixed number of k-RPs, then the first two terms on the right-hand side of (B8) converges to 0 as \(n \rightarrow \infty \). Thus, we obtain the desired result

$$\begin{aligned} \limsup _{n \rightarrow \infty } \rho _{\infty } ( {\tilde{H}}_n^{(k)},H_n ) \le \rho _{\infty } \biggl ( \Phi \biggl ( \frac{x}{\sigma }\biggr ),\Phi \biggl ( \frac{x}{{\tilde{\sigma }}_k}\biggr )\biggr ). \end{aligned}$$

\(\square \)

1.4 Proof of Theorem 4

Lemma 5

(The Berry–Esséen’s inequality) If \(Z_1,\ldots , Z_n\) are independent and identically distributed random variables with \(E(Z_1)=0\), \(E(Z_1^2)=\sigma ^{2}>0\), \(E(|Z_1|^{3})<\infty \), and \({\bar{Z}}=n^{-1}\sum _{i=1}^{n}Z_i\), then we have

$$\begin{aligned} \sup _{-\infty<z<\infty } \bigg |{\text {pr}}\bigg (\frac{n^{1/2}{\bar{Z}}}{\sigma } \le t\bigg )-\Phi (t) \bigg |\le \frac{CE(|Z_1 |^{3})}{\sigma ^{3}n^{1/2}}, \end{aligned}$$

where \(\Phi (t)\) is the cumulative distribution function of standard normal distribution and C is a positive constant independent of n and the distributions of Z.

Proof

Applying the triangle inequality and Berry–Esséen’s inequality (Lemma 5), we have

$$\begin{aligned}&\sup _{-\infty<t<\infty } \bigg |{\text {pr}}\bigg ( \frac{n^{1/2}({\bar{X}}-\mu )}{\sigma } \le t\bigg )-{\text {pr}}\bigg ( \frac{n^{1/2}({\bar{Y}}-{\tilde{\mu }}_k)}{{\tilde{\sigma }}_k} \le t\bigg )\bigg |\\&\quad \le \sup _{-\infty<t<\infty } \bigg |{\text {pr}}\bigg (\frac{n^{1/2}({\bar{X}}-\mu )}{\sigma } \le t\bigg )-\Phi (t) \bigg |\\ {}&\qquad + \sup _{-\infty<t<\infty }\bigg |{\text {pr}}\bigg ( \frac{n^{1/2}({\bar{Y}}-{\tilde{\mu }}_k)}{{\tilde{\sigma }}_k^2} \le t\bigg )-\Phi (t)\bigg |\\&\quad \le \frac{C\alpha _3}{\sigma ^3n^{1/2}} + \frac{C{\tilde{\alpha }}_{3,k}}{{\tilde{\sigma }}_k^3n^{1/2}}, \end{aligned}$$

where \(\alpha _3=E(|X-\mu |^3)\) and \({\tilde{\alpha }}_{3,k}=E(|Y^{mse}_k-{\tilde{\mu }}_k|^3)\). Then, it yields

$$\begin{aligned}&\bigg ( \frac{\sigma ^3n^{1/2}}{\alpha _3} \bigg ) \sup _{-\infty<t<\infty }\bigg |{\text {pr}}\bigg (\frac{n^{1/2}({\bar{X}}-\mu )}{\sigma }\le t\bigg )\\ {}&\quad -{\text {pr}}\bigg (\frac{n^{1/2}({\bar{Y}}-{\tilde{\mu }}_k)}{{\tilde{\sigma }}_k} \le t\bigg ) \bigg |\le C\bigg (1+\frac{\sigma ^3}{{\tilde{\sigma }}_k^3}\frac{{\tilde{\alpha }}_{3,k}}{\alpha _3}\bigg ). \end{aligned}$$

According to Theorem 2, if \(E(|X|^{3+\epsilon })<\infty \) with some \(\epsilon >0\), then we have \(\lim _{k\rightarrow \infty }( {\tilde{\alpha }}_{3,k}) =\alpha _3\). Hence, we know

$$\begin{aligned} \limsup _{k\rightarrow \infty }\bigg ( \frac{\sigma ^3}{{\tilde{\sigma }}_k^3}\frac{{\tilde{\alpha }}_{3,k}}{\alpha _3}\bigg ) \le \bigg (\limsup _{k\rightarrow \infty }\frac{\sigma ^3}{{\tilde{\sigma }}_k^3}\bigg ) \bigg (\limsup _{k\rightarrow \infty }\frac{{\tilde{\alpha }}_{3,k}}{\alpha _3}\bigg )=1, \end{aligned}$$

which is the desired result.

Therefore, we establish the consistency of the RPs-based resampling approximation to the sampling distribution of standardized sample mean, that is,

$$\begin{aligned} \sup _{-\infty<t<\infty } \bigg |{\text {pr}}\bigg (\frac{n^{1/2}({\bar{X}}-\mu )}{\sigma }\le t\bigg ) -{\text {pr}}\bigg (\frac{n^{1/2}({\bar{Y}}-{\tilde{\mu }}_k)}{{\tilde{\sigma }}_k} \le t\bigg ) \bigg |\rightarrow 0, \end{aligned}$$

as both \(k\rightarrow \infty \) and \(n \rightarrow \infty \). \(\square \)

1.5 Proof of Theorem 5

Sample variance is a U-statistic for estimating \(\sigma ^2=E[(X_{1}-X_{2})^{2} / 2]\) (Serfling 1980, Sect. 5). In order to prove Theorem 5, we need to use the asymptotic normality of U-statistic. Lemma 6 gives asymptotic distribution of sample variance with given assumptions.

Lemma 6

(Serfling 1980, Sect. 5.5.1) Let \(X_1,\ldots ,X_n\) be independent observations on a distribution F and \(S^2\) be their sample variance. Assume that \(\sigma ^{4}<\mu _{4}<\infty \), where \(\sigma ^{2}=\text {var}(X)\) and \(\mu _{4}=E[X-E(X)]^{4}\), then \(T(X_1,\ldots X_n; F)=n^{1/2}(S^2-\sigma ^{2})\) converges to \(N(0, \mu _{4}-\sigma ^{4})\) in distribution.

Lemma 7

(Rohatgi and Székely 1989, Theorem 1) Suppose \(X \sim F\) with finite forth moments. Let \(\mu =E(X)\) and \(\sigma ^2=\text {var}(X)>0\), then

$$\begin{aligned} \frac{E[(X-\mu )^4]}{\sigma ^4}\ge \bigg ( \frac{E[(X-\mu )^3]}{\sigma ^3}\bigg )^2 + 1 \end{aligned}$$

holds with equality if and only if the distribution F is concentrated on at most two points.

Proof

Let \(H_n(x)=\text {pr}\lbrace n^{1/2}(S^2-\sigma ^2)\le x\rbrace \), and \({\tilde{H}}_n^{(k)}(x)=\text {pr}\lbrace n^{1/2}({\tilde{S}}^2-{\tilde{\sigma }}^2_k)\le x\rbrace \). Denote \(\mu _{4}\) and \({\tilde{\mu }}_{4,k}\) as the fourth central moment of F and \(F_{mse,k}\), respectively. Since F is continuous in the setting of this theorem, it follows that \(\mu _{4}>\sigma ^{4}\) from the result of Lemma 7. Because \(F_{mse,k}\) is discrete, we have \({\tilde{\mu }}_{4,k}-{\tilde{\sigma }}_k^{4}>0\) for \(k>2\) and \(k \in {\mathbb {N}}^+\) according to Lemma 7. Moreover, \(n^{1/2}({\tilde{S}}^2-{\tilde{\sigma }}^2_k)\) converges to \(N(0, {\tilde{\mu }}_{4,k}-{\tilde{\sigma }}_k^{4})\) in distribution as \(n\rightarrow \infty \).

According to Theorem 2, if \(E(|X |^{4+\epsilon })<\infty \) where \(\epsilon >0\), then we have \(\lim _{k\rightarrow \infty }({\tilde{\mu }}_{4,k}-{\tilde{\sigma }}_k^{4})=\mu _{4}-\sigma ^{4}.\)

By Lemma 6, we obtain

$$\begin{aligned} \rho _{\infty }\bigg ( H_n(x),\Phi \bigg [ \frac{x}{(\mu _{4}-\sigma ^{4})^{1/2}}\bigg ] \bigg ) \rightarrow 0 \text { as } n\rightarrow \infty , \end{aligned}$$

and

$$\begin{aligned} \rho _{\infty }\bigg ( {\tilde{H}}_n^{(k)}(x) ,\Phi \bigg [ \frac{x}{({\tilde{\mu }}_{4,k}-{\tilde{\sigma }}_k^{4})^{1/2}}\bigg ]\bigg ) \rightarrow 0 \ \text{ as } \ n \rightarrow \infty . \end{aligned}$$

Similarly to the proof of Theorem 3 part (ii), we obtain the desire results based on the triangle inequality, i.e, \(\rho _{\infty }({\tilde{H}}_n^{(k)}(x), H_n(x) ) \rightarrow 0\) as both \(k \rightarrow \infty \) and \(n \rightarrow \infty \).

If we only let \(n\rightarrow \infty \), then we have

$$\begin{aligned} \limsup _{n \rightarrow \infty } \rho _{\infty }( {\tilde{H}}_n^{(k)}(x),H_n(x)) \le \rho _{\infty }\bigg ( \Phi \bigg [ \frac{x}{({\tilde{\mu }}_{4,k}-{\tilde{\sigma }}_k^{4})^{1/2}}\bigg ],\Phi \bigg [ \frac{x}{(\mu _{4}-\sigma ^{4})^{1/2}}\bigg ]\bigg ) . \end{aligned}$$

\(\square \)

1.6 Proof of Theorem 6

Proof

Denote \(Z \sim J(z)\) as a random variable from a general location-scale family with finite variance. Given a set of n-RPs of distribution J(z), denoted as \(\Xi =\{\xi _{1}^{(n)}, \ldots , \xi _{n}^{(n)}\}\). Let a discrete random variable \(M^{mse}_{n} \sim J_{mse,n}\) on \(\Xi \) with probability function \(p_i\) for \(i=1,\ldots ,n\). Then, \(E(M^{mse}_{n})=\sum _{i=1}^{n}(\xi _{i}^{(n)}p_i)=E(Z)\), \(\text {var}(M^{mse}_{n})=\sum _{i=1}^{n}[(\xi _{i}^{(n)}-E(M^{mse}_{n}))^2p_i]\), and \(\lim _{n\rightarrow \infty }\text {var}(M^{mse}_{n})=\text {var}(Z)\). Let a random variable \(X \sim F\), where F has a linear relationship with Z, i.e., \(X=mZ+b\), where \(b \in {\mathbb {R}}\) and \(m \ne 0\). Denote \(W^{mse}_{n} \sim {\hat{F}}_{mse,n}\) on \(\lbrace {\hat{m}}_n\xi _{1}^{(n)}+{\hat{b}}_n, \ldots , {\hat{m}}_n\xi _{n}^{(n)}+{\hat{b}}_n\rbrace \) with the same probability function \(p_i\) for \(i=1,\ldots ,n\). Thus, we obtain

$$\begin{aligned} E(W^{mse}_{n})= & {} \sum _{i=1}^{n}\left[ ({\hat{m}}_n \xi _{i}^{(n)} +{\hat{b}}_n) p_i\right] ={\hat{m}}_n E(M^{mse}_{n}) + {\hat{b}}_n, \end{aligned}$$
(B9)
$$\begin{aligned} \text {var}(W^{mse}_{n})= & {} \sum _{i=1}^{n}\left[ ({\hat{m}}_n \xi _{i}^{(n)} +{\hat{b}}_n-E(W^{mse}_{n}))^2 p_i\right] ={\hat{m}}_n^2 \text {var}(M^{mse}_{n}), \end{aligned}$$
(B10)

and

$$\begin{aligned} E |W^{mse}_{n}-E(W^{mse}_{n})|^3=|{\hat{m}}_n^3|\left( E |M^{mse}_{n}-E(M^{mse}_{n})|^3\right) . \end{aligned}$$
(B11)

By using the triangle inequality, we have

$$\begin{aligned} \begin{aligned}&\rho _{\infty }\left( {\text {pr}}\left( n^{1/2}({\bar{X}}-E(X))\le t\right) ,{\text {pr}}^{*}\left( n^{1/2}({\bar{W}}-E(W^{mse}_{n})) \le t\right) \right) \\&\quad \le \rho _{\infty }\left( {\text {pr}}( n^{1/2}({\bar{X}}-E(X))\le t), \Phi \left( \frac{t}{\text {var}(X)^{1/2}} \right) \right) \\&\qquad + \rho _{\infty }\left( {\text {pr}}^{*}(n^{1/2}({\bar{W}}-E(W^{mse}_{n})) \le t), \Phi \left( \frac{t}{\text {var}(W^{mse}_{n})^{1/2}} \right) \right) \\&\qquad + \rho _{\infty }\left( \Phi \left( \frac{t}{\text {var}(X)^{1/2}} \right) , \Phi \left( \frac{t}{\text {var}(W^{mse}_{n})^{1/2}} \right) \right) . \end{aligned} \end{aligned}$$

After using the Berry–Esséen’s inequality, the above inequality yields

$$\begin{aligned} \begin{aligned}&\rho _{\infty }\left( {\text {pr}}\left( n^{1/2}({\bar{X}}-E(X))\le t\right) ,{\text {pr}}^{*}\left( n^{1/2}({\bar{W}}-E(W^{mse}_{n})) \le t\right) \right) \\&\quad \le \frac{C}{n^{1/2}}\left( \frac{ E|X-E(X)|^3}{\text {var}(X)^{3/2}} + \frac{ E|W^{mse}_{n}-E(W^{mse}_{n})|^3}{\text {var}(W^{mse}_{n})^{3/2}}\right) \\&\qquad + \rho _{\infty }\left( \Phi \left( \frac{t}{\text {var}(X)^{1/2}} \right) , \Phi \left( \frac{t}{\text {var}(W^{mse}_{n})^{1/2}} \right) \right) . \end{aligned} \end{aligned}$$
(B12)

Notice that the maximum likelihood estimators \({\hat{m}}_n\) and \({\hat{b}}_n\) converge almost surely to m and b, respectively. By (B9), we know \(E(W^{mse}_{n}) \rightarrow mE(M^{mse}_{n})+b=E(X)\) almost surely as \(n \rightarrow \infty \). By (B10), we know \(\text {var}(W^{mse}_{n}) \rightarrow m^2\text {var}(Z)=\text {var}(X)\) almost surely as \(n \rightarrow \infty \).

If \(E(|X|^{3+\epsilon })<\infty \) with some \(\epsilon >0\), then \(\lim _{n\rightarrow \infty }E |M^{mse}_{n}-E(M^{mse}_{n})|^3=E|Z-E(Z)|^3\). Hence, \(E|W^{mse}_{n}-E(W^{mse}_{n})|^3 \rightarrow E|X-E(X)|^3\) almost surely by (B11). Therefore, the first term on the right-hand side of (B12) converges to 0 almost surely. Since \(\text {var}(W^{mse}_{n}) \rightarrow m^2\text {var}(Z)=\text {var}(X)\) almost surely as \(n \rightarrow \infty \), it follows that the second term converges to 0 almost surely. Thus, the proof is completed. \(\square \)

1.7 Proof of Theorem 7

Lemma 8

(Serfling 1980, p. 193) Given a kernel function \(h\left( x_{1}, \ldots , x_{s}\right) \), denote \(U_n\) as the corresponding U-statistic for estimation of \(\theta =E_{F}\left[ h\left( X_{1}, \ldots , X_{s}\right) \right] \) on the basis of a sample \(X_{1}, \ldots , X_{n}\) of size \(n \ge s\). Let \(\zeta _{1}={\text {Var}}_{F}\left\{ E_{F}\left[ h\left( X_{1}, \ldots , X_{s}\right) \mid X_{1}=x_{1}\right] \right\} \). If \(v=E(|h |^{3})<\infty \) and \(\zeta _{1}>0\), then

$$\begin{aligned} \sup _{-\infty<t<\infty }\bigg |{\text {pr}}\bigg (\frac{n^{1 / 2}(U_{n}-\theta )}{s \zeta _{1}^{1 / 2}} \le t\bigg )-\Phi (t)\bigg |\le C v(s^{2} \zeta _{1})^{-3 / 2} n^{-1 / 2}, \end{aligned}$$

where C is an absolute constant.

Proof

With the notations in the proof of Theorem 6, we let \(\mu _4=m^{4} E(Z-E(Z))^4\) and \(\mu _{4,n}^*={\hat{m}}_{n}^{4} \sum _{i=1}^{n}p_{i}(\xi _{i}^{(n)}-E(M_n^{mse}))^{4}\) be the fourth central moment of F and \({\hat{F}}_{mse,n}\), respectively. Denote \(\sigma ^2\) and \(\sigma _n^{*2}\) as the variances of F and \({\hat{F}}_{mse,n}\), respectively. Thus,

$$\begin{aligned} \mu _{4,n}^{*}-\sigma _n^{*4}&={\hat{m}}_{n}^{4}\sum _{i=1}^{n}p_{i}(\xi _{i}^{(n)}-E(M_n^{mse}))^{4}-\left( {\hat{m}}_n^2\sum _{i=1}^{n}p_i(\xi _{i}^{(n)}-E(M^{mse}_{n}))^2\right) ^2\\&={\hat{m}}_{n}^{4}\left( \sum _{i=1}^{n}p_{i}(\xi _{i}^{(n)}-E(M_n^{mse})\right) ^{4}-\left( \sum _{i=1}^{n}p_i(\xi _{i}^{(n)}-E(M^{mse}_{n}))^2)^2\right) . \end{aligned}$$

Since \(M_n^{mse}\) is discrete, it follows that \(\sum _{i=1}^{n}p_{i}(\xi _{i}^{(n)}-E(M_n^{mse}))^{4}-\left( \sum _{i=1}^{n}p_i(\xi _{i}^{(n)}-E(M^{mse}_{n}))^2\right) ^2>0\) for all \(n>2\) due to Lemma 7.

If \(E_F(|X |^{6+\epsilon })<\infty \) with some \(\epsilon >0\), then \(E_J(|Z|^{6+\epsilon })<\infty \) with some \(\epsilon >0\). According to Theorem 2, we have

$$\begin{aligned} \lim _{n\rightarrow \infty } \sum _{i=1}^{n} p_{i}(\xi _{i}^{(n)}-E(M_n^{mse}))^{4}=E(Z-E(Z))^4. \end{aligned}$$

Since \({\hat{m}}_n\) is a maximum likelihood estimator of m, it follows that \(\mu _{4}^{*} \rightarrow \mu _4\) almost surely. Since \(\text {var}(W^{mse}_{n}) \rightarrow m^2\text {var}(Z)=\text {var}(X)\) almost surely as \(n \rightarrow \infty \), it follows that

$$\begin{aligned} \lim _{n \rightarrow \infty } (\mu _{4,n}^{*} - \sigma _n^{*4})=\mu _{4}-\sigma ^{4} \ \text {almost surely}. \end{aligned}$$
(B13)

Similarly to the proof of Theorem 5, because F is continuous in the setting of this theorem, we have \(0<\mu _{4}-\sigma ^{4}<\infty \). Then, according to Lemma 6, we obtain

$$\begin{aligned} n^{1/2}(S^2-\sigma ^2) \rightarrow N(0,\mu _4-\sigma ^4) \text { in distribution as } n \rightarrow \infty . \end{aligned}$$
(B14)

By using the triangle inequality, we obtain

$$\begin{aligned} \begin{aligned}&\rho _{\infty }\bigl \{{\text {pr}}\bigl (n^{1 / 2}\bigl (S^{2}-\sigma ^{2}\bigr ) \le t\bigr ), {\text {pr}}^{*}\bigl (n^{1 / 2}\bigl ({\hat{S}}^{* 2}-\sigma ^{* 2}\bigr ) \le t\bigr )\bigr \} \\&\quad \le \rho _{\infty }\biggl \{{\text {pr}}\biggl ( \frac{n^{1 / 2}(S^{2}-\sigma ^{2})}{(\mu _{4}-\sigma ^{4})^{1 / 2}} \le t\biggr ) , \Phi (t) \biggr \}\\&\qquad + \rho _{\infty }\biggl \{{\text {pr}}^{*}\biggl (\frac{n^{1 / 2}({\hat{S}}^{* 2}-\sigma ^{* 2})}{(\mu _{4,n}^{*}-\sigma _n^{* 4})^{1 / 2}} \le t\biggr ) , \Phi (t) \biggr \} \\&\qquad + \rho _{\infty }\biggl \{\Phi \bigg (\frac{t}{(\mu _{4}-\sigma ^{4})^{1 / 2}}\bigg ), \Phi \bigg (\frac{t}{(\mu _{4,n}^{*}-\sigma _n^{* 4})^{1 / 2}}\bigg )\biggr \}. \end{aligned} \end{aligned}$$
(B15)

From (B13), the third term on the right-hand side of (B15) converges to 0 almost surely as \(n \rightarrow \infty \). From (B14), the first term on the right-hand side of (B15) converges to 0 as \(n \rightarrow \infty \).

For the sample variance of a random variable X, the kernel function \(h(x_1,x_2)=(x_1-x_2)^2/2\) has \(s=2\) variables and \(\zeta _{1}=(\mu _4-\sigma ^4)/4\). By Lemma 8,

$$\begin{aligned} \rho _{\infty }\bigg \lbrace {\text {pr}}^{*}\bigg ( \frac{n^{1/2}({\hat{S}}^{*2}-\sigma ^{*2}) }{(\mu _{4,n}^*-\sigma _n^{*4})^{1/2}} \le t\bigg ), \Phi (t)\bigg \rbrace \le C v^{*}(\mu _{4,n}^{*}-\sigma _n^{* 4})^{-3 / 2} n^{-1 / 2}, \end{aligned}$$

where C is an absolute constant and

$$\begin{aligned} v^{*}=E_{{\hat{F}}_{mse,n}}( |h(W_{1}, W_{2})|^{3}) =\frac{1}{8}{\hat{m}}_{n}^{6}E_{J_{mse,n}}\big [ (M_1 -M_2)^6\big ] \end{aligned}$$

with \(W_1, W_2 \overset{iid}{\sim }{\hat{F}}_{mse,n}\) and \(M_1, M_2 \overset{iid}{\sim }J_{mse,n}\).

If \(E_J(|Z|^{6+\epsilon })<\infty \) with some \(\epsilon >0\), then \(E_{J_{mse,n}}[(M_1)^6]< \infty \) due to Theorem 2. After applying Minkowski inequality, we have

$$\begin{aligned} E[(M_1-M_2)^6]^{1/6} \le E[(M_1)^6]^{1/6}+ E[(M_2)^6]^{1/6}<\infty . \end{aligned}$$

Since \({\hat{m}}_n\) is a maximum likelihood estimator of m, it follows that \({\hat{m}}_n\rightarrow m\) almost surely as \(n\rightarrow \infty \). Therefore, the second term on the right-hand side of (B15) converges to 0 almost surely as \(n \rightarrow \infty \). \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, LH., Li, Y. & Fang, KT. The resampling method via representative points. Stat Papers (2024). https://doi.org/10.1007/s00362-024-01536-2

Download citation

  • Received:

  • Revised:

  • Published:

  • DOI: https://doi.org/10.1007/s00362-024-01536-2

Keywords

Mathematics Subject Classification

Navigation