Jackknife variance estimation for general two-sample statistics and applications to common mean estimators under ordered variances

Abstract

We study the jackknife variance estimator for a general class of two-sample statistics. As a concrete application, we consider samples with a common mean but possibly different, ordered variances as arising in various fields such as interlaboratory experiments, field studies, or the analysis of sensor data. Estimators for the common mean under ordered variances typically employ random weights, which depend on the sample means and the unbiased variance estimators. They take different forms when the sample estimators are in agreement with the order constraints or not, which complicates even basic analyses such as estimating their variance. We propose to use the jackknife, whose consistency is established for general smooth two-sample statistics induced by continuously Gâteux or Fréchet-differentiable functionals, and, more generally, asymptotically linear two-sample statistics, allowing us to study a large class of common mean estimators. Furthermore, it is shown that the common mean estimators under consideration satisfy a central limit theorem (CLT). We investigate the accuracy of the resulting confidence intervals by simulations and illustrate the approach by analyzing several data sets.

This is a preview of subscription content, log in to check access.

Fig. 1

References

  1. Bhattarcharya, C. (1980). Estimation of a common mean and recovery of interblock information. The Annals of Statistics, 8, 205–211.

    MathSciNet  Article  Google Scholar 

  2. Brown, L. D., & Cohen, A. (1974). Point and confidence interval estimation of a common mean and recovery of interblock information. The Annals of Statistics, 2, 963–976.

    MathSciNet  Article  Google Scholar 

  3. Cemer, I. (2011). Noise measurement. Sensors online. https://www.sensorsmag.com/embedded/noise-measurement.

  4. Chang, Y.-T., Oono, Y., & Shinozaki, N. (2012). Improved estimators for the common mean and ordered means of two normal distributions with ordered variances. Journal of Statistical Planning and Inference, 142(9), 2619–2628.

    MathSciNet  Article  Google Scholar 

  5. Chang, Y.-T., & Shinozaki, N. (2008). Estimation of linear functions of ordered scale parameters of two gamma distributions under entropy loss. Journal of the Japan Statistical Society, 38(2), 335–347.

    MathSciNet  Article  Google Scholar 

  6. Chang, Y-T., & Shinozaki, N. (2015). Estimation of two ordered normal means under modified Pitman nearness criterion. Annals of the Institute of Statistical Mathematics, 67, 863–883. https://doi.org/10.1007/s10463-014-0479-4.

    MathSciNet  Article  MATH  Google Scholar 

  7. Cochran, W. (1937). Problems arising in the analysis of a series of similar experiments. JASA, 4, 172–175.

    Google Scholar 

  8. Cressie, N. (1997). Jackknifing in the presence of inhomogeneity. Technometrics, 39(1), 45–51.

    MathSciNet  Article  Google Scholar 

  9. Degerli, Y. (2000). Analysis and reduction of signal readout circuitry temporal noise in CMOS image sensors for low light levels. IEEE Transactions on Electron Devices, 47(5), 949–962.

    Article  Google Scholar 

  10. Efron, B. (1982). The jackknife, the bootstrap and other resampling plans (Vol. 38)., CBMS-NSF regional conference series in applied mathematics Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM).

    Google Scholar 

  11. Efron, B., & Hastie, T. (2016). Computer age statistical inference: Algorithms, evidence, and data science (Vol. 5)., Institute of mathematical statistics (IMS) monographs New York: Cambridge University Press.

    Google Scholar 

  12. Efron, B., & Stein, C. (1981). The jackknife estimate of variance. The Annals of Statistics, 9(3), 586–596.

    MathSciNet  Article  Google Scholar 

  13. Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap (Vol. 57)., Monographs on statistics and applied probability New York: Chapman and Hall.

    Google Scholar 

  14. Elfessi, A., & Pal, N. (1992). A note on the common mean of two normal populations with order restrictions in location-scale families. Communications in Statistics—Theory and Methods, 21(11), 3177–3184.

    MathSciNet  Article  Google Scholar 

  15. Fisher, R. (1932). Statistical methods for research workers (4th ed.). London: Oliver and Boyd.

    Google Scholar 

  16. Heyl, P., & Cook, G. (1936). The value of gravity in Washington. Journal of Research of the U.S. Bureau of Standards, 17, 805–839.

    Article  Google Scholar 

  17. Keller, T., & Olkin, I. (2004). Combining correlated unbiased estimators of the mean of a normal distribution. A Festschrift for Herman Rubin, 45, 218–227.

    MathSciNet  Article  Google Scholar 

  18. Kubokawa, T. (1989). Closer estimation of a common mean in the sense of Pitman. Annals of the Institute of Statistical Mathematics, 41(3), 477–484.

    MathSciNet  Article  Google Scholar 

  19. Lee, Y. (1991). Jackknife variance estimators in the one-way random effects model. Annals of the Institute of Statistical Mathematics, 43(4), 707–714.

    MathSciNet  Article  Google Scholar 

  20. Lin, D. (2010). Quantified temperature effect in a CMOS image sensor. IEEE Transactions on Electron Devices, 57(2), 422–428.

    Article  Google Scholar 

  21. Mehta, J., & Gurland, J. (1969). Combinations of unbiased estimators of the mean which consider inequality of unknown variances. JASA, 64(327), 1042–1055.

    MathSciNet  Article  Google Scholar 

  22. Miller, R. G. (1974). The jackknife—A review. Biometrika, 61, 1–15.

    MathSciNet  MATH  Google Scholar 

  23. Nair, K. (1980). Variance and distribution of the Graybill–Deal estimator of the common mean of two normal populations. The Annals of Statistics, 8(1), 212–216.

    MathSciNet  Article  Google Scholar 

  24. Nair, K. (1982). An estimator of the common mean of two normal populations. Journal of Statistical Planning and Inference, 6, 119–122.

    MathSciNet  Article  Google Scholar 

  25. Pitman, E. (1937). The closest estimates of statistical parameters. Proceedings of the Cambridge Philosophical Society, 33, 212–222.

    Article  Google Scholar 

  26. Quenouille, M. (1949). Approximate tests of correlation in time series. Mathematical Proceedings of the Cambridge Philosophical Society, 11, 68–84.

    MathSciNet  MATH  Google Scholar 

  27. Shao, J. (1993). Differentiability of statistical functionals and consistency of the jackknife. The Annals of Statistics, 21(1), 61–75.

    MathSciNet  Article  Google Scholar 

  28. Shao, J., & Wu, C. F. (1989). A general theory for jackknife variance estimation. The Annals of Statistics, 17, 1176–1197.

    MathSciNet  Article  Google Scholar 

  29. Sinha, B. (1985). Unbiased estimation of the variance of the GD estimator of the common mean of several normal populations. Canadian Journal of Statistics, 13(3), 243–247.

    MathSciNet  Article  Google Scholar 

  30. Steland, A. (2015). Vertically weighted averages in Hilbert spaces and applications to imaging: Fixed sample asymptotics and efficient sequential two-stage estimation. Sequential Analysis, 34(3), 295–323.

    MathSciNet  Article  Google Scholar 

  31. Steland, A. (2017). Fusing photovoltaic data for improved confidence intervals. AIMS Energy, 5, 113–136.

    Article  Google Scholar 

  32. Tippett, L. (1931). The method of statistics. London: Williams and Norgate.

    Google Scholar 

  33. Tukey, J. W. (1958). Bias and confidence in not quite large samples (abstract). Annals of Mathematical Statistics, 29, 614.

    Article  Google Scholar 

  34. van Eeden, C. (2006). Restricted parameter space estimation problems., Lecture notes in statistics Berlin: Springer.

    Google Scholar 

  35. Voinov, V. (1984). Variance and its unbiased estimator for the common mean of several normal populations. Sankhya: The Indian Journal of Statistics, Series B, 46, 291–300.

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by JSPS Kakenhi grants #JP26330047 and #JP8K11196. Parts of this paper have been written during research visits of the first author at Mejiro University, Tokyo. He thanks for the warm hospitality. Both authors thank Hideo Suzuki, Keio University at Yokohama, for invitations to his research seminar, Shinozaki Nobuo, Takahisa Iida, Shun Matsuura and the participants for comments and discussion. The authors gratefully acknowledge the support of Prof. Takenori Takahashi, Mejiro University and Keio University Graduate School, and Akira Ogawa, Mejiro University, for providing and discussing the chip manufacturing data. They would like to thank anonymous referees for the helpful comments.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ansgar Steland.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Miscellaneous results

The following result provides the asymptotic linearity of the sample variance and provides several higher order convergence rates of the remainder term under moment conditions on the random variables.

Lemma A.1

Let\(\xi _1, \dots , \xi _n\)be i.i.d. random variables\({\textit{with}}\, E |\xi _1|^{12} < \infty\). Then, the sample variance\(S_n^2 = \frac{1}{n}\sum _{i=1}^n (\xi _i - \overline{\xi })^2\)is asymptotically linear:

$$\begin{aligned} S_n^2 - \sigma ^2 = \frac{1}{n} \sum _{i=1}^n ({\widetilde{ \xi }}_i^2 - \sigma ^2) - R_n, \end{aligned}$$
(32)

where\({\widetilde{ \xi }}_i = \xi _i - E( \xi _i )\), \(1 \le i \le n\), and the remainder term\(R_n = ( \frac{1}{n} \sum _{i=1}^n (X_i-\mu ))^2\)satisfies the following:

$$\begin{aligned} E ( n R_n^6 ) = O(n^{-5}), \quad E( n R_n^4 ) = O( n^{-3} ), \quad E( n R_n^2 ) = O(n^{-1}), \end{aligned}$$
(33)

and

$$\begin{aligned} E( R_n - R_{n-1} )^2 = O( n^{-3} ), \ E( R_n - R_{n-1} )^4 = O(n^{-6} ), \end{aligned}$$
(34)

as\(n \rightarrow \infty\). The assertions also hold for the unbiased variance estimator\({\widetilde{ S }}_n^2 = \frac{n}{n-1} S_n^2\).

Proof of Lemma A.1

A direct calculation shows that

$$\begin{aligned} S_n^2 - \sigma ^2&= \frac{1}{n} \sum _{i=1}^n ( {\widetilde{ \xi }}_i^2 - \sigma ^2 ) - ( \overline{{\widetilde{ \xi }}} )^2, \end{aligned}$$

with \(R_n = ( \overline{{\widetilde{ \xi }}} )^2\). Using the fact that \(E( \sum _{i=1}^n {\widetilde{ \xi }}_i )^r = O( n^{r/2} )\), if \(E | \xi _i |^r < \infty\), the estimates in (33) follow easily, that is:

$$\begin{aligned} E( n R_n^2 )= & {} n n^{-4} E\left( \sum _{i=1}^n {\widetilde{ \xi }}_i \right) ^{4} = O(n^{-1}), \\ E( n R_n^4 )= & {} O( n^{-3} ) \end{aligned}$$

and

$$\begin{aligned} E( n R_n^6 ) = n n^{-12} E\left( \sum _{i=1}^n {\widetilde{ \xi }}_i \right) ^{12} = O( n^{-5} ). \end{aligned}$$

To show (34), we observe that

$$\begin{aligned} R_n - R_{n-1}&= \left( \frac{1}{n} \sum _{j=1}^n {\widetilde{ \xi }}_j \right) ^2 - \left( \frac{1}{n-1} \sum _{j=1}^{n-1} {\widetilde{ \xi }}_j \right) ^2 \\&=\left( \frac{1}{n^2} - \frac{1}{(n-1)^2} \right) \sum _{j,j'<n} {\widetilde{ \xi }}_j {\widetilde{ \xi }}_{j'} + \frac{2}{n^2} \sum _{j=1}^n {\widetilde{ \xi }}_n {\widetilde{ \xi }}_j. \end{aligned}$$

By virtue of the \(C_r\)-inequality:

$$\begin{aligned} E( R_n - R_{n-1} )^r \le 2^{r-1} \left\{ E \left( \sum _{j,j'=1}^{n-1} \frac{-2n+1}{n^2(n-1)^2} {\widetilde{ \xi }}_j {\widetilde{ \xi }}_{j'} \right) ^r + E \left( \frac{2}{n^2} \sum _{j=1}^n {\widetilde{ \xi }}_n {\widetilde{ \xi }}_j \right) ^r \right\} , \end{aligned}$$

for \(r = 2, 4\). Observe that

$$\begin{aligned} E \left( \frac{2}{n^2} \sum _{i=1}^n {\widetilde{ \xi }}_n {\widetilde{ \xi }}_j \right) ^2&\le \frac{4}{n^4} \sqrt{ E {\widetilde{ \xi }}_n^4 } \sqrt{ E \left( \sum _{j=1}^n {\widetilde{ \xi }}_j \right) ^4 } = O( n^{-3} ) \end{aligned}$$

and

$$\begin{aligned} E \left( \sum _{j,j'=1}^{n-1} \frac{-2n+1}{n^2(n-1)^2} {\widetilde{ \xi }}_j {\widetilde{ \xi }}_{j'} \right) ^2&= \left( \frac{-2n+1}{n^2(n-1)^2} \right) ^2 E \left( \sum _{j=1}^{n-1} {\widetilde{ \xi }}_j \right) ^4 = O( n^{-4} ), \end{aligned}$$

such that

$$\begin{aligned} E( R_n - R_{n-1})^2 = O( n^{-3} ). \end{aligned}$$

Similarly:

$$\begin{aligned} E \left( \frac{2}{n^2} \sum _{j=1}^n {\widetilde{ \xi }}_n {\widetilde{ \xi }}_j \right) ^4&\le \frac{2^4}{n^8} \sqrt{ E {\widetilde{ \xi }}_n^8 } \sqrt{ E \left( \sum _{j=1}^n {\widetilde{ \xi }}_j \right) ^8 } = O( n^{-8} ) O( n^2 ) = O(n^{-6}) \end{aligned}$$

and

$$\begin{aligned} E \left( \sum _{j,j'=1}^{n-1} \frac{-2n+1}{n^2(n-1)^2} {\widetilde{ \xi }}_j {\widetilde{ \xi }}_{j'} \right) ^4&= \left( \frac{-2n+1}{n^2(n-1)^2} \right) ^4 E \left( \sum _{j=1}^{n-1} {\widetilde{ \xi }}_j \right) ^8 \\&= O( n^{-12} ) O( n^4 ) = O(n^{-8}) \end{aligned}$$

leading to

$$\begin{aligned} E(R_n - R_{n-1})^4 = O(n^{-6} ). \end{aligned}$$

The additional arguments to treat \({\widetilde{ S }}_n^2\) are straightforward and omitted. \(\square\)

Lemma A.2

Suppose that\(X_{ij}, j = 1, \dots , n_i\), are i.i.d. with finite eight moments and strictly ordered variances\(\sigma _1^2 < \sigma _2^2\). Then

$$\begin{aligned} P( S_1^2 > S_2^2 ) \le C n^{-2}, \end{aligned}$$

for some constantC.

Proof

By Lemma A.1, we have the following:

$$\begin{aligned} S_{i}^2 - \sigma _i^2 = L_{ni} - R_{ni}, \end{aligned}$$

with \(L_{ni} = \frac{1}{n_i} \sum _{j=1}^{n_i} [ ( X_{ij}- \mu )^2 - \sigma _i^2 ]\) and \(R_{ni} = ( \frac{1}{n_i} \sum _{j=1}^{n_i} (X_{ij}-\mu ) )^2\), for \(i = 1, 2\). For \(r = 1, 2\), we have the following:

$$\begin{aligned} E | L_{ni} |^{2r} = O( n_i^{-r} ) \qquad \text {and} \qquad E | R_{ni} |^{2r} = O(n_i^{-2r}). \end{aligned}$$

Hence

$$\begin{aligned} \Vert S_1^2 - S_2^2 - (\sigma _1^2-\sigma _2^2) \Vert _{L_4} \le \Vert S_1^2 - \sigma _1^2 \Vert _{L_4} + \Vert S_2^2 - \sigma _2^2 \Vert _{L_4} = O( n_1^{-1/2} ) + O( n_2^{-1/2} ) = O( n^{-1/2} ). \end{aligned}$$

Now, we may conclude that:

$$\begin{aligned} P( S_1^2 > S_2^2 )&\le P( |S_1^2 - S_2^2 - (\sigma _2^2 - \sigma _1^2) | \ge (\sigma _2^2 - \sigma _1^2) ) \\&\le \frac{ E | S_1^2 - \sigma _1^2 - (S_2^2 - \sigma _2^2) |^4 }{ | \sigma _2^2 - \sigma _1^2 |^4 } \\&\le \frac{ ( \Vert S_1^2 - \sigma _1^2 \Vert _{L_4} + \Vert S_2^2 - \sigma _2^2 \Vert _{L_4} )^4 }{ | \sigma _2^2 - \sigma _1^2 |^4 } \\&= O( n^{-2} ). \end{aligned}$$

\(\square\)

Lemma A.3

If\(\xi _1, \dots , \xi _{n_1}\)are i.i.d. with finite second moment and\(\max _{1 \le i \le n_1} E| {\widehat{ \xi }}_i - \xi _i | = o(1)\), then

$$\begin{aligned} E \left| \frac{1}{n_1} \sum _{j=1}^{n_1} {\widehat{ \xi }}_j^2 - \frac{1}{n_1} \sum _{j=1}^{n_1} \xi _j^2 \right| = o(1), \end{aligned}$$

and

$$\begin{aligned} E \left| \left( \frac{1}{n_1} \sum _{j=1}^{n_1} {\widehat{ \xi }}_j \right) ^2 - \left( \frac{1}{n_1} \sum _{j=1}^{n_1} \xi _j \right) ^2 \right| = o(1), \end{aligned}$$

as\(n_1 \rightarrow \infty\).

Proof

Since

$$\begin{aligned} E \left| \frac{1}{n_1} \sum _{j=1}^{n_1} {\widehat{ \xi }}_j^2 - \frac{1}{n_1} \sum _{j=1}^{n_1} \xi _j^2 \right| \le \frac{1}{n_1} \sum _{j=1}^{n_1} E|{\widehat{ \xi }}_j^2 - \xi _i^2| = o(1), \end{aligned}$$

as \(n_1 \rightarrow \infty\). Similarly:

$$\begin{aligned} E | {\widehat{ \xi }}_i {\widehat{ \xi }}_j - \xi _i \xi _j | = E| (\xi _i + A_n )(\xi _j + A_n) - \xi _i \xi _j | \le E|\xi _i A_n| + E |\xi _j A_n| + E( A_n^2 ) = o(1), \end{aligned}$$

such that the squared sample moments converge in \(L_1\), as well, since

$$\begin{aligned} E \left| \left( \frac{1}{n_1} \sum _{j=1}^{n_1} {\widehat{ \xi }}_j \right) ^2 - \left( \frac{1}{n_1} \sum _{j=1}^{n_1} \xi _j \right) ^2 \right| \le \frac{1}{n_1^2} \sum _{i,j=1}^{n_1} E | {\widehat{ \xi }}_i {\widehat{ \xi }}_j - \xi _i \xi _j | = o(1), \end{aligned}$$

as \(n_1 \rightarrow \infty\), follows. \(\square\)

Lemma A.4

Suppose that\(Z_1, Z_2, \dots\)are i.i.d. with\(E( Z_1^2 )< \infty\)and\(\xi _1, \xi _2, \dots\)are random variables with\(\max _{1 \le i \le n} | Z_i - \xi _i | = o(1)\), as\(n \rightarrow \infty\), a.s.. Then, \(\frac{1}{n} \sum _{i=1}^n \xi _i \rightarrow E(Z_1)\), a.s., and\(\frac{1}{n} \sum _{i=1}^n \xi _i^2 \rightarrow E( Z_1^2 )\), a.s.

Proof

First notice that, by the strong law of large numbers:

$$\begin{aligned} \left| \frac{1}{n} \sum _{i=1}^n \xi _i - E(Z_1) \right|&= \left| \frac{1}{n} \sum _{i=1}^n ( Z_i - E(Z_1) )+ \frac{1}{n} \sum _{i=1}^n (\xi _i - Z_i) \right| \\&\le \left| \frac{1}{n} \sum _{i=1}^n ( Z_i - E(Z_1) ) \right| + \max _{1 \le i \le n} | Z_i - \xi _i |, \\&= o(1), \end{aligned}$$

as \(n \rightarrow \infty\), a.s.. For the sample moments of order 2, we have the estimate:

$$\begin{aligned} \left| \frac{1}{n} \sum _{i=1}^n Z_i^2 - \frac{1}{n} \sum _{i=1}^n \xi _i^2 \right|&\le \left| \frac{1}{n} \sum _{i=1}^n | Z_i - \xi _i | (|Z_i| + |\xi _i|) \right| \\&\le \max _{1 \le i \le n} | Z_i - \xi _i | \left( \frac{1}{n} \sum _{i=1}^n | Z_i | + \frac{1}{n} \sum _{i=1}^n | \xi _i | \right) \\&= \max _{1 \le i \le n} | Z_i - \xi _i | ( 2 E | Z_1 | + o(1) ) \\&= o(1), \end{aligned}$$

as \(n \rightarrow \infty\). \(\square\)

Appendix B: Proof of Theorem 4.1

In the sequel, \(\Vert \cdot \Vert _2\) denotes the vector-2 norm, whereas \(\Vert \cdot \Vert _{L_p}\) stands for the \(L_p\)-norm of a random variable, i.e., \(\Vert Z \Vert _{L_p} = \left( E |Z|^p \right) ^{1/p}\).

As discussed in Remark 3.3, we let:

$$\begin{aligned} {\widehat{ \theta }}_N = (N/n, {\widetilde{ S }}_1^2, {\widetilde{ S }}_2^2, \overline{X}_1, \overline{X}_2 )', \qquad \theta = (\lambda _1, \sigma _1^2, \sigma _2^2, \mu _1, \mu _2 )'. \end{aligned}$$
(35)

Also notice that \(\mu _1 = \mu _2 = \mu\). We have the following:

$$\begin{aligned} \varphi ( {\widehat{ \theta }}_N ) := {\widehat{ \mu }}_N = \gamma _N \overline{X}_1 + (1- \gamma _N) \overline{X}_2 = \phi ( {\widehat{ \theta }}_N ) + \psi ( {\widehat{ \theta }}_N ), \end{aligned}$$
(36)

where

$$\begin{aligned} \phi ( {\widehat{ \theta }}_N )&= \gamma ^\le ( {\widehat{ \theta }}_N) \overline{X}_1 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} + \gamma ^>( {\widehat{ \theta }}_N) \overline{X}_1 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2 \}}, \\ \psi ( {\widehat{ \theta }}_N )&= [1-\gamma ^\le ({\widehat{ \theta }}_N) ] \overline{X}_2 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} + [1-\gamma ^>({\widehat{ \theta }}_N) ] \overline{X}_2 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}}. \end{aligned}$$

Observe that

$$\begin{aligned} \phi ( \theta )&= \gamma ^{\le }( \theta ) \mu _1 {\mathbf {1}}_{ \{ \sigma _1^2 \le \sigma _2^2 \} } + \gamma ^{>}( \theta ) \mu _1 {\mathbf {1}}_{ \{ \sigma _1^2> \sigma _2^2 \} }, \\ \psi ( \theta )&= [1-\gamma ^{\le }(\theta )]\mu _2 {\mathbf {1}}_{\{\sigma _1^2\le \sigma _2^2\}} + [1-\gamma ^{>}(\theta )] \mu _2 {\mathbf {1}}_\{\sigma _2^2>\sigma _2^2\}, \end{aligned}$$

such that, especially, the function \(\varphi\) satisfies the following:

$$\begin{aligned} \varphi ( \theta )&= \gamma ^\le (\theta ) \mu _1 {\mathbf {1}}_{\{ \sigma _1^2 \le \sigma _2^2 \}} + \gamma ^>(\theta ) \mu _1 {\mathbf {1}}_{\{ \sigma _1^2> \sigma _2^2 \}} \\&\quad + [1-\gamma ^\le (\theta ) ] \mu _2 {\mathbf {1}}_{\{ \sigma _1^2 \le \sigma _2^2 \}} + [1-\gamma ^>(\theta ) ] \mu _2 {\mathbf {1}}_{\{ \sigma _1^2 > \sigma _2^2 \}}. \end{aligned}$$

Observe that

$$\begin{aligned} \varphi ( \theta ) = \mu , \qquad \hbox { if}\ \mu _1 = \mu _2 = \mu . \end{aligned}$$

We show the result in detail for \(\theta\) with \(\sigma _1^2 < \sigma _2^2\); the other case (\(\sigma _1^2 > \sigma _2^2\)) is treated analogously and we only indicate some essential steps. Furthermore, we only discuss \(\phi ( {\widehat{ \theta }}_N)\), since \(\psi ( {\widehat{ \theta }}_N)\) can be handled similarly. We have to consider the following:

$$\begin{aligned} \phi ({\widehat{ \theta }}_N ) - \phi ( \theta )&= \gamma ^\le ( {\widehat{ \theta }}_N ) \overline{X}_1 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2\}} + \gamma ^>( {\widehat{ \theta }}_N ) \overline{X}_1 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}} \\&\quad - \gamma ^\le ( \theta ) \mu _1, \end{aligned}$$

which can be rewritten as follows:

$$\begin{aligned} \phi ( {\widehat{ \theta }}_N ) - \phi ( \theta )&= [ \gamma ^\le ( {\widehat{ \theta }}_N ) \overline{X}_1 - \gamma ^\le ( \theta ) \mu _1 ] {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2\}} \\&\quad + [ \gamma ^>( {\widehat{ \theta }}_N ) \overline{X}_1 - \gamma ^>( \theta ) \mu _2 ] {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2 \}} \\&\quad - \gamma ^\le ( \theta ) \mu _1 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2\}} + \gamma ^>( \theta ) \mu _2 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}}. \end{aligned}$$

The first term is the leading one, whereas the last two terms are of the order \(1/N^2\) in the \(L_2\)-sense, since

$$\begin{aligned} E ( \gamma ^\le ( \theta ) \mu _1 {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2 \}} )^2 \le \Vert \gamma ^\le \Vert _\infty ^2 \mu _1^2 P( {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 ), \end{aligned}$$

and for \(\sigma _1^2 < \sigma _2^2\), we have the following:

$$\begin{aligned} P( {\widetilde{ S }}_1^2 \ge {\widetilde{ S }}_2^2 ) = O\left( \frac{1}{N^2} \right) \end{aligned}$$
(37)

by Lemma A.2. By symmetry, we also have the following:

$$\begin{aligned} P( {\widetilde{ S }}_1^2 < {\widetilde{ S }}_2^2 ) = O(1/N^2), \qquad \hbox { if}\ \sigma _1^2 > \sigma _2^2 . \end{aligned}$$
(38)

By our assumptions on \(\gamma ^\le\), we have for the leading term of \(\phi ( {\widehat{ \theta }}_N ) - \phi ( \theta )\):

$$\begin{aligned} \gamma ^\le ( {\widehat{ \theta }}_N ) \overline{X}_1 - \gamma ^\le ( \theta )\mu _1&= \nabla [\gamma ^\le ( \theta ) \mu _1] ({\widehat{ \theta }}_N - \theta ) + \frac{1}{2} ({\widehat{ \theta }}_N - \theta )' \Delta [ \gamma ^\le (\theta ) \mu _1]( {\widehat{ \theta }}_N^* ) ( {\widehat{ \theta }}_N - \theta ), \end{aligned}$$

which can be written as follows:

$$\begin{aligned} \sum _{i=1}^5 \frac{ \partial [\gamma ^\le (\theta )\mu _1] }{ \partial \theta _i } ({\widehat{ \theta }}_{Ni} - \theta _i) + \frac{1}{2} \sum _{i,j=1}^5 \partial _{ij} [\gamma ^\le (\theta ) \mu _1]( {\widehat{ \theta }}_N^* ) ( {\widehat{ \theta }}_{Ni} - \theta _i )( {\widehat{ \theta }}_{Nj} - \theta _j ), \end{aligned}$$
(39)

for some \({\widehat{ \theta }}_N^*\) between \({\widehat{ \theta }}_N\) and \(\theta\), where \({\widehat{ \theta }}_N = ( {\widehat{ \theta }}_{Ni} )_{i=1}^5\) and \(\theta = (\theta _i)_{i=1}^5\). \(\nabla [\gamma ^\le ( \theta ) \mu _1]\) denotes the gradient of the function \(\bar{\theta } \mapsto \gamma ^\le (\bar{\theta }) \bar{\theta }_4\), \(\bar{\theta } =(\bar{\theta }_1, \dots , \bar{\theta }_5)' \in \Theta\), evaluated at (the true value) \(\theta\), and \(\partial _{ij} [\gamma ^\le \mu _1]( {\widehat{ \theta }}_N^* )\) stands for the second partial derivative with respect to \(\theta _i\) and \(\theta _j\) evaluated at a point \({\widehat{ \theta }}_N^*\). Observe that the linear term in the above expansion is asymptotically linear, i.e., it can be written as follows:

$$\begin{aligned} \nabla [ \gamma ^{\le }( \theta ) \mu _1 ]( {\widehat{ \theta }}_N - \theta ) = \frac{1}{N} \sum _{j=1}^N \left\{ \sum _{i=1}^5 \frac{\partial [\gamma ^\le (\theta )\mu _1] }{ \partial \theta _i} \xi _{ij} \right\} + R_N^{\gamma ^\le }, \quad N E( R_N^{\gamma ^\le } )^2 = o(1), \end{aligned}$$

for appropriate random variables \(\xi _{ij}\), since the coordinates are asymptotically linear, that is:

$$\begin{aligned} {\widehat{ \theta }}_{Ni} - \theta _i = \frac{1}{N} \sum _{j=1}^N \xi _{ij} + R_{Ni}, \quad \text {with} \quad N E(R_{Ni}^2) = o(1). \end{aligned}$$

This follows from the facts that \(\overline{X}_i\), \(i = 1, 2\), are linear statistics and \({\widetilde{ S }}_i^2 - \sigma _i^2\), \(i = 1, 2\), are asymptotically linear by virtue of Lemma A.1. Thus, it suffices to estimate the second-order terms of the Taylor expansion. First, observe that

$$\begin{aligned} N E( {\widehat{ \theta }}_{Ni} - \theta _i )^4 = N E \left| \frac{1}{N} \sum _{j=1}^N \xi _{ij} + R_{Ni} \right| ^4 = O(1/N), \end{aligned}$$
(40)

for \(i = 1, \dots , 5\), such that

$$\begin{aligned} E \Vert {\widehat{ \theta }}_N - \theta \Vert _2^2 = O(1/N^2), \end{aligned}$$
(41)

where \(\Vert \cdot \Vert _2\) denotes the vector-2 norm. For the arithmetic means (40) follows, because the remainder term vanishes and \(E(\sum _{j=1}^N \xi _{ij} )^r = O(N^{r/2})\) if \(E|\xi _{ij}|^r < \infty\), and for the estimators \({\widetilde{ S }}_i^2\) because of Lemma A.1. The quadratic terms can be estimated as follows:

$$\begin{aligned}&N E \left( \partial _{ij} [\gamma ^\le (\theta ) \mu _1] ( {\widehat{ \theta }}_N^* ) ( {\widehat{ \theta }}_{Ni} - \theta _i )( {\widehat{ \theta }}_{Nj} - \theta _j ) \right) ^2 \\&\quad \le \Vert \partial _{ij} [\gamma ^\le (\theta ) \mu _1]\Vert _\infty ^2 N E\left( | {\widehat{ \theta }}_{Ni} - \theta _i |^2 | {\widehat{ \theta }}_{Nj} - \theta _j |^2 \right) \\&\quad = O\left( \sqrt{ N E | {\widehat{ \theta }}_{Ni} - \theta _i |^4 } \sqrt{ N E | {\widehat{ \theta }}_{Nj} - \theta _j |^4 } \right) \\&\quad = O( 1/N ). \end{aligned}$$

Here, \(\Vert x \Vert _\infty = \max _{1\le i \le \ell } | x_i |\) denotes the \(l_\infty\)-norm of a vector \(x = (x_1, \dots , x_\ell )' \in {\mathbb {R}}^{\ell }\). Therefore, we obtain the estimate:

$$\begin{aligned} N E \left( \frac{1}{2} \sum _{i,j=1}^5 \partial _{ij} [\gamma ^\le (\theta ) \mu _1]( {\widehat{ \theta }}_N^* ) ( {\widehat{ \theta }}_{Ni} - \theta _i )( {\widehat{ \theta }}_{Nj} - \theta _j ) \right) ^2 = O( 1/ N ) \end{aligned}$$

for the quadratic term of the above Taylor expansion. Switching back to the definition (35) of \({\widehat{ \theta }}_N\) and \(\theta\), the above arguments show that, for ordered variances, \(\sigma _1^2 < \sigma _2^2\):

$$\begin{aligned} \phi ( N/n, {\widehat{ \theta }}_N ) - \phi ( \lambda _1, \theta ) = \nabla [\gamma ^\le ( \theta ) \mu _1] ({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} + \nabla [\gamma ^>( \theta ) \mu _2] ({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}} + R_N^\phi , \end{aligned}$$

for some remainder \(R_N^\phi\) with \(N E( R_N^\phi )^2 = O(1/N)\). The treatment of \(\psi ( {\widehat{ \theta }}_N ) - \psi (\theta )\) is similar and leads to the following:

$$\begin{aligned} \psi ( N/n, {\widehat{ \theta }}_N ) - \psi ( \lambda _1, \theta ) = \nabla [(1-\gamma ^\le ( \theta )) \mu _2] ({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \ge {\widetilde{ S }}_2^2 \}} + \nabla [(1-\gamma ^>( \theta )) \mu _1] ({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 < {\widetilde{ S }}_2^2 \}} + R_N^\psi , \end{aligned}$$

for some remainder term \(R_N^\psi\) with \(N E( R_N^\psi )^2 = O(1/N)\). Putting things together and collecting terms lead to the following:

$$\begin{aligned} {\widehat{ \mu }}_N(\gamma ) - \mu&= [\left( \nabla [ \gamma ^\le (\theta ) \mu _1] + \nabla [(1-\gamma ^\le (\theta )) \mu _2 ])({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} \right. \\&\left. \quad + (\nabla [\gamma ^>(\theta ) \mu _2] + \nabla [(1-\gamma ^>(\theta )) \mu _1]) ({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}} \right] + R_N^\gamma , \end{aligned}$$

for a remainder \(R_N^\gamma\) with \(N E( R_N^\gamma )^2 = O(1/N)\). Replacing the indicators \({\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}}\) and \({\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}}\) by \({\mathbf {1}}_{\{ \sigma _1^2 \le \sigma _2^2 \}} (=1)\) and \({\mathbf {1}}_{\{ \sigma _1^2 > \sigma _2^2 \}} (=0)\), respectively, adds additional correction terms \(R_N^{(1)}\) and \(R_N^{(2)}\) with \(N E( R_N^{(i)} )^2 = o(1)\), \(i = 1, 2\). This can be seen as follows: we have, for instance:

$$\begin{aligned} \nabla [\gamma ^\le ( \theta )\mu _1] {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} ( {\widehat{ \theta }}_N - \theta ) = \nabla [\gamma ^\le ( \theta ) \mu _1] {\mathbf {1}}_{\{ \sigma _1^2 \le \sigma _2^2 \}} ( {\widehat{ \theta }}_N - \theta ) + U_N, \end{aligned}$$

where

$$\begin{aligned} U_N = \nabla [\gamma ^\le ( \theta ) \mu _1] ( {\mathbf {1}}_{\{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} - {\mathbf {1}}_{\{ \sigma _1^2 \le \sigma _2^2 \}} ) ( {\widehat{ \theta }}_N - \theta ). \end{aligned}$$

Observe that, for \(\sigma _1^2 < \sigma _2^2\), we have \(J = | {\mathbf {1}}_{ \{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}} - {\mathbf {1}}_{ \{ \sigma _1^2 \le \sigma _2^2 \} } | = {\mathbf {1}}_{ \{ {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 \}}\). If we put \(C = \max _j \sup _\theta \left| \frac{ \partial [\gamma ^\le (\theta ) \mu _1 ] }{ \partial \theta _j } \right| < \infty\), we may estimate the following:

$$\begin{aligned} E(U_N^2)&= E \left( \nabla [ \gamma ^\le ( \theta ) \mu _1 ] ({\widehat{ \theta }}_N - \theta ) {\mathbf {1}}_{ \{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2 \}} \right) ^2 \\&\le E \left( \sum _{j=1}^5 \left| \frac{ \partial [\gamma ^\le (\theta ) \mu _1 ] }{ \partial \theta _j } \right| | {\widehat{ \theta }}_{Nj} - \theta _j | {\mathbf {1}}_{ \{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2 \}} \right) ^2 \\&\le C^2 E \left( 5 \max _j | {\widehat{ \theta }}_{Nj} - \theta _j |^2 {\mathbf {1}}_{ \{ {\widetilde{ S }}_1^2> {\widetilde{ S }}_2^2 \}} \right) ^2 \\&\le 25 C^2 \sqrt{ E \sum _{j=1}^5 | {\widehat{ \theta }}_{Nj} - \theta _j |^4 } \sqrt{ P( {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 ) }. \end{aligned}$$

Using

$$\begin{aligned} E \sum _{j=1}^5 | {\widehat{ \theta }}_{Nj} - \theta _j |^4 = O(1/N^2) \end{aligned}$$

and \(P( {\widetilde{ S }}_1^2 > {\widetilde{ S }}_2^2 ) = O( 1/N^2 )\); see (37) and (41), we, therefore, arrive at

$$\begin{aligned} N E( U_N )^2 = O( 1/N ). \end{aligned}$$

If, contrary, \(\sigma _1^2 > \sigma _2^2\), we observe that \(J = {\mathbf {1}}_{ \{ {\widetilde{ S }}_1^2 \le {\widetilde{ S }}_2^2 \}}\), and similar arguments combined with (38) imply the following:

$$\begin{aligned} N E( U_N )^2 = O( 1/N ). \end{aligned}$$

Putting things together, we, therefore, arrive at assertion (i):

$$\begin{aligned} {\widehat{ \mu }}_N(\gamma ) - \mu&= \left[ (\nabla [ \gamma ^\le (\theta ) \mu ] + \nabla [(1-\gamma ^\le (\theta )) \mu ]) {\mathbf {1}}_{\{ \sigma _1^2 < \sigma _2^2 \}} \right. \\&\left. \quad + (\nabla [\gamma ^>(\theta ) \mu ] + \nabla [(1-\gamma ^>(\theta )) \mu ]) {\mathbf {1}}_{\{ \sigma _1^2 > \sigma _2^2 \}} \right] ( {\widehat{ \theta }}_N - \theta ) + R_N^\gamma , \end{aligned}$$

since \(\mu _1 = \mu _2 = \mu\) by assumption.

It remains to discuss the form of the leading term when \(\mu = \mu _1 = \mu _2\) by calculating the partial derivatives. Observe that

$$\begin{aligned} \frac{ \partial [ \gamma ^\le (\theta ) \mu _1 + (1-\gamma ^\le (\theta )) \mu _2 ] }{ \partial \sigma _i^2 } = \frac{ \partial \gamma ^\le ( \theta ) }{ \partial \sigma _i^2 } \mu _1 - \frac{ \partial \gamma ^\le ( \theta ) }{ \partial \sigma _i^2 } \mu _2, \end{aligned}$$

and, thus, vanishes if \(\mu _1 = \mu _2\). Hence, the asymptotic linear statistic governing \({\widehat{ \mu }}_N(\gamma ) - \mu\) does not depend on the sample variances. Furthermore:

$$\begin{aligned} \frac{ \partial [\gamma ^\le (\theta ) \mu _1 + (1-\gamma ^\le (\theta )) \mu _2 ] }{ \partial \mu _1 } = \frac{ \partial \gamma ^\le (\theta )}{ \partial \mu _1 } \mu _1 + \gamma ^\le ( \theta ) - \frac{ \partial \gamma ^\le (\theta ) }{ \partial \mu _1 } \mu _2 \end{aligned}$$

and

$$\begin{aligned} \frac{ \partial [\gamma ^\le (\theta ) \mu _1 + (1-\gamma ^\le (\theta )) \mu _2 ] }{ \partial \mu _2 } = \frac{ \partial \gamma ^\le (\theta )}{ \partial \mu _2 } \mu _1 - \frac{ \partial \gamma ^\le (\theta ) }{ \partial \mu _2 } \mu _2 - \gamma ^\le (\theta ) +1. \end{aligned}$$

Hence, under the common mean constraint \(\mu _1 = \mu _2\), those two partial derivatives are given by \(\gamma ^\le (\theta )\) and \(1-\gamma ^\le (\theta )\), respectively. Thus, we may conclude that:

$$\begin{aligned} {\widehat{ \mu }}_N(\gamma ) - \mu = \gamma (\theta ) ( \overline{X}_1 - \mu ) + (1-\gamma (\theta )) ( \overline{X}_2 - \mu ) + R_N, \end{aligned}$$

with \(N E( R_N^2 ) = O(1/N)\), which verifies (ii).

(iii) Assertion (ii) shows that \({\widehat{ \mu }}_N(\gamma )\) is an asymptotically linear two-sample statistic with kernels \(h_1(x) = \gamma (\theta ) (x-\mu _1)\) and \(h_2(x) = (1-\gamma (\theta )) (x-\mu _2)\), which satisfy that \(\int | h_i(x) |^4 \, {\text {d}} F_i(x) < \infty\), \(i = 1, 2\), and a remainder term \(R_N\) with \(N E(R_N^2) = o(1)\), as \(N \rightarrow \infty\), such that (8) holds. It remains to verify (9). Our starting point is the Taylor expansion (39), where we have to study the second sum. First observe that replacing \(\partial _{ij}[\gamma ^{\le }(\theta )\mu _1]({\widehat{ \theta }}_N^*)\) by \(\partial _{ij}[\gamma ^{\le }(\theta )\mu _1](\theta )\) gives error terms \(F_{N,ij}\) satisfying the following:

$$\begin{aligned} E( F_{N,ij}^2 )&= E[\{ \partial _{ij}[\gamma ^{\le }(\theta )\mu _1]({\widehat{ \theta }}_N^*) - \partial _{ij}[\gamma ^{\le }(\theta )\mu _1](\theta ) \} ({\widehat{ \theta }}_{Ni} - \theta _i )({\widehat{ \theta }}_{Nj}-\theta _j)]^2 \\&\le \sqrt{ E \left( \partial _{ij}[\gamma ^{\le }(\theta ) \mu _1 ] \left. \right| _{z=\theta }^{z={\widehat{ \theta }}_N^*} \right) ^4 } \sqrt{ E( {\widehat{ \theta }}_{Ni} - \theta _i )^4 ( {\widehat{ \theta }}_{Nj} - \theta _j )^4 }. \end{aligned}$$

The first factor is o(1) , by Fubini’s theorem, since the integrand is continuous, bounded, and converges to 0, as \(N \rightarrow \infty\), a.s. Furthermore:

$$\begin{aligned} E( ( {\widehat{ \theta }}_{Ni} - \theta _i )^4 ( {\widehat{ \theta }}_{Nj} - \theta _j )^4 ) \le \sqrt{ E( {\widehat{ \theta }}_{Ni} - \theta _i )^8 } \sqrt{ E( {\widehat{ \theta }}_{Ni} - \theta _i )^8 } = O(N^{-4}), \end{aligned}$$

such that

$$\begin{aligned} N^2 E( F_{N,ij}^2 ) = o(1) N^2 O(N^{-2}) = o(1), \end{aligned}$$
(42)

as \(N \rightarrow \infty\), follows. In what follows, denote by \({\widetilde{ R }}_N\) the second term in (39) with \({\widehat{ \theta }}_N^*\) replaced by \(\theta\), that is:

$$\begin{aligned} {\widetilde{ R }}_N = \frac{1}{2} \sum _{i,j=1}^5 \partial _{ij} [\gamma ^\le (\theta ) \mu _1]( \theta ) ( {\widehat{ \theta }}_{Ni} - \theta _i )( {\widehat{ \theta }}_{Nj} - \theta _j ), \end{aligned}$$

such that \(F_N = \frac{1}{2} \sum _{i,j=1}^5 F_{N,ij}\) satisfies

$$\begin{aligned} R_N = {\widetilde{ R }}_N + F_N. \end{aligned}$$

Clearly, (42) implies that

$$\begin{aligned} N^2 E( F_{N}^2 ) = o(1). \end{aligned}$$
(43)

To show

$$\begin{aligned} N^2 E( R_N - R_{N-1})^2 = o(1), \end{aligned}$$

as \(N \rightarrow \infty\), by virtue of (43), it suffices to show that

$$\begin{aligned} N^2 E({\widetilde{ R }}_N - {\widetilde{ R }}_{N-1})^2 = o(1), \end{aligned}$$
(44)

as \(N \rightarrow \infty\). Indeed, since we already know that \(N^2 E(R_N^2) = o(1)\), we have the following:

$$\begin{aligned} N^2 E({\widetilde{ R }}_N^2)&= N^2 E(R_N^2) + N^2 E( F_N^2 ) - 2 N^2 E( R_N F_N ) \end{aligned}$$

with \(N^2 E( F_N^2 ) = o(1)\) and

$$\begin{aligned} | N^2 E (R_N F_N) | \le \sqrt{ N^2 E( R_N^2 ) } \sqrt{ N^2 E( F_N^2 ) } = o(1), \end{aligned}$$

as \(N \rightarrow \infty\). Hence, \(N^2 E( {\widetilde{ R }}_N^2 ) = o(1)\), as \(N \rightarrow \infty\), as well. Due to the high rate of convergence of \(F_N\) in (43), it follows that \(N^2 E( F_N - F_{N-1} )^2 = o(1)\), as \(N \rightarrow \infty\), since

$$\begin{aligned} N^2 E( F_N - F_{N-1} )^2&\le N^2 \left( \sqrt{ E( F_N^2 ) } + \sqrt{ E( F_{N-1}^2 ) } \right) ^2 \\&= N^2 E( F_N^2 ) + N^2 E( F_{N-1}^2 ) + 2 \sqrt{ N^2 E( F_N^2 ) } \sqrt{ N^2 E( F_{N-1}^2 ) } \\&= o(1), \end{aligned}$$

as \(N \rightarrow \infty\). Now, it follows that (44) implies \(N^2E(R_N - R_{N-1})^2 = o(1)\), as \(N \rightarrow \infty\), since, by Minkowski’s inequality:

$$\begin{aligned} N^2 E( R_N - R_{N-1} )^2&= N^2 E( {\widetilde{ R }}_N - {\widetilde{ R }}_{N-1} + (F_N - F_{N-1}) )^2 \\&= N^2 E( {\widetilde{ R }}_N - {\widetilde{ R }}_{N-1} )^2 + N^2 E( F_N - F_{N-1} )^2 \\&\quad + 2 N^2 E( {\widetilde{ R }}_N - {\widetilde{ R }}_{N-1} )( F_N - F_{N-1} ), \end{aligned}$$

where

$$\begin{aligned} | N^2 E( {\widetilde{ R }}_N - {\widetilde{ R }}_{N-1} )( F_N - F_{N-1} ) |&\le N^2 \sqrt{ E( {\widetilde{ R }}_N - {\widetilde{ R }}_{N-1} )^2 } \sqrt{ E( F_N - F_{N-1} )^2 } \\&\le \sqrt{ N^2 E( {\widetilde{ R }}_N - {\widetilde{ R }}_{N-1} )^2 } \sqrt{ N^2 E( F_N - F_{N-1})^2 } \\&= o(1), \end{aligned}$$

as \(N \rightarrow \infty\). This verifies that it is sufficient to show (44). As the sum is finite and the second-order derivatives \(\partial _{ij}[ \gamma {\le }( \theta ) \mu _1 ]( \theta )\) do not depend on N and are bounded, (44) follows, if we establish, for each pair \(i, j \in \{ 1, \dots , 5 \}\):

$$\begin{aligned} N^2 E[({\widehat{ \theta }}_{Ni} - \theta _i ) ({\widehat{ \theta }}_{Nj} - \theta _j ) - ({\widehat{ \theta }}_{N-1,i} - \theta _i ) ({\widehat{ \theta }}_{N-1,j} - \theta _j ) ]^2 = o(1), \end{aligned}$$
(45)

as \(N \rightarrow \infty\). We shall make use of the decomposition \({\widehat{ \theta }}_{Ni} - \theta _i = L_{Ni} + R_{Ni}\) in a linear statistic and an additional remainder term (if \({\widehat{ \theta }}_{Ni}\) is not a sample mean), for \(i = 1, \dots , 5\). (45) follows, if we prove that

$$\begin{aligned} N^2 E( L_{Ni} L_{Nj} - L_{N-1,i} L_{N-1,j} )^2 =&o(1), \end{aligned}$$
(46)
$$\begin{aligned} N^2 E( L_{Ni} R_{Nj} - L_{N-1,i} R_{N-1,j} )^2 =&o(1), \end{aligned}$$
(47)
$$\begin{aligned} N^2 E( R_{Ni} R_{Nj} - R_{N-1,i} R_{N-1,j} )^2 =&o(1), \end{aligned}$$
(48)

as \(N \rightarrow \infty\), for \(i, j = 1, \dots , 5\). Those estimates will follow from the following estimates for \(L_{Ni}\) and \(R_{Ni}\):

$$\begin{aligned} E( L_{Ni}^4 )&= O(1/N^2), \end{aligned}$$
(49)
$$\begin{aligned} E( L_{Ni} - L_{N-1,i} )^4&= O(1/N^4),\end{aligned}$$
(50)
$$\begin{aligned} E( R_{Ni}^4 )&= O(1/N^4), \end{aligned}$$
(51)
$$\begin{aligned} E(R_{Ni} - R_{N-1,i})^2&= O(1/N^3), \end{aligned}$$
(52)
$$\begin{aligned} E( R_{Ni} - R_{N-1,i} )^4&= O( 1/N^6 ), \end{aligned}$$
(53)

as \(N \rightarrow \infty\), for \(i, j = 1, \dots , 5\). (50) holds, because for a linear statistic, say, \(L_N = N^{-1} \sum _{i=1}^N \xi _i\), with i.i.d. mean zero summands \(\xi _1, \dots , \xi _N\) with finite fourth moment, the formula

$$\begin{aligned} L_{N} - L_{N-1} = \left( \frac{1}{N} - \frac{1}{N-1} \right) \sum _{i<N} \xi _i + \frac{\xi _N}{N} \end{aligned}$$

leads to the estimate

$$\begin{aligned} \Vert L_{N} - L_{N-1} \Vert _{L_4} \le \frac{1}{N(N-1)} \left\| \sum _{i<N} \xi _i \right\| _{L_4} + \frac{ ( E( \xi _1^4 ) )^{1/4} }{ N } = O( 1/N ), \end{aligned}$$

since \(E( \sum _{i<N} \xi _i )^4 = O(N^2)\). For the sample means, the remainder term vanishes, such that (51)–(53) hold trivially, and for the sample variances (51)–(53) are shown in Lemma A.1. Now, (46)–(48) can be shown as follows. For each pair \(i, j \in \{ 1, \dots , 5 \}\), the identity

$$\begin{aligned} L_{Ni} R_{Nj} - L_{N-1,i} R_{N-1,j} = (L_{Ni} - L_{N-1,i}) R_{Nj} + L_{N-1,i}(R_{Nj} - R_{N-1,j} ) \end{aligned}$$

yields

$$\begin{aligned} \Vert L_{Ni} R_{Nj} - L_{N-1,i} R_{N-1,j} \Vert _{L_2}&\le \Vert L_{N,i} - L_{N-1,i} \Vert _{L_4} \Vert R_{Nj} \Vert _{L_4} \\& \quad + \Vert L_{N-1,i} \Vert _{L_4} \Vert R_{Nj} - R_{N-1,i} \Vert _{L_4} \\&= O(1/N) O(1/N) + O(1/\sqrt{N}) O( 1 / N^{3/2} ) \\&= O(1 / N^2 ), \end{aligned}$$

such that

$$\begin{aligned} E( L_{Ni}R_{Nj} - L_{N-1,i} R_{N-1,j} )^2 = O(1/N^4), \end{aligned}$$

which verifies (47). Similarly, we have the following:

$$\begin{aligned} L_{Ni} L_{Nj} - L_{N-1,i} L_{N-1,j} = (L_{Ni} - L_{N-1,i}) L_{Nj} + L_{N-1,i}( L_{Nj} - L_{N-1,j} ), \end{aligned}$$

where

$$\begin{aligned} E( (L_{Ni} - L_{N-1,i})^2 L_{Nj}^2 )&\le \sqrt{ E( L_{Ni} - L_{N-1,i} )^4 } \sqrt{ E( L_{Nj}^4 ) } \\&= O(1/N^2) O(1/N) \\&= O(1/N^3), \end{aligned}$$

which leads to (46). Combining those results shows that the remainder term, \(R_N\), also satisfies the condition (9). Therefore, consistency and asymptotic unbiasedness of the jackknife variance estimator follow from Theorem 2.1. The proof is complete. \(\square\)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Steland, A., Chang, Y. Jackknife variance estimation for general two-sample statistics and applications to common mean estimators under ordered variances. Jpn J Stat Data Sci 2, 173–217 (2019). https://doi.org/10.1007/s42081-019-00034-2

Download citation

Keywords

  • Common mean
  • Data science
  • Central limit theorem
  • Gâteaux derivative
  • Fréchet differentiability
  • Graybill–Deal estimator
  • Jackknife
  • Order constraint
  • Resampling