1 Introduction

Testing for normality is a topic of interest that has generated and is still generating a vast literature. Some recent contributions are Ebner et al. (2022), Henze and Jiménez-Gamero (2019), Henze et al. (2019), Henze and Koch (2020), and Jelito and Pitera (2021); see the paper by Ebner and Henze (2020) for a review on normality tests. Most papers on this issue deal with testing normality for a sample, and the properties of the proposed procedures are stated as the sample size increases. This paper studies the problem of simultaneously testing normality of k univariate samples, where k can increase with the sample sizes. Moreover, k will be allowed to be even larger than the sample sizes. Specifically, we will consider the following general setting:

$$\begin{aligned}{} & {} \text {Let}\, \textbf{X}_1=\{X_{1,1}, \ldots , X_{1, n_1}\}, \ldots , \textbf{X}_k=\{X_{k,1}, \ldots , X_{k, n_k}\} \, \text {be } k \text { independent samples}\, \nonumber \\{} & {} \text {with sizes\, }n_1, \ldots , n_k,\, \text {which may be different, coming from}\, X_1, \ldots , X_k \in {\mathbb {R}}\, ,\nonumber \\{} & {} \text {with continuous distribution functions} F_1, \ldots , F_k, \text {respectively, and}\, E(X_j^2)\nonumber \\{} & {} <\infty , 1\le j \le k . \end{aligned}$$
(1)

The elements of each sample in (1) are assumed to be independent. In this setting, we deal with the problem of testing

$$\begin{aligned} H_0: \, F_1, \ldots , F_k \in {\mathcal {N}}, \end{aligned}$$

against general alternatives,

$$\begin{aligned} H_1: \, F_j \notin {\mathcal {N}}, \text{ for } \text{ some } 1 \le j \le k, \end{aligned}$$

where \({\mathcal {N}}\) is the set of univariate normal populations, \({\mathcal {N}}=\{ N(\mu , \sigma ^2), \, \mu \in {\mathbb {R}}, \, \sigma >0\}\) and, as said before, k is allowed to be large (the precise meaning of “large” will be stated in the following sections).

The main motivation for testing normality comes from the fact that, under this distributional assumption, many statistical procedures become simpler and more efficient than their non-parametric counterparts, mainly due to the good properties of the normal law. Nevertheless, the efficiency of those procedures may decrease, or even disappear, if the normality assumption fails. For example, if one can assume that the populations are normal, then the classical k-sample problem becomes that of testing the equality of variances and the equality of means, for which more tests can be found in the statistical literature than for testing the equality of the k distributions when k is large (Zhan and Hart 2014; Jiménez-Gamero et al. 2022). As another instance, let us consider the problem of testing the equality of the means of a large number k of univariate normal populations, that may have different variances. Park and Park (2012) proposed two tests for this problem, whose associated statistics, conveniently normalized, are asymptotically normal. Here by asymptotic it is meant as \(k\rightarrow \infty \). The assumption that the populations have all of them a normal distribution is crucial in order to derive the asymptotic distribution of those test statistics. In fact, the simulations in Jiménez-Gamero and Franco-Pereira (2021) show that, when the data meets the normality assumption, these tests can be more powerful than nonparametric competitors but, when data come from non-normal populations, the empirical type I errors of the tests in Park and Park (2012) can be far apart from the nominal value.

The problem of simultaneously testing goodness-of-fit for k populations has been studied in Gaigall (2021) by using test statistics based on comparing the empirical distribution function of each sample with a parametric estimator derived under the null hypothesis. The asymptotic properties studied in Gaigall (2021) are for fixed k and increasing sample sizes. Jiménez-Gamero et al. (2005) studied the problem of testing normality of the errors in multivariate, homoscedastic linear models. The test statistic in Jiménez-Gamero et al. (2005) is based on comparing the empirical characteristic function (ECF) of the studentized residuals with the characteristic function (CF) of a standard normal law. The asymptotic properties studied in Jiménez-Gamero et al. (2005) allow k to increase with the sample sizes in such a way that \(k^2/n=o(1)\), where n is the sample size. Specifically, they show that the asymptotic null distribution of the considered test statistic coincides with that derived for independent, identically distributed (iid) data in Baringhaus and Henze (1988) and Henze and Wagner (1997). The normality test in Baringhaus and Henze (1988) and Henze and Wagner (1997) is usually called the BHEP test, since it was first proposed for the univariate case by Epps and Pulley (1983), and then extended to the multivariate case by Baringhaus and Henze (1988); moreover, due to its nice properties, it has been extended in several directions: to testing normality of the errors in homoscedastic linear models in Jiménez-Gamero et al. (2005), as explained before; to testing normality of the errors in nonparametric regression in Hušková and Meintanis (2010) and Rivas-Martínez and Jiménez-Gamero (2018); to testing normality of the innovations in GARCH models in Jiménez-Gamero (2014), Klar et al. (2012); and to testing Gaussianity of random elements taking values in a Hilbert space in Henze and Jiménez-Gamero (2021), just to cite a few.

In this paper we first study the test in Jiménez-Gamero et al. (2005) for testing \(H_0\), without assuming that the populations are homoscedastic. Not assuming homoscedasticity greatly complicates theoretical derivations, since instead of estimating one variance from the pooled data, now we must dealt with k variance estimators. With this aim, it will be assumed that the sample sizes are comparable in the following sense:

$$\begin{aligned} n_i=c_im, \quad 0<c_0 \le c_i \le C_0<\infty , \quad \forall i, \quad \text{ for } \text{ some } \text{ fixed } \text{ constants } \, c_0\, \text{ and }\, C_0 . \nonumber \\ \end{aligned}$$
(2)

It is shown that the asymptotic null distribution of the test statistic also coincides with that derived for iid data, whenever \(k/m=o(1)\). Notice that if the sample sizes satisfy (2), then the condition \(k/m\rightarrow 0 \) is equivalent to \(k^2/n\rightarrow 0 \).

Since the practical calculation of the BHEP test statistic involves \(O(n^2)\) sums (see Section 2), its computation can be rather time-consuming for large k. So, for the case \(k/m \rightarrow \ell \in (0, \infty ]\), we try other strategies for testing \(H_0\). First, inspired by the random projection procedure in Cuesta-Albertos et al. (2006), we could test \(H_0\) using not all data sets but a “small" number \(k_0\) of samples (small in the sense that \(k_0/m=o(1)\)) randomly selected from all of the k population samples. Finally, we consider a test statistic which combines the BHEP test statistics calculated in each sample.

As said before, this paper studies BHEP-based statistics for testing \(H_0\). Other test statistics could be considered. The main reason for our choice is the good properties enjoyed by this test. This is why, Section 2 starts by reviewing its definition and some properties. This section also derives new properties that will be used in Section 5. Specifically, it is shown that the two first moments of the null distribution of the BHEP test statistic converge to those of the asymptotic null distribution, and sufficient conditions are given for such convergence to hold under alternatives. Section 3 studies the test that compares the ECF of the studentized data with the CF of the standard normal law which, in our view, is the natural extension of the BHEP test statistic to the setting in (1). Section 4 studies the test in the previous section when it is calculated in a subset of randomly selected samples. Section 5 studies a test whose test statistic is based on the sum of the BHEP test statistics calculated in each sample. The properties investigated in Sects. 35 are asymptotic. In order to assess the finite sample performance of the proposals, a simulation study was carried out, whose results are reported in Sect. 6. Section 7 summarizes the paper and comments on extensions and further research. All proofs are deferred to the last section.

Throughout the paper we will make use of the following standard notation: \(\textrm{i}=\sqrt{-1}\) is the imaginary unit; for any complex number \(x=a+ \textrm{i}b \in {\mathbb {C}}\), with \(a, b \in {\mathbb {R}}\), \(\Re x=a\) denotes the real part and \(\Im x=b\) denotes the imaginary part; all random variables and random elements will be defined on a sufficiently rich probability space \((\Omega ,{{{\mathcal {A}}}},P)\); the symbols E and V denote expectation and variance, respectively; \(P_0\), \(E_0\) and \(V_0\) denote probability, expectation and variance under the null hypothesis, respectively; \(\overset{{\mathcal {D}}}{\rightarrow }\) means convergence in distribution of random vectors and random elements.

2 The BHEP test

2.1 The test

This section revisits the BHEP test for univariate data. Let \(X_1, \ldots , X_n \) (\(n\ge 2\)) be a sample from a random variable X with continuous distribution function F and \(E(X^2)<\infty \). For testing the hypothesis \(H_{0,1}: F \in {\mathcal {N}}\), the rationale of the BHEP test is as follows: write \({\overline{X}} = n^{-1}\sum _{j=1}^n X_j\) and \(S^2 = n^{-1}\sum _{j=1}^n (X_j-{\overline{X}})^2\) for the sample mean and the sample variance, respectively, and let \( Y_{j} = (X_j - {\overline{X}})/S\), \(1 \le j \le n\), be the so-called scaled residuals of \(X_1,\ldots ,X_n\), which provide an empirical standardization of \(X_1,\ldots ,X_n\). Notice that, under the assumptions made, \(P(S>0)=1\), and thus \(Y_{1}, \ldots , Y_{n}\) are well defined. Since, under \(H_{0,1}\) and for large n, the distribution of the scaled residuals should be close to the standard normal distribution, it is tempting to compare the ECF of \(Y_{1},\ldots ,Y_{n}\),

$$\begin{aligned} \varphi _n (t) = \frac{1}{n}\sum _{j=1}^n \exp (\textrm{i}t Y_{j}), \quad t \in {\mathbb {R}}, \end{aligned}$$

with \(\varphi _0(t)=\exp (-t^2/2)\), which is the CF of the standard normal distribution. The BHEP test rejects \(H_{0,1}\) for large values of the weighted \(L^2\)-statistic

$$\begin{aligned} {\mathcal {T}}_{n,\beta } = \int \left| \varphi _n(t)-\varphi _0(t)\right| ^2w_{\beta }(t)\, \textrm{d}t, \end{aligned}$$
(3)

where an unspecified integral stands for an integral over the whole real line, \(w_{\beta }(t)\) is the probability density function of the normal distribution N\((0,\beta ^2)\), and \(\beta >0\) is a parameter, that must be fixed by the user. The test statistic \({\mathcal {T}}_{n,\beta }\) may be written as

$$\begin{aligned} \begin{array}{rcl} \displaystyle {\mathcal {T}}_{n,\beta } &{} = &{} \displaystyle \frac{1}{n^2} \sum _{j,k=1}^n \exp \left( - \frac{\beta ^2}{2} (Y_{j}- Y_{k})^2 \right) \\ &{} &{} \displaystyle - \frac{2}{n(1+\beta ^2)^{1/2}} \sum _{j=1}^n \exp \left( - \frac{\beta ^2}{2(1+\beta ^2)} Y_{j}^2 \right) + \frac{1}{(1+2\beta ^2)^{1/2}}, \end{array} \end{aligned}$$
(4)

which is a useful expression for the practical computation of \({\mathcal {T}}_{n,\beta } \). Notice that the computation of \({\mathcal {T}}_{n,\beta } \) involves a double sum, so the number of required calculations is of order \(O(n^2)\).

Representation (4) also shows that \({\mathcal {T}}_{n,\beta }\) is a function of the products \(Y_{j}Y_{k} = (X_j-{\overline{X}})(X_k-{\overline{X}})/S^2\), \(1 \le j,k \le n\), and thus it is invariant with respect to affine transformations of \(X_1,\ldots ,X_n\). This property implies that the null distribution of \({\mathcal {T}}_{n,\beta }\) only depends on the sample size n, and on the value of \(\beta \).

Critical points for several sample sizes, \(\beta =1\) and the usual values for the probability of type I error (level) can be found in Baringhaus and Henze (1988) and Henze (1990). The function cv.quan of the package mnt (Butsch and Ebner 2020) of the R language (Core Team 2020) can be used to calculate critical points of the null distribution of \(n{\mathcal {T}}_{n,\beta }\) for any sample size, any value of \(\beta \) and any level. The critical points can be also approximated by those of the asymptotic null distribution of \(n{\mathcal {T}}_{n,\beta }\). Under the null hypothesis, \(n{\mathcal {T}}_{n,\beta }\) is asymptotically (when \(n\rightarrow \infty \)) distributed as \(W_\beta = \sum _{\ell =1}^\infty \lambda _{\beta ,\ell } Z_\ell ^2\), where \(\lambda _{\beta ,1}, \lambda _{\beta ,2}, \ldots \) are the descending sequence of positive eigenvalues of certain integral operator, and \(Z_1,Z_2,\ldots \) are independent standard normal random variables. Since those eigenvalues can be estimated (see, for example, Ebner and Henze 2021, 2022; Meintanis et al. 2022) and hence the cumulants of \(W_\beta \), one could approximate the asymptotic critical values by using the Pearson system of distributions (with the help of the package PearsonDS (Becker and Klößner 2022) of the R language (Core Team 2020)). This idea was proposed by Henze (1990), who (exactly) calculated the first four cumulants of \(W_\beta \) with \(\beta =1\).

The BHEP test is consistent against any fixed alternative and it is able to detect continuous alternatives converging to the null at the rate \(n^{-1/2}\), see (Henze and Wagner 1997). Ebner and Henze (2021) and Meintanis et al. (2022) have obtained approximate Bahadur efficiencies of the BHEP test, showing that it outperforms tests based on the empirical distribution function over certain close alternatives to normality. All these properties are for \(n\rightarrow \infty \).

2.2 The mean and the variance of the BHEP test statistic

Henze and Wagner (1997) (see also Henze (1990) for \(\beta =1\)) have (exactly) calculated the mean and the second a third centred moments of \(W_\beta \). Specifically, for \(\beta =1\), the mean and the variance of the asymptotic null distribution of \(n{\mathcal {T}}_{n,1}\) are \(\mu _0=1-\sqrt{3}/2 \approx 0.13397\) and \(\sigma ^2_0=2/\sqrt{5}+5/6-155/64\sqrt{2} \approx 0.015236\), respectively. So, it is tempting to approximate \(\mu _{0,n}=E_0(n{\mathcal {T}}_{n,1})\) by means of \(\mu _0\) and \(\tau ^2_{0,n}=V_0(n{\mathcal {T}}_{n,1})\) by \(\sigma ^2_0\), where \(E_0\) and \(V_0\) are both understood under \(H_{0,1}\). Next proposition shows that those approximations are asymptotically valid, which implies that \(\{(n{\mathcal {T}}_{n,\beta })^2\}\) is uniformly integrable.

Proposition 1

Let \(X_1, \ldots , X_n\) be a random sample from \(X\sim N(0, 1)\). Let \({\mathcal {T}}_{n,\beta }\) be as defined in (3). Let \(\mu _{0,\beta }\) and \(\sigma _{0,\beta }^2\) denote the mean and the variance of the asymptotic distribution (as \(n\rightarrow \infty \)) of \(n{\mathcal {T}}_{n,\beta }\), respectively. Then, \(E_0(n{\mathcal {T}}_{n,\beta }) \rightarrow \mu _{0,\beta }\) and \(V_0(n{\mathcal {T}}_{n,\beta }) \rightarrow \sigma ^2_{0,\beta }\), as \(n \rightarrow \infty \).

As in the first paragraph of this subsection, when \(\beta =1\) we denote \(\mu _{0,\beta }\) and \(\sigma ^2_{0,\beta }\) as \(\mu _{0}\) and \(\sigma ^2_{0}\), respectively. We have numerically checked the approximations for \(\beta =1\), \(\mu _{0,n}\approx \mu _0\) and \(\tau _{0,n}^2\approx \sigma _0^2\), in finite sample sizes. For each n, the true values of \(\mu _{0,n}\) and \(\tau _{0,n}^2\) were calculated by simulation, based on 100,000 samples of size n from a standard normal law: \(n{\mathcal {T}}_{n,1}\) were computed at each sample, and then the sample mean and the sample variance of these 100,000 values were used to approximate \(\mu _{0,n}\) and \(\tau _{0,n}^2\), respectively. Figure 1 displays \(\mu _{0,n}\), joint with the line \(y=\mu _0\) in red, and \(\tau _{0,n}^2\), joint with the line \(y=\sigma _0^2\) in red, for \(5 \le n \le 100\). Looking at this figure one can see that the approximation for the mean, \(\mu _{0,n}\approx \mu _0\), is almost exact for \(n \ge 20\), and the approximation for the variance, \(\tau _{0,n}^2\approx \sigma _0^2\), works really well for \(n \ge 50\).

Fig. 1
figure 1

True values for \(\mu _{0,n}\) (left panel) and \(\tau _{0,n}^2\) (right panel), \(5 \le n \le 100\). The line \(y=\mu _0\) (left panel) and the line \(y=\sigma _0^2\) (right panel) are in red

Now assume that \(E(X)=0\), \(V(X)=1\), and that the CF of X is \(\varphi _X \ne \varphi _0\), which is tantamount to \(\Delta _{X, \beta }=\int | \varphi _X(t)-\varphi _0(t)|^2w_{\beta }(t)\, \textrm{d}t>0\). In this setting, if \(E(X^2)<\infty \), the proof of Theorem 3.1 in Ebner and Henze (2021) shows that \(\mu _{n,\beta }=E({\mathcal {T}}_{n,\beta }) \rightarrow \Delta _{X,\beta }\) and \(V({\mathcal {T}}_{n,\beta }) \rightarrow 0\), as \(n \rightarrow \infty \). So, we can approximate \(\mu _{n, \beta }\approx \Delta _{X, \beta }\). Example 1 in Baringhaus et al. (2017) shows that if \(E(X^4)<\infty \), then \(\sqrt{n}({\mathcal {T}}_{n,\beta }-\Delta _{X,\beta })\overset{{\mathcal {D}}}{\rightarrow } N(0,\sigma ^2_{X, \beta })\), as \(n \rightarrow \infty \). The expression of \(\sigma ^2_{X, \beta }>0\) is given in Baringhaus et al. (2017). If, in addition, \(\{\big (\sqrt{n}({\mathcal {T}}_{n,\beta }-\Delta _{X,\beta })\big )^2\}\) is uniformly integrable, then we can approximate \(\tau ^2_{n,\beta }=V(\sqrt{n}{\mathcal {T}}_{n,\beta }) \approx \sigma ^2_{X, \beta }\). Similar steps to those given in the proof of Proposition 1 show that a sufficient condition for the uniform integrability of \(\{\big (\sqrt{n}({\mathcal {T}}_{n,\beta }-\Delta _{X,\beta })\big )^2\}\) is that

$$\begin{aligned} E\{|\sqrt{n}(S-1)|^{4+\delta }\}<\infty \quad \text{ and } \quad E\{|\sqrt{n}{\bar{X}}|^{4+\delta }\}<\infty , \end{aligned}$$
(5)

for some \(\delta >0\).

3 The BHEP test for \(H_0\)

The tests in Baringhaus and Henze (1988), Epps and Pulley (1983) chose \(\beta =1\). In order to simplify notation, in our developments we will also choose \(\beta =1\), although all results keep on being true for arbitrary (but fixed) \(\beta \).

Let \(Y_{j,r}=(X_{j,r}- {\overline{X}}_j)/S_j\), \(1 \le r \le n_j\), \(1 \le j \le k\), where \({\overline{X}}_j\) and \(S_j^2\) stand for the sample mean and the sample variance of the sample from \(X_j\), respectively, \(1 \le j \le k\). As in the one-sample case, under \(H_{0}\), the distribution of the scaled residuals should be close to the standard normal distribution. So we consider as test statistic

$$\begin{aligned} T_{k,n}= \int \left| \varphi _n(t)-\varphi _0(t)\right| ^2w(t)\, \textrm{d}t, \end{aligned}$$

where now \(\varphi _n\) is the ECF of \(Y_{1,1}, \ldots , Y_{1,n_1}, \ldots , Y_{k,1}, \ldots , Y_{k,n_k}\),

$$\begin{aligned} \varphi _n=\frac{1}{n} \sum _{j=1}^k\sum _{r=1}^{n_j} \exp (\textrm{i}t Y_{j,r}), \quad t \in {\mathbb {R}}, \end{aligned}$$

and \(w(t)= w_{1}(t)\) is the probability density function of the standard normal law. Notice that, to be more precise, the proposed test statistic should be denoted as \(T_{n_1, \ldots , n_k}\) but, to simplify notation, we just write \(T_{k,n}\). Let \(n=\sum _{i=1}^kn_i\). With this notation, the one-sample test statistic is \(T_{1,n}\).

Since we are assuming that \(E(X_j^2)<\infty \), \(1\le j \le k\), we can write \(X_{j,r}=\mu _j+\sigma _jW_{j,r}\), where \(\mu _j=E(X_j)\), \(\sigma _j^2=V(X_j)\), \(E(W_{j,r})=0\), \(V(W_{j,r})=1\), \(1 \le r \le n_j\), \(1 \le j \le k\); moreover, the scaled residuals \(Y_{j,r}\) calculated from \(X_{j,1}, \ldots , X_{j,n_j}\) coincide with those calculated from \(W_{j,1}, \ldots , W_{j,n_j}\). Because of this reason, we can assume that \(E(X_j)=0\) and \(V(X_j)=1\), \(1 \le j \le k\). Accordingly, from now on, instead of (1), it will be assumed that the setting is as follows:

$$\begin{aligned}{} & {} \text {Let}\, \textbf{X}_1=\{X_{1,1}, \ldots , X_{1, n_1}\} , \ldots , \textbf{X}_k\nonumber \\{} & {} \quad =\{X_{k,1}, \ldots , X_{k, n_k}\} \, \text {be} \, k \, \text {independent samples with sizes}\, n_1, \ldots , n_k ,\nonumber \\{} & {} \quad \text {which may be different, coming from}\, X_1, \ldots , X_k ,\nonumber \\{} & {} \quad \text {with continuous distribution functions} \, F_1, \ldots , F_k ,\, \text {respectively}, \nonumber \\{} & {} \quad E(X_j)=0 \, and V(X_j)=1 , 1\le j \le k . \end{aligned}$$
(6)

Another consequence of assuming that \(E(X_j)=0\) and \(V(X_j)=1\), \(1 \le j \le k\), is that the null distribution of \(T_{k,n}\) only depends on the sample sizes \(n_1, \ldots , n_k\).

As in the one-sample case, rejection of the null hypothesis \(H_0\) is for large values of \(T_{k,n}\), say \(T_{k,n}>t_{k,n,\alpha }\) where \(t_{k,n,\alpha }\) is the \(\alpha \) upper percentile of the null distribution of \(T_{k,n}\). So, to test \(H_0\) we must calculate upper percentiles of the null distribution of \(T_{k,n}\). Although the critical points can be calculated by simulation, from a practical point of view it would be nice if they could be approximated in some fashion. The next result is useful in that sense.

Before stating it, we introduce some notation. Since \(w(t)=w(-t)\) we can write \(T_{k,n}= \int Z_{k,n}^2(t) w(t)\, \textrm{d}t,\) with

$$\begin{aligned} Z_{k,n}(t)= \frac{1}{n}\sum _{j,r}\cos (tY_{j,r})+ \frac{1}{n}\sum _{j,r}\sin (tY_{j,r})-\varphi _0(t), \quad t \in {\mathbb {R}}. \end{aligned}$$
(7)

Let \(L^2_w\) denote the separable Hilbert space of (equivalence classes of) measurable functions \(f:{\mathbb {R}} \mapsto {\mathbb {C}}\) satisfying \(\int |f(t)|^2 w(t)\textrm{d}t<\infty \). The scalar product and the resulting norm in \(L^2_w\) will be denoted by \( \langle f, g \rangle _w=\int f(t) \overline{g(t)} w(t)\textrm{d}t\) and \(\Vert f \Vert _w^2=\int |f(t)|^2 w(t)\textrm{d}t\), respectively. With this notation, \(T_{k,n}=\Vert Z_{k,n}\Vert ^2_w\).

Theorem 1

Suppose that (6) holds, that \(H_0\) is true, that the sample sizes satisfy (2), and that \(k/m\rightarrow 0 \), as \(m \rightarrow \infty \). Then \(nT_{k,n}\overset{{\mathcal {D}}}{\rightarrow } \Vert Z\Vert _w^2\), as \(m \rightarrow \infty \), where Z is a centred Gaussian random element of \(L^2_w\) having covariance kernel

$$\begin{aligned} C(t,s)=\exp \{-0.5(t-s)^2\}-(1+st+0.5s^2 t^2)\exp \{-0.5(t^2+s^2)\}, \quad t,s\in {\mathbb {R}}.\nonumber \\ \end{aligned}$$
(8)

Theorem 1 says that, under certain assumptions, the asymptotic null distribution of \(T_{k,n}\) coincides with that of the one-sample test statistic, \(T_{1,n}\) (see Henze and Wagner 1997). Therefore, at least for large n, we can approximate the percentiles of \(T_{k,n}\) either by those of \(T_{1,n}\) or by those of the asymptotic null distribution of \(T_{1,n}\). In both cases, as commented in Sect. 2, the percentiles can be calculated using packages of the R language.

Next we study the behavior of the test under alternatives. To this end, we first state the following result that gives the a.s. behavior of \(T_{k,n}\).

Theorem 2

Suppose that (6) holds, that \(X_1, \ldots , X_k\) have CFs \(\varphi _1, \ldots , \varphi _k\), respectively, that the sample sizes satisfy (2) and that \(k/m \rightarrow 0\), as \(m \rightarrow \infty \), then

$$\begin{aligned} T_{k,n}- \Vert \varphi _{0,k}-\varphi _0\Vert _w^2 \overset{a.s.}{\rightarrow }\ 0,\end{aligned}$$

as \(m \rightarrow \infty \), where \(\varphi _{0,k}=(1/n)\sum _{j=1}^kn_j \varphi _j\).

Notice that if \(X_1, \ldots , X_k\) have all of them the same CF, say \(\varphi \), then \(\varphi _{0,k}=\varphi \) and, from Theorem 2, it follows that

$$\begin{aligned} T_{k,n} \overset{a.s.}{\rightarrow }\ \Vert \varphi -\varphi _0\Vert _w^2, \end{aligned}$$
(9)

as \(m\rightarrow \infty \). In particular, under \(H_0\) we have that \(T_{k,n} \overset{a.s.}{\rightarrow }\ 0.\)

Next we see that the power of the test goes to 1 for alternatives so that \(\Vert \varphi _{0,k}-\varphi _0\Vert _w^2>0\). With this aim, it will be assumed w.l.o.g. that

$$\begin{aligned}{} & {} X_1, \ldots , { X}_r \ \text {have alternative distributions with CFs}\ \varphi _1, \ldots , \varphi _r ,\nonumber \\{} & {} \text {while the other}\ k-r \ \text {populations obey}\ H_0 , \ \text {for some}\ 1\le r \le k . \end{aligned}$$
(10)

In the above setting, r is allowed to vary with m, \(r=r_m\), but such dependence on m will be skipped. Let

$$\begin{aligned} f_r=\frac{1}{n}\sum _{i=1}^rn_i, \quad \varphi _{0,r}=\frac{1}{n f_r} \sum _{i=1}^rn_i\varphi _i. \end{aligned}$$

We have that

$$\begin{aligned} \varphi _{0,k}-\varphi _0=f_r(\varphi _{0,r}-\varphi _0). \end{aligned}$$
(11)

Assume that the CFs of the alternative distributions satisfy

$$\begin{aligned} \inf _{\overset{\alpha _1+\cdots +\alpha _r=1,}{ \alpha _1,\ldots ,\alpha _r \ge 0}} \Vert \sum _{i=1}^r \alpha _i \varphi _i-\varphi _0\Vert _w \ge \tau >0, \quad \forall r \ge 1, \end{aligned}$$
(12)

and that the sample sizes satisfy (2), then from (11) it follows that

$$\begin{aligned} \frac{r}{k} M_1 \le \Vert \varphi _{0,k}-\varphi _0 \Vert _w \le \frac{r}{k} M_2, \end{aligned}$$
(13)

where \(M_1\) and \(M_2\) are two positive constants (depending on \(\tau \), \(c_0\) and \(C_0\)).

As a consequence of Theorem 1, Theorem 2 and (13), we have the following result.

Corollary 1

Suppose that (6), (10) and (12) hold, that the sample sizes satisfy (2), and that \(k/m \rightarrow 0\) and \(r/k \rightarrow p \in (0,1]\), as \(m \rightarrow \infty \). Then the power of the test that rejects \(H_0\) when \(T_{k,n} \ge t_{k,n,\alpha }\) goes to 1, as \(m \rightarrow \infty \).

The result in Corollary 1 remains true if \(t_{k, n, \alpha }\) is replaced with a consistent estimator.

Remark 1

Assumption (12) may not be satisfied under alternatives. To see this fact, let us recall that if X has CF \(\varphi _X=\Re \varphi _X+ \textrm{i}\Im \varphi _X\) then \(-X\) has CF \({\varphi }_{-X}= \Re \varphi _X- \textrm{i}\Im \varphi _X\). On the other hand, if X is a continuous random variable with probability density function (pdf)

$$\begin{aligned} 2\phi (x)\pi (x), \end{aligned}$$
(14)

where \(\phi (x)\) is the pdf of a standard normal law and \(\pi \) is a skewing function (i.e. a function satisfying \(0\le \pi (x) \le 1\) and \(\pi (-x) =1-\pi (x)\)), then the real part of the CF of X is equal to \(\varphi _0\) (see e.g. Jiménez-Gamero et al. 2016), and therefore \(0.5 \varphi _X+0.5\varphi _{-X}=\varphi _0\). An example of a continuous law having a pdf of the form (14) is the skew normal law, for which \(\pi (x)=\Phi (\lambda x)\), where \(\Phi \) denotes the cumulative distribution function of a standard normal law and \(\lambda \in {\mathbb {R}}\) is a constant. Thus, if \(X_1=X\), \(X_2=-X\) (say) and the pdf of X satisfies (14), then

$$\begin{aligned} \inf _{\overset{\alpha _1+\cdots +\alpha _r=1,}{ \alpha _1,\ldots ,\alpha _r \ge 0}} \Vert \sum _{i=1}^r \alpha _i \varphi _i-\varphi _0\Vert _w=0, \quad \forall r \ge 2. \end{aligned}$$

The same problem arises in the one-sample case if the assumption that the data are identically distributed is dropped. If, as in the one-sample case, we assume that \(X_1, \ldots , X_r\) have the same alternative distribution, then (12) is satisfied.

Remark 2

Theorem 1 states that, under certain assumptions on k and the sample sizes, the asymptotic null distribution of \(nT_{k,n}\) coincides with that of the BHEP test statistic. On the other hand, if \(X_1, \ldots , X_k\) have all of them the same CF, say \(\varphi \), then we saw that (9) holds. As a consequence of these two facts, in this setting, the Bahadur efficiencies computed in Ebner and Henze (2021) and Meintanis et al. (2022) for the BHEP test, also apply to the test proposed in this section.

Remark 3

As explained before, the main motivation for considering \(T_{k,n}\) is that it can be seen as the natural extension of the BHEP test statistic. Nevertheless, other test statistics can be used for testing \(H_0\). For example, if we denote by \({\mathcal {T}}_i\) to the BHEP test statistic calculated on the sample from \(X_i\), following the approach in Gaigall (2021), other possible choices are \(\sum _{i=1}^k{\mathcal {T}}_i\) or \((1/n)\sum _{i=1}^kn_i{\mathcal {T}}_i\). As for the studied proposal, the null distribution of these two test statistics only depends on \(n_1, \ldots , n_k\). We will come back to a test statistic of this type in Sect. 5.

The results in this section allow k to increase with the sample size, but at a lower rate. A key result to prove Theorem 1 is Lemma 2 in Sect. 8. If \(k/m\rightarrow \ell >0\) then, it can be checked that the results in Lemma 2 are no longer true. Moreover, since in such a case, even the practical calculation of \(T_{k,n}\) can be very time-consuming, Sects. 4 and 5 explore other strategies to build a test of \(H_0\).

4 Random selection

Because, as observed before, for large k the calculation of \(T_{k,n}\) can be very time-consuming, here we study a more efficient way (from a computational point of view) of testing \(H_0\), which consists in randomly selecting a subset of samples, and then applying the test studied in the previous section to the selected data. Specifically, the method proceeds as follows: for some (fixed) \(k_0<k\) (the precise order of \(k_0\) will be specified later), select randomly (without replacement) \(I_1, \ldots , I_{k_0}\) from \(1, \ldots , k\) and then apply the test in Sect. 3 to the samples \(\textbf{X}_{I_1},\ldots , \textbf{X}_{I_{k_0}}\). Let \(n_0=n_{I_1}+ \ldots +n_{I_{k_0}}\) be the total size of the selected data. Notice that when not all sample sizes are equal (unbalanced samples), \(n_0\) is a random quantity. Let \(T_{k_0, n_0}=T_{k_0, n_0}(\textbf{X}_{I_1}, \ldots , \textbf{X}_{I_{k_0}})\) denote the test statistic calculated on the selected samples. Then, \(H_0\) is rejected if \(T_{k_0, n_0} \ge t_{k_0, n_0, \alpha }\), where \( t_{k_0, n_0, \alpha }\) is the \(\alpha \) upper percentile of the null distribution of \(T_{k_0, n_0}\) for \(n_0\) fixed (non-random) and equal to its observed value.

To study properties of this procedure we first introduce some notation. For each \(\textbf{i}=(i_1, \ldots , i_{k_0})\), with \(1 \le i_1< \cdots < i_{k_0}\le k\), let

$$\begin{aligned} a(\textbf{i})= & {} P(T_{k_0, n_0}(\textbf{X}_{i_1}, \ldots , \textbf{X}_{i_{k_0}}) \ge t_{k_0, n_0, \alpha }\, | \, I_1=i_1, \ldots , I_{k_0}=i_{k_0} ), \nonumber \\ \text{ and } \quad a_0(\textbf{i})= & {} P_0(T_{k_0, n_0}(\textbf{X}_{i_1}, \ldots , \textbf{X}_{i_{k_0}}) \ge t_{k_0, n_0, \alpha }\, | \, I_1=i_1, \ldots , I_{k_0}=i_{k_0} ). \end{aligned}$$
(15)

By construction and the definition of \( t_{k_0, n_0, \alpha }\), \(a_0(\textbf{i})=\alpha \), \(\forall \textbf{i}\), and thus the test has level \(\alpha \) because

$$\begin{aligned} P_0(T_{k_0, n_0} \ge t_{k_0, n_0, \alpha })=\frac{1}{\left( {\begin{array}{c}k\\ k_0\end{array}}\right) } \sum _{1 \le i_1< \cdots < i_{k_0}\le k} a_0(\textbf{i}) =\alpha . \end{aligned}$$

Now we study the power.

Theorem 3

Suppose that (6), (10) and (12) hold, that the sample sizes satisfy (2), and that \(k_0 \rightarrow \infty \), \(k_0/m \rightarrow 0\), \(k_0/k \rightarrow \rho _0 \in [0,1)\), and \(r/k \rightarrow p \in (0,1]\), as \(m \rightarrow \infty \). Then the power of the test that rejects \(H_0\) when \(T_{k_0, n_0} \ge t_{k_0, n_0, \alpha }\) goes to 1, as \(m \rightarrow \infty \).

The result in Theorem 3 remains true if \(t_{k_0, n_0, \alpha }\) is replaced with a consistent estimator.

The consistency result in Theorem 3 is very similar to that in Corollary 1, in the sense that, besides some assumptions on \(k_0\), both tests are consistent under the same assumptions, namely, (12) and \(\frac{r}{k} \rightarrow p \in (0,1]\). The comments in Remarks 1 and 2 also apply here.

Notice that two different random selections of \(I_1, \ldots , I_{k_0}\) from \(1, \ldots , k\) could induce to opposite conclusions. To avoid this inconvenience, we follow Cuesta-Albertos and Febrero-Bande (2010) for tests based on random projections with functional data. These authors have proposed to take several random projections, calculate the p-value for each projection, and then apply some correction, as for example the procedure in Benjamini and Yekutieli (2001), which controls the false discovery rate. The same approach can be applied here.

5 Sum of BHEP test statistics

As observed in Remark 3, we could consider test statistics based on the sum of test statistics calculated on each sample. Here we study a test whose test statistic is of that type. Recall from Sect. 2.2 the definition of \(\mu _{0,n}\) and \(\tau _{0,n}^2\), and that \(0<\mu _{0,n} \rightarrow \mu _0\) and \(0<\tau _{0,n}^2 \rightarrow \sigma _0^2\), as \(n\rightarrow \infty \).

Let \({\mathfrak {T}}_i=n_i {\mathcal {T}}_i\), where \({\mathcal {T}}_i={\mathcal {T}}_{n_i,1}\) (see (3)), \(1 \le i \le k\), and let

$$\begin{aligned} {\mathbb {T}}_{0,k}=\frac{\sum _{i=1}^k({\mathfrak {T}}_i-\mu _{0,n_i})}{\sqrt{\sum _{i=1}^k\tau _{0,n_i}^2}}. \end{aligned}$$

As seen in Sect. 2.2, if \(F_i \notin {\mathcal {N}}\) then \(E({\mathfrak {T}}_i)=n_i \Delta _{X_i,n_i}\), with \(\Delta _{X_i,n_i}\rightarrow \Delta _{X_i}=\Vert \varphi _i-\varphi _0\Vert _w^2>0\), as \(n_i\rightarrow \infty \), where \(\varphi _i\) is the CF of \(X_i\). Hence, for large enough \(n_i\), \(E({\mathfrak {T}}_i)\) is bigger than \(E_0({\mathfrak {T}}_i)=\mu _{0,n_i}\). Thus, it seems reasonable to reject \(H_0\) for large values of \({\mathbb {T}}_{0,k}\).

To test \(H_0\) we must calculate upper percentiles of the null distribution of \({\mathbb {T}}_{0,k}\), which depends on \(n_1, \ldots , n_k\). Although the critical points can be calculated by simulation, from a practical point of view it would be nice if they could be approximated in some fashion, at least for large k, since \(n_1, \ldots , n_k\) can take many values. The next result shows that, under \(H_0\), \({\mathbb {T}}_{0,k}\) converges in law to a standard normal law, as \(k \rightarrow \infty \), no matter how large (or small) are the sample sizes \(n_1, \ldots , n_k\), it only assumes that \(n_i\ge 3\). If \(n_i=2\) then the two scaled residuals take the values -1 and 1, for any possible values of \(X_{i,1}, X_{i,2}\) (whenever they are different, which happens with probability 1 as \(X_i\) is assumed to be continuous), and thus \({\mathfrak {T}}_i\) is a degenerate random variable.

Theorem 4

Suppose that (6) holds, that \(n_i \ge 3\), and that \(H_0\) is true. Then \({\mathbb {T}}_{0,k}\overset{{\mathcal {D}}}{\rightarrow } Z\sim N(0,1)\), as \(k \rightarrow \infty \).

From Theorem 4, the test that rejects \(H_0\) when \( {\mathbb {T}}_{0,k} \ge z_{1-\alpha }, \) for some \(\alpha \in (0,1)\), where \(\Phi (z_{1-\alpha })=1-\alpha \), has (asymptotic) level \(\alpha \), where here asymptotic means for large k.

Now we study the power of this test. As in the previous sections, we will suppose that (10) holds. The r alternative distributions will be assumed to satisfy the following assumption.

Assumption 1

  1. (a)

    \(\{ \big (\sqrt{n_i}({\mathcal {T}}_i-\Delta _{X_i,n_i}) \big )^2\}\) is uniformly integrable, \(1 \le i \le r\).

  2. (b)

    There exist \(0< \varsigma _1< \varsigma _2<\infty \) such that \(\displaystyle \varsigma _1 \le \inf _{1 \le i \le r} \tau _{X_i,n_i}^2 \le \sup _{1 \le i \le r} \tau _{X_i,n_i}^2 \le \varsigma _2\), \(\forall r\), where \(\tau _{X_i,n_i}^2 =V(\sqrt{n_i}{\mathcal {T}}_i)\), \(1 \le i \le r\).

  3. (c)

    There exists \(0< \eta \) such that \(\frac{1}{r}\sum _{i=1}^r( \Delta _{X_i,n_i}-\mu _{0,n_i}/n_i) \ge \eta \), \(\forall r\).

Recall from Sect. 2.2 that, for alternative distributions with \(E(X_i^4)<\infty \), \(\sqrt{n_i}({\mathcal {T}}_i-\Delta _{X_i})\) converges in law to a zero-mean normal distribution. Thus, it makes sense to consider Assumption 1 (a) and (b). Moreover, (5) (for each i) was seen to be a sufficient condition for Assumption 1 (a) to hold. Since \(\Delta _{X_i,n_i}\) is positive and close to \(\Delta _{X_i}\), Assumption 1 (c) is saying that \(n_i\) must large enough so that \(E({\mathbb {T}}_{0,k})>0\). Table 1 below displays the values of \(\mu _{0,n}\) and \(n\Delta _{X,n}\), for some alternative distributions and some small values of n, that were calculated by simulation based on 100,000 samples in each case. The considered alternative distributions are:

  • The beta distribution with parameters (2,2) (b(2, 2)), whose pdf is \(f(x)=6x(1-x)\) if \(x \in [0,1]\),

  • the Laplace distribution (Lap) with pdf \(f(x)=0.5\exp (-|x|)\), \(x \in {\mathbb {R}}\),

  • the uniform distribution (unif) with pdf \(f(x)=1\) if \(x \in [0,1]\),

  • the logistic distribution (log) with pdf \(f(x)=\exp (-x)/\{1+ \exp (-x)\}^2\), \(x \in {\mathbb {R}}\),

  • the Student t-distribution with \(\nu \) degrees of freedom (\(t_\nu \)), whose pdf is given by \(f(x)=\frac{\Gamma \{(\nu +1)/2\}}{\sqrt{\nu \pi }\Gamma (\nu /2)}\left( 1+x^2/\nu \right) ^{-(\nu +1)/2}\), \(x \in {\mathbb {R}}\), where \(\Gamma \) is the gamma function,

  • a scale mixture of two normal populations (SMN): \(pN(0,\sigma ^2)+(1-p)N(0,1)\), with \(p=0.2\) and \(\sigma =3\),

  • the negative exponential distribution (exp) with pdf \(f(x)=\exp (-x)\), \(x \in [0, \infty )\), and

  • the chi-squared distribution with \(\nu \) degrees of freedom (\(\chi ^2_\nu \)), whose pdf is given by \(f(x)=\frac{1}{2^{\nu /2} \Gamma (\nu /2)}x^{\nu /2-1}\exp (-x/2)\), \(x \in [0, \infty )\).

Looking at Table 1 we see that Assumption 1(c) is not restrictive at all: it suffices to take \(n_i \ge 5\) for all considered alternatives.

Table 1 Values of \(\mu _{0,n}\) and \(n\Delta _{X,n}\), for \(3 \le n \le 10\) and some alternative distributions for X

To derive the asymptotic null distribution of \({\mathbb {T}}_{0,k}\) no assumption was made on the sample sizes \(n_1, \ldots , n_k\) (except that all of then are greater than or equal to 3). To study the power, we will assume that the sample sizes are comparable, in the sense that they satisfy (2). Nevertheless, in contrast to the results for the power in the previous sections (see Corollary 1 and Theorem 3), here m is not assumed to increase, it only must be large enough so that Assumption 1(c) holds true.

Theorem 5

Suppose that (6), (10) and Assumption 1 hold, that the sample sizes satisfy (2), and that \(r/k \rightarrow p \in (0,1]\), as \(k\rightarrow \infty \). Then the power of the test that rejects \(H_0\) when \({\mathbb {T}}_{0,k} \ge z_{1-\alpha }\) goes to 1, as \(k \rightarrow \infty \).

6 Simulation results

This section presents the results of several simulation experiments designed to study the finite sample performance of the three tests studied in this paper. We first study the goodness of the approximations given to the null distribution of the proposed test statistics, and then their powers, which are also compared with the following procedures for testing \(H_0\):

  • Compute the BHEP test for testing \(H_{0i}: X_i \in {\mathcal {N}}\), \(1 \le i \le k\). Then one can apply either the Bonferroni method, which controls the family-wise error rate, or the Benjamini-Hochberg method (see Benjamini and Yekutieli 2001), which controls the false discovery rate when the k tests are independent. Both procedures agree in rejecting \(H_0\) if \(\min _{1 \le i \le k}p_i \le \alpha /k\), where \(p_1, \ldots , p_k\) are the p-values obtained when testing \(H_{01}, \ldots , H_{0k}\), respectively. The results for this procedure are headed in the tables by BH (and we will also refer to that test as the test BH).

  • The tests in Gaigall (2021).

6.1 Simulations for the level

We first consider the test in Sect. 3, which rejects \(H_0\) when \(T_{k,n}>t_{k,n,\alpha }\). By construction, if one uses that critical region, then the test will have exactly level \(\alpha \). So, to check the actual level of this test has no interest. Recall that Theorem 1 states that, under certain assumptions, the asymptotic null distribution of \(T_{k,n}\) coincides with that of the one-sample test statistic, \(T_{1,n}\). Therefore, we could approximate the percentiles of \(T_{k,n}\) by those of \(T_{1,n}\). Here we study that approximation by simulation, and hence we consider the test that rejects \(H_0\) when \(T_{k,n}>t_{1,\alpha }\) where \(t_{1, \alpha }\) is the \(\alpha \) upper percentile of the null distribution of \(T_{1,n}\). The results for such test are headed in the tables by \(T_{k,n}^{as}\) (and we will also refer to that test as the test \(T_{k,n}^{as}\)).

The critical point \(t_{k,n,\alpha }\) can be also approximated by the \(\alpha \) upper percentile of the asymptotic null distribution of \(T_{k,n}\), say \(t_{\alpha }\). As explained by the end of Sect. 2.1, \(t_{\alpha }\) can be estimated by using the Pearson system of distributions, as proposed in Henze (1990). The results for such test are headed in the tables by \(T_{k,n}^{Pe}\) (and we will also refer to that test as the test \(T_{k,n}^{Pe}\)).

We also consider the test that rejects \(H_0\) when \({\mathbb {T}}_{0,k} \ge z_{1-\alpha }\). The results for such test are headed in the tables by \({\mathbb {T}}_{0,k}\) (and we will also refer to that test as the test \({\mathbb {T}}_{0,k}\)).

The test statistics in Gaigall (2021) are sum of the Kolmogorov-Smirnov test statistics and the Cramér-von-Mises test statistics in each sample, and so, it is expected that, conveniently normalized (subtracting the mean and dividing by the square root of their variances, as we did in Sect. 5 to obtain \({\mathbb {T}}_{0,k}\)) those statistics are asymptotically normal as \(k\rightarrow \infty \). Notice that the null distribution of the Kolmogorov-Smirnov and the Cramér-von-Mises test statistics in each sample does not depend on the values of the population mean and variance, but only on the sample size. We calculated by simulation, the means and the variances of the these statistics in a sample, and considered the test that rejects \(H_0\) when \(KS \ge z_{1-\alpha }\) and the test that rejects \(H_0\) when \(CM \ge z_{1-\alpha }\), where KS and CM are the Kolmogorov-Smirnov and the Cramér-von-Mises analogues of \({\mathbb {T}}_{0,k} \), respectively. The results for such tests are headed in the tables by KS and CM (and we will also refer to those tests as the test KS and the test CM), respectively.

In each case we did the following experiment: k random samples with size \(n_i=m\), \(1\le i \le k\), were generated from a standard normal law and the tests KS, CM, \(T_{k,n}^{as}\), \(T_{k,n}^{Pe}\), \({\mathbb {T}}_{0,k}\) and BH were applied with \(\alpha =0.05\). The experiment was repeated 10,000 times (all simulations for the level in this paper are based on 10,000 samples). Table 2 displays the proportion of times that \(H_0\) was rejected for \(k=2,3,5,10,20\) and \(n_i=5, 10, 15, \ldots , 45\). Looking at Table 2 we conclude that: (a) for the test \(T_{k,n}^{as}\): as expected, the approximation works when k is small in relation to \(n_i\), the distortion of the level may be important if such relation is not met; (b) the same can be concluded for the test \(T_{k,n}^{Pe}\), whose performance is quite close to \(T_{k,n}^{as}\); (c) for the test \({\mathbb {T}}_{0,k}\): the approximation works better for larger values of k (as expected), nevertheless, the empirical levels are not far apart from the nominal level even for \(k=2\); its behavior does not seem to be influenced by the sample sizes; (d) the same can be concluded for the tests KS and CM; (e) for the test BH: the levels are reasonably close to the nominal level.

The above experiment was repeated for larger values of k (specifically, for \(k=100, 200\)) and \(n_i=5, 10, 15, 20\), but instead of the test \(T_{k,n}^{as}\) (and \(T_{k,n}^{Pe}\)) we considered the random selection test studied in Sect. 4 with \(k_0=10,20\), headed in the tables by RP (and we will also refer to that test as the test RP). We tried that test with one and more than one random selections. In view of the results of Table 2 for the test \(T_{k,n}^{as}\) (and \(T_{k,n}^{Pe}\)), and since the sample sizes considered are not very large compared to \(k_0\), we used the exact critical values of the null distribution of \(T_{k_0,n_0}\); when several random selections are taken into account (5, 10, 20 and 30), we proceed as explained before for the test BH. The results obtained are displayed in Table 3. In all cases, the empirical levels match quite closely the nominal value.

Table 2 Empirical levels for small to moderate values of k at the nominal level \(\alpha =5.0\%\)
Table 3 Empirical levels for large values of k at the nominal level \(\alpha =5.0\%\)

6.2 Simulations for the power

As said at the beginning of Sect. 3, in our developments we chose \(\beta =1\), although all results keep on being true for arbitrary (but fixed) \(\beta \). It is well-known that, for finite sample sizes, the power of the BHEP test strongly depends on the value of the parameter \(\beta \) and on the alternative. Tenreiro (2009) observed that for short-tailed alternatives, high power is obtained for large (but not too large) values of \(\beta \), and for long-tailed (symmetric or asymmetric) alternatives a small value for \(\beta \) should be chosen. Since the tests studied in this paper are all of them based on the BHEP test, it is expected that they inherit their characteristics. So, in the power study, we tried the proposed tests for several values of \(\beta \). As in Ebner and Henze (2021), we considered \(\beta \in \{0.25, 0.5, 0.75, 1,2,3,5,10\}\).

To examine the power we repeated the experiments in the previous section, but now r samples where taken from an alternative distribution and the other \(k-r\) samples were generated from a standard normal distribution. r is taken so that the percentage of alternative distributions equals 20%, 40%, 60% and 80%. All simulations for the power in this paper are based on 2,000 samples. For the test that rejects \(H_0\) when \(T_{k,n}>t_{k,n,\alpha }\) (the dependence on \(\beta \) is skipped to simplify notation), we calculated the exact critical values (headed in the tables by \(T_{k,n}^{ex}\), we will also refer to that test as the test \(T_{k,n}^{ex}\)). The columns headed as \(RP_{1}\) and \(RP_{2}\) display the results for the random selection method with \(k_0=10\) and \(k_0=20\), respectively, both with 30 random selections because, in most cases, that number of selections gave the higher power (taken among 1, 3, 5, 10, 20 and 30 random selections).

As for the alternatives, we considered several short-tailed and some long-tailed distributions. The picture in each case is very similar and agree with Tenreiro (2009) observations. Due to space limitations, next we just summarize the experiment results. All tables and a detailed description of the numerical results can be found in the Supplementary Material.

Looking at the tables for the power in the Supplementary Material, in general, it can be concluded that: (a) the power of all tests increases with the percentage of alternative distributions, with the sample sizes and with k; (b) the test BH gives the poorest results; (c) there is no test giving the highest power for all alternatives; (d) among the BHEP-based tests, we again see that there is no one giving the highest power for all alternatives; nevertheless, \({\mathbb {T}}_{0,k}\) gives powers (for adequate choices of \(\beta \)) that are either optimal of reasonably close to the optimal; (e) KS is less powerful than CM; their powers are smaller than those of \(T_{k,n}^{ex}\) and \({\mathbb {T}}_{0,k}\) (for adequate choices of \(\beta \)), for small k, and of \(RP_2\) and \({\mathbb {T}}_{0,k}\) (for adequate choices of \(\beta \)), for larger k.

From a computational point of view, among BHEP-based tests, the test \({\mathbb {T}}_{0,k}\) is the best choice, since the number of required computations for the calculation of its test statistic is of order \(O(km^2)\), and its application does not involve the calculation of critical points.

6.3 Further simulation results

In the above experiments all samples have the same size \(n_1=n_2=\ldots =n_k:=m\), so the data can be also seen as

$$\begin{aligned} Y_1=\left( \begin{array}{c}X_{1,1}\\ X_{2,1}\\ \vdots \\ X_{k,1} \end{array}\right) , Y_2=\left( \begin{array}{c}X_{1,2}\\ X_{2,2}\\ \vdots \\ X_{k,2} \end{array}\right) , \ldots , Y_m=\left( \begin{array}{c}X_{1,m}\\ X_{2,m}\\ \vdots \\ X_{k,m} \end{array}\right) \text{ which } \text{ are } \text{ iid } \text{ from } Y=\left( \begin{array}{c}X_{1}\\ X_{2}\\ \vdots \\ X_{k} \end{array}\right) \in {\mathbb {R}}^k. \end{aligned}$$

An anonymous referee asked us to apply the BHEP test to the Y-data. With this aim we need \(m \ge k+1\) , so we only considered the cases \(k=10\) with \(n_i=15,20,25,30 \) and \(k=20\) with \(n_i=25,30 \). Tenreiro (2009) performed an extensive simulation study on the power of the BHEP test for a wide range of data dimension. He recommends using \(\beta _k=\sqrt{2}/(1.376 + 0.075k)\). We repeated the power simulation study for the Y-data using \(\beta _k\) and the critical points of the BHEP test for each value of k (that now becomes the dimension) and m (the sample size). The obtained results are displayed in the Supplementary Material. Comparing them with those yielded by the tests proposed in this paper, CM and KS, applied to the X-data, we see that when the BHEP is applied to the Y-data the power is really poor.

The above Y-data description assumes that the components of Y are independent. To numerically study the effect on the level of the tests considered in Sect. 6.1 when the independence assumption is dropped, the following simulation experiment was carried out: we generated data from \(Y\sim N_k(0, \Sigma _\rho )\), where \(\Sigma _\rho \) is the equicorrelation matrix, and then applied the tests in Table 2 to the associated X-data. The obtained results are displayed in the Supplementary Material. Looking at them we see that as the dependence between the components of Y becomes stronger, the empirical levels move further away from the nominal value 0.05. As a consequence, the case of Y-data with correlated components requires the development of new procedures that take into account such dependence.

7 Concluding remarks and further research

This paper proposes and studies three procedures for simultaneously testing that k independent samples come from normal populations, which can have different means and variances. All of them are based on the BHEP test and allow k to increase. The first test, based on the test statistic \(T_{k,n}\), can be seen as the direct extension of the BHEP test. Its null distribution only depends on k and the sample sizes, so that exact critical points can be calculated by simulation, but when the sample sizes are large relative to k, one can use the critical points of the BHEP test. If k is very large, the practical calculation of the test statistic \(T_{k,n}\) is very time-consuming, so one can randomly select \(k_0\) samples (one or more times) and apply the previous test to the selected samples. One can also calculate the BHEP test statistic in each sample and then sum the obtained values, which conveniently centred and scaled (\({\mathbb {T}}_{0,k}\)) converges in law to a standard normal law when the null hypothesis is true. The normal approximation works reasonably well even for not too large k, so its practical use does not require the calculation of critical points. All test are consistent against alternatives where the fraction of samples not obeying the null goes to a positive constant. The test based on \({\mathbb {T}}_{0,k}\) is, from a computational point of view, the best choice.

This paper is centered in studying BHEP-based procedures for simultaneously testing that k independent samples come from normal populations. Other normality tests could be used to build similar procedures to those developed in Sects. 35. Moreover, parallel approaches could be used for simultaneously testing that k independent samples come from any location-scale family.

As observed in Remark 2, in certain specific settings, the Bahadur efficiencies calculations made in Ebner and Henze (2021) and Meintanis et al. (2022) for the BHEP test, also apply to the tests proposed in Sects. 3 and 4. It would be interesting to study Bahadur efficiencies in more general settings, and also for the test in Sect. 5. Those calculations could help to determine optimal values of \(\beta \) and \(k_0\).

Finally, in simulations we saw that the tests studied in this paper are not valid for dependent data. New procedures that take into account such dependence should be developed.

8 Proofs

Along this section M is a generic positive constant taking many different values throughout the proofs.

8.1 Auxiliary results

Lemma 1

Suppose that (6) holds and that \(H_0\) is true. Then

  1. (a)

    \(E\{n_j(1-S_j)^2\}=0.5+O(1/n_j)\),

  2. (b)

    \(E\{(1-1/S_j)^2\}=1/n_j+O(1/n_j^2)\),

  3. (c)

    \(E\{(1-1/S_j)^4\}=O(1/n_j^2)\),

as \(n_j\rightarrow \infty \), \(1 \le j \le k\).

Proof

(a) Under \(H_0\), \(n_jS_j^2\sim \chi ^2_{n_j-1}\), thus \(E(n_jS_j^2)=n_j-1\) and \(E(n_jS_j)=\sqrt{n_j}\sqrt{2}\Gamma (n_j/2)/\Gamma ((n_j-1)/2)\), where \(\Gamma \) stands for the gamma function. Using Legendre duplication formula (see display 5.5.5 of Olver et al. 2010) and Stirling’s formula (see display 5.11.3 of Olver et al. 2010), one gets that, for large \(n_j\),

$$\begin{aligned} E(n_jS_j)= \sqrt{n_j(n_j-1)}\left( 1+\frac{1}{4n_j}+O\left( \frac{1}{n_j^2}\right) \right) . \end{aligned}$$

Thus

$$\begin{aligned} E\{n_j(1-S_j)^2\}=2n_j-1-2\sqrt{n_j(n_j-1)}\left( 1+\frac{1}{4n_j}+O\left( \frac{1}{n_j^2}\right) \right) . \end{aligned}$$

Finally, taking into account that \(2n_j-1-2\sqrt{n_j(n_j-1)}=2\left( 1+\sqrt{1-1/n_j}\right) ^{-1}-1\), \(\sqrt{n_j(n_j-1)}/n_j=\sqrt{1-1/n_j}\) and \(\sqrt{1-1/n_j}=1-1/(2n_j)+O(1/n_j^2)\), the result follows.

(b, c) The proof is similar to that of part (a). \(\square \)

Remark 4

The result in Lemma 1 (a) can be also derived from pages 421–422 of Johnson et al. (1994).

Let \(\Delta _{j,r}=Y_{j,r}-X_{j,r}=X_{j,r}(1/S_j-1)-{\overline{X}}_j/S_j\), \(1 \le r \le n_j\), \(1 \le j \le k\).

Lemma 2

Suppose that (6) holds, that \(H_0\) is true, that the sample sizes satisfy (2), and that \(k/m \rightarrow 0\), as \(m \rightarrow \infty \). Then,

  1. (a)

    \(\frac{1}{\sqrt{n}}\sum _{j,r} \Delta _{j,r}^2=o_P(1)\),

  2. (b)

    \(\Vert W_n\Vert _w=o_P(1)\), for

    1. (b.1)

      \(W_n(t)=t\frac{1}{\sqrt{n}}\sum _{j=1}^k ({\overline{X}}_j/S_j)\sum _{r=1}^{n_j} \sin (tX_{j,r})\), \(t \in {\mathbb {R}}\),

    2. (b.2)

      \(W_n(t)=t\frac{1}{\sqrt{n}}\sum _{j=1}^k ({\overline{X}}_j/S_j)\sum _{r=1}^{n_j} \left\{ \cos (tX_{j,r})-\varphi _0(t)\right\} \), \(t \in {\mathbb {R}}\),

    3. (b.3)

      \(W_n(t)=t\frac{1}{\sqrt{n}}\sum _{j=1}^k (1-1/S_j)\sum _{r=1}^{n_j} X_{j,r}\cos (tX_{j,r})\), \(t \in {\mathbb {R}}\),

    4. (b.4)

      \(W_n(t)=t\frac{1}{\sqrt{n}}\sum _{j=1}^k (1-1/S_j)\sum _{r=1}^{n_j} \left\{ X_{j,r}\sin (tX_{j,r})+\varphi _0'(t) \right\} \), \(t \in {\mathbb {R}}\),

    5. (b.5)

      \(W_n(t)=t \varphi _0(t)\frac{1}{\sqrt{n}}\sum _{j=1}^k n_j {\overline{X}}_j(1-1/S_j)\), \(t \in {\mathbb {R}}\),

    6. (b.6)

      \(W_n(t)=t \varphi _0'(t)\frac{1}{\sqrt{n}}\sum _{j=1}^k \sum _{r=1}^{n_j}\{(1/S_j-1)+0.5(X_{j,r}^2-1)\}\), \(t \in {\mathbb {R}}\),

as \(m\rightarrow \infty \), where \(\varphi _0'(t)=\frac{\textrm{d}}{\textrm{d}t}\varphi _0(t)\).

Proof

(a) We first observe that

$$\begin{aligned} \sum _{r=1}^{n_j} \Delta _{j,r}^2=n_j\{(1-S_j)^2+{\overline{X}}_j^2\}, \quad 1 \le j \le k. \end{aligned}$$

From (2),

$$\begin{aligned} n=\sum _{j=1}^{k}n_i=m\sum _{j=1}^{k}c_i={\bar{c}} k m,\quad \text{ with } \quad 0<c_0 \le {\bar{c}}=(1/k)\sum _{j=1}^kc_j \le C_0<\infty . \end{aligned}$$
(16)

Taking into account (16), the result in Lemma 1(a), and that \(E(n_j {\overline{X}}_j^2)=1\), \( 1 \le j \le k\), one gets

$$\begin{aligned} 0\le E\left( \frac{1}{\sqrt{n}}\sum _{j,r} \Delta _{j,r}^2 \right) \le M \sqrt{\frac{k}{m}} . \end{aligned}$$

Since we are assuming that \(k/m \rightarrow 0\), the result follows.

(b.1) We have that \(E\{W_n(t)^2\} =t^2 (1/n) \sum _{j,v=1}^k E_{jv}\), where

$$\begin{aligned} E_{jv}=E\left( \frac{{\overline{X}}_j}{S_j} \frac{{\overline{X}}_v}{S_v} \sum _{r=1}^{n_j} \sin (tX_{j,r}) \sum _{s=1}^{n_v} \sin (tX_{v,s})\right) . \end{aligned}$$

If \(j\ne v\), then

$$\begin{aligned} E_{jv}= & {} E\left( \frac{{\overline{X}}_j}{S_j} \sum _{r=1}^{n_j} \sin (tX_{j,r}) \right) E\left( \frac{{\overline{X}}_v}{S_v} \sum _{s=1}^{n_v} \sin (tX_{v,s}) \right) \\\le & {} E^{1/2}\left( \frac{{\overline{X}}_j^2}{S_j^2}\right) E^{1/2}\left( \frac{{\overline{X}}_v^2}{S_v^2}\right) E^{1/2}\left( \left\{ \sum _{r=1}^{n_j} \sin (tX_{j,r}) \right\} ^2\right) E^{1/2}\left( \left\{ \sum _{s=1}^{n_v} \sin (tX_{v,s}) \right\} ^2\right) . \end{aligned}$$

Since \( \sqrt{n_j-1}\,{\overline{X}}_j/S_j \sim t_{n_j-1}\), one gets that \(E\left( {\overline{X}}_j^2/S_j^2\right) =1/(n_j-3)\). We also have that \(E\left( \left\{ \sum _{r=1}^{n_j} \sin (tX_{j,r}) \right\} ^2\right) =n_j E \{\sin ^2(tX_{j}) \}=0.5n_j\{1-\varphi _0(2t)\} \le M n_j\). Thus, \(E_{jv} \le M\), \(1\le j \ne v \le k.\) If \(j= v\), then

$$\begin{aligned} E_{jj} \le E^{1/2}\left( \frac{{\overline{X}}_j^4}{S_j^4}\right) E^{1/2}\left( \left\{ \sum _{r=1}^{n_j} \sin (tX_{j,r}) \right\} ^4\right) . \end{aligned}$$

Since

$$\begin{aligned} E\left( \left\{ \sum _{r=1}^{n_j} \sin (tX_{j,r}) \right\} ^4\right) = \sum _{r,s=1}^{n_j} E\left\{ \sin ^2(tX_{j,r}) \sin ^2(tX_{j,s}) \right\} \le n_j^2, \end{aligned}$$

and, for \(n_j\ge 6\),

$$\begin{aligned} E\left( \frac{{\overline{X}}_j^4}{S_j^4}\right) =\frac{1}{(n_j-1)^2}\left( \frac{6}{n_j-5}\frac{(n_j-1)^2}{(n_j-3)^2}+3 \right) \le M \frac{1}{n_j^2}, \end{aligned}$$

we get that \(E_{jj} \le M\), \(1\le j \le k\), whenever \(n_j\ge 6\). Therefore, \(E\{W_n(t)^2\} \le t^2 M k^2/n\), whenever \(n_j\ge 6\), \(1\le j \le k\). From (16), it follows that \(k^2/n \le M k/m\). Since \(\int t^2w(t) \textrm{d}t=1\) and \(k/m \rightarrow 0\), as \(m\rightarrow \infty \), it follows that \(\Vert W_n\Vert _w=o_P(1)\), as \(m\rightarrow \infty \).

(b.2) The proof is similar to that of (b.1).

(b.3, b.4) The proof is similar to that of (b.1) using Lemma 1 (b) and (c).

(b.5) We have that \(E\{W_n(t)^2\} =t^2 \varphi _0(t)^2(1/n) \sum _{j,v=1}^k E_{jv}\), where

$$\begin{aligned} E_{jv}=n_j n_vE\left\{ {\overline{X}}_j(1-1/S_j) {\overline{X}}_v(1-1/S_v)\right\} . \end{aligned}$$

If \(j\ne v\), then

$$\begin{aligned} E_{jv}=n_j n_vE({\overline{X}}_j)E\{(1-1/S_j) \} E({\overline{X}}_v) E\{(1-1/S_v)\}=0. \end{aligned}$$

If \(j= v\), then

$$\begin{aligned} E_{jj}=n_j^2E({\overline{X}}_j^2)E\{(1-1/S_j)^2 \}. \end{aligned}$$

Taking into account that \(E(n_j{\overline{X}}_j^2)=1\) and Lemma 1(b), we get that, for large m, \(E_{jj} \le M\).

Therefore, for large m, \(E\{W_n(t)^2\} \le M t^2 \varphi _0(t)^2 k/n\). Since \(\int t^2w(t) \varphi _0(t)^2 \textrm{d}t<\infty \), from (16), it follows that \(\Vert W_n\Vert _w=o_P(1)\), as \(m\rightarrow \infty \).

(b.6) We can write \(W_n(t)= t \varphi _0'(t)\{A_n+0.5 B_n\}\), where \(A_n=(1/\sqrt{n})\sum _{j=1}^k n_j(1/S_j-1)\) and \(B_n=(1/\sqrt{n})\sum _{j=1}^k \sum _{r=1}^{n_j}(X_{j,r}^2-1)\). Since \(\int t^2w(t) \varphi _0'(t)^2 \textrm{d}t<\infty \), to show the result it suffices to see that \(A_n=o_P(1)\) and \(B_n=o_P(1)\), as \(m\rightarrow \infty \).

We have that \(E\{A_n^2\} =(1/n) \sum _{j,v=1}^k E_{jv}\), where \( E_{jv}=n_j n_vE\left\{ (1/S_j-1)^2\right. \) \(\left. (1/S_v-1)^2\right\} \). From Lemma 1 (b) and (c), it follows that for large m, \(E_{jv}\le M\), \(1 \le j,v \le k\), and thus \(E\{A_n^2\} \le M k^2/n\). Now using (16) one gets that \(A_n=o_P(1)\), as \(m\rightarrow \infty \).

Since

$$\begin{aligned} E\{(X_{j,r}^2-1)(X_{v,s}^2-1)\}=\left\{ \begin{array}{ll} 2 &{} \text{ if } j=v \text{ and } r=s,\\ 0 &{} \text{ otherwise, } \end{array}\right. \end{aligned}$$

and using (16), we have that \(E\{B_n^2\} =2k/n=2/({\bar{c}}m)\rightarrow 0\), as \(m\rightarrow \infty \). Therefore, \(B_n=o_P(1)\), as \(m\rightarrow \infty \). \(\square \)

Lemma 3

Suppose that (6) holds, that the sample sizes satisfy (2), and that \(k/m \rightarrow 0\), as \(m \rightarrow \infty \). Then \((1/n)\sum _{j,r}|\Delta _{j,r}|\overset{a.s.}{\rightarrow }\ 0,\) as \(m \rightarrow \infty \).

Proof

To stress the dependence on \(\omega \in \Omega \) we write \(\Delta _{j,r}(\omega )\). By the strong law of large numbers (SLLN) and the fact that the intersection of a countable collection of sets of probability one has probability one, there is a measurable subset \(\Omega _0\) of \(\Omega \) such that \(P(\Omega _0) =1\), and for each \(\omega \in \Omega _0\) we have \(a_{j}=a_{j}(\omega )=(1/n_j)\sum _{r=1}^{n_j}|\Delta _{j,r}(\omega )| \rightarrow 0\), as \(m \rightarrow \infty \), \(1\le j\le k\). Using (16) we get that

$$\begin{aligned} \frac{1}{n}\sum _{j,r}|\Delta _{j,r}|=\frac{1}{n}\sum _{j}n_j a_j=\frac{1}{{\bar{c}}}\frac{1}{k} \sum _{j=1}^k c_ja_j \le \frac{C_0}{c_0} \frac{1}{k} \sum _{j=1}^k a_j , \end{aligned}$$

and the result follows. \(\square \)

8.2 Proof of main results

Proof of Proposition 1

We have that \({\mathcal {T}}_{n,\beta }=\int Z_n^2(t)w_\beta (t) \textrm{d}t\), with \(Z_n(t)=Z_{1,n}\) and \(Z_{k,n}\) is as defined in (7). By Taylor expansion we can write

$$\begin{aligned} \cos (tY_{j})= & {} \cos (tX_{j})-t \sin (t{\tilde{X}}_{j})\Delta _{j},\\ \sin (tY_{j})= & {} \sin (tX_{j})+t \cos (t\breve{X}_{j})\Delta _{j}, \end{aligned}$$

where \(\Delta _{j}=X_{j}(1/S-1)-{\overline{X}}/S\), \({\tilde{X}}_{j}\) and \(\breve{X}_{j}\) both lie between \(Y_{j}\) and \(X_{j}\), \(1\le j \le n\). Thus, \(Z_n(t)= {\mathcal {Z}}_{n,1}(t)+{\mathcal {Z}}_{n,2}(t)+2t{\mathcal {Z}}_{n,3}(t)\), \(t \in {\mathbb {R}}\), with

$$\begin{aligned} {\mathcal {Z}}_{n,1}(t)= & {} \frac{1}{n}\sum _{j=1}^n\{ \cos (tX_j)-\varphi _0(t)\},\\ {\mathcal {Z}}_{n,2}(t)= & {} \frac{1}{n}\sum _{j=1}^n \sin (tX_j),\\ {\mathcal {Z}}_{n,3}(t)= & {} \frac{1}{2n}\sum _{j=1}^n \{\cos (t{\tilde{X}}_j)-\sin (t\breve{X}_j)\} \Delta _{j}. \end{aligned}$$

Let \({\mathcal {T}}_{n,i,\beta }=n\int {\mathcal {Z}}_{n,i}^2(t)w_\beta (t) \textrm{d}t\), \(1\le i \le 3\). Since \(n{\mathcal {T}}_{n,\beta }\) converges in law to its limit distribution, from Lemma 1.4.A and Theorem 1.4.A in Serfling (2009), a sufficient condition for the convergence of the two first moments of \(n{\mathcal {T}}_{n,\beta }\) to those of the limit distribution is that \(E\{(n{\mathcal {T}}_{n,\beta })^r\}<\infty \), \(\forall n\), for some \(r >2\). With this aim, we will see that \(E\{|n{\mathcal {T}}_{n,i,\beta }|^3\}=E\{(n{\mathcal {T}}_{n,i,\beta })^3\}<\infty \), \(1\le i \le 3\).

Notice that \({\mathcal {Z}}_{n,1}(t)\) is the average of independent elements with zero mean and hence, routine calculations show that \(E\{(n{\mathcal {T}}_{n,1,\beta })^3\}<\infty \). The same argument can be used to see that \(E\{(n{\mathcal {T}}_{n,2,\beta })^3\}<\infty \). Finally, since

$$\begin{aligned} |{\mathcal {Z}}_{n,3}(t)| \le \left( \frac{1}{n}\sum _{j=1}^n \Lambda _{j}^2 \right) ^{1/2}=\left\{ (S-1)^2+{\overline{X}}^2\right\} ^{1/2} \le |S-1|+|{\overline{X}}|, \end{aligned}$$

to show that \(E\{(n{\mathcal {T}}_{n,3,\beta })^3\}<\infty \), it suffices to see that \(E[\{\sqrt{n}(S_n-1)\}^6]<\infty \) and that \(E\{(\sqrt{n}{\bar{X}})^6\}<\infty \). As \(\sqrt{n}{\bar{X}} \sim N(0,1)\), it follows that \(E\{(\sqrt{n}{\bar{X}})^6\}=15<\infty \). On the other hand, \(\sqrt{n}S_n\) has a chi-distribution with \(n-1\) degrees of freedom (see Chap. 18 of Johnson et al. (1994)). Deutler (1984) has calculated the first six cumulants of the chi-distribution. Using the results in Deutler (1984) it can be easily seen that \(E[\{\sqrt{n}(S_n-1)\}^6]<\infty \). The proof is concluded. \(\Box \)

Proof of Theorem 1

By Taylor expansion we can write

$$\begin{aligned} \cos (tY_{j,r})= & {} \cos (tX_{j,r})-t \sin (tX_{j,r})\Delta _{j,r}-0.5 t^2 \sin (t{\tilde{X}}_{j,r})\Delta _{j,r}^2,\\ \sin (tY_{j,r})= & {} \sin (tX_{j,r})+t \cos (tX_{j,r})\Delta _{j,r}-0.5 t^2 \cos (t\breve{X}_{j,r})\Delta _{j,r}^2, \end{aligned}$$

where \({\tilde{X}}_{j,r}\) and \(\breve{X}_{j,r}\) both lie between \(Y_{j,r}\) and \(X_{j,r}\). Thus,

$$\begin{aligned}{} & {} \sqrt{n}Z_{k,n}(t) =\frac{1}{\sqrt{n}}\sum _{j,r}\big \{\cos (tX_{j,r})+ \sin (tX_{j,r})-\varphi _0(t)-t \sin (tX_{j,r})\Delta _{j,r}\\{} & {} \quad +t \cos (tX_{j,r})\Delta _{j,r}\big \} +r_{1}(t), \end{aligned}$$

where \(|r_{1}(t)|\le t^2 (1/\sqrt{n})\sum _{j,r}\Delta _{j,r}^2\). Since \(\int t^2w(t) \textrm{d}t<\infty \), from Lemma 2 (a), it follows that \(\Vert r_1\Vert _w=o_P(1)\), as \(m \rightarrow \infty \).

We have that

$$\begin{aligned} -t\frac{1}{\sqrt{n}}\sum _{j,r} \sin (tX_{j,r})\Delta _{j,r}=r_2(t)-r_3(t)-r_4(t)+r_5(t), \end{aligned}$$

where \(r_2\) is the process in Lemma 2 (b.1), \(r_3\) is the process in Lemma 2 (b.4), \(r_4\) is the process in Lemma 2 (b.6), and

$$\begin{aligned} r_5(t)= t^2\varphi _0(t)0.5\frac{1}{\sqrt{n}}\sum _{j,r}(X_{j,r}^2-1). \end{aligned}$$

We also have that

$$\begin{aligned} t\frac{1}{\sqrt{n}}\sum _{j,r} \cos (tX_{j,r})\Delta _{j,r}=-r_6(t)-r_7(t)+r_8(t)+r_9(t), \end{aligned}$$

where \(r_6\) is the process in Lemma 2 (b.2), \(r_7\) is the process in Lemma 2 (b.3), \(r_8\) is the process in Lemma 2 (b.5), and

$$\begin{aligned} r_9(t)= -t\varphi _0(t)\frac{1}{\sqrt{n}}\sum _{j,r}X_{j,r}. \end{aligned}$$

Summarizing,

$$\begin{aligned} \sqrt{n}Z_{k,n}(t)= W_{k,n}(t)+r_{10}(t), \quad t \in {\mathbb {R}}, \end{aligned}$$

with \(\Vert r_{10}\Vert _w=o_P(1)\), as \(m \rightarrow \infty \), and

$$\begin{aligned}{} & {} W_{k,n}(t)\\ {}{} & {} \quad = \frac{1}{\sqrt{n}}\sum _{j,r}\left\{ \cos (tX_{j,r})+\sin (tX_{j,r})-\varphi _0(t) +\frac{1}{2}t^2\varphi _0(t)(X_{j,r}^2-1)-t\varphi _0(t)X_{j,r}\right\} , \quad t \in {\mathbb {R}}. \end{aligned}$$

Notice that \(\{\cos (tX_{j,r})+\sin (tX_{j,r})-\varphi _0(t)+\frac{1}{2}t^2\varphi _0(t)(X_{j,r}^2-1)-t\varphi _0(t)X_{j,r}, \, t \in {\mathbb {R}}, 1 \le r \le n_j, \, 1\le j \le k\}\) are n iid random elements taking values in \(L^2_w\) with covariance kernel C(ts) in (8). By the central limit theorem in Hilbert spaces (see Theorem 2.7 in Bosq 2000), \(W_{k,n}\overset{{\mathcal {D}}}{\rightarrow } Z\) in \(L^2_w\), as \(m \rightarrow \infty \). Finally, the assertion follows by applying the continuous mapping theorem. \(\square \)

Proof of Theorem 2

By Taylor expansion, proceeding as in the proof of Proposition 1, we can write

$$\begin{aligned} Z_{k,n}(t)= \frac{1}{n}\sum _{j,r}\left\{ \cos (tX_{j,r})+\sin (tX_{j,r})-\varphi _0(t)\right\} +r_{1n}(t), \end{aligned}$$

where \(Z_{k,n}\) is as defined in (7) and \(|r_{1n}(t)| \le t\frac{1}{n}\sum _{j,r}|\Delta _{j,r}|\). Since \(\int t^2 w(t)\textrm{d}t=1\), from Lemma 3, one gets that \(\Vert r_{1n}\Vert _w \overset{a.s.}{\rightarrow }\ 0\).

Let \(R_{0k}(t)=\Re \varphi _{0k}(t)\) and \(W_{n,1}(t)=\frac{1}{n}\sum _{j,r}\left\{ \cos (tX_{j,r})-R_{0k}(t) \right\} \). The reasoning in the proof of Lemma 3 can be used to prove (using now the SLLN in Hilbert spaces, see Theorem 2.4 in Bosq (2000)) that \(W_{n,1} \overset{a.s.}{\rightarrow }\ 0\) in \(L^2_w\). Analogously, it can be seen that \(W_{n,2} \overset{a.s.}{\rightarrow }\ 0\) in \(L^2_w\), with \(W_{n,2}(t)=\frac{1}{n}\sum _{j,r}\left\{ \sin (tX_{j,r})-I_{0k}(t)\right\} \) and \(I_{0k}(t)=\Im \varphi _{0k}(t)\).

Summarizing,

$$\begin{aligned} Z_{k,n}(t)= R_{0k}(t)+I_{0k}(t)-\varphi _0(t)+r_{2n}(t), \quad t \in {\mathbb {R}}, \end{aligned}$$

with \(\Vert r_{2n}\Vert _w\overset{a.s.}{\rightarrow }\ 0\). Recalling that \(T_{k,n}=\Vert Z_{k,n}\Vert _w^2\) and noticing that \(\Vert R_{0k}+I_{0k}-\varphi _0\Vert _w=\Vert \varphi _{0k}-\varphi _0\Vert _w<\infty \), we get that

$$\begin{aligned} \left| T_{k,n}-\Vert \varphi _{0k}-\varphi _0\Vert _w^2\right|= & {} \left| \Vert r_{2n}\Vert _w^2+2\langle R_{0k}+I_{0k}-\varphi _0, r_{2n}\rangle _w \right| \\ {}\le & {} \Vert r_{2n}\Vert _w^2+2\Vert r_{2n}\Vert _w \Vert \varphi _{0k}-\varphi _0\Vert _w\overset{a.s.}{\rightarrow }\ 0, \end{aligned}$$

which proves the result. \(\square \)

Proof of Theorem 3

We have that

$$\begin{aligned} \text{ power }:=P(T_{k_0, n_0} \ge t_{k_0, n_0, \alpha }) = \Sigma _0+\Sigma _1+\cdots +\Sigma _{k_0}, \end{aligned}$$

where

$$\begin{aligned} \Sigma _j=\frac{1}{\left( {\begin{array}{c}k\\ k_0\end{array}}\right) } \sum _{\textbf{i}\in {\mathfrak {I}}_j} a(\textbf{i}), \end{aligned}$$

\({\mathfrak {I}}_0=\{\textbf{i}=(i_1, \ldots , i_{k_0}) : r< i_{1}<\cdots < i_{k_0} \le k \}\), \({\mathfrak {I}}_j=\{\textbf{i}=(i_1, \ldots , i_{k_0}) : 1\le i_1<\cdots< i_j \le r< i_{r+1}<\cdots < i_{k_0} \le k \}\), \(1\le j \le k_0\), and \(a(\textbf{i})\) is as defined in (15). If, for some j, \({\mathfrak {I}}_j\) is empty, then we define \(\Sigma _j=0\).

Suppose that \({\mathfrak {I}}_j\) is not empty. Let \(\textbf{i}\in {\mathfrak {I}}_j\), \(n_0=n_{i_1}+\cdots +n_{i_{k_0}}\), \(f_{\textbf{i}}=(1/n_0)\sum _{v=1}^{j}n_{i_v}\),

$$\begin{aligned} {\widetilde{\varphi }}_{\textbf{i}}=\frac{1}{n_0}\sum _{v=1}^{j}n_{i_v}\varphi _{i_v}+(1-f_{\textbf{i}})\varphi _0, \quad {\varphi }_{\textbf{i}}=\frac{1}{n_0f_{\textbf{i}}}\sum _{v=1}^{j}n_{i_v}\varphi _{i_v}. \end{aligned}$$

Then, \({\widetilde{\varphi }}_{\textbf{i}}-\varphi _0=f_{\textbf{i}}({\varphi }_{\textbf{i}}-\varphi _0),\) and thus, if the sample sizes satisfy (2) and \(\varphi _1, \ldots , \varphi _r\) satisfy (12), we obtain

$$\begin{aligned} \frac{j}{k_0} M_1 \le \Vert {\widetilde{\varphi }}_{\textbf{i}}-\varphi _0\Vert _w \le \frac{j}{k_0} M_2, \quad \forall \textbf{i}\in {\mathfrak {I}}_j, \quad 1\le j \le \min \{r,k_0\}, \end{aligned}$$

where \(M_1\) and \(M_2\) are two positive constants (depending on \(\tau \), \(c_0\) and \(C_0\)).

Let \(0<p_0<p\). We have that

$$\begin{aligned} \text{ power }=\sum _{u:\, u \le p_0 k_0} \Sigma _u+ \sum _{u:\, u> p_0 k_0} \Sigma _u:=\Sigma _{\le p_0}+\Sigma _{ > p_0 } \, . \end{aligned}$$

Let \(H\sim H(k,r,k_0)\), where \(H(k,r,k_0)\) stands for a hypergeometric distribution: from a population with k units, r of type A and \(k-r\) of type B, a sample without replacement of size \(k_0\) is selected, H is the number of units type A in the sample. Let \({\mathbb {I}}(p_0)=\{\textbf{i}\in {\mathfrak {I}}_u, u > p_0 k_0\}\) and \(N_{p_0}=\text{ card }\{{\mathbb {I}}(p_0)\}\). We can write

$$\begin{aligned} \Sigma _{> p_0 }=\sum _{u:\, u > p_0 k_0} P(H=u)+\frac{1}{\left( {\begin{array}{c}k\\ k_0\end{array}}\right) }\sum _{\textbf{i}\in {\mathbb {I}}(p_0)}\left( a(\textbf{i})-1\right) :=W_1+W_2. \end{aligned}$$

We can write \(W_1=1-P(H \le k_0p_0)\). Since (see Hoeffding 1963) \(P(H \le k_0p_0)\le \exp \{-2(p-p_0)^2k_0\}\), it follows that \(W_1\) converges to 1, as \(m\rightarrow \infty \). On the other hand, from Corollary 1, for each \(\textbf{i}\in {\mathfrak {I}}_u\), with \(u > p_0 k_0\), we have that \(a(\textbf{i}) \rightarrow 1\) and thus

$$\begin{aligned} {\bar{a}}=\frac{1}{N_{p_0}}\sum _{ \textbf{i}\in {\mathbb {I}}(p_0) } a(\textbf{i}) \rightarrow 1. \end{aligned}$$

Since \(W_2=({\bar{a}}-1) N_{p_0}/\left( {\begin{array}{c}k\\ k_0\end{array}}\right) \) and \(0\le N_{p_0}/\left( {\begin{array}{c}k\\ k_0\end{array}}\right) \le 1\), it follows that \(W_2\rightarrow 0.\) Therefore, \(\text{ power }\rightarrow 1\). \(\square \)

Proof of Theorem 4

\({\mathbb {T}}_{0,k}\) is a sum of independent zero-mean random variables. Thus, it suffices to see that Lindeberg condition below is met,

$$\begin{aligned} h(\varepsilon )= & {} \frac{\sum _{i=1}^kE_0\left\{ ({\mathfrak {T}}_i-\mu _{0,n_i})^2\textbf{1}[({\mathfrak {T}}_i-\mu _{0,n_i})^2>\varepsilon \sum _{i=1}^k\tau _{0,n_i}^2]\right\} }{\sum _{i=1}^k\tau _{0,n_i}^2}\\ {}\rightarrow & {} 0, \quad \forall \varepsilon >0, \quad \text{ as } k \rightarrow \infty , \end{aligned}$$

where \(\textbf{1}(\cdot )\) stands for the indicator function. From Proposition 1 it follows that \(\sum _{i=1}^k\tau _{0,n_i}^2 \ge k\tau _0\), \(\forall k\), for some \(\tau _0>0\). As a consequence, we have that,

$$\begin{aligned} 0<h(\varepsilon ) \le H(\varepsilon )=\frac{1}{k\tau _0}\sum _{i=1}^k \mu (n_i,k,\varepsilon ), \end{aligned}$$

with \(\mu (n_i,k,\varepsilon )= E_0\left\{ ({\mathfrak {T}}_i-\mu _{0,n_i})^2\textbf{1}[({\mathfrak {T}}_i-\mu _{0,n_i})^2>\varepsilon \tau _0 k]\right\} \). From Proposition 1, it also follows that \(\displaystyle \lim _{k \rightarrow \infty } \sup _n \mu (n,k,\varepsilon )=0.\) This fact implies that \(H(\varepsilon )\rightarrow 0\) as \(k \rightarrow \infty \), and thus the result is proven. \(\square \)

Proof of Theorem 5

We first prove that

$$\begin{aligned} {\mathbb {T}}_{r,k}=\frac{\sum _{i=1}^k({\mathfrak {T}}_i-\mu _{n_i})}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} \overset{{\mathcal {D}}}{\rightarrow } Z\sim N(0,1), \quad \text{ as } k \rightarrow \infty , \end{aligned}$$
(17)

where

$$\begin{aligned} \mu _{n_i}=\left\{ \begin{array}{ll} n_i \Delta _{X_i,n_i} \quad &{} \text{ if } 1 \le i \le r,\\ \mu _{0,n_i} &{} \text{ if } r+1 \le i \le k, \end{array} \right. \qquad \tau _{n_i}^2=\left\{ \begin{array}{ll} n_i \tau _{X_i,n_i}^2 \quad &{} \text{ if } 1 \le i \le r,\\ \tau _{0,n_i}^2 &{} \text{ if } r+1 \le i \le k. \end{array} \right. \end{aligned}$$

With this aim, we prove that the Lindeberg condition below is met,

$$\begin{aligned} h(\varepsilon )=\frac{ \sum _{i=1}^kE\left\{ ({\mathfrak {T}}_i-\mu _{n_i})^2\textbf{1}[({\mathfrak {T}}_i-\mu _{n_i})^2>\varepsilon \sum _{i=1}^k\tau _{n_i}^2]\right\} }{\sum _{i=1}^k\tau _{n_i}^2} \rightarrow 0, \quad \forall \varepsilon >0, \quad \text{ as } k \rightarrow \infty . \end{aligned}$$

We can write \( h(\varepsilon )= h_1(\varepsilon )+ h_2(\varepsilon )\), with

$$\begin{aligned} h_1(\varepsilon )= & {} \frac{ \sum _{i=1}^rE\left\{ ({\mathfrak {T}}_i-\mu _{n_i})^2\textbf{1}[({\mathfrak {T}}_i-\mu _{n_i})^2>\varepsilon \sum _{i=1}^k\tau _{n_i}^2]\right\} }{\sum _{i=1}^k\tau _{n_i}^2}, \end{aligned}$$

and \(h_2(\varepsilon )=h(\varepsilon )-h_1(\varepsilon )\).

From Proposition 1 and Assumption 1(b), there exists \(0<\varsigma _1 \le \varsigma _2<\infty \) such that

$$\begin{aligned} \varsigma _1 \le \min \{\inf _n \tau _{0,n}^2, \inf _{i} \tau _{X_i,n_i}^2 \} \le \max \{\sup _n \tau _{0,n}^2, \sup _{i} \tau _{X_i,n_i}^2 \} \le \varsigma _2. \end{aligned}$$
(18)

As a consequence, \(\sum _{i=1}^k\tau _{n_i}^2 \ge k\varsigma _1\). Now proceeding as in the proof of Theorem 4, we have that \(h_2(\varepsilon )\rightarrow 0\), \(\forall \varepsilon >0\).

Taking into account that the sample sizes satisfy (2), we can write

$$\begin{aligned} \sum _{i=1}^k\tau _{n_i}^2 \ge \sum _{i=1}^r\tau _{n_i}^2 = \sum _{i=1}^r n_i \tau _{X_i,n_i}^2 \ge m r c_0 \varsigma _1, \end{aligned}$$

and hence

$$\begin{aligned} h_1(\varepsilon ) \le H_1(\varepsilon )= \frac{ \sum _{i=1}^rE\left\{ \big (\sqrt{n_i}({\mathcal {T}}_i-\Delta _{X_i,n_i}) \big )^2\textbf{1}[ \big (\sqrt{n_i}({\mathcal {T}}_i-\Delta _{X_i,n_i}) \big )^2>\varepsilon r \varsigma _1 c_0/C_0 ]\right\} }{m r c_0 \varsigma _1,} \end{aligned}$$

Because \(r/k \rightarrow p \in (0,1]\), as \(k\rightarrow \infty \), it implies that \(r\rightarrow \infty \) as \(k \rightarrow \infty \). From Assumption 1(a), if follows that

$$\begin{aligned} \displaystyle \lim _{r\rightarrow \infty } E\left\{ \big (\sqrt{n_i}({\mathcal {T}}_i-\Delta _{X_i,n_i}) \big )^2\textbf{1}[ \big (\sqrt{n_i}({\mathcal {T}}_i-\Delta _{X_i,n_i}) \big )^2>\varepsilon r \varsigma _1 c_0/C_0 ]\right\} =0, \quad 1 \le i \le r. \end{aligned}$$

As a consequence, we have that \(h_1(\varepsilon )\rightarrow 0\), as \(k\rightarrow \infty \), \(\forall \varepsilon >0\). Therefore, (17) has been proven.

Now we can study the power. We have that

$$\begin{aligned} \text{ power }=P({\mathbb {T}}_{0,k} \ge z_{1-\alpha }) = P \left( {\mathbb {T}}_{r,k}\ge z_{1-\alpha } \frac{\sqrt{\sum _{i=1}^k\tau _{0,n_i}^2}}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} -\frac{\sum _{i=1}^r(n_i \Delta _{X_i,n_i}-\mu _{0,n_i})}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}}\right) . \end{aligned}$$

From (17), we can write

$$\begin{aligned} \text{ power } \approx \Phi \left( \frac{\sum _{i=1}^r(n_i \Delta _{X_i,n_i}-\mu _{0,n_i})}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} - z_{1-\alpha } \frac{\sqrt{\sum _{i=1}^k\tau _{0,n_i}^2}}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} \right) . \end{aligned}$$

From (18) and using that the sample sizes satisfy (2), we can write

$$\begin{aligned} \frac{1}{r} \sum _{i=1}^k\tau _{n_i}^2=\frac{1}{r} \sum _{i=1}^r n_i \tau _{X_i,n_i}^2 + \frac{1}{r} \sum _{i=r+1}^k \tau _{0,n_i}^2 \le C_0 m \varsigma _2 \frac{k}{r}. \end{aligned}$$

Let \(\varepsilon >0\) be such that \(p-\varepsilon >0\). Since \(r/k \rightarrow p\), as \(k \rightarrow \infty \), we have that \(p-\varepsilon \le r/k\), \(\forall k \ge k(\varepsilon )\), for some large enough \(k(\varepsilon )\). Thus, \(\forall k \ge k(\varepsilon )\),

$$\begin{aligned} \frac{1}{r} \sum _{i=1}^k\tau _{n_i}^2 \le C_0 m \varsigma _2 /(p-\varepsilon ), \end{aligned}$$

and hence, using Assumption 1(c), we have that

$$\begin{aligned} \frac{\sum _{i=1}^r(n_i \Delta _{X_i,n_i}-\mu _{0,n_i})}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} \ge \sqrt{r}\frac{\eta c_0 \sqrt{m}}{\sqrt{ C_0 \varsigma _2 /(p-\varepsilon )}}, \end{aligned}$$
(19)

\(\forall k \ge k(\varepsilon )\). On the other hand, from (18),

$$\begin{aligned} \frac{\sum _{i=1}^k\tau _{0,n_i}^2}{\sum _{i=1}^k\tau _{n_i}^2}= \frac{\sum _{i=1}^k\tau _{0,n_i}^2}{\sum _{i=1}^r n_i \tau _{X_i,n_i}^2+ \sum _{i=r+1}^k\tau _{0,n_i}^2} \le \frac{\varsigma _2}{\varsigma _1}<\infty . \end{aligned}$$
(20)

From (19) and (20),

$$\begin{aligned} \frac{\sum _{i=1}^r(n_i \Delta _{X_i,n_i}-\mu _{0,n_i})}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} - z_{1-\alpha } \frac{\sqrt{\sum _{i=1}^k\tau _{0,n_i}^2}}{\sqrt{\sum _{i=1}^k\tau _{n_i}^2}} \ge \sqrt{r}\frac{\eta c_0 \sqrt{m}}{\sqrt{ C_0 \varsigma _2 /(p-\varepsilon )}}-\frac{\varsigma _2}{\varsigma _1}|z_{1-\alpha }|, \end{aligned}$$

\(\forall k \ge k(\varepsilon )\), and the right-hand side of the above inequality goes to \(\infty \) as \(k\rightarrow \infty \), implying that power\(\rightarrow 1\), as \(k\rightarrow \infty \). \(\square \)