Skip to main content
Log in

Large-Scale Simultaneous Testing Using Kernel Density Estimation

  • Published:
Sankhya A Aims and scope Submit manuscript

Abstract

A century ago, when Student’s t-statistic was introduced, no one ever imagined its increasing applicability in the modern era. It finds applications in highly multiple hypothesis testing, feature selection and ranking, high dimensional signal detection, etc. Student’s t-statistic is constructed based on the empirical distribution function (EDF). An alternative choice to the EDF is the kernel density estimate (KDE), which is a smoothed version of the EDF. The novelty of the work consists of an alternative to Student’s t-test that uses the KDE technique and exploration of the usefulness of KDE based t-test in the context of its application to large-scale simultaneous hypothesis testing. An optimal bandwidth parameter for the KDE approach is derived by minimizing the asymptotic error between the true p-value and its asymptotic estimate based on normal approximation. If the KDE-based approach is used for large-scale simultaneous testing, then it is interesting to consider, when does the method fail to manage the error rate? We show that the suggested KDE-based method can control false discovery rate (FDR) if total number tests diverge at a smaller order of magnitude than N3/2, where N is the total sample size. We compare our method to several possible alternatives with respect to FDR. We show in simulations that our method produces a lower proportion of false discoveries than its competitors. That is, our method better controls the false discovery rate than its competitors. Through these empirical studies, it is shown that the proposed method can be successfully applied in practice. The usefulness of the proposed methods is further illustrated through a gene expression data example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1

Similar content being viewed by others

References

  • Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. Ser. B 57, 289–300.

    MathSciNet  MATH  Google Scholar 

  • Candes, E. and Barber, R.F. (2018). https://statweb.stanford.edu/~candes/stats300c/Lectures/Lecture7.pdf.

  • Fan, J., Hall, P. and Yao, Q. (2007). To how many simultaneous hypothesis tests can normal, student’s t or bootstrap calibration be applied? J. Am. Statist. Assoc. 102, 1282–1288.

    Article  MathSciNet  Google Scholar 

  • Ghosh, S. and Polansky, A.M. (2014). Smoothed and iterated bootstrap confidence regions for parameter vectors. J. Multivar. Statist. 132, 172–182.

    MathSciNet  MATH  Google Scholar 

  • Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New York.

    Book  Google Scholar 

  • Hall, P., Jing, B.Y. and Lahiri, S.N. (1998). On the sampling window method for long-range dependent data. Statist. Sin. 8, 1189–1204.

    MathSciNet  MATH  Google Scholar 

  • Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75, 800–802.

    Article  MathSciNet  Google Scholar 

  • Hommel, G.A. (1988). Stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75, 383–386.

    Article  Google Scholar 

  • Karimi, S. and Farrokhnia, M. (2014). Leukemia and small round blue-cell tumor cancer detection using microarray gene expression data set: combining data dimension reduction and variable selection technique. Chemom. Intell. Lab. Syst. 139, 6–14.

    Article  Google Scholar 

  • Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R. and Peterson, C. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7, 673–679.

    Article  Google Scholar 

  • Liu, W. and Shao, Q. (2014). Phase transition and regularized bootstrap in large-scale tt-tests with false discovery rate control. Ann. Statist. 42, 2003–2025.

    Article  MathSciNet  Google Scholar 

  • Murie, C., Woody, O., Lee, A. and Nadon, R. (2009). Comparison of small n statistical tests of differential expression applied to microarrays. BMC Bioinformatics 10, 45.

    Article  Google Scholar 

  • Polansky, A.M. (2001). Bandwidth selection for the smoothed bootstrap percentile method. Comput. Stat. Data Anal. 36, 333–349.

    Article  MathSciNet  Google Scholar 

  • Polansky, A.M. (2011). Introduction to Statistical Limit Theory. Chapman and Hall/CRC, Boca Raton.

    Book  Google Scholar 

  • Polansky, A.M. and Schucany, W.R. (1997). Kernel smoothing to improve bootstrap confidence intervals. J. R. Statist. Soc. Ser. B 59, 821–838.

    Article  MathSciNet  Google Scholar 

  • Smyth, G. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statist. Appl. Genet. Mol. Biol. 3, 1544–6115.

    Article  MathSciNet  Google Scholar 

  • Storey, J. (2002). A direct approach to false discovery rates. J. R. Statist. Soc. Ser. B 64, 479–498.

    Article  MathSciNet  Google Scholar 

  • Tusher, V., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5121.

    Article  Google Scholar 

  • Wand, M.P. and Jones, M.C. (1995). Kernel Smoothing. Chapman and Hall, London.

    Book  Google Scholar 

  • Westfall, P.H. and Young, S.S. (1993). Resampling Based Multiple Testing: Examples and Methods for p-value Adjustments. Wiley, New York.

    Google Scholar 

Download references

Acknowledgments

We would like to thank the Editor, Associate Editor and two anonymous referees for their careful reading and constructive suggestions which improved the readability of the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santu Ghosh.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A. : Proofs

Appendix A. : Proofs

The proofs of Lemmas 1 and 5 are straightforward, and follow from the fact that a kernel density estimator is a convolution of the empirical distribution and the kernel function with respect to counting measure, hence omitted.

Proof of Lemma 2.

To prove Lemma 2 it is sufficient to show that

$$ \limsup_{N\to\infty}\sup_{1\leq i <\infty} N^{k}h_{i}\leq \infty. $$

Suppose that

$$ \limsup_{N\to\infty}(\sup_{i\geq 1} N^{k}h_{i})=\infty. $$

Then by the definition of \(\limsup \), we have

$$ \inf_{N\geq 1}\sup_{j\geq N}\sup_{i\geq 1}j^{k}h_{i}=\infty. $$

The above line implies that for all N ≥ 1,

$$ \sup_{j\geq N}\sup_{i\geq 1}j^{k}h_{i}=\infty, $$

and now using by switching order of supremum for doubly indexed sequence, one can have

$$ \sup_{i\geq 1}\sup_{j\geq 1} j^{k}h_{i}=\infty. $$

Again, \(\sup _{i\geq 1}\sup _{j\geq 1} j^{k}h_{i}=\infty \) implies that there are infinitely many i such that

$$ \sup_{j\geq 1} j^{k}h_{i}=\infty \text{holds}. $$

But Assumption (ii) implies that for each i, the sequence \(\{j^{k}h_{i}\}_{j=1}^{\infty }\) is bounded and this contradicts the with the fact

$$ \sup_{j\geq 1} j^{k}h_{i}=\infty. $$

Thus we must have

$$ \limsup_{N\to\infty}(\sup_{i\geq 1} N^{k}h_{i})<\infty, $$

and consequently Lemma 2 is established.

The proofs of Theorems 3 and 6 depend on the Edgeworth expansion of the distribution of

$$ T_{i}=\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{S^{2}_{x,i}/n+S^{2}_{y,i}/m}}, $$

i = 1,…,d. The following result considers the Edgeworth expansion of the distribution of Ti.

Proposition 9.

Under the assumptions (i),(iii), and (iv), the distribution of Ti has the following Edgeworth expansion

$$ P(T_{i}\leq x)={\Phi}(x)+N^{-1/2}q_{1,i}(x)\phi(x)+N^{-1}q_{2,i}(x)\phi(x)+O(N^{-3/2}), $$

holds uniformly in i; where q1,i(x) and q2,i(x)

$$ \begin{array}{@{}rcl@{}} q_{1,i}(x)&=&\frac{1}{6}\eta_{1,i}^{-3/2} \eta_{2,i}(2x^2+1),\\ q_{2,i}(x)&=& x\left[\frac{1}{12}\eta_{1,i}^{-2}\eta_{3,i}(x^2-3)-\frac{1}{18}\eta^{-3}_{1,i}\eta^{2}_{2,i}(x^4+2x^2-3)\right.\\&&\quad\left.-\frac{1}{4}\eta^{-2}_{1,i}\{\eta_{4,i} (x^{2}+3)+2\frac{\sigma^{2}_{x,i} \sigma^{2}_{y,i}}{{r^{2}_{x}}{r^{2}_{y}}}\}\right], \end{array} $$

where

$$ \eta_{1,i}=\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}} , \eta_{2,i}=\frac{\gamma_{x,i}}{{r^{2}_{x}}}-\frac{\gamma_{y,i}}{{r^{2}_{y}}} , \eta_{3,i}=\frac{\kappa_{x,i}}{{r^{3}_{x}}}+\frac{\kappa_{y,i}}{{r^{3}_{y}}} \text{and} \eta_{4,i}=\frac{\sigma^{4}_{x,i}}{r^{3}_{x,i}}+\frac{\sigma^{4}_{y,i}}{r^{3}_{y,i}}, $$

and

$$ \begin{array}{@{}rcl@{}} \gamma_{x,i}&=&E{(X_{i}-\mu_{x,i})^{3}} , \gamma_{y,i}=E{(Y_{i}-\mu_{y,i})^{3}} , \kappa_{x,i}=E{(X_{i}-\mu_{x,i})^{4}}\\&&-3\sigma^{4}_{x,i} \text{and} \kappa_{y,i}=E{(Y_{i}-\mu_{y,i})^{4}}-3\sigma^{4}_{y,i}. \end{array} $$

In these expressions, γx,i and γy,i denote the skewness of the i th component of X and Y, and κx,i and κy,i denote the kurtosis of the i th component of X and Y. The proof of Proposition 9 is available with the attached supplement.

Proof of Theorem 2.1.

: Let \(\tilde {\sigma }_{i}=\sqrt {\frac {S_{x,i}}{r_{x}}+\frac {S_{y,i}}{r_{y}}+{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}\), then we can express \(\tilde {T}_{i}\) as

$$ \tilde{T}_{i}=T_{i} \frac{\hat{\sigma}_{i}}{\tilde{\sigma}_{i}}, $$

where

$$ T_{i}=\frac{\sqrt{N}(\bar{X}_{i}-\bar{Y}_{i})}{\sqrt{\frac{S^{2}_{x,i}}{r_{x}}+\frac{S^{2}_{y,i}}{r_{y}}}}, $$

and \(\hat {\sigma }_{i}=\sqrt {\frac {S^{2}_{x,i}}{r_{x}}+\frac {S^{2}_{y,i}}{r_{y}}}\). Let \(\tilde {t}_{i}\) denote the observed value of the two-sample t statistic, \(\tilde {T}_{i}\), based on the kernel density estimator technique. Then the p-value corresponding to H0i is given by

$$ \begin{array}{@{}rcl@{}} P(|\tilde{T}_{i}|\geq |\tilde{t_{i}}|)= 1-\{P(\tilde{T}_{i}\leq |\tilde{t_{i}}|)-P(\tilde{T}_{i}\leq -|\tilde{t_{i}}|)\}. \end{array} $$
(A.1)

Let \(\beta _{i}=P(\tilde {T}_{i}\leq |\tilde {t_{i}}|)\). Then \(|\tilde {t_{i}}|\) is the βith quantile of distribution of \(\tilde {T}_{i}\). Under assumptions (i) and (iii), \(P\{\tilde {T}_{i}\leq x\}\) has a similar Edgeworth expansion as \(P\{\tilde {T}_{i}\leq x\}\) where q1,i(x) and q2,i(x) in Proposition 9 are replaced by \(\tilde {q}_{1,i}(x)\) and \(\tilde {q}_{2,i}(x)\). Using the similar techniques of Polansky and Schucany (1997), we can show that

$$ \begin{array}{@{}rcl@{}} \tilde{q}_{1,i}(x)&=& q_{1,i}(x)+O(N^{-2k}), \\ \tilde{q}_{2,i}(x)&=& q_{2,i}(x)+O(N^{-2k}), \end{array} $$
(A.2)

holds uniformly in i, since \(\sup _{i}h_{i}=O(N^{-k})\). Under the assumptions (i), (ii), and (iii), we can obtain the Cornish-Fisher expansion for \(|\tilde {t_{i}}|\) as

$$ |\tilde{t_{i}}|=z_{\beta_{i}}-N^{-1/2}\tilde{q}_{1,i}(z_{\beta_{i}})+N^{-1}\tilde{q}_{21,i}(z_{\beta_{i}})+O(N^{-3/2}), $$

uniformly in i, since \(\sup _{i}h_{i}=O(N^{-k})\). An excellent review of Cornish-Fisher expansions can be found in Hall (1992), where the function \(\tilde {q}_{21,i}(\cdot )\) is a function of \(\tilde {q}_{1,i}(\cdot )\) and \(\tilde {q}_{2,i}(\cdot )\). Equation A.2 implies that

$$ |\tilde{t_{i}}| = z_{\beta_{i}}-N^{-1/2}q_{1,i}(z_{\beta_{i}})+N^{-1}q_{21,i}(z_{\beta_{i}})+O(N^{-\min\{2k+1/2,3/2\}}), $$
(A.3)

holds uniformly in i. Considering \(P\{\tilde {T}_{i}\leq |\tilde {t_{i}}|\}\), we have that

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq |\tilde{t_{i}}|\} &=& P\{T_{i}\leq |\tilde{t}_{i}| \frac{\tilde{\sigma}_{i}}{\hat{\sigma}_{i}}\}=P\left\{T_{i}\leq |\tilde{t}_{i}| \left( 1+\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{S^{2}_{x,i}}{r_{x}}+\frac{S^{2}_{y,i}}{r_{y}}}\right)^{1/2}\right\}\\&=& P\left\{\!T_{i} \leq |\tilde{t}_{i}| \left( 1 + \frac{1}{2}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}} + O_{p}(N^{-(2k+1/2)}) \right) \right\} \end{array} $$
(A.4)

From Eqs. A.3 and A.4 we have

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq |\tilde{t_{i}}|\} &=&\!\!\!\!P\left\{T_{i}\leq |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}+O_{p}(N^{-(2k+1/2)})\right\}\\\ &=&\!\!\!\!P\left\{T_{i}\leq |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right\}+O(N^{-(2k+1/2)}), \end{array} $$
(A.5)

holds uniformly in i; where the last line follows after an application of the Delta method of Section 2.7 of Hall (1992). Proposition 9 and Eq. A.5, and applying a Taylor series expansion we obtain

$$ \begin{array}{@{}rcl@{}} &&P\left\{T_{i}\leq |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right\}\\&&={\Phi}\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&+N^{-1/2}q_{1,i}\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&\times\phi\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&+N^{-1}q_{2,i}\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&\phi\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)+O(N^{-3/2})\\&&={\Phi}(|\tilde{t}_{i}|)+\frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})\\ &&+N^{-1/2}q_{1,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})+N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\\ &&+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$
(A.6)

holds uniformly in i. Equations A.5 and A.6 imply that

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq |\tilde{t_{i}}|\}&=&{\Phi}(|\tilde{t}_{i}|)+\frac{1}{2}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})z_{\beta_{i}}\\ &&+N^{-1/2}q_{1,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\\ &&+N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\\ &&+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$
(A.7)

holds uniformly in i. Similarly we have

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq -|\tilde{t_{i}}|\}&=&{\Phi}(-|\tilde{t}_{i}|)-\frac{1}{2}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(-z_{\beta_{i}})z_{\beta_{i}}\\ &&+N^{-1/2}q_{1,i}(-z_{\beta_{i}})\phi(-z_{\beta_{i}})\\ &&+N^{-1}q_{2,i}(-z_{\beta_{i}})\phi(-z_{\beta_{i}})\\ &&+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$
(A.8)

holds uniformly in i. Finally, based on Eqs. A.1A.7, and A.8 we can conclude that

$$ \begin{array}{@{}rcl@{}} P(|\tilde{T}_{i}|\geq |\tilde{t_{i}}|)&=&1-\left\{\Phi(|\tilde{t}_{i}|)-{\Phi}(-|\tilde{t}_{i}|)+z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})\right.\\ &&\left.+2N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\right.\\ &&\left.+O(N^{-\min\{2k+1/2,3/2\}})\right\}\\ &&=P(|Z|>|\tilde{t}_{i}|)-z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})\\ &&-2N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$

holds uniformly in i. □

Proof of Lemma 4.

To prove this lemma, we use the following result. □

Proposition 10.

Let \(\mu _{xi,l}=\mathrm {E}(X_{i}-\mu _{xi})^{l}<\infty \) and \(\mu _{yi,l}=\mathrm {E}(X_{i}-\mu _{xi})^{l}<\infty \). Let

$$ m_{xi,l}=n^{-1}\sum\limits_{j=1}^{n}(X_{j,i}-\bar{X}_{i})^{l}, \text{and} m_{yi,l}=m^{-1}\sum\limits_{j=1}^{m}(Y_{j,i}-\bar{Y}_{i})^{l}. $$

Then

$$ \max_{i\geq 1}(m_{xi,l}-\mu_{xi,l})=O_{p}(N^{-1/2}), \text{and} \max_{i\geq 1}(m_{yi,l}-\mu_{yi,l})=O_{p}(N^{-1/2}). $$

Proof.

Let’s define

$$ p_{i,n}=P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}, \text{and} \delta_{i,n}=1-p_{i,n}. $$

Then for every i and un

$$ \inf_{v\geq n}\delta_{i,v}\leq \delta_{i,u}. $$

Consequently, for every r and un

$$ \sum\limits_{i=1}^{r}\inf_{v\geq n}\delta_{i,n}\leq \sum\limits_{i=1}^{r}\delta_{i,u}, $$

which indicates that for every r and n

$$ \sum\limits_{i=1}^{r}\inf_{v\geq n}\delta_{i,v}\leq \inf_{u\geq n} \sum\limits_{i=1}^{r}\delta_{i,u}. $$

The above inequality implies that

$$ \sum\limits_{i=1}^{r}\inf_{v\geq n}(1-p_{i,v})\leq \inf_{u\geq n} \sum\limits_{i=1}^{r}(1-p_{i,u}) \implies \sum\limits_{i=1}^{r}\sup_{v\geq n}p_{i,v}\geq \sup_{u\geq n} \sum\limits_{i=1}^{r}p_{i,u} $$

Thus for every r,

$$ \sup_{u\geq n} \sum\limits_{i=1}^{r}p_{i,u}\leq \sum\limits_{i=1}^{\infty}\sup_{v\geq n}p_{i,v}\implies \lim_{n\to\infty}\sup_{u\geq n} \sum\limits_{i=1}^{r}p_{i,u}\leq \sum\limits_{i=1}^{\infty}\lim_{n\to\infty }\sup_{v\geq n}p_{i,v} $$

and consequently

$$ \lim_{n\to\infty}\sup_{u\geq n} \sum\limits_{i=1}^{\infty}p_{i,u}\leq \sum\limits_{i=1}^{\infty}\lim_{n\to\infty }\sup_{v\geq n}p_{i,v}. $$

Thus, we have

$$ \begin{array}{@{}rcl@{}} &&\limsup_{n\to\infty}\sum\limits_{i=1}^{\infty}P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}\\&&\leq \sum\limits_{i=1}^{\infty} \limsup_{n\to\infty }P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\} \end{array} $$
(A.9)

Now, Eq. A.9 implies that

$$ \begin{array}{@{}rcl@{}} &&\lim_{c\to\infty} \limsup_{n\to\infty} P\{\sup_{i}|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}\\ &&\leq \sum\limits_{i=1}^{\infty} \lim_{c\to\infty} \limsup_{n\to\infty} P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}. \end{array} $$

Thus, the last inequality, and |mxi,lμxi,l| = Op(n− 1/2) imply that

$$ \sup_{i}|m_{xi,l}-\mu_{xi,l}|=O_{p}(n^{-1/2}), $$

and since n/N = O(1), so

$$ \sup_{i}|m_{xi,l}-\mu_{xi,l}|=O_{p}(N^{-1/2}). $$

Similarly we have

$$ \sup_{i}|m_{yi,l}-\mu_{yi,l}|=O_{p}(N^{-1/2}). $$

Proposition 10 implies that \(S^{2}_{x,i}=\sigma ^{2}_{x,i}+O_{p}(N^{-1/2})\), \(\hat {\gamma }_{x,i}=\gamma _{x,i}+O_{p}(N^{-1/2})\), \(\hat {\kappa }_{x,i}=\kappa _{x,i}+O_{p}(N^{-1/2})\) hold uniformly in i, and similarly, \(S^{2}_{y,i}=\sigma ^{2}_{y,i}+O_{p}(N^{-1/2})\), \(\hat {\gamma }_{y,i}=\gamma _{y,i}+O_{p}(N^{-1/2})\), \(\hat {\kappa }_{y,i}=\kappa _{y,i}+O_{p}(N^{-1/2})\) hold uniformly in i.

Again it can be shown using the technique in Polansky (2011, p. 42) that for any βi ∈ [𝜖,1 − 𝜖] for some 𝜖 > 0, \(z_{\hat {\beta _{i}}}=z_{\beta _{i}}+O(N^{-1})\).

Using the above asymptotic relations, it can be checked easily that

$$ \hat{L}(z_{\hat{\beta_{i}}})=L(z_{\beta_{i}})+O_{p}(N^{-1/2}), \text{and hence} \hat{h}^{2}_{i,opt}=h^{2}_{i,opt}+O_{p}(N^{-3/2}) $$

hold uniformly in i. Now,

$$ \begin{array}{@{}rcl@{}} \tilde{T}_{i}&=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat{h}^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\\&=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\sqrt{\frac{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat{h}^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\\ &=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\sqrt{1+\frac{[h^{2}_{i,opt}-\hat{h}^{2}_{i,opt}](\frac{1}{n}+\frac{1}{m})}{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat{h}^{2}_{i,opt}(\frac{1}{n}+\frac{1}{ m})}}\\ &=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}+O_{p}(N^{-3/2}), \end{array} $$
(A.10)

uniformly in i because \(\hat {h}^{2}_{i,opt}=h^{2}_{i,opt}+O_{p}(N^{-3/2})\) and \(\frac {S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat {h}^{2}_{i,opt}(\frac {1}{n}+\frac {1}{m})=\sigma ^{2}_{x,i}+\sigma ^{2}_{y,i}+O_{p}(N^{-1/2})\) are true uniformly in i. Equation A.10 implies that

$$ \begin{array}{@{}rcl@{}} P\{|\tilde{T}_{i}|\geq |\tilde{t}_i|\}&=& P\left\{\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}+O_{p}(N^{-3/2})\geq |\tilde{t}_i|\right\}\\ &=&P\left\{\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\geq |\tilde{t}_i|\right\}+O(N^{-3/2})\\ &=&2-2{\Phi}(|\tilde{t}_{i}|)+O(N^{-3/2}), \end{array} $$

uniformly in i, where the second last line and the last follow from after an application of the Delta method of Section 2.7 of Hall (1992).

Proof of Theorem 6.

Following the steps of Theorem 2.1, we can prove Theorem 2.2. Hence, we have omitted the proof. □

Proof of Lemma 7.

Following the steps of Lemma 4, we can prove Lemma 7. Hence, we have omitted the proof. □

Proof of Theorem 8.

Let Pi denote true p-values of \(\tilde {T}_{i}\) or \(\tilde {T}^{R}_{i}\) and depending on the sign of \(\hat {h}^{2}_{i,\text {opt}}\). Let \(\hat {P}_{i}\) denote \(2(1-{\Phi }(|\tilde {T}_{i}|))\) or \(2(1-{\Phi }(|\tilde {T}^{R}_{i}|))\) depending on the sign of \(\hat {h}^{2}_{i,\text {opt}}\). Define for each true null hypothesis H0i. Let \({\mathscr{H}}_{0}\) denote the set of all true null hypothesis. Then \(V={\sum }_{i\in {\mathscr{H}}_{0}} V_{i}\) denotes the numbers of wrongly rejected null hypotheses, and FDR is defined as

$$ \text{FDR}=\sum\limits_{i\in \mathcal{H}_{0}}\mathrm{E} \left( \frac{V_{i}}{\max\{R,1\}}\right). $$

Some part of this argument rests heavily on idea from Candes and Barber (2018). We can write \(\frac {V_{i}}{\max \limits \{R,1\}}\) as follows by summing over all possible values of number of rejections

Note that, when there are j rejections, then H0i is rejected if and only if \(\hat {p}_{i} \leq \alpha j/d\), and thus

Suppose that H0i is rejected, i.e. \(\hat {p}_{i} \leq \alpha j/d\), and if we set the value of \(\hat {P}_{i}\) to 0 then this new number of rejections is exactly j, because we are only reordering the first j p-values, all of which remain below the threshold αj/d. We denote this new number of rejections as \(R(\hat {p}_{i}=0)\), and thus .

Set \(\hat {P}=(\hat {P}_{1},\ldots ,\hat {P}_{d})^{\top }\) and \(\hat {P}_{(-i)}=\hat {P}\setminus \{\hat {P}_{i}\}\), then from the above observation we have

(11)

where last two lines follow from the facts that (i) is non-random when condition on \(\hat {P}_{(-i)}\) and \(\hat {P}_{i}=0\), and (ii) the components of \(\hat {P}\) are independent. We now center attention upon . Lemmas 4 and 7 indicate that \(|P_{i}-\hat {P}_{i}|=O_{p}(N^{-3/2})\) uniformly in i and using the similar arguments of the delta method of Hall (1992, Section 2.7), we have

$$ \mathrm{P}\{\hat{P}_{i}\leq \alpha j/d\}=\mathrm{P}\{P_{i}\leq \alpha j/d\}+O(N^{-3/2})= \alpha j/d+O(N^{-3/2}), $$

uniformly in i, where the expression in the last equality follows from the fact that \(P_{i}\sim \text {Uniform}(0,1)\), and

(12)

uniformly in i. The expressions in Eqs. A.11 and A.12 imply that

(13)

uniformly in i, the last is obtained because

and

Equation A.13 implies that

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i\in \mathcal{H}_{0}}\mathrm{E} \left( \frac{V_i}{\max\{R,1\}}\right)&=&\sum\limits_{i\in \mathcal{H}_{0}}\left( \frac{\alpha}{d}+O(N^{-3/2})\right)\\ &=&\frac{d_{0}}{d}\alpha+d_{0} O(N^{-3/2}), \end{array} $$

last holds because O(N− 3/2) is uniform in i, where \(d_{0}={\sum }_{i\in {\mathscr{H}}_{0}}\) is number of true null hypotheses. Thus,

$$ \frac{\text{FDR}}{\frac{d_{0}}{d}}=\alpha+O(d N^{-3/2}), $$

and \(\frac {\text {FDR}}{\frac {d_{0}}{d}}\to 1\) because d = o(N3/2).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ghosh, S., Polansky, A.M. Large-Scale Simultaneous Testing Using Kernel Density Estimation. Sankhya A 84, 808–843 (2022). https://doi.org/10.1007/s13171-020-00220-5

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13171-020-00220-5

Keywords

AMS (2000) subject classification

Navigation