Large-Scale Simultaneous Testing Using Kernel Density Estimation

Ghosh, Santu; Polansky, Alan M.

doi:10.1007/s13171-020-00220-5

Large-Scale Simultaneous Testing Using Kernel Density Estimation

Published: 15 October 2020

Volume 84, pages 808–843, (2022)
Cite this article

Sankhya A Aims and scope Submit manuscript

203 Accesses
Explore all metrics

Abstract

A century ago, when Student’s t-statistic was introduced, no one ever imagined its increasing applicability in the modern era. It finds applications in highly multiple hypothesis testing, feature selection and ranking, high dimensional signal detection, etc. Student’s t-statistic is constructed based on the empirical distribution function (EDF). An alternative choice to the EDF is the kernel density estimate (KDE), which is a smoothed version of the EDF. The novelty of the work consists of an alternative to Student’s t-test that uses the KDE technique and exploration of the usefulness of KDE based t-test in the context of its application to large-scale simultaneous hypothesis testing. An optimal bandwidth parameter for the KDE approach is derived by minimizing the asymptotic error between the true p-value and its asymptotic estimate based on normal approximation. If the KDE-based approach is used for large-scale simultaneous testing, then it is interesting to consider, when does the method fail to manage the error rate? We show that the suggested KDE-based method can control false discovery rate (FDR) if total number tests diverge at a smaller order of magnitude than N^3/2, where N is the total sample size. We compare our method to several possible alternatives with respect to FDR. We show in simulations that our method produces a lower proportion of false discoveries than its competitors. That is, our method better controls the false discovery rate than its competitors. Through these empirical studies, it is shown that the proposed method can be successfully applied in practice. The usefulness of the proposed methods is further illustrated through a gene expression data example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Multiple Hypothesis Testing: A Methodological Overview

Adjustment for Multiplicity

False Discovery Rate Based on Extreme Values in High Dimension

References

Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. Ser. B 57, 289–300.
MathSciNet MATH Google Scholar
Candes, E. and Barber, R.F. (2018). https://statweb.stanford.edu/~candes/stats300c/Lectures/Lecture7.pdf.
Fan, J., Hall, P. and Yao, Q. (2007). To how many simultaneous hypothesis tests can normal, student’s t or bootstrap calibration be applied? J. Am. Statist. Assoc. 102, 1282–1288.
Article MathSciNet Google Scholar
Ghosh, S. and Polansky, A.M. (2014). Smoothed and iterated bootstrap confidence regions for parameter vectors. J. Multivar. Statist. 132, 172–182.
MathSciNet MATH Google Scholar
Hall, P. (1992). The Bootstrap and Edgeworth Expansion. Springer, New York.
Book Google Scholar
Hall, P., Jing, B.Y. and Lahiri, S.N. (1998). On the sampling window method for long-range dependent data. Statist. Sin. 8, 1189–1204.
MathSciNet MATH Google Scholar
Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75, 800–802.
Article MathSciNet Google Scholar
Hommel, G.A. (1988). Stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75, 383–386.
Article Google Scholar
Karimi, S. and Farrokhnia, M. (2014). Leukemia and small round blue-cell tumor cancer detection using microarray gene expression data set: combining data dimension reduction and variable selection technique. Chemom. Intell. Lab. Syst. 139, 6–14.
Article Google Scholar
Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R. and Peterson, C. (2001). Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat. Med. 7, 673–679.
Article Google Scholar
Liu, W. and Shao, Q. (2014). Phase transition and regularized bootstrap in large-scale tt-tests with false discovery rate control. Ann. Statist. 42, 2003–2025.
Article MathSciNet Google Scholar
Murie, C., Woody, O., Lee, A. and Nadon, R. (2009). Comparison of small n statistical tests of differential expression applied to microarrays. BMC Bioinformatics 10, 45.
Article Google Scholar
Polansky, A.M. (2001). Bandwidth selection for the smoothed bootstrap percentile method. Comput. Stat. Data Anal. 36, 333–349.
Article MathSciNet Google Scholar
Polansky, A.M. (2011). Introduction to Statistical Limit Theory. Chapman and Hall/CRC, Boca Raton.
Book Google Scholar
Polansky, A.M. and Schucany, W.R. (1997). Kernel smoothing to improve bootstrap confidence intervals. J. R. Statist. Soc. Ser. B 59, 821–838.
Article MathSciNet Google Scholar
Smyth, G. (2004). Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statist. Appl. Genet. Mol. Biol. 3, 1544–6115.
Article MathSciNet Google Scholar
Storey, J. (2002). A direct approach to false discovery rates. J. R. Statist. Soc. Ser. B 64, 479–498.
Article MathSciNet Google Scholar
Tusher, V., Tibshirani, R. and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5121.
Article Google Scholar
Wand, M.P. and Jones, M.C. (1995). Kernel Smoothing. Chapman and Hall, London.
Book Google Scholar
Westfall, P.H. and Young, S.S. (1993). Resampling Based Multiple Testing: Examples and Methods for p-value Adjustments. Wiley, New York.
Google Scholar

Download references

Acknowledgments

We would like to thank the Editor, Associate Editor and two anonymous referees for their careful reading and constructive suggestions which improved the readability of the paper.

Author information

Authors and Affiliations

Augusta University, Augusta, GA, USA
Santu Ghosh
Northern Illinois University, DeKalb, IL, USA
Alan M. Polansky

Authors

Santu Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Alan M. Polansky
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Santu Ghosh.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A. : Proofs

The proofs of Lemmas 1 and 5 are straightforward, and follow from the fact that a kernel density estimator is a convolution of the empirical distribution and the kernel function with respect to counting measure, hence omitted.

Proof of Lemma 2.

To prove Lemma 2 it is sufficient to show that

$$ \limsup_{N\to\infty}\sup_{1\leq i <\infty} N^{k}h_{i}\leq \infty. $$

□

Suppose that

$$ \limsup_{N\to\infty}(\sup_{i\geq 1} N^{k}h_{i})=\infty. $$

Then by the definition of $\limsup $, we have

$$ \inf_{N\geq 1}\sup_{j\geq N}\sup_{i\geq 1}j^{k}h_{i}=\infty. $$

The above line implies that for all N ≥ 1,

$$ \sup_{j\geq N}\sup_{i\geq 1}j^{k}h_{i}=\infty, $$

and now using by switching order of supremum for doubly indexed sequence, one can have

$$ \sup_{i\geq 1}\sup_{j\geq 1} j^{k}h_{i}=\infty. $$

Again, $\sup _{i\geq 1}\sup _{j\geq 1} j^{k}h_{i}=\infty $ implies that there are infinitely many i such that

$$ \sup_{j\geq 1} j^{k}h_{i}=\infty \text{holds}. $$

But Assumption (ii) implies that for each i, the sequence $\{j^{k}h_{i}\}_{j=1}^{\infty }$ is bounded and this contradicts the with the fact

$$ \sup_{j\geq 1} j^{k}h_{i}=\infty. $$

Thus we must have

$$ \limsup_{N\to\infty}(\sup_{i\geq 1} N^{k}h_{i})<\infty, $$

and consequently Lemma 2 is established.

The proofs of Theorems 3 and 6 depend on the Edgeworth expansion of the distribution of

$$ T_{i}=\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{S^{2}_{x,i}/n+S^{2}_{y,i}/m}}, $$

i = 1,…,d. The following result considers the Edgeworth expansion of the distribution of T_i.

Proposition 9.

Under the assumptions (i),(iii), and (iv), the distribution of T_i has the following Edgeworth expansion

$$ P(T_{i}\leq x)={\Phi}(x)+N^{-1/2}q_{1,i}(x)\phi(x)+N^{-1}q_{2,i}(x)\phi(x)+O(N^{-3/2}), $$

holds uniformly in i; where q_1,i(x) and q_2,i(x)

$$ \begin{array}{@{}rcl@{}} q_{1,i}(x)&=&\frac{1}{6}\eta_{1,i}^{-3/2} \eta_{2,i}(2x^2+1),\\ q_{2,i}(x)&=& x\left[\frac{1}{12}\eta_{1,i}^{-2}\eta_{3,i}(x^2-3)-\frac{1}{18}\eta^{-3}_{1,i}\eta^{2}_{2,i}(x^4+2x^2-3)\right.\\&&\quad\left.-\frac{1}{4}\eta^{-2}_{1,i}\{\eta_{4,i} (x^{2}+3)+2\frac{\sigma^{2}_{x,i} \sigma^{2}_{y,i}}{{r^{2}_{x}}{r^{2}_{y}}}\}\right], \end{array} $$

where

$$ \eta_{1,i}=\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}} , \eta_{2,i}=\frac{\gamma_{x,i}}{{r^{2}_{x}}}-\frac{\gamma_{y,i}}{{r^{2}_{y}}} , \eta_{3,i}=\frac{\kappa_{x,i}}{{r^{3}_{x}}}+\frac{\kappa_{y,i}}{{r^{3}_{y}}} \text{and} \eta_{4,i}=\frac{\sigma^{4}_{x,i}}{r^{3}_{x,i}}+\frac{\sigma^{4}_{y,i}}{r^{3}_{y,i}}, $$

and

$$ \begin{array}{@{}rcl@{}} \gamma_{x,i}&=&E{(X_{i}-\mu_{x,i})^{3}} , \gamma_{y,i}=E{(Y_{i}-\mu_{y,i})^{3}} , \kappa_{x,i}=E{(X_{i}-\mu_{x,i})^{4}}\\&&-3\sigma^{4}_{x,i} \text{and} \kappa_{y,i}=E{(Y_{i}-\mu_{y,i})^{4}}-3\sigma^{4}_{y,i}. \end{array} $$

In these expressions, γ_x,i and γ_y,i denote the skewness of the i th component of X and Y, and κ_x,i and κ_y,i denote the kurtosis of the i th component of X and Y. The proof of Proposition 9 is available with the attached supplement.

Proof of Theorem 2.1.

: Let $\tilde {\sigma }_{i}=\sqrt {\frac {S_{x,i}}{r_{x}}+\frac {S_{y,i}}{r_{y}}+{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}$, then we can express $\tilde {T}_{i}$ as

$$ \tilde{T}_{i}=T_{i} \frac{\hat{\sigma}_{i}}{\tilde{\sigma}_{i}}, $$

where

$$ T_{i}=\frac{\sqrt{N}(\bar{X}_{i}-\bar{Y}_{i})}{\sqrt{\frac{S^{2}_{x,i}}{r_{x}}+\frac{S^{2}_{y,i}}{r_{y}}}}, $$

and $\hat {\sigma }_{i}=\sqrt {\frac {S^{2}_{x,i}}{r_{x}}+\frac {S^{2}_{y,i}}{r_{y}}}$. Let $\tilde {t}_{i}$ denote the observed value of the two-sample t statistic, $\tilde {T}_{i}$, based on the kernel density estimator technique. Then the p-value corresponding to H_0i is given by

$$ \begin{array}{@{}rcl@{}} P(|\tilde{T}_{i}|\geq |\tilde{t_{i}}|)= 1-\{P(\tilde{T}_{i}\leq |\tilde{t_{i}}|)-P(\tilde{T}_{i}\leq -|\tilde{t_{i}}|)\}. \end{array} $$

(A.1)

Let $\beta _{i}=P(\tilde {T}_{i}\leq |\tilde {t_{i}}|)$. Then $|\tilde {t_{i}}|$ is the β_ith quantile of distribution of $\tilde {T}_{i}$. Under assumptions (i) and (iii), $P\{\tilde {T}_{i}\leq x\}$ has a similar Edgeworth expansion as $P\{\tilde {T}_{i}\leq x\}$ where q_1,i(x) and q_2,i(x) in Proposition 9 are replaced by $\tilde {q}_{1,i}(x)$ and $\tilde {q}_{2,i}(x)$. Using the similar techniques of Polansky and Schucany (1997), we can show that

$$ \begin{array}{@{}rcl@{}} \tilde{q}_{1,i}(x)&=& q_{1,i}(x)+O(N^{-2k}), \\ \tilde{q}_{2,i}(x)&=& q_{2,i}(x)+O(N^{-2k}), \end{array} $$

(A.2)

holds uniformly in i, since $\sup _{i}h_{i}=O(N^{-k})$. Under the assumptions (i), (ii), and (iii), we can obtain the Cornish-Fisher expansion for $|\tilde {t_{i}}|$ as

$$ |\tilde{t_{i}}|=z_{\beta_{i}}-N^{-1/2}\tilde{q}_{1,i}(z_{\beta_{i}})+N^{-1}\tilde{q}_{21,i}(z_{\beta_{i}})+O(N^{-3/2}), $$

uniformly in i, since $\sup _{i}h_{i}=O(N^{-k})$. An excellent review of Cornish-Fisher expansions can be found in Hall (1992), where the function $\tilde {q}_{21,i}(\cdot )$ is a function of $\tilde {q}_{1,i}(\cdot )$ and $\tilde {q}_{2,i}(\cdot )$. Equation A.2 implies that

$$ |\tilde{t_{i}}| = z_{\beta_{i}}-N^{-1/2}q_{1,i}(z_{\beta_{i}})+N^{-1}q_{21,i}(z_{\beta_{i}})+O(N^{-\min\{2k+1/2,3/2\}}), $$

(A.3)

holds uniformly in i. Considering $P\{\tilde {T}_{i}\leq |\tilde {t_{i}}|\}$, we have that

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq |\tilde{t_{i}}|\} &=& P\{T_{i}\leq |\tilde{t}_{i}| \frac{\tilde{\sigma}_{i}}{\hat{\sigma}_{i}}\}=P\left\{T_{i}\leq |\tilde{t}_{i}| \left( 1+\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{S^{2}_{x,i}}{r_{x}}+\frac{S^{2}_{y,i}}{r_{y}}}\right)^{1/2}\right\}\\&=& P\left\{\!T_{i} \leq |\tilde{t}_{i}| \left( 1 + \frac{1}{2}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}} + O_{p}(N^{-(2k+1/2)}) \right) \right\} \end{array} $$

(A.4)

From Eqs. A.3 and A.4 we have

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq |\tilde{t_{i}}|\} &=&\!\!\!\!P\left\{T_{i}\leq |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}+O_{p}(N^{-(2k+1/2)})\right\}\\\ &=&\!\!\!\!P\left\{T_{i}\leq |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right\}+O(N^{-(2k+1/2)}), \end{array} $$

(A.5)

holds uniformly in i; where the last line follows after an application of the Delta method of Section 2.7 of Hall (1992). Proposition 9 and Eq. A.5, and applying a Taylor series expansion we obtain

$$ \begin{array}{@{}rcl@{}} &&P\left\{T_{i}\leq |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right\}\\&&={\Phi}\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&+N^{-1/2}q_{1,i}\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&\times\phi\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&+N^{-1}q_{2,i}\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)\\ &&\phi\left( |\tilde{t}_{i}|+ \frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\right)+O(N^{-3/2})\\&&={\Phi}(|\tilde{t}_{i}|)+\frac{1}{2}z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})\\ &&+N^{-1/2}q_{1,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})+N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\\ &&+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$

(A.6)

holds uniformly in i. Equations A.5 and A.6 imply that

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq |\tilde{t_{i}}|\}&=&{\Phi}(|\tilde{t}_{i}|)+\frac{1}{2}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})z_{\beta_{i}}\\ &&+N^{-1/2}q_{1,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\\ &&+N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\\ &&+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$

(A.7)

holds uniformly in i. Similarly we have

$$ \begin{array}{@{}rcl@{}} P\{\tilde{T}_{i}\leq -|\tilde{t_{i}}|\}&=&{\Phi}(-|\tilde{t}_{i}|)-\frac{1}{2}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(-z_{\beta_{i}})z_{\beta_{i}}\\ &&+N^{-1/2}q_{1,i}(-z_{\beta_{i}})\phi(-z_{\beta_{i}})\\ &&+N^{-1}q_{2,i}(-z_{\beta_{i}})\phi(-z_{\beta_{i}})\\ &&+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$

(A.8)

holds uniformly in i. Finally, based on Eqs. A.1, A.7, and A.8 we can conclude that

$$ \begin{array}{@{}rcl@{}} P(|\tilde{T}_{i}|\geq |\tilde{t_{i}}|)&=&1-\left\{\Phi(|\tilde{t}_{i}|)-{\Phi}(-|\tilde{t}_{i}|)+z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})\right.\\ &&\left.+2N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})\right.\\ &&\left.+O(N^{-\min\{2k+1/2,3/2\}})\right\}\\ &&=P(|Z|>|\tilde{t}_{i}|)-z_{\beta_{i}}\frac{{h^{2}_{i}}(r^{-1}_{x}+r^{-1}_{y})}{\frac{\sigma^{2}_{x,i}}{r_{x}}+\frac{\sigma^{2}_{y,i}}{r_{y}}}\phi(z_{\beta_{i}})\\ &&-2N^{-1}q_{2,i}(z_{\beta_{i}})\phi(z_{\beta_{i}})+O(N^{-\min\{2k+1/2,3/2\}}), \end{array} $$

holds uniformly in i. □

Proof of Lemma 4.

To prove this lemma, we use the following result. □

Proposition 10.

Let $\mu _{xi,l}=\mathrm {E}(X_{i}-\mu _{xi})^{l}<\infty $ and $\mu _{yi,l}=\mathrm {E}(X_{i}-\mu _{xi})^{l}<\infty $. Let

$$ m_{xi,l}=n^{-1}\sum\limits_{j=1}^{n}(X_{j,i}-\bar{X}_{i})^{l}, \text{and} m_{yi,l}=m^{-1}\sum\limits_{j=1}^{m}(Y_{j,i}-\bar{Y}_{i})^{l}. $$

Then

$$ \max_{i\geq 1}(m_{xi,l}-\mu_{xi,l})=O_{p}(N^{-1/2}), \text{and} \max_{i\geq 1}(m_{yi,l}-\mu_{yi,l})=O_{p}(N^{-1/2}). $$

Proof.

Let’s define

$$ p_{i,n}=P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}, \text{and} \delta_{i,n}=1-p_{i,n}. $$

Then for every i and u ≥ n

$$ \inf_{v\geq n}\delta_{i,v}\leq \delta_{i,u}. $$

Consequently, for every r and u ≥ n

$$ \sum\limits_{i=1}^{r}\inf_{v\geq n}\delta_{i,n}\leq \sum\limits_{i=1}^{r}\delta_{i,u}, $$

which indicates that for every r and n

$$ \sum\limits_{i=1}^{r}\inf_{v\geq n}\delta_{i,v}\leq \inf_{u\geq n} \sum\limits_{i=1}^{r}\delta_{i,u}. $$

The above inequality implies that

$$ \sum\limits_{i=1}^{r}\inf_{v\geq n}(1-p_{i,v})\leq \inf_{u\geq n} \sum\limits_{i=1}^{r}(1-p_{i,u}) \implies \sum\limits_{i=1}^{r}\sup_{v\geq n}p_{i,v}\geq \sup_{u\geq n} \sum\limits_{i=1}^{r}p_{i,u} $$

Thus for every r,

$$ \sup_{u\geq n} \sum\limits_{i=1}^{r}p_{i,u}\leq \sum\limits_{i=1}^{\infty}\sup_{v\geq n}p_{i,v}\implies \lim_{n\to\infty}\sup_{u\geq n} \sum\limits_{i=1}^{r}p_{i,u}\leq \sum\limits_{i=1}^{\infty}\lim_{n\to\infty }\sup_{v\geq n}p_{i,v} $$

and consequently

$$ \lim_{n\to\infty}\sup_{u\geq n} \sum\limits_{i=1}^{\infty}p_{i,u}\leq \sum\limits_{i=1}^{\infty}\lim_{n\to\infty }\sup_{v\geq n}p_{i,v}. $$

Thus, we have

$$ \begin{array}{@{}rcl@{}} &&\limsup_{n\to\infty}\sum\limits_{i=1}^{\infty}P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}\\&&\leq \sum\limits_{i=1}^{\infty} \limsup_{n\to\infty }P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\} \end{array} $$

(A.9)

Now, Eq. A.9 implies that

$$ \begin{array}{@{}rcl@{}} &&\lim_{c\to\infty} \limsup_{n\to\infty} P\{\sup_{i}|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}\\ &&\leq \sum\limits_{i=1}^{\infty} \lim_{c\to\infty} \limsup_{n\to\infty} P\{|m_{xi,l}-\mu_{xi,l}|>cn^{-1/2}\}. \end{array} $$

Thus, the last inequality, and |m_xi,l − μ_xi,l| = O_p(n^− 1/2) imply that

$$ \sup_{i}|m_{xi,l}-\mu_{xi,l}|=O_{p}(n^{-1/2}), $$

and since n/N = O(1), so

$$ \sup_{i}|m_{xi,l}-\mu_{xi,l}|=O_{p}(N^{-1/2}). $$

Similarly we have

$$ \sup_{i}|m_{yi,l}-\mu_{yi,l}|=O_{p}(N^{-1/2}). $$

□

Proposition 10 implies that $S^{2}_{x,i}=\sigma ^{2}_{x,i}+O_{p}(N^{-1/2})$, $\hat {\gamma }_{x,i}=\gamma _{x,i}+O_{p}(N^{-1/2})$, $\hat {\kappa }_{x,i}=\kappa _{x,i}+O_{p}(N^{-1/2})$ hold uniformly in i, and similarly, $S^{2}_{y,i}=\sigma ^{2}_{y,i}+O_{p}(N^{-1/2})$, $\hat {\gamma }_{y,i}=\gamma _{y,i}+O_{p}(N^{-1/2})$, $\hat {\kappa }_{y,i}=\kappa _{y,i}+O_{p}(N^{-1/2})$ hold uniformly in i.

Again it can be shown using the technique in Polansky (2011, p. 42) that for any β_i ∈ [𝜖,1 − 𝜖] for some 𝜖 > 0, $z_{\hat {\beta _{i}}}=z_{\beta _{i}}+O(N^{-1})$.

Using the above asymptotic relations, it can be checked easily that

$$ \hat{L}(z_{\hat{\beta_{i}}})=L(z_{\beta_{i}})+O_{p}(N^{-1/2}), \text{and hence} \hat{h}^{2}_{i,opt}=h^{2}_{i,opt}+O_{p}(N^{-3/2}) $$

hold uniformly in i. Now,

$$ \begin{array}{@{}rcl@{}} \tilde{T}_{i}&=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat{h}^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\\&=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\sqrt{\frac{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat{h}^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\\ &=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\sqrt{1+\frac{[h^{2}_{i,opt}-\hat{h}^{2}_{i,opt}](\frac{1}{n}+\frac{1}{m})}{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat{h}^{2}_{i,opt}(\frac{1}{n}+\frac{1}{ m})}}\\ &=&\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}+O_{p}(N^{-3/2}), \end{array} $$

(A.10)

uniformly in i because $\hat {h}^{2}_{i,opt}=h^{2}_{i,opt}+O_{p}(N^{-3/2})$ and $\frac {S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+\hat {h}^{2}_{i,opt}(\frac {1}{n}+\frac {1}{m})=\sigma ^{2}_{x,i}+\sigma ^{2}_{y,i}+O_{p}(N^{-1/2})$ are true uniformly in i. Equation A.10 implies that

$$ \begin{array}{@{}rcl@{}} P\{|\tilde{T}_{i}|\geq |\tilde{t}_i|\}&=& P\left\{\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}+O_{p}(N^{-3/2})\geq |\tilde{t}_i|\right\}\\ &=&P\left\{\frac{\bar{X}_{i}-\bar{Y}_{i}}{\sqrt{\frac{S^{2}_{x,i}}{n}+\frac {S^{2}_{y,i}}{m}+h^{2}_{i,opt}(\frac{1}{n}+\frac{1}{m})}}\geq |\tilde{t}_i|\right\}+O(N^{-3/2})\\ &=&2-2{\Phi}(|\tilde{t}_{i}|)+O(N^{-3/2}), \end{array} $$

uniformly in i, where the second last line and the last follow from after an application of the Delta method of Section 2.7 of Hall (1992).

Proof of Theorem 6.

Following the steps of Theorem 2.1, we can prove Theorem 2.2. Hence, we have omitted the proof. □

Proof of Lemma 7.

Following the steps of Lemma 4, we can prove Lemma 7. Hence, we have omitted the proof. □

Proof of Theorem 8.

Let P_i denote true p-values of $\tilde {T}_{i}$ or $\tilde {T}^{R}_{i}$ and depending on the sign of $\hat {h}^{2}_{i,\text {opt}}$. Let $\hat {P}_{i}$ denote $2(1-{\Phi }(|\tilde {T}_{i}|))$ or $2(1-{\Phi }(|\tilde {T}^{R}_{i}|))$ depending on the sign of $\hat {h}^{2}_{i,\text {opt}}$. Define for each true null hypothesis H_0i. Let ${\mathscr{H}}_{0}$ denote the set of all true null hypothesis. Then $V={\sum }_{i\in {\mathscr{H}}_{0}} V_{i}$ denotes the numbers of wrongly rejected null hypotheses, and FDR is defined as

$$ \text{FDR}=\sum\limits_{i\in \mathcal{H}_{0}}\mathrm{E} \left( \frac{V_{i}}{\max\{R,1\}}\right). $$

Some part of this argument rests heavily on idea from Candes and Barber (2018). We can write $\frac {V_{i}}{\max \limits \{R,1\}}$ as follows by summing over all possible values of number of rejections

Note that, when there are j rejections, then H_0i is rejected if and only if $\hat {p}_{i} \leq \alpha j/d$, and thus

□

Suppose that H_0i is rejected, i.e. $\hat {p}_{i} \leq \alpha j/d$, and if we set the value of $\hat {P}_{i}$ to 0 then this new number of rejections is exactly j, because we are only reordering the first j p-values, all of which remain below the threshold αj/d. We denote this new number of rejections as $R(\hat {p}_{i}=0)$, and thus .

Set $\hat {P}=(\hat {P}_{1},\ldots ,\hat {P}_{d})^{\top }$ and $\hat {P}_{(-i)}=\hat {P}\setminus \{\hat {P}_{i}\}$, then from the above observation we have

(11)

where last two lines follow from the facts that (i) is non-random when condition on $\hat {P}_{(-i)}$ and $\hat {P}_{i}=0$, and (ii) the components of $\hat {P}$ are independent. We now center attention upon . Lemmas 4 and 7 indicate that $|P_{i}-\hat {P}_{i}|=O_{p}(N^{-3/2})$ uniformly in i and using the similar arguments of the delta method of Hall (1992, Section 2.7), we have

$$ \mathrm{P}\{\hat{P}_{i}\leq \alpha j/d\}=\mathrm{P}\{P_{i}\leq \alpha j/d\}+O(N^{-3/2})= \alpha j/d+O(N^{-3/2}), $$

uniformly in i, where the expression in the last equality follows from the fact that $P_{i}\sim \text {Uniform}(0,1)$, and

(12)

uniformly in i. The expressions in Eqs. A.11 and A.12 imply that

(13)

uniformly in i, the last is obtained because

and

Equation A.13 implies that

$$ \begin{array}{@{}rcl@{}} \sum\limits_{i\in \mathcal{H}_{0}}\mathrm{E} \left( \frac{V_i}{\max\{R,1\}}\right)&=&\sum\limits_{i\in \mathcal{H}_{0}}\left( \frac{\alpha}{d}+O(N^{-3/2})\right)\\ &=&\frac{d_{0}}{d}\alpha+d_{0} O(N^{-3/2}), \end{array} $$

last holds because O(N^− 3/2) is uniform in i, where $d_{0}={\sum }_{i\in {\mathscr{H}}_{0}}$ is number of true null hypotheses. Thus,

$$ \frac{\text{FDR}}{\frac{d_{0}}{d}}=\alpha+O(d N^{-3/2}), $$

and $\frac {\text {FDR}}{\frac {d_{0}}{d}}\to 1$ because d = o(N^3/2).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghosh, S., Polansky, A.M. Large-Scale Simultaneous Testing Using Kernel Density Estimation. Sankhya A 84, 808–843 (2022). https://doi.org/10.1007/s13171-020-00220-5

Download citation

Received: 17 September 2019
Published: 15 October 2020
Issue Date: August 2022
DOI: https://doi.org/10.1007/s13171-020-00220-5

Keywords

AMS (2000) subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Large-Scale Simultaneous Testing Using Kernel Density Estimation

Abstract

Access this article

Similar content being viewed by others