Skip to main content
Log in

New multiple testing method under no dependency assumption, with application to multiple comparisons problem

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

Traditional multiple hypotheses testing mainly focuses on constructing stepwise procedures under some error rate control, such as familywise error rate (FWER), false discovery rate, and so forth. However, most of these procedures are obtained in independent case, and when there exists correlation across tests, the dependency may increase or decrease the chance of false rejections. In this paper, a totally different testing method is proposed, which doesn’t focus on specific error control, but pays attention to the overall performance of the collection of hypotheses and the structure utilization among hypotheses. Since the main purpose of multiple testing is to pick out the false ones from the whole hypotheses and present a rejection set, motivated by the principle of simple hypothesis testing, we give the final testing result based on the estimation of the set of all the true null hypotheses. Our method can be applied in any dependent case provided that a reasonable \(p\)-value can be obtained for each intersection hypothesis. We illustrate the new procedures with application to multiple comparisons problems. Theoretical results show the consistency of our method, and investigate their FWER behavior. Simulation results suggest that our procedures have a better overall performance than some existing procedures in dependent cases, especially in the total number of type I and type II errors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300

    MathSciNet  MATH  Google Scholar 

  • Benjamini Y, Krieger AM, Yekutieli D (2006) Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93:491–507

    Article  MathSciNet  MATH  Google Scholar 

  • Benjamini Y, Yekutieli D (2001) The control of the false discovery rate in multiple testing under dependency. Ann Stat 29:1165–1188

    Article  MathSciNet  MATH  Google Scholar 

  • Bernhard G, Klein M, Hommel G (2004) Global and multiple test procedures using ordered \(p\)-values—a review. Stat Pap 45:1–14

    Article  MathSciNet  MATH  Google Scholar 

  • Cohen A, Sackrowitz HB, Xu M (2009) A new multiple testing method in the dependent case. Ann Stat 37:1518–1544

    Article  MathSciNet  MATH  Google Scholar 

  • Dawid AP, Stone M (1982) The functional-model basis of fiducial inference. Ann Stat 10:1054–1067

  • Dudoit S, Van Der Laan M (2008) Multiple Testing Procedures with Applications to Genomics. Springer, New York

    Book  MATH  Google Scholar 

  • Efron B, Tibshirani R, Storey J, Tusher V (2001) Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc 96:1151–1160

  • Fan J, Han X, Gu W (2012) Estimating false discovery proportion under arbitrary covariance dependence. J Am Stat Assoc 107:1019–1035

  • Hochberg Y, Tamhane AC (1987) Multiple comparison procedures. Wiley, New York

    Book  MATH  Google Scholar 

  • Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70

    MathSciNet  MATH  Google Scholar 

  • Hsu JC (1996) Multiple comparisons: theory and methods. Chapman & Hall, London

    Book  MATH  Google Scholar 

  • Lehmann EL, Romano JP (2005) Generalizations of the familywise error rate. Ann Stat 33:1138–1154

    Article  MathSciNet  MATH  Google Scholar 

  • Li X, Xu X, Li G (2007) A fiducial argument for generalized \(p\)-value. Sci China Ser A Math 50:957–966

    Article  MATH  Google Scholar 

  • Miller RG (1966) Simultaneous statistical inference. McGraw-Hill, New York

    MATH  Google Scholar 

  • Pigeot I (2000) Basic concepts of multiple tests—a survey. Stat Pap 41:3–36

    Article  MathSciNet  MATH  Google Scholar 

  • Romano JP, Shaikh AM, Wolf M (2008) Control of the false discovery rate under dependence using the bootstrap and subsampling. Test 17:417–442

    Article  MathSciNet  MATH  Google Scholar 

  • Romano JP, Wolf M (2007) Control of generalized error rates in multiple testing. Ann Stat 35:1378–1408

    Article  MathSciNet  MATH  Google Scholar 

  • Sarkar SK (2008) Two-stage stepup procedures controlling FDR. J Stat Plann Inference 138:1072–1084

    Article  MATH  Google Scholar 

  • Storey JD (2003) The positive false discovery rate: a Bayesian interpretation and the \(q\)-value. Ann Stat 31:2013–2035

    Article  MathSciNet  MATH  Google Scholar 

  • Tsui KW, Weerahandi S (1989) Generalized \(p\)-values in significance testing of hypotheses in the presence of nuisance parameters. J Am Stat Assoc 84:602–607

    MathSciNet  Google Scholar 

  • Tukey JW (1953) The problem of multiple comparisons. In: The collected works of John H. Tukey VIII. Multiple comparisons: 1948–1983. Chapman & Hall, New York, pages 1–300

  • Westfall PH, Young SS (1993) Resampling-based multiple testing: examples and methods for \(p\)-value adjustment. Wiley, New York

    Google Scholar 

  • Yin Y (2012) A new Bayesian procedure for testing point null hypotheses. Comput Stat 27:237–249

    Article  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the the editor and the referee for their useful comments, which led to a considerable improvement of the paper. This work was supported by the National Natural Science Foundation of China (Grant No.11471030, 11471035).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xingzhong Xu.

Appendix

Appendix

Proof of Lemma 1

(i) Since for \(J \subseteq I_0\), \(p(J)\sim U(0,1)\), the result is obtained immediately.

(ii) \(\forall J\nsubseteq I_0\),

case 1. \(\sigma ^2\) is known

Denote

$$\begin{aligned} Y_n=\frac{n}{\sigma ^2}{\bar{\mathbf {X}}}^T{L_J}^T\left( L_J{L_J}^T\right) ^{-1}L_J\bar{\mathbf {X}}, \text {and} \Sigma =\left( L_J{L_J}^T\right) ^{-1}=RR^T, \end{aligned}$$

where \(R\) is positive definite. Then

$$\begin{aligned} Y_n&= \frac{n}{\sigma ^2}\left( L_J\bar{\mathbf {X}}-L_J\varvec{\theta }\right) ^T\Sigma \left( L_J\bar{\mathbf {X}}-L_J\varvec{\theta }\right) +\frac{2n}{\sigma ^2}\left( L_J\varvec{\theta }\right) ^T\Sigma \left( L_J\bar{\mathbf {X}}-L_J\varvec{\theta }\right) \\&+\frac{n}{\sigma ^2}\left( L_J\varvec{\theta }\right) ^T\Sigma \left( L_J\varvec{\theta }\right) \\&\triangleq Y_{1n}+Y_{2n}+Y_{3n}. \end{aligned}$$

Notice that

$$\begin{aligned} Y_{1n}\sim \chi ^2_{r_J}, \frac{\sqrt{n}}{\sigma }R^T\left( L_J\bar{\mathbf {X}}-L_J\varvec{\theta }\right) \sim N(\mathbf {0},\mathbf {I}_{r_J}), \end{aligned}$$

then \(Y_{1n}=O_p(1)\), \(Y_{2n}=O_p(\sqrt{n})\). Additionally, \(Y_{3n}\asymp n\), so \(Y_n\asymp _p n\), that is to say, \(Y_n\) is of the same order as \(n\) in probability.

Since \(\chi ^{-2}_m(1-\alpha (n))=o(n)\), \(\chi ^{-2}_{r_J}(1-\alpha (n))=o(n)\). So

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}\left( Y_n\ge \chi ^{-2}_{r_J}(1-\alpha (n)) \right) = 1, \text {i.e.} \lim _{n\rightarrow \infty }\mathrm{Pr}\left( p(L_J)\le \alpha \right) = 1. \end{aligned}$$

case 2. \(\sigma ^2\) is unknown

Similarly as before, let

$$\begin{aligned} W_n=\frac{n{\bar{\mathbf {X}}}^T{L_J}^T\left( L_J{L_J}^T\right) ^{-1}L_J\bar{\mathbf {X}}}{r_JS^2}=\frac{Y_n}{r_J}/\frac{S^2}{\sigma ^2}. \end{aligned}$$

Since \(S^2\) converges to \(\sigma ^2\) in probability, the remaining is as similar as that of case 1. \(\square \)

Proof of Lemma 2

For any \(\alpha \) satisfying (C0), from Lemma 1, we have

$$\begin{aligned}&\mathrm{Pr}\left( p(K)>p(J)\right) \\&\quad =\mathrm{Pr}\left( p(K)> p(J),p(K)\le \alpha \right) +\mathrm{Pr}\left( p(K)> p(J),p(K)>\alpha \right) \\&\quad \le \mathrm{Pr}\left( p(J)\le \alpha \right) +\mathrm{Pr}\left( p(K)>\alpha \right) \\&\quad \rightarrow 0, \text {as}\,\, n\rightarrow \infty . \end{aligned}$$

\(\square \)

Proof of Theorem 1

(i) For All-sets-down procedure,

$$\begin{aligned} \{J_0\ne I_0\}\!=\!\left\{ \bigcup _{|J|>|I_0|}\left\{ p(J)>\alpha \right\} \right\} \!\cup \!\left\{ \bigcup _{|J|=|I_0|,J\ne I_0}\left\{ p(I_0)<p(J)\right\} \right\} \cup \{p(I_0)\le \alpha \}. \end{aligned}$$

From Lemma 1 and Lemma 2,

$$\begin{aligned} \mathrm{Pr}\left( \bigcup _{|J|>|I_0|}\left\{ p(J)>\alpha \right\} \right)&\le \sum _{|J|>|I_0|}\mathrm{Pr}\left( p(J)>\alpha \right) \rightarrow 0,\\&\text {as } n\rightarrow \infty ,\\ \mathrm{Pr}\left( \bigcup _{|J|=|I_0|,J\ne I_0}\left\{ p(I_0)<p(J)\right\} \right)&\le \sum _{|J|=|I_0|,J\ne I_0}\mathrm{Pr}\left( p(I_0)<p(J)\right) \rightarrow 0,\\&\text {as } n\rightarrow \infty , \end{aligned}$$

then

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}\left( J_0=I_0\right) =1. \end{aligned}$$

(ii) For All-sets-up procedure,

$$\begin{aligned}&\left\{ J_0\ne I_0\right\} \\&\quad =\left\{ \bigcup _{d=1}^{|I_0|-1}\bigcap _{|J|=d}\{p(J)\le \alpha \}\right\} \cup \left\{ \bigcup _{|J|=|I_0|,J\ne I_0}\{p(J)>p(I_0)\}\right\} \cup \{p(I_0)\le \alpha \}\\&\quad \cup \left\{ \bigcup _{|J|=|I_0|+1}\{p(J)>\alpha \}\right\} . \end{aligned}$$

Since

$$\begin{aligned}&\mathrm{Pr}\left( \bigcup _{d=1}^{|I_0|-1}\bigcap _{|J|=d}\{p(J)\!\le \! \alpha \}\right) \le \sum _{d=1}^{|I_0|-1}\mathrm{Pr}\left( \bigcap _{|J|=d,J\subseteq I_0}\{p(J)\le \alpha \}\right) \rightarrow 0, \text {as } n\rightarrow \infty ,\\&\mathrm{Pr}\left( \bigcup _{|J|=|I_0|,J\ne I_0}\{p(J)\!>\!p(I_0)\}\right) \!\le \! \sum _{|J|=|I_0|,J\ne I_0}\mathrm{Pr}\left( p(J)\!>\!p(I_0)\right) \rightarrow 0, \text {as } n\rightarrow \infty ,\\&\mathrm{Pr}\left( \bigcup _{|J|=|I_0|+1}\{p(J)>\alpha \}\right) \le \sum _{|J|=|I_0|+1}\mathrm{Pr}\left( p(J)>\alpha \right) \rightarrow 0, \text {as } n\rightarrow \infty , \end{aligned}$$

and \( \lim \limits _{n\rightarrow \infty }\mathrm{Pr}(p(I_0)\le \alpha )=0\), we immediately get

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}(J_0=I_0)=1. \end{aligned}$$

(iii) For Nested-up procedure, let

$$\begin{aligned} D_n=\{\text {Accept all true null hypotheses without any false null accepted}\}. \end{aligned}$$

From Lemma 1 and Lemma 2, we have \(\lim \limits _{n\rightarrow \infty }\mathrm{Pr}(D_n)=1\). Therefore, as long as there exists at least one true null hypothesis in that step, a true null will be accepted in probability. Since

$$\begin{aligned} \{J_0=I_0\}=D_n\cap \left\{ \bigcap _{|J|=|I_0|+1, J\supset I_0}\{p(J)\le \alpha \}\right\} , \end{aligned}$$

and \(\lim \limits _{n\rightarrow \infty }\mathrm{Pr}(p(J)\le \alpha )=1,J\supset I_0\), we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}\left( J_0=I_0\right) =1. \end{aligned}$$

\(\square \)

Proof of Proposition 1

When all the null hypotheses are true, for All-sets-down and Nested-down procedures,

$$\begin{aligned} \mathrm{FWER}=\mathrm{Pr}(p(I)\ge \alpha )=\alpha . \end{aligned}$$

\(\square \)

Proof of Proposition 2

Without loss of generality, suppose there exists at least one false null hypotheses. Denote \(V_n\) be the number of false rejections, so \(\mathrm{FWER}=\mathrm{Pr}(V_n\ge 1)\). For All-sets-down procedure,

$$\begin{aligned} \{V_n\ge 1\}=E_n\cup F_n\cup G_n, \end{aligned}$$

where

$$\begin{aligned} E_n&= \bigcup _{d>|I_0|}\left\{ \left\{ \bigcap _{|J|=d+1}\{p(J)\le \alpha \}\right\} \right. \\&\left. \cap \left\{ \bigcup _{|K|=d,I_0\nsubseteq K} \left\{ p(K)>\alpha ,p(K)=\max _{|W|=|K|}p(W)\right\} \right\} \right\} ,\\ F_n&= \left\{ \bigcap _{|J|>|I_0|}\{p(J)\le \alpha \}\right\} \cap \left\{ p(I_0)<\max _{K\ne I_0,|K|=|I_0|}p(K)\right\} ,\\ G_n&= \left\{ \bigcap _{|J|>|I_0|}\{p(J)\le \alpha \}\right\} \cap \left\{ p(I_0)=\max _{|K|=|I_0|}p(K)\right\} \cap \left\{ p(I_0)\le \alpha \right\} . \end{aligned}$$

Since \(H_K\) is false, with \(|K|>|I_0|\), for any fixed \(\alpha \), \(\alpha \in (0,1)\),

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}(p(K)>\alpha )= 0, \text {and} \lim _{n\rightarrow \infty }\mathrm{Pr}(E_n)= 0. \end{aligned}$$

Additionally, from Lemma 2,

$$\begin{aligned} \mathrm{Pr}\left( p(I_0)<\max _{K\ne I_0,|K|=|I_0|}p(K)\right) \rightarrow 0, \end{aligned}$$

so \(\lim _{n\rightarrow \infty }\mathrm{Pr}(F_n)=0.\)

Furthermore, since \(\mathrm{Pr}\left( p(I_0)\le \alpha \right) =\alpha \), and

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr} \left( \bigcap _{|J|>|I_0|}\{p(J)\le \alpha \}\right) = 1, \lim _{n\rightarrow \infty }\mathrm{Pr}\left( p(I_0)=\max _{|K|=|I_0|}p(K)\right) = 1, \end{aligned}$$

we get \(\lim _{n\rightarrow \infty }\mathrm{Pr}(G_n)=\alpha .\)

It’s easy to see

$$\begin{aligned} \mathrm{Pr}(G_n)\le \mathrm{Pr}(V_n\ge 1)\le \mathrm{Pr}(E_n)+\mathrm{Pr}(F_n)+\mathrm{Pr}(G_n). \end{aligned}$$

Since

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}(E_n)=0, \lim _{n\rightarrow \infty }\mathrm{Pr}(F_n)= 0, \lim _{n\rightarrow \infty }\mathrm{Pr}(G_n)=\alpha , \end{aligned}$$

we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\mathrm{Pr}(V_n\ge 1)= \alpha . \end{aligned}$$

\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., Xu, X. & A, Y. New multiple testing method under no dependency assumption, with application to multiple comparisons problem. Stat Papers 57, 161–183 (2016). https://doi.org/10.1007/s00362-014-0650-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-014-0650-2

Keywords

Mathematics Subject Classification

Navigation