Abstract
A new multiple testing procedure, called the FDR L procedure, was proposed by Zhang et al. (Ann Stat 39:613–642, 2011) for detecting the presence of spatial signals for large-scale 2D and 3D imaging data. In contrast to the conventional multiple testing procedure, the FDR L procedure substitutes each p-value by a locally aggregated median filter of p-values. This paper examines the performance of another commonly used filter, mean filter, in the FDR L procedure. It is demonstrated that when the p-values are independent and uniformly distributed under the true null hypotheses, (i) in view of estimating the resulting false discovery rate, the mean filter better alleviates the “lack of identification phenomenon” of the FDR L procedure than the median filter; (ii) in view of signal detection, the median filter enjoys the “edge-preserving property” and lends support to its better performance in detecting sparse signals than the mean filter.
Similar content being viewed by others
References
Arias-Castro E, Donoho DL (2009) Does median filtering truly preserve edges better than linear filtering? Ann Stat 37:1172–1206
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300
Brown BM (1983) Statistical uses of the spatial median. J R Stat Soc B 45:25–30
Efron B (2010) Large-scale inference. Empirical Bayes methods for estimation, testing, and prediction. Cambridge University Press, Cambridge
Feller W (1968) An introduction to probability theory and its applications, Vol. I, 3rd edn. Wiley, New York
Ruiz S (1996) An algebraic identity leading to Wilson’s theorem. Math Gaz 80:579–582
Sadooghi-Alvandi S, Nematollahi A, Habibi R (2009) On the distribution of the sum of independent uniform random variables. Stat Pap 50:171–175
Storey JD, Taylor JE, Siegmund D (2004) Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach. J R Stat Soc B 66:187–205
Uspensky JV (1937) Introduction to mathematical probability. McGraw-Hill, New York
van-der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge
Zhang CM, Yu T (2008) Semiparametric detection of significant activation for brain fMRI. Ann Stat 36:1693–1725
Zhang CM, Fan J, Yu T (2011) Multiple testing via FDR L for large-scale imaging data. Ann Stat 39:613–642
Acknowledgements
The author thanks the Associate Editor and referee for insightful comments.
The research is supported by the US National Science Foundation grants DMS-1106586, DMS-1308872 and Wisconsin Alumni Research Foundation.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Throughout the proof, since \(\alpha_{\infty}^{\mathrm{FDR}}\) has been treated in the Appendix of Zhang et al. (2011), derivations will be confined to \(\alpha_{\infty}^{\mathrm{FDR}_{L}}\) of the mean filter.
Condition A
-
A0
The neighborhood size k is an integer not depending on n.
-
A1
lim n→∞ n 0/n=π 0 exists and π 0<1.
Proof of Theorem 1
Note that for the original p-values,
Also, for the “mean filtered” p ∗(υ)-values, for any \(\upsilon \in\mathcal{V}_{j}\), where j=0,1, we obtain
Thus, for p ∗(υ)-values under true H 0(υ), \(G_{0}^{*}(t)\) is the C.D.F. corresponding to the p.d.f. in (3.7); for p ∗(υ)-values under true H 1(υ), \(G_{1}^{*}(t)\) is the C.D.F. corresponding to the p.d.f. in (3.7) with t replaced by G 1(t).
Part I. So for the FDR L procedure, by (3.2),
Applying L’Hospital’s rule and the fact lim t→0+ G 1(t)=0,
where \(x = F_{0}^{-1}(1-t)\). Note that \(\widehat{\mathrm{FDR}_{L}}^{\infty}(t)\) is a decreasing function of \(G_{1}^{*}(t)/G_{0}^{*}(t)\). Using \(\lim_{t\to0+} G_{1}^{*}(t)=0\), \(\lim_{t\to0+} G_{0}^{*}(t)=0\), and (3.3),
which together with (A.3) shows \(\lim_{t\to0+} G_{1}^{*}(t)/G_{0}^{*}(t) = \lim_{t\to0+} \{G_{1}(t)/t\}^{k} = \infty\). Thus, \(\sup_{0 < t \le1} G_{1}^{*}(t)/G_{0}^{*}(t)=\infty\), that is, \(\alpha_{\infty}^{\mathrm{FDR}_{L}} = 0\) for the FDR L procedure.
Part II. Following \(\widehat{\mathrm{FDR}_{L}}^{\infty}(t)\), we conclude that \(\alpha_{\infty}^{\mathrm{FDR}_{L}} \ne0\) if
We first verify (A.5) for the FDR L procedure. Assume (A.5) fails, i.e., \(\sup_{0<t \le 1} G_{1}^{*}(t) / G_{0}^{*}(t)=\infty\). Note that for any δ>0, the function \(G_{1}^{*}(t)/G_{0}^{*}(t)\), for t∈[δ,1], is continuous and bounded away from ∞, thus, \(\sup_{0<t \le1}G_{1}^{*}(t)/G_{0}^{*}(t)=\infty\) only if there exists a sequence t 1>t 2>⋯>0, such that lim m→∞ t m =0 and \(\lim_{m\to\infty} G_{1}^{*}(t_{m})/G_{0}^{*}(t_{m}) = \infty\). For each m, recall that both \(G_{1}^{*}(t)\) and \(G_{0}^{*}(t)\) are continuous on [0,t m ], and differentiable on (0,t m ). Applying Cauchy’s mean-value theorem, there exists ξ m ∈(0,t m ) such that \(G_{1}^{*}(t_{m})/G_{0}^{*}(t_{m}) = \{G_{1}^{*}(t_{m})-G_{1}^{*}(0)\}/{\{G_{0}^{*}(t_{m})-G_{0}^{*}(0)\}} = \frac{d G_{1}^{*}(t)/dt}{d G_{0}^{*}(t)/dt}|_{t=\xi_{m}}\). Since \(\lim_{m\to\infty} G_{1}^{*}(t_{m})/G_{0}^{*}(t_{m}) = \infty\), it follows that \(\limsup_{t\to0+}\frac{d G_{1}^{*}(t)/dt}{d G_{0}^{*}(t)/dt} = \infty\), which combined with (A.4) implies
On the other hand, the condition \(\limsup_{x\to x_{0}-} {f_{1}(x)}/{f_{0}(x)} < \infty\) indicates that
where \(x = F_{0}^{-1}(1-t)\). Clearly, (A.7) contradicts (A.6). The proof is completed. □
Proof of Theorem 2
We first show Lemma 1.
Lemma 1
Let B(t) be the C.D.F. of the p.d.f. corresponding to (3.7) with k≥3. Then I. for t∈(0,0.5), B(t)/t is a strictly increasing function and B(t)<t; II. for t∈(0.5,1), B(t)>t; III. for t 1∈(0,0.5] and t 2∈[t 1,1], B(t 1)/t 1≤B(t 2)/t 2.
Proof
Since (3.7) is symmetric with respect to 0.5, we deduce that B(t)=1−B(1−t) and B′(t)=B′(1−t), i.e. the p.d.f. B′(t) is symmetric with respect to 0.5. It follows that
namely, B″(t) is antisymmetric with respect to 0.5. More precisely, it is easy to see from (3.7) that
Hence from (3.4) and (A.8), the possible roots of B″(t) are at {0,0.5,1}. For positive t close to 0, B″(t) is a positive polynomial function of degree k−2.
To show part I, define F1(t)=B(t)/t. Then \(\mathrm{F}_{1}'(t)=\{B'(t)t-B(t)\}/t^{2}\), where d{B′(t)t−B(t)}/dt=B″(t)t. For t∈(0,0.5), (A.8) and the above analysis indicate B″(t)>0, i.e., B′(t)t−B(t) is strictly increasing, implying B′(t)t−B(t)>B′(0)0−B(0)=0. Hence for t∈(0,0.5), B(t)/t is strictly increasing and therefore B(t)/t<B(0.5)/0.5=1.
For part II, define F2(t)=B(t)−t. Then \(\mathrm{F}_{2}''(t)=B''(t)\). By (A.8), B″(t)<0 for t∈(0.5,1), thus F2(t) is strictly concave, giving F2(t)>max{F2(0.5),F2(1)}=0.
Last, we show part III. For t 2∈[t 1,0.5], part I indicates that B(t 1)/t 1≤B(t 2)/t 2; for t 2∈[0.5,1], part II indicates that B(t 2)/t 2≥1 which, combined with B(t 1)/t 1≤1 from part I, yields B(t 1)/t 1≤B(t 2)/t 2. □
We now prove Theorem 2. It suffices to show that
To verify (A.9), it suffices to show that \(G_{1}(\lambda) \le G_{1}^{*}(\lambda)\) and \(\lambda\ge G_{0}^{*}(\lambda)\). Following (A.2), for 0≤t≤1,
Applying (A.11), (A.1), \(1-F_{0}(F_{1}^{-1}(0.5))\le\lambda\) and part II of Lemma 1 yields \(G_{1}(\lambda) \le G_{1}^{*}(\lambda)\); applying λ≤0.5 and part I of Lemma 1 implies \(\lambda \ge G_{0}^{*}(\lambda)\). This shows (A.9).
To verify (A.10), let M=sup0<t≤1 G 1(t)/t. Since G 1(1)/1=1, we have M≥1 which will be discussed in two cases. Case 1: if M=1, then
Case 2: if M>1, then there exist t 0∈[0,1] and t n ∈(0,1) such that lim n→∞ t n =t 0, and
Thus, there exists N 1 such that for all n>N 1,
Cases of t 0=1, t 0=0 and t 0∈(0,1) will be discussed separately. First, if t 0=1, then M=lim n→∞{G 1(t n )/t n }=lim n→∞ G 1(t n )≤1, which contradicts (A.13). Thus t 0<1. Second, if t 0=0, then there exists N 2 such that t n <0.5 for all n>N 2. Thus for all n>N≡max{N 1,N 2}, applying (A.11), (A.14) and part III of Lemma 1, we have
This together with (A.13) shows
Third, for t 0∈(0,1), since both F 0 and F 1 are differentiable and f 0 is supported in a single interval, \(G_{1}(t)/t=\{1-F_{1}(F_{0}^{-1}(1-t))\}/t\) is differentiable in (0,1). Thus,
and \({d\{G_{1}(t)/t\}}/{dt}|_{t=t_{0}} = 0\). Notice
If t 0>0.5, then \(F_{0}^{-1}(1-t_{0})\le F_{0}^{-1}(0.5)\). By (A.3) and the assumption on f 0 and f 1, \({dG_{1}(t)}/{dt}|_{t=t_{0}} = f_{1}(F_{0}^{-1}(1-t_{0}))/f_{0}(F_{0}^{-1}(1-t_{0})) \le1\), which contradicts (A.17). Thus, 0<t 0≤0.5. This together with (A.11), (A.16), and part III of Lemma 1 gives
This together with (A.16) shows,
Combining (A.12), (A.15) and (A.18) completes the proof.
Calculation of \(\alpha_{\infty}^{\mathrm{FDR}_{L}}\) of the mean filter in Table 3 of Sect. 4.1
From (A.1) and the conditions given in Sect. 4.1,
Now we compute \(\alpha_{\infty}^{\mathrm{FDR}_{L}}\) of the mean filter. Recall that the distribution \(G_{0}^{*}(t)\) is that of \(\overline{\mathrm{U}}_{k}=\sum_{i=1}^{k} \mathrm{U}_{i}/k\), where \(\{\mathrm{U}_{i}\}_{i=1}^{k} \stackrel {\text{i.i.d.}}{\sim} \mathrm{Unif}(0,1)\), and the distribution G ∗(t) is also that of \(\overline{\mathrm{U}}_{k}\). Similarly, by (A.2), the distribution \(G_{1}^{*}(t)\) is that of \(\overline{\mathrm{U}}_{k}/e^{C}\).
Hence, \(\widehat{\mathrm{FDR}_{L}}^{\infty}(t) = [{\pi_{0}+\pi_{1}\{1-G_{1}^{*}(\lambda)\}/\{1- G_{0}^{*}(\lambda)\}}] /\{ \pi_{0}+\pi_{1} G_{1}^{*}(t)/G_{0}^{*}(t)\}\), yielding
Note that
Note that \({1}/{\mathrm{P}(\overline{ \mathrm{U}}_{k} \le t)}\) is decreasing in t. By a graphical approach for k=5, \({\mathrm{P}(\overline{ \mathrm{U}}_{k} \le te^{C})}/{\mathrm {P}(\overline {\mathrm{U}}_{k} \le t)}\) is also decreasing in t. Thus, \(\sup_{t\in(0, 1]}G_{1}^{*}(t)/G_{0}^{*}(t) = \lim_{t\to0+} G_{1}^{*}(t)/G_{0}^{*}(t) = \lim_{t\to0+} \{G_{1}(t)/t\}^{k} = e^{Ck}\) from (A.4) and (A.19). So \(\alpha_{\infty}^{\mathrm{FDR}_{L}} = [{\pi_{0}+\pi_{1}\{ 1-G_{1}^{*}(\lambda)\}/\{1- G_{0}^{*}(\lambda)\}}] / (\pi_{0}+\pi_{1} e^{Ck}) \), where \(G_{0}^{*}(\cdot)\) can be calculated from (3.5). The completes the derivation.
Rights and permissions
About this article
Cite this article
Zhang, C. Assessing mean and median filters in multiple testing for large-scale imaging data. TEST 23, 51–71 (2014). https://doi.org/10.1007/s11749-013-0341-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-013-0341-7