Abstract
Based on two independent samples X 1, . . . , X m and X m+1, . . . , X n drawn from multivariate distributions with unknown Lebesgue densities p and q respectively, we propose an exact multiple test in order to identify simultaneously regions of significant deviations between p and q. The construction is built from randomized nearest-neighbor statistics. It does not require any preliminary information about the multivariate densities such as compact support, strict positivity or smoothness and shape properties. The properly adjusted multiple testing procedure is shown to be sharp-optimal for typical arrangements of the observation values which appear with probability close to one. The proof relies on a new coupling Bernstein type exponential inequality, reflecting the non-subgaussian tail behavior of a combinatorial process. For power investigation of the proposed method a reparametrized minimax set-up is introduced, reducing the composite hypothesis “p = q” to a simple one with the multivariate mixed density (m/n)p + (1 − m/n)q as infinite dimensional nuisance parameter. Within this framework, the test is shown to be spatially and sharply asymptotically adaptive with respect to uniform loss on isotropic Hölder classes. The exact minimax risk asymptotics are obtained in terms of solutions of the optimal recovery.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Behnen K., Neuhaus G., Ruymgaart F.: Two sample rank estimators of optimal nonparametric score-functions and corresponding adaptive rank statistics. Ann. Stat. 11, 588–599 (1983)
Belomestny D., Spokoiny V.: Spatial aggregation of local likelihood estimates with application to classification. Ann. Stat. 35, 2287–2311 (2007)
Bennett G.: Probability inequalities for sums of independent random variables. J. Am. Stat. Assoc. 57, 33–45 (1962)
Butucea C., Tribouley K.: Nonparametric homogeneity tests. J. Stat. Plann. Inference 136, 597–639 (2006)
Donoho D.: Statical estimation and optimal recovery. Ann. Stat. 22, 238–270 (1994a)
Donoho D.: Asymptotic minimax risk for sup-norm loss—solution via optimal recovery. Probab. Theory Relat. Fields 99, 145–170 (1994b)
Ducharme G.R., Ledwina T.: Efficient and adaptive nonparametric test for the two-sample problem. Ann. Stat. 31, 2036–2058 (2003)
Dudley R.M., Giné E., Zinn J.: Uniform and universal Glivenko–Cantelli classes. J. Theoret. Probab. 4, 485–510 (1991)
Dümbgen L.: Application of local rank tests to nonparametric regression. J. Nonparametric Stat. 14, 511–537 (2002)
Dümbgen L., Spokoiny V.G.: Multiscale testing of qualitative hypotheses. Ann. Stat. 29, 124–152 (2001)
Dümbgen, L., Walther, G.: Multiscale inference about a density. Ann. Stat. 36, 1758–1785; accompanying technical report, version 2. Available at http://arxiv.org/abs/0706.3968 (2008)
Eubank R.L., Hart J.D.: Testing goodness-of-fit in regression via order selection criteria. Ann. Stat. 20, 1412–1425 (1992)
Fan J.: Test of significance based on wavelet thresholding and Neyman’s truncation. J. Am. Stat. Assoc. 91, 674–688 (1996)
Hájek J., Šidak Z.: Theory of Rank Tests. Academic press, New York (1967)
Gijbels I., Heckmann N.: Nonparametric testing for a monotone hazard function via normalized spacings. J. Nonparametric Stat. 16, 463–477 (2004)
Hoeffding W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)
Ingster Y.: Asymptotically minimax testing of nonparametric hypotheses. Prob. Theory Math. Stat. I, 553–574 (1987)
Janic-Wróblewska A., Ledwina T.: Data driven rank test for two-sample problem. Scand. J. Stat. 27, 281–297 (2000)
Klemelä J., Tsybakov A.: Sharp adaptive estimation of linear functionals. Ann. Stat. 29, 1567–1600 (2001)
Le Cam L., Yang G.: Asymptotics in Statistics: Some Basic Concepts. Springer, New York (2000)
Ledwina T., Kallenberg W.C.M.: Consistency and Monte Carlo simulation of a data-driven version of smooth goodness-of-fit tests. Ann. Stat. 23, 1594–1608 (1995)
Ledwina T.: Data-driven version of Neyman’s smooth test of fit. J. Am. Stat. Assoc. 89, 1000–1005 (1994)
Leonov S.L.: On the solution of an optimal recovery problem and its applications in nonparametric Statics. Math. Methods Stat. 4, 476–490 (1997)
Leonov S.L.: Remarks on extremal problems in nonparametric curve estimation. Stat. Probab. Lett. 43, 169–178 (1999)
Lepski O., Tsybakov A.: Asymptotically exact nonparametric hypothesis testing in sup-norm and at a fixed point. Probab. Theory Relat. Fields 117, 17–48 (2000)
Neuhaus G.: H 0-contiguity in nonparametric testing problems and sample Pitman efficiency. Ann. Stat. 10, 575–582 (1982)
Neuhaus G.: Local asymptotics for linear rank Statics with estimated score functions. Ann. Stat. 15, 491–512 (1987)
Nussbaum M.: Asymptotic equivalence of density estimation and Gaussian white noise. Ann. Stat. 24, 2399–2430 (1996)
de la Peña V.H.: A bound on the moment generating function of a sum of dependent variables with an application to simple sampling without replacement. Ann. Inst. H. Poincaré Probab. Stat. 30, 197–211 (1994)
de la Peña V.H.: A general class of exponential inequalities for martingales and ratios. Ann. Prob. 27, 537–564 (1999)
Pollard D.: Convergence of Stochastic Processes. Springer, Heidelberg (1984)
Rohde A.: Adaptive goodness-of-fit tests based on signed ranks. Ann. Stat. 36, 1346–1374 (2008)
Rufibach, K., Walther, G.: A block criterion for multiscale inference about a density, with applications to other multiscale problems. J. Comp. Graph. Stat. (2009, to appear)
Serfling R.J.: Probability inequalities for the sum of sampling without replacement. Ann. Stat. 2, 39–48 (1974)
Shorack G.R., Wellner J.A.: Empirical Processes with Applications to Statistics. Wiley, New York (1986)
Spokoiny V.: Adaptive hypothesis testing using wavelets. Ann. Stat. 24, 2477–2498 (1996)
van der Vaart A.W., Wellner J.A.: Weak Convergence and Empirical Processes. Springer, Heidelberg (1996)
Walther, G.: Optimal and fast detection of spatial clusters with scan statistics. Ann. Stat. 38 (2010, to appear)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rohde, A. Optimal calibration for multiple testing against local inhomogeneity in higher dimension. Probab. Theory Relat. Fields 149, 515–559 (2011). https://doi.org/10.1007/s00440-010-0263-1
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-010-0263-1
Keywords
- Combinatorial process
- Exponential concentration bound
- Coupling
- Decoupling inequality
- Exact multiple test
- Nearest-neighbors
- Optimal recovery
- Sharp asymptotic adaptivity