Abstract
Tukey’s halfspace depth has attracted much interest in data analysis, because it is a natural way of measuring the notion of depth relative to a cloud of points or, more generally, to a probability measure. Given an i.i.d. sample, we investigate the concentration of upper level sets of the Tukey depth relative to that sample around their population version. We show that under some mild assumptions on the underlying probability measure, concentration occurs at a parametric rate and we deduce moment inequalities at that same rate. In a computational prospective, we study the concentration of a discretized version of the empirical upper level sets.
Similar content being viewed by others
References
Arcones, M.A., Chen, Z., Giné, E.: Estimators related to \(U\)-processes with applications to multivariate medians: asymptotic normality. Ann. Stat. 22(3), 1460–1477 (1994)
Bárány, I., Larman, D.G.: Convex bodies, economic cap coverings, random polytopes. Mathematika 35(2), 274–291 (1988)
Baraud, Y.: Bounding the expectation of the supremum of an empirical process over a (weak) VC-major class. Electron. J. Stat. 10(2), 1709–1728 (2016)
Brunel, V.-E.: Uniform behaviors of random polytopes under the hausdorff metric. Bernoulli (2018) (to appear)
Brunel, V.-E.: Adaptive estimation of convex polytopes and convex sets from noisy data. Electron. J. Stat. 7, 1301–1327 (2013)
Brunel, V.-E..: A universal deviation inequality for random polytopes (2014). arXiv:1311.2902
Chaudhuri, P.: On a geometric notion of quantiles for multivariate data. J. Am. Stat. Assoc. 91(434), 862–872 (1996)
Chazelle, B.: An optimal convex hull algorithm in any fixed dimension. Discrete Comput. Geom. 10, 377–409 (1993)
Cole, R., Sharir, M., Yap, C.-K.: On \(k\)-hulls and related problems. SIAM J. Comput. 16(1), 61–77 (1987)
Cuesta-Albertos, J.A., Nieto-Reyes, A.: The random Tukey depth. Comput. Stat. Data Anal. 52(11), 4979–4988 (2008)
Cuevas, A., González-Manteiga, W., Rodríguez-Casal, A.: Plug-in estimation of general level sets. Aust. N. Z. J. Stat. 48(1), 7–19 (2006)
Donoho, D.L., Gasko, M.: Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Stat. 20(4), 1803–1827 (1992)
Dutta, S., Ghosh, A.K., Chaudhuri, P.: Some intriguing properties of Tukey’s half-space depth. Bernoulli 17(4), 1420–1434 (2011)
Dyckerhoff, R., Mozharovskyi, P.: Exact computation of the halfspace depth. Comput. Stat. Data Anal. 98, 19–30 (2016)
Fresen, D.: A multivariate Gnedenko law of large numbers. Ann. Probab. 41(5), 3051–3080 (2013)
Ghosh, A.K., Chaudhuri, P.: On data depth and distribution-free discriminant analysis using separating surfaces. Bernoulli 11(1), 1–27 (2005)
Ghosh, A.K., Chaudhuri, P.: On maximum depth and related classifiers. Scand. J. Stat. 32(2), 327–350 (2005)
Guntuboyina, A.: Optimal rates of convergence for convex set estimation from support functions. Ann. Stat. 40(1), 385–411 (2012)
Hallin, M., Paindaveine, D., Šiman, M.: Multivariate quantiles and multiple-output regression quantiles: from \(L_1\) optimization to halfspace depth. Ann. Stat. 38(2), 635–669 (2010)
He, X., Wang, G.: Convergence of depth contours for multivariate datasets. Ann. Stat. 25(2), 495–504 (1997)
He, Y.: Multivariate extreme value statistics for risk assessment. Ph.D. thesis (2016)
Hubert, M., Rousseeuw, P., Segaert, P.: Multivariate and functional classification using depth and distance. Adv. Data Anal. Classif. 11(3), 445–466 (2017)
Johnson, D.S., Preparata, F.P.: The densest hemisphere problem. Theor. Comput. Sci. 6(1), 93–107 (1978)
Kim, J.: Rate of convergence of depth contours: with application to a multivariate metrically trimmed mean. Stat. Probab. Lett. 49(4), 393–400 (2000)
Koltchinskii, V.: Oracle inequalities in empirical risk minimization and sparse recovery problems, volume 2033 of Lecture Notes in Mathematics. Lectures from the 38th Probability Summer School held in Saint-Flour, 2008, École d’Été de Probabilités de Saint-Flour. [Saint-Flour Probability Summer School] Springer, Heidelberg (2011)
Kong, L., Mizera, I.: Quantile tomography: using quantiles with multivariate data. Stat. Sin. 22(4), 1589–1610 (2012)
Li, S.: Concise formulas for the area and volume of a hyperspherical cap. Asian J. Math. Stat. 4(1), 66–70 (2011)
Liu, R.Y., Parelius, J.M., Singh, K.: Multivariate analysis by data depth: descriptive statistics, graphics and inference. With discussion and a rejoinder by Liu and Singh. Ann. Stat. 27(3), 783–858 (1999)
Liu, R.Y., Singh, K.: A quality index based on data depth and multivariate rank tests. J. Am. Stat. Assoc. 88(421), 252–260 (1993)
Lòpez, R., Still, G.: Semi-infinite programming. Eur. J. Oper. Res. 180(2), 491–518 (2007)
Lovász, L., Vempala, S.: The geometry of logconcave functions and sampling algorithms. Random Struct. Algorithms 30(3), 307–358 (2007)
Mani-Levitska, P.: Characterization of convex sets. In: Gruber, P.M. and Wills, J.M. (eds.) Handbook of Convex Geometry, North-Holland, pp. 19–41 (1993)
Massé, J.-C., Theodorescu, R.: Halfplane trimming for bivariate distributions. J. Multivar. Anal. 48(2), 188–202 (1994)
Massé, J.-C.: Asymptotics for the Tukey depth process, with an application to a multivariate trimmed mean. Bernoulli 10(3), 397–419 (2004)
Miller, K., Ramaswami, S., Rousseeuw, P., Sellarès, J.A., Souvaine, D., Streinu, I., Struyf, A.: Efficient computation of location depth contours by methods of computational geometry. Stat. Comput. 13(2), 153–162 (2003)
Molchanov, I.S.: A limit theorem for solutions of inequalities. Scand. J. Stat. 25(1), 235–242 (1998)
Pateiro-Lopez, B.: Set estimation under convexity type restrictions. PhD thesis (2008)
Polonik, W.: Measuring mass concentrations and estimating density contour clusters—an excess mass approach. Ann. Stat. 23, 855–881 (1995)
Rigollet, P., Vert, R.: Optimal rates for plug-in estimators of density level sets. Bernoulli 15(4), 1154–1178 (2009)
Rousseeuw, P.J., Ruts, I.: Computing depth contours of bivariate point clouds. Comput. Stat. Data Anal. 23, 153–168 (1996)
Rousseeuw, P.J., Struyf, A.: Computing location depth and regression depth in higher dimensions. Stat. Comput. 8, 193–203 (1998)
Rousseeuw, P.J., Ruts, I.: The depth function of a population distribution. Metrika 49(3), 213–244 (1999)
Schneider, R.: Convex Bodies: The Brunn–Minkowski theory, volume 151 of Encyclopedia of Mathematics and Its Applications, expanded edn. Cambridge University Press, Cambridge (2014)
Schtt, C., Werner, E.: The convex floating body. Math. Scand. 66, 275–290 (1990)
Tsybakov, A.: On nonparametric estimation of density level sets. Ann. Stat. 25, 948–969 (1997)
Tukey, J.W.: Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians, pp. 523–531 (1975)
Yeh, A.B., Singh, K.: Balanced confidence regions based on Tukey’s depth and the bootstrap. J. R. Stat. Soc. Ser. B 59(3), 639–652 (1997)
Zuo, Y., Serfling, R.: General notions of statistical depth function. Ann. Stat. 28(2), 461–482 (2000)
Zuo, Y., Serfling, R.: Structural properties and convergence results for contours of sample statistical depth functions. Ann. Stat. 28(2), 483–499 (2000)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Brunel, VE. Concentration of the empirical level sets of Tukey’s halfspace depth. Probab. Theory Relat. Fields 173, 1165–1196 (2019). https://doi.org/10.1007/s00440-018-0850-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00440-018-0850-0