Data depth for the uniform distribution

Silva, Pedro C.; Cerdeira, J. Orestes; Martins, M. João; Monteiro-Henriques, T.

doi:10.1007/s10651-013-0242-7

Data depth for the uniform distribution

Published: 24 March 2013

Volume 21, pages 27–39, (2014)
Cite this article

Environmental and Ecological Statistics Aims and scope Submit manuscript

Pedro C. Silva¹,
J. Orestes Cerdeira^1,2,
M. João Martins¹ &
…
T. Monteiro-Henriques^1,3

370 Accesses
3 Citations
Explore all metrics

Abstract

Given a set $X$ of $k$ points and a point $z$ in the $n$-dimensional euclidean space, the Tukey depth of $z$ with respect to $X$, is defined as $m/k$, where $m$ is the minimum integer such that $z$ is not in the convex hull of some set of $k-m$ points of $X$. If $z$ belongs to the closed region $B$ delimited by an ellipsoid, define the continuous depth of $z$ with respect to $B$ as the quotient $V(z)/\text{ Vol }(B)$, where $V(z)$ is the minimum volume of the intersection of $B$ with the halfspaces defined by any hyperplane passing through $z$, and $\text{ Vol }(B)$ is the volume of $B$. We consider $z$ a random variable and prove that, if $z$ is uniformly distributed in $B$, the continuous depth of $z$ with respect to $B$ has expected value $1/2^{n+1}$. This result implies that if $z$ and $X$ are uniformly distributed in $B$, the expected value of Tukey depth of $z$ with respect to $X$ converges to $1/2^{n+1}$ as the number of points $k$ goes to infinity. These findings have applications in ecology, namely within the niche theory, where it is useful to explore and characterize the distribution of points inside species niche.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Density-Based Clustering Based on Hierarchical Density Estimates

Confidence distributions and hypothesis testing

Article Open access 29 March 2024

Eugenio Melilli & Piero Veronese

On Falconer’s distance set problem in the plane

Article 23 August 2019

Larry Guth, Alex Iosevich, … Hong Wang

References

Abramowitz A, Stegun M, Irene A (eds) (1972) Handbook of mathematical functions with formulas, graphs, and mathematical tables. Dover Publications, New York. ISBN: 978-0-486-61272-0
Anthos (2011) Sistema de información de las plantas de España. Real Jardín Botánico, CSIC- Fundación Biodiversidad. Downloaded in November 16, 2011
Battista T, Gattone SA (2004) Multivariate bootstrap confidence regions for abundance vector using data depth. Environ Ecol Stat 11:355–365
Article Google Scholar
Cerdeira JO, Martins MJ, Silva PC (2012) A combinatorial approach to assess the separability of clusters. J Classif 29:7–22
Article Google Scholar
Donoho DL, Gasko M (1992) Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann Stat 20:1803–1827
Article Google Scholar
Fukuda K, Rosta V (2005) Data depth and maximum feasible subsystems. In: Avis David, Hertz A, Marcotte Odile (eds) Graph theory and combinatorial optimization. Springer, Berlin
Google Scholar
Hijmans RJ, Cameron SE, Parra JL, Jones PG, Jarvis A, Richardson K (2006) Worldclim. Accessed May, 2012
Hijmans RJ, van Etten J (2012) Raster: geographic analysis and modeling with raster data. http://CRAN.R-project.org/package=raster, R package version 1.9-70
Hutchinson GE (1957) Concluding remarks. Cold Spring Harb Symp Quant Biol 22: 415–427 (reprinted in 1991: Classics in theoretical biology. Bull Math Biol 53:193–213)
Google Scholar
Johnson DS, Preparata FP (1978) The densest hemisphere problem. Theor Comput Sci 6:93–107
Article Google Scholar
Li J, Ban J, Santiago LS (2011) Nonparametric tests for homogeneity of species assemblages: a data depth approach. Biometrics 67:1481–1488
Article PubMed Google Scholar
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2012) Cluster: cluster analysis basics and extensions. http://CRAN.R-project.org/package=cluster, R package version 1.14.2
Massé JC, Plante JF (2009) Depth: depth functions tools for multivariate analysis. http://CRAN.R-project.org/package=depth, R package version 1.0-1
Rousseeuw PJ, Ruts I (1996) Algorithm AS 307: bivariate location depth. Appl Stat (JRSS-C) 45:516–526
Article Google Scholar
Rousseeuw PJ, Ruts I (1999) The depth function of a population distribution. Metrika 49:213–244
Google Scholar
Rousseeuw PJ, Struyf A (1998) Computing location depth and regression depth in higher dimensions. Stat Comput 8:193–203
Article Google Scholar
Tukey JW (1975) Mathematics and picturing of data. Proc Int Congr Math Vancouver 23:523–531
Google Scholar
Zuo R, Serfling Y (2000a) General notions of statistical depth functions. Ann Stat 28:461–482
Article Google Scholar
Zuo R, Serfling Y (2000b) Structural properties and convergence results for contours of sample statistical depth functions. Ann Stat 28:483–499
Article Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Superior de Agronomia, CEF (Centro de Estudos Florestais), Universidade Técnica de Lisboa, Tapada da Ajuda, 1349-017, Lisbon, Portugal
Pedro C. Silva, J. Orestes Cerdeira, M. João Martins & T. Monteiro-Henriques
Departamento de Matemática, Faculdade de Ciências e Tecnologia, Universidade Nova de Lisboa, Quinta da Torre, 2829-516, Caparica, Portugal
J. Orestes Cerdeira
CBAA (Centro de Botânica Aplicada à Agricultura), Universidade Técnica de Lisboa, Tapada da Ajuda, 1349-017, Lisbon, Portugal
T. Monteiro-Henriques

Authors

Pedro C. Silva
View author publications
You can also search for this author in PubMed Google Scholar
J. Orestes Cerdeira
View author publications
You can also search for this author in PubMed Google Scholar
M. João Martins
View author publications
You can also search for this author in PubMed Google Scholar
T. Monteiro-Henriques
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pedro C. Silva.

Additional information

Handling Editor: Ashis SenGupta.

The authors acknowledge an anonymous referee for his comments and suggestions.

This work was supported by the Portuguese Foundation for Science and Technology (FCT) through the projects PEst-OE/AGR/UI0239/2011, CEF (Centro de Estudos Florestais) under FEDER/POCI, and PTDC/AAC-AMB/113394/2009.

Appendix

1.1 Proof of Theorem 1

Denote by $V_n$ the volume of the unit closed $n$-ball $B_n$. It is well known that $V_n=(2\pi /n)\, V_{n-2}$, with $V_1=2$ and $V_2=\pi $.

Given $R\in \mathbb{R }^+$ and a continuous function $f:[0,R] \rightarrow \mathbb{R }$ set

$$\begin{aligned} I_n[f,R]:=\int \!\!\!\int \limits _{B_n(R)} \!\! \cdots \!\!\! \int f(\Vert x\Vert ) \, dx_1dx_2\cdots dx_n, \end{aligned}$$

(1)

where $B_n(R)$ denotes the $n$-ball centred at the origin of radius $R$. Using hyperspherical coordinates we obtain by straightforward computations,

$$\begin{aligned} I_n[f,R]= n \,V_n \int \limits _0^R f(\rho ) \,\rho ^{n-1} \,d\rho . \end{aligned}$$

(2)

The volume of the hyperspherical cap orthogonal to $z=(0,\ldots ,0,z_n)$ is

$$\begin{aligned} V(z)= \int \!\!\!\!\!\!\!\!\!\!\!\int \limits _{B_{n-1} \left( \sqrt{1-z_n^2}\right) }\!\!\!\!\!\!\!\!\!\!\!\cdots \!\!\int \left( \sqrt{1-\Vert y\Vert ^2}-\Vert z\Vert \right) \,dy_1\, dy_2\,\cdots \,dy_{n-1}. \end{aligned}$$

Set $\Theta _n=\int \!\!\!\int \limits _{B_n(1)}\!\!\cdots \!\!\int V(z)\,dz_1\, dz_2\,\cdots \,dz_n$. If $z$ is uniformly distributed in $B_n \equiv B_n(1)$, the expected value of the continuous depth of $z$ with respect to $B_n$ is

$$\begin{aligned} \mathrm{E}\!\left[ d_z^c(B_n)\right] =\frac{1}{V_n} \int \!\!\!\!\int \limits _{B_n(1)}\!\!\!\!\cdots \!\!\int \frac{V(z)}{V_n}\,dz_1\, dz_2\,\cdots \,dz_n = \frac{\Theta _n}{V_n^2}, \end{aligned}$$

and we have to show that

$$\begin{aligned} \Theta _n=\frac{V_n^2}{2^{n+1}},\quad n\ge 1. \end{aligned}$$

(3)

For $n=1, V(z)=1-|z|$ and we obtain

$$\begin{aligned} \Theta _1=2 \int \limits _0^1 (1-z) \,dz=\frac{V_1^2}{2^2}. \end{aligned}$$

For $n\ge 2$, we have by (2),

$$\begin{aligned} \Theta _n&= I_n\left[ I_{n-1}\left[ \sqrt{1-\rho ^2} - r, \sqrt{1-r^2}\right] ,1\right] \\&= (n-1)n \, V_{n-1} V_n \int \limits _0^1\int \limits _0^{\sqrt{1-r^2}} \left( \sqrt{1-\rho ^2}- r\right) \,\rho ^{n-2} \, d\rho \,\, r^{n-1}\, dr, \end{aligned}$$

where $\rho =\Vert y\Vert $ and $r=\Vert z\Vert $.

Given nonnegative integers $k,\ell $, set $\Omega _{k,\ell }=\int \limits _0^1 (1- \rho ^2)^{\frac{k}{2}} \,\rho ^\ell \,d\rho $. Changing the order of integration and simplifying, we obtain

$$\begin{aligned} \Theta _n=\frac{n-1}{n+1}\,V_{n-1}\, V_n \, \Omega _{n+1,n-2}. \end{aligned}$$

In particular, we get

$$\begin{aligned} \Theta _2&= \frac{1}{3} V_1 \,V_{2}\, \Omega _{3,0}=\frac{2\pi }{3}\, \int \limits _0^1 (1- \rho ^2)^{3/2}\, d\rho =\frac{\pi ^2}{8}=\frac{V_2^2}{2^3},\\ \Theta _3&= \frac{2}{4}\, V_2 \,V_{3}\, \Omega _{4,1}= \frac{2\pi ^2}{3}\, \int \limits _0^1 (1- \rho ^2)^{2}\,\rho \,d\rho =\frac{\pi ^2}{9}=\frac{V_3^2}{2^4}. \end{aligned}$$

Let $\Gamma [z]$ denote the usual gamma function. Using relation $\Gamma [z+1]=z\Gamma [z]$ along with relation

$$\begin{aligned} \Omega _{k,\ell }=\frac{1}{2}\frac{\Gamma \left[ \frac{k+2}{2}\right] \,\Gamma \left[ \frac{\ell +1}{2}\right] }{\Gamma \left[ \frac{k+\ell +3}{2}\right] },\qquad k,\ell \ge 0, \end{aligned}$$

(see, for instance, Abramowitz et al. 1972, Section 6.2) we get for every $n\ge 4$,

$$\begin{aligned} \Omega _{n+1,n-2}= \Omega _{n-1,n-4}\frac{(n+1)(n-3)}{4\, (n-1)n}. \end{aligned}$$

(4)

By (4) we derive, for $n\ge 4$,

$$\begin{aligned} \frac{\Theta _n}{\Theta _{n-2}}&= \frac{(n-1)/(n+1)}{(n-3)/(n-1)}\,\, \frac{V_n}{V_{n-2}} \,\, \frac{V_{n-1}}{V_{n-3}}\,\, \frac{\Omega _{n+1,n-2}}{\Omega _{n-1,n-4}}\\&= \frac{(n-1)^2}{(n+1)(n-3)} \,\, \frac{2\pi }{n} \,\, \frac{2\pi }{n-1} \,\, \frac{(n+1)(n-3)}{4\, n(n-1)}\\&= \frac{\pi ^2}{n^2}. \end{aligned}$$

The proof of (3) now follows by induction, reminding that $V_{n-2}=n V_n/(2\pi )$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Silva, P.C., Cerdeira, J.O., Martins, M.J. et al. Data depth for the uniform distribution. Environ Ecol Stat 21, 27–39 (2014). https://doi.org/10.1007/s10651-013-0242-7

Download citation

Received: 15 August 2012
Revised: 20 December 2012
Published: 24 March 2013
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10651-013-0242-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data depth for the uniform distribution

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Confidence distributions and hypothesis testing

On Falconer’s distance set problem in the plane

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

1.1 Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Data depth for the uniform distribution

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Confidence distributions and hypothesis testing

On Falconer’s distance set problem in the plane

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

1.1 Proof of Theorem 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation