Abstract
We introduce and discuss a multivariate version of the classical median that is based on an equipartition property with respect to quarter spaces. These arise as pairwise intersections of the half-spaces associated with the coordinate hyperplanes of an orthogonal basis. We obtain results on existence, equivariance, and asymptotic normality.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The classical one-dimensional median associated with a probability distribution P on the Borel subsets of the real line \(\mathbb {R}\) is any value \(\theta \) that satisfies the inequalities
Under certain conditions, for example if P has a positive density, we have equalities in (1) and the median is unique. In general, the set of solutions is the bounded closed interval \(\left[ \hbox {Med}_-(P),\hbox {Med}_+(P)\right] \) with
as the smallest and the largest median of P. Often, ‘the’ median of P is defined as the corresponding midpoint,
For a data set \(x_1,\ldots ,x_n\) of real numbers a (sample) median is any (distributional) median associated with the empirical distribution \(P_n:=n^{-1} \sum _{k=1}^n \delta _{x_k}\), where \(\delta _x\) denotes unit mass at x. With \(x_{(1)}\le \cdots \le x_{(n)}\) as the values obtained by arranging \(x_1,\ldots ,x_n\) in increasing order,
is the unique sample median if n is odd, and we get \(\bigl [x_{(n/2)},x_{(n/2+1)}\bigr ]\) as the set of sample medians if n is even.
Historically, the first attempt to generalize this to dimensions \(d>1\) was to use coordinates: If \(x_1,\ldots ,x_n\) is a subset of \(\mathbb {R}^d\) for some \(d\ge 2\) then a coordinatewise or marginal median of the empirical distribution \(P_n:=n^{-1} \sum _{k=1}^n \delta _{x_k}\) is any vector whose ith coordinate is a (one-dimensional) median of the respective ith coordinates of the data vectors. This is the sample version; for a probability measure P on the Borel subsets \(\mathcal {B}^d\) of \(\mathbb {R}^d\) the push-forwards \(P^{\pi _i}\) of P under the coordinate projections \(\pi _i\), \(i=1,\ldots ,d\), would be used. Formally, the marginal medians of P are elements of the d-dimensional interval \(\bigl [\hbox {MMed}_-(P),\, \hbox {MMed}_+(P)\bigr ]\) where \(\hbox {MMed}_\pm (P)\) have \(\hbox {Med}_\pm (P^{\pi _i})\) as their ith component, \(i=1,\ldots ,d\), and
is the midpoint marginal median. It is well known that such component-wise extensions of the one-dimensional median are not equivariant with respect to orthogonal transformations and thus depend on the coordinate system chosen for the representation of the data. One way to repair this is by ‘averaging out’ (symmetrization), see Grübel (1996).
Other generalizations of the one-dimensional median to higher dimensions start with some characterizing property of the classical version and then use a multidimensional analogue. For example, it is known that in the one-dimensional case any median \(\theta \) minimizes the function \(x\mapsto \int |y-x|\, P(dy)\), which leads to the spatial or \(L^1\)-median of the data set, defined as a minimizer of \(x\mapsto \sum _{k=1}^n\Vert x-x_k\Vert \), where \(\Vert \, \cdot \,\Vert \) denotes Euclidean distance. As this distance is invariant under orthogonal transformation, the spatial median does not depend on the chosen coordinates.
Closer to the above equipartition property are the various concepts related to data depth, where for each \(x\in \mathbb {R}^d\) and each affine hyperplane H with \(x\in H\) the number of data points to one side of H is the basic ingredient: A centerpoint median is defined as any such x with the property that the minimum number is at least a fraction \(n/(d+1)\) of the data, and Tukey’s median is a point that maximizes these data depth values; see e.g. Donoho and Gasko (1992). This whole area has attracted the attention of many researchers, often with emphasis on robustness. The literature on multivariate medians up to about 1990 is covered in the review of Small (1990), which includes an interesting discussion of the history of the concept. For more recent reviews of this subject we refer to the treatises of Oja (2013) and Dhar and Chaudhuri (2011).
In the present paper, following an idea of the late Dietrich Morgenstern, we again start with the marginal median as in the work of Grübel (1996), but then use a variant of the equipartition property instead of symmetrization. In dimension \(d=2\), for example, a coordinate system specifies four quarter spaces via the pairwise intersections of the half-spaces underlying the marginal median, and a quarter median of P may be defined as any vector \(\theta \in \mathbb {R}^2\) such that for some coordinate system centered at \(\theta \) all four of the respective quarter spaces have probability at least 1/4. Figure 1 shows a simple example that Professor Morgenstern would have liked: The data are the locations of the cities in Germany with a population of at least 100,000 (in 2017). We have \(n=76\) such cities in total, so we can aim for a partition with 19 cities in each open quarter space. One resulting quarter median is nearby the exit Freudenberg of the A45 (Sauerlandlinie), and with the North–South axis tilted by about 39\(^{\circ }\) counterclockwise; see the lines in Fig. 1. The data (longitude/latitude) for drawing the German border are extracted from the file \(\mathtt gadm36_{-}DEU_{-}0_{-}sp.rds\) available at https://gadm.org/data.html.
Investigation of this concept leads to a number of interesting questions, starting with the obvious ones concerning existence and uniqueness. Next, what are the equivariance properties of the quarter median? Further, in contrast to the spatial median a quarter median comes with a system of associated hyperplanes, which may be regarded as a part of the estimator. In the two-dimensional case these are just two orthogonal lines and can be parametrized by a counterclockwise rotation of the horizontal axis, thus providing a connection to a real interval. In higher dimensions, however, specification of a topological notion of nearness of systems of coordinate hyperplanes is less straightforward, but it is needed in connection with consistency of the estimators. For asymptotic normality even a local linearization is required. Finally, there is the algorithmic issue: How do we find a quarter median for a given data set?
Section 2 contains our main results, on existence, equivariance, and large sample behavior. Elliptical distributions are treated in some detail and we compare the quarter median with other estimators for several specific families. We also consider algorithmic aspects together with the problem of measurable selection from the respective solution sets. Section 3 presents numerical comparisons with existing multivariate medians. Proofs are collected in Sect. 4.
2 Results
We first assemble some formal definitions and then give our results in separate subsections.
2.1 Preliminaries and basic definitions
We regard vectors \(x\in \mathbb {R}^d\) as column vectors and write \(\langle x, y\rangle \) for the inner product of \(x,y\in \mathbb {R}^d\). The transpose of the vector x is \(x^{\text {t}}\), so that \(\langle x, y\rangle =x^{\text {t}}y\). Similarly, \(A^{\text {t}}\) denotes the transpose of a matrix A. We write \(\Vert \cdot \Vert \) for the Euclidean norm on \(\mathbb {R}^d\), \(\mathbb {S}_{d-1}\) is the unit sphere in \(\mathbb {R}^d\), \(\mathbb {O}(d)\) is the group of orthogonal \(d\times d\)-matrices, and \(\mathbb {SO}(d)\) denotes the group of rotations, consisting of those elements of \(\mathbb {O}(d)\) that have determinant 1.
If \((\varOmega ,\mathcal {A},P)\) is a probability space, if \((\varOmega ',\mathcal {A}')\) is a measurable space, and if the mapping \(S:\varOmega \rightarrow \varOmega '\) is \((\mathcal {A},\mathcal {A}')\)-measurable then we write \(P^S\) for the push-forward of P under S, i.e. \(P^S(A')=P(S^{-1}(A'))\) for all \(A'\in \mathcal {A}'\). If \((\varOmega ,\mathcal {A})=(\mathbb {R}^d,\mathcal {B}^d)\) and \(b\in \mathbb {S}_{d-1}\) or \(U\in \mathbb {O}(d)\) then \(P^{b}\) and \(P^U\) respectively refer to the push-forwards of P under the mappings \(x\mapsto \langle b, x\rangle \) and \(x\mapsto Ux\).
For \(a\in \mathbb {R}^d\) and \(b\in \mathbb {S}_{d-1}\) we formally define the associated half-spaces by
Let \(e_1,\ldots ,e_d\) be the vectors of the canonical basis of \(\mathbb {R}^d\). Then a marginal median for a probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\) is any vector \(\theta \in \mathbb {R}^d\) with the property that
For two orthogonal unit vectors \(b, b'\in \mathbb {S}_{d-1}\) and \(a\in \mathbb {R}^d\) the quarter spaces are the intersections of these half-spaces,
with the obvious notational convention for the four possible combinations of plus and minus signs. We then define a quarter median of a probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\) as any vector \(\theta \in \mathbb {R}^d\) with the property that, for some orthonormal basis \(B=\{b_1,\ldots ,b_d\}\) of \(\mathbb {R}^d\),
and
for all combinations of plus and minus signs. If (5) and (6) hold then we call the pair \((\theta ,U)\in \mathbb {R}^d\times \mathbb {O}(d)\), where U has rows \(b_1^{\text {t}},\ldots ,b_d^{\text {t}}\), a solution of the quarter median problem for the probability measure P. Again, the data versions of these notions refer to the empirical distribution \(P_n\).
If P is absolutely continuous then hyperplanes have zero probability and (5) follows from (6). In general, this is not the case as the following simple example shows: Let \(d=2\) and
Then (6) holds with \(b_1=e_1\), \(b_2=e_2\) and \(\theta = (1,1/2)^{\text {t}}\), but (5) is not true. Further, with this choice of \(b_1,b_2\) it is easily verified that the unique marginal median \(\theta =(1/2,1/2)^{\text {t}}\) is a quarter median. This example also shows that the quarter median may not be unique. Further, if hyperplanes have probability zero, then the inequalities in (5) and (6) can be replaced by equalities, and with \((\theta ,U)\) as solution of (6) we then speak of a solution of the equipartition problem for P.
2.2 Existence and uniqueness
The existence of a median and thus of a marginal median follow immediately from the monotonicity of distribution functions. For the quarter median counting the number of variables and the number of constraints may give a first impression. For simplicity, we temporarily assume that hyperplanes have probability zero, so that we have an equipartition problem. If \(d=2\), there are then three unknowns, two for the location parameter and one for the angle of rotation, see also Remark 1 (b) below, and one of the four constraints is redundant, so that the number of unknowns is the same as the number of equations. For \(d>2\) we first note that the conditions in (5) and (6) are not independent. Indeed, for a given basis \(\{b_1,\ldots ,b_d\}\) of \(\mathbb {R}^d\) it is enough to have probability 1/2 for \(H_+(\theta ,b_i)\), \(i=1,\ldots ,d\), and probability 1/4 for \(V_{++}(\theta ;b_i,b_j)\) for \(1\le i < j\le d\), so that \(d + d(d-1)/2\) conditions remain. On the other hand, we have \(d(d-1)/2\) parameters for the (special) orthogonal group, and there are d components of the location vector, hence the number of unknowns is again the same as the number of independent constraints. This, of course, is just a heuristic argument, as the respective conditions are nonlinear.
Using a general tool from algebraic topology, the homotopy invariance of the Brouwer degree of continuous mappings, Makeev (2007) solved the equipartition problem under the additional condition that P is absolutely continuous. A smoothing argument leads to the extension needed here.
Theorem 1
A solution \((\theta ,U)\) of the quarter median problem exists for every probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\).
This result raises an obvious question: Is there a stronger equipartition in dimension \(d>2\), such as an ‘octomedian’ if \(d=3\), with each of the octants receiving the same probability? This is certainly possible for specific distributions, such as the multivariate standard normal. However, if \(d=3\) then we have three unknowns for the location vector, and three for the rotation, which can be parametrized by the Euler angles, and the condition that the octants all have probability 1/8 would similarly lead to seven constraints, which is one too many. Obviously, the difference would be even greater for dimension \(d>3\) if all \(2^d\) intersections of coordinate half-spaces were to have the same probability. According to Makeev (2007, p. 554) this dimension counting argument already implies that a solution does not exist for a generic distribution.
The notion of uniqueness requires some attention. We say that a distribution P has a unique quarter median if \(\theta _1=\theta _2\) for any two solutions \((\theta _1,U_1)\) and \((\theta _2,U_2)\) of the quarter median problem. Obviously, the full pair \((\theta ,U)\) will never be unique: Any linear hyperplane, i.e. a subspace H of \(\mathbb {R}^d\) of dimension \(d-1\), may be specified as the orthogonal complement of the one-dimensional space (line) generated by a unit vector \(b\in \mathbb {S}_{d-1}\) via \(H(b):=\{x\in \mathbb {R}^d:\, b^{\text {t}}x=0\}\), but obviously \(-b\) would lead to the same hyperplane. This simple observation already implies that we may restrict U to be an element of \(\mathbb {SO}(d)\) when searching for a quarter median. Further, the conditions (5) and (6) are invariant under permutations of the rows of U. Conversely, given a set of coordinate hyperplanes, a corresponding basis is a set with elements that are unique up to a factor \(-\,1\). Putting this together we write \(U\sim V\) for \(U,V\in \mathbb {O}(d)\) if U and V lead to the same set of linear hyperplanes; we then say that the solution of the quarter median problem is unique if for any two solutions \((\theta _1,U_1)\) and \((\theta _2,U_2)\) we have \(\theta _1=\theta _2\) and \(U_1\sim U_2\).
The following formal approach will turn out to be useful. Let \(G_1\) be the subgroup of \(\mathbb {O}(d)\) that consists of the diagonal matrices with diagonal entries from \(\{-\,1,+\,1\}\), i.e. the subgroup that represents the compositions of reflections at the coordinate axes, and let \(G_2\) be the subgroup of permutation matrices, which corresponds to coordinate permutations. We write G for the subgroup generated by \(G_1\) and \(G_2\). This is the system of all \(d\times d\)-matrices A with entries \(a_{ij}\in \{-1,0,1\}\) and such that for some permutation \(\pi \) of \(\{1,\ldots ,d\}\) we have \(a_{ij}\not =0\) if and only if \(j=\pi (i)\). The above transition from equality to equivalence can be then regarded as a transition from the group \(\mathbb {O}(d)\) to the factor group
with the equivalence classes \([U] :=\{V\in \mathbb {O}(d):\, V\sim U\}\), \(U\in \mathbb {O}(d)\), as elements. If we think of a solution of the quarter median problem as a pair \((\theta ,H)\in \mathbb {R}^d\times \mathbb {H}(d)\), then uniqueness means that two solutions are equal.
Remark 1
-
(a)
The \(\theta \)-part of a solution \((\theta ,H)\) for the quarter median problem may be unique while the H-part is not. For example, if P has a density that depends on \(x\in \mathbb {R}^d\) only through \(\Vert x\Vert \) then 0 is the unique quarter median, but all \(H\in \mathbb {H}(d)\) lead to a solution (0, H).
-
(b)
In dimension \(d=2\) there is an (almost) canonical choice of an element \(U\in \mathbb {O}(d)\) from an equivalence class \(H\in \mathbb {H}(d)\) corresponding to a set of hyperplanes, here just two orthogonal lines \(L_1,L_2\) meeting at the point \(0\in \mathbb {R}^2\). With a suitable half-open interval I of length \(\pi /2\), such as \(I=[-\pi /4,\pi /4)\), there is exactly one \(\alpha \in I\) such that the lines are represented by
$$\begin{aligned} U_\alpha :=\begin{pmatrix} \cos (\alpha )&{}\quad -\sin (\alpha )\\ \sin (\alpha ) &{}\quad \cos (\alpha )\end{pmatrix}, \end{aligned}$$(8)which is the counterclockwise rotation of the canonical coordinate lines by the angle \(\alpha \), in the sense that \(L_i=\{x\in \mathbb {R}^2:\, b_i^{\text {t}}x=0\}\), where \(b_1^{\text {t}},b_2^{\text {t}}\) are the rows of U. The corresponding topology would then be that of the quotient space \(\mathbb {R}/I\).
-
(c)
While we work with the space \(\mathbb {R}^d\) of column vectors throughout the notions of quarter median and quarter median problem make sense for an arbitrary finite-dimensional Euclidean space \((E,\langle \cdot ,\cdot \rangle )\). If \((\theta ,U)\) solves the quarter median problem for a distribution P on \((\mathbb {R}^d,\mathcal {B}^d)\), then \(\{b_2,\ldots ,b_d\}\) is an orthonormal basis for the linear space \(E=H(b_1)\), which has dimension \(d-1\). Here \(b_i^{\text {t}}\) denotes the ith row of U, \(i=1,\ldots ,d\); equivalently, \(b_i\) is the ith column of \(U^{\text {t}}\). If \(d\ge 3,\) this can be used to obtain a projectivity or consistency property that relates \((\theta ,U)\) to a solution \((\theta _E,U_E)\) of the quarter median problem for the push-forward of P under the orthogonal projection \(\pi _E:\mathbb {R}^d\rightarrow E\). In fact, let \(\mathbb {S}_{d-1}^E=\{y\in E:\Vert y\Vert =1\}\) be the unit sphere in E. For \(c\in E\) and orthogonal unit vectors \(d,d'\in \mathbb {S}_{d-1}^E\) the associated half-spaces and quarter spaces are given by
$$\begin{aligned} H_-^E(c;d) := \bigl \{y\in E:\, \langle d,y-c\rangle \le 0\bigr \}, \ H_+^E(c;d) := \bigl \{y\in E:\, \langle d,y-c\rangle \ge 0\bigr \}, \end{aligned}$$and
$$\begin{aligned} V_{\pm \pm }^E(c;d,d') := H_\pm ^E(c;d)\cap H_\pm ^E(c;d'). \end{aligned}$$Using that a vector \(x\in \mathbb {R}^d\) can uniquely be written as \(x=\sum _{i=1}^d\langle x, b_i \rangle \,b_i\) we see that the orthogonal projection of x in E is \(x_E:=\pi _E(x)=\sum _{i=2}^d\langle x, b_i \rangle \,b_i\). It follows from \(\pi _E^{-1}\left( H_{\pm }^E(\theta _E;b_i)\right) = H_{\pm }(\theta ;b_i)\) for \(2\le i\le d\), and \(\pi _E^{-1}\left( V_{\pm \pm }^E(\theta _E;b_i,b_j)\right) = V_{\pm \pm }(\theta ;b_i,b_j)\) for \(2\le i<j\le d\), that \(P^{\pi _E}\left( H_{\pm }^E(\theta _E;b_i)\right) =P\left( H_{\pm }(\theta ;b_i)\right) \) for \(2\le i\le d\), and that \(P^{\pi _E}\left( V_{\pm \pm }^E(\theta _E;b_i,b_j)\right) =P\left( V_{\pm \pm }(\theta ;b_i,b_j)\right) \) for \(2\le i<j\le d\).
2.3 Equivariance
In the context of equivariance properties of location estimators for d-dimensional data we start with a family (often a group) \(\mathcal {S}\) of measurable transformations \(S:\mathbb {R}^d\rightarrow \mathbb {R}^d\) and we regard the location parameter \(\theta \in \mathbb {R}^d\) as a function \(\theta =T(P)\) of P. Recall that \(P^S\) is the push-forward of P under the transformation S. Then T is said to be \(\mathcal {S}\)-equivariant if
In cases where the parameter is not unique this might better be expressed as \(S(\theta )\) being a parameter of this type for \(P^S\) if \(\theta \) is for P.
For example, with \(\mathcal {S}\) being one of the groups \(G_1\) and \(G_2\) that were used in connection with (7), it is easy to check that
i.e. the specific marginal median introduced in (3) is equivariant with respect to reflections and coordinate permutations. It is easily seen that we may even take \(\mathcal {S}=G\). This specific marginal median is also equivariant with respect to separate rescalings, where the corresponding group is the set of all regular diagonal matrices.
The general idea is of course that statistical inference should respect the structure of the data space. Equivariance of a location estimator with respect to shifts and orthogonal linear transformations means that the estimator interacts in this way with the full isometry group of Euclidean space.
At various stages the following function will be important,
with \(U\in \mathbb {O}(d)\) and P a probability measure on \((\mathbb {R}^d,\mathcal {B}^d)\); \(\varPsi (U,P)\) is the specific marginal median in the coordinate system associated with U, transformed back to canonical coordinates. By definition, a quarter median is always a marginal median in a specific coordinate system. Below, marginal medians in
will be of particular interest.
Proposition 1
Let P be a probability distribution on \((\mathbb {R}^d,\mathcal {B}^d)\), and let \(\mathcal {S}\) be the set of Euclidean motions.
- (a):
-
If \(\theta \) is a quarter median for the distribution P then, for all \(S\in \mathcal {S}\), \(S(\theta )\) is a quarter median for \(P^S\).
- (b):
-
If \(U,V\in \mathbb {O}(d)\) are such that \(U\sim V\) then \(\varPsi (U,P)=\varPsi (V,P)\).
- (c):
-
If \((\theta ,U)\) solves the quarter median problem for P, then there is a marginal median \(\eta \in \mathcal {M}(P^U)\) such that \((\tilde{\theta },U)\) with \(\tilde{\theta }=U^{\text {t}}\eta \) is also a solution.
- (d):
-
If \((\theta ,U)\) solves the quarter median problem for P and \(S\in \mathcal {S}\), \(S(x)=Ax+b\), then \((A\theta + b, U\!A^{\text {t}})\) solves the quarter median problem for \(P^S\). Moreover, if \(P^b\) has a unique median for all \(b \in \mathbb {S}_{d-1}\),
$$\begin{aligned} \varPsi \bigl (U\!A^{\text {t}},P^S\bigr )\, = \, S\bigl (\varPsi (U,P)\bigr ). \end{aligned}$$(12) - (e):
-
Suppose that \(P^b\) has a unique median for all \(b \in \mathbb {S}_{d-1}\). Then, the function \(\varPsi (\,\cdot \,,P):\mathbb {O}(d)\rightarrow \mathbb {R}^d\) is continuous.
Part (a) is the equivariance on the set level. Part (b) shows that \(\varPsi (\cdot ,P)\) may be regarded as a function on \(\mathbb {H}(d)\). Part (c) is of importance when developing an algorithm for searching a solution of empirical quarter median problems: For each \(U\in \mathbb {O}(d)\) the set of potential \(\theta \)-values in (11) consists of three vectors only, separately for each coordinate. It is tempting to think that this could be reduced further to the respective midpoint, but the uniform distribution on the six points \((-2,-2)^{\text {t}},(-1,3)^{\text {t}},(1,-1)^{\text {t}},(2,2)^{\text {t}},(3,4)^{\text {t}},(4,0)^{\text {t}}\) in the plane provides a counterexample: The pair \(\bigl ((2,2)^{\text {t}},U\bigr )\) with \(U=\mathrm{diag}(1,1)\) solves the quarter median problem for this distribution, but the midpoint of the associated rectangle \([1,2]\times [0,2]\) of multivariate medians would lead to \(\bigl ((1.5,1)^{\text {t}},U\bigr )\), which is not a solution. Part (d) notices the Euclidean motion equivariance of the quarter median for distributions P with the property that \(P^b\) has a unique median for all \(b\in \mathbb {S}_{d-1}\), a condition that will be used repeatedly below.
Similar to the spatial median and the orthomedian introduced by Grübel (1996), the quarter median is not affine equivariant. In particular, it is not equivariant with respect to separate rescalings of the coordinates, in contrast to the special marginal median, but for distributions P with the property that \(P^b\) has a unique median for all \(b\in \mathbb {S}_{d-1}\), equivariance does hold (and is used in the proofs below) for simultaneous rescalings, where the group \(\mathcal {S}\) consists of all positive scalar multiples of the identity matrix.
Any quarter median is a marginal median in a suitably chosen coordinate system, which we may take to be generated by an orthogonal matrix with determinant 1. Grübel (1996) introduced the orthomedian of P as the integral of the function \(\varPsi (\cdot ,P)\) over \(\mathbb {SO}(d)\) with respect to the Haar measure with total mass 1. Thus, the orthomedian may be seen as the expected value of this function with U chosen uniformly at random, and its orthogonal equivariance is then an immediate consequence of the fact that the Haar measure is invariant under the group operations. In contrast, the quarter median picks a marginal median from the range of the set-valued function
via the partition constraint (6). Figure 2 shows the range of the function \(U\mapsto \varPsi (U,P_n)\) for the empirical distribution \(P_n\) associated with the city data in Fig. 1, where \(n=76\). The parts colored red correspond to possible quarter medians.
2.4 Asymptotics
Throughout this section we assume that P is a probability measure on \((\mathbb {R}^d,\mathcal {B}^d)\) with the property that
and that \(X_1,X_2,\ldots \) are independent copies of a random vector X with distribution P. We further assume that, for each \(n\in \mathbb {N}\), the pair \((\theta _n,U_n)\in \mathbb {R}^d\times \mathbb {O}(d)\) with \(\theta _n=\theta _n(X_1,\ldots ,X_n)\), \(U_n=U_n(X_1,\ldots ,X_n)\), is a solution of the quarter median problem for the (random) empirical distribution \(P_n\) associated with \(X_1,\ldots ,X_n\). We also assume that \(\theta _n\) and \(U_n\) are measurable with respect to the respective Borel \(\sigma \)-fields on \((\mathbb {R}^d)^n\). A discussion of the associated selection problem and an extension of the results to more general random elements is given in Sect. 2.6.
We deal with consistency first and then consider asymptotic normality. Recall that uniqueness and convergence refer not to the U-matrices themselves, but to the associated equivalence classes \([U]\in \mathbb {H}(d)\), hence we have to specify what convergence of the U-part means. On \(\mathbb {O}(d)\) we use the topology induced by \(\mathbb {R}^{d\times d}\) where a sequence of matrices converges if all entries converge individually. It is well known that this makes \(\mathbb {O}(d)\) a compact and separable Hausdorff space. We use the quotient topology on \(\mathbb {H}(d)\), which is the finest topology with the property that the mapping \(\mathbb {SO}(d)\rightarrow \mathbb {H}(d)\), \(U\mapsto [U]\), is continuous. As we have \(\mathbb {H}(d)=\mathbb {O}(d)/G\) with some finite group \(G\subset \mathbb {O}(d)\), the space \(\mathbb {H}(d)\) with this topology is again a compact and separable Hausdorff space.
Theorem 2
Suppose that (13) holds.
- (a):
-
If the quarter median for P is unique and given by \(\theta \in \mathbb {R}^d\), then \(\theta _n\) converges almost surely to \(\theta \) as \(n\rightarrow \infty \).
- (b):
-
If the solution of the quarter median problem for P is unique and given by \((\theta ,H)\in \mathbb {S}_{d-1}\times \mathbb {H}(d)\), then \((\theta _n,[U_n])\) converges almost surely to \((\theta ,H)\) as \(n\rightarrow \infty \).
In connection with distributional asymptotics we require (13) and also that P is absolutely continuous. Let \(M_a=M(a)\) be the (unique) median of \(P^a\), \(a\in \mathbb {S}_{d-1}\). As in the paper of Grübel (1996) we further assume that \(P^a\) has a density \(f_a\) such that
and, in addition, that
For example, (16) is satisfied if P is symmetric about \(\theta \) in the sense that \((X_1-\theta )\) and \(-(X_1-\theta )\) have the same distribution. Of course, if (13) and (16) hold then the quarter median for P is unique and given by \(\theta \).
For \(U\in \mathbb {O}(d)\) with rows \(b_1^{\text {t}},\ldots ,b_d^{\text {t}}\,\) let
Of course, \(f_a\) and \(M_a\) also depend on P. Finally, we write \(X_n\rightarrow _{\text {\tiny d}}Q\) for convergence in distribution of a sequence \((X_n)_{n\in \mathbb {N}}\) of random variables to a random variable with distribution Q, and \(N_d(\mu ,\varSigma )\) for the d-dimensional normal distribution with mean vector \(\mu \) and covariance matrix \(\varSigma \).
Theorem 3
Suppose that the conditions (13)–(16) hold and that the solution of the quarter median problem for the absolutely continuous probability measure P is unique and given by \((\theta ,H)\). Then, in the above setup, and with \(U\in H\),
where \(\,\varSigma (P,U):= U^{\text {t}}\varDelta (P,U)U\) and \(\varDelta (P,U)\) is given by (17).
As part of the proof we will show that \(\varSigma (P,U)\) does not depend on the choice of the element U of H.
2.5 Elliptical distributions
We now consider a special family of distributions in more detail. Let \(h:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be a function with the property
Then, for each \(\mu \in \mathbb {R}^d\) and each positive definite matrix \(\varSigma \in \mathbb {R}^{d\times d}\),
is the density of a probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\), the elliptical distribution with location vector \(\mu \) and dispersion matrix \(\varSigma \). We abbreviate this to \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\). If \(\mu =0\) and \(\varSigma =I_d\) we write \(\mathrm{Sym}_d(h)\) for \(\mathrm{Ell}_d(0,I_d;h)\) and speak of a spherically symmetric distribution with the density \(h\big (\Vert x \Vert ^2\big )/c_d(h),\,x\in \mathbb {R}^d\). A transformation to spherical coordinates leads to
The function h is said to be the density generator of the elliptical distributions \(\mathrm{Ell}_d(\mu ,\varSigma ;h).\) For \(h(t)=\exp (-t/2)\) the associated constant is \(c_d(h)= (2\pi )^{d/2}\), and we obtain the multivariate normal distributions \(N_d(\mu ,\varSigma )\).
For later purposes we list some properties of elliptical distributions. Clearly, if \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) and if \(T:\mathbb {R}^d\rightarrow \mathbb {R}^d\), \(T(x) = Ax + b\) with \(A\in \mathbb {R}^{d\times d}\) regular, \(b\in \mathbb {R}^d\), then \(P^T=\mathrm{Ell}_d(\mu +b, A \varSigma A^{\text {t}}; h)\). Further, for \(d\ge 2\) the univariate marginal densities of \(P=\mathrm{Sym}_d(h)\) coincide and are given by
where
The function g is nonincreasing and continuous on \((0,\infty )\). Hence the univariate marginal distributions are unimodal with mode 0 and their density is continuous on \(\mathbb {R}{\setminus } \{0\}\); see e.g. Fang et al. (1990, p. 37).
In what follows we deal with generators h that satisfy the condition
Then g and \(g_1\) are bounded and continuous. Additionally, with
for \(k=1,\ldots ,d-1\), we have \(0<c_k(h)<\infty \); especially, \(g(0)=c_{d-1}(h).\) Finally, the transformation law mentioned above can be generalized to \(k\times d\)-matrices with rank \(k<d\) at the cost of changing the function h. In particular, the first marginal distribution associated with \(\mathrm{Ell}_d(0,\varSigma ;h)\) is \(\mathrm{Ell}_1(0, \sigma _{11}; g)\) if \(\varSigma =(\sigma _{ij})_{i,j=1}^d\), and has a density that is bounded, continuous and strictly positive at 0. More generally, if \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) then the density \(f_a\) of \(P^a\) for \(a\in \mathbb {S}_{d-1}\) is given by
\(M_a=a^{\text {t}}\mu \) is the unique median of \(P^a\), \(f_a(M_a)\) is positive for all \(a\in \mathbb {S}_{d-1},\) and the function \((a,t)\rightarrow f_a(t)\) is continuous. Hence the assumptions (13)–(16) of Theorem 3 are satisfied.
As \(\varSigma \) is positive definite and symmetric the principal axes theorem applies and gives the representation
with some \(U\in \mathbb {O}(d)\) and \(\lambda _1\ge \lambda _2\ge \cdots \ge \lambda _d>0\), where the \(\lambda _i\)’s are the eigenvalues of \(\varSigma \). The distribution \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) is said to be strictly elliptical if no two of these eigenvalues coincide. The next theorem deals with the uniqueness of (the solution of) the quarter median (problem) for elliptical distributions. Note that condition (21) is not imposed there.
Theorem 4
If \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\), then the quarter median of P is unique and given by \(\mu \). Moreover, if P is strictly elliptical, then the solution of the quarter median problem is unique and given by \((\mu ,[U])\) with U as in (23).
For samples from strictly elliptical distributions with density generators satisfying (21) we obtain asymptotic normality for a sequence of associated quarter medians \(\theta _n=\theta _n(X_1,\ldots ,X_n)\). Part (a) of the following result is essentially a corollary to Theorem 3. In the symmetric case we only have weak uniqueness, hence we need to consider this situation separately. Intermediate cases, where only some of the eigenvalues coincide, can be treated similarly, but require a certain amount of bookkeeping.
Theorem 5
Let \(X_n\), \(n\in \mathbb {N}\), be independent random vectors with distribution \(P=\mathrm{Ell}_d(\mu ,\varSigma ; h)\), with h satisfying (21). Let \(c_{d-1}(h),c_d(h)\) be as in (19) and let
- (a):
-
Suppose that \((\theta _n,U_n)_{n\in \mathbb {N}}\) is a sequence of random variables with values in \(\mathbb {R}^d\times \mathbb {O}(d)\) such that, for all \(n\in \mathbb {N}\), \((\theta _n,U_n)\) solves the quarter median problem for \(P_n\). Then, if P is strictly elliptical,
$$\begin{aligned} \sqrt{n}(\theta _n-\mu ) \; \rightarrow _{\text {\tiny d}}\; N_d\big (0,\sigma _{{{\textsc {QMed}}}}^2\,\varSigma \big )\quad \text {as } n\rightarrow \infty . \end{aligned}$$(24) - (b):
-
Suppose that \(\varSigma =\lambda I_d\) for some \(\lambda >0\). Let \((U_n)_{n\in \mathbb {N}}\) be an arbitrary sequence of elements of \(\mathbb {O}(d)\) and suppose that, for all \(n\in \mathbb {N}\), \(\theta _n\) is a marginal median of \(P_n^{U_n}\). Then
$$\begin{aligned} \sqrt{n}(\theta _n-\mu ) \; \rightarrow _{\text {\tiny d}}\; N_d\big (0,\sigma _{{{\textsc {QMed}}}}^2\,\varSigma \big )\quad \text {as } n\rightarrow \infty . \end{aligned}$$(25)
Remarkably, the covariance matrix of the limiting normal distribution always is a fixed multiple of the dispersion matrix \(\varSigma \). This is in contrast to other familiar estimators of \(\mu \) such as the empirical spatial median \({\textsc {SMed}}(X_1,\dots ,X_1),\) for example. In fact, with reference to Bai et al. (1990) where the asymptotics of the spatial median are considered in an even more general context, Somorčík (2006) states that in the d-dimensional case under some weak conditions
where the asymptotic covariance matrix is given by
with
and
In the special (spherical) symmetric case \(\varSigma =I_d\) the identity (27) simplifies to
where
It was shown by Grübel (1996) that in this symmetric case the empirical orthomedian has the same asymptotic covariance matrix as the spatial median. In what follows, we give a comparison of the covariances of the limit distributions for some estimators of \(\mu \), in particular for \({{\textsc {QMed}}}(X_1,\dots ,X_n)\) as the empirical quarter median, the sample mean \(\frac{1}{n}\sum _{j=1}^nX_j\), and the maximum likelihood estimator \(\hbox {ML}(X_1,\dots ,X_n)\), for some special distribution families, where our main focus is on strictly elliptical distribution families. In all cases, the limit distribution of the standardized estimators is a centered multivariate normal, and by asymptotic covariance we mean the respective covariance matrix.
Example 1
We work out the details for some specific distribution families in the case of dimension \(d=2\).
(a) For normal distributions \(P=N_2(\mu ,\varSigma )\) Theorem 5 (a) leads to
The maximum likelihood estimator of \(\mu \) is the sample mean; its (asymptotic) covariance is simply \(\varSigma \) itself, hence the asymptotic efficiency of the quarter median is about \(64\%\). Of course, the quarter median is more robust with respect to gross outliers in the data. In the symmetric case \(\varSigma =I_2\) both orthomedian and spatial median have asymptotic covariance \(\tfrac{4}{\pi }I_2\).
For non-symmetric two-dimensional normal distributions with \(\varSigma =\mathrm{diag}(1,\lambda )\), \(0<\lambda <1\), Brown (1983) and Grübel (1996) obtained expressions for the asymptotic covariance matrices of the spatial median and the orthomedian. Both are of diagonal form, with the two diagonal entries given by expressions involving infinite series and double integrals respectively. These can be used numerically, see Brown (1983, Table 1) and Grübel (1996, Table 1). For the second value, referring to the shorter axis, we have the simple explicit formula \(\lambda \pi /2\) for the quarter median, which is smaller than the value for the spatial median if \(\lambda <0.038\), and for the orthomedian if \(\lambda <0.1\).
(b) Taking \(h(t)=\exp (-\tfrac{1}{2}\sqrt{t}),\,t\ge 0,\) we obtain the bivariate doubly exponential distributions \(\mathrm{MDE}_2(\mu ,\varSigma )\), with densities given by
The asymptotic covariance matrix for the sample mean of independent two-dimensional random vectors \(X_1,\dots ,X_n\) with the density \(f(\cdot ;\mu ,\varSigma )\) is given by \(\mathrm{cov}(X_1)=12\,\varSigma \); see, e.g. (Gómez et al. 1998, Proposition 3.2 (iii)). Using
we obtain \(\sigma ^2_{{{\textsc {QMed}}}}=\pi ^2\), so that \(\pi ^2\varSigma \) is the asymptotic covariance matrix for the sample quarter median. For a given fixed symmetric positive definite \(\varSigma \) the Fisher information matrix can be calculated to be \(\frac{1}{8}\varSigma ^{-1}\) (see, e.g. Mitchell 1989, formulas (3.2), (3.3)). So, by the general asymptotic theory of maximum likelihood estimation \(8\varSigma \) is the asymptotic covariance of the maximum likelihood estimator \(\hbox {ML}(X_1,\dots ,X_n).\) In the symmetric case \(\varSigma =I_d\), it follows from (26) and (28) that
which means that the asymptotic covariances of \({\textsc {SMed}}(X_1,\dots ,X_n)\) and \(\hbox {ML}(X_1,\dots ,X_n)\) coincide in this case. This is not surprising, because \({\textsc {SMed}}(X_1,\dots ,X_n)\) is the maximum likelihood estimator of \(\mu \) in this symmetric case.
(c) With \(h(t)=(1+t)^{-3/2}\) we obtain a class of bivariate Cauchy distributions \(\mathrm{MC}_2(\mu ,\varSigma )\) with densities
The associated marginal distributions are univariate Cauchy distributions. For this distribution family the sample mean is not even consistent, hence we do not take it into consideration as an estimator for \(\mu \). Again we use (Mitchell 1989, formulas (3.2),(3.3)) to see that for given fixed positive definite \(\varSigma \) the Fisher information matrix is \(\frac{3}{5}\varSigma ^{-1}.\) Thus, by the general asymptotic theory of maximum likelihood estimators,
Using
we arrive at the value \(\tfrac{\pi ^2}{4}\varSigma \) for the asymptotic covariance matrix of the sample quarter median. In the symmetric case \(\varSigma =I_d\), using (26) and (28), it follows that
(d) With \(h(t)=(1-t)^2\) for \(0\le t \le 1\) and 0 elsewhere we have a special class of symmetric bivariate Pearson type II distributions \(\mathrm{SMPII}_2(\mu ,\varSigma )\) with densities
The asymptotic covariance matrix for the sample mean of independent two-dimensional random vectors \(X_1,\dots ,X_n\) with the density \(f(\cdot ;\mu ,\varSigma )\) is \(\mathrm{cov}(X_1)=\frac{1}{8}\varSigma \); see Fang et al. (1990, p. 89). The centered bivariate limit normal distribution of the maximum likelihood estimator has the covariance matrix \(\frac{1}{12}\varSigma \). Using
it follows that \(\frac{25}{1024}\pi ^2\varSigma \) is the asymptotic covariance matrix for the sample quarter median. In the symmetric case \(\varSigma =I_d\) we obtain
(e) Taking \(h(t)=\frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2},\,t\ge 0,\) we obtain the symmetric bivariate logistic distributions \(\mathrm{SML}_2(\mu ,\varSigma )\), with densities given by
For \(X\sim \mathrm{SML}_2(0,I_2)\) the distribution of \(R^2=\Vert X\Vert ^2\) is the univariate half-logistic distribution with density \(2\frac{\exp (-t)}{\left( 1+\exp (-t)\right) ^2},\,t\ge 0\). Then,
and \(\mathrm{cov}(X)=m_{\mathrm{SML}_2}I_2.\) From this we deduce that the asymptotic covariance matrix for the sample mean of independent two-dimensional random vectors \(X_1,\dots ,X_n\) with the above density \(f(\cdot ;\mu ,\varSigma )\) is \(\mathrm{cov}(X_1)=m_{\mathrm{SML}_2}\varSigma \). With
the Fisher information matrix is \(k_{\mathrm{SML}_2}\varSigma ^{-1}\), and the centered bivariate limit normal distribution of the maximum likelihood estimator has the covariance matrix \(k_{\mathrm{SML}_2}^{-1}\varSigma \). Further, with
we obtain that \(\frac{c_2(h)^2}{(2c_1(h))^2}\varSigma \, \approx \,1.35901156\varSigma \) is the covariance matrix of the centered limiting normal distribution of the quarter median. In the symmetric case \(\varSigma =I_d\) we get
where \(\frac{1}{2c_1(h)^2}\approx 1.10157328\). Numerical values for the ratios \(k_{\mathrm{SML}_2}^{-1}/m_{\mathrm{SML}_2}\) and \(k_{\mathrm{SML}_2}^{-1}/(\frac{c_2(h)}{2c_1(h)})^2\) are shown in the last line of Table 1.
We summarize some special results presented in Example 1 for the respective strictly elliptical cases in Table 1, where the relative asymptotic efficiencies \(\mathrm{eff}(\mathrm{Mean}, \mathrm{ML})\), and \(\mathrm{eff}({{\textsc {QMed}}},\mathrm{ML})\) of the mean and the quarter median with respect to the maximum likelihood estimator are shown; these are the ratios \(\sigma _\mathrm{ML}^2/\sigma _\mathrm{Mean}^2\) and \(\sigma _\mathrm{ML}^2/\sigma _{{{\textsc {QMed}}}}^2\).
In contrast to the other multivariate versions of the median that appear in the above example the quarter median goes beyond providing a location estimate as it comes with an equipartition basis. In the elliptical case \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) the basis elements are the unit vectors proportional to the eigenvectors of \(\varSigma \) and thus contain information about the dispersion parameter. In Theorem 3 the asymptotic behavior of \((U_n)_{n\in \mathbb {N}}\) is important. In the strictly elliptical case we have consistency and we may assume that convergence takes place in \(\mathbb {SO}(d)\); see also the beginning of the proof of Theorem 3. We aim at a second order statement about the distributional asymptotics of this sequence.
In the following theorem we only consider \(d=2\), but see the ensuing remarks. We can then use the parametrization of \(\mathbb {SO}(2)\) given in Remark 1 (b) by an angle from an half-open interval I of length \(\pi /2\). We assume that \(U=U_\alpha \) with I chosen such that \(\alpha \) is in its interior. Then consistency implies that \(U_n=U_{\alpha _n}\) with I-valued random variables \(\alpha _n\) that converge to \(\alpha \) with probability 1. It then makes sense to consider the distributional asymptotics of \(\sqrt{n}(\alpha _n-\alpha )\) as \(n\rightarrow \infty \).
Theorem 6
Let \(P=\mathrm{Ell}_2(\mu ,\varSigma ;h)\) and let \(X_n, \theta _n, U_n\), \(n\in \mathbb {N}\), be as in Theorem 5 (a). Let \(\lambda _1\) and \(\lambda _2\) be the eigenvalues of \(\,\varSigma \), with \(\lambda _1>\lambda _2\). Then, with \(\alpha \) and \(\alpha _n\) as defined above,
where
Moreover, \(\alpha _n\) and \(\theta _n\) are asymptotically independent.
In particular, the asymptotic variance in (34) is large if the eigenvalues are close to each other. This is to be expected as \(\mathrm{Ell}_2(\mu ,\varSigma ;h)\) is then close to a symmetric distribution, where U is not unique.
In dimensions higher than two a possible approach could be based on the representation of the Lie group \(\mathbb {SO}(d)\) by its Lie algebra \(\mathfrak {so}(d)\), which consists of the skew-symmetric matrices \(A\in \mathbb {R}^{d\times d}\), where \(U\in \mathbb {SO}(d)\) is written as the matrix exponential \(U=\exp (A)\) of some \(A\in \mathfrak {so}(d)\); see Muirhead (1982, Theorem A9.7) for a proof of the latter assertion, and Hall (2004) for an elementary introduction to Lie groups, Lie algebras, and representations. This leads to an asymptotic distribution Q for \((U^{\text {t}}U_n)^{\sqrt{n}}\) as \(n\rightarrow \infty \), where Q is a probability measure on (the Borel subsets of) \(\mathbb {SO}(d)\), and it avoids any ambiguities caused by parametrization. In fact, Q is the distribution of the random orthogonal matrix \(\exp (A)\), where A is a random skew symmetric matrix with jointly normal subdiagonal entries. If \(d=2\), as in the above theorem, then
and A is then specified by its one subdiagonal entry, which has a central normal distribution with variance given by (34).
2.6 Measurability, selection and algorithms
Let S be a topological space with Borel \(\sigma \)-field \(\mathcal {B}(S)\). We write \(C_b(S)\) for the set of bounded continuous functions \(f:S\rightarrow \mathbb {R}\). Let \(Z,Z_1,Z_2,\ldots \) be S-valued random variables, i.e. \(\mathcal {B}(S)\)-measurable functions on some probability space (which may depend on the respective variable). In its classical form, convergence in distribution \(Z_n\rightarrow _{\text {\tiny d}}Z\) of \(Z_n\) to Z as \(n\rightarrow \infty \) means that
This is the notion that we used in Sect. 2.4. An extension of this concept, due to Hoffmann–Jørgensen, can be applied if the \(Z_n\)’s are not measurable with respect to the full Borel \(\sigma \)-field on S; roughly, the expectations \(Ef(Z_n)\) in (35) are then replaced by outer expectations \(E^\star f(Z_n)\). Similarly, almost sure convergence now refers to outer probabilities. The research monographs of Dudley (1999) and van der Vaart and Wellner (1996) give an in-depth treatment of this circle of ideas, together with a variety of applications. This extension appears in our proof of Theorem 3 in connection with empirical processes; it can also be used if the estimators \((\theta _n,U_n)\) are not measurable. We refer the reader to the paper of Kuelbs and Zinn (2013), where this is worked out in detail for the related situation of quantile processes.
While this avoids the problem of choosing an estimator from the respective solution set in a measurable way, it is of independent interest whether such a selection is possible. For the classical one-dimensional median and the marginal median this can obviously be done by choosing the midpoint of the respective (component) interval, but no such simple rule seems to exist for the quarter median.
We think of estimators as functions of the random variables \(X_i\), \(i\in \mathbb {N}\), and it is then enough to establish measurability for these functions. To be precise, for a given \(n\in \mathbb {N}\) and an n-tuple \((x_1,\dots ,x_n)\in (\mathbb {R}^d)^n\) we denote by \(\big (\theta _n(x_1,\dots ,x_n),U_n(x_1,\dots ,x_n)\big )\in \mathbb {R}^d\times \mathbb {SO}(d)\) a solution of the quarter median problem for the empirical distribution \(P_{n;x_1,\dots ,x_n}=n^{-1}\sum _{k=1}^n\delta _{x_k}\).
Proposition 2
There exists a function \(\tau :(\mathbb {R}^d)^n\rightarrow \mathbb {R}^d\times \mathbb {SO}(d)\) that is measurable with respect to the Borel \(\sigma \)-algebra on \((\mathbb {R}^d)^n\) and the Borel \(\sigma \)-algebra on \(\mathbb {R}^d\times \mathbb {SO}(d)\) such that, for each \((x_1,\dots ,x_n)\in (\mathbb {R}^d)^n\),
is a solution of the quarter median problem for \(P_{n;x_1,\dots ,x_n}\).
Moreover, \(\tau \) may be chosen to be permutation invariant in the sense that
for all permutations \(\pi \) of \(\{1,\ldots ,n\}\).
Permutation invariance means that the estimates depend on the data only through the respective empirical distribution.
Selecting a solution in a measurable way from the respective set of all solutions may seem as a corollary to establishing an algorithm that returns an estimate for every data input \(x_1,\ldots ,x_n\). Indeed, without such an algorithm a statistical procedure would seem to be of limited use. In dimension two, and using the parametrization in Remark 1 (b), we obtain a real function on the interval \([0,\pi /2)\) by counting the number of data points in the upper right quadrant specified by \(U_\alpha \) and \(\hbox {MMed}_m(P^{U_\alpha })\); see Fig. 3 for a plot of this function for the city data in Fig. 1, where the counts refer to the open quarter space. By construction, all half-spaces contain at least half of the data points, so each angle that leads to a count of n/4 (if n is divisible by 4) would give a quarter median. For absolutely continuous distributions this approach, together with the intermediate value theorem, leads to a proof of Theorem 1 in dimension 2, and this may be used numerically via bisection. Such a bisection argument also works for functions that are piecewise constant and have jumps of size \(\pm 1\) only, provided a solution exists.
In what follows, still considering the case \(d=2\), we present a different algorithm. Let \(x_1,\dots ,x_n\) be \(n\ge 2\) pairwise distinct data points in \(\mathbb {R}^2\). For \(1\le i<j\le n\) let \(b_{ij}=\frac{x_j-x_i}{\Vert x_j-x_i\Vert }\in \mathbb {S}_1\) be the direction of the line through \(x_i\) and \(x_j,\) and let \(b_{ij}'\in \mathbb {S}_1\) be orthogonal to \(b_{ij}.\) With the empirical distribution \(P_n=n^{-1}\sum _{k=1}^n \delta _{x_k}\) we associate the finite set \(\mathscr {L}(P_n)\) of pairs \((\theta _{ij},U_{ij})\), with \(U_{ij}\in \mathbb {O}(2)\) as the matrix with row vectors \(b_{ij}^{\text {t}}\) and \({b_{ij}'}^{\text {t}}\), and \(\theta _{ij}=U_{ij}^{\text {t}}\eta \) with some \(\eta \in \mathcal {M}\bigl (P_n^{U_{ij}}\bigr )\).
Theorem 7
There is an element \((\theta _{ij},U_{ij}) \in \mathscr {L}(P_n)\) that solves the quarter median problem for \(P_n\).
Hence, checking successively the conditions (5) and (6) for all \(\left( \theta _{ij},U_{ij}\right) \in \mathscr {L}(P_n)\), \(1\le i<j\le n\), provides a solution of the quarter median problem for \(P_n\). This procedure, however, may lead to an estimator that is not permutation invariant in the sense of (36), meaning that it would not be a function of the empirical distribution \(P_n\). For our introductory data example, see Fig. 1, this happens for both of the above algorithms: Fig. 3 shows that different solutions appear depending on whether the angles are scanned clockwise or counterclockwise. Similarly, the algorithm based on Theorem 7 could lead to the (i, j)-pair corresponding to the cities Berlin and Frankfurt am Main, or München and Dortmund, for example.
3 Numerical comparison
We present a small simulation study on the performance of the quarter median (\({{\textsc {QMed}}}\)) as a location estimator, comparing it with that of three other procedures—the spatial median (\({\textsc {SMed}}\)); Oja’s simplicial median (\(\textsc {OMed}\)), see Oja (1983); and Tukey’s half-space median (\({\textsc {TMed}}\)), see Tukey (1975). For a detailed description of these classical estimators we refer to the survey papers of Small (1990) and Oja (2013). A Monte Carlo study on the performance of the three (and some other bivariate location) estimators is given by Massé and Plante (2003).
Samples are drawn from the strictly elliptical distributions \(N_2(\mu ,\varSigma )\) (bivariate normal), \(\mathrm{MDE}_2(\mu ,\varSigma )\) (bivariate doubly exponential), \(\mathrm{SMPII}_2(\mu ,\varSigma )\) (symmetric bivariate Pearson type II), and \(\mathrm{SML}_2(\mu ,\varSigma )\) (symmetric bivariate logistic) introduced in Example 1. The bivariate Cauchy distributions \(\mathrm{MC}_2(\mu ,\varSigma )\) also discussed there are not taken into consideration for the simulation study, as the Oja median does not exist for this distribution family. In fact, for a given distribution P on \(\mathbb {R}^2\), independent \(X_j\sim P\), \(j=1,2\), and \(\theta \in \mathbb {R}^2\), let \(\varDelta (X_1,X_2,\theta )\) be the area of the triangle in \(\mathbb {R}^2\) with vertices \(X_1,X_2,\theta .\) If the expectation \(\gamma _P(\theta ):=E\left( \varDelta (X_1,X_2,\theta )\right) \) is finite for all \(\theta \in \mathbb {R}^2\), the Oja median is a point \(\theta _\mathrm{Oja}(P)\in \mathbb {R}^2\) that minimizes the function \(\theta \mapsto \gamma _P(\theta )\). Obviously, \(\gamma _P\equiv \infty \) for \(P=\mathrm{MC}_2(0,\varSigma )\), while for the other elliptical distributions P under consideration \(\theta _\mathrm{Oja}(P)=\mu .\)
Without loss of generality we take \(\mu =0\). The dispersion matrix chosen is \(\varSigma =\begin{pmatrix} 1 &{} 0 \\ 0 &{} \lambda \end{pmatrix}\), with \(\lambda \in \{0.01,0.1,0.5,0.9\}\). Repeatedly, with \(r=10{,}000\) replications, samples of size \(n=100\) are drawn from the underlying distribution. Let \({\textsc {SMed}}_{n,i},\textsc {OMed}_{n,i},{\textsc {TMed}}_{n,i},{{\textsc {QMed}}}_{n,i}\) be the observed values of the corresponding estimators obtained in the ith repetition. Table 2 shows the numerical values of the components \(\hat{m}_1,\hat{m}_2\) of the empirical means and the numerical values of the eigenvalues \(\hat{l}_1,\hat{l}_2\) of the empirical covariance matrices of \(\sqrt{n}{\textsc {SMed}}_{n,i},\sqrt{n}\textsc {OMed}_{n,i},\sqrt{n}{\textsc {TMed}}_{n,i},\sqrt{n}{{\textsc {QMed}}}_{n,i}\), \(i=1,\ldots ,r.\) The simulations are conducted by using the statistical software environment R; the calculation of the estimators \({\textsc {SMed}}\), \(\textsc {OMed}\), and \({\textsc {TMed}}\) is easily done by applying the corresponding functions med(...,method="Spatial"), med(...,method="CR20",...) and TukeyMedian(...) provided with the additional R software packages depth and TukeyRegion. Note that, in the later package, the Tukey median is defined to be the barycenter of the Tukey region.
Roughly, for the bivariate distributions considered the Oja median seems to have the best power performance; it is followed by the Tukey median, the spatial median and the quarter median (in this order). But note the exceptional case \(\lambda =0.01,\) with the smaller values of \(\hat{l}_2\) and \(\hat{l}_1\hat{l}_2\) for \({{\textsc {QMed}}}\) compared to that of \({\textsc {SMed}}\). Note also that as we already pointed out above, the quarter median contains information about the dispersion parameter and so goes beyond providing a location estimate.
For samples taken from bivariate normal distributions with dispersion matrix \(\varSigma =\begin{pmatrix} 1 &{} 0 \\ 0 &{} \lambda \end{pmatrix}\) the limit distributions of \({\textsc {SMed}}\), \(\textsc {OMed}\), and \({{\textsc {QMed}}}\) as \(n\rightarrow \infty \) are the centered bivariate normal distributions with the diagonal covariance matrices \(\varSigma _{{\textsc {SMed}}}=\begin{pmatrix} l_{1,{\textsc {SMed}}} &{} 0 \\ 0 &{} l_{2,{\textsc {SMed}}} \end{pmatrix}\), see Brown (1983), where also expressions for the diagonal elements \(l_{1,{\textsc {SMed}}},l_{2,{\textsc {SMed}}}\) and their numerical values for some special values of \(\lambda \) are given, \(\varSigma _{\textsc {OMed}}=\begin{pmatrix} 4/\pi &{} 0 \\ 0 &{} 4\lambda /\pi \end{pmatrix}\), see Oja and Niinimaa (1985), and \(\varSigma _{{{\textsc {QMed}}}}=\begin{pmatrix} \pi /2 &{} 0 \\ 0 &{} \pi \lambda /2 \end{pmatrix}\). The limit distribution of \({\textsc {TMed}}\) is the distribution of \(\arg \sup _{t\in \mathbb {R}^2} \inf _{u\in \mathbb {S}_{1}} \left( Z(u) - \frac{1}{2\pi \lambda ^{1/2}} u^{\text {t}}t\right) \), where \(\left( Z(u),u\in \mathbb {S}_{1}\right) \) is a centered Gaussian process with the covariance function \(\mathrm{cov}\left( Z(u),Z(v)\right) =P(V_{++}(0;u,v)-1/4,\,u,v\in \mathbb {S}_{1};\) see Nolan (1999). To the best of our knowledge, nothing is known about the type of this distribution. In this respect, see for example also the interesting results on the asymptotic behaviour of the empirical Tukey depth process given by Massé (2004).
Table 3 shows the values of \(\hat{l}_1,\hat{l}_2\) and, for comparison, that of the diagonal elements \(l_1,l_2\) of \(\varSigma _{{\textsc {SMed}}},\varSigma _{\textsc {OMed}}\), and \(\varSigma _{{{\textsc {QMed}}}}\).
4 Proofs
4.1 Proof of Theorem 1
Using Makeev’s result, we only need to remove the smoothness assumption.
For an arbitrary probability measure P on \((\mathbb {R}^d,\mathcal {B}^d)\) let \(P_\epsilon =P * N_d(0,\epsilon I_d)\) be the convolution of P with a centered d-dimensional normal distribution with independent components that all have variance \(\epsilon >0\). Then \(P_\epsilon \) is absolutely continuous with an everywhere positive density, so that a solution \(\bigl (\theta _\epsilon , U_\epsilon )\) of the equipartition problem exists. Let \(|\!|\!|\cdot |\!|\!|\) be a matrix norm on \(\mathbb {R}^{d\times d}\) that is compatible with the Euclidean norm \(\Vert \cdot \Vert \) on \(\mathbb {R}^d\). From \(\theta _\epsilon =U_\epsilon ^{\text {t}}U_\epsilon \theta _\epsilon \) it then follows that
where \(s=\sup _{U\in \mathbb {SO}(d)}|\!|\!|U^{\text {t}}|\!|\!|<\infty .\) Let \(t=\limsup _{\epsilon \rightarrow 0} \Vert U_\epsilon \theta _\epsilon \Vert \). Hence there exists a sequence \((\epsilon _n)_{n\in \mathbb {N}}\) with \(\epsilon _n\rightarrow 0\) such that \(\Vert U_{\epsilon _n}\theta _{\epsilon _n}\Vert \rightarrow t \) and \(U_{\epsilon _n}\rightarrow U \in \mathbb {O}(d)\) as \(n\rightarrow \infty \). As a consequence, \(P_{\epsilon _n}^{U_{\epsilon _n}}\rightarrow P^U\) weakly. Due to the fact that \(U_{\epsilon _n}\theta _{\epsilon _n}\) is a marginal median of \(P_{\epsilon _n}^{U_{\epsilon _n}}\) we deduce from this that the sequence \(\left( \Vert U_{\epsilon _n}\theta _{\epsilon _n}\Vert \right) _{n=1}^\infty \) is bounded; hence, by (37), the sequence \((\theta _{\epsilon _n})_{n=1}^\infty \) is also bounded. Therefore, \(\theta _{\epsilon _{n'}}\rightarrow \theta \in \mathbb {R}^d\) and \(U_{\epsilon _{n'}}\rightarrow U \in \mathbb {O}(d)\) along some subsequence \((\epsilon _{n'})\) of \((\epsilon _{n}).\) To see that \((\theta ,U)\) solves the quarter median problem for P we use that \(P_{\epsilon _{n'}}^{U_{\epsilon _{n'}}}\rightarrow P^U\) weakly and argue as follows: With \(b_{\epsilon _{n'},1}^{\text {t}},\dots ,b_{\epsilon _{n'},d}^{\text {t}}\) and \(b_1^{\text {t}},\dots ,b_d^{\text {t}}\) as the row vectors of \(U_{\epsilon _{n'}}\) and U, respectively, we have that, for each \(1\le i<j\le d\) and each \(\delta >0\),
Let \(\delta \downarrow 0\) to obtain \(P\big (V_{+-}(\theta ;b_i,b_j)\big )\ge \frac{1}{4}.\) The other inequalities stated in (5) and (6) can be verified in the same way.
4.2 Proof of Proposition 1
(a) Let \(\theta \) be a quarter median for P with \(b_1,\ldots ,b_d\) an associated orthonormal basis of \(\mathbb {R}^d\) such that the constraints (5) and (6) are satisfied. Further, let \(S(x)=Ax+c\) with \(A\in \mathbb {O}(d)\), \(c\in \mathbb {R}^d\), be a Euclidean motion. Then \(b_1',\ldots ,b_d'\) with \(b_i':=Ab_i\), \(1\le i\le d\), is again an orthonormal basis for \(\mathbb {R}^d\), and it is easily checked that
In view of the definition of push-forwards this implies that \(S(\theta )\) is a quarter median for \(P^S\), with \(b_1',\ldots ,b_d'\) an associated equipartition basis.
(b) If \(V=AU\) with \(A\in G\) then
where we have used (9) and \(G\subset \mathbb {O}(d)\).
(c) Let \(b_1^{\text {t}},\ldots ,b_d^{\text {t}}\) be the row vectors of U. With
it holds that \(\theta =U^{\text {t}}\theta _U=\sum _{i=1}^d\theta _{U,i}b_i\) and that, by definition, \(\theta _U\) is a marginal median of \(P^U.\) We can now choose an element \(\tilde{\theta }_U=(\tilde{\theta }_{U,1},\ldots ,\tilde{\theta }_{U,d})^{\text {t}}\in \mathcal {M}\left( P^U\right) \) such that
for \(1\le i\le d\), and
for \(1\le i<j\le d\). Then putting \(\tilde{\theta }=U^{\text {t}}\tilde{\theta }_U=\sum _{i=1}^d \tilde{\theta }_{U,i} b_i\) we have
for \(1\le i\le d\), and
for \(1\le i<j\le d\). Thus, \((\tilde{\theta },U)\) solves the quarter median problem for P.
(d) The first statement may be regarded as an equivariance property of solution pairs for the quarter median problem; its proof proceeds as in (a). For the second statement we note that \(A\in \mathbb {O}(d)\) and then calculate, using the shift equivariance of the marginal median,
(e) Let \(U\in \mathbb {O}(d)\) and \((U_n)_{n=1}^\infty \) be a sequence of elements in \(\mathbb {O}(d)\) converging to U. Due to \(P^{U_n}\rightarrow P^U\) weakly and the fact that \(P^{b}\) has a unique median for all \(b\in S_d,\) it follows that \(\hbox {MMed}_m(P^{U_n})\rightarrow \, \hbox {MMed}_m(P^U).\) Then, from
and the compactness of \(\mathbb {O}(d)\) we deduce that \( U_n^{\text {t}}\,\hbox {MMed}_m(P^{U_n})\rightarrow U^{\text {t}}\,\hbox {MMed}_m(P^U).\)
4.3 Proof of Theorem 2
We will make use of the fact that if \(m_n,n\in \mathbb {N},\) are medians of univariate distributions that converge weakly as \(n\rightarrow \infty \) to a univariate distribution with unique median m, then \(\lim _{n\rightarrow \infty } m_n=m\). Further, we only prove the second part of the theorem as the arguments for (a) are similar.
Let \((\varOmega ,\mathcal {A},\mathbb {P})\) be a probability space on which the random vectors \(X_1,X_2,\ldots \) are defined, so that \(\mathbb {P}^{X_j}=P\) for all \(j\in \mathbb {N}\). Then there is a \(\mathbb {P}\)-null set \(N\in \mathcal {A}\) such that for each \(\omega \in N^c\)
Fix \(\omega \in N^c\); until further notice we omit the argument \(\omega \) below. By definition, the quarter median \(\theta _n\) of \(P_n\) can be written as \(\theta _n=U_n^{\text {t}}\eta _n\) with some \(U_n\in \mathbb {O}(d)\) and some \(\eta _n\in [\hbox {MMed}_-(P_n^{U_n}),\,\hbox {MMed}_+(P_n^{U_n})]\) (both may depend on \(\omega \in N^c\)). Let \((U_{n'})\) be a subsequence of \((U_n)\). As \(\mathbb {O}(d)\) is compact a subsubsequence \((U_{n''})\) converges to some \(V\in \mathbb {O}(d)\), and we have weak convergence \(P_{n''}^{U_{n''}}\rightarrow P^V\) by the extended continuous mapping theorem; see Billingsley (1968, Theorem 5.5). As the distribution \(P^b\) has a unique median for all \(b\in \mathbb {S}_{d-1}\), the marginal median \(\eta _n''\) of \(P_{n''}^{U_{n''}}\) converges to the marginal median \(\eta \) of \(P^V\). Now fix i, j with \(1\le i < j\le d\). Let \(a_n'',b_n''\), a, b be the corresponding rows in \(U_n''\) and V respectively. Using the same arguments as in the proof of Theorem 1 we then obtain for each \(\epsilon >0\), and with m instead of \(n''\),
where in the first line we have used that \((\theta _m,U_m)\) solves the quarter median problem for \(P_m\). Letting \(\epsilon \downarrow 0\) we get \(P\left( V_{-+}(\eta ; a,b) \right) \, \ge \, 1/4\). More generally, and using the same argument, we obtain \(P\bigl (V_{\pm \pm }(\eta ;a,b)\bigr )\ge 1/4\) as well as \(P\bigl (H_{\pm }(\eta ;a)\bigr )\ge 1/2\) and \(P\big (H_{\pm }(\eta (\omega );b)\bigr ))\ge 1/2\). Recall that some of the quantities depend on the \(\omega \) chosen above so that, for example,
In summary, we have shown that every limit point \((\eta (\omega ),V(\omega ))\) of the sequence \((\theta _n(\omega ),U_n(\omega ))_{n\in \mathbb {N}}\) is a solution of the quarter median problem for P. Hence, if the solution is unique and given by \((\theta ,H)\in \mathbb {R}^d\times \mathbb {H}\), then, on a set of probability 1, \(\eta =\theta \) and \([V]=H\).
4.4 Proof of Theorem 3
Invariance with respect to shifts means that we may assume \(\theta =0\). As already noticed in Sect. 4.3, the quarter median \(\theta _n\) can be written as \(\theta _n=U_n^{\text {t}}\eta _n\) with \(\eta _n\in [\hbox {MMed}_-(P_n^{U_n}),\,\hbox {MMed}_+(P_n^{U_n})]\). Theorem 2 implies that \(([U_n])_{n\in \mathbb {N}}\) converges almost surely to H with respect to the quotient topology on \(\mathbb {H}(d)\). The mapping \(p:\mathbb {O}(d)\rightarrow \mathbb {H}(d)\), \(U\mapsto [U]\), associates to each \(H\in \mathbb {H}(d)\) a finite fiber \(p^{-1}(\{H\})\), and \(\mathbb {O}(d)\) is a covering space for \(\mathbb {H}(d)\). General results from algebraic topology, see e.g. tom Dieck (1991, Section II.6), imply that p may locally be inverted to obtain homeomorphisms to open subsets of the individual sheets of the covering space. In particular, we may assume that the matrices \(U_n\) converge in \(\mathbb {O}(d)\) to some \(U\in H\) almost surely as \(n\rightarrow \infty \).
For each \(a\in \mathbb {S}_{d-1}\) and \(n\in \mathbb {N}\) let \(M_n^\pm (a):=\hbox {Med}_\pm \bigl (P_n^a\bigr )\), and let M(a) be the uniquely determined median of \(P^a\). We define two stochastic processes \(Y_n^\pm =\bigl (Y_n^\pm (a),a\in \mathbb {S}_{d-1}\bigr )\) with index set \(\mathbb {S}_{d-1}\) by
and will use results presented by Grübel (1996) and adopt arguments used by Kuelbs and Zinn (2013) to prove that
Here \(Y=(Y_a,a\in \mathbb {S}_{d-1})\) is a centered Gaussian process with continuous paths and covariance function
with
We regard the \(Y_n^\pm \) as processes with paths in \(\ell _\infty (\mathbb {S}_{d-1})\), the space of real-valued bounded functions on \(\mathbb {S}_{d-1}\) endowed with the supremum norm, and the symbol ’\(\rightarrow _{\text {\tiny d}}\)’ refers to weak or distributional convergence in the sense of Hoffmann–Jørgensen; see e.g. van der Vaart and Wellner (1996) and Dudley (1999) for details.
For the proof of (38) we start with a functional central limit theorem for the empirical processes and then use an almost sure representation in order to be able to work with individual paths in an \(\epsilon \)–\(\delta \) style.
Let \(F(a,\cdot ):=F_a(\cdot )\) be the distribution function of \(a^{\text {t}}X\) and let
be the (random) distribution function associated with \(P_n^a\), \(a\in \mathbb {S}_{d-1}\). Here \(1(\,\cdot \,)\) denotes the indicator function of its (logical) argument. We introduce the empirical process \(Z_n=\left( \sqrt{n}\left( F_n(a,t)-F(a,t)\right) ,(a,t)\in \mathbb {S}_{d-1}\times \mathbb {R}\right) \) as a process with paths in \(\ell _\infty (\mathbb {S}_{d-1}\times \mathbb {R})\). We obtain a semimetric d on \(\mathbb {S}_{d-1}\times \mathbb {R}\) via
\((a,s),(b,t)\in \mathbb {S}_{d-1}\times \mathbb {R}\). As noticed in Grübel (1996, Proof of Theorem 1), we then have \(Z_n\rightarrow _{\text {\tiny d}}Z_0\), where \(Z_0=(Z_0(a,t),(a,t)\in \mathbb {S}_{d-1}\times \mathbb {R})\) is a centered Gaussian process with covariance function
\((a,s),(b,t)\in \mathbb {S}_{d-1}\times \mathbb {R}\), and sample paths that are bounded and continuous with respect to d. The assumptions on P ensure that the covariance function is continuous with respect to the usual topology on \(\mathbb {S}_{d-1}\times \mathbb {R}\). This implies that the process \(Z_0\) has continuous sample paths.
For the remainder of the proof we abbreviate ‘almost surely’ to ‘a.s.’, and convergence refers to \(n\rightarrow \infty \) unless specified otherwise.
By the Skorokhod–Dudley–Wichura representation theorem (Dudley 1999, Theorem 3.5.1) there exists a probability space \((\tilde{\varOmega },\tilde{\mathcal {A}},\tilde{\mathbb {P}})\) together with perfect and measurable maps \(g_n:\tilde{\varOmega }\rightarrow \varOmega \), \(n\in \mathbb {N}_0\), such that \(\tilde{\mathbb {P}}^{g_n}=\mathbb {P}\) for all \(n\in \mathbb {N}_0\), and such that, with \(\tilde{Z}_n:= Z_n\circ g_n\),
As \(F_n\) is a deterministic function of \(Z_n\) it follows that \(\tilde{F}_n:= F_n\circ g_n\) has the same distribution under \(\tilde{\mathbb {P}}\) as \(F_n\) under \(\mathbb {P}\), \(n\in \mathbb {N}\). This extends to the associated one-dimensional projections and their medians, and to the \(Y_n\)-processes, where \(\tilde{M}_n^\pm (a):=\hbox {Med}_\pm \bigl (\tilde{F}_n(a,\cdot )\bigr )\) and \(\tilde{Y}_n^\pm :=Y_n^\pm \circ g_n\), \(n\in \mathbb {N}\).
Arguing as in Grübel (1996, Proof of Theorem 1) we obtain that
for all \(a\in \mathbb {S}_{d-1}\), and that
For the integral of the marginal medians needed by Grübel (1996) this was enough, but here we need the stronger stronger statement
For the proof of (43) we adopt some arguments given in Kuelbs and Zinn (2013, Proof of Proposition 1). First we note that \(F\bigl (a,M(a)\bigr )=1/2\) and then write
where \(r(a,h)=f_a\bigl (M(a)+\zeta _{a,h}h\bigr )-f_a\bigl (M(a)\bigr )\) with some \(\zeta _{a,h}\in [0,1]\). Using compactness of \(\mathbb {S}_{d-1}\) we see that the assumption (14) implies
Further, (42) yields
in view of the definition of the Y-processes. Taken together this gives, \(\tilde{\mathbb {P}}\)-a.s,
and using (42) again we get
From (41) and the continuity of \(a\rightarrow \tilde{Z}_0\left( a,M(a)\right) \) it follows that, again \(\tilde{\mathbb {P}}\)-a.s.,
Absolute continuity of P implies that samples from P are in general position with probability 1, meaning that at most d points are on a hyperplane. Hence at most d of the variables \(a^{\text {t}}X_1,\ldots ,a^{\text {t}}X_n\) coincide, and thus, given that the range of \(\tilde{F}_n=F_n\circ g_n\) is contained in the range of \(F_n\),
with probability 1. Using this and (45) we get
From
which holds by (44), we obtain that, \(\tilde{\mathbb {P}}\)-a.s.,
Thus,
which finishes the proof of (43) as \(a\mapsto f_a(M(a))\) is bounded away from 0 on \(\mathbb {S}_{d-1}\).
Returning to the un-tilded variables we see that this establishes the functional limit theorems in (38).
We note in passing that \(\sup _{a\in \mathbb {S}_{d-1}}\bigl |\tilde{Y}_n^+(a)-\tilde{Y}_n^-(a)\bigr |\,\rightarrow \, 0\) \(\tilde{\mathbb {P}}\)-a.s., which implies
This shows that the volume of the interval \([\hbox {MMed}_-(P_n^{U_n}),\,\hbox {MMed}_+(P_n^{U_n})]\) shrinks fast enough so that the choice of a vector from this interval is irrelevant for the distributional asymptotics.
Now let \(b_n(i)^{\text {t}}\) and \(b(i)^{\text {t}}\) be the ith row of \(U_n\), \(n\in \mathbb {N}\), and U respectively, for \(i=1,\dots ,d\). In view of \(U_n\rightarrow U\) in \(\mathbb {O}(d)\) we have \(b_n(i)\rightarrow b(i)\) \(\mathbb {P}\)-a.s. for \(i=1,\ldots ,d\). Let \(W_n:=\sqrt{n}\eta _n = U_n\sqrt{n}(\theta _n-\theta )\) and
Note that \(W_n\) is an element of the d-dimensional interval \([W_n^-,W_n^+]\) with probability 1. As the paths of the limit process are continuous the functional limit theorems (38) now imply \(W_n^\pm \rightarrow _{\text {\tiny d}}W\), hence
with \(W\sim N_d(0,\varXi )\), where the entries of \(\varXi \) can be calculated from (39) and (40). Indeed, as \(M\equiv 0\) and as (0, U) solves the quarter median problem for P, we obtain \(K(b_i,b_i)=1/4\) and \(K(b_i,b_j)=0\) if \(i\not = j\), so that \(\varXi =\varDelta (P,U)\) with \(\varDelta (P,U)\) the diagonal matrix given in (17).
We now apply \(U_n^{\text {t}}\) on the left hand side, \(U^{\text {t}}\) on the right hand side of (47) and use (Billingsley 1968, Theorem 5.5) again to obtain the claimed asymptotic normality of \(\sqrt{n}(\theta _n-\theta )\), together with the mean and covariance matrix of the limit distribution.
It remains to show that \(\varSigma (P,U)\) does not depend on the choice of U from the set H. This follows easily from the fact that, for diagonal matrices \(\varDelta \), we always have \(P^{\text {t}}\varDelta P=\varDelta \) if P corresponds to the permutation of two rows or the multiplication by \(-1\) of some row.
For the proof the assumption (16) is important. Without such an assumption we would need the decomposition of \(\sqrt{n}\bigl (M_n^\pm (b_n(i))) - M(b(i))\bigr )\) into two terms \(\sqrt{n}\bigl (M_n^\pm (b_n(i))) - M(b_n(i))\bigr )\) and \(\sqrt{n}\bigl (M(b_n(i))) - M(b(i))\bigr )\). The first could still be analyzed with the above arguments, but for the second term the local behaviour of the function \(a\mapsto M(a)\) and distributional asymptotics of the estimates \(U_n\) would become important; see also Theorem 6.
Further, (46) and the ensuing comment show that the original estimate \(\theta _n\) need not be permutation invariant. This may seem surprising as the proof relies on empirical process theory, which deals with the data through their empirical distribution. Here it turned out to be enough that the estimate can be squeezed between two other estimates that are such functions of the empirical distribution.
4.5 Proof of Theorem 4
The transformation law for d-dimensional elliptical distributions, together with the equivariance from Proposition 1 (a), means that we may assume that \(\mu =0\) and that \(\varSigma =\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Recall that \(e_1,\ldots ,e_d\) is the canonical basis of \(\mathbb {R}^d\). For a subset I of \(\{1,\ldots ,d\}\) let
The assumptions imply that \(P^T=P\) for all reflections \(T:\mathbb {R}^d\rightarrow \mathbb {R}^d\) about hyperplanes \(H(e_i)\), hence \(P(A_I)\) does not depend on I; also, \(\sum _{I\subset \{1,\ldots ,d\}}P(A_I)=1\). A quarter space such as \(\{x\in \mathbb {R}^d:\, x^{\text {t}}e_i\ge 0, \, x^{\text {t}}e_j\ge 0\}\) with \(i\not =j\) can be written as the union of \(2^{d-2}\) such sets \(A_I\), where the intersections are all P-null sets, hence all of these have probability \(2^{d-2}2^{-d}=1/4\). This shows that \((0,\varSigma )\) solves the quarter median problem for P.
In order to see that the quarter median is unique it is enough to show that \({\hbox {MMed}}(P^U)=0\) for all \(U\in \mathbb {O}(d)\) under the above assumptions on P. This follows from the fact that \(\mathrm{Ell}_d(0,\varSigma ;h)\) is invariant under the reflection \(x\mapsto -x\), \(x\in \mathbb {R}^d\).
It remains to show that the solution of the quarter median is unique in the strictly elliptical case. To prove this, we recall that \(\mu =0\) and that \(\varSigma =\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Assume that (0, [U]) with \(U\in \mathbb {O}(d)\) is a solution of the quarter median problem. Let \(b_1^{\text {t}},\dots ,b_d^{\text {t}}\) be the row vectors of U. Then \(P^U=\mathrm{Ell}_d(0,\tilde{\varSigma };h)\), where \(\tilde{\varSigma }=\left( \tilde{\sigma }_{ij}\right) _{1\le i,j\le d}=U\varSigma U^t\). With \(U_{ij}\) as the \(2\times d\) matrix with the rows \(b_i^{\text {t}},b_j^{\text {t}},\,1\le i<j\le d,\) we see that \(P_{ij}:=P^{U_{ij}}\) is the elliptical distribution \(\mathrm{Ell}_2(0,\tilde{\varSigma }_{ij};h_2)\) with dispersion matrix
and density generator \(h_2\) which is equal to h if \(d=2\) and given by
if \(d\ge 3\); see, e.g. Fang et al. (1990, Section 2.2.3). According to Grübel (1996, p. 1466) the probability assigned by \(P_{ij}\) to the left lower quadrant is given by
which is equal to \(\frac{1}{4}\) if and only if \(\tilde{\sigma }_{ij}=0\), i.e. if and only if \(\tilde{\varSigma }_{ij}\) is a diagonal matrix with positive diagonal elements. Therefore, \(\tilde{\varSigma }=U\varSigma U^t\) is a diagonal matrix, \(\tilde{\varSigma }=\mathrm{diag}(\tilde{\lambda }_1,\dots ,\tilde{\lambda }_d)\) say, or equivalently, \(\varSigma =U^{\text {t}}\tilde{\varSigma }U=U^{\text {t}}\mathrm{diag}(\tilde{\lambda }_1,\dots ,\tilde{\lambda }_d)U\). It follows from this that the set of eigenvalues \(\bigl \{\tilde{\lambda }_1,\ldots ,\tilde{\lambda }_d\bigr \}\) of \(\tilde{\varSigma }\) and the set of eigenvalues \(\bigl \{\lambda _1,\ldots ,\lambda _d \bigr \}\) of \(\varSigma \) coincide and, as a consequence, that there is a permutation matrix \(\Pi \) such that \(\Pi ^{\text {t}}\varSigma \Pi =\tilde{\varSigma }\), which in turn implies \(U^{\text {t}}\Pi ^{\text {t}}\varSigma U\Pi = \varSigma .\) Thus, putting \(U^{\text {t}}\Pi ^{\text {t}}=:Q=(q_{ij})\) we have \(Q\varSigma =\varSigma Q^{\text {t}},\) i.e. \(q_{ij}\lambda _j=\lambda _iq_{ij}.\) As the \(\lambda _j\) are pairwise distinct this gives \(q_{ij}=0\) for \(i\not =j\). Thus Q is a diagonal matrix with the entries \(\pm 1\) in the diagonal so that \(U\sim I_d.\) Taken together this shows that \((0,[I_d])\) is the unique solution of the quarter median problem.
4.6 Proof of Theorem 5
(a) We first assume that \(\mu =0\) and that \(\varSigma =\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Using Theorem 3 we only need to evaluate the diagonal elements of \(\varDelta (P,U)\), and because of Theorem 4 we may take U to be the identity matrix. In particular, \(b_i=e_i\) for \(i=1,\ldots ,d\). Then \(f_{e_i}\) is the ith marginal density of P, and the calculations preceding Theorem 4 lead to
Putting this together we see that, for such P, the asymptotic covariance matrix is indeed the specified multiple of \(\varSigma \).
We now use the equivariance properties of the quarter median to ‘bootstrap’ this to general strictly elliptical distributions. Hence suppose that \(P=\mathrm{Ell}_d(\mu ,\varSigma ;h)\) and that \(\lambda _1> \lambda _2> \cdots > \lambda _d\) are the eigenvalues of \(\varSigma \). Then, for some \(U\in \mathbb {O}(d)\), \(\varSigma =U^{\text {t}}\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)U\). If \(X_i\), \(i\in \mathbb {N}\), are independent with distribution P then the random variables \(Y_i:= U(X_i-\mu )\), \(i\in \mathbb {N}\), are independent with distribution \(\mathrm{Ell}_d(0,\mathrm{diag}(\lambda _1,\ldots ,\lambda _d);h)\). As \(U(\theta _n - \mu \bigr )\) is a quarter median of \(Y_1,\dots ,Y_n\) the first part of the proof leads to
with \(\varXi =\sigma _{{{\textsc {QMed}}}}^2\,\mathrm{diag}(\lambda _1,\ldots ,\lambda _d)\). Using the well-known properties of weak convergence and the behavior of multivariate normal distributions under affine transformations we finally obtain (24).
(b) Suppose that the eigenvalues of \(\varSigma \) coincide and let \((U_n)_{n\in \mathbb {N}}\) be as in the statement of the second part of the theorem. Then, for any subsequence \((U_{n'})\) there exists a subsubsequence \((U_{n''})\) such that \(U_{n''}\rightarrow U_0\) for some \(U_0\in \mathbb {O}(d)\) as \(n''\rightarrow \infty \). With m for \(n''\) and \(b_m(i)^{\text {t}}\), \(b(i)^{\text {t}}\) the ith row of \(U_m\) respectively \(U_0\), and using the same arguments as for (47), we obtain
The symmetry of P implies that the limit distribution does not depend on \(U_0\).
4.7 Proof of Theorem 6
As the result is somewhat tangential to the topic of multivariate location estimation we only provide the main ideas together with some arguments specific to the quarter median application.
The general approach consists in writing the parameter \(\theta =\theta (P)\) as the solution of an equation \(\varPsi _P(\theta )=z\) with a suitably chosen function \(\varPsi _P\) and known z. Let \(\varPsi _n\) be the empirical version of \(\varPsi _P\) and suppose that \(\varPsi _n(\theta _n)\) sufficiently close to z for large \(n\in \mathbb {N}\). Then, under certain conditions, properties of the estimator sequence \((\theta _n)_{n\in \mathbb {N}}\) can be obtained by localizing \(\varPsi _P\) near \(\theta \) and then using the delta method. An excellent general exposition of this approach can be found in the textbook of van der Vaart (1998), see also van der Vaart (2002). Specifically, van der Vaart(1998, Theorem 5.9) together with van der Vaart (2002, Theorem 6.17) may serve as a guide towards filling in the details. Ultimately, this would also lead to an alternative proof for Theorem 5 in the case \(d=2\) that does not make use of the results presented by Grübel (1996).
For \(a,b>0\) let
be the boundary of the two-dimensional centered ellipse with main half-axes parallel to the unit vectors \(e_1\) and \(e_2\) and of length a and b respectively. On this set we consider the push-forward \(Q(a,b):=\mathrm{unif}(\mathbb {S}_{1})^{T_{a,b}}\) of the uniform distribution on \(\mathbb {S}_{1}=E(1,1)\) under the mapping \(T_{a,b}:\mathbb {R}^2\rightarrow \mathbb {R}^2\), \((x,y)^{\text {t}}\mapsto (a x,b y)^{\text {t}}\). If \(a\not = b\) then the solution of the quarter median problem for Q(a, b) is unique and given by \(\mu =(0,0)^{\text {t}}\) and \(U=\mathrm{diag}(1,1)=U_\alpha \) with \(\alpha =0\).
Let P be a probability measure on the Borel subsets of \(\mathbb {R}^2\). We introduce the function \(\varPsi _P:\mathbb {R}\times \mathbb {R}\times [-\pi /4,\pi /4)\rightarrow \mathbb {R}^3\), where the components \(\varPsi _{P,i}\) are the respective probabilities of the upper right, upper left and lower left quarter spaces associated with the argument. Formally, and with \(b_\alpha ,b_\alpha '\) the columns of \(U_\alpha ^{\text {t}}\),
and \(\,\varPsi _{P,2}(x,y,\alpha ) = P\bigl (V_{-+}(\cdots )\bigr )\), \(\,\varPsi _{P,3}(x,y,\alpha ) = P\bigl (V_{--}(\cdots )\bigr )\).
Suppose that \(P=Q(a,b)\) with \(a\not =b\). Then elementary geometric considerations lead to the following matrix of partial derivatives of \(\,\varPsi _{P}\) at \((x,y,z)=(0,0,0)\),
For later use we note that
In order to obtain the derivative for more general P we use some structural properties of symmetric and elliptical distributions. Note that Q(r, r) is the uniform distribution on the boundary of the sphere with radius r. If \(X\sim \mathrm{Sym}_2(h)\) then \(\Vert X\Vert \) has density
and given \(\Vert X\Vert =r\), X is uniformly distributed on \(\{x\in \mathbb {R}^2:\, \Vert x\Vert =r\}\). Taken together this implies the mixture representation
Elliptical distributions are affine transformations of symmetric distributions. In particular,
so that
Inserting quarter spaces we obtain a relation between \(\varPsi _P\) and the \(\varPsi \)-functions corresponding to the measures Q(a, b), \(a,b>0\). This in turn can be used to relate the derivative D(P) of \(\varPsi _P\) at (0, 0, 0) to the derivatives obtained earlier for Q(a, b), and we arrive at
and thus
If \((x,y,\alpha )\) solves the quarter median problem for an absolutely continuous P, then \(\varPsi _P(x,y,\alpha )=(1/4,1/4,1/4)\). For a sample of size n from P the random vector of observation counts in the quarter spaces associated with the solution of the quarter median problem for P has a multinomial distribution with parameters n and (1/4, 1/4, 1/4, 1/4). The multivariate central limit theorem shows that the standardized counts in the three quarter spaces corresponding to the components of \(\varPsi \) are asymptotically normal with mean vector 0 and covariance matrix
We may now apply the delta method to obtain asymptotic normality for the triplet consisting of the quarter median coordinates and the rotation angle for a sequence of independent random variables with distribution \(P\,=\,\mathrm{Ell}_2\bigl ((0,0)^{\text {t}}, \mathrm{diag}(a^2,b^2);h\bigr )\). The limiting normal distribution is centered and has covariance matrix
It remains to extend this to \(P=\mathrm{Ell}_2(\mu ,\varSigma ;h)\). We use essentially the same argument as at the end of Sect. 4.6 (a). By assumption and Theorem 4, we have that \(\varSigma =U_\alpha ^{\text {t}}\mathrm{diag}(\lambda _1,\lambda _2)U_\alpha \), where \(U_\alpha \) is the matrix representing the rotation by the angle \(\alpha \) in the interior of the half-open interval I, see (8), and that \((\mu ,U_\alpha )\) is the unique solution of the quarter median problem for P. Theorem 2 implies that \(\alpha _n-\alpha \in (-\pi /4,+\pi /4)\) with probability 1 for n sufficiently large. Let \(Y_i:= U_\alpha ^{\text {t}}(X_i-\mu )\), \(i\in \mathbb {N}\). Then, with probability 1 for n large enough, \(\left( U_\alpha ^{\text {t}}({{\textsc {QMed}}}(X_1,\ldots ,X_n)-\mu ),U_{\alpha _n(X_1,\dots ,X_n)-\alpha }\right) \) is a solution of the quarter median problem for \(Y_1,\dots ,Y_n\). Noting that \(\lambda _1=a^2\) and \(\lambda _2=b^2\) we may now apply the above asymptotic normality result to the Y-sequence to obtain
Consequently,
which is the statement of the theorem.
4.8 Proof of Proposition 2
We denote by \(\mathcal C\) the set of non-empty closed subsets of \(\mathbb {R}^d\times \mathbb {SO}(d).\) For \(x_1,\ldots ,x_n\in \mathbb {R}^d\) let \(\varGamma (x_1,\dots ,x_n)\) be the (by Theorem 1 non-empty) set of solutions \((\theta ,U)\in \mathbb {R}^d\times \mathbb {SO}(d)\) of the quarter median problem for \(P_{n;x_1,\dots ,x_n}\). As permutations of \(x_1,\ldots ,x_n\) do not change \(P_{n;x_1,\dots ,x_n}\) we may regard \(\varGamma \) as a function on \(E:=(\mathbb {R}^d)^n/\sim \), where \((x_1,\ldots ,x_n),(y_1,\ldots ,y_n)\in (\mathbb {R}^d)^n\) are equivalent if \(y_i=x_{\pi (i)}\) for \(i=1,\ldots ,n\) for some permutation \(\pi \) of the set \(\{1,\ldots ,n\}\). We aim to prove that the function \(\varGamma \) has values in \(\mathcal C\) and that it is measurable in the sense that the set
is a Borel set for each \(C\in \mathcal C\). Then, by the measurable selection theorem of Kuratowski and Ryll–Nardzewski (see, e.g. Rockafellar 1976, 1.C Corollary) there exists a measurable function \(\tau :E\rightarrow \mathbb {R}^d\times \mathbb {SO}(d)\) such that \(\tau (z)\in \varGamma (z)\) for all \(z\in E\). In fact, we will prove that the set A(C) is closed whenever \(C\in \mathcal C\).
Arguing as in the first part of the proof of Theorem 1 in Sect. 4.1 we see that if \((\theta _\ell ,U_\ell )_{\ell \in \mathbb {N}}\) is a sequence of elements in \(\varGamma (z)\) converging to \((\theta ,U)\in \mathbb {R}^d\times \mathbb {SO}(d)\) as \(\ell \rightarrow \infty \), then \((\theta ,U)\in \varGamma (z)\). This shows that the function \(\varGamma \) has values in \(\mathcal C\).
Now, more generally, fix \(C\in \mathcal C\), let A(C) be as in (50), and let \((z_\ell )_{\ell \in \mathbb {N}}\) be a sequence of elements of A(C) that converges to some element \(z\in E\) as \(\ell \rightarrow \infty \). As pointed out in connection with the transition from \(\mathbb {O}(d)\) to \(\mathbb {H}(d)\) in Sect. 4.4 we can find \((x_1^{(\ell )},\dots ,x_n^{(\ell )})\in z_\ell \), \(l\in \mathbb {N}\), and \((x_1,\dots ,x_n)\in z\) such that \((x_1^{(\ell )},\dots ,x_n^{(\ell )})\) converges to \((x_1,\dots ,x_n)\in z\) in the space \((\mathbb {R}^d)^n\). Then for each \(\ell \in \mathbb {N}\) there exists an associated element \((\theta _\ell ,U_\ell )\) of \(\varGamma (x_1^{(\ell )},\dots ,x_n^{(\ell )})\). Arguing as in Sect. 4.1 again, we see that along some subsequence \((\ell ')\) it holds that \((\theta _{\ell '},U_{\ell '})\rightarrow (\theta ,U)\in \mathbb {R}^d\times \mathbb {SO}(d)\) and
With \(b_1^{\text {t}},\dots ,b_d^{\text {t}}\) as the row vectors of U this gives \(P_{n;x_1,\dots ,x_n}\big (V_{\pm \pm }(\theta ;b_i,b_j)\big )\ge \frac{1}{4}\). Consequently, \((\theta ,U)\in \varGamma (z)\); also, \(z\in A(C)\) as C is closed.
4.9 Proof of Theorem 7
Let \(\theta \) be a quarter median of \(P_n\), with associated orthogonal directions \(b,b'\in S_2.\) Let \(L_{\theta ,b}=\{\theta + tb:t\in \mathbb {R}\},\,L_{\theta ,b'}=\{\theta + tb':t\in \mathbb {R}\}\) be the axes of the rectangular coordinate system with origin \(\theta \) and directions \(b,b'\). We consider two cases.
(i) \(\theta \in \{x_1,\dots ,x_n\}\). We may assume without loss of generality that \(\theta =x_1,\) and define
There are orthogonal directions \(b_*,b_*'\in \mathbb {S}_{1}\) obtained from \(b,b'\) by the same clockwise or counterclockwise rotation with the angle \(\psi \), where \(b_*\) or \(b_*'\) is an element of \(\{-b_{12},b_{12},\ldots ,-b_{1n},b_{1n}\}\). By construction, the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta ;b_{*},b_{*}')\) is greater than or equal to the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta ;b,b')\); also the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b_*)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b_*')\) is greater than or equal to the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b')\), respectively. It follows that \((\theta ,U_*)\) with \(U_*\) as the matrix with the row vectors \(b_*,b_*'\) solves the quarter median problem for \(P_n\). Thus by Proposition 3 (c) there is an \(\eta \in \mathcal {M}\bigl (P_n^{U_*}\bigr )\) such that \((\theta _*,U_*)\) with \(\theta _*=U_*^{\text {t}}\eta \) also solves the quarter median problem for \(P_n\). Note that \(\eta = \alpha b_* +\beta b_*'\), where \(\alpha \) is median of \(P_n^{b_*}\) and \(\beta \) is a is median of \(P_n^{b_*}\). Because \(b_*\) or \(b_*'\) is an element of \(\{-b_{12},b_{12},\ldots ,-b_{1n},b_{1n}\}\), and
it follows that \((\theta _*,U_*)\in \mathscr {L}(P_n).\)
(ii) \(\theta \notin \{x_1,\dots ,x_n\}\). With \(\tilde{\theta }\) as the intersection point of the line parallel to \(L_{\theta ,b}\) through the data point \(\tilde{x}\in \{x_1,\dots ,x_n\}\) with the shortest distance to \(L_{\theta ,b}\), i.e. \(\inf \{\Vert \tilde{x}-y\Vert :y\in L_{\theta ,b}\}=\min \left\{ \inf \{\Vert x_j-y\Vert :y\in L_{\theta ,b}\}:1\le j\le n\right\} \), and the line parallel to \(L_{\theta ,b'}\) through the data point \(\tilde{\tilde{x}}\in \{x_1,\dots ,x_n\}\) with the shortest distance to \(L_{\theta ,b'}\), we have that \((\tilde{\theta },U)\) with U as matrix with the row vectors \(b,b'\) is a solution of the quarter median problem for \(P_n.\) If \(\tilde{x}=\tilde{\tilde{x}}\) then \(\tilde{\theta }=\tilde{x}=\tilde{\tilde{x}}\) and we arrive at case (i). So, we can (and do) assume that there are two points, \(x_1,x_2\) say, such that \(x_1\in L_{\theta ,b},\,x_2\in L_{\theta ,b'}.\) Now we put \(B:=\{b_{1j}:2\le j\le n\}\cup \{b_{2j}:3\le j\le n\}\) and define
There are orthogonal directions \(b_*,b_*'\in \mathbb {S}_{1}\) obtained from \(b,b'\) by the same clockwise or counterclockwise rotation with the angle \(\psi \), where \(b_*\) or \(b_*'\) is an element of \(B\cup (-B)\). With \(L_1=\{x_1 + tb_*:t\in \mathbb {R}\}\) and \(L_2=\{x_2 + tb_*':t\in \mathbb {R}\}\) we obtain the axes of a new rectangular coordinate system. The origin \(\theta \) of the coordinate system with the axes \(L_{\theta ,b}\) and \(L_{\theta ,b'}\) and the origin \(\theta _*\) of the new system are both located on the Thales circle about the line segment connecting \(x_1\) and \(x_2\). By construction, the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta _*;b_*,b_*')\) is greater than or equal to the number of data points \(x_j\) lying in \(V_{\pm \pm }(\theta ;b,b')\). Further, the number of data points \(x_j\) lying in \(H_{\pm }(\theta _*;b_*)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta _*;b_*')\) are greater than or equal to the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b)\) and the number of data points \(x_j\) lying in \(H_{\pm }(\theta ;b')\), respectively. Consequently, \((\theta _*,U_*)\) with \(U_*\) as matrix with the row vectors \(b_*,b_*'\) solves the quarter median for \(P_n\). Again, by Proposition 3 (c), there is an \(\eta \in \mathcal {M}\bigl (P_n^{U_*}\bigr )\) such that \((\theta _{**},U_*)\), with \(\theta _{**}=U_*^{\text {t}}\eta \) also solves the quarter median problem for \(P_n\). Here, \(b_*\) or \(b_*'\) is an element of \(B\cup (-B)\), so that, again by (51), we have \(\bigl (\theta _{**},U_*\bigr ) \in \mathscr {L}(P_n)\).
References
Bai ZD, Chen XR, Miao BQ, Rao CR (1990) Asymptotic theory of least distances estimate in multivariate linear models. Statistics 21:503–519
Baringhaus L, Grübel R, Henze N (2008) Nachruf auf Dietrich Morgenstern. Jahresber Deutsch Math Ver 110:101–113
Billingsley P (1968) Convergence of probability measures. Wiley, New York
Brown BM (1983) Statistical uses of the spatial median. J R Stat Soc Ser B 45:25–30
Dhar S, Chaudhuri P (2011) On the statistical efficiency of robust estimators of multivariate location. Stat Methodol 8:113–128
Donoho DL, Gasko M (1992) Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann Stat 20:1803–1827
Dudley R (1999) Uniform central limit theorems. Cambridge University Press, Cambridge
Fang TH, Kotz S, Ng KW (1990) Symmetric multivariate and related distributions. Chapman and Hall, London
Gómez E, Gómez-Villegas MA, Marín JM (1998) A multivariate generalization of the power exponential family of distributions. Commun Stat Theory Methods 27:589–600
Grübel R (1996) Orthogonalization of multivariate location estimators: the orthomedian. Ann Stat 24:1457–1473
Hall BC (2004) Lie groups, Lie algebras, and representations. Springer, New York
Kuelbs J, Zinn J (2013) Empirical quantile CLTs for time-dependent data. In: High dimensional probability. Progr Probab 66. vol VI. Birkhäuser/Springer, Basel, pp 167–194
Makeev VV (2007) Equipartition of a continuous mass distribution. J Math Sci 140:551–557
Massé JC (2004) Asymptotics for the Tukey depth process, with an application to a multivariate trimmed mean. Bernoulli 10:397–419
Massé JC, Plante JF (2003) A Monte Carlo study of the accuracy and robustness of ten bivariate location estimators. Comput Stat Data Anal 42:1–26
Mitchell AFS (1989) The information matrix, skewness tensor and a-connections for the general multivariate elliptic distribution. Ann Inst Stat Math 41:289–304
Muirhead RJ (1982) Aspects of multivariate statistical theory. Wiley, New York
Nolan D (1999) On min–max majority and deepest points. Stat Probab Lett 43:325–333
Oja H (1983) Descriptive statistics for multivariate distributions. Stat Probab Lett 1:327–333
Oja H (2013) Multivariate median. In: Becker C, Fried R, Kuhnt S (eds) Robustness and complex data structures. Springer, Heidelberg, pp 3–15
Oja H, Niinimaa A (1985) Asymptotic properties of the generalized median in the case of multivariate normality. J R Stat Soc Ser B 47:372–377
Rockafellar RT (1976) Integral functionals, normal integrands and measurable selections. In: Nonlinear operators and the calculus of variations (Summer School, Univ Libre Bruxelles, Brussels, 1975). Lecture Notes in Mathematics 543, Springer, Berlin, pp 157–207
Small CG (1990) A survey of multidimensional medians. Int Stat Rev 58:263–277
Somorčík J (2006) Tests using spatial median. Aust J Stat 35:331–338
tom Dieck T (1991) Topologie. Walter de Gruyter & Co., Berlin
Tukey JW (1975) Mathematics and the picturing of data. In: Proceedings of the 1974 international congress of mathematics, Vancouver, Vol. 2. pp 523–531
van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge
van der Vaart AW (2002) Semiparametric statistics. In: Bolthausen E, Perkins E, van der Vaart A (eds) Lectures on probability theory and statistics. Lecture Notes in Mathematics 1781. Springer, Berlin
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer, New York
Acknowledgements
Dietrich Morgenstern (1924-2007) was the founding father of the Institut für Mathematische Stochastik at the then Technische Universität Hannover, which is now the Leibniz Universität Hannover, and of two similar institutes in Freiburg and Münster; see the obituary of Baringhaus et al. (2008). Discussions with him were the starting point for the present paper. The authors also thank the Editor and the Reviewer for useful suggestions.
Open Access
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Baringhaus, L., Grübel, R. The quarter median. Metrika 85, 419–458 (2022). https://doi.org/10.1007/s00184-021-00836-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-021-00836-z
Keywords
- Asymptotic normality
- Consistency
- Estimation of location
- Euclidean motion equivariance
- Equipartition
- Multivariate median