Abstract
Characterizations of finite sequences \(\beta _{1}<\cdots <\beta _{n}\) representing expected values of order statistics from a random sample of size n are given. As a by-product, a characterization of binomial mixtures, when the mixing random variable is supported in the open interval (0, 1), is presented; this enables the exact description of the convex hull of the open binomial curve, as well as the open moment curve.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In the present note we consider the following problem: Given n real numbers
under what conditions on \(\beta \)’s is there an integrable random variable (r.v.) X such that
[Here, \(X_{1:n}\le \cdots \le X_{n:n}\) are the order statistics of independent, identically distributed r.v.’s \(X_1,\ldots ,X_n\), each with distribution like X.] Notice that the number n is held fixed; the question for infinite sequences is closely connected to the Hausdorff (1921) moment problem, and its answer is well-known from the works of Huang (1998); Kadane (1971, 1974); Kolodynski (2000); Papadatos (2017). Some relative results for the finite case can be found in Mallows (1973).
The outline of the paper is as follows. In Section 2, we establish (Theorem 1) a one to one correspondence between the parent distribution of the sample with given expectations of order statistics \(\beta _{1}<\ldots <\beta _{n},\) and a random variable with the moments depending on \(\beta _{1}<\ldots <\beta _{n}.\) Since it is rather hard to check the characterization using this result, in Section 3 we provide (Theorem 3) explicit conditions on \(\beta _{1}<\ldots <\beta _{n},\) which guarantee the characterization.
2 A characterization of finite sequences of expected order statistics via binomial moments
Without loss of generality we may consider the numbers
instead of \(\beta _{i}\). Clearly these numbers will be the expected order statistics (=EOS) from \((X-c)/\lambda \) if and only if the \(\beta \)’s are the EOS from X.
First, we seek for a necessary condition. Assume that X is a non-degenerate random variable with distribution function (d.f.) F and \(\textrm{IE}|X|<\infty \). Let \(X_1,\ldots ,X_n\) be independent, identically distributed (i.i.d.) random variables with d.f. F, and denote by \(X_{1:n}\le \cdots \le X_{n:n}\) the corresponding order statistics. It is known that
where \(\mu _{j:n}:=\textrm{IE}X_{j:n}\), \(j=1,\ldots ,n\); this follows by a trivial application of Newton’s formula to the expression \( \mu _k = k \int _{0}^1 u^{k-1} F^{-1}(u) [u+(1-u)]^{n-k} du, \) where \(F^{-1}(u):=\inf \{x:F(x)\ge u\}\), \(0<u<1\), is the left-continuous inverse of F. From (1) with \(k=1,2\),
On the other hand, it is well-known (see Jones and Balakrishnan (2002)) that
where \(\alpha <\omega \) are the endpoints of the support of X; actually this formula goes back to Pearson (1902). Notice that \(-\infty \le \alpha <\omega \le \infty \), \(\alpha <\omega \) because F is non-degenerate, and the integral in (3) is finite since X is integrable. From (1) and (3) (applied to \(n=2\)),
while (2) yields
Choosing \(c=n^{-1}\sum _{j=1}^n \mu _{j:n}\), \(\lambda =\big (n(n-1)\big )^{-1}\sum _{i=1}^{n-1} i(n-i)(\mu _{i+1:n}-\mu _{i:n})>0\), the numbers \(\widetilde{\mu }_{j:n}=(\mu _{j:n}-c)/\lambda \) and \(\widetilde{\mu _j}=(\mu _{j}-c)/\lambda \) are the EOS from \(\widetilde{X}=(X-c)/\lambda \) whose mean is 0 and whose Gini mean difference is 2. Therefore,
Since \(F(y)(1-F(y))>0\) for \(y\in (\alpha ,\omega )\), and zero outside \([\alpha ,\omega )\), it follows that \(f_Y(y):=F(y)(1-F(y))\) defines a Lebesgue density of a random variable, say Y, supported in the (finite or infinite) interval \((\alpha ,\omega )\). By (3),
where \(T:=F(Y)\) is a random variable taking values in the interval (0, 1) w.p. 1, because, by definition, \(\Pr (\alpha<Y<\omega )=1\). Hence,
and we have shown the following
Proposition 1
If \(X_1,\ldots ,X_n\) are i.i.d. integrable non-degenerate r.v.’s, then there exists an r.v. T, with \(\Pr (0<T<1)=1\), such that
It is of interest to observe that the binomial moments of T appear in the r.h.s. of (5). Clearly, the r.v. T in this representation need not be unique; any other r.v. \(T'\) with \(\Pr (0<T'<1)=1\), possessing identical moments up to order \(n-2\) with T, will fulfill the same relationship.
Remark 1
For any integrable non-degenerate r.v. X with d.f. F we may define the r.v. T as in the proof of Proposition 1, that is, \(T = F(Y)\) where Y has density \(f_Y(y) = F(y)(1-F(y))/\lambda \) with \(\lambda = \int F (1-F)\). It can be shown, using Lemma 4.1 in Papadatos (2001), that the d.f. of T is specified by
Notice that \(\lambda = \int _{0}^1 (2t-1)F^{-1}(t) dt\) and, hence, the function \(F^{-1}\) determines uniquely the d.f. of T. Moreover, (6) shows that the entire location-scale family of X, \(\{c+\lambda X: \ c\in \textbf{R}, \ \lambda > 0\}\), is mapped to a single r.v. \(T\in (0,1)\). Provided that X has (finite or infinite) interval support, non-vanishing density f and differentiable inverse d.f. \(F^{-1}\), we conclude from (6) that a density of T is given by
Next, we proceed to verify that the preceding procedure can be inverted, showing sufficiency of (5). To this end, we shall make use of the following lemma, which is of independent interest in itself. A detailed proof is postponed to the appendix.
Lemma 1
Let T be an r.v. with d.f. \(F_T\) such that \(\Pr (0<T<1)=1\). Then, there exists a unique, non-degenerate, integrable, r.v. X, satisfying
where \(X_{k:k}=\max \{X_1,\ldots ,X_k\}\) with \(X_1,X_2,\ldots \) being i.i.d. copies of X. The inverse distribution function of X is given by
\(0<t<1\), where \(F_T(t-)=\Pr (T<t)\), \(\int _{1/2}^t du=-\int _{t}^{1/2} du\) for \(t<1/2\),
and I denotes an indicator function.
Remark 2
Any r.v. \(T\in (0,1)\) can be viewed as the expected order statistics generator of its corresponding r.v. X with inverse d.f. \(F_0^{-1}\) as in (9). This is so because the map \(T\rightarrow X\) (i.e., \(F_T\rightarrow F_0\equiv F_X\)), defined implicitly by Lemma 1, is one to one and onto from the space \(\mathcal{T}=\{T: \Pr (0<T<1)=1\}\) to \(\mathcal{H}=\{X: \textrm{IE}X=0, \textrm{IE}X_{2:2}=1\}\), where identically distributed r.v.’s are considered as equal. Its inverse is given by Remark 1 (with \(\lambda =1\), since \(X\in \mathcal{H}\)). In view of (13), below, it is the suitable (and unique) transformation that quantifies the characterization of Hoeffding (1953), stating that the sequence of expected order statistics characterizes the corresponding distribution. It also provides an explicit connection of the (infinite) sequence of expected order statistics to the Hausdorff (1921) moment problem; see Kadane (1971, 1974); Huang (1998); Kolodynski (2000); Papadatos (2017).
Remark 3
Suppose that the r.v. T of Lemma 1 is absolutely continuous with density \(f_T\). Assume also that the corresponding r.v. X (with \(\textrm{IE}X=0\), \(\textrm{IE}X_{2:2}=1\), inverse d.f. \(F_0^{-1}\) as in (9)) is absolutely continuous, admitting a non-vanishing density \(f_0\) in the (finite or infinite) interval support of X, and that \(F_0^{-1}\) is differentiable. Then (see Remark 1),
For example, if T is Beta(2, 2) then X is uniform in \((-3,3)\); if T is Beta(2, 1) then \(X=2\mathcal{E}-2\) where \(\mathcal{E}\) is standard exponential; if T is Beta(1, 2) then \(X=2-2\mathcal{E}\); if T is standard uniform then X is standard logistic with density \(f_0(x)=e^{-x}/(1+e^{-x})^2\), \(x\in \textbf{R}\); if T is degenerate with \(\Pr (T=\rho )=1\) then (9) shows that X is a two-valued r.v. with \(\Pr (X=-1/\rho )=\rho \), \(\Pr (X=1/(1-\rho ))=1-\rho \).
The characterization for finite n reads as follows.
Theorem 1
Given n real numbers \(\beta _{1}<\cdots <\beta _{n}\), the following are equivalent.
-
(i)
The \(\beta \)’s are EOS, that is, there exist i.i.d. integrable non-degenerate r.v.’s \(X_1,\ldots ,X_n\) such that \(\textrm{IE}X_{j:n}=\beta _{j}\), \(j=1,\ldots ,n\).
-
(ii)
There exists an r.v. T, with \(\Pr (0<T<1)=1\), such that
$$\begin{aligned} \frac{(j+1)(n-j-1)(\beta _{j+2}-\beta _{j+1})}{\sum _{i=1}^{n-1}i(n-i)(\beta _{i+1}-\beta _{i})}= & {} \textrm{IE}\left\{ \begin{pmatrix}n-2\\ j \end{pmatrix} T^{j} (1-T)^{n-2-j}\right\} ,\nonumber \\{} & {} \ \ \ \ \ \ \ \ \ \ \ \ \ \ j=0,\ldots ,n-2. \end{aligned}$$(11) -
(iii)
There exists an r.v. T, with \(\Pr (0<T<1)=1\), such that
$$\begin{aligned} \frac{n-1}{\begin{pmatrix}n-1\\ k+1\end{pmatrix}\sum _{i=1}^{n-1}i(n-i)(\beta _{i+1}-\beta _i)} \sum _{j=k+1}^{n-1}(n-j)\begin{pmatrix}j\\ k+1\end{pmatrix}(\beta _{j+1}-\beta _j)= & {} \textrm{IE}T^k,\nonumber \\ k=0,\ldots ,n-2. \end{aligned}$$(12)
Proof
The equivalence of (11) and (12) follows by a straightforward computation, while the implication (i)\(\Rightarrow \)(ii) is proved in Proposition 1. In order to verify (ii)\(\Rightarrow \)(i), assume that (11) is satisfied for some T with \(\Pr (0<T<1)=1\), and consider the r.v. X as defined in Lemma 1. Let \(\mu _{j:n}=\textrm{IE}X_{j:n}\) and \(\mu _k=\textrm{IE}X_{k:k}\). Then,
see Mallows (1973); Arnold et al. (1992); David and Nagaraja (2003). It follows that
By a trivial application of the binomial theorem to \((1-T)^{n-2-j}\), and since \(\textrm{IE}T^k=\mu _{k+2}-\mu _{k+1}\), see (8), we obtain
Hence, for \(j=0,\ldots ,n-2\),
and (11) implies that for some \(\lambda >0\),
It follows by induction on j that \(\beta _{j}=\beta _{1}+\lambda (\mu _{j:n}-\mu _{1:n})\), and therefore, \((\beta _{j}-c)/\lambda =\mu _{j:n}\), with \(c=\beta _{1}-\lambda \mu _{1:n}\). Hence, the numbers \(\big ((\beta _{j}-c)/\lambda \big )_{j=1}^n\) are expected order statistics, and thus, the same is true for \(\beta \)’s.
Remark 4
The r.h.s. of (11) corresponds to a Binomial Mixture (of a particular form, since \(\Pr (T=0)=\Pr (T=1)=0\)). The necessary and sufficient condition (12) is always satisfied for \(n=2\) and \(n=3\). To see this, it suffices to check that if \(n=2,\) then (12) holds whenever \(\Pr (T=c)=1\) with an arbitrary \(c\in (0,1),\) while if \(n=3,\) then (12) is fulfilled when \(\Pr (T=(\beta _2-\beta _1)/(\beta _3 -\beta _1))=1.\) Hence, the true problem begins at \(n=4\).
3 Explicit characterization of sequences of expected order statistics by solving the truncated moment problem for finite open intervals
In this section, we obtain a precise characterization by invoking results from the truncated moment problem for finite intervals. The existing results are limited to compact intervals and are not applicable to our case, since, according to the characterization of Theorem 1, a suitable T lies in the open interval (0, 1) w.p. 1.
Definition 1
Given \(n\ge 4\) numbers \(\beta _1<\cdots <\beta _n\), let \({\varvec{\beta }}=(\beta _1,\ldots ,\beta _n)\) and define the vector \((\nu _k)_{k=0}^{n-2}={\varvec{\nu }}={\varvec{\nu }}({\varvec{\beta }})\) by
where \(\lambda =\lambda (\varvec{\beta }):=\sum _{i=1}^{n-1}i(n-i)(\beta _{i+1}-\beta _i)>0\).
It is easily checked that \(1=\nu _0>\nu _1>\cdots>\nu _{n-2}>0\), and that the vector \(\varvec{\nu }\) is invariant under location-scale transformations on the \(\beta \)’s.
According to Theorem 1, the \(\beta \)’s are EOS if and only if the \(\nu \)’s fulfill the truncated moment problem in the interval (0, 1). However, for the truncated moment problem, well-known results exist for a compact interval [a, b]; see, e.g., Theorem IV.1.1 of Karlin and Studden (1966) or Theorems 10.1, 10.2 in Schmüdgen (2017). In order to obtain the corresponding necessary and sufficient conditions for open intervals, we shall make use of the following
Theorem 2
(Richter-Tchakaloff Theorem; see Schmüdgen (2017), Theorem 1.24). Let \((\mathcal {X},\mathcal {F},\mu )\) be a measure space and V be a finite dimensional linear subspace of the space \(L^1_{\textbf{R}}(\mathcal {X},\mathcal {F},\mu )\) of real-valued \(\mu \)-integrable functions on \(\mathcal {X}\). Define the linear functional \(L_{\mu }\) by \(L_{\mu }(f):=\int f d\mu \), \(f\in V\). Then, there exists a measure \(\mu _0\) in \((\mathcal {X},\mathcal {F})\), supported on \(k\le \dim V\) points of \(\mathcal X\), such that \(L_{\mu _0}\equiv L_{\mu }\) on V, that is, \(\int f d\mu _0=\int f d\mu \) for all \(f\in V\).
A symmetric \(n\times n\) matrix A with real entries is positive definite (denoted by \(A \succ 0\)) if \({\varvec{x}}^T A{\varvec{x}}>0\) for all \({\varvec{x}}\in \textbf{R}^n\setminus \{{\varvec{0}}\}\), where \({\varvec{x}}^T\) denotes the transpose of a column vector \({\varvec{x}}\in \textbf{R}^n\). Similarly, A is positive semi-definite (or nonnegative definite) if \({\varvec{x}}^T A{\varvec{x}}\ge 0\) for all \({\varvec{x}}\in \textbf{R}^n\), and this is denoted by \(A\succeq 0\).
Definition 2
(Hankel matrices). Let \(n\in \{4,5,\ldots \}\), \(0\le \varepsilon < 1/2\), and consider the numbers \(\nu _k\) as in Definition 1.
-
(i)
Case \(n=2m+2\): We define
$$\begin{aligned} A_0(\varepsilon ):=\Big (\nu _{i+j}\Big )_{i,j=0}^m, \ \ \ B_0(\varepsilon ):=\Big (\nu _{i+j+1}-\nu _{i+j+2} -\varepsilon (1-\varepsilon ) \nu _{i+j}\Big )_{i,j=0}^{m-1}, \end{aligned}$$(15)and \(A_0:=A_0(0)\), \(B_0:=B_0(0)\).
-
(ii)
Case \(n=2m+3\): We define
$$\begin{aligned} A_1(\varepsilon ):=\Big ( \nu _{i+j+1}-\varepsilon \nu _{i+j}\Big )_{i,j=0}^m, \ \ \ B_1(\varepsilon ):=\Big ((1-\varepsilon )\nu _{i+j}-\nu _{i+j+1}\Big )_{i,j=0}^{m}, \end{aligned}$$(16)and \(A_1:=A_1(0)\), \(B_1:=B_1(0)\).
The notation \(A_0(\varepsilon )\) is used for convenience, although \(A_0(\varepsilon )\) does not depend on \(\varepsilon )\). Notice that the matrices \(A_0(\varepsilon ), A_1(\varepsilon ), B_1(\varepsilon )\) are of order \(m+1\), while \(B_0(\varepsilon )\) is of order m. The following theorem contains our main result; compare with Mallows (1973).
Theorem 3
Let \(n\in \{4,5,\ldots \}\), \(\beta _1<\cdots <\beta _n\), and \((\nu _0,\ldots ,\nu _{n-2})\) as in Definition 1.
-
(i)
If \(n=2m+2,\) then the \(\beta \)’s are EOS if and only if \(A_0(\varepsilon )\succeq 0\) and \(B_0(\varepsilon )\succeq 0\) for some \(\varepsilon \in (0,1/2)\), where \(A_0(\varepsilon )\) and \(B_0(\varepsilon )\) are given by Definition 2(i).
-
(ii)
If \(n=2m+3,\) then the \(\beta \)’s are EOS if and only if \(A_1(\varepsilon )\succeq 0\) and \(B_1(\varepsilon )\succeq 0\) for some \(\varepsilon \in (0,1/2)\), where \(A_1(\varepsilon )\) and \(B_1(\varepsilon )\) are given by Definition 2(ii).
-
(iii)
If \(n=2m+2\), the condition \(A_0\succ 0\) and \(B_0\succ 0\) is sufficient, but not necessary, for the \(\beta \)’s to be EOS. Similarly, if \(n=2m+3\), the condition \(A_1\succ 0\) and \(B_1\succ 0\) is sufficient, but not necessary, for the \(\beta \)’s to be EOS.
Note that if either \(A_0\) or \(B_0\) (\(A_1\) or \( B_1\)) are not nonnegative definite, then \(\beta _1<\ldots < \beta _n\) are not the expectations of order statistics. This suggests the following verification procedure. We first analyze \(A_0\) and \(B_0\) (\(A_1\) and \(B_1\), respectively). If they are positive definite, then the \(\beta \)’s are EOS. If either of them is not nonnegative definite, then the \(\beta \)’s are not EOS. Otherwise we use Theorem 3 (i) and (ii) for a more precise analysis.
Proof of Theorem 3 (i) and (ii): According to Theorems 10.1, 10.2 in Schmüdgen (2017), or Theorem IV.1.1 of Karlin and Studden (1966), the condition \(A_i(\varepsilon )\succeq 0\) and \(B_i(\varepsilon )\succeq 0\) (\(i=0\) or 1) is necessary and sufficient for \((\nu _k)_{k=0}^{n-2}\) to be a truncated moment sequence in the interval \([\varepsilon ,1-\varepsilon ]\).
Assume first that \(A_i(\varepsilon )\succeq 0\) and \(B_i(\varepsilon )\succeq 0\) for some \(\varepsilon \in (0,1/2)\). Since \(\nu _0=1\), any solution (=representing measure) \(\mu \) will be a probability measure. Equivalently, the r.v. T with d.f. \(F_T(x)=\mu \big ((-\infty ,x]\big )\) takes values in \([\varepsilon ,1-\varepsilon ]\subseteq (0,1)\) and satisfies \(\textrm{IE}T^k=\nu _k\), \(k=0,\ldots ,n-2\). From Theorem 1(iii) it follows that the \(\beta \)’s are EOS.
To prove necessity, assume that the \(\beta \)’s are EOS. From Theorem 1(iii) we can find an r.v. T with \(\textrm{IE}T^k=\nu _k\), \(k=0,\ldots ,n-2\), and \(\Pr (0<T<1)=1\). Let \(\mu _T\) be the probability measure of T and consider the probability space \((\mathcal {X},\mathcal {F},\mu ):= ((0,1),\mathcal {B},\mu _T)\), where \(\mathcal {B}\) is the Borel \(\sigma -\)field on (0, 1). Define the space V of real polynomials \(f:(0,1)\rightarrow \textbf{R}\) of degree \(\le n-2\); obviously, V is a linear subspace of \(L^1(\textbf{R},\mu _T)\) of dimension \(n-1\) (finite). Consider also the Riesz functional \(L_{\mu _T}:V\rightarrow \textbf{R}\) defined by \(L_{\mu _T}(f):=\int f d \mu _T=\sum _{k=0}^{n-2}a_k \nu _k\) for \(f(x)=\sum _{k=0}^{n-2}a_k x^k \in V\). Form Richter-Tchakaloff Theorem (see Theorem 2, above), there exists a measure \(\mu _0\), supported in at most \(n-1\) points of \(\mathcal {X}=(0,1)\), such that \(L_{\mu _0}\equiv L_{\mu _T}\) on V; in particular, \(\nu _k=\int _{(0,1)}x^k d {\mu _T}(x)=\int _{(0,1)}x^k d {\mu _0}(x)\), \(k=0,\ldots ,n-2\). Thus, \(\mu _0\) is a probability measure (\(\nu _0=1\)) supported on a finite number of points in (0, 1), possessing the same initial \(n-2\) moments as \(\mu _T\). This means that \(\mu _0\) solves the truncated moment problem for \((\nu _k)_{k=0}^{n-2}\) in the interval \([t_1,t_{2}]\), where \(t_1\in (0,1)\) is the minimum supporting point of \(\mu _0\) and \(t_{2}\in (0,1)\) the maximum one. Choose \(\varepsilon >0\) such that \(\varepsilon <\min \{t_1,1-t_{2}\}\). Then, the sequence \((\nu _k)_{k=0}^{n-2}\) is the moment sequence of \(\mu _0\), supported in the interval \([\varepsilon ,1-\varepsilon ]\), and Theorems 10.1, 10.2 in Schmüdgen (2017) imply that \(A_i(\varepsilon )\succeq 0\) and \(B_i(\varepsilon )\succeq 0\) (\(i=0\) or 1).
(iii) First we prove sufficiency. Denote by \(\lambda _{\min }(M)\) (resp. \(\lambda _{\max }(M)\)) the smallest (resp. the largest) eigenvalue of a real symmetric matrix M. For the case \(n=2m+2\), the matrix \(A_0(\varepsilon )\) is independent of \(\varepsilon \), hence, \(A_0(\varepsilon )=A_0\succ 0\) by hypothesis. Moreover, \(B_0(\varepsilon )= B_0-\varepsilon (1-\varepsilon )M_0\) for some real symmetric matrix \(M_0\); see (15). Since \(\lambda _{\min }(B_0)>0\) by assumption, it follows that for any \({\varvec{x}}=(x_0,\ldots ,x_{m-1})^T\in \textbf{R}^{m}\), \({\varvec{x}}^T B_0(\varepsilon ) {\varvec{x}}={\varvec{x}}^T B_0 {\varvec{x}} -\varepsilon (1-\varepsilon ) {\varvec{x}}^T M_0 {\varvec{x}}\ge \big [\lambda _{\min }(B_0)-\varepsilon (1-\varepsilon )\lambda _{\max }(M_0)\big ]{\varvec{x}}^T {\varvec{x}}\ge 0\), if \(\varepsilon >0\) is sufficiently small. Hence, the sufficient condition (i), namely, \(A_0(\varepsilon )\succeq 0\) and \(B_0(\varepsilon )\succeq 0\) for some small \(\varepsilon >0\), is satisfied. Similarly, when \(n=2m+3\) we have \(A_1(\varepsilon )=A_1-\varepsilon M_1\) and \(B_1(\varepsilon )=B_1-\varepsilon M_1\) for some real symmetric matrix \(M_1\); see (16). From \(\lambda _{\min }(A_1)>0\), \(\lambda _{\min }(B_1)>0\), it follows that for any \({\varvec{x}}\in \textbf{R}^{m+1}\), \({\varvec{x}}^T A_1(\varepsilon ){\varvec{x}}={\varvec{x}}^T A_1 {\varvec{x}} -\varepsilon {\varvec{x}}^T M_1 {\varvec{x}}\ge \big [\lambda _{\min }(A_1) -\varepsilon \lambda _{\max }(M_1)\big ]{\varvec{x}}^T {\varvec{x}}\ge 0\), and \({\varvec{x}}^T B_1(\varepsilon ){\varvec{x}}\ge \big [\lambda _{\min }(B_1)-\varepsilon \lambda _{\max }(M_1)\big ]{\varvec{x}}^T {\varvec{x}}\ge 0\), provided \(\varepsilon >0\) is sufficiently small. Hence, the sufficient condition (ii), \(A_1(\varepsilon )\succeq 0\) and \(B_1(\varepsilon )\succeq 0\) for some small \(\varepsilon >0\), is satisfied. Therefore, in both cases, the condition (iii) is sufficient for the \(\beta \)’s to represent EOS.
Finally, we show that the condition (iii), namely \(A_i\succ 0\) and \(B_i\succ 0\) (\(i=0\) or 1), is not necessary. To this end, consider the sequence \(\beta _j:=\sum _{k=n+1-j}^n \begin{pmatrix}n \\ k\end{pmatrix}\), \(j=1,\ldots ,n\). Then, \(\beta _{j+1}-\beta _j=\begin{pmatrix}n\\ j\end{pmatrix}\) (\(j=1,\ldots ,n-1\)) and a straightforward computation yields \(\nu _k=2^{-k}\), \(k=0,\ldots ,n-2\); see (14). Suppose first that \(n=2m+2\) and let \({\varvec{x}}^T=(x_0,\ldots ,x_m)\in \textbf{R}^{m+1}\). Then, \({\varvec{x}}^T A_0 \varvec{x}=\big (\sum _{k=0}^m x_k/2^{k}\big )^2\), and since \(m\ge 1\), the matrix \(A_0\) is singular (hence, not positive definite). Similarly, for \({\varvec{x}}^T=(x_0,\ldots ,x_{m-1})\in \textbf{R}^{m}\), \({\varvec{x}}^T B_0 \varvec{x}=(1/4)\big (\sum _{k=0}^{m-1} x_k/2^{k}\big )^2\), which is positive definite if and only if \(m=1\) (\(n=4\)). On the other hand, \(A_0(\varepsilon )=A_0\succeq 0\) for all \(\varepsilon \in (0,1/2)\), while \({\varvec{x}}^T B_0(\varepsilon ) \varvec{x}=\big (1/4-\varepsilon (1-\varepsilon )\big ) \big (\sum _{k=0}^{m-1} x_k/2^{k}\big )^2\ge 0\) for small enough \(\varepsilon >0\). According to characterization (i), the given \(\beta \)’s are EOS, although the numbers \(\nu _k({\varvec{\beta }})\) (\(k=0,\ldots ,n-2\)) do not satisfy the condition \(A_0\succ 0\) and \(B_0\succ 0\).
Next, suppose that \(n=2m+3\). Then, \(A_1=B_1\) and it follows that \({\varvec{x}}^T A_1 \varvec{x}={\varvec{x}}^T B_1 \varvec{x}=(1/2) \big (\sum _{k=0}^{m} x_k/2^{k}\big )^2\), showing that \(A_1\) (and \(B_1\)) is singular and positive semi-definite. On the other hand, \({\varvec{x}}^T A_1(\varepsilon ) \varvec{x}={\varvec{x}}^T B_1(\varepsilon ) \varvec{x} =(1/2-\varepsilon ) \big (\sum _{k=0}^{m} x_k/2^{k}\big )^2\ge \) 0, and (ii) shows that the \(\beta \)’s are EOS. Hence, although the numbers \(\nu _k({\varvec{\beta }})\) (\(k=0,\ldots ,n-2\)) do not satisfy the condition \(A_1\succ 0\) and \(B_1\succ 0\), the corresponding \(\beta _j\) are EOS.
It can be checked that the given \(\beta \)’s are the EOS from the two-valued r.v. X with \(\Pr (X=0)=\Pr (X=2^n)=1/2\).\(\square \)
Remark 5
(a) Assume that for \(i=0\) or 1, \(A_i\succeq 0\), \(B_i\succeq 0\), and either \(\det A_i=0\) or \(\det B_i=0\) (or both). Then, the measure \(\mu =\mu _0\) is [0, 1]-determinate from its moments \((\nu _k)_{k=0}^{n-2}\); see Theorem 10.7 in Schmüdgen (2017). Hence, if this is the case, we can find \(\varepsilon \in (0,1/2)\) such that \(A_i(\varepsilon )\succeq 0\) and \(B_i(\varepsilon )\succeq 0\) if and only if the support of (the unique) \(\mu _0\) does not contain any of the endpoints 0 and 1.
(b) The finite supporting set of the discrete measure \(\mu _0\), constructed in the proof of Theorem 3, can be chosen to contain at most \(k\le n/2\) (rather than \(k\le n-1\)) points.
(c) Theorem 3 makes it possible, at least in principle, to calculate sharp upper and lower bounds on distribution functions in terms of expectations of order statistics (see Mallows, 1973).
Example 1
(The case \(n=4\)). Assume we are given \(\beta _1<\beta _2<\beta _3<\beta _4\). Since for \(n=4\) we have \(m=1\), the matrices \(A_0\), \(B_0\) are given by (see (14), (15)),
with \(\lambda \) as in Definition 1. Hence, \(B_0\) is (trivially) positive definite, and by Sylvester’s criterion, \(A_0=A_0(\varepsilon )\) is positive semi-definite if and only if
According to Theorem 3(i), the \(\beta \)’s are EOS if and only if (17) is satisfied. Based on (17) we immediately deduce that, e.g., the numbers \((0,\ 2,\ 5,\ 7)\) are EOS, while the numbers \((0,\ 2,\ 11,\ 13)\) are not. For the first set of numbers, the r.v. T in (11) or (12) is uniquely determined (in fact, \(T\equiv 1/2\)), because \(\varvec{\nu }({\varvec{\beta })}=(1,1/2,1/4)=(1,\textrm{IE}T,\textrm{IE}T^2)\), showing that \(\text {Var}T=0\); see Remark 5(a) and (14) of Definition 1. Consequently, from Lemma 1 we conclude that the corresponding r.v. X, assuming the given expected order statistics, is also unique, namely, \(\Pr (X=-1/2)=\Pr (X=15/2)=1/2\); see Remarks 3, 2.
Example 2
(The case \(n=5\)). It can be checked that for \(n=5\), the \(2\times 2\) matrices \(A_1\), \(B_1\), see (16), (14), are positive semi-definite if and only if \((\beta _3-\beta _2)(\beta _5-\beta _4) \ge \frac{1}{2}(\beta _4-\beta _3)^2\) and \((\beta _2-\beta _1)(\beta _4-\beta _3) \ge \frac{1}{2}(\beta _3-\beta _2)^2\). Moreover, if both inequalities are strict (case \((+,+)\)) then \(A_1\succ 0\), \(B_1\succ 0\), and Theorem 3(iii) shows that the \(\beta \)’s are EOS. If, however, one (or both) of the inequalities reduces to an equality, one has to check the condition (ii) of Theorem 3 in detail. For instance, if both matrices are singular (case (0, 0)), then \(A_1(\varepsilon )\succeq 0\) and \(B_1(\varepsilon )\succeq 0\) for \(0<\varepsilon <\min \{\beta _3-\beta _2,\beta _4-\beta _3\}/(\beta _4-\beta _2)\), and the \(\beta \)’s are again EOS. As an example of the (0, 0)-case consider the numbers \((0,\ 1,\ 5,\ 13,\ 21)\), representing the EOS from the (uniquely defined) r.v. X with \(\Pr (X=-1/10)=2/3\), \(\Pr (X=121/5)=1/3\). However, both cases \((0,+)\) (e.g., \((0,\ 9,\ 11,\ 13,\ 14)\)) and \((+,0)\) (e.g., \((0,\ 1,\ 3,\ 5,\ 14)\)) imply that the \(\beta \)’s are not EOS. To see this, assume that \(2 (\beta _3-\beta _2)(\beta _5-\beta _4) = (\beta _4-\beta _3)^2\) and \(2(\beta _2-\beta _1)(\beta _4-\beta _3)>(\beta _3-\beta _2)^2\); case \((0,+)\). Then, \(B_1\succ 0\) (hence, \(B_1(\varepsilon )\succeq 0\) for small \(\varepsilon >0\)) and \(A_1\succeq 0\) with \(\det A_1=0\). It can be verified that for \({\varvec{x}}^T=(x_0,x_1):=(\beta _4-\beta _3,-(\beta _4-\beta _2))\), \(\varvec{x}^T A_1(\varepsilon ){\varvec{x}}=-\varepsilon \Delta \), where \(\Delta >0\) depends only on \(\beta \)’s, and thus, according to Theorem 3(ii), the \(\beta \)’s cannot be EOS. By the same reasoning, this is also true for the \((+,0)\)-case. Therefore, the complete characterization for \(n=5\) says for the \(\beta \)’s to be EOS it is necessary and sufficient that either \(2 (\beta _2-\beta _1)(\beta _4-\beta _3) = (\beta _3-\beta _2)^2\) and \(2 (\beta _3-\beta _2)(\beta _5-\beta _4)= (\beta _4-\beta _3)^2\), or \(2 (\beta _2-\beta _1)(\beta _4-\beta _3) > (\beta _3-\beta _2)^2\) and \(2 (\beta _3-\beta _2)(\beta _5-\beta _4)> (\beta _4-\beta _3)^2\); that is, either both matrices \(A_1\), \(B_1\) are positive definite, or both are positive semi-definite and singular. We do not know if the situation is similar for odd values of \(n\ge 7\).
Our final result characterizes the binomial mixtures for which the mixing distribution is supported in the open interval (0, 1) (cf. Wood, 1992, 1999). The proof, being an immediate application of Theorems 1, 3, is omitted.
Theorem 4
Let \({\varvec{p}}=(p_0,\ldots ,p_n)\) (\(n\ge 2\)) be a probability vector (\(p_i\ge 0\), \(\sum _{i=0}^n p_i=1\)) and \({\varvec{u}}={\varvec{u}}({\varvec{p}}) =(u_0,\ldots ,u_n)\), where
If \(n=2m\) set
and if \(n=2m+1\) set
Then, the following are equivalent.
-
(i)
\(A(\varepsilon )\succeq 0\) and \(B(\varepsilon )\succeq 0\) for some \(\varepsilon \) with \(0<\varepsilon <1/2\).
-
(ii)
\({\varvec{p}}\in \text {Conv}[B_0]\), where
$$ B_0=\left\{ \left( \begin{pmatrix}n\\ j\end{pmatrix}p^j(1-p)^{n-j}\right) _{j=0}^n, \ 0<p<1 \right\} $$is the open binomial probability curve (without its endpoints) and \(\text {Conv}[X]\) denotes the convex hull of \(X\subseteq \textbf{R}^{n+1}\).
-
(iii)
There exists an r.v. V with \(\Pr (0<V<1)=1\) such that
$$ p_j=\textrm{IE}\left\{ \begin{pmatrix}n\\ j\end{pmatrix} V^j (1-V)^{n-j}\right\} , \ \ \ j=0,1,\ldots ,n. $$ -
(iv)
\({\varvec{u}}\in \text {Conv}[M_0]\), where \(M_0=\left\{ (1,t,t^2,\ldots ,t^n), \ \ 0<t<1 \right\} \) is the open moment curve (without its endpoints).
-
(v)
There exists an r.v. V with \(\Pr (0<V<1)=1\) such that
$$ u_k=\textrm{IE}V^k, \ \ \ k=0,1,\ldots ,n. $$
Let \({\varvec{x}}(t)=(1,t,t^2,\ldots ,t^n)\), \(0\le t\le 1\). A simple application of Theorem 4 shows that for \(n\ge 3\) (in contrast to the case \(n=2\)), the line segment \((1-\lambda ){\varvec{x}(0)}+\lambda {\varvec{x}(t_0)}\), \(0\le \lambda <1\), lies outside \(\text {Conv}[M_0]\). Given \(\lambda _0,t_0\in (0,1)\), it follows from Farkas’ Lemma (see, e.g., Bertsimas and Tsitsiklis (1997), Theorem 4.6) that for any m, and any given collection \(\{t_0,\ldots ,t_m\}\subseteq (0,1)\), we can find a polynomial p with \(\deg (p)\le n\), such that \((1-\lambda _0)p(0)+\lambda _0 p(t_0)<0\) and \(p(t_i)\ge 0\), \(i=0,1,\ldots ,m\).
References
Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (1992). A first course in order statistics. John Wiley & Sons, New York.
Bertsimas, D. and Tsitsiklis, J.N. (1997). Introduction to linear optimization. Athena Scientiffic, Belmont, Massachusetts.
David, H.A. and Nagaraja, H.N. (2003). Order statistics (3rd ed.), John Wiley & Sons, Hoboken, New Jersey.
Hausdorff, F. (1921). Summationmethoden und Momentfolgen. I. Math. Zeitchrift, 9(1), 74–109.
Hoeffding, W. (1953). On the distribution of the expected values of the order statistics. Ann. Math. Statist., 24(1), 93–100.
Huang, J.S. (1998). Sequences of expectations of maximum-order statistics. Statist. Probab. Lett., 38, 117–123.
Jones, M.C. and Balakrishnan, N. (2002). How are moments and moments of spacings related to distribution functions? J. Stat. Plann. Inference (C.R. Rao 80th birthday felicitation volume, Part I), 103, 377–390.
Kadane, J.B. (1971). A moment problem for order statistics. Ann. Math. Statist., 42, 745–751.
Kadane, J.B. (1974). A characterization of triangular arrays which are expectations of order statistics. J. Appl. Probab., 11, 413–416.
Karlin, S. and Studden, W. (1966). Tchebycheff systems with applications in analysis and statistics. Interscience, New York.
Kolodynski, S. (2000). A note on the sequence of expected extremes. Statist. Probab. Lett., 47, 295–300.
Mallows, C.L. (1973). Bounds on distribution functions in terms of expectations of order-statistics. Ann. Probab., 1, 297–303.
Papadatos, N. (2001). Distribution and expectation bounds on order statistics from possibly dependent variates. Statist. Probab. Lett., 54, 21–31.
Papadatos, N. (2017). On sequences of expected maxima and expected ranges. J. Appl. Probab., 54, 1144–1166.
Pearson, K. (1902). Note on Francis Galton’s problem. Biometrika, 1, 390–399.
Schmüdgen, K. (2017). The moment problem. Graduate Texts in Mathematics, 277, Springer.
Wood, G.R. (1992). Binomial mixtures and finite exchangeability. Ann. Probab., 20(3), 1167–1173.
Wood, G.R. (1999). Binomial mixtures: geometric estimation of the mixing distribution. Ann. Statist., 27(5), 1706–1721.
Funding
Open access funding provided by HEAL-Link Greece.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: A detailed proof of Lemma 1
Appendix: A detailed proof of Lemma 1
Provided that (8) holds true, the uniqueness of X follows from the classical result of Hoeffding (1953), since the sequence \(\textrm{IE}X_{k:k}\) determines the triangular array \(\textrm{IE}X_{j:n}\), and vice-versa; see (13), (1). We proceed to verify that \(F_0^{-1}\) is indeed the distribution inverse of an integrable r.v. X satisfying (8). Left continuity of \(F_0^{-1}\) follows automatically from its definition. Moreover, using the obvious equality \(\int \frac{2u-1}{u^2(1-u)^2} du = \frac{1}{u(1-u)}+C\) for \(C\in \textbf{R},\) the function \(g(t):=F_0^{-1}(t)+c_T\) can be written as
showing that \(g(t)\ge 0\) for \(t\ge 1/2\) and \(g(t)\le 0\) for \(t\le 1/2\). For \(t_1<t_2\) with \(1/2\le t_1<t_2<1\) we have
Similarly, for \(t_1<t_2\) with \(0<t_1<t_2\le 1/2\),
Therefore, g, and hence \(F_0^{-1}\), is nondecrasing. In order to verify that \(F_0^{-1}\in L^1\), we shall calculate the integrals
In the following we shall make repeatedly use of the fact that for a nonegative r.v. Y, \(\textrm{IE}Y=\int _{0}^{\infty }\Pr (Y>t)dt\) or \(\textrm{IE}Y=\int _{0}^{\infty }\Pr (Y\ge t)dt\), followed by subsequent applications of Tonelli’s theorem. Using (18) and Tonelli’s theorem we have
on noting that \(F_T(t)=F_T(t-)\) a.e. Considering the nonnegative r.v. \(Y=h(T)=T I(T\le u)\) it is easily seen that \(\Pr (Y>t)=F_T(u)-F_T(t)\) for \(t<u\), and the probability is zero for \(t\ge u\). Hence, \(\textrm{IE}Y=\int _0^u \big [ F_T(u)-F_T(t)\big ] dt\), and also, \(\textrm{IE}h(T)= \int _{(0,u]} t dF_T(t)\). Since these expectations are equal, we obtain
Substituting this equality to the double integral in \(J_1\) and interchanging once again the order of integration (since the integrand is nonnegative), we obtain
In order to evaluate the exact value of \(J_1\), it remains to express the integral \(\int _{0}^{1/2} F_T(t) dt\) in terms of integrals w.r.t. \(dF_T\).
For \(u\in (0,1)\), consider the nonnegative r.v. \(Y=h(T)=T I(T<u)\), for which \(\Pr (Y>t)=F_{T}(u-)-F_{T}(t)\) for \(t<u\), and zero otherwise. Then, \(\textrm{IE}Y=\int _{0}^{u} \big [F_{T}(u-)-F_{T}(t)\big ] dt =u F_T(u-)-\int _{0}^u F_T(t) dt\), and \(\textrm{IE}h(T)=\int _{(0,u)}t d F_T(t)\); thus,
Setting \(u=1/2\) we find
and finally, since \(\int _{(0,1/2]} \big (\frac{1}{1-t}-4t\big ) dF_T(t)= \int _{(0,1/2)} \big (\frac{1}{1-t}-4t\big ) dF_T(t)\) (because the integrand vanish for \(t=1/2\)), we conclude that
Using (18) we rewrite \(J_2\) as
by Tonelli’s theorem. Substituting
in the inner integral (noting that both integrals represent the expectation of \(Y=(1-T)I(T>u)\)), and changing the order of integration, we arrive at
For \(u\in (0,1)\) consider the r.v. \(Y=(1-T)I(T\ge u)\), so that, \(\Pr (Y\ge y)=F_T(1-y)-F_T(u-)\) for \(y\le 1-u\), and zero otherwise. Then,
and this expectation is also equal to \(\int _{[u,1)}(1-t) d F_T(t)\). Hence,
Substituting \(u=1/2\) we obtain
and since \(\int _{(1/2,1)} \big (\frac{1}{t}-4(1-t)\big ) d F_T(t) =\int _{[1/2,1)} \big (\frac{1}{t}-4(1-t)\big ) d F_T(t)\), we conclude that
The preceding argument not only shows that \(F_0^{-1}\in L^{1}\), but also proves that
with \(c_T\) as in (10), and therefore, the expectation of the r.v. X with inverse d.f. \(F_0^{-1}=g-c_T\) is zero. Next, set \(\mu _k=E X_{k:k}=k\int _0^1 t^{k-1}F_0^{-1}(t)dt\), \(k=1,2,\ldots \). , and let \(R=F_T(\frac{1}{2}-)\). In view of (18), write \(\mu _k+c_T=\int _0^1 k t^{k-1} g(t) dt=I_2-I_1\) where
noting that the integrands may be different form the original ones (suggested from (18)) at sets of measure zero. Changing the order of integration, according to Tonelli’s theorem, we see that
Consider the expectation of \(Y=T^k I(T\le u)\). Since \(\Pr (Y>y)=F_T(u)-F_T(y^{1/k})\) for \(y<u^k\), and zero otherwise, we obtain
Thus,
Next, set \(Y=T^k I(T<u)\) (for \(0<u<1\)) with \(\textrm{IE}Y= \int _{(0,u)} t^k d F_T(t)\), and observe that \(\Pr (Y>y)=F_T(u-)-F_T(y^{1/k})\) for \(y<u^k\) (and zero otherwise), to conclude the identity
Applying this with \(u=1/2\) we obtain an explicit simple formula for \(I_1\):
Finally, in order to calculate \(I_2\), consider the auxiliary variable \(Y=(1-T^k)I(T>u)\) (with \(0<u<1\)), for which \(\Pr (Y\ge y)=F_T((1-y)^{1/k})-F_T(u)\) for \(y<1-u^k\), and zero otherwise. The alternative expressions for its expectation yield
Hence,
and applying once again Tonelli’s theorem, we get
If we set \(Y=(1-T^k)I(T\ge u)\), we see that \(\Pr (Y\ge y)=F_T((1-y)^{1/k})-F_T(u-)\) for \(y\le 1-u^k\), and zero otherwise, obtaining
Applying this identity with \(u=1/2\) we conclude that
By the preceding calculations,
(observe that the r.h.s. equals to \(c_T\) for \(k=1\), showing once again that \(\mu _1=0\)). Therefore, for \(k=0,1,\ldots \),
and the proof is complete.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Okolewski, A., Papadatos, N. Finite Sequences Representing Expected Order Statistics. Sankhya A 86, 755–774 (2024). https://doi.org/10.1007/s13171-024-00343-z
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13171-024-00343-z