Optimal evaluations for the bias of trimmed means of k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}th record values

We provide sharp upper and lower mean-variance bounds on the expectations of trimmed means of k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}th record values from general family of distributions. Also we improve these bounds in the case of non-trimmed means for parent distributions with decreasing density or decreasing failure rate. They can be viewed as bounds on the bias of approximation of expectation of the parent population by mean or trimmed mean of record values. The results are illustrated with numerical examples.


Introduction
Let {X n , n ≥ 1} be a sequence of i.i.d. random variables with continuous common cummulative distribution function F with finite mean μ = Let Y (k) n , n ≥ 1 denote the sequence of kth record values of {X n , n ≥ 1}, defined by Dziubdziela and Kopociński (1976) as Y (k) n = X U k (n):U k (n)+k−1 , where U k (1) = 1 and U k (n + 1) = min j > U k (n) : X j: j+k−1 > X U k (n):U k (n)+k−1 , n ≥ 1.
For 1 ≤ r ≤ n let denote the trimmed mean of kth record values and in particular for n ≥ 2 i denotes the mean of n first kth record values. Assume that we are given not the observations X 1 , X 2 , . . ., but only kth record values and we want to estimate the mean μ of the parent distribution F. To this aim we propose the usage of T (k) r,n and S (k) n . The approximation of μ by T (k) r,n makes sense for n ≤ k, since then there is a chance that some records Y (k) i , r ≤ i ≤ n, have expected values less than μ. Other values of n and k (i.e. n > k) are studied for completeness.
In order to study the quality of approximation of μ by T (k) r,n we evaluate the bias of T (k) r,n expressed in standard deviation units. In other words we seek for sharp lower and upper bounds on E T (k) r,n − μ σ for general F with finite σ . The mean S (k) n seems to be most reasonable estimator of μ among T (k) r,n , 1 ≤ r ≤ n, but as we show that left trimming improves lower bounds so it also makes sense to study T (k) r,n with r > 1 as well. The first idea is to note that by obvious inequalities n it may suffice to apply upper bounds on EY (k) n of Raqab (1997) or Raqab and Rychlik (2002), and lower bounds on EY (k) r of Goroncy and Rychlik (2011). But the bounds obtained this way are not optimal since the equality in the above inequality holds iff Y (k) r = Y (k) n which holds with probability 0. From numerical computations it appears that the smallest upper error is commited if r = 1. Therefore we improve upper bounds on E S (k) n − μ σ for F coming from restricted families of distributions with decreasing density (DD) and with decreasing failure rate (DFR) defined by convex ordering of distributions.
For two distribution functions F and G we write F c G if and only iff F −1 G is convex function on the support of G. Then we write F ∈ DD if F c U , where U is the cdf of the uniform distribution on [0, 1], and F ∈ DFR if F c V , where V (x) = 1 − e −x , x ≥ 0, is the cdf of the standard exponential distribution.
The bounds are derived by applications of the projection method. For its full explaination and numerous examples the reader is referred to the monograph of Rychlik (2001). The bounds valid in the general class of distributions are obtained by Moriguti's approach of the greatest convex minorants and the bounds in restricted classes-by making use of slight generalization of the results of Danielak (2003). For exhaustive review on results concerning bounds on expectations of kth record values and their functions valid in general and restricted families of distributions we refer to Section 1 of Goroncy and Rychlik (2011).
In Sect. 2 we recall necessary results on projection method. The key step in application of these results is to determine the shapes of the functions to be projected. This is done in Sect. 3 with the aid of the variation diminishing property of the densities of kth record values from uniform distribution. This property is stated and proved by Gajek and Okolewski (2003), but here we give its new simple proof based on the Descartes' rule of signs. Section 4 contains the main results on bounds on E T (k) r,n for general distributions. In Sect. 5 we focus on restricted families of distributions. First in Sect. 5.1 we need to generalize the results of Danielak (2003) where the projected functions h satisfy h(0) = 0 while in our case we have h(0) > 0. The solution to this new problem is similar but it cannot be simply inferred from the results of Danielak (2003). Then in Sects. 5.2 and 5.3 we present the main results on bounds on E S (k) n for DD and DFR distributions. Finally in Sect. 6 we present numerical computations of the bounds obtained in this paper.

Auxiliary results on the projection method
Using the representation (valid only for continuous F) denotes the density of kth record value from uniform U (0, 1) distribution, we easily get Since g (k) r,n integrates to 1, and F −1 − μ integrates to 0 we get By the projection method and Schwarz inequality where g (k) r,n denotes the projection of g (k) r,n onto the convex cone C of nondecreasing functions on [0, 1]. The bound is attained if To derive lower bounds Danielak and Rychlik (2003) used the symmetry of distributions of order statistics, but this symmetry no longer holds for record values. So the lower bound is derived analogously to the upper one noting that r,n denotes the projection of −g (k) r,n onto C . The equality is attained if (2) This gives Therefore it suffices to determine the projections of g (k) r,n and −g (k) r,n onto C . By Theorem 1 of Moriguti (1953), for any function g : [0, 1] → R its projection g onto C is the right-hand derivative of the greatest convex minorant G of the antiderivative of g. In our case g is either g (k) r,n or −g (k) r,n , so to determine convexity regions of their antiderivatives, we need to determine monotonicity properties of g (k) r,n . This is done in Sect. 3 and the projections g (k) r,n and −g (k) r,n are determined in Sect. 4.
In Sect. 5 we apply the projection method in a more subtle way. We consider the Hilbert space L 2 W , where W : [a, d) → R is a fixed absolutely continuous distribution function, of real functions on [a, d) square integrable with weight w = W such that 1,n and we get by the projection method and Schwarz' inequality n is the projection ofĥ (k) n onto the convex cone C W . This bound is sharp and the equality in (4) is attained if Since this time we approximate the functionsĥ (k) n by convex functions we need to study their convexity properties. This is also done in Sect. 3 and the projection P Wĥ (k) n is determined in Sect. 5.1.

Shapes of projected functions
To determine the exact shape of functions g (k) r,n and h (k) n it suffices to study the sign changes of their first and second derivatives. Since g (k) r,n and h Namely, to obtain first and second derivatives of g (k) r,n we use the identities which may be easily verified by direct computations. They hold true for n = 1 and n = 2 as well if we adopt the convention f The sign changes of such combinations can be studied by variation diminishing property (VDP) of f Gajek and Okolewski (2003).

Theorem 1
The number of zeros of any linear combination n i=1 a i f (k) i in (0, 1) does not exceed the number of sign changes in the sequence a 1 , . . . , a n of its coefficients. Moreover, the first and the last sign of the combination are the same as the signs of the first and the last nonzero coefficients, respectively.
This theorem was proved in Gajek and Okolewski (2003) using the notion of total positivity, but here we give new simpler proof of its first part based on Descartes' rule of signs (see Wang 2004; Komornik 2006, for its elementary proofs).
in (0, 1) is the same as the number of zeros of the polynomial P in R + . By Descartes' rule of signs this number does not exceed the number of sign changes in b 1 , . . . , b n , which in turn is equal to the number of sign changes of a 1 , . . . , a n . Now carefully analyzing the signs of coefficients in the expansions of first and second derivatives, we can determine the shapes of g n , which will be of special interest to us.
n , n ≥ 2, are increasing from k/n, and then decreasing to 0. Moreover: 2 which is concave-convex. Now we consider the case when r ≥ 2.

General distributions
The bounds expressed in σ units are well known and their derivation is rather standard. Therefore we only sketch the most crucial details. If k = 1, by Theorems 2 and 3, the functions g (1) r,n , 1 ≤ r < n, are increasing and therefore g (1) r,n is decreasing and it is obvious that −g This bound is trivial as it can be derived from obvious inequalities If k ≥ 2, then by Theorems 2 and 3, the functions g (k) r,n are nonnegative, increasing from 0, and then decreasing to 0 with the exception of r = 1 when g (k) 1,n starts from k/n. Therefore if r,n is convex on (0, θ) and concave on (θ, 1). If r = 1 and n ≤ k, then (G This bound becomes trivial if we realize that EY In all the remaining cases, i.e. k ≥ 2 and r = 1, n > k or r ≥ 2, there exists unique α * = α * (k, r, n) ∈ (0, θ) defined by the equation such that the projection of g (k) r,n onto C is On the other hand for k ≥ 2 and 1 ≤ r < n, there exists unique α * ∈ (θ, 1) defined by Pluging these functions into (3) and calculating the respective norms we conclude the proof of the following theorem.

Theorem 4 For any continuous cdf F with finite mean μ and variance
The bounds are attained in the limit by distributions satisfying (1) for upper bounds, and by (2) for the lower ones. These distributions have analogous form to Raqab and Rychlik (2002) for the upper bounds and Goroncy and Rychlik (2011) for the lower ones and therefore their specification is omitted.

Remark 2
The above theorem could easily be generalized to arbitrary p ∈ [1, ∞] instead of p = 2 to provide p-norm bounds on the bias of trimmed means of kth record values.
We close this section with numerical values of the bounds B (k) r,n and B (k) r,n for 1 ≤ r ≤ n ≤ 10. In Tables 1, 2

and 3 we present the values of the upper bounds B
(1) r,n , B (2) r,n and B (3) r,n . Table 4 contains exemplary values of the lower bounds B (3) r,n . The values in bold correspond to r = n, i.e. to single kth record values. They were obtained by Nagaraja (1978) for k = 1 (Table 1), Raqab (1997) for k ≥ 2 (Tables 2 and 3) and Goroncy and Rychlik (2011) (Table 4).
In Table 1 we observe that for ordinary record values (i.e. for k = 1) the approximation of μ by trimmed means is very poor (even for r = 1), but as Tables 2 and 3 show, it significantly improves for k = 2 or k = 3. Also in general the upper bounds become smaller if k increases, and they become larger if either r or n increases. The lower bounds become larger as k increases and they decrease with the increase of r or n. The above relations reflect elementary inequalities T   In other words we only treat the cases when W = U or W = V in the notation of Sect. 2, but we might generalize it to the case of generalized Pareto distribution W α , with arbitrary α > −1/2 (see Bieniek 2008a). First we need to solve another projection problem.

Projection problem
The projection P W h can be determined for h ∈ L 2 W satisfying the following set of conditions: The solution to this problem follows the solution to problem of Danielak (2003) who considered the case when h(a) = 0. First we need to describe the shape of the projection and then we determine its exact parameters.
for some α ≥ 0 and a ≤ y ≤ d. Then for every g ∈ C W there exists g * ∈ C * W such that h − g ≥ h − g * .

Proof
Assume that h(a) > 0, and let θ ∈ (c, d) be the unique point on the decrease interval of h such that h(θ ) = h(a). If there is no such point then the results of Danielak (2003) are applicable after simple shift. It suffices to consider only functions g for which 0 < g(a) < h(a) and two cases: In the case (i) the functions g and h cross each other at the point δ ∈ (θ, d). Then the constant function equal to h(δ) is closer to h than g in L 2 W norm. In the case (ii) g and h cross at ∈ (b, θ). Let l δ be the straight line passing through the point ( , g( )) and tangent to h at some δ ∈ (a, b). Then if then g * is closer to h than g.
The proof in the remaining cases follows exactly the proof of Lemma 3 of Danielak (2003).
The rest of the solution of this projection problem is the same as those of Danielak (2003). Namely, for h satisfying (A) we define Then the shape of P W h depends on the behavior of functions K W and L W , where and defined for y ∈ [a, b].

Proposition 1 Let
Remark 3 If the projection is linear function of the form (8) thenβ ≥ 0. For ifβ < 0, thenᾱ > 0, and (8) would be negative in a neighbourhood of a. But then the function [ᾱ(x −a)+β] + would be a better approximation of h than (8), which is a contradiction.
The following simplified version of Lemma 4 of Gajek and Rychlik (1998) is very useful when we study the function L W . (a, v) and L W has a finite number of zeros, then L W is either positive or negative or it changes the sign from − to + in K + .

Distributions with decreasing density
Now we consider the case W = U . Then h (k) n satisfies (A) for all n, k ≥ 2. First we calculate the functions α * (y), K U and L U and the parametersᾱ andβ. We make use of identities which can be checked by direct computations. They are also special cases of Eqs. (12) and (13) of Bieniek (2008b). We have and by (10) 1 Therefore by (11) After detailed computations this gives where for 1 ≤ i ≤ n To calculate the values ofᾱ andβ it suffices to know the value of the integral We have used (12) here. Thereforē We also need the following analytical lemma which will be helpul in statement and the proof of the next theorem. The proof of the lemma is given in the "Appendix".
In our notation the statement of Lemma 3 can be rephrased as follows: Now we may state our main result on bounds on expectations of S (k) n for distributions with decreasing density. Note that for k = 1 by Theorem 2(a) the function h n and the upper bounds on Eh 1,n given in Theorem 4 are optimal in DD family as well. Therefore in the rest of this subsection we assume that k ≥ 2.
Theorem 5 Fix any F ∈ DD with finite mean μ and variance σ 2 and n, k ≥ 2.
Otherwise, i.e. if n k + 2 k k+1 n > 2 and n k + k k+1 n > 11 6 and n k + 3 k k+1 n > 5 2 and L U (v) > 0, and y * is the unique solution to L U (y) = 0 in (0, v). The bound is attained if Proof First we note that the sequence α 1 , . . . , α n−1 is strictly decreasing to α n−1 < 0 for k = 2, 3, . . . and α n ≥ 0, and K U (b) < 0. If α 1 ≤ 0 then by Theorem 1 the function K U is −+ (− if k = 2) on (0, 1), so it is negative on (0, b) and K = ∅, and P U h (k) n is a linear function. If α 1 > 0, then K U is either + − + or + (+− if k = 2) on (0, 1). But K U (b) < 0, so K U cannot be positive, so in both cases it is +− on (0, b) and K + = (0, v) where v is the smallest solution to K U (v) = 0 on (0, 1), and to determine P U h (k) n we need to study the behaviour of the function L U . Now β n = 0 if k = 2 and β n < 0 for k ≥ 3, and β n−1 > 0 for k ≥ 2, and routine calculations show that the sequence β 1 , . . . , β n is either decreasing or first increasing and then decreasing. Therefore L U is either − + − or −+ or +− or + or − on (0, 1) depending on the value of k and the sign of β 1 .
Namely if β 1 ≥ 0, then by Lemma 2 the function L U has to be positive on (0, v) and again K = ∅ and P U h (k) n is a strictly linear function. If β 1 < 0 then using Lemma 2 again, L U is either −+ of − on (0, v) and the shape of P U h (k) n depends on whether L U (v) ≤ 0 or L U (v) > 0. In the former case L U is negative on (0, v) and again K = ∅ and the projection is linear. In the latter case L U has unique zero y * in (0, 1) and the projection is then the projection is of the above form. If then the projection is the linear function of the form If additionallyᾱ ≤ 0, then the projection is constant equal to 1. We now analyze when (17) or (18) holds depending on n and k. First we assume that n k + 2 k k+1 n ≤ 2 i.e.ᾱ ≤ 0. If n ≤ k, then Lemma 3(a) implies that α 1 ≤ 0, and if n > k then Lemma 3(b) implies that β 1 ≥ 0. In both cases (18) holds and sincē α ≤ 0 we get that P Uĥ (k) n = 1. If we assume that n k + 2 k k+1 n > 2, i.e.ᾱ > 0 then (18) is equivalent to (13), and (17) is equivalent to (15). Now the detailed statements of the theorem follow after calculations of the norms of the projections and the forms of distributions for which equalities hold follow from (5).
We conclude this subsection with a short discussion on the conditions of Theorem 5.
Remark 4 1. If n ≤ k then by Lemma 3(a) we get n/k+( k k+1 ) n ≤ 2, so by Theorem 5 we have E S (k) n ≤ μ. This agrees with the thesis of Theorem 4 on bounds for general distributions, since bounds in DD family cannot be greater than general bounds. 2. If instead of the second inequality in (15) we assume that then obviously β 1 < 0 and by Lemma 3(c) it follows thatᾱ > 0, so n ≥ k and by Lemma 3(d) we get α 1 > 0. Moreover then the projection would be a linear function withβ < 0, which is impossible by Remark 3. So (19) suffices to claim that (15)  -n = 11, 12, 13 if k = 5. Moreover numerical calculations show that if k = 3 and n = 6 then α 1 > 0, β 1 < 0 and L U (v) < 0, while if k = 3 and n = 7 then α 1 > 0, β 1 < 0 and L U (v) > 0, so both possibilities do occur. This is contrary to order statistic case when the respective β 1 < 0 excludes the case of linear projection.

Distributions with decreasing failure rate
Now we consider the case W = V andĥ Analyzing the sign changes of the coefficients of the above expansions we prove the following lemma.
Lemma 4 (a) Let k = 1. The functionĥ (1) 2 is linear increasing, and the functionŝ h n , n ≥ 2, are increasing from k/n, and then decreasing to 0. Moreover: Therefore the functionsĥ (k) n satisfy the conditions (A) for all k ≥ 2 and n ≥ 2. Next we need to calculate the functions α * , K V , L V and the parametersᾱ andβ. We use the identities (see e.g. Bieniek 2008a, "Appendix"). These relations easily imply Moreover, To computeᾱ andβ it suffices to calulate the value of the integral

Now we may present the main result of this subsection on bounds on E S (k)
n valid for DFR distributions. In DFR case the conditions on n and k have more explicit form. We exclude the case k = 1 for the same reason as in DD case. The upper bounds on E S (1) n for general distributions are attained by DFR distributions and hence are optimal in the DFR family as well.
Theorem 6 Fix any F ∈ DFR with finite mean μ and variance σ 2 and n, k ≥ 2.
If 2 ≤ n ≤ 2k − 1, then E S (k) n ≤ μ. If n > 2k − 1, and where v is the smallest positive solution to K V (v) = 0, then and the bound is attained for the exponential distribution of the form and y * is the unique solution to L V (v) = 0 in (0, v). The bound is attained if Proof The proof is similar to the proof of Theorem 5 so we only describe most essential steps here. First note that the sequence γ 1 , . . . , γ n−1 is decreasing to γ n−1 = 3 2 1 k 2 − 1 < 0 and γ n > 0 (for k ≥ 2). Therefore if γ 1 ≤ 0, then K V is −+ on (0, ∞), and since K V (b) < 0, the function K V is negative on (0, b), so K = ∅ and the projection is linear.
We can check that for k ≥ 1 we have 0 < n δ1 < n α < n γ < n δ2 . Therefore for n ≤ n δ1 we have γ 1 ≤ 0, so the projection is linear and sinceᾱ ≤ 0, it is constant equal to 1. For n δ1 < n ≤ n α we have δ 1 ≥ 0, so the projection is linear, and sincē α ≤ 0, it is again constant. For n α < n ≤ n δ2 we have and δ 1 ≥ 0, butᾱ > 0, so the projection is strictly increasing linear function. Finally for n > n δ2 we have γ 1 > 0 and δ 1 < 0 and the verification of the sign of L V (v) is nedeed.
Remark 5 If n > 4k − 1 then the condition (21) holds. Indeed, if L V (v) was negative then the projection P Vĥ (k) n would be linear function withβ < 0 which is impossible by Remark 3. Also similar to DD case we can find vaues of n ≤ 4k − 1 for which γ 1 > 0, δ 1 < 0 but L V (v) ≤ 0 so P Vĥ (k) n is linear.

Numerical results
The results of the previous section allow numerical implementation. In Table 5 we compare the lower and upper general bounds −B (k) 1,n and B 1,n with bounds in restricted families C U (n, k) and C V (n, k) for k = 2 and k = 3 and 1 ≤ n ≤ 10. The values of B (3) 1,n are more accurate values of the numbers in the first row of Table 4. The values in bold are obtained from the simple formulae (14) and (20), and the remaining positive values of C U (n, k) and C V (n, k) are obtained from the complicated formulae (16) and (22).
Obviously the bounds obtained for DFR case are tighter than in DD case, since the class DFR is narrower than DD. Also we see that the gain by restricting to DD or DFR distributions is smaller than that obtained by looking at kth record values with (2) 1,n C V (n, 2) C U (n, 2) B (2) 1,n −B