1 Introduction, Preliminaries and the Data Analysis

Garlic (Allium sativum) is a year round crop grown in moderate climates with many medicinal properties. Garlic plant cannot withstand extreme temperature. Exposure of dormant cloves or young plants to temperature of around 20 \(^{\circ }\)C or lower for a time period hastens subsequent bulbing. In dry weather conditions, with increase in evaporation rate during Indian summer, plant growth may be substantially affected. The maximum summer temperature can be as high as 47 \(^{\circ }\)C in Jharkhand, India.

In the first study to asses crop yield, one hundred garlic clove seedlings were planted in an experimental plot at Indian Statistical Institute Giridih farm in Jharkhand on 12 February 2014, in winter season. The plot had topsoil eroded; this is part of a barren land having sandy soil composition mixed with ‘dhoincha’ (Sesbania bispinosa) plant compost manure, so as to make survival of plants easier in the unfertile plot of land. In each row there were ten plantations. Plant to plant distance was 15 cm. There were ten rows; distance between rows was 30 cm. A little bit of vermicompost manure was also provided in the experimental plot. Out of 100 plantations, 85 resulted in healthy garlic plants having positive yields on maturity. For remaining 15 plants, there were no yields (there was a typo in no. of healthy plants mentioned in an earlier report (Dasgupta 2015)).

Crop like garlic in harsh environment will be a worthwhile and cultivable crop in Giridih, Jharkhand, if adequate fertilizers like e.g., DAP, organic manure etc. are administered and additional cares like regular irrigation, loosening the soil near plants are undertaken.

In a follow-up study undertaken in subsequent year, the growth scenario is seen to improve as a result of front shifting the time zone of garlic cultivation i.e., early winter plantation of seedlings. Simultaneously, the other concerns like land fertility and plant care are also attended to increase yield.

Garlic bulbs are usually divided into numerous fleshy sections called cloves. The numbers of cloves for production in different years are given in Table 1. Since the yield depend on the number of cloves in each bulb, it is of interest to study the distribution of number of cloves over bulbs.

In Fig. 1 we plot the observed cumulative frequency distribution \(nF_n(x)\) of number of cloves in \(\log \)-\(\log \) scale. An approximate linear relationship suggests possibility of the following model for c.d.f.

$$\begin{aligned} F(x)=(x/\theta )^{\alpha }, \alpha >0, x=1,2,3,\ldots , \theta \end{aligned}$$
(1)

For the observed data in the year 2013–2014, slope and intercept of the least square linear fit are 0.06912 and 4.34809 respectively, with correlation coefficient \(r=0.9847.\)

One may take \(\hat{\theta }=x_{(n)}=4,\) maximum of the observations, and \(\hat{\alpha }= 0.06912,\) slope of the least square fitted line.

One may estimate the value of \(\theta \) from the intercept of the regression line as well. The value \(\tilde{\theta }=3.93\) is pretty close to the m.l.e. \(\hat{\theta }=x_{(n)}=4.\)

The continuous version of the distribution in (1) may be written as

$$\begin{aligned} G(y)=(y/\theta )^{\alpha }, \alpha >0, y\in (0,\theta ] \end{aligned}$$
(2)

A scaled version of the variable with proper shaping is uniform over the range (0, 1). As such we may term such a distribution as extended uniform distribution.

The above model resembles power law, but has a positive exponent \(\alpha ;\) support of the variable has an unknown upper bound.

The maximum likelihood estimate of the parameters in (2) based on n iid observations \(y_i, i=1, 2, 3, \ldots , n\) are \(\hat{\theta }=y_{(n)},\) the maximum observation; and \(\hat{\alpha }=[-\frac{1}{n}\sum _{i=1}^n\log (\frac{y_i}{\theta })]^{-1}\).

Table 1 Garlic data for 2013–2014 and 2014–2015

A discretized version, the nearest integer \(z=\left\lfloor y+\frac{1}{2}\right\rfloor \) of the continuous variable y within range \((0, \theta ],\) with c.d.f F given in (2) is of interest. The distribution may be a candidate model to explain ‘number of cloves per garlic bulb’.

Growth of underground garlic-bulb is a continuous process over time. The number of cloves is a discrete variable depending on the continuous development process of a garlic-bulb as this grows in weight and size over lifetime. The innermost cloves grown near the main stem are relatively new; these gradually expand towards outer periphery over time.

Fig. 1
figure 1

Model fit for garlic cloves (2013–2014)

Fig. 2
figure 2

Model fit for garlic cloves (2014–2015)

Fig. 3
figure 3

Empirical CDF of garlic yield and the model (2013–2014)

Assuming that the observations z’s are to be discretized values of y,  the estimates in terms of y values are \(\hat{\theta }=y_{(n)}=4,\) and \(\hat{\alpha }= 0.0362.\)

Model fit may be ascertained by the chi-square goodness of fit.

The values of chi-squares are 1.1876 and 5.8719 respectively in the two above cases viz., regression based estimates and the mle; with \(1 \;d.f.\) p-value of significance is 0.2758 and 0.0154 respectively.

Thus the first model with \(\hat{\theta }=x_{(n)}=4,\) maximum of the observations, and \(\hat{\alpha }= 0.06912,\) i.e., slope of the regression line; seems plausible.

Another set of 100 garlic seed cloves were planted on 5 December 2014, in a comparatively fertile land near riverside, see the yield data of the year 2013–2014. The crop produced maximum number of cloves as 19 in a total of 98 healthy surviving plants. In Fig. 2, slope and intercept of the least square linear fit are 0.4005 and 3.4382 respectively, with correlation coefficient \(r=0.9956.\) One may take \(\hat{\theta }=x_{(n)}=19,\) maximum of the observations; and \(\hat{\alpha }= 0.4005,\) slope of the least square regression line. Estimated value of \(\theta \) from the intercept of least square regression line is \(\tilde{\theta }=17.52,\) which is again close to the m.l.e. \(\hat{\theta }=x_{(n)}=19.\)

As before, assuming that the observations on number of cloves in yield z’s of the year 2014–2015 to be discretized values of y,  an approximate maximum likelihood estimates in terms of y values are \(\hat{\theta }=y_{(n)}=19;\) and \(\hat{\alpha }= 0.108123.\)

The values of chi-squares in these two cases, merging last several classes with no. of cloves \(\ge 10\) in a single class are 7.70469 (corresponding to \(\hat{\alpha }= 0.4005,\) obtained from slope of the least square regression line), and 120.324 (corresponding to m.l.e. of \(\alpha \) in continuous version of variable) respectively; with \(10-2-1=7 \;d.f.\) p-value of significance is 0.3594 and \(<0.0001\) respectively.

In this case the model providing \(\hat{\theta }=x_{(n)}=19,\) maximum of the observations; and \(\hat{\alpha }= 0.4005\) obtained from slope of the least square regression line; seems plausible.

The new model ‘extended uniform distribution’ proposed for the bulb crop garlic in the first data set, is therefore validated for the data set of subsequent year. The results from the model are close to those obtained from the growth experiment.

A similar model may be postulated for weight of garlic bulbs. Weight of a bulb consisting of cloves may be taken as proportional to number of cloves in it to a first approximation, as the former is approximately equal to weight of a typical clove multiplied by the number of cloves. Model fit from maximum likelihood consideration provide \(\alpha =0.5659\) for the year 2013–2014 and \(\alpha =0.7267\) for the year 2014–2015, with \(\hat{\theta }\) as the maximum weight of bulb in that data set; \(\hat{\theta }= 0.87\) g, 5.85 g, respectively for the year 2013–2014, and 2014–2015.

Yet other estimates of \(\theta \) are available from the intercept of the regression line as \(\tilde{\theta }= 0.63\) g, 22.39 g, respectively for the year 2013–2014, and 2014–2015. Observe that in all the cases mentioned above, \(\alpha \in (0, 1).\)

Figure 3 shows theoretical and empirical c.d.f. (a slightly modified smooth curve drawn by joining the mid-points of jump, instead of drawing traditional step function; this modification does not change the computation of distances much as the steps are of magnitude 1 / n,  but this smoothens the jig-jag look a little bit; especially for convergence to a continuous c.d.f.) of the garlic weight for the year 2013–2014. The maximum vertical distance between two curves is 0.3713, and the value of the Kolmogorov-Smirnov (KS) statistic is \(\sqrt{85}\times 0.3713=3.42;\) this is significant even at \(0.5\%\) level.

Figure 4 shows the comparison of model with empirical c.d.f of garlic weight for the year 2014–2015. The value of the KS statistic is \(\sqrt{98}\times 0.2569=2.54\). Although the second value is lower than the first one, the second value is greater than \(0.5\%\) level KS value 1.73. Garlic weight data for two years indicate bad fit to the model; in spite of very good fit to number of garlic cloves. We shall come back to this point later.

In Fig. 5 we plot empirical c.d.f. for the garlic yield of two consecutive years. The value of nonparametric two sample KS statistic is \(\sqrt{\frac{85\times 98}{85+98}}\times 0.8327 =5.618,\) which is highly significant. Thus the productions of garlic are markedly distinct for two years.

Figures 6 and 7 explore model fit to empirical c.d.f. in \(\log \)-\(\log \) scale. Although the curve in the middle is close to the straight line representing the model, deviation from the model is prominent towards both extremes in two data sets.

Fig. 4
figure 4

Empirical CDF of garlic yield and the model (2014–2015)

Fig. 5
figure 5

Empirical CDF of garlic yield for two seasons

Fig. 6
figure 6

Model fit for empirical CDF (2013–2014)

Fig. 7
figure 7

Model fit for empirical CDF (2014–2015)

These features of the figures indicate that the implicitly made model fitting assumption viz., weight of a garlic bulb equals to number of cloves multiplied by weight of a typical clove; may be a good approximation for the middle segment of the data set of weights, and show model departure towards data points representing extreme weights.

Growth curve of garlic with lowess regression (\(f=2/3\)) for the year 2014–2015 is shown in Fig. 8.

Spline regression in SPlus with smooth.spline and spar\(=0.001\) provides Fig. 9 as the growth curve of garlic. The curve is relatively smooth compared to previous curve.

Fig. 8
figure 8

Growth curve of bulb crop for the year 2014–2015

Fig. 9
figure 9

Growth curve of bulb crop for the year 2014–2015 (spline)

In Sect. 2 we prove some characterization theorems based on linear relationship of conditional quantiles. The results have implications in parameter estimation. In Sect. 2, we discuss strong convergence of one-sided estimators to a parameter from above/below.

2 Characterization of Extended Uniform Distribution

We first prove the following.

Theorem 1

Let X be a random variable with support \((0, \theta ), \theta >0,\) and distribution function F. Denote \(c=c(p)\) to be the unrestricted p-th quantile of \(X(>0),\) and consider p in a (small) dense neighborhood \(A_0\) of 1 (e.g., \(p\in A_0=(1-\epsilon ,1)\cap Q, \epsilon >0,\) small and Q is the set of rational numbers). Then the p-th quantile of the distribution, \(p\in A_0,\) under the restriction \(X<x_0(<\theta )\) is \(cx_0/\theta \) iff F is extended uniform distribution function (1.2).

Proof

Consider the distribution function of scaled variable with \(\theta =1.\)

$$\begin{aligned} F(x)= x^{\alpha },\; 0<x<1,\; \alpha >0 \end{aligned}$$
(3)

The p-th quantile of the distribution is at \(p^{1/\alpha }.\) Denote \( g(x)=\log {F}(x)=\alpha \log x \downarrow -\infty , \; x\downarrow 0.\) The c.d.f. of the variable, given that \(x<x_0 (<1),\) then turns out to be \(F(x)/F(x_0),\) and one may write \(P(X<x|X<x_0)=\frac{F(x)}{F(x_0)}=(\frac{x}{x_o})^{\alpha }.\) Equating this to p,  we obtain the new p-th quantile of the random variable bounded above by the threshold \(x_0\) as \(cx_0,\) where \(c=p^{1/\alpha }\) is the p-th quantile of the unrestricted random variable \(X(<1).\)

Assume that the property of constant multiple factor of restricted and unrestricted quantiles holds for a dense set of quantiles corresponding to \(p\in (0, 1),\) p rational. Suppose that the new median of the random variable X under the restriction \(x<x_0\) is at \(cx_0,\) where c is independent of \(x_0.\) Indeed c is the p-th quantile of original unrestricted random variable as seen by taking \(x_0 \uparrow 1.\)

Next, write

$$\begin{aligned} e^{g(cx_0)-g(x_0)}=\frac{F(cx_0)}{F(x_0)}=p \end{aligned}$$
(4)

This provides,

$$\begin{aligned} g(cx_0)-g(x_0)=-k \end{aligned}$$
(5)

where, \( k=-\log p (>0).\)

Thus \(g(c^2)=g(c)-k=-2k,\; g(c^3)\!=\!-3k, \ldots , g(c^m)\!=\!-mk.\) Hence, \(g(x)= \log F(x)= \alpha \log x;\) where \(\alpha =-k/(\log c)\) at the points \(x=c, c^2, \ldots , c^m, \ldots ; c\in (0,1).\)

This specifies the distribution function F to be extended uniform in a dense set \(x=c, c^2, \ldots , c^m, \ldots ,\) of (0, 1). For an arbitrary real number \(z\in (0,1),\) there exist integer m and \(c=p^{1/\alpha };\; p\in Q \cap (0, 1)\) such that \(c^m\) is arbitrary close to the number z,  where Q is the set of all rational numbers. Next from right continuity of distribution function, the form of F is extended uniform at z,  where \(z\in (0,1)\) is arbitrary.

Finally, a dense choice of p in a small neighborhood of 1, e.g., \(p\in A_0=(1-\epsilon , 1)\cap Q, \epsilon >0\) is small, suffices for the Theorem to hold; as the resultant sequence \(\{c^m: m=1,2,3,\ldots \}\) still spans a dense support of the variable.

For the general case let the supremum of possible value of X be \(\theta (>0)\). The distribution function F with maximum value \(\theta \) is then

$$\begin{aligned} F(x)= (x/\theta )^{\alpha },\; x\in (0,\theta ),\; \alpha >0 \end{aligned}$$
(6)

One may then consider the transformed random variable \(X/\theta \in (0,1).\) Proceeding as before, the characterization of Theorem 1 holds.

Characterization theorems for discrete random variables

Consider a random variable X with range either \(\mathbf {N_0},\) the set of nonnegative integers; or set of positive integers \(\mathbf {N_1}=\mathbf {N_0}-\{0\}.\) Let the cumulative distribution function of X be denoted by \(F(x)=P(X \le x),\) it is enough to define F at integer values. For \(p\in (0,1)\) the p-th quantile of F is defined as \(F^{-1}(p)=\{\text{ inf }\;x: F(x)\ge p\}.\)

The following theorem is the counterpart of Theorem 1 stated for discrete random variables.

Theorem 2

Let X be a random variable with support \(\mathbf {N_1} \cap [0,\theta ],\) where \(\theta \) is an arbitrary positive integer, \(F(x)=P(X\le x)\) be the distribution function. Let the p-th quantile of the distribution under the restriction \(X\le x_0(\in \mathbf {N_1}),\) \( x_0\le \theta \) be \(cx_0;\) where \(c \in \mathbf {N_1}\) is the unrestricted p-th quantile of X. The above property holds for all p of the form \(p=p_i=\sum _{j=1}^iP(X=j),\; i=1, 2, 3,\ldots , \theta \) iff \(F(x)=(\frac{x}{\theta })^{\alpha }\) for some \(\alpha >0,\) where \(x\in \mathbf {N_1}.\)

Proof

Proof of Theorem 2 follows similar lines as that of Theorem 1. One way implication of the Theorem is easy to see. Consider the ‘only if’ part.

Steps similar to (4)–(5) hold. The variable X has support \(\mathbf {N_1}\cap [0,\theta ].\) This set is same as the set \(\{c, c^2, \ldots , c^m, \ldots \}\cap [0,\theta ],\) where \(c=c(p)\) is the p-th quantile of X,  and p of the form \(p=p_i=\sum _{j=1}^iP(X=j),\; i=1, 2, 3,\ldots , \theta .\) The p-th quantile is then an integer, as the jumps of F occur at integer points. For example when \(F(x)=(x/\theta )^{\alpha },\) the p-th quantile \(c=c(p)\) is obtained as the solution i of the equation \(p=p_i=\sum _{j=1}^iP(X=j)=(i/\theta )^{\alpha }.\)

Over the set \(\mathbf {N_1} \cap [0,\theta ],\) characterization for \(g(x)= \log F(x)=\alpha \log (x/\theta )\) is seen to hold in a similar fashion like in Theorem 1.

3 One Sided Estimation for Upper End Point

Conventional estimators of a parameter usually fluctuate around the unknown value of the parameter. An estimator \(T_n\) of the unknown parameter \(\theta \in R\) is said to converge to \(\theta \) from above (below), if \(T_n \ge \theta \) \((T_n \le \theta )\) for all sufficiently large sample size n and \(T_n \rightarrow \theta \) a.s., as \(n \rightarrow \infty \).

One-sided convergence from above is denoted as \(T_n \rightarrow _+ \theta \) a.s., and one-sided convergence from below is denoted as \(T_n \rightarrow _ -\theta \) a.s.

With an application of Marcinkiewicz-Zygmund strong law of large numbers (MZSLLN), estimation problem for the mean \(\theta = E_{F_\theta }(X)\) from above/below has been addressed by Gilat and Hill (1992). Observe that \(\overline{X}_n\) is the natural estimator for \(\theta = EX\). But \(\overline{X}_n\) fluctuates above and below \(\theta \) although \(\overline{X}_n \rightarrow \theta \) a.s., as \(n \rightarrow \infty \).

Estimation problems of this kind are considered in Dasgupta (2007) when \(\theta \) is a finite end point of the distribution function. Application of Borel-Cantelli lemma and properties of extreme order statistics are some of the tools used to obtain the results.

One-sided convergence may be useful while estimating the unknown variance of a random variable, for which the estimator should be non-negative. Level of flood water is another example. In such cases, one may like to estimate the parameter from above. As for some other examples, consider estimating the strength of a dam or bridge. One may like to estimate the unknown strength conservatively from below, to have a protection from probable disaster.

The maximum observations in two sets of garlic clove data for the years 2013–2014, and 2014–2015 are \(X_{n}=4, 19\) respectively, with \(n=85, 98\) over two production seasons. The underlying models proposed are (1)–(2). Large garlic with many cloves have a market value. We wish to estimate \(\theta ,\) the maximum number of cloves both from above and below.

The following result is stated in Galambos (1978).

Result. Let F be continuous, then \(P[X_{(n)} \le F^{-1}(1 - \frac{\delta \log \log n}{n})\; \text{ i.o. }] = 0, \delta > 1.\) The above relationship may be inverted to conclude, \(X_{(n)} > F^{-1}(1 - \frac{\delta \log \log n}{n}) = \theta - \beta _n \; \text{(say) } \; \text{ a.s., } \text{ as } \; n \rightarrow \infty ;\)

i.e., \(X_{(n)} + \beta _n \rightarrow _+ \theta \) a.s., as \(n \rightarrow \infty \).

For the form of F given in (2) we have \(\beta _n= \frac{\delta \log \log n}{\alpha n},\) providing the amount of perturbation to be added to \(X_{(n)}\) for upper convergence to \(\theta ,\) a.s.

Application of Borel cantelli lemma provides the following sharper result on upper and lower convergence to the end point \(\theta ,\) in place of Proposition 1 and Proposition 3 of Dasgupta (2007).

Theorem 3

Let \(X_1, \ldots ,X_n\) be iid random variables with distribution \(F = F_\theta ,\) where \(\theta = \sup \{ x : F(x)< 1 \} < \infty \) and the functional form of F be known near the right tail of the distribution.

  1. (i)

    Let \(\alpha _n = \theta - F^{-1}(1 - \epsilon _n)= F^{-1}(1) - F^{-1}(1 - \epsilon _n)\), where \(F^{-1}(a) = \inf \{ x : F(x) \ge a \}\) and \(\epsilon _n = \frac{\log \{n (\log n)^\delta \}}{n} \rightarrow 0,\) as \(n \rightarrow \infty \), \(\delta > 1\). Then,

    $$\begin{aligned} \hat{\theta }_+ =X_{(n)}+ \alpha _n \rightarrow _+ \theta \; a.s., \; { as} \; n \rightarrow \infty , \end{aligned}$$
    (7)
  2. (ii)

    Let \(\alpha _n^* = \theta - F^{-1}(1 - \epsilon _n^*)= F^{-1}(1)-F^{-1}(1 - \epsilon _n^*)\), and \(\epsilon _n^* = n^{-2}(\log n)^{-\delta } \rightarrow 0,\) as \(n \rightarrow \infty \), where \(\delta > 1\). Then,

    $$\begin{aligned} \hat{\theta }_- = X_{(n)}+ \alpha _n^* \rightarrow _- \theta \; a.s.,\; { as}\; n \rightarrow \infty , \end{aligned}$$
    (8)

    Thus,

    $$\begin{aligned} P_{F_{\theta }}( X_{(n)}+ \alpha _n^*< \theta < X_{(n)}+ \alpha _n) = 1 \end{aligned}$$
    (9)

    for all sufficiently large n.

Proof

Consider,

$$\begin{aligned} P(X_{(n)} \le d_n)= & {} F^n (d_n)< e^{-n \{1-F(d_n)\}}\\< & {} 1/\{n (\log n)^\delta \} \; \text{ if, } \; 1 - F(d_n)> \frac{\log \{n (\log n)^\delta \}}{n}, \; \delta > 1. \end{aligned}$$

i.e., if \(F(d_n) < 1 - \frac{\log \{n (\log n)^\delta \}}{n},\) i.e., if \(d_n < F^{-1} (1 - \frac{\log \{n (\log n)^\delta \}}{n}).\)

In such a situation, \(\sum _n P(X_{(n)} \le d_n) \le \sum _n 1/\{n (\log n)^\delta \} < \infty \) and therefore by Borel-Cantelli lemma one gets \(P(X_{(n)} \le d_n \; \text {i.o.}) = 0\).

Hence, \(X_{(n)}> d_n > F^{-1}(1 - \frac{\log \{n (\log n)^\delta \}}{n})\) a.s., as \(n \rightarrow \infty \).

If the functional form of F is known (at least near the right tail) then one can invert the above relation to obtain,

$$X_{(n)} > \theta - \alpha _n \; \; \text{ a.s., } \text{ as } \; n \rightarrow \infty ,$$

where \(\alpha _n \rightarrow 0\), as \(n \rightarrow \infty \), since \(\frac{\log \{n (\log n)^\delta \}}{n} \rightarrow 0\).

Thus \(X_{(n)} + \alpha _n \rightarrow _+ \theta \) a.s., as \(n \rightarrow \infty \).

For (ii), note that, \(\alpha _n^* \rightarrow 0,\) as \(\epsilon _n^* = n^{-1}(\log n)^{-\delta } \rightarrow 0;\; n \rightarrow \infty \). Next write, \(\sum _n P(X_{(n)}+ \alpha _n^*> \theta ) = \sum _n [1- F^n(\theta - \alpha _n^*)] \le \sum _n n^{-1}(\log n)^{-\delta } < \infty ,\;\;\delta > 1,\) if \( F(\theta - \alpha _n^*) > ( 1- n^{-1}(\log n)^{-\delta })^{1/n} \simeq 1 - n^{-2}(\log n)^{-\delta },\) i.e., if \(\alpha _n^* \le \theta - F^{-1}(1 - n^{-2}(\log n)^{-\delta }).\) Now use Borel-Cantelli lemma to claim (8). Result (9) then follows from (7) and (8).

For the model (2), \(\alpha _n=\frac{\log \{n (\log n)^\delta \}}{\alpha n}\) and \(\alpha _n^*=\{\alpha n^2 (\log n)^{\delta }\}^{-1}.\)

Point estimate of upper end point

Usual estimator \(X_{(n)}\) underestimates \(\theta ,\) the upper end point of non-regular model (2) with discontinuous likelihood function. Non-regularity is caused by dependence of the boundary on unknowns. Asymptotic analysis, as presented in Ibragimov and Hasminskii (1981) covers a number of such models. These frequently arise in real life problems including econometrics, see e.g., Chernozhukov and Hong (2004) on auction models and equilibrium job-search models with a jump of density at start. Hall and Wang (2005) considered estimation problem with empirical prior distribution based on two extreme order statistics to estimate the lower end point of a distribution.

Our approach is based on one sided convergence. Results proved in previous section states that the type of distribution and shape parameter \(\alpha \) remains the same with upper censoring of data, say up to \(X_{(n-1)}, X_{(n-2)}\) etc., thus providing more than one estimate of \(\alpha \) that can be combined by standard methods to have a pooled estimate of \(\alpha .\) This is required in computing \(\alpha _n\) and \(\alpha _n^*.\)

For the model (2), one may then consider the midpoint of the interval in (9) viz., \(\tilde{\theta }=X_{(n)}+\frac{\log \{n (\log n)^\delta \}}{2\alpha n}+\{2\alpha n^2 (\log n)^{\delta }\}^{-1}\) as a point estimate of \(\theta .\)

The estimator is simple average of a positively biased and a negatively biased estimate of the parameter \(\theta .\) Other weighted average of these two estimates may also be considered. The estimator always lies above \(X_{(n)},\) the m.l.e. of \(\theta .\)