A new estimator for mean under stratified random sampling

Shahzad, Usman; Hanif, Muhammad; Koyuncu, Nursel

doi:10.1007/s40096-018-0255-3

A new estimator for mean under stratified random sampling

Original Paper
Open access
Published: 23 July 2018

Volume 12, pages 163–169, (2018)
Cite this article

Download PDF

You have full access to this open access article

Mathematical Sciences Aims and scope Submit manuscript

A new estimator for mean under stratified random sampling

Download PDF

3821 Accesses
9 Citations
Explore all metrics

Abstract

In this paper, we have proposed an estimator of finite population mean in stratified random sampling. The expressions for the bias and mean square error of the proposed estimator are obtained up to the first order of approximation. It is found that the proposed estimator is more efficient than the traditional mean, ratio, exponential, regression, Shabbir and Gupta (in Commun Stat Theory Method 40:199–212, 2011) and Khan et al. (in Pak J Stat 31:353–362, 2015) estimators. We have utilized four natural and four artificial data sets under stratified random sampling scheme for assessing the performance of all the estimators considered here.

Sampling Techniques for Quantitative Research

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Levi Kumle, Melissa L.-H. Võ & Dejan Draschkow

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Ulrich Knief & Wolfgang Forstmeier

Introduction

Stratification is a designing tool that is used in modern surveys for improving the precision of estimates. In stratified design, the whole population is divided into number of strata for getting homogeneity within each stratum, and samples are selected within each stratum mostly through simple random sampling. There are numerous authors who have suggested different estimators by utilizing some known population parameters of an auxiliary variable. Sisodia and Dwivedi [11] presented a proportion estimator by utilizing coefficient of variety of an auxiliary variable. Singh and Kakran [9] proposed another proportion estimator by utilizing known coefficient of kurtosis of an auxiliary variable. Upadhyaya and Singh [12] likewise examined a ratio-type estimator by using the linear combination of coefficient of variation and kurtosis of an auxiliary variable. Kadilar and Cingi [2] utilized the stratified forms of these specified estimators keeping in mind the end goal to improve the efficiency of the suggested estimators.

The fundamental goal of this paper is to propose an improved estimator of the finite population mean utilizing data on an auxiliary variable in stratified random sampling. The expressions for the bias and mean square error (MSE) of the proposed estimator are inferred up to the first order of approximation. On the bases of theoretical and numerical comparisons, we demonstrate that the proposed estimator is more efficient than existing estimators.

The rest of the paper is organized as follows: "Some existing estimators" section consists in the estimators which we reviewed from the literature, and also some useful preliminaries results for obtaining the properties of proposed and existing estimators are available here. "Proposed estimator" section introduces an improved estimator using stratified random sampling scheme. "Numerical illustration" section is devoted to the efficiency comparison. A numerical evaluation is presented in "Conclusion" section to highlight the contribution of the paper.

Some existing estimators

Suppose $U=\left\{ U^{'}_{1},U^{'}_{2},\ldots , U^{'}_{N}\right\} $ be the population containing N finite units. Let X be an auxiliary and Y be the study variable taking values $y_{hi}$ and $x_{hi}$ in the unit $(i=1,2,\ldots ,N)$ in the hth stratum consisting of $N_{h}$ units such that $\sum _{h=1}^{L}N_{h}=N$. Let $n_{h}$ be the size of the sample drawn from the hth stratum by using simple random sampling without replacement scheme such that $\sum _{h=1}^{L}n_{h}=n$. Suppose $\bar{y}_{st}=\sum _{h=1}^{L}W_{h}\bar{y}_{h}$, where $\bar{y}_{h}=\frac{1}{n_{h}}\sum _{i=1}^{n_{h}}y_{hi}.$ Let $\bar{x}_{st}=\sum _{h=1}^{L}W_{h}\bar{x}_{h}$, where $\bar{x}_{h}=\frac{1}{n_{h}}\sum _{i=1}^{n_{h}}x_{hi}$ and $W_{h}=\frac{N_{h}}{N}$ is the stratum weight. The expressions for Y are also defined in similar way.

To find the MSE of the proposed and existing estimators, let us define

$$\begin{aligned} e_{0st}=\frac{\bar{y}_{st}-\bar{Y}}{\bar{Y}},\quad e_{1st} =\frac{\bar{x}_{st}-\bar{X}}{\bar{X}}. \end{aligned}$$

The expectations of e terms are given below

$$\begin{aligned} E(e_{ost})= & {} 0,\,E(e_{1st})=0, \\ E(e_{0st}^{2})= & {} \sum _{h=1}^{L}W_{h}^{2}f^{'}_{h} \frac{S^{2}_{yh}}{\bar{Y}^{2}}=V_{2.0},\,E(e_{1st}^{2}) =\sum _{h=1}^{L}W_{h}^{2}f^{'}_{h}\frac{S^{2}_{xh}}{\bar{X}^{2}}=V_{0.2},\\ E(e_{0st}e_{1st})= & {} \sum _{h=1}^{L}W_{h}^{2}f^{'}_{h} \frac{S_{yxh}}{\bar{X}\bar{Y}}=V_{1.1} \end{aligned}$$

where

$$\begin{aligned} f_{h}=\frac{n_{h}}{N_{h}}, \,f^{'}_{h} =\left( \frac{1-f_{h}}{n_{h}}\right) , \end{aligned}$$

and

$$\begin{aligned} V_{a.b}= \sum _{h=1}^{L}W_{h}^{a+b}f^{'}_{h}\sum _{i=1}^{N_{h}} \frac{(y_{hi}-\bar{Y}_{h})^{a} (x_{hi}-\bar{X}_{h})^{b} }{\bar{Y}^{a} \bar{X}^{b}}. \end{aligned}$$

The variance of the sample mean in stratified random sampling without replacement is given by:

$$\begin{aligned} \mathrm{Var}(\bar{y}_{st})=\bar{Y}^{2}V_{2.0}. \end{aligned}$$

The stratified version of classical ratio estimator for mean is given by

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Rst}}=\frac{\bar{y}_{st}}{\bar{x}_{st}}\bar{X}. \end{aligned}$$

(1)

The bias and MSE of classical ratio estimator given in (1) up to the first order of approximation are given below:

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{\mathrm{Rst}})= & {} \bar{Y} \left[ V_{0.2}-V_{1.1}\right] \nonumber \\ \mathrm{MSE}(\hat{\bar{y}}_{\mathrm{Rst}})= & {} \bar{Y}^{2}\left[ V_{2.0}+V_{0.2} -2V_{1.1}\right] . \end{aligned}$$

(2)

The stratified version of Bahl and Tuteja [1] estimator is

$$\begin{aligned} \hat{\bar{y}}_{BTst}=\bar{y}_{st}\mathrm{exp} \left[ \frac{\bar{X}-\bar{x}_{st}}{\bar{X} +\bar{x}_{st}}\right] . \end{aligned}$$

(3)

The bias and MSE of $\hat{\bar{y}}_{BTst}$ up to the first order of approximation are given below:

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{BTst})=\bar{Y}\left[ \frac{3}{8}V_{0.2} -\frac{1}{2}V_{1.1}\right] \nonumber \\ \mathrm{MSE}(\hat{\bar{y}}_{BTst})= & {} \bar{Y}^{2}\left[ V_{2.0} +\frac{1}{4}V_{0.2}-V_{1.1}\right] . \end{aligned}$$

(4)

The traditional regression estimator is given by

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Reg}(st)}=\bar{y}_{st}+b_{st}(\bar{X}-\bar{x}_{st}). \end{aligned}$$

The MSE of $\hat{\bar{y}}_{\mathrm{Reg}(st)}$ is,

$$\begin{aligned} \mathrm{MSE}(\hat{\bar{y}}_{\mathrm{Reg}(st)})=\bar{Y}^{2}V_{2.0}(1-\rho ^{2}_{st}), \end{aligned}$$

(5)

where $\rho _{st}=\frac{V_{1.1}}{\sqrt{V_{2.0}}\sqrt{V_{0.2}}}$ is the combined correlation between study and auxiliary variate.

Shabbir and Gupta [8] introduced an exponential ratio estimator as:

$$\begin{aligned} \hat{\bar{y}}_{SGst}=\left[ w_{1}^{SG}\bar{y}_{st}+w_{2}^{SG}(\bar{X} -\bar{x}_{st})\right] \mathrm{exp} \left( \frac{\bar{A}-\bar{a}_{st}}{\bar{A} +\bar{a}_{st}}\right) , \end{aligned}$$

(6)

where $\bar{a}=\bar{x}_{st}+N\bar{X}$, $\bar{A}=\bar{X}+N\bar{X}.$

The optimum values of $w_{1}^{SG}$ and $w_{2}^{SG}$ are

$w_{1}^{SG(\mathrm{opt})}=\frac{B_{SG}D_{SG} -\frac{C_{SG}E_{SG}}{2}}{A_{SG}B_{SG} -C_{SG}^{2}}$ and $w_{2}^{SG(\mathrm{opt})} =\frac{-C_{SG}D_{SG}+\frac{A_{SG}E_{SG}}{2}}{A_{SG}B_{SG}-C_{SG}^{2}}.$

The minimum MSE of $\hat{\bar{y}}_{SG}$ is

$$\begin{aligned} \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{SGst})=\bar{Y}^{2}-\frac{B_{SG}D_{SG}^{2} +\frac{A_{SG}E_{SG}^{2}}{4}-C_{SG}D_{SG}E_{SG}}{A_{SG}B_{SG} -C_{SG}^{2}}, \end{aligned}$$

(7)

where

$$\begin{aligned} A_{SG} &= {} \bar{Y}^{2}\left[ 1+V_{2.0}+\frac{V_{0.2}}{(1+N)^{2}} -\frac{2V_{1.1}}{(1+N)}\right] ,\\ B_{SG}&= {} \bar{X}^{2}V_{0.2}, \,\,C_{SG}=\bar{X}\bar{Y} \left[ \frac{V_{0.2}}{(1+N)}-V_{1.1}\right] ,\\ D_{SG}&= {} \bar{Y}^{2}\left[ 1+\frac{3V_{0.2}}{8(1+N)^{2}} -\frac{V_{1.1}}{2(1+N)}\right] and \,\,E_{SG}=\bar{X} \bar{Y}\frac{V_{0.2}}{(1+N)}. \end{aligned}$$

Khan et al. [3] proposed the following ratio estimator in simple random sampling:

$$\begin{aligned} \hat{\bar{y}}_{Kalst}=\left[ w_{1}^{Kal}\bar{y}+w_{2}^{Kal} (\bar{X}-\bar{x})\right] \mathrm{exp}\left( \frac{\bar{A^{'}} -\bar{a^{'}}}{\bar{A^{'}}+\bar{a^{'}}}\right) , \end{aligned}$$

(8)

In stratified random sampling, it will be of the form:

$$\begin{aligned} \hat{\bar{y}}_{Kal(st)}=\left[ w_{1}^{Kal}\bar{y}_{st} +w_{2}^{Kal}(\bar{X}-\bar{x}_{st})\right] \mathrm{exp} \left( \frac{\bar{A^{'}}-\bar{a^{'}}_{st}}{\bar{A^{'}} +\bar{a^{'}}_{st}}\right) , \end{aligned}$$

(9)

where

$\bar{a^{'}}_{st}=\bar{X}c_{ki}$, $\bar{A}=\bar{x}_{st}+\bar{X}(c_{ki}-1),$ for $i=1, 2, 3.$

$c_{k1}=\rho _{st}+1$, $c_{k2}=\frac{\rho _{st}+1}{2}$, $c_{k3}=\frac{\rho _{st}+1}{3}.$

The optimum values of $w_{1}^{Kal}$ and $w_{2}^{Kal}$ are

$w_{1}^{Kal(\mathrm{opt})}=\frac{B_{Kal}D_{Kal} -\frac{C_{Kal}E_{Kal}}{2}}{A_{Kal}B_{Kal}-C_{Kal}^{2}}$ and $w_{2}^{Kal(\mathrm{opt})}=\frac{-C_{Kal}D_{Kal} +\frac{A_{Kal}E_{Kal}}{2}}{A_{Kal}B_{Kal}-C_{Kal}^{2}}.$

The minimum MSE of $\hat{\bar{y}}_{Kalst}$ is

$$\begin{aligned} \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{Kal(st)})=\bar{Y}^{2} -\frac{B_{Kal}D_{Kal}^{2}+\frac{A_{Kal}E_{Kal}^{2}}{4} -C_{Kal}D_{Kal}E_{Kal}}{A_{Kal}B_{Kal}-C_{Kal}^{2}}, \end{aligned}$$

(10)

where

$$\begin{aligned} A_{Kal}= & {} \bar{Y}^{2}\left[ 1+V_{2.0}+\frac{V_{0.2}}{c_{ki}^{2}} -\frac{2V_{1.1}}{c_{ki}}\right] ,\\ B_{Kal}= & {} \bar{X}^{2}V_{0.2}, \,\,C_{Kal}=\bar{X}\bar{Y} \left[ \frac{V_{0.2}}{c_{ki}}-V_{1.1}\right] ,\\ D_{Kal}= & {} \bar{Y}^{2}\left[ 1+\frac{3V_{0.2}}{8c_{ki}^{2}} -\frac{V_{1.1}}{2c_{ki}}\right] and \,\,E_{Kal}=\bar{X} \bar{Y}\frac{V_{0.2}}{c_{ki}}. \end{aligned}$$

Proposed estimator

In this section, an improved estimator of finite population mean in stratified random sampling is proposed. The properties of the proposed estimator are studied up to the first order of approximation. Development of the proposed estimator is given step-by-step below.

Rao [7] introduced the following estimator in simple random sampling

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Rao}}= \left[ w_{1}\left( \bar{X}-\bar{x}\right) +w_{2}\bar{y}\right] \end{aligned}$$

(11)

and its stratified version can be written as

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Rao}(st)}= \left[ w_{1}\left( \bar{X} -\bar{x}_{st}\right) +w_{2}\bar{y}_{st}\right] . \end{aligned}$$

(12)

As we know, the average of Bahl and Tuteja [1] estimators is

$$\begin{aligned} \hat{\bar{y}}_{ABT}= \frac{1}{2}\left\{ \bar{y}_{st}\mathrm{exp} \left( \frac{\bar{X}-\bar{x}_{st}}{\bar{X}+\bar{x}_{st}}\right) +\bar{y}_{st}\mathrm{exp}\left( \frac{\bar{x}_{st}-\bar{X}}{\bar{X} +\bar{x}_{st}}\right) \right\} , \end{aligned}$$

(13)

Now, by adding (12) and (13), one can propose the following estimator

$$\begin{aligned} \hat{\bar{y}}_{AA}= \left[ \frac{1}{2}\left\{ \bar{y}_{st}\mathrm{exp} \left( \frac{\bar{X}-\bar{x}_{st}}{\bar{X}+\bar{x}_{st}}\right) +\bar{y}_{st}\mathrm{exp}\left( \frac{\bar{x}_{st}-\bar{X}}{\bar{X} +\bar{x}_{st}}\right) \right\} +w_{1}\left( \bar{X} -\bar{x}_{st}\right) +w_{2}\bar{y}_{st}\right] , \end{aligned}$$

Hence, taking motivation from $\hat{\bar{y}}_{AA}$ & $\hat{\bar{y}}_{BTst}$, we propose the following estimator

$$\begin{aligned} \hat{\bar{y}}_{Nst}= & {} \left[ \frac{1}{2}\left\{ \bar{y}_{st}\mathrm{exp} \left( \frac{\bar{X}-\bar{x}_{st}}{\bar{X}+\bar{x}_{st}}\right) +\bar{y}_{st}\mathrm{exp}\left( \frac{\bar{x}_{st}-\bar{X}}{\bar{X} +\bar{x}_{st}}\right) \right\} \right. \nonumber \\&\left. +\,w_{1}\left( \bar{X} -\bar{x}_{st}\right) +w_{2}\bar{y}_{st}\right] \mathrm{exp} \left[ \frac{\bar{X}^{''}-\bar{x}_{st}^{''}}{\bar{X}^{''} +\bar{x}_{st}^{''}}\right] , \end{aligned}$$

(14)

where $\bar{X}^{''}=\bar{X}k$ and $\bar{x}_{st}^{''}=\bar{x}_{st}+\bar{X}(k-1),$ (see [3]). Further, $k = \frac{\rho _{st}+1}{4}$, suitably chosen constant.

The bias, MSE and minimum MSE of $\hat{\bar{y}}_{Nst}$ are given by,

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{Nst})= & {} \bar{Y}\left\{ \frac{V_{0.2}}{8} \left( 1+\frac{3}{k^{2}}\right) -\frac{V_{11}}{2k}\right\} +\frac{\bar{X}V_{0.2}}{2k}w_{1}+\bar{Y}\left\{ \frac{3V_{0.2}}{8k^{2}} -\frac{V_{1.1}}{2k}w_{2}\right\} ,\\ \mathrm{MSE}(\hat{\bar{y}}_{Nst})= & {} \bar{Y}^{2}L+w^{2}_{1}\lambda _{A} +w^{2}_{2}\lambda _{B}+2w_{1}w_{2}\lambda _{C}-2w_{1}\lambda _{D} -w_{2}\lambda _{E}, \\ \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{Nst})= & {} \left[ \bar{Y}^{2}L -\frac{\lambda _{B}\lambda _{D}^{2}+\frac{\lambda _{A}\lambda _{E}^{2}}{4} -\lambda _{C}\lambda _{D}\lambda _{E}}{\lambda _{A}\lambda _{B} -\lambda _{C}^{2}}\right] . \end{aligned}$$

The detailed proofs of the above expressions are provided in Appendix.

Numerical illustration

Real data sets

To investigate the theoretical results, the following real data sets are considered as:

Population 1

[10, p. 219]

$X = \hbox {Amount of milky cows}$ in the year 1990 and $Y=\hbox {Amount of milky cows}$ in the year 1993.

$N_{1st}=7$	$N_{2st}=12$	$N_{3st}=5$
$n_{1st}=3$	$n_{2st}=5$	$n_{3st}=2$
$W_{1st}=0.2916$	$W_{2st}=0.5000$	$W_{3st}=0.2083$
$\bar{X}_{1st}=15.2857$	$\bar{X}_{2st}=17.2500$	$\bar{X}_{3st}=20.6000$
$\bar{Y}_{1st}=17.4285$	$\bar{Y}_{2st}=20.4166$	$\bar{Y}_{3st}=17.8000$
$S_{x1st}=4.5721$	$S_{x2st}=5.4958$	$S_{x3st}=3.6469$
$S_{y1st}=4.1975$	$S_{y2st}=4.0778$	$S_{y3st}=3.2710$
$\rho _{st}=0.29$	$n=10$	$N=24$

Population 2

[6, p. 228]

$X = \hbox {Amount of workers working in a factor}$ and $Y=\hbox {Yield}$ or output for factories in an area.

$N_{1st}=25$	$N_{2st}=23$	$N_{3st}=16$	$N_{4st}=16$
$n_{1st}=14$	$n_{2st}=13$	$n_{3st}=9$	$n_{4st}=9$
$W_{1st}=0.3125$	$W_{2st}=0.2875$	$W_{3st}=0.2000$	$W_{4st}=0.2000$
$\bar{X}_{1st}=71.00$	$\bar{X}_{2st}=140.69$	$\bar{X}_{3st}=362.93$	$\bar{X}_{4st}=749.50$
$\bar{Y}_{1st}=3156.64$	$\bar{Y}_{2st}=4766.21$	$\bar{Y}_{3st}=6334.18$	$\bar{Y}_{4st}=7795.31$
$S_{x1st}=14.6116$	$S_{x2st}=28.0364$	$S_{x3st}=91.3823$	$S_{x4st}=174.46$
$S_{y1st}=740.01$	$S_{y2st}=515.69$	$S_{y3st}=501.39$	$S_{y4st}=653.09$
$n=45$	$N=80$	$\rho _{st}=0.67$

Strata are formed by grouping randomly into four strata on the premise of auxiliary variate (X). The criteria of constructions are $x < 100.0$, $100.0 \ge x < 200.0$, $200.0 \ge x < 500.0$ and $x \ge 500.0$ respectively. Proportional allocation is used for selecting sample from each stratum. Note that we use $n=45$.

Population 3

([2])

$X = \hbox {Apple trees}$ amount in $N=854$ towns in Turkey in (1999) and $Y = \hbox {Level of apple production}$.

$N_{1st}=106$	$N_{2st}=106$	$N_{3st}=94$	$N_{4st}=171$	$N_{5st}=204$
$N_{6st}=173$	$n_{1st}=9$	$n_{2st}=17$	$n_{3st}=38$	$n_{4st}=67$
$n_{5st}=7$	$n_{6st}=2$	$W_{1st}=0.1241$	$W_{2st}=0.1241$	$W_{3st}=0.1101$
$W_{4st}=0.2002$	$W_{5st}=0.2386$	$W_{6st}=0.2025$	$\bar{Xst}_{1st}=243.76$	$\bar{X}_{2st}=274.22$
$\bar{X}_{3st}=724.10$	$\bar{X}_{4st}=773.65$	$\bar{X}_{5st}=264.42$	$\bar{X}_{6st}=98.44$	$\bar{Y}_{1st}=15.37$
$\bar{Y}_{2st}=22.13$	$\bar{Y}_{3st}=93.84$	$\bar{Y}_{4st}=55.88$	$\bar{Y}_{5st}=9.67$	$\bar{Y}_{6st}=4.04$
$C_{x1st}=2.02$	$C_{x2st}=2.10$	$C_{x3st}=2.22$	$C_{x4st}=3.84$	$C_{x5st}=1.72$
$C_{x6st}=1.91$	$C_{y1st}=4.18$	$C_{y2st}=5.22$	$C_{y3st}=3.19$	$C_{y4st}=5.13$
$C_{y5st}=2.47$	$C_{y6st}=2.31$	$\rho _{1st}=0.82$	$\rho _{2st}=0.86$	$\rho _{3st}=0.90$
$\rho _{4st}=0.99$	$\rho _{5st}=0.71$	$\rho _{6st}=0.89$	$\rho _{st}=0.82$	$n=140$

Population 4

([4])

$X=\hbox {Amount of students}$ in secondary schools plus primary consisting $N=923$ at six different districts in Turkey in the year 2007, and $Y=\hbox {Number of teachers}.$

$N_{1st}=127$	$N_{2st}=117$	$N_{3st}=103$	$N_{4st}=170$	$N_{5st}=205$
$N_{6st}=201$	$n_{1st}=31$	$n_{2st}=21$	$n_{3st}=29$	$n_{4st}=38$
$n_{5st}=22$	$n_{6st}=39$	$W_{1st}=0.11375$	$W_{2st}=0.1267$	$W_{3st}=0.1115$
$W_{4st}=0.1841$	$W_{5st}=0.2221$	$W_{6st}=0.2177$	$\bar{X}_{1st}=20804.59$	$\bar{X}_{2st}=9211.79$
$\bar{X}_{3st}=14309.30$	$\bar{X}_{4st}=9478.85$	$\bar{X}_{5st}=5569.95$	$\bar{X}_{6st}=12997.59$	$\bar{Y}_{1st}=703.74$
$\bar{Y}_{2st}=413$	$\bar{Y}_{3st}=573.17$	$\bar{Y}_{4st}=424.66$	$\bar{Y}_{5st}=267.03$	$\bar{Y}_{6st}=393.84$
$C_{x1st}=1.465$	$C_{x2st}=1.648$	$C_{x3st}=1.925$	$C_{x4st}=1.922$	$C_{x5st}=1.526$
$C_{x6st}=1.777$	$C_{y1st}=1.256$	$C_{y2st}=1.562$	$C_{y3st}=1.803$	$C_{y4st}=1.909$
$C_{y5st}=1.512$	$C_{y6st}=1.807$	$\rho _{1st}=0.936$	$\rho _{2st}=0.996$	$\rho _{3st}=0.994$
$\rho _{4st}=0.983$	$\rho _{5st}=0.989$	$\rho _{6st}=0.965$	$\rho _{st}=0.95$	$n=180$

MSEs and their respective constants are calculated by using the above-mentioned data sets. After that, improved ratio estimator is compared with all of the reviewed estimators, through MSE and percent relative efficiency (PRE). In Table 1, we can observe the MSE and PRE based on populations 1, 2, 3 and 4, respectively.

Table 1 Bias, MSE and PRE based on real data sets

Full size table

Simulation study

For assessing the performance of the proposed estimator, we use four different artificial populations where $x^{'}_{hi}$ and $y^{'}_{hi}$ are from different distributions, as given in Table 2. In order to propose different level of correlations between study and auxiliary variables, some transformations are given in Table 3. Each population contains three strata having five units. We selected all possible $n_{h} = 2, 3, 4$ units from each stratum, respectively, and in this way, we get $^{5}C_{2}\,^{5}C_{3}\,^{5}C_{4}=500$ samples.

Table 2 Parameters and distributions of study and auxiliary variables

Full size table

Table 3 Characteristics of strata

Full size table

The degree of linear relationship between study and auxiliary variables is taken as 0.50, 0.70, 0.90 for each stratum respectively. For more details about these artificial stratified populations, see Koyuncu and Kadilar (2014). The MSE and PRE based on artificial populations 1,2,3,4 are available in Table 4.

Table 4 Bias, MSE and PRE based on simulation

Full size table

Conclusion

We have developed an improved estimator for the population mean in stratified random sampling. MSE of improved estimator has been found and compared with some of the existing estimators. Also numerical illustration has been carried out using four real data sets. From Table 1, we can see that our proposed estimator is less bias and has minimum MSE value for all real data sets. We have also conducted a simulation study to see the efficiency of our proposed estimator for different artificial data sets. We have calculated both theoretical and empirical bias and MSE values for all estimators. From Table 4, we can conclude that our proposed estimator is highly efficient than existing estimators. As a result from numerical illustration, it is derived that the new estimator is more efficient than the classical mean, ratio, exponential, regression, [8] and [3] estimators. We suggest the use of the proposed estimator for a more efficient estimation of the finite population mean in stratified random sampling.

References

Bahl, S., Tuteja, R.K.: Ratio and product type exponential estimator. J. Inf. Optim. Sci. 12, 159–163 (1991)
MathSciNet MATH Google Scholar
Kadilar, C., Cingi, H.: Ratio estimators in stratified sampling. Biom. J. 45, 218–225 (2003)
Article MathSciNet Google Scholar
Khan, S.A., Ali, H., Manzoor, S., Alamgir, : A class of transformed efficient ratio estimators of finite population mean. Pak. J. Stat. 31, 353–362 (2015)
MathSciNet Google Scholar
Koyuncu, N., Kadilar, C.: Ratio and product estimators in stratified random sampling. J. Stat. Plann. Inference 139, 2552–2558 (2009)
Article MathSciNet Google Scholar
Koyuncu, N., Kadilar, C.: Calibration weighting in stratified random sampling. Commun. Stat. Simul. Comput. 45(7), 2267–2275 (2016)
Article MathSciNet Google Scholar
Murthy, M.N.: Sampling Theory and Methods. Statistical Publishing Society, Calcutta (1967)
MATH Google Scholar
Rao, T.: On certain methods of improving ratio and regression estimators. Commun. Stat. Theory Methods 20, 3325–3340 (1991)
Article MathSciNet Google Scholar
Shabbir, J., Gupta, S.: On estimating finite population mean in simple and stratified random sampling. Commun. Stat. Theory Method 40, 199–212 (2011)
Article MathSciNet Google Scholar
Singh, H.P., Kakran, M.S.: A Modified Ratio Estimator using Known Coefficient of Kurtosis of an Auxiliary Character.(unpublished) (1993)
Singh, R., Mangat, N.S.: Elements of Survey Sampling. Kluwer Academic Publishers, London (1996)
Book Google Scholar
Sisodia, B.V.S., Dwivedi, V.K.: A modified ratio estimator using coefficient of variation of auxiliary variable. J. Indian Soc. Agric. Stat. 33, 13–18 (1981)
Google Scholar
Upadhyaya, L.N., Singh, H.P.: Use of transformed auxiliary variable in estimating the finite population mean. Biom. J. 41, 627–636 (1999)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Statistics, PMAS-Arid Agriculture University, Rawalpindi, Pakistan
Usman Shahzad & Muhammad Hanif
Department of Statistics, Hacettepe University, Ankara, Turkey
Nursel Koyuncu

Authors

Usman Shahzad
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Hanif
View author publications
You can also search for this author in PubMed Google Scholar
Nursel Koyuncu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Usman Shahzad.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

We can write (14) with the e terms as:

$$\begin{aligned} \hat{\bar{y}}_{Nst}=\left[ \bar{Y} \left\{ 1+\frac{1}{8}e_{1st}^{2}+e_{ost}\right\} -w_{1}\bar{X}e_{1st}+w_{2}\bar{Y}+w_{2} \bar{Y}e_{ost}\right] \left[ 1-\frac{e_{1st}}{2k} +\frac{3e_{1st}^{2}}{8k^{2}}+...\right] . \end{aligned}$$

(15)

It implies that

$$\begin{aligned} \hat{\bar{y}}_{Nst}-\bar{Y}= & {} \frac{\bar{Y}}{8}e_{1st}^{2} +\bar{Y}e_{ost}-w_{1}\bar{X}e_{1st}+w_{2}\bar{Y}+w_{2}\bar{Y}e_{ost} -\bar{Y}\frac{e_{1st}}{2k}\nonumber \\&-\bar{Y}\frac{e_{ost}e_{1st}}{2k} +w_{1}\bar{X}\frac{e_{1st}^{2}}{2k}\nonumber \\&-w_{2}\bar{Y}\frac{e_{1st}}{2k}-w_{2}\bar{Y}\frac{e_{ost}e_{1st}}{2k} +\bar{Y}\frac{3e_{1st}^{2}}{8k^{2}}+w_{2}\bar{Y} \frac{3e_{1st}^{2}}{8k^{2}}. \end{aligned}$$

(16)

Now taking expectation we obtain the bias of $\hat{\bar{y}}_{Nst}$ as,

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{Nst})=\bar{Y}\left\{ \frac{V_{0.2}}{8} \left( 1+\frac{3}{k^{2}}\right) -\frac{V_{11}}{2k}\right\} +\frac{\bar{X}V_{0.2}}{2k}w_{1} +\bar{Y} \left\{ \frac{3V_{0.2}}{8k^{2}}-\frac{V_{1.1}}{2k}w_{2}\right\} . \end{aligned}$$

We get $MSE(\hat{\bar{y}}_{N})$ by squaring both sides of (16), ignoring higher order terms, taking expectation as

$$\begin{aligned} MSE(\hat{\bar{y}}_{Nst})=\bar{Y}^{2}L+w^{2}_{1}\lambda _{A}+w^{2}_{2} \lambda _{B}+2w_{1}w_{2}\lambda _{C}-2w_{1}\lambda _{D}-w_{2}\lambda _{E}, \end{aligned}$$

(17)

where

$$\begin{aligned} L&= {} f^{'}\left\{ V_{2.0}+\frac{V_{0.2}}{4k^{2}} -\frac{V_{1.1}}{k}\right\} ,\\ \lambda _{A}&= {} \bar{X}^{2}V_{0.2},\\ \lambda _{B}&= {} \bar{Y}^{2}\left\{ 1+V_{2.0} +\frac{V_{0.2}}{k^{2}}-\frac{2V_{1.1}}{k}\right\} ,\\ \lambda _{C}= & {} \bar{X}\bar{Y}\left\{ \frac{V_{0.2}}{k}-V_{1.1}\right\} ,\\ \lambda _{D}&= {} \bar{X}\bar{Y}\left\{ -\frac{V_{0.2}}{2k}+V_{1.1}\right\} ,\\ \lambda _{E}&= {} \bar{Y}^{2}\left\{ \frac{3V_{1.1}}{k}-2V_{2.0} -\frac{V_{0.2}}{4}-\frac{5V_{0.2}}{4k^{2}}\right\} . \end{aligned}$$

Partially differentiating (17) with respect to $w_{1}$ and $w_{2}$ and equating to zero, we have the following equations

$$\begin{aligned} w_{1}\lambda _{A}+w_{2}\lambda _{C}= & {} \lambda _{D}, \end{aligned}$$

(17)

$$\begin{aligned} w_{1}\lambda _{C}+w_{2}\lambda _{B}= & {} \frac{\lambda _{E}}{2}. \end{aligned}$$

(18)

Now by solving matrix inversion method, we get the optimum values of $w_{1}$, $w_{2}$ i.e.

$$\begin{aligned} w^{\mathrm{opt}}_{1}=\left[ \frac{\lambda _{B}\lambda _{D}-\frac{\lambda _{C} \lambda _{E}}{2}}{\lambda _{A}\lambda _{B}-\lambda _{C}^{2}}\right] , \end{aligned}$$

(19)

and

$$\begin{aligned} w^{\mathrm{opt}}_{2}=\left[ \frac{-\lambda _{C}\lambda _{D}+\frac{\lambda _{A} \lambda _{E}}{2}}{\lambda _{A}\lambda _{B}-\lambda _{C}^{2}}\right] . \end{aligned}$$

(20)

By putting $w^{\mathrm{opt}}_{1}$, $w^{\mathrm{opt}}_{2}$ in $MSE(\hat{\bar{y}}_{Nst})$, we get the minimum MSE of $\hat{\bar{y}}_{Nst}$, i.e.,

$$\begin{aligned} \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{Nst})=\left[ \bar{Y}^{2}L -\frac{\lambda _{B}\lambda _{D}^{2}+\frac{\lambda _{A} \lambda _{E}^{2}}{4}-\lambda _{C}\lambda _{D} \lambda _{E}}{\lambda _{A}\lambda _{B}-\lambda _{C}^{2}}\right] . \end{aligned}$$

(21)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Shahzad, U., Hanif, M. & Koyuncu, N. A new estimator for mean under stratified random sampling. Math Sci 12, 163–169 (2018). https://doi.org/10.1007/s40096-018-0255-3

Download citation

Received: 01 March 2018
Accepted: 14 July 2018
Published: 23 July 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s40096-018-0255-3

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

\(N_{1st}=7\)	\(N_{2st}=12\)	\(N_{3st}=5\)
\(n_{1st}=3\)	\(n_{2st}=5\)	\(n_{3st}=2\)
\(W_{1st}=0.2916\)	\(W_{2st}=0.5000\)	\(W_{3st}=0.2083\)
\(\bar{X}_{1st}=15.2857\)	\(\bar{X}_{2st}=17.2500\)	\(\bar{X}_{3st}=20.6000\)
\(\bar{Y}_{1st}=17.4285\)	\(\bar{Y}_{2st}=20.4166\)	\(\bar{Y}_{3st}=17.8000\)
\(S_{x1st}=4.5721\)	\(S_{x2st}=5.4958\)	\(S_{x3st}=3.6469\)
\(S_{y1st}=4.1975\)	\(S_{y2st}=4.0778\)	\(S_{y3st}=3.2710\)
\(\rho _{st}=0.29\)	\(n=10\)	\(N=24\)

\(N_{1st}=25\)	\(N_{2st}=23\)	\(N_{3st}=16\)	\(N_{4st}=16\)
\(n_{1st}=14\)	\(n_{2st}=13\)	\(n_{3st}=9\)	\(n_{4st}=9\)
\(W_{1st}=0.3125\)	\(W_{2st}=0.2875\)	\(W_{3st}=0.2000\)	\(W_{4st}=0.2000\)
\(\bar{X}_{1st}=71.00\)	\(\bar{X}_{2st}=140.69\)	\(\bar{X}_{3st}=362.93\)	\(\bar{X}_{4st}=749.50\)
\(\bar{Y}_{1st}=3156.64\)	\(\bar{Y}_{2st}=4766.21\)	\(\bar{Y}_{3st}=6334.18\)	\(\bar{Y}_{4st}=7795.31\)
\(S_{x1st}=14.6116\)	\(S_{x2st}=28.0364\)	\(S_{x3st}=91.3823\)	\(S_{x4st}=174.46\)
\(S_{y1st}=740.01\)	\(S_{y2st}=515.69\)	\(S_{y3st}=501.39\)	\(S_{y4st}=653.09\)
\(n=45\)	\(N=80\)	\(\rho _{st}=0.67\)

\(N_{1st}=106\)	\(N_{2st}=106\)	\(N_{3st}=94\)	\(N_{4st}=171\)	\(N_{5st}=204\)
\(N_{6st}=173\)	\(n_{1st}=9\)	\(n_{2st}=17\)	\(n_{3st}=38\)	\(n_{4st}=67\)
\(n_{5st}=7\)	\(n_{6st}=2\)	\(W_{1st}=0.1241\)	\(W_{2st}=0.1241\)	\(W_{3st}=0.1101\)
\(W_{4st}=0.2002\)	\(W_{5st}=0.2386\)	\(W_{6st}=0.2025\)	\(\bar{Xst}_{1st}=243.76\)	\(\bar{X}_{2st}=274.22\)
\(\bar{X}_{3st}=724.10\)	\(\bar{X}_{4st}=773.65\)	\(\bar{X}_{5st}=264.42\)	\(\bar{X}_{6st}=98.44\)	\(\bar{Y}_{1st}=15.37\)
\(\bar{Y}_{2st}=22.13\)	\(\bar{Y}_{3st}=93.84\)	\(\bar{Y}_{4st}=55.88\)	\(\bar{Y}_{5st}=9.67\)	\(\bar{Y}_{6st}=4.04\)
\(C_{x1st}=2.02\)	\(C_{x2st}=2.10\)	\(C_{x3st}=2.22\)	\(C_{x4st}=3.84\)	\(C_{x5st}=1.72\)
\(C_{x6st}=1.91\)	\(C_{y1st}=4.18\)	\(C_{y2st}=5.22\)	\(C_{y3st}=3.19\)	\(C_{y4st}=5.13\)
\(C_{y5st}=2.47\)	\(C_{y6st}=2.31\)	\(\rho _{1st}=0.82\)	\(\rho _{2st}=0.86\)	\(\rho _{3st}=0.90\)
\(\rho _{4st}=0.99\)	\(\rho _{5st}=0.71\)	\(\rho _{6st}=0.89\)	\(\rho _{st}=0.82\)	\(n=140\)

\(N_{1st}=127\)	\(N_{2st}=117\)	\(N_{3st}=103\)	\(N_{4st}=170\)	\(N_{5st}=205\)
\(N_{6st}=201\)	\(n_{1st}=31\)	\(n_{2st}=21\)	\(n_{3st}=29\)	\(n_{4st}=38\)
\(n_{5st}=22\)	\(n_{6st}=39\)	\(W_{1st}=0.11375\)	\(W_{2st}=0.1267\)	\(W_{3st}=0.1115\)
\(W_{4st}=0.1841\)	\(W_{5st}=0.2221\)	\(W_{6st}=0.2177\)	\(\bar{X}_{1st}=20804.59\)	\(\bar{X}_{2st}=9211.79\)
\(\bar{X}_{3st}=14309.30\)	\(\bar{X}_{4st}=9478.85\)	\(\bar{X}_{5st}=5569.95\)	\(\bar{X}_{6st}=12997.59\)	\(\bar{Y}_{1st}=703.74\)
\(\bar{Y}_{2st}=413\)	\(\bar{Y}_{3st}=573.17\)	\(\bar{Y}_{4st}=424.66\)	\(\bar{Y}_{5st}=267.03\)	\(\bar{Y}_{6st}=393.84\)
\(C_{x1st}=1.465\)	\(C_{x2st}=1.648\)	\(C_{x3st}=1.925\)	\(C_{x4st}=1.922\)	\(C_{x5st}=1.526\)
\(C_{x6st}=1.777\)	\(C_{y1st}=1.256\)	\(C_{y2st}=1.562\)	\(C_{y3st}=1.803\)	\(C_{y4st}=1.909\)
\(C_{y5st}=1.512\)	\(C_{y6st}=1.807\)	\(\rho _{1st}=0.936\)	\(\rho _{2st}=0.996\)	\(\rho _{3st}=0.994\)
\(\rho _{4st}=0.983\)	\(\rho _{5st}=0.989\)	\(\rho _{6st}=0.965\)	\(\rho _{st}=0.95\)	\(n=180\)

A new estimator for mean under stratified random sampling

Abstract

Similar content being viewed by others

Sampling Techniques for Quantitative Research

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Introduction

Some existing estimators

Proposed estimator

Numerical illustration

Real data sets

Population 1

Population 2

Population 3

Population 4

Simulation study

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A new estimator for mean under stratified random sampling

Abstract

Similar content being viewed by others

Sampling Techniques for Quantitative Research

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Introduction

Some existing estimators

Proposed estimator

Numerical illustration

Real data sets

Population 1

Population 2

Population 3

Population 4

Simulation study

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation