Introduction

Stratification is a designing tool that is used in modern surveys for improving the precision of estimates. In stratified design, the whole population is divided into number of strata for getting homogeneity within each stratum, and samples are selected within each stratum mostly through simple random sampling. There are numerous authors who have suggested different estimators by utilizing some known population parameters of an auxiliary variable. Sisodia and Dwivedi [11] presented a proportion estimator by utilizing coefficient of variety of an auxiliary variable. Singh and Kakran [9] proposed another proportion estimator by utilizing known coefficient of kurtosis of an auxiliary variable. Upadhyaya and Singh [12] likewise examined a ratio-type estimator by using the linear combination of coefficient of variation and kurtosis of an auxiliary variable. Kadilar and Cingi [2] utilized the stratified forms of these specified estimators keeping in mind the end goal to improve the efficiency of the suggested estimators.

The fundamental goal of this paper is to propose an improved estimator of the finite population mean utilizing data on an auxiliary variable in stratified random sampling. The expressions for the bias and mean square error (MSE) of the proposed estimator are inferred up to the first order of approximation. On the bases of theoretical and numerical comparisons, we demonstrate that the proposed estimator is more efficient than existing estimators.

The rest of the paper is organized as follows: "Some existing estimators" section  consists in the estimators which we reviewed from the literature, and also some useful preliminaries results for obtaining the properties of proposed and existing estimators are available here. "Proposed estimator" section introduces an improved estimator using stratified random sampling scheme. "Numerical illustration" section is devoted to the efficiency comparison. A numerical evaluation is presented in "Conclusion" section to highlight the contribution of the paper.

Some existing estimators

Suppose \(U=\left\{ U^{'}_{1},U^{'}_{2},\ldots , U^{'}_{N}\right\} \) be the population containing N finite units. Let X be an auxiliary and Y be the study variable taking values \(y_{hi}\) and \(x_{hi}\) in the unit \((i=1,2,\ldots ,N)\) in the hth stratum consisting of \(N_{h}\) units such that \(\sum _{h=1}^{L}N_{h}=N\). Let \(n_{h}\) be the size of the sample drawn from the hth stratum by using simple random sampling without replacement scheme such that \(\sum _{h=1}^{L}n_{h}=n\). Suppose \(\bar{y}_{st}=\sum _{h=1}^{L}W_{h}\bar{y}_{h}\), where \(\bar{y}_{h}=\frac{1}{n_{h}}\sum _{i=1}^{n_{h}}y_{hi}.\) Let \(\bar{x}_{st}=\sum _{h=1}^{L}W_{h}\bar{x}_{h}\), where \(\bar{x}_{h}=\frac{1}{n_{h}}\sum _{i=1}^{n_{h}}x_{hi}\) and \(W_{h}=\frac{N_{h}}{N}\) is the stratum weight. The expressions for Y are also defined in similar way.

To find the MSE of the proposed and existing estimators, let us define

$$\begin{aligned} e_{0st}=\frac{\bar{y}_{st}-\bar{Y}}{\bar{Y}},\quad e_{1st} =\frac{\bar{x}_{st}-\bar{X}}{\bar{X}}. \end{aligned}$$

The expectations of e terms are given below

$$\begin{aligned} E(e_{ost})= & {} 0,\,E(e_{1st})=0, \\ E(e_{0st}^{2})= & {} \sum _{h=1}^{L}W_{h}^{2}f^{'}_{h} \frac{S^{2}_{yh}}{\bar{Y}^{2}}=V_{2.0},\,E(e_{1st}^{2}) =\sum _{h=1}^{L}W_{h}^{2}f^{'}_{h}\frac{S^{2}_{xh}}{\bar{X}^{2}}=V_{0.2},\\ E(e_{0st}e_{1st})= & {} \sum _{h=1}^{L}W_{h}^{2}f^{'}_{h} \frac{S_{yxh}}{\bar{X}\bar{Y}}=V_{1.1} \end{aligned}$$

where

$$\begin{aligned} f_{h}=\frac{n_{h}}{N_{h}}, \,f^{'}_{h} =\left( \frac{1-f_{h}}{n_{h}}\right) , \end{aligned}$$

and

$$\begin{aligned} V_{a.b}= \sum _{h=1}^{L}W_{h}^{a+b}f^{'}_{h}\sum _{i=1}^{N_{h}} \frac{(y_{hi}-\bar{Y}_{h})^{a} (x_{hi}-\bar{X}_{h})^{b} }{\bar{Y}^{a} \bar{X}^{b}}. \end{aligned}$$

The variance of the sample mean in stratified random sampling without replacement is given by:

$$\begin{aligned} \mathrm{Var}(\bar{y}_{st})=\bar{Y}^{2}V_{2.0}. \end{aligned}$$

The stratified version of classical ratio estimator for mean is given by

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Rst}}=\frac{\bar{y}_{st}}{\bar{x}_{st}}\bar{X}. \end{aligned}$$
(1)

The bias and MSE of classical ratio estimator given in (1) up to the first order of approximation are given below:

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{\mathrm{Rst}})= & {} \bar{Y} \left[ V_{0.2}-V_{1.1}\right] \nonumber \\ \mathrm{MSE}(\hat{\bar{y}}_{\mathrm{Rst}})= & {} \bar{Y}^{2}\left[ V_{2.0}+V_{0.2} -2V_{1.1}\right] . \end{aligned}$$
(2)

The stratified version of Bahl and Tuteja [1] estimator is

$$\begin{aligned} \hat{\bar{y}}_{BTst}=\bar{y}_{st}\mathrm{exp} \left[ \frac{\bar{X}-\bar{x}_{st}}{\bar{X} +\bar{x}_{st}}\right] . \end{aligned}$$
(3)

The bias and MSE of \(\hat{\bar{y}}_{BTst}\) up to the first order of approximation are given below:

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{BTst})=\bar{Y}\left[ \frac{3}{8}V_{0.2} -\frac{1}{2}V_{1.1}\right] \nonumber \\ \mathrm{MSE}(\hat{\bar{y}}_{BTst})= & {} \bar{Y}^{2}\left[ V_{2.0} +\frac{1}{4}V_{0.2}-V_{1.1}\right] . \end{aligned}$$
(4)

The traditional regression estimator is given by

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Reg}(st)}=\bar{y}_{st}+b_{st}(\bar{X}-\bar{x}_{st}). \end{aligned}$$

The MSE of \(\hat{\bar{y}}_{\mathrm{Reg}(st)}\) is,

$$\begin{aligned} \mathrm{MSE}(\hat{\bar{y}}_{\mathrm{Reg}(st)})=\bar{Y}^{2}V_{2.0}(1-\rho ^{2}_{st}), \end{aligned}$$
(5)

where \(\rho _{st}=\frac{V_{1.1}}{\sqrt{V_{2.0}}\sqrt{V_{0.2}}}\) is the combined correlation between study and auxiliary variate.

Shabbir and Gupta [8] introduced an exponential ratio estimator as:

$$\begin{aligned} \hat{\bar{y}}_{SGst}=\left[ w_{1}^{SG}\bar{y}_{st}+w_{2}^{SG}(\bar{X} -\bar{x}_{st})\right] \mathrm{exp} \left( \frac{\bar{A}-\bar{a}_{st}}{\bar{A} +\bar{a}_{st}}\right) , \end{aligned}$$
(6)

where \(\bar{a}=\bar{x}_{st}+N\bar{X}\), \(\bar{A}=\bar{X}+N\bar{X}.\)

The optimum values of \(w_{1}^{SG}\) and \(w_{2}^{SG}\) are

\(w_{1}^{SG(\mathrm{opt})}=\frac{B_{SG}D_{SG} -\frac{C_{SG}E_{SG}}{2}}{A_{SG}B_{SG} -C_{SG}^{2}}\) and \(w_{2}^{SG(\mathrm{opt})} =\frac{-C_{SG}D_{SG}+\frac{A_{SG}E_{SG}}{2}}{A_{SG}B_{SG}-C_{SG}^{2}}.\)

The minimum MSE of \(\hat{\bar{y}}_{SG}\) is

$$\begin{aligned} \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{SGst})=\bar{Y}^{2}-\frac{B_{SG}D_{SG}^{2} +\frac{A_{SG}E_{SG}^{2}}{4}-C_{SG}D_{SG}E_{SG}}{A_{SG}B_{SG} -C_{SG}^{2}}, \end{aligned}$$
(7)

where

$$\begin{aligned} A_{SG} &= {} \bar{Y}^{2}\left[ 1+V_{2.0}+\frac{V_{0.2}}{(1+N)^{2}} -\frac{2V_{1.1}}{(1+N)}\right] ,\\ B_{SG}&= {} \bar{X}^{2}V_{0.2}, \,\,C_{SG}=\bar{X}\bar{Y} \left[ \frac{V_{0.2}}{(1+N)}-V_{1.1}\right] ,\\ D_{SG}&= {} \bar{Y}^{2}\left[ 1+\frac{3V_{0.2}}{8(1+N)^{2}} -\frac{V_{1.1}}{2(1+N)}\right] and \,\,E_{SG}=\bar{X} \bar{Y}\frac{V_{0.2}}{(1+N)}. \end{aligned}$$

Khan et al. [3] proposed the following ratio estimator in simple random sampling:

$$\begin{aligned} \hat{\bar{y}}_{Kalst}=\left[ w_{1}^{Kal}\bar{y}+w_{2}^{Kal} (\bar{X}-\bar{x})\right] \mathrm{exp}\left( \frac{\bar{A^{'}} -\bar{a^{'}}}{\bar{A^{'}}+\bar{a^{'}}}\right) , \end{aligned}$$
(8)

In stratified random sampling, it will be of the form:

$$\begin{aligned} \hat{\bar{y}}_{Kal(st)}=\left[ w_{1}^{Kal}\bar{y}_{st} +w_{2}^{Kal}(\bar{X}-\bar{x}_{st})\right] \mathrm{exp} \left( \frac{\bar{A^{'}}-\bar{a^{'}}_{st}}{\bar{A^{'}} +\bar{a^{'}}_{st}}\right) , \end{aligned}$$
(9)

where

\(\bar{a^{'}}_{st}=\bar{X}c_{ki}\), \(\bar{A}=\bar{x}_{st}+\bar{X}(c_{ki}-1),\)        for \(i=1, 2, 3.\)

\(c_{k1}=\rho _{st}+1\), \(c_{k2}=\frac{\rho _{st}+1}{2}\), \(c_{k3}=\frac{\rho _{st}+1}{3}.\)

The optimum values of \(w_{1}^{Kal}\) and \(w_{2}^{Kal}\) are

\(w_{1}^{Kal(\mathrm{opt})}=\frac{B_{Kal}D_{Kal} -\frac{C_{Kal}E_{Kal}}{2}}{A_{Kal}B_{Kal}-C_{Kal}^{2}}\) and \(w_{2}^{Kal(\mathrm{opt})}=\frac{-C_{Kal}D_{Kal} +\frac{A_{Kal}E_{Kal}}{2}}{A_{Kal}B_{Kal}-C_{Kal}^{2}}.\)

The minimum MSE of \(\hat{\bar{y}}_{Kalst}\) is

$$\begin{aligned} \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{Kal(st)})=\bar{Y}^{2} -\frac{B_{Kal}D_{Kal}^{2}+\frac{A_{Kal}E_{Kal}^{2}}{4} -C_{Kal}D_{Kal}E_{Kal}}{A_{Kal}B_{Kal}-C_{Kal}^{2}}, \end{aligned}$$
(10)

where

$$\begin{aligned} A_{Kal}= & {} \bar{Y}^{2}\left[ 1+V_{2.0}+\frac{V_{0.2}}{c_{ki}^{2}} -\frac{2V_{1.1}}{c_{ki}}\right] ,\\ B_{Kal}= & {} \bar{X}^{2}V_{0.2}, \,\,C_{Kal}=\bar{X}\bar{Y} \left[ \frac{V_{0.2}}{c_{ki}}-V_{1.1}\right] ,\\ D_{Kal}= & {} \bar{Y}^{2}\left[ 1+\frac{3V_{0.2}}{8c_{ki}^{2}} -\frac{V_{1.1}}{2c_{ki}}\right] and \,\,E_{Kal}=\bar{X} \bar{Y}\frac{V_{0.2}}{c_{ki}}. \end{aligned}$$

Proposed estimator

In this section, an improved estimator of finite population mean in stratified random sampling is proposed. The properties of the proposed estimator are studied up to the first order of approximation. Development of the proposed estimator is given step-by-step below.

Rao [7] introduced the following estimator in simple random sampling

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Rao}}= \left[ w_{1}\left( \bar{X}-\bar{x}\right) +w_{2}\bar{y}\right] \end{aligned}$$
(11)

and its stratified version can be written as

$$\begin{aligned} \hat{\bar{y}}_{\mathrm{Rao}(st)}= \left[ w_{1}\left( \bar{X} -\bar{x}_{st}\right) +w_{2}\bar{y}_{st}\right] . \end{aligned}$$
(12)

As we know, the average of Bahl and Tuteja [1] estimators is

$$\begin{aligned} \hat{\bar{y}}_{ABT}= \frac{1}{2}\left\{ \bar{y}_{st}\mathrm{exp} \left( \frac{\bar{X}-\bar{x}_{st}}{\bar{X}+\bar{x}_{st}}\right) +\bar{y}_{st}\mathrm{exp}\left( \frac{\bar{x}_{st}-\bar{X}}{\bar{X} +\bar{x}_{st}}\right) \right\} , \end{aligned}$$
(13)

Now, by adding (12) and (13), one can propose the following estimator

$$\begin{aligned} \hat{\bar{y}}_{AA}= \left[ \frac{1}{2}\left\{ \bar{y}_{st}\mathrm{exp} \left( \frac{\bar{X}-\bar{x}_{st}}{\bar{X}+\bar{x}_{st}}\right) +\bar{y}_{st}\mathrm{exp}\left( \frac{\bar{x}_{st}-\bar{X}}{\bar{X} +\bar{x}_{st}}\right) \right\} +w_{1}\left( \bar{X} -\bar{x}_{st}\right) +w_{2}\bar{y}_{st}\right] , \end{aligned}$$

Hence, taking motivation from \(\hat{\bar{y}}_{AA}\) & \(\hat{\bar{y}}_{BTst}\), we propose the following estimator

$$\begin{aligned} \hat{\bar{y}}_{Nst}= & {} \left[ \frac{1}{2}\left\{ \bar{y}_{st}\mathrm{exp} \left( \frac{\bar{X}-\bar{x}_{st}}{\bar{X}+\bar{x}_{st}}\right) +\bar{y}_{st}\mathrm{exp}\left( \frac{\bar{x}_{st}-\bar{X}}{\bar{X} +\bar{x}_{st}}\right) \right\} \right. \nonumber \\&\left. +\,w_{1}\left( \bar{X} -\bar{x}_{st}\right) +w_{2}\bar{y}_{st}\right] \mathrm{exp} \left[ \frac{\bar{X}^{''}-\bar{x}_{st}^{''}}{\bar{X}^{''} +\bar{x}_{st}^{''}}\right] , \end{aligned}$$
(14)

where \(\bar{X}^{''}=\bar{X}k\) and \(\bar{x}_{st}^{''}=\bar{x}_{st}+\bar{X}(k-1),\) (see [3]). Further, \(k = \frac{\rho _{st}+1}{4}\), suitably chosen constant.

The bias, MSE and minimum MSE of \(\hat{\bar{y}}_{Nst}\) are given by,

$$\begin{aligned} \mathrm{Bias}(\hat{\bar{y}}_{Nst})= & {} \bar{Y}\left\{ \frac{V_{0.2}}{8} \left( 1+\frac{3}{k^{2}}\right) -\frac{V_{11}}{2k}\right\} +\frac{\bar{X}V_{0.2}}{2k}w_{1}+\bar{Y}\left\{ \frac{3V_{0.2}}{8k^{2}} -\frac{V_{1.1}}{2k}w_{2}\right\} ,\\ \mathrm{MSE}(\hat{\bar{y}}_{Nst})= & {} \bar{Y}^{2}L+w^{2}_{1}\lambda _{A} +w^{2}_{2}\lambda _{B}+2w_{1}w_{2}\lambda _{C}-2w_{1}\lambda _{D} -w_{2}\lambda _{E}, \\ \mathrm{MSE}_{\mathrm{min}}(\hat{\bar{y}}_{Nst})= & {} \left[ \bar{Y}^{2}L -\frac{\lambda _{B}\lambda _{D}^{2}+\frac{\lambda _{A}\lambda _{E}^{2}}{4} -\lambda _{C}\lambda _{D}\lambda _{E}}{\lambda _{A}\lambda _{B} -\lambda _{C}^{2}}\right] . \end{aligned}$$

The detailed proofs of the above expressions are provided in Appendix.

Numerical illustration

Real data sets

To investigate the theoretical results, the following real data sets are considered as:

Population 1

[10, p. 219]

\(X = \hbox {Amount of milky cows}\) in the year 1990 and \(Y=\hbox {Amount of milky cows}\) in the year 1993.

\(N_{1st}=7\)

\(N_{2st}=12\)

\(N_{3st}=5\)

\(n_{1st}=3\)

\(n_{2st}=5\)

\(n_{3st}=2\)

\(W_{1st}=0.2916\)

\(W_{2st}=0.5000\)

\(W_{3st}=0.2083\)

\(\bar{X}_{1st}=15.2857\)

\(\bar{X}_{2st}=17.2500\)

\(\bar{X}_{3st}=20.6000\)

\(\bar{Y}_{1st}=17.4285\)

\(\bar{Y}_{2st}=20.4166\)

\(\bar{Y}_{3st}=17.8000\)

\(S_{x1st}=4.5721\)

\(S_{x2st}=5.4958\)

\(S_{x3st}=3.6469\)

\(S_{y1st}=4.1975\)

\(S_{y2st}=4.0778\)

\(S_{y3st}=3.2710\)

\(\rho _{st}=0.29\)

\(n=10\)

\(N=24\)

Population 2

[6, p. 228]

\(X = \hbox {Amount of workers working in a factor}\) and \(Y=\hbox {Yield}\) or output for factories in an area.

\(N_{1st}=25\)

\(N_{2st}=23\)

\(N_{3st}=16\)

\(N_{4st}=16\)

\(n_{1st}=14\)

\(n_{2st}=13\)

\(n_{3st}=9\)

\(n_{4st}=9\)

\(W_{1st}=0.3125\)

\(W_{2st}=0.2875\)

\(W_{3st}=0.2000\)

\(W_{4st}=0.2000\)

\(\bar{X}_{1st}=71.00\)

\(\bar{X}_{2st}=140.69\)

\(\bar{X}_{3st}=362.93\)

\(\bar{X}_{4st}=749.50\)

\(\bar{Y}_{1st}=3156.64\)

\(\bar{Y}_{2st}=4766.21\)

\(\bar{Y}_{3st}=6334.18\)

\(\bar{Y}_{4st}=7795.31\)

\(S_{x1st}=14.6116\)

\(S_{x2st}=28.0364\)

\(S_{x3st}=91.3823\)

\(S_{x4st}=174.46\)

\(S_{y1st}=740.01\)

\(S_{y2st}=515.69\)

\(S_{y3st}=501.39\)

\(S_{y4st}=653.09\)

\(n=45\)

\(N=80\)

\(\rho _{st}=0.67\)

 

Strata are formed by grouping randomly into four strata on the premise of auxiliary variate (X). The criteria of constructions are \(x < 100.0\), \(100.0 \ge x < 200.0\), \(200.0 \ge x < 500.0\) and \(x \ge 500.0\) respectively. Proportional allocation is used for selecting sample from each stratum. Note that we use \(n=45\).

Population 3

([2])

\(X = \hbox {Apple trees}\) amount in \(N=854\) towns in Turkey in (1999) and \(Y = \hbox {Level of apple production}\).

\(N_{1st}=106\)

\(N_{2st}=106\)

\(N_{3st}=94\)

\(N_{4st}=171\)

\(N_{5st}=204\)

\(N_{6st}=173\)

\(n_{1st}=9\)

\(n_{2st}=17\)

\(n_{3st}=38\)

\(n_{4st}=67\)

\(n_{5st}=7\)

\(n_{6st}=2\)

\(W_{1st}=0.1241\)

\(W_{2st}=0.1241\)

\(W_{3st}=0.1101\)

\(W_{4st}=0.2002\)

\(W_{5st}=0.2386\)

\(W_{6st}=0.2025\)

\(\bar{Xst}_{1st}=243.76\)

\(\bar{X}_{2st}=274.22\)

\(\bar{X}_{3st}=724.10\)

\(\bar{X}_{4st}=773.65\)

\(\bar{X}_{5st}=264.42\)

\(\bar{X}_{6st}=98.44\)

\(\bar{Y}_{1st}=15.37\)

\(\bar{Y}_{2st}=22.13\)

\(\bar{Y}_{3st}=93.84\)

\(\bar{Y}_{4st}=55.88\)

\(\bar{Y}_{5st}=9.67\)

\(\bar{Y}_{6st}=4.04\)

\(C_{x1st}=2.02\)

\(C_{x2st}=2.10\)

\(C_{x3st}=2.22\)

\(C_{x4st}=3.84\)

\(C_{x5st}=1.72\)

\(C_{x6st}=1.91\)

\(C_{y1st}=4.18\)

\(C_{y2st}=5.22\)

\(C_{y3st}=3.19\)

\(C_{y4st}=5.13\)

\(C_{y5st}=2.47\)

\(C_{y6st}=2.31\)

\(\rho _{1st}=0.82\)

\(\rho _{2st}=0.86\)

\(\rho _{3st}=0.90\)

\(\rho _{4st}=0.99\)

\(\rho _{5st}=0.71\)

\(\rho _{6st}=0.89\)

\(\rho _{st}=0.82\)

\(n=140\)

Population 4

([4])

\(X=\hbox {Amount of students}\) in secondary schools plus primary consisting \(N=923\) at six different districts in Turkey in the year 2007, and \(Y=\hbox {Number of teachers}.\)

\(N_{1st}=127\)

\(N_{2st}=117\)

\(N_{3st}=103\)

\(N_{4st}=170\)

\(N_{5st}=205\)

\(N_{6st}=201\)

\(n_{1st}=31\)

\(n_{2st}=21\)

\(n_{3st}=29\)

\(n_{4st}=38\)

\(n_{5st}=22\)

\(n_{6st}=39\)

\(W_{1st}=0.11375\)

\(W_{2st}=0.1267\)

\(W_{3st}=0.1115\)

\(W_{4st}=0.1841\)

\(W_{5st}=0.2221\)

\(W_{6st}=0.2177\)

\(\bar{X}_{1st}=20804.59\)

\(\bar{X}_{2st}=9211.79\)

\(\bar{X}_{3st}=14309.30\)

\(\bar{X}_{4st}=9478.85\)

\(\bar{X}_{5st}=5569.95\)

\(\bar{X}_{6st}=12997.59\)

\(\bar{Y}_{1st}=703.74\)

\(\bar{Y}_{2st}=413\)

\(\bar{Y}_{3st}=573.17\)

\(\bar{Y}_{4st}=424.66\)

\(\bar{Y}_{5st}=267.03\)

\(\bar{Y}_{6st}=393.84\)

\(C_{x1st}=1.465\)

\(C_{x2st}=1.648\)

\(C_{x3st}=1.925\)

\(C_{x4st}=1.922\)

\(C_{x5st}=1.526\)

\(C_{x6st}=1.777\)

\(C_{y1st}=1.256\)

\(C_{y2st}=1.562\)

\(C_{y3st}=1.803\)

\(C_{y4st}=1.909\)

\(C_{y5st}=1.512\)

\(C_{y6st}=1.807\)

\(\rho _{1st}=0.936\)

\(\rho _{2st}=0.996\)

\(\rho _{3st}=0.994\)

\(\rho _{4st}=0.983\)

\(\rho _{5st}=0.989\)

\(\rho _{6st}=0.965\)

\(\rho _{st}=0.95\)

\(n=180\)

MSEs and their respective constants are calculated by using the above-mentioned data sets. After that, improved ratio estimator is compared with all of the reviewed estimators, through MSE and percent relative efficiency (PRE). In Table 1, we can observe the MSE and PRE based on populations 1, 2, 3 and 4, respectively.

Table 1 Bias, MSE and PRE based on real data sets

Simulation study

For assessing the performance of the proposed estimator, we use four different artificial populations where \(x^{'}_{hi}\) and \(y^{'}_{hi}\) are from different distributions, as given in Table 2. In order to propose different level of correlations between study and auxiliary variables, some transformations are given in Table 3. Each population contains three strata having five units. We selected all possible \(n_{h} = 2, 3, 4\) units from each stratum, respectively, and in this way, we get \(^{5}C_{2}\,^{5}C_{3}\,^{5}C_{4}=500\) samples.

Table 2 Parameters and distributions of study and auxiliary variables
Table 3 Characteristics of strata

The degree of linear relationship between study and auxiliary variables is taken as 0.50, 0.70, 0.90 for each stratum respectively. For more details about these artificial stratified populations, see Koyuncu and Kadilar (2014). The MSE and PRE based on artificial populations 1,2,3,4 are available in Table 4.

Table 4 Bias, MSE and PRE based on simulation

Conclusion

We have developed an improved estimator for the population mean in stratified random sampling. MSE of improved estimator has been found and compared with some of the existing estimators. Also numerical illustration has been carried out using four real data sets. From Table 1, we can see that our proposed estimator is less bias and has minimum MSE value for all real data sets. We have also conducted a simulation study to see the efficiency of our proposed estimator for different artificial data sets. We have calculated both theoretical and empirical bias and MSE values for all estimators. From Table 4, we can conclude that our proposed estimator is highly efficient than existing estimators. As a result from numerical illustration, it is derived that the new estimator is more efficient than the classical mean, ratio, exponential, regression, [8] and [3] estimators. We suggest the use of the proposed estimator for a more efficient estimation of the finite population mean in stratified random sampling.