A new estimator for mean under stratified random sampling

AbstractIn this paper, we have proposed an estimator of finite population mean in stratified random sampling. 
The expressions for the bias and mean square error of the proposed estimator are obtained up to the first order of approximation. It is found that the proposed estimator is more efficient than the traditional mean, ratio, exponential, regression, Shabbir and Gupta (in Commun Stat Theory Method 40:199–212, 2011) and Khan et al. (in Pak J Stat 31:353–362, 2015) estimators. We have utilized four natural and four artificial data sets under stratified random sampling scheme for assessing the performance of all the estimators considered here.


Introduction
Stratification is a designing tool that is used in modern surveys for improving the precision of estimates. In stratified design, the whole population is divided into number of strata for getting homogeneity within each stratum, and samples are selected within each stratum mostly through simple random sampling. There are numerous authors who have suggested different estimators by utilizing some known population parameters of an auxiliary variable. Sisodia and Dwivedi [11] presented a proportion estimator by utilizing coefficient of variety of an auxiliary variable. Singh and Kakran [9] proposed another proportion estimator by utilizing known coefficient of kurtosis of an auxiliary variable. Upadhyaya and Singh [12] likewise examined a ratio-type estimator by using the linear combination of coefficient of variation and kurtosis of an auxiliary variable. Kadilar and Cingi [2] utilized the stratified forms of these specified estimators keeping in mind the end goal to improve the efficiency of the suggested estimators.
The fundamental goal of this paper is to propose an improved estimator of the finite population mean utilizing data on an auxiliary variable in stratified random sampling. The expressions for the bias and mean square error (MSE) of the proposed estimator are inferred up to the first order of approximation. On the bases of theoretical and numerical comparisons, we demonstrate that the proposed estimator is more efficient than existing estimators.
The rest of the paper is organized as follows: ''Some existing estimators'' section consists in the estimators which we reviewed from the literature, and also some useful preliminaries results for obtaining the properties of proposed and existing estimators are available here. ''Proposed estimator'' section introduces an improved estimator using stratified random sampling scheme. ''Numerical illustration'' section is devoted to the efficiency comparison. A numerical evaluation is presented in ''Conclusion'' section to highlight the contribution of the paper.

Some existing estimators
É be the population containing N finite units. Let X be an auxiliary and Y be the study variable taking values y hi and x hi in the unit ði ¼ 1; 2; . . .; NÞ in the hth stratum consisting of N h units such that P L h¼1 N h ¼ N. Let n h be the size of the sample drawn from the hth stratum by using simple random sampling without replacement scheme such that The expressions for Y are also defined in similar way.
To find the MSE of the proposed and existing estimators, let us define The expectations of e terms are given below The variance of the sample mean in stratified random sampling without replacement is given by: The stratified version of classical ratio estimator for mean is given bŷ The bias and MSE of classical ratio estimator given in (1) up to the first order of approximation are given below: The stratified version of Bahl and Tuteja [1] estimator iŝ The bias and MSE of y BTst up to the first order of approximation are given below: The traditional regression estimator is given bŷ The MSE of y RegðstÞ is, is the combined correlation between study and auxiliary variate.
Shabbir and Gupta [8] introduced an exponential ratio estimator as: where The minimum MSE of y SG is Khan et al. [3] proposed the following ratio estimator in simple random sampling: In stratified random sampling, it will be of the form: The optimum values of w Kal 1 and w Kal 2 are w The minimum MSE of y Kalst is

Proposed estimator
In this section, an improved estimator of finite population mean in stratified random sampling is proposed. The properties of the proposed estimator are studied up to the first order of approximation. Development of the proposed estimator is given step-by-step below. Rao [7] introduced the following estimator in simple random samplinĝ and its stratified version can be written aŝ As we know, the average of Bahl and Tuteja [1] estimators iŝ Now, by adding (12) and (13), one can propose the following estimator Hence, taking motivation from y AA & y BTst , we propose the following estimator where 4 , suitably chosen constant. The bias, MSE and minimum MSE of y Nst are given by, The detailed proofs of the above expressions are provided in Appendix.

Real data sets
To investigate the theoretical results, the following real data sets are considered as: Population 1 [10, p. 219] X ¼ Amount of milky cows in the year 1990 and Y ¼ Amount of milky cows in the year 1993. Population 2 [6, p. 228] X ¼ Amount of workers working in a factor and Y ¼ Yield or output for factories in an area.
Strata are formed by grouping randomly into four strata on the premise of auxiliary variate (X). The criteria of constructions are x\100:0, 100:0 ! x\200:0, 200:0 ! x\500:0 and x ! 500:0 respectively. Proportional allocation is used for selecting sample from each stratum. Note that we use n ¼ 45.

Population 3 ([2])
X ¼ Apple trees amount in N ¼ 854 towns in Turkey in (1999) and Y ¼ Level of apple production. MSEs and their respective constants are calculated by using the above-mentioned data sets. After that, improved ratio estimator is compared with all of the reviewed estimators, through MSE and percent relative efficiency (PRE). In Table 1, we can observe the MSE and PRE based on populations 1, 2, 3 and 4, respectively.

Simulation study
For assessing the performance of the proposed estimator, we use four different artificial populations where x 0 hi and y 0 hi are from different distributions, as given in Table 2. In order to propose different level of correlations between study and auxiliary variables, some transformations are given in Table 3. Each population contains three strata having five units. We selected all possible n h ¼ 2; 3; 4 units from each stratum, respectively, and in this way, we get 5 C 2 5 C 3 5 C 4 ¼ 500 samples. The degree of linear relationship between study and auxiliary variables is taken as 0.50, 0.70, 0.90 for each stratum respectively. For more details about these artificial stratified populations, see Koyuncu and Kadilar (2014). The MSE and PRE based on artificial populations 1,2,3,4 are available in Table 4.

Conclusion
We have developed an improved estimator for the population mean in stratified random sampling. MSE of improved estimator has been found and compared with some of the existing estimators. Also numerical illustration has been carried out using four real data sets. From Table 1, we can see that our proposed estimator is less bias and has minimum MSE value for all real data sets. We have also conducted a simulation study to see the efficiency of our proposed estimator for different artificial data sets. We have calculated both theoretical and empirical bias and MSE values for all estimators. From Table 4, we can conclude that our proposed estimator is highly efficient q 4st ¼ 0:983 q 5st ¼ 0:989 q 6st ¼ 0:965 q st ¼ 0:95 n ¼ 180 than existing estimators. As a result from numerical illustration, it is derived that the new estimator is more efficient than the classical mean, ratio, exponential, regression, [8] and [3] estimators. We suggest the use of the proposed estimator for a more efficient estimation of the finite population mean in stratified random sampling.

& '
: Partially differentiating (17) with respect to w 1 and w 2 and equating to zero, we have the following equations Now by solving matrix inversion method, we get the optimum values of w 1 , w 2 i.e.
and w opt 2 ¼ By putting w opt 1 , w opt 2 in MSEð y Nst Þ, we get the minimum MSE of y Nst , i.e.,