Abstract
This article deals with some new chain imputation methods by using two auxiliary variables under missing completely at random (MCAR) approach. The proposed generalized classes of chain imputation methods are tested from the viewpoint of optimality in terms of MSE. The proposed imputation methods can be considered as an efficient extension to the work of Singh and Horn (Metrika 51:267–276, 2000), Singh and Deo (Stat Pap 44:555–579, 2003), Singh (Stat A J Theor Appl Stat 43(5):499–511, 2009), Kadilar and Cingi (Commun Stat Theory Methods 37:2226–2236, 2008) and Diana and Perri (Commun Stat Theory Methods 39:3245–3251, 2010). The performance of the proposed chain imputation methods is investigated relative to the conventional chain-type imputation methods. The theoretical results are derived and comparative study is conducted and the results are found to be quite encouraging providing the improvement over the discussed work.
Similar content being viewed by others
References
Bhushan, S., Pandey, A.P.: Optimality of the ratio type estimation methods in presence of missing data. Commun. Stat. Theory Methods 47(11), 2576–2589 (2018)
Bhushan, S., Pandey, A.P.: Optimal imputation of the missing data for estimation of population mean. J. Stat. Manag. Syst. 19(6), 755–769 (2016)
Diana, G., Perri, P.F.: Improved estimators of the population mean for missing data. Commun. Stat. Theory Methods 39, 3245–3251 (2010)
Gupta, S., Shabbir, J.: On improvement in estimating the population mean in simple random sampling. J. Appl. Stat. 35(5), 559–566 (2008)
Heitjan, D.F., Basu, S.: Distinguishing missing at random and missing completely at random. Am. Stat. 50, 207–213 (1996)
Kalton, G., Kasprzyk, D., Santos, R.: Issues of nonresponse and imputation in the survey of income and program participation. In: Krewski, D., Platek, R., Rao, J.N.K. (eds.) Current Topics in Survey Sampling, pp. 455–480. Academic Press, New York (1981)
Kalton, G., Kasprzyk, D.: Imputing for missing survey responses. In: Proceedings of the Section on Survey Research Methods. American Statistical Association 22–31 (1982)
Kadilar, C., Cingi, H.: Ratio estimators in stratified random sampling. Biom. J. 45(2), 218–225 (2003)
Kadilar, C., Cingi, H.: Estimators for the population mean in the case missing data. Commun. Stat. Theory Methods 37, 2226–2236 (2008)
Koyuncu, N., Kadilar, C.: On improvement in estimating population mean in stratified random sampling. J. Appl. Stat. 37(6), 999–1013 (2010)
Lee, H., Rancourt, E., Sarndal, C.E.: Experiments with variance estimation from survey data with imputed values. J. Off. Stat. 10, 231–243 (1994)
Lee, H., Rancourt, E., Sarndal, C.E.: Variance estimation in the presence of imputed data for the generalized estimation system. In: Proceedings of the Section on Survey Research Methods, American Statistical Association (1995)
Mohamed, C.: Improved Imputation Methods in Survey Sampling. Texas A & M University-Kingsville, Texas (2015)
Mohamed, C., Sedory, S.A., Singh, S.: Improved mean methods of imputation. Stat. Optim. Inf. Comput. 6, 526–535 (2018)
Murthy, M.N.: Sampling: Theory and Methods. Statistical Publishing Society, Calcutta (1977)
Rubin, R.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)
Reddy, V.N.: A study on the use of prior knowledge on certain population parameters in estimation. Sankhya C 40, 29–37 (1978)
Rueda, M., Gonzalez, S.: Missing data and auxiliary information in surveys. Comput. Stat. 19, 551–567 (2004)
Rueda, M., Gonzalez, S., Arcos, A.: Indirect methods of imputation of missing data based on available units. Appl. Math. Comput. 164, 249–261 (2005)
Searls, D.T.: The utilization of a known coefficient of variation in the estimation procedure. J. Am. Stat. Assoc. 59, 1225–1226 (1964)
Sampath, S.: On optimal choice of unknowns in ratio type estimators. J. Indian Soc. Agric. Stat. 41, 166–172 (1989)
Singh, S., Horn, S.: Compromised imputation in survey sampling. Metrika 51, 267–276 (2000)
Singh, S., Deo, B.: Imputation by power transformation. Stat. Pap. 44, 555–579 (2003)
Singh, S.: A new method of imputation in survey sampling. Stat. A J. Theor. Appl. Stat. 43(5), 499–511 (2009)
Singh, S., Valdes, S.R.: Optimal method of imputation in survey sampling. Appl. Math. Sci. 3(35), 1727–1737 (2009)
Toutenburg, H., Srivastava, V.K.: Estimation of ratio of population means in survey sampling when some observations are missing. Metrika 48, 177–187 (1998)
Toutenburg, H., Srivastava, V.K.: Amputation versus imputation of missing values through ratio method in sample surveys. Stat. Pap. 49, 237–247 (2008)
Walsh, J.E.: Generalization of ratio estimator for population total. Sankhya A 32, 99–106 (1970)
Acknowledgements
The authors are deeply grateful to the learned reviewer and to editor Dr. Niansheng Tang for the rigorous review and their support which significantly improved the revised manuscript.
Author information
Authors and Affiliations
Appendix
Appendix
1.1 Notations and minMSE of the proposed imputation
Let \(\epsilon _{0}=\frac{{\bar{y}}_{r}}{{\bar{Y}}}-1\), \(\epsilon _{1}=\frac{{\bar{x}}_{r}}{{\bar{X}}}-1\), and \(\epsilon _{2}=\frac{{\bar{x}}_{n}}{{\bar{X}}}-1\).
\(E\left( \epsilon _{0}\right) =E\left( \epsilon _{1}\right) =E\left( \epsilon _{2}\right) =0\)
\(E\left( \epsilon _{0}^{2}\right) =f_r C_{y}^{2}\), \(E\left( \epsilon _{1}^{2}\right) =f_r C_{x}^{2}\), \(E\left( \epsilon _{2}^{2}\right) =f_n C_{x}^{2}\),
\(E\left( \epsilon _{0}\epsilon _{1}\right) =f_r \rho _{yx}C_{y}C_{x}\), \(E\left( \epsilon _{0}\epsilon _{2}\right) =f_n \rho _{yx}C_{y}C_{x}\), \(E\left( \epsilon _{1}\epsilon _{2}\right) =f_n C_{x}^{2}\).
Outline of Derivation of Theorem 3.1. The MSE of \(T_{j}\) (\(j=2,3\)) is given by \({\text {MSE}}(T_{j})={\overline{Y}}^{2} \left[ 1+\gamma _{j}^{2}A_{j} -2\gamma _{j}B_{j} \right] . \)
The optimum values of scalars involved are tabulated below for ready reference: \(\gamma _{jopt}=\dfrac{B_{j}}{A_{j}}\ ;\left( j=2,3\right) \) substituting the optimum value of \(\alpha _{2}\) in \({\text {MSE}}\left( T_{2}\right) \), we get minimum MSE
\({\text {MSE}}\left( T_{j}\right) ={\overline{Y}}^{2}\left( 1-\dfrac{B_{j}^{2}}{A_{j}} \right) , \)
where
where \(\theta _{1}=\theta _{1opt}=\rho _{yx}\dfrac{C_{y}}{C_{x}}, \theta _{2}=\theta _{2opt}=\rho _{yz}\dfrac{C_{y}}{C_{z}}, k_{1}=k_{1opt}=\left( 1-\rho _{yx}\dfrac{C_{y}}{C_{x}}\right) \) and \(k_{2}=k_{2opt}=\rho _{yz}\dfrac{ C_{y}}{C_{z}}\) are used as optimizing values of the constants in this study, which are the optimum values where \(\gamma =1\).
Outline of Derivation of Theorem 4.1. The MSE of \(T_{1}\) is given by
\({\text {MSE}}\left( T_{1}\right) =\dfrac{{\bar{Y}}^{2}\left[ S_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }{\left[ {\bar{Y}}^{2}+S_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }=\dfrac{{\bar{Y}}^{2} \left[ {\text {MSE}}\left( t_{1}\right) \right] }{\left[ {\bar{Y}}^{2}+{\text {MSE}}\left( t_{1}\right) \right] }.\)
The optimum values of scalars involved are given below:
\(\gamma _{1}=\dfrac{1}{\left[ 1+C_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }\), \(\delta _{1}=-\gamma _{1} \beta _{1}\) and \(\delta _{2}=-\gamma _{1} \beta _{2}\),
where \(\beta _{1}=\rho _{yx}\dfrac{S_{y}}{S_{x}}\) and \(\beta _{2} =\rho _{yz}\dfrac{ S_{y}}{S_{z}}\).
Rights and permissions
About this article
Cite this article
Bhushan, S., Pandey, A.P. New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables. Commun. Math. Stat. 11, 325–340 (2023). https://doi.org/10.1007/s40304-021-00251-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40304-021-00251-w