Skip to main content
Log in

New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables

  • Published:
Communications in Mathematics and Statistics Aims and scope Submit manuscript

Abstract

This article deals with some new chain imputation methods by using two auxiliary variables under missing completely at random (MCAR) approach. The proposed generalized classes of chain imputation methods are tested from the viewpoint of optimality in terms of MSE. The proposed imputation methods can be considered as an efficient extension to the work of Singh and Horn (Metrika 51:267–276, 2000), Singh and Deo (Stat Pap 44:555–579, 2003), Singh (Stat A J Theor Appl Stat 43(5):499–511, 2009), Kadilar and Cingi (Commun Stat Theory Methods 37:2226–2236, 2008) and Diana and Perri (Commun Stat Theory Methods 39:3245–3251, 2010). The performance of the proposed chain imputation methods is investigated relative to the conventional chain-type imputation methods. The theoretical results are derived and comparative study is conducted and the results are found to be quite encouraging providing the improvement over the discussed work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bhushan, S., Pandey, A.P.: Optimality of the ratio type estimation methods in presence of missing data. Commun. Stat. Theory Methods 47(11), 2576–2589 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  2. Bhushan, S., Pandey, A.P.: Optimal imputation of the missing data for estimation of population mean. J. Stat. Manag. Syst. 19(6), 755–769 (2016)

    Google Scholar 

  3. Diana, G., Perri, P.F.: Improved estimators of the population mean for missing data. Commun. Stat. Theory Methods 39, 3245–3251 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  4. Gupta, S., Shabbir, J.: On improvement in estimating the population mean in simple random sampling. J. Appl. Stat. 35(5), 559–566 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  5. Heitjan, D.F., Basu, S.: Distinguishing missing at random and missing completely at random. Am. Stat. 50, 207–213 (1996)

    MathSciNet  Google Scholar 

  6. Kalton, G., Kasprzyk, D., Santos, R.: Issues of nonresponse and imputation in the survey of income and program participation. In: Krewski, D., Platek, R., Rao, J.N.K. (eds.) Current Topics in Survey Sampling, pp. 455–480. Academic Press, New York (1981)

    Chapter  Google Scholar 

  7. Kalton, G., Kasprzyk, D.: Imputing for missing survey responses. In: Proceedings of the Section on Survey Research Methods. American Statistical Association 22–31 (1982)

  8. Kadilar, C., Cingi, H.: Ratio estimators in stratified random sampling. Biom. J. 45(2), 218–225 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  9. Kadilar, C., Cingi, H.: Estimators for the population mean in the case missing data. Commun. Stat. Theory Methods 37, 2226–2236 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  10. Koyuncu, N., Kadilar, C.: On improvement in estimating population mean in stratified random sampling. J. Appl. Stat. 37(6), 999–1013 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  11. Lee, H., Rancourt, E., Sarndal, C.E.: Experiments with variance estimation from survey data with imputed values. J. Off. Stat. 10, 231–243 (1994)

    Google Scholar 

  12. Lee, H., Rancourt, E., Sarndal, C.E.: Variance estimation in the presence of imputed data for the generalized estimation system. In: Proceedings of the Section on Survey Research Methods, American Statistical Association (1995)

  13. Mohamed, C.: Improved Imputation Methods in Survey Sampling. Texas A & M University-Kingsville, Texas (2015)

    Google Scholar 

  14. Mohamed, C., Sedory, S.A., Singh, S.: Improved mean methods of imputation. Stat. Optim. Inf. Comput. 6, 526–535 (2018)

    Article  MathSciNet  Google Scholar 

  15. Murthy, M.N.: Sampling: Theory and Methods. Statistical Publishing Society, Calcutta (1977)

    Google Scholar 

  16. Rubin, R.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)

    Article  MathSciNet  MATH  Google Scholar 

  17. Reddy, V.N.: A study on the use of prior knowledge on certain population parameters in estimation. Sankhya C 40, 29–37 (1978)

    MATH  Google Scholar 

  18. Rueda, M., Gonzalez, S.: Missing data and auxiliary information in surveys. Comput. Stat. 19, 551–567 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  19. Rueda, M., Gonzalez, S., Arcos, A.: Indirect methods of imputation of missing data based on available units. Appl. Math. Comput. 164, 249–261 (2005)

    MathSciNet  MATH  Google Scholar 

  20. Searls, D.T.: The utilization of a known coefficient of variation in the estimation procedure. J. Am. Stat. Assoc. 59, 1225–1226 (1964)

    Article  MATH  Google Scholar 

  21. Sampath, S.: On optimal choice of unknowns in ratio type estimators. J. Indian Soc. Agric. Stat. 41, 166–172 (1989)

    Google Scholar 

  22. Singh, S., Horn, S.: Compromised imputation in survey sampling. Metrika 51, 267–276 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  23. Singh, S., Deo, B.: Imputation by power transformation. Stat. Pap. 44, 555–579 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  24. Singh, S.: A new method of imputation in survey sampling. Stat. A J. Theor. Appl. Stat. 43(5), 499–511 (2009)

    MathSciNet  MATH  Google Scholar 

  25. Singh, S., Valdes, S.R.: Optimal method of imputation in survey sampling. Appl. Math. Sci. 3(35), 1727–1737 (2009)

    MathSciNet  MATH  Google Scholar 

  26. Toutenburg, H., Srivastava, V.K.: Estimation of ratio of population means in survey sampling when some observations are missing. Metrika 48, 177–187 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  27. Toutenburg, H., Srivastava, V.K.: Amputation versus imputation of missing values through ratio method in sample surveys. Stat. Pap. 49, 237–247 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  28. Walsh, J.E.: Generalization of ratio estimator for population total. Sankhya A 32, 99–106 (1970)

    MATH  Google Scholar 

Download references

Acknowledgements

The authors are deeply grateful to the learned reviewer and to editor Dr. Niansheng Tang for the rigorous review and their support which significantly improved the revised manuscript.

Author information

Authors and Affiliations

Authors

Appendix

Appendix

1.1 Notations and minMSE of the proposed imputation

Let \(\epsilon _{0}=\frac{{\bar{y}}_{r}}{{\bar{Y}}}-1\), \(\epsilon _{1}=\frac{{\bar{x}}_{r}}{{\bar{X}}}-1\), and \(\epsilon _{2}=\frac{{\bar{x}}_{n}}{{\bar{X}}}-1\).

\(E\left( \epsilon _{0}\right) =E\left( \epsilon _{1}\right) =E\left( \epsilon _{2}\right) =0\)

\(E\left( \epsilon _{0}^{2}\right) =f_r C_{y}^{2}\), \(E\left( \epsilon _{1}^{2}\right) =f_r C_{x}^{2}\), \(E\left( \epsilon _{2}^{2}\right) =f_n C_{x}^{2}\),

\(E\left( \epsilon _{0}\epsilon _{1}\right) =f_r \rho _{yx}C_{y}C_{x}\), \(E\left( \epsilon _{0}\epsilon _{2}\right) =f_n \rho _{yx}C_{y}C_{x}\), \(E\left( \epsilon _{1}\epsilon _{2}\right) =f_n C_{x}^{2}\).

Outline of Derivation of Theorem 3.1. The MSE of \(T_{j}\) (\(j=2,3\)) is given by \({\text {MSE}}(T_{j})={\overline{Y}}^{2} \left[ 1+\gamma _{j}^{2}A_{j} -2\gamma _{j}B_{j} \right] . \)

The optimum values of scalars involved are tabulated below for ready reference: \(\gamma _{jopt}=\dfrac{B_{j}}{A_{j}}\ ;\left( j=2,3\right) \) substituting the optimum value of \(\alpha _{2}\) in \({\text {MSE}}\left( T_{2}\right) \), we get minimum MSE

\({\text {MSE}}\left( T_{j}\right) ={\overline{Y}}^{2}\left( 1-\dfrac{B_{j}^{2}}{A_{j}} \right) , \)

where

$$\begin{aligned} A_{2}= & {} 1+f_r C_{y}^{2}+2\theta _{1}^{2}f_{rn} C_{x}^{2} +2\theta _{2}^{2}f_n C_{z}^{2}+\theta _{1}f_{rn} \left( C_{x}^{2}-4\rho _{yx}C_{y}C_{x}\right) \\&+\,\,\theta _{2}f_n \left( C_{z}^{2}-4\rho _{yz}C_{y}C_{z}\right) ,\\ B_{2}= & {} 1+\frac{\theta _{1}^{2}}{2}f_{rn} C_{x}^{2}+\frac{\theta _{2}^{2}}{2}f_n C_{z}^{2}+\frac{\theta _{1}}{2}f_{rn} \left( C_{x}^{2}-2\rho _{yx}C_{y}C_{x}\right) \\&+\,\frac{\theta _{2}}{2} f_n \left( C_{z}^{2}-2\rho _{yz}C_{y}C_{z}\right) ,\\ A_{3}= & {} 1+f_r C_{y}^{2}+3k_{1}^{2}f_{rn} C_{x}^{2}+3k_{2}^{2}f_n C_{z}^{2}-2k_{1}f_{rn}\\&\left( 3C_{x}^{2}-2\rho _{yx}C_{y}C_{x}\right) +f_{rn} \left( 3C_{x}^{2}-4\rho _{yx}C_{y}C_{x}\right) -4k_{2}f_n \rho _{yz}C_{y}C_{z}, \\ B_{3}= & {} 1+k_{1}^{2}f_{rn} C_{x}^{2}+k_{2}^{2}f_n C_{z}^{2}-k_{1}f_{rn} \left( 2C_{x}^{2}-\rho _{yx}C_{y}C_{x}\right) \\&+\,\,f_{rn} \left( C_{x}^{2}-\rho _{yx}C_{y}C_{x}\right) -k_{2}f_n \rho _{yz}C_{y}C_{z}, \end{aligned}$$

where \(\theta _{1}=\theta _{1opt}=\rho _{yx}\dfrac{C_{y}}{C_{x}}, \theta _{2}=\theta _{2opt}=\rho _{yz}\dfrac{C_{y}}{C_{z}}, k_{1}=k_{1opt}=\left( 1-\rho _{yx}\dfrac{C_{y}}{C_{x}}\right) \) and \(k_{2}=k_{2opt}=\rho _{yz}\dfrac{ C_{y}}{C_{z}}\) are used as optimizing values of the constants in this study, which are the optimum values where \(\gamma =1\).

Outline of Derivation of Theorem 4.1. The MSE of \(T_{1}\) is given by

\({\text {MSE}}\left( T_{1}\right) =\dfrac{{\bar{Y}}^{2}\left[ S_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }{\left[ {\bar{Y}}^{2}+S_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }=\dfrac{{\bar{Y}}^{2} \left[ {\text {MSE}}\left( t_{1}\right) \right] }{\left[ {\bar{Y}}^{2}+{\text {MSE}}\left( t_{1}\right) \right] }.\)

The optimum values of scalars involved are given below:

\(\gamma _{1}=\dfrac{1}{\left[ 1+C_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }\), \(\delta _{1}=-\gamma _{1} \beta _{1}\) and \(\delta _{2}=-\gamma _{1} \beta _{2}\),

where \(\beta _{1}=\rho _{yx}\dfrac{S_{y}}{S_{x}}\) and \(\beta _{2} =\rho _{yz}\dfrac{ S_{y}}{S_{z}}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bhushan, S., Pandey, A.P. New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables. Commun. Math. Stat. 11, 325–340 (2023). https://doi.org/10.1007/s40304-021-00251-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40304-021-00251-w

Keywords

Mathematics Subject Classification

Navigation