New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables

Bhushan, Shashi; Pandey, Abhay Pratap

doi:10.1007/s40304-021-00251-w

New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables

Published: 21 January 2022

Volume 11, pages 325–340, (2023)
Cite this article

Communications in Mathematics and Statistics Aims and scope Submit manuscript

261 Accesses
2 Citations
Explore all metrics

Abstract

This article deals with some new chain imputation methods by using two auxiliary variables under missing completely at random (MCAR) approach. The proposed generalized classes of chain imputation methods are tested from the viewpoint of optimality in terms of MSE. The proposed imputation methods can be considered as an efficient extension to the work of Singh and Horn (Metrika 51:267–276, 2000), Singh and Deo (Stat Pap 44:555–579, 2003), Singh (Stat A J Theor Appl Stat 43(5):499–511, 2009), Kadilar and Cingi (Commun Stat Theory Methods 37:2226–2236, 2008) and Diana and Perri (Commun Stat Theory Methods 39:3245–3251, 2010). The performance of the proposed chain imputation methods is investigated relative to the conventional chain-type imputation methods. The theoretical results are derived and comparative study is conducted and the results are found to be quite encouraging providing the improvement over the discussed work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparison of Different Methods for Multiple Imputation by Chain Equation

Optimal imputation of the missing data using multi auxiliary information

Article 18 July 2020

Markov Chain Monte-Carlo Methods for Missing Data Under Ignorability Assumptions

References

Bhushan, S., Pandey, A.P.: Optimality of the ratio type estimation methods in presence of missing data. Commun. Stat. Theory Methods 47(11), 2576–2589 (2018)
Article MathSciNet MATH Google Scholar
Bhushan, S., Pandey, A.P.: Optimal imputation of the missing data for estimation of population mean. J. Stat. Manag. Syst. 19(6), 755–769 (2016)
Google Scholar
Diana, G., Perri, P.F.: Improved estimators of the population mean for missing data. Commun. Stat. Theory Methods 39, 3245–3251 (2010)
Article MathSciNet MATH Google Scholar
Gupta, S., Shabbir, J.: On improvement in estimating the population mean in simple random sampling. J. Appl. Stat. 35(5), 559–566 (2008)
Article MathSciNet MATH Google Scholar
Heitjan, D.F., Basu, S.: Distinguishing missing at random and missing completely at random. Am. Stat. 50, 207–213 (1996)
MathSciNet Google Scholar
Kalton, G., Kasprzyk, D., Santos, R.: Issues of nonresponse and imputation in the survey of income and program participation. In: Krewski, D., Platek, R., Rao, J.N.K. (eds.) Current Topics in Survey Sampling, pp. 455–480. Academic Press, New York (1981)
Chapter Google Scholar
Kalton, G., Kasprzyk, D.: Imputing for missing survey responses. In: Proceedings of the Section on Survey Research Methods. American Statistical Association 22–31 (1982)
Kadilar, C., Cingi, H.: Ratio estimators in stratified random sampling. Biom. J. 45(2), 218–225 (2003)
Article MathSciNet MATH Google Scholar
Kadilar, C., Cingi, H.: Estimators for the population mean in the case missing data. Commun. Stat. Theory Methods 37, 2226–2236 (2008)
Article MathSciNet MATH Google Scholar
Koyuncu, N., Kadilar, C.: On improvement in estimating population mean in stratified random sampling. J. Appl. Stat. 37(6), 999–1013 (2010)
Article MathSciNet MATH Google Scholar
Lee, H., Rancourt, E., Sarndal, C.E.: Experiments with variance estimation from survey data with imputed values. J. Off. Stat. 10, 231–243 (1994)
Google Scholar
Lee, H., Rancourt, E., Sarndal, C.E.: Variance estimation in the presence of imputed data for the generalized estimation system. In: Proceedings of the Section on Survey Research Methods, American Statistical Association (1995)
Mohamed, C.: Improved Imputation Methods in Survey Sampling. Texas A & M University-Kingsville, Texas (2015)
Google Scholar
Mohamed, C., Sedory, S.A., Singh, S.: Improved mean methods of imputation. Stat. Optim. Inf. Comput. 6, 526–535 (2018)
Article MathSciNet Google Scholar
Murthy, M.N.: Sampling: Theory and Methods. Statistical Publishing Society, Calcutta (1977)
Google Scholar
Rubin, R.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)
Article MathSciNet MATH Google Scholar
Reddy, V.N.: A study on the use of prior knowledge on certain population parameters in estimation. Sankhya C 40, 29–37 (1978)
MATH Google Scholar
Rueda, M., Gonzalez, S.: Missing data and auxiliary information in surveys. Comput. Stat. 19, 551–567 (2004)
Article MathSciNet MATH Google Scholar
Rueda, M., Gonzalez, S., Arcos, A.: Indirect methods of imputation of missing data based on available units. Appl. Math. Comput. 164, 249–261 (2005)
MathSciNet MATH Google Scholar
Searls, D.T.: The utilization of a known coefficient of variation in the estimation procedure. J. Am. Stat. Assoc. 59, 1225–1226 (1964)
Article MATH Google Scholar
Sampath, S.: On optimal choice of unknowns in ratio type estimators. J. Indian Soc. Agric. Stat. 41, 166–172 (1989)
Google Scholar
Singh, S., Horn, S.: Compromised imputation in survey sampling. Metrika 51, 267–276 (2000)
Article MathSciNet MATH Google Scholar
Singh, S., Deo, B.: Imputation by power transformation. Stat. Pap. 44, 555–579 (2003)
Article MathSciNet MATH Google Scholar
Singh, S.: A new method of imputation in survey sampling. Stat. A J. Theor. Appl. Stat. 43(5), 499–511 (2009)
MathSciNet MATH Google Scholar
Singh, S., Valdes, S.R.: Optimal method of imputation in survey sampling. Appl. Math. Sci. 3(35), 1727–1737 (2009)
MathSciNet MATH Google Scholar
Toutenburg, H., Srivastava, V.K.: Estimation of ratio of population means in survey sampling when some observations are missing. Metrika 48, 177–187 (1998)
Article MathSciNet MATH Google Scholar
Toutenburg, H., Srivastava, V.K.: Amputation versus imputation of missing values through ratio method in sample surveys. Stat. Pap. 49, 237–247 (2008)
Article MathSciNet MATH Google Scholar
Walsh, J.E.: Generalization of ratio estimator for population total. Sankhya A 32, 99–106 (1970)
MATH Google Scholar

Download references

Acknowledgements

The authors are deeply grateful to the learned reviewer and to editor Dr. Niansheng Tang for the rigorous review and their support which significantly improved the revised manuscript.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, Dr. Shakuntala Misra National Rehabilitation University, Lucknow, India
Shashi Bhushan
Department of Statistics, Ramanujan College University of Delhi, New Delhi, India
Abhay Pratap Pandey

Authors

Shashi Bhushan
View author publications
You can also search for this author in PubMed Google Scholar
Abhay Pratap Pandey
View author publications
You can also search for this author in PubMed Google Scholar

Appendix

1.1 Notations and minMSE of the proposed imputation

Let $\epsilon _{0}=\frac{{\bar{y}}_{r}}{{\bar{Y}}}-1$, $\epsilon _{1}=\frac{{\bar{x}}_{r}}{{\bar{X}}}-1$, and $\epsilon _{2}=\frac{{\bar{x}}_{n}}{{\bar{X}}}-1$.

$E\left( \epsilon _{0}\right) =E\left( \epsilon _{1}\right) =E\left( \epsilon _{2}\right) =0$

$E\left( \epsilon _{0}^{2}\right) =f_r C_{y}^{2}$, $E\left( \epsilon _{1}^{2}\right) =f_r C_{x}^{2}$, $E\left( \epsilon _{2}^{2}\right) =f_n C_{x}^{2}$,

$E\left( \epsilon _{0}\epsilon _{1}\right) =f_r \rho _{yx}C_{y}C_{x}$, $E\left( \epsilon _{0}\epsilon _{2}\right) =f_n \rho _{yx}C_{y}C_{x}$, $E\left( \epsilon _{1}\epsilon _{2}\right) =f_n C_{x}^{2}$.

Outline of Derivation of Theorem 3.1. The MSE of $T_{j}$ ($j=2,3$) is given by ${\text {MSE}}(T_{j})={\overline{Y}}^{2} \left[ 1+\gamma _{j}^{2}A_{j} -2\gamma _{j}B_{j} \right] . $

The optimum values of scalars involved are tabulated below for ready reference: $\gamma _{jopt}=\dfrac{B_{j}}{A_{j}}\ ;\left( j=2,3\right) $ substituting the optimum value of $\alpha _{2}$ in ${\text {MSE}}\left( T_{2}\right) $, we get minimum MSE

${\text {MSE}}\left( T_{j}\right) ={\overline{Y}}^{2}\left( 1-\dfrac{B_{j}^{2}}{A_{j}} \right) , $

where

$$\begin{aligned} A_{2}= & {} 1+f_r C_{y}^{2}+2\theta _{1}^{2}f_{rn} C_{x}^{2} +2\theta _{2}^{2}f_n C_{z}^{2}+\theta _{1}f_{rn} \left( C_{x}^{2}-4\rho _{yx}C_{y}C_{x}\right) \\&+\,\,\theta _{2}f_n \left( C_{z}^{2}-4\rho _{yz}C_{y}C_{z}\right) ,\\ B_{2}= & {} 1+\frac{\theta _{1}^{2}}{2}f_{rn} C_{x}^{2}+\frac{\theta _{2}^{2}}{2}f_n C_{z}^{2}+\frac{\theta _{1}}{2}f_{rn} \left( C_{x}^{2}-2\rho _{yx}C_{y}C_{x}\right) \\&+\,\frac{\theta _{2}}{2} f_n \left( C_{z}^{2}-2\rho _{yz}C_{y}C_{z}\right) ,\\ A_{3}= & {} 1+f_r C_{y}^{2}+3k_{1}^{2}f_{rn} C_{x}^{2}+3k_{2}^{2}f_n C_{z}^{2}-2k_{1}f_{rn}\\&\left( 3C_{x}^{2}-2\rho _{yx}C_{y}C_{x}\right) +f_{rn} \left( 3C_{x}^{2}-4\rho _{yx}C_{y}C_{x}\right) -4k_{2}f_n \rho _{yz}C_{y}C_{z}, \\ B_{3}= & {} 1+k_{1}^{2}f_{rn} C_{x}^{2}+k_{2}^{2}f_n C_{z}^{2}-k_{1}f_{rn} \left( 2C_{x}^{2}-\rho _{yx}C_{y}C_{x}\right) \\&+\,\,f_{rn} \left( C_{x}^{2}-\rho _{yx}C_{y}C_{x}\right) -k_{2}f_n \rho _{yz}C_{y}C_{z}, \end{aligned}$$

where $\theta _{1}=\theta _{1opt}=\rho _{yx}\dfrac{C_{y}}{C_{x}}, \theta _{2}=\theta _{2opt}=\rho _{yz}\dfrac{C_{y}}{C_{z}}, k_{1}=k_{1opt}=\left( 1-\rho _{yx}\dfrac{C_{y}}{C_{x}}\right) $ and $k_{2}=k_{2opt}=\rho _{yz}\dfrac{ C_{y}}{C_{z}}$ are used as optimizing values of the constants in this study, which are the optimum values where $\gamma =1$.

Outline of Derivation of Theorem 4.1. The MSE of $T_{1}$ is given by

${\text {MSE}}\left( T_{1}\right) =\dfrac{{\bar{Y}}^{2}\left[ S_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }{\left[ {\bar{Y}}^{2}+S_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }=\dfrac{{\bar{Y}}^{2} \left[ {\text {MSE}}\left( t_{1}\right) \right] }{\left[ {\bar{Y}}^{2}+{\text {MSE}}\left( t_{1}\right) \right] }.$

The optimum values of scalars involved are given below:

$\gamma _{1}=\dfrac{1}{\left[ 1+C_{y}^{2}\left\{ f_r -f_{rn} \rho _{yx}^{2}-f_n \rho _{yz}^{2}\right\} \right] }$, $\delta _{1}=-\gamma _{1} \beta _{1}$ and $\delta _{2}=-\gamma _{1} \beta _{2}$,

where $\beta _{1}=\rho _{yx}\dfrac{S_{y}}{S_{x}}$ and $\beta _{2} =\rho _{yz}\dfrac{ S_{y}}{S_{z}}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bhushan, S., Pandey, A.P. New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables. Commun. Math. Stat. 11, 325–340 (2023). https://doi.org/10.1007/s40304-021-00251-w

Download citation

Received: 20 May 2020
Revised: 22 August 2020
Accepted: 08 June 2021
Published: 21 January 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s40304-021-00251-w

Keywords

Mathematics Subject Classification

62D05

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables

Abstract

Access this article

Similar content being viewed by others

Comparison of Different Methods for Multiple Imputation by Chain Equation

Optimal imputation of the missing data using multi auxiliary information

Markov Chain Monte-Carlo Methods for Missing Data Under Ignorability Assumptions

References

Acknowledgements

Author information

Authors and Affiliations

Appendix

1.1 Notations and minMSE of the proposed imputation

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

New Chain Imputation Methods for Estimating Population Mean in the Presence of Missing Data Using Two Auxiliary Variables

Abstract

Access this article

Similar content being viewed by others

Comparison of Different Methods for Multiple Imputation by Chain Equation

Optimal imputation of the missing data using multi auxiliary information

Markov Chain Monte-Carlo Methods for Missing Data Under Ignorability Assumptions

References

Acknowledgements

Author information

Authors and Affiliations

Appendix

Appendix

1.1 Notations and minMSE of the proposed imputation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation