Abstract
Statistical inference with nonresponse is quite challenging, especially when the response mechanism is nonignorable. In this case, the validity of statistical inference depends on untestable correct specification of the response model. To avoid the misspecification, we propose semiparametric Bayesian estimation in which an outcome model is parametric, but the response model is semiparametric in that we do not assume any parametric form for the nonresponse variable. We adopt penalized spline methods to estimate the unknown function. We also consider a fully nonparametric approach to modeling the response mechanism by using radial basis function methods. Using Pólya–gamma data augmentation, we developed an efficient posterior computation algorithm via Gibbs sampling in which most full conditional distributions can be obtained in familiar forms. The performance of the proposed method is demonstrated in simulation studies and an application to longitudinal data.
Similar content being viewed by others
References
Celeux G, Forbes F, Robert CP, Titterington DM et al (2006) Deviance information criteria for missing data models. Bayesian Anal 1(4):651–673
Chang T, Kott PS (2008) Using calibration weighting to adjust for nonresponse under a plausible model. Biometrika 105:1265–1275
Diggle P, Kenward MG (1994) Informative drop-out in longitudinal data analysis. J R Stat Soc Ser C 43:49–93
Durrant GB, Skinner C (2006) Using data augmentation to correct for non-ignorable non-response when surrogate data are available: an application to the distribution of hourly pay. J R Stat Soc Ser A 169:605–623
Greenlees JS, Reece WS, Zieschang KD (1982) Imputation of missing values when the probability of response depends on the variable being imputed. J Am Stat Assoc 77:251–261
Han P (2014) Multiply robust estimation in regression analysis with missing data. J Am Stat Assoc 109:1159–1173
Hobert JP, Casella G (1996) The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J Am Stat Assoc 91:1461–1473
Ibrahim JG, Lipsitz SR, Horton N (2001) Using auxiliary data for parameter estimation with non-ignorably missing outcomes. J R Stat Soc Ser C 50:361–373
Im J, Kim S (2017) Multiple imputation for nonignorable missing data. J Korean Stat Soc 46:583–592
Kang JDY, Schafer JL (2007) Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 22:523–539
Kim JK, Yu CL (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106:157–165
Kott PS, Chang T (2010) Imputation of missing values when the probability of response depends on the variable being imputed. J Am Stat Assoc 77:251–261
Little RJA, Rubin DB (2002) Statistical inference with missing data, 2nd edn. Wiley, New York
Makalic E (2016) Schmidt D (2016) High-dimensional Bayesian regularised regression with the bayesreg package. arXiv:1611.06649v3
Miao W, Tchetgen EJT (2016) On varieties of doubly robust estimators under missingness not at random with a shadow variable. Biometrika 103:475–482
Millar RB (2009) Comparison of hierarchical Bayesian models for overdispersed count data using dic and bayes’ factors. Biometrics 65(3):962–969
Polson NG, Scott JG, Windle JS (2013) Bayesian inference for logistic models using polya-gamma latent variables. J Am Stat Assoc 108:1339–1349
Qin J, Leung D, Shao J (2002) Estimation with survey data under nonignorable nonresponse or informative sampling. J Am Stat Assoc 97:193–200
Riddles MK, Kim JK, Im J (2016) A propensity-score-adjustment method for nonignorable nonresponse. J Surv Stat Methodol 4:215–245
Robins JM, Rotnitzky A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Rubin DB (1976) Inference and missing data. Biometrika 63:581–592
Rubin DB (1978) Multiple imputation in sample surveys—a phenomenological Bayesian approach to nonresponse. In: Proceedings of the Survey Research Methods Section. American Statistical Association, Washington, DC, pp 20–34
Rubin DB (1987) Multiple imputation for nonresponse in surveys. Wiley, New York
Sang H, Morikawa K (2018) A profile likelihood approach to semiparametric estimation with nonignorable nonresponse. arXiv:1809.03645
Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103:175–187
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soci Ser B (Stat Methodol) 64(4):583–639
Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764
Wang S, Shao J, Kim JK (2014) An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 20:1097–1116
Yin G (2009) Bayesian generalized method of moments. Bayesian Anal 4:191–208
Zahner GE, Pawelkiewicz W, DeFrancesco JJ, Adnopoz J (1992) Children’s mental health service needs and utilization patterns in an urban community: an epidemiological assessment. J Am Acad Child Adolesc Psychiatry 31:951–960
Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110:1577–1590
Acknowledgements
This work is supported by the Japan Society for the Promotion of Science (KAKENHI) grant numbers 18K12757 and 19K14592.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sugasawa, S., Morikawa, K. & Takahata, K. Bayesian semiparametric modeling of response mechanism for nonignorable missing data. TEST 31, 101–117 (2022). https://doi.org/10.1007/s11749-021-00774-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-021-00774-y
Keywords
- Longitudinal data
- Markov Chain Monte Carlo
- Multiple imputation
- Polya-gamma distribution
- Penalized spline