Abstract
The problem of nonignorable nonresponse data is ubiquitous in medical and social science studies. Analyses focused only on the missing-at-random assumption may lead to biased results. Various debias methods have been extensively studied in the literature, particularly the doubly robust (DR) estimators. We propose DR augmented-estimating-equations (AEE) estimators of the mean response which enjoy the double-robustness property under correct specification of the log odds ratio model. An advantage of DR AEE estimators is that they can efficiently use the completely observed covariates to improve estimation efficiency of existing DR estimators with nonignorable nonresponse data. We propose a model selection criterion that can consistently select the correct parametric model of the log odds ratio model from a group of candidate models. Moreover, the correctness of the required working models can be evaluated via straightforward goodness-of-fit tests. Simulation results indicate that doubly robust augmented-estimating-equations estimators are very robust to a misspecification of the baseline outcome density model or the baseline response model and dominate other competitors in the sense of having smaller mean-square errors. The analysis of a real dataset illustrates the flexibility and usefulness of the proposed methods.
Similar content being viewed by others
References
Choi JY, Lee MJ (2017) Regression discontinuity: review with extensions. Stat Pap 58:1217–1246
D’Haultfoeuille X (2010) A new instrumental method for dealing with endogenous selection. J Econom 154:1–15
Fang F, Shao J (2016) Model selection with nonignorable nonresponse. Biometrika 103:861–874
Guan Z, Qin J (2017) Empirical likelihood method for non-ignorable missing data problems. Lifetime Data Anal 23:113–135
Hall AR (2005) Generalized method of moments. Oxford University Press, Oxford
Han P (2014) Multiply robust estimation in regression analysis with missing data. J Am Stat Assoc 109(507):1159–1173
Kang JD, Schafer JL (2007) Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 22:523–539
Kim JK, Yu CL (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106(493):157–165
Little RJA, Rubin DB (2002) Statistical inference with missing data, 2nd edn. Wiley series in probability and statistics. Wiley, New York
Miao W, Tchetgen EJ (2016) On varieties of doubly robust estimators under missingness not at random with a shadow variable. Biometrika 103(2):475–482
Miao W, Tchetgen EJ, Geng Z (2015) Identification and doubly robust estimation of data missing not at random with an ancillary variable. http://biostats.bepress.com/harvardbiostat/paper189
Morikawa K, Kim JK (2016) Semiparametric adaptive estimation with nonignorable nonresponse data. arXiv:1612.09207 [stat.ME]
Robins JM, Rotnitzkt A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Robins J, Sued M, Lei-Gomez Q, Rotnitzky A (2007) Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Stat Sci 22:544–559
Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103(1):175–187
Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90(4):747–764
Tang N, Zhao P, Zhu H (2014) Empirical likelihood for estimating equations with nonignorably missing data. Stat Sin 24(2):723–747
Tang N, Zhao P, Qu A, Jiang D (2017) Semiparametric estimating equations inference with nonignorable nonresponse. Stat Sin. https://doi.org/10.5705/ss.202015.0052
Tsiatis AA (2006) Semiparametric theory and missing data. Springer Series in Statistics. Springer, New York
Wang S, Shao J, Kim JK (2014) An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24(3):1097–1116
Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110(512):1577–1590
Acknowledgements
We are grateful to the two reviewers and the editor for a number of constructive and helpful comments and suggestions that have clearly improved our manuscript. Xiaohui Yuan was partly supported by the NSFC (No. 11571051, 11671054, 11701043).
Author information
Authors and Affiliations
Corresponding authors
Appendix
Appendix
To establish all the large sample properties in this paper, we require the following conditions:
-
C1
\(t_i=(w_i^\textsf {T}, y_i^\textsf {T},r_i^\textsf {T})^\textsf {T}\), \(i=1,\ldots ,n\), are independent and identically distributed, where \(w_i=(x_i^\textsf {T},z_i^\textsf {T})^\textsf {T}\).
-
C2
(a) ; (b) ; (c) \(0<E[\text{ OR }(y|x)|r=1,x]<+\infty \), \(0<E[\text{ OR }(y|x)|r=1,w]<+\infty \), and \(\text{ OR }(y=0|x)=0\), where \(\text{ OR }(y|x)=\log \left\{ \frac{\text{ pr }(r=0|y,x)\text{ pr }(r=1|y=0,x)}{\text{ pr }(r=0|y=0,x)\text{ pr }(r=1|y,x)}\right\} \); (d) \(E(y^2)<+\infty \).
-
C3
Either \(f (y |r =1, w)\) or \(f (y |r =0, w)\) follows the location-scale model
$$\begin{aligned} f(y|w,r)=\frac{1}{\sigma _r(w)}f_r\left\{ \frac{y-\mu _r(w)}{\sigma _r(w)}\right\} , r=0,1, \end{aligned}$$with unrestricted functions \(\mu _r\) and \(\sigma _r\), and density functions \(f_r\), and the corresponding density function \(f_{r=1}\) or \(f_{r=0}\) satisfies the following conditions: (a) the characteristic function \(\varphi (t)\) of the density function f(v) satisfies \(0<|\varphi (t)|<C\exp (-\delta |t|)\) for \(t\in R\) and some constants \(C, \delta >0\); (b) conditional on x, \(\mu (z,x)\) and \(\sigma (z,x)\) are continuously differentiable and integrable with respect to z; f(v) is continuously differentiable, and \(\int _{-\infty }^{+\infty }|v\times \partial f (v)/\partial v|^2dv\) is finite; (c) there exist some linear one-to-one mapping \(M :f \{(v-a)/b\}\mapsto c(t, a, b)\) and some value \(-\infty \le t_0\le +\infty \) such that \(\lim _{t\rightarrow t_0}c(t, a, b)/c(t, a', b')\) either equals zero or infinity for any \(a, a' \in \mathbb {R}\), \(b, b'>0\) with \((a, b)\ne (a',b')\).
-
C4
The response probability \(\pi (x, y; \alpha ^*,\gamma ^0)\) is bounded below. That is, \(\pi (x, y; \alpha ^*,\gamma ^0) > c_0\) for some \(c_0 > 0\) for all (x, y).
-
C5
Define \(\vartheta =(\mu ^\textsf {T},\theta ^\textsf {T})^\textsf {T}\), \(\vartheta ^*=(\mu ^{0},\theta ^{*\textsf {T}})^\textsf {T}\) and \(b(t,\vartheta )=(b_1(t; \vartheta ),\ldots ,b_s(t; \vartheta ))^\textsf {T}=(d_1(t;\theta )-\mu ,h^\textsf {T}(t;\theta ))^\textsf {T}\), where \(\{b_1(t; \vartheta ),\ldots ,b_s(t; \vartheta )\}\) is a set of functionally independent estimating function. There exists a unique \(\vartheta ^*\) such that \(E\{b(t; \vartheta ^*)\}=0\), the matrix \(E\{\partial b(t; \vartheta ^*)/\partial \vartheta ^\textsf {T}\}\) is of full rank and the matrix \(E\{b^{\otimes 2}(t; \vartheta ^*)\}\) is positive definite, and \(\partial b(t; \vartheta )/\partial \vartheta ^\textsf {T}\) and \(\partial ^2 b(t; \vartheta )/\partial \vartheta \partial \vartheta ^\textsf {T}\) are continuous in a neighborhood of \(\vartheta ^*\). Furthermore, \(\Vert \partial ^2 b(t; \vartheta )/\partial \vartheta \partial \vartheta ^\textsf {T}\Vert \), \(\Vert b(t; \vartheta )\Vert ^3\), and \(\Vert \partial b(t; \vartheta )/\partial \vartheta ^\textsf {T}\Vert \) are bounded by some integrable function in this neighborhood.
Most of the above conditions are assumed for the missing data mechanism and the identification of the joint distribution of (w, y, r). Additional conditions are on the moment restrictions. Conditions C1–C3 guarantee the identifiability of the joint distribution of (w, y, r). C4 implies that the outcome y cannot be missing with probability 1 anywhere in the domain of the (x, y). Condition C5 imposes some conditions on the moment restriction \(E\{b(t,\vartheta ^*)\}=0\).
Proof of Theorem 1
The properties (1) and (3) are obvious. We only prove property (2). By direct calculations, we have
Moreover,
Similarly, \(E[\{\pi ^{-1}(w, y)- 1\}|w]=\text{ pr }(r=0|w)E[\pi ^{-1}(w, y)|r=0,w]\). Thus,
where \(N_*(w)=E\{\pi ^{-1}(w, y)y|r=0,w\}/E\{\pi ^{-1}(w, y)|r=0,w\}\). The desired result follows by letting \(N(w)=N_*(w)\). \(\square \)
Lemma A.1
Under the regular condition C2, we have
Proof of Lemma A.1
On one hand, from (5), we have
On the other hand, from (5), it follows that
Thus,
\(\square \)
Lemma A.2
Under regular conditions C1–C5, suppose that the odds ratio model \(\varsigma (x,y;\gamma )=\exp \{\text{ OR }(y|x;\gamma )\}\) is correct, and that the equation \(E\{h(t;\theta )\}=0\) has a unique solution \(\theta ^*\). For any square integrable vector function \(\varphi (w, y)\) and scalar function \(\kappa (x)\), \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) solving Eq. (7).
-
(i)
if \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, then
$$\begin{aligned} n^{-1}\sum _{i=1}^n\{1-\pi ^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }})r_i\}\varphi (w_i, y_i) \end{aligned}$$converges to zero in probability;
-
(ii)
if \(f(y,z|r=1,x;\beta )\) is correct, then
$$\begin{aligned} n^{-1}\sum _{i=1}^n[r_i\varsigma (x_i,y_i;{\hat{\gamma }}) \kappa (x_i)\{\varphi (y_i, w_i)-\varphi _\bullet (x_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$converges to zero in probability, where \(\varphi _\bullet (x;\gamma ,\beta )=E\{\varphi (w, y) |r =0, x; \beta \}\);
-
(iii)
if either \(\text{ pr }(r=1|y=0,x;\alpha )\) or \(f(z,y|r=1,x;\beta )\) is correct, then
$$\begin{aligned} n^{-1}\sum _{i=1}^n[\{r_i\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}) -1\}\{\varphi (w_i, y_i )-\varphi _\bullet (x_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$converges to zero in probability.
Proof of Lemma A.2
See the proof of Lemma A1 in Miao and Tchetgen (2016). \(\square \)
Lemma A.3
Under regular conditions C1–C5, suppose that the odds ratio model \(\varsigma (x,y;\gamma )=\exp \{\text{ OR }(y|x;\gamma )\}\) is correct, and that the equation \(E\{h(t;\theta )\}=0\) has a unique solution \(\theta ^*\). For any square integrable vector function \(\psi (w, y)\) and scalar function \(\kappa (w)\), \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) solving Eq. (7).
-
(i)
if \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, then
$$\begin{aligned} n^{-1}\sum _{i=1}^n\{1-\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }})r_i\}\psi (w_i, y_i) \end{aligned}$$converges to zero in probability;
-
(ii)
if \(f(y,z|r=1,x;\beta )\) is correct, then
$$\begin{aligned} n^{-1}\sum _{i=1}^n[r_i\varsigma (x_i,y_i;{\hat{\gamma }}) \kappa (w_i)\{\psi (w_i, y_i)-\psi _\bullet (w_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$converges to zero in probability, where \(\psi _\bullet (w;\gamma ,\beta )=E\{\psi (w, y) |r =0, w; \beta \}\);
-
(iii)
if either \(\text{ pr }(r=1|y=0,x;\alpha )\) or \(f(y,z|r=1,x;\beta )\) is correct, then
$$\begin{aligned} n^{-1}\sum _{i=1}^n[\{r_i\pi ^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }})-1\}\{\psi (w_i, y_i )-\psi _\bullet (w_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$converges to zero in probability.
Proof of Lemma A.3
The proof is very similar to that of Lemma A1 of Miao and Tchetgen (2016). We use starred characters \((\alpha ^*,\gamma ^*,\beta ^*)\) to denote probability limits of \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\), and use \((\alpha ^0,\gamma ^0,\beta ^0)\) to denote true values of the nuisance parameters.
-
(i)
If \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, by the proof of (i) of Lemma A.2, \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) must converge to \((\alpha ^0,\gamma ^0,\beta ^*)\). Thus, for any square integrable function \(\psi (w, y)\),
$$\begin{aligned} n^{-1}\sum _{i=1}^n\{1-\pi ^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }})r_i\}\psi (w_i, y_i) \end{aligned}$$converges to \(E[\{r\pi ^{-1}(x,y;\alpha ^0,\gamma ^0)-1\}\psi (w, y)]\), which equals zero.
-
(ii)
For any square integrable function \(\psi (w, y)\), from Lemma A.1, we can prove that
$$\begin{aligned} \psi _\bullet (w;\gamma ^0,\beta ^0)= & {} E\{\psi (w, y) |r =0, w; \beta ^0\}\\= & {} \frac{E\{r\varsigma (x,y;\gamma ^0)\psi (w, y)|w; \beta ^0\}}{E\{r\varsigma (x,y;\gamma ^0)|w;\beta ^0\}}. \end{aligned}$$Thus,
$$\begin{aligned} E\{r\varsigma (x,y;\gamma ^0)\psi (w, y)|w;\beta ^0\}=E\{r\varsigma (x,y;\gamma ^0)|w;\beta ^0\} \psi _\bullet (w;\gamma ^0,\beta ^0), \end{aligned}$$and
$$\begin{aligned} E\{r\varsigma (x,y;\gamma ^0)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}|w;\beta ^0\}=0. \end{aligned}$$Then for any square integrable scalar function \(\upsilon (w)\),
$$\begin{aligned} E\{r\varsigma (x,y;\gamma ^0)\kappa (w)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}|w;\beta ^0\}=0, \end{aligned}$$and by the law of iterated expectation
$$\begin{aligned} E[r\varsigma (x,y;\gamma ^0)\kappa (w)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0, \end{aligned}$$Moreover, by Lemma A.1, it follows that
$$\begin{aligned} E[(1-r)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0. \end{aligned}$$From Eq. (4), \(\{\pi ^{-1}(x,y;\alpha ^*,\gamma ^0)-1\}r=r\varsigma (x,y; \gamma ^0)v(x,\alpha ^*)\) with \(v(x,\alpha ^*)=\text{ pr }(r = 0 | y = 0, x; \alpha ^*)/\text{ pr }(r = 1| y = 0, x; \alpha ^*)\). Thus,
$$\begin{aligned} E[\{\pi ^{-1}(x,y;\alpha ^*,\gamma ^0)r-1\}\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0. \end{aligned}$$By the proof of (ii) of Lemma A.2, if \(f(z,y|r=1,x;\beta )=f(y|r=1,x;\zeta )f(z|x,y;\eta )\) is correct, then \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) must converge to \((\alpha ^*,\gamma ^0,\beta ^0)\). Thus, for any square integrable functions
$$\begin{aligned}&\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n[\{\pi ^{-1}(x_i,y_i; {\hat{\alpha }},{\hat{\gamma }})r_i-1\}\{\psi (w_i, y_i)-\psi _\bullet (w_i;{\hat{\gamma }},{\hat{\beta }})\}]\\= & {} E[\{\pi ^{-1}(x,y;\alpha ^*,\gamma ^0)r-1\}\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0. \end{aligned}$$ -
(iii)
If \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, the result is implied by (i). If \(f(y,z|r=1,x;\beta )\) is correct, the result is implied by (ii). \(\square \)
Proof of Theorem 2
The proof is very similar to that of Theorem 1 of Miao and Tchetgen (2016).
-
(1)
Double robustness of \({\tilde{\mu }}_1\). If either of the baseline models is correct, from (iii) of Lemma A.3, \(n^{-1}\sum _{i=1}^n[\{\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }})r_i-1\}\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\}]\) converges to zero, therefore
$$\begin{aligned} n^{-1}\sum _{i=1}^n[\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}) r_i\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\}+N_0(w_i; {\hat{\gamma }},{\hat{\beta }})] \end{aligned}$$converges to the true outcome mean \(\mu ^0\).
-
(2)
Double robustness of \({\tilde{\mu }}_2\). From (i) of Lemma A.3, if the baseline propensity score model is correct,
$$\begin{aligned}&n^{-1}\sum _{i=1}^n\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},\lambda =0)r_i-1\}\{N_0(w_i;{\hat{\gamma }}, {\hat{\beta }})-{\tilde{\mu }}_{reg}\} \\= & {} n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})r_i-1\}\{N_0(w_i;{\hat{\gamma }},{\hat{\beta }}) -{\tilde{\mu }}_{reg}\} \end{aligned}$$converges to zero, i.e., \(\lambda =0\) is a solution of the probability limit of Eq. (19). Thus, the solution of Eq. (19), \({\tilde{\lambda }}\) converges to zero, and
$$\begin{aligned}&\lim _{n\rightarrow +\infty }n^{-1}\sum _{i=1}^n\pi _{ext}^{-1} (x_i,y_i;{\hat{\alpha }},{\hat{\gamma }},{\tilde{\lambda }})r_i=1,\\&\lim _{n\rightarrow +\infty }n^{-1}\sum _{i=1}^n\pi _{ext}^{-1} (x_i,y_i;{\hat{\alpha }},{\hat{\gamma }},{\tilde{\lambda }})r_iy_i\\&\quad = \lim _{n\rightarrow +\infty }n^{-1}\sum _{i=1}^n\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})r_iy_i=\mu ^0. \end{aligned}$$If the baseline outcome model is correct, \(n^{-1}\sum _{i=1}^n[(1-r_i)\{y_i-N_0(w_i;{\hat{\gamma }},{\hat{\beta }})\}]\) converges to zero; \({\tilde{\mu }}_{reg}=n^{-1}\sum _{i=1}^n\{(1- r_i)N_0(w;{\hat{\gamma }},{\hat{\beta }}) + r_i y_i\}\) converges to \(\mu ^0\); and \(n^{-1}\sum _{i=1}^n(y_i-{\tilde{\mu }}_{reg})\) converges to zero. By definition of the extended propensity score \(\{\pi _{ext}^{-1}(x,y;{\hat{\alpha }},{\hat{\gamma }},{\tilde{\lambda }})-1\}r=r\varsigma (x,y;{\hat{\gamma }})v(x,{\hat{\alpha }},{\tilde{\lambda }})\) with \(v(x,{\hat{\alpha }},{\tilde{\lambda }})=\text{ pr }_{ext}(r = 0 | y = 0, x;{\hat{\alpha }}, {\tilde{\lambda }})/\text{ pr }_{ext}(r = 1| y = 0, x;{\hat{\alpha }}, {\tilde{\lambda }})\). From (ii) of Lemma A.3, \(n^{-1}\sum _{i=1}^n[r_i\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},{\tilde{\lambda }})-1\}\{y_i-N_0(w_i;{\hat{\gamma }}, {\hat{\beta }})\}]\) converges to zero. Thus,
$$\begin{aligned} n^{-1}\sum _{i=1}^n[\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},{\tilde{\lambda }})r_i-1\}\{y_i-N_0(w_i;{\hat{\gamma }}, {\hat{\beta }})\}] \end{aligned}$$must converge to zero, and
$$\begin{aligned} {\tilde{\mu }}_2= & {} \left\{ \sum _{i=1}^n\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }} ,{\hat{\gamma }},{\tilde{\lambda }})r_i\right\} ^{-1}\\&\sum _{i=1}^n [\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}, {\tilde{\lambda }})r_i-1\}\{y_i-N_0(w_i;{\hat{\gamma }},{\hat{\beta }})\}]\\&+\left\{ \sum _{i=1}^n\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }} ,{\hat{\gamma }},{\tilde{\lambda }})r_i\right\} ^{-1} \\&\sum _{i=1}^n [\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}, {\tilde{\lambda }})r_i-1\}\{N_0(w_i;{\hat{\gamma }},{\hat{\beta }}) -{\tilde{\mu }}_{reg}\}]\\&+\left\{ \sum _{i=1}^n\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }} ,{\hat{\gamma }},{\tilde{\lambda }})r_i\right\} ^{-1}\sum _{i=1}^n (y_i-{\tilde{\mu }}_{reg})+{\tilde{\mu }}_{reg} \end{aligned}$$converges to \(\mu ^0\) in probability.
-
(3)
Double robustness of \({\tilde{\mu }}_3\). If the baseline propensity score model is correct, from (i) of Lemma A.3,
$$\begin{aligned} n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})r_i-1\}\{y_i-N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\} \end{aligned}$$converges to zero. Note Eq. (21), we have that \(n^{-1}\sum _{i=1}^n(1-r_i)\{y_i-N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\}\) converges to zero. Thus,
$$\begin{aligned} {\tilde{\mu }}_3=n^{-1}\sum _{i=1}^n\{r_iy_i+(1-r_i)N_{0ext} (w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\} \end{aligned}$$converges to \(\mu ^0\). If the baseline outcome model is correct, \(n^{-1}\sum _{i=1}^n(1-r_i)\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\}\) converges to zero. Since
$$\begin{aligned} \{\pi ^{-1}(x,y;{\hat{\alpha }},{\hat{\gamma }})-1\}r=r \varsigma (x,y;{\hat{\gamma }})v(x,{\hat{\alpha }}) \end{aligned}$$with \(v(x,{\hat{\alpha }})=\text{ pr }(r = 0 | y = 0, x; {\hat{\alpha }})/\text{ pr }(r = 1| y = 0, x; {\hat{\alpha }})\), from (ii) of Lemma A.3,
$$\begin{aligned}&n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})-1\}r_i\{y_i-N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, \rho =0)\} \\= & {} n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})-1\}r_i\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\} \end{aligned}$$converges to zero. That is, \(\rho =0\) is a solution of the probability limit of Eq. (21). Thus, the solution of Eq. (21), \({\tilde{\rho }}\) converges to zero, and
$$\begin{aligned}&\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n\{r_iy_i+(1-r_i) N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\} \\= & {} \lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n\{r_iy_i+(1-r_i)N_{0}(w_i; {\hat{\gamma }},{\hat{\beta }})\}=\mu ^0. \end{aligned}$$
\(\square \)
Proof of Theorem 3
Using (7) and a Taylor expansion, it follows that
Using (A.1) and direct calculations, we have
By the central limit theorem, we establish the asymptotic normality of \({\tilde{\mu }}_1\).\(\square \)
Proof of Theorem 4
Let \(\mu _{reg}(t,\theta )=(1- r)N_0(w;\gamma ,\beta ) + r_i y_i\) and \(\mu _{reg}^*=E\{\mu _{reg}(t,\theta ^*)\}\). By a Taylor expansion, we get
where \(\mu _{reg,\theta }=E\{\partial \mu _{reg}(t,\theta ^*)/\partial \theta ^\textsf {T}\}\). Let
and \(\lambda ^*\) be the unique solution to the equation \(E\{g_{ht}(t,\lambda ,\theta ^*,\mu _{reg}^*)\}=0\). Repeated applications of Taylor expansions yield
where \(g_{ht,\lambda }=E\{\partial g_{ht}(t,\lambda ^*,\theta ^*,\mu _{reg}^*)/\partial \lambda \}, g_{ht,\theta }=E\{\partial g_{ht}(t,\lambda ^*,\theta ^*,\mu _{reg}^*)/\partial \theta ^\textsf {T}\}\) and
Using (A.1), we can write
where
Let
By a Taylor expansion, we get that
where \(\delta _{\theta }=E\{\partial \delta (t,\theta ^*,\lambda ^*)/\partial \theta ^\textsf {T}\}\) and \(\delta _{\lambda }=E\{\partial \delta (t,\theta ^*,\lambda ^*)/\partial \lambda \}\). Combining previous results, we have
where
\(d_{2,\theta }=E\{d_2(t,\theta ^*,\lambda ^*,\delta ^*)/\partial \theta ^\textsf {T}\}\), \(d_{2,\lambda }=E\{d_2(t,\theta ^*,\lambda ^*, \delta ^*)/\partial \lambda \}\) and \(d_{2,\delta }=E\{d_2(t,\theta ^*, \lambda ^*,\delta ^*)/\partial \delta \}\). By the central limit theorem, we establish the asymptotic normality of \({\tilde{\mu }}_2\). \(\square \)
Proof of Theorem 5
Let \(g_{reg}(t,\rho ,\theta )=\{\pi ^{-1}(x, y; \alpha , \gamma )-1\}r\{y-N_{0ext}(w; \gamma ,\beta ,\rho )\}\) and \(\rho ^*\) be the unique solution to the equation \(E\{g_{reg}(t,\rho ,\theta ^*)\}=0\). A Taylor expansion reveals that
where \(g_{reg,\rho }=E\{\partial g_{reg}(t_i,\rho ^*,\theta ^*)/\partial \rho \}\) and \(g_{reg,\theta }=E\{\partial g_{reg}(t_i,\rho ^*,\theta ^*)/\partial \theta ^\textsf {T}\}\). Using (A.1), it follows that
Let \(d_3(t,\theta ,\rho )=ry+(1-r)N_{0ext}(w; \gamma ,\beta ,\rho )\). By direct calculations, we have
where
\(d_{3,\theta }=E\{d_3(t,\theta ^*,\rho ^*)/\partial \theta ^\textsf {T}\}\) and \(d_{3,\rho }=E\{d_3(t,\theta ^*,\rho ^*)/\partial \rho \}\). By the central limit theorem, we establish the asymptotic normality of \({\tilde{\mu }}_3\). \(\square \)
Proof of Theorem 6
The proof is simple and the details are omitted. \(\square \)
Rights and permissions
About this article
Cite this article
Liu, T., Yuan, X. Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data. Stat Papers 61, 2241–2270 (2020). https://doi.org/10.1007/s00362-018-1046-5
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-1046-5