Skip to main content
Log in

Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

The problem of nonignorable nonresponse data is ubiquitous in medical and social science studies. Analyses focused only on the missing-at-random assumption may lead to biased results. Various debias methods have been extensively studied in the literature, particularly the doubly robust (DR) estimators. We propose DR augmented-estimating-equations (AEE) estimators of the mean response which enjoy the double-robustness property under correct specification of the log odds ratio model. An advantage of DR AEE estimators is that they can efficiently use the completely observed covariates to improve estimation efficiency of existing DR estimators with nonignorable nonresponse data. We propose a model selection criterion that can consistently select the correct parametric model of the log odds ratio model from a group of candidate models. Moreover, the correctness of the required working models can be evaluated via straightforward goodness-of-fit tests. Simulation results indicate that doubly robust augmented-estimating-equations estimators are very robust to a misspecification of the baseline outcome density model or the baseline response model and dominate other competitors in the sense of having smaller mean-square errors. The analysis of a real dataset illustrates the flexibility and usefulness of the proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Choi JY, Lee MJ (2017) Regression discontinuity: review with extensions. Stat Pap 58:1217–1246

    Article  MathSciNet  Google Scholar 

  • D’Haultfoeuille X (2010) A new instrumental method for dealing with endogenous selection. J Econom 154:1–15

    Article  MathSciNet  Google Scholar 

  • Fang F, Shao J (2016) Model selection with nonignorable nonresponse. Biometrika 103:861–874

    Article  MathSciNet  Google Scholar 

  • Guan Z, Qin J (2017) Empirical likelihood method for non-ignorable missing data problems. Lifetime Data Anal 23:113–135

    Article  MathSciNet  Google Scholar 

  • Hall AR (2005) Generalized method of moments. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Han P (2014) Multiply robust estimation in regression analysis with missing data. J Am Stat Assoc 109(507):1159–1173

    Article  MathSciNet  Google Scholar 

  • Kang JD, Schafer JL (2007) Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Stat Sci 22:523–539

    Article  MathSciNet  Google Scholar 

  • Kim JK, Yu CL (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106(493):157–165

    Article  MathSciNet  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical inference with missing data, 2nd edn. Wiley series in probability and statistics. Wiley, New York

    Book  Google Scholar 

  • Miao W, Tchetgen EJ (2016) On varieties of doubly robust estimators under missingness not at random with a shadow variable. Biometrika 103(2):475–482

    Article  MathSciNet  Google Scholar 

  • Miao W, Tchetgen EJ, Geng Z (2015) Identification and doubly robust estimation of data missing not at random with an ancillary variable. http://biostats.bepress.com/harvardbiostat/paper189

  • Morikawa K, Kim JK (2016) Semiparametric adaptive estimation with nonignorable nonresponse data. arXiv:1612.09207 [stat.ME]

  • Robins JM, Rotnitzkt A, Zhao LP (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866

    Article  MathSciNet  Google Scholar 

  • Robins J, Sued M, Lei-Gomez Q, Rotnitzky A (2007) Comment: performance of double-robust estimators when “inverse probability” weights are highly variable. Stat Sci 22:544–559

    Article  Google Scholar 

  • Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103(1):175–187

    Article  MathSciNet  Google Scholar 

  • Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90(4):747–764

    Article  MathSciNet  Google Scholar 

  • Tang N, Zhao P, Zhu H (2014) Empirical likelihood for estimating equations with nonignorably missing data. Stat Sin 24(2):723–747

    MathSciNet  MATH  Google Scholar 

  • Tang N, Zhao P, Qu A, Jiang D (2017) Semiparametric estimating equations inference with nonignorable nonresponse. Stat Sin. https://doi.org/10.5705/ss.202015.0052

    Article  MATH  Google Scholar 

  • Tsiatis AA (2006) Semiparametric theory and missing data. Springer Series in Statistics. Springer, New York

  • Wang S, Shao J, Kim JK (2014) An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24(3):1097–1116

    MathSciNet  MATH  Google Scholar 

  • Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110(512):1577–1590

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We are grateful to the two reviewers and the editor for a number of constructive and helpful comments and suggestions that have clearly improved our manuscript. Xiaohui Yuan was partly supported by the NSFC (No. 11571051, 11671054, 11701043).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tianqing Liu or Xiaohui Yuan.

Appendix

Appendix

To establish all the large sample properties in this paper, we require the following conditions:

  1. C1

    \(t_i=(w_i^\textsf {T}, y_i^\textsf {T},r_i^\textsf {T})^\textsf {T}\), \(i=1,\ldots ,n\), are independent and identically distributed, where \(w_i=(x_i^\textsf {T},z_i^\textsf {T})^\textsf {T}\).

  2. C2

    (a) ; (b) ; (c) \(0<E[\text{ OR }(y|x)|r=1,x]<+\infty \), \(0<E[\text{ OR }(y|x)|r=1,w]<+\infty \), and \(\text{ OR }(y=0|x)=0\), where \(\text{ OR }(y|x)=\log \left\{ \frac{\text{ pr }(r=0|y,x)\text{ pr }(r=1|y=0,x)}{\text{ pr }(r=0|y=0,x)\text{ pr }(r=1|y,x)}\right\} \); (d) \(E(y^2)<+\infty \).

  3. C3

    Either \(f (y |r =1, w)\) or \(f (y |r =0, w)\) follows the location-scale model

    $$\begin{aligned} f(y|w,r)=\frac{1}{\sigma _r(w)}f_r\left\{ \frac{y-\mu _r(w)}{\sigma _r(w)}\right\} , r=0,1, \end{aligned}$$

    with unrestricted functions \(\mu _r\) and \(\sigma _r\), and density functions \(f_r\), and the corresponding density function \(f_{r=1}\) or \(f_{r=0}\) satisfies the following conditions: (a) the characteristic function \(\varphi (t)\) of the density function f(v) satisfies \(0<|\varphi (t)|<C\exp (-\delta |t|)\) for \(t\in R\) and some constants \(C, \delta >0\); (b) conditional on x, \(\mu (z,x)\) and \(\sigma (z,x)\) are continuously differentiable and integrable with respect to z; f(v) is continuously differentiable, and \(\int _{-\infty }^{+\infty }|v\times \partial f (v)/\partial v|^2dv\) is finite; (c) there exist some linear one-to-one mapping \(M :f \{(v-a)/b\}\mapsto c(t, a, b)\) and some value \(-\infty \le t_0\le +\infty \) such that \(\lim _{t\rightarrow t_0}c(t, a, b)/c(t, a', b')\) either equals zero or infinity for any \(a, a' \in \mathbb {R}\), \(b, b'>0\) with \((a, b)\ne (a',b')\).

  4. C4

    The response probability \(\pi (x, y; \alpha ^*,\gamma ^0)\) is bounded below. That is, \(\pi (x, y; \alpha ^*,\gamma ^0) > c_0\) for some \(c_0 > 0\) for all (xy).

  5. C5

    Define \(\vartheta =(\mu ^\textsf {T},\theta ^\textsf {T})^\textsf {T}\), \(\vartheta ^*=(\mu ^{0},\theta ^{*\textsf {T}})^\textsf {T}\) and \(b(t,\vartheta )=(b_1(t; \vartheta ),\ldots ,b_s(t; \vartheta ))^\textsf {T}=(d_1(t;\theta )-\mu ,h^\textsf {T}(t;\theta ))^\textsf {T}\), where \(\{b_1(t; \vartheta ),\ldots ,b_s(t; \vartheta )\}\) is a set of functionally independent estimating function. There exists a unique \(\vartheta ^*\) such that \(E\{b(t; \vartheta ^*)\}=0\), the matrix \(E\{\partial b(t; \vartheta ^*)/\partial \vartheta ^\textsf {T}\}\) is of full rank and the matrix \(E\{b^{\otimes 2}(t; \vartheta ^*)\}\) is positive definite, and \(\partial b(t; \vartheta )/\partial \vartheta ^\textsf {T}\) and \(\partial ^2 b(t; \vartheta )/\partial \vartheta \partial \vartheta ^\textsf {T}\) are continuous in a neighborhood of \(\vartheta ^*\). Furthermore, \(\Vert \partial ^2 b(t; \vartheta )/\partial \vartheta \partial \vartheta ^\textsf {T}\Vert \), \(\Vert b(t; \vartheta )\Vert ^3\), and \(\Vert \partial b(t; \vartheta )/\partial \vartheta ^\textsf {T}\Vert \) are bounded by some integrable function in this neighborhood.

Most of the above conditions are assumed for the missing data mechanism and the identification of the joint distribution of (wyr). Additional conditions are on the moment restrictions. Conditions C1–C3 guarantee the identifiability of the joint distribution of (wyr). C4 implies that the outcome y cannot be missing with probability 1 anywhere in the domain of the (xy). Condition C5 imposes some conditions on the moment restriction \(E\{b(t,\vartheta ^*)\}=0\).

Proof of Theorem 1

The properties (1) and (3) are obvious. We only prove property (2). By direct calculations, we have

$$\begin{aligned} \text{ var }({\hat{\mu }}_N)= & {} \text{ var }({\hat{\mu }}_M)+\text{ var }[\{1-\pi ^{-1}(w, y)r\}\{N(w)-M(w)\}]\\&+\,2\text{ cov }[{\hat{\mu }}_M,\{1-\pi ^{-1}(w, y)r\}\{N(w)-M(w)\}]\\= & {} \text{ var }({\hat{\mu }}_M)+E[\{1-\pi ^{-1}(w, y)r\}^2\{N(w)-M(w)\}^2]\\&+\,2E[\pi ^{-1}(w,y)ry\{1-\pi ^{-1}(w, y)\}\{N(w)-M(w)\}]\\&+\,2E[\{1-\pi ^{-1}(w, y)r\}^2M(w)\{N(w)-M(w)\}]\\= & {} \text{ var }({\hat{\mu }}_M)+E[\{\pi ^{-1}(w, y)- 1\}\{N^2(w)-M^2(w)\}]\\&+\,2E[y\{1-\pi ^{-1}(w, y)\}\{N(w)-M(w)\}]. \end{aligned}$$

Moreover,

$$\begin{aligned}&E[\{\pi ^{-1}(w, y)- 1\}y|w]=E[\pi ^{-1}(w, y)(1-r)y|w]\\= & {} \text{ pr }(r=1|w)E[\pi ^{-1}(w, y)(1-r)y|r=1,w]\\&+\,\text{ pr }(r=0|w)E[\pi ^{-1}(w, y)(1-r)y|r=0,w]\\= & {} \text{ pr }(r=0|w)E[\pi ^{-1}(w, y)y|r=0,w]. \end{aligned}$$

Similarly, \(E[\{\pi ^{-1}(w, y)- 1\}|w]=\text{ pr }(r=0|w)E[\pi ^{-1}(w, y)|r=0,w]\). Thus,

$$\begin{aligned}&E[\{\pi ^{-1}(w, y)- 1\}\{N^2(w)-M^2(w)\}]\\&+\,2E[y\{1-\pi ^{-1}(w, y)\} \{N(w)-M(w)\}]\\= & {} E(\text{ pr }(r=0|w)\{N(w)-M(w)\}\\&\times \,[\{N(w)+M(w)\}E\{\pi ^{-1}(w, y)|r=0,w\}-2E\{\pi ^{-1}(w, y) y|r=0,w\}])\\= & {} E(\text{ pr }(r=0|w)E\{\pi ^{-1}(w, y)|r=0,w\}\{N(w)-M(w)\}[\{N(w)+M(w)\}\\&-2N_*(w)]), \end{aligned}$$

where \(N_*(w)=E\{\pi ^{-1}(w, y)y|r=0,w\}/E\{\pi ^{-1}(w, y)|r=0,w\}\). The desired result follows by letting \(N(w)=N_*(w)\). \(\square \)

Lemma A.1

Under the regular condition C2, we have

$$\begin{aligned} f(y|r =0,w)=\frac{\exp \{\text{ OR }(y|x)\}}{E[\exp \{\text{ OR }(y|x)\}|r=1,w]}f(y|r =1,w). \end{aligned}$$

Proof of Lemma A.1

On one hand, from (5), we have

$$\begin{aligned} f(z|r =0,x)= & {} \frac{\int \exp \{\text{ OR }(y|x)\}f(z,y|r =1,x)dy}{E[\exp \{\text{ OR }(y|x)\}|r=1,x]}\\= & {} \frac{\int \exp \{\text{ OR }(y|x)\}f(y|r =1,w)dy f(z|r =1,x)}{E[\exp \{\text{ OR }(y|x)\}|r=1,x]}\\= & {} \frac{E[\exp \{\text{ OR }(y|x)\}|r=1,w] f(z|r =1,x)}{E[\exp \{\text{ OR }(y|x)\}|r=1,x]}. \end{aligned}$$

On the other hand, from (5), it follows that

$$\begin{aligned}&f(y|r =0,w)f(z|r =0,x)\\&\quad =\frac{\exp \{\text{ OR }(y|x)\}}{E[\exp \{\text{ OR }(y|x)\}|r=1,x]}f(y|r =1,w)f(z|r =1,x). \end{aligned}$$

Thus,

$$\begin{aligned} f(y|r =0,w)= & {} \frac{f(z|r =1,x)}{f(z|r =0,x)}\frac{\exp \{\text{ OR }(y|x)\}}{E[\exp \{\text{ OR }(y|x)\}|r=1,x]}f(y|r =1,w)\\= & {} \frac{E[\exp \{\text{ OR }(y|x)\}|r=1,x]}{E[\exp \{\text{ OR } (y|x)\}|r=1,w]}\frac{\exp \{\text{ OR }(y|x)\}}{E[\exp \{\text{ OR } (y|x)\}|r=1,x]}\\&f(y|r =1,w)\\= & {} \frac{\exp \{\text{ OR }(y|x)\}}{E[\exp \{\text{ OR }(y|x)\}|r=1,w]}f(y|r =1,w). \end{aligned}$$

\(\square \)

Lemma A.2

Under regular conditions C1–C5, suppose that the odds ratio model \(\varsigma (x,y;\gamma )=\exp \{\text{ OR }(y|x;\gamma )\}\) is correct, and that the equation \(E\{h(t;\theta )\}=0\) has a unique solution \(\theta ^*\). For any square integrable vector function \(\varphi (w, y)\) and scalar function \(\kappa (x)\), \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) solving Eq. (7).

  1. (i)

    if \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, then

    $$\begin{aligned} n^{-1}\sum _{i=1}^n\{1-\pi ^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }})r_i\}\varphi (w_i, y_i) \end{aligned}$$

    converges to zero in probability;

  2. (ii)

    if \(f(y,z|r=1,x;\beta )\) is correct, then

    $$\begin{aligned} n^{-1}\sum _{i=1}^n[r_i\varsigma (x_i,y_i;{\hat{\gamma }}) \kappa (x_i)\{\varphi (y_i, w_i)-\varphi _\bullet (x_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$

    converges to zero in probability, where \(\varphi _\bullet (x;\gamma ,\beta )=E\{\varphi (w, y) |r =0, x; \beta \}\);

  3. (iii)

    if either \(\text{ pr }(r=1|y=0,x;\alpha )\) or \(f(z,y|r=1,x;\beta )\) is correct, then

    $$\begin{aligned} n^{-1}\sum _{i=1}^n[\{r_i\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}) -1\}\{\varphi (w_i, y_i )-\varphi _\bullet (x_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$

    converges to zero in probability.

Proof of Lemma A.2

See the proof of Lemma A1 in Miao and Tchetgen (2016). \(\square \)

Lemma A.3

Under regular conditions C1–C5, suppose that the odds ratio model \(\varsigma (x,y;\gamma )=\exp \{\text{ OR }(y|x;\gamma )\}\) is correct, and that the equation \(E\{h(t;\theta )\}=0\) has a unique solution \(\theta ^*\). For any square integrable vector function \(\psi (w, y)\) and scalar function \(\kappa (w)\), \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) solving Eq. (7).

  1. (i)

    if \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, then

    $$\begin{aligned} n^{-1}\sum _{i=1}^n\{1-\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }})r_i\}\psi (w_i, y_i) \end{aligned}$$

    converges to zero in probability;

  2. (ii)

    if \(f(y,z|r=1,x;\beta )\) is correct, then

    $$\begin{aligned} n^{-1}\sum _{i=1}^n[r_i\varsigma (x_i,y_i;{\hat{\gamma }}) \kappa (w_i)\{\psi (w_i, y_i)-\psi _\bullet (w_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$

    converges to zero in probability, where \(\psi _\bullet (w;\gamma ,\beta )=E\{\psi (w, y) |r =0, w; \beta \}\);

  3. (iii)

    if either \(\text{ pr }(r=1|y=0,x;\alpha )\) or \(f(y,z|r=1,x;\beta )\) is correct, then

    $$\begin{aligned} n^{-1}\sum _{i=1}^n[\{r_i\pi ^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }})-1\}\{\psi (w_i, y_i )-\psi _\bullet (w_i;{\hat{\gamma }},{\hat{\beta }})\}] \end{aligned}$$

    converges to zero in probability.

Proof of Lemma A.3

The proof is very similar to that of Lemma A1 of Miao and Tchetgen (2016). We use starred characters \((\alpha ^*,\gamma ^*,\beta ^*)\) to denote probability limits of \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\), and use \((\alpha ^0,\gamma ^0,\beta ^0)\) to denote true values of the nuisance parameters.

  1. (i)

    If \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, by the proof of (i) of Lemma A.2, \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) must converge to \((\alpha ^0,\gamma ^0,\beta ^*)\). Thus, for any square integrable function \(\psi (w, y)\),

    $$\begin{aligned} n^{-1}\sum _{i=1}^n\{1-\pi ^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }})r_i\}\psi (w_i, y_i) \end{aligned}$$

    converges to \(E[\{r\pi ^{-1}(x,y;\alpha ^0,\gamma ^0)-1\}\psi (w, y)]\), which equals zero.

  2. (ii)

    For any square integrable function \(\psi (w, y)\), from Lemma A.1, we can prove that

    $$\begin{aligned} \psi _\bullet (w;\gamma ^0,\beta ^0)= & {} E\{\psi (w, y) |r =0, w; \beta ^0\}\\= & {} \frac{E\{r\varsigma (x,y;\gamma ^0)\psi (w, y)|w; \beta ^0\}}{E\{r\varsigma (x,y;\gamma ^0)|w;\beta ^0\}}. \end{aligned}$$

    Thus,

    $$\begin{aligned} E\{r\varsigma (x,y;\gamma ^0)\psi (w, y)|w;\beta ^0\}=E\{r\varsigma (x,y;\gamma ^0)|w;\beta ^0\} \psi _\bullet (w;\gamma ^0,\beta ^0), \end{aligned}$$

    and

    $$\begin{aligned} E\{r\varsigma (x,y;\gamma ^0)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}|w;\beta ^0\}=0. \end{aligned}$$

    Then for any square integrable scalar function \(\upsilon (w)\),

    $$\begin{aligned} E\{r\varsigma (x,y;\gamma ^0)\kappa (w)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}|w;\beta ^0\}=0, \end{aligned}$$

    and by the law of iterated expectation

    $$\begin{aligned} E[r\varsigma (x,y;\gamma ^0)\kappa (w)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0, \end{aligned}$$

    Moreover, by Lemma A.1, it follows that

    $$\begin{aligned} E[(1-r)\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0. \end{aligned}$$

    From Eq. (4), \(\{\pi ^{-1}(x,y;\alpha ^*,\gamma ^0)-1\}r=r\varsigma (x,y; \gamma ^0)v(x,\alpha ^*)\) with \(v(x,\alpha ^*)=\text{ pr }(r = 0 | y = 0, x; \alpha ^*)/\text{ pr }(r = 1| y = 0, x; \alpha ^*)\). Thus,

    $$\begin{aligned} E[\{\pi ^{-1}(x,y;\alpha ^*,\gamma ^0)r-1\}\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0. \end{aligned}$$

    By the proof of (ii) of Lemma A.2, if \(f(z,y|r=1,x;\beta )=f(y|r=1,x;\zeta )f(z|x,y;\eta )\) is correct, then \(({\hat{\alpha }},{\hat{\gamma }},{\hat{\beta }})\) must converge to \((\alpha ^*,\gamma ^0,\beta ^0)\). Thus, for any square integrable functions

    $$\begin{aligned}&\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n[\{\pi ^{-1}(x_i,y_i; {\hat{\alpha }},{\hat{\gamma }})r_i-1\}\{\psi (w_i, y_i)-\psi _\bullet (w_i;{\hat{\gamma }},{\hat{\beta }})\}]\\= & {} E[\{\pi ^{-1}(x,y;\alpha ^*,\gamma ^0)r-1\}\{\psi (w, y)-\psi _\bullet (w;\gamma ^0,\beta ^0)\}]=0. \end{aligned}$$
  3. (iii)

    If \(\text{ pr }(r=1|y=0,x;\alpha )\) is correct, the result is implied by (i). If \(f(y,z|r=1,x;\beta )\) is correct, the result is implied by (ii). \(\square \)

Proof of Theorem 2

The proof is very similar to that of Theorem 1 of Miao and Tchetgen (2016).

  1. (1)

    Double robustness of \({\tilde{\mu }}_1\). If either of the baseline models is correct, from (iii) of Lemma A.3, \(n^{-1}\sum _{i=1}^n[\{\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }})r_i-1\}\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\}]\) converges to zero, therefore

    $$\begin{aligned} n^{-1}\sum _{i=1}^n[\pi ^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}) r_i\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\}+N_0(w_i; {\hat{\gamma }},{\hat{\beta }})] \end{aligned}$$

    converges to the true outcome mean \(\mu ^0\).

  2. (2)

    Double robustness of \({\tilde{\mu }}_2\). From (i) of Lemma A.3, if the baseline propensity score model is correct,

    $$\begin{aligned}&n^{-1}\sum _{i=1}^n\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},\lambda =0)r_i-1\}\{N_0(w_i;{\hat{\gamma }}, {\hat{\beta }})-{\tilde{\mu }}_{reg}\} \\= & {} n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})r_i-1\}\{N_0(w_i;{\hat{\gamma }},{\hat{\beta }}) -{\tilde{\mu }}_{reg}\} \end{aligned}$$

    converges to zero, i.e., \(\lambda =0\) is a solution of the probability limit of Eq. (19). Thus, the solution of Eq. (19), \({\tilde{\lambda }}\) converges to zero, and

    $$\begin{aligned}&\lim _{n\rightarrow +\infty }n^{-1}\sum _{i=1}^n\pi _{ext}^{-1} (x_i,y_i;{\hat{\alpha }},{\hat{\gamma }},{\tilde{\lambda }})r_i=1,\\&\lim _{n\rightarrow +\infty }n^{-1}\sum _{i=1}^n\pi _{ext}^{-1} (x_i,y_i;{\hat{\alpha }},{\hat{\gamma }},{\tilde{\lambda }})r_iy_i\\&\quad = \lim _{n\rightarrow +\infty }n^{-1}\sum _{i=1}^n\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})r_iy_i=\mu ^0. \end{aligned}$$

    If the baseline outcome model is correct, \(n^{-1}\sum _{i=1}^n[(1-r_i)\{y_i-N_0(w_i;{\hat{\gamma }},{\hat{\beta }})\}]\) converges to zero; \({\tilde{\mu }}_{reg}=n^{-1}\sum _{i=1}^n\{(1- r_i)N_0(w;{\hat{\gamma }},{\hat{\beta }}) + r_i y_i\}\) converges to \(\mu ^0\); and \(n^{-1}\sum _{i=1}^n(y_i-{\tilde{\mu }}_{reg})\) converges to zero. By definition of the extended propensity score \(\{\pi _{ext}^{-1}(x,y;{\hat{\alpha }},{\hat{\gamma }},{\tilde{\lambda }})-1\}r=r\varsigma (x,y;{\hat{\gamma }})v(x,{\hat{\alpha }},{\tilde{\lambda }})\) with \(v(x,{\hat{\alpha }},{\tilde{\lambda }})=\text{ pr }_{ext}(r = 0 | y = 0, x;{\hat{\alpha }}, {\tilde{\lambda }})/\text{ pr }_{ext}(r = 1| y = 0, x;{\hat{\alpha }}, {\tilde{\lambda }})\). From (ii) of Lemma A.3, \(n^{-1}\sum _{i=1}^n[r_i\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},{\tilde{\lambda }})-1\}\{y_i-N_0(w_i;{\hat{\gamma }}, {\hat{\beta }})\}]\) converges to zero. Thus,

    $$\begin{aligned} n^{-1}\sum _{i=1}^n[\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},{\tilde{\lambda }})r_i-1\}\{y_i-N_0(w_i;{\hat{\gamma }}, {\hat{\beta }})\}] \end{aligned}$$

    must converge to zero, and

    $$\begin{aligned} {\tilde{\mu }}_2= & {} \left\{ \sum _{i=1}^n\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }} ,{\hat{\gamma }},{\tilde{\lambda }})r_i\right\} ^{-1}\\&\sum _{i=1}^n [\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}, {\tilde{\lambda }})r_i-1\}\{y_i-N_0(w_i;{\hat{\gamma }},{\hat{\beta }})\}]\\&+\left\{ \sum _{i=1}^n\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }} ,{\hat{\gamma }},{\tilde{\lambda }})r_i\right\} ^{-1} \\&\sum _{i=1}^n [\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }},{\hat{\gamma }}, {\tilde{\lambda }})r_i-1\}\{N_0(w_i;{\hat{\gamma }},{\hat{\beta }}) -{\tilde{\mu }}_{reg}\}]\\&+\left\{ \sum _{i=1}^n\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }} ,{\hat{\gamma }},{\tilde{\lambda }})r_i\right\} ^{-1}\sum _{i=1}^n (y_i-{\tilde{\mu }}_{reg})+{\tilde{\mu }}_{reg} \end{aligned}$$

    converges to \(\mu ^0\) in probability.

  3. (3)

    Double robustness of \({\tilde{\mu }}_3\). If the baseline propensity score model is correct, from (i) of Lemma A.3,

    $$\begin{aligned} n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})r_i-1\}\{y_i-N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\} \end{aligned}$$

    converges to zero. Note Eq. (21), we have that \(n^{-1}\sum _{i=1}^n(1-r_i)\{y_i-N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\}\) converges to zero. Thus,

    $$\begin{aligned} {\tilde{\mu }}_3=n^{-1}\sum _{i=1}^n\{r_iy_i+(1-r_i)N_{0ext} (w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\} \end{aligned}$$

    converges to \(\mu ^0\). If the baseline outcome model is correct, \(n^{-1}\sum _{i=1}^n(1-r_i)\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\}\) converges to zero. Since

    $$\begin{aligned} \{\pi ^{-1}(x,y;{\hat{\alpha }},{\hat{\gamma }})-1\}r=r \varsigma (x,y;{\hat{\gamma }})v(x,{\hat{\alpha }}) \end{aligned}$$

    with \(v(x,{\hat{\alpha }})=\text{ pr }(r = 0 | y = 0, x; {\hat{\alpha }})/\text{ pr }(r = 1| y = 0, x; {\hat{\alpha }})\), from (ii) of Lemma A.3,

    $$\begin{aligned}&n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})-1\}r_i\{y_i-N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, \rho =0)\} \\= & {} n^{-1}\sum _{i=1}^n\{\pi ^{-1}(x_i, y_i; {\hat{\alpha }}, {\hat{\gamma }})-1\}r_i\{y_i-N_0(w_i; {\hat{\gamma }},{\hat{\beta }})\} \end{aligned}$$

    converges to zero. That is, \(\rho =0\) is a solution of the probability limit of Eq. (21). Thus, the solution of Eq. (21), \({\tilde{\rho }}\) converges to zero, and

    $$\begin{aligned}&\lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n\{r_iy_i+(1-r_i) N_{0ext}(w_i;{\hat{\gamma }},{\hat{\beta }}, {\tilde{\rho }})\} \\= & {} \lim _{n\rightarrow \infty }n^{-1}\sum _{i=1}^n\{r_iy_i+(1-r_i)N_{0}(w_i; {\hat{\gamma }},{\hat{\beta }})\}=\mu ^0. \end{aligned}$$

\(\square \)

Proof of Theorem 3

Using (7) and a Taylor expansion, it follows that

$$\begin{aligned} {\hat{\theta }}-\theta ^*=-h_\theta ^{-1}n^{-1}\sum _{i=1}^nh (t_i;\theta ^*)+o_p(n^{-1/2}). \end{aligned}$$
(A.1)

Using (A.1) and direct calculations, we have

$$\begin{aligned} {\tilde{\mu }}_1-\mu ^0= & {} n^{-1}\sum _{i=1}^nd_1(t_i;{\hat{\theta }}) -\mu ^0\nonumber \\= & {} n^{-1}\sum _{i=1}^nd_1(t_i;\theta ^*)-\mu ^0+n^{-1}\sum _{i=1}^n \frac{\partial d_1(t_i;\theta ^*)}{\partial \theta ^\textsf {T}} ({\hat{\theta }}-\theta ^*)+ o_p(n^{-1/2})\nonumber \\= & {} n^{-1}\sum _{i=1}^n\{d_1(t_i;\theta ^*)-\mu ^0\}-d_{1,\theta } h_{\theta }^{-1}n^{-1}\sum _{i=1}^nh(t_i;\theta ^*)+ o_p(n^{-1/2})\nonumber \\= & {} n^{-1}\sum _{i=1}^n\{d_1(t_i;\theta ^*)-\mu ^0-d_{1,\theta } h_{\theta }^{-1}h(t_i;\theta ^*)\}+ o_p(n^{-1/2}). \end{aligned}$$

By the central limit theorem, we establish the asymptotic normality of \({\tilde{\mu }}_1\).\(\square \)

Proof of Theorem 4

Let \(\mu _{reg}(t,\theta )=(1- r)N_0(w;\gamma ,\beta ) + r_i y_i\) and \(\mu _{reg}^*=E\{\mu _{reg}(t,\theta ^*)\}\). By a Taylor expansion, we get

$$\begin{aligned} {\tilde{\mu }}_{reg}-\mu _{reg}^*= & {} n^{-1}\sum _{i=1}^n\{\mu _{reg} (t_i,\theta ^*)-\mu _{reg}^*\}+\mu _{reg,\theta }({\hat{\theta }} -\theta ^*)+o_p(n^{-1/2}), \end{aligned}$$

where \(\mu _{reg,\theta }=E\{\partial \mu _{reg}(t,\theta ^*)/\partial \theta ^\textsf {T}\}\). Let

$$\begin{aligned} g_{ht}(t,\lambda ,\theta ,\mu )=\{\pi _{ext}^{-1}(x,y;\alpha , \gamma ,\lambda )r-1\}\{N_0(w;\gamma ,\beta )-\mu \} \end{aligned}$$

and \(\lambda ^*\) be the unique solution to the equation \(E\{g_{ht}(t,\lambda ,\theta ^*,\mu _{reg}^*)\}=0\). Repeated applications of Taylor expansions yield

$$\begin{aligned} 0= & {} n^{-1}\sum _{i=1}^ng_{ht}(t_i,{\tilde{\lambda }},{\hat{\theta }},{\tilde{\mu }}_{reg})\\= & {} n^{-1}\sum _{i=1}^n g_{ht}(t_i,\lambda ^*,\theta ^*,\mu _{reg}^*) +g_{ht,\lambda }({\tilde{\lambda }}-\lambda ^*)+g_{ht,\theta }({\hat{\theta }} -\theta ^*)\\&+\,g_{ht,\mu }({\tilde{\mu }}_{reg}-\mu _{reg}^*)+o_p(n^{-1/2})\\= & {} n^{-1}\sum _{i=1}^n g_{ht}(t_i,\lambda ^*,\theta ^*,\mu _{reg}^*)+g_{ht,\lambda } ({\tilde{\lambda }}-\lambda ^*)+g_{ht,\theta }({\hat{\theta }}-\theta ^*)\\&+\,g_{ht,\mu }n^{-1}\sum _{i=1}^n\{\mu _{reg}(t_i,\theta ^*) -\mu _{reg}^*\}+g_{ht,\mu }\mu _{reg,\theta }({\hat{\theta }} -\theta ^*)+o_p(n^{-1/2}), \end{aligned}$$

where \(g_{ht,\lambda }=E\{\partial g_{ht}(t,\lambda ^*,\theta ^*,\mu _{reg}^*)/\partial \lambda \}, g_{ht,\theta }=E\{\partial g_{ht}(t,\lambda ^*,\theta ^*,\mu _{reg}^*)/\partial \theta ^\textsf {T}\}\) and

$$\begin{aligned} g_{ht,\mu }=E\{\partial g_{ht}(t,\lambda ^*,\theta ^*,\mu _{reg}^*)/\partial \mu \}. \end{aligned}$$

Using (A.1), we can write

$$\begin{aligned} {\tilde{\lambda }}-\lambda ^*= & {} -g_{ht,\lambda }^{-1}n^{-1}\sum _{i=1}^n[g_{ht}(t_i,\lambda ^*,\theta ^*, \mu _{reg}^*)+g_{ht,\mu }\{\mu _{reg}(t_i,\theta ^*)-\mu _{reg}^*\}]\\&-\,g_{ht,\lambda }^{-1}\{g_{ht,\theta }+g_{ht,\mu }\mu _{reg,\theta }\} (\hat{\theta }-\theta ^*)+o_p(n^{-1/2})\\= & {} n^{-1}\sum _{i=1}^n\varphi _{\lambda i}+o_p(n^{-1/2}), \end{aligned}$$

where

$$\begin{aligned} \varphi _{\lambda i}= & {} -g_{ht,\lambda }^{-1}[g_{ht}(t_i,\lambda ^*,\theta ^*,\mu _{reg}^*)+g_{ht,\mu } \{\mu _{reg}(t_i,\theta ^*)-\mu _{reg}^*\}\\&-\{g_{ht,\theta } +g_{ht,\mu }\mu _{reg,\theta }\} h_{\theta }^{-1}h(t_i;\theta ^*)]. \end{aligned}$$

Let

$$\begin{aligned} \delta (t,\theta ,\lambda )= & {} \pi _{ext}^{-1}(x,y;\alpha , \gamma ,\lambda )r,\quad \hat{\delta }=n^{-1}\sum _{i=1}^n\{\pi _{ext}^{-1}(x_i,y_i;{\hat{\alpha }}, {\hat{\gamma }},{\tilde{\lambda }})r_i\},\\ \delta ^*= & {} E\{\pi _{ext}^{-1}(x,y;\alpha ^*, \gamma ^*,\lambda ^*)r\},\quad \ d_2(t,\theta ,\lambda ,\delta )=\delta ^{-1}\pi _{ext}^{-1}(x,y;\alpha , \gamma ,\lambda )ry. \end{aligned}$$

By a Taylor expansion, we get that

$$\begin{aligned} \hat{\delta }-\delta ^*= & {} n^{-1}\sum _{i=1}^n\{\delta (t_i,\theta ^*, \lambda ^*)-\delta ^*\}+\delta _{\theta }(\hat{\theta }-\theta ^*) +\delta _{\lambda }({\tilde{\lambda }}-\lambda ^*)+o_p(n^{-1/2}), \end{aligned}$$

where \(\delta _{\theta }=E\{\partial \delta (t,\theta ^*,\lambda ^*)/\partial \theta ^\textsf {T}\}\) and \(\delta _{\lambda }=E\{\partial \delta (t,\theta ^*,\lambda ^*)/\partial \lambda \}\). Combining previous results, we have

$$\begin{aligned}&{\tilde{\mu }}_2-\mu ^0\\&\quad =n^{-1}\sum _{i=1}^nd_2(t_i,\theta ^*,\lambda ^*,\delta ^*) -\mu ^0+d_{2,\theta }(\hat{\theta }-\theta ^*)+d_{2,\lambda } ({\tilde{\lambda }}-\lambda ^*)+d_{2,\delta }(\hat{\delta }-\delta ^*)\\&\qquad +\,o_p(n^{-1/2})\\&=n^{-1}\sum _{i=1}^n[d_2(t_i,\theta ^*,\lambda ^*,\delta ^*) +d_{2,\delta }\{\delta (t_i,\theta ^*,\lambda ^*)-\delta ^*\}] -\mu ^0\\&\qquad +\,(d_{2,\theta }+d_{2,\delta }\delta _{\theta })(\hat{\theta }-\theta ^*)+(d_{2,\lambda }+d_{2,\delta }\delta _{\lambda }) ({\tilde{\lambda }}-\lambda ^*)+o_p(n^{-1/2})\\= & {} n^{-1}\sum _{i=1}^n(d_2(t_i,\theta ^*,\lambda ^*,\delta ^*) -\mu ^0+d_{2,\delta }\{\delta (t_i,\theta ^*,\lambda ^*)-\delta ^*\}\\&\qquad -\,(d_{2,\lambda }+d_{2,\delta }\delta _{\lambda })g_{ht, \lambda }^{-1}[g_{ht}(t_i,\lambda ^*,\theta ^*,\mu _{reg}^*)+g_{ht,\mu } \{\mu _{reg}(t_i,\theta ^*)-\mu _{reg}^*\}])\\&\qquad +\,[(d_{2,\theta }+d_{2,\delta }\delta _{\theta })-(d_{2, \lambda }+d_{2,\delta }\delta _{\lambda })g_{ht,\lambda }^{-1} \{g_{ht,\theta }+g_{ht,\mu }\mu _{reg,\theta }\}](\hat{\theta }-\theta ^*)\\&+\qquad \,o_p(n^{-1/2})\\&=n^{-1}\sum _{i=1}^n\psi _{2i}+o_p(n^{-1/2}), \end{aligned}$$

where

$$\begin{aligned} \psi _{2i}= & {} d_2(t_i,\theta ^*,\lambda ^*,\delta ^*)-\mu ^0 +d_{2,\delta }\{\delta (t_i,\theta ^*,\lambda ^*)-\delta ^*\}\\&-\,(d_{2,\lambda }+d_{2,\delta }\delta _{\lambda })g_{ht, \lambda }^{-1}[g_{ht}(t_i,\lambda ^*,\theta ^*,\mu _{reg}^*)+g_{ht,\mu } \{\mu _{reg}(t_i,\theta ^*)-\mu _{reg}^*\}]\\&-\,[(d_{2,\theta }+d_{2,\delta }\delta _{\theta })-(d_{2, \lambda }+d_{2,\delta }\delta _{\lambda })g_{ht,\lambda }^{-1} \{g_{ht,\theta }+g_{ht,\mu }\mu _{reg,\theta }\}]h_{\theta }^{-1}h(t_i;\theta ^*), \end{aligned}$$

\(d_{2,\theta }=E\{d_2(t,\theta ^*,\lambda ^*,\delta ^*)/\partial \theta ^\textsf {T}\}\), \(d_{2,\lambda }=E\{d_2(t,\theta ^*,\lambda ^*, \delta ^*)/\partial \lambda \}\) and \(d_{2,\delta }=E\{d_2(t,\theta ^*, \lambda ^*,\delta ^*)/\partial \delta \}\). By the central limit theorem, we establish the asymptotic normality of \({\tilde{\mu }}_2\). \(\square \)

Proof of Theorem 5

Let \(g_{reg}(t,\rho ,\theta )=\{\pi ^{-1}(x, y; \alpha , \gamma )-1\}r\{y-N_{0ext}(w; \gamma ,\beta ,\rho )\}\) and \(\rho ^*\) be the unique solution to the equation \(E\{g_{reg}(t,\rho ,\theta ^*)\}=0\). A Taylor expansion reveals that

$$\begin{aligned} 0= & {} n^{-1}\sum _{i=1}^ng_{reg}(t_i,{\tilde{\rho }},\hat{\theta })\\= & {} n^{-1}\sum _{i=1}^ng_{reg}(t_i,\rho ^*,\theta ^*)+g_{reg,\rho } ({\tilde{\rho }}-\rho ^*)+g_{reg,\theta }(\hat{\theta }-\theta ^*)+o_p(n^{-1/2}), \end{aligned}$$

where \(g_{reg,\rho }=E\{\partial g_{reg}(t_i,\rho ^*,\theta ^*)/\partial \rho \}\) and \(g_{reg,\theta }=E\{\partial g_{reg}(t_i,\rho ^*,\theta ^*)/\partial \theta ^\textsf {T}\}\). Using (A.1), it follows that

$$\begin{aligned} {\tilde{\rho }}-\rho ^*= & {} -g_{reg,\rho }^{-1}n^{-1}\sum _{i=1}^ng_{reg} (t_i,\rho ^*,\theta ^*)-g_{reg,\rho }^{-1}g_{reg,\theta }(\hat{\theta } -\theta ^*)+o_p(n^{-1/2})\\= & {} -g_{reg,\rho }^{-1}n^{-1}\sum _{i=1}^n\{g_{reg}(t_i,\rho ^*,\theta ^*) -g_{reg,\theta }h_{\theta }^{-1}h(t_i;\theta ^*)\}+o_p(n^{-1/2}). \end{aligned}$$

Let \(d_3(t,\theta ,\rho )=ry+(1-r)N_{0ext}(w; \gamma ,\beta ,\rho )\). By direct calculations, we have

$$\begin{aligned} {\tilde{\mu }}_3-\mu ^0= & {} n^{-1}\sum _{i=1}^nd_3(t_i,\theta ^*,\rho ^*) -\mu ^0+d_{3,\theta }(\hat{\theta }-\theta ^*)+d_{3,\rho }(\tilde{\rho }-\rho ^*)+o_p(n^{-1/2})\\= & {} n^{-1}\sum _{i=1}^n\{d_3(t_i,\theta ^*,\rho ^*)-\mu ^0-d_{3,\rho } g_{reg,\rho }^{-1}g_{reg}(t_i,\rho ^*,\theta ^*)\}\\&+\,(d_{3,\theta }-d_{3,\rho }g_{reg,\rho }^{-1}g_{reg,\theta }) (\hat{\theta }-\theta ^*)+o_p(n^{-1/2})\\= & {} n^{-1}\sum _{i=1}^n\psi _{3i}+o_p(n^{-1/2}), \end{aligned}$$

where

$$\begin{aligned} \psi _{3i}= & {} d_3(t_i,\theta ^*,\rho ^*)-\mu ^0-d_{3,\rho } g_{reg,\rho }^{-1}g_{reg}(t_i,\rho ^*,\theta ^*)\\&-(d_{3,\theta } -d_{3,\rho }g_{reg,\rho }^{-1}g_{reg,\theta })h_{\theta }^{-1}h(t_i;\theta ^*), \end{aligned}$$

\(d_{3,\theta }=E\{d_3(t,\theta ^*,\rho ^*)/\partial \theta ^\textsf {T}\}\) and \(d_{3,\rho }=E\{d_3(t,\theta ^*,\rho ^*)/\partial \rho \}\). By the central limit theorem, we establish the asymptotic normality of \({\tilde{\mu }}_3\). \(\square \)

Proof of Theorem 6

The proof is simple and the details are omitted. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, T., Yuan, X. Doubly robust augmented-estimating-equations estimation with nonignorable nonresponse data. Stat Papers 61, 2241–2270 (2020). https://doi.org/10.1007/s00362-018-1046-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-018-1046-5

Keywords

Navigation