Abstract
In the presence of longitudinal data with nonignorable dropouts, we propose improved estimators for generalized partial linear models that accommodate both the within-subject correlations and nonignorable missing data. To address the identifiability problem, an instrumental covariate, which is related to the response variable but unrelated to the propensity given the response variable and other covariates, is used to construct sufficient instrumental estimating equations. Subsequently, the nonparametric function is approximated by B-spline basis functions and then we construct bias-corrected generalized estimating equations based on the inverse probability weighting. In order to incorporate the within-subject correlations under an informative working correlation structure, we borrow the idea of quadratic inference function and hybrid-GEE to construct the improved empirical likelihood procedures. Under some regularity conditions, we establish asymptotic normality of the proposed estimators for the parametric components and convergence rate of the estimators for the nonparametric functions. The finite-sample performance of the proposed estimators is studied through simulations and an application to HIV-CD4 data set is also presented.
Similar content being viewed by others
References
Bai Y, Fung WK, Zhu Z (2010) Weighted empirical likelihood for generalized linear models with longitudinal data. J Stat Plan Inference 140:3446–3456
Boente G, He X, Zhou J (2006) Robust estimates in generalized partially linear models. Ann Stat 34:2856–2878
Boente G, Rodriguez D (2010) Robust inference in generalized partially linear models. Comput Stat Data Anal 54:2942–2966
Chen B, Zhou X (2013) Generalized partially linear models for incomplete longitudinal data In the presence of population-level information. Biometrics 69:386–395
Chen X, Christensen T (2015) Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. J Econom 188:447–465
Cho H, Qu A (2015) Efficient estimation for longitudinal data by combining large-dimensional moment conditions. Electron J Stat 9:1315–1334
Diggle P, Kenward MG (1994) Informative drop-out in longitudinal data analysis (with discussion). J R Stat Soc Ser C (Appl Stat) 43:49–93
Fang F, Shao J (2016) Model selection with nonignorable nonresponse. Biometrika 103:861–874
Fitzmaurice GM, Molenberghs G, Lipsitz SR (1995) Regression models for longitudinal binary responses with informative drop-outs. J Roy Stat Soc 57:691–704
Fu L, Wang Y (2012) Quantile regression for longitudinal data with a working correlation model. Comput Stat Data Anal 56:2526–2538
Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054
He X, Shi P (1996) Bivariate tensor-product B-splines in a partly linear model. J Multivar Anal 58:162–181
He X, Zhu Z, Fung WK (2002) Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89:579–590
Holland A (2017) Penalized spline estimation in the partially linear model. J Multivar Anal 153:211–235
Huang J, Liu L, Liu N (2007) Estimation of large covariance matrices of longitudinal data with basis function approximations. J Comput Graph Stat 16:189–209
Kim JK, Yu CL (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106:157–165
Koenker R, Bassett Jr G (1978) Regression quantiles. Econometrica 46:33–55
Kott PS, Chang T (2010) Using calibration weighting to adjust for nonignorable unit nonresponse. J Am Stat Assoc 105:1265–1275
Leng C, Zhang W (2014) Smoothing combined estimating equations in quantile regression for longitudinal data. Stat Comput 24:123–136
Leng C, Zhang W, Pan J (2010) Semiparametric mean-covariance regression analysis for longitudinal data. J Am Stat Assoc 105:181–193
Leung D, Wang Y, Zhu M (2009) Efficient parameter estimation in longitudinal data analysis using a hybrid-GEE method. Biostatistics 10:436–445
Li D, Pan J (2013) Empirical likelihood for generalized linear models with longitudinal data. J Multivar Anal 114:63–73
Liang H, Qin Y, Zhang X, Ruppert D (2009) Empirical likelihood-based inferences for generalized partially linear models. Scand J Stat 36:433–443
Liang K, Zeger S (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22
Lin H, Qin G, Zhang J, Fung WK (2018) Doubly robust estimation of partially linear models for longitudinal data with dropouts and measurement error in covariates. Statistics 52:84–98
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York
Lv J, Guo C, Yang H, Li Y (2017) A moving average Cholesky factor model in covariance modeling for composite quantile regression with longitudinal data. Comput Stat Data Anal 112:129–144
Molenberghs G, Kenward M (2007) Missing data in clinical studies. Wiley, London
Qin G, Zhu Z, Fung WK (2016) Robust estimation of generalized partially linear model for longitudinal data with dropouts. Ann Inst Stat Math 68:977–1000
Qin J, Lawless J (1994) Empirical likelihood and general estimating equations. Ann Stat 22:300–325
Qu A, Lindsay BG, Li B (2000) Improving generalised estimating equations using quadratic inference functions. Biometrika 87:823–836
Robins JM, Rotnitzky A, Zhao L (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Schumaker LL (1981) Spline functions: basic theory. Cambridge University Press, Cambridge
Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103:175–187
Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764
Wang L, Qi C, Shao J (2019) Model-assisted regression estimators for longitudinal data with nonignorable dropout. Int Stat Rev 87:S121–S138
Wang S, Shao J, Kim JK (2014) An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24:1097–1116
Wolberg G, Alfy I (2002) An energy-minimization framework for monotonic cubic spline interpolation. J Comput Appl Math 143:145–188
Zhang W, Leng C (2011) A moving average Cholesky factor model in covariance modelling for longitudinal data. Biometrika 99:141–150
Zhang W, Leng C, Tang CY (2015) A joint modelling approach for longitudinal studies. J R Stat Soc Ser B (Stat Methodol) 77:219–238
Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110:1577–1590
Zhou J, Qu A (2012) Informative estimation and selection of correlation structure for longitudinal data. J Am Stat Assoc 107:701–710
Zhu Z, Fung W, He X (2008) On the asymptotics of marginal regression splines with longitudinal data. Biometrika 95:907–917
Acknowledgements
The authors sincerely thank Professor Maria Kateri, an Associate Editor and two reviewers for their insightful comments that greatly improved this paper. This paper was supported by the National Natural Science Foundation of China under Grant Nos. 11871287, 11771144, 11801359, the Natural Science Foundation of Tianjin under Grant No. 18JCYBJC41100, Fundamental Research Funds for the Central Universities and the Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Appendix
Appendix
-
(C1)
The covariate vectors are fixed and the first four moments of \(y_{ij}\) exist. Also, for each i, \({m_i}\) is a bounded sequence of positive integers. The random error \({\varvec{\varepsilon }}_i=\varvec{y}_i-{{\varvec{\mu }}}_i\) satisfies that \(E({\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T)=\varvec{V}_i\), \(\mathop {sup}\nolimits _{i}\Vert \varvec{V}_i\Vert <\infty \) and there exists a positive constant \(\delta _1\) such that \(\mathop {sup}\nolimits _{i}E\Vert {\varvec{\varepsilon }}_i\Vert ^{2+\delta _1}<\infty \), where \(\Vert \varvec{V}_i\Vert \) and \(\Vert {\varvec{\varepsilon }}_i\Vert \) denote the modulus of the largest singular value of matrix \(\varvec{V}_i\) and vector \({\varvec{\varepsilon }}_i\), respectively.
-
(C2)
All marginal variances \(\varvec{A}_i\) is nonsingular and \(\mathop {sup}\nolimits _i\Vert \varvec{A}_i\Vert <\infty \). The link function \(h(\cdot )\) has bounded second derivative and \(E\{\big (h(\cdot )\big )^{2+\delta _2}\}<\infty \) for some \(\delta _2>2\).
-
(C3)
The function \(f(\cdot )\) is r times continuously differentiable on (0, 1) with \(r \ge 2\). The inner knots \(\{a_i, i=1,\ldots , k_n\}\) satisfy
$$\begin{aligned} \mathop {\max }\limits _{1 \le i \le {k_n}}|\kappa _{i+1}-\kappa _i|=O(k_n^{-1}),~~~~~~\frac{{\mathop {\max }\nolimits _{1 \le i \le {k_n}} \kappa _i}}{{\mathop {\min }\nolimits _{1 \le i \le {k_n}} \kappa _i }} \le C_0, \end{aligned}$$where \(\kappa _i=a_i-a_{i-1}\) and \(C_0\) is a positive constant.
-
(C4)
The joint distribution function \(Q_{jl}(t,s)\) of any pair of \(t_{ij}\) and \(t_{il}\), the marginal distribution function \(Q_j(t)\) of \(t_{ij}\) have positive continuous density functions \(q_{jl}(t,s)\) and \(q_j(t)\) on \([0,1]\times [0, 1]\) and [0, 1], respectively.
-
(C5)
The probability function \(\pi _{ij}({{\varvec{\vartheta }}}_j)\) satisfies (a) it is twice differentiable with respect to \({{\varvec{\vartheta }}}_j\); (b) \(0<C_1<\pi _{ij}({{\varvec{\vartheta }}}_j)<1\) for a positive constant \(C_1\); (c) \(\partial \pi _{ij}({{\varvec{\vartheta }}}_j)/\partial {{\varvec{\vartheta }}}_j\) is uniformly bounded.
-
(C6)
There is a unique \({\varvec{\theta }}_0 \in {\mathcal {B}}\) satisfying \(E({\hat{{\varvec{g}}}}_i({\varvec{\theta }}))=0\) and \(E(\hat{{\varvec{\varrho }}}_i({\varvec{\theta }}))=0\), where \({\mathcal {B}}\) is the parameter space.
-
(C7)
Assume that \(n^{-1}{\sum \nolimits _{i = 1}^n {\varvec{g}}_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0){\varvec{g}}^T_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0)} \) and \(n^{-1}\sum \nolimits _{i = 1}^n {\varvec{\varrho }}_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0){\varvec{\varrho }}^T_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0) \) converge almost surely to positive definite matrixes \({\varvec{\Lambda }}_g\) and \(\varvec{\Lambda }_{{\varvec{\varrho }}}\), respectively. Further, assume that \(n^{-1}{\sum \nolimits _{i = 1}^n cov({\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\varvec{{\hat{\gamma }}})) }\) and \(n^{-1}\sum \nolimits _{i = 1}^n cov(\hat{{\varvec{\varrho }}}_{0i}({{\varvec{\theta }}}_0,\varvec{{\hat{\gamma }}})) \) converges almost surely to positive definite matrixes \({\varvec{\Sigma }}_g\) and \(\varvec{\Sigma }_{{\varvec{\varrho }}}\), respectively.
-
(C8)
Assume that \({\partial ^2 {\varvec{g}}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\) and \({\partial ^2 {\varvec{\varrho }}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\) is continuous in a neighborhood of \({{\varvec{\theta }}}_0\), \(\Vert {\partial ^2 {\varvec{g}}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\Vert \) and \(\Vert {\partial ^2 {\varvec{\varrho }}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\Vert \) can be bounded by some integrable functions in the neighborhood of \({{\varvec{\theta }}}_0\).
Further, suppose that \({\varvec{\xi }}_{j}^0\) is the unique solution to \(E\{{\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j})\}=0\) and the proposed model holds, \({{\varvec{\Omega }}}_{j} =E\{{\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0){\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0)^T\}\) is positive definite and the matrix \({\varvec{\Upsilon }}_{j} =E[\partial {\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0)/\partial {{\varvec{\xi }}}_{j}]\) is of full rank. As \(n \rightarrow \infty \), \(\sqrt{n}(\hat{{{\varvec{\xi }}}}_{j}-{{\varvec{\xi }}}_{j}^0) {\mathop {\longrightarrow }\limits ^{{{d}}}}N(0, ({\varvec{\Upsilon }}_{j}^T{{\varvec{\Omega }}}_{j}{\varvec{\Upsilon }}_{j})^{-1})\) and \(\sqrt{n}(\hat{{\varvec{\gamma }}}-{\varvec{\gamma }}_0) {\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,{\varvec{\Sigma }})\) in distribution. In the following we mainly derive the case in \(\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}})\), and the proof for \(\hat{{{\varvec{\varrho }}}}_{i}({{\varvec{\theta }}})\) can be obtained using similar arguments.
Lemma 1
Assume the regularity conditions in Theorem 1 hold. As \(n \rightarrow \infty \),
Proof of Lemma 1
Consider the kth component of \({\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})\),
where \({\varvec{\eta }}_{0i}=\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0\), \({{\varvec{\mu }}}_{0i}=h(\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0)\) and \(\varvec{H}'( {{{\varvec{\eta }}_{0i}}} )\,\mathrm{{= diag}}( {h'( {{{\varvec{\eta }}_{0i1}}} ),\dots , h'( {{{\varvec{\eta }} _{0im}}} )} )\), \(h'(t)={{dh( t )} / {dt}}\). Note that
Applying the Taylor expansion to the first two terms in (11) around \(( {{{\varvec{\beta }} _0},{{\varvec{\alpha }} _0}})\), we have
where \(\varvec{w}_{i}\) is between \({\varvec{\eta }}_{0i}\) and \({\varvec{x}_i}{{\varvec{\beta }} _0} + {f_0}( {{\varvec{t}_i}})\) and \( {\varvec{R}}( {{\varvec{t}_i}} ) = {( {{R}( {{t_{i1}}} ),\dots ,{R}( {{t_{im}}} )} )^T}\), \({\varvec{R}^*}( {{\varvec{t}_i}} ) = {( {{R^2}( {{t_{i1}}} ),\dots ,{R^2}( {{t_{im}}} )} )^T}\), \({R}( {{t_{ij}}} ) = {f_0}( {{t_{ij}}} ) - {\varvec{m} ^T}( {{t_{ij}}} ){{\varvec{\alpha }} _0},\) \( \varvec{H}''( {{\varvec{w}_i}} )\mathrm{{= diag}}( h''( {{\varvec{w} _{i1}}} ),\dots , h''( {{\varvec{w} _{im}}} ) ),\) \(h''(t)={{d^2h( t )} / {dt^2}}. \)
From conditions (C3)–(C4) and Corollary 6.21 in Schumaker (1981), we obtain \(\Vert {{R}( {{t_{ij}}} )} \Vert = O_p( {k_n^{- r}})\) and \(\Vert {\varvec{m}( {{t_{ij}}})}\Vert = O_p( 1 )\). By substituting (12) into (11), we have
Under conditions (C3)–(C4) and through a simple calculation, we derive that \(\varvec{I}_2=O_p(n^{-1}n^{1/2}{k_n^{-r}})=o_p(n^{-1/2})\) and \(\varvec{I}_3=o_p(n^{-1/2})\). Thus,
where \({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} {{\varvec{\varepsilon }} _i}\) and \({\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})=({\hat{{\varvec{g}}}}^T_{0i1},\ldots ,{\hat{{\varvec{g}}}}^T_{0is})^T\). Obviously, it leads to \(E({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}}))=0\). Applying the Taylor expansion to \({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})\) around \({\varvec{\gamma }}_0\), it leads to
where \(\varvec{I}_{n1}= {{\varvec{g}}_{0ik}( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}})}=\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{\varvec{W}}_i {\varvec{\varepsilon }}_i\) and \(\varvec{I}_{n2}\) is the rest part of the above equation. Under (C1)–(C2) and (C5), it can be seen that
where \(\varvec{T}_i\) is a symmetric matrix with its \((j, j')\) element \((j\le j')\) as \(\varepsilon _{ij}\varepsilon _{ij'}/\pi _{ij}\). Therefore,
and
where \(\varvec{N}_{lk}=\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_l}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}\) and \(\varvec{U}_{lk}=E\big [ {\partial {{\varvec{g}}_{0il}}\big ( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}} \big )} /{\partial {\varvec{\gamma }} }\big ] {\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}} \big )} / {\partial {{\varvec{\gamma }}}}}} \big ]\) for \(l, k=1,\ldots ,s\). Under the conditions (C6)–(C7), it can be verified that
in probability and then
in distribution. Moreover, we can obtain \(n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})=O_p(n^{-1/2})\). Noting that \(\sqrt{n}(\varvec{{\hat{\gamma }}}-{\varvec{\gamma }}_0)=O_p(1)\) and \({\partial {\varvec{g}}_{0ik}( {{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} ) \big / \partial {{\varvec{\gamma }}}}=O_p(1)\) from (C5) and (C8), we can obtain
Hence
According to (3),
which leads to \( {\mathop {\max }\nolimits _{1 \le i \le n} \Vert {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) \Vert } =o_p(n^{1/2})\). The proof of Lemma 1 is completed. \(\square \)
Lemma 2
Assume the regularity conditions in Theorem 1 hold. We have
where \({\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)=n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0)\) and \(\varvec{S}_n( {{{{\varvec{\theta }}}_0}} ) = {n^{- 1}}\sum \nolimits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})} {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})}^T\).
Proof of Lemma 2
Noting that the Lagrange multiplier method leads to the empirical log-likelihood ratio function for \({{\varvec{\theta }}}_0\) as
where vector \({{\varvec{\lambda }}}\) is the solution to the equation
Denote \({{\varvec{\lambda }}} = \rho {\varvec{{\varvec{u}}}}\) where \( \rho \ge 0\), \({\varvec{u}}\in {R^p}\) and \(\big \Vert {\varvec{u}}\big \Vert = 1\) is an unit vector. Substituting \({1 \big / {\big ( {1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big )}} = 1 - {{{{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big / {\big ( {1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big )}}\) and \({{\varvec{\lambda }}} = \rho {\varvec{{\varvec{u}}}}\) into the above equation, we have
where \(\varvec{S}_n( {{{{\varvec{\theta }}}_0}} ) = {n^{- 1}}\sum \nolimits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})} {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})}^T={\varvec{\Lambda }}_g(1+o_p(1))\) and the inequality follows from the positivity of \((1+\rho \varvec{u}^T{\hat{\varvec{g}}_i({{\varvec{\theta }}}_0)})\). Therefore, it can be shown that
Noting that \(\max \nolimits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=o_p(n^{1/2})\) and according to the conclusion provided in the proof of Lemma 1, we have
which leads to \( \rho =O_{p}(n^{-1/2}), \) i.e., \(|| {{\varvec{\lambda }}}||=O_{p}(n^{-1/2}).\) Naturally,
Expanding the equation \(D({\varvec{\lambda }})\), we get
where \(\xi _i\in (0, {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})).\) Using the fact that \(\max \nolimits _{1\le i\le n}|| {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=o_{p}(1), \) then \(|\xi _i| =o_{p}(1).\) Since
thus
where \(| |{\varvec{\zeta }}| |=o_{p}(n^{-1/2}).\) A Taylor expansion of \(\hat{R}_Q({{\varvec{\theta }}}_0)\) yields
Similarly,
therefore,
Substituting (13) in (14), it holds that
A simple calculation shows that
Therefore,
Since \({\hat{{{\varvec{\theta }}}}}_Q=\mathop {\arg \min }\nolimits _{{{\varvec{\theta }}}\in {\mathcal {B}}} {\hat{R}_Q({{\varvec{\theta }}})}\), we have two properties of \({\hat{R}}_Q({{\varvec{\theta }}}_0)\) as follows:
where
Since \(n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})=O_p(n^{-1/2})\), together with conditions (C5)–(C8), we have \(R_{n1}=O_p(n^{-1})\) and \(R_{n2}=o_p(1)\). The proof of Lemma 2 is completed. \(\square \)
Proof of Theorem 1
Similar to the proof of Lemma 1 in Qin and Lawless (1994), under conditions (C3)–(C4) and (C6)–(C7), we have \(\hat{{{\varvec{\theta }}}}_Q {\mathop {\longrightarrow }\limits ^{{{p}}}}{{\varvec{\theta }}}_0\) in probability. This implies \(\Vert {\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0\Vert =O_p(n^{-1/2})\) and \(\Vert {\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0\Vert =O_p(n^{-1/2})\). Under (C3)–(C4),
Since \(\Vert R(t_{ij})\Vert =O_p(k_n^{-r})=O_p(n^{-r/(2r+1)})\), we obtain
Proof of Theorem 2
Denote \(\varvec{L}_{1n}({{\varvec{\beta }}},{{\varvec{\alpha }}})\) and \(\varvec{L}_{2n}({{\varvec{\beta }}},{{\varvec{\alpha }}})\) as the first derivatives of \({\hat{R}}_Q({{\varvec{\theta }}})\) with respect to \({{\varvec{\beta }}}\) and \({{\varvec{\alpha }}}\) respectively. Obviously, \(\varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)=0\) and \(\varvec{L}_{2n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)=0\). Applying Taylor expansion to \(\varvec{L}_{1n}\) around \(({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)\), it follows that
where \(({\tilde{{{\varvec{\beta }}}}}_0^T,{\tilde{{{\varvec{\alpha }}}}}_0^T)^T\) is between \(({{\varvec{\beta }}}_0^T,{{\varvec{\alpha }}}_0^T)^T\) and \(({\hat{{{\varvec{\beta }}}}}_Q^T,{\hat{{{\varvec{\alpha }}}}}_Q^T)^T\). Since
we have
where \(\varvec{S}^{-1}_{n,k_1k_2}\) represents the \((k_1,k_2)\) part of \(\varvec{S}_n^{-1}\). Let \(\tilde{\varvec{x}}_{l_1}=\varvec{H}'({\varvec{\eta }}_{0i})\varvec{x}_{l_1}\), \(\tilde{\varvec{R}}(\varvec{t}_{l_2})=\varvec{H}'({\varvec{\eta }}_{0{l_2}})\varvec{R}(\varvec{t}_{l_2})\) and for simplicity,
and
Similar with the proof of Lemma 1, we have \(\hat{{\varvec{\tau }}}_{l_1l_2}={\varvec{\tau }}_{l_1l_2}(1+o_p(1))\). Therefore, we have
where \(\tilde{\varvec{m}}_{l_2}=\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{m}_{l_2}\). Thus,
Similarly,
From the Eq. (16), we have
where \(\varvec{V}_g=n^{-2}{\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}}\), \(\varvec{K}_g=n^{-2}\sum \nolimits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2}\) and \(\varvec{P}_g=n^{-2}\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T \hat{{\varvec{\tau }}}_{l_1l_2} \{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}\). By the central limit theorem, it follows that \(\varvec{V}_g=n^{-2}{\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}} {\mathop {\longrightarrow }\limits ^{{{p}}}}E(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\), \(\varvec{K}_g=n^{-2}\sum \nolimits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2} {\mathop {\longrightarrow }\limits ^{{{p}}}}E(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})\) with \(\tilde{\varvec{x}}=(\tilde{\varvec{x}}_1^T,\ldots ,\tilde{\varvec{x}}_n^T)^T\), \(\tilde{\varvec{m}}=(\tilde{\varvec{m}}_1^T,\ldots ,\tilde{\varvec{m}}_n^T)^T\) and
By substituting (17) into the Eq. (15), we obtain
By a simple calculation, we can obtain
By defining \(\varvec{x}_i^*=\tilde{\varvec{x}}^T_i-\varvec{K}_g^T\varvec{V}_g^{-1}\tilde{\varvec{m}}_i^T\) and substituting it into (18), it leads to
Note that
where
Using the similar arguments in the proof of Lemma 1, we have
where \(\varvec{B}_{g} = E\big ( {{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )\varvec{z}_{gl_2}^T( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} \big )=E\big \{[\tilde{\varvec{x}}^T{\varvec{\tau }}-\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{m}}[\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}}]^{-1}\tilde{\varvec{m}}^T{\varvec{\tau }}] {{\varvec{\varepsilon }}}\big \}^{\otimes 2}\) and \({\varvec{\Psi }}_{g}=\varvec{B}_{g}+E( {{{{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} / {\partial {{\varvec{\gamma }}} }}} ){{\varvec{\Sigma }}}E( {{{{{{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} / {\partial {{\varvec{\gamma }}} }}}^T}} )\). Then we have \(\varvec{A}_1{\mathop {\longrightarrow }\limits ^{d}}N(0,{\varvec{\Psi }}_g)\) in distribution. Since
then \(\varvec{A}_2=0\). Expanding \({\varvec{x}}_{l_1}^*\) in \(\varvec{A}_3\), we have
From the definition of \(\varvec{K}_g\), it is easy to verify that
which implies
Recalling that \(\Vert {\varvec{m}( {{t_{ij}}})}\Vert = O_p( 1 )\) and \(\Vert {{R}( {{t_{ij}}} )} \Vert =o_p(1)\), we have \(\varvec{A}_{31}=o_p(1)\). Similarly, we obtain \(\varvec{A}_{32}=o_p(1)\) and then \(\varvec{A}_{3}=o_p(1)\). By the law of large numbers, we have
in probability where \({\varvec{\Gamma }}_g=E\big \{\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{x}}-\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{m}}[\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}}]^{-1}\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}}\big \}\). Hence,
in distribution and the proof of Theorem 2 is completed. \(\square \)
Rights and permissions
About this article
Cite this article
Shao, Y., Wang, L. Generalized partial linear models with nonignorable dropouts. Metrika 85, 223–252 (2022). https://doi.org/10.1007/s00184-021-00828-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-021-00828-z