Generalized partial linear models with nonignorable dropouts

Shao, Yujing; Wang, Lei

doi:10.1007/s00184-021-00828-z

Generalized partial linear models with nonignorable dropouts

Published: 12 July 2021

Volume 85, pages 223–252, (2022)
Cite this article

Metrika Aims and scope Submit manuscript

316 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In the presence of longitudinal data with nonignorable dropouts, we propose improved estimators for generalized partial linear models that accommodate both the within-subject correlations and nonignorable missing data. To address the identifiability problem, an instrumental covariate, which is related to the response variable but unrelated to the propensity given the response variable and other covariates, is used to construct sufficient instrumental estimating equations. Subsequently, the nonparametric function is approximated by B-spline basis functions and then we construct bias-corrected generalized estimating equations based on the inverse probability weighting. In order to incorporate the within-subject correlations under an informative working correlation structure, we borrow the idea of quadratic inference function and hybrid-GEE to construct the improved empirical likelihood procedures. Under some regularity conditions, we establish asymptotic normality of the proposed estimators for the parametric components and convergence rate of the estimators for the nonparametric functions. The finite-sample performance of the proposed estimators is studied through simulations and an application to HIV-CD4 data set is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fixed and random effects models: making an informed choice

Article Open access 07 August 2018

Identifying typical trajectories in longitudinal data: modelling strategies and interpretations

Article Open access 05 March 2020

Meta-analysis with Robust Variance Estimation: Expanding the Range of Working Models

Article 07 May 2021

References

Bai Y, Fung WK, Zhu Z (2010) Weighted empirical likelihood for generalized linear models with longitudinal data. J Stat Plan Inference 140:3446–3456
Article MathSciNet MATH Google Scholar
Boente G, He X, Zhou J (2006) Robust estimates in generalized partially linear models. Ann Stat 34:2856–2878
Article MathSciNet MATH Google Scholar
Boente G, Rodriguez D (2010) Robust inference in generalized partially linear models. Comput Stat Data Anal 54:2942–2966
Article MathSciNet MATH Google Scholar
Chen B, Zhou X (2013) Generalized partially linear models for incomplete longitudinal data In the presence of population-level information. Biometrics 69:386–395
Article MathSciNet MATH Google Scholar
Chen X, Christensen T (2015) Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. J Econom 188:447–465
Article MathSciNet MATH Google Scholar
Cho H, Qu A (2015) Efficient estimation for longitudinal data by combining large-dimensional moment conditions. Electron J Stat 9:1315–1334
Article MathSciNet MATH Google Scholar
Diggle P, Kenward MG (1994) Informative drop-out in longitudinal data analysis (with discussion). J R Stat Soc Ser C (Appl Stat) 43:49–93
MATH Google Scholar
Fang F, Shao J (2016) Model selection with nonignorable nonresponse. Biometrika 103:861–874
Article MathSciNet Google Scholar
Fitzmaurice GM, Molenberghs G, Lipsitz SR (1995) Regression models for longitudinal binary responses with informative drop-outs. J Roy Stat Soc 57:691–704
MathSciNet MATH Google Scholar
Fu L, Wang Y (2012) Quantile regression for longitudinal data with a working correlation model. Comput Stat Data Anal 56:2526–2538
Article MathSciNet MATH Google Scholar
Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054
Article MathSciNet MATH Google Scholar
He X, Shi P (1996) Bivariate tensor-product B-splines in a partly linear model. J Multivar Anal 58:162–181
Article MathSciNet MATH Google Scholar
He X, Zhu Z, Fung WK (2002) Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89:579–590
Article MathSciNet MATH Google Scholar
Holland A (2017) Penalized spline estimation in the partially linear model. J Multivar Anal 153:211–235
Article MathSciNet MATH Google Scholar
Huang J, Liu L, Liu N (2007) Estimation of large covariance matrices of longitudinal data with basis function approximations. J Comput Graph Stat 16:189–209
Article MathSciNet Google Scholar
Kim JK, Yu CL (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106:157–165
Article MathSciNet MATH Google Scholar
Koenker R, Bassett Jr G (1978) Regression quantiles. Econometrica 46:33–55
Kott PS, Chang T (2010) Using calibration weighting to adjust for nonignorable unit nonresponse. J Am Stat Assoc 105:1265–1275
Article MathSciNet MATH Google Scholar
Leng C, Zhang W (2014) Smoothing combined estimating equations in quantile regression for longitudinal data. Stat Comput 24:123–136
Article MathSciNet MATH Google Scholar
Leng C, Zhang W, Pan J (2010) Semiparametric mean-covariance regression analysis for longitudinal data. J Am Stat Assoc 105:181–193
Article MathSciNet MATH Google Scholar
Leung D, Wang Y, Zhu M (2009) Efficient parameter estimation in longitudinal data analysis using a hybrid-GEE method. Biostatistics 10:436–445
Article MATH Google Scholar
Li D, Pan J (2013) Empirical likelihood for generalized linear models with longitudinal data. J Multivar Anal 114:63–73
Article MathSciNet MATH Google Scholar
Liang H, Qin Y, Zhang X, Ruppert D (2009) Empirical likelihood-based inferences for generalized partially linear models. Scand J Stat 36:433–443
Article MathSciNet MATH Google Scholar
Liang K, Zeger S (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22
Article MathSciNet MATH Google Scholar
Lin H, Qin G, Zhang J, Fung WK (2018) Doubly robust estimation of partially linear models for longitudinal data with dropouts and measurement error in covariates. Statistics 52:84–98
Article MathSciNet MATH Google Scholar
Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York
Book MATH Google Scholar
Lv J, Guo C, Yang H, Li Y (2017) A moving average Cholesky factor model in covariance modeling for composite quantile regression with longitudinal data. Comput Stat Data Anal 112:129–144
Article MathSciNet MATH Google Scholar
Molenberghs G, Kenward M (2007) Missing data in clinical studies. Wiley, London
Book Google Scholar
Qin G, Zhu Z, Fung WK (2016) Robust estimation of generalized partially linear model for longitudinal data with dropouts. Ann Inst Stat Math 68:977–1000
Article MathSciNet MATH Google Scholar
Qin J, Lawless J (1994) Empirical likelihood and general estimating equations. Ann Stat 22:300–325
Article MathSciNet MATH Google Scholar
Qu A, Lindsay BG, Li B (2000) Improving generalised estimating equations using quadratic inference functions. Biometrika 87:823–836
Article MathSciNet MATH Google Scholar
Robins JM, Rotnitzky A, Zhao L (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866
Article MathSciNet MATH Google Scholar
Schumaker LL (1981) Spline functions: basic theory. Cambridge University Press, Cambridge
Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103:175–187
Article MathSciNet MATH Google Scholar
Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764
Article MathSciNet MATH Google Scholar
Wang L, Qi C, Shao J (2019) Model-assisted regression estimators for longitudinal data with nonignorable dropout. Int Stat Rev 87:S121–S138
Article MathSciNet Google Scholar
Wang S, Shao J, Kim JK (2014) An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24:1097–1116
MathSciNet MATH Google Scholar
Wolberg G, Alfy I (2002) An energy-minimization framework for monotonic cubic spline interpolation. J Comput Appl Math 143:145–188
Article MathSciNet MATH Google Scholar
Zhang W, Leng C (2011) A moving average Cholesky factor model in covariance modelling for longitudinal data. Biometrika 99:141–150
Article MathSciNet MATH Google Scholar
Zhang W, Leng C, Tang CY (2015) A joint modelling approach for longitudinal studies. J R Stat Soc Ser B (Stat Methodol) 77:219–238
Article MathSciNet MATH Google Scholar
Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110:1577–1590
Article MathSciNet MATH Google Scholar
Zhou J, Qu A (2012) Informative estimation and selection of correlation structure for longitudinal data. J Am Stat Assoc 107:701–710
Article MathSciNet MATH Google Scholar
Zhu Z, Fung W, He X (2008) On the asymptotics of marginal regression splines with longitudinal data. Biometrika 95:907–917
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors sincerely thank Professor Maria Kateri, an Associate Editor and two reviewers for their insightful comments that greatly improved this paper. This paper was supported by the National Natural Science Foundation of China under Grant Nos. 11871287, 11771144, 11801359, the Natural Science Foundation of Tianjin under Grant No. 18JCYBJC41100, Fundamental Research Funds for the Central Universities and the Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin.

Author information

Authors and Affiliations

School of Statistics and Data Science and LPMC, Nankai University, Tianjin, 300071, People’s Republic of China
Yujing Shao & Lei Wang

Authors

Yujing Shao
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lei Wang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 205 KB)

Appendix

(C1)
The covariate vectors are fixed and the first four moments of $y_{ij}$ exist. Also, for each i, ${m_i}$ is a bounded sequence of positive integers. The random error ${\varvec{\varepsilon }}_i=\varvec{y}_i-{{\varvec{\mu }}}_i$ satisfies that $E({\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T)=\varvec{V}_i$, $\mathop {sup}\nolimits _{i}\Vert \varvec{V}_i\Vert <\infty $ and there exists a positive constant $\delta _1$ such that $\mathop {sup}\nolimits _{i}E\Vert {\varvec{\varepsilon }}_i\Vert ^{2+\delta _1}<\infty $, where $\Vert \varvec{V}_i\Vert $ and $\Vert {\varvec{\varepsilon }}_i\Vert $ denote the modulus of the largest singular value of matrix $\varvec{V}_i$ and vector ${\varvec{\varepsilon }}_i$, respectively.
(C2)
All marginal variances $\varvec{A}_i$ is nonsingular and $\mathop {sup}\nolimits _i\Vert \varvec{A}_i\Vert <\infty $. The link function $h(\cdot )$ has bounded second derivative and $E\{\big (h(\cdot )\big )^{2+\delta _2}\}<\infty $ for some $\delta _2>2$.
(C3)
The function $f(\cdot )$ is r times continuously differentiable on (0, 1) with $r \ge 2$. The inner knots $\{a_i, i=1,\ldots , k_n\}$ satisfy
$$\begin{aligned} \mathop {\max }\limits _{1 \le i \le {k_n}}|\kappa _{i+1}-\kappa _i|=O(k_n^{-1}),~~~~~~\frac{{\mathop {\max }\nolimits _{1 \le i \le {k_n}} \kappa _i}}{{\mathop {\min }\nolimits _{1 \le i \le {k_n}} \kappa _i }} \le C_0, \end{aligned}$$
where $\kappa _i=a_i-a_{i-1}$ and $C_0$ is a positive constant.
(C4)
The joint distribution function $Q_{jl}(t,s)$ of any pair of $t_{ij}$ and $t_{il}$, the marginal distribution function $Q_j(t)$ of $t_{ij}$ have positive continuous density functions $q_{jl}(t,s)$ and $q_j(t)$ on $[0,1]\times [0, 1]$ and [0, 1], respectively.
(C5)
The probability function $\pi _{ij}({{\varvec{\vartheta }}}_j)$ satisfies (a) it is twice differentiable with respect to ${{\varvec{\vartheta }}}_j$; (b) $0<C_1<\pi _{ij}({{\varvec{\vartheta }}}_j)<1$ for a positive constant $C_1$; (c) $\partial \pi _{ij}({{\varvec{\vartheta }}}_j)/\partial {{\varvec{\vartheta }}}_j$ is uniformly bounded.
(C6)
There is a unique ${\varvec{\theta }}_0 \in {\mathcal {B}}$ satisfying $E({\hat{{\varvec{g}}}}_i({\varvec{\theta }}))=0$ and $E(\hat{{\varvec{\varrho }}}_i({\varvec{\theta }}))=0$, where ${\mathcal {B}}$ is the parameter space.
(C7)
Assume that $n^{-1}{\sum \nolimits _{i = 1}^n {\varvec{g}}_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0){\varvec{g}}^T_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0)} $ and $n^{-1}\sum \nolimits _{i = 1}^n {\varvec{\varrho }}_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0){\varvec{\varrho }}^T_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0) $ converge almost surely to positive definite matrixes ${\varvec{\Lambda }}_g$ and $\varvec{\Lambda }_{{\varvec{\varrho }}}$, respectively. Further, assume that $n^{-1}{\sum \nolimits _{i = 1}^n cov({\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\varvec{{\hat{\gamma }}})) }$ and $n^{-1}\sum \nolimits _{i = 1}^n cov(\hat{{\varvec{\varrho }}}_{0i}({{\varvec{\theta }}}_0,\varvec{{\hat{\gamma }}})) $ converges almost surely to positive definite matrixes ${\varvec{\Sigma }}_g$ and $\varvec{\Sigma }_{{\varvec{\varrho }}}$, respectively.
(C8)
Assume that ${\partial ^2 {\varvec{g}}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}$ and ${\partial ^2 {\varvec{\varrho }}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}$ is continuous in a neighborhood of ${{\varvec{\theta }}}_0$, $\Vert {\partial ^2 {\varvec{g}}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\Vert $ and $\Vert {\partial ^2 {\varvec{\varrho }}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\Vert $ can be bounded by some integrable functions in the neighborhood of ${{\varvec{\theta }}}_0$.

Further, suppose that ${\varvec{\xi }}_{j}^0$ is the unique solution to $E\{{\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j})\}=0$ and the proposed model holds, ${{\varvec{\Omega }}}_{j} =E\{{\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0){\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0)^T\}$ is positive definite and the matrix ${\varvec{\Upsilon }}_{j} =E[\partial {\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0)/\partial {{\varvec{\xi }}}_{j}]$ is of full rank. As $n \rightarrow \infty $, $\sqrt{n}(\hat{{{\varvec{\xi }}}}_{j}-{{\varvec{\xi }}}_{j}^0) {\mathop {\longrightarrow }\limits ^{{{d}}}}N(0, ({\varvec{\Upsilon }}_{j}^T{{\varvec{\Omega }}}_{j}{\varvec{\Upsilon }}_{j})^{-1})$ and $\sqrt{n}(\hat{{\varvec{\gamma }}}-{\varvec{\gamma }}_0) {\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,{\varvec{\Sigma }})$ in distribution. In the following we mainly derive the case in $\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}})$, and the proof for $\hat{{{\varvec{\varrho }}}}_{i}({{\varvec{\theta }}})$ can be obtained using similar arguments.

Lemma 1

Assume the regularity conditions in Theorem 1 hold. As $n \rightarrow \infty $,

$$\begin{aligned}&(1)~~ \frac{1}{\sqrt{n}}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({\varvec{\theta }}_0){\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,\varvec{\Sigma }_{\varvec{g}}),~~~ (2)~~\frac{1}{n}\sum \limits _{i=1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0)=O_p(n^{-1/2}),~~\\&(3)~~ \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)^{T}{\mathop {\longrightarrow }\limits ^{p}}\varvec{\Lambda }_{{\varvec{g}}},~~ ~~(4)~~ \max _{i}\Vert \hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)\Vert =o_{p}(n^{1/2}). \end{aligned}$$

Proof of Lemma 1

Consider the kth component of ${\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})$,

$$\begin{aligned} \frac{1}{n}\sum \limits _{i=1}^n {{{\hat{{\varvec{g}}}}}_{ik}}( {{{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}} } )&=\frac{1}{n}\sum \limits _{i=1}^n \dot{{\varvec{\mu }}} _i^T \varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}( {{\varvec{y}_i} - {{\varvec{\mu }} _{0i}}})\\&=\frac{1}{n}\sum \limits _{i=1}^n \varvec{d}_i^T \varvec{H}'( {{{\varvec{\eta }} _{0i}}} )\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}( {{\varvec{y}_i} - {{\varvec{\mu }} _{0i}}}), \end{aligned}$$

where ${\varvec{\eta }}_{0i}=\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0$, ${{\varvec{\mu }}}_{0i}=h(\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0)$ and $\varvec{H}'( {{{\varvec{\eta }}_{0i}}} )\,\mathrm{{= diag}}( {h'( {{{\varvec{\eta }}_{0i1}}} ),\dots , h'( {{{\varvec{\eta }} _{0im}}} )} )$, $h'(t)={{dh( t )} / {dt}}$. Note that

$$\begin{aligned} {\varvec{y}_i} - {{\varvec{\mu }} _{0i}} =h(\varvec{x}_i{{\varvec{\beta }}}_0+f_0(\varvec{t}_i))-h(\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0)+ {{\varvec{\varepsilon }} _i}. \end{aligned}$$

(11)

Applying the Taylor expansion to the first two terms in (11) around $( {{{\varvec{\beta }} _0},{{\varvec{\alpha }} _0}})$, we have

$$\begin{aligned} \begin{aligned} {\varvec{y}_i} - {{\varvec{\mu }} _{0i}}=&{{\varvec{\varepsilon }} _i} +\varvec{H}'( {{{\varvec{\eta }}_{0i}}}){\varvec{R}}( {{\varvec{t}_i}}) + \frac{1}{2}\varvec{H}''( {{\varvec{w}_{i}}} ){\varvec{R}^*}( {{\varvec{t}_i}} ), \end{aligned} \end{aligned}$$

(12)

where $\varvec{w}_{i}$ is between ${\varvec{\eta }}_{0i}$ and ${\varvec{x}_i}{{\varvec{\beta }} _0} + {f_0}( {{\varvec{t}_i}})$ and $ {\varvec{R}}( {{\varvec{t}_i}} ) = {( {{R}( {{t_{i1}}} ),\dots ,{R}( {{t_{im}}} )} )^T}$, ${\varvec{R}^*}( {{\varvec{t}_i}} ) = {( {{R^2}( {{t_{i1}}} ),\dots ,{R^2}( {{t_{im}}} )} )^T}$, ${R}( {{t_{ij}}} ) = {f_0}( {{t_{ij}}} ) - {\varvec{m} ^T}( {{t_{ij}}} ){{\varvec{\alpha }} _0},$ $ \varvec{H}''( {{\varvec{w}_i}} )\mathrm{{= diag}}( h''( {{\varvec{w} _{i1}}} ),\dots , h''( {{\varvec{w} _{im}}} ) ),$ $h''(t)={{d^2h( t )} / {dt^2}}. $

From conditions (C3)–(C4) and Corollary 6.21 in Schumaker (1981), we obtain $\Vert {{R}( {{t_{ij}}} )} \Vert = O_p( {k_n^{- r}})$ and $\Vert {\varvec{m}( {{t_{ij}}})}\Vert = O_p( 1 )$. By substituting (12) into (11), we have

$$\begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n {{{\hat{{\varvec{g}}}}}_{ik}}( {{{\varvec{\theta }}_0},\hat{{\varvec{\gamma }}} } )&= \frac{1}{n}\sum \limits _{i = 1}^n {\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} {{\varvec{\varepsilon }} _i}\\&\quad + \frac{1}{n}\sum \limits _{i = 1}^n {\dot{{\varvec{\mu }}} _i^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} \varvec{H}'( {{{\varvec{\eta }} _{0i}}} ){\varvec{R}}( {{\varvec{t}_i}} )\\&\quad + \frac{1}{{2n}}\sum \limits _{i = 1}^n {\dot{{\varvec{\mu }}} _i^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} \varvec{H}''( {{\varvec{w} _{i}}} ){\varvec{R}^*}( {{\varvec{t}_i}} )\\&=: \varvec{I}_1+\varvec{I}_2+\varvec{I}_3. \end{aligned}$$

Under conditions (C3)–(C4) and through a simple calculation, we derive that $\varvec{I}_2=O_p(n^{-1}n^{1/2}{k_n^{-r}})=o_p(n^{-1/2})$ and $\varvec{I}_3=o_p(n^{-1/2})$. Thus,

$$\begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n{\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})=\frac{1}{n}\sum \limits _{i = 1}^n {\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})+o_p(n^{-1/2}), \end{aligned}$$

where ${\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} {{\varvec{\varepsilon }} _i}$ and ${\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})=({\hat{{\varvec{g}}}}^T_{0i1},\ldots ,{\hat{{\varvec{g}}}}^T_{0is})^T$. Obviously, it leads to $E({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}}))=0$. Applying the Taylor expansion to ${\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})$ around ${\varvec{\gamma }}_0$, it leads to

$$\begin{aligned} {{\hat{{\varvec{g}}}}_{0ik}( {{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})}&= {{\varvec{g}}_{0ik}({{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} ) + {\frac{{\partial {\varvec{g}}_{0ik}( {{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} )}}{{\partial {{\varvec{\gamma }}} }}( {\hat{{\varvec{\gamma }}} - {\varvec{\gamma }}_0 } )} } + {o_p}( 1 )\\&= :{\varvec{I}_{n1}} + {\varvec{I}_{n2}}, \end{aligned}$$

where $\varvec{I}_{n1}= {{\varvec{g}}_{0ik}( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}})}=\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{\varvec{W}}_i {\varvec{\varepsilon }}_i$ and $\varvec{I}_{n2}$ is the rest part of the above equation. Under (C1)–(C2) and (C5), it can be seen that

$$\begin{aligned} E(\varvec{I}_{n1}\varvec{I}_{n1}^T)&=E[{\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{W}_i{\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T\varvec{W}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}}]\\&=E[E[{\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{W}_i{\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T\varvec{W}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}}|{\varvec{x}}_i]]\\&={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}},\\ E(\varvec{I}_{n2}\varvec{I}_{n2}^T)&=E\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {\varvec{\gamma }} }}} \big ]{\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {{\varvec{\gamma }}}}}} \big ],\\ E(\varvec{I}_{n1} \varvec{I}_{n2})&=O_p(n^{-1/2}),~~~~~~~Cov(\varvec{I}_{n1},\varvec{I}_{n2})=o_p(1), \end{aligned}$$

where $\varvec{T}_i$ is a symmetric matrix with its $(j, j')$ element $(j\le j')$ as $\varepsilon _{ij}\varepsilon _{ij'}/\pi _{ij}$. Therefore,

$$\begin{aligned} Cov({\hat{{\varvec{g}}}}_{0ik}({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}}))&={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}}\\&\quad +E\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {\varvec{\gamma }} }}} \big ]{\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {{\varvec{\gamma }}}}}} \big ], \end{aligned}$$

and

$$\begin{aligned}Cov({\hat{{\varvec{g}}}}_{0i}({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}}))={\left( \begin{array}{ccc} \varvec{N}_{11} &{} \ldots &{} \varvec{N}_{1s} \\ \vdots &{} \ddots &{} \vdots \\ \varvec{N}_{s1} &{} \ldots &{} \varvec{N}_{ss} \\ \end{array} \right) }+ {\left( \begin{array}{ccc} \varvec{U}_{11} &{} \ldots &{} \varvec{U}_{1s} \\ \vdots &{} \ddots &{} \vdots \\ \varvec{U}_{s1} &{} \ldots &{} \varvec{U}_{ss} \\ \end{array} \right) },\end{aligned}$$

where $\varvec{N}_{lk}=\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_l}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}$ and $\varvec{U}_{lk}=E\big [ {\partial {{\varvec{g}}_{0il}}\big ( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}} \big )} /{\partial {\varvec{\gamma }} }\big ] {\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}} \big )} / {\partial {{\varvec{\gamma }}}}}} \big ]$ for $l, k=1,\ldots ,s$. Under the conditions (C6)–(C7), it can be verified that

$$\begin{aligned} n^{-1}\sum \limits _{i = 1}^nCov({\hat{{\varvec{g}}}}_{0i}({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})){\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{\Sigma }}_g, \end{aligned}$$

in probability and then

$$\begin{aligned} \frac{1}{{\sqrt{n} }}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) = \frac{1}{{\sqrt{n} }}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_{0i}}}( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) + o_p(1){\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,{\varvec{\Sigma }}_g), \end{aligned}$$

in distribution. Moreover, we can obtain $n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})=O_p(n^{-1/2})$. Noting that $\sqrt{n}(\varvec{{\hat{\gamma }}}-{\varvec{\gamma }}_0)=O_p(1)$ and ${\partial {\varvec{g}}_{0ik}( {{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} ) \big / \partial {{\varvec{\gamma }}}}=O_p(1)$ from (C5) and (C8), we can obtain

$$\begin{aligned}\begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_{il}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ){{{{\hat{{\varvec{g}}}}}_{ik}^T}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) =\frac{1}{n}\sum \limits _{i = 1}^n {{{{\varvec{g}}}_{0il}}} ( {{{\varvec{\theta }} _0,\varvec{\gamma _0}}} ){{{{\varvec{g}}}_{0ik}^T}} ( {{{\varvec{\theta }} _0,\varvec{\gamma _0}}} )+o_p(1). \end{aligned}\end{aligned}$$

Hence

$$\begin{aligned} \begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ){{{{\hat{{\varvec{g}}}}}_{i}^T}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} )&=\frac{1}{n}\sum \limits _{i = 1}^n {{{{\varvec{g}}}_{0i}}} ( {{{\varvec{\theta }} _0,\varvec{\gamma _0}}} ){{{{\varvec{g}}}_{0i}^T}} ( {{{\varvec{\theta }} _0,{{\varvec{\gamma }}_0}}} )+o_p(1)\\&=\frac{1}{n}\sum \limits _{i=1}^n{\left( \begin{array}{ccc} \varvec{N}_{11} &{} \ldots &{} \varvec{N}_{1s} \\ \vdots &{} \ddots &{} \vdots \\ \varvec{N}_{s1} &{} \ldots &{} \varvec{N}_{ss} \\ \end{array} \right) }+o_p(1){\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{\Lambda }}_{g}. \end{aligned}\end{aligned}$$

According to (3),

$$\begin{aligned}\frac{{{n^{- 1}}{{( {\mathop {\max }\limits _{1 \le i \le n} \Vert {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) \Vert } )}^2}}}{{{n^{- 1}}\sum \limits _{i = 1}^n {{{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ){{{{\hat{{\varvec{g}}}}}^T_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} )} }} \rightarrow 0,\end{aligned}$$

which leads to $ {\mathop {\max }\nolimits _{1 \le i \le n} \Vert {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) \Vert } =o_p(n^{1/2})$. The proof of Lemma 1 is completed. $\square $

Lemma 2

Assume the regularity conditions in Theorem 1 hold. We have

$$\begin{aligned} n^{-1}\hat{R}_Q({{\varvec{\theta }}}_0)= {\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)+o_p(1), \end{aligned}$$

where ${\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)=n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0)$ and $\varvec{S}_n( {{{{\varvec{\theta }}}_0}} ) = {n^{- 1}}\sum \nolimits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})} {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})}^T$.

Proof of Lemma 2

Noting that the Lagrange multiplier method leads to the empirical log-likelihood ratio function for ${{\varvec{\theta }}}_0$ as

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)=2\sum \limits _{i=1}^n \log \left\{ 1+{{\varvec{\lambda }}}^{T}({{\varvec{\theta }}}_0)\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)\right\} , \end{aligned}$$

where vector ${{\varvec{\lambda }}}$ is the solution to the equation

$$\begin{aligned} D({\varvec{\lambda }})=:\frac{1}{n}\sum \limits _{i = 1}^n {\frac{{{{{\hat{{\varvec{g}}}}}_i}\big ({{\varvec{\theta }}}_0 \big )}}{{1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}}}_i}\big ( {{\varvec{\theta }}}_0 \big )}}} = 0. \end{aligned}$$

Denote ${{\varvec{\lambda }}} = \rho {\varvec{{\varvec{u}}}}$ where $ \rho \ge 0$, ${\varvec{u}}\in {R^p}$ and $\big \Vert {\varvec{u}}\big \Vert = 1$ is an unit vector. Substituting ${1 \big / {\big ( {1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big )}} = 1 - {{{{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big / {\big ( {1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big )}}$ and ${{\varvec{\lambda }}} = \rho {\varvec{{\varvec{u}}}}$ into the above equation, we have

$$\begin{aligned} 0&= \frac{1}{n}\sum \limits _{i = 1}^n {\frac{{{{{\hat{{\varvec{g}}}}}_i}\big ( {{{{\varvec{\theta }}}_0}} \big )}}{{1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}}}_i}\big ( {{{{\varvec{\theta }}}_0}} \big )}}} = \frac{1}{n}\sum \limits _{i = 1}^n {{\varvec{u}^T}{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0}) - \rho } \frac{1}{n}\sum \limits _{i = 1}^n {\frac{{{{({{\varvec{u}}^T}{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0}))}^2}}}{{1 + \rho {{\varvec{u}}^T}{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0})}}} \\&\le {{\varvec{u}}^T}\frac{\mathrm{{1}}}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}\big ( {{{{\varvec{\theta }}}_0}} \big )} - \frac{\rho }{{1 + \rho \mathop {\max }\limits _{1 \le i \le n} \big \Vert {{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0})} \big \Vert }}{{\varvec{u}}^T}\varvec{S}_n\big ( {{{{\varvec{\theta }}}_0}} \big ){\varvec{u}}, \end{aligned}$$

where $\varvec{S}_n( {{{{\varvec{\theta }}}_0}} ) = {n^{- 1}}\sum \nolimits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})} {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})}^T={\varvec{\Lambda }}_g(1+o_p(1))$ and the inequality follows from the positivity of $(1+\rho \varvec{u}^T{\hat{\varvec{g}}_i({{\varvec{\theta }}}_0)})$. Therefore, it can be shown that

$$\begin{aligned} \rho {{\varvec{u}}^T}\varvec{S}_n\big ( {{{{\varvec{\theta }}}_0 }} \big ){\varvec{u}}\le \{{1+\rho \mathop {\max }\limits _{1 \le i \le n} ||{{{\hat{{\varvec{g}}}} }_{i}}({{{\varvec{\theta }}}_0 })||}\} \Vert {{\varvec{u}}^T}\frac{\mathrm{{1}}}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}_0 }} \big )}\Vert . \end{aligned}$$

Noting that $\max \nolimits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=o_p(n^{1/2})$ and according to the conclusion provided in the proof of Lemma 1, we have

$$\begin{aligned} \rho [\varvec{u}^{T} \varvec{\Lambda }_{{\varvec{g}}} \varvec{u}+o_{p}(1)]\le O_p(n^{-1/2})\{1+\rho o_p(n^{1/2})\}, \end{aligned}$$

which leads to $ \rho =O_{p}(n^{-1/2}), $ i.e., $|| {{\varvec{\lambda }}}||=O_{p}(n^{-1/2}).$ Naturally,

$$\begin{aligned}\max \limits _{1\le i\le n}| |{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |\le || {{\varvec{\lambda }}}|| \max \limits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=O_{p}(n^{-1/2}) o_p(n^{1/2})=o_{p}(1).\end{aligned}$$

Expanding the equation $D({\varvec{\lambda }})$, we get

$$\begin{aligned} 0&= \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big \{1-{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) +\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 }{(1+\xi _i)^3}\Big \} \\&= \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) - \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{{\varvec{\lambda }}}+ \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 }{(1+\xi _i)^3}, \end{aligned}$$

where $\xi _i\in (0, {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})).$ Using the fact that $\max \nolimits _{1\le i\le n}|| {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=o_{p}(1), $ then $|\xi _i| =o_{p}(1).$ Since

$$\begin{aligned} \Big \Vert \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 }{(1+\xi _i)^3} \Big \Vert&\le \frac{\max \limits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |}{1-\max \limits _{1\le i\le n}|\xi _i| } \Big \Vert {{\varvec{\lambda }}}^{T} \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T} {{\varvec{\lambda }}}\Big \Vert \\&= o(n^{1/2})O_{p}(n^{-1}) \\&= o_{p}(n^{-1/2}), \end{aligned}$$

thus

$$\begin{aligned} {{\varvec{\lambda }}}&= \Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]^{-1}\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big ]+ {\varvec{\zeta }}, \end{aligned}$$

(13)

where $| |{\varvec{\zeta }}| |=o_{p}(n^{-1/2}).$ A Taylor expansion of $\hat{R}_Q({{\varvec{\theta }}}_0)$ yields

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&= 2 \sum \limits _{i=1}^n \Big \{ {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})-\frac{1}{2}[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 +\frac{1}{3}\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^3}{(1+\xi _i)^3} \Big \}\\&= 2 {{\varvec{\lambda }}}^{T} \sum \limits _{i=1}^n\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})-\sum \limits _{i=1}^n {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{{\varvec{\lambda }}}+\frac{2}{3}\sum \limits _{i=1}^n\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^3}{(1+\xi _i)^3}. \end{aligned}$$

Similarly,

$$\begin{aligned} \Big \Vert \sum \limits _{i=1}^n \frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^3 }{(1+\xi _i)^3} \Big \Vert&\le \frac{\max \limits _{1\le i\le n}|| {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |}{1-\max \limits _{1\le i\le n}|\xi _i| } \Big \Vert {{\varvec{\lambda }}}^{T} \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T} {{\varvec{\lambda }}}\Big \Vert \\&= o_{p}(1)nO_{p}(n^{-1}) \\&= o_{p}(1), \end{aligned}$$

therefore,

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&= 2 {{\varvec{\lambda }}}^{T} \sum \limits _{i=1}^n\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})-\sum \limits _{i=1}^n {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{{\varvec{\lambda }}}+o_{p}(1). \end{aligned}$$

(14)

Substituting (13) in (14), it holds that

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&=n\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]^{-1}\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big ]\\&\quad -n{\varvec{\zeta }}^{T}\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{\varvec{\zeta }}+o_{p}(1). \end{aligned}$$

A simple calculation shows that

$$\begin{aligned} n{\varvec{\zeta }}^{T}\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{\varvec{\zeta }}&=no_{p}(n^{-1/2})o_{p}(n^{-1/2})=o_{p}(1). \end{aligned}$$

Therefore,

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&=\Big [\frac{1}{\sqrt{n}}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]^{-1}\Big [\frac{1}{\sqrt{n}}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big ]+o_{p}(1). \end{aligned}$$

Since ${\hat{{{\varvec{\theta }}}}}_Q=\mathop {\arg \min }\nolimits _{{{\varvec{\theta }}}\in {\mathcal {B}}} {\hat{R}_Q({{\varvec{\theta }}})}$, we have two properties of ${\hat{R}}_Q({{\varvec{\theta }}}_0)$ as follows:

$$\begin{aligned} \frac{1}{n}\hat{R}'_Q({{\varvec{\theta }}}_0)&=2{\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)-R_{n1}\\ \frac{1}{n}\hat{R}''_Q({{\varvec{\theta }}}_0)&=2{{\hat{{\varvec{g}}}}'}({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)+R_{n2}, \end{aligned}$$

where

$$\begin{aligned} R_{n1}&={\hat{{\varvec{g}}}}^{T}({{\varvec{\theta }}}_0)\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0),\\ R_{n2}&=2{\hat{{\varvec{g}}}}''({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)-4n{\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)^{T}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)\\&\quad +2{\hat{{\varvec{g}}}}^{T}({{\varvec{\theta }}}_0)\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\varvec{\{}\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)\\&\quad -{\hat{{\varvec{g}}}}^{T}({{\varvec{\theta }}}_0)\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}''_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0). \end{aligned}$$

Since $n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})=O_p(n^{-1/2})$, together with conditions (C5)–(C8), we have $R_{n1}=O_p(n^{-1})$ and $R_{n2}=o_p(1)$. The proof of Lemma 2 is completed. $\square $

Proof of Theorem 1

Similar to the proof of Lemma 1 in Qin and Lawless (1994), under conditions (C3)–(C4) and (C6)–(C7), we have $\hat{{{\varvec{\theta }}}}_Q {\mathop {\longrightarrow }\limits ^{{{p}}}}{{\varvec{\theta }}}_0$ in probability. This implies $\Vert {\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0\Vert =O_p(n^{-1/2})$ and $\Vert {\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0\Vert =O_p(n^{-1/2})$. Under (C3)–(C4),

$$\begin{aligned}&\mathop {lim}_{n\rightarrow \infty }\frac{1}{n}\sum \limits _{i=1}^n({\hat{f}}_Q(\varvec{t}_i)-f_0(\varvec{t}_i))^T({\hat{f}}_Q(\varvec{t}_i)-f_0(\varvec{t}_i))=\int _0^1 {({\hat{f}}_Q(t)-f_0( t))^2 dt}\\&\quad =\int _0^1\big (\varvec{m}^T(t)({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)+R(t)\big )^2dt\\&\quad \le 2\int _0^1\big (\varvec{m}^T(t)({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\big )^2dt+2\int _0^1 R(t)^2 dt\\&\quad = 2 ({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)^T\int _0^1 \varvec{m}(t)\varvec{m}^T(t) dt ({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)+ 2\int _0^1 R(t)^2 dt. \end{aligned}$$

Since $\Vert R(t_{ij})\Vert =O_p(k_n^{-r})=O_p(n^{-r/(2r+1)})$, we obtain

$$\begin{aligned} \int _0^1 {({\hat{f}}_Q(t)-f_0( t))^2 dt}=O_p(n^{-2r/(2r+1)}). \end{aligned}$$

Proof of Theorem 2

Denote $\varvec{L}_{1n}({{\varvec{\beta }}},{{\varvec{\alpha }}})$ and $\varvec{L}_{2n}({{\varvec{\beta }}},{{\varvec{\alpha }}})$ as the first derivatives of ${\hat{R}}_Q({{\varvec{\theta }}})$ with respect to ${{\varvec{\beta }}}$ and ${{\varvec{\alpha }}}$ respectively. Obviously, $\varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)=0$ and $\varvec{L}_{2n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)=0$. Applying Taylor expansion to $\varvec{L}_{1n}$ around $({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)$, it follows that

$$\begin{aligned} \varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)&=\varvec{L}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)+\frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\beta }}}}}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+ \frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\alpha }}}}}({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\\&\quad +\frac{1}{2}({\hat{{{\varvec{\theta }}}}}_Q-{{\varvec{\theta }}}_0)^T\frac{{\partial ^2 {\varvec{L}}_{1n}({\tilde{{{\varvec{\beta }}}}}_0,{\tilde{{{\varvec{\alpha }}}}}_0)}}{{\partial {{\varvec{\theta }}}^2 }}({\hat{{{\varvec{\theta }}}}}_Q-{{\varvec{\theta }}}_0), \end{aligned}$$

where $({\tilde{{{\varvec{\beta }}}}}_0^T,{\tilde{{{\varvec{\alpha }}}}}_0^T)^T$ is between $({{\varvec{\beta }}}_0^T,{{\varvec{\alpha }}}_0^T)^T$ and $({\hat{{{\varvec{\beta }}}}}_Q^T,{\hat{{{\varvec{\alpha }}}}}_Q^T)^T$. Since

$$\begin{aligned}n^{-1}\hat{R}'_Q({{\varvec{\theta }}}_0)=2{\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)+O_p(n^{-1}),\end{aligned}$$

we have

$$\begin{aligned} \frac{1}{n}\varvec{L}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)=-\frac{2}{n^2}&\sum \limits _{l_1,l_2 = 1}^n\sum \limits _{k_1,k_2 = 1}^s \{\varvec{x}_{l_1}^T\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}\hat{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1}\\&\varvec{S}^{-1}_{n,k_1k_2}\varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}\hat{\varvec{W}}_{l_2}(\varvec{y}_{l_2}-{{\varvec{\mu }}}_{0l_2})\} +o_p(n^{-1/2})\\ =-\frac{2}{n^2}&\sum \limits _{l_1,l_2= 1}^n\tilde{\varvec{x}}_{l_1}\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\} +o_p(n^{-1/2}), \end{aligned}$$

where $\varvec{S}^{-1}_{n,k_1k_2}$ represents the $(k_1,k_2)$ part of $\varvec{S}_n^{-1}$. Let $\tilde{\varvec{x}}_{l_1}=\varvec{H}'({\varvec{\eta }}_{0i})\varvec{x}_{l_1}$, $\tilde{\varvec{R}}(\varvec{t}_{l_2})=\varvec{H}'({\varvec{\eta }}_{0{l_2}})\varvec{R}(\varvec{t}_{l_2})$ and for simplicity,

$$\begin{aligned}\hat{{\varvec{\tau }}}_{l_1l_2}=\sum \limits _{k_1,k_2 = 1}^s \varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}\hat{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} \varvec{S}^{-1}_{n,k_1k_2}\varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}\hat{\varvec{W}}_{l_2}, \end{aligned}$$

and

$$\begin{aligned} {{\varvec{\tau }}_{l_1l_2}}=\sum \limits _{k_1,k_2 = 1}^s \varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} {\varvec{\Lambda }}^{-1}_{g,k_1k_2}\varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}{\varvec{W}}_{l_2}. \end{aligned}$$

Similar with the proof of Lemma 1, we have $\hat{{\varvec{\tau }}}_{l_1l_2}={\varvec{\tau }}_{l_1l_2}(1+o_p(1))$. Therefore, we have

$$\begin{aligned}&\frac{1}{n}\frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\beta }}}}}=-\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2}+o_p(n^{-1/2}),\\&\frac{1}{n}\frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\alpha }}}}}=-\frac{2}{n^2}\sum \limits _{l_1,l_2= 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}+o_p(n^{-1/2}), \end{aligned}$$

where $\tilde{\varvec{m}}_{l_2}=\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{m}_{l_2}$. Thus,

$$\begin{aligned} \begin{aligned} \frac{1}{n}\varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)&=-\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+\tilde{\varvec{m}}_{l_2}({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\}\\&\quad -\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}+o_p(n^{-1/2}). \end{aligned} \end{aligned}$$

(15)

Similarly,

$$\begin{aligned} \begin{aligned} \frac{1}{n}\varvec{L}_{2n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)&=-\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+\tilde{\varvec{m}}_{l_2}({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\}\\&\quad -\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}+o_p(n^{-1/2}). \end{aligned} \end{aligned}$$

(16)

From the Eq. (16), we have

$$\begin{aligned} {\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0=-\varvec{V}_g^{-1}\big \{\varvec{K}_g({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+\varvec{P}_g\big \}, \end{aligned}$$

(17)

where $\varvec{V}_g=n^{-2}{\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}}$, $\varvec{K}_g=n^{-2}\sum \nolimits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2}$ and $\varvec{P}_g=n^{-2}\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T \hat{{\varvec{\tau }}}_{l_1l_2} \{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}$. By the central limit theorem, it follows that $\varvec{V}_g=n^{-2}{\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}} {\mathop {\longrightarrow }\limits ^{{{p}}}}E(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})$, $\varvec{K}_g=n^{-2}\sum \nolimits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2} {\mathop {\longrightarrow }\limits ^{{{p}}}}E(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})$ with $\tilde{\varvec{x}}=(\tilde{\varvec{x}}_1^T,\ldots ,\tilde{\varvec{x}}_n^T)^T$, $\tilde{\varvec{m}}=(\tilde{\varvec{m}}_1^T,\ldots ,\tilde{\varvec{m}}_n^T)^T$ and

$$\begin{aligned}{\varvec{\tau }}=\left( \begin{array}{ccc} {\varvec{\tau }}_{11} &{} \ldots &{} {\varvec{\tau }}_{1n} \\ \vdots &{} \ddots &{} \vdots \\ {\varvec{\tau }}_{n1} &{} \ldots &{} {\varvec{\tau }}_{nn} \\ \end{array} \right) .\end{aligned}$$

By substituting (17) into the Eq. (15), we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{K}_g\}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+o_p(n^{-1/2})\\&\quad =\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_2}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{P}_g\}. \end{aligned} \end{aligned}$$

(18)

By a simple calculation, we can obtain

$$\begin{aligned}n^{-2}\sum \limits _{l_1,l_2= 1}^n\varvec{K}_g^T \varvec{V}_g^{-1}\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{K}_g\}=0,\end{aligned}$$

$$\begin{aligned}n^{-2}\sum \limits _{l_1,l_2= 1}^n\varvec{K}_g^T \varvec{V}_g^{-1}\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{{\varvec{\varepsilon }}_{l_2}+\tilde{\varvec{R}}(\varvec{t}_{l_2})-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{P}_g\}=0.\end{aligned}$$

By defining $\varvec{x}_i^*=\tilde{\varvec{x}}^T_i-\varvec{K}_g^T\varvec{V}_g^{-1}\tilde{\varvec{m}}_i^T$ and substituting it into (18), it leads to

$$\begin{aligned}&\big \{\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}\varvec{x}_{l_2}^{*T} +o_p(1)\big \}\sqrt{n}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)\\&\quad =\frac{1}{n^{{3}/{2}}}\sum \limits _{l_1,l_2 = 1}^n \varvec{x}_i^{*}\hat{{\varvec{\tau }}}_{l_1l_2}{\varvec{\varepsilon }}_{l_2} -\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2= 1}^n \varvec{x}_{l_{1}}^{*} \hat{{\varvec{\tau }}}_{l_{1}l_{2}} \tilde{\varvec{m}}_{l_{1}}\varvec{V}_g^{-1}\varvec{P}_g +\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2 = 1}^n \varvec{x}_{l_1}^{*}\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&\quad =:\varvec{A}_1+\varvec{A}_2+\varvec{A}_3. \end{aligned}$$

Note that

$$\begin{aligned} \begin{aligned} \varvec{A}_1&=\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}{\varvec{\varepsilon }}_{l_2}=\frac{1}{\sqrt{n}}\sum \limits _{l_2= 1}^n\big \{\frac{1}{n}\sum \limits _{l_1 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}{\varvec{\varepsilon }}_{l_2}\big \} =\frac{1}{\sqrt{n}}\sum \limits _{l_2= 1}^n \hat{{\varvec{z}}}_{gl_2}({{\varvec{\beta }}}_0,\hat{{\varvec{\gamma }}}), \end{aligned}\end{aligned}$$

where

$$\begin{aligned} \hat{\varvec{z}}_{gl_2}({{\varvec{\beta }}}_0,\hat{{\varvec{\gamma }}})&=\frac{1}{n}\sum \limits _{l_1 = 1}^n\sum \limits _{k_1,k_2= 1}^n \varvec{x}_{l_1}^*\varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}\hat{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} \varvec{S}^{-1}_{n,k_1k_2}\\&\quad \times \varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}\hat{\varvec{W}}_{l_2}{\varvec{\varepsilon }}_{l_2}, {\varvec{z}}_{gl_2}({{\varvec{\beta }}}_0,{{\varvec{\gamma }}_0})\\&=\frac{1}{n}\sum \limits _{l_1 = 1}^n\sum \limits _{k_1,k_2= 1}^n \varvec{x}_{l_1}^*\varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} {\varvec{\Lambda }}^{-1}_{g,k_1k_2}\\&\quad \times \varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}{\varvec{W}}_{l_2}{\varvec{\varepsilon }}_{l_2}. \end{aligned}$$

Using the similar arguments in the proof of Lemma 1, we have

$$\begin{aligned} \frac{1}{{\sqrt{n} }}\sum \limits _{l_2 = 1}^n {\hat{\varvec{z}}_{gl_2}( {{{{\varvec{\beta }}}_0},\hat{{\varvec{\gamma }}} } )\varvec{z}_{gl_2}^T} ( {{{{\varvec{\beta }}}_0},\hat{{\varvec{\gamma }}} } ) {\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{B}_{g}}, ~~~\frac{1}{{\sqrt{n} }}\sum \limits _{l_2= 1}^n {\hat{\varvec{z}}_{gl_2}( {{{{\varvec{\beta }}}_0},\hat{{\varvec{\gamma }}} } ) {\mathop {\longrightarrow }\limits ^{d}}} N( {0,{{\varvec{\Psi }}_{g}}} ), \end{aligned}$$

where $\varvec{B}_{g} = E\big ( {{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )\varvec{z}_{gl_2}^T( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} \big )=E\big \{[\tilde{\varvec{x}}^T{\varvec{\tau }}-\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{m}}[\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}}]^{-1}\tilde{\varvec{m}}^T{\varvec{\tau }}] {{\varvec{\varepsilon }}}\big \}^{\otimes 2}$ and ${\varvec{\Psi }}_{g}=\varvec{B}_{g}+E( {{{{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} / {\partial {{\varvec{\gamma }}} }}} ){{\varvec{\Sigma }}}E( {{{{{{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} / {\partial {{\varvec{\gamma }}} }}}^T}} )$. Then we have $\varvec{A}_1{\mathop {\longrightarrow }\limits ^{d}}N(0,{\varvec{\Psi }}_g)$ in distribution. Since

$$\begin{aligned}\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2= 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}=0,\end{aligned}$$

then $\varvec{A}_2=0$. Expanding ${\varvec{x}}_{l_1}^*$ in $\varvec{A}_3$, we have

$$\begin{aligned} \begin{aligned} \varvec{A}_3&=\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2,=1}^n(\tilde{\varvec{x}}^T_{l_1}-\varvec{K}^T_g\varvec{V}_g^{-1}\tilde{\varvec{m}}_{l_1}^T)\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&=\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2,=1}^n\big [\tilde{\varvec{x}}^T_{l_1}-\{E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})\}\{E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\}\tilde{\varvec{m}}_{l_1}^T\big ]\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&\quad -\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2,=1}^n\big [\varvec{K}^T_g\varvec{V}_g^{-1}-\{E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})\}\{E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\}\big ]\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&=:\varvec{A}_{31}+\varvec{A}_{32}. \end{aligned} \end{aligned}$$

From the definition of $\varvec{K}_g$, it is easy to verify that

$$\begin{aligned}E\big [\big \{\tilde{\varvec{x}}^T_{l_1}-E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\tilde{\varvec{m}}_{l_1}^T\big \}\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}\big ]=0,\end{aligned}$$

which implies

$$\begin{aligned}\frac{1}{n^2}\sum \limits _{l_1,l_2=1}^n \big \{\tilde{\varvec{x}}^T_{l_1}-E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\tilde{\varvec{m}}_{l_1}^T\big \}\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}=O_p(1).\end{aligned}$$

Recalling that $\Vert {\varvec{m}( {{t_{ij}}})}\Vert = O_p( 1 )$ and $\Vert {{R}( {{t_{ij}}} )} \Vert =o_p(1)$, we have $\varvec{A}_{31}=o_p(1)$. Similarly, we obtain $\varvec{A}_{32}=o_p(1)$ and then $\varvec{A}_{3}=o_p(1)$. By the law of large numbers, we have

$$\begin{aligned}\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}\varvec{x}_{l_2}^{*T}{\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{\Gamma }}_g,\end{aligned}$$

in probability where ${\varvec{\Gamma }}_g=E\big \{\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{x}}-\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{m}}[\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}}]^{-1}\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}}\big \}$. Hence,

$$\begin{aligned}\sqrt{n}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0){\mathop {\longrightarrow }\limits ^{d}}N(0,{\varvec{\Gamma }}_g^{-1}{\varvec{\Psi }}_g{\varvec{\Gamma }}_g^{-1}),\end{aligned}$$

in distribution and the proof of Theorem 2 is completed. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shao, Y., Wang, L. Generalized partial linear models with nonignorable dropouts. Metrika 85, 223–252 (2022). https://doi.org/10.1007/s00184-021-00828-z

Download citation

Received: 17 June 2020
Accepted: 13 June 2021
Published: 12 July 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s00184-021-00828-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalized partial linear models with nonignorable dropouts

Abstract

Access this article

Similar content being viewed by others

Fixed and random effects models: making an informed choice

Identifying typical trajectories in longitudinal data: modelling strategies and interpretations

Meta-analysis with Robust Variance Estimation: Expanding the Range of Working Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary material 1 (pdf 205 KB)

Appendix

Lemma 1

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generalized partial linear models with nonignorable dropouts

Abstract

Access this article

Similar content being viewed by others

Fixed and random effects models: making an informed choice

Identifying typical trajectories in longitudinal data: modelling strategies and interpretations

Meta-analysis with Robust Variance Estimation: Expanding the Range of Working Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary material 1 (pdf 205 KB)

Appendix

Appendix

Lemma 1

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation