Skip to main content

Advertisement

Log in

Generalized partial linear models with nonignorable dropouts

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

In the presence of longitudinal data with nonignorable dropouts, we propose improved estimators for generalized partial linear models that accommodate both the within-subject correlations and nonignorable missing data. To address the identifiability problem, an instrumental covariate, which is related to the response variable but unrelated to the propensity given the response variable and other covariates, is used to construct sufficient instrumental estimating equations. Subsequently, the nonparametric function is approximated by B-spline basis functions and then we construct bias-corrected generalized estimating equations based on the inverse probability weighting. In order to incorporate the within-subject correlations under an informative working correlation structure, we borrow the idea of quadratic inference function and hybrid-GEE to construct the improved empirical likelihood procedures. Under some regularity conditions, we establish asymptotic normality of the proposed estimators for the parametric components and convergence rate of the estimators for the nonparametric functions. The finite-sample performance of the proposed estimators is studied through simulations and an application to HIV-CD4 data set is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bai Y, Fung WK, Zhu Z (2010) Weighted empirical likelihood for generalized linear models with longitudinal data. J Stat Plan Inference 140:3446–3456

    Article  MathSciNet  MATH  Google Scholar 

  • Boente G, He X, Zhou J (2006) Robust estimates in generalized partially linear models. Ann Stat 34:2856–2878

    Article  MathSciNet  MATH  Google Scholar 

  • Boente G, Rodriguez D (2010) Robust inference in generalized partially linear models. Comput Stat Data Anal 54:2942–2966

    Article  MathSciNet  MATH  Google Scholar 

  • Chen B, Zhou X (2013) Generalized partially linear models for incomplete longitudinal data In the presence of population-level information. Biometrics 69:386–395

    Article  MathSciNet  MATH  Google Scholar 

  • Chen X, Christensen T (2015) Optimal uniform convergence rates and asymptotic normality for series estimators under weak dependence and weak conditions. J Econom 188:447–465

    Article  MathSciNet  MATH  Google Scholar 

  • Cho H, Qu A (2015) Efficient estimation for longitudinal data by combining large-dimensional moment conditions. Electron J Stat 9:1315–1334

    Article  MathSciNet  MATH  Google Scholar 

  • Diggle P, Kenward MG (1994) Informative drop-out in longitudinal data analysis (with discussion). J R Stat Soc Ser C (Appl Stat) 43:49–93

    MATH  Google Scholar 

  • Fang F, Shao J (2016) Model selection with nonignorable nonresponse. Biometrika 103:861–874

    Article  MathSciNet  Google Scholar 

  • Fitzmaurice GM, Molenberghs G, Lipsitz SR (1995) Regression models for longitudinal binary responses with informative drop-outs. J Roy Stat Soc 57:691–704

    MathSciNet  MATH  Google Scholar 

  • Fu L, Wang Y (2012) Quantile regression for longitudinal data with a working correlation model. Comput Stat Data Anal 56:2526–2538

    Article  MathSciNet  MATH  Google Scholar 

  • Hansen LP (1982) Large sample properties of generalized method of moments estimators. Econometrica 50:1029–1054

    Article  MathSciNet  MATH  Google Scholar 

  • He X, Shi P (1996) Bivariate tensor-product B-splines in a partly linear model. J Multivar Anal 58:162–181

    Article  MathSciNet  MATH  Google Scholar 

  • He X, Zhu Z, Fung WK (2002) Estimation in a semiparametric model for longitudinal data with unspecified dependence structure. Biometrika 89:579–590

    Article  MathSciNet  MATH  Google Scholar 

  • Holland A (2017) Penalized spline estimation in the partially linear model. J Multivar Anal 153:211–235

    Article  MathSciNet  MATH  Google Scholar 

  • Huang J, Liu L, Liu N (2007) Estimation of large covariance matrices of longitudinal data with basis function approximations. J Comput Graph Stat 16:189–209

    Article  MathSciNet  Google Scholar 

  • Kim JK, Yu CL (2011) A semiparametric estimation of mean functionals with nonignorable missing data. J Am Stat Assoc 106:157–165

    Article  MathSciNet  MATH  Google Scholar 

  • Koenker R, Bassett Jr G (1978) Regression quantiles. Econometrica 46:33–55

  • Kott PS, Chang T (2010) Using calibration weighting to adjust for nonignorable unit nonresponse. J Am Stat Assoc 105:1265–1275

    Article  MathSciNet  MATH  Google Scholar 

  • Leng C, Zhang W (2014) Smoothing combined estimating equations in quantile regression for longitudinal data. Stat Comput 24:123–136

    Article  MathSciNet  MATH  Google Scholar 

  • Leng C, Zhang W, Pan J (2010) Semiparametric mean-covariance regression analysis for longitudinal data. J Am Stat Assoc 105:181–193

    Article  MathSciNet  MATH  Google Scholar 

  • Leung D, Wang Y, Zhu M (2009) Efficient parameter estimation in longitudinal data analysis using a hybrid-GEE method. Biostatistics 10:436–445

    Article  MATH  Google Scholar 

  • Li D, Pan J (2013) Empirical likelihood for generalized linear models with longitudinal data. J Multivar Anal 114:63–73

    Article  MathSciNet  MATH  Google Scholar 

  • Liang H, Qin Y, Zhang X, Ruppert D (2009) Empirical likelihood-based inferences for generalized partially linear models. Scand J Stat 36:433–443

    Article  MathSciNet  MATH  Google Scholar 

  • Liang K, Zeger S (1986) Longitudinal data analysis using generalized linear models. Biometrika 73:13–22

    Article  MathSciNet  MATH  Google Scholar 

  • Lin H, Qin G, Zhang J, Fung WK (2018) Doubly robust estimation of partially linear models for longitudinal data with dropouts and measurement error in covariates. Statistics 52:84–98

    Article  MathSciNet  MATH  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York

    Book  MATH  Google Scholar 

  • Lv J, Guo C, Yang H, Li Y (2017) A moving average Cholesky factor model in covariance modeling for composite quantile regression with longitudinal data. Comput Stat Data Anal 112:129–144

    Article  MathSciNet  MATH  Google Scholar 

  • Molenberghs G, Kenward M (2007) Missing data in clinical studies. Wiley, London

    Book  Google Scholar 

  • Qin G, Zhu Z, Fung WK (2016) Robust estimation of generalized partially linear model for longitudinal data with dropouts. Ann Inst Stat Math 68:977–1000

    Article  MathSciNet  MATH  Google Scholar 

  • Qin J, Lawless J (1994) Empirical likelihood and general estimating equations. Ann Stat 22:300–325

    Article  MathSciNet  MATH  Google Scholar 

  • Qu A, Lindsay BG, Li B (2000) Improving generalised estimating equations using quadratic inference functions. Biometrika 87:823–836

    Article  MathSciNet  MATH  Google Scholar 

  • Robins JM, Rotnitzky A, Zhao L (1994) Estimation of regression coefficients when some regressors are not always observed. J Am Stat Assoc 89:846–866

    Article  MathSciNet  MATH  Google Scholar 

  • Schumaker LL (1981) Spline functions: basic theory. Cambridge University Press, Cambridge

  • Shao J, Wang L (2016) Semiparametric inverse propensity weighting for nonignorable missing data. Biometrika 103:175–187

    Article  MathSciNet  MATH  Google Scholar 

  • Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764

    Article  MathSciNet  MATH  Google Scholar 

  • Wang L, Qi C, Shao J (2019) Model-assisted regression estimators for longitudinal data with nonignorable dropout. Int Stat Rev 87:S121–S138

    Article  MathSciNet  Google Scholar 

  • Wang S, Shao J, Kim JK (2014) An instrumental variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24:1097–1116

    MathSciNet  MATH  Google Scholar 

  • Wolberg G, Alfy I (2002) An energy-minimization framework for monotonic cubic spline interpolation. J Comput Appl Math 143:145–188

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang W, Leng C (2011) A moving average Cholesky factor model in covariance modelling for longitudinal data. Biometrika 99:141–150

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang W, Leng C, Tang CY (2015) A joint modelling approach for longitudinal studies. J R Stat Soc Ser B (Stat Methodol) 77:219–238

    Article  MathSciNet  MATH  Google Scholar 

  • Zhao J, Shao J (2015) Semiparametric pseudo-likelihoods in generalized linear models with nonignorable missing data. J Am Stat Assoc 110:1577–1590

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou J, Qu A (2012) Informative estimation and selection of correlation structure for longitudinal data. J Am Stat Assoc 107:701–710

    Article  MathSciNet  MATH  Google Scholar 

  • Zhu Z, Fung W, He X (2008) On the asymptotics of marginal regression splines with longitudinal data. Biometrika 95:907–917

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors sincerely thank Professor Maria Kateri, an Associate Editor and two reviewers for their insightful comments that greatly improved this paper. This paper was supported by the National Natural Science Foundation of China under Grant Nos. 11871287, 11771144, 11801359, the Natural Science Foundation of Tianjin under Grant No. 18JCYBJC41100, Fundamental Research Funds for the Central Universities and the Key Laboratory for Medical Data Analysis and Statistical Research of Tianjin.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lei Wang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 205 KB)

Appendix

Appendix

  1. (C1)

    The covariate vectors are fixed and the first four moments of \(y_{ij}\) exist. Also, for each i, \({m_i}\) is a bounded sequence of positive integers. The random error \({\varvec{\varepsilon }}_i=\varvec{y}_i-{{\varvec{\mu }}}_i\) satisfies that \(E({\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T)=\varvec{V}_i\), \(\mathop {sup}\nolimits _{i}\Vert \varvec{V}_i\Vert <\infty \) and there exists a positive constant \(\delta _1\) such that \(\mathop {sup}\nolimits _{i}E\Vert {\varvec{\varepsilon }}_i\Vert ^{2+\delta _1}<\infty \), where \(\Vert \varvec{V}_i\Vert \) and \(\Vert {\varvec{\varepsilon }}_i\Vert \) denote the modulus of the largest singular value of matrix \(\varvec{V}_i\) and vector \({\varvec{\varepsilon }}_i\), respectively.

  2. (C2)

    All marginal variances \(\varvec{A}_i\) is nonsingular and \(\mathop {sup}\nolimits _i\Vert \varvec{A}_i\Vert <\infty \). The link function \(h(\cdot )\) has bounded second derivative and \(E\{\big (h(\cdot )\big )^{2+\delta _2}\}<\infty \) for some \(\delta _2>2\).

  3. (C3)

    The function \(f(\cdot )\) is r times continuously differentiable on (0, 1) with \(r \ge 2\). The inner knots \(\{a_i, i=1,\ldots , k_n\}\) satisfy

    $$\begin{aligned} \mathop {\max }\limits _{1 \le i \le {k_n}}|\kappa _{i+1}-\kappa _i|=O(k_n^{-1}),~~~~~~\frac{{\mathop {\max }\nolimits _{1 \le i \le {k_n}} \kappa _i}}{{\mathop {\min }\nolimits _{1 \le i \le {k_n}} \kappa _i }} \le C_0, \end{aligned}$$

    where \(\kappa _i=a_i-a_{i-1}\) and \(C_0\) is a positive constant.

  4. (C4)

    The joint distribution function \(Q_{jl}(t,s)\) of any pair of \(t_{ij}\) and \(t_{il}\), the marginal distribution function \(Q_j(t)\) of \(t_{ij}\) have positive continuous density functions \(q_{jl}(t,s)\) and \(q_j(t)\) on \([0,1]\times [0, 1]\) and [0, 1], respectively.

  5. (C5)

    The probability function \(\pi _{ij}({{\varvec{\vartheta }}}_j)\) satisfies (a) it is twice differentiable with respect to \({{\varvec{\vartheta }}}_j\); (b) \(0<C_1<\pi _{ij}({{\varvec{\vartheta }}}_j)<1\) for a positive constant \(C_1\); (c) \(\partial \pi _{ij}({{\varvec{\vartheta }}}_j)/\partial {{\varvec{\vartheta }}}_j\) is uniformly bounded.

  6. (C6)

    There is a unique \({\varvec{\theta }}_0 \in {\mathcal {B}}\) satisfying \(E({\hat{{\varvec{g}}}}_i({\varvec{\theta }}))=0\) and \(E(\hat{{\varvec{\varrho }}}_i({\varvec{\theta }}))=0\), where \({\mathcal {B}}\) is the parameter space.

  7. (C7)

    Assume that \(n^{-1}{\sum \nolimits _{i = 1}^n {\varvec{g}}_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0){\varvec{g}}^T_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0)} \) and \(n^{-1}\sum \nolimits _{i = 1}^n {\varvec{\varrho }}_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0){\varvec{\varrho }}^T_{0i}({{\varvec{\theta }}}_0,{\varvec{\gamma }}_0) \) converge almost surely to positive definite matrixes \({\varvec{\Lambda }}_g\) and \(\varvec{\Lambda }_{{\varvec{\varrho }}}\), respectively. Further, assume that \(n^{-1}{\sum \nolimits _{i = 1}^n cov({\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\varvec{{\hat{\gamma }}})) }\) and \(n^{-1}\sum \nolimits _{i = 1}^n cov(\hat{{\varvec{\varrho }}}_{0i}({{\varvec{\theta }}}_0,\varvec{{\hat{\gamma }}})) \) converges almost surely to positive definite matrixes \({\varvec{\Sigma }}_g\) and \(\varvec{\Sigma }_{{\varvec{\varrho }}}\), respectively.

  8. (C8)

    Assume that \({\partial ^2 {\varvec{g}}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\) and \({\partial ^2 {\varvec{\varrho }}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\) is continuous in a neighborhood of \({{\varvec{\theta }}}_0\), \(\Vert {\partial ^2 {\varvec{g}}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\Vert \) and \(\Vert {\partial ^2 {\varvec{\varrho }}_{0i}({{\varvec{\theta }}},{\varvec{\gamma }}_0)}/{\partial {{\varvec{\theta }}}\partial {{\varvec{\theta }}}^T}\Vert \) can be bounded by some integrable functions in the neighborhood of \({{\varvec{\theta }}}_0\).

Further, suppose that \({\varvec{\xi }}_{j}^0\) is the unique solution to \(E\{{\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j})\}=0\) and the proposed model holds, \({{\varvec{\Omega }}}_{j} =E\{{\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0){\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0)^T\}\) is positive definite and the matrix \({\varvec{\Upsilon }}_{j} =E[\partial {\varvec{s}}_{j}(\varvec{y}_i, \varvec{v}_i, {\varvec{r}}_i, {{\varvec{\xi }}}_{j}^0)/\partial {{\varvec{\xi }}}_{j}]\) is of full rank. As \(n \rightarrow \infty \), \(\sqrt{n}(\hat{{{\varvec{\xi }}}}_{j}-{{\varvec{\xi }}}_{j}^0) {\mathop {\longrightarrow }\limits ^{{{d}}}}N(0, ({\varvec{\Upsilon }}_{j}^T{{\varvec{\Omega }}}_{j}{\varvec{\Upsilon }}_{j})^{-1})\) and \(\sqrt{n}(\hat{{\varvec{\gamma }}}-{\varvec{\gamma }}_0) {\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,{\varvec{\Sigma }})\) in distribution. In the following we mainly derive the case in \(\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}})\), and the proof for \(\hat{{{\varvec{\varrho }}}}_{i}({{\varvec{\theta }}})\) can be obtained using similar arguments.

Lemma 1

Assume the regularity conditions in Theorem 1 hold. As \(n \rightarrow \infty \),

$$\begin{aligned}&(1)~~ \frac{1}{\sqrt{n}}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({\varvec{\theta }}_0){\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,\varvec{\Sigma }_{\varvec{g}}),~~~ (2)~~\frac{1}{n}\sum \limits _{i=1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0)=O_p(n^{-1/2}),~~\\&(3)~~ \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)^{T}{\mathop {\longrightarrow }\limits ^{p}}\varvec{\Lambda }_{{\varvec{g}}},~~ ~~(4)~~ \max _{i}\Vert \hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)\Vert =o_{p}(n^{1/2}). \end{aligned}$$

Proof of Lemma 1

Consider the kth component of \({\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})\),

$$\begin{aligned} \frac{1}{n}\sum \limits _{i=1}^n {{{\hat{{\varvec{g}}}}}_{ik}}( {{{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}} } )&=\frac{1}{n}\sum \limits _{i=1}^n \dot{{\varvec{\mu }}} _i^T \varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}( {{\varvec{y}_i} - {{\varvec{\mu }} _{0i}}})\\&=\frac{1}{n}\sum \limits _{i=1}^n \varvec{d}_i^T \varvec{H}'( {{{\varvec{\eta }} _{0i}}} )\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}( {{\varvec{y}_i} - {{\varvec{\mu }} _{0i}}}), \end{aligned}$$

where \({\varvec{\eta }}_{0i}=\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0\), \({{\varvec{\mu }}}_{0i}=h(\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0)\) and \(\varvec{H}'( {{{\varvec{\eta }}_{0i}}} )\,\mathrm{{= diag}}( {h'( {{{\varvec{\eta }}_{0i1}}} ),\dots , h'( {{{\varvec{\eta }} _{0im}}} )} )\), \(h'(t)={{dh( t )} / {dt}}\). Note that

$$\begin{aligned} {\varvec{y}_i} - {{\varvec{\mu }} _{0i}} =h(\varvec{x}_i{{\varvec{\beta }}}_0+f_0(\varvec{t}_i))-h(\varvec{x}_i{{\varvec{\beta }}}_0+\varvec{m}_i{{\varvec{\alpha }}}_0)+ {{\varvec{\varepsilon }} _i}. \end{aligned}$$
(11)

Applying the Taylor expansion to the first two terms in (11) around \(( {{{\varvec{\beta }} _0},{{\varvec{\alpha }} _0}})\), we have

$$\begin{aligned} \begin{aligned} {\varvec{y}_i} - {{\varvec{\mu }} _{0i}}=&{{\varvec{\varepsilon }} _i} +\varvec{H}'( {{{\varvec{\eta }}_{0i}}}){\varvec{R}}( {{\varvec{t}_i}}) + \frac{1}{2}\varvec{H}''( {{\varvec{w}_{i}}} ){\varvec{R}^*}( {{\varvec{t}_i}} ), \end{aligned} \end{aligned}$$
(12)

where \(\varvec{w}_{i}\) is between \({\varvec{\eta }}_{0i}\) and \({\varvec{x}_i}{{\varvec{\beta }} _0} + {f_0}( {{\varvec{t}_i}})\) and \( {\varvec{R}}( {{\varvec{t}_i}} ) = {( {{R}( {{t_{i1}}} ),\dots ,{R}( {{t_{im}}} )} )^T}\), \({\varvec{R}^*}( {{\varvec{t}_i}} ) = {( {{R^2}( {{t_{i1}}} ),\dots ,{R^2}( {{t_{im}}} )} )^T}\), \({R}( {{t_{ij}}} ) = {f_0}( {{t_{ij}}} ) - {\varvec{m} ^T}( {{t_{ij}}} ){{\varvec{\alpha }} _0},\) \( \varvec{H}''( {{\varvec{w}_i}} )\mathrm{{= diag}}( h''( {{\varvec{w} _{i1}}} ),\dots , h''( {{\varvec{w} _{im}}} ) ),\) \(h''(t)={{d^2h( t )} / {dt^2}}. \)

From conditions (C3)–(C4) and Corollary 6.21 in Schumaker (1981), we obtain \(\Vert {{R}( {{t_{ij}}} )} \Vert = O_p( {k_n^{- r}})\) and \(\Vert {\varvec{m}( {{t_{ij}}})}\Vert = O_p( 1 )\). By substituting (12) into (11), we have

$$\begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n {{{\hat{{\varvec{g}}}}}_{ik}}( {{{\varvec{\theta }}_0},\hat{{\varvec{\gamma }}} } )&= \frac{1}{n}\sum \limits _{i = 1}^n {\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} {{\varvec{\varepsilon }} _i}\\&\quad + \frac{1}{n}\sum \limits _{i = 1}^n {\dot{{\varvec{\mu }}} _i^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} \varvec{H}'( {{{\varvec{\eta }} _{0i}}} ){\varvec{R}}( {{\varvec{t}_i}} )\\&\quad + \frac{1}{{2n}}\sum \limits _{i = 1}^n {\dot{{\varvec{\mu }}} _i^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} \varvec{H}''( {{\varvec{w} _{i}}} ){\varvec{R}^*}( {{\varvec{t}_i}} )\\&=: \varvec{I}_1+\varvec{I}_2+\varvec{I}_3. \end{aligned}$$

Under conditions (C3)–(C4) and through a simple calculation, we derive that \(\varvec{I}_2=O_p(n^{-1}n^{1/2}{k_n^{-r}})=o_p(n^{-1/2})\) and \(\varvec{I}_3=o_p(n^{-1/2})\). Thus,

$$\begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n{\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})=\frac{1}{n}\sum \limits _{i = 1}^n {\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})+o_p(n^{-1/2}), \end{aligned}$$

where \({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{{\hat{\varvec{W}}}_i}} {{\varvec{\varepsilon }} _i}\) and \({\hat{{\varvec{g}}}}_{0i}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})=({\hat{{\varvec{g}}}}^T_{0i1},\ldots ,{\hat{{\varvec{g}}}}^T_{0is})^T\). Obviously, it leads to \(E({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}}))=0\). Applying the Taylor expansion to \({\hat{{\varvec{g}}}}_{0ik}({{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})\) around \({\varvec{\gamma }}_0\), it leads to

$$\begin{aligned} {{\hat{{\varvec{g}}}}_{0ik}( {{\varvec{\theta }}}_0,\hat{{\varvec{\gamma }}})}&= {{\varvec{g}}_{0ik}({{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} ) + {\frac{{\partial {\varvec{g}}_{0ik}( {{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} )}}{{\partial {{\varvec{\gamma }}} }}( {\hat{{\varvec{\gamma }}} - {\varvec{\gamma }}_0 } )} } + {o_p}( 1 )\\&= :{\varvec{I}_{n1}} + {\varvec{I}_{n2}}, \end{aligned}$$

where \(\varvec{I}_{n1}= {{\varvec{g}}_{0ik}( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}})}=\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}{\varvec{W}}_i {\varvec{\varepsilon }}_i\) and \(\varvec{I}_{n2}\) is the rest part of the above equation. Under (C1)–(C2) and (C5), it can be seen that

$$\begin{aligned} E(\varvec{I}_{n1}\varvec{I}_{n1}^T)&=E[{\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{W}_i{\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T\varvec{W}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}}]\\&=E[E[{\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{W}_i{\varvec{\varepsilon }}_i{\varvec{\varepsilon }}_i^T\varvec{W}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}}|{\varvec{x}}_i]]\\&={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}},\\ E(\varvec{I}_{n2}\varvec{I}_{n2}^T)&=E\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {\varvec{\gamma }} }}} \big ]{\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {{\varvec{\gamma }}}}}} \big ],\\ E(\varvec{I}_{n1} \varvec{I}_{n2})&=O_p(n^{-1/2}),~~~~~~~Cov(\varvec{I}_{n1},\varvec{I}_{n2})=o_p(1), \end{aligned}$$

where \(\varvec{T}_i\) is a symmetric matrix with its \((j, j')\) element \((j\le j')\) as \(\varepsilon _{ij}\varepsilon _{ij'}/\pi _{ij}\). Therefore,

$$\begin{aligned} Cov({\hat{{\varvec{g}}}}_{0ik}({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}}))&={\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}}\\&\quad +E\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {\varvec{\gamma }} }}} \big ]{\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{{\varvec{\theta }}}_0},{{\varvec{\gamma }} _0}} \big )} \big / {\partial {{\varvec{\gamma }}}}}} \big ], \end{aligned}$$

and

$$\begin{aligned}Cov({\hat{{\varvec{g}}}}_{0i}({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}}))={\left( \begin{array}{ccc} \varvec{N}_{11} &{} \ldots &{} \varvec{N}_{1s} \\ \vdots &{} \ddots &{} \vdots \\ \varvec{N}_{s1} &{} \ldots &{} \varvec{N}_{ss} \\ \end{array} \right) }+ {\left( \begin{array}{ccc} \varvec{U}_{11} &{} \ldots &{} \varvec{U}_{1s} \\ \vdots &{} \ddots &{} \vdots \\ \varvec{U}_{s1} &{} \ldots &{} \varvec{U}_{ss} \\ \end{array} \right) },\end{aligned}$$

where \(\varvec{N}_{lk}=\dot{{\varvec{\mu }} _i}^T\varvec{A}_i^{- 1/2}{\varvec{B}_l}\varvec{A}_i^{- 1/2}\varvec{T}_i\varvec{A}_i^{- 1/2}{\varvec{B}_k}\varvec{A}_i^{- 1/2}\dot{{\varvec{\mu }} _i}\) and \(\varvec{U}_{lk}=E\big [ {\partial {{\varvec{g}}_{0il}}\big ( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}} \big )} /{\partial {\varvec{\gamma }} }\big ] {\varvec{\Sigma }} {E^T}\big [ {{{\partial {{\varvec{g}}_{0ik}}\big ( {{{\varvec{\theta }} _0},{{\varvec{\gamma }} _0}} \big )} / {\partial {{\varvec{\gamma }}}}}} \big ]\) for \(l, k=1,\ldots ,s\). Under the conditions (C6)–(C7), it can be verified that

$$\begin{aligned} n^{-1}\sum \limits _{i = 1}^nCov({\hat{{\varvec{g}}}}_{0i}({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})){\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{\Sigma }}_g, \end{aligned}$$

in probability and then

$$\begin{aligned} \frac{1}{{\sqrt{n} }}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) = \frac{1}{{\sqrt{n} }}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_{0i}}}( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) + o_p(1){\mathop {\longrightarrow }\limits ^{{{d}}}}N(0,{\varvec{\Sigma }}_g), \end{aligned}$$

in distribution. Moreover, we can obtain \(n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})=O_p(n^{-1/2})\). Noting that \(\sqrt{n}(\varvec{{\hat{\gamma }}}-{\varvec{\gamma }}_0)=O_p(1)\) and \({\partial {\varvec{g}}_{0ik}( {{\varvec{\theta }}}_0,{{\varvec{\gamma }}_0} ) \big / \partial {{\varvec{\gamma }}}}=O_p(1)\) from (C5) and (C8), we can obtain

$$\begin{aligned}\begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_{il}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ){{{{\hat{{\varvec{g}}}}}_{ik}^T}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) =\frac{1}{n}\sum \limits _{i = 1}^n {{{{\varvec{g}}}_{0il}}} ( {{{\varvec{\theta }} _0,\varvec{\gamma _0}}} ){{{{\varvec{g}}}_{0ik}^T}} ( {{{\varvec{\theta }} _0,\varvec{\gamma _0}}} )+o_p(1). \end{aligned}\end{aligned}$$

Hence

$$\begin{aligned} \begin{aligned} \frac{1}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ){{{{\hat{{\varvec{g}}}}}_{i}^T}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} )&=\frac{1}{n}\sum \limits _{i = 1}^n {{{{\varvec{g}}}_{0i}}} ( {{{\varvec{\theta }} _0,\varvec{\gamma _0}}} ){{{{\varvec{g}}}_{0i}^T}} ( {{{\varvec{\theta }} _0,{{\varvec{\gamma }}_0}}} )+o_p(1)\\&=\frac{1}{n}\sum \limits _{i=1}^n{\left( \begin{array}{ccc} \varvec{N}_{11} &{} \ldots &{} \varvec{N}_{1s} \\ \vdots &{} \ddots &{} \vdots \\ \varvec{N}_{s1} &{} \ldots &{} \varvec{N}_{ss} \\ \end{array} \right) }+o_p(1){\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{\Lambda }}_{g}. \end{aligned}\end{aligned}$$

According to (3),

$$\begin{aligned}\frac{{{n^{- 1}}{{( {\mathop {\max }\limits _{1 \le i \le n} \Vert {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) \Vert } )}^2}}}{{{n^{- 1}}\sum \limits _{i = 1}^n {{{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ){{{{\hat{{\varvec{g}}}}}^T_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} )} }} \rightarrow 0,\end{aligned}$$

which leads to \( {\mathop {\max }\nolimits _{1 \le i \le n} \Vert {{{{\hat{{\varvec{g}}}}}_{i}}} ( {{{\varvec{\theta }} _0,\varvec{{\hat{\gamma }}}}} ) \Vert } =o_p(n^{1/2})\). The proof of Lemma 1 is completed. \(\square \)

Lemma 2

Assume the regularity conditions in Theorem 1 hold. We have

$$\begin{aligned} n^{-1}\hat{R}_Q({{\varvec{\theta }}}_0)= {\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)+o_p(1), \end{aligned}$$

where \({\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)=n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({{\varvec{\theta }}}_0)\) and \(\varvec{S}_n( {{{{\varvec{\theta }}}_0}} ) = {n^{- 1}}\sum \nolimits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})} {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})}^T\).

Proof of Lemma 2

Noting that the Lagrange multiplier method leads to the empirical log-likelihood ratio function for \({{\varvec{\theta }}}_0\) as

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)=2\sum \limits _{i=1}^n \log \left\{ 1+{{\varvec{\lambda }}}^{T}({{\varvec{\theta }}}_0)\hat{{{\varvec{g}}}}_{i}({{\varvec{\theta }}}_0)\right\} , \end{aligned}$$

where vector \({{\varvec{\lambda }}}\) is the solution to the equation

$$\begin{aligned} D({\varvec{\lambda }})=:\frac{1}{n}\sum \limits _{i = 1}^n {\frac{{{{{\hat{{\varvec{g}}}}}_i}\big ({{\varvec{\theta }}}_0 \big )}}{{1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}}}_i}\big ( {{\varvec{\theta }}}_0 \big )}}} = 0. \end{aligned}$$

Denote \({{\varvec{\lambda }}} = \rho {\varvec{{\varvec{u}}}}\) where \( \rho \ge 0\), \({\varvec{u}}\in {R^p}\) and \(\big \Vert {\varvec{u}}\big \Vert = 1\) is an unit vector. Substituting \({1 \big / {\big ( {1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big )}} = 1 - {{{{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big / {\big ( {1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}}} \big )} \big )}}\) and \({{\varvec{\lambda }}} = \rho {\varvec{{\varvec{u}}}}\) into the above equation, we have

$$\begin{aligned} 0&= \frac{1}{n}\sum \limits _{i = 1}^n {\frac{{{{{\hat{{\varvec{g}}}}}_i}\big ( {{{{\varvec{\theta }}}_0}} \big )}}{{1 + {{\varvec{\lambda }} ^T}{{{\hat{{\varvec{g}}}}}_i}\big ( {{{{\varvec{\theta }}}_0}} \big )}}} = \frac{1}{n}\sum \limits _{i = 1}^n {{\varvec{u}^T}{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0}) - \rho } \frac{1}{n}\sum \limits _{i = 1}^n {\frac{{{{({{\varvec{u}}^T}{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0}))}^2}}}{{1 + \rho {{\varvec{u}}^T}{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0})}}} \\&\le {{\varvec{u}}^T}\frac{\mathrm{{1}}}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}\big ( {{{{\varvec{\theta }}}_0}} \big )} - \frac{\rho }{{1 + \rho \mathop {\max }\limits _{1 \le i \le n} \big \Vert {{{{\hat{{\varvec{g}}}}}_i}({{{\varvec{\theta }}}_0})} \big \Vert }}{{\varvec{u}}^T}\varvec{S}_n\big ( {{{{\varvec{\theta }}}_0}} \big ){\varvec{u}}, \end{aligned}$$

where \(\varvec{S}_n( {{{{\varvec{\theta }}}_0}} ) = {n^{- 1}}\sum \nolimits _{i = 1}^n {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})} {{{{\hat{{\varvec{g}}}}}_i}( {{{{\varvec{\theta }}}_0}})}^T={\varvec{\Lambda }}_g(1+o_p(1))\) and the inequality follows from the positivity of \((1+\rho \varvec{u}^T{\hat{\varvec{g}}_i({{\varvec{\theta }}}_0)})\). Therefore, it can be shown that

$$\begin{aligned} \rho {{\varvec{u}}^T}\varvec{S}_n\big ( {{{{\varvec{\theta }}}_0 }} \big ){\varvec{u}}\le \{{1+\rho \mathop {\max }\limits _{1 \le i \le n} ||{{{\hat{{\varvec{g}}}} }_{i}}({{{\varvec{\theta }}}_0 })||}\} \Vert {{\varvec{u}}^T}\frac{\mathrm{{1}}}{n}\sum \limits _{i = 1}^n {{{{\hat{{\varvec{g}}}} }_{i}}\big ( {{{{\varvec{\theta }}}_0 }} \big )}\Vert . \end{aligned}$$

Noting that \(\max \nolimits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=o_p(n^{1/2})\) and according to the conclusion provided in the proof of Lemma 1, we have

$$\begin{aligned} \rho [\varvec{u}^{T} \varvec{\Lambda }_{{\varvec{g}}} \varvec{u}+o_{p}(1)]\le O_p(n^{-1/2})\{1+\rho o_p(n^{1/2})\}, \end{aligned}$$

which leads to \( \rho =O_{p}(n^{-1/2}), \) i.e., \(|| {{\varvec{\lambda }}}||=O_{p}(n^{-1/2}).\) Naturally,

$$\begin{aligned}\max \limits _{1\le i\le n}| |{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |\le || {{\varvec{\lambda }}}|| \max \limits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=O_{p}(n^{-1/2}) o_p(n^{1/2})=o_{p}(1).\end{aligned}$$

Expanding the equation \(D({\varvec{\lambda }})\), we get

$$\begin{aligned} 0&= \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big \{1-{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) +\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 }{(1+\xi _i)^3}\Big \} \\&= \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) - \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{{\varvec{\lambda }}}+ \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 }{(1+\xi _i)^3}, \end{aligned}$$

where \(\xi _i\in (0, {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})).\) Using the fact that \(\max \nolimits _{1\le i\le n}|| {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |=o_{p}(1), \) then \(|\xi _i| =o_{p}(1).\) Since

$$\begin{aligned} \Big \Vert \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 }{(1+\xi _i)^3} \Big \Vert&\le \frac{\max \limits _{1\le i\le n}|| \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |}{1-\max \limits _{1\le i\le n}|\xi _i| } \Big \Vert {{\varvec{\lambda }}}^{T} \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T} {{\varvec{\lambda }}}\Big \Vert \\&= o(n^{1/2})O_{p}(n^{-1}) \\&= o_{p}(n^{-1/2}), \end{aligned}$$

thus

$$\begin{aligned} {{\varvec{\lambda }}}&= \Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]^{-1}\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big ]+ {\varvec{\zeta }}, \end{aligned}$$
(13)

where \(| |{\varvec{\zeta }}| |=o_{p}(n^{-1/2}).\) A Taylor expansion of \(\hat{R}_Q({{\varvec{\theta }}}_0)\) yields

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&= 2 \sum \limits _{i=1}^n \Big \{ {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})-\frac{1}{2}[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^2 +\frac{1}{3}\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^3}{(1+\xi _i)^3} \Big \}\\&= 2 {{\varvec{\lambda }}}^{T} \sum \limits _{i=1}^n\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})-\sum \limits _{i=1}^n {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{{\varvec{\lambda }}}+\frac{2}{3}\sum \limits _{i=1}^n\frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^3}{(1+\xi _i)^3}. \end{aligned}$$

Similarly,

$$\begin{aligned} \Big \Vert \sum \limits _{i=1}^n \frac{[{{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})]^3 }{(1+\xi _i)^3} \Big \Vert&\le \frac{\max \limits _{1\le i\le n}|| {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0}) | |}{1-\max \limits _{1\le i\le n}|\xi _i| } \Big \Vert {{\varvec{\lambda }}}^{T} \frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T} {{\varvec{\lambda }}}\Big \Vert \\&= o_{p}(1)nO_{p}(n^{-1}) \\&= o_{p}(1), \end{aligned}$$

therefore,

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&= 2 {{\varvec{\lambda }}}^{T} \sum \limits _{i=1}^n\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})-\sum \limits _{i=1}^n {{\varvec{\lambda }}}^{T}\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{{\varvec{\lambda }}}+o_{p}(1). \end{aligned}$$
(14)

Substituting (13) in (14), it holds that

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&=n\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]^{-1}\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big ]\\&\quad -n{\varvec{\zeta }}^{T}\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{\varvec{\zeta }}+o_{p}(1). \end{aligned}$$

A simple calculation shows that

$$\begin{aligned} n{\varvec{\zeta }}^{T}\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}{\varvec{\zeta }}&=no_{p}(n^{-1/2})o_{p}(n^{-1/2})=o_{p}(1). \end{aligned}$$

Therefore,

$$\begin{aligned} \hat{R}_Q({{\varvec{\theta }}}_0)&=\Big [\frac{1}{\sqrt{n}}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]\Big [\frac{1}{n}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})^{T}\Big ]^{-1}\Big [\frac{1}{\sqrt{n}}\sum \limits _{i=1}^n \hat{{{\varvec{g}}}}_{i}({{{\varvec{\theta }}}_0})\Big ]+o_{p}(1). \end{aligned}$$

Since \({\hat{{{\varvec{\theta }}}}}_Q=\mathop {\arg \min }\nolimits _{{{\varvec{\theta }}}\in {\mathcal {B}}} {\hat{R}_Q({{\varvec{\theta }}})}\), we have two properties of \({\hat{R}}_Q({{\varvec{\theta }}}_0)\) as follows:

$$\begin{aligned} \frac{1}{n}\hat{R}'_Q({{\varvec{\theta }}}_0)&=2{\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)-R_{n1}\\ \frac{1}{n}\hat{R}''_Q({{\varvec{\theta }}}_0)&=2{{\hat{{\varvec{g}}}}'}({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)+R_{n2}, \end{aligned}$$

where

$$\begin{aligned} R_{n1}&={\hat{{\varvec{g}}}}^{T}({{\varvec{\theta }}}_0)\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0),\\ R_{n2}&=2{\hat{{\varvec{g}}}}''({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)-4n{\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)^{T}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)\\&\quad +2{\hat{{\varvec{g}}}}^{T}({{\varvec{\theta }}}_0)\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\varvec{\{}\varvec{S}'_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)\\&\quad -{\hat{{\varvec{g}}}}^{T}({{\varvec{\theta }}}_0)\varvec{S}^{-1}_n({{\varvec{\theta }}}_0)\{\varvec{S}''_n({{\varvec{\theta }}}_0)\}^{-1}\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0). \end{aligned}$$

Since \(n^{-1}\sum \nolimits _{i = 1}^n{\hat{{\varvec{g}}}}_i({\varvec{\theta }}_0,\hat{{\varvec{\gamma }}})=O_p(n^{-1/2})\), together with conditions (C5)–(C8), we have \(R_{n1}=O_p(n^{-1})\) and \(R_{n2}=o_p(1)\). The proof of Lemma 2 is completed. \(\square \)

Proof of Theorem 1

Similar to the proof of Lemma 1 in Qin and Lawless (1994), under conditions (C3)–(C4) and (C6)–(C7), we have \(\hat{{{\varvec{\theta }}}}_Q {\mathop {\longrightarrow }\limits ^{{{p}}}}{{\varvec{\theta }}}_0\) in probability. This implies \(\Vert {\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0\Vert =O_p(n^{-1/2})\) and \(\Vert {\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0\Vert =O_p(n^{-1/2})\). Under (C3)–(C4),

$$\begin{aligned}&\mathop {lim}_{n\rightarrow \infty }\frac{1}{n}\sum \limits _{i=1}^n({\hat{f}}_Q(\varvec{t}_i)-f_0(\varvec{t}_i))^T({\hat{f}}_Q(\varvec{t}_i)-f_0(\varvec{t}_i))=\int _0^1 {({\hat{f}}_Q(t)-f_0( t))^2 dt}\\&\quad =\int _0^1\big (\varvec{m}^T(t)({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)+R(t)\big )^2dt\\&\quad \le 2\int _0^1\big (\varvec{m}^T(t)({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\big )^2dt+2\int _0^1 R(t)^2 dt\\&\quad = 2 ({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)^T\int _0^1 \varvec{m}(t)\varvec{m}^T(t) dt ({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)+ 2\int _0^1 R(t)^2 dt. \end{aligned}$$

Since \(\Vert R(t_{ij})\Vert =O_p(k_n^{-r})=O_p(n^{-r/(2r+1)})\), we obtain

$$\begin{aligned} \int _0^1 {({\hat{f}}_Q(t)-f_0( t))^2 dt}=O_p(n^{-2r/(2r+1)}). \end{aligned}$$

Proof of Theorem 2

Denote \(\varvec{L}_{1n}({{\varvec{\beta }}},{{\varvec{\alpha }}})\) and \(\varvec{L}_{2n}({{\varvec{\beta }}},{{\varvec{\alpha }}})\) as the first derivatives of \({\hat{R}}_Q({{\varvec{\theta }}})\) with respect to \({{\varvec{\beta }}}\) and \({{\varvec{\alpha }}}\) respectively. Obviously, \(\varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)=0\) and \(\varvec{L}_{2n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)=0\). Applying Taylor expansion to \(\varvec{L}_{1n}\) around \(({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)\), it follows that

$$\begin{aligned} \varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)&=\varvec{L}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)+\frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\beta }}}}}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+ \frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\alpha }}}}}({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\\&\quad +\frac{1}{2}({\hat{{{\varvec{\theta }}}}}_Q-{{\varvec{\theta }}}_0)^T\frac{{\partial ^2 {\varvec{L}}_{1n}({\tilde{{{\varvec{\beta }}}}}_0,{\tilde{{{\varvec{\alpha }}}}}_0)}}{{\partial {{\varvec{\theta }}}^2 }}({\hat{{{\varvec{\theta }}}}}_Q-{{\varvec{\theta }}}_0), \end{aligned}$$

where \(({\tilde{{{\varvec{\beta }}}}}_0^T,{\tilde{{{\varvec{\alpha }}}}}_0^T)^T\) is between \(({{\varvec{\beta }}}_0^T,{{\varvec{\alpha }}}_0^T)^T\) and \(({\hat{{{\varvec{\beta }}}}}_Q^T,{\hat{{{\varvec{\alpha }}}}}_Q^T)^T\). Since

$$\begin{aligned}n^{-1}\hat{R}'_Q({{\varvec{\theta }}}_0)=2{\hat{{\varvec{g}}}}'({{\varvec{\theta }}}_0)^T\varvec{S}^{-1}_n({{\varvec{\theta }}}_0){\hat{{\varvec{g}}}}({{\varvec{\theta }}}_0)+O_p(n^{-1}),\end{aligned}$$

we have

$$\begin{aligned} \frac{1}{n}\varvec{L}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)=-\frac{2}{n^2}&\sum \limits _{l_1,l_2 = 1}^n\sum \limits _{k_1,k_2 = 1}^s \{\varvec{x}_{l_1}^T\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}\hat{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1}\\&\varvec{S}^{-1}_{n,k_1k_2}\varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}\hat{\varvec{W}}_{l_2}(\varvec{y}_{l_2}-{{\varvec{\mu }}}_{0l_2})\} +o_p(n^{-1/2})\\ =-\frac{2}{n^2}&\sum \limits _{l_1,l_2= 1}^n\tilde{\varvec{x}}_{l_1}\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\} +o_p(n^{-1/2}), \end{aligned}$$

where \(\varvec{S}^{-1}_{n,k_1k_2}\) represents the \((k_1,k_2)\) part of \(\varvec{S}_n^{-1}\). Let \(\tilde{\varvec{x}}_{l_1}=\varvec{H}'({\varvec{\eta }}_{0i})\varvec{x}_{l_1}\), \(\tilde{\varvec{R}}(\varvec{t}_{l_2})=\varvec{H}'({\varvec{\eta }}_{0{l_2}})\varvec{R}(\varvec{t}_{l_2})\) and for simplicity,

$$\begin{aligned}\hat{{\varvec{\tau }}}_{l_1l_2}=\sum \limits _{k_1,k_2 = 1}^s \varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}\hat{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} \varvec{S}^{-1}_{n,k_1k_2}\varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}\hat{\varvec{W}}_{l_2}, \end{aligned}$$

and

$$\begin{aligned} {{\varvec{\tau }}_{l_1l_2}}=\sum \limits _{k_1,k_2 = 1}^s \varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} {\varvec{\Lambda }}^{-1}_{g,k_1k_2}\varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}{\varvec{W}}_{l_2}. \end{aligned}$$

Similar with the proof of Lemma 1, we have \(\hat{{\varvec{\tau }}}_{l_1l_2}={\varvec{\tau }}_{l_1l_2}(1+o_p(1))\). Therefore, we have

$$\begin{aligned}&\frac{1}{n}\frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\beta }}}}}=-\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2}+o_p(n^{-1/2}),\\&\frac{1}{n}\frac{{\partial {\varvec{L}}_{1n}({{\varvec{\beta }}}_0,{{\varvec{\alpha }}}_0)}}{{\partial {{\varvec{\alpha }}}}}=-\frac{2}{n^2}\sum \limits _{l_1,l_2= 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}+o_p(n^{-1/2}), \end{aligned}$$

where \(\tilde{\varvec{m}}_{l_2}=\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{m}_{l_2}\). Thus,

$$\begin{aligned} \begin{aligned} \frac{1}{n}\varvec{L}_{1n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)&=-\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+\tilde{\varvec{m}}_{l_2}({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\}\\&\quad -\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}+o_p(n^{-1/2}). \end{aligned} \end{aligned}$$
(15)

Similarly,

$$\begin{aligned} \begin{aligned} \frac{1}{n}\varvec{L}_{2n}({\hat{{{\varvec{\beta }}}}}_Q,{\hat{{{\varvec{\alpha }}}}}_Q)&=-\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+\tilde{\varvec{m}}_{l_2}({\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0)\}\\&\quad -\frac{2}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}+o_p(n^{-1/2}). \end{aligned} \end{aligned}$$
(16)

From the Eq. (16), we have

$$\begin{aligned} {\hat{{{\varvec{\alpha }}}}}_Q-{{\varvec{\alpha }}}_0=-\varvec{V}_g^{-1}\big \{\varvec{K}_g({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+\varvec{P}_g\big \}, \end{aligned}$$
(17)

where \(\varvec{V}_g=n^{-2}{\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}}\), \(\varvec{K}_g=n^{-2}\sum \nolimits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2}\) and \(\varvec{P}_g=n^{-2}\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T \hat{{\varvec{\tau }}}_{l_1l_2} \{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}\}\). By the central limit theorem, it follows that \(\varvec{V}_g=n^{-2}{\sum \nolimits _{l_1,l_2= 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}} {\mathop {\longrightarrow }\limits ^{{{p}}}}E(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\), \(\varvec{K}_g=n^{-2}\sum \nolimits _{l_1,l_2 = 1}^n\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{x}}_{l_2} {\mathop {\longrightarrow }\limits ^{{{p}}}}E(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})\) with \(\tilde{\varvec{x}}=(\tilde{\varvec{x}}_1^T,\ldots ,\tilde{\varvec{x}}_n^T)^T\), \(\tilde{\varvec{m}}=(\tilde{\varvec{m}}_1^T,\ldots ,\tilde{\varvec{m}}_n^T)^T\) and

$$\begin{aligned}{\varvec{\tau }}=\left( \begin{array}{ccc} {\varvec{\tau }}_{11} &{} \ldots &{} {\varvec{\tau }}_{1n} \\ \vdots &{} \ddots &{} \vdots \\ {\varvec{\tau }}_{n1} &{} \ldots &{} {\varvec{\tau }}_{nn} \\ \end{array} \right) .\end{aligned}$$

By substituting (17) into the Eq. (15), we obtain

$$\begin{aligned} \begin{aligned}&\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{K}_g\}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)+o_p(n^{-1/2})\\&\quad =\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\tilde{\varvec{x}}_{l_2}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{R}}(\varvec{t}_{l_2})+{\varvec{\varepsilon }}_{l_2}-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{P}_g\}. \end{aligned} \end{aligned}$$
(18)

By a simple calculation, we can obtain

$$\begin{aligned}n^{-2}\sum \limits _{l_1,l_2= 1}^n\varvec{K}_g^T \varvec{V}_g^{-1}\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{\tilde{\varvec{x}}_{l_2}-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{K}_g\}=0,\end{aligned}$$
$$\begin{aligned}n^{-2}\sum \limits _{l_1,l_2= 1}^n\varvec{K}_g^T \varvec{V}_g^{-1}\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\{{\varvec{\varepsilon }}_{l_2}+\tilde{\varvec{R}}(\varvec{t}_{l_2})-\tilde{\varvec{m}}_{l_2}\varvec{V}_g^{-1}\varvec{P}_g\}=0.\end{aligned}$$

By defining \(\varvec{x}_i^*=\tilde{\varvec{x}}^T_i-\varvec{K}_g^T\varvec{V}_g^{-1}\tilde{\varvec{m}}_i^T\) and substituting it into (18), it leads to

$$\begin{aligned}&\big \{\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}\varvec{x}_{l_2}^{*T} +o_p(1)\big \}\sqrt{n}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0)\\&\quad =\frac{1}{n^{{3}/{2}}}\sum \limits _{l_1,l_2 = 1}^n \varvec{x}_i^{*}\hat{{\varvec{\tau }}}_{l_1l_2}{\varvec{\varepsilon }}_{l_2} -\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2= 1}^n \varvec{x}_{l_{1}}^{*} \hat{{\varvec{\tau }}}_{l_{1}l_{2}} \tilde{\varvec{m}}_{l_{1}}\varvec{V}_g^{-1}\varvec{P}_g +\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2 = 1}^n \varvec{x}_{l_1}^{*}\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&\quad =:\varvec{A}_1+\varvec{A}_2+\varvec{A}_3. \end{aligned}$$

Note that

$$\begin{aligned} \begin{aligned} \varvec{A}_1&=\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}{\varvec{\varepsilon }}_{l_2}=\frac{1}{\sqrt{n}}\sum \limits _{l_2= 1}^n\big \{\frac{1}{n}\sum \limits _{l_1 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}{\varvec{\varepsilon }}_{l_2}\big \} =\frac{1}{\sqrt{n}}\sum \limits _{l_2= 1}^n \hat{{\varvec{z}}}_{gl_2}({{\varvec{\beta }}}_0,\hat{{\varvec{\gamma }}}), \end{aligned}\end{aligned}$$

where

$$\begin{aligned} \hat{\varvec{z}}_{gl_2}({{\varvec{\beta }}}_0,\hat{{\varvec{\gamma }}})&=\frac{1}{n}\sum \limits _{l_1 = 1}^n\sum \limits _{k_1,k_2= 1}^n \varvec{x}_{l_1}^*\varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}\hat{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} \varvec{S}^{-1}_{n,k_1k_2}\\&\quad \times \varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}\hat{\varvec{W}}_{l_2}{\varvec{\varepsilon }}_{l_2}, {\varvec{z}}_{gl_2}({{\varvec{\beta }}}_0,{{\varvec{\gamma }}_0})\\&=\frac{1}{n}\sum \limits _{l_1 = 1}^n\sum \limits _{k_1,k_2= 1}^n \varvec{x}_{l_1}^*\varvec{A}_{l_1}^{-1/2} \varvec{B}_{k_1}\varvec{A}_{l_1}^{-1/2}{\varvec{W}}_{l_1}\varvec{H}'({\varvec{\eta }}_{0l_1})\varvec{d}_{l_1} {\varvec{\Lambda }}^{-1}_{g,k_1k_2}\\&\quad \times \varvec{d}_{l_2}^T\varvec{H}'({\varvec{\eta }}_{0l_2})\varvec{A}_{l_2}^{-1/2}\varvec{B}_{k_2}\varvec{A}_{l_2}^{-1/2}{\varvec{W}}_{l_2}{\varvec{\varepsilon }}_{l_2}. \end{aligned}$$

Using the similar arguments in the proof of Lemma 1, we have

$$\begin{aligned} \frac{1}{{\sqrt{n} }}\sum \limits _{l_2 = 1}^n {\hat{\varvec{z}}_{gl_2}( {{{{\varvec{\beta }}}_0},\hat{{\varvec{\gamma }}} } )\varvec{z}_{gl_2}^T} ( {{{{\varvec{\beta }}}_0},\hat{{\varvec{\gamma }}} } ) {\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{B}_{g}}, ~~~\frac{1}{{\sqrt{n} }}\sum \limits _{l_2= 1}^n {\hat{\varvec{z}}_{gl_2}( {{{{\varvec{\beta }}}_0},\hat{{\varvec{\gamma }}} } ) {\mathop {\longrightarrow }\limits ^{d}}} N( {0,{{\varvec{\Psi }}_{g}}} ), \end{aligned}$$

where \(\varvec{B}_{g} = E\big ( {{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )\varvec{z}_{gl_2}^T( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} \big )=E\big \{[\tilde{\varvec{x}}^T{\varvec{\tau }}-\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{m}}[\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}}]^{-1}\tilde{\varvec{m}}^T{\varvec{\tau }}] {{\varvec{\varepsilon }}}\big \}^{\otimes 2}\) and \({\varvec{\Psi }}_{g}=\varvec{B}_{g}+E( {{{{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} / {\partial {{\varvec{\gamma }}} }}} ){{\varvec{\Sigma }}}E( {{{{{{\varvec{z}_{gl_2}}( {{{{\varvec{\beta }}}_0},{{\varvec{\gamma }} _0}} )} / {\partial {{\varvec{\gamma }}} }}}^T}} )\). Then we have \(\varvec{A}_1{\mathop {\longrightarrow }\limits ^{d}}N(0,{\varvec{\Psi }}_g)\) in distribution. Since

$$\begin{aligned}\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2= 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}=0,\end{aligned}$$

then \(\varvec{A}_2=0\). Expanding \({\varvec{x}}_{l_1}^*\) in \(\varvec{A}_3\), we have

$$\begin{aligned} \begin{aligned} \varvec{A}_3&=\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2,=1}^n(\tilde{\varvec{x}}^T_{l_1}-\varvec{K}^T_g\varvec{V}_g^{-1}\tilde{\varvec{m}}_{l_1}^T)\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&=\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2,=1}^n\big [\tilde{\varvec{x}}^T_{l_1}-\{E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})\}\{E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\}\tilde{\varvec{m}}_{l_1}^T\big ]\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&\quad -\frac{1}{n^{3/2}}\sum \limits _{l_1,l_2,=1}^n\big [\varvec{K}^T_g\varvec{V}_g^{-1}-\{E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})\}\{E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\}\big ]\tilde{\varvec{m}}_{l_1}^T\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{R}}(\varvec{t}_{l_2})\\&=:\varvec{A}_{31}+\varvec{A}_{32}. \end{aligned} \end{aligned}$$

From the definition of \(\varvec{K}_g\), it is easy to verify that

$$\begin{aligned}E\big [\big \{\tilde{\varvec{x}}^T_{l_1}-E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\tilde{\varvec{m}}_{l_1}^T\big \}\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}\big ]=0,\end{aligned}$$

which implies

$$\begin{aligned}\frac{1}{n^2}\sum \limits _{l_1,l_2=1}^n \big \{\tilde{\varvec{x}}^T_{l_1}-E^T(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}})E^{-1}(\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}})\tilde{\varvec{m}}_{l_1}^T\big \}\hat{{\varvec{\tau }}}_{l_1l_2}\tilde{\varvec{m}}_{l_2}=O_p(1).\end{aligned}$$

Recalling that \(\Vert {\varvec{m}( {{t_{ij}}})}\Vert = O_p( 1 )\) and \(\Vert {{R}( {{t_{ij}}} )} \Vert =o_p(1)\), we have \(\varvec{A}_{31}=o_p(1)\). Similarly, we obtain \(\varvec{A}_{32}=o_p(1)\) and then \(\varvec{A}_{3}=o_p(1)\). By the law of large numbers, we have

$$\begin{aligned}\frac{1}{n^2}\sum \limits _{l_1,l_2 = 1}^n\varvec{x}_{l_1}^*\hat{{\varvec{\tau }}}_{l_1l_2}\varvec{x}_{l_2}^{*T}{\mathop {\longrightarrow }\limits ^{{{p}}}}{\varvec{\Gamma }}_g,\end{aligned}$$

in probability where \({\varvec{\Gamma }}_g=E\big \{\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{x}}-\tilde{\varvec{x}}^T{\varvec{\tau }}\tilde{\varvec{m}}[\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{m}}]^{-1}\tilde{\varvec{m}}^T{\varvec{\tau }}\tilde{\varvec{x}}\big \}\). Hence,

$$\begin{aligned}\sqrt{n}({\hat{{{\varvec{\beta }}}}}_Q-{{\varvec{\beta }}}_0){\mathop {\longrightarrow }\limits ^{d}}N(0,{\varvec{\Gamma }}_g^{-1}{\varvec{\Psi }}_g{\varvec{\Gamma }}_g^{-1}),\end{aligned}$$

in distribution and the proof of Theorem 2 is completed. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shao, Y., Wang, L. Generalized partial linear models with nonignorable dropouts. Metrika 85, 223–252 (2022). https://doi.org/10.1007/s00184-021-00828-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-021-00828-z

Keywords

Navigation