Skip to main content

Advertisement

Log in

Model averaging for right censored data with measurement error

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

This paper studies a novel model averaging estimation issue for linear regression models when the responses are right censored and the covariates are measured with error. A novel weighted Mallows-type criterion is proposed for the considered issue by introducing multiple candidate models. The weight vector for model averaging is selected by minimizing the proposed criterion. Under some regularity conditions, the asymptotic optimality of the selected weight vector is established in terms of its ability to achieve the lowest squared loss asymptotically. Simulation results show that the proposed method is superior to the other existing related methods. A real data example is provided to supplement the actual performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Ando T, Li KC (2014) A model-averaging approach for high-dimensional regression. J Am Stat Assoc 109(505):254–265

    Article  MathSciNet  CAS  Google Scholar 

  • Ando T, Li KC (2017) A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann Stat 45(6):2654–2679

    Article  MathSciNet  Google Scholar 

  • Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective. Chapman and Hall-CRC, Boca Raton

    Book  Google Scholar 

  • Chen L, Yi GY (2020) Model selection and model averaging for analysis of truncated and censored data with measurement error. Electron J Stat 14(2):4054–4109

    Article  MathSciNet  Google Scholar 

  • Claeskens G, Hjort NL (2003) The focused information criterion. J Am Stat Assoc 98(464):900–916

    Article  MathSciNet  Google Scholar 

  • Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, Cambridge

    Google Scholar 

  • Dong Q, Liu B, Zhao H (2023) Weighted least squares model averaging for accelerated failure time models. Comput Stat Data Anal 184:107743

    Article  MathSciNet  Google Scholar 

  • Du J, Zhang Z, Xie T (2017) Focused information criterion and model averaging in censored quantile regression. Metrika 80(5):547–570

    Article  MathSciNet  Google Scholar 

  • Han P, Kong L, Zhao J, Zhou X (2019) A general framework for quantile estimation with incomplete data. J R Stat Soc Ser B Stat Methodol 81(2):305–333

    Article  MathSciNet  Google Scholar 

  • Hansen BE (2007) Least squares model averaging. Econometrica 75(4):1175–1189

    Article  MathSciNet  Google Scholar 

  • Hansen BE, Racine JS (2012) Jackknife model averaging. J Econom 167(1):38–46

    Article  MathSciNet  Google Scholar 

  • Hoeting J, Madigan D, Raftery A, Volinsky C (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–417

    MathSciNet  Google Scholar 

  • Kaplan EL, Meyer P (1957) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481

    Article  MathSciNet  Google Scholar 

  • Li KC (1987) Asymptotic optimality for \(C_p, C_L\), cross-validation and generalized cross-validation: discrete index set. Ann Stat 15(3):958–975

    Article  MathSciNet  Google Scholar 

  • Li G, Wang Q (2003) Empirical likelihood regression analysis for right censored data. Stat Sin 13(1):51–68

    MathSciNet  CAS  Google Scholar 

  • Li M, Wang X (2023) Semiparametric model averaging method for survival probability predictions of patients. Comput Stat Data Anal 185:107759

    Article  MathSciNet  Google Scholar 

  • Li J, Yu T, Lv J, Lee M-LT (2021) Semiparametric model averaging prediction for lifetime data via hazards regression. J R Stat Soc Ser C Appl Stat 70(5):1187–1209

    Article  MathSciNet  Google Scholar 

  • Liang H, Li R (2009) Variable selection for partially linear models with measurement errors. J Am Stat Assoc 104(485):234–248

    Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  • Liang H, Wang S, Carroll RJ (2007) Partially linear models with missing response variables and error-prone covariates. Biometrika 94(1):185–198

    Article  MathSciNet  PubMed  Google Scholar 

  • Liang Z, Chen X, Zhou Y (2022) Mallows model averaging estimation for linear regression model with right censored data. Acta Math Appl Sin Engl Ser 38(1):5–23

    Article  MathSciNet  Google Scholar 

  • Liao J, Zou G (2020) Corrected mallows criterion for model averaging. Comput Stat Data Anal 144:106902

    Article  MathSciNet  Google Scholar 

  • Liao J, Zong X, Zhang X, Zou G (2019) Model averaging based on leave-subject-out cross-validation for vector autoregressions. J Econom 209(1):35–60

    Article  MathSciNet  Google Scholar 

  • Liu Q, Okui R (2013) Heteroscedasticity-robust \(C_p\) model averaging. Econom J 16(3):463–472

    Article  MathSciNet  Google Scholar 

  • Longford NT (2005) Model selection and efficiency-is ‘Which model...?’ the right question? J R Stat Soc Ser A Stat Soc 168(3):469–472

    Article  MathSciNet  Google Scholar 

  • Raftery AE, Zheng Y (2003) Discussion: performance of Bayesian model averaging. J Am Stat Assoc 98(464):931–938

    Article  Google Scholar 

  • Su M, Wang R, Wang Q (2022) A two-stage optimal subsampling estimation for missing data problems with large-scale data. Comput Stat Data Anal 173:107505

    Article  MathSciNet  Google Scholar 

  • Sun Z, Sun L, Lu X, Zhu J, Li Y (2017) Frequentist model averaging estimation for the censored partial linear quantile regression model. J Stat Plan Inference 189:1–15

    Article  MathSciNet  CAS  Google Scholar 

  • Tang ML, Tang NS, Zhao PY, Zhu H (2018) Efficient robust estimation for linear models with missing response at random. Scand J Stat 45(2):366–381

    Article  MathSciNet  Google Scholar 

  • Wan ATK, Zhang X, Zou G (2010) Least squares model averaging by mallows criterion. J Econom 156(2):277–283

    Article  MathSciNet  Google Scholar 

  • Wang H, Zou G, Wan ATK (2012) Model averaging for varying-coefficient partially linear measurement error models. Electron J Stat 6:1017–1039

    Article  MathSciNet  Google Scholar 

  • Wen C (2012) Cox regression for mixed case interval-censored data with covariate errors. Lifetime Data Anal 18(3):321–338

    Article  MathSciNet  PubMed  Google Scholar 

  • Yan X, Wang H, Wang W, Xie J, Ren Y, Wang X (2021) Optimal model averaging forecasting in high-dimensional survival analysis. Int J Forecast 37(3):1147–1155

    Article  Google Scholar 

  • Zhang X, Liu C-A (2023) Model averaging prediction by K-fold cross-validation. J Econom 235(1):280–301

    Article  MathSciNet  Google Scholar 

  • Zhang T, Wang L (2020) Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response. Comput Stat Data Anal 144:106888

    Article  MathSciNet  Google Scholar 

  • Zhang X, Zou G, Carroll RJ (2015) Model averaging based on Kullback–Leibler distance. Stat Sin 25:1583–1598

    MathSciNet  PubMed  PubMed Central  Google Scholar 

  • Zhang X, Yu D, Zou G, Liang H (2016) Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. J Am Stat Assoc 111(516):1775–1790

    Article  MathSciNet  CAS  Google Scholar 

  • Zhang X, Wang H, Ma Y, Carroll RJ (2017) Linear model selection when covariates contain errors. J Am Stat Assoc 112(520):1553–1561

    Article  MathSciNet  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang X, Ma Y, Carroll RJ (2019) MALMEM: model averaging in linear measurement error models. J R Stat Soc Ser B Stat Methodol 81(4):763–779

    Article  MathSciNet  Google Scholar 

  • Zhou M (1992) Asymptotic normality of the ‘synthetic data’ regression estimator for censored survival data. Ann Stat 20:1002–1021

    Article  MathSciNet  Google Scholar 

  • Zhu R, Wan ATK, Zhang X, Zou G (2019) A mallows-type model averaging estimator for the varying-coefficient partially linear model. J Am Stat Assoc 114(526):882–892

    Article  MathSciNet  CAS  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the editor, associate editor and three referees for their comments and suggestions that helped improve the manuscript greatly. This paper is supported by Institute of Digital Finance, Hangzhou City University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Caiya Zhang.

Ethics declarations

Conflict of interest

The authors report there are no competing interests to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Regularity conditions

In Appendix 1, some addition notations and regularity conditions would be listed. Let \(J(t)=1-\{1-F(t)\}\{1-G(t)\}\) with \(F(\cdot )\) is the cumulative distribution function of \(\mathbf{{Y}}\). Denote by \(\mathbf{{Q}}(\mathbf{{w}})\) and \(\hat{\mathbf{{Q}}}(\mathbf{{w}})\) are two n order diagonal matrices, and their diagonal elements are \(p_{ii}(\mathbf{{w}})\) and \({\hat{p}}_{ii}(\mathbf{{w}})\), respectively, for \(i=1,\ldots ,n\). Let \(\lambda _{\max }(\cdot )\) and \(\lambda _{\min }(\cdot )\) be the largest and the smallest singular values of the given matrix, respectively. Denote by \(\xi _G=\inf _{\mathbf{{w}}\in W_n}R_G(\mathbf{{w}})\).

  1. (C.1)

    \(1-G(\tau _J-)>0\) with \(\tau _J=\inf \{t:J(t)=1\}\).

  2. (C.2)

    \(\max _{1\le i\le n}E(\epsilon ^{4}_i|\mathbf{{X}}_i)<\infty\), a.s. and \(\Vert \varvec{ {\upmu }}\Vert ^2=O(n)\).

  3. (C.3)

    \(c_1\le n^{-1}\lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})\le n^{-1}\lambda _{\max }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})<c_2\) uniformly in m, a.s., and \(\lambda _{\max }(\varvec{ {\Sigma }}_T)<c_2\), where \(c_1\) and \(c_2\) are two positive constants.

  4. (C.4)

    \(n^{1/2}k_M/\xi _G=o(1)\) and \(k^2_M/n=o(1)\), a.s., where \(k_M<p_n\) is the number of regressors in the largest candidate model.

  5. (C.5)

    \(\max _{1\le m\le M}\max _{1\le i\le n}p_{(m),ii}=O(n^{-1/2})\), a.s., where \(p_{(m),ii}\) is the ith diagonal element of \(\mathbf{{P}}_{(m)}\).

  6. (C.6)

    \(\Vert \hat{\varvec{ {\Sigma }}}_{T,(m)}-{\varvec{ {\Sigma }}}_{T,(m)}\Vert =O_p(n^{-1/2}k_M)\) uniformly in m.

Remark 1

Condition (C.1) is a commonly used condition in right censored data analysis. Similar conditions can be seen Sun et al. (2017), Li and Wang (2003) and Liang et al. (2022). Condition (C.2) is just a standard condition for the linear regression models with measurement error, which is similar to Condition 1 of Zhang et al. (2019). Condition (C.3) is a standard condition for the covariates of the model and the covariance of the measurement error, which is the same as Condition (C.3) in Zhang et al. (2017). Similar condition can also be seen in Condition (C.5) of Liao and Zou (2020). Condition (C.4) places a constraint on the dimension of the regressors in the largest candidate model. It allows \(k_M\) to increase with n, but it is not arbitrary and is limited by Condition (C.4). Condition (C.4) also requires that \(\xi _G\) increases faster than \(n^{1/2}k_M\), which implies that there is no finite candidate model whose bias is 0. This condition is weaker than Condition (C.1) of Zhang et al. (2017), because Zhang et al. (2017) requires that the minimum of the risk tends to infinity faster than \(n^{1/2}p^2_n\). Other similar conditions can be seen Condition (C.6) of Zhang et al. (2016) and Condition (C.9) in Liao and Zou (2020). Condition (C.6) requires that the estimator \(\hat{\varvec{ {\Sigma }}}_{T,(m)}\) is consistent, similar condition can be seen Condition (C.4) in Zhang et al. (2017).

Appendix 2. Proofs of main results

Appendix 2 gives the detailed proofs of the theorems appearing in this paper. We first give a short proof of Lemma 1. Next, we give some additional lemmas to assist the proofs of the theorems. In the following, C represents a general positive constant that can be freely varied under different contexts. The norm of a matrix is the Euclidean norm, which is the largest singular value of the matrix.

Proof of Lemma 1

By the definition of \(C_{G}(\mathbf{{w}})\) in (6), we have

$$\begin{aligned} E\{C_{G}(\mathbf{{w}})\mid \tilde{\mathbf{{X}}}\} & = E[\Vert \mathbf{{Z}}_G-\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})\Vert ^2+2tr\{\mathbf{{P}}(\mathbf{{w}})\varvec{ {\Omega }}_G\}\mid \tilde{\mathbf{{X}}}]\\ & = E\{\Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})\Vert ^2\mid \tilde{\mathbf{{X}}}\}+E(\varvec{ {\upepsilon }}^\top _{G}\varvec{ {\upepsilon }}_{G}\mid \tilde{\mathbf{{X}}})+2E[\varvec{ {\upepsilon }}^\top _{G}\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\mid \tilde{\mathbf{{X}}}]\\&-2E[\{\varvec{ {\upepsilon }}^\top _{G}\mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\}-tr\{\mathbf{{P}}(\mathbf{{w}})\varvec{ {\Omega }}_G\}\mid \tilde{\mathbf{{X}}}]\\ & = R_G(\mathbf{{w}})+tr(\varvec{ {\Omega }}_G), \end{aligned}$$

where \(\mathbf{{I}}_n\) is an n order identity matrix. \(\square\)

Lemma 2

Provided that Conditions (C.1)–(C.2) in Appendix 1 hold, as \(n\rightarrow \infty\), we have

$$\begin{aligned} \Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2=O_p(1). \end{aligned}$$
(13)

Proof of Lemma 2

According to the definitions of \(\mathbf{{Z}}_{{\hat{G}}_n}\) and \(\mathbf{{Z}}_G\), by Condition (C.2), we have

$$\begin{aligned} \Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2 & = \sum _{i=1}^n\bigg \{\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg \}^2\delta ^2_iZ^2_i \nonumber \\ & \le n\bigg \{\max _{1\le i\le n}\bigg |\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg |\bigg \}^2O_p(1). \end{aligned}$$
(14)

Through straightforward but trivial calculations, we can obtain that

$$\begin{aligned}&\max _{1\le i\le n}\bigg |\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg |\\ & \le \qquad \max _{1\le i\le n}\bigg [\bigg \{\frac{1}{1-G(Z_i-)}\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-G(Z_i-)}\bigg |\bigg \}\bigg \{1+\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-G(Z_i-)}\bigg |\\&\quad +\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-G(Z_i-)}\bigg |\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-{\hat{G}}_n(Z_i-)}\bigg |\bigg \}\bigg ]. \end{aligned}$$

Then, combine with Condition (C.1) and the results in Zhou (1992)

$$\begin{aligned} \sup _z\bigg |\frac{{\hat{G}}_n(z)-G(z)}{1-G(z)}\bigg |=O_p(n^{-1/2}),\quad \sup _z\bigg |\frac{{\hat{G}}_n(z)-G(z)}{1-{\hat{G}}_n(z)}\bigg |=O_p(1), \end{aligned}$$

we have

$$\begin{aligned}&\max _{1\le i\le n}\bigg |\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg |=O_p(n^{-1/2}). \end{aligned}$$
(15)

By (14) and (15), it can be obtained that \(\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2=O_p(1)\). Thus, we complete the proof of Lemma 2. \(\square\)

Lemma 3

Provided that Conditions (C.1)–(C.5) in Appendix 1 hold, as \(n\rightarrow \infty\), we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |=o_p(1). \end{aligned}$$
(16)

Proof of Lemma 3

By the definition of \(R_G(\mathbf{{w}})\) in (7), it follows that

$$\begin{aligned} R_G(\mathbf{{w}})=E\{L_G(\mathbf{{w}})\mid \tilde{\mathbf{{X}}}\}=\Vert \{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\Vert ^2+tr\{\mathbf{{P}}^\top (\mathbf{{w}})\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}, \end{aligned}$$

where \(\varvec{ {\Omega }}_G=diag(\sigma ^2_{G,1},\ldots ,\sigma ^2_{G,n})\). Then, we have

$$\begin{aligned}&L_G(\mathbf{{w}})-R_G(\mathbf{{w}})\\&\quad =\Vert \mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\Vert ^2-tr\{\mathbf{{P}}^\top (\mathbf{{w}})\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}-2[\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}]^\top \{\mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\}. \end{aligned}$$

Thus, to prove (16), it is equivalent to show

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|\Vert \mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\Vert ^2-tr\{\mathbf{{P}}^\top (\mathbf{{w}})\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}|}{R_G(\mathbf{{w}})}=o_p(1), \end{aligned}$$
(17)
$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|[\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}]^\top \{\mathbf{{P}}(\mathbf{{w}}) \varvec{ {\upepsilon }}_{G}\}|}{R_G(\mathbf{{w}})}=o_p(1). \end{aligned}$$
(18)

Similar to Zhang et al. (2017), under Conditions (C.2) and (C.3), by Markov inequality, we have

$$\begin{aligned} \text {pr}\bigg (\frac{\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m),m}\Vert }{\sqrt{nk_M}}>C_n\bigg )\le \frac{E(\mathbf{{X}}^\top _{(m),m}\mathbf{{T}}_{(m)}\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m),m})}{nk_MC^2_n}\le \frac{c_2n\cdot tr({\varvec{ {\Sigma }}}_{T,(m)})}{nk_MC^2_n}\rightarrow 0 \end{aligned}$$

uniformly in m, as \(C_n\rightarrow \infty\), where \(\mathbf{{X}}_{(m),m}\) is the mth column of \(\mathbf{{X}}_{(m)}\). This implies \(\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m),m}\Vert =O_p(n^{1/2}k^{1/2}_M)\) uniformly in m. Similarly, we can prove that \(\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}\Vert =O_p(n^{1/2}k_M)\) and \(\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)}\Vert =O_p(n^{1/2}k_M)\) uniformly in m. Then, under Conditions (C.2) and (C.3), with the similar proving steps of (21) in Zhang et al. (2017), we can obtain that uniformly in m,

$$\begin{aligned}&\lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad =\lambda _{\max }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad \le \lambda _{\max }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})+2\Vert \mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}\Vert +\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)}\Vert \nonumber \\&\quad \le c_2n+O_p(n^{1/2}k_M), \end{aligned}$$
(19)

and

$$\begin{aligned}&\lambda _{\min }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad =\lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad \ge \lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})+\lambda _{\min }(\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)})\nonumber \\&\quad \ge \lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})-\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)}\Vert -2\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}\Vert \nonumber \\&\quad \ge c_1n+O_p(n^{1/2}k_M). \end{aligned}$$
(20)

By (19) and (20), under Conditions (C.3) and (C.4), we can derive that

$$\begin{aligned} c_1+o_p(1)<\frac{\lambda _{\min }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})}{n}\le \frac{\lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})}{n}<c_2+o_p(1), \end{aligned}$$

which indirectly indicates that

$$\begin{aligned} \lambda _{\max }\{(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})^{-1}\}=O_p(n^{-1}). \end{aligned}$$
(21)

Similar to (19), we can obtain that \(\lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)})\le 2nc_2+O_p(n^{1/2}k_M)\). This together with (21), under Condition (C.4), we have

$$\begin{aligned} \lambda _{\max }(\mathbf{{P}}_{(m)})&=\lambda _{\max }\{\tilde{\mathbf{{X}}}_{(m)}(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)})^{-1}\tilde{\mathbf{{X}}}^\top _{(m)}\}\nonumber \\&\le \lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)})\lambda _{\max }\{(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})^{-1}\}\nonumber \\&\le \{2nc_2+O_p(n^{1/2}k_M)\}O_p(n^{-1})\nonumber \\&=O_p(1). \end{aligned}$$
(22)

By (22), it can be obtained that \(\lambda _{\max }\{\mathbf{{P}}(\mathbf{{w}})\}=O_p(1)\). Then, we have

$$\begin{aligned} tr\{\mathbf{{P}}^\top (\mathbf{{w}})\mathbf{{P}}(\mathbf{{w}})\mathbf{{P}}^\top (\mathbf{{w}})\mathbf{{P}}(\mathbf{{w}})\}\le \lambda ^2_{\max }\{\mathbf{{P}}(\mathbf{{w}})\}tr\{\mathbf{{P}}^2(\mathbf{{w}})\}\le CR_G(\mathbf{{w}}), \end{aligned}$$
(23)

and

$$\begin{aligned} \Vert \mathbf{{P}}^\top (\mathbf{{w}})\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\Vert ^2\le \lambda ^2_{\max }\{\mathbf{{P}}(\mathbf{{w}})\}\Vert \{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\Vert ^2\le CR_G(\mathbf{{w}}). \end{aligned}$$
(24)

In addition, by Conditions (C.1) and (C.2), using \(C_r\) inequality, we can prove

$$\begin{aligned} \max _{1\le i\le n}E(\epsilon ^{4}_{G,i}|\mathbf{{X}}_i)<\infty . \end{aligned}$$
(25)

Together with (23)–(25), under Condition (C.4), using the same techniques as those for the proof of Theorem 1 in Wan et al. (2010), we can prove (17) and (18). Therefore, Lemma 3 is valid. \(\square\)

Lemma 4

Provided that Conditions (C.2), (C.3), (C.4) and (C.6) in Appendix 1 hold, as \(n\rightarrow \infty\), we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert =O_p(n^{-1/2}k_M). \end{aligned}$$
(26)

Proof of Lemma 4

Let \(\hat{\mathbf{{A}}}_{(m)}=n^{-1}(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\hat{\varvec{ {\Sigma }}}_{T,(m)})\) and \({\mathbf{{A}}}_{(m)}=n^{-1}(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n{\varvec{ {\Sigma }}_{T,(m)}})\). Note that

$$\begin{aligned} \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}=\{{\mathbf{{A}}}^{-1}_{(m)}+(\hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)})\}({\mathbf{{A}}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}){\mathbf{{A}}}^{-1}_{(m)}. \end{aligned}$$

Then, we have

$$\begin{aligned} \Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \le \{\Vert {\mathbf{{A}}}^{-1}_{(m)}\Vert +\Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \}\Vert {\mathbf{{A}}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert \Vert {\mathbf{{A}}}^{-1}_{(m)}\Vert . \end{aligned}$$
(27)

By (21), we can easily verify that

$$\begin{aligned} \Vert \mathbf{{A}}^{-1}_{(m)}\Vert =nO_p(n^{-1})=O_p(1), \end{aligned}$$

that is, \(\sup _m\Vert \mathbf{{A}}^{-1}_{(m)}\Vert \le C<\infty\) uniformly in m. By Condition (C.6), we have \(\Vert {\mathbf{{A}}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert =n^{-1}\Vert {n(\varvec{ {\Sigma }}_{T,(m)}}-\hat{\varvec{ {\Sigma }}}_{T,(m)})\Vert =O_p(n^{-1/2}k_M)\) uniformly in m. Further, by Condition (C.4), we know that \(\sup _m\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert \rightarrow 0\) in probability, which implies that \(C\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert < 1\) in probability. Then, by (27), we have

$$\begin{aligned} \sup _m \Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-\hat{\mathbf{{A}}}^{-1}_{(m)}\Vert \le \sup _m\frac{C\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert }{1-C\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert }=O_p(n^{-1/2}k_M). \end{aligned}$$
(28)

By (19) and (28), under Condition (C.4), we then have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert \\&\quad =\sup _{\mathbf{{w}} \in W_n}\bigg \Vert \sum _{m=1}^{M}w_m(n^{-1}\tilde{\mathbf{{X}}}_{(m)}\hat{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}-n^{-1}\tilde{\mathbf{{X}}}_{(m)}{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)})\bigg \Vert \\&\quad \le \sup _{\mathbf{{w}} \in W_n}\sum _{m=1}^{M}w_m\bigg \Vert n^{-1}\tilde{\mathbf{{X}}}_{(m)}\hat{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}-n^{-1}\tilde{\mathbf{{X}}}_{(m)}{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}\bigg \Vert \\&\quad \le \sup _{\mathbf{{w}} \in W_n}\sum _{m=1}^{M}w_m\{ \Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \Vert n^{-1}\tilde{\mathbf{{X}}}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}\Vert \}\\&\quad \le \sup _{\mathbf{{w}} \in W_n}\sum _{m=1}^{M}w_m\{ \sup _m\Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \lambda _{\max }(n^{-1}\tilde{\mathbf{{X}}}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)})\} \\&\quad =O_p(n^{-1/2}k_M). \end{aligned}$$

Thus, we complete the proof of Lemma 4. \(\square\)

Proof of Theorem 1

Under Conditions (C.1)–(C.5) listed in Appendix 1, the idea of proving Theorem 1 is similar to that of the proof of Theorem 1. And the steps in the proof of Theorem 1 are relatively simpler than the proof of Theorem 2. Thus we omit it here. \(\square\)

Proof of Theorem 2

The proposed criterion of (9) can be recognized as

$$\begin{aligned} C_{\hat{G}_n}(\mathbf{{w}}) & = \Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2+2(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)^\top \{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}+2\varvec{ {\upepsilon }}^\top _{G}\{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}\nonumber \\&+2\hat{\varvec{ {\upepsilon }}}^\top _{{\hat{G}}_n}\hat{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}+\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\varvec{ {\upmu }}\Vert , \end{aligned}$$
(29)

where \(\hat{\mathbf{{Q}}}(\mathbf{{w}})\) is defined in Appendix 1. Obviously, the last term in (29) is unrelated to \(\mathbf{{w}}\). Accordingly, following Li (1987), Theorem 2 is valid if the following three formulas hold,

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_{{\hat{G}}_n}(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |=o_p(1), \end{aligned}$$
(30)
$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)^\top \{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}|}{R_G(\mathbf{{w}})}=o_p(1), \end{aligned}$$
(31)
$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|\varvec{ {\upepsilon }}^\top _{G}\{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}+\hat{\varvec{ {\upepsilon }}}^\top _{{\hat{G}}_n}\hat{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}|}{R_G(\mathbf{{w}})}=o_p(1), \end{aligned}$$
(32)

where \(L_{{\hat{G}}_n}(\mathbf{{w}})=\Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2\). We first consider (30). By Cauchy–Schwartz inequality, we have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_{{\hat{G}}_n}(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |\\&\quad =\sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_{{\hat{G}}_n}(\mathbf{{w}})-L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}+\frac{L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |\\&\quad \le \sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |+\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2}{R_G(\mathbf{{w}})}\\ &\qquad +\sup _{\mathbf{{w}} \in W_n}\frac{2\{L_G(\mathbf{{w}})\}^{1/2}\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert }{R_G(\mathbf{{w}})}\\&\quad \equiv J_1+J_2+J_3. \end{aligned}$$

By Lemma 3, we know that \(J_1\) is \(o_p(1)\). Note that

$$\begin{aligned}&\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2\\&\quad =\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})\mathbf{{Z}}_{{\hat{G}}_n}-{\mathbf{{P}}}(\mathbf{{w}})\mathbf{{Z}}_G\Vert ^2\\&\quad =\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)+\{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G+{\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2\\&\quad \le 3[\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2+\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G\Vert ^2+\Vert {\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2]. \end{aligned}$$

By Conditions (C.1) and (C.2), it is clear that \(\Vert \mathbf{{Z}}_G\Vert ^2=O_p(n)\). Then, by Lemmas 2 and 4, under Condition (C.4), we have

$$\begin{aligned} J_2&\le 3\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2}{R_G(\mathbf{{w}})}+3\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G\Vert ^2}{R_G(\mathbf{{w}})} \\&\quad +3\sup _{\mathbf{{w}} \in W_n}\frac{\Vert {\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2}{R_G(\mathbf{{w}})}\\&\le \sup _{\mathbf{{w}} \in W_n}3\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert ^2\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2\xi ^{-1}_G+3\sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert ^2\Vert \mathbf{{Z}}_G\Vert ^2\xi ^{-1}_G \\&\quad +3\lambda _{\max }\{{\mathbf{{P}}}(\mathbf{{w}})\}\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2\xi ^{-1}_G\\ & \le C\xi ^{-1}_G\{O_p(n^{-1}k^2_M)O_p(1)+O_p(n^{-1}k^2_M)O_p(n)+O_p(1)\}\\&= o_p(1). \end{aligned}$$

As for \(J_3\), according to the results of \(J_1\) and \(J_2\), by Cauchy–Schwartz inequality, we have

$$\begin{aligned} J_3\le \sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{L_{{\hat{G}}_n}(\mathbf{{w}})}{R_G(\mathbf{{w}})}}\sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2}{R_G(\mathbf{{w}})}}=o_p(1). \end{aligned}$$

The above indicates that (30) is true.

Next, we turn to consider (31). By Lemma 2 and (30), utilizing Cauchy–Schwartz inequality once again, we have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)^\top \{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}|}{R_G(\mathbf{{w}})}\\&\quad \le \sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2}{R_G(\mathbf{{w}})}}\sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2}{R_G(\mathbf{{w}})}}\\&\quad =o_p(1). \end{aligned}$$

Finally, we consider (32). According to the definition of \(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}\) in (10), by Lemmas 2 and 4, it can be obtained that

$$\begin{aligned}&\Vert \hat{\varvec{ {\upepsilon }}}_G-\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}\Vert \nonumber \\&\quad =\frac{1}{\sqrt{n-k_M}}\Vert (\mathbf{{Z}}_G-\mathbf{{P}}_{(M)}\mathbf{{Z}}_G)-(\mathbf{{Z}}_{{\hat{G}}_n}-\hat{\mathbf{{P}}}_{(M)}\mathbf{{Z}}_{{\hat{G}}_n})\Vert \nonumber \\&\quad =\frac{1}{\sqrt{n-k_M}}\Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}+(\hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\nonumber \\&\qquad +(\hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)})\mathbf{{Z}}_G+\mathbf{{P}}_{(M)}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert \nonumber \\&\quad \le \frac{1}{\sqrt{n-k_M}}\{\Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}\Vert +\Vert \hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)}\Vert \Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}\Vert \nonumber \\&\qquad +\Vert \hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)}\Vert \Vert \mathbf{{Z}}_G\Vert +\lambda _{\max }(\mathbf{{P}}_{(M)})\Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}\Vert \}\nonumber \\&\quad =\frac{1}{\sqrt{n-k_M}}\{O_p(1)+O_p(n^{-1/2}k_M)O_p(1)+O_p(n^{-1/2}k_M)O_p(n^{1/2})+O_p(1)O_p(1)\}\nonumber \\&\quad =O_p(k_M/\sqrt{n-k_M}). \end{aligned}$$
(33)

Through some trivial calculations, we can obtain that

$$\begin{aligned}&|\varvec{ {\upepsilon }}^\top _{G}\{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}+\hat{\varvec{ {\upepsilon }}}^\top _{{\hat{G}}_n}\hat{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}|\\&\quad \le |\varvec{ {\upepsilon }}^\top _{G}\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}|+|\varvec{ {\upepsilon }}^\top _{G}\{\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}|+|\varvec{ {\upepsilon }}^\top _{G}\mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}-tr\{\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}|\\&\qquad +|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top \{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})|+|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top {\mathbf{{Q}}}(\mathbf{{w}})(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})|\\&\qquad +2|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top \{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}\hat{\varvec{ {\upepsilon }}}_{G}|+2|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top {\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{G}|\\&\qquad +|\hat{\varvec{ {\upepsilon }}}^\top _{G}\{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}\hat{\varvec{ {\upepsilon }}}_{G}|+|\hat{\varvec{ {\upepsilon }}}^\top _{G}{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{G}-tr\{\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}|\\&\quad \equiv \sum _{i=1}^{9}\Lambda _i, \end{aligned}$$

where \(\varvec{ {\Omega }}_G=diag(\sigma ^2_{G,1},\ldots ,\sigma ^2_{G,n})\), \(\mathbf{{Q}}(\mathbf{{w}})\) and \(\hat{\mathbf{{Q}}}(\mathbf{{w}})\) are given in Appendix 1. Thus, to prove (32), it is equivalent to prove

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _i}{R_G(\mathbf{{w}})}=o_p(1), \quad i=1,\ldots ,9. \end{aligned}$$
(34)

By (25) and Markov inequality, it is easy to show that

$$\begin{aligned} \Vert \varvec{ {\upepsilon }}_{G}\Vert =O_p(n^{1/2}). \end{aligned}$$
(35)

By (35), under Conditions (C.2) and (C.5), with the similar proving skills of Theorem 2.2 in Liu and Okui (2013), we know that \(\sup _{\mathbf{{w}} \in W_n}\Lambda _i/R_G(\mathbf{{w}})=o_p(1)\) are true for \(i=1,3,9\). By (22), (35), Lemmas 24 and \(J_2\), under Condition (C.4), we have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _2}{R_G(\mathbf{{w}})} \\&\quad \le \sup _{\mathbf{{w}} \in W_n} \frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert \{\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}\Vert }{R_G(\mathbf{{w}})}\\&\quad \le C\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert }{R_G(\mathbf{{w}})}+C\sup _{\mathbf{{w}}\in W_n}\frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G\Vert }{R_G(\mathbf{{w}})} \\&\qquad +C\sup _{\mathbf{{w}}\in W_n}\frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert {\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert }{R_G(\mathbf{{w}})} \\&\quad \le C\xi ^{-1}_G\{O_p(n^{1/2})O_p(n^{-1/2}k_M)+O_p(n^{1/2})O_p(n^{-1/2}k_M)O_p(n^{1/2})+O_p(n^{1/2})\}\\&\quad =o_p(1). \end{aligned}$$

Denote by \(\tilde{p}=\sup _{\mathbf{{w}} \in W_n}\max _{1\le i\le n}p_{ii}(\mathbf{{w}})\). By Condition (C.5), it is easy to verify that \(\tilde{p}=O_p(n^{-1/2})\). Similar to Lemma 4, we can verify that

$$\begin{aligned} \quad \sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\Vert =O_p(n^{-1/2}k_M). \end{aligned}$$
(36)

Then, for \(\Lambda _4\), combine with (33) and (36), under Condition (C.4), we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _4}{R_G(\mathbf{{w}})}&\le \sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_G\Vert ^2\sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\Vert }{R_G(\mathbf{{w}})}\\&\le \xi ^{-1}_GO_p\{k^2_M/(n-k_M)\}O_p(n^{-1/2}k_M)\\&=o_p(1). \end{aligned}$$

For \(\Lambda _5\), by (33) and (36), under Conditions (C.4) and (C.5), we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _5}{R_G(\mathbf{{w}})}&\le \sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_G\Vert ^2\tilde{p}}{R_G(\mathbf{{w}})}\\&\le \xi ^{-1}_GO_p\{k^2_M/(n-k_M)\}O_p(n^{-1/2})\\&=o_p(1). \end{aligned}$$

As for \(\Lambda _6\), similarly, we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _6}{R_G(\mathbf{{w}})}&=\sup _{\mathbf{{w}} \in W_n}\frac{2|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top \{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}(\mathbf{{I}}_n-\mathbf{{P}}_{(M)})\mathbf{{Z}}_G|}{R_G(\mathbf{{w}})}\\&\le C\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_G\Vert \sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\Vert \{1+\lambda _{\max }(\mathbf{{P}}_{(M)})\}\Vert \mathbf{{Z}}_G\Vert }{R_G(\mathbf{{w}})}\\&\le C\xi ^{-1}_GO_p(k_M/\sqrt{n-k_M})O_p(n^{-1/2}k_M)O_p(n^{1/2})\{1+O_p(1)\}\\&\le CO_p\bigg (\frac{\sqrt{n}k_M}{\xi _G}\frac{k_M}{\sqrt{n}}\frac{1}{\sqrt{n-k_M}}\bigg )\\&= o_p(1). \end{aligned}$$

Similar to the proving of \(\Lambda _4\), \(\Lambda _5\) and \(\Lambda _6\), we can prove that \(\sup _{\mathbf{{w}} \in W_n}\Lambda _i/R_G(\mathbf{{w}})=o_p(1)\) for \(i=7,8\). Until now, we have proved that (32) is true.

Thus, we complete the proof of Theorem 2. \(\square\)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liang, Z., Zhang, C. & Xu, L. Model averaging for right censored data with measurement error. Lifetime Data Anal 30, 501–527 (2024). https://doi.org/10.1007/s10985-024-09620-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-024-09620-3

Keywords

Mathematics Subject Classification

Navigation