Model averaging for right censored data with measurement error

Liang, Zhongqi; Zhang, Caiya; Xu, Linjun

doi:10.1007/s10985-024-09620-3

Model averaging for right censored data with measurement error

Published: 13 March 2024

Volume 30, pages 501–527, (2024)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

145 Accesses
Explore all metrics

Abstract

This paper studies a novel model averaging estimation issue for linear regression models when the responses are right censored and the covariates are measured with error. A novel weighted Mallows-type criterion is proposed for the considered issue by introducing multiple candidate models. The weight vector for model averaging is selected by minimizing the proposed criterion. Under some regularity conditions, the asymptotic optimality of the selected weight vector is established in terms of its ability to achieve the lowest squared loss asymptotically. Simulation results show that the proposed method is superior to the other existing related methods. A real data example is provided to supplement the actual performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mallows Model Averaging Estimation for Linear Regression Model with Right Censored Data

Article 01 January 2022

Model averaging for linear models with responses missing at random

Article 01 July 2020

Jackknife model averaging for linear regression models with missing responses

Article 19 February 2024

References

Ando T, Li KC (2014) A model-averaging approach for high-dimensional regression. J Am Stat Assoc 109(505):254–265
Article MathSciNet CAS Google Scholar
Ando T, Li KC (2017) A weight-relaxed model averaging approach for high-dimensional generalized linear models. Ann Stat 45(6):2654–2679
Article MathSciNet Google Scholar
Carroll RJ, Ruppert D, Stefanski LA, Crainiceanu CM (2006) Measurement error in nonlinear models: a modern perspective. Chapman and Hall-CRC, Boca Raton
Book Google Scholar
Chen L, Yi GY (2020) Model selection and model averaging for analysis of truncated and censored data with measurement error. Electron J Stat 14(2):4054–4109
Article MathSciNet Google Scholar
Claeskens G, Hjort NL (2003) The focused information criterion. J Am Stat Assoc 98(464):900–916
Article MathSciNet Google Scholar
Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, Cambridge
Google Scholar
Dong Q, Liu B, Zhao H (2023) Weighted least squares model averaging for accelerated failure time models. Comput Stat Data Anal 184:107743
Article MathSciNet Google Scholar
Du J, Zhang Z, Xie T (2017) Focused information criterion and model averaging in censored quantile regression. Metrika 80(5):547–570
Article MathSciNet Google Scholar
Han P, Kong L, Zhao J, Zhou X (2019) A general framework for quantile estimation with incomplete data. J R Stat Soc Ser B Stat Methodol 81(2):305–333
Article MathSciNet Google Scholar
Hansen BE (2007) Least squares model averaging. Econometrica 75(4):1175–1189
Article MathSciNet Google Scholar
Hansen BE, Racine JS (2012) Jackknife model averaging. J Econom 167(1):38–46
Article MathSciNet Google Scholar
Hoeting J, Madigan D, Raftery A, Volinsky C (1999) Bayesian model averaging: a tutorial. Stat Sci 14(4):382–417
MathSciNet Google Scholar
Kaplan EL, Meyer P (1957) Nonparametric estimation from incomplete observations. J Am Stat Assoc 53(282):457–481
Article MathSciNet Google Scholar
Li KC (1987) Asymptotic optimality for $C_p, C_L$, cross-validation and generalized cross-validation: discrete index set. Ann Stat 15(3):958–975
Article MathSciNet Google Scholar
Li G, Wang Q (2003) Empirical likelihood regression analysis for right censored data. Stat Sin 13(1):51–68
MathSciNet CAS Google Scholar
Li M, Wang X (2023) Semiparametric model averaging method for survival probability predictions of patients. Comput Stat Data Anal 185:107759
Article MathSciNet Google Scholar
Li J, Yu T, Lv J, Lee M-LT (2021) Semiparametric model averaging prediction for lifetime data via hazards regression. J R Stat Soc Ser C Appl Stat 70(5):1187–1209
Article MathSciNet Google Scholar
Liang H, Li R (2009) Variable selection for partially linear models with measurement errors. J Am Stat Assoc 104(485):234–248
Article MathSciNet CAS PubMed PubMed Central Google Scholar
Liang H, Wang S, Carroll RJ (2007) Partially linear models with missing response variables and error-prone covariates. Biometrika 94(1):185–198
Article MathSciNet PubMed Google Scholar
Liang Z, Chen X, Zhou Y (2022) Mallows model averaging estimation for linear regression model with right censored data. Acta Math Appl Sin Engl Ser 38(1):5–23
Article MathSciNet Google Scholar
Liao J, Zou G (2020) Corrected mallows criterion for model averaging. Comput Stat Data Anal 144:106902
Article MathSciNet Google Scholar
Liao J, Zong X, Zhang X, Zou G (2019) Model averaging based on leave-subject-out cross-validation for vector autoregressions. J Econom 209(1):35–60
Article MathSciNet Google Scholar
Liu Q, Okui R (2013) Heteroscedasticity-robust $C_p$ model averaging. Econom J 16(3):463–472
Article MathSciNet Google Scholar
Longford NT (2005) Model selection and efficiency-is ‘Which model...?’ the right question? J R Stat Soc Ser A Stat Soc 168(3):469–472
Article MathSciNet Google Scholar
Raftery AE, Zheng Y (2003) Discussion: performance of Bayesian model averaging. J Am Stat Assoc 98(464):931–938
Article Google Scholar
Su M, Wang R, Wang Q (2022) A two-stage optimal subsampling estimation for missing data problems with large-scale data. Comput Stat Data Anal 173:107505
Article MathSciNet Google Scholar
Sun Z, Sun L, Lu X, Zhu J, Li Y (2017) Frequentist model averaging estimation for the censored partial linear quantile regression model. J Stat Plan Inference 189:1–15
Article MathSciNet CAS Google Scholar
Tang ML, Tang NS, Zhao PY, Zhu H (2018) Efficient robust estimation for linear models with missing response at random. Scand J Stat 45(2):366–381
Article MathSciNet Google Scholar
Wan ATK, Zhang X, Zou G (2010) Least squares model averaging by mallows criterion. J Econom 156(2):277–283
Article MathSciNet Google Scholar
Wang H, Zou G, Wan ATK (2012) Model averaging for varying-coefficient partially linear measurement error models. Electron J Stat 6:1017–1039
Article MathSciNet Google Scholar
Wen C (2012) Cox regression for mixed case interval-censored data with covariate errors. Lifetime Data Anal 18(3):321–338
Article MathSciNet PubMed Google Scholar
Yan X, Wang H, Wang W, Xie J, Ren Y, Wang X (2021) Optimal model averaging forecasting in high-dimensional survival analysis. Int J Forecast 37(3):1147–1155
Article Google Scholar
Zhang X, Liu C-A (2023) Model averaging prediction by K-fold cross-validation. J Econom 235(1):280–301
Article MathSciNet Google Scholar
Zhang T, Wang L (2020) Smoothed empirical likelihood inference and variable selection for quantile regression with nonignorable missing response. Comput Stat Data Anal 144:106888
Article MathSciNet Google Scholar
Zhang X, Zou G, Carroll RJ (2015) Model averaging based on Kullback–Leibler distance. Stat Sin 25:1583–1598
MathSciNet PubMed PubMed Central Google Scholar
Zhang X, Yu D, Zou G, Liang H (2016) Optimal model averaging estimation for generalized linear models and generalized linear mixed-effects models. J Am Stat Assoc 111(516):1775–1790
Article MathSciNet CAS Google Scholar
Zhang X, Wang H, Ma Y, Carroll RJ (2017) Linear model selection when covariates contain errors. J Am Stat Assoc 112(520):1553–1561
Article MathSciNet CAS PubMed PubMed Central Google Scholar
Zhang X, Ma Y, Carroll RJ (2019) MALMEM: model averaging in linear measurement error models. J R Stat Soc Ser B Stat Methodol 81(4):763–779
Article MathSciNet Google Scholar
Zhou M (1992) Asymptotic normality of the ‘synthetic data’ regression estimator for censored survival data. Ann Stat 20:1002–1021
Article MathSciNet Google Scholar
Zhu R, Wan ATK, Zhang X, Zou G (2019) A mallows-type model averaging estimator for the varying-coefficient partially linear model. J Am Stat Assoc 114(526):882–892
Article MathSciNet CAS Google Scholar

Download references

Acknowledgements

The authors are grateful to the editor, associate editor and three referees for their comments and suggestions that helped improve the manuscript greatly. This paper is supported by Institute of Digital Finance, Hangzhou City University.

Author information

Authors and Affiliations

Institute of Digital Finance, Hangzhou City University, Hangzhou, 310015, China
Zhongqi Liang & Caiya Zhang
School of Mathematical Sciences, Zhejiang University, Hangzhou, 310058, China
Zhongqi Liang
School of Statistics and Mathematics, Zhejiang Gongshang University, Hangzhou, 310018, China
Linjun Xu

Authors

Zhongqi Liang
View author publications
You can also search for this author in PubMed Google Scholar
Caiya Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Linjun Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Caiya Zhang.

Ethics declarations

Conflict of interest

The authors report there are no competing interests to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1. Regularity conditions

In Appendix 1, some addition notations and regularity conditions would be listed. Let $J(t)=1-\{1-F(t)\}\{1-G(t)\}$ with $F(\cdot )$ is the cumulative distribution function of $\mathbf{{Y}}$. Denote by $\mathbf{{Q}}(\mathbf{{w}})$ and $\hat{\mathbf{{Q}}}(\mathbf{{w}})$ are two n order diagonal matrices, and their diagonal elements are $p_{ii}(\mathbf{{w}})$ and ${\hat{p}}_{ii}(\mathbf{{w}})$, respectively, for $i=1,\ldots ,n$. Let $\lambda _{\max }(\cdot )$ and $\lambda _{\min }(\cdot )$ be the largest and the smallest singular values of the given matrix, respectively. Denote by $\xi _G=\inf _{\mathbf{{w}}\in W_n}R_G(\mathbf{{w}})$.

(C.1)
$1-G(\tau _J-)>0$ with $\tau _J=\inf \{t:J(t)=1\}$.
(C.2)
$\max _{1\le i\le n}E(\epsilon ^{4}_i|\mathbf{{X}}_i)<\infty$, a.s. and $\Vert \varvec{ {\upmu }}\Vert ^2=O(n)$.
(C.3)
$c_1\le n^{-1}\lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})\le n^{-1}\lambda _{\max }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})<c_2$ uniformly in m, a.s., and $\lambda _{\max }(\varvec{ {\Sigma }}_T)<c_2$, where $c_1$ and $c_2$ are two positive constants.
(C.4)
$n^{1/2}k_M/\xi _G=o(1)$ and $k^2_M/n=o(1)$, a.s., where $k_M<p_n$ is the number of regressors in the largest candidate model.
(C.5)
$\max _{1\le m\le M}\max _{1\le i\le n}p_{(m),ii}=O(n^{-1/2})$, a.s., where $p_{(m),ii}$ is the ith diagonal element of $\mathbf{{P}}_{(m)}$.
(C.6)
$\Vert \hat{\varvec{ {\Sigma }}}_{T,(m)}-{\varvec{ {\Sigma }}}_{T,(m)}\Vert =O_p(n^{-1/2}k_M)$ uniformly in m.

Remark 1

Condition (C.1) is a commonly used condition in right censored data analysis. Similar conditions can be seen Sun et al. (2017), Li and Wang (2003) and Liang et al. (2022). Condition (C.2) is just a standard condition for the linear regression models with measurement error, which is similar to Condition 1 of Zhang et al. (2019). Condition (C.3) is a standard condition for the covariates of the model and the covariance of the measurement error, which is the same as Condition (C.3) in Zhang et al. (2017). Similar condition can also be seen in Condition (C.5) of Liao and Zou (2020). Condition (C.4) places a constraint on the dimension of the regressors in the largest candidate model. It allows $k_M$ to increase with n, but it is not arbitrary and is limited by Condition (C.4). Condition (C.4) also requires that $\xi _G$ increases faster than $n^{1/2}k_M$, which implies that there is no finite candidate model whose bias is 0. This condition is weaker than Condition (C.1) of Zhang et al. (2017), because Zhang et al. (2017) requires that the minimum of the risk tends to infinity faster than $n^{1/2}p^2_n$. Other similar conditions can be seen Condition (C.6) of Zhang et al. (2016) and Condition (C.9) in Liao and Zou (2020). Condition (C.6) requires that the estimator $\hat{\varvec{ {\Sigma }}}_{T,(m)}$ is consistent, similar condition can be seen Condition (C.4) in Zhang et al. (2017).

Appendix 2. Proofs of main results

Appendix 2 gives the detailed proofs of the theorems appearing in this paper. We first give a short proof of Lemma 1. Next, we give some additional lemmas to assist the proofs of the theorems. In the following, C represents a general positive constant that can be freely varied under different contexts. The norm of a matrix is the Euclidean norm, which is the largest singular value of the matrix.

Proof of Lemma 1

By the definition of $C_{G}(\mathbf{{w}})$ in (6), we have

$$\begin{aligned} E\{C_{G}(\mathbf{{w}})\mid \tilde{\mathbf{{X}}}\} & = E[\Vert \mathbf{{Z}}_G-\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})\Vert ^2+2tr\{\mathbf{{P}}(\mathbf{{w}})\varvec{ {\Omega }}_G\}\mid \tilde{\mathbf{{X}}}]\\ & = E\{\Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})\Vert ^2\mid \tilde{\mathbf{{X}}}\}+E(\varvec{ {\upepsilon }}^\top _{G}\varvec{ {\upepsilon }}_{G}\mid \tilde{\mathbf{{X}}})+2E[\varvec{ {\upepsilon }}^\top _{G}\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\mid \tilde{\mathbf{{X}}}]\\&-2E[\{\varvec{ {\upepsilon }}^\top _{G}\mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\}-tr\{\mathbf{{P}}(\mathbf{{w}})\varvec{ {\Omega }}_G\}\mid \tilde{\mathbf{{X}}}]\\ & = R_G(\mathbf{{w}})+tr(\varvec{ {\Omega }}_G), \end{aligned}$$

where $\mathbf{{I}}_n$ is an n order identity matrix. $\square$

Lemma 2

Provided that Conditions (C.1)–(C.2) in Appendix 1 hold, as $n\rightarrow \infty$, we have

$$\begin{aligned} \Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2=O_p(1). \end{aligned}$$

(13)

Proof of Lemma 2

According to the definitions of $\mathbf{{Z}}_{{\hat{G}}_n}$ and $\mathbf{{Z}}_G$, by Condition (C.2), we have

$$\begin{aligned} \Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2 & = \sum _{i=1}^n\bigg \{\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg \}^2\delta ^2_iZ^2_i \nonumber \\ & \le n\bigg \{\max _{1\le i\le n}\bigg |\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg |\bigg \}^2O_p(1). \end{aligned}$$

(14)

Through straightforward but trivial calculations, we can obtain that

$$\begin{aligned}&\max _{1\le i\le n}\bigg |\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg |\\ & \le \qquad \max _{1\le i\le n}\bigg [\bigg \{\frac{1}{1-G(Z_i-)}\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-G(Z_i-)}\bigg |\bigg \}\bigg \{1+\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-G(Z_i-)}\bigg |\\&\quad +\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-G(Z_i-)}\bigg |\bigg |\frac{{\hat{G}}_n(Z_i-)-G(Z_i-)}{1-{\hat{G}}_n(Z_i-)}\bigg |\bigg \}\bigg ]. \end{aligned}$$

Then, combine with Condition (C.1) and the results in Zhou (1992)

$$\begin{aligned} \sup _z\bigg |\frac{{\hat{G}}_n(z)-G(z)}{1-G(z)}\bigg |=O_p(n^{-1/2}),\quad \sup _z\bigg |\frac{{\hat{G}}_n(z)-G(z)}{1-{\hat{G}}_n(z)}\bigg |=O_p(1), \end{aligned}$$

we have

$$\begin{aligned}&\max _{1\le i\le n}\bigg |\frac{1}{1-{\hat{G}}_n(Z_i-)}-\frac{1}{1-G(Z_i-)}\bigg |=O_p(n^{-1/2}). \end{aligned}$$

(15)

By (14) and (15), it can be obtained that $\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2=O_p(1)$. Thus, we complete the proof of Lemma 2. $\square$

Lemma 3

Provided that Conditions (C.1)–(C.5) in Appendix 1 hold, as $n\rightarrow \infty$, we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |=o_p(1). \end{aligned}$$

(16)

Proof of Lemma 3

By the definition of $R_G(\mathbf{{w}})$ in (7), it follows that

$$\begin{aligned} R_G(\mathbf{{w}})=E\{L_G(\mathbf{{w}})\mid \tilde{\mathbf{{X}}}\}=\Vert \{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\Vert ^2+tr\{\mathbf{{P}}^\top (\mathbf{{w}})\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}, \end{aligned}$$

where $\varvec{ {\Omega }}_G=diag(\sigma ^2_{G,1},\ldots ,\sigma ^2_{G,n})$. Then, we have

$$\begin{aligned}&L_G(\mathbf{{w}})-R_G(\mathbf{{w}})\\&\quad =\Vert \mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\Vert ^2-tr\{\mathbf{{P}}^\top (\mathbf{{w}})\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}-2[\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}]^\top \{\mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\}. \end{aligned}$$

Thus, to prove (16), it is equivalent to show

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|\Vert \mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}\Vert ^2-tr\{\mathbf{{P}}^\top (\mathbf{{w}})\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}|}{R_G(\mathbf{{w}})}=o_p(1), \end{aligned}$$

(17)

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|[\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}]^\top \{\mathbf{{P}}(\mathbf{{w}}) \varvec{ {\upepsilon }}_{G}\}|}{R_G(\mathbf{{w}})}=o_p(1). \end{aligned}$$

(18)

Similar to Zhang et al. (2017), under Conditions (C.2) and (C.3), by Markov inequality, we have

$$\begin{aligned} \text {pr}\bigg (\frac{\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m),m}\Vert }{\sqrt{nk_M}}>C_n\bigg )\le \frac{E(\mathbf{{X}}^\top _{(m),m}\mathbf{{T}}_{(m)}\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m),m})}{nk_MC^2_n}\le \frac{c_2n\cdot tr({\varvec{ {\Sigma }}}_{T,(m)})}{nk_MC^2_n}\rightarrow 0 \end{aligned}$$

uniformly in m, as $C_n\rightarrow \infty$, where $\mathbf{{X}}_{(m),m}$ is the mth column of $\mathbf{{X}}_{(m)}$. This implies $\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m),m}\Vert =O_p(n^{1/2}k^{1/2}_M)$ uniformly in m. Similarly, we can prove that $\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}\Vert =O_p(n^{1/2}k_M)$ and $\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)}\Vert =O_p(n^{1/2}k_M)$ uniformly in m. Then, under Conditions (C.2) and (C.3), with the similar proving steps of (21) in Zhang et al. (2017), we can obtain that uniformly in m,

$$\begin{aligned}&\lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad =\lambda _{\max }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad \le \lambda _{\max }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})+2\Vert \mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}\Vert +\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)}\Vert \nonumber \\&\quad \le c_2n+O_p(n^{1/2}k_M), \end{aligned}$$

(19)

and

$$\begin{aligned}&\lambda _{\min }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad =\lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})\nonumber \\&\quad \ge \lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})+\lambda _{\min }(\mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}+\mathbf{{X}}^\top _{(m)}\mathbf{{T}}_{(m)}+\mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)})\nonumber \\&\quad \ge \lambda _{\min }(\mathbf{{X}}^\top _{(m)}\mathbf{{X}}_{(m)})-\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{T}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)}\Vert -2\Vert \mathbf{{T}}^\top _{(m)}\mathbf{{X}}_{(m)}\Vert \nonumber \\&\quad \ge c_1n+O_p(n^{1/2}k_M). \end{aligned}$$

(20)

By (19) and (20), under Conditions (C.3) and (C.4), we can derive that

$$\begin{aligned} c_1+o_p(1)<\frac{\lambda _{\min }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})}{n}\le \frac{\lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})}{n}<c_2+o_p(1), \end{aligned}$$

which indirectly indicates that

$$\begin{aligned} \lambda _{\max }\{(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})^{-1}\}=O_p(n^{-1}). \end{aligned}$$

(21)

Similar to (19), we can obtain that $\lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)})\le 2nc_2+O_p(n^{1/2}k_M)$. This together with (21), under Condition (C.4), we have

$$\begin{aligned} \lambda _{\max }(\mathbf{{P}}_{(m)})&=\lambda _{\max }\{\tilde{\mathbf{{X}}}_{(m)}(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n{\varvec{ {\Sigma }}}_{T,(m)})^{-1}\tilde{\mathbf{{X}}}^\top _{(m)}\}\nonumber \\&\le \lambda _{\max }(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)})\lambda _{\max }\{(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\varvec{ {\Sigma }}_{T,(m)})^{-1}\}\nonumber \\&\le \{2nc_2+O_p(n^{1/2}k_M)\}O_p(n^{-1})\nonumber \\&=O_p(1). \end{aligned}$$

(22)

By (22), it can be obtained that $\lambda _{\max }\{\mathbf{{P}}(\mathbf{{w}})\}=O_p(1)$. Then, we have

$$\begin{aligned} tr\{\mathbf{{P}}^\top (\mathbf{{w}})\mathbf{{P}}(\mathbf{{w}})\mathbf{{P}}^\top (\mathbf{{w}})\mathbf{{P}}(\mathbf{{w}})\}\le \lambda ^2_{\max }\{\mathbf{{P}}(\mathbf{{w}})\}tr\{\mathbf{{P}}^2(\mathbf{{w}})\}\le CR_G(\mathbf{{w}}), \end{aligned}$$

(23)

and

$$\begin{aligned} \Vert \mathbf{{P}}^\top (\mathbf{{w}})\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\Vert ^2\le \lambda ^2_{\max }\{\mathbf{{P}}(\mathbf{{w}})\}\Vert \{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}\Vert ^2\le CR_G(\mathbf{{w}}). \end{aligned}$$

(24)

In addition, by Conditions (C.1) and (C.2), using $C_r$ inequality, we can prove

$$\begin{aligned} \max _{1\le i\le n}E(\epsilon ^{4}_{G,i}|\mathbf{{X}}_i)<\infty . \end{aligned}$$

(25)

Together with (23)–(25), under Condition (C.4), using the same techniques as those for the proof of Theorem 1 in Wan et al. (2010), we can prove (17) and (18). Therefore, Lemma 3 is valid. $\square$

Lemma 4

Provided that Conditions (C.2), (C.3), (C.4) and (C.6) in Appendix 1 hold, as $n\rightarrow \infty$, we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert =O_p(n^{-1/2}k_M). \end{aligned}$$

(26)

Proof of Lemma 4

Let $\hat{\mathbf{{A}}}_{(m)}=n^{-1}(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n\hat{\varvec{ {\Sigma }}}_{T,(m)})$ and ${\mathbf{{A}}}_{(m)}=n^{-1}(\tilde{\mathbf{{X}}}^\top _{(m)}\tilde{\mathbf{{X}}}_{(m)}-n{\varvec{ {\Sigma }}_{T,(m)}})$. Note that

$$\begin{aligned} \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}=\{{\mathbf{{A}}}^{-1}_{(m)}+(\hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)})\}({\mathbf{{A}}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}){\mathbf{{A}}}^{-1}_{(m)}. \end{aligned}$$

Then, we have

$$\begin{aligned} \Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \le \{\Vert {\mathbf{{A}}}^{-1}_{(m)}\Vert +\Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \}\Vert {\mathbf{{A}}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert \Vert {\mathbf{{A}}}^{-1}_{(m)}\Vert . \end{aligned}$$

(27)

By (21), we can easily verify that

$$\begin{aligned} \Vert \mathbf{{A}}^{-1}_{(m)}\Vert =nO_p(n^{-1})=O_p(1), \end{aligned}$$

that is, $\sup _m\Vert \mathbf{{A}}^{-1}_{(m)}\Vert \le C<\infty$ uniformly in m. By Condition (C.6), we have $\Vert {\mathbf{{A}}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert =n^{-1}\Vert {n(\varvec{ {\Sigma }}_{T,(m)}}-\hat{\varvec{ {\Sigma }}}_{T,(m)})\Vert =O_p(n^{-1/2}k_M)$ uniformly in m. Further, by Condition (C.4), we know that $\sup _m\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert \rightarrow 0$ in probability, which implies that $C\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert < 1$ in probability. Then, by (27), we have

$$\begin{aligned} \sup _m \Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-\hat{\mathbf{{A}}}^{-1}_{(m)}\Vert \le \sup _m\frac{C\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert }{1-C\Vert \mathbf{{A}}_{(m)}-\hat{\mathbf{{A}}}_{(m)}\Vert }=O_p(n^{-1/2}k_M). \end{aligned}$$

(28)

By (19) and (28), under Condition (C.4), we then have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert \\&\quad =\sup _{\mathbf{{w}} \in W_n}\bigg \Vert \sum _{m=1}^{M}w_m(n^{-1}\tilde{\mathbf{{X}}}_{(m)}\hat{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}-n^{-1}\tilde{\mathbf{{X}}}_{(m)}{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)})\bigg \Vert \\&\quad \le \sup _{\mathbf{{w}} \in W_n}\sum _{m=1}^{M}w_m\bigg \Vert n^{-1}\tilde{\mathbf{{X}}}_{(m)}\hat{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}-n^{-1}\tilde{\mathbf{{X}}}_{(m)}{\mathbf{{A}}}^{-1}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}\bigg \Vert \\&\quad \le \sup _{\mathbf{{w}} \in W_n}\sum _{m=1}^{M}w_m\{ \Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \Vert n^{-1}\tilde{\mathbf{{X}}}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)}\Vert \}\\&\quad \le \sup _{\mathbf{{w}} \in W_n}\sum _{m=1}^{M}w_m\{ \sup _m\Vert \hat{\mathbf{{A}}}^{-1}_{(m)}-{\mathbf{{A}}}^{-1}_{(m)}\Vert \lambda _{\max }(n^{-1}\tilde{\mathbf{{X}}}_{(m)}\tilde{\mathbf{{X}}}^\top _{(m)})\} \\&\quad =O_p(n^{-1/2}k_M). \end{aligned}$$

Thus, we complete the proof of Lemma 4. $\square$

Proof of Theorem 1

Under Conditions (C.1)–(C.5) listed in Appendix 1, the idea of proving Theorem 1 is similar to that of the proof of Theorem 1. And the steps in the proof of Theorem 1 are relatively simpler than the proof of Theorem 2. Thus we omit it here. $\square$

Proof of Theorem 2

The proposed criterion of (9) can be recognized as

$$\begin{aligned} C_{\hat{G}_n}(\mathbf{{w}}) & = \Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2+2(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)^\top \{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}+2\varvec{ {\upepsilon }}^\top _{G}\{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}\nonumber \\&+2\hat{\varvec{ {\upepsilon }}}^\top _{{\hat{G}}_n}\hat{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}+\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\varvec{ {\upmu }}\Vert , \end{aligned}$$

(29)

where $\hat{\mathbf{{Q}}}(\mathbf{{w}})$ is defined in Appendix 1. Obviously, the last term in (29) is unrelated to $\mathbf{{w}}$. Accordingly, following Li (1987), Theorem 2 is valid if the following three formulas hold,

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_{{\hat{G}}_n}(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |=o_p(1), \end{aligned}$$

(30)

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)^\top \{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}|}{R_G(\mathbf{{w}})}=o_p(1), \end{aligned}$$

(31)

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|\varvec{ {\upepsilon }}^\top _{G}\{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}+\hat{\varvec{ {\upepsilon }}}^\top _{{\hat{G}}_n}\hat{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}|}{R_G(\mathbf{{w}})}=o_p(1), \end{aligned}$$

(32)

where $L_{{\hat{G}}_n}(\mathbf{{w}})=\Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2$. We first consider (30). By Cauchy–Schwartz inequality, we have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_{{\hat{G}}_n}(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |\\&\quad =\sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_{{\hat{G}}_n}(\mathbf{{w}})-L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}+\frac{L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |\\&\quad \le \sup _{\mathbf{{w}} \in W_n}\bigg |\frac{L_G(\mathbf{{w}})}{R_G(\mathbf{{w}})}-1\bigg |+\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2}{R_G(\mathbf{{w}})}\\ &\qquad +\sup _{\mathbf{{w}} \in W_n}\frac{2\{L_G(\mathbf{{w}})\}^{1/2}\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert }{R_G(\mathbf{{w}})}\\&\quad \equiv J_1+J_2+J_3. \end{aligned}$$

By Lemma 3, we know that $J_1$ is $o_p(1)$. Note that

$$\begin{aligned}&\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2\\&\quad =\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})\mathbf{{Z}}_{{\hat{G}}_n}-{\mathbf{{P}}}(\mathbf{{w}})\mathbf{{Z}}_G\Vert ^2\\&\quad =\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)+\{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G+{\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2\\&\quad \le 3[\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2+\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G\Vert ^2+\Vert {\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2]. \end{aligned}$$

By Conditions (C.1) and (C.2), it is clear that $\Vert \mathbf{{Z}}_G\Vert ^2=O_p(n)$. Then, by Lemmas 2 and 4, under Condition (C.4), we have

$$\begin{aligned} J_2&\le 3\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2}{R_G(\mathbf{{w}})}+3\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G\Vert ^2}{R_G(\mathbf{{w}})} \\&\quad +3\sup _{\mathbf{{w}} \in W_n}\frac{\Vert {\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert ^2}{R_G(\mathbf{{w}})}\\&\le \sup _{\mathbf{{w}} \in W_n}3\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert ^2\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2\xi ^{-1}_G+3\sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\Vert ^2\Vert \mathbf{{Z}}_G\Vert ^2\xi ^{-1}_G \\&\quad +3\lambda _{\max }\{{\mathbf{{P}}}(\mathbf{{w}})\}\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2\xi ^{-1}_G\\ & \le C\xi ^{-1}_G\{O_p(n^{-1}k^2_M)O_p(1)+O_p(n^{-1}k^2_M)O_p(n)+O_p(1)\}\\&= o_p(1). \end{aligned}$$

As for $J_3$, according to the results of $J_1$ and $J_2$, by Cauchy–Schwartz inequality, we have

$$\begin{aligned} J_3\le \sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{L_{{\hat{G}}_n}(\mathbf{{w}})}{R_G(\mathbf{{w}})}}\sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2}{R_G(\mathbf{{w}})}}=o_p(1). \end{aligned}$$

The above indicates that (30) is true.

Next, we turn to consider (31). By Lemma 2 and (30), utilizing Cauchy–Schwartz inequality once again, we have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{|(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)^\top \{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}|}{R_G(\mathbf{{w}})}\\&\quad \le \sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G\Vert ^2}{R_G(\mathbf{{w}})}}\sqrt{\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\Vert ^2}{R_G(\mathbf{{w}})}}\\&\quad =o_p(1). \end{aligned}$$

Finally, we consider (32). According to the definition of $\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}$ in (10), by Lemmas 2 and 4, it can be obtained that

$$\begin{aligned}&\Vert \hat{\varvec{ {\upepsilon }}}_G-\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}\Vert \nonumber \\&\quad =\frac{1}{\sqrt{n-k_M}}\Vert (\mathbf{{Z}}_G-\mathbf{{P}}_{(M)}\mathbf{{Z}}_G)-(\mathbf{{Z}}_{{\hat{G}}_n}-\hat{\mathbf{{P}}}_{(M)}\mathbf{{Z}}_{{\hat{G}}_n})\Vert \nonumber \\&\quad =\frac{1}{\sqrt{n-k_M}}\Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}+(\hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\nonumber \\&\qquad +(\hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)})\mathbf{{Z}}_G+\mathbf{{P}}_{(M)}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert \nonumber \\&\quad \le \frac{1}{\sqrt{n-k_M}}\{\Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}\Vert +\Vert \hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)}\Vert \Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}\Vert \nonumber \\&\qquad +\Vert \hat{\mathbf{{P}}}_{(M)}-\mathbf{{P}}_{(M)}\Vert \Vert \mathbf{{Z}}_G\Vert +\lambda _{\max }(\mathbf{{P}}_{(M)})\Vert \mathbf{{Z}}_G-\mathbf{{Z}}_{{\hat{G}}_n}\Vert \}\nonumber \\&\quad =\frac{1}{\sqrt{n-k_M}}\{O_p(1)+O_p(n^{-1/2}k_M)O_p(1)+O_p(n^{-1/2}k_M)O_p(n^{1/2})+O_p(1)O_p(1)\}\nonumber \\&\quad =O_p(k_M/\sqrt{n-k_M}). \end{aligned}$$

(33)

Through some trivial calculations, we can obtain that

$$\begin{aligned}&|\varvec{ {\upepsilon }}^\top _{G}\{\varvec{ {\upmu }}-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}+\hat{\varvec{ {\upepsilon }}}^\top _{{\hat{G}}_n}\hat{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}|\\&\quad \le |\varvec{ {\upepsilon }}^\top _{G}\{\mathbf{{I}}_n-\mathbf{{P}}(\mathbf{{w}})\}\varvec{ {\upmu }}|+|\varvec{ {\upepsilon }}^\top _{G}\{\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}|+|\varvec{ {\upepsilon }}^\top _{G}\mathbf{{P}}(\mathbf{{w}})\varvec{ {\upepsilon }}_{G}-tr\{\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}|\\&\qquad +|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top \{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})|+|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top {\mathbf{{Q}}}(\mathbf{{w}})(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})|\\&\qquad +2|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top \{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}\hat{\varvec{ {\upepsilon }}}_{G}|+2|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top {\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{G}|\\&\qquad +|\hat{\varvec{ {\upepsilon }}}^\top _{G}\{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}\hat{\varvec{ {\upepsilon }}}_{G}|+|\hat{\varvec{ {\upepsilon }}}^\top _{G}{\mathbf{{Q}}}(\mathbf{{w}})\hat{\varvec{ {\upepsilon }}}_{G}-tr\{\varvec{ {\Omega }}_G\mathbf{{P}}(\mathbf{{w}})\}|\\&\quad \equiv \sum _{i=1}^{9}\Lambda _i, \end{aligned}$$

where $\varvec{ {\Omega }}_G=diag(\sigma ^2_{G,1},\ldots ,\sigma ^2_{G,n})$, $\mathbf{{Q}}(\mathbf{{w}})$ and $\hat{\mathbf{{Q}}}(\mathbf{{w}})$ are given in Appendix 1. Thus, to prove (32), it is equivalent to prove

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _i}{R_G(\mathbf{{w}})}=o_p(1), \quad i=1,\ldots ,9. \end{aligned}$$

(34)

By (25) and Markov inequality, it is easy to show that

$$\begin{aligned} \Vert \varvec{ {\upepsilon }}_{G}\Vert =O_p(n^{1/2}). \end{aligned}$$

(35)

By (35), under Conditions (C.2) and (C.5), with the similar proving skills of Theorem 2.2 in Liu and Okui (2013), we know that $\sup _{\mathbf{{w}} \in W_n}\Lambda _i/R_G(\mathbf{{w}})=o_p(1)$ are true for $i=1,3,9$. By (22), (35), Lemmas 2, 4 and $J_2$, under Condition (C.4), we have

$$\begin{aligned}&\sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _2}{R_G(\mathbf{{w}})} \\&\quad \le \sup _{\mathbf{{w}} \in W_n} \frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert \{\hat{\varvec{ {\upmu }}}_G(\mathbf{{w}})-\hat{\varvec{ {\upmu }}}_{{\hat{G}}_n}(\mathbf{{w}})\}\Vert }{R_G(\mathbf{{w}})}\\&\quad \le C\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert }{R_G(\mathbf{{w}})}+C\sup _{\mathbf{{w}}\in W_n}\frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert \{\hat{\mathbf{{P}}}(\mathbf{{w}})-{\mathbf{{P}}}(\mathbf{{w}})\}\mathbf{{Z}}_G\Vert }{R_G(\mathbf{{w}})} \\&\qquad +C\sup _{\mathbf{{w}}\in W_n}\frac{\Vert \varvec{ {\upepsilon }}_{G}\Vert \Vert {\mathbf{{P}}}(\mathbf{{w}})(\mathbf{{Z}}_{{\hat{G}}_n}-\mathbf{{Z}}_G)\Vert }{R_G(\mathbf{{w}})} \\&\quad \le C\xi ^{-1}_G\{O_p(n^{1/2})O_p(n^{-1/2}k_M)+O_p(n^{1/2})O_p(n^{-1/2}k_M)O_p(n^{1/2})+O_p(n^{1/2})\}\\&\quad =o_p(1). \end{aligned}$$

Denote by $\tilde{p}=\sup _{\mathbf{{w}} \in W_n}\max _{1\le i\le n}p_{ii}(\mathbf{{w}})$. By Condition (C.5), it is easy to verify that $\tilde{p}=O_p(n^{-1/2})$. Similar to Lemma 4, we can verify that

$$\begin{aligned} \quad \sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\Vert =O_p(n^{-1/2}k_M). \end{aligned}$$

(36)

Then, for $\Lambda _4$, combine with (33) and (36), under Condition (C.4), we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _4}{R_G(\mathbf{{w}})}&\le \sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_G\Vert ^2\sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\Vert }{R_G(\mathbf{{w}})}\\&\le \xi ^{-1}_GO_p\{k^2_M/(n-k_M)\}O_p(n^{-1/2}k_M)\\&=o_p(1). \end{aligned}$$

For $\Lambda _5$, by (33) and (36), under Conditions (C.4) and (C.5), we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _5}{R_G(\mathbf{{w}})}&\le \sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_G\Vert ^2\tilde{p}}{R_G(\mathbf{{w}})}\\&\le \xi ^{-1}_GO_p\{k^2_M/(n-k_M)\}O_p(n^{-1/2})\\&=o_p(1). \end{aligned}$$

As for $\Lambda _6$, similarly, we have

$$\begin{aligned} \sup _{\mathbf{{w}} \in W_n}\frac{\Lambda _6}{R_G(\mathbf{{w}})}&=\sup _{\mathbf{{w}} \in W_n}\frac{2|(\hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_{G})^\top \{\hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\}(\mathbf{{I}}_n-\mathbf{{P}}_{(M)})\mathbf{{Z}}_G|}{R_G(\mathbf{{w}})}\\&\le C\sup _{\mathbf{{w}} \in W_n}\frac{\Vert \hat{\varvec{ {\upepsilon }}}_{{\hat{G}}_n}-\hat{\varvec{ {\upepsilon }}}_G\Vert \sup _{\mathbf{{w}} \in W_n}\Vert \hat{\mathbf{{Q}}}(\mathbf{{w}})-{\mathbf{{Q}}}(\mathbf{{w}})\Vert \{1+\lambda _{\max }(\mathbf{{P}}_{(M)})\}\Vert \mathbf{{Z}}_G\Vert }{R_G(\mathbf{{w}})}\\&\le C\xi ^{-1}_GO_p(k_M/\sqrt{n-k_M})O_p(n^{-1/2}k_M)O_p(n^{1/2})\{1+O_p(1)\}\\&\le CO_p\bigg (\frac{\sqrt{n}k_M}{\xi _G}\frac{k_M}{\sqrt{n}}\frac{1}{\sqrt{n-k_M}}\bigg )\\&= o_p(1). \end{aligned}$$

Similar to the proving of $\Lambda _4$, $\Lambda _5$ and $\Lambda _6$, we can prove that $\sup _{\mathbf{{w}} \in W_n}\Lambda _i/R_G(\mathbf{{w}})=o_p(1)$ for $i=7,8$. Until now, we have proved that (32) is true.

Thus, we complete the proof of Theorem 2. $\square$

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liang, Z., Zhang, C. & Xu, L. Model averaging for right censored data with measurement error. Lifetime Data Anal 30, 501–527 (2024). https://doi.org/10.1007/s10985-024-09620-3

Download citation

Received: 13 February 2023
Accepted: 28 January 2024
Published: 13 March 2024
Issue Date: April 2024
DOI: https://doi.org/10.1007/s10985-024-09620-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Model averaging for right censored data with measurement error

Abstract

Access this article

Similar content being viewed by others

Mallows Model Averaging Estimation for Linear Regression Model with Right Censored Data

Model averaging for linear models with responses missing at random

Jackknife model averaging for linear regression models with missing responses

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1. Regularity conditions

Remark 1

Appendix 2. Proofs of main results

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Lemma 3

Proof of Lemma 3

Lemma 4

Proof of Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Model averaging for right censored data with measurement error

Abstract

Access this article

Similar content being viewed by others

Mallows Model Averaging Estimation for Linear Regression Model with Right Censored Data

Model averaging for linear models with responses missing at random

Jackknife model averaging for linear regression models with missing responses

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1. Regularity conditions

Remark 1

Appendix 2. Proofs of main results

Proof of Lemma 1

Lemma 2

Proof of Lemma 2

Lemma 3

Proof of Lemma 3

Lemma 4

Proof of Lemma 4

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation