Abstract
Measurement error data are often encountered in a broad spectrum of scientific fields, including engineering, economics, biomedical sciences and epidemiology. Simply ignoring the measurement errors would result in biased estimators. Combining the local kernel smoothing and the SCAD approach, this paper proposes a bias-corrected penalized method to capture the underlying structure of varying coefficient models with measurement errors. We show that, under the proper choice of tuning parameters and some regular conditions, the proposed method can consistently remove all the unimportant variables and separate the constant effects and varying effects. The corresponding algorithm is also developed to compute the estimates using the local quadratic approximation. Simulation studies are conducted to assess the finite sample performance of the proposed method.
Similar content being viewed by others
References
Ahmad I, Leelahanon S, Li Q (2005) Efficient estimation of a semiparametric partially linear varying coefficient model. Ann Stat 33:258–283
Cai Z, Fan J, Li R (2000) Efficient estimation and inferences for varying-coefficient models. J Am Stat Assoc 95:941–956
Fan J, Huang T (2005) Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11:1031–1057
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Hu T, Xia Y (2012) Adaptive semi-varying coefficient model selection. Stat Sin 22:575–599
Hu X, Wang Z, Zhao Z (2009) Empirical likelihood for semiparametric varying-coefficient partially linear errors-in-variables models. Stat Probab Lett 79:1044–1052
Kai B, Li R, Zou H (2011) New efficient estimation and variable methods for semiparametric varying-coefficient partially linear models. Ann Stat 39:305–332
Li L, Greene T (2008) Varying coefficients model with measurement error. Biometrics 64:519–526
Li Q, Huang CJ, Li D, Fu TT (2002) Semiparametric smooth coefficient models. J Bus Econ Stat 20:412–422
Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36:261–286
Li X, You J, Zhou Y (2011) Statistical inference for varying-coefficient models with error-prone covariates. J Stat Comput Simul 81:1755–1771
Liang H, Härdle W, Carroll RJ (1999) Estimation in a semiparametric partially linear errors-in-variables model. Ann Stat 27:1519–1535
Ma X, Zhang J (2016) A new variable selection approach for varying coefficient models. Metrika 79:59–72
Tang Q (2015) Robust estimation for spatial semiparametric varying coefficient partially linear regression. Stat Pap 56:1137–1161
Tang Y, Wang H, Zhu Z, Song X (2012) A unified variable selection approach for varying coefficient models. Stat Sin 22:601–628
Wang D, Kulasekera K (2012) Parametric component detection and variable selection in varying-coefficient partially linear models. J Multivar Anal 112:117–129
Wang K, Lin L (2017) Robust and efficient estimator for simultaneous model structure identification and variable selection in generalized partial linear varying coefficient models with longitudinal data. Stat Pap. https://doi.org/10.1007/s00362-017-0890-z
Wang M, Song L (2013) Identification for semiparametric varying coefficient partially linear models. Stat Probab Lett 83:1311–1320
Wang H, Xia Y (2009) Shrinkage estimation of the varying coefficient model. J Am Stat Assoc 104:747–757
Xia Y, Zhang W, Tong H (2004) Efficient estimation for semivarying-coefficient models. Biometrika 91:661–681
You J, Chen G (2006) Estimation of a semiparametric varying-coefficient partially linear errors-in-variables model. J Multivar Anal 97:324–341
You J, Zhou Y, Chen G (2006) Corrected local polynomial estimation in varying-coefficient models with measurement errors. Can J Stat 34:391–410
Zhang W, Lee S, Song X (2002) Local polynomial fitting in semivarying coefficient model. J Multivar Anal 82:166–188
Zhao P, Xue L (2009) Variable selection for semiparametric varying coefficient partially linear models. Stat Probab Lett 79:2148–2157
Zhao P, Xue L (2010) Variable selection for semiparametric varying coefficient partially linear errors-in-variables models. J Multivar Anal 101:1872–1883
Zhao P, Xue L (2011) Variable selection for varying coefficient models with measurement errors. Metrika 74:231–245
Zhao W, Zhang R, Liu J et al (2014) Robust and efficient variable selection for semiparametric partially linear varying coefficient model based on modal regression. Ann Inst Stat Math 66:165–191
Zhou Y, Liang H (2009) Statistical inference for semiparametric varying-coefficient partially linear models with error-prone linear covariates. Ann Stat 37:427–458
Acknowledgements
Mingqiu Wang’s Research was supported by the National Natural Science Foundation of China (Grant Nos. 11401340, 11771250). Peixin Zhao’s Research was supported by the Chongqing Research Program of Basic Theory and Advanced Technology (No. cstc2016jcyjA0151), and the Fifth Batch of Excellent Talent Support Program of Chongqing Colleges and University.
Author information
Authors and Affiliations
Contributions
All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no competing interests.
Appendix
Appendix
Proof of Theorem 3.1
The proof of Theorem 3.1 is similar to that of Theorem 3.2 and thus is not given in detail. \(\square \)
Proof of Theorem 3.2
For any matrix \(A=(a_{ij}),\)\(\Vert A\Vert ^2=\sum _{i,j}a_{ij}^2.\) Denote \( \varvec{M}=(m_{ij})\in R^{n\times p}\) with rows \( \varvec{m}_1^\top ,\ldots , \varvec{m}_n^\top \) and columns \( \varvec{m}_{(1)},\ldots , \varvec{m}_{(p)}.\) Let \(\alpha _n=(nh)^{-1/2}.\) It suffices to show that for any given \(\epsilon >0,\) there exists a large constant C such that
Based on the definition of \(Q_{\lambda _1,\lambda _2}(\cdot ),\) we have
Let \(\widehat{\Sigma }(U_j)=n^{-1}\sum _{i=1}^n\mathbf {Z}_i\mathbf {Z}_i^{\top }K_h(U_j-U_i),\)\(\Omega (U_j)=n^{-1}\sum _{i=1}^n\Sigma \otimes K_h(U_j-U_i)\) and \(\hat{ \varvec{e}}_j=\sum _{i=1}^n\mathbf {Z}_iK_h(U_j-U_i)(Y_i-\mathbf {Z}_i^\top \varvec{\phi }_0(U_j)),\) we have
where \(\lambda _j^{\min }\) indicates the smallest eigenvalue of \(\widehat{\Sigma }(U_j)-\Omega (U_j),\)\(\lambda ^{\min }=\min \{\lambda _j^{\min },\,j=1,\ldots ,n\}\) and \(T_{1j}=\hat{ \varvec{e}}_j+\Omega (U_j)\varvec{\phi }_0(U_j).\)
Using the formula (A1) in You et al. (2006),
we have \(\Pr (\lambda ^{\min }\rightarrow \lambda ^{\min }_0)\rightarrow 1,\) where \(\lambda ^{\min }_0=\inf _{u\in [0,1]}\lambda _{\min }(f(u)\Gamma (u)),\)\(\lambda _{\min }(A)\) stands for the minimal eigenvalue of an arbitrary positive definite matrix A. By conditions (C1)–(C2), it is easy to see \(\lambda ^{\min }_0>0.\)
We now decompose \(T_{1j}\) as
Based on the proof of Wang and Xia (2009), we have \(\Vert T_{21}\Vert ^2=O_P(nh^{-1})\) and \(\Vert T_{23}\Vert ^2=O_P(nh^{-1}).\)
For \(T_{22},\)
By Taylor expression, we have
where \(\varvec{\phi }'_0(\cdot )=(\phi '_{01}(\cdot ),\ldots ,\phi '_{0p}(\cdot ))^{\top },\)\(\varvec{\phi }''_0(\cdot )=(\phi ''_{01}(\cdot ),\ldots ,\phi ''_{0p}(\cdot ))^{\top }\) and \(U^*\) is between \(U_1\) and \(U_2.\)
Based on some simple calculations,
Let \(\widetilde{\Gamma }(u)=\Gamma (u)f(u),\) by Taylor expansion
where c is a constant. Consequently, we have
Similarly, \(T_{222}=O(h).\) As a result, we have \(\Vert T_{22}\Vert ^2=O_P(nh)=O_P(nh^{-1})\) due to \(h\rightarrow 0.\)
For \(T_{24},\)
Let \(\theta (u)=E(\varepsilon _2^2|U_2=u),\) we have
It follows that \(E\Vert T_{24}\Vert ^2=O(nh^{-1}).\)
For \(T_{25},\) using the Cauchy–Schwartz inequality, we obtain
Denote \(\Sigma =(\sigma _{kl}),\) then
Similar to the proof of \(T_{24},\) we have \(E\Vert T_{25}\Vert ^2=O(nh^{-1}).\)
For \(T_{26},\)
Similar to the proof of (6.2), we have
It follows that \( \Vert T_{26}\Vert ^2=O(nh^{-1}).\)
Now, we consider \(T_2.\)
Note that
For \(k\notin S_z,\)\(n^{-1}\Vert \tilde{\varvec{a}}_k\Vert ^2>0,\) with probability tending to 1, and \(\lambda _{1}/\sqrt{n}\rightarrow 0,\) thus for any \(\xi >0,\)
which implies that \(\rho '_{\lambda _{1}}(\Vert \tilde{\varvec{a}}_k\Vert )=o_P(n^{3/2}\alpha _n).\) So we have \(T_2=o_P(n^2\alpha _n^2)C.\) Similarly, for \(k\in S_v,\)\(n^{-1}\Vert \tilde{\varvec{b}}_k\Vert ^2>0,\) with probability tending to 1, and \(\lambda _{2}/\sqrt{n}\rightarrow 0,\) so we have \(T_3=o_P(n^2\alpha _n^2)C.\)
By choosing a large C, (6.1) is positive with probability close to 1. This completes the proof of Theorem 3.2. \(\square \)
Proof of Theorem 3.3
We first prove the part (1). We only need to prove that \(\Pr (\Vert \hat{\varvec{a}}_k\Vert =0)\rightarrow 1\) with \(k=p,\) where \(\hat{\varvec{a}}_k\) is the kth column of \(\hat{\varvec{\Phi }}.\) The proof for \(p_0+p_1<k<p\) is similar. If \(\Vert \hat{\varvec{a}}_p\Vert \ne 0,\) then
where \(J_1\) is an \(n \times 1\) vector with its lth component given by
and
where \(\Sigma _{p\cdot }^{\top }\) is the pth row of \(\Sigma .\) From Theorem 3.1, we have \(n^{-1}\Vert \tilde{\varvec{a}}_p\Vert ^2=O_P(n^{-4/5})\) and \(n^{-1}\Vert \tilde{\varvec{b}}_p\Vert ^2=O_P(n^{-4/5}).\) According to the definition of the SCAD and \(n^{1/10}/\max (\lambda _{1},\,\lambda _{2})\rightarrow 0,\) we obtain
In fact, for any \(\eta > 0,\)
Similarly,
By standard arguments of kernel smoothing, we have \(\Vert J_1\Vert =O_P(nh^{-1/2}).\) Consequently, with probability tending to 1, the normal equation (6.4) cannot hold, which implies \(\Pr (\Vert \hat{\varvec{a}}_p\Vert =0)\rightarrow 1.\)
Now we prove the part (2). Similarly, we only need to prove that \(\Pr (\Vert \hat{\varvec{b}}_k\Vert =0)\rightarrow 1\) with \(k=p_0+p_1.\) The proof for \(p_0<k<p_0+p_1\) is similar. If \(\Vert \hat{\varvec{b}}_{p_0+p_1}\Vert \ne 0,\) then
where \(J_1^*\) is an \(n \times 1\) vector with its lth component given by
and
By Theorem 3.1, we have \(n^{-1}\Vert \tilde{\varvec{a}}_{p_0+p_1}\Vert ^2>0\) with probability tending to 1 and \(n^{-1}\Vert \tilde{\varvec{b}}_{p_0+p_1}\Vert ^2=O_P(n^{-4/5}).\) From (6.3), we have
Based on the definition of the SCAD, together with \(n^{1/10}/\max (\lambda _{1},\,\lambda _{2})\rightarrow 0,\) we can obtain
By standard arguments of kernel smoothing, we have \(\Vert J_1^*\Vert =O_P(nh^{-1/2}).\) Consequently, with probability tending to one, the normal equation (6.5) cannot hold, which implies \(\Pr (\Vert \hat{\varvec{b}}_{p_0+p_1}\Vert =0)\rightarrow 1.\)\(\square \)
Rights and permissions
About this article
Cite this article
Wang, M., Zhao, P. & Kang, X. Structure identification for varying coefficient models with measurement errors based on kernel smoothing. Stat Papers 61, 1841–1857 (2020). https://doi.org/10.1007/s00362-018-1009-x
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-018-1009-x