Abstract
In this paper, we propose a robust two-stage estimation and variable selection procedure for varying-coefficient partially nonlinear model based on modal regression. In the first stage, each coefficient function is approximated by B-spline basis functions and then QR decomposition is employed to remove the nonparametric component from the original model. For the simple parametric model, an estimation and variable selection procedure for parameter is proposed based on modal regression. In the second stage, similar procedure for coefficient function is developed. The proposed procedure is not only flexible and easy to implement, but also is robust and efficient. Under some mild conditions, certain asymptotic properties of the resulting estimators are established. Moreover, the bandwidth selection and estimation algorithm for the proposed method is discussed. Furthermore, we conduct some simulations and a real example to evaluate the performances of the proposed estimation and variable selection procedure in finite samples.
Similar content being viewed by others
References
Dai, S., & Huang, Z. S. (2019). Estimation for varying coefficient partially nonlinear models with distorted measurement errors. Journal of the Korean Statistical Society, 48, 117–133.
Fan, J. Q., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 9, 1348–1360.
Huang, J. T., & Zhao, P. X. (2017). QR decomposition based orthogonality estimation for partially linear models with longitudinal data. Journal of Computational and Applied Mathematics, 321, 406–415.
Jiang, Y. L., Ji, Q. H., & Xie, B. J. (2017). Robust estimation for the varying coefficient partially nonlinear models. Journal of Computational and Applied Mathematics, 326, 31–43.
Li, T. Z., & Mei, C. L. (2013). Estimation and inference for varying coefficient partially nonlinear models. Journal of Statistical Planning and Inference, 143, 2023–2037.
Li, J., Ray, S., & Lindsay, B. (2007). A nonparametric statistical approach to clustering via mode identification. Journal of Machine Learning Research, 8, 1687–1723.
Lv, Z. K., Zhu, H. M., & Yu, K. M. (2014). Robust variable selection for nonlinear models with diverging number for parameters. Statistics and Probability Letters, 91, 90–97.
Schumaker, L. L. (1981). Spline Function. New York: Wiley.
Tang, X. R., Zhao, P. X., Yang, Y. P., & Yang, W. M. (2020). Adjusted empirical likelihood inferences for varying coefficient partially non linear models with endogenous covariates. Communications in Statistics -Theory and Methods. https://doi.org/10.1080/03610926.2020.1747078
Wang, K. N., Li, S. M., Sun, X. F., & Lin, L. (2019). Modal regression statistical inference for longitudinal data semivarying coefficient models: Generalized estimating equations, empirical likelihood and variable selection. Computational Statistics and Data Analysis, 133, 257–276.
Wang, X. L., Zhao, P. X., & Du, H. Y. (2021). Statistical inferences for varying coefficient partially non linear model with missing covariates. Communications in Statistics -Theory and Methods, 50, 2599–2618.
Wu, P., & Zhu, L. X. (2010). An orthogonality-based estimation of moments for linear mixed models. Scandinavian Journal of Statistics, 37, 253–263.
Xiao, Y. T., & Chen, Z. S. (2018). Bias-corrected estimations in varying-coefficient partially nonlinear models with measurement error in the nonparametric part. Journal of Applied Statistics, 45, 586–603.
Xia, Y. F., Qu, Y. R., & Sun, N. L. (2019). Variable selection for semiparametric varying coefficient partially linear model based on modal regression with missing data. Communications in Statistics -Theory and Methods, 48, 5121–5137.
Yang, J., Lu, F., Tian, G. L., Lu, X. W., & Yang, H. (2019). Robust variable selection of varying coefficient partially nonlinear model based on quantile regression. Statistics and Its Interface, 12, 397–413.
Yang, J., Lu, F., & Yang, H. (2018). Quantile regression for robust inference on varying coefficient partially nonlinear models. Journal of the Korean Statistical Society, 47, 172–184.
Yang, J., & Yang, H. (2016). Smooth-threshold estimating equations for varying coefficient partially nonlinear models based on orthogonality-projection method. Journal of Computational and Applied Mathematics, 302, 24–37.
Yao, W., & Li, L. (2014). A new regression model: modal linear regression. Scandinavian Journal of Statistics, 41, 656–671.
Yao, W., Lindsay, B., & Li, R. (2012). Local modal regression. Journal of Nonparametric Statistics, 24, 647–663.
Zhou, Z. Y., & Lin, Z. Y. (2018). Varying coefficient partially nonlinear models with nonstationary regressors. Journal of Statistical Planning and Inference, 194, 47–64.
Zhao, P. X., & Yang, Y. P. (2019). A new orthogonality-based estimation for varying-coefficient partially linear models. Journal of the Korean Statistical Society, 48, 29–39.
Zhang, R. Q., Zhao, W. H., & Liu, J. C. (2013). Robust estimation and variable selection for semiparametric partially linear varying coefficient model based on modal regression. Journal of Nonparametric Statistics, 25, 523–544.
Zhao, W. H., Zhang, R. Q., Liu, J. C., & Lv, Y. Z. (2014). Robust and efficient variable selection for semiparametric partially linear varying coefficient model based on modal regression. Annals of the Institute of Statistical Mathematics, 66, 165–191.
Zhou, X. S., Zhao, P. X., & Wang, X. L. (2017). Empirical likelihood inferences for varying coefficient partially nonlinear models. Journal of Applied Statistics, 44, 474–492.
Funding
This work is supported by the National Natural Science Foundation of China (No.11801438).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A. Proofs of theorems
Appendix A. Proofs of theorems
Proof of of Theorem 1
Let \(\delta _{n} =n^{-r/(2r+1)} + a_{n_1}\) and \({\mathbf {v}}=(v_1,\ldots ,v_q)^T\). Define \({\varvec{\beta }}={\varvec{\beta }}_0+\delta _{n}{\mathbf {v}}\).
We show that, for any given \(\varepsilon > 0\), there exists a large enough constant C such that
where \(L({\varvec{\beta }})\) is defined in (2.8). Let \(\pi ({\varvec{\beta }})=L({\varvec{\beta }})- L({\varvec{\beta }}_0)\), by Taylor expansion, we have
Using Taylor expanding \(g(\mathbf{Z },{\varvec{\beta }})\) around \({\varvec{\beta }}_0\), we have
Then, for \({J}_1\), using Taylor expansion and (A.3), we obtain that
where \(\xi _{i}\) lies in \(\varepsilon _{i}\) and \(\varepsilon _{i}-\delta _{n}\mathbf{Q }_{2i}^{T}g{'}(\mathbf{Z },{\varvec{\beta }}_0){\mathbf {v}}\).
By calculating the mean and variance of \({J}_{11}\), we have \({J}_{11}=O_{p}(n\delta _{n}^{2}||{\mathbf {v}}||)\). Similarly, we also have \({J}_{13}=O_{p}(n\delta _{n}^{3}||{\mathbf {v}}||^{3})\).
By the Assumption A5 and A7,
We have \({J}_{12}=O_{p}(n\delta _{n}^{2}||{\mathbf {v}}||^{2}).\) Hence, by choosing a sufficiently large C, \({J}_{12}\) dominates both \({J}_{11}\) and \({J}_{13}\) in \(||{\mathbf {v}}||=C.\)
We next consider \({J}_{2}\), by invoking \(p_{\lambda }(0)=0\) and \(p_\lambda (\beta )>0\) for any \(\beta\), then by the argument of the Taylor expansion, we obtain that
Then by \(b_{n_1}\rightarrow 0\), \({J}_{2}\) is also dominated by \({J}_{12}\) in \(||{\mathbf {v}}||=C\). Hence, by choosing a sufficiently large C, (A.1) holds. This completes the Theorem 1. \(\square\)
Proof of of Theorem 2
It is sufficient to show that, for any \({\varvec{\beta }}\) that satisfies \(||{\varvec{\beta }} -{\varvec{\beta }}_{0}|| = O_{p}(n^{-r/(2r+1)})\) and for some given \(\varepsilon = C n^{-r/(2r+1)}\), when \(n\rightarrow \infty\) with probability tending to 1, we have,
and
By similar proof of (A.3) and (A.4) in Theorem 1, we can obtain that
where \(\xi _{i}\) is between \(\varepsilon _{i}\) and \(\varepsilon _{i}-\delta _{n}\mathbf{Q }_{2i}^{T}g{'}(\mathbf{Z },{\varvec{\beta }}_0){\mathbf {v}}\).
By the Assumption A9 and \(\lambda _j n^{r/(2r+1)}>\lambda _{min} n^{r/(2r+1)}\longrightarrow \infty\), the sign of derivation is determined by that of \({\beta }_j\). Then (A.7) and (A.8) hold, which imply that \({\hat{\beta }}_j=0,j=s_1+1,\cdots ,q\) with probability tending to 1. \(\square\)
Proof of of Theorem 3
Let \(\delta _{n} =n^{-r/(2r+1)} + a_{n_2}\) and \({\mathbf {v}}=({\mathbf {v}}_1^T,\ldots ,{\mathbf {v}}_p^T)^T\) be a pL-dimensional vector. Define \({\varvec{\gamma }}={\varvec{\gamma }}_0+\delta _{n}{\mathbf {v}}\), where \({\varvec{\gamma }}_0=({\varvec{\gamma }}_{01}^T,\ldots ,{\varvec{\gamma }}_{0p}^T)^T\) is the best approximation of \({\varvec{\alpha }}(u)\) in the B-spline space.
We first show that for any given \(\varepsilon > 0\), there exists a large enough constant C such that
where \(L({\varvec{\gamma }})\) is defined in (2.12).
Let \(\pi ({\varvec{\gamma }})=L({\varvec{\gamma }})- L({\varvec{\gamma }}_0)\). By Taylor expansion with simple calculation, we have
where \(\mathbf{R }(u)=(R_1(u),\ldots ,R_p(u))^T\) with \(R_j(u)=\alpha _j(u)-{\varvec{B}}(u)^T {\varvec{\gamma }}_{0j},j=1,\ldots ,p\), \(\xi _i\) is between \(\varepsilon _i+\mathbf{X }_i^T \mathbf{R }(U_i)\) and \(\varepsilon _i+\mathbf{X }_i^T \mathbf{R }(U_i)-\delta _n \mathbf{W }_i^T {\mathbf {v}}\).
By the Assumption A1, A2 and Corollary 6.21 in Schumaker (1981), we have
Then, by Taylor expansion, we have
where \(\xi _i^*\) is between \(\varepsilon _i\) and \(\varepsilon _i+\mathbf{X }_i^T \mathbf{R }(U_i)\).
By the Assumption A6, A8, and some calculations, we get
For \(I_2\), we can prove that
Therefore, by choosing a sufficient large C, \(I_2\) dominates \(I_1\) uniformly \(|| {\mathbf {v}}||=C\). Similar to \(I_2\), we can prove that
By the condition \(a_{n_2}\longrightarrow 0\), hence \(\delta _n \longrightarrow 0\), then \(I_3=O_p(I_2)\) is obtained by the fact \(\delta _n|| {\mathbf {v}}||\longrightarrow 0\) with \(||{\mathbf {v}}||=C\). Therefore, \(I_3\) is also dominated by \(I_2\) uniformly in \(|| {\mathbf {v}}||=C\).
For \(I_4\), invoking \(p_\lambda (0)=0\) and some argument of Taylor expansion, we get that
Then, with the condition \(b_{n_2}\longrightarrow 0\), it is easy to show that \(I_4\) is also dominated by \(I_2\) uniformly in \(||{\mathbf {v}}||=C\). Hence, by choosing a sufficient large C, (A.10) holds. So, there exits a local maximizer such that
Note that
This invoking \(||H||=O(1)\) and (A.17), we have
Moreover, we can get that
Invoking (A.19), (A.20) and (A.18),
This completes the Theorem 3. \(\square\)
Proof of of Theorem 4
It is sufficient to show that, for any \({\varvec{\gamma }}\) that satisfies \(||{\varvec{\gamma }} -{\varvec{\gamma }}_{0}|| = O_{p}(n^{-r/(2r+1)})\) and for some given small \(\varepsilon = C n^{-r/(2r+1)}\),
and
By a similar proof of (A.11) in Theorem 3, we can show that,
where \(W_{ik}\) denote the kth element of \(\mathbf{W }_i\), \(\eta _i\) is between \(Y_i-\mathbf{W }_i^T{\varvec{\gamma }}\) and \(\varepsilon _i+\mathbf{X }_i^T \mathbf{R }(U_i)\).
By the Assumption A9 and \(\lambda _k n^{r/(2r+1)}>\lambda _{min} n^{r/(2r+1)}\longrightarrow \infty\), the sign of derivation is completely determined by that of \(||{\varvec{\gamma }}_k||_\mathbf{H }\). Then (A.22) and (A.23) hold, which imply that \(\hat{{\varvec{\gamma }}}_j=0,j=s_2+1,\ldots ,p\) with probability tending to 1. Invoking \(\mathrm {sup}_u||\mathbf{B }(u)||=O(1)\), the result of Theorem 4 is obtained by \({\hat{\alpha }}_j(u)=\mathbf{B }(u)^T\hat{{\varvec{\gamma }}}_j\).
\(\square\)
Rights and permissions
About this article
Cite this article
Xiao, Y., Liang, L. Robust estimation and variable selection for varying-coefficient partially nonlinear models based on modal regression. J. Korean Stat. Soc. 51, 692–715 (2022). https://doi.org/10.1007/s42952-021-00158-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-021-00158-w