Abstract
In this paper, we consider variable selection for a class of semiparametric spatial autoregressive models based on exponential squared loss (ESL). Using the orthogonal projection technique, we propose a novel orthogonality-based variable selection procedure that enables simultaneous model selection and parameter estimation, and identifies the significance of spatial effects. Under appropriate conditions, we show that the proposed procedure is consistent and the resulting estimator has oracle properties. Furthermore, some simulation studies and an analysis of the Boston housing price data are also carried out to examine the finite-sample performance of the proposed method.
Similar content being viewed by others
References
Basile, R. (2009). Productivity polarization across regions in Europe: The role of nonlinearities and spatial dependence. International Regional Science Review, 32(1), 92–115.
Case, A. C. (1991). Spatial patterns in household demand. Econometrica, 59(4), 953–965.
Cheng, S., Chen, J., Liu, X. (2019). GMM estimation of partially linear single-index spatial autoregressive model. Spatial Statistics, 31, 100354.
Du, J., Sun, X., Cao, R., et al. (2018). Statistical inference for partially linear additive spatial autoregressive models. Spatial Statistics, 25, 52–67.
Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96(456), 1348–1360.
Harrison, D., Rubinfeld, D. L. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management, 5(1), 81–102.
Jiang, Y., Ji, Q., Xie, B. (2017). Robust estimation for the varying coefficient partially nonlinear models. Journal of Computational and Applied Mathematics, 326, 31–43.
Jiang, Y., Tian, G. L., Fei, Y. (2019). A robust and efficient estimation method for partially nonlinear models via a new MM algorithm. Statistical Papers, 60(6), 2063–2085.
Kelejian, H. H., Prucha, I. R. (1998). A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. The Journal of Real Estate Finance and Economics, 17(1), 99–121.
Kelejian, H. H., Prucha, I. R. (1999). A generalized moments estimator for the autoregressive parameter in a spatial model. International Economic Review, 40(2), 509–533.
Koenker, R., Bassett, G., Jr. (1978). Regression quantiles. Econometrica, 46(1), 33–50.
Kong, E., Xia, Y. (2012). A single-index quantile regression model and its estimation. Econometric Theory, 28(4), 730–768.
Lee, L. F. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica, 72(6), 1899–1925.
Li, T., Guo, Y. (2020). Penalized profile quasi-maximum likelihood method of partially linear spatial autoregressive model. Journal of Statistical Computation and Simulation, 90(15), 2705–2740.
Li, T., Yin, Q., Peng, J. (2020). Variable selection of partially linear varying coefficient spatial autoregressive model. Journal of Statistical Computation and Simulation, 90(15), 2681–2704.
Luo, G., Wu, M. (2021). Variable selection for semiparametric varying-coefficient spatial autore-gressive models with a diverging number of parameters. Communications in Statistics-Theory and Methods, 50(9), 2062–2079.
Ord, K. (1975). Estimation methods for models of spatial interaction. Journal of the American Statistical Association, 70(349), 120–126.
Schumaker, L. (1981). Spline functions: Basic theory. New York: Wiley.
Song, Y., Liang, X., Zhu, Y., et al. (2021). Robust variable selection with exponential squared loss for the spatial autoregressive model. Computational Statistics and Data Analysis, 155, 107094.
Su, L., Jin, S. (2010). Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. Journal of Econometrics, 157(1), 18–33.
Su, L., Yang, Z. (2007). Instrumental variable quantile estimation of spatial autoregressive models. In Development economics working papers 22476, East Asian Bureau of Economic Research. https://ideas.repec.org/p/eab/develo/22476.html.
Wang, H., Li, G., Jiang, G. (2007). Robust regression shrinkage and consistent variable selection through the LAD-Lasso. Journal of Business & Economic Statistics, 25(3), 347–355.
Wang, K., Lin, L. (2016). Robust structure identification and variable selection in partial linear varying coefficient models. Journal of Statistical Planning and Inference, 174, 153–168.
Wang, X., Jiang, Y., Huang, M., et al. (2013). Robust variable selection with exponential squared loss. Journal of the American Statistical Association, 108(502), 632–643.
Zhao, P., Gan, H., Cheng, S., et al. (2021). Orthogonality based penalized GMM estimation for variable selection in partially linear spatial autoregressive models. Communications in Statistics-Theory and Methods, 52, 1676–1691.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101(476), 1418–1429.
Zou, H., Yuan, M. (2008). Composite quantile regression and the oracle model selection theory. The Annals of Statistics, 36(3), 1108–1126.
Acknowledgements
The research is supported by NSF projects (ZR2021MA077 and ZR2021MA048) of Shandong Province of China.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Proof of Theorem 1
Let \(\xi = n^{-1/2} + a_n\). Similar to Fan and Li (2001), we first prove that for any given \(\epsilon > 0\), there exists a constant C such that
where \({{\textbf {u}}}\) is a \((p+1)\)-dimensional vector such that \(\Vert {{\textbf {u}}}\Vert = C\), and C is a large enough constant. This means that the probability that there exists a local maximum in the sphere \(\{\theta _0 + \xi {{\textbf {u}}} : \Vert {{\textbf {u}}}\Vert \le C \}\) is at least \(1 - \epsilon\). Hence, we prove that there exists a local maximizer \({\hat{\theta }}_n\) such that \(\Vert {\hat{\theta }}_n - \theta _0\Vert = O_p(\xi )\). Let
Since \(p_{\lambda _j}(0)=0\) for \(j = 1,\ldots , p+1\) and \(\gamma _n - \gamma _0 =o_p(1)\) , by Taylor’s expansion we have
Note that \(n^{-1/2} D(\theta _0, \gamma _0)=O_p(1)\). Therefore, the order of the first term on the right side of Eq. (18) is equal to \(O_p(n^{1/2} \xi ) = O_p(n \xi ^2)\). By choosing a sufficiently large C, the second term dominates the first term uniformly in \(\Vert {{\textbf {u}}}\Vert = C\). Since \(b_n=o_p(1)\), the third term is also dominated by the second term of (18). Therefore, (17) holds by choosing a sufficiently large C. The proof of Theorem 1 is completed. \(\square\)
Proof of Theorem 2(a)
We now prove the sparsity. We will prove that with probability 1, for any \(\theta _1\) satisfying \(\theta _1-\theta _{01} = O_p(n^{-1/2})\), and for some small \(\epsilon _n = Cn^{-1/2}\) and \(j=s+1,\ldots , p+1\), we have \(\partial \ell (\theta )/ \partial \theta _j >0\), for \(0<\theta _j<\epsilon _n\), and \(\partial \ell (\theta )/ \partial \theta _j <0\), for \(-\epsilon _n<\theta _j<0\). Let
By Talylor’s expansion, we have
where \(\theta ^*\) lies between \(\theta\) and \(\theta _0\). Here we assume \(\left| (n-L)^{-1}\frac{ \partial ^3 Q_n(\theta ,\gamma _n)}{\partial \theta _j \partial \theta _l \partial \theta _k} \right| \le M_{jlk}\), where \(E(M_{jlk}) < \infty\). Note that
and
Since \(b_n = o_p(1)\) and \(\sqrt{n}a_n = o_p(1)\), we obtain \(\theta - \theta _0 = O_p(n^{-1/2})\). By \(\sqrt{n}(\gamma _n-\gamma _0) = o_p(1)\), we have
Since \(\frac{1}{\min _{s+1 \le j \le p+1} \sqrt{n}\lambda _j } =o_p(1)\) and \(\lim \inf _{n\rightarrow \infty } \lim \inf _{t\rightarrow 0^+} \lambda ^{-1} p_{\lambda }'(|t|) >0\) with probability 1, the sign of the derivative is completely determined by that of \(\theta _j\). This completes the proof of Theorem 2(a).
Proof of Theorem 2(b)
It can be shown easily that there exists a \({\hat{\theta }}_{n1}\) in Theorem 1 that is a \(\sqrt{n}\)-consistent local maximizer of \(\ell \{(\theta _1,0)\}\), satisfying that
Note that \({\hat{\theta }}_{n1}\) is a consistent estimator,
The above equation can be rewritten as follows
and
Since \(\sqrt{n}(\gamma _n- \gamma _0)=o_p(1)\), invoking the Slutsky’s lemma and the Lindeberg-Feller central limit theorem, we have
where \(\varSigma _1 = {\text {diag}}\{p_{\lambda _{j}}''(|\theta _{01}|) ,\ldots , p_{\lambda _{j}}''(|\theta _{0s}|) \}\), \(\varSigma _2 = \textrm{Cov}(\exp (-r^2/\gamma _0)\frac{2r}{\gamma _0} {\tilde{Z}}_{i1})\), \(\varDelta = (p_{\lambda _{j}}'(|\theta _{01}|)\textrm{sign}(\theta _{01}),\ldots , p_{\lambda _{j}}'(|\theta _{0s}|)\textrm{sign}(\theta _{0s}) )^T\), and \(I_1(\theta _{01},\gamma _0) = \frac{2}{\gamma _0} E[\exp (-r^2/\gamma _0)( \frac{2r^2}{\gamma _0}\) \(-1) ] \times (E{\tilde{Z}}_{i1}{\tilde{Z}}_{i1}^T)\). Then the proof of Theorem 2(b) is completed. \(\square\)
Proof of Theorem 3
Let that \(R(U_i) = g(U_i)-B(U_i)^T \eta\) and \(R(U) = (R(U_1), \ldots ,\) \(R(U_n))^T\). To facilitate expression, we set \(Z = (WY, X), g(U) =(g(U_1), \ldots , g(U_n))^T\). Similar to Zhao et al. (2021), a simple calculation gives that
By \(\Vert R(u)\Vert =O(n^{-v/(2v+1)})\), we have
Note that \(E\{B(U_i)\epsilon |X_i,U_i\}=0\), then by the central limit theorem we have \(n^{-1/2}\sum _{i=1}^n\) \(B(U_i)\epsilon _i = O_p(1)\). Therefore, we have
By Theorem 1, we have \(\sqrt{n}(\theta -{\hat{\theta }}_n) = O_p(1)\). Similar to the above proof, we can obtain \(\Vert R_3\Vert = O_p(n^{-1/2})\). Hence, we have
Therefore,
Note that \(\Vert \int _0^1B(u)B(u)^Tdu\Vert = O(1)\), and thus invoking (22) gives
From \(\Vert R(u)\Vert =O(n^{-v/(2v+1)})\), we have
As a result, \(\Vert {\hat{g}}(u) - g(u) \Vert ^2 = O_p(n^{-2v/(2v+1)})\). This completes the proof of Theorem 3. \(\square\)
About this article
Cite this article
Wang, X., Shao, J., Wu, J. et al. Robust variable selection with exponential squared loss for partially linear spatial autoregressive models. Ann Inst Stat Math 75, 949–977 (2023). https://doi.org/10.1007/s10463-023-00870-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-023-00870-w