Abstract
Issues concerning spatial dependence among cross-sectional units in econometrics have received more and more attention, while in statistical modeling, rarely can the analysts have a priori knowledge of the dependency relationship of the response variable with respect to independent variables. This paper proposes an automatic structure identification and variable selection procedure for semiparametric spatial autoregressive model, based on the generalized method of moments and the smooth-threshold estimating equations. The novel method is easily implemented without solving any convex optimization problems. Model identification consistency is theoretically established in the sense that the proposed method can automatically separate the linear and zero components from the varying ones with probability approaching to one. Detailed issues on computation and turning parameter selection are discussed. Some Monte Carlo simulations are conducted to demonstrate the finite sample performance of the proposed procedure. Two empirical applications on Boston housing price data and New York leukemia data are further considered.
Similar content being viewed by others
References
Ai, C., Zhang, Y.: Estimation of partially specified spatial panel data models with fixed-effects. Economet. Rev. 36, 6–22 (2017)
Carroll, R.J., Fan, J., Gijbels, I., Wand, M.P.: Generalized partially linear single-index models. J. Am. Stat. Assoc. 92, 477–489 (1997)
Chen, Y., Wang, Q., Yao, W.: Adaptive estimation for varying coefficient models. J. Multivar. Anal. 137, 17–31 (2015)
Fan, J., Huang, T.: Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11, 1031–1057 (2005)
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96, 1348–1360 (2001)
Geyer, C.J.: On the asymptotics of constrained M-estimation. Ann. Stat. 22, 1993–2010 (1994)
Gilley, O.W., Pace, R.K.: On the Harrison and Rubinfeld data. J. Environ. Econ. Manag. 31, 403–405 (1996)
Harrison, D., Rubinfeld, D.L.: Hedonic housing prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)
Hu, T., Xia, Y.: Adaptive semi-varying coefficient model selection. Stat. Sin. 22, 575–599 (2012)
Huang, J., Wu, C., Zhou, L.: Varying-coefficient models and basis function approximation for the analysis of repeated measurements. Biometrika 89, 111–128 (2002)
Huang, J., Wei, F., Ma, S.: Semiparametric regression pursuit. Stat. Sin. 22, 1403–1426 (2012)
Jencks, C., Mayer, S.: The social consequences of growing up in a poor neighborhood. Inner-city poverty in the United States. National Academy, Washington (1990)
Jeong, H., Lee, L.F.: Spatial dynamic models with intertemporal optimization: specification and estimation. J. Econom. 218, 82–104 (2020)
Jiang, J., Zhou, H., Jiang, X., Peng, J.: Generalized likelihood ratio tests for the structure of semiparametric additive models. Can. J. Stat. 35, 381–398 (2007)
Kai, B., Li, R., Zou, H.: New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann. Stat. 39, 305–332 (2011)
Kelejian, H.H., Prucha, I.R.: A generalized moments estimator for the autoregressive parameter in a spatial model. Int. Econ. Rev. 40, 509–533 (1999)
Kelejian, H.H., Prucha, I.R.: Specification and estimation of spatial autoregressive models with autoregressive and heteroskedastic disturbances. J. Econom. 157, 53–67 (2010)
Lee, L.F.: Asymptotic distribution of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72, 1899–1925 (2004)
Lee, L.F.: GMM and 2SLS estimation of mixed regressive, spatial autoregressive models. J. Econom. 137, 489–514 (2007)
Lee, L.F., Liu, X.: Efficient GMM estimation of high order spatial autoregressive models with autoregressive disturbances. Econom. Theor. 26, 187–230 (2010)
Li, R., Liang, H.: Variable selection in semiparametric regression model. Ann. Stat. 36, 261–286 (2008)
Lian, H.: Semiparametric estimation of additive quantile regression models by two-fold penalty. J. Bus. Econ. Stat. 30, 337–350 (2012)
Lian, H., Chen, X., Yang, J.Y.: Identification of partially linear structure in additive models with an application to gene expression prediction from sequences. Biometrics 68, 437–445 (2012)
Lian, H., Liang, H., Ruppert, D.: Separation of covariates into nonparametric and parametric parts in high-dimensional partially linear additive models. Stat. Sin. 25, 591–607 (2015)
Lin, X., Lee, L.F.: GMM estimation of spatial autoregressive models with unknown heteroskedasticity. J. Econom. 157, 34–52 (2010)
Liu, X., Lee, L.F., Bollinger, C.R.: An efficient GMM estimator of spatial autoregressive models. J. Econom. 159, 303–319 (2010)
Mack, Y., Silverman, B.: Weak and strong uniform consistency of kernel regression estimates. Probab. Theory Relat. Fields 61, 405–415 (1982)
Noh, H., Van Keilegom, I.: Efficient model selection in semivarying coefficient models. Electron. J. Stat. 6, 2519–2534 (2012)
Olejnik, J., Olejnik, A.: QML estimation with non-summable weight matrices. J. Geogr. Syst. 22, 469–495 (2020)
Ord, J.K.: Estimation methods for models of spatial interaction. J. Am. Stat. Assoc. 70, 120–126 (1975)
Pace, R.K., Gilley, O.W.: Using the spatial configuration of the data to improve estimation. J. Real Estate Finance Econ. 14, 333–340 (1997)
Paelinck, J.H., Klaassen, L.H.: Spatial Econometrics. Gower Press, Aldershot (1979)
Smirnov, O., Anselin, L.: Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach. Comput. Stat. Data Anal. 35, 301–319 (2001)
Su, L.: Semiparametric GMM estimation of spatial autoregressive models. J. Econom. 167, 543–560 (2012)
Su, L., Jin, S.: Profile quasi-maximum likelihood estimation of spatial autoregressive models. J. Econom. 157, 18–33 (2010)
Su, L., Yang, Z.: Instrumental variable quantile estimation of spatial autoregressive models. Working paper, Singapore Management University (2011)
Sun, Y., Wu, Y.: Estimation and testing for a partially linear single-index spatial regression model. Spat. Econ. Anal. 13, 473–489 (2018)
Sun, Y., Yan, H., Zhang, W., Lu, Z.: A semiparametric spatial dynamic model. Ann. Stat. 42, 700–727 (2014)
Sun, Y., Zhang, Y., Huang, J.Z.: Estimation of a semiparametric varying-coefficient mixed regressive spatial autoregressive model. Econom. Stat. 9, 140–155 (2019)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B (Methodology) 58, 267–288 (1996)
Ueki, M.: A note on automatic variable selection using smooth-threshold estimating equations. Biometrika 96, 1005–1011 (2009)
Waller, L.A., Gotway, C.A.: Applied Spatial Statistics for Public Health Data. John Wiley, Hoboken (2004)
Wakefield, J.: Disease mapping and spatial regression with count data. Biostatistics 8, 158–183 (2007)
Wang, D., Kulasekera, K.B.: Parametric component detection and variable selection in varying-coefficient partially linear models. J. Multivar. Anal. 112, 117–129 (2012)
Wang, H., Leng, C.: Unified lasso estimation via least squares approximation. J. Am. Stat. Assoc. 102, 1039–1048 (2007)
Wang, H., Xia, Y.: Shrinkage estimation of the varying coefficient model. J. Am. Stat. Assoc. 104, 747–757 (2009)
Wang, H.J., Zhu, Z., Zhou, J.: Quantile regression in partially linear varying coefficient models. Ann. Stat. 37, 3841–3866 (2009)
Wei, H., Sun, Y.: Heteroskedasticity-robust semi-parametric GMM estimation of a spatial model with space-varying coefficients. Spat. Econ. Anal. 12, 113–128 (2017)
Xia, Y., Zhang, W., Tong, H.: Efficient estimation for semivarying-coefficient models. Biometrika 91, 661–681 (2004)
Zhang, H.H., Cheng, G., Liu, Y.: Linear or nonlinear? Automatic structure discovery for partially linear models. J. Am. Stat. Assoc. 106, 1099–1112 (2011)
Zou, H.: The adaptive LASSO and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Acknowledgements
The authors would like to thank the Editor, Associate Editor and referees for the valuable comments which enhanced quality of the paper very much. This research was supported by the Natural Science Foundation of Hunan Province (Grant 2022JJ30368), the National Natural Science Foundation of China (Grants 11801168, 11801169, 12071124) and the Discovery Grants (RG/PIN261567-2013) from National Science and Engineering Council of Canada.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Ethical standards
The authors declare that this article does not contain any experiments disobey the current laws of the China.
Conflict of interest
The authors declare that they have no conflict of interest.
Appendix
Appendix
The following regularity conditions are required for technical proof.
- (C1):
-
All diagonal elements of the weight matrix W are zeros. The matrix \(I_n-\rho W\) is nonsingular for all \(\rho \) in its compact support \(\left( -1/|\lambda _{min}(W)|, 1/\lambda _{max}(W) \right) \), where \(\lambda _{min}(W)\) and \(\lambda _{max}(W)\), respectively, denote the minimum and maximum eigenvalues of W. Besides, the matrices W and \((I_n-\rho _0 W)^{-1}\) are uniformly bounded in absolute value in both row and column sums.
- (C2):
-
The elements of matrix \(H_n\) are uniformly bounded, and the regressors \(x_i\), \(i=1,\ldots ,n\) are nonstochastic with bounded support.
- (C3):
-
The density function \(f_u(\cdot )\) of random variable u is positive and has a continuous second derivative. The matrix \(\varGamma (u)\) is nonsingular and Lipschitz continuous.
- (C4):
-
The functions \(\beta _{0\,s}(\cdot )\), \(l=1,\ldots ,q\) are rth continuously differentiable over the interval (0, 1) with \(r\ge 2\).
- (C5):
-
The kernel function \(K(\cdot )\) is a symmetric density function with a compact support \([-1,1]\) and a bounded first-order derivative.
- (C6):
-
The random errors \(\varepsilon _i\)’s are independent, and \(E (|\varepsilon _i|^{2+\varpi })<\infty \) for some \(\varpi >0\).
- (C7):
-
At least one nonconstant regressor in X must have significant effect on the response variable, or \(\beta _0(u)\ne 0_q\) over at least one nonempty interval.
Note that conditions (C1) and (C2) are frequently postulated in spatial econometric literature including [16, 19, 34, 39], etc. Particularly, the uniform boundedness of W and \((I_n-\rho W)^{-1}\) in condition (C1) aims to limit the spatial correlation among spatial units to a manageable degree, which plays an important role in the asymptotic properties of estimators. When W is row-normalized, condition (C1) can be satisfied with \(\rho \in (-1,1)\), see [18] for particular interpretations. Conditions (C3)–(C5) are regular conditions assumed in the local polynomial regression of VCPLM such as [4, 15] and the references therein. Condition (C6) is a necessary condition for estimation consistency. Condition (C7) is postulated to avoid a pure spatial autoregressive model so as to ensure that the valid instrumental variables can be generated and the proposed local linear GMM estimators are consistent. One may refer to [19] and the references therein for deeper discussions on this issue.
To proceed with the proofs of our theoretical properties, we first quote the following lemma which will be frequently used in the sequel.
Lemma 6.1
Let \((X_1,Y_1),...,(X_n,Y_n)\) be i.i.d. random vectors, where \({Y_i}'s\) are scalar random variables, and f denotes the joint density of (X, Y). Let K be a bounded positive function with bounded support, satisfying Lipschitz condition. Further assume that \(\sup _{\textbf{x}} \int |y|^r f(\textbf{x},y)dy < \infty \) and \(E|Y|^r < \infty \). Then,
provided that \(n^{2\varepsilon -1}h \rightarrow \infty \) for some \(\varepsilon < 1-r^{-1}\).
The proof of Lemma 6.1 can be referred to [27].
Proof of Theorem 2.2
The asymptotic normality of \(\hat{\beta }(u;\hat{\rho })\) can be directly obtained from [48], so we focus on the part of \(\hat{\beta }^{\prime }(u;\hat{\rho })\). Adopting the notations in Sect. 2, it follows from Eq. (2.1) with h replaced by \(h_1\) that
where \({\textbf {0}}_p\) means a \(p\times p\) matrix with all components being equal to 0. Thus, we have
where
Next, we mainly consider \(R_{n1}(u)\) since the properties of \(R_{n2}(u)\) and \(R_{n3}(u)\) can be similarly derived from [48]. Note that
Each element of the above matrix is in the form of a kernel regression. Hence, by Lemma 6.1, we have
where \(c_n= \{\log (1/h_1)/ nh_1 \}^{1/2}+h_1^2\).
In addition, since the functions \(\beta _{0}(\cdot )\) are presumed to be second-order continuously differentiable, it follows from Taylor expansion that
where \(g_0(u)=\left( \beta _0(u)^T,\beta ^{\prime }_0(u)^T \right) ^T\). Based on some direct calculations and Lemma 6.1, it follows
Therefore, applying (6.2) and (6.3) to \(R_{n1}(u)\) yields
where the last equality holds due to the bandwidth assumption \(h_1=O(n^{-1/5})\).
For the terms \(R_{n2}(u)\) and \(R_{n3}(u)\), exactly following the same line as done in the proof of Theorem 2 in [48], wherein the 2-dimensional index variable \({\textbf {s}}\) is replaced by scalar one u and \((I_p,{\textbf {0}}_{2p})\) in \(\varDelta _{n2}({\textbf {s}}),\varDelta _{n3}({\textbf {s}})\) is replaced by \(({\textbf {0}}_{p},I_p)\), we can obtain that
As a consequence, we derive the following conclusion by uniting results (6.1), (6.4) and (6.5), that is,
This completes the proof. \(\square \)
Proof of Theorem 3.1
From Theorem 2.2 and the assumption \(h_1=O(n^{-1/5})\), we should always keep in mind the fact, which plays a crucial role in this proof, that \(\hat{\beta }(u)\) and \(\hat{\beta }^{\prime }(u)\) are consistent estimators of \(\beta _0(u)\) and \(\beta _0^{\prime }(u)\) with convergence rates \(n^{2/5}\) and \(n^{1/5}\), respectively. Suppose \(\eta >0\) be an arbitrary positive number, we now present the proof by the following steps.
Step 1: If \(k \in {\mathcal {A}}_{v}\), we have \(\Vert n^{-1/2}{\hat{a}}_k\Vert >0\) and \(\Vert n^{-1/2}{\hat{b}}_k\Vert >0\). Thus,
by condition \(n^{2/5}\lambda _1 \rightarrow 0\), and
by condition \(n^{1/5}\lambda _2 \rightarrow 0\). This implies,
Accordingly, we have
Step 2: If \(k \in {\mathcal {A}}_{v}^c\), that means \(k \in {\mathcal {A}}_{c}\) or \(k \in {\mathcal {A}}_{z}\).
Step 2.1: If \(k \in {\mathcal {A}}_{c}\), then \(\Vert n^{-1/2}{\hat{a}}_k\Vert >0\) and \(\Vert n^{-1/2}{\hat{b}}_k\Vert =O_p(n^{-1/5})\). By Step 1, we have \(P \left( \hat{\delta }_{1k} < 1 ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{c} \right) \rightarrow 1\). Moreover, by condition \(n^{(1+\gamma _2)/5}\lambda _2 \rightarrow \infty \),
which is equivalent to \(P \left( \hat{\delta }_{2k} = 1 ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{c} \right) \rightarrow 1\). Consequently, we obtain
Step 2.2: If \(k \in {\mathcal {A}}_{z}\), then \(\Vert n^{-1/2}{\hat{a}}_k\Vert =O_p(n^{-2/5})\) and \(\Vert n^{-1/2}{\hat{b}}_k\Vert =O_p(n^{-1/5})\). By Step 2.1, we have \(P \left( \hat{\delta }_{2k} = 1 ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{z} \right) \rightarrow 1\). Besides, by condition \(n^{2(1+\gamma _1)/5}\lambda _1 \rightarrow \infty \),
which is equivalent to \(P \left( \hat{\delta }_{1k} = 1 ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{z} \right) \rightarrow 1\). As a result,
Combining expressions (6.6), (6.7) and (6.8), it follows that
Step 3: If \(k \in {\mathcal {A}}_{c}\), from Step 2.1, we have \(P \left( \hat{\delta }_{1k} < 1,~\hat{\delta }_{2k} = 1, ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{c} \right) \rightarrow 1\). If \(k \in {\mathcal {A}}_{c}^c\), then \(k \in {\mathcal {A}}_{v}\) or \(k \in {\mathcal {A}}_{z}\). When \(k \in {\mathcal {A}}_{v}\), it follows from Step 1 that \(P \left( \hat{\delta }_{1k}< 1, ~\hat{\delta }_{2k}<1, ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{v} \right) \rightarrow 1\), while \(P \left( \hat{\delta }_{1k} = 1,~\hat{\delta }_{2k} = 1, ~ \text{ for } \text{ all }\right. \left. k\in {\mathcal {A}}_{z} \right) \rightarrow 1\) by Step 2.2 when \(k \in {\mathcal {A}}_{z}\). Therefore, we have
Step 4: If \(k \in {\mathcal {A}}_{z}\), from Step 2.2, we have \(P \left( \hat{\delta }_{1k} = 1,~\hat{\delta }_{2k} = 1, ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{z} \right) \rightarrow 1\). If \(k \in {\mathcal {A}}_{z}^c\), then \(k \in {\mathcal {A}}_{v}\) or \(k \in {\mathcal {A}}_{c}\). If \(k \in {\mathcal {A}}_{v}\), \(P \left( \hat{\delta }_{1k}< 1, ~\hat{\delta }_{2k}<1, ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{v} \right) \rightarrow 1\) holds from Step 1. If \(k \in {\mathcal {A}}_{c}\), we have \(P \left( \hat{\delta }_{1k} < 1,~\hat{\delta }_{2k} = 1, ~ \text{ for } \text{ all }~ k\in {\mathcal {A}}_{c} \right) \rightarrow 1\) from Step 2.1. Hence,
Note that the sets \({\mathcal {A}}_{v}\), \({\mathcal {A}}_{c}\) and \({\mathcal {A}}_{z}\) constitute a partition of the covariate indexes. Namely, \({\mathcal {A}}_{v}\cup {\mathcal {A}}_{c}\cup {\mathcal {A}}_{z}=\{1,2,\ldots ,p\}\) and the intersection of any two sets is empty. This together with results (6.9), (6.10) and (6.11) implies
This completes the proof. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lu, F., Yang, J. & Lu, X. Automatic Structure Identification of Semiparametric Spatial Autoregressive Model Based on Smooth-Threshold Estimating Equation. Commun. Math. Stat. (2023). https://doi.org/10.1007/s40304-023-00362-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40304-023-00362-6
Keywords
- Semiparametric spatial autoregressive model
- Generalized method of moments
- Automatic structure recovery
- Smooth-threshold estimating equation
- Consistency