Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters

Li, Tizheng; Kang, Xiaojuan

doi:10.1007/s00362-021-01241-4

Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters

Regular Article
Published: 27 May 2021

Volume 63, pages 243–285, (2022)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Tizheng Li¹ &
Xiaojuan Kang²

419 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we consider problem of variable selection in higher-order partially linear spatial autoregressive model with a diverging number of parameters. By combining series approximation method, two-stage least squares method and a class of non-convex penalty function, we propose a variable selection method to simultaneously select both significant spatial lags of the response variable and explanatory variables in the parametric component and estimate the corresponding nonzero parameters. Unlike existing variable selection methods for spatial autoregressive models, the proposed variable selection method can simultaneously select significant explanatory variables and spatial lags of the response variable. Under appropriate conditions, we establish rate of convergence of the penalized estimator of the parameter vector in the parametric component and uniform rate of convergence of the series estimator of the nonparametric component, and show that the proposed variable selection method enjoys the oracle property. That is, it can estimate the zero parameters as exact zero with probability approaching one, and estimate the nonzero parameters as efficiently as if the true model was known in advance. Simulation studies show that the proposed variable selection method is of satisfactory finite sample properties. Especially, when the sample size is moderate, the proposed variable selection method even works well in the case where the correlation among the explanatory variables in the parametric component is strong. An application of the proposed variable selection method to the Boston house price data serves as a practical illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection for spatial autoregressive models with a diverging number of parameters

Article 29 January 2018

Robust variable selection with exponential squared loss for partially linear spatial autoregressive models

Article 03 May 2023

Local Walsh-average-based Estimation and Variable Selection for Spatial Single-index Autoregressive Models

Article 08 February 2024

References

Ai CR, Zhang YQ (2017) Estimation of partially specified spatial panel data models with fixed-effects. Econom Rev 36:6–22
Article MathSciNet Google Scholar
Badinger H, Egger P (2011) Estimation of higher-order spatial autoregressive cross-section models with heteroscedastic disturbances. Pap Reg Sci 90:213–235
Article Google Scholar
Cai ZW, Xu XP (2008) Nonparametric quantile estimations for dynamic smooth coefficient models. J Am Stat Assoc 103:1595–1608
Article MathSciNet Google Scholar
Du J, Sun XQ, Cao RY, Zhang ZZ (2018) Statistical inference for partially linear additive spatial autoregressive models. Spat Stat 25:52–67
Article MathSciNet Google Scholar
Fan JQ, Huang T (2005) Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli 11:1031–1057
Article MathSciNet Google Scholar
Fan JQ, Li RZ (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96:1348–1360
Article MathSciNet Google Scholar
Gupta A, Robinson PM (2015) Inference on higher-order spatial autoregressive models with increasingly many parameters. J Econom 186:19–31
Article MathSciNet Google Scholar
Harrison D, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5:81–102
Article Google Scholar
Horn R, Johnson C (1985) Matrix analysis. Cambridge University Press, Cambridge
Book Google Scholar
Hoshino T (2018) Semiparametric spatial autoregressive models with endogenous regressors: with an application to crime data. J Bus Econ Stat 36:160–172
Article MathSciNet Google Scholar
Kelejian HH, Prucha IR (1999) A generalized moments estimator for the autoregressive parameter in a spatial model. Int Econom Rev 40:509–533
Article MathSciNet Google Scholar
Kong E, Xia YC (2012) A single-index quantile regression model and its estimation. Econ Theory 28:730–768
Article MathSciNet Google Scholar
Lee LF (2004) Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models. Econometrica 72:1899–1925
Article MathSciNet Google Scholar
Lee LF, Liu XD (2010) Efficient GMM estimation of high order spatial autoregressive models. Econ Theory 26:187–230
Article MathSciNet Google Scholar
Lesage JP, Pace RK (2009) Introduction to spatial econometrics. CRC Press, Boca Raton
Book Google Scholar
Li DK, Mei CL, Wang N (2019) Tests for spatial dependence and heterogeneity in spatially autoregressive varying coefficient models with application to Boston house price analysis. Reg Sci Urban Econ 79:103470
Article Google Scholar
Li TZ, Mei CL (2013) Testing a polynomial relationship of the non-parametric component in partially linear spatial autoregressive models. Pap Reg Sci 92:633–649
Google Scholar
Li TZ, Mei CL (2016) Statistical inference on the parametric component in partially linear spatial autoregressive models. Commun Stat Simul Comput 45:1991–2006
Article MathSciNet Google Scholar
Lin X, Lee LF (2010) GMM estimation of spatial autoregressive models with unknown heteroskedasticity. J Econom 157:34–52
Article MathSciNet Google Scholar
Lin X, Weinberg B (2014) Unrequited friendship? how reciprocity mediates adolescent peer effects. Reg Sci Urban Econ 48:144–153
Article Google Scholar
Liu X, Chen JB, Cheng SL (2018) A penalized quasi-maximum likelihood method for variable selection in the spatial autoregressive model. Spat Stat 25:86–104
Article MathSciNet Google Scholar
Luo GW, Wu MX (2019) Variable selection for semiparametric varying-coefficient spatial autoregressive models with a diverging number of parameters. Commun Stat Theory Methods 50(9):2062–2079. https://doi.org/10.1080/03610926.2019.1659367
Article MathSciNet Google Scholar
Newey WK (1997) Convergence rates and asymptotic normality for series estimators. J Econom 79:147–168
Article MathSciNet Google Scholar
Neyman J, Scotts EL (1948) Consistent estimates based on partially consistent observations. Econometrica 16:1–32
Article MathSciNet Google Scholar
Pace RK, Gilley OW (1997) Using the spatial configuration of the data to improve estimation. J Real Estate Financ Econ 14:333–340
Article Google Scholar
Su LJ (2012) Semiparametric GMM estimation of spatial autoregressive models. J Econom 167:543–560
Article MathSciNet Google Scholar
Su LJ, Jin SN (2010) Profile quasi-maximum likelihood estimation of partially linear spatial autoregressive models. J Econom 157:18–33
Article MathSciNet Google Scholar
Sun Y, Yan HJ, Zhang WY, Lu ZD (2014) A semiparametric spatial dynamic model. Ann Stat 42:700–727
MathSciNet MATH Google Scholar
Tao J (2005) Spatial econometrics: models, methods and applications. Dissertation, Ohio State University
Wang HS, Li RZ, Tsai C-L (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94:553–568
Article MathSciNet Google Scholar
Wu YQ, Sun Y (2017) Shrinkage estimation of the linear model with spatial interaction. Metrika 80:51–68
Article MathSciNet Google Scholar
Xie HL, Huang J (2009) SCAD-penalized regression in high-dimensional partially linear models. Ann Stat 37:673–696
Article MathSciNet Google Scholar
Xie TF, Cao RY, Du J (2020) Variable selection for spatial autoregressive models with a diverging number of parameters. Stat Pap 61:1125–1145
Article MathSciNet Google Scholar
Yang ZL (2018) Bootstrap LM tests for higher-order spatial effects in spatial linear regression models. Empir Econ 55:35–68
Article Google Scholar
Zhang YQ, Sun YQ (2015) Estimation of partially specified dynamic spatial panel data models with fixed-effects. Reg Sci Urban Econ 51:37–46
Article Google Scholar
Zhang YQ, Yang GR (2015a) Statistical inference of partially specified spatial autoregressive model. Acta Math Appl Sin E 31:1–16
Article MathSciNet Google Scholar
Zhang YQ, Yang GR (2015b) Estimation of partially specified spatial panel data models with random-effects. Acta Math Sin 31:456–478
Article MathSciNet Google Scholar
Zhang ZY (2013) A pairwise difference estimator for partially linear spatial autoregressive models. Spatial Econom Anal 8:176–194
Article Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101:1418–1429
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors are grateful to the editor Christine H. Müller and reviewers for their constructive comments and suggestions, which lead to an improved version of this paper. This research was supported by the Natural Science Foundation of Shaanxi Province [Grant Number 2021JM349], the National Statistical Science Project [Grant Number 2019LY36] and the National Natural Science Foundation of China [Grant Number 11972273].

Author information

Authors and Affiliations

School of Science, Xi’an University of Architecture and Technology, Xi’an, 710055, China
Tizheng Li
School of Economics and Management, Xi’an University of Technology, Xi’an, 710048, China
Xiaojuan Kang

Authors

Tizheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojuan Kang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tizheng Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proofs

In this appendix, we give the technical proofs of Theorems 1–3. In our proofs, we will frequently use the following three facts.

Fact 1. If the row and column sums of $n{\times }n$ matrices ${\mathbf{A}}_{n1}$ and ${\mathbf{A}}_{n2}$ are uniformly bounded in absolute value, then the row and column sums of ${\mathbf{A}}_{n1}{\mathbf{A}}_{n2}$ and ${\mathbf{A}}_{n2}{\mathbf{A}}_{n1}$ are also uniformly bounded in absolute value.

Fact 2. The largest eigenvalue of an idempotent matrix is at most one.

Fact 3. For any $n{\times }n$ matrix ${\mathbf{A}}_{n}$, its spectral radius is bounded by $\mathrm{{max}}_{1{\le }i{\le }n} \sum _{j=1}^{n}|a_{n,ij}|$, where $a_{n,ij}$ is the (i, j)th element of ${\mathbf{A}}_{n}$.

Proof of Theorem 1

Let ${\alpha }_{n}={\sqrt{p_{n}/n}}+K^{-{\delta }}$ and ${\varvec{\theta }}={\varvec{\theta }}_{0}+{\alpha }_{n}{\mathbf{u}}$. It suffices to prove that for any given ${\eta }>0$, there exists a sufficiently large positive constant C such that

$$\begin{aligned} P\left( \inf _{\Vert {\mathbf{u}}\Vert =C}Q({\varvec{\theta }}_{0}+{\alpha }_{n}{\mathbf{u}})> Q({\varvec{\theta }}_{0})\right) {\ge }1-{\eta }. \end{aligned}$$

(14)

This implies that with probability at least $1-{\eta }$, there exists a local minimizer $\widehat{\varvec{\theta }}$ in the ball $\{{\varvec{\theta }}_{0}+{\alpha }_{n}{\mathbf{u}}:{\Vert {\mathbf{u}}\Vert {\le }C}\}$ such that $\Vert \widehat{\varvec{\theta }}-{\varvec{\theta }}_{0}\Vert = \mathrm{{O}}_{p}({\alpha }_{n})$.

Let

$$\begin{aligned} {C}_{n1}=\Vert \widetilde{\mathbf{Y}}_{n}-{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}({\varvec{\theta }}_{0}+{\alpha }_{n}{\mathbf{u}})\Vert ^{2} -\Vert \widetilde{\mathbf{Y}}_{n}-{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0}\Vert ^{2} \end{aligned}$$

and

$$\begin{aligned} {C}_{n2}=n\sum _{j=1}^{t+l}\left[ p_{{\lambda }_{j}}(|{\theta }_{j0}+{\alpha }_{n}u_{j}|)- p_{{\lambda }_{j}}(|{\theta }_{j0}|)\right] . \end{aligned}$$

It follows from the assumptions about the penalty function that $p_{{\lambda }_{j}}(0)=0$ and $p_{{\lambda }_{j}}(\cdot )$ is increasing on $[0,{\infty })$. Thus, we have

$$\begin{aligned}&Q({\varvec{\theta }}_{0}+{\alpha }_{n}{\mathbf{u}})-Q({\varvec{\theta }}_{0})\\&\quad ={\frac{1}{2}} {C}_{n1}+n\sum _{j=1}^{p_{n}+r}\left[ p_{{\lambda }_{j}}(|{\theta }_{j0}+{\alpha }_{n}u_{j}|)- p_{{\lambda }_{j}}(|{\theta }_{j0}|)\right] \\&\quad ={\frac{1}{2}} {C}_{n1}+n\sum _{j=1}^{t+l}\left[ p_{{\lambda }_{j}}(|{\theta }_{j0}+{\alpha }_{n}u_{j}|)- p_{{\lambda }_{j}}(|{\theta }_{j0}|)\right] +n\sum _{j=t+l+1}^{p_{n}+r}p_{{\lambda }_{j}}({\alpha }_{n}|u_{j}|)\\&\quad {\ge }\,{\frac{1}{2}} {C}_{n1}+n\sum _{j=1}^{t+l}\left[ p_{{\lambda }_{j}}(|{\theta }_{j0}+{\alpha }_{n}u_{j}|)- p_{{\lambda }_{j}}(|{\theta }_{j0}|)\right] \\&\quad ={\frac{1}{2}}{C}_{n1}+{C}_{n2}. \end{aligned}$$

For ${C}_{n1}$, we have

$$\begin{aligned} {C}_{n1}= & {} \Vert \widetilde{\mathbf{Y}}_{n}-{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}({\varvec{\theta }}_{0}+{\alpha }_{n}{\mathbf{u}})\Vert ^{2} -\Vert \widetilde{\mathbf{Y}}_{n}-{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0}\Vert ^{2}\\= & {} \Vert (\widetilde{\mathbf{Y}}_{n}-{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0})-{\alpha }_{n}{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\mathbf{u}}\Vert ^{2} -\Vert \widetilde{\mathbf{Y}}_{n}-{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0}\Vert ^{2}\\= & {} -2{\alpha }_{n}(\widetilde{\mathbf{Y}}_{n}- {\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0})^{{\mathrm{T}}}{\mathbf{M}}_{n} \widetilde{\mathbf{D}}_{n}{\mathbf{u}}+{\alpha }_{n}^{2}{\mathbf{u}}^{{\mathrm{T}}}\widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{M}}_{n} \widetilde{\mathbf{D}}_{n}{\mathbf{u}}\\&{\overset{\triangle }{=}}&-2{\alpha }_{n}{D}_{n1}+{\alpha }_{n}^{2}{D}_{n2}. \end{aligned}$$

Let ${\mathbf{V}}_{n}={\mathbf{m}}_{0}({\mathbf{Z}}_{n})-{\mathbf{P}}_{n}{\varvec{\nu }}_{0}$ and ${\overline{\varvec{\varepsilon }}}_{n}=({\mathbf{G}}_{n1}{\varvec{\varepsilon }}_{n}, \ldots ,{\mathbf{G}}_{nr}{\varvec{\varepsilon }}_{n},{\mathbf{0}}_{n{\times }p_{n}})$, then it is easy to show ${\mathbf{D}}_{n}={\overline{\mathbf{D}}}_{n}+{\overline{\varvec{\varepsilon }}}_{n}$. Thus, ${D}_{n1}$ can be decomposed into

$$\begin{aligned} {D}_{n1}= & {} (\widetilde{\mathbf{Y}}_{n}- {\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0})^{{\mathrm{T}}}{\mathbf{M}}_{n} \widetilde{\mathbf{D}}_{n}{\mathbf{u}}\\= & {} [\widetilde{\varvec{\varepsilon }}_{n}+({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n} +({\mathbf{I}}_{n}-{\mathbf{M}}_{n})\widetilde{\mathbf{D}}_{n}{\varvec{\theta }}_{0}]^{{\mathrm{T}}}{\mathbf{M}}_{n} \widetilde{\mathbf{D}}_{n}{\mathbf{u}}\\= & {} \widetilde{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\mathbf{u}} +{\mathbf{V}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n}{\mathbf{u}}\\= & {} {\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{D}}_{n}{\mathbf{u}} +{\mathbf{V}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{D}}_{n}{\mathbf{u}}\\= & {} {D}_{n11}+{D}_{n12}+{D}_{n13}+{D}_{n14}, \end{aligned}$$

where ${D}_{n11}={\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}$, ${D}_{n12}={\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\varvec{\varepsilon }}_{n}{\mathbf{u}}$, ${D}_{n13}={\mathbf{V}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}$ and ${D}_{n14}={\mathbf{V}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\varvec{\varepsilon }}_{n}{\mathbf{u}}$.

By Assumption 2 and Fact 2, we have

$$\begin{aligned} \mathrm{{E}}(\Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\Vert ^{2})&=\mathrm{{E}}({\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n})\\&={\sigma }_{0}^{2}\mathrm{{tr}}(({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}))\\&{\le }\,{\sigma }_{0}^{2}\mathrm{{tr}}({\mathbf{M}}_{n})\\&={\sigma }_{0}^{2}\mathrm{{tr}}({\mathbf{H}}_{n}({\mathbf{H}}_{n}^{{\mathrm{T}}}{\mathbf{H}}_{n})^{-1}{\mathbf{H}}_{n}^{{\mathrm{T}}})\\&={\sigma }_{0}^{2}\mathrm{{tr}}({\mathbf{I}}_{p_{n}+r})\\&=\mathrm{{O}}(p_{n}). \end{aligned}$$

This together with Markov’s inequality implies that

$$\begin{aligned} \Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\Vert ^{2} =\mathrm{{O}}_{P}(p_{n}). \end{aligned}$$

(15)

It follows from Assumption 3.4 and Fact 2 that

$$\begin{aligned} \Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}\Vert ^{2}&={\mathbf{u}}^{{\mathrm{T}}}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}\nonumber \\&{\le }\,\,{\mathbf{u}}^{{\mathrm{T}}}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) \overline{\mathbf{D}}_{n}{\mathbf{u}}\nonumber \\&{\le }\,\,{\mathbf{u}}^{{\mathrm{T}}}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}} \overline{\mathbf{D}}_{n}{\mathbf{u}}\nonumber \\&{\le }\,\,n{\cdot }{\eta }_{\max }(n^{-1}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}} \overline{\mathbf{D}}_{n}){\mathbf{u}}^{{\mathrm{T}}}{\mathbf{u}}\nonumber \\&=\mathrm{{O}}(n\Vert {\mathbf{u}}\Vert ^{2}). \end{aligned}$$

(16)

With (15), (16) and Cauchy–Schwarz inequality, we obtain ${D}_{n11}=\mathrm{{O}}_{P}({\sqrt{np_{n}}}\Vert {\mathbf{u}}\Vert )$.

For $j=1,\ldots ,r$, it follows from Assumption 1.3 and Fact 1 that the row sums of ${\mathbf{G}}_{nj}{\mathbf{G}}_{nj}^{{\mathrm{T}}}$ are uniformly bounded in absolute value. Hence, we obtain ${\eta }_{\max }({\mathbf{G}}_{nj}{\mathbf{G}}_{nj}^{{\mathrm{T}}})=\mathrm{{O}}(1)$ by Fact 3. Using this result and a similar proof to that of (15), we have

$$\begin{aligned} \Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{G}}_{nj}{\varvec{\varepsilon }}_{n}\Vert ^{2} =\mathrm{{O}}_{P}(p_{n}),\,j=1,\ldots ,r. \end{aligned}$$

This together with (15) and Cauchy–Schwarz inequality yields

$$\begin{aligned} {\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{G}}_{nj}{\varvec{\varepsilon }}_{n}=\mathrm{{O}}_{P}(p_{n}),\,j=1,\ldots ,r. \end{aligned}$$

(17)

It follows from (17) and Cauchy–Schwarz inequality that

$$\begin{aligned} D_{n12}{\le }\Vert {\mathbf{u}}\Vert \left\{ \sum _{j=1}^{r}[{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{G}}_{nj}{\varvec{\varepsilon }}_{n}]^{2}\right\} ^{1/2} =\mathrm{{O}}_{P}(p_{n}\Vert {\mathbf{u}}\Vert ). \end{aligned}$$

By Fact 2 and Assumption 4.1, we obtain

$$\begin{aligned} \Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\Vert ^{2}= & {} {\mathbf{V}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\nonumber \\\le & {} {\mathbf{V}}_{n}^{{\mathrm{T}}}{\mathbf{V}}_{n}\nonumber \\= & {} \sum _{i=1}^{n}|m_{0}({\mathbf{z}}_{n,i})-{\mathbf{p}}^{K}({\mathbf{z}}_{n,i}) ^{{\mathrm{T}}}{\varvec{\nu }}_{0}|^{2}\nonumber \\\le & {} n{\cdot }\max _{1{\le }i{\le }n}|m_{0}({\mathbf{z}}_{n,i})-{\mathbf{p}}^{K}({\mathbf{z}}_{n,i}) ^{{\mathrm{T}}}{\varvec{\nu }}_{0}|^{2}\nonumber \\\le & {} n{\cdot }\left( \max _{1{\le }i{\le }n}|m_{0}({\mathbf{z}}_{n,i})-{\mathbf{p}}^{K}({\mathbf{z}}_{n,i}) ^{{\mathrm{T}}}{\varvec{\nu }}_{0}|\right) ^{2}\nonumber \\\le & {} n{\cdot }\left( {\sup }_{{\mathbf{z}}{\in }{{\mathcal {Z}}}}|m_{0}({\mathbf{z}})-{\mathbf{p}}^{K}({\mathbf{z}}) ^{{\mathrm{T}}}{\varvec{\nu }}_{0}|\right) ^{2}\nonumber \\= & {} \mathrm{{O}}(nK^{-2{\delta }}). \end{aligned}$$

(18)

Using (16), (18) and Cauchy–Schwarz inequality, we get

$$\begin{aligned} D_{n13}= & {} {\mathbf{V}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}\\&{\le }&\left\{ \Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\Vert ^{2} {\cdot }\Vert {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}\Vert ^{2}{\cdot } \Vert {\mathbf{u}}\Vert ^{2}\right\} ^{\frac{1}{2}}\\= & {} \mathrm{{O}}_{P}(nK^{-{\delta }}\Vert {\mathbf{u}}\Vert ). \end{aligned}$$

Similarly, we can prove $D_{n14}=\mathrm{{O}}_{P}({\sqrt{np_{n}}}K^{-{\delta }}\Vert {\mathbf{u}}\Vert )$ by using (17), (18) and Cauchy–Schwarz inequality. Combining the orders of $D_{n11}$, $D_{n12}$, $D_{n13}$ and $D_{n14}$, we obtain

$$\begin{aligned} D_{n1}=\mathrm{{O}}_{P}(({\sqrt{np_{n}}}+nK^{-{\delta }})\Vert {\mathbf{u}}\Vert ). \end{aligned}$$

For $D_{n2}$, we have

$$\begin{aligned} D_{n2}= & {} {\mathbf{u}}^{{\mathrm{T}}}\widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{M}}_{n} \widetilde{\mathbf{D}}_{n}{\mathbf{u}}\\= & {} {\mathbf{u}}^{{\mathrm{T}}}(\overline{\mathbf{D}}_{n}+\overline{\varvec{\varepsilon }}_{n})^{{\mathrm{T}}} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) (\overline{\mathbf{D}}_{n}+\overline{\varvec{\varepsilon }}_{n}){\mathbf{u}}\\= & {} D_{n21}+D_{n22}+2D_{n23}, \end{aligned}$$

where $D_{n21}={\mathbf{u}}^{{\mathrm{T}}}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}{\mathbf{u}}$, $D_{n22}={\mathbf{u}}^{{\mathrm{T}}}\overline{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\varvec{\varepsilon }}_{n}{\mathbf{u}}$ and $D_{n23}={\mathbf{u}}^{{\mathrm{T}}}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}) {\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\varvec{\varepsilon }}_{n}{\mathbf{u}}$. Similar to the proof of $D_{n1}$, we can obtain $D_{n21}=\mathrm{{O}}(n\Vert {\mathbf{u}}\Vert ^{2})$, $D_{n22}=\mathrm{{O}}_{P}(p_{n}\Vert {\mathbf{u}}\Vert ^{2})$ and $D_{n23}=\mathrm{{O}}_{P}({\sqrt{np_{n}}}\Vert {\mathbf{u}}\Vert ^{2})$. Thus, we have $D_{n2}=\mathrm{{O}}_{P}(n\Vert {\mathbf{u}}\Vert ^{2})$.

Next, we consider ${C}_{n2}$. Let $a_{n}=\max \{|p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j0}|)|, {\theta }_{j0}{\ne }0\}$ and $b_{n}=\max \{|p_{{\lambda }_{j}}^{\prime \prime }(|{\theta }_{j0}|)|, {\theta }_{j0}{\ne }0\}$. Then, it follows from ${\lambda }_{{\mathrm{max}}}{\rightarrow }0$ as $n{\rightarrow }{\infty }$ and the Condition (6) on the penalty function that $a_{n}=o(1)$ and $b_{n}=o(1)$. By using Taylor expansion of the penalty function and Cauchy-Schwarz inequality, we have

$$\begin{aligned} {C}_{n2}= & {} n\sum _{j=1}^{t+l}\left\{ p_{{\lambda }_{j}}(|{\theta }_{j0}+{\alpha }_{n}u_{j}|)- p_{{\lambda }_{j}}(|{\theta }_{j0}|)\right\} \\= & {} n{\alpha }_{n}{\sum _{j=1}^{t+l}p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j0}|)\mathrm{{sgn}}({\theta }_{j0})u_{j}}+ {\frac{n{\alpha }_{n}^{2}}{2}}{\sum _{j=1}^{t+l}p_{{\lambda }_{j}}^{\prime \prime }(|{\theta }_{j0}|)u_{j}^{2}[1+\mathrm{{o}}(1)]}\\&{\ge }&-\,n{\alpha }_{n}{\sum _{j=1}^{t+l}|p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j0}|)||u_{j}|} -{\frac{n{\alpha }_{n}^{2}}{2}}{\sum _{j=1}^{t+l}|p_{{\lambda }_{j}}^{\prime \prime }(|{\theta }_{j0}|)|u_{j}^{2}[1+\mathrm{{o}}(1)]}\\&{\ge }&-\,n{\alpha }_{n}{a_{n}}{\sum _{j=1}^{t+l}|u_{j}|} -{\frac{n{\alpha }_{n}^{2}}{2}}{b_{n}}{\sum _{j=1}^{t+l}u_{j}^{2}[1+\mathrm{{o}}(1)]}\\&{\ge }&-\,n{\alpha }_{n}{a_{n}}{\sqrt{t+l}}\Vert {\mathbf{u}}\Vert -n{\alpha }_{n}^{2}{\Vert {\mathbf{u}}\Vert }^{2}{b_{n}} \end{aligned}$$

By comparing the orders of ${\alpha }_{n}{D}_{n1}$, ${\alpha }_{n}^{2}{D}_{n2}$ and ${C}_{n2}$ and noting that ${a_{n}}=\mathrm{{o}}(1)$ and ${b_{n}}=\mathrm{{o}}(1)$, we can conclude that ${\alpha }_{n}^{2}{D}_{n2}$ dominates both ${\alpha }_{n}{D}_{n1}$ and ${C}_{n2}$ provided C is sufficiently large. Thus, (14) holds for sufficiently large C. This completes the proof of Theorem 1. $\square $

Proof of Theorem 2

We first prove part (a). It is sufficient to prove that, for any ${\varvec{\theta }}$ satisfying $\Vert {\varvec{\theta }}-{\varvec{\theta }}_{0}\Vert = \mathrm{{O}}_{P}({\sqrt{p_{n}/n}}+K^{-{\delta }})$ and some small ${{\delta }}_{n}=C({\sqrt{p_{n}/n}}+K^{-{\delta }})$, with probability tending to 1 as $n \rightarrow \infty $,

$$\begin{aligned} \frac{\partial {Q({\varvec{\theta }})}}{\partial {{\theta }_{j}}}<0,\,\,\mathrm{{for}} \,\,{-{{\delta }}_{n}<{\theta }_{j}<0},\,\,j=t+l+1,\ldots ,p_{n}+r \end{aligned}$$

(19)

and

$$\begin{aligned} \frac{\partial {Q({\varvec{\theta }})}}{\partial {{\theta }_{j}}}>0,\,\,\mathrm{{for}} \,\,{0<{\theta }_{j}<{{\delta }}_{n}},\,\,j=t+l+1,\ldots ,p_{n}+r. \end{aligned}$$

(20)

Hence, (19) and (20) imply that the minimizer of $Q({\varvec{\theta }})$ attains at ${\theta }_{j}=0,j=t+l+1,\ldots ,p_{n}+r$.

For $j=t+l+1,\ldots ,p_{n}+r$, we have

$$\begin{aligned} \frac{\partial {Q({\varvec{\theta }})}}{\partial {{\theta }_{j}}}= & {} -\sum _{i=1}^{n}({\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n})_{n,ij} [\widetilde{{Y}}_{n,i}-({\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n})_{n,i{\cdot }}{\varvec{\theta }}] +np_{{\lambda }_{j}}^{\prime }(|{\theta }_{j}|)\mathrm{{sgn}}({\theta }_{j})\\= & {} -\sum _{i=1}^{n}({\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n})_{n,ij} [\widetilde{\mathbf{D}}_{n,i{\cdot }}{\varvec{\theta }}_{0}+ ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})_{n,i{\cdot }}{\mathbf{V}}_{n} +\widetilde{\varvec{\varepsilon }}_{n,i}-({\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n})_{n,i{\cdot }} {\varvec{\theta }}_{0}]\\&-\sum _{i=1}^{n}({\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n})_{n,ij}({\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n})_{n,i{\cdot }} ({\varvec{\theta }}_{0}-{\varvec{\theta }}) +np_{{\lambda }_{j}}^{\prime }(|{\theta }_{j}|)\mathrm{{sgn}}({\theta }_{j})\\= & {} -\,(\widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}})_{n,j{\cdot }}{\mathbf{M}}_{n}\widetilde{\varvec{\varepsilon }}_{n} -(\widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}})_{n,j{\cdot }}{\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n} +(\widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}})_{n,j{\cdot }}{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n} ({\varvec{\theta }}-{\varvec{\theta }}_{0})\\&+\; np_{{\lambda }_{j}}^{\prime }(|{\theta }_{j}|)\mathrm{{sgn}}({\theta }_{j}). \end{aligned}$$

By using the same arguments as those used in the proof of Theorem 1 and notice that ${\varvec{\theta }}-{\varvec{\theta }}_{0}= \mathrm{{O}}_{P}({\sqrt{p_{n}/n}}+K^{-{\delta }})$, we can conclude that

$$\begin{aligned} \widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{M}}_{n}\widetilde{\varvec{\varepsilon }}_{n}= & {} {\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\\= & {} \overline{\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\\&+\;\overline{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\\= & {} \mathrm{{O}}_{P}({\sqrt{np_{n}}})+\mathrm{{O}}_{P}(p_{n})\\= & {} \mathrm{{O}}_{P}({\sqrt{np_{n}}}),\\ \widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}= & {} {\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\\= & {} \overline{\mathbf{D}}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\\&+\;\overline{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\\= & {} \mathrm{{O}}_{P}(nK^{-{\delta }})+\mathrm{{O}}_{P}({\sqrt{np_{n}}}K^{-{\delta }})\\= & {} \mathrm{{O}}_{P}(nK^{-{\delta }}) \end{aligned}$$

and

$$\begin{aligned} \widetilde{\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{M}}_{n}\widetilde{\mathbf{D}}_{n} ({\varvec{\theta }}-{\varvec{\theta }}_{0}) =\mathrm{{O}}_{P}(n({\sqrt{p_{n}/n}}+K^{-{\delta }})). \end{aligned}$$

Combining the above results, we get

$$\begin{aligned} \frac{\partial {Q({\varvec{\theta }})}}{\partial {{\theta }_{j}}} =n{\lambda }_{j} \left\{ {\lambda }_{j}^{-1}p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j}|)\mathrm{{sgn}}({\theta }_{j}) +\mathrm{{O}}_{P}({\lambda }_{j}^{-1}({\sqrt{p_{n}/n}}+K^{-{\delta }}))\right\} . \end{aligned}$$

Since $p_{{\lambda }_{j}}^{\prime }(0+)={\lambda }_{j}$, ${\lambda }_{j}^{-1}p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j}|) {\ge }\mathop {\lim \inf }_{n \rightarrow \infty }\mathop {\lim \inf }_{{\theta }_{j} \rightarrow 0}{\lambda }_{j}^{-1}p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j}|)=1$. This together with ${\lambda }_{j}\left( {\sqrt{p_{n}/n}}+K^{-{\delta }}\right) ^{-1} >{\lambda }_{{\mathrm{min}}}\left( {\sqrt{p_{n}/n}}+K^{-{\delta }}\right) ^{-1}{\rightarrow }{\infty }$ as $n{\rightarrow }{\infty }$ imply that the sign of $\frac{\partial {Q({\varvec{\theta }})}}{\partial {{\theta }_{j}}}$ is completely determined by that of ${\theta }_{j}$. As a result, (19) and (20) hold.

Next, we prove part (b). Since ${\widehat{\varvec{\theta }}}=(({\widehat{\varvec{\theta }}}^{*})^{{\mathrm{T}}},{\mathbf{0}})^{{\mathrm{T}}}$ minimizes $Q({\varvec{\theta }})$, ${\widehat{\varvec{\theta }}}$ must satisfy the following system of equations

$$\begin{aligned} \frac{\partial {Q({\widehat{\varvec{\theta }}})}}{\partial {{\theta }_{j}}}=0,\,\,j=1,\ldots ,t+l. \end{aligned}$$

That is

$$\begin{aligned} -\sum _{i=1}^{n}({\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*})_{n,ij} [\widetilde{{Y}}_{n,i}-({\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*})_{n,i{\cdot }} {\widehat{\varvec{\theta }}}^{*}] +np_{{\lambda }_{j}}^{\prime }(|{\widehat{\theta }}_{j}^{*}|)\mathrm{{sgn}} ({\widehat{\theta }}_{j}^{*})=0,\,\,j=1,\ldots ,t+l. \qquad \end{aligned}$$

(21)

By straightforward derivation, we have

$$\begin{aligned}&\sum _{i=1}^{n}({\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*})_{n,ij} [\widetilde{{Y}}_{n,i}-({\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*})_{n,i{\cdot }} {\widehat{\varvec{\theta }}}^{*}]\nonumber \\&\quad =\quad ((\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}})_{n,j{\cdot }}{\mathbf{M}}_{n}^{*}\widetilde{\varvec{\varepsilon }}_{n} +((\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}})_{n,j{\cdot }}{\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\nonumber \\&\qquad +((\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}})_{n,j{\cdot }}{\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*} ({\varvec{\theta }}_{0}^{*}-\widehat{\varvec{\theta }}^{*}),\,\,j=1,\ldots ,t+l. \end{aligned}$$

(22)

where $\widetilde{\mathbf{D}}_{n}^{*}=({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{D}}_{n}^{*}$. By applying Taylor expansion to $p_{{\lambda }_{j}}^{\prime }(|{\widehat{\theta }}_{j}^{*}|)$, we obtain

$$\begin{aligned} p_{{\lambda }_{j}}^{\prime }(|{\widehat{\theta }}_{j}^{*}|) =p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j0}^{*}|)+ [p_{{\lambda }_{j}}^{\prime \prime }(|{\theta }_{j0}^{*}|)+\mathrm{{o}}_{P}(1)] ({\widehat{\theta }}_{j}^{*}-{\theta }_{j0}^{*}),\,\,\quad j=1,\ldots ,t+l. \end{aligned}$$

For both SCAD and MCP penalty functions, as ${\lambda }_{\max }{\rightarrow }0$, $p_{{\lambda }_{j}}^{\prime }(|{\theta }_{j0}^{*}|)=0$ and $p_{{\lambda }_{j}}^{\prime \prime }(|{\theta }_{j0}^{*}|)=0$. Thus, we have

$$\begin{aligned} p_{{\lambda }_{j}}^{\prime }(|{\widehat{\theta }}_{j}^{*}|) =\mathrm{{o}}_{P}(1)({\widehat{\theta }}_{j}^{*}-{\theta }_{j0}^{*}),\,\,j=1,\ldots ,t+l. \end{aligned}$$

This together with (21) and (22) yields

$$\begin{aligned}&(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*} (\widehat{\varvec{\theta }}^{*}-{\varvec{\theta }}_{0}^{*}) -(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}\widetilde{\varvec{\varepsilon }}_{n}\nonumber \\&\quad -(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n} +\mathrm{{o}}_{P}(n)(\widehat{\varvec{\theta }}^{*}-{\varvec{\theta }}_{0}^{*}) ={\mathbf{0}}. \end{aligned}$$

(23)

It follows from (23) that

$$\begin{aligned} {\mathbf{0}}= & {} \left[ n^{-1}(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*} +\mathrm{{o}}_{P}(1)\right] {\sqrt{n}}(\widehat{\varvec{\theta }}^{*}-{\varvec{\theta }}_{0}^{*}) -\frac{1}{\sqrt{n}}(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}\widetilde{\varvec{\varepsilon }}_{n}\nonumber \\&-\frac{1}{\sqrt{n}}(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\nonumber \\&{\overset{\triangle }{=}}&\left[ C_{n3}+\mathrm{{o}}_{P}(1)\right] {\sqrt{n}}(\widehat{\varvec{\theta }}^{*}-{\varvec{\theta }}_{0}^{*}) -C_{n4}-C_{n5}. \end{aligned}$$

(24)

Note that ${\mathbf{M}}_{n}^{*}$ is an $n{\times }(t+l)$ idempotent matrix. Thus, by using the same arguments as those in the proof of Theorem 1, we can obtain

$$\begin{aligned}&\Vert {\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\Vert ^{2} =\mathrm{{O}}_{P}(t+l)=\mathrm{{O}}_{P}(1), \\&{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}{\mathbf{G}}_{ni}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{G}}_{nj}{\varvec{\varepsilon }}_{n} =\mathrm{{O}}_{P}(t+l)=\mathrm{{O}}_{P}(1),\,i,j=1,\ldots ,t, \\&\Vert {\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}^{*}\Vert ^{2} =\mathrm{{O}}(n) \end{aligned}$$

and

$$\begin{aligned} \Vert {\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\Vert ^{2} =\mathrm{{O}}(nK^{-2{\delta }}). \end{aligned}$$

Let ${\overline{\varvec{\varepsilon }}}_{n}^{*}=({\mathbf{G}}_{n1}{\varvec{\varepsilon }}_{n}, \ldots ,{\mathbf{G}}_{nt}{\varvec{\varepsilon }}_{n},{\mathbf{0}}_{n{\times }l})$, then ${\mathbf{D}}_{n}^{*}=\overline{\mathbf{D}}_{n}^{*}+\overline{\varvec{\varepsilon }}_{n}^{*}$. This together with Cauchy–Schwarz inequality and the above four results, we have

$$\begin{aligned} C_{n3}= & {} n^{-1}(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}\widetilde{\mathbf{D}}_{n}^{*}\nonumber \\= & {} n^{-1}{\overline{\mathbf{D}}_{n}^{*}}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}^{*} +n^{-1}{\overline{\varvec{\varepsilon }}_{n}^{*}}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\varvec{\varepsilon }}_{n}^{*}\nonumber \\&+\,n^{-1}{\overline{\mathbf{D}}_{n}^{*}}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\varvec{\varepsilon }}_{n}^{*} +n^{-1}{\overline{\varvec{\varepsilon }}_{n}^{*}}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n})\overline{\mathbf{D}}_{n}^{*}\nonumber \\= & {} {\varvec{\varSigma }}_{n,1}+\mathrm{{O}}_{P}(n^{-1})+\mathrm{{O}}_{P}(n^{-1/2})\nonumber \\= & {} {\varvec{\varSigma }}_{n,1}+\mathrm{{o}}_{P}(1), \end{aligned}$$

(25)

$$\begin{aligned} C_{n4}= & {} \frac{1}{\sqrt{n}}(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}\widetilde{\varvec{\varepsilon }}_{n}\nonumber \\= & {} \frac{1}{\sqrt{n}}(\overline{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n} +\frac{1}{\sqrt{n}}{\overline{\varvec{\varepsilon }}_{n}^{*}}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}\nonumber \\= & {} \frac{1}{\sqrt{n}}(\overline{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}+\mathrm{{O}}_{P}(n^{-1/2})\nonumber \\= & {} \frac{1}{\sqrt{n}}(\overline{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}+\mathrm{{o}}_{P}(1), \end{aligned}$$

(26)

and

$$\begin{aligned} C_{n5}= & {} \frac{1}{\sqrt{n}}(\widetilde{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}{\mathbf{M}}_{n}^{*}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\nonumber \\= & {} \frac{1}{\sqrt{n}}(\overline{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}+\frac{1}{\sqrt{n}}{\overline{\varvec{\varepsilon }}_{n}^{*}}^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{V}}_{n}\nonumber \\= & {} \mathrm{{O}}_{P}({\sqrt{n}}K^{-\delta })+\mathrm{{O}}_{P}(K^{-\delta })\nonumber \\= & {} \mathrm{{O}}_{P}({\sqrt{n}}K^{-\delta })\nonumber \\= & {} \mathrm{{o}}_{P}(1). \end{aligned}$$

(27)

Combining (24)–(27), we obtain

$$\begin{aligned} \left[ {\varvec{\varSigma }}_{n,1}^{*}+\mathrm{{o}}_{P}(1)\right] {\sqrt{n}}(\widehat{\varvec{\theta }}^{*}-{\varvec{\theta }}_{0}^{*}) =\frac{1}{\sqrt{n}}(\overline{\mathbf{D}}_{n}^{*})^{{\mathrm{T}}}({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\mathbf{M}}_{n}^{*} ({\mathbf{I}}_{n}-{\varvec{\varPi }}_{n}){\varvec{\varepsilon }}_{n}+\mathrm{{o}}_{P}(1). \end{aligned}$$

By using the central limit theorem and the Slutsky’s Lemma, we have

$$\begin{aligned} {\sqrt{n}}(\widehat{\varvec{\theta }}^{*}-{\varvec{\theta }}_{0}^{*}) \overset{\text {D}}{\longrightarrow }N({\mathbf{0}},{\varvec{\varSigma }}). \end{aligned}$$

$\square $

Proof of Theorem 3

First, we prove part (a). By the definition of ${\widehat{\varvec{\nu }}}$, we have

$$\begin{aligned} {\widehat{\varvec{\nu }}}-{\varvec{\nu }}_{0}= & {} ({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1} {\mathbf{P}}_{n}^{{\mathrm{T}}}({\mathbf{Y}}_{n}-{\mathbf{D}}_{n}\widehat{\varvec{\theta }}) -{\varvec{\nu }}_{0}\\= & {} ({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1} {\mathbf{P}}_{n}^{{\mathrm{T}}}({\mathbf{D}}_{n}{\varvec{\theta }}_{0} +{\mathbf{P}}_{n}{\varvec{\nu }}_{0}+{\mathbf{V}}_{n}+{\varvec{\varepsilon }}_{n}-{\mathbf{D}}_{n}\widehat{\varvec{\theta }}) -{\varvec{\nu }}_{0}\\= & {} C_{n6}+C_{n7}+c_{n8}, \end{aligned}$$

where $C_{n6}=({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{D}}_{n} ({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})$, $C_{n7}=({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{V}}_{n}$ and $C_{n8}=({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\varvec{\varepsilon }}_{n}$.

By Assumption 3.4, Fact 2 and ${\mathbf{D}}_{n}=\overline{\mathbf{D}}_{n}+\overline{\varvec{\varepsilon }}_{n}$, we obtain

$$\begin{aligned} \Vert C_{n6}\Vert ^{2}= & {} (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})^{{\mathrm{T}}} {\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n}({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1} ({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{D}}_{n} (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})\\&{\le }&n^{-1}{\eta }_{\min }^{-1}(n^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n}) (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})^{{\mathrm{T}}} {\mathbf{D}}_{n}^{{\mathrm{T}}}{\varvec{\varPi }}_{n}{\mathbf{D}}_{n} (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})\\&{\le }&{{\underline{c}}}_{P}^{-1}(\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})^{{\mathrm{T}}} (n^{-1}{\mathbf{D}}_{n}^{{\mathrm{T}}}{\mathbf{D}}_{n}) (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})\\= & {} C_{n61}+C_{n62}+C_{n63}, \end{aligned}$$

where $C_{n61}={{\underline{c}}}_{P}^{-1}(\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})^{{\mathrm{T}}} (n^{-1}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}\overline{\mathbf{D}}_{n}) (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})$, $C_{n62}={{\underline{c}}}_{P}^{-1}(\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})^{{\mathrm{T}}} (n^{-1}\overline{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}\overline{\varvec{\varepsilon }}_{n}) (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})$ and $C_{n63}=2{{\underline{c}}}_{P}^{-1}(\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})^{{\mathrm{T}}} (n^{-1}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}\overline{\varvec{\varepsilon }}_{n}) (\widehat{\varvec{\theta }}-{\varvec{\theta }}_{0})$.

It follows from Theorem 1 that $\Vert \widehat{\varvec{\theta }}-{\varvec{\theta }}_{0}\Vert ^{2}= \mathrm{{O}}_{P}({{p_{n}/n}}+K^{-2{\delta }})$. This together with Assumption 3.4 yields

$$\begin{aligned} C_{n61}{\le }{{\underline{c}}}_{P}^{-1}{\eta }_{\max }(n^{-1}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}} \overline{\mathbf{D}}_{n})\Vert \widehat{\varvec{\theta }}-{\varvec{\theta }}_{0}\Vert ^{2}=\mathrm{{O}}_{P}({{p_{n}/n}}+K^{-2{\delta }}). \end{aligned}$$

For $i,j=1,\ldots ,r$, by Assumption 1.3 and Facts 1 and 3, we have

$$\begin{aligned} \mathrm{{E}}(n^{-1}{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}{\mathbf{G}}_{ni}^{{\mathrm{T}}}{\mathbf{G}}_{nj}{\varvec{\varepsilon }}_{n}) =n^{-1}{\sigma }_{0}^{2}\mathrm{{tr}}({\mathbf{G}}_{ni}^{{\mathrm{T}}}{\mathbf{G}}_{nj}) {\le }{\sigma }_{0}^{2}{\eta }_{\max }({\mathbf{G}}_{ni}^{{\mathrm{T}}}{\mathbf{G}}_{nj}) =\mathrm{{O}}(1). \end{aligned}$$

This implies that $n^{-1}{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}{\mathbf{G}}_{ni}^{{\mathrm{T}}}{\mathbf{G}}_{nj}{\varvec{\varepsilon }}_{n} =\mathrm{{O}}_{P}(1)$ ($i,j=1,\ldots ,r$). Thus, we get

$$\begin{aligned} C_{n62}={{\underline{c}}}_{P}^{-1}({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})^{{\mathrm{T}}}(n^{-1}\overline{\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}\overline{\varvec{\varepsilon }}_{n})({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})=\mathrm{{O}}_{P}({{p_{n}/n}}+K^{-2{\delta }}). \end{aligned}$$

By using Cauchy–Schwarz inequality and the orders of $C_{n61}$ and $C_{n62}$, we can show that

$$\begin{aligned} C_{n63}= & {} 2{{\underline{c}}}_{P}^{-1}({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})^{{\mathrm{T}}} (n^{-1}\overline{\mathbf{D}}_{n}^{{\mathrm{T}}}\overline{\varvec{\varepsilon }}_{n}) ({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})\\&{\le }&2\left| \left[ {{\underline{c}}}_{P}^{-1/2}n^{-1/2}({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})^{{\mathrm{T}}} \overline{\mathbf{D}}_{n}^{{\mathrm{T}}}\right] {\cdot }\left[ {{\underline{c}}}_{P}^{-1/2}n^{-1/2}\overline{\varvec{\varepsilon }}_{n} ({\varvec{\theta }}_{0}-\widehat{\varvec{\theta }})\right] \right| \\\le & {} 2(C_{n61}C_{n62})^{1/2}\\= & {} \mathrm{{O}}_{P}({{p_{n}/n}}+K^{-2{\delta }}). \end{aligned}$$

Combining the orders of $C_{n61}$, $C_{n62}$ and $C_{n63}$, we obtain

$$\begin{aligned} \Vert C_{n6}\Vert =\mathrm{{O}}_{P}({\sqrt{p_{n}/n}}+K^{-{\delta }}). \end{aligned}$$

From the proof of (18), we have $\Vert {\mathbf{V}}_{n}\Vert ^{2}=\mathrm{{O}}(nK^{-2{\delta }})$. This together with Assumption 4.4 and Fact 2 yields

$$\begin{aligned} \Vert C_{n7}\Vert ^{2}= & {} {\mathbf{V}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n}({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1} ({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{V}}_{n}\\\le & {} n^{-1}{\eta }_{\min }^{-1}(n^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n}){\mathbf{V}}_{n}^{{\mathrm{T}}} {\varvec{\varPi }}_{n}{\mathbf{V}}_{n}\\\le & {} n^{-1}{{\underline{c}}}_{P}^{-1}\Vert {\mathbf{V}}_{n}\Vert ^{2}\\= & {} \mathrm{{O}}(K^{-2{\delta }}). \end{aligned}$$

This means that $\Vert C_{n7}\Vert =\mathrm{{O}}(K^{-{\delta }})$.

Under Assumption 4.4, we have

$$\begin{aligned} \mathrm{{E}}(\Vert C_{n8}\Vert ^{2})= & {} \mathrm{{E}}({\varvec{\varepsilon }}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n}({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1} ({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\varvec{\varepsilon }}_{n})\\\le & {} n^{-1}{\eta }_{\min }^{-1}(n^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n}) \mathrm{{E}}({\varvec{\varepsilon }}_{n}^{{\mathrm{T}}} {\mathbf{P}}_{n}({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}}{\varvec{\varepsilon }}_{n})\\= & {} n^{-1}{{\underline{c}}}_{P}^{-1}{\sigma }_{0}^{2} \mathrm{{tr}}({\mathbf{P}}_{n}({\mathbf{P}}_{n}^{{\mathrm{T}}}{\mathbf{P}}_{n})^{-1}{\mathbf{P}}_{n}^{{\mathrm{T}}})\\= & {} \mathrm{{O}}(K/n). \end{aligned}$$

This implies that $\Vert C_{n8}\Vert =\mathrm{{O}}_{P}({\sqrt{K/n}})$.

Combining the orders of $C_{n6}$, $C_{n7}$ and $C_{n8}$ with triangle inequality, we obtain

$$\begin{aligned} \Vert {\widehat{\varvec{\nu }}}-{\varvec{\nu }}_{0}\Vert {\le }\Vert C_{n6}\Vert +\Vert C_{n7}\Vert +\Vert C_{n8}\Vert =\mathrm{{O}}_{P}({\sqrt{p_{n}/n}}+{\sqrt{K/n}}+K^{-{\delta }}). \end{aligned}$$

This together with Assumptions 4.1 and 4.3 yields

$$\begin{aligned} {\sup }_{{\mathbf{z}}{\in }{{\mathcal {Z}}}}|{{\widehat{m}}}({\mathbf{z}})-{{m}}_{0}({\mathbf{z}})|= & {} |{\mathbf{p}}^{K}({\mathbf{z}})^{{\mathrm{T}}}{\widehat{\varvec{\nu }}} -{{m}}_{0}({\mathbf{z}})|\\= & {} |{\mathbf{p}}^{K}({\mathbf{z}})^{{\mathrm{T}}}({\widehat{\varvec{\nu }}}- {{\varvec{\nu }}}_{0})+{\mathbf{p}}^{K}({\mathbf{z}})^{{\mathrm{T}}}{{\varvec{\nu }}}_{0} -{{m}}_{0}({\mathbf{z}})|\\\le & {} |{\mathbf{p}}^{K}({\mathbf{z}})^{{\mathrm{T}}}({\widehat{\varvec{\nu }}}- {{\varvec{\nu }}}_{0})| +|{{m}}_{0}({\mathbf{z}})-{\mathbf{p}}^{K}({\mathbf{z}})^{{\mathrm{T}}}{{\varvec{\nu }}}_{0}|\\\le & {} \Vert {\widehat{\varvec{\nu }}}- {{\varvec{\nu }}}_{0}\Vert {\cdot }{\sup }_{{\mathbf{z}}{\in }{{\mathcal {Z}}}}\Vert {\mathbf{p}}^{K}({\mathbf{z}})\Vert + {\sup }_{{\mathbf{z}}{\in }{{\mathcal {Z}}}}|m_{0}({\mathbf{z}})-{\mathbf{p}}^{K}({\mathbf{z}}) ^{{\mathrm{T}}}{\varvec{\nu }}_{0}|\\\le & {} {\zeta }(K)\Vert {\widehat{\varvec{\nu }}}- {{\varvec{\nu }}}_{0}\Vert +\mathrm{{O}}(K^{-{\delta }})\\= & {} \mathrm{{O}}_{P}({\zeta }(K)({\sqrt{p_{n}/n}}+{\sqrt{K/n}}+K^{-{\delta }})). \end{aligned}$$

Thus, we complete the proof of part (b). $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, T., Kang, X. Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters. Stat Papers 63, 243–285 (2022). https://doi.org/10.1007/s00362-021-01241-4

Download citation

Received: 02 June 2020
Revised: 12 May 2021
Published: 27 May 2021
Issue Date: February 2022
DOI: https://doi.org/10.1007/s00362-021-01241-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters

Abstract

Access this article

Similar content being viewed by others

Variable selection for spatial autoregressive models with a diverging number of parameters

Robust variable selection with exponential squared loss for partially linear spatial autoregressive models

Local Walsh-average-based Estimation and Variable Selection for Spatial Single-index Autoregressive Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix: Proofs

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variable selection of higher-order partially linear spatial autoregressive model with a diverging number of parameters

Abstract

Access this article

Similar content being viewed by others

Variable selection for spatial autoregressive models with a diverging number of parameters

Robust variable selection with exponential squared loss for partially linear spatial autoregressive models

Local Walsh-average-based Estimation and Variable Selection for Spatial Single-index Autoregressive Models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix: Proofs

Appendix: Proofs

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation