B spline variable selection for the single index models

Li, Jianbo; Li, Yuan; Zhang, Riquan

doi:10.1007/s00362-015-0721-z

B spline variable selection for the single index models

Regular Article
Published: 30 October 2015

Volume 58, pages 691–706, (2017)
Cite this article

Statistical Papers Aims and scope Submit manuscript

Jianbo Li^1,2,
Yuan Li¹ &
Riquan Zhang³

Abstract

Through the nonconcave penalized least squares method, we consider the variable selection in the full nonparametric regression models with the B spline-based single index approximation. Under some regular conditions, we show that the resulting estimates with SCAD and HARD thresholding penalties enjoy $\sqrt{n}$-consistency and oracle properties. We use some simulation studies and a real example to illustrate the performance of our proposed variable selection procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The adaptive LASSO spline estimation of single-index model

Article 04 June 2015

Variable selection for the partial linear single-index model

Article 08 April 2017

Penalized weighted composite quantile regression for partially linear varying coefficient models with missing covariates

Article 09 July 2020

References

Antoniadis A (1997) Wavelets in Statistics: A review (with discussion). J Ital Stat Soc 6:97–144
Antoniadis A, Fryzlewicz P, Frédérique L (2010) The Dantzig selector in Coxs proportional hazards model. Scand J Stat 37(4):531–552
Article MathSciNet MATH Google Scholar
Carroll R, Fan J, Gijbels I, Wand M (1997) Generalized partially linear single-index models. J Am Stat Assoc 92:477–489
Article MathSciNet MATH Google Scholar
Ciuperca G (2014) Model selection by LASSO methods in a change-point model. Stat Pap 55:349–374
Article MathSciNet MATH Google Scholar
Candes E, Tao T (2007) The Dantzig selector: statistical estimation when $p$ is much larger than $n$. Ann Stat 35(6):2313–2351
Article MathSciNet MATH Google Scholar
deBoor C (1978) A practical guide to splines. Springer, New York
Book Google Scholar
Fan J (1997) Comment on “Wavelet in statistics: a review” by A. Antoniadis. J Ital Stat Soc 6(2):131–138
Article MathSciNet Google Scholar
Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348–1360
Article MathSciNet MATH Google Scholar
Fan J, Li R (2002) Variable selection for Cox’s proportional hazards model and frailty model. Ann Stat 30(1):74–99
Article MathSciNet MATH Google Scholar
Fan J, Lv J (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin 20(1):101–148
MathSciNet MATH Google Scholar
Härdle W, Hall P, Ichimura H (1993) Optimal smoothing in single-index models. Ann Stat 21:157–178
Article MathSciNet MATH Google Scholar
Hall P (1989) On projection pursuit regression. Ann Stat 17:573–588
Article MathSciNet MATH Google Scholar
Horowitz J, Härdle W (1996) Direct semiparametric estimation of single-index models with discrete covariates. J Am Stat Assoc 91:1632–1640
Article MathSciNet MATH Google Scholar
Hristache M, Juditsky A, Spokoiny V (2001) Direct estimation of the index coefficientina single-index model. Ann Stat 29:595–623
Article MATH Google Scholar
Ichimura H (1993) Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J Ecomom 58:71–120
MathSciNet MATH Google Scholar
Klein R, Spady R (1993) An efficient semiparametric estimator for binary response models. Econometrica 61:387–421
Article MathSciNet MATH Google Scholar
Knight K, Fu W (2000) Asymptotics for lasso-type estimators. Ann Stat 28(5):1356–1378
Article MathSciNet MATH Google Scholar
Kong E, Xia Y (2007) Variable selection for the single-index model. Biometrika 94:217–229
Article MathSciNet MATH Google Scholar
Li K (1991) Sliced inverse regression for dimension reduction. J Am Stat Assoc 86:316–342
Article MathSciNet MATH Google Scholar
Lu W, Zhang H (2007) Variable selection for proportional odds model. Stat Med 26(20):3771–3781
Article MathSciNet Google Scholar
Neykov N, Filzmoser P, Neytchev P (2014) Ultrahigh dimensional variable selection through the penalized maximum trimmed likelihood estimator. Stat Pap 55:187–208
Article MathSciNet MATH Google Scholar
Peng H, Huang T (2011) Penalized least squares for single index models. J Stat Plan Inference 141:1362–1379
Article MathSciNet MATH Google Scholar
Penrose K, Nelson A, Fisher A (1985) Generalized body composition prediction equation for men using simple measurement techniques. Med Sci Sports Exerc 17:189
Article Google Scholar
Powell J, Stock J, Stoker T (1989) Semiparemetric estimation of index coefficients. Econometrika 57:l403–1430
Article Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological) 58(1):267–288
MathSciNet MATH Google Scholar
Tibshirani R (1997) The lasso method for variable selection in the Cox model. Stat Med 16(4):385–395
Article Google Scholar
Wang L, Yang L (2009) Spline estimation of single index models. Stat Sin 19:765–783
MathSciNet MATH Google Scholar
Wang H (2009) Bayesian estimation and variable selection for single index models. Comput Stat Data Anal 53:2617–2627
Article MathSciNet MATH Google Scholar
Wang H, Li R, Tsai C (2007) Tuning parameter selectors for the smoothly clipped absolute deviation method. Biometrika 94(3):553–568
Article MathSciNet MATH Google Scholar
Xia Y, Tong H, Li WK, Zhu L (2002) An adaptive estimation of dimension reduction space (with discussion). J R Stat Soc Ser B 64:363–410
Article MathSciNet MATH Google Scholar
Xia Y, Li WK, Tong H, Zhang D (2004) A goodness-of-fit test for single-index models. Stat Sin 14:1–39
MathSciNet MATH Google Scholar
Xia Y, Li W (1999) On single-index coefficient regression models. J Am Stat Assoc 94:1275–1285
Article MathSciNet MATH Google Scholar
Xu D, Zhang Z, Wu L (2014) Variable selection in high-dimensional double generalized linear models. Stat Pap 55:327–347
Article MathSciNet MATH Google Scholar
Zeng P, He T, Zhu Y (2012) A lasso-type approach for estimation and variable selection in single index models. J Comput Graph Stat 21:92–109
Article MathSciNet Google Scholar
Zhang H, Lu W (2007) Adaptive lasso for coxs proportional hazards model. Biometrika 94(3):691–703
Article MathSciNet MATH Google Scholar
Zhang H, Lu W, Wang H (2010) On sparse estimation for semiparametric linear transformation models. J Multivar Anal 101(7):1594–1606
Article MathSciNet MATH Google Scholar
Zhu L, Qian L, Lin J (2011) Variable selection in a class of single-index models. Ann Inst Stat Math 63:1277–1293
Article MathSciNet MATH Google Scholar
Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

We are grateful to the editor, associate editor, and referees for their helpful comments which led to the revised version of this paper. This work is partially supported by National Natural Science Foundation of China (11201190,11571148, 11271195,11171112), Postdoctoral Science Foundation of China (2014M550432), Humanities and Social Fund of Ministry of Education in China (12YJC910004), the Postdoctoral Initial Foundation in Guangzhou (gzhubsh2013004), Specialized Research Fund for the Doctoral Program of Higher Education(20124410110002), A Project Funded by the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions and “Qinglan” Project in Jiangsu.

Author information

Authors and Affiliations

School of Economics and Statistics, Guangzhou University, Guangzhou, 510006, Guangdong, People’s Republic of China
Jianbo Li & Yuan Li
School of Mathematics and Statistics, Jiangsu Normal University, Xuzhou, 221116, People’s Republic of China
Jianbo Li
School of Finance and Statistics, East China Normal University, Shanghai, 200241, People’s Republic of China
Riquan Zhang

Authors

Jianbo Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Riquan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianbo Li.

Appendix

In this section, we prove Theorems 1, 2 under the Assumptions (A1)–(A6) in Wang and Yang (2009).

Proof of Theorem 1

Let $\alpha _n=n^{-1/2}+a_n$. It is sufficient to show that for any given $\varepsilon \in (0,1)$, there exists a large constant C such that

$$\begin{aligned} P\left\{ \inf _{||\varvec{u}||=C}Q(\beta _0^{(1)}+\alpha _n\varvec{u})\ge Q(\beta _0^{(1)})\right\} \ge 1-\varepsilon . \end{aligned}$$

(12)

Based on that $p_{\lambda _n}(0)=0$ and $p_{\lambda _n}(\theta )>0$, we have

$$\begin{aligned}&Q(\beta _0^{(1)}+\alpha _n\varvec{u})-Q(\beta _0^{(1)})\nonumber \\&\quad \ge \left[ R^*(\beta _0^{(1)}+\alpha _n\varvec{u})-R^*(\beta _0^{(1)})\right] +\sum _{j=1}^s[p_{\lambda _n}(|\beta _{j0}+\alpha _nu_j|)-p_{\lambda _n}(|\beta _{j0}|)].\qquad \qquad \end{aligned}$$

(13)

By Theorems 1, 2 in Wang and Yang (2009), for any $\beta ^{(1)}\in \{\beta ^{(1)}: \beta ^{(1)}=\beta _0^{(1)}+\alpha _n\varvec{u},\ \ ||\varvec{u}||= C\}$, we have

$$\begin{aligned}&R^*(\beta ^{(1)})-R^*(\beta _0^{(1)})\nonumber \\&\quad =S^*(\beta _0^{(1)})(\beta ^{(1)}-\beta _0^{(1)})+\frac{1}{2}(\beta ^{(1)}-\beta _0^{(1)})^TH^*(\beta _0^{(1)}) (\beta ^{(1)}-\beta _0^{(1)})\{1+O_P(1)\}\nonumber \\&\quad =\frac{1}{2}(\beta ^{(1)}-\beta _0^{(1)})^T[H^*(\beta _0^{(1)})+O_P(1)](\beta ^{(1)}-\beta _0^{(1)})+O_P(n^{-1/2})\cdot ||\beta ^{(1)}-\beta _0^{(1)}||\nonumber \\&\quad =\frac{1}{2}\alpha _n^2\varvec{u}^T[H^*(\beta _0^{(1)})+O_P(1)]\varvec{u}+O_P(n^{-1/2}\alpha _n||\varvec{u}||). \end{aligned}$$

(14)

Note that $H^*(\beta _0^{(1)})$ is a positive definite matrix. The order for the first term in the last equality of (14) is $C^2\alpha _n^2$ and for second one is $\alpha _n^2C$. Therefore, for a sufficiently large C, the second term is dominated by the first term in the last equation of (14). On the other hand, by Taylor’s expansion, the second term of (13) is bounded by

$$\begin{aligned} \sqrt{s}\alpha _na_n||\varvec{u}||+\alpha _n^2b_n||\varvec{u}||^2=C\alpha _n^2(\sqrt{s}+b_nC). \end{aligned}$$

If $b_n\rightarrow 0$, the second term of (13) is dominated by the first term of (14). Thus, for a sufficiently large C, (12) holds, which means that there exists a local minimizer in the ball $\{\beta ^{(1)}:\beta ^{(1)}=\beta _0^{(1)}+\alpha _n\varvec{u},\ \ ||\varvec{u}||\le C\}$ with probability at least $1-\varepsilon >0$. Therefore, there exists a local minimizer $\hat{\beta }_n^{(1)}$ such that $||\hat{\beta }_n^{(1)}-\beta _0^{(1)}||=O_P(n^{-1/2}+a_n)$. $\square $

Proof of Theorem 2

(i) It is sufficient to prove that

$$\begin{aligned} Q((\beta _1^{(1)'},\varvec{0}')')=\min _{||\beta _2^{(1)}||\le Cn^{-1/2}}Q((\beta _1^{(1)'},\beta _2^{(1)'})') \end{aligned}$$

(15)

for any given $\beta _1^{(1)}$ satisfying $||\beta _1^{(1)}-\beta _{10}^{(1)}||=O_P(n^{-1/2})$ and any constant C.

Denote $S_j^*(\beta ^{(1)})$ as the jth element of $S^*(\beta ^{(1)})$. By the Taylor expansion of $S_j^*(\beta ^{(1)})$ for $||\beta ^{(1)}-\beta _0^{(1)}||=O_P(n^{-1/2})$ at $\beta _0^{(1)}$, we have

$$\begin{aligned} S_j^*(\beta ^{(1)})=S_j^*(\beta _0^{(1)}) + \sum _{i=1}^{p-1}\frac{\partial ^2 R^*(\beta _0^{(1)})}{\partial \beta _j\partial \beta _i}(\beta _i-\beta _{i0}) + O_p(||\beta ^{(1)}-\beta _0^{(1)}||^2). \end{aligned}$$

(16)

From (A.32) and Theorem 2 of Wang and Yang (2009), it can be obtained that

$$\begin{aligned} \frac{\partial ^2 R^*(\beta _0^{(1)})}{\partial \beta _j\partial \beta _i}=l_{ji}+o(1)\quad \ \ \text {and}\quad \ \ S_j^*(\beta _0^{(1)})=O_p(n^{-1/2}), \end{aligned}$$

where $l_{ji}$’s are defined in the Theorem 2 of Wang and Yang (2009). So for $||\beta ^{(1)}-\beta _0^{(1)}||=O_P(n^{-1/2})$, from (16) we have

$$\begin{aligned} S_j^*(\beta ^{(1)})=O_p(n^{-1/2}). \end{aligned}$$

Therefore, for $||\beta ^{(1)}-\beta _0^{(1)}||=O_P(n^{-1/2})$ and $j=s+1,s+2,\cdots ,p-1$, we have that

$$\begin{aligned} \begin{aligned} \frac{\partial Q(\beta ^{(1)})}{\partial \beta _j}&=\frac{1}{n}\left\{ nS_j^*(\beta ^{(1)})+n\dot{p}_{\lambda _n}(|\beta _j|)\text {sign}(\beta _j)\right\} \\&=\frac{1}{n}\left\{ n\dot{p}_{\lambda _n}(|\beta _j|)\text {sign}(\beta _j)+O_P(\sqrt{n})\right\} \\&=\frac{1}{n}\left\{ n\lambda _n\left[ \lambda _n^{-1}\dot{p}_{\lambda _n}(|\beta _j|)\text {sign}(\beta _j)+O_P(\frac{1}{\sqrt{n}\lambda _n})\right] \right\} , \end{aligned} \end{aligned}$$

(17)

Since $\liminf _{n\rightarrow \infty }\liminf _{\theta \rightarrow 0+}\dot{p}_{\lambda _n}(\theta )/\lambda _n=c>0,$ $\frac{1}{\sqrt{n}\lambda _n}\rightarrow 0$ and $|\text {sign}(\beta _j)|=1$ for any $\beta _j\ne 0$,

$$\begin{aligned} \liminf _{n\rightarrow \infty }\liminf _{\beta _j\rightarrow 0}|\lambda _n^{-1}\dot{p}_{\lambda _n}(|\beta _j|)\text {sign}(\beta _j)|= c>0\end{aligned}$$

and so the second term in squared bracket of the last equation in (17) is dominated by the first term when n is large enough. Hence the the derivative and $\beta _j$ have the same sign. Therefore (15) holds.

(ii) From $a_n=O(n^{-1/2})$ and Theorem 1, there exists a local $\sqrt{n}$-consistent minimizer, $\hat{\beta }_{1n}^{(1)}$, of $Q((\beta _1^{(1)'},\varvec{0}')')$ satisfying

$$\begin{aligned} \frac{\partial Q(\beta ^{(1)})}{\partial \beta _j^{(1)}}\Big |_{\beta ^{(1)}=(\hat{\beta }_{1n}^{(1)},\varvec{0}')'}=0\quad \ \ \text {for}\quad \ \ j=1,2,\cdots ,s. \end{aligned}$$

(18)

Set $\hat{\beta }_n^{(1)}=(\hat{\beta }_{1n}^{(1)'},\varvec{0}')'$ and $S_1^*(\beta ^{(1)})$ as the vector consisting of the first s components of $S^*(\beta ^{(1)})$, then

$$\begin{aligned} 0= & {} \frac{\partial Q(\beta ^{(1)})}{\partial \beta _1^{(1)}}\Big |_{\beta ^{(1)}=\hat{\beta }_n^{(1)}}=\frac{\partial Q(\beta ^{(1)})}{\partial \beta _1^{(1)}}\Big |_{\beta ^{(1)}=\beta _0^{(1)}}+\frac{\partial ^2Q(\beta ^{(1)})}{\partial \beta _1^{(1)}\partial \beta _1^{(1)'}}\Big |_{\beta ^{(1)}=\beta ^{(1)*}}(\hat{\beta }_{1n}^{(1)}-\beta _{10}^{(1)})\nonumber \\= & {} S_1^*(\beta _0^{(1)})+\varvec{b}_{\lambda _n}+\frac{\partial R^*(\beta ^{(1)})}{\partial \beta _1^{(1)}\partial \beta _1^{(1)'}}\Big |_{\beta ^{(1)}=\beta ^{(1)*}}(\hat{\beta }_{1n}^{(1)}-\beta _{10}^{(1)})+\Sigma _{\lambda _n}(\beta _1^{(1)*})(\hat{\beta }_{1n}^{(1)}-\beta _{10}^{(1)})\nonumber \\ \end{aligned}$$

(19)

where $\beta ^{(1)*}=(\beta _1^{(1)*'},\beta _2^{(1)*'})'$ lies on the line segment between $\hat{\beta }_n^{(1)}$ and $\beta _0^{(1)}$. From Theorem 1 above, Theorems 1, 2 in Wang and Yang (2009), (9) holds. This completes the proof. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, J., Li, Y. & Zhang, R. B spline variable selection for the single index models. Stat Papers 58, 691–706 (2017). https://doi.org/10.1007/s00362-015-0721-z

Download citation

Received: 15 June 2014
Revised: 09 October 2015
Published: 30 October 2015
Issue Date: September 2017
DOI: https://doi.org/10.1007/s00362-015-0721-z

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

B spline variable selection for the single index models

Abstract

Access this article

Similar content being viewed by others

The adaptive LASSO spline estimation of single-index model

Variable selection for the partial linear single-index model

Penalized weighted composite quantile regression for partially linear varying coefficient models with missing covariates

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

B spline variable selection for the single index models

Abstract

Access this article

Similar content being viewed by others

The adaptive LASSO spline estimation of single-index model

Variable selection for the partial linear single-index model

Penalized weighted composite quantile regression for partially linear varying coefficient models with missing covariates

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Proof of Theorem 1

Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation