Skip to main content
Log in

M-estimators for single-index model using B-spline

  • Published:
Metrika Aims and scope Submit manuscript

Abstract

The single-index model is an important tool in multivariate nonparametric regression. This paper deals with M-estimators for the single-index model. Unlike the existing M-estimator for the single-index model, the unknown link function is approximated by B-spline and M-estimators for the parameter and the nonparametric component are obtained in one step. The proposed M-estimator of unknown function is shown to attain the convergence rate as that of the optimal global rate of convergence of estimators for nonparametric regression according to Stone (Ann Stat 8:1348–1360, 1980; Ann Stat 10:1040–1053, 1982), and the M-estimator of parameter is \(\sqrt{n}\)-consistent and asymptotically normal. A small sample simulation study showed that the M-estimators proposed in this paper are robust. An application to real data illustrates the estimator’s usefulness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Carroll RJ, Fan J, Gijbels I, Wand MP (1997) Generalized partially linear single-index models. J Am Stat Assoc 92:477–489

    Article  MATH  MathSciNet  Google Scholar 

  • Cheng G, Huang JZ (2010) Bootstrap consistency for general semiparametric M-estimation. Ann Stat 38:2884–2915

    Article  MATH  MathSciNet  Google Scholar 

  • Cox DD (1983) Asymptotics for M-type smoothing splines. Ann Stat 11:530–551

    Article  MATH  Google Scholar 

  • Delecroix M, Hristache M, Patilea V (2006) On semiparametric M-estimation in single-index regression. J Stat Plan Infer 136:730–769

    Google Scholar 

  • Eggleston HG (1958) Convexity, Cambridge tracts in mathematics and mathematical physics, vol 47. Cambridge University Press, Cambridge

    Google Scholar 

  • Elmi A, Ratcliffe SJ, Parrey S, Guo WS (2011) A B-spline based semiparametric nonlinear mixed effects model. J Comput Graph Stat 20:492–509

    Article  Google Scholar 

  • Fan J, Hu T, Truong Y (1994) Robust nonparametric function estimation. Scand J Stat 21:433–446

    MATH  MathSciNet  Google Scholar 

  • Gao J, Liang H (1997) Statistical inference in single-index and partially nonlinear models. Ann Inst Stat Math 49:493–517

    Article  MATH  MathSciNet  Google Scholar 

  • Hample FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust ststistics. The approach based on influence functions. Wiley, New York

    Google Scholar 

  • Härdle W, Stoker TM (1989) Investing smooth multiple regression by the method of average derivatives. J Am Stat Assoc 84:986–995

    MATH  Google Scholar 

  • Härdle W, Hall P, Ichimura H (1993) Optimal smoothing in single-index models. Ann Stat 21:157–178

    Article  MATH  Google Scholar 

  • He X, Shi P (1994) Convergence rate of B-spline estimators of nonparametric conditional quantile functions. J Nonparametric Stat 3:299–308

    Article  MATH  MathSciNet  Google Scholar 

  • He X, Shi P (1996) Bivariate tensor—product B-spline in a partly linear model. J Multivar Anal 58:162–181

    Article  MATH  MathSciNet  Google Scholar 

  • He X, Zhu ZY, Fung WK (2002) Estimationin a semiparametric model for longitudinal data with unspecidied dependence structure. Biometrika 89:579–590

    Article  MATH  MathSciNet  Google Scholar 

  • He X, Fung WK, Zhu ZY (2005) Robust estimation in genearlized partial linear models for clusters data. J Am Stat Assoc 100:1176–1184

    Article  MATH  MathSciNet  Google Scholar 

  • Hristache M, Juditsky A, Spokoiny V (2001) Direct estimation of the index coefficient in a single-index model. Ann Stat 29:595–623

    Article  MATH  MathSciNet  Google Scholar 

  • Huang JZ (2003) Local asymptotic for polynomial spline regression. Ann Stat 31:1600–1635

    Article  MATH  Google Scholar 

  • Huang JZ, Liu L (2006) Polynomial spline estimation and inference of proportional hazards regression models with flexible relative risk form. Biometric 62:793–802

    Article  MATH  MathSciNet  Google Scholar 

  • Huang J, Wu C, Zhou L (2002) Varying-coefficient models and basis function approximations for the analysis of repeated measurements. Biometrika 89:111–128

    Article  MATH  MathSciNet  Google Scholar 

  • Huber PJ (1964) Robust estimation of location parameter. Ann Math Stat 35:73–101

    Article  MATH  Google Scholar 

  • Huber PJ (1981) Robust statistics. Wiley, New York

    Book  MATH  Google Scholar 

  • Ichimura H (1993) Semiparametric least square (SLS) and weighted SLS estimation of single-index models. J Econom 58:71–120

    Article  MATH  MathSciNet  Google Scholar 

  • Jiang CR, Wang JL (2011) Functional single index models for longitudinal data. Ann Stat 39:362–388

    Article  MATH  Google Scholar 

  • Kozek AS (2003) On M-estimators and normal quantiles. Ann Stat 31:1170–1185

    Article  MATH  MathSciNet  Google Scholar 

  • Li JB, Zhang RQ (2011) Partially varying coefficient single-index proportional hazards regression models. Comput Stat Data Anal 55:389–400

    Article  MATH  Google Scholar 

  • Liang H, Liu X, Li RZ, Tsai CL (2010) Estimation and testing for partially linear single-index model. Ann Stat 38:3811–3836

    Article  MATH  MathSciNet  Google Scholar 

  • Lin X, Wang N, Welsh AH, Carroll RJ (2004) Equivalent kernel of smoothing splines in nonparametric regression for clustered/longitudinal data. Biometrika 91:177–193

    Article  MATH  MathSciNet  Google Scholar 

  • Naik P, Tsai CL (2000) Partial least squares estimator for single-index. J R Stat Soc Ser B 62:763–771

    Article  MATH  MathSciNet  Google Scholar 

  • Powell JL, Stoker JH, Stoker TM (1989) SAemiparametric estimation of index coefficients. Econometrica 57:1403–1430

    Article  MATH  MathSciNet  Google Scholar 

  • Rice J (1986) Convergence rates for partially splined models. Stat Probab Lett 4:203–208

    Article  MATH  MathSciNet  Google Scholar 

  • Schumaker LL (1981) Spline functions. Wiley, New York

    MATH  Google Scholar 

  • Shi P, Li G (1995) Optimal global rates of convergence of B-spline M-estimators for nonparametric regression. Stat Sinica 5:303–318

    MATH  MathSciNet  Google Scholar 

  • Silverman BW (1984) Spline smoothing: the equivalent variable kernel method. Ann Stat 12:898–916

    Article  MATH  Google Scholar 

  • Speckman P (1988) Kernel smoothing in partial linear models. J R Stat Soc Ser B 50:413–436

    MATH  MathSciNet  Google Scholar 

  • Stoker TM (1986) Consistent estimation of scaled coefficients. Econometrica 54:1461–1481

    Article  MATH  MathSciNet  Google Scholar 

  • Stone C (1980) Optimal rates of convergence for nonparametric estimators. Ann Stat 8:1348–1360

    Article  MATH  Google Scholar 

  • Stone C (1982) Optimal global rates of convergence for nonparametric regression. Ann Stat 10:1040–1053

    Article  MATH  Google Scholar 

  • Wang N (2003) Margnial nonparametric kernel regression accounting for within-subject correlation. Biometrika 90:43–52

    Google Scholar 

  • Wang L, Yang LJ (2009) Spline estimation of single-index models. Stat Sinica 19:765–783

    MATH  Google Scholar 

  • Welsh AH (1996) Robust estimation of smooth regression and spread functions and theirs derivates. Stat Sinica 6:347–366

    MATH  MathSciNet  Google Scholar 

  • Welsh AH, Lin X, Carroll RJ (2002) Marginal nonpartametric regression: locality and efficiency of spline and kernel methods. J Am Stat Assoc 97:482–493

    Article  MATH  MathSciNet  Google Scholar 

  • Wu WB (2007) M-estimation of linear models with dependent errors. Ann Stat 35:495–521

    Article  MATH  Google Scholar 

  • Wu TZ, Yu KM, Yu Y (2010) Single-index quantile regression. J Multi Anal 101:1607–1621

    Article  MATH  MathSciNet  Google Scholar 

  • Xia YC, Härdle W (2006) Semi-parametric estimation of partially linear single-index models. J Multi Anal 97:1162–1184

    Article  MATH  Google Scholar 

  • Xia YC, Tong H, Li WK, Zhu LX (2002) An adaptive estimation of dimension reduction space. J R Stat Soc Ser B 64:363–410

    Article  MATH  MathSciNet  Google Scholar 

  • Yu Y, Ruppert D (2002) Penalized spline estimation for partially linear single-index models. J Am Stat Assoc 97:1042–1054

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongyi Zhu.

Appendix

Appendix

In this section, we prove the results of Theorems 1 and 2. Let

$$\begin{aligned} v_i(\theta _0)&= B^{\prime }(x_i^\tau \beta _0)^\tau \delta _0 x_i,\quad i=1,\ldots , n,\\ V(\theta _0)&= (v_1(\theta _0), \ldots , v_n(\theta _0))^\tau ,\\ Z_n&= (B(x_1^\tau \beta _0),\ldots ,B(x_n^\tau \beta _0))^\tau ,\quad P=Z_n(Z_n^\tau Z_n)^{-1}Z_n^\tau , \end{aligned}$$

\(H_n^2=NZ_n^\tau Z_n,\quad z_i=H_n^{-1}B(x_i^\tau \beta _0), \lambda _n\) is the smallest eigenvalue of \(N/nH_n^2\).

Lemma 6.1

If the Assumption 1–2 is satisfied, then

$$\begin{aligned} lim_{n\rightarrow \infty }\frac{1}{n} V(\theta _0)^\tau (I-P)V(\theta _0)=\Sigma , \end{aligned}$$
(6.1)

where \(I\) is an identity matrix.

Proof

The proof of this Lemma 6.1 is referred to Gao and Liang (1997). \(\square \)

Lemma 6.2

Under Assumption 1, there exists a constant \(c\) such that

$$\begin{aligned} \sup _{x\in X}|g(\beta _0^\tau x)-B(\beta _0^\tau x)^\tau \delta _0|\le ck_n^{-r}, \end{aligned}$$
(6.2)

where \(k_n\) is the number of knots in the same order of \(N\).

This Lemma justifies the approximation power of B-spline. Its proof follows readily from Corollary 6.21 in Schumaker (1981) .

Lemma 6.3

Assume \(lim_{n\rightarrow \infty } n^{\gamma -1}k_n^2=0\) for some \(\gamma \ge 0\), then with probability one

$$\begin{aligned} lim \, inf_n\lambda _n&> 0,\end{aligned}$$
(6.3)
$$\begin{aligned} max_{1\le i\le n}|z_i|^2&\le 2(l+3)N/(n\lambda _n). \end{aligned}$$
(6.4)

A complete proof can be found in Shi and Li (1995).

Proof of Theorem 1

By Lemma 6.2, there exists a constant \(M\) such that

$$\begin{aligned} g(x_i^\tau \beta _0)=B(x_i^\tau \beta _0)^\tau \delta _0-R_{ni}, \sup _{1\le i\le n}|R_{ni}|\le M k_n^{-r}. \end{aligned}$$
(6.5)

Thus,

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^n(\hat{g}(x_i^\tau \beta _0)- g(x_i^\tau \beta _0))^2\le \frac{2}{n}\sum _{i=1}^n (B(x_i^\tau \beta _0)^\tau (\hat{\delta }_0-\delta _0))^2+2M^2k_n^{-2r}.\quad \end{aligned}$$
(6.6)

Then, in order to prove (2.5) hold, it suffices to show that

$$\begin{aligned} \sum _{i=1}^n(B(x_i^\tau \beta _0)^\tau (\hat{\delta }_0-\delta _0))^2=O_p(k_n). \end{aligned}$$
(6.7)

Denote \(V^*(\theta _0)=(I-P)V(\theta _0)= (v_1^*(\theta _0),\ldots ,v_n^*(\theta _0))^\tau , S_n=V^*(\theta _0)^\tau V^*(\theta _0)\).

Let

$$\begin{aligned} \begin{aligned} \theta (\beta , \delta )&= \left( \begin{array}{c} \theta _1\\ \theta _2 \end{array}\right) =\left( \begin{array}{c} S_n^{\frac{1}{2}}(\beta -\beta _0)\\ H_n(\delta -\delta _0)N^{-\frac{1}{2}}+H_n^{-1}N^{\frac{1}{2}}Z_n^\tau V(\theta _0)(\beta -\beta _0) \end{array}\right) ,\\ \hat{\theta }&=\left( \begin{array}{c} \hat{\theta }_1\\ \hat{\theta }_2 \end{array}\right) =\theta (\hat{\beta }, \hat{\delta }). \end{aligned} \end{aligned}$$
(6.8)

In order to attain (6.7), we first show that \(||\hat{\theta }||=O_p(k_n^{1/2})\). Let \(\tilde{v_i}(\theta _0)=S_n^{-\frac{1}{2}}v_i^*(\theta _0), \tilde{B}(x_i^\tau \beta _0)=H_n^{-1}B(x_i^\tau \beta _0)N^{1/2}\).

Then, we have that

$$\begin{aligned} \sum _{i=1}^n\rho (y_i-B(x_i^\tau \beta )^\tau \delta )= \sum _{i=1}^n\rho (e_i-R_{ni}-\tilde{v_i}(\theta _0)^\tau \theta _1 -\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2+R^*(\theta )),\nonumber \\ \end{aligned}$$
(6.9)

which is minimized at \(\hat{\theta }\), where \(R^*(\theta )=B^{\prime }(\beta _0^\tau x_i)^\tau \delta _0 x_i^\tau S_n^{-1/2}\theta _1-B^{\prime }(x_i^\tau \beta ^*)^\tau \delta x_i^\tau \) \( S_n^{-1/2}\theta _1, \beta ^*\) is between \(\beta \) and \(\beta _0\).

According to the properties of B-spline, there exist constant vectors \(s_j j=1, \ldots , N\), for any \(h_j(x_i^\tau \beta _0)\),

$$\begin{aligned} h_j(x_i^\tau \beta _0)=B(x_i^\tau \beta _0)^\tau s_j+\tilde{R}_{nij}, \end{aligned}$$
(6.10)

where \(\tilde{R}_{nij}\) is the remainder of \(h_j(x_i^\tau \beta _0)\) approximating by B-spline. According to Lemma 6.2, \(g^{\prime }(x_i^\tau \beta _0)x_i=B^{\prime }(x_i^\tau \beta _0)^\tau \delta _0 x_i+\tilde{R}_{ni}, \sup _{1\le ni}|\tilde{R}_{ni}|\le k_n^{-(r-1)}\) by Assumption 3, we know that there exist matrices \(G\) and \(W_n\) such that

$$\begin{aligned} V^*(\theta _0)=(I-P)(Z_nG+W_n), \end{aligned}$$

where \(W_n=\tilde{R}_n+U_n, \tilde{R}_n=(\tilde{R}_{nij})_{n\times N}, U_n=(u_{ij})_{n\times N}, G=(s_1,\ldots ,s_N)\). Hence, any column of \(V^*(\theta _0)\) is of the order of \(O_p(n^{1/2})\),

Therefore, according to (6.8), Lemmas 6.1, 6.2 and 6.3

$$\begin{aligned}&|\delta -\delta _0|=|N^{1/2}H_n^{-1}\theta _2-NH_n^{-2}Z_n^\tau V(\theta _0)S_n^{-1/2}\theta _1|\nonumber \\&\quad \le |N^{1/2}H_n^{-1}\theta _2|+|NH_n^{-2}(z_1,\ldots , z_n)V(\theta _0)S_n^{-1/2}\theta _1|\nonumber \\&\quad \le L\lambda _n^{-1/2}(N^{1/2}n^{-1/2}+Nn^{-1/2}). \end{aligned}$$
(6.11)

Thus, when \(|\theta |\le L, \delta =\delta _0+O_p(Nn^{-1/2})\). Because the derivative of \(g(\cdot )\) is continuous and bounded, by Lemma 6.2, have

$$\begin{aligned}&|B^{\prime }(\beta _0^\tau x_i)^\tau \delta _0-B^{\prime }(\beta ^{*\tau } x_i)^\tau \delta |=|B^{\prime }(\beta _0^\tau x_i)^\tau \delta _0-B^{\prime }(\beta ^{*\tau } x_i)^\tau \delta _0\nonumber \\&\qquad +O_p(Nn^{-1/2})| \le |B^{\prime }(\beta _0^\tau x_i)^\tau \delta _0-B^{\prime }(\beta ^{*\tau } x_i)^\tau \delta _0|+O_p(Nn^{-1/2})\nonumber \\&\quad =|g^{\prime }(\beta _0^\tau x_i)+\tilde{R}_{ni}-g^{\prime }(\beta ^{*\tau } x_i)-\tilde{R}_{ni}^*|+O_p(Nn^{-1/2})=o_p(1). \end{aligned}$$
(6.12)

As the support of \(X\) is convex, so when \(|\theta |\le L\)

$$\begin{aligned} |R_i^*(\theta )|&= |B^{\prime }(\beta _0^\tau x_i)^\tau \delta _0x_i^{\tau }S_n^{-1/2} \theta _1-B^{\prime } (\beta _0^{*\tau }x_i)^\tau \delta x_i^\tau S_n^{-1/2}\theta _1|\nonumber \\&= |B^{\prime }(\beta _0^\tau x_i)^\tau \delta _0-B^{\prime }(\beta _0^{*\tau }x_i)^\tau \delta ||x_i^\tau S_n^{-1/2}\theta _1|=O_p(n^{-1/2}). \end{aligned}$$
(6.13)

Therefore

$$\begin{aligned} \sup _i\sup _{||\theta ||\le L} max\left\{ k_n^{1/2}|\tilde{v_i} (\theta _0)^\tau \theta _1|,k_n^{1/2}|\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2|,|R_{ni}|, |R^*(\theta )|\right\} =o_p(1).\nonumber \\ \end{aligned}$$
(6.14)

According to Assumption 4–5, by similar argument to those for the below Lemma 6.4 of this paper, we have for any \(L>0\), that

$$\begin{aligned}&\sup _{||\theta ||\le L}k_n^{-1}\left| \sum _{i=1}^n[\rho (e_i-k_n^{1/2} \tilde{v_i}(\theta _0)^\tau \theta _1-k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni}+R^*(\theta ))\right. \nonumber \\&\quad -\rho (e_i-R_{ni})-E\{\rho (e_i-k_n^{1/2}\tilde{v_i} (\theta _0)^\tau \theta _1-k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni}+R^*(\theta ))\nonumber \\&\quad \left. -\rho (e_i-R_{ni})\}+(k_n^{1/2}\tilde{v_i} (\theta )^\tau \theta _1+k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2+R^*(\theta ))\psi (e_i)]\right| =o_p(1).\nonumber \\ \end{aligned}$$
(6.15)

Note that \(|k_n^{-1/2}\sum _{i=1}^n\tilde{v_i}(\theta _0)R_{ni}b| =||k_n^{-1/2}S_n^{-\frac{1}{2}}(W_n^\tau (I-P)+G^\tau Z_n^\tau (I-P))r_n||=o_p(1),\) where \(r_n\) is a vector formed by \(R_{ni}b\) and \(R^*(\theta )=O_p(n^{-1/2})=o_p(R_{ni})\) for any \(L>0\).

By Assumption 4–5 and the fact \(\sum _{i=1}^n\tilde{v_i} (\theta _0)\tilde{B}(x_i^\tau \beta _0)^\tau =0\), we have

$$\begin{aligned}&k_{n}^{-1}\sum _{i=1}^nE[\rho (e_i-(k_n^{1/2} \tilde{v_i} (\theta _0)^\tau \theta _1+k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2+R^*(\theta ))-R_{ni})\nonumber \\&\qquad -\rho (e_i-R_{ni})]= k_n^{-1}\sum _{i=1}^nE\int \limits _{-R_{ni}}^{-(k_n^{1/2} \tilde{v_i}(\theta _0)^\tau \theta _1+k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2+R_{ni}+R^*(\theta ))}\psi (e_i+s)ds\nonumber \\&\quad =k_n^{-1}\sum _{i=1}^n[k_n\{(\tilde{v_i}(\theta _0)^\tau \theta _1)^2+(\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2)^2\} b/2+(k_n^{1/2}\tilde{v_i}(\theta _0)^\tau \theta _1+k_n^{1/2 }\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2\nonumber \\&\qquad +R^*(\theta ))R_{ni}b]\{1+o(1)\}\!=\!\frac{1}{2}\left( b\theta _1^\tau \theta _1+\theta _2^\tau \sum _{i=1}^n\tilde{B}(x_i^\tau \beta _0) \tilde{B}(x_i^\tau \beta _0)^\tau \theta _2)+0(||\theta ||^2\right) . \nonumber \\ \end{aligned}$$
(6.16)

By the Tchebychev inequality and Assumption 4, we obtain

$$\begin{aligned} P\left( \Bigg |\sum _i^n\tilde{v_i}(\theta _0)^\tau \theta _1\psi (e_i)\Bigg |\ge Lk_n^{-1/2}\right)&< \frac{E(\sum _i^n\tilde{v_i}(\theta _0)^\tau \theta _1\psi (e_i))^2}{L^2k_n} \\&= \frac{||\theta _1||^2E\psi (e_1)^2}{L^2k_n}\rightarrow 0. \end{aligned}$$

Hence, \(\sup _{||\theta ||=L} k_n^{-1/2}|\sum _{i=1}^n \tilde{v_i}(\theta _0)^\tau \theta _1\psi (e_i)|=O_p(L)\).

Similarly, \(\sup _{||\theta ||=L} k_n^{-1/2}|\sum _{i=1}^n \tilde{B}(x_i^\tau \beta _0)^\tau \theta _2\psi (e_i)|=O_p(L)\).

Thus, it follows from (6.13) that, for sufficiently large \(L\),

$$\begin{aligned}&\inf _{||\theta ||=L}k_n^{-1}\sum _{i=1}^n[E\{\rho (e_i-k_n^{1/2} \tilde{v_i}(\theta _0)^\tau \theta _1-k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni}+R^*(\theta ))\nonumber \\&\quad -\rho (e_i-R_{ni})\}-(k_n^{1/2}\tilde{v_i}(\theta )^\tau \theta _1+k_n^{1/2}\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2+R^*(\theta ))\psi (e_i)]>0,\nonumber \\ \end{aligned}$$
(6.17)

with probability tending to 1 as \(n\rightarrow \infty \). Combining (6.12) and (6.14) yields

$$\begin{aligned} \inf _{||\theta ||\!=\!L}\sum _{i=1}^n\rho (e_i\!-\!k_n^{1/2} \tilde{v_i} (\theta _0)^\tau \theta _1-k_n^{1/2} \tilde{B} (x_i^\tau \beta _0)^\tau \theta _2\!-\! R_{ni}+ R^*(\theta ))\!>\! \sum _{i=1}^n\rho (e_i-R_{ni}), \end{aligned}$$

which implies,by the convexity of \(\rho \), that

$$\begin{aligned} \inf _{||\theta ||\ge Lk_n^{1/2}}\sum _{i=1}^n\rho (e_i- \tilde{v_i}(\theta _0)^\tau \theta _1-\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni}+R^*(\theta ))>\sum _{i=1}^n\rho (e_i-R_{ni}). \end{aligned}$$

We then conclude that \(||\hat{\theta }||=O_p(k_n^{1/2})\). By (6.8), we have

$$\begin{aligned} |S_n^{1/2}(\hat{\beta }_0-\beta _0)|^2=(\hat{\beta }_0-\beta _0)^{\tau }S_n (\hat{\beta }_0-\beta _0)=O_p(k_n). \end{aligned}$$

Thus

$$\begin{aligned}&|\hat{\beta }_0-\beta _0|=O_p(n^{-r/(2r+1)}),\\&\sum _{i=1}^n(B(x_i^\tau \beta _0)^\tau (\hat{\delta }_0-\delta _0))^2 = O_p(k_n). \end{aligned}$$

Hence,

$$\begin{aligned}&\sum _{i=1}^n(B(x_i^\tau \hat{\beta }_0)^\tau (\hat{\delta }_0-\delta _0))^2 \le 2\sum _{i=1}^n\{[B(\hat{\beta }_0^\tau x_i)-B(\beta _0^\tau x_i)](\hat{\delta }_0-\delta _0)\}^2\\&\qquad +2\sum _{i=1}^n(B(\beta _0^\tau x_i)(\hat{\delta }_0-\delta _0))^2\\&\quad =2\sum _{i=1}^n[B^{\prime }(\tilde{\beta })^\tau (\hat{\delta }_0-\delta _0) x_i^\tau (\hat{\beta }_0-\beta _0)]+O_p(k_n)=O_p(k_n). \end{aligned}$$

As the derivative of \(g(\cdot )\) is bounded and the support of \(X\) is compact, so

$$\begin{aligned}&\frac{1}{n}\sum _{i=1}^n(\hat{g}(\hat{\beta }_0^\tau x_i)-g(\beta _0^\tau x_i))^2=\frac{1}{n}\sum _{i=1}^n [\hat{g}(\hat{\beta }_0^\tau x_i)-g(\hat{\beta }_0^\tau x_i)+g(\hat{\beta }_0^\tau x_i)\nonumber \\&\qquad -g(\beta _0^\tau x_i))^2 \le \frac{2}{n}\sum _{i=1}^n[\hat{g}(\hat{\beta }_0^\tau x_i)-g(\hat{\beta }_0^\tau x_i)]^2+\frac{2}{n}\sum _{i=1}^n[ g(\hat{\beta }_0^\tau x_i)-g(\beta _0^\tau x_i)]^2\nonumber \\&\quad =\frac{2}{n}\sum _{i=1}^n[\hat{g}(\hat{\beta }_0^\tau x_i)-g(\hat{\beta }_0^\tau x_i)]^2+\frac{2}{n}\sum _{i=1}^n g^{\prime }(\beta _0^{*\tau }x_i)^2|x_i||\hat{\beta }_0-\beta _0|^2=O_p (k_n^{-2r}).\nonumber \\ \end{aligned}$$
(6.18)

The proof of Theorem 1 is completed.

To obtain asymptotic normality of \(\hat{\beta }_0\), we shall make use of several asymptotic linearization results which are similar to Lemmas 6.3 and 6.4 of He and Shi (1996). The proofs is similar to the argument in He and Shi (1996).

Lemma 6.4

Under the Assumption of Theorem 2, we have for any \(L>0\) and \(M_0>0\)

$$\begin{aligned}&\sup _{|\theta _1|\le M_0, |\theta _2|\le L k_n^{1/2}}\left| \sum _{i=1}^n[\rho (e_i-\tilde{v_i} (\theta _0)^\tau \theta _1-\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni})\right. \nonumber \\&\quad -\rho (e_i-\tilde{B}(x_i^\tau \beta _0) ^\tau \theta _2-R_{ni})+\tilde{v_i}(\theta _0)\theta _1\psi (e_i) -E_e(\rho (e_i-\tilde{v_i}(\theta _0)^\tau \theta _1\nonumber \\&\quad \left. -\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni})-\rho (e_i-\tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni}))]\right| =o_p(1),\qquad \end{aligned}$$
(6.19)

where \(E_e\) is the expectation with respect to \(e\).

Proof

Because \(f_e(\cdot )\) is continuous at 0, there exist two positive numbers \(\omega _1,\omega _2\) such that when \(|t|\le \omega _2\), \(|f_e(t)|<\omega _1\) holds. By (6.13), (6.14) and Lemma 6.2, have

$$\begin{aligned} d_n&= k_n^{1/2}|\tilde{v}_i(\theta _0)^\tau \theta _1+\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta )|\le k_n^{1/2}|\tilde{v}_i(\theta _0)^\tau \theta _1|\\&+k_n^{1/2}[|\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2|+|R_{ni}|+|R_i^*(\theta )|]=O_p(n^{-1/2}k_n)=O_p(1). \end{aligned}$$

In order to prove the (6.19), it suffices to show that given \(\varepsilon >0\),

$$\begin{aligned}&P\biggr \{\sup _{||\theta _1||\le 1,||\theta _2||\le L}\left| \sum _{i=1}^n [\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2} \tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}\right. \nonumber \\&\quad +R_i^*(\theta ))-\rho (e_i+R_i^*(\theta )-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_ni)+M\tilde{v}_i(\theta _0)\theta _1\psi (e_i)\nonumber \\&\quad -E_e(\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\quad \left. -\rho (e_i+R_i^*(\theta )-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_ni))]\right| \ge \varepsilon , d_n\le \omega _2\biggr \}\rightarrow 0.\qquad \quad \qquad \end{aligned}$$
(6.20)

Denote \(\Gamma =\{\theta _1:|\theta _1|\le 1, \theta _1\in R^{p}\}\) as a union of \(K_n\) disjoint parts \(\Gamma _1,\ldots ,\Gamma _{k_n}\) such that the diameter of each part does not exceed \(q_0=k_n\varepsilon /(8b_2M\omega _2n)\). Then \(K_n\le (2\sqrt{p}/q_0+1)^p\). Choose \(\theta _{1j}\in \Gamma _j\), for each of \(j=1,\ldots , K_n\), by Assumptions 5–6 and Lemma 6.3, we have

$$\begin{aligned}&\min _{1\le j\le K_n}\Bigg |\sum _{i=1}^n[\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad \!-\!\rho (e_i\!-\!Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2\!-\!R_{ni}\!+\!sR_i^*(\theta ))\!+\!M\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)\!-\!\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _{1j}\nonumber \\&\qquad -Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))+ \rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad -M\tilde{v}_i(\theta _0)^\tau \theta _{1j}\psi (e_i)- E(\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2} \tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta )) \nonumber \\&\qquad -\rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta )))+E(\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _{1j}\nonumber \\&\qquad -Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))- \rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2\nonumber \\&\qquad -R_{ni}+R_i^*(\theta )))]\Bigg |I(d_n\le \omega _2) \le n(b_2\omega _2M\max _{1\le i\le n}|\tilde{v}_i(\theta _0)|\min _{1\le j\le K_n}|\theta _1-\theta _{1j}|\nonumber \\&\qquad +bM\omega _2\max _{1\le i\le n}|\tilde{v}_i(\theta _0)|\min _{1\le j\le K_n}|\theta _1-\theta _{1j}|\max _{1\le i\le n}(M|\tilde{v}_i(\theta _0)|+Lk_n^{1/2}|\tilde{B}(\beta _0^\tau x_i)||\theta _2| \nonumber \\&\qquad +|R_{ni}|+|R_i^*(\theta )|) \le 2b_2nM\omega _2\max _{1\le i\le n}|\tilde{v}_i(\theta _0)|\min _{1\le j\le K_n}|\theta _1-\theta _{1j}|\nonumber \\&\quad \le 2b_2nM\omega _2\max _{1\le i\le n}|\tilde{v}_i(\theta _0)|k_n\varepsilon /8b_2M\omega _2n<\varepsilon /4, \end{aligned}$$
(6.21)

where \(I(\cdot )\) is a indicator function. According to Assumption 6 and the mean theorem of differential, have

$$\begin{aligned}&\sup _{\theta _1\in \Gamma }|\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad -\rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))+\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)|I(d_n\le \omega _2) \nonumber \\&\quad =\sup _{\theta _1\in \Gamma }|\psi (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1^*-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\tilde{v}_i(\theta _0)^\tau \theta _1M\nonumber \\&\qquad -\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)|I(d_n\le \omega _2)\le b_2\max _{1\le i\le n}|\tilde{v}_i(\theta _0)|M\omega _2\le b_2\omega _2Mk_n^{1/2}n^{-1/2}. \nonumber \\ \end{aligned}$$
(6.22)

It follows the Assumption 6 again that,

$$\begin{aligned}&|\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad -\rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))+M\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)|\nonumber \\&\quad \le b_2|\tilde{v}_i(\theta _0)^\tau \theta _1|MI(|e_i|\le |\tilde{v}_i(\theta _0)|M+Lk_n^{1/2}|\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2|+|R_{ni}|+|R_i^*(\theta )|).\nonumber \\ \end{aligned}$$
(6.23)

Therefore, given \(X^{*}=(X_1,\ldots ,X_n)\), by (6.23), we have

$$\begin{aligned}&\sum _{i=1}^n Var[\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad \!-\!\rho (e_i\!-\!Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}\!+\!R_i^*(\theta ))\!+\!M\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)|X^*] \nonumber \\&\quad \le b_2^2M^2\omega _2^2k_nn^{-1}. \end{aligned}$$
(6.24)

By (6.21), (6.22), (6.23), (6.24) and Bernstein inequality, given any \(\varepsilon >0\), we have

$$\begin{aligned}&P\left\{ \sup _{|\theta _1|\le 1}\left| \sum _{i=1}^n [\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\right. \right. \nonumber \\&\qquad -\rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))+M\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)\nonumber \\&\qquad -E_{e_i}(\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad -\left. \rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta )))]\Bigg |\le \varepsilon , d_n\le \omega _2|X^*\right\} \nonumber \\&\quad \le \sum _{j=1}^{K_n}P\left\{ \Bigg |\sum _{i=1}^n[\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\right. \nonumber \\&\qquad -\rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))+M\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i)\nonumber \\&\qquad -E_{e_i}(\rho (e_i-M\tilde{v}_i(\theta _0)^\tau \theta _1-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))\nonumber \\&\qquad \left. -\rho (e_i-Lk_n^{1/2}\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta )))]\Bigg |\le \varepsilon /2, d_n\le \omega _2|X^*\right\} \nonumber \\&\quad \le 2K_n exp\left\{ -n(\varepsilon /2n)^2/2\left( \frac{b_2^2M^2 \omega _1\omega _2k_n^{1/2}n^{-1/2}}{n}+\frac{3M^3b_1b_2\omega _2k_n n^{-1/2}}{n}\right) \right\} \nonumber \\&\quad =2exp\left\{ -\gamma \frac{n^{1/2}}{k_n^{1/2}}\left( 1-\frac{k_n^{1/2}}{n^{1/2}\gamma }pln\left( 2\sqrt{p}/q_{0}+1\right) \right) \right\} , \end{aligned}$$
(6.25)

where \(\gamma =\varepsilon ^2/8b_2^2M^2\omega _1\omega _2\). As \(k_n=n^{1/(2r+1)}\), so (6.19) holds. The Lemma 6.4 is shown. \(\square \)

Lemma 6.5

If Assumptions of Theorem 2 hold, then

$$\begin{aligned}&\sum _{i=1}^nE_e(\rho (e_i-\tilde{v_i}(\theta _0)^\tau \theta _1- \tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni})\\&\quad -\rho (e_i- \tilde{B}(x_i^\tau \beta _0)^\tau \theta _2-R_{ni})) =\frac{b\theta _1^\tau \theta _1}{2}+r_n(\theta _1,\theta _2), \end{aligned}$$

where \(\sup _{|\theta _1|\le M_0,|\theta _2\le Lk_n^{1/2}}|r(\theta _1,\theta _2)|=o_p(1)\).

The proof of Lemma 6.5 is similar to those of (6.16), so the proof is omitted.

Proof of Theorem 2

Let \(\hat{\theta }^\tau =(\hat{\theta }_1^\tau , \hat{\theta }_2^\tau )\) as \(\hat{\theta }\) in the proof of Theorem 1, and \(\tilde{\theta _1}=b^{-1}\sum _{i=1}^n\tilde{v_i}(\theta _0)\psi (e_i)\). Following Lemmas 6.4, 6.5 and triangle inequality, for any \(L>0\) and \(\varepsilon >0\), we obtain

$$\begin{aligned}&\sup _{|\theta _1-\tilde{\theta }_1|=\varepsilon ,\theta _1\in R^p}I(|\tilde{\theta }_1|\le L,|\theta _2|\le Lk_n^{1/2}) \left| \sum _{i=1}^n(\rho (e_i-\tilde{v}_i(\theta _0)^\tau \theta _1\!-\!\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2\right. \nonumber \\&\qquad \left. \!-\!R_{ni} \!+\!R_i^*(\theta ))\!-\!\rho (e_i\!-\!\tilde{v}_i(\theta _0)^\tau \tilde{\theta }_1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}+R_i^*(\theta )))\!-\!\frac{b\varepsilon ^2}{2} \right| \nonumber \\&\quad \!\le \! 2\sup _{|\theta _1|\!\le \! L\!+\!\varepsilon ,|\theta _2|\le Lk_n^{1/2}}\left| \sum _{i=1}^n(\rho (e_i-\tilde{v}_i(\theta _0)^\tau \theta _1-\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni} +R_i^*(\theta ))\right. \nonumber \\&\quad \quad -\rho (e_i-\tilde{v}_i(\theta _0)^\tau \tilde{\theta }_1-\tilde{B}(\beta _0^\tau x_i)^\tau \theta _2-R_{ni}+R_i^*(\theta ))+\tilde{v}_i(\theta _0)^\tau \theta _1\psi (e_i))\nonumber \\&\quad \quad -\frac{b\theta _1^\tau \theta }{2}\Bigg | =o_p(1). \end{aligned}$$
(6.26)

Notice that when \(L\rightarrow \infty , P\{|\tilde{\theta }_1|\le L\}\rightarrow 1, P\{|\hat{\theta }_2|\le Lk_n^{1/2}\}\rightarrow 1\) hold. Then

$$\begin{aligned}&\lim _{n\rightarrow \infty }P\left\{ \sup _{|\theta -\tilde{\theta }|=\varepsilon , \theta \in R^p}\left| \sum _{i=1}^n(\rho (e_i-\tilde{v}_i(\theta _0)^\tau \theta _1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni} +R_i^*(\theta ))\right. \right. \\&\left. \left. \quad -\rho (e_i-\tilde{v}_i(\theta _0)^\tau \tilde{\theta }_1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}+R_i^*(\theta )))- \frac{b\varepsilon ^2}{2} \right| \le \varepsilon \right\} =0. \end{aligned}$$

Therefore, when \(|\theta _1-\tilde{\theta }_1|=\varepsilon ,\theta _1\in R^p\), we obtain

$$\begin{aligned}&\sum _{i=1}^n\rho (e_i-\tilde{v}_i(\theta _0)^\tau \theta _1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni} +R_i^*(\theta ))\\&\quad =\sum _{i=1}^n\rho (e_i-\tilde{v}_i(\theta _0)^\tau \tilde{\theta }_1 -\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}+R_i^*(\theta ))) +\frac{b\varepsilon ^2}{2}+o_p(1). \end{aligned}$$

It follows from the above formula that

$$\begin{aligned}&\lim _{n\rightarrow \infty }P\left\{ \inf _{|\theta -\tilde{\theta }|=\varepsilon , \theta \in R^p}\sum _{i=1}^n\rho (e_i-\tilde{v}_i(\theta _0)^\tau \theta _1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}\right. \nonumber \\&\left. \quad +R_i^*(\theta )) >\sum _{i=1}^n\rho (e_i-\tilde{v}_i(\theta _0)^\tau \tilde{\theta }_1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}+R_i^*(\theta ))\right\} =1.\qquad \qquad \quad \end{aligned}$$
(6.27)

According to Corollary 25 of Eggleston (1958), we have

$$\begin{aligned}&\lim _{n\rightarrow \infty }P\left\{ \inf _{|\theta -\tilde{\theta }|\ge \varepsilon ,\theta \in R^p}\sum _{i=1}^n\rho (e_i-\tilde{v}_i(\theta _0)^\tau \theta _1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}\right. \nonumber \\&\left. \quad +R_i^*(\theta )) >\sum _{i=1}^n\rho (e_i-\tilde{v}_i(\theta _0)^\tau \tilde{\theta }_1-\tilde{B}(\beta _0^\tau x_i)^\tau \hat{\theta }_2-R_{ni}+R_i^*(\theta ))\right\} =1.\nonumber \\ \end{aligned}$$
(6.28)

By the definition of \(\hat{\theta }\), we obtain

$$\begin{aligned} \hat{\theta }_1=\tilde{\theta }_1+o_p(1)= \frac{1}{b}\sum _{i=1}^n\tilde{v_i}(\theta _0)\psi (e_i)+o_p(1). \end{aligned}$$
(6.29)

According to Lemma 6.1, the central limit theorem and Slutsky’s theorem, we have

$$\begin{aligned} \sqrt{n}(\hat{\beta }_0-\beta _0)\rightarrow N(0,\sigma ^2\Sigma ^{-1}). \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, Q., Zhu, Z. M-estimators for single-index model using B-spline. Metrika 77, 225–246 (2014). https://doi.org/10.1007/s00184-013-0434-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00184-013-0434-z

Keywords

Navigation