Skip to main content
Log in

A robust and efficient estimation method for partially nonlinear models via a new MM algorithm

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

When the observed data set contains outliers, it is well known that the classical least squares method is not robust. To overcome this difficulty, Wang et al. (J Am Stat Assoc 108(502): 632–643, 2013) proposed a robust variable selection method by using the exponential squared loss (ESL) function with a tuning parameter. Although many important statistical models are investigated, to date, in the presence of outliers there is no paper to study the partially nonlinear model by using the ESL function. To fill in this gap, in this paper, we propose a robust and efficient estimation method for the partially nonlinear model based on the ESL function. Under certain conditions, we have shown that the proposed estimators can achieve the best convergence rates. Next, the asymptotic normality of the proposed estimators is established. In addition, we develop a new minorization–maximization algorithm to calculate the estimates for both non-parametric and parametric parts and present a procedure for deriving initial values. Finally, we provide a data-driven approach to select the tuning parameters. Numerical simulations and a real data analysis are used to illustrate that when there are outliers, the proposed ESL method is more robust and efficient for partially nonlinear models than the existing linear approximation method and the composite quantile regression method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Becker MP, Yang I, Lange K (1997) EM algorithms without missing data. Stat Methods Med Res 6:38–54

    Article  Google Scholar 

  • Huang TM, Chen H (2008) Estimating the parametric component of nonlinear partial spline model. J Multivar Anal 99(8):1665–1680

    Article  MathSciNet  MATH  Google Scholar 

  • Huet S, Bouvier A, Poursat M-A, Jolivet E (2004) Statistical tools for nonlinear regression: a practical guide with S-plus and R examples. Springer, New York

    MATH  Google Scholar 

  • Jiang Y, Li H (2014) Penalized weighted composite quantile regression in the linear regression model with heavy-tailed autocorrelated errors. J Korean Stat Soc 43:531–543

    Article  MathSciNet  MATH  Google Scholar 

  • Jiang Y (2015) Robust estimation in partially linear regression models. J Appl Stat 42(11):2497–2508

    Article  MathSciNet  Google Scholar 

  • Jiang Y (2016) An exponential-squared estimator in the autoregressive model with heavy-tailed errors. Stat Interface 9(2):233–238

    Article  MathSciNet  MATH  Google Scholar 

  • Kai B, Li R, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39(1):305–332

    Article  MathSciNet  MATH  Google Scholar 

  • Lange K, Hunter DR, Yang I (2000) Optimization transfer using surrogate objective functions (with discussion). J Comput Graph Stat 9(1):1–20

    Google Scholar 

  • Li R, Nie L (2007) A new estimation procedure for a partially nonlinear model via a mixed-effects approach. Can J Stat 35(3):399–411

    Article  MathSciNet  MATH  Google Scholar 

  • Li R, Nie L (2008) Efficient statistical inference procedures for partially nonlinear models and their applications. Biometrics 64(3):904–911

    Article  MathSciNet  MATH  Google Scholar 

  • Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36(1):261–286

    Article  MathSciNet  MATH  Google Scholar 

  • Liu JC, Zhang RQ, Zhao WH, Lv YZ (2013) A robust and efficient estimation method for single index models. J Multivar Anal 122:226–238

    Article  MathSciNet  MATH  Google Scholar 

  • Lv J, Yang H, Guo CH (2015a) An efficient and robust variable selection method for longitudinal generalized linear models. Comput Stat Data Anal 82:74–88

    Article  MathSciNet  MATH  Google Scholar 

  • Lv J, Yang H, Guo CH (2015b) Robust smooth-threshold estimating equations for generalized varying-coefficient partially linear models based on exponential score function. J Comput Appl Math 280:125–140

    Article  MathSciNet  MATH  Google Scholar 

  • Mack YP, Silverman BW (1982) Weak and strong uniform consistency of kernel regression estimates. Probab Theory Relat Fields 61(3):405–415

    MathSciNet  MATH  Google Scholar 

  • Ruppert D, Sheather SJ, Wand MP (1995) An effective bandwidth selector for local least squares regression. J Am Stat Assoc 90(432):1257–1270

    Article  MathSciNet  MATH  Google Scholar 

  • Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, New York

    Book  MATH  Google Scholar 

  • Song LX, Zhao Y, Wang XG (2010) Sieve least squares estimation for partially nonlinear models. Stat Probab Lett 80(17–18):1271–1283

    Article  MathSciNet  MATH  Google Scholar 

  • Song WX, Yao W, Xing YR (2014) Robust mixture regression model fitting by Laplace distribution. Comput Stat Data Anal 71:128–137

    Article  MathSciNet  MATH  Google Scholar 

  • Tang LJ, Zhou ZG, Wu CC (2012) Efficient estimation and variable selection for infinite variance autoregressive models. J Appl Math Comput 40:399–413

    Article  MathSciNet  MATH  Google Scholar 

  • Wang X, Jiang Y, Huang M, Zhang H (2013) Robust variable selection with exponential squared loss. J Am Stat Assoc 108(502):632–643

    Article  MathSciNet  MATH  Google Scholar 

  • Yao W, Li L (2014) A new regression model: modal linear regression. Scand J Stat 41(3):656–671

    Article  MathSciNet  MATH  Google Scholar 

  • Yao W, Lindsay BG, Li R (2012) Local modal regression. J Nonparametric Stat 24(3):647–663

    Article  MathSciNet  MATH  Google Scholar 

  • Yatchew A (1997) An elementary estimator of the partial linear model. Econ Lett 57(2):135–143

    Article  MathSciNet  MATH  Google Scholar 

  • Yu C, Chen K, Yao W (2015) Outlier detection and robust mixture modeling using nonconvex penalized likelihood. J Stat Plan Inference 164:27–38

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang RQ, Zhao WH, Liu JC (2013) Robust estimation and variable selection for semiparametric partially linear varying coefficient model based on modal regression. J Nonparametric Stat 25(2):523–544

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

Jiang’s research is partially supported by the National Natural Science Foundation of China (No. 11301221) and the Fundamental Research Funds for the Central Universities (No. 11615455). Partial work was done when the first author visited the Department of Statistics and Actuarial Science of HKU. Fei’s work is supported in part by the National Natural Science Foundation of China (No. 11561071).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Fei.

Appendix

Appendix

For convenience, we define the following notations:

$$\begin{aligned} \begin{array}{lllllllll} {\mathbf {z}}_i^* &{}=&{} (1,(T_i-t)/h_1,g'({\mathbf {x}}_i; \varvec{\beta }_0)^{\!\top \!})^{\!\top \!}, &{} {\mathbf {z}}_i &{}=&{} (1, (T_i-t)/h_1)^{\!\top \!}, &{} \tau _n &{}=&{} 1/\sqrt{nh_1}, \\ \varvec{\theta }&{}=&{} (a,b,\varvec{\beta }^{\!\top \!})^{\!\top \!}, &{} \varvec{\theta }_1 &{}=&{} (a,b)^{\!\top \!}, &{} \varvec{\theta }_2 &{}=&{} \varvec{\beta }, \\ \varvec{\theta }_0 &{}=&{} (a_0,b_0, \varvec{\beta }_0^{\!\top \!})^{\!\top \!}, &{} \varvec{\theta }_{10}&{}=&{} (a_0, b_0)^{\!\top \!}, &{} \varvec{\theta }_{20} &{}=&{} \varvec{\beta }_0,\\ r_i &{}=&{} m(T_i)-a_0-b_0(T_i-t), &{}a_0 &{}=&{} m(t), &{} b_0 &{}=&{} m'(t), \\ \mathbf{H}&{}=&{} \text{ diag }(1,h,\mathbf{1}\!\!\!\mathbf{1}_d^{\!\top \!}), &{} \mathbf{H}_1 &{}=&{} \text{ diag }(1,h_1,\mathbf{1}\!\!\!\mathbf{1}_d^{\!\top \!}), &{} K_{i,h} &{}=&{} K((T_i-t)/h)/h, \\ {\varvec{\upalpha }}&{}=&{} \mathbf{H}_1 \varvec{\theta }, &{} {\varvec{\upalpha }}_0 &{}=&{} \mathbf{H}_1\varvec{\theta }_0, &{} \tilde{{\varvec{\upalpha }}} &{}=&{} \mathbf{H}_1\tilde{\varvec{\theta }}. \end{array} \end{aligned}$$

Before we prove Theorem 1, we first prove the following two lemmas.

Lemma 1

Assume that Conditions (C1)–(C2) and (C5)–(C7) hold, we have

$$\begin{aligned} \frac{1}{n} \sum _{i=1}^n K_{i,h}\phi ''_{\gamma } (\varepsilon _i) \left( \frac{T_i-t}{h} \right) ^j =F(t,\gamma )f_{T}(t)\mu _j+o_p(1) \end{aligned}$$
(6.1)

and

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}K_{i,h}\phi ''_{\gamma }(\varepsilon _i)r_i\left( \frac{T_i-t}{h}\right) ^j= \frac{1}{2}h^2F(t,\gamma )f_{T}(t)m''(t)\mu _{j+2}+o_p(h^2), \end{aligned}$$
(6.2)

for \(j=0,1\).

The proof of Lemma 1 is similar to that of Lemma 1 in Yao et al. (2012). Therefore, we omit it here.

Lemma 2

Under Conditions (C1)–(C7), with probability approaching to 1, there exists a consistent local maximizer of (2.2), denoted by \(\tilde{\varvec{\theta }}\), such that

$$\begin{aligned} \Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert =O_p(c_n), \end{aligned}$$

where \(c_n \,\hat{=}\,(nh_1)^{-1/2}+h_1^2\).

Proof of Lemma 2

Recall that \(\ell _n(\varvec{\theta }) = \ell _n(a, b, \varvec{\beta })\) is defined by (2.2). It suffices to show that for any given \(\delta >0\), there exists a large constant \(C>0\) such that

$$\begin{aligned} \Pr \left\{ \sup _{\Vert \mathbf{v}\Vert =C} \ell _n({\varvec{\upalpha }}_0+c_n\mathbf{v}) <\ell _n({\varvec{\upalpha }}_0)\right\} \ge 1-\delta , \end{aligned}$$
(6.3)

for any \((d+2)\)-dimensional scalar vector \(\mathbf{v}\) satisfying \(\Vert \mathbf{v}\Vert =C\). Note that

$$\begin{aligned} \ell _n({\varvec{\upalpha }}_0+c_n\mathbf{v})-\ell _n({\varvec{\upalpha }}_0)= & {} \sum _{i=1}^n \phi _{\gamma _1}\left( Y_i-g({\mathbf {x}}_i; \varvec{\beta }_{0}+c_n\mathbf{v}_2)-{\mathbf {z}}_i^{\!\top \!} \left[ {a_0 \atopwithdelims ()h_1b_0} +c_n\mathbf{v}_1\right] \right) K_{i,h_1} \\&-\sum _{i=1}^{n}\phi _{\gamma _1}\left( Y_i-g({\mathbf {x}}_i; \varvec{\beta }_{0})-{\mathbf {z}}_i^{\!\top \!} {a_0 \atopwithdelims ()h_1b_0} \right) K_{i,h_1}. \end{aligned}$$

By applying the first-order Taylor expansion and noting Conditions (C3)–(C4), we obtain

$$\begin{aligned} g({\mathbf {x}}_i; \varvec{\beta }_0+c_n\mathbf{v}_2)=g({\mathbf {x}}_i; \varvec{\beta }_0)+c_ng'({\mathbf {x}}_i; \varvec{\beta }_0)^{\!\top \!}\mathbf{v}_2[1+o_p(1)], \end{aligned}$$
(6.4)

so that

$$\begin{aligned}&\ell _n({\varvec{\upalpha }}_0+c_n\mathbf{v})-\ell _n({\varvec{\upalpha }}_0) \\&\quad =\sum _{i=1}^{n} \left[ \phi _{\gamma _1} (\varepsilon _i+r_i-c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v}) -\phi _{\gamma _1} (\varepsilon _i+r_i)\right] K_{i,h_1} \\&\quad =\sum _{i=1}^{n} \left[ -\phi _{\gamma _1}'(\varepsilon _i+r_i) c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v}+ \frac{1}{2} \phi _{\gamma _1}''(\varepsilon _i+r_i) (c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v})^2\,-\,\,\frac{1}{6} \phi _{\gamma _1}''(\varepsilon _i^*) (c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v})^3 \right] K_{i,h_1} \\&\quad \triangleq I_1+I_2+I_3, \end{aligned}$$

where \(\varepsilon _i^*\) is a point between \(\varepsilon _i+r_i\) and \(\varepsilon _i+r_i-c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v}\).

Under regularity conditions (C3)–(C4), the mean and variance of \(I_1\) can be directly calculated as

$$\begin{aligned} E(I_1)=O(nCc_nh_1^2) \quad \text{ and } \quad \text{ Var }(I_1)=O(n^2C^2c_n^2(nh_1)^{-1}). \end{aligned}$$

Therefore, we have

$$\begin{aligned} I_1=E(I_1)+O_p \Big (\sqrt{\text{ Var }(I_1)} \, \Big )=O(nCc_nh_1^2)+nCc_nO_p((nh_1)^{-1/2})=O_p(nCc_n^2). \end{aligned}$$

Similarly, we can obtain \(I_3=O_p(nc_n^3)\).

By Lemma 1, it follows that

$$\begin{aligned} I_2=nc_n^2f_T(t)\mathbf{v}^{\!\top \!}\varvec{\Sigma }_1(t)\mathbf{v}[1+o_p(1)], \end{aligned}$$

where \(\varvec{\Sigma }_1(t)=F(t,\gamma _1)E[{\mathbf {A}}({\mathbf {x}})|T=t]\). Noting that \(\Vert \mathbf{v}\Vert =C\), we can choose a sufficiently large C such that \(I_2\) dominates both \(I_1\) and \(I_3\) with a probability of at least \(1-\delta \). By Condition (C1), we have \(F(t,\gamma _1)<0\). Therefore, \(\varvec{\Sigma }_1(t)\) is a negative matrix. The proof of Lemma 2 is completed. \(\square \)

Proof of Theorem 1

Note that

$$\begin{aligned} \tilde{\varvec{\theta }} = \arg \, \max _{a, b, \varvec{\beta }} \sum _{i=1}^{n}\exp \left\{ -\left[ Y_i -a-b(T_i-t) -g({\mathbf {x}}_i; \varvec{\beta })\right] ^2/ \gamma _1\right\} K_{i,h_1}. \end{aligned}$$

From (6.4) and the Taylor expansion, we know that \(\tilde{{\varvec{\upalpha }}}\) satisfies

$$\begin{aligned} \mathbf{0}\!\!\!\mathbf{0}=\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}'(\varepsilon _i+\tilde{r}_i) =\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}[\phi _{\gamma _1}'(\varepsilon _i)+\phi _{\gamma _1}''(\varepsilon _i)\tilde{r}_i +\frac{1}{2}\phi _{\gamma _1}'''(\varepsilon _i^*)\tilde{r}_i^2], \end{aligned}$$
(6.5)

where \(\varepsilon _i^*\) lies in \(\varepsilon _i\) and \(\varepsilon _i+\tilde{r}_i\),

$$\begin{aligned} \tilde{r}_i=r_i-(\tilde{a}-a_0)-(\tilde{b}-b_0)(T_i-t)-g'({\mathbf {x}}_i; \varvec{\beta }_{0}) (\tilde{\varvec{\beta }}-\varvec{\beta }_{0}))=r_i-{\mathbf {z}}_{i}^{*\!\top \!}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}). \end{aligned}$$

Therefore, the second term on the right-hand side of (6.5) is

$$\begin{aligned}&\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i)\tilde{r}_i = \sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i)r_i\\&\quad -\sum _{i=1}^{n}K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i){\mathbf {z}}_i^*{\mathbf {z}}_{i}^{*\!\top \!}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}) \triangleq I_4+I_5. \end{aligned}$$

From Lemma 1, we have

$$\begin{aligned} I_4= & {} \sum _{i=1}^n {\mathbf {z}}_i^*K_{i,h_1} \phi _{\gamma _1}'' (\varepsilon _i) r_i =\frac{1}{2}nh_1^2 \mu _2f_T(t){\varvec{\upsigma }}_3(t) m''(t)+o_p(nh_1^2) \quad \text{ and } \quad \\ I_5= & {} -\sum _{i=1}^n K_{i,h_1} \phi _{\gamma _1}''(\varepsilon _i) {\mathbf {z}}_i^* {\mathbf {z}}_{i}^{*\!\top \!} (\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}) =-nf_T(t) \varvec{\Sigma }_1(t)(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})[1+o_p(1)]. \end{aligned}$$

By applying Lemma 2, we have \(\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert =O_p((nh_1)^{-1/2}+h_1^2)\). Thus,

$$\begin{aligned} \sup _{i:\; |T_i-t|/h_1\le 1}|\tilde{r}_i|\le & {} \sup _{i: \; |T_i-t|/h_1 \le 1} \Big [ |r_i|+|{\mathbf {z}}_{i}^{*\!\top \!}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})| \Big ] \\= & {} O_p(h_1^2+\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert )=O_p(\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert )=o_p(1), {\mathrm{and}} \\ \sup _{i:\; |T_i-t|/h_1 \le 1} |\tilde{r}_i^2|= & {} o_p(1) O_p(\Vert \tilde{{\varvec{\upalpha }}} -{\varvec{\upalpha }}_{0}\Vert ^2)=o_p(\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert ). \end{aligned}$$

Since

$$\begin{aligned} E\left[ \sum _{i=1}^{n}K_{i,h_1}\phi _{\gamma _1}'''(\varepsilon _i^*)\tilde{r}_i^2(T_i-t)^j/h_1^j\right]= & {} O_p(n\Vert \tilde{{\varvec{\upalpha }}} -{\varvec{\upalpha }}_{0}\Vert ^2) = o_p(I_5) \ \,{\mathrm{and}} \\ \text{ Var } \left[ \sum _{i=1}^{n}K_{i,h_1} \phi _{\gamma _1}'''(\varepsilon _i^*) \tilde{r}_i^2(T_i-t)^j/h_1^j\right]= & {} O_p(n\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert )\\= & {} O_p(n^2\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert ^2(nh_1)^{-1}), \end{aligned}$$

we have

$$\begin{aligned} \sum _{i=1}^n K_{i,h_1}\phi _{\gamma _1}'''(\varepsilon _i^*)\tilde{r}_i^2{\mathbf {z}}_i^*=o_p(I_5) +O_p \Big (\sqrt{n^2 \Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert ^2(nh_1)^{-1}} \, \Big ) = o_p(I_5), \end{aligned}$$

so that

$$\begin{aligned} \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}=\frac{1}{nf_T(t)}\varvec{\Sigma }_1(t)^{-1}W_n[1+o_p(1)]+\frac{1}{2}h_1^2\mu _2 \varvec{\Sigma }_1(t)^{-1}{\varvec{\upsigma }}_3(t)m''(t)[1+o_p(1)], \end{aligned}$$

where \(\mathbf {w}_n=\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}'(\varepsilon _i)\). By Condition (C2), we have \(E(\mathbf {w}_n)= \mathbf{0}\!\!\!\mathbf{0}\) and

$$\begin{aligned} \text{ Var }(\mathbf {w}_n)=nh_1^{-1}f_T(t)\varvec{\Sigma }_2(t)[1+o_p(1)]. \end{aligned}$$
(6.6)

Let \(\mathbf {w}_n^* \,\hat{=}\,\sqrt{h_1/n}\, \mathbf {w}_n\) and \(\zeta _i \,\hat{=}\,\sqrt{h_1/n}\,{\mathbf {d}}^{\!\top \!}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}'(\varepsilon _i)\), where \({\mathbf {d}}\) is a unit vector satisfying \(\Vert {\mathbf {d}}\Vert =1\). Then, \({\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*=\sum _{i=1}^{n}\zeta _i\). By (6.6), we have

$$\begin{aligned} \text{ Var }({\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*)=f_T(t){\mathbf {d}}^{\!\top \!}\varvec{\Sigma }_2(t){\mathbf {d}}[1+o_p(1)]. \end{aligned}$$

Since \(\phi _{\gamma _1}(\cdot )\) is bounded and \(K(\cdot )\) has a compact support, we obtain \(nE|\zeta _1|^3=O((nh_1)^{-1/2})\rightarrow 0\) by direct calculations. Using the Lyapunov’s condition, we obtain

$$\begin{aligned}{}[{\mathrm{Var}}({\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*)]^{-1/2} [{\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*-{\mathbf {d}}^{\!\top \!} E(\mathbf {w}_n^*)] \xrightarrow {\mathrm{D}}N(0,1). \end{aligned}$$

Therefore

$$\begin{aligned} \mathbf {w}_n^* \xrightarrow {\mathrm{D}} N(\mathbf{0}\!\!\!\mathbf{0}, \; f_T(t)\varvec{\Sigma }_2(t)). \end{aligned}$$

The proof is completed. \(\square \)

Lemma 3

Let \((X_1,U_1)^{\!\top \!}, \ldots , (X_n, U_n)^{\!\top \!}\) be an i.i.d. random samples from population random vector \((X, U)^{\!\top \!}\) with the joint density p(xu). Let \(E|U|^s<\infty \) and \(\sup _{x}\int |u|^sp(x,u)\,\text{ d }u<\infty \), where \(s \ge 2\). Let \(K(\cdot )>0\) be a bounded function with a bounded support, and satisfy the Lipschitz condition. Then,

$$\begin{aligned} \sup _{x \in [0,1]} \left| \frac{1}{n} \sum _{i=1}^{n} \Big \{ K_h(X_i-x)U_i-E[K_h(X_i-x)U_i ] \Big \}\right| =O_p \left( \left[ \frac{\log (1/h)}{nh}\right] ^{1/2}\right) \end{aligned}$$

provided that \(n^{2t-1}h\rightarrow \infty \) for some \(t<1-s^{-1}\).

The proof of Lemma 3 can be found in Mack and Silverman (1982).

Lemma 4

Under the conditions in Theorem 1, we have

$$\begin{aligned} \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}=\frac{1}{nf_T(t)}\varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n+O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)). \end{aligned}$$

Proof of Lemma 4

Let \(\tilde{{\varvec{\uplambda }}}=\sqrt{nh_1}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})\). Then, \(\tilde{{\varvec{\uplambda }}}\) is the maximizer of

$$\begin{aligned} U_n(a,b,\varvec{\beta })= & {} h_1 \sum _{i=1}^{n} \Big [\phi _{\gamma _1}(Y_i-g({\mathbf {x}}_i,\varvec{\beta })-a-b(T_i-t)/h_1) \\&- \; \phi _{\gamma _1}(Y_i-g({\mathbf {x}}_i; \varvec{\beta }_0)-a_0-b_0(T_i-t)/h_1) \Big ]K_{i,h_1}. \end{aligned}$$

Using the Taylor expansion, we have

$$\begin{aligned} U_n(a,b,\varvec{\beta })=\tilde{{\varvec{\uplambda }}}^{\!\top \!}\mathbf {w}_n^*+\frac{1}{2} \tilde{{\varvec{\uplambda }}}^{\!\top \!}\Delta _n\tilde{{\varvec{\uplambda }}}+O_p(\tau _n\Vert \tilde{{\varvec{\uplambda }}}^{\!\top \!}\Vert ^2), \end{aligned}$$
(6.7)

where \({\varvec{\Delta }}_n = \frac{1}{n}\sum _{i=1}^{n} \tau _n^2{\mathbf {z}}_i^{*}{\mathbf {z}}_i^{*\!\top \!} K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i)\). By Lemma 3, we obtain

$$\begin{aligned} {\varvec{\Delta }}_n=E[{\varvec{\Delta }}_n]+O_p(\tau _n\log ^{1/2}(1/h_1)). \end{aligned}$$

By the regularity conditions, we have \(E[{\varvec{\Delta }}_n]=f_T(t)\varvec{\Sigma }_1(t)+O_p(h_1^2)\). Therefore,

$$\begin{aligned} {\varvec{\Delta }}_n=f_T(t)\varvec{\Sigma }_1(t)+O_p(h_1^2)+O_p(\tau _n\log ^{1/2}(1/h_1)). \end{aligned}$$

According to (6.7), we have

$$\begin{aligned} \tilde{{\varvec{\uplambda }}}=\frac{1}{f_T(t)}\varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n^*+O_p(h_1^2+\tau _n\log ^{1/2}(1/h_1)). \end{aligned}$$

Since \(\tilde{{\varvec{\uplambda }}}=\sqrt{nh_1}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})\), we obtain

$$\begin{aligned} \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}=\frac{1}{nf_T(t)} \varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n+O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)) \end{aligned}$$

holds uniformly in \(t\in {\mathbb {T}}\). The proof is completed. \(\square \)

Lemma 5

Let Conditions (C1)–(C7) hold, \(nh_1^4\rightarrow 0\) and \(nh_1^2/[\log (1/h_1)] \rightarrow \infty \) as \(n\rightarrow \infty \). Then, with probability approaching to 1, there exists a local maximizer \({\varvec{\hat{\beta }}}_n\) defined by (2.3) such that

$$\begin{aligned} \Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert =O_p \left( \frac{1}{\sqrt{n}} \right) . \end{aligned}$$

Proof of Lemma 5

Let \(\chi _i=\tilde{m}(T_i)-m(T_i)\) and \(R(\varvec{\beta })=\frac{1}{n}\sum _{i=1}^{n}\phi _{\gamma _2}(Y_i-\tilde{m}(T_i)-g({\mathbf {x}}_i; \varvec{\beta }))\). Using the Taylor expansion and (6.4), we have

$$\begin{aligned}&R \left( \varvec{\beta }_0+ \varvec{\mathrm{e}}/\sqrt{n} \right) -R(\varvec{\beta }_0) \\&\quad = \frac{1}{n}\sum _{i=1}^{n} \left[ \phi _{\gamma _2}\left( Y_i-\tilde{m}(T_i)- g({\mathbf {x}}_i; \varvec{\beta }_0+ \varvec{\mathrm{e}}/\sqrt{n})\right) -\phi _{\gamma _2}(Y_i-\tilde{m}(T_i)-g({\mathbf {x}}_i; \varvec{\beta }_0))\right] \\&\quad = \frac{1}{n}\sum _{i=1}^{n}\left[ \phi _{\gamma _2}(\varepsilon _i-\chi _i - \varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)\Big /\sqrt{n})-\phi _{\gamma _2}(\varepsilon _i-\chi _i)\right] \\&\quad = - \; \frac{1}{n} \sum _{i=1}^{n} \phi _{\gamma _2}'(\varepsilon _i-\chi _i) \varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)\Big /\sqrt{n} + \frac{1}{n}\sum _{i=1}^{n}\frac{1}{2n}\phi _{\gamma _2}''(\varepsilon _i-\chi _i) [\varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)]^2 \\&\quad \quad - \; \frac{1}{n} \sum _{i=1}^{n} \frac{1}{6n^{3/2}}\phi _{\gamma _2}'''(\varepsilon _i^*) [\varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)]^3 \\&\quad \triangleq I_6+I_7+I_8, \end{aligned}$$

where \(\Vert \varvec{\mathrm{e}}\Vert =C\) for a large constant C, and \(\varepsilon _i^*\) lies in \(\varepsilon _i-\chi _i- \varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)/ \sqrt{n}\) and \(\varepsilon _i-\chi _i\). By Lemma 4, we have

$$\begin{aligned} \sup _{t\in {\mathbb {T}}} \left\| \tilde{m}(t) -m(t)-\frac{1}{nf_T(t)} {\mathbf {q}}^{\!\top \!} \varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n\right\| =O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)). \end{aligned}$$
(6.8)

By (6.8), \(nh_1^4\rightarrow 0\) and \(nh_1^2/[\log (1/h_1)]\rightarrow \infty \) as \(n\rightarrow \infty \), we have \(E(I_6)=O(Ch_1^2/\sqrt{n})\), \(\text{ Var }(I_6)=O(C^2/n^2)\). Therefore, we can obtain \(I_6=O_p(n^{-1})\). Similarly, we have \(I_8=O_p(n^{-3/2})\). For \(I_7\), we have

$$\begin{aligned} I_7=\frac{1}{2n}\varvec{\mathrm{e}}^{\!\top \!}\varvec{\Sigma }_4\varvec{\mathrm{e}}[1+o_p(1)], \end{aligned}$$

where \(\varvec{\Sigma }_4=E[F(t,\gamma _2)g'({\mathbf {x}};\varvec{\beta }_0)g'({\mathbf {x}};\varvec{\beta }_0)^{\!\top \!}]\). Noting \(\Vert \varvec{\mathrm{e}}\Vert =C\), we can choose a sufficiently large C such that \(I_7\) dominates both \(I_6\) and \(I_8\) with a probability of at least \(1-\delta \). Since \(F(t,\gamma _2)<0\), we have

$$\begin{aligned} \Pr \left\{ \sup _{\Vert \varvec{\mathrm{e}}\Vert =C}R(\varvec{\beta }_0+ \varvec{\mathrm{e}}/\sqrt{n} ) < R(\varvec{\beta }_0)\right\} \ge 1-\delta . \end{aligned}$$

The proof is completed. \(\square \)

Proof of Theorem 2

Let \(\tilde{\varphi }_i = \tilde{m}(T_i)-m(T_i)+g'({\mathbf {x}};\varvec{\beta }_0)^{\!\top \!}({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)\). Then, \({\varvec{\hat{\beta }}}_n\) satisfies the following equation

$$\begin{aligned} 0= & {} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}'(\varepsilon _i-\tilde{\varphi }_i) \\= & {} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \left[ \phi _{\gamma _2}' (\varepsilon _i)-\phi _{\gamma _2}'' (\varepsilon _i)\tilde{\varphi }_i+\frac{1}{2} \phi _{\gamma _2}'''(\varepsilon _i^*) \tilde{\varphi }_i^2\right] \triangleq I_9+I_{10}+I_{11}, \end{aligned}$$

where

$$\begin{aligned} I_{9}= & {} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}' (\varepsilon _i), \\ I_{10}= & {} -\sum _{i=1}^{n}g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i)\tilde{\varphi }_i \\= & {} -\sum _{i=1}^{n}g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i) [\tilde{m}(T_i)-m(T_i) ] \\&- \sum _{i=1}^{n} \phi _{\gamma _2}''(\varepsilon _i) g'({\mathbf {x}};\varvec{\beta }_0)g'({\mathbf {x}};\varvec{\beta }_0)^{\!\top \!}({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0) \\= & {} J_1+J_2, \\ I_{11}= & {} \frac{1}{2}\sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \phi _{\gamma _2}'''(\varepsilon _i^*) \tilde{\varphi }_i^2. \end{aligned}$$

By (6.8), we can write \(J_1\) as

$$\begin{aligned} J_1= & {} -\sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i)[\tilde{m}(T_i)-m(T_i)] \\= & {} -\sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \phi _{\gamma _2}''(\varepsilon _i) \frac{1}{nf_T(T_i)} {\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1(T_i)^{-1}\mathbf {w}_n+O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)) \\= & {} -\frac{1}{n} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \phi _{\gamma _2}''(\varepsilon _i) \frac{1}{f_T(T_i)}{\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1(T_i)^{-1} \sum _{j=1}^n {\mathbf {z}}_j^*K_{h_1}(T_j-T_i) \phi _{\gamma _1}'(\varepsilon _j) \\&+ \; O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)). \end{aligned}$$

Since \(nh_1^4\rightarrow 0\) and \(nh_1^2/[\log (1/h_1)] \rightarrow \infty \) as \(n\rightarrow \infty \), we have

$$\begin{aligned} \tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)=O(n^{1/2}h_1^2)=o(1), \end{aligned}$$

so that

$$\begin{aligned} J_1 = -\frac{1}{n} \sum _{i=1}^{n} g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i) \frac{1}{f_T(T_i)}{\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1(T_i)^{-1}\sum _{j=1}^{n} {\mathbf {z}}_j^*K_{h_1}(T_j-T_i)\phi _{\gamma _1}'(\varepsilon _j) + o_p(1). \end{aligned}$$

By calculating the second moment, It can be shown that \(J_1-J_3\xrightarrow {\mathrm{P}} 0\), where \(J_3=-\sum _{j=1}^{n} {\varvec{\kappa }}(T_j) \) with

$$\begin{aligned} {\varvec{\kappa }}(t_j) =\phi _{\gamma _1}'(\varepsilon _j) {\varvec{\upsigma }}_6(t_j){\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1^{-1}(t_j)(1,0,g'({\mathbf {x}}_j,\varvec{\beta }_0)^{\!\top \!})^{\!\top \!}. \end{aligned}$$

On the other hand, \(J_2=-n\varvec{\Sigma }_4({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)\). Since \(|\tilde{\varphi }_i|=O_p(\Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert )=o_p(1)\), and \(|\tilde{\varphi }_i|^2=o_p(1)O_p({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)=o_p(\Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert )\)

$$\begin{aligned} \frac{1}{2}\sum _{i=1}^{n}\phi _{\gamma _2}'''(\varepsilon _i^*)\tilde{\varphi }_i^2=o_p(J_4). \end{aligned}$$

Therefore,

$$\begin{aligned} \sqrt{n}({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)=\frac{1}{\sqrt{n}}\varvec{\Sigma }_4^{-1}\sum _{i=1}^{n} \Big [ g'({\mathbf {x}}_i; \varvec{\beta }_0)\phi _{\gamma _2}'(\varepsilon _i)-{\varvec{\kappa }}(T_i)\Big ] + o_p(1). \end{aligned}$$

The proof is completed by Slutsky’s Theorem and the Central Limit Theorem. \(\square \)

Proof of Theorem 3

According to Theorem 2, We have \(\Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert =O_p(1/\sqrt{n})\). Following the ideas in the proof of Theorem 1, we can easily obtain the result of Theorem 3.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jiang, Y., Tian, GL. & Fei, Y. A robust and efficient estimation method for partially nonlinear models via a new MM algorithm. Stat Papers 60, 2063–2085 (2019). https://doi.org/10.1007/s00362-017-0909-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-017-0909-5

Keywords

Navigation