A robust and efficient estimation method for partially nonlinear models via a new MM algorithm

Jiang, Yunlu; Tian, Guo-Liang; Fei, Yu

doi:10.1007/s00362-017-0909-5

A robust and efficient estimation method for partially nonlinear models via a new MM algorithm

Regular Article
Published: 04 May 2017

Volume 60, pages 2063–2085, (2019)
Cite this article

Statistical Papers Aims and scope Submit manuscript

457 Accesses
9 Citations
1 Altmetric
Explore all metrics

Abstract

When the observed data set contains outliers, it is well known that the classical least squares method is not robust. To overcome this difficulty, Wang et al. (J Am Stat Assoc 108(502): 632–643, 2013) proposed a robust variable selection method by using the exponential squared loss (ESL) function with a tuning parameter. Although many important statistical models are investigated, to date, in the presence of outliers there is no paper to study the partially nonlinear model by using the ESL function. To fill in this gap, in this paper, we propose a robust and efficient estimation method for the partially nonlinear model based on the ESL function. Under certain conditions, we have shown that the proposed estimators can achieve the best convergence rates. Next, the asymptotic normality of the proposed estimators is established. In addition, we develop a new minorization–maximization algorithm to calculate the estimates for both non-parametric and parametric parts and present a procedure for deriving initial values. Finally, we provide a data-driven approach to select the tuning parameters. Numerical simulations and a real data analysis are used to illustrate that when there are outliers, the proposed ESL method is more robust and efficient for partially nonlinear models than the existing linear approximation method and the composite quantile regression method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust exponential squared loss-based estimation in semi-functional linear regression models

Article 05 April 2018

The generalized equivalence of regularization and min–max robustification in linear mixed models

Article Open access 11 January 2021

M-estimation in high-dimensional linear model

Article Open access 30 August 2018

References

Becker MP, Yang I, Lange K (1997) EM algorithms without missing data. Stat Methods Med Res 6:38–54
Article Google Scholar
Huang TM, Chen H (2008) Estimating the parametric component of nonlinear partial spline model. J Multivar Anal 99(8):1665–1680
Article MathSciNet MATH Google Scholar
Huet S, Bouvier A, Poursat M-A, Jolivet E (2004) Statistical tools for nonlinear regression: a practical guide with S-plus and R examples. Springer, New York
MATH Google Scholar
Jiang Y, Li H (2014) Penalized weighted composite quantile regression in the linear regression model with heavy-tailed autocorrelated errors. J Korean Stat Soc 43:531–543
Article MathSciNet MATH Google Scholar
Jiang Y (2015) Robust estimation in partially linear regression models. J Appl Stat 42(11):2497–2508
Article MathSciNet Google Scholar
Jiang Y (2016) An exponential-squared estimator in the autoregressive model with heavy-tailed errors. Stat Interface 9(2):233–238
Article MathSciNet MATH Google Scholar
Kai B, Li R, Zou H (2011) New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. Ann Stat 39(1):305–332
Article MathSciNet MATH Google Scholar
Lange K, Hunter DR, Yang I (2000) Optimization transfer using surrogate objective functions (with discussion). J Comput Graph Stat 9(1):1–20
Google Scholar
Li R, Nie L (2007) A new estimation procedure for a partially nonlinear model via a mixed-effects approach. Can J Stat 35(3):399–411
Article MathSciNet MATH Google Scholar
Li R, Nie L (2008) Efficient statistical inference procedures for partially nonlinear models and their applications. Biometrics 64(3):904–911
Article MathSciNet MATH Google Scholar
Li R, Liang H (2008) Variable selection in semiparametric regression modeling. Ann Stat 36(1):261–286
Article MathSciNet MATH Google Scholar
Liu JC, Zhang RQ, Zhao WH, Lv YZ (2013) A robust and efficient estimation method for single index models. J Multivar Anal 122:226–238
Article MathSciNet MATH Google Scholar
Lv J, Yang H, Guo CH (2015a) An efficient and robust variable selection method for longitudinal generalized linear models. Comput Stat Data Anal 82:74–88
Article MathSciNet MATH Google Scholar
Lv J, Yang H, Guo CH (2015b) Robust smooth-threshold estimating equations for generalized varying-coefficient partially linear models based on exponential score function. J Comput Appl Math 280:125–140
Article MathSciNet MATH Google Scholar
Mack YP, Silverman BW (1982) Weak and strong uniform consistency of kernel regression estimates. Probab Theory Relat Fields 61(3):405–415
MathSciNet MATH Google Scholar
Ruppert D, Sheather SJ, Wand MP (1995) An effective bandwidth selector for local least squares regression. J Am Stat Assoc 90(432):1257–1270
Article MathSciNet MATH Google Scholar
Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, New York
Book MATH Google Scholar
Song LX, Zhao Y, Wang XG (2010) Sieve least squares estimation for partially nonlinear models. Stat Probab Lett 80(17–18):1271–1283
Article MathSciNet MATH Google Scholar
Song WX, Yao W, Xing YR (2014) Robust mixture regression model fitting by Laplace distribution. Comput Stat Data Anal 71:128–137
Article MathSciNet MATH Google Scholar
Tang LJ, Zhou ZG, Wu CC (2012) Efficient estimation and variable selection for infinite variance autoregressive models. J Appl Math Comput 40:399–413
Article MathSciNet MATH Google Scholar
Wang X, Jiang Y, Huang M, Zhang H (2013) Robust variable selection with exponential squared loss. J Am Stat Assoc 108(502):632–643
Article MathSciNet MATH Google Scholar
Yao W, Li L (2014) A new regression model: modal linear regression. Scand J Stat 41(3):656–671
Article MathSciNet MATH Google Scholar
Yao W, Lindsay BG, Li R (2012) Local modal regression. J Nonparametric Stat 24(3):647–663
Article MathSciNet MATH Google Scholar
Yatchew A (1997) An elementary estimator of the partial linear model. Econ Lett 57(2):135–143
Article MathSciNet MATH Google Scholar
Yu C, Chen K, Yao W (2015) Outlier detection and robust mixture modeling using nonconvex penalized likelihood. J Stat Plan Inference 164:27–38
Article MathSciNet MATH Google Scholar
Zhang RQ, Zhao WH, Liu JC (2013) Robust estimation and variable selection for semiparametric partially linear varying coefficient model based on modal regression. J Nonparametric Stat 25(2):523–544
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Jiang’s research is partially supported by the National Natural Science Foundation of China (No. 11301221) and the Fundamental Research Funds for the Central Universities (No. 11615455). Partial work was done when the first author visited the Department of Statistics and Actuarial Science of HKU. Fei’s work is supported in part by the National Natural Science Foundation of China (No. 11561071).

Author information

Authors and Affiliations

Department of Statistics, College of Economics, Jinan University, Guangzhou, 510632, People’s Republic of China
Yunlu Jiang
Department of Mathematics, Southern University of Science and Technology, Shenzhen, 518055, People’s Republic of China
Guo-Liang Tian
School of Statistics and Mathematics, Yunnan University of Finance and Economics, Kunming, 650221, People’s Republic of China
Yu Fei

Authors

Yunlu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Guo-Liang Tian
View author publications
You can also search for this author in PubMed Google Scholar
Yu Fei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Fei.

Appendix

For convenience, we define the following notations:

$$\begin{aligned} \begin{array}{lllllllll} {\mathbf {z}}_i^* &{}=&{} (1,(T_i-t)/h_1,g'({\mathbf {x}}_i; \varvec{\beta }_0)^{\!\top \!})^{\!\top \!}, &{} {\mathbf {z}}_i &{}=&{} (1, (T_i-t)/h_1)^{\!\top \!}, &{} \tau _n &{}=&{} 1/\sqrt{nh_1}, \\ \varvec{\theta }&{}=&{} (a,b,\varvec{\beta }^{\!\top \!})^{\!\top \!}, &{} \varvec{\theta }_1 &{}=&{} (a,b)^{\!\top \!}, &{} \varvec{\theta }_2 &{}=&{} \varvec{\beta }, \\ \varvec{\theta }_0 &{}=&{} (a_0,b_0, \varvec{\beta }_0^{\!\top \!})^{\!\top \!}, &{} \varvec{\theta }_{10}&{}=&{} (a_0, b_0)^{\!\top \!}, &{} \varvec{\theta }_{20} &{}=&{} \varvec{\beta }_0,\\ r_i &{}=&{} m(T_i)-a_0-b_0(T_i-t), &{}a_0 &{}=&{} m(t), &{} b_0 &{}=&{} m'(t), \\ \mathbf{H}&{}=&{} \text{ diag }(1,h,\mathbf{1}\!\!\!\mathbf{1}_d^{\!\top \!}), &{} \mathbf{H}_1 &{}=&{} \text{ diag }(1,h_1,\mathbf{1}\!\!\!\mathbf{1}_d^{\!\top \!}), &{} K_{i,h} &{}=&{} K((T_i-t)/h)/h, \\ {\varvec{\upalpha }}&{}=&{} \mathbf{H}_1 \varvec{\theta }, &{} {\varvec{\upalpha }}_0 &{}=&{} \mathbf{H}_1\varvec{\theta }_0, &{} \tilde{{\varvec{\upalpha }}} &{}=&{} \mathbf{H}_1\tilde{\varvec{\theta }}. \end{array} \end{aligned}$$

Before we prove Theorem 1, we first prove the following two lemmas.

Lemma 1

Assume that Conditions (C1)–(C2) and (C5)–(C7) hold, we have

$$\begin{aligned} \frac{1}{n} \sum _{i=1}^n K_{i,h}\phi ''_{\gamma } (\varepsilon _i) \left( \frac{T_i-t}{h} \right) ^j =F(t,\gamma )f_{T}(t)\mu _j+o_p(1) \end{aligned}$$

(6.1)

and

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}K_{i,h}\phi ''_{\gamma }(\varepsilon _i)r_i\left( \frac{T_i-t}{h}\right) ^j= \frac{1}{2}h^2F(t,\gamma )f_{T}(t)m''(t)\mu _{j+2}+o_p(h^2), \end{aligned}$$

(6.2)

for $j=0,1$.

The proof of Lemma 1 is similar to that of Lemma 1 in Yao et al. (2012). Therefore, we omit it here.

Lemma 2

Under Conditions (C1)–(C7), with probability approaching to 1, there exists a consistent local maximizer of (2.2), denoted by $\tilde{\varvec{\theta }}$, such that

$$\begin{aligned} \Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert =O_p(c_n), \end{aligned}$$

where $c_n \,\hat{=}\,(nh_1)^{-1/2}+h_1^2$.

Proof of Lemma 2

Recall that $\ell _n(\varvec{\theta }) = \ell _n(a, b, \varvec{\beta })$ is defined by (2.2). It suffices to show that for any given $\delta >0$, there exists a large constant $C>0$ such that

$$\begin{aligned} \Pr \left\{ \sup _{\Vert \mathbf{v}\Vert =C} \ell _n({\varvec{\upalpha }}_0+c_n\mathbf{v}) <\ell _n({\varvec{\upalpha }}_0)\right\} \ge 1-\delta , \end{aligned}$$

(6.3)

for any $(d+2)$-dimensional scalar vector $\mathbf{v}$ satisfying $\Vert \mathbf{v}\Vert =C$. Note that

$$\begin{aligned} \ell _n({\varvec{\upalpha }}_0+c_n\mathbf{v})-\ell _n({\varvec{\upalpha }}_0)= & {} \sum _{i=1}^n \phi _{\gamma _1}\left( Y_i-g({\mathbf {x}}_i; \varvec{\beta }_{0}+c_n\mathbf{v}_2)-{\mathbf {z}}_i^{\!\top \!} \left[ {a_0 \atopwithdelims ()h_1b_0} +c_n\mathbf{v}_1\right] \right) K_{i,h_1} \\&-\sum _{i=1}^{n}\phi _{\gamma _1}\left( Y_i-g({\mathbf {x}}_i; \varvec{\beta }_{0})-{\mathbf {z}}_i^{\!\top \!} {a_0 \atopwithdelims ()h_1b_0} \right) K_{i,h_1}. \end{aligned}$$

By applying the first-order Taylor expansion and noting Conditions (C3)–(C4), we obtain

$$\begin{aligned} g({\mathbf {x}}_i; \varvec{\beta }_0+c_n\mathbf{v}_2)=g({\mathbf {x}}_i; \varvec{\beta }_0)+c_ng'({\mathbf {x}}_i; \varvec{\beta }_0)^{\!\top \!}\mathbf{v}_2[1+o_p(1)], \end{aligned}$$

(6.4)

so that

$$\begin{aligned}&\ell _n({\varvec{\upalpha }}_0+c_n\mathbf{v})-\ell _n({\varvec{\upalpha }}_0) \\&\quad =\sum _{i=1}^{n} \left[ \phi _{\gamma _1} (\varepsilon _i+r_i-c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v}) -\phi _{\gamma _1} (\varepsilon _i+r_i)\right] K_{i,h_1} \\&\quad =\sum _{i=1}^{n} \left[ -\phi _{\gamma _1}'(\varepsilon _i+r_i) c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v}+ \frac{1}{2} \phi _{\gamma _1}''(\varepsilon _i+r_i) (c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v})^2\,-\,\,\frac{1}{6} \phi _{\gamma _1}''(\varepsilon _i^*) (c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v})^3 \right] K_{i,h_1} \\&\quad \triangleq I_1+I_2+I_3, \end{aligned}$$

where $\varepsilon _i^*$ is a point between $\varepsilon _i+r_i$ and $\varepsilon _i+r_i-c_n{\mathbf {z}}_i^{*\!\top \!}\mathbf{v}$.

Under regularity conditions (C3)–(C4), the mean and variance of $I_1$ can be directly calculated as

$$\begin{aligned} E(I_1)=O(nCc_nh_1^2) \quad \text{ and } \quad \text{ Var }(I_1)=O(n^2C^2c_n^2(nh_1)^{-1}). \end{aligned}$$

Therefore, we have

$$\begin{aligned} I_1=E(I_1)+O_p \Big (\sqrt{\text{ Var }(I_1)} \, \Big )=O(nCc_nh_1^2)+nCc_nO_p((nh_1)^{-1/2})=O_p(nCc_n^2). \end{aligned}$$

Similarly, we can obtain $I_3=O_p(nc_n^3)$.

By Lemma 1, it follows that

$$\begin{aligned} I_2=nc_n^2f_T(t)\mathbf{v}^{\!\top \!}\varvec{\Sigma }_1(t)\mathbf{v}[1+o_p(1)], \end{aligned}$$

where $\varvec{\Sigma }_1(t)=F(t,\gamma _1)E[{\mathbf {A}}({\mathbf {x}})|T=t]$. Noting that $\Vert \mathbf{v}\Vert =C$, we can choose a sufficiently large C such that $I_2$ dominates both $I_1$ and $I_3$ with a probability of at least $1-\delta $. By Condition (C1), we have $F(t,\gamma _1)<0$. Therefore, $\varvec{\Sigma }_1(t)$ is a negative matrix. The proof of Lemma 2 is completed. $\square $

Proof of Theorem 1

Note that

$$\begin{aligned} \tilde{\varvec{\theta }} = \arg \, \max _{a, b, \varvec{\beta }} \sum _{i=1}^{n}\exp \left\{ -\left[ Y_i -a-b(T_i-t) -g({\mathbf {x}}_i; \varvec{\beta })\right] ^2/ \gamma _1\right\} K_{i,h_1}. \end{aligned}$$

From (6.4) and the Taylor expansion, we know that $\tilde{{\varvec{\upalpha }}}$ satisfies

$$\begin{aligned} \mathbf{0}\!\!\!\mathbf{0}=\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}'(\varepsilon _i+\tilde{r}_i) =\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}[\phi _{\gamma _1}'(\varepsilon _i)+\phi _{\gamma _1}''(\varepsilon _i)\tilde{r}_i +\frac{1}{2}\phi _{\gamma _1}'''(\varepsilon _i^*)\tilde{r}_i^2], \end{aligned}$$

(6.5)

where $\varepsilon _i^*$ lies in $\varepsilon _i$ and $\varepsilon _i+\tilde{r}_i$,

$$\begin{aligned} \tilde{r}_i=r_i-(\tilde{a}-a_0)-(\tilde{b}-b_0)(T_i-t)-g'({\mathbf {x}}_i; \varvec{\beta }_{0}) (\tilde{\varvec{\beta }}-\varvec{\beta }_{0}))=r_i-{\mathbf {z}}_{i}^{*\!\top \!}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}). \end{aligned}$$

Therefore, the second term on the right-hand side of (6.5) is

$$\begin{aligned}&\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i)\tilde{r}_i = \sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i)r_i\\&\quad -\sum _{i=1}^{n}K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i){\mathbf {z}}_i^*{\mathbf {z}}_{i}^{*\!\top \!}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}) \triangleq I_4+I_5. \end{aligned}$$

From Lemma 1, we have

$$\begin{aligned} I_4= & {} \sum _{i=1}^n {\mathbf {z}}_i^*K_{i,h_1} \phi _{\gamma _1}'' (\varepsilon _i) r_i =\frac{1}{2}nh_1^2 \mu _2f_T(t){\varvec{\upsigma }}_3(t) m''(t)+o_p(nh_1^2) \quad \text{ and } \quad \\ I_5= & {} -\sum _{i=1}^n K_{i,h_1} \phi _{\gamma _1}''(\varepsilon _i) {\mathbf {z}}_i^* {\mathbf {z}}_{i}^{*\!\top \!} (\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}) =-nf_T(t) \varvec{\Sigma }_1(t)(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})[1+o_p(1)]. \end{aligned}$$

By applying Lemma 2, we have $\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert =O_p((nh_1)^{-1/2}+h_1^2)$. Thus,

$$\begin{aligned} \sup _{i:\; |T_i-t|/h_1\le 1}|\tilde{r}_i|\le & {} \sup _{i: \; |T_i-t|/h_1 \le 1} \Big [ |r_i|+|{\mathbf {z}}_{i}^{*\!\top \!}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})| \Big ] \\= & {} O_p(h_1^2+\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert )=O_p(\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert )=o_p(1), {\mathrm{and}} \\ \sup _{i:\; |T_i-t|/h_1 \le 1} |\tilde{r}_i^2|= & {} o_p(1) O_p(\Vert \tilde{{\varvec{\upalpha }}} -{\varvec{\upalpha }}_{0}\Vert ^2)=o_p(\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert ). \end{aligned}$$

Since

$$\begin{aligned} E\left[ \sum _{i=1}^{n}K_{i,h_1}\phi _{\gamma _1}'''(\varepsilon _i^*)\tilde{r}_i^2(T_i-t)^j/h_1^j\right]= & {} O_p(n\Vert \tilde{{\varvec{\upalpha }}} -{\varvec{\upalpha }}_{0}\Vert ^2) = o_p(I_5) \ \,{\mathrm{and}} \\ \text{ Var } \left[ \sum _{i=1}^{n}K_{i,h_1} \phi _{\gamma _1}'''(\varepsilon _i^*) \tilde{r}_i^2(T_i-t)^j/h_1^j\right]= & {} O_p(n\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert )\\= & {} O_p(n^2\Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert ^2(nh_1)^{-1}), \end{aligned}$$

we have

$$\begin{aligned} \sum _{i=1}^n K_{i,h_1}\phi _{\gamma _1}'''(\varepsilon _i^*)\tilde{r}_i^2{\mathbf {z}}_i^*=o_p(I_5) +O_p \Big (\sqrt{n^2 \Vert \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}\Vert ^2(nh_1)^{-1}} \, \Big ) = o_p(I_5), \end{aligned}$$

so that

$$\begin{aligned} \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}=\frac{1}{nf_T(t)}\varvec{\Sigma }_1(t)^{-1}W_n[1+o_p(1)]+\frac{1}{2}h_1^2\mu _2 \varvec{\Sigma }_1(t)^{-1}{\varvec{\upsigma }}_3(t)m''(t)[1+o_p(1)], \end{aligned}$$

where $\mathbf {w}_n=\sum _{i=1}^{n}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}'(\varepsilon _i)$. By Condition (C2), we have $E(\mathbf {w}_n)= \mathbf{0}\!\!\!\mathbf{0}$ and

$$\begin{aligned} \text{ Var }(\mathbf {w}_n)=nh_1^{-1}f_T(t)\varvec{\Sigma }_2(t)[1+o_p(1)]. \end{aligned}$$

(6.6)

Let $\mathbf {w}_n^* \,\hat{=}\,\sqrt{h_1/n}\, \mathbf {w}_n$ and $\zeta _i \,\hat{=}\,\sqrt{h_1/n}\,{\mathbf {d}}^{\!\top \!}{\mathbf {z}}_i^*K_{i,h_1}\phi _{\gamma _1}'(\varepsilon _i)$, where ${\mathbf {d}}$ is a unit vector satisfying $\Vert {\mathbf {d}}\Vert =1$. Then, ${\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*=\sum _{i=1}^{n}\zeta _i$. By (6.6), we have

$$\begin{aligned} \text{ Var }({\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*)=f_T(t){\mathbf {d}}^{\!\top \!}\varvec{\Sigma }_2(t){\mathbf {d}}[1+o_p(1)]. \end{aligned}$$

Since $\phi _{\gamma _1}(\cdot )$ is bounded and $K(\cdot )$ has a compact support, we obtain $nE|\zeta _1|^3=O((nh_1)^{-1/2})\rightarrow 0$ by direct calculations. Using the Lyapunov’s condition, we obtain

$$\begin{aligned}{}[{\mathrm{Var}}({\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*)]^{-1/2} [{\mathbf {d}}^{\!\top \!}\mathbf {w}_n^*-{\mathbf {d}}^{\!\top \!} E(\mathbf {w}_n^*)] \xrightarrow {\mathrm{D}}N(0,1). \end{aligned}$$

Therefore

$$\begin{aligned} \mathbf {w}_n^* \xrightarrow {\mathrm{D}} N(\mathbf{0}\!\!\!\mathbf{0}, \; f_T(t)\varvec{\Sigma }_2(t)). \end{aligned}$$

The proof is completed. $\square $

Lemma 3

Let $(X_1,U_1)^{\!\top \!}, \ldots , (X_n, U_n)^{\!\top \!}$ be an i.i.d. random samples from population random vector $(X, U)^{\!\top \!}$ with the joint density p(x, u). Let $E|U|^s<\infty $ and $\sup _{x}\int |u|^sp(x,u)\,\text{ d }u<\infty $, where $s \ge 2$. Let $K(\cdot )>0$ be a bounded function with a bounded support, and satisfy the Lipschitz condition. Then,

$$\begin{aligned} \sup _{x \in [0,1]} \left| \frac{1}{n} \sum _{i=1}^{n} \Big \{ K_h(X_i-x)U_i-E[K_h(X_i-x)U_i ] \Big \}\right| =O_p \left( \left[ \frac{\log (1/h)}{nh}\right] ^{1/2}\right) \end{aligned}$$

provided that $n^{2t-1}h\rightarrow \infty $ for some $t<1-s^{-1}$.

The proof of Lemma 3 can be found in Mack and Silverman (1982).

Lemma 4

Under the conditions in Theorem 1, we have

$$\begin{aligned} \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}=\frac{1}{nf_T(t)}\varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n+O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)). \end{aligned}$$

Proof of Lemma 4

Let $\tilde{{\varvec{\uplambda }}}=\sqrt{nh_1}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})$. Then, $\tilde{{\varvec{\uplambda }}}$ is the maximizer of

$$\begin{aligned} U_n(a,b,\varvec{\beta })= & {} h_1 \sum _{i=1}^{n} \Big [\phi _{\gamma _1}(Y_i-g({\mathbf {x}}_i,\varvec{\beta })-a-b(T_i-t)/h_1) \\&- \; \phi _{\gamma _1}(Y_i-g({\mathbf {x}}_i; \varvec{\beta }_0)-a_0-b_0(T_i-t)/h_1) \Big ]K_{i,h_1}. \end{aligned}$$

Using the Taylor expansion, we have

$$\begin{aligned} U_n(a,b,\varvec{\beta })=\tilde{{\varvec{\uplambda }}}^{\!\top \!}\mathbf {w}_n^*+\frac{1}{2} \tilde{{\varvec{\uplambda }}}^{\!\top \!}\Delta _n\tilde{{\varvec{\uplambda }}}+O_p(\tau _n\Vert \tilde{{\varvec{\uplambda }}}^{\!\top \!}\Vert ^2), \end{aligned}$$

(6.7)

where ${\varvec{\Delta }}_n = \frac{1}{n}\sum _{i=1}^{n} \tau _n^2{\mathbf {z}}_i^{*}{\mathbf {z}}_i^{*\!\top \!} K_{i,h_1}\phi _{\gamma _1}''(\varepsilon _i)$. By Lemma 3, we obtain

$$\begin{aligned} {\varvec{\Delta }}_n=E[{\varvec{\Delta }}_n]+O_p(\tau _n\log ^{1/2}(1/h_1)). \end{aligned}$$

By the regularity conditions, we have $E[{\varvec{\Delta }}_n]=f_T(t)\varvec{\Sigma }_1(t)+O_p(h_1^2)$. Therefore,

$$\begin{aligned} {\varvec{\Delta }}_n=f_T(t)\varvec{\Sigma }_1(t)+O_p(h_1^2)+O_p(\tau _n\log ^{1/2}(1/h_1)). \end{aligned}$$

According to (6.7), we have

$$\begin{aligned} \tilde{{\varvec{\uplambda }}}=\frac{1}{f_T(t)}\varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n^*+O_p(h_1^2+\tau _n\log ^{1/2}(1/h_1)). \end{aligned}$$

Since $\tilde{{\varvec{\uplambda }}}=\sqrt{nh_1}(\tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0})$, we obtain

$$\begin{aligned} \tilde{{\varvec{\upalpha }}}-{\varvec{\upalpha }}_{0}=\frac{1}{nf_T(t)} \varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n+O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)) \end{aligned}$$

holds uniformly in $t\in {\mathbb {T}}$. The proof is completed. $\square $

Lemma 5

Let Conditions (C1)–(C7) hold, $nh_1^4\rightarrow 0$ and $nh_1^2/[\log (1/h_1)] \rightarrow \infty $ as $n\rightarrow \infty $. Then, with probability approaching to 1, there exists a local maximizer ${\varvec{\hat{\beta }}}_n$ defined by (2.3) such that

$$\begin{aligned} \Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert =O_p \left( \frac{1}{\sqrt{n}} \right) . \end{aligned}$$

Proof of Lemma 5

Let $\chi _i=\tilde{m}(T_i)-m(T_i)$ and $R(\varvec{\beta })=\frac{1}{n}\sum _{i=1}^{n}\phi _{\gamma _2}(Y_i-\tilde{m}(T_i)-g({\mathbf {x}}_i; \varvec{\beta }))$. Using the Taylor expansion and (6.4), we have

$$\begin{aligned}&R \left( \varvec{\beta }_0+ \varvec{\mathrm{e}}/\sqrt{n} \right) -R(\varvec{\beta }_0) \\&\quad = \frac{1}{n}\sum _{i=1}^{n} \left[ \phi _{\gamma _2}\left( Y_i-\tilde{m}(T_i)- g({\mathbf {x}}_i; \varvec{\beta }_0+ \varvec{\mathrm{e}}/\sqrt{n})\right) -\phi _{\gamma _2}(Y_i-\tilde{m}(T_i)-g({\mathbf {x}}_i; \varvec{\beta }_0))\right] \\&\quad = \frac{1}{n}\sum _{i=1}^{n}\left[ \phi _{\gamma _2}(\varepsilon _i-\chi _i - \varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)\Big /\sqrt{n})-\phi _{\gamma _2}(\varepsilon _i-\chi _i)\right] \\&\quad = - \; \frac{1}{n} \sum _{i=1}^{n} \phi _{\gamma _2}'(\varepsilon _i-\chi _i) \varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)\Big /\sqrt{n} + \frac{1}{n}\sum _{i=1}^{n}\frac{1}{2n}\phi _{\gamma _2}''(\varepsilon _i-\chi _i) [\varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)]^2 \\&\quad \quad - \; \frac{1}{n} \sum _{i=1}^{n} \frac{1}{6n^{3/2}}\phi _{\gamma _2}'''(\varepsilon _i^*) [\varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)]^3 \\&\quad \triangleq I_6+I_7+I_8, \end{aligned}$$

where $\Vert \varvec{\mathrm{e}}\Vert =C$ for a large constant C, and $\varepsilon _i^*$ lies in $\varepsilon _i-\chi _i- \varvec{\mathrm{e}}^{\!\top \!} g'({\mathbf {x}}_i; \varvec{\beta }_0)/ \sqrt{n}$ and $\varepsilon _i-\chi _i$. By Lemma 4, we have

$$\begin{aligned} \sup _{t\in {\mathbb {T}}} \left\| \tilde{m}(t) -m(t)-\frac{1}{nf_T(t)} {\mathbf {q}}^{\!\top \!} \varvec{\Sigma }_1(t)^{-1}\mathbf {w}_n\right\| =O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)). \end{aligned}$$

(6.8)

By (6.8), $nh_1^4\rightarrow 0$ and $nh_1^2/[\log (1/h_1)]\rightarrow \infty $ as $n\rightarrow \infty $, we have $E(I_6)=O(Ch_1^2/\sqrt{n})$, $\text{ Var }(I_6)=O(C^2/n^2)$. Therefore, we can obtain $I_6=O_p(n^{-1})$. Similarly, we have $I_8=O_p(n^{-3/2})$. For $I_7$, we have

$$\begin{aligned} I_7=\frac{1}{2n}\varvec{\mathrm{e}}^{\!\top \!}\varvec{\Sigma }_4\varvec{\mathrm{e}}[1+o_p(1)], \end{aligned}$$

where $\varvec{\Sigma }_4=E[F(t,\gamma _2)g'({\mathbf {x}};\varvec{\beta }_0)g'({\mathbf {x}};\varvec{\beta }_0)^{\!\top \!}]$. Noting $\Vert \varvec{\mathrm{e}}\Vert =C$, we can choose a sufficiently large C such that $I_7$ dominates both $I_6$ and $I_8$ with a probability of at least $1-\delta $. Since $F(t,\gamma _2)<0$, we have

$$\begin{aligned} \Pr \left\{ \sup _{\Vert \varvec{\mathrm{e}}\Vert =C}R(\varvec{\beta }_0+ \varvec{\mathrm{e}}/\sqrt{n} ) < R(\varvec{\beta }_0)\right\} \ge 1-\delta . \end{aligned}$$

The proof is completed. $\square $

Proof of Theorem 2

Let $\tilde{\varphi }_i = \tilde{m}(T_i)-m(T_i)+g'({\mathbf {x}};\varvec{\beta }_0)^{\!\top \!}({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)$. Then, ${\varvec{\hat{\beta }}}_n$ satisfies the following equation

$$\begin{aligned} 0= & {} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}'(\varepsilon _i-\tilde{\varphi }_i) \\= & {} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \left[ \phi _{\gamma _2}' (\varepsilon _i)-\phi _{\gamma _2}'' (\varepsilon _i)\tilde{\varphi }_i+\frac{1}{2} \phi _{\gamma _2}'''(\varepsilon _i^*) \tilde{\varphi }_i^2\right] \triangleq I_9+I_{10}+I_{11}, \end{aligned}$$

where

$$\begin{aligned} I_{9}= & {} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}' (\varepsilon _i), \\ I_{10}= & {} -\sum _{i=1}^{n}g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i)\tilde{\varphi }_i \\= & {} -\sum _{i=1}^{n}g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i) [\tilde{m}(T_i)-m(T_i) ] \\&- \sum _{i=1}^{n} \phi _{\gamma _2}''(\varepsilon _i) g'({\mathbf {x}};\varvec{\beta }_0)g'({\mathbf {x}};\varvec{\beta }_0)^{\!\top \!}({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0) \\= & {} J_1+J_2, \\ I_{11}= & {} \frac{1}{2}\sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \phi _{\gamma _2}'''(\varepsilon _i^*) \tilde{\varphi }_i^2. \end{aligned}$$

By (6.8), we can write $J_1$ as

$$\begin{aligned} J_1= & {} -\sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i)[\tilde{m}(T_i)-m(T_i)] \\= & {} -\sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \phi _{\gamma _2}''(\varepsilon _i) \frac{1}{nf_T(T_i)} {\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1(T_i)^{-1}\mathbf {w}_n+O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)) \\= & {} -\frac{1}{n} \sum _{i=1}^n g'({\mathbf {x}};\varvec{\beta }_0) \phi _{\gamma _2}''(\varepsilon _i) \frac{1}{f_T(T_i)}{\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1(T_i)^{-1} \sum _{j=1}^n {\mathbf {z}}_j^*K_{h_1}(T_j-T_i) \phi _{\gamma _1}'(\varepsilon _j) \\&+ \; O_p(\tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)). \end{aligned}$$

Since $nh_1^4\rightarrow 0$ and $nh_1^2/[\log (1/h_1)] \rightarrow \infty $ as $n\rightarrow \infty $, we have

$$\begin{aligned} \tau _nh_1^2+\tau _n^2\log ^{1/2}(1/h_1)=O(n^{1/2}h_1^2)=o(1), \end{aligned}$$

so that

$$\begin{aligned} J_1 = -\frac{1}{n} \sum _{i=1}^{n} g'({\mathbf {x}};\varvec{\beta }_0)\phi _{\gamma _2}''(\varepsilon _i) \frac{1}{f_T(T_i)}{\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1(T_i)^{-1}\sum _{j=1}^{n} {\mathbf {z}}_j^*K_{h_1}(T_j-T_i)\phi _{\gamma _1}'(\varepsilon _j) + o_p(1). \end{aligned}$$

By calculating the second moment, It can be shown that $J_1-J_3\xrightarrow {\mathrm{P}} 0$, where $J_3=-\sum _{j=1}^{n} {\varvec{\kappa }}(T_j) $ with

$$\begin{aligned} {\varvec{\kappa }}(t_j) =\phi _{\gamma _1}'(\varepsilon _j) {\varvec{\upsigma }}_6(t_j){\mathbf {q}}^{\!\top \!}\varvec{\Sigma }_1^{-1}(t_j)(1,0,g'({\mathbf {x}}_j,\varvec{\beta }_0)^{\!\top \!})^{\!\top \!}. \end{aligned}$$

On the other hand, $J_2=-n\varvec{\Sigma }_4({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)$. Since $|\tilde{\varphi }_i|=O_p(\Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert )=o_p(1)$, and $|\tilde{\varphi }_i|^2=o_p(1)O_p({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)=o_p(\Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert )$

$$\begin{aligned} \frac{1}{2}\sum _{i=1}^{n}\phi _{\gamma _2}'''(\varepsilon _i^*)\tilde{\varphi }_i^2=o_p(J_4). \end{aligned}$$

Therefore,

$$\begin{aligned} \sqrt{n}({\varvec{\hat{\beta }}}_n-\varvec{\beta }_0)=\frac{1}{\sqrt{n}}\varvec{\Sigma }_4^{-1}\sum _{i=1}^{n} \Big [ g'({\mathbf {x}}_i; \varvec{\beta }_0)\phi _{\gamma _2}'(\varepsilon _i)-{\varvec{\kappa }}(T_i)\Big ] + o_p(1). \end{aligned}$$

The proof is completed by Slutsky’s Theorem and the Central Limit Theorem. $\square $

Proof of Theorem 3

According to Theorem 2, We have $\Vert {\varvec{\hat{\beta }}}_n-\varvec{\beta }_0\Vert =O_p(1/\sqrt{n})$. Following the ideas in the proof of Theorem 1, we can easily obtain the result of Theorem 3.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, Y., Tian, GL. & Fei, Y. A robust and efficient estimation method for partially nonlinear models via a new MM algorithm. Stat Papers 60, 2063–2085 (2019). https://doi.org/10.1007/s00362-017-0909-5

Download citation

Received: 18 October 2016
Revised: 05 April 2017
Published: 04 May 2017
Issue Date: December 2019
DOI: https://doi.org/10.1007/s00362-017-0909-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A robust and efficient estimation method for partially nonlinear models via a new MM algorithm

Abstract

Access this article

Similar content being viewed by others

Robust exponential squared loss-based estimation in semi-functional linear regression models

The generalized equivalence of regularization and min–max robustification in linear mixed models

M-estimation in high-dimensional linear model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Lemma 1

Lemma 2

Proof of Lemma 2

Proof of Theorem 1

Lemma 3

Lemma 4

Proof of Lemma 4

Lemma 5

Proof of Lemma 5

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A robust and efficient estimation method for partially nonlinear models via a new MM algorithm

Abstract

Access this article

Similar content being viewed by others

Robust exponential squared loss-based estimation in semi-functional linear regression models

The generalized equivalence of regularization and min–max robustification in linear mixed models

M-estimation in high-dimensional linear model

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 1

Lemma 2

Proof of Lemma 2

Proof of Theorem 1

Lemma 3

Lemma 4

Proof of Lemma 4

Lemma 5

Proof of Lemma 5

Proof of Theorem 2

Proof of Theorem 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation