Two-stage estimation and simultaneous confidence band in partially nonlinear additive model

Li, Rui; Zhang, Yuanyuan

doi:10.1007/s00184-021-00808-3

Two-stage estimation and simultaneous confidence band in partially nonlinear additive model

Published: 02 February 2021

Volume 84, pages 1109–1140, (2021)
Cite this article

Metrika Aims and scope Submit manuscript

382 Accesses
Explore all metrics

Abstract

In this paper, we focus on the estimation and inference in partially nonlinear additive model on which few research was conducted to our best knowledge. By integrating spline approximation and local smoothing, we propose a two-stage estimating approach in which the profile nonlinear least square method was used to estimate parameters and additive functions. Under some regular conditions, we establish the asymptotic normality of parametric estimators and achieve an optimal nonparametric convergence rate of the fitted functions. Furthermore, the spline-backfitted local linear estimator is proposed for the additive functions and the corresponding asymptotic distribution is also established. To make inference on the nonparametric functions from the whole, we construct the theoretical simultaneous confidence bands, and further propose an empirical bootstrap-based confidence band for the heavy computing burden in implement. Finally, both Monte Carlo simulation and real data analysis show the good performance of our proposed methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation and inference for additive partially nonlinear models

Article 12 March 2016

Smoothing combined generalized estimating equations in quantile partially linear additive models with longitudinal data

Article 12 August 2015

Efficient estimation of longitudinal data additive varying coefficient regression models

Article 08 April 2017

References

Bates DM, Watts DG (1988) Nonlinear regression analysis and its applications. Wiley, New York
Book MATH Google Scholar
Biedermann S, Dette H, Woods DC (2011) Optimal design for additive partially nonlinear models. Biometrika 98(2):449–458
Article MathSciNet MATH Google Scholar
Bowman AW (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2):353–360
Article MathSciNet Google Scholar
Breiman L, Friedman JH (1985) Estimating optimal transformations for multiple regression and correlation. J Am Stat Assoc 80(391):580–598
Article MathSciNet MATH Google Scholar
Cai Z, Xu X (2008) Nonparametric quantile estimations for dynamic smooth coefficient models. J Am Stat Assoc 103(484):1595–1608
Article MathSciNet MATH Google Scholar
Claeskens G, Van Keilegom I (2003) Bootstrap confidence bands for regression curves and their derivatives. Ann Stat 31(6):1852–1884
Article MathSciNet MATH Google Scholar
Currie DJ (1982) Estimating michaelis-menten parameters: bias, variance and experimental design. Biometrics 38(4):907–919
Article MATH Google Scholar
De Boor C (2001) A practical guide to splines. Appl Math Sci
Donthi R, Prasad SV, Mahaboob B, Praveen JP, Venkateswarlu B (2019) Estimation methods of nonlinear regression models. In: AIP conference proceedings, 2177(1), 020081. AIP Publishing
Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall, London
MATH Google Scholar
Fan J, Härdle W, Mammen E et al (1998) Direct estimation of low-dimensional components in additive models. Ann Stat 26(3):943–971
Article MathSciNet MATH Google Scholar
Fan J, Zhang W (2000) Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand J Stat 27(4):715–731
Article MathSciNet MATH Google Scholar
Härdle W, Liang H, Gao J (2012) Partially linear models. Springer Science & Business Media
Härdle W, Sperlich S, Spokoiny V (2001) Structural tests in additive regression. J Am Stat Assoc 96(456):1333–1347
Article MathSciNet MATH Google Scholar
Harrison D Jr, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5(1):81–102
Article MATH Google Scholar
Hart JD, Wehrly TE (1993) Consistency of cross-validation when the data are curves. Stochast Process Appl 45(2):351–361
Article MathSciNet MATH Google Scholar
Huang L-S, Yu C-H (2019) Classical backfitting for smooth-backfitting additive models. J Comput Graph Stat 28(2):386–400
Article MathSciNet Google Scholar
Imhof LA et al (2001) Maximin designs for exponential growth models and heteroscedastic polynomial models. Ann Stat 29(2):561–576
Article MathSciNet MATH Google Scholar
Jiang Y, Tian G-L, Fei Y (2019) A robust and efficient estimation method for partially nonlinear models via a new mm algorithm. Stat Pap 60(6):2063–2085
Article MathSciNet MATH Google Scholar
Kong E, Xia Y (2012) A single-index quantile regression model and its estimation. Econom Theory 28(4):730–768
Article MathSciNet MATH Google Scholar
Li G, Peng H, Tong T (2013) Simultaneous confidence band for nonparametric fixed effects panel data models. Econ Lett 119(3):229–232
Article MathSciNet MATH Google Scholar
Li Q (2000) Efficient estimation of additive partially linear models. Int Econ Rev 41(4):1073–1092
Article MathSciNet Google Scholar
Li Q, Racine JS (2007) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton
MATH Google Scholar
Li R, Nie L (2007) A new estimation procedure for a partially nonlinear model via a mixed-effects approach. Can J Stat 35(3):399–411
Article MathSciNet MATH Google Scholar
Li R, Nie L (2008) Efficient statistical inference procedures for partially nonlinear models and their applications. Biometrics 64(3):904–911
Article MathSciNet MATH Google Scholar
Li Y, Ruppert D (2008) On the asymptotics of penalized splines. Biometrika 95(2):415–436
Article MathSciNet MATH Google Scholar
Liang H, Thurston SW, Ruppert D, Apanasovich T, Hauser R (2008) Additive partial linear models with measurement errors. Biometrika 95(3):667–678
Article MathSciNet MATH Google Scholar
Liu X, Wang L, Liang H (2011) Estimation and variable selection for semiparametric additive partial linear models. Statistica Sinica 21(3):1225–1248
Article MathSciNet MATH Google Scholar
Ma S, Lian H, Liang H, Carroll R (2017) SiAM: a hybrid of single index models and additive models. Electron J Stat 11(1):2397–2423
Article MathSciNet MATH Google Scholar
Ma S, Yang L (2011) Spline-backfitted kernel smoothing of partially linear additive model. J Stat Plan Inf 141(1):204–219
Article MathSciNet MATH Google Scholar
Mammen E, Linton O, Nielsen JP (1999) The existence and asymptotic properties of a backfitting projection algorithm under weak conditions. Ann Stat 27(5):1443–1490
Article MathSciNet MATH Google Scholar
Manzan S, Zerom D (2005) Kernel estimation of a partially linear additive model. Stat Probab Lett 72(4):313–322
Article MathSciNet MATH Google Scholar
Nielsen JP, Sperlich S (2005) Smooth backfitting in practice. J Roy Stat Soc B 67(1):43–61
Article MathSciNet MATH Google Scholar
Riazoshams H, Midi H, Ghilagaber G (2018) Robust nonlinear regression: with applications using R. Wiley, Hoboken
Book MATH Google Scholar
Rice JA, Silverman BW (1991) Estimating the mean and covariance structure nonparametrically when the data are curves. J Roy Stat Soc B 53(1):233–243
MathSciNet MATH Google Scholar
Rudemo M (1982) Empirical choice of histograms and kernel density estimators. Scand J Stat 9(2):65–78
MathSciNet MATH Google Scholar
Schumaker LL (1981) Spline functions: basic theory. Wiley, New York
MATH Google Scholar
Seber GA, Wild CJ (2003) Nonlinear regression. Wiley-Interscience, Hoboken
MATH Google Scholar
Severini TA, Wong WH et al (1992) Profile likelihood and conditionally parametric models. Ann Stat 20(4):1768–1802
Article MathSciNet MATH Google Scholar
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, London
MATH Google Scholar
Song L, Zhao Y, Wang X (2010) Sieve least squares estimation for partially nonlinear models. Stat Probab Lett 80(17–18):1271–1283
Article MathSciNet MATH Google Scholar
Stone CJ et al (1984) An asymptotically optimal window selection rule for kernel density estimates. Ann Stat 12(4):1285–1297
Article MathSciNet MATH Google Scholar
Su L, Ullah A (2006) Profile likelihood estimation of partially linear panel data models with fixed effects. Econ Lett 92(1):75–81
Article MathSciNet MATH Google Scholar
Tjøstheim D, Auestad BH (1994) Nonparametric identification of nonlinear time series: projections. J Am Stat Assoc 89(428):1398–1409
MathSciNet MATH Google Scholar
Wang J, Yang L (2009) Efficient and fast spline-backfitted kernel smoothing of additive models. Ann Inst Stat Math 61(3):663–690
Article MathSciNet MATH Google Scholar
Wang Z, Xue L, Liu J (2019) Checking nonparametric component for partially nonlinear model with missing response. Stat Probab Lett 149:1–8
Article MathSciNet MATH Google Scholar
Wu TZ, Yu K, Yu Y (2010) Single-index quantile regression. J Multivar Anal 101(7):1607–1621
Article MathSciNet MATH Google Scholar
Xiao Y, Tian Z, Li F (2014) Empirical likelihood-based inference for parameter and nonparametric function in partially nonlinear models. J Korean Stat Soc 43(3):367–379
Article MathSciNet MATH Google Scholar
Xie H, Huang J et al (2009) Scad-penalized regression in high-dimensional partially linear models. Ann Stat 37(2):673–696
Article MathSciNet MATH Google Scholar
Yang L, Park BU, Xue L, Härdle W (2006) Estimation and testing for varying coefficients in additive models with marginal integration. J Am Stat Assoc 101(475):1212–1227
Article MathSciNet MATH Google Scholar
Yang L, Sperlich S, Härdle W (2003) Derivative estimation and testing in generalized additive models. J Stat Plan Inf 115(2):521–542
Article MathSciNet MATH Google Scholar
Yu K, Lu Z (2004) Local linear additive quantile regression. Scand J Stat 31(3):333–346
Article MathSciNet MATH Google Scholar
Zhang HH, Cheng G, Liu Y (2011) Linear or nonlinear? Automatic structure discovery for partially linear models. J Am Stat Assoc 106(495):1099–1112
Article MathSciNet MATH Google Scholar
Zhang Y, Lian H, Yu Y (2017) Estimation and variable selection for quantile partially linear single-index models. J Multivar Anal 162:215–234
Article MathSciNet MATH Google Scholar
Zhou S, Shen X, Wolfe D (1998) Local asymptotics for regression splines and confidence regions. Ann Stat 26(5):1760–1782
MathSciNet MATH Google Scholar
Zhou X, Zhao P, Liu Z (2016) Estimation and inference for additive partially nonlinear models. J Korean Stat Soc 45(4):491–504
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Li’s research was supported by the grant from the National Social Science Fund of China (No. 17BTJ025) and the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science (East China Normal University), Ministry of Education (No. KLATASDS1802).

Author information

Authors and Affiliations

School of Statistics and Information, Shanghai University of International Business and Economics, Shanghai, China
Rui Li & Yuanyuan Zhang
Key Laboratory of Advanced Theory and Application in Statistics and Data Science, (East China Normal University), Ministry of Education, Shanghai, China
Rui Li

Authors

Rui Li
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rui Li.

Ethics declarations

Conflict of interest

The authors state that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

To start with, we review some properties of B-spline function. Let $\mathbf{B}(u)=(B_1(u),\ldots ,B_L(u))^\top $ be B-spline basis over [0, 1], then $B_l(u)\ge 0$ and $\sum _{l=1}^{L}B_l(u)=\sqrt{L}$ for each $u\in [0,1]$. Moreover, for any vector $\varvec{\theta }=(\theta _1,\ldots ,\theta _L)^\top $ and constants $0<C_1<C_2$,

$$\begin{aligned} C_1\Vert \varvec{\theta }\Vert ^2\le \int \left\{ \sum _{l=1}^{L}\varvec{\theta }^\top \mathbf{B}(u)\right\} ^2du \le C_2\Vert \varvec{\theta }\Vert ^2. \end{aligned}$$

Lemma 1

If a function $\phi (u)$ defined on the support $\mathcal {U}$ has r-order continuous derivatives with $r\ge 2$, there exists $\varvec{\gamma }=(\gamma _1,\ldots ,\gamma _L)^\top $ such that

$$\begin{aligned} \sup _{u\in \mathcal {U}}|\phi (u)-\mathbf{B}^\top (u)\varvec{\gamma }|=O(L^{-r}). \end{aligned}$$

Proof

Lemma 1 can be proved directly from the Theorem XII.1 in De Boor (2001).

$\square $

Lemma 2

If the conditions (B1)–(B2) are satisfied, it holds

$$\begin{aligned} \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1 &{} u_{ik}-u_0\\ (u_{ik}-u_0)/h^2 &{} (u_{ik}-u_0)^2/h^2 \end{pmatrix} =\mathbf{Q}+o_p(1), \end{aligned}$$

where

$$\begin{aligned} \mathbf{Q}=\begin{pmatrix} f(u_0) &{} 0\\ \kappa _2f'(u_0)&{}\kappa _2f(u_0) \end{pmatrix}, \end{aligned}$$

moreover, it follows that

$$\begin{aligned} \mathbf{Q}^{-1}= \begin{pmatrix} 1/f(u_0) &{} 0\\ -f'(u_0)/f^2(u_0)&{}1/(\kappa _2f(u_0)) \end{pmatrix}. \end{aligned}$$

Proof

The exercise 2.7 in Li and Racine (2007) shows this result, so we omit the proof. $\square $

Lemma 3

Suppose the conditions (B1)–(B2) hold and denote

$$\begin{aligned} m(u_{ik},u_0)=(1,0)(\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0))^{-1}\mathbf{D}_i(u_0)K_h(u_{ik}-u_0) \end{aligned}$$

with $\mathbf{D}_i(u_0)$ being the ith column of $\mathbf{D}^{\top }(u_0)$, we get

$(1)\, m(u_{ik},u_0)=n^{-1}K_h(u_{ik}-u_0)f^{-1}(u_0)\{1+o_p(1)\};$

$(2)\, \lim _{n\rightarrow \infty }P_n\{\underset{u_0\in [0,1]}{\sup }\underset{1\le i\le n}{\max }| m(u_{ik},u_0)|\le C(nh)^{-1}\}=1$.

Proof

The conclusions can be derived by referring to the Lemma 4.1 in Su and Ullah (2006). $\square $

Proof of Theorem 1

Let $\varvec{\gamma }_{0}=(\varvec{\gamma }^\top _{01},\ldots ,\varvec{\gamma }^\top _{0p})^{\top }$ be the true spline coefficient of $\alpha _{0k}(\cdot )$ for $k=1,2,\ldots ,p$, and denote $R_{k}(u_{k})=\alpha _{0k}(u_{k})-\mathbf{B}^{\top }(u_{k})\varvec{\gamma }_{0k}$,

$$\begin{aligned} \mathbf{R}(u)=(R_1(u_1),R_{2}(u_2),\ldots ,R_{p}(u_p))^{\top }\, \mathrm {and} \, \mathbf{R}_{p\times n}=(\mathbf{R}(u_1),\mathbf{R}(u_{2}),\ldots ,\mathbf{R}(u_n)). \end{aligned}$$

Following the assumptions (A3)-(A5) and the corollary 6.21 in Schumaker (1981), we get $\Vert \mathbf{B}(u)\Vert =O(\sqrt{K})$ and $\Vert R_{k}(u)\Vert =O(L^{-r})$ with r being defined in the condition (A4).

To prove the $\sqrt{n}$-consistency of ${{\widehat{\varvec{\beta }}}}$, it suffices to show that for any $\zeta >0$, there exists a sufficiently large constant $C>0$ such that

$$\begin{aligned} P\{ \underset{\Vert \mathbf{v}\Vert =C}{\inf }Q(\varvec{\beta }_0+n^{-1/2}\mathbf{v})>Q(\varvec{\beta }_0) \}\ge 1-\zeta , \end{aligned}$$

(6.13)

where $\mathbf{v}$ is m-dimensional constant vector. Let $Q'(\cdot )$ and $Q''(\cdot )$ be the first and the second order derivations of $Q(\cdot )$ respectively, and take the Taylor expansion of $Q(\cdot )$ at $\varvec{\beta }_0$ , we get

$$\begin{aligned} Q(\varvec{\beta }_0+n^{-1/2}\mathbf{v})-Q(\varvec{\beta }_0) =n^{-1/2}Q'(\varvec{\beta }_0)^{\top }\mathbf{v}+\frac{1}{2n}\mathbf{v}^{\top }Q''(\varvec{\beta }^{*})\mathbf{v}+o(n^{-1}) \end{aligned}$$

(6.14)

where $\varvec{\beta }^{*}$ lies between $\varvec{\beta }_0$ and $\varvec{\beta }_0+n^{-1/2}\mathbf{v}$. Let $E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)$ be the projection of $\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)$ onto the function class $\mathcal {A}$, then there exists a vector $\varvec{\gamma }^*$ leading to $E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-\mathbf{Z}\varvec{\gamma }^*=O(L^{-r})$. Then, $\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)$ and $E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)$ are orthogonal and $\Vert [\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }(\mathbf{I}-\mathbf{M}_\mathbf{z})\Vert =\Vert [\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_\mathcal {A}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }(\mathbf{I}-\mathbf{M}_\mathbf{z})\Vert =O_p(K/\sqrt{n}+L^{-r})$. Thus, some calculations based on Cauchy-Schwarz inequality show that

$$\begin{aligned} \begin{aligned} Q'(\varvec{\beta }_0)&=-2[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }\mathbf{R}^{\top }\mathbf{1}_{p} -2[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }{\varvec{\varepsilon }}\\&\quad +O_p(L^{-r}+K/\sqrt{n}) \end{aligned} \end{aligned}$$

(6.15)

in which $(\mathbf{I}-\mathbf{M}_\mathbf{z})\mathbf{Z}=\mathbf{P}\mathbf{Z}=\mathbf{0}$ with $\mathbf{M}_\mathbf{z}=\mathbf{Z}(\mathbf{Z}^{\top }\mathbf{Z})^{-1}\mathbf{Z}^{\top }$. Note that $\mathrm {E}(\varepsilon )=0$ and $\mathrm {var}(\varepsilon )=\sigma ^{2}$, then

$$\begin{aligned}&\mathrm {E}\left[n^{-1/2}\sum _{i=1}^{n} \{g'(\mathbf{x}_i, \varvec{\beta })-E_{\mathcal {A}}g'(\mathbf{x}_i, \varvec{\beta })\}\varepsilon _{i}|\mathbf{X},\mathbf{U}\right]=\mathbf{0}, \quad \mathrm {and} \\&\mathrm {var}\left[n^{-1/2}\sum _{i=1}^{n} \{g'(\mathbf{x}_i, \varvec{\beta })-E_{\mathcal {A}}g'(\mathbf{x}_i, \varvec{\beta })\}\varepsilon _{i}|\mathbf{X},\mathbf{U}\right] =\sigma ^2 \mathrm {E}[\{g'(\mathbf{x}_1, \varvec{\beta }) \\&\quad -E_{\mathcal {A}}g'(\mathbf{x}_1, \varvec{\beta })\}^{\otimes 2}|\mathbf{X},\mathbf{U}], \end{aligned}$$

which integrates the Lindeberg-Levy central limit theorem leading to

$$\begin{aligned} n^{-1/2}\sum _{i=1}^{n}\{g'(\mathbf{x}_i, \varvec{\beta })-E_{\mathcal {A}}g'(\mathbf{x}_i, \varvec{\beta })\}\varepsilon _{i} \rightarrow _d N(0,\sigma ^{2}{\varvec{\Sigma }}_{\varvec{\beta }} ) \end{aligned}$$

(6.16)

where ${\varvec{\Sigma }}_{\varvec{\beta }}=\mathrm {E}[\{g'(\mathbf{x}_1, \varvec{\beta })-E_{\mathcal {A}}g'(\mathbf{x}_1, \varvec{\beta })\}^{\otimes 2}|\mathbf{X},\mathbf{U}]$. Consequently,

$$\begin{aligned} n^{-1/2}Q'(\varvec{\beta }_0)^\top \mathbf{v}&=-2n^{-1/2}\mathbf{1}_{p}^\top \mathbf{R}[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]\mathbf{v}\\&\quad -2n^{-1/2}{\varvec{\varepsilon }}^{\top }[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]\mathbf{v}+O_p(L^{-r}/\sqrt{n}+K/n) \end{aligned}$$

that combines the assumptions (A1)–(A2) leading to

$$\begin{aligned} n^{-1/2}Q'(\varvec{\beta }_0)^{\top }\mathbf{v}=O_{p}(\Vert \mathbf{v}\Vert ). \end{aligned}$$

(6.17)

Similarly, some calculations with $\mathbf{g}(\mathbf{X},\varvec{\beta }_0)-\mathbf{g}(\mathbf{X}, \varvec{\beta }^{*})=O_{p}(n^{-1/2}\Vert \mathbf{v}\Vert )$ show that

$$\begin{aligned} \begin{aligned} Q''(\varvec{\beta }^{*})&=2[\mathbf{g}'(\mathbf{X}, \varvec{\beta }^{*})-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }^{*})]^{\top } [\mathbf{g}'(\mathbf{X}, \varvec{\beta }^{*})-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }^{*})]\\&\quad -2[\mathbf{g}''(\mathbf{X}, \varvec{\beta }^{*})-E_{\mathcal {A}}\mathbf{g}''(\mathbf{X}, \varvec{\beta }^{*})]^{\top } (\mathbf{R}^{\top }\mathbf{1}_{p} +{\varvec{\varepsilon }})+O_p(L^{-r}+K/\sqrt{n}), \end{aligned} \end{aligned}$$

which integrates (6.16) and the idempotence of $\mathbf{I}-\mathbf{M}_\mathbf{z}$ resulting in

$$\begin{aligned} \begin{aligned} \frac{1}{2n}\mathbf{v}^{\top }Q''(\varvec{\beta }^{*})\mathbf{v}&=O_{p}(\Vert \mathbf{v}\Vert ^{2}). \end{aligned} \end{aligned}$$

(6.18)

Therefore, the combination of (6.14), (6.17) and (6.18) indicates that

$$\begin{aligned} P(Q(\varvec{\beta }_0+n^{-1/2}\mathbf{v})-Q(\varvec{\beta }_0)>0)\rightarrow 1 \end{aligned}$$

on the basis of the fact that $n^{-1/2}Q'(\varvec{\beta }_0)^{\top }\mathbf{v}$ is dominated by $\frac{1}{2n}\mathbf{v}^{\top }Q''(\varvec{\beta }^{*})\mathbf{v}$ uniformly in $\Vert \mathbf{v}\Vert =C$ for a sufficiently large C. Thus, (6.13) holds, i.e., with the probability approaching to 1, there exists local minimizer ${{\widehat{\varvec{\beta }}}}$ such that $\Vert {{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0 \Vert =O_p(n^{-1/2})$.

Now we prove the asymptotic normality of ${{\widehat{\varvec{\beta }}}}$. Note that ${{\widehat{\varvec{\beta }}}}$ is solution to $Q'({{\widehat{\varvec{\gamma }}}}(\varvec{\beta }),\varvec{\beta })\equiv Q'(\varvec{\beta }) =0$. Then, we take the Taylor expansion of $Q'(\varvec{\beta })$ at $\varvec{\beta }_0$ and get

$$\begin{aligned} Q'({{\widehat{\varvec{\beta }}}})=Q'(\varvec{\beta }_0) +Q''({\widetilde{\varvec{\beta }}})^{\top }({{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0)+o(({{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0)^{2}) \end{aligned}$$

(6.19)

where ${\widetilde{\varvec{\beta }}}$ lies between $\varvec{\beta }_0$ and ${{\widehat{\varvec{\beta }}}}$. Following the assumptions (A1)–(A4) and similar discussions in (6.18), we get

$$\begin{aligned} \frac{1}{2n}Q''({\widetilde{\varvec{\beta }}})={\varvec{\Sigma }}_{\varvec{\beta }} (1+o_{p}(1)). \end{aligned}$$

Applying (6.15) and $\Vert {{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0 \Vert =O_p(n^{-1/2})$, we rewrite (6.19) as

$$\begin{aligned}&\frac{2}{n}[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }\mathbf{R}^{\top }\mathbf{1}_p +\frac{2}{n}[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }{\varvec{\varepsilon }}\\&\quad =2{\varvec{\Sigma }}_{\varvec{\beta }}({{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0)[1+o_p(1)]+o(n^{-1}), \end{aligned}$$

and further

$$\begin{aligned}&\frac{1}{\sqrt{n}}[\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }{\varvec{\varepsilon }}\\&\quad ={\varvec{\Sigma }}_{\varvec{\beta }}\sqrt{n}({{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0)[1+o_p(1)]. \end{aligned}$$

Hence, the use of Slutsky theorem leads to

$$\begin{aligned} \sqrt{n}({{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0) \rightarrow _d N(0,\sigma ^{2}{\varvec{\Sigma }}_{\varvec{\beta }}^{-1}) \end{aligned}$$

and the proof of Theorem 1 is completed. $\square $

Proof of Theorem 2

Firstly, we prove the consistency of ${\widehat{\varvec{\gamma }}}$. Note that

$$\begin{aligned} {\widehat{\varvec{\gamma }}}-\varvec{\gamma }_{0}&=(\mathbf{Z}^{\top }\mathbf{Z})^{-1}\mathbf{Z}^{\top }(\mathbf{g}(\mathbf{X},\varvec{\beta }_0)-\mathbf{g}(\mathbf{X},{{\widehat{\varvec{\beta }}}})) +(\mathbf{Z}^{\top }\mathbf{Z})^{-1}\mathbf{Z}^{\top }\mathbf{R}^{\top }\mathbf{1}_p+(\mathbf{Z}^{\top }\mathbf{Z})^{-1}\mathbf{Z}^{\top }{\varvec{\varepsilon }}\\&{\mathop {=}\limits ^\mathrm{def}}J_1+J_2+J_3. \end{aligned}$$

Simple calculations show that $J_1=O_p(\frac{\sqrt{L}}{n})=o_p(n^{-1/2})$, $J_2=O_p(n^{-1}L^{\frac{1}{2}-r})=o_p(n^{-1})$ and $J_3=O_p(\sqrt{L/n})$, thus $J_3$ is the dominated term that leads to $\Vert {\widehat{\varvec{\gamma }}} -\varvec{\gamma }_0 \Vert =O_p(\sqrt{L/n})$. Moreover,

$$\begin{aligned} \Vert {{\widehat{\alpha }}}_{k}(\cdot )-\alpha _{0k}(\cdot ) \Vert ^{2}&\le 2\int _{0}^{1}( \mathbf{B}^{\top }(u_{k}){\widehat{\varvec{\gamma }}}_{k}-\mathbf{B}^{\top }(u_{k})\varvec{\gamma }_{0k} )^{2}du_{k}+2\int _{0}^{1}R_{k}(u_{k})^{2}du_{k}\\&=O_p(L/n)+O_p(L^{-2r}). \end{aligned}$$

Then, we finish the proof. $\square $

Proof of Theorem 3

Note that

$$\begin{aligned} \mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0) =\frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ u_{ik}-u_0 \end{pmatrix} \begin{pmatrix} 1&u_{ik}-u_0 \end{pmatrix} \end{aligned}$$

that may be singular in some cases and result in irreversibility. A common method so solve this issue is to insert an identity matrix $\mathbf{I}_{2\times 2}=\mathbf{G}_n^{-1}\mathbf{G}_n$ with $\mathbf{G}_n=\mathrm {diag}(1, h^{-2})$, then

$$\begin{aligned} \begin{aligned}&(\breve{\alpha }_k(u_0),\breve{\alpha }'_k(u_0))^\top \\&\quad = \left[ \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix} \right. \\&\left. \qquad \begin{pmatrix} 1&u_{ik}-u_0 \end{pmatrix} \right] ^{-1} \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix}\\&\qquad \times \left[ g(\mathbf{x}_i,\varvec{\beta })+\alpha _k(u_{ik}) +\underset{k'\ne k}{\sum } \alpha _{k'}(u_{ik'})+\varepsilon _i-g(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}}) -\underset{k'\ne k}{\sum } {{\widehat{\alpha }}}_{k'}(u_{ik'}) \right]. \end{aligned} \end{aligned}$$

Taking the Taylor expansion leads to

$$\begin{aligned} \alpha _k(u_{ik})&=\begin{pmatrix} 1&u_{ik}-u_0\end{pmatrix}\begin{pmatrix} \alpha _k(u_0)\\ \alpha _k'(u_0)\end{pmatrix}+\frac{1}{2}\alpha _k''(u_0)(u_{ik}-u_0)^2+R_m(u_{ik},u_0) \end{aligned}$$

where $R_m(u_{ik},u_0)$ is the remainder, consequently,

$$\begin{aligned} \begin{aligned}&(\breve{\alpha }_k(u_0),\breve{\alpha }'_k(u_0))^\top -(\alpha _k(u_0),\alpha '_k(u_0))^\top \\&\quad =\left[\frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1 &{} u_{ik}-u_0\\ (u_{ik}-u_0)/h^2 &{} (u_{ik}-u_0)^2/h^2 \end{pmatrix} \right]^{-1}\\&\qquad \times \left\rbrace \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix} \right. \\&\quad \qquad \left. \left[ \frac{1}{2}\alpha _k'(u_0)(u_{ik}-u_0)^2+R_m(u_{ik},u_0)+g(\mathbf{x}_i,\varvec{\beta })-g(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}}) \right. \right.\\&\left.\left. \qquad +\underset{k'\ne k}{\sum }( \alpha _{k'}(u_{ik'})-{{\widehat{\alpha }}}_{k'}(u_{ik'}) )+\varepsilon _i \right] \right\lbrace \\&{\mathop {=}\limits ^\mathrm{def}}[A_0]^{-1}( A_1+A_2+A_3+A_4 )+(s.o) \end{aligned} \end{aligned}$$

where

$$\begin{aligned} A_0= & {} \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1 &{} u_{ik}-u_0\\ (u_{ik}-u_0)/h^2 &{} (u_{ik}-u_0)^2/h^2 \end{pmatrix}, \\ A_1= & {} \frac{1}{2n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix} \alpha _k''(u_0)(u_{ik}-u_0)^2, \\ A_2= & {} \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix} (g(\mathbf{x}_i,\varvec{\beta })-g(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}})) ,\\ A_3= & {} \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix} \underset{k'\ne k}{\sum } ( \alpha _{k'}(u_{ik'})-{{\widehat{\alpha }}}_{k'}(u_{ik'})), \\ A_4= & {} \frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix}1\\ (u_{ik}-u_0)/h^2\end{pmatrix}\varepsilon _i, \end{aligned}$$

and the term (s.o) is

$$\begin{aligned}{}[ A_0 ]^{-1}\frac{1}{n}\sum _{i=1}^{n}K_h(u_{ik}-u_0) \begin{pmatrix} 1\\ (u_{ik}-u_0)/h^2 \end{pmatrix} R_m(u_{ik},u_0), \end{aligned}$$

with an order being much less than that of $[ A_0 ]^{-1}A_1$, thus

$$\begin{aligned} \sqrt{nh}\mathbf{H}_n\left[ \begin{pmatrix} \breve{\alpha }_k(u_0)\\ \breve{\alpha }'_k(u_0) \end{pmatrix} - \begin{pmatrix} \alpha _k(u_0)\\ \alpha _k'(u_0) \end{pmatrix} \right] =\sqrt{nh}\mathbf{H}_n[ A_0]^{-1}\{ A_1+A_2+A_3+A_4 \}+(s.o). \end{aligned}$$

For ease of notation, write

$$\begin{aligned} \mathbf{R}=\mathrm {diag}(\mathbf{Q}^{-1})= \begin{pmatrix}1/f(u_0)&{} 0\\ 0 &{} 1/(\kappa _2f(u_0)) \end{pmatrix} \quad \mathrm {and} \quad \mathbf{V}=\begin{pmatrix} \zeta _0 \sigma ^2f(u_0) &{} 0\\ 0 &{} \zeta _2\sigma ^2f(u_0) \end{pmatrix}. \end{aligned}$$

Integrating the Lemma 2 and the Lemmas 2.1-2.3 in Li and Racine (2007), we get

$$\begin{aligned} \sqrt{nh}\mathbf{H}_n[ A_0 ]^{-1}\{ A_1+A_2+A_3+A_4 \}= & {} \sqrt{nh}\mathbf{H}_n\mathbf{Q}^{-1} \{ A_1+A_2+A_3+A_4 \}+o_p(1) \\ \sqrt{nh}\mathbf{H}_n\mathbf{Q}^{-1}\{ A_1+A_4 \}= & {} \mathbf{R}\sqrt{nh}\mathbf{H}_n\{ A_1+A_4 \}+o_p(1) \\ \sqrt{nh}\mathbf{H}_nA_1= & {} \begin{pmatrix} \sqrt{nh}(\kappa _2/2)f(u_0)h^2\alpha ''_k(u_0)\\ 0 \end{pmatrix} \\&+o_p(1) \, \mathrm { and} \, \sqrt{nh}\mathbf{H}_n A_4 \rightarrow _d N(0,\mathbf{V}) \end{aligned}$$

based on $g(\mathbf{x}_i,\varvec{\beta })-g(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}})=g'(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}})(\varvec{\beta }-{{\widehat{\varvec{\beta }}}}) +o_p((\varvec{\beta }-{{\widehat{\varvec{\beta }}}})^2)$ and $\Vert {{\widehat{\varvec{\beta }}}}-\varvec{\beta }\Vert =O_p(n^{-1/2})$. Then, $\sqrt{nh}\mathbf{H}_nQ^{-1}A_2=o_p(1)$ and $\sqrt{nh}\mathbf{H}_nQ^{-1}A_3=o_p(1)$. The results follows directly and when $\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0)$ is nonsingular, the proof can be similarly derived. $\square $

Proof of Theorem 4

Let $\Vert g\Vert =\sup _{x\in [0,1]}|g(x)|$ for a function g(x), $\mathbf{A}(x)=(a_{ij}(x))_p$ and $\Vert \mathbf{A}\Vert _\infty =( \sum _{i=1}^{p}\sum _{j=1}^{p}\Vert a_{ij} \Vert _\infty ^2 )^{1/2}$ for a matrix $\mathbf{A}$ for ease of notation. Similar calculation as that in Theorem 3 leads to

$$\begin{aligned} \breve{\alpha }_k(u_0)-\alpha _k(u_0)-b(u_0)&=(\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0))^{-1}\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0){\varvec{\varepsilon }}+o_p(1)\\&{\mathop {=}\limits ^\mathrm{def}}I_1(u_0)+o_p(1). \end{aligned}$$

The use of Lemma 3 results in

$$\begin{aligned} n\mathbf{H}_n(\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0))^{-1}\mathbf{H}_n=f^{-1}(u_0){\varvec{\Omega }}^{-1} +O_p(h+(\log n/nh)^{1/2}) \end{aligned}$$

(6.20)

with ${\varvec{\Omega }}=\begin{pmatrix}1&{}0\\ 0&{}\kappa _2\end{pmatrix}$ and

$$\begin{aligned} \Vert \frac{1}{n} \mathbf{H}_n^{-1}\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0){\varvec{\varepsilon }}\Vert _{\infty } =O_p((\log n/nh)^{1/2}), \end{aligned}$$

(6.21)

consequently,

$$\begin{aligned}&\Vert I_1(u_0)-\frac{1}{nf(u_0)}(1,0){\varvec{\Omega }}^{-1}\mathbf{H}_n^{-1}\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0){\varvec{\varepsilon }}\Vert _\infty \\&\quad =O_p(h(\log n/nh)^{1/2}+(\log n/nh)). \end{aligned}$$

Furthermore, we denote

$$\begin{aligned} I_2(u_0)&=\frac{1}{nf(u_0)}(1,0){\varvec{\Omega }}^{-1}\mathbf{H}_n^{-1}\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0){\varvec{\varepsilon }}=\frac{1}{nf(u_0)}\sum _{i=1}^{n}K_h(u_{ik}-u_0)\varepsilon _i \end{aligned}$$

and apply the Theorem 1 and the Lemma 1 in Fan and Zhang (2000), for $h=n^{-\rho }$ with $1/5\le \rho \le 1/3$,

$$\begin{aligned} \lim _{n\rightarrow \infty }P\{ (-2\log h)^{1/2}( \Vert ({nh}/{\sigma ^2_{\alpha }})^{1/2}I_2(u_0) \Vert _\infty -d_n )<z \}=\exp (-2\exp (-z)). \end{aligned}$$

Thus, the proof is completed. $\square $

Proof of Theorem 5

It suffices to show the convergence rates of the estimated bias and variance of $\breve{\alpha }_k(u_0)$. First, the result (6.20) and the proof in Theorem 4 result in

$$\begin{aligned} \Vert {\widehat{bias}}(\breve{\alpha }_k(u_0)|\mathcal {D})-b(u_0) \Vert _\infty =O_p(h^2(\sqrt{\log n/nh_{*}^5}))=O_p(h^2(n^{-1/7}\log ^{1/2}n)) \end{aligned}$$

(6.22)

with $h_{*}=O(n^{-1/7})$. Besides, the Lemma 3 and similar discussions as that in (6.21) lead to

$$\begin{aligned} \Vert \frac{h}{n}\mathbf{H}_n^{-1}(\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0))\mathbf{H}_n^{-1} -f(u_0){\varvec{\Lambda }}\Vert _\infty =o_p(1) \end{aligned}$$

with ${\varvec{\Lambda }}=\begin{pmatrix}\kappa _0&{}0\\ 0&{}\kappa _2 \end{pmatrix}$. Moreover, $\Vert {\widehat{\sigma }}^2-\sigma ^2 \Vert _\infty =o_p(1)$ is easy to derived. Thus, the results mentioned above together with Theorem 2 indicating that for each $u_0 \in [0,1]$,

$$\begin{aligned} \Vert nh\widehat{\mathrm {var}}\{ \breve{\alpha }_k(u_0)|\mathcal {D} \} -\sigma ^2_{\alpha } \Vert _\infty =o_p(1). \end{aligned}$$

(6.23)

Finally, the combination of (6.22), (6.23) and Theorem 4 shows the conclusion. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, R., Zhang, Y. Two-stage estimation and simultaneous confidence band in partially nonlinear additive model. Metrika 84, 1109–1140 (2021). https://doi.org/10.1007/s00184-021-00808-3

Download citation

Received: 18 July 2020
Accepted: 11 January 2021
Published: 02 February 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s00184-021-00808-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage estimation and simultaneous confidence band in partially nonlinear additive model

Abstract

Access this article

Similar content being viewed by others

Estimation and inference for additive partially nonlinear models

Smoothing combined generalized estimating equations in quantile partially linear additive models with longitudinal data

Efficient estimation of longitudinal data additive varying coefficient regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorem 5

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Two-stage estimation and simultaneous confidence band in partially nonlinear additive model

Abstract

Access this article

Similar content being viewed by others

Estimation and inference for additive partially nonlinear models

Smoothing combined generalized estimating equations in quantile partially linear additive models with longitudinal data

Efficient estimation of longitudinal data additive varying coefficient regression models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendix

Appendix

Lemma 1

Proof

Lemma 2

Proof

Lemma 3

Proof

Proof of Theorem 1

Proof of Theorem 2

Proof of Theorem 3

Proof of Theorem 4

Proof of Theorem 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation