Abstract
In this paper, we focus on the estimation and inference in partially nonlinear additive model on which few research was conducted to our best knowledge. By integrating spline approximation and local smoothing, we propose a two-stage estimating approach in which the profile nonlinear least square method was used to estimate parameters and additive functions. Under some regular conditions, we establish the asymptotic normality of parametric estimators and achieve an optimal nonparametric convergence rate of the fitted functions. Furthermore, the spline-backfitted local linear estimator is proposed for the additive functions and the corresponding asymptotic distribution is also established. To make inference on the nonparametric functions from the whole, we construct the theoretical simultaneous confidence bands, and further propose an empirical bootstrap-based confidence band for the heavy computing burden in implement. Finally, both Monte Carlo simulation and real data analysis show the good performance of our proposed methods.
Similar content being viewed by others
References
Bates DM, Watts DG (1988) Nonlinear regression analysis and its applications. Wiley, New York
Biedermann S, Dette H, Woods DC (2011) Optimal design for additive partially nonlinear models. Biometrika 98(2):449–458
Bowman AW (1984) An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71(2):353–360
Breiman L, Friedman JH (1985) Estimating optimal transformations for multiple regression and correlation. J Am Stat Assoc 80(391):580–598
Cai Z, Xu X (2008) Nonparametric quantile estimations for dynamic smooth coefficient models. J Am Stat Assoc 103(484):1595–1608
Claeskens G, Van Keilegom I (2003) Bootstrap confidence bands for regression curves and their derivatives. Ann Stat 31(6):1852–1884
Currie DJ (1982) Estimating michaelis-menten parameters: bias, variance and experimental design. Biometrics 38(4):907–919
De Boor C (2001) A practical guide to splines. Appl Math Sci
Donthi R, Prasad SV, Mahaboob B, Praveen JP, Venkateswarlu B (2019) Estimation methods of nonlinear regression models. In: AIP conference proceedings, 2177(1), 020081. AIP Publishing
Fan J, Gijbels I (1996) Local polynomial modelling and its applications. Chapman & Hall, London
Fan J, Härdle W, Mammen E et al (1998) Direct estimation of low-dimensional components in additive models. Ann Stat 26(3):943–971
Fan J, Zhang W (2000) Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand J Stat 27(4):715–731
Härdle W, Liang H, Gao J (2012) Partially linear models. Springer Science & Business Media
Härdle W, Sperlich S, Spokoiny V (2001) Structural tests in additive regression. J Am Stat Assoc 96(456):1333–1347
Harrison D Jr, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5(1):81–102
Hart JD, Wehrly TE (1993) Consistency of cross-validation when the data are curves. Stochast Process Appl 45(2):351–361
Huang L-S, Yu C-H (2019) Classical backfitting for smooth-backfitting additive models. J Comput Graph Stat 28(2):386–400
Imhof LA et al (2001) Maximin designs for exponential growth models and heteroscedastic polynomial models. Ann Stat 29(2):561–576
Jiang Y, Tian G-L, Fei Y (2019) A robust and efficient estimation method for partially nonlinear models via a new mm algorithm. Stat Pap 60(6):2063–2085
Kong E, Xia Y (2012) A single-index quantile regression model and its estimation. Econom Theory 28(4):730–768
Li G, Peng H, Tong T (2013) Simultaneous confidence band for nonparametric fixed effects panel data models. Econ Lett 119(3):229–232
Li Q (2000) Efficient estimation of additive partially linear models. Int Econ Rev 41(4):1073–1092
Li Q, Racine JS (2007) Nonparametric econometrics: theory and practice. Princeton University Press, Princeton
Li R, Nie L (2007) A new estimation procedure for a partially nonlinear model via a mixed-effects approach. Can J Stat 35(3):399–411
Li R, Nie L (2008) Efficient statistical inference procedures for partially nonlinear models and their applications. Biometrics 64(3):904–911
Li Y, Ruppert D (2008) On the asymptotics of penalized splines. Biometrika 95(2):415–436
Liang H, Thurston SW, Ruppert D, Apanasovich T, Hauser R (2008) Additive partial linear models with measurement errors. Biometrika 95(3):667–678
Liu X, Wang L, Liang H (2011) Estimation and variable selection for semiparametric additive partial linear models. Statistica Sinica 21(3):1225–1248
Ma S, Lian H, Liang H, Carroll R (2017) SiAM: a hybrid of single index models and additive models. Electron J Stat 11(1):2397–2423
Ma S, Yang L (2011) Spline-backfitted kernel smoothing of partially linear additive model. J Stat Plan Inf 141(1):204–219
Mammen E, Linton O, Nielsen JP (1999) The existence and asymptotic properties of a backfitting projection algorithm under weak conditions. Ann Stat 27(5):1443–1490
Manzan S, Zerom D (2005) Kernel estimation of a partially linear additive model. Stat Probab Lett 72(4):313–322
Nielsen JP, Sperlich S (2005) Smooth backfitting in practice. J Roy Stat Soc B 67(1):43–61
Riazoshams H, Midi H, Ghilagaber G (2018) Robust nonlinear regression: with applications using R. Wiley, Hoboken
Rice JA, Silverman BW (1991) Estimating the mean and covariance structure nonparametrically when the data are curves. J Roy Stat Soc B 53(1):233–243
Rudemo M (1982) Empirical choice of histograms and kernel density estimators. Scand J Stat 9(2):65–78
Schumaker LL (1981) Spline functions: basic theory. Wiley, New York
Seber GA, Wild CJ (2003) Nonlinear regression. Wiley-Interscience, Hoboken
Severini TA, Wong WH et al (1992) Profile likelihood and conditionally parametric models. Ann Stat 20(4):1768–1802
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman and Hall, London
Song L, Zhao Y, Wang X (2010) Sieve least squares estimation for partially nonlinear models. Stat Probab Lett 80(17–18):1271–1283
Stone CJ et al (1984) An asymptotically optimal window selection rule for kernel density estimates. Ann Stat 12(4):1285–1297
Su L, Ullah A (2006) Profile likelihood estimation of partially linear panel data models with fixed effects. Econ Lett 92(1):75–81
Tjøstheim D, Auestad BH (1994) Nonparametric identification of nonlinear time series: projections. J Am Stat Assoc 89(428):1398–1409
Wang J, Yang L (2009) Efficient and fast spline-backfitted kernel smoothing of additive models. Ann Inst Stat Math 61(3):663–690
Wang Z, Xue L, Liu J (2019) Checking nonparametric component for partially nonlinear model with missing response. Stat Probab Lett 149:1–8
Wu TZ, Yu K, Yu Y (2010) Single-index quantile regression. J Multivar Anal 101(7):1607–1621
Xiao Y, Tian Z, Li F (2014) Empirical likelihood-based inference for parameter and nonparametric function in partially nonlinear models. J Korean Stat Soc 43(3):367–379
Xie H, Huang J et al (2009) Scad-penalized regression in high-dimensional partially linear models. Ann Stat 37(2):673–696
Yang L, Park BU, Xue L, Härdle W (2006) Estimation and testing for varying coefficients in additive models with marginal integration. J Am Stat Assoc 101(475):1212–1227
Yang L, Sperlich S, Härdle W (2003) Derivative estimation and testing in generalized additive models. J Stat Plan Inf 115(2):521–542
Yu K, Lu Z (2004) Local linear additive quantile regression. Scand J Stat 31(3):333–346
Zhang HH, Cheng G, Liu Y (2011) Linear or nonlinear? Automatic structure discovery for partially linear models. J Am Stat Assoc 106(495):1099–1112
Zhang Y, Lian H, Yu Y (2017) Estimation and variable selection for quantile partially linear single-index models. J Multivar Anal 162:215–234
Zhou S, Shen X, Wolfe D (1998) Local asymptotics for regression splines and confidence regions. Ann Stat 26(5):1760–1782
Zhou X, Zhao P, Liu Z (2016) Estimation and inference for additive partially nonlinear models. J Korean Stat Soc 45(4):491–504
Acknowledgements
Li’s research was supported by the grant from the National Social Science Fund of China (No. 17BTJ025) and the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science (East China Normal University), Ministry of Education (No. KLATASDS1802).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors state that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
To start with, we review some properties of B-spline function. Let \(\mathbf{B}(u)=(B_1(u),\ldots ,B_L(u))^\top \) be B-spline basis over [0, 1], then \(B_l(u)\ge 0\) and \(\sum _{l=1}^{L}B_l(u)=\sqrt{L}\) for each \(u\in [0,1]\). Moreover, for any vector \(\varvec{\theta }=(\theta _1,\ldots ,\theta _L)^\top \) and constants \(0<C_1<C_2\),
Lemma 1
If a function \(\phi (u)\) defined on the support \(\mathcal {U}\) has r-order continuous derivatives with \(r\ge 2\), there exists \(\varvec{\gamma }=(\gamma _1,\ldots ,\gamma _L)^\top \) such that
Proof
Lemma 1 can be proved directly from the Theorem XII.1 in De Boor (2001).
\(\square \)
Lemma 2
If the conditions (B1)–(B2) are satisfied, it holds
where
moreover, it follows that
Proof
The exercise 2.7 in Li and Racine (2007) shows this result, so we omit the proof. \(\square \)
Lemma 3
Suppose the conditions (B1)–(B2) hold and denote
with \(\mathbf{D}_i(u_0)\) being the ith column of \(\mathbf{D}^{\top }(u_0)\), we get
\((1)\, m(u_{ik},u_0)=n^{-1}K_h(u_{ik}-u_0)f^{-1}(u_0)\{1+o_p(1)\};\)
\((2)\, \lim _{n\rightarrow \infty }P_n\{\underset{u_0\in [0,1]}{\sup }\underset{1\le i\le n}{\max }| m(u_{ik},u_0)|\le C(nh)^{-1}\}=1\).
Proof
The conclusions can be derived by referring to the Lemma 4.1 in Su and Ullah (2006). \(\square \)
Proof of Theorem 1
Let \(\varvec{\gamma }_{0}=(\varvec{\gamma }^\top _{01},\ldots ,\varvec{\gamma }^\top _{0p})^{\top }\) be the true spline coefficient of \(\alpha _{0k}(\cdot )\) for \(k=1,2,\ldots ,p\), and denote \(R_{k}(u_{k})=\alpha _{0k}(u_{k})-\mathbf{B}^{\top }(u_{k})\varvec{\gamma }_{0k}\),
Following the assumptions (A3)-(A5) and the corollary 6.21 in Schumaker (1981), we get \(\Vert \mathbf{B}(u)\Vert =O(\sqrt{K})\) and \(\Vert R_{k}(u)\Vert =O(L^{-r})\) with r being defined in the condition (A4).
To prove the \(\sqrt{n}\)-consistency of \({{\widehat{\varvec{\beta }}}}\), it suffices to show that for any \(\zeta >0\), there exists a sufficiently large constant \(C>0\) such that
where \(\mathbf{v}\) is m-dimensional constant vector. Let \(Q'(\cdot )\) and \(Q''(\cdot )\) be the first and the second order derivations of \(Q(\cdot )\) respectively, and take the Taylor expansion of \(Q(\cdot )\) at \(\varvec{\beta }_0\) , we get
where \(\varvec{\beta }^{*}\) lies between \(\varvec{\beta }_0\) and \(\varvec{\beta }_0+n^{-1/2}\mathbf{v}\). Let \(E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)\) be the projection of \(\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)\) onto the function class \(\mathcal {A}\), then there exists a vector \(\varvec{\gamma }^*\) leading to \(E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-\mathbf{Z}\varvec{\gamma }^*=O(L^{-r})\). Then, \(\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)\) and \(E_{\mathcal {A}}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)\) are orthogonal and \(\Vert [\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }(\mathbf{I}-\mathbf{M}_\mathbf{z})\Vert =\Vert [\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)-E_\mathcal {A}\mathbf{g}'(\mathbf{X}, \varvec{\beta }_0)]^{\top }(\mathbf{I}-\mathbf{M}_\mathbf{z})\Vert =O_p(K/\sqrt{n}+L^{-r})\). Thus, some calculations based on Cauchy-Schwarz inequality show that
in which \((\mathbf{I}-\mathbf{M}_\mathbf{z})\mathbf{Z}=\mathbf{P}\mathbf{Z}=\mathbf{0}\) with \(\mathbf{M}_\mathbf{z}=\mathbf{Z}(\mathbf{Z}^{\top }\mathbf{Z})^{-1}\mathbf{Z}^{\top }\). Note that \(\mathrm {E}(\varepsilon )=0\) and \(\mathrm {var}(\varepsilon )=\sigma ^{2}\), then
which integrates the Lindeberg-Levy central limit theorem leading to
where \({\varvec{\Sigma }}_{\varvec{\beta }}=\mathrm {E}[\{g'(\mathbf{x}_1, \varvec{\beta })-E_{\mathcal {A}}g'(\mathbf{x}_1, \varvec{\beta })\}^{\otimes 2}|\mathbf{X},\mathbf{U}]\). Consequently,
that combines the assumptions (A1)–(A2) leading to
Similarly, some calculations with \(\mathbf{g}(\mathbf{X},\varvec{\beta }_0)-\mathbf{g}(\mathbf{X}, \varvec{\beta }^{*})=O_{p}(n^{-1/2}\Vert \mathbf{v}\Vert )\) show that
which integrates (6.16) and the idempotence of \(\mathbf{I}-\mathbf{M}_\mathbf{z}\) resulting in
Therefore, the combination of (6.14), (6.17) and (6.18) indicates that
on the basis of the fact that \(n^{-1/2}Q'(\varvec{\beta }_0)^{\top }\mathbf{v}\) is dominated by \(\frac{1}{2n}\mathbf{v}^{\top }Q''(\varvec{\beta }^{*})\mathbf{v}\) uniformly in \(\Vert \mathbf{v}\Vert =C\) for a sufficiently large C. Thus, (6.13) holds, i.e., with the probability approaching to 1, there exists local minimizer \({{\widehat{\varvec{\beta }}}}\) such that \(\Vert {{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0 \Vert =O_p(n^{-1/2})\).
Now we prove the asymptotic normality of \({{\widehat{\varvec{\beta }}}}\). Note that \({{\widehat{\varvec{\beta }}}}\) is solution to \(Q'({{\widehat{\varvec{\gamma }}}}(\varvec{\beta }),\varvec{\beta })\equiv Q'(\varvec{\beta }) =0\). Then, we take the Taylor expansion of \(Q'(\varvec{\beta })\) at \(\varvec{\beta }_0\) and get
where \({\widetilde{\varvec{\beta }}}\) lies between \(\varvec{\beta }_0\) and \({{\widehat{\varvec{\beta }}}}\). Following the assumptions (A1)–(A4) and similar discussions in (6.18), we get
Applying (6.15) and \(\Vert {{\widehat{\varvec{\beta }}}}-\varvec{\beta }_0 \Vert =O_p(n^{-1/2})\), we rewrite (6.19) as
and further
Hence, the use of Slutsky theorem leads to
and the proof of Theorem 1 is completed. \(\square \)
Proof of Theorem 2
Firstly, we prove the consistency of \({\widehat{\varvec{\gamma }}}\). Note that
Simple calculations show that \(J_1=O_p(\frac{\sqrt{L}}{n})=o_p(n^{-1/2})\), \(J_2=O_p(n^{-1}L^{\frac{1}{2}-r})=o_p(n^{-1})\) and \(J_3=O_p(\sqrt{L/n})\), thus \(J_3\) is the dominated term that leads to \(\Vert {\widehat{\varvec{\gamma }}} -\varvec{\gamma }_0 \Vert =O_p(\sqrt{L/n})\). Moreover,
Then, we finish the proof. \(\square \)
Proof of Theorem 3
Note that
that may be singular in some cases and result in irreversibility. A common method so solve this issue is to insert an identity matrix \(\mathbf{I}_{2\times 2}=\mathbf{G}_n^{-1}\mathbf{G}_n\) with \(\mathbf{G}_n=\mathrm {diag}(1, h^{-2})\), then
Taking the Taylor expansion leads to
where \(R_m(u_{ik},u_0)\) is the remainder, consequently,
where
and the term (s.o) is
with an order being much less than that of \([ A_0 ]^{-1}A_1\), thus
For ease of notation, write
Integrating the Lemma 2 and the Lemmas 2.1-2.3 in Li and Racine (2007), we get
based on \(g(\mathbf{x}_i,\varvec{\beta })-g(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}})=g'(\mathbf{x}_i,{{\widehat{\varvec{\beta }}}})(\varvec{\beta }-{{\widehat{\varvec{\beta }}}}) +o_p((\varvec{\beta }-{{\widehat{\varvec{\beta }}}})^2)\) and \(\Vert {{\widehat{\varvec{\beta }}}}-\varvec{\beta }\Vert =O_p(n^{-1/2})\). Then, \(\sqrt{nh}\mathbf{H}_nQ^{-1}A_2=o_p(1)\) and \(\sqrt{nh}\mathbf{H}_nQ^{-1}A_3=o_p(1)\). The results follows directly and when \(\mathbf{D}^{\top }(u_0)\mathbf{W}(u_0)\mathbf{D}(u_0)\) is nonsingular, the proof can be similarly derived. \(\square \)
Proof of Theorem 4
Let \(\Vert g\Vert =\sup _{x\in [0,1]}|g(x)|\) for a function g(x), \(\mathbf{A}(x)=(a_{ij}(x))_p\) and \(\Vert \mathbf{A}\Vert _\infty =( \sum _{i=1}^{p}\sum _{j=1}^{p}\Vert a_{ij} \Vert _\infty ^2 )^{1/2}\) for a matrix \(\mathbf{A}\) for ease of notation. Similar calculation as that in Theorem 3 leads to
The use of Lemma 3 results in
with \({\varvec{\Omega }}=\begin{pmatrix}1&{}0\\ 0&{}\kappa _2\end{pmatrix}\) and
consequently,
Furthermore, we denote
and apply the Theorem 1 and the Lemma 1 in Fan and Zhang (2000), for \(h=n^{-\rho }\) with \(1/5\le \rho \le 1/3\),
Thus, the proof is completed. \(\square \)
Proof of Theorem 5
It suffices to show the convergence rates of the estimated bias and variance of \(\breve{\alpha }_k(u_0)\). First, the result (6.20) and the proof in Theorem 4 result in
with \(h_{*}=O(n^{-1/7})\). Besides, the Lemma 3 and similar discussions as that in (6.21) lead to
with \({\varvec{\Lambda }}=\begin{pmatrix}\kappa _0&{}0\\ 0&{}\kappa _2 \end{pmatrix}\). Moreover, \(\Vert {\widehat{\sigma }}^2-\sigma ^2 \Vert _\infty =o_p(1)\) is easy to derived. Thus, the results mentioned above together with Theorem 2 indicating that for each \(u_0 \in [0,1]\),
Finally, the combination of (6.22), (6.23) and Theorem 4 shows the conclusion. \(\square \)
Rights and permissions
About this article
Cite this article
Li, R., Zhang, Y. Two-stage estimation and simultaneous confidence band in partially nonlinear additive model. Metrika 84, 1109–1140 (2021). https://doi.org/10.1007/s00184-021-00808-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-021-00808-3