Skip to main content
Log in

Estimation and inference in functional single-index models

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

We propose a functional single-index model (FSiM) to study the link between a scalar response variable and multiple functional predictors, in which the mean of the response is related to the linear predictors via an unknown link function. The FSiM serves as a good tool for dimension reduction in regression with multiple predictors and it is more flexible than functional linear models. Assuming that the functional predictors are observed at discrete points, we use B-spline basis functions to estimate the slope functions and the link function based on the least-squares criterion, and propose an iterative estimating procedure. Moreover, we provide uniform convergence rates of the proposed spline estimators in the FSiM, and construct asymptotic simultaneous confidence bands for the slope functions for inference. Our proposed method is illustrated by simulation studies and by an analysis of a diffusion tensor imaging data application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Bunea, F., Ivanescu, A.E., Wegkamp, M.H. (2011). Adaptive inference for the mean of a Gaussian process in functional data. Journal of the Royal Statistical Society Series B, 531–558.

  • Cai, T., Hall, P. (2006). Prediction in functional linear regression. Annals of Statistics, 34, 2159–2179.

  • Cardot, H., Ferraty, F., Sarda, P. (2003). Spline estimators for the functional linear model. Statistica Sinica, 13, 571–591.

  • Carroll, R. J., Fan, J., Gijbels, I., Wand, M. P. (1997). Generalized partially linear single-index models. Journal of the American Statistical Association, 92, 477–489.

  • Chen, D., Hall, P., Müller, H. G. (2011). Single and multiple index functional regression models with nonparametric link. Annals of Statistics, 39, 1720–1747.

  • Claeskens, G., Krivobokova, T., Opsomer, J. D. (2009). Asymptotic properties of penalized spline estimators. Biometrika, 96, 529–544.

  • Crainiceanu, C. M., Staicu, A.-M., Ray, S., Punjabi, N. (2012). Bootstrap-based inference on the difference in the means of two correlated functional processes. Statistics in Medicine, 31, 3223–3240.

  • Csőrgő, M., Révész, P. (1981). Strong approximations in probability and statistics. New York: Academic Press.

  • de Boor, C. (2001). A practical guide to splines. New York: Springer.

  • Demko, S. (1986). Spectral bounds for \( \left|a^{-1}\right|_{\infty }\). Journal of Approximation Theory, 48, 207–212.

  • DeVore, R. A., Lorentz, G. G. (1993). Constructive approximation. Berlin: Springer-Verlag.

  • Gertheiss, J., Maity, A., Staicu, A.-M. (2013). Variable selection in generalized functional linear models. Stat, 2, 86–101.

  • Goldsmith, J., Bobb, J., Crainiceanu, C. M., Caffo, B., Reich, D. (2011a). Penalized functional regression. Journal of Computational and Graphical Statistics, 20, 830–851.

  • Goldsmith, J., Crainiceanu, C. M., Caffo, B., Reich, D. (2011b). Penalized functional regression analysis of white-matter tract profiles in multiple sclerosis. NeuroImage, 57, 431–439.

  • Goldsmith, J., Crainiceanu, C., Caffo, B., Reich, D. (2012). Longitudinal penalized functional regression for cognitive outcomes on neuronal tract measurements. Journal of the Royal Statistical Society Series C, 61, 453–469.

  • Hall, P., Horowitz, J. L. (2007). Methodology and convergence rates for functional linear regression. Annals of Statistics, 35, 70–91.

  • Huang, J. (2003). Local asymptotics for polynomial spline regression. Annals of Statistics, 31, 1600–1635.

  • Huang, J. Z., Wu, C. O., Zhou, L. (2004). Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statistica Sinica, 14, 763–788.

  • Ichimura, H. (1993). Semiparametric least squares (SLS) and weighted SLS estimation of single index models. Journal of Econometrics, 58, 71–120.

  • Jiang, C. R., Wang, J. L. (2011). Functional single index models for longitudinal data. Annals of Statistics, 39, 362–388.

  • Li, Y., Hsing, T. (2007). On rates of convergence in functional linear regression. Journal of Multivariate Analysis, 98, 1782–1804.

  • Ma, S., Yang, L., Carroll, R. (2012). Simultaneous confidence band for sparse longitudinal regression. Statistica Sinica, 22, 95–122.

  • McLean, M. W., Hooker, G., Staicu, A.-M., Scheipl, F., Ruppert, D. (2014). Functional generalized additive models. Journal of Computational and Graphical Statistics, 23, 249–269.

  • Müller, H. G., Stadtmüller, U. (2005). Generalized functional linear models. Annals of Statistics, 33, 774–805.

  • Ramsay, J., Silverman, B. W. (2005). Functional data analysis. Springer series in statistics (2nd ed.). New York: Springer.

  • Wu, Y. (2010). The partially monotone tensor spline estimation of joint distribution function with bivariate current status data. In Technical report, University of Iowa.

  • Xia, Y., Tong, H., Li, W. K., Zhu, L. (2002). An adaptive estimation of dimension reduction space (with discussion). Journal of the Royal Statistical Society Series B, 64, 363–410.

  • Xue, L., Qu, A., Zhou, J. (2010). Consistent model selection for marginal generalized additive model for correlated data. Journal of American Statistical Association, 105, 1518–1530.

  • Xue, L., Yang, L. (2006). Additive coefficient modeling via polynomial spline. Statistica Sinica, 16, 1423–1446.

  • Yao, F., Müller, H. G., Wang, J. L. (2005). Functional linear regression analysis for longitudinal data. The Annals of Statistics, 33, 2873–2903.

  • Zhou, S., Shen, X., Wolfe, D. A. (1998). Local asymptotics for regression splines and confidence regions. The Annals of Statistics, 26, 1760–1782.

Download references

Acknowledgments

The author thanks the Editor, the Associate Editor and the two referees for their insightful comments and suggestions that lead to substantially improve the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shujie Ma.

Additional information

Ma’s research was partially supported by NSF grant DMS 1306972.

Appendix

Appendix

For any positive numbers \(a_{n}\) and \(b_{n}\), let \(a_{n}\sim b_{n}\) denote that lim\(_{n \rightarrow \infty }a_{n}/b_{n}=1\). For any vector \(\zeta = ( \zeta _{1},\ldots ,\zeta _{s} ) ^{\mathrm{T}}\in R^{s}\), denote its \(L_{r}\) norm as \( \Vert \zeta \Vert _{r}= ( \vert \zeta _{1} \vert ^{r}+\cdots \vert \zeta _{s} \vert ^{r} ) ^{1/r}\). For any symmetric matrix \(\mathbf{A} _{s\times s}\), denote its \(L_{r}\) norm as \( \Vert \mathbf{A} \Vert _{r}=\max _{\zeta \in s,\zeta \ne 0} \Vert \mathbf{A\zeta } \Vert _{r} \Vert \zeta \Vert _{r}^{-1}\). For any matrix \(\mathbf{A}= ( A_{ij} ) _{i=1,j=1}^{s,t}\), denote \( \Vert \mathbf{A} \Vert _{\infty }=\max _{1\le i\le s}\sum \nolimits _{j=1}^{t} \vert A_{ij} \vert \). The estimator \( \widehat{\dot{g}} ( u;\delta _{n} ) \) can be rewritten as \( \widehat{\dot{g}} ( u;\delta _{n} ) =\mathbf{B} _{1}^{q-1} ( u ) ^{\mathrm{T}}\mathbf{D}_{1}\widehat{\lambda } ( \delta _{n} ) \), where

$$\begin{aligned} \mathbf{B}_{1}^{q-1} ( u ) = ( B_{r,1}^{q-1} ( u ) :2\le r\le J_{n,1} ) ^{\mathrm {T}} \end{aligned}$$

are B-spline functions with order \(q-1\), and

$$\begin{aligned} \mathbf{D}_{1}= ( q-1 ) \left( \begin{array}{ccccc} \frac{-1}{\tau _{q+1}-\tau _{2}} &{} \frac{1}{\tau _{q+1}-\tau _{2}} &{} 0 &{} \cdots &{} 0 \\ 0 &{} \frac{-1}{\tau _{q+2}-\tau _{3}} &{} \frac{1}{\tau _{q+2}-\tau _{3}} &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} \frac{-1}{\tau _{J_{n,1}+q-1}-\tau _{J_{n,1}}} &{} \frac{1}{ \tau _{J_{n,1}+q-1}-\tau _{J_{n,1}}} \end{array} \right) _{ ( J_{n,1}-1 ) \times {J_{n,1}}}. \end{aligned}$$
(13)

Proof of Proposition 1

By (6), \(\widehat{\lambda } ( \delta _{n} ) \) can be decomposed as \(\widehat{\lambda } ( \delta _{n} ) =\widehat{\lambda }_{\varepsilon } ( \delta _{n} ) +\widehat{\lambda }_{g} ( \delta _{n} ) \), where

$$\begin{aligned} \widehat{\lambda }_{\varepsilon } ( \delta _{n} )&= \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal { B} ( \delta _{n} ) \} ^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\varepsilon _{n}, \\ \widehat{\lambda }_{g} ( \delta _{n} )&= \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathbf{g}_{n}, \end{aligned}$$

in which \(\varepsilon _{n}= ( \varepsilon _{1},\ldots ,\varepsilon _{n} ) ^{\mathrm{T}}\) and \(\mathbf{g}_{n}= \{ g ( \int \nolimits _{\mathcal {T}}\beta ( t ) ^{\mathrm{T}}\mathbf{X}_{i} ( t ) \mathrm{d}t ) ,1\le i\le n \} ^{\mathrm{T}}\). Correspondingly, \(\widehat{g} ( u;\delta _{n} ) \) is decomposed into \(\widehat{g} ( u;\delta _{n} ) =\widehat{g} _{\varepsilon } ( u;\delta _{n} ) +\widehat{g}_{g} ( u; \delta _{n} ) \), where \(\widehat{g}_{\varepsilon } ( u; \delta _{n} ) =\mathbf{B}_{1} ( u ) ^{\mathrm{T}} \widehat{\lambda }_{\varepsilon } ( \delta _{n} ) \) and \(\widehat{g}_{g} ( u;\delta _{n} ) =\mathbf{B} _{1} ( u ) ^{\mathrm{T}}\widehat{\lambda }_{g} ( \delta _{n} ) \). Thus,

$$\begin{aligned} \widehat{g}_{g} ( u;\delta _{n} ) -g_{n} ( u )&= \mathbf{B}_{1} ( u ) ^{\mathrm{T}} ( \widehat{\lambda } _{g} ( \delta _{n} ) -\lambda _{n} ) \\&=\mathbf{B}_{1} ( u ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}} \{ \mathbf{g}_{n}-\mathcal {B} ( \delta _{n} ) \lambda _{n} \} \\&=\varPsi _{1} ( u ) +\varPsi _{2} ( u ) +\varPsi _{3} ( u ), \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} \varPsi _{1} ( u )&=\mathbf{B}_{1} ( u ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}} \bigg [ \bigg \{ g \bigg ( \int \nolimits _{\mathcal {T}} \beta ( t ) ^{\mathrm{T}}\mathbf{X}_{i} ( t ) \mathrm{d}t \bigg ) \\&\qquad - g \bigg ( \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) \beta ( t_{ij} ) ^{\mathrm{T}} \mathbf{X}_{i} ( t_{ij} )\bigg ),1\le i\le n \bigg \} ^{\mathrm{T}} \bigg ], \\ \varPsi _{2} ( u )&=\mathbf{B}_{1} ( u ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}} \\&\qquad \times \bigg \{ g \bigg ( \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) \beta ( t_{ij} ) ^{\mathrm{T}}\mathbf{X}_{i} ( t_{ij} ) \bigg ) -g ( \varPhi _{i}^{\mathrm{T}}\delta _{n} ),1\le i\le n \bigg \} ^{\mathrm{T}},\\ \varPsi _{3} ( u )&=\mathbf{B}_{1} ( u ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}} \\&\qquad \times [ \{ g ( \varPhi _{i}^{\mathrm{T}}\delta _{n} ) ,1\le i\le n \} ^{\mathrm{T}}-\mathcal {B} ( \delta _{n} ) \lambda _{n} ]. \end{aligned} \end{aligned}$$
(14)

By Conditions (C3) and (C4), we have that for all \(1\le i\le n\), there exists a constant \(0<C<\infty \) such that

$$\begin{aligned} \bigg \vert g \bigg ( \int \nolimits _{\mathcal {T}}\beta ( t ) ^{\mathrm{T}}\mathbf{X}_{i} ( t ) \mathrm{d}t \bigg ) -g \bigg ( \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) \beta ( t_{ij} ) ^{\mathrm{T}}\mathbf{X}_{i} ( t_{ij} ) \bigg ) \bigg \vert \le Cm_{\min }^{-1}, \end{aligned}$$
(15)

and by (8) there exist constants \(0<C^{\prime },C^{\prime \prime }<\infty \) such that

$$\begin{aligned}&\bigg \vert g \bigg ( \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) \beta ( t_{ij} ) ^{\mathrm{T}} \mathbf{X}_{i} ( t_{ij} ) \bigg ) -g ( \varPhi _{i}^{\mathrm{T}} \delta _{n} ) \bigg \vert \nonumber \\&\quad \le C^{\prime } \bigg \vert \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) \sum \limits _{k=1}^{p} \{ \beta _{k} ( t_{ij} ) -\mathbf{B}_{2} ( t_{ij} ) ^{\mathrm{T}}\widetilde{ \delta }_{k,n} \} X_{ik} ( t_{ij} ) \bigg \vert \nonumber \\&\quad \le C^{\prime \prime } ( a_{n}+J_{n,2}^{-\alpha } ). \end{aligned}$$
(16)

Moreover, by (8) for all \(1\le i\le n\), there exists a constant \(0<C^{\prime \prime \prime }<\infty \) such that

$$\begin{aligned} \vert g ( \varPhi _{i}^{\mathrm{T}}\delta _{n} ) -\mathbf{B }_{1} ( \varPhi _{i}^{\mathrm{T}}\delta _{n} ) \lambda _{n} \vert \le C^{\prime \prime \prime }J_{n,1}^{-\alpha }. \end{aligned}$$
(17)

By Theorem 5.4.2 of DeVore and Lorentz (1993) and Berstein’s inequality in Boor (2001), one has for large enough \(n\), there are constants \(0<c_{1}<C_{1}<\infty \), such that

$$\begin{aligned} c_{1}J_{n,1}^{-1}\le \Vert E \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} \Vert _{2}\le C_{1}J_{n,1}^{-1}, \end{aligned}$$

with probability approaching \(1\), for \(J_{n,1}\log ( n ) /n=o ( 1 ) \),

$$\begin{aligned} c_{1}J_{n,1}^{-1}\le \Vert \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} \Vert _{2}\le C_{1}J_{n,1}^{-1}, \end{aligned}$$

and thus

$$\begin{aligned} C_{1}^{-1}J_{n,1}\le \Vert \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1} \Vert _{2}\le c_{1}^{-1}J_{n,1}. \end{aligned}$$
(18)

By the above result and Demko (1986), it can be proved that with probability approaching \(1\) and for large enough \(n\),

$$\begin{aligned} \Vert \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1} \Vert _{\infty }\le C_{2}J_{n,1}, \end{aligned}$$
(19)

for some constant \(0<C_{2}<\infty \). Therefore, by (14), (15), (16), (17) and (19), we have

$$\begin{aligned}&\sup \nolimits _{u\in \mathcal {I}} \vert \varPsi _{1} ( u ) \vert \\&\quad \le \sup \nolimits _{u\in \mathcal {I}} \Vert \mathbf{B}_{1} ( u ) \Vert _{\infty } \Vert \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1} \Vert _{\infty } \Vert n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathbf{1}_{n} \Vert _{\infty }O ( m_{\min }^{-1} ) \\&\quad =O_{p} ( J_{n,1} ) O_{p} ( J_{n,1}^{-1} ) O ( m_{\min }^{-1} ) =O_{p} ( m_{\min }^{-1} ), \sup \nolimits _{u\in \mathcal {I}} \vert \varPsi _{2} ( u ) \vert \\&\quad \le \sup \nolimits _{u\in \mathcal {I}} \Vert \mathbf{B}_{1} ( u ) \Vert _{\infty } \Vert \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1} \Vert _{\infty } \Vert n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathbf{1}_{n} \Vert _{\infty }O ( a_{n}+J_{n,2}^{-\alpha } ) \\&\quad =O_{p} ( J_{n,1} ) O_{p} ( J_{n,1}^{-1} ) O ( a_{n}+J_{n,2}^{-\alpha } ) =O_{p} ( a_{n}+J_{n,2}^{-\alpha } )\\&\quad \qquad \sup \nolimits _{u\in \mathcal {I}} \vert \varPsi _{3} ( u ) \vert \\&\quad \le \sup \nolimits _{u\in \mathcal {I}} \Vert \mathbf{B}_{1} ( u ) \Vert _{\infty } \Vert \{ n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1} \Vert _{\infty } \Vert n^{-1}\mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathbf{1}_{n} \Vert _{\infty }O ( J_{n,1}^{-\alpha } ) \\&\quad =O_{p} ( J_{n,1} ) O_{p} ( J_{n,1}^{-1} ) O ( J_{n,1}^{-\alpha } ) =O_{p} ( J_{n,1}^{-\alpha } ). \end{aligned}$$

Thus, we have

$$\begin{aligned} \sup \nolimits _{u\in \mathcal {I}} \vert \widehat{g}_{g} ( u;\delta _{n} ) -g_{n} ( u ) \vert =O_{p} ( a_{n}+J_{n,1}^{-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1} ). \end{aligned}$$

Let \(\mathbb {X=} ( \mathbf{X}_{1},\ldots ,\mathbf{X}_{n} ) \). By Condition (C5) and (18) for every \(u\in \mathcal {I}\), \(E \{ \widehat{g}_{\varepsilon } ( u;\delta _{n} ) \vert \mathbb {X} \} =0\), and

$$\begin{aligned} E \{ \widehat{g}_{\varepsilon } ( u;\delta _{n} ) \vert \mathbb {X} \} ^{2}\asymp \mathbf{B}_{1} ( u ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta _{n} ) ^{\mathrm{T}}\mathcal {B} ( \delta _{n} ) \} ^{-1}\mathbf{B}_{1} ( u ) \asymp J_{n,1}n^{-1}. \end{aligned}$$

Thus, it can be proved by Berstein’s inequality in Boor (2001) that \( \sup \nolimits _{u\in \mathcal {I}} \vert \widehat{g}_{\varepsilon } ( u;\delta _{n} ) \vert =O_{p} ( \sqrt{\log ( n ) J_{n,1}n^{-1}} ) \). Therefore, we have

$$\begin{aligned} \sup \nolimits _{u\in \mathcal {I}} \vert \widehat{g} ( u;\delta _{n} ) -g_{n} ( u ) \vert =O_{p}\left( a_{n}+J_{n,1}^{-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1}+\sqrt{\log ( n ) J_{n,1}n^{-1}}\right) . \end{aligned}$$

Result (i) is proved by the above result and (8). It is easy to prove that \( \Vert \mathbf{D}_{1} \Vert _{\infty }=O ( J_{n,1} ) \), where \(\mathbf{D}_{1}\) is defined in (13). Following the similar reasoning as the proof for \(\widehat{g} ( u; \delta _{n} ) \), the result in (ii) can be proved. \(\square \)

Lemma 1

Under Condition (C3), we have that there exists \( \widetilde{\delta }_{1,n}^{0}=\mathbf{(}\widetilde{\delta } _{r1,n}^{0}:1\le r\le J_{n,2})^{\mathrm{T}}\in R^{J_{n,2}}\) with \(\widetilde{ \delta }_{11,n}^{0}\le \cdots \le \widetilde{\delta }_{J_{n,2}1,n}^{0}\) such that \(\sup _{t\in \mathcal {T}} \vert \beta _{1} ( t ) - \widetilde{\beta }_{1,n} ( t ) \vert =O(J_{n,2}^{-\alpha })\), where \(\widetilde{\beta }_{1,n} ( t ) =\mathbf{B}_{2} ( t ) ^{\mathrm{T}}\widetilde{\delta }_{1,n}^{0}\).

Proof

By choosing \(\epsilon _{1}<\cdots <\epsilon _{J_{n,2}}\) , we define

$$\begin{aligned} \widetilde{\beta }_{1,n}^{0} ( t ) =\sum \limits _{r=1}^{J_{n,2}}\beta _{1} ( \epsilon _{r} ) B_{r,2} ( t ), \end{aligned}$$

which is monotone nondecreasing function in \(t\). By the fact that for \(t\) in \( [ \upsilon _{r_{1}},\upsilon _{r_{1}+1} ) \), \(\sum \nolimits _{r= \upsilon _{r_{1}}+1-q}^{\upsilon _{r_{1}}}B_{r,2}(t)=1\), we have

$$\begin{aligned} \beta _{1} ( \widetilde{t} ) =\sum \limits _{r=\upsilon _{r_{1}}+1-q}^{\upsilon _{r_{1}}}\beta _{1} ( \widetilde{t} ) B_{r,2}(t) \end{aligned}$$

for \(\widetilde{t}\in [ \upsilon _{r_{1}},\upsilon _{r_{1}+1} ) \), and thus

$$\begin{aligned} \vert \beta _{1} ( \widetilde{t} ) -\widetilde{\beta } _{1,n}^{0} ( \widetilde{t} ) \vert&\le \sum \limits _{r=\upsilon _{r_{1}}+1-q}^{\upsilon _{r_{1}}} \vert \beta _{1} ( \widetilde{t} ) -\beta _{1} ( \epsilon _{r} ) \vert B_{r,2} ( t ) \\&\le q\max _{\upsilon _{r_{1}}+1-q\le r\le \upsilon _{r_{1}}} \vert \beta _{1} ( \widetilde{t} ) -\beta _{1} ( \epsilon _{r} ) \vert . \end{aligned}$$

Let \(h=\max _{q\le l\le J_{n,2}}(\upsilon _{r+1}-\upsilon _{r})\). Define

$$\begin{aligned} \omega (\beta _{1};h)=\max \{ \vert \beta _{1}(t_{1})-\beta _{1}(t_{2}) \vert : \vert t_{1}-t_{2} \vert \le h \}. \end{aligned}$$

Then, \(\omega (\beta _{1};h)\) is a monotone and subadditivity function of \(h\) , that is, \(\omega (\beta _{1};h)\le \omega (\beta _{1};h_{1}+h_{2})\le \omega (\beta _{1};h_{1})+\omega (\beta _{1};h_{2})\) for \(h_{1}>0\) and \( h_{2}>0\). See Lemma 2.19 of Wu (2010) for the detailed proof. We choose \( \epsilon _{r}=\upsilon _{1}+(r-1)(\upsilon _{q+1}-\upsilon _{q})/q\) for \( r=1,\ldots ,q\) and \(\epsilon _{r}=\upsilon _{r}\) for \(r=q+1,\ldots ,J_{n,2}\) to guarantee that \(\epsilon _{r+1}-\epsilon _{r}>0\). Then, we have \( \vert \epsilon _{r}-\upsilon _{r} \vert \le h\) for \(r=1,\ldots ,J_{n,2}\). Moreover, for \(\widetilde{t}\in [ \upsilon _{r_{1}},\upsilon _{r_{1}+1} ) \) and \(r_{1}-q\le r\le r_{1}\), \( \vert \widetilde{t} -\epsilon _{r} \vert \le (q+1)h\). Therefore, we have

$$\begin{aligned} \sup _{t} \vert \beta _{1}(t)-\widetilde{\beta }_{1,n}^{0} ( t ) \vert \le q\omega (\beta _{1};(q+1)h)\le (q+1)q\omega (\beta _{1};h). \end{aligned}$$

The last step follows from the subadditivity of \(\omega (\beta _{1};h)\). Let

$$\begin{aligned} G_{q}= \{ \mathbf{B}_{2} ( t ) ^{\mathrm{T}}\delta _{1}, \delta _{1}\in \mathbf{R}^{J_{n,2}},\delta _{11}\le \cdots \le \delta _{J_{n,2}1} \}. \end{aligned}$$

Denote \(d(\beta _{1},G_{q})\) as the distance of \(\beta _{1}\) from \(G_{q}\). Following the reasoning as given in Lemma 2.19 of Wu (2010), it can be shown that for any \(g\in G_{q}\), we have

$$\begin{aligned} d(\beta _{1},G_{q})\le ch \Vert \partial (\beta _{1}-g)/\partial t \Vert _{\infty }, \end{aligned}$$

for some constant \(0<c<\infty \), and thus

$$\begin{aligned} d(\beta _{1},G_{q})\le chd(\partial \beta _{1}/\partial t,G_{q-1}), \end{aligned}$$

where \(G_{q-1,q}= \{ \partial g/\partial t,g\in G_{q} \} \). Proceeding in this way, we can derive

$$\begin{aligned} d(\beta _{1},G_{q})\le ch^{\alpha } \Vert \partial ^{\alpha }g/\partial t^{\alpha } \Vert _{\infty }. \end{aligned}$$

Thus, the result in Lemma 1 follows from the above result and Condition (C3).\(\square \)

Lemma 2

Let \(\widehat{\delta }\) be the minimizer of \( \widetilde{L}_{n} ( \delta ) \) given in (7 ) subject to \(\delta _{1,1}\le \cdots \le \delta _{J_{n,2},1}\) satisfying \( \Vert \widehat{\delta }-\delta _{n}^{0}\Vert _{\infty }\le a_{n}\) with probability approaching \(1\), where \(\widetilde{ \delta }_{n}^{0}= ( \widetilde{\delta }_{1,n}^{0\mathrm{T }},\ldots ,\delta _{p,n}^{0\mathrm{T}} ) ^{\mathrm{T}}\), under the assumptions in Theorem 1, we have

$$\begin{aligned} \Vert \widehat{\delta }_{n}-\widetilde{\delta } _{n}^{0}\Vert _{\infty }=O_{p}\{ ( \log n) ^{1/2}J_{n,2}^{1/2}n^{-1/2}\}. \end{aligned}$$
(20)

Proof

Let \(\widehat{\delta }_{n}\) be the minimizer of \(\widetilde{L} _{n}( \delta ) \) and \(\Vert \widehat{\delta }_{n}-\delta _{n}^{0}\Vert _{\infty }\le a_{n}\). By Taylor’s expansion, we have

$$\begin{aligned} \widehat{\delta }_{n}-\delta _{n}^{0}=\{ L_{n}( \widetilde{\delta }_{n}^{0}) /\partial \delta \partial \delta ^{\mathrm{T}}\} ^{-1}\{ -L_{n}( \widetilde{\delta }_{n}^{0}) /\partial \delta \} \{ 1+o_{p}( 1) \}. \end{aligned}$$

Moreover,

$$\begin{aligned}&-\widetilde{L}_{n}( \widetilde{\delta }_{n}^{0})/\partial \delta \\&\quad =\sum \limits _{i=1}^{n}\{ Y_{i}-\widehat{g}( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta }_{n}^{0},\widehat{\delta }) \} [ \widehat{\dot{g}}( \varPhi _{i}^{\mathrm{T}} \widetilde{\delta }_{n}^{0},\,\widehat{\delta }) \varPhi _{i}+\{ \widehat{\lambda }( \widehat{\delta }) ^{\mathrm{T}}/\partial \delta \} \mathbf{B}_{1}( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0})] \\&\quad =\varTheta _{1}+\varTheta _{2}, \end{aligned}$$

where

$$\begin{aligned} \varTheta _{1}&= \sum \limits _{i=1}^{n}\{ Y_{i}-g( U_{i}) \} \varPsi _{i}, \\ \varTheta _{2}&= \sum \limits _{i=1}^{n}\{ g( U_{i}) -\widehat{ g}( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta }_{n}^{0},\widehat{ \delta }) \} \varPsi _{i}, \end{aligned}$$

and

$$\begin{aligned} \varPsi _{i}=\widehat{\dot{g}}( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta }_{n}^{0},\widehat{\delta }) \varPhi _{i}+\{ \widehat{\lambda }( \widehat{\delta } ) ^{\mathrm{T}}/\partial \delta \} \mathbf{B}_{1}( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0}). \end{aligned}$$

By Berstein’s inequality Boor (2001), it can be proved that \(\Vert \varTheta _{1}\Vert _{\infty }=O_{p}( ( \log n)\) \( ^{1/2}n^{1/2}J_{n,2}^{-1/2}) \). Next, we will show that \(\Vert \varTheta _{2}\Vert _{\infty }=o_{p}( n^{1/2}J_{n,2}^{-1/2}) \). By Proposition 1 and the assumption in Theorem 1, we have

$$\begin{aligned} \vert g( U_{i}) -\widehat{g}( \varPhi _{i}^{\mathrm{T}} \widetilde{\delta }_{n}^{0},\widehat{\delta }) \vert =O_{p}\left( a_{n}+J_{n,1}^{-\alpha }+m_{\min }^{-1}+\sqrt{\log ( n) J_{n,1}n^{-1}}\right) =o_{p}( 1). \end{aligned}$$

By the law of large numbers, we have \(\sum \nolimits _{i=1}^{n}\Vert \varPsi _{i}\Vert _{\infty }=O_{p}( n^{1/2}J_{n,2}^{-1/2}) \). Therefore, \(\Vert \varTheta _{2}\Vert _{\infty }=o_{p}( n^{1/2}J_{n,2}^{-1/2}) \). Thus, we have \(\Vert -L_{n}( \widetilde{\delta }_{n}^{0}) /\partial \delta \Vert _{\infty }=O_{p}( ( \log n)\) \( ^{1/2}n^{1/2}J_{n,2}^{-1/2}) \). Moreover,

$$\begin{aligned} L_{n}( \delta _{n}^{0}) /\partial \delta \partial \delta ^{\mathrm{T}}=\left( \sum \limits _{i=1}^{n}\varPsi _{i}\varPsi _{i}^{\mathrm{T}}\right) ( 1+o_{p}( 1) ) \asymp nJ_{n,2}^{-1}. \end{aligned}$$

Therefore, we have \( \Vert \widehat{\delta }_{n}-\widetilde{ \delta }_{n}^{0} \Vert _{\infty }=O_{p} ( ( \log n ) ^{1/2}n^{-1/2}J_{n,2}^{1/2} ) \). Since \(\widetilde{\delta } _{11,n}^{0}\le \cdots \le \widetilde{\delta }_{J_{n,2}1,n}^{0}\), then with probability approaching \(1\), \(\widehat{\delta }_{n}=\widehat{ \delta }\). \(\Box \)

Lemma 3

Under the assumptions in Theorem 1,

$$\begin{aligned} \left\| -\widetilde{L}_{n} ( \widetilde{\delta } _{n}^{0} ) /\partial \delta -2\sum \limits _{i=1}^{n} \{ Y_{i}-g ( U_{i} ) \} \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} \right\| _{\infty }=o_{p} ( n^{1/2}J_{n,2}^{-1/2} ). \end{aligned}$$

Proof

By (6), we have \(\widehat{\lambda } ( \delta ) = \{ \mathcal {B} ( \delta ) ^{\mathrm{T} }\mathcal {B} ( \delta ) \} ^{-1}\mathcal {B} ( \delta ) ^{\mathrm{T}}\mathbf{Y}_{n}\), where \(\mathbf{Y} _{n}= ( Y_{1},\ldots ,Y_{n} ) ^{\mathrm{T}}\). Thus,

$$\begin{aligned}&\mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\delta ) ^{\mathrm{T} } \{ \partial \widehat{\lambda } ( \delta ) /\partial \delta ^{\mathrm{T}} \} \nonumber \\&\quad =\mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\delta ) ^{\mathrm{T }} \{ \partial ( \widehat{\lambda } ( \delta ) -\lambda _{n}^{0} ) /\partial \delta ^{\mathrm{T }} \} \nonumber \\&\quad =\mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\delta ) ^{\mathrm{T }}\partial [ \{ \mathcal {B} ( \delta ) ^{\mathrm{T }}\mathcal {B} ( \delta ) \} ^{-1}\mathcal {B} ( \delta ) ^{\mathrm{T}} ( \mathbf{Y}_{n}-\mathcal {B} ( \delta ) \lambda _{n}^{0} ) ] /\partial \delta ^{\mathrm{T}} \nonumber \\&\quad =\varOmega _{1} ( \delta ) +\varOmega _{2} ( \delta ) , \end{aligned}$$
(21)

where

$$\begin{aligned} \varOmega _{1} ( \delta )&= -\mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\delta ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta ) ^{\mathrm{T}}\mathcal {B} ( \delta ) \} ^{-1}\mathcal {B} ( \delta ) ^{\mathrm{T}} \{ \dot{g}_{n} ( \varPhi _{i}^{\mathrm{T}}\delta ) \varPhi _{i},1\le i\le n \} ^{\mathrm{T}}, \\ \varOmega _{2} ( \delta )&= \mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\delta ) ^{\mathrm{T}} [ \partial [ \{ \mathcal {B} ( \delta ) ^{\mathrm{T}}\mathcal {B} ( \delta ) \} ^{-1}\mathcal {B} ( \delta ) ^{\mathrm{T}} ] /\partial \delta ^{\mathrm{T}} ] \{ \mathbf{Y}_{n}-\mathcal {B} ( \delta ) \lambda _{n}^{0} \}. \end{aligned}$$

Let

$$\begin{aligned} \widehat{\varOmega }_{1} ( \delta ) =-\mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\delta ) ^{\mathrm{T}} \{ \mathcal {B} ( \delta ) ^{\mathrm{T}}\mathcal {B} ( \delta ) \} ^{-1}\mathcal {B} ( \delta ) ^{\mathrm{T} } \{ \dot{g} ( U_{i} ) \varPhi _{i},1\le i\le n \} ^{\mathrm{T }}. \end{aligned}$$

Following similar reasoning as the proofs in Proposition 1, it can be shown that

$$\begin{aligned} \Vert \varOmega _{1} ( \widetilde{\delta }_{n}^{0} ) ^{\mathrm{T}}-\widehat{\varOmega }_{1} ( \widetilde{\delta } _{n}^{0} ) ^{\mathrm{T}} \Vert _{\infty }&= O_{p} ( J_{n,1}^{1-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1} ) , \nonumber \\ \Vert \mathbf{Y}_{n}-\mathcal {B} ( \delta ) \lambda _{n}^{0} \Vert _{\infty }&= O_{p} ( J_{n,1}^{-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1}+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ).\qquad \quad \end{aligned}$$
(22)

Denote

$$\begin{aligned} \mathbf{A} ( \delta ) =\{A_{1} ( \delta ) ,\ldots ,A_{J_{n,1}} ( \delta ) \}^{\mathrm{T} }= \{ \mathcal {B} ( \delta ) ^{\mathrm{T}}\mathcal {B} ( \delta ) \} ^{-1}\mathcal {B} ( \delta ) ^{\mathrm{T}}\mathbf{1}_{n}. \end{aligned}$$

By (19) and Berstein’s inequality in Boor (2001), we have

$$\begin{aligned} \sup _{1\le s\le J_{n,1}} \vert A_{s} ( \widetilde{\delta } _{n}^{0} ) \vert&\le \Vert \{ n^{-1}\mathcal {B} ( \widetilde{\delta }_{n}^{0} ) ^{\mathrm{T}}\mathcal {B} ( \widetilde{\delta }_{n}^{0} ) \} ^{-1} \Vert _{\infty } \Vert n^{-1}\mathcal {B} ( \widetilde{\delta } _{n}^{0} ) ^{\mathrm{T}}\mathbf{1}_{n} \Vert _{\infty } \\&= O_{p} ( J_{n,1} ) O_{p} ( J_{n,1}^{-1} ) =O_{p} ( 1 ) , \end{aligned}$$

and thus with probability approaching \(1\), \(\sup _{1\le s\le J_{n,1}} \vert \dot{A}_{s} ( \widetilde{\delta } _{n}^{0} ) \vert \le C\) for some constant \(0<C<\infty \) by the fact that \(B_{s,1} ( u ) \) and \(\dot{B}_{s,1} ( u ) \) are functions with values bounded between \(0\) and \(1\). Hence, with probability approaching \(1\),

$$\begin{aligned} \sup _{1\le s\le J_{n,1}} \Vert \partial A_{s} ( \widetilde{\delta }_{n}^{0} ) /\partial \delta \Vert _{\infty }\le \sup _{1\le s\le J_{n,1}} \Vert \dot{A}_{s} ( \widetilde{ \delta }_{n}^{0} ) \Vert _{\infty }\sup _{1\le i\le n} \Vert \varPhi _{i} \Vert _{\infty }\le C^{\prime } \end{aligned}$$

for some constant \(0<C^{\prime }<\infty \). By B-spline properties, we have \( \sum \nolimits _{1\le s\le J_{n,1}} \) \(\vert B_{s,1} ( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta }_{n}^{0} ) \vert =O(1)\). Therefore,

$$\begin{aligned} \Vert \varOmega _{2} ( \widetilde{\delta }_{n}^{0} ) ^{\mathrm{T}} \Vert _{\infty }&\le \sum \limits _{1\le s\le J_{n,1}} \vert B_{s,1} ( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta }_{n}^{0} ) \vert \sup _{1\le s\le J_{n,1}} \Vert \partial A_{s} ( \widetilde{\delta }_{n}^{0} ) /\partial \delta \Vert _{\infty } \Vert \mathbf{Y}_{n}-\mathcal {B} ( \delta ) \lambda _{n}^{0} \Vert _{\infty } \nonumber \\&= O(1)O_{p}(1)O_{p} ( J_{n,1}^{-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1}+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ) \nonumber \\&= O_{p} ( J_{n,1}^{-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1}+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ). \end{aligned}$$
(23)

Moreover, by Condition (C3), for every \(t_{ij}\in \mathcal {T}\), there exists \(\zeta _{n,k} ( t_{ij} ) \in R^{J_{n,1}}\) such that \( \vert E \{ X_{i,k} ( t_{ij} ) \vert U_{i} \} -\mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}}\zeta _{n,k} ( t_{ij} ) \vert =O ( J_{n,1}^{-1} ) \), and thus for every \(s\) and \(k\),

$$\begin{aligned}&\left| E ( \dot{g} ( U_{i} ) \varPhi _{i,sk} \vert U_{i} ) -\dot{g} ( U_{i} ) \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) B_{s,2} ( t_{ij} ) \mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}}\zeta _{n,k} ( t_{ij} ) \right| \nonumber \\&\quad =\sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) B_{s,2} ( t_{ij} ) \dot{g} ( U_{i} ) O ( J_{n,1}^{-1} ) \nonumber \\&\quad = \left( \int B_{s,2} ( t ) \mathrm{d}t \right) \dot{g} ( U_{i} ) O ( J_{n,1}^{-1}+m_{\min }^{-1} ) \nonumber \\&\quad =O \{ J_{n,2}^{-1} ( J_{n,1}^{-1}+m_{\min }^{-1} ) \}. \end{aligned}$$
(24)

Let

$$\begin{aligned} \widetilde{\varOmega }_{1}&= \{ \widetilde{\varOmega }_{1,sk} \} \\&= -\mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}} ( \mathcal {B}^{\mathrm{T} }\mathcal {B} ) ^{-1}\mathcal {B}^{\mathrm{T}} \{ \dot{g} ( U_{i} ) \varPhi _{i},1\le i\le n \} ^{\mathrm{T}}, \end{aligned}$$

where \(\mathcal {B}= [ \{ \mathbf{B}_{1} ( U_{1} ) ,\ldots , \mathbf{B}_{1} ( U_{n} ) \} ^{\mathrm{T}} ] _{n\times J_{n,1}}\). By (24) and Berstein’s inequality, we have

$$\begin{aligned}&\sup _{s,k} \left| -\widetilde{\varOmega }_{1,sk}-\dot{g} ( U_{i} ) \sum \limits _{j=1}^{m_{i}} ( t_{i,j+1}-t_{ij} ) B_{s,2} ( t_{ij} ) \mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}} \zeta _{n,k} ( t_{ij} ) \right| \\&\quad =\sup _{s,k} \vert \mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}} ( \mathcal {B}^{\mathrm{T}}\mathcal {B} ) ^{-1}\mathcal {B}^{\mathrm{T}} [ \dot{g} ( U_{i} ) \varPhi _{i,sk}-E ( \dot{g} ( U_{i} ) \varPhi _{i,sk} \vert U_{i} ) \\&\quad \quad +O ( J_{n,2}^{-1}J_{n,1}^{-1}+J_{n,2}^{-1}m_{\min }^{-1} ) ,1\le i\le n ] \vert \\&\quad = ( J_{n,2}^{-1}+m_{\min }^{-1} ) O_{p} ( ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ) +O ( J_{n,2}^{-1}J_{n,1}^{-1}+J_{n,2}^{-1}m_{\min }^{-1} ) \\&\quad =O_{p} ( ( \log n ) ^{1/2}J_{n,2}^{-1}J_{n,1}^{1/2}n^{-1/2}\!+\! ( \log n ) ^{1/2}m_{\min }^{-1}J_{n,1}^{1/2}n^{-1/2}\!+\! J_{n,2}^{-1}J_{n,1}^{-1}\!+\!J_{n,2}^{-1}m_{\min }^{-1} ) , \end{aligned}$$

and thus

$$\begin{aligned}&\sup _{s,k} \vert -\widetilde{\varOmega }_{1,sk}-E ( \dot{g} ( U_{i} ) \varPhi _{i,sk} \vert U_{i} ) \vert \\&\quad =O_{p} ( ( \log n ) ^{1/2}J_{n,2}^{-1}J_{n,1}^{1/2}n^{-1/2}\!+\! ( \log n ) ^{1/2}m_{\min }^{-1}J_{n,1}^{1/2}n^{-1/2}\!+\! J_{n,2}^{-1}J_{n,1}^{-1}\!+\! J_{n,2}^{-1}m_{\min }^{-1} ). \end{aligned}$$

Furthermore, it can be proved that \( \Vert \widehat{\varOmega }_{1} ( \widetilde{\delta }_{n}^{0} ) ^{\mathrm{T}}-\widetilde{\varOmega } _{1}^{\mathrm{T}} \Vert _{\infty }=O ( J_{n,2}^{-\alpha }+m_{\min }^{-1} ) \). Therefore, we have

$$\begin{aligned}&\Vert \widehat{\varOmega }_{1} ( \widetilde{\delta } _{n}^{0} ) ^{\mathrm{T}}+E ( \dot{g} ( U_{i} ) \varPhi _{i} \vert U_{i} ) \Vert _{\infty } \nonumber \\&\quad =O_{p} ( ( \log n ) ^{1/2}J_{n,2}^{-1}J_{n,1}^{1/2}n^{-1/2}+J_{n,2}^{-1}J_{n,1}^{-1}+m_{\min }^{-1}+J_{n,2}^{-\alpha } ). \end{aligned}$$
(25)

By (21), (22), (23) and (25), we have

$$\begin{aligned}&\Vert \{ \widehat{\lambda } ( \widetilde{\delta }_{n}^{0} ) ^{\mathrm{T}}/\partial \delta \} \mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta } _{n}^{0} ) +E ( \dot{g} ( U_{i} ) \varPhi _{i} \vert U_{i} ) \Vert _{\infty } \\&\quad =O_{p} ( J_{n,2}^{-1}J_{n,1}^{-1}+m_{\min }^{-1}+J_{n,2}^{-\alpha }+J_{n,1}^{1-\alpha }+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ). \end{aligned}$$

Let

$$\begin{aligned} \varDelta _{i}= [ \widehat{\dot{g}} ( \varPhi _{i}^{\mathrm{T}}\widetilde{ \delta }_{n}^{0},\widetilde{\delta }_{n}^{0} ) \varPhi _{i}+ \{ \widehat{\lambda } ( \widetilde{\delta } _{n}^{0} ) ^{\mathrm{T}}/\partial \delta \} \mathbf{B} _{1} ( \varPhi _{i}^{\mathrm{T}}\widetilde{\delta }_{n}^{0} ) ] - [ \dot{g} ( U_{i} ) \varPhi _{i}-E ( \dot{g} ( U_{i} ) \varPhi _{i} \vert U_{i} ) ]. \end{aligned}$$

By the above result and Proposition (1), we have

$$\begin{aligned}&\Vert \varDelta _{i} \Vert _{\infty } \nonumber \\&\quad =O_{p} \{ ( J_{n,1}^{1-\alpha }+J_{n,1}J_{n,2}^{-\alpha }+J_{n,1}m_{\min }^{-1}+ ( \log n ) ^{1/2}J_{n,1}^{3/2}n^{-1/2} ) ( J_{n,2}^{-1}+m_{\min }^{-1} ) \} \nonumber \\&\qquad +O_{p} ( J_{n,2}^{-1}J_{n,1}^{-1}+m_{\min }^{-1}+J_{n,2}^{-\alpha }+J_{n,1}^{1-\alpha }+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ) \nonumber \\&\quad =O_{p} ( J_{n,2}^{-1}J_{n,1}^{-1}+m_{\min }^{-1}+J_{n,2}^{-\alpha }+J_{n,1}^{1-\alpha }+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} \nonumber \\&\qquad +J_{n,1}J_{n,2}^{-\alpha -1}+J_{n,1}J_{n,2}^{-1}m_{\min }^{-1}+ ( \log n )^{1/2}J_{n,1}^{3/2}J_{n,2}^{-1}n^{-1/2} ). \end{aligned}$$
(26)
$$\begin{aligned}&\partial \widetilde{L}_{n} ( \widetilde{\delta } _{n}^{0} ) /\partial \delta \\&\quad =-2\sum \limits _{i=1}^{n} \{ Y_{i}-\widehat{g} ( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0},\delta _{n}^{0} ) \} \\&\quad \quad \times \, [ \widehat{\dot{g}} ( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0},\delta _{n}^{0} ) \varPhi _{i}+ \{ \widehat{\lambda } ( \delta _{n}^{0} ) ^{\mathrm{T}}/\partial \delta _{n} \} \mathbf{B}_{1} ( \varPhi _{i}^{\mathrm{T}} \delta _{n}^{0} ) ] ( 1+o_{p} ( 1 ) ) \\&\quad =-2 ( \varTheta _{1}+\varTheta _{2}+\varTheta _{3}+\varTheta _{4}+\varTheta _{5} ) ( 1+o_{p} ( 1 ) ) \end{aligned}$$

where

$$\begin{aligned} \varTheta _{1}&= \sum \limits _{i=1}^{n} \{ Y_{i}-g ( U_{i} ) \} \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} , \nonumber \\ \varTheta _{2}&= \sum \limits _{i=1}^{n} \{ g ( U_{i} ) -\widehat{ g} ( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0},\delta _{n}^{0} ) \} \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} , \nonumber \\ \varTheta _{3}&= \sum \limits _{i=1}^{n} \{ Y_{i}-g ( U_{i} ) \} \varDelta _{i}, \nonumber \\ \varTheta _{4}&= \sum \limits _{i=1}^{n} \{ g ( U_{i} ) -\widehat{ g} ( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0},\delta _{n}^{0} ) \} \varDelta _{i}. \end{aligned}$$
(27)

In the following, we will prove that \( \Vert \varTheta _{i} \Vert _{\infty }=o_{p} ( n^{1/2}J_{n,2}^{-1/2} ) \) for \(i=2,3,4\). By (8), we have \( \vert \widehat{g} ( \varPhi _{i}^{\mathrm{T}} \delta _{n}^{0},\delta _{n}^{0} ) -\widehat{g} ( U_{i} ) \vert =O ( J_{n,2}^{-\alpha }+m_{\min }^{-1} ) \). Moreover, we have \( \Vert \varPhi _{i} \Vert _{\infty }=O ( J_{n,2}^{-1}+m_{\min }^{-1} ) \). Thus,

$$\begin{aligned} \Vert \varTheta _{2}-\widetilde{\varTheta }_{2} \Vert _{\infty }=O \{ n ( J_{n,2}^{-\alpha }+m_{\min }^{-1} ) ( J_{n,2}^{-1}+m_{\min }^{-1} ) \} , \end{aligned}$$

where \(\widetilde{\varTheta }_{2}=\widetilde{\varTheta }_{12}+\widetilde{\varTheta } _{22}\),

$$\begin{aligned} \widetilde{\varTheta }_{12}&= \sum \limits _{i=1}^{n} \{ g ( U_{i} ) -\widehat{g}_{g} ( U_{i} ) \} \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} , \\ \widetilde{\varTheta }_{22}&= -\sum \limits _{i=1}^{n}\widehat{g}_{e} ( U_{i} ) \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} , \end{aligned}$$

in which \(\widehat{g}_{g} ( U_{i} ) =\mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}} ( \mathcal {B}^{\mathrm{T}}\mathcal {B} ) ^{-1} \mathcal {B}^{\mathrm{T}}\mathbf{g}_{n}\) and \(\widehat{g}_{g} ( U_{i} ) =\mathbf{B}_{1} ( U_{i} ) ^{\mathrm{T}} ( \mathcal {B} ^{\mathrm{T}}\mathcal {B} ) ^{-1}\mathcal {B}^{\mathrm{T}}\varepsilon _{n}\). By law of large numbers and \( \vert g ( U_{i} ) - \widehat{g}_{g} ( U_{i} ) \vert =O_{p} ( J_{n,1}^{-\alpha }+J_{n,1}^{1/2}/n^{1/2} ) \), we have \( \Vert \widetilde{\varTheta } _{12} \Vert _{\infty }=o_{p} ( n^{1/2}J_{n,2}^{-1/2} ) \). Moreover,

$$\begin{aligned} \Vert \widetilde{\varTheta }_{22} \Vert _{\infty }&\le \Vert \sum \limits _{i=1}^{n}\mathbf{B}_{1} ( U_{i} ) \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} \Vert _{\infty } \Vert ( \mathcal {B}^{\mathrm{T }}\mathcal {B} ) ^{-1}\mathcal {B}^{\mathrm{T}}\varepsilon _{n} \Vert _{_{\infty }} \\&= O_{p} ( ( \log n ) ^{1/2}J_{n,1}^{-1/2}n^{1/2} ) O_{p} ( ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ) =O_{p} ( \log n ). \end{aligned}$$

Therefore, for \(n^{1/2}J_{n,2}^{-\alpha -1/2}=o ( 1 ) \), \( n^{1/2}m_{\min }^{-1}J_{n,2}^{-1/2}=o ( 1 ) \) and \(n^{1/2}m_{\min }^{-2}=o ( 1 ) \), we have \( \Vert \varTheta _{2} \Vert _{\infty }=o_{p} ( n^{1/2}J_{n,2}^{-1/2} ) \). Similarly, it can be proved that \( \Vert \varTheta _{3} \Vert _{\infty }=o_{p} ( n^{1/2}J_{n,2}^{-1/2} ) \). By Proposition 1 and (26), for \(n^{1/ ( 2\alpha +1 ) }\ll J_{n,2}\ll n^{1/3} ( \log n ) ^{-1}\), \(n^{1/ ( 2\alpha +3 ) }\ll J_{n,1}\ll J_{n,2}\ll J_{n,1}^{2}\), and \(n^{1/2}m_{\min }^{-1}J_{n,2}^{-1/2}=o ( 1 ) \), we have

$$\begin{aligned} \Vert \varTheta _{4} \Vert _{\infty }&= n\times O_{p} ( J_{n,1}^{-\alpha }+J_{n,2}^{-\alpha }+m_{\min }^{-1}+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} ) \\&\times O_{p} ( J_{n,2}^{-1}J_{n,1}^{-1}+m_{\min }^{-1}+J_{n,2}^{-\alpha }+J_{n,1}^{1-\alpha }+ ( \log n ) ^{1/2}J_{n,1}^{1/2}n^{-1/2} \\&+ J_{n,1}J_{n,2}^{-\alpha -1}+J_{n,1}J_{n,2}^{-1}m_{\min }^{-1}+ ( \log n ) ^{1/2}J_{n,1}^{3/2}J_{n,2}^{-1}n^{-1/2} ) \\&= o_{p} ( n^{1/2}J_{n,2}^{-1/2} ). \end{aligned}$$

Proof of Theorem 1

By (20), we have

$$\begin{aligned} \sup _{t\in \mathcal {T}} \vert \widehat{\beta }_{k} ( t ) -\beta _{k,n} ( t ) \vert \asymp \Vert \widehat{\delta }-\widetilde{\delta }_{n}^{0} \Vert _{\infty }=O_{p} \{ ( \log n ) ^{1/2}J_{n,2}^{1/2}n^{-1/2} \} , \end{aligned}$$

and by (8) and \(n^{1/2}J_{n,2}^{-\alpha -1/2}=o ( 1 )\),

$$\begin{aligned} \sup _{t\in \mathcal {T}} \vert \widehat{\beta }_{k} ( t ) -\beta _{k} ( t ) \vert \le \sup _{t\in \mathcal {T}} \vert \widehat{\beta }_{k} ( t ) -\beta _{k,n} ( t ) \vert +\sup _{t\in \mathcal {T}} \vert \beta _{k,n} ( t ) -\beta _{k} ( t ) \vert \\ =O_{p} \{ ( \log n ) ^{1/2}J_{n,2}^{1/2}n^{-1/2}+J_{n,2}^{-\alpha } \} =O_{p} \{ ( \log n ) ^{1/2}J_{n,2}^{1/2}n^{-1/2} \}. \end{aligned}$$

Therefore, result (i) in Theorem 1 is proved. By (27), \(\partial \widetilde{L}_{n} ( \widetilde{\delta }_{n}^{0} ) /\partial \delta _{n}=-2 ( \varPi _{1}+\varTheta _{3}+\varTheta _{4} ) ( 1+o_{p} ( 1 ) ) \), where

$$\begin{aligned} \varPi _{1}=\sum \limits _{i=1}^{n} \{ Y_{i}-\widehat{g} ( \varPhi _{i}^{\mathrm{T}}\delta _{n}^{0},\delta _{n}^{0} ) \} \dot{g} ( U_{i} ) \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \}. \end{aligned}$$

By (26),

$$\begin{aligned} \partial \varPi _{1}/\partial \delta ^{\mathrm{T}}=-\sum \nolimits _{i=1}^{n}\dot{g} ( U_{i} ) ^{2} \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} ^{\otimes 2}+o_{p} ( nJ_{n,2}^{-1} ). \end{aligned}$$

Therefore,

$$\begin{aligned} \partial \widetilde{L}_{n} ( \widetilde{\delta }_{n}^{0} ) /\partial \delta \partial \delta ^{\mathrm{T} }=2\sum \limits _{i=1}^{n}\dot{g} ( U_{i} ) ^{2} \{ \varPhi _{i}-E ( \varPhi _{i} \vert U_{i} ) \} ^{\otimes 2}+o_{p} ( nJ_{n,2}^{-1} ) . \end{aligned}$$

By Taylor expansion, Lemma 1, Berstein’s inequality in Boor (2001) and the above result, we have

$$\begin{aligned} \widehat{\delta }-\widetilde{\delta }_{n}^{0}&= - \{ \partial L_{n} ( \widetilde{\delta }_{n}^{0} ) /\partial \delta \partial \delta ^{\mathrm{T}} \} ^{-1} \{ \partial \widetilde{L}_{n} ( \widetilde{\delta }_{n}^{0} ) /\partial \delta \} \{ 1+o_{p} ( 1 ) \} \nonumber \\&= \left\{ \sum \limits _{i=1}^{n}E ( \varPsi _{i}^{\otimes 2} ) \right\} ^{-1}\sum \limits _{i=1}^{n}\varepsilon _{i}\varPsi _{i}+o_{p} ( J_{n,2}^{1/2}n^{-1/2} ). \end{aligned}$$
(28)

Result (ii) follows from Lindeberg–Feller Central Limit Theorem and Slutsky’s Theorem.\(\square \)

Proof of Theorem 2

By (20), Proposition 1 and the conditions in Theorem 2, we have \(\sup _{u\in \mathcal {I}}\) \( \vert \widehat{g} ( u;\widehat{\delta } ) -g ( u ) \vert =O_{p} \{ ( \log n ) ^{1/2}J_{n,2}^{1/2}n^{-1/2}+J_{n,1}^{-\alpha } \}\). \(\square \)

Proof of Theorem 3

Let \(\varXi _{i}=E ( \varPsi _{i}^{\otimes 2} ) \) and \(\varPi _{i}=E ( \sigma ^{2} ( U_{i} ) \varPsi _{i}^{\otimes 2} ) \). Let \(\mathbf{Z }_{1},\ldots ,\mathbf{Z}_{n}\) be independent random variables from \(\hbox {MVN} ( \mathbf{0},\mathbf{I}_{pJ_{n,2}\times pJ_{n,2}} ) \), where \( \mathbf{Z}_{i}= \{ Z_{i,sk} \} \). Define

$$\begin{aligned} \eta _{k} ( t )&= \sigma _{n,k}^{-1} ( t ) \mathbf{B} _{2} ( t ) ^{\mathrm{T}}\varvec{\Lambda }_{k} \left\{ n^{-1}\sum \limits _{i=1}^{n}\varXi _{i} \right\} ^{-1}n^{-1}\sum \limits _{i=1}^{n}\varepsilon _{i}\varPsi _{i}, \\ \eta _{k}^{0} ( t )&= \sigma _{n,k}^{-1} ( t ) \mathbf{B} _{2} ( t ) ^{\mathrm{T}}\varvec{\Lambda }_{k} \left\{ n^{-1}\sum \limits _{i=1}^{n}\varXi _{i}\right\} ^{-1}n^{-1}\sum \limits _{i=1}^{n}\varPi _{i}^{1/2}\mathbf{Z}_{i}. \end{aligned}$$

By the fact that \(\widehat{\beta }_{k}-\beta _{k,n}=\sigma _{n,k}^{-1} ( t ) \mathbf{B}_{2} ( t ) ^{\mathrm{T}}\varvec{\Lambda } _{k} ( \widehat{\delta }-\widetilde{\delta } _{n}^{0} ) \), (8) and (28), we have

$$\begin{aligned} \sup \nolimits _{t\in \mathcal {T}} \vert \widehat{\beta }_{k} ( t ) -\beta _{k} ( t ) -\eta _{k} ( t ) \vert =o_{p} ( J_{n,2}^{1/2}n^{-1/2} ). \end{aligned}$$
(29)

It is apparent that \(\eta _{k}^{0} ( t ) \) is a Gaussian process with \(E \{ \eta _{k}^{0} ( t ) \} \equiv 0\), Var\( \{ \eta _{k}^{0} ( t ) \} \equiv 1\), and covariance matrix given in (9). Therefore, we have

$$\begin{aligned} P \{ \sup \nolimits _{t\in T} \vert \eta _{k}^{0} ( t ) \vert \le Q_{k} ( \alpha ) \} =1-\alpha . \end{aligned}$$
(30)

Next, we will prove that \(\sup \nolimits _{t\in \mathcal {T}} \vert \eta _{k} ( t ) -\eta _{k}^{0} ( t ) \vert =o_{p} ( 1 ) \). Let \(\mathbf{e}_{i}= \{ e_{i,sk} \} =\varPi _{i}^{-1/2}\varepsilon _{i}\varPsi _{i}\). Denote \(\varPi _{i}^{1/2}= \{ \xi _{i,s^{\prime }k^{\prime },sk} \} \). There exists a constant \( 0<C<\infty \), such that \(\sup \vert \xi _{i,s^{\prime }k^{\prime },sk} \vert \le CJ_{n,2}^{-1/2}\). Then, \(E ( \mathbf{e}_{i} ) = \mathbf{0}\) and Var\( ( \mathbf{e}_{i} ) =\mathbf{I}_{pJ_{n,2}\times pJ_{n,2}}\).There exist \(s,s^{\prime },k,k^{\prime }\) such that

$$\begin{aligned} \left\| n^{-1}\sum \limits _{i=1}^{n}\varPi _{i}^{1/2} ( \mathbf{e}_{i}- \mathbf{Z}_{i} ) \right\| _{\infty }\le n^{-1}pJ_{n,2} \left| \sum \limits _{i=1}^{n}\xi _{i,s^{\prime }k^{\prime },s,k} ( e_{i,sk}-Z_{i,sk} ) \right| . \end{aligned}$$

For notation simplicity, let \(\xi _{i}=\xi _{i,s^{\prime }k^{\prime },s,k}\). Order all \(\xi _{i}\), \(1\le i\le n\), from the largest to the smallest such that \(\xi _{ ( 1 ) }\ge \) \(\xi _{ ( 2 ) }\ge \cdots \ge \xi _{ ( n ) }\). Moreover, \(Z_{i,sk}\) can be written as \( Z_{i,sk}=W ( i ) -W ( i-1 ) \), where \( \{ W ( s ) ,0\le s<\infty \} \) is a Wiener process that is a Borel function of \(Z_{i,sk}\). Let \(S_{i}=\sum \nolimits _{i^{\prime }=1}^{i}e_{i^{\prime },sk}\) and \(S_{0}=0\). Define \(M_{n}=\max _{1\le s\le n} \vert S_{s}-W(s) \vert \). By Theorem 2.6.2 in Csőrgő and Révész (1981), we have \(M_{n}=O_{p} ( \log n ) \). Then,

$$\begin{aligned}&\left\| n^{-1}\sum \limits _{i=1}^{n}\varPi _{i}^{1/2} ( \mathbf{e} _{i}-\mathbf{Z}_{i} ) \right\| _{\infty } \\&\quad \le n^{-1}pJ_{n,2} \left\{ \vert \xi _{n} ( S_{n}-W ( n ) ) \vert + \left| \sum \limits _{i=1}^{n-1} ( \xi _{i}-\xi _{i+1} ) ( S_{i}-W ( i ) ) \right| \right\} \\&\quad \le n^{-1}pJ_{n,2}M_{n} \left( CJ_{n,2}^{-1/2}+\sum \limits _{i=1}^{n-1} \vert \xi _{i}-\xi _{i+1} \vert \right) \\&\quad =n^{-1}pJ_{n,2}M_{n} ( CJ_{n,2}^{-1/2}+ \vert \xi _{1}-\xi _{n} \vert ) \\&\quad \le 3Cn^{-1}pJ_{n,2}^{1/2}M_{n}=O_{p} ( J_{n,2}^{1/2}n^{-1}\log n ). \end{aligned}$$

Therefore,

$$\begin{aligned}&\sup \nolimits _{t\in \mathcal {T}} \vert \eta _{k} ( t ) -\eta _{k}^{0} ( t ) \vert \nonumber \\&\quad \le \sup \nolimits _{t\in \mathcal {T}} \left\{ \vert \sigma _{n,k}^{-1} ( t ) \vert \sum \limits _{r=1}^{J_{n,2}} \vert B_{r,2} ( t ) \vert \right\} \left\| \varvec{\Lambda }_{k} \right\| _{\infty } \nonumber \\&\qquad \times \left\| \left\{ n^{-1}\sum \limits _{i=1}^{n}\varXi _{i} \right\} ^{-1} \right\| _{\infty } \left\| n^{-1}\sum \limits _{i=1}^{n}\varPi _{i}^{1/2} ( \mathbf{e}_{i}-\mathbf{Z}_{i} ) \right\| _{\infty } \nonumber \\&\quad =O_{p} ( n^{1/2}J_{n,2}^{-1/2} ) O_{p} ( J_{n,2} ) O_{p} ( J_{n,2}^{1/2}n^{-1}\log n ) \nonumber \\&\quad =O_{p} ( J_{n,2}n^{-1/2}\log n ) =o_{p} ( 1 ). \end{aligned}$$
(31)

Thus, Theorem follows from (29), (30) and (31).\(\square \)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ma, S. Estimation and inference in functional single-index models. Ann Inst Stat Math 68, 181–208 (2016). https://doi.org/10.1007/s10463-014-0488-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-014-0488-3

Keywords

Navigation