Abstract
We study a functional linear semiparametric model which is not only an extension of partially functional linear models, but also an extension of semiparametric models. We consider the case that a response is related to a functional predictor and several scalar variables and the functional predictor is observed at a set of discrete points with noise. We propose a new estimation procedure which combines functional principal component analysis and B-spline methods to estimate unknown parameters and functions in model. The asymptotic distribution of the estimators of slope parameters is derived and the global convergence rate of the estimator of unknown slope function is established. The convergence rate of the mean squared prediction error for a predictor is also established. Simulation studies are conducted to investigate the finite sample performance of the proposed estimators. A real data example based on real estate data is used to illustrate our proposed methodology.
Similar content being viewed by others
References
Cai T, Hall P (2006) Predictionin functional linear regression. Ann Stat 34:2159–2179
Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13:571–591
Cardot H, Mas A, Sarda P (2007) CLT in functional linear models. Probab Theory Relat Fields 138:325–361
Carroll R, Fan J, Gijbels I, Wand M (1997) Generalized partially linear single-index models. J Am Stat Assoc 92:477–489
Chen D, Hall P, Müller H (2011) Single and multiple index functional regression models with nonparametric link. Ann Stat 39:1720–1747
Chen K, Jin Z (2006) Partial linear regression models for clustered data. J Am Stat Assoc 101:195–204
Chen K, Müller H-G (2012) Conditional quantile analysis when covariates are functions, with application to growth data. J R Stat Soc B 74:67–89
de Boor C (1978) A practical guide to splines. Springer, New York
Gao J, Lu Z, Tjøstheim D (2006) Estimation in semiparametric spatial regression. Ann Stat 34:1395–1435
Gócki T, Krzyśko M, Waszak L, Wolyński W (2018) Selected statistical methods of data analysis for multivariate functional data. Stat Pap 59:153–182
Hall P, Horowitz JL (2007) Methodology and convergence rates for functional linear regression. Ann Stat 35:70–91
Hall P, Müller H, Wang J (2006) Properties of principal component methods for functional and longitudinal data analysis. Ann Stat 34:1493–1517
Hsing T, Eubank R (2015) Theoretical foundations of functional data analysis, with an introduction to linear operators. Wiley, New York
Kato K (2012) Estimation in functional linear quantile regression. Ann Stat 40:3108–3136
Li Y, Hsing T (2010) Uniform convergence rates for nonparametric regression and principal component analysis in functional/logitudinal data. Ann Stat 38:3321–3351
Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New York
Reiss P, Ogden R (2010) Functional generalized linear models with images as predictors. Biometrics 66:61–69
Schumaker LL (1981) Spline functions: basic theory. Wiley, New York
Shin H (2009) Partial functional linear regression. J Stat Plan Inference 139:3405–3418
Shin H, Lee MH (2012) On prediction rate in partial functional linear regression. J Multianal 103:93–106
Stone C (1985) Additive regression and other nonparametric models. Ann Stat 13:689–705
Tang Q (2013) B-spline estimation for semiparametric varying-coefficient partially linear regression with spatial data. J Nonparametr Stat 25:361–378
Tang Q (2015) Estimation for semi-functional linear regression. Statistics 49:1262–1278
Tang Q, Cheng L (2014) Partial functional linear quantile regression. Sci China Math 57(12):2589–2608
Wang G, Zhou J, Wu W, Chen M (2017) Robust functional sliced inverse regression. Stat Pap 58:227–245
Yao F, Müller H, Wang J (2005) Functional data analysis for sparse longitudinal data. J Am Stat Assoc 100:577–590
Yao F, Sue-Chee S, Wang F (2017) Regularized partially functional quantile regression. J Multivar Anal 156:39–56
Zhang J, Chen J (2007) Statistical inferences for functional data. Ann Stat 35:1052–1079
Zhou S, Shen X, Wolfe DA (1998) Local asymptotics for regression splines and confidence regions. Ann Stat 26:1760–1782
Acknowledgements
We wish to thank the Editor and two reviewers for their helpful comments and suggestions that led to substantial improvements in this paper. This work was supported by National Social Science Foundation of China (16BTJ019).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Proofs of theorems
Appendix: Proofs of theorems
Let \(C>0\) denote a generic constant of which the value may change from line to line. For a matrix \(\pmb {A}=(a_{ij})\), set \(\Vert \pmb {A}\Vert _{\infty }=\max _{i}\sum _{j}|a_{ij}|\) and \(|\pmb {A}|_{\infty }=\max _{i,j}|a_{ij}|\). For a vector \(\pmb {v}=(v_{1},\ldots ,v_{k})^{T}\), set \(\Vert \pmb {v}\Vert _{\infty }=\sum _{j=1}^{k}|v_{j}|\) and \(|\pmb {v}|_{\infty }=\max _{1\le j\le k}|v_{j}|\).
Denote \(A_{l}=\sum _{j=1}^{\infty }a_{j}\xi _{lj}\), \({\tilde{A}}_{i}=A_{i}-\frac{1}{n}\sum _{l=1}^{n}A_{l}{\tilde{\zeta }}_{li}\), \(F_{i}=f(U_{i})\), \({\tilde{F}}_{i}=F_{i}-\frac{1}{n}\sum _{l=1}^{n}F_{l}{\tilde{\zeta }}_{li}\), \({\tilde{\varepsilon }}_{i}=\varepsilon _{i}-\frac{1}{n}\sum _{l=1}^{n}\varepsilon _{l}{\tilde{\zeta }}_{li}\) and \(\tilde{\pmb {A}}=({\tilde{A}}_{1},\ldots ,{\tilde{A}}_{n})^{T}\), \(\tilde{\pmb {F}}=({\tilde{F}}_{1},\ldots ,{\tilde{F}}_{n})^{T}\), \(\tilde{\pmb {\varepsilon }}=({\tilde{\varepsilon }}_{1},\ldots ,{\tilde{\varepsilon }}_{n})^{T}\).
We first list the following Lemmas A.1–A.8, their proofs are given in supplementary material.
Lemma A.1
Let \(\Delta (s,t)={\hat{K}}(s,t)-K(s,t)\) and \(|\Vert \Delta \Vert |=(\int _{{\mathcal {T}}}\int _{{\mathcal {T}}}\Delta ^{2}(s,t)dsdt)^{1/2}\). suppose that Assumptions 1–3 and 6 hold, then it holds that
Lemma A.2
Suppose that Assumptions 1–3 and 6 hold, then it holds that
Lemma A.3
Assume that Assumptions 1–6 hold. Then it holds that
where \(o_{p}(h_{0}^{2})\) holds uniformly for \(1\le k,k^{\prime }\le K_{n}\).
Lemma A.4
Under Assumptions 1–4, it holds that
Lemma A.5
Under Assumptions 1–3, it holds that
Lemma A.6
Under the Assumptions 1–4 and 6, it holds that
Lemma A.7
Under the Assumptions 1–4 and 6, it holds that
Lemma A.8
Define \({\check{a}}_{j}=\frac{1}{{\hat{\lambda }}_{j} }E[(Y-Z^{T}\beta _{0}-f(U))\xi _{j}]\). Under the assumptions of Theorem 3.2, it holds that
Proof of Theorem 3.1
Under Assumption 5, according to Corollary 6.21 of (Schumaker 1981, p.227), there exists a spline function \(f_{0}(u)=\sum _{k=1}^{K_{n}}b_{0k}B_{k}(u)\) and a constant \(C>0\) such that
where \({\bar{f}}(u)=f(u)-f_{0}(u)\). Denote \({\bar{F}}_{i}={\bar{f}}(U_{i})\), \(\tilde{{\bar{F}}}_{i}={\bar{F}}_{i}-\frac{1}{n}\sum _{l=1}^{n}{\bar{F}}_{l}{\tilde{\xi }}_{li}\), \(\tilde{\bar{\pmb {F}}}=(\tilde{{\bar{F}}}_{1},\ldots ,\tilde{{\bar{F}}}_{n})^{T}\). By (2.14), we have
where \(\pmb {I}_{n}\) is the \(n\times n\) identity matrix. By arguments similar to those used in the proof of Lemma 1 of Tang (2013) and using Lemmas A.2 and A.3, we have
Similar to the proof of Lemma A.7, we have that \(n^{-1/2}|\sum _{i=1}^{n}{\tilde{A}}_{i}{\tilde{B}}_{k}(U_{i})|=o_{p}(h_{0})\) uniformly for \(1\le k\le K_{n}\). Hence by arguments similar to those used in the proof of Lemma 1 of Tang (2013), we obtain that
Using Lemma A.7 and (6.4), we deduce that
By Lemma A.2, we get that \(\sum _{i=1}^{n}{\tilde{Z}}_{ik}^{2}=O_{p}(n)\). Using (6.1) and the assumption that \(nh_{0}^{2\rho '}\rightarrow 0\), we have
By arguments similar to those used to prove Lemmas A.5 and A.6 and using (6.1), we deduce that \(n^{-\frac{1}{2}}\sum _{i=1}^{n}\left( \frac{1}{n}\sum _{l=1}^{n}{\bar{F}}_{l}{\tilde{\xi }}_{li}\right) {\tilde{Z}}_{ik}=o_{p}(1)\) and
Hence \(n^{-\frac{1}{2}}\sum _{i=1}^{n}{\tilde{Z}}_{ik}\tilde{{\bar{F}}}_{i}=o_{p}(1)\). By arguments similar to those used to prove (6.5), we further get that
We decompose \(\sum _{i=1}^{n}{\tilde{Z}}_{ik}\varepsilon _{i}\) into three terms as
Similar to the proof of Lemma A.6, we have \(\sum _{i=1}^{n}\varepsilon _{i} \frac{1}{n}\sum _{l=1}^{n}Z_{lk}({\tilde{\zeta }}_{li}-\vec {\zeta }_{li})=o_{p}(n)\). Since
\(\sum _{i=1}^{n}\varepsilon _{i}\sum _{j=1}^{m}\frac{\xi _{ij}}{\lambda _{j} }(\frac{1}{n}\sum _{l=1}^{n}Z_{lk}\xi _{lj}-E(Z_{lk}\xi _{j}))=o_{p}(n)\) and \(\sum _{i=1}^{n}\varepsilon _{i}\sum _{j=m+1}^{\infty }\mu _{kj}\xi _{ij} =o_{p}(n)\), then it follows by (6.6) that
By arguments similar to those used in the proof of Lemma 2 of Tang (2013), we deduce that
where \(\pmb {B}^{T}=(\pmb {B}_{1},\ldots ,\pmb {B}_{n})\) with \(\pmb {B}_{i}=(B_{1}(U_{i}), \ldots ,B_{K_{n}}(U_{i}))^{T}\) and \(\pmb {\varepsilon }=(\varepsilon _{1},\ldots ,\varepsilon _{n})^{T}\). Now (3.1) follows from (6.2), (6.3), (6.4), (6.6)–(6.9) and the central limit theorem. The proof of Theorem 3.1 is finished.
Proof of Theorem 3.2
Note that
and
Assumption 4 implies that \(m\sum _{j=1}^{m}a_{j}^{2}\Vert {\hat{\phi }}_{j}-\phi _{j}\Vert ^{2}=O_{p}(mn^{-1}\sum _{j=1}^{m}a_{j}^{2}j^{2}\log j)=o_{p}(m/n)\) and \(\sum _{j=m+1}^{\infty }a_{j}^{2}=O(m^{-2\gamma +1})\). Now (3.4) follows from Lemma A.8, (5.10) and (5.11). The proof of Theorem 3.2 is finished.
Proof of Theorem 3.3
By Assumption 7 and Lemma A.3, all the eigenvalues of \((\frac{K^{*}_{n}}{n}\tilde{\pmb {B}}^{*T}\tilde{\pmb {B}}^{*})^{-1}\) are bounded away from zero and infinity, except possibly on an event whose probability tends to zero. Similar to (6.1), there exists a spline function \(f^{*}(u)=\sum _{k=1}^{K^{*} _{n}}b_{0k}^{*}B_{k}^{*}(u)\) such that
Let \(\pmb {b}^{*}_{0}=(b^{*}_{01},\ldots ,b^{*}_{0K^{*}_{n}})^{T}\). Using the properties of B-splines (de Boor 1978), we obtain
By arguments similar to those used to prove (6.4)–(6.7) and using Theorem 3.1, we conclude that \(\Vert \hat{\pmb {b}}-\pmb {b}^{*} _{0}\Vert ^{2}=O_{p}(n^{-1}{K^{*}_{n}}^{2})\). Now (3.5) follows from (5.13) and the fact that \(h=O({K^{*}_{n}}^{-1})\). This completes the proof of Theorem 3.3.
Proof of Theorem 3.4
Observe that
where \(\Vert {\hat{a}}-a\Vert _{K}^{2}=\int _{{\mathcal {T}}}\int _{{\mathcal {T}} }K(s,t)[{\hat{a}}(s)-a(s)][{\hat{a}}(t)-a(t)]dsdt\). Similar to the proof of Lemma A.1, we obtain that
Under the assumptions of Theorem 3.4, using arguments similar to those used in the proof of Theorem 2 of Tang (2015), we deduce that
Write
Using Theorem 3.3, we obtain \( E([{\hat{f}}(U_{n+1})-f^{*}(U_{n+1})]^{2}|{\mathcal {S}}) =O_{p}(n^{-2\rho '/(2\rho '+1)})\). Using (6.12), we obtain \( E([f^{*}(U_{n+1})-f(U_{n+1}]^{2}|{\mathcal {S}})=O_{p}(h^{2\rho '})\). Hence, \(E([{\hat{f}}(U_{n+1})-f(U_{n+1})]^{2}|{\mathcal {S}})=O_{p}(n^{-2\rho '/(2\rho '+1)})\). Now (3.7) follows from (5.14)–(5.16), Assumption 6 and Theorem 3.1. This completes the proof of Theorem 3.4.
Rights and permissions
About this article
Cite this article
Qingguo, T., Minjie, B. Estimation for functional linear semiparametric model. Stat Papers 62, 2799–2823 (2021). https://doi.org/10.1007/s00362-020-01215-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-020-01215-y