Skip to main content
Log in

On some stable linear functional regression estimators based on random projections

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

In this work, we develop two stable estimators for solving linear functional regression problems. It is well known that such a problem is an ill-posed stochastic inverse problem. Hence, a special interest has to be devoted to the stability issue in the design of an estimator for solving such a problem. Our proposed estimators are based on combining a stable least-squares technique and a random projection of the slope function \(\beta _0(\cdot )\in L^2(J),\) where J is a compact interval. Moreover, these estimators have the advantage of having a fairly good convergence rate with reasonable computational load, since the involved random projections are generally performed over a fairly small dimensional subspace of \(L^2(J).\) More precisely, the first estimator is given as a least-squares solution of a regularized minimization problem over a finite dimensional subspace of \(L^2(J).\) In particular, we give an upper bound for the empirical risk error as well as the convergence rate of this estimator. The second proposed stable LFR estimator is based on combining the least-squares technique with a dyadic decomposition of the i.i.d. samples of the stochastic process, associated with the LFR model. In particular, we provide an \(L^2\)-risk error of this second LFR estimator. Finally, we provide some numerical simulations on synthetic as well as on real data that illustrate the results of this work. These results indicate that our proposed estimators are competitive with some existing and popular LFR estimators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Amini AA (2021) Spectrally-truncated kernel ridge regression and its free lunch. Electron J Stat 15:3743–3761

    Article  MathSciNet  Google Scholar 

  • Ben Saber A, Karoui A (2023) A distribution free truncated kernel ridge regression estimator and related spectral analyses. Preprint at arxiv:2301.07172

  • Bouka S, Dabo-Niang S, Nkiet GM (2023) On estimation and prediction in spatial functional linear regression model. Lith Math J 63:13–30

    Article  MathSciNet  Google Scholar 

  • Cai T, Hall P (2006) Prediction in functional linear regression. Ann Statist 34:2159–2179

    Article  MathSciNet  Google Scholar 

  • Cai T, Yuan M (2012) Minimax and adaptive prediction for functional linear regression. J Am Stat Assoc 107:1201–1216

    Article  MathSciNet  Google Scholar 

  • Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13:571–591

    MathSciNet  Google Scholar 

  • Casella G (1985) Condition numbers and minimax ridge regression estimators. J Am Stat Assoc 80:753–758

    Article  MathSciNet  Google Scholar 

  • Chen D, Hall P, Müller H-G (2011) Single and multiple index functional regression models with nonparametric link. Ann Stat 39:1720–1747

    Article  MathSciNet  Google Scholar 

  • Chen C, Guo S, Qiao X (2022) Functional linear regression: dependence and error contamination. J Bus Econ Stat 40(1):444–457

    Article  MathSciNet  Google Scholar 

  • Cohen A, Davenport MA, Leviatan D (2013) On the stability and accuracy of least square approximations. Found Comput Math 13(5):819–834

    Article  MathSciNet  Google Scholar 

  • Crambes C, Kneip A, Sarda P (2009) Smoothing splines estimators for functional linear regression. Ann Stat 37:35–72

    Article  MathSciNet  Google Scholar 

  • Du P, Wang X (2014) Penalized likelihood regression. Stat Sin 24(2):1017–1041

    MathSciNet  Google Scholar 

  • Escabias M, Aguilera AM, Valderrama MJ (2005) Modeling environmental data by functional principal component logistic regression. Environmetrics 16:95–107

    Article  MathSciNet  Google Scholar 

  • Ferraty F (2014) Regression on functional data: methodological approach with application to near-infrared spectrometry. J Soc Fr Stat 155:100–120

    MathSciNet  Google Scholar 

  • Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New York

    Google Scholar 

  • Gareth MJ, Jing W, Ji Z (2009) Functional linear regression that’s interpretable. Ann Stat 37:2083–2108

    MathSciNet  Google Scholar 

  • Hall P, Horowitz JL (2007) Methodology and convergence rates for functional linear regression. Ann Stat 35(1):70–91

    Article  MathSciNet  Google Scholar 

  • Hoerl AE, Kennard RW (1970) Ridge regression: biased estimation of nonorthogonal problems. Technometrics 12:55–67

    Article  Google Scholar 

  • Horn RA, Johnson CR (2013) Matrix analysis, 2nd edn. Cambridge University Press, Cambridge

    Google Scholar 

  • Maronna RA, Yohai VJ (2011) Robust functional linear regression based on splines. Comput Stat Data Anal 65:46–55

    Article  MathSciNet  Google Scholar 

  • Masselot P, Dabo-Niang S, Chebana F, Ouarda T (2016) Streamflow forecasting using functional regression. J Hydrol 538:754–766

    Article  Google Scholar 

  • Mishra P, Verkleij T, Klont R (2021) Improved prediction of minced pork meat chemical properties with near-infrared spectroscopy by a fusion of scatter-correction techniques. Infrared Phys Technol 113:103643

    Article  Google Scholar 

  • Morris JS (2015) Functional regression. Ann Rev Stat Appl 2(1):321–359

    Article  MathSciNet  Google Scholar 

  • Olver FW, Lozier DW, Boisvert RF, Clark CW (2010) NIST handbook of mathematical functions, 1st edn. Cambridge University Press, New York

    Google Scholar 

  • Panaretos VM, Tavakoli S (2013) Cramér-Karhunen-Loève representation and harmonic principal component analysis of functional time series. Stoch Process Appl 123:2779–2807

    Article  Google Scholar 

  • Ramsay JO, Ramsey JB (2002) Functional data analysis of the dynamics of the monthly index of nondurable goods production. J Econom 107(1–2):327–344

    Article  MathSciNet  Google Scholar 

  • Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New York

    Book  Google Scholar 

  • Ratcliffe SJ, Leader LR, Heller GZ (2002) Functional data analysis with application to periodically stimulated foetal heart rate data. I: functional regression. Stat Med 21:1103–1114

    Article  Google Scholar 

  • Shenk JS, Westerhaus MO (1991) Population definition, sample selection, and calibration procedures for near infrared reflectance spectroscopy. Crop Sci 31:469–474

    Article  Google Scholar 

  • Shin H, Hsing T (2012) Linear prediction in functional data analysis. Stoch Process Appl 122:3680–3700

    Article  MathSciNet  Google Scholar 

  • Shin H, Lee S (2016) An RKHS approach to robust functional linear regression. Stat Sin 26:255–272

    MathSciNet  Google Scholar 

  • Tropp JA (2019) Matrix concentration and computational linear algebra. Caltech CMS Lecture Notes 2019-01

  • Vinod HD (1978) A survey of ridge regression and related techniques for improvements over ordinary least squares. Rev Econ Stat 60:121–131

    Article  MathSciNet  Google Scholar 

  • Wang JL, Chiou JM, Müller HG (2016) Functional data analysis. Ann Rev Stat Appl 3(1):257–295

    Article  Google Scholar 

  • Wang D, Zhao Z, Yu Y, Willett R (2022) Functional linear regression with mixed predictors. J Mach Learn Res 23:1–94

    MathSciNet  Google Scholar 

  • Wesley IJ, Uthayakumaran S, Anderssen RS, Cornish GB, Bekes F, Osborne BG, Skerritt JH (1999) A curve-fitting approach to the near infrared reflectance measurement of wheat flour proteins which influence dough quality. J Near Infrared Spectrosc 7:229–240

    Article  Google Scholar 

  • Wu Y, Fan JQ, Müller HG (2010) Varying-coefficient functional linear. Bernoulli 16(3):730–758

    Article  MathSciNet  Google Scholar 

  • Yuan M, Cai TT (2010) A reproducing Kernel Hilbert space approach to functional linear regression. Ann Stat 38(6):3412–3444

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors would like to thank very much the anonymous Referees for their valuable suggestions and comments that have greatly improved the first version of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abderrazek Karoui.

Ethics declarations

Conflict of interest

No Conflict of interest was reported by the author.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of theorem 1: From Eqs. (15) and (16), it is easy to see that

$$\begin{aligned} \Big [<X_i(\cdot ),{{\widehat{\beta }}}_{N,\lambda }(\cdot )>\Big ]_{1\le i\le n}^T= \Big ( F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T \Big )\cdot \Big [ <X_i(\cdot ),\beta _0(\cdot )>+\varepsilon _i\Big ]_{1\le i\le n}^T. \end{aligned}$$
(61)

where \(G_{N,\lambda }=F_{n,N}^T F_{n,N}+\lambda I_N.\) Consequently, we have

$$\begin{aligned} \Big [<X_i(\cdot ),{{\widehat{\beta }}}_{N,\lambda }(\cdot )>-<X_i(\cdot ),\beta _0(\cdot )>\Big ]_{1\le i\le n}^T&= \Big ( F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n \Big )\Big [<X_i(\cdot ),\beta _0(\cdot )>\Big ]_{1\le i\le n}^T\\&\quad +\Big ( F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T\Big )\Big [\varepsilon _i\Big ]_{1\le i\le n}^T. \end{aligned}$$

Let \(\pmb \varepsilon =\big (\varepsilon _1,\cdots ,\varepsilon _n\big )_{1\le i\le n}^T\) and let \(\beta _{0,N}\) be the projection of \(\beta _0(\cdot )\) over \(S_N=\text{ Span }\{\varphi _j, \, 1\le j\le N\}\), so that

$$\begin{aligned} \beta _{0,N}(s)=\sum _{j=1}^N\alpha _j\varphi _j(s), \quad s \in J, \;\alpha _j=<\beta _0(\cdot ), \varphi _j(\cdot )>. \end{aligned}$$

Then, by writing \(<X_i(\cdot ),\beta _0(\cdot )>=<X_i(\cdot ),\beta _{0,N}(\cdot )>+<X_i(\cdot ),\beta _0(\cdot )-\beta _{0,N}(\cdot )>\), one gets

$$\begin{aligned} \Big \Vert \Big [<X_i(\cdot ),{{\widehat{\beta }}}_{N,\lambda }(\cdot )>- <X_i(\cdot ),\beta _0(\cdot )>\Big ]^T_{1\le i\le n}\Big \Vert _n&\le \Big \Vert (F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n )F_{n,N}\pmb \alpha \Big \Vert _n\nonumber \\&\quad +\Big \Vert F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T\pmb \varepsilon \Big \Vert _n \nonumber \\&\quad +\Big \Vert (F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n)\pmb {\Delta _N}\Big \Vert _n, \end{aligned}$$
(62)

where \(\pmb \alpha =(\alpha _1,\cdots ,\alpha _N)^T\) and \(\pmb {\Delta _N}=\Big [<X_i(\cdot ),\beta _0(\cdot )-\beta _{0,N}(\cdot )>\Big ]_{1\le i\le n}^T.\) Next, the singular values decomposition of \(F_{n,N}\) is given by \(F_{n,N}=U\Sigma _{n,N}V^T.\) Here, UV are orthogonal matrices and \(\Sigma _{n,N}\) is an \(n\times N\) rectangular diagonal matrix. Consequently, we have

$$\begin{aligned} F_{n,N}^TF_{n,N}=\big (U\Sigma _{n,N}V^T\big )^T \big (U\Sigma _{n,N}V^T\big )=V\Sigma _{n,N}^T\Sigma _{n,N}V^T. \end{aligned}$$

Let \(D_N=\Sigma _{n,N}^T\Sigma _{n,N}.\) Then,

$$\begin{aligned} (F_{n,N}^TF_{n,N}+\lambda I_N)^{-1}=\big (V(D_N+\lambda I_N\big )V^T)^{-1}=V(D_N+\lambda I_N)^{-1}V^T. \end{aligned}$$

Hence, we have

$$\begin{aligned} \Big \Vert \Big (F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n \Big )F_{n,N}\pmb \alpha \Big \Vert _n&=\frac{1}{\sqrt{n}}\Big \Vert \Big (U\Sigma _{n,N}V^TV(D_N+\lambda I_N)^{-1}V^TV\Sigma _{n,N}^TU^T-I_n\Big )\\&U\Sigma _{n,N}V^T\pmb \alpha \>\Big \Vert _{\ell _2}\\&\le \frac{1}{\sqrt{n}}\Big \Vert \Big (\Sigma _{n,N}(D_N+\lambda I_N)^{-1}\Sigma _{n,N}^T-I_n\Big )\Sigma _{n,N}\Big \Vert _2\Big \Vert \pmb \alpha \Big \Vert _{\ell _2}\\&\le \Big \Vert \Big (\Sigma _{n,N}(D_N+\lambda I_N)^{-1}\Sigma _{n,N}^T-I_n\Big )\Sigma _{n,N}\Big \Vert _2\Big \Vert \pmb \alpha \Big \Vert _n \end{aligned}$$

with \(\big \Vert \pmb \alpha \big \Vert _n\le 1.\) Let \({\displaystyle \phi _{n}^\lambda =\Sigma _{n,N}(D_N+\lambda I_N)^{-1}\Sigma _{n,N}^T-I_n,}\) and recall that for an \(n\times N\) matrix A, \(\Vert A\Vert _2=\sigma _1(A)\) is the largest singular value of A. Then, we have

$$\begin{aligned} \Big \Vert \big (\Sigma _{n,N}(D_N+\lambda I_N)^{-1}\Sigma _{n,N}^T-I_n\big )\Sigma _{n,N}\Big \Vert _2=\big |\sigma _1\big (\phi _{n}^\lambda \Sigma _{n,N}\big )\big |. \end{aligned}$$
(63)

Note that \({\displaystyle \Sigma _{n,N}=\big [s_{i,j}\big ]_{\underset{1\le j\le N}{1\le i\le n}}, \quad s_{i,j}=\mu _j\delta _{i,j}}\) with \(\mu _j=\sqrt{\lambda _{j}(F_{n,N}^T F_{n,N})}\) are the singular values of \(F_{n,N}.\) Consequently, we have

$$\begin{aligned} \phi _{n,N}^\lambda \Sigma _{n,N}=\big [\gamma _{i,j}\big ]_{\underset{1\le j\le N}{1\le i\le n}},\quad \gamma _{i,j}=\Big (\frac{{\mu }_j^2}{{\mu }_j^2+\lambda }-1\Big )\mu _j\delta _{i,j}=\frac{-\lambda \mu _j}{{\mu }_j^2+\lambda }\delta _{i,j}. \end{aligned}$$

Since \({\displaystyle |\sigma _1\big (\phi _{n}^\lambda \Sigma _{n,N}\big )|=\underset{1\le j\le N}{\max }\big (\frac{\lambda }{{\mu }_j^2+\lambda }\big )\mu _j}\) and since \({\displaystyle \underset{x\ge 0}{\sup }\frac{\lambda x}{x^2+\lambda }=\frac{\sqrt{\lambda }}{2},}\) then \({\displaystyle \underset{1\le j\le N}{\max }\Big (\frac{\lambda }{{\mu }_j^2+\lambda }\Big )\mu _j=\frac{\sqrt{\lambda }}{2}.}\) That is

$$\begin{aligned} {\mathbb {E}} \Big [ \Big \Vert (F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n )F_{n,N}\pmb \alpha \Big \Vert ^2_n\Big ]\le \frac{\lambda }{4}. \end{aligned}$$
(64)

In the same way, one gets

$$\begin{aligned} \Big \Vert (F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n)\pmb {\Delta _N}\Big \Vert _n&\le \frac{1}{\sqrt{n}}\Big \Vert \Sigma _{n,N}(D_N+\lambda I_N)^{-1}\Sigma _{n,N}^T-I_n\Big \Vert _2\Big \Vert \pmb {\Delta _N}\Big \Vert _{\ell _2}\\&\le \Big \Vert \phi ^\lambda _n\Big \Vert _2\Big \Vert \pmb {\Delta _N}\Big \Vert _n, \end{aligned}$$

where

$$\begin{aligned} \Big \Vert \phi ^\lambda _n\Big \Vert _2=|\sigma _1(\phi ^\lambda _n)|=\underset{1\le j\le N}{\max }\Big (\frac{{\mu }_j^2}{{\mu }_j^2+\lambda }-1\Big )=\underset{1\le j\le N}{\max }\Big |-\frac{\lambda }{{\mu }_j^2+\lambda }\Big |\le 1. \end{aligned}$$

Consequently, we have for \(1\le i\le n,\)

$$\begin{aligned} <X_i(\cdot ),\beta _0(\cdot )-\beta _{0,N}(\cdot )>&=\int _JX_i(s)\cdot \big (\beta _0-\beta _{0,N}\big )(s)ds\\&=\int _J\Big (\sum _{k=1}^\infty \xi _k Z_{i,k} \varphi _k(s)\Big )\Big (\sum _{j\ge N+1}\alpha _j\varphi _j(s) \Big )ds\\&=\sum _{j\ge N+1} \xi _jZ_{i,j}\alpha _j. \end{aligned}$$

Moreover, since

$$\begin{aligned} \Big \Vert \pmb {\Delta _N}\Big \Vert ^2_n=\frac{1}{n}\sum _{i=1}^n\big (\sum _{j\ge N+1} \xi _jZ_{i,j}\alpha _j\big )^2, \end{aligned}$$

then

$$\begin{aligned} {\mathbb {E}}\Big [\Big \Vert \Big (F_{n,N} G_{N,\lambda }^{-1} F_{n,N}^T -I_n\Big )\pmb {\Delta _N}\Big \Vert ^2_n\Big ] \le {\mathbb {E}} \Big [\Big (\sum _{j\ge N+1} \xi _jZ_{i,j}\alpha _j\Big )^2\Big ]. \end{aligned}$$
(65)

Next, let \({\displaystyle B_n= U\Sigma _{n,N}(D_N+\lambda I_N)^{-1}\Sigma _{n,N}^T U^T}\) and let \(\Vert B_n\Vert _F\) be its Frobenius norm, given by

$$\begin{aligned} \Vert B_n\Vert ^2_F=\sum _{i,j=1}^n (b_{ij})^2= \text{ Tr } (B_n^T B_n)=\sum _{i=1}^n \sigma _i^2(B_n). \end{aligned}$$

Since \(B_n\) has at most rank N,  then

$$\begin{aligned} \Vert B_n\Vert _F^2= \sum _{j=1}^N \Big (\frac{\mu _j^2}{\mu _j^2+\lambda }\Big )^2. \end{aligned}$$

Also, by using the fact that the \(\varepsilon _i\) are i.i.d. and independent from the \(Z_i\) with \({\mathbb {E}}_{\varepsilon }(\varepsilon _i)=0\) and \({\mathbb {E}}_{\varepsilon }(\varepsilon ^2_i)=\sigma ^2,\) one gets

$$\begin{aligned} \frac{1}{n}{\mathbb {E}}_{Z\times \varepsilon } \left( \big \Vert B_n \pmb {\varepsilon }\big \Vert ^2_{2}\right)= & {} \frac{1}{n} {\mathbb {E}}_Z{\mathbb {E}}_{\varepsilon } \sum _{i=1}^n \left( \sum _{j=1}^n b_{ij} \varepsilon _j\right) ^2 \nonumber \\= & {} \frac{1}{n} {\mathbb {E}}_Z \left[ \sum _{i,j=1}^n b^2_{ij} {\mathbb {E}}_{\varepsilon }(\varepsilon _j^2)\right] =\frac{\sigma ^2}{n}{\mathbb {E}}_Z \left[ \sum _{j=1}^N \Big (\frac{\mu _j^2}{\mu _j^2+\lambda }\Big )^2\right] . \end{aligned}$$
(66)

Finally, by combining Eqs. (62), (64)–(66) together with the fact that \((a+b)^2\le 2(a^2+b^2)\), one gets the desired result Eq. (20).

Proof of theorem 2: Since the i.i.d. random variables \(Z_{i,j}\) are centred with variances \(\sigma _Z^2\), then

$$\begin{aligned} {\mathbb {E}}(G_k)=\begin{bmatrix}\sigma _Z^2\xi _{2^{k-1}+1}^2&{}&{} \\ {} &{}\ddots \\ {} &{}&{}\sigma _Z^2\xi _{2^k} \end{bmatrix}. \end{aligned}$$

Hence, we have

$$\begin{aligned} \mu _{\min }=\lambda _{\min }({\mathbb {E}}(G_k))=\sigma _Z^2\displaystyle \min _{\begin{array}{c} j\in I_k \end{array}}{\xi _j^2}, \quad \mu _{\max }=\lambda _{\max }({\mathbb {E}}(G_k))=\sigma _Z^2\displaystyle \max _{\begin{array}{c} j\in I_k \end{array}}{\xi _j^2}. \end{aligned}$$
(67)

Moreover, the \(2^{k-1}-\)dimension random matrix \(G_k\) is written in the following form

$$\begin{aligned} G_k=\frac{1}{n}\sum _{i=1}^{n} {\textbf{H}}_{i,k}, \quad {\textbf{H}}_{i,k}=\left[ \xi _j\xi _l Z_{i,j}Z_{i,l} \right] _{j,l\in I_k}. \end{aligned}$$

Note that each matrix \({\textbf{H}}_{i,k}\) is positive semi-definite. This follows from the fact that for any \(\pmb x\in {\mathbb {R}}^{2^{k-1}}\), we have

$$\begin{aligned} \pmb {x}^T {\textbf{H}}_{i,k}\pmb {x}=\pmb {x}^T B_{i,k}^T B_{i,k}\pmb {x}\ge 0, \quad B_{i,k}=\frac{1}{\sqrt{n}}[\xi _jZ_{i,j}]_{j\in I_k}. \end{aligned}$$

By using Gershgorin circle theorem, see (Horn and Johnson 2013), one gets

$$\begin{aligned} \lambda _{\max }({\textbf{H}}_{i,k})\le & {} \frac{1}{n}\max _{\begin{array}{c} j\in I_k \end{array}}{|\xi _jZ_{i,j}|}\displaystyle \sum _{l=2^{k-1}+1}^{2^k}|\xi _l Z_{i,l}|\\\le & {} \frac{M^2}{n}\max _{\begin{array}{c} j\in I_k \end{array}}{|\xi _j|}\displaystyle \sum _{l=2^{k-1}+1}^{2^k}|\xi _l|\le \frac{M_{\pmb {\xi }}}{n}. \end{aligned}$$

Hence, we have

$$\begin{aligned} 0 \preccurlyeq {\textbf{H}}_{i,k} \preccurlyeq \frac{M_{\pmb {\xi }}}{n},\qquad \forall \, 1\le i\le n, \end{aligned}$$
(68)

with probability at least \((1-\delta _N).\) To conclude for Eq. (38), it suffices to combine Eqs. (36), (37), (67), (68) and use the fact that \((1-\eta _1) (1-\eta _2) (1-\eta _3) > 1-\eta _1-\eta _2-\eta _3,\) where

$$\begin{aligned}{} & {} 0<\eta _1=2^{k-1}\exp \left( -n \, \delta ^2\frac{ \mu _{\min }}{3 M_{\pmb {\xi }} }\right)<1,\\{} & {} \quad 0<\eta _2=2^{k-1}\exp \left( -n \, \delta ^2\frac{ \mu _{\max }}{2 M_{\pmb {\xi }} }\right)<\eta _1,\quad 0<\eta _3=\delta _N<1. \end{aligned}$$

Proof of theorem 3: Let \({\textbf{c}}^k\) and \(\widehat{{\textbf{c}}}_{k}\) be the expansion coefficients vectors of \(\beta _0^k\) and \({{\widehat{\beta }}}_{n,N}^k,\) the orthogonal projection of \(\beta _0\) and \({{\widehat{\beta }}}_{n,N}\) over \(\text{ Span }\{\varphi _j(\cdot ),\, j\in I_k\}.\) That is

$$\begin{aligned}{} & {} {{\textbf{c}}^k}=G_k^{-1}\Big (F_{n,K}^T\frac{1}{\sqrt{n}}\big [Y_i^k-\varepsilon _i^k\big ]^T_{1\le i\le n}\Big )=G_k^{-1} F_{n,k}^T \frac{1}{\sqrt{n}}\big [\widetilde{Y}_i^k\big ]^T_{1\le i\le n},\nonumber \\{} & {} \quad \widehat{{\textbf{c}}}_{k}=G_k^{-1} F_{n,k}^T\frac{1}{\sqrt{n}}\big [Y_i^k\big ]^T_{1\le i\le n}, \end{aligned}$$
(69)

where,

$$\begin{aligned} F_{n,k}=\frac{1}{\sqrt{n}}\big [\xi _jZ_{i,j}\big ]_{1\le i\le n,j\in I_k}, \quad \varepsilon _i^k=\frac{\Vert {\textbf{Y}}^k\Vert _{\ell _1}}{\Vert {\textbf{Y}}\Vert _{\ell _1}}\varepsilon _i,\quad 1\le i\le n,\quad 1 \le k\le K_N.\nonumber \\ \end{aligned}$$
(70)

In a similar manner, let \(\widetilde{{\textbf{c}}}_{k}\) be the expansion coefficients vector of \({\widetilde{\beta }}_{N,{ M_1}}^k\), the projection over of \({\widetilde{\beta }}_{N,{ M_1}}.\) For each \(1\le k\le K_N,\) let \(\Omega _{+,k}\) and \(\Omega _{-,k}\) be the set of all possible draw \((X_1(\cdot ),\cdots , X_n(\cdot ))\) for which \(\lambda _{\min }(G_k)\ge \eta _k\) and \(0<\lambda _{\min }(G_k)\le \eta _k\) respectively. Then, we have

$$\begin{aligned}{} & {} {\mathbb {E}}\Big [\Vert \widetilde{{\textbf{c}}}_{k}-{\textbf{c}}^k\Vert _{\ell _2}^2\Big ]\le \int _{\Omega _{+,k}}\int _J|({\widetilde{\beta }}_{N,{ M_1}}^k-\beta _0^k)(x)|^2dx d\pmb \rho \nonumber \\{} & {} +\int _{\Omega _{-,k}}\int _J|({\widetilde{\beta }}_{N,{ M_1}}^k-\beta _0^k)(x)|^2dx d\pmb \rho , \end{aligned}$$
(71)

where

$$\begin{aligned} \int _{\Omega _{-,k}}\int _J|({\widetilde{\beta }}_{N,{ M_1}}^k-\beta _0^k)(x)|^2dx d\pmb \rho \le \frac{4M_1^2}{n^r} \end{aligned}$$
(72)

and

$$\begin{aligned} \int _{\Omega _{+,k}}\int _J|({\widetilde{\beta }}_{N,{ M_1}}^k-\beta _0^k)(x)|^2dx d\pmb \rho \le \int _{\Omega _{+,k}}\int _J|({\widehat{\beta }}_{n,N}^k-\beta _0^k)(x)|^2dx d\pmb \rho . \end{aligned}$$
(73)

By using Parseval’s equality, we have from Eq. (69)

$$\begin{aligned} \int _J|({\widehat{\beta }}_{n,N}^k-\beta _0^k)(x)|^2dx=\Vert \widehat{{\textbf{c}}}_{k}-{\textbf{c}}^k\Vert _{\ell _2}^2\le \frac{1}{n}\Vert G_k^{-1}\Vert _2^2\Vert F_{n,k}^T\big (\mathbf {\widetilde{Y}}^k-{\textbf{Y}}^k\big )\Vert _{\ell _2}^2. \end{aligned}$$

So that on \(\Omega _{+,k},\) we obtain

$$\begin{aligned} \int _J|({\widehat{\beta }}_{n,N}^k-\beta _0^k)(x)|^2dx\le \frac{1}{n \eta _k^2}\Vert F_{n,k}^T\big (\mathbf {\widetilde{Y}}^k-{\textbf{Y}}^k\big )\Vert _{\ell _2}^2. \end{aligned}$$

But

$$\begin{aligned} F_{n,k}^T\big (\mathbf {\widetilde{Y}}^k-{\textbf{Y}}^k\big )=\frac{1}{\sqrt{n}}\Bigg [\displaystyle \sum _{l=1}^{n}-\varepsilon _l^k\xi _iZ_{l,i}\Bigg ]^T_{i\in I_k}=\frac{\gamma _k}{\sqrt{n}}\Bigg [\displaystyle \sum _{l=1}^{n}-\varepsilon _l\xi _iZ_{l,i}\Bigg ]^T_{i\in I_k}=\frac{\gamma _k}{\sqrt{n}}\Big [a_i\Big ]^T_{i\in I_k}. \end{aligned}$$

By using the hypotheses on the noises \(\varepsilon _i,\) it is easy to see that

$$\begin{aligned} {\mathbb {E}}\big [a_i^2\big ]={\mathbb {E}}\Bigg [ \sum _{l,j=1}^{n}\varepsilon _l\varepsilon _j\xi _l\xi _iZ_{l,i}Z_{j,i}\Bigg ]=\sigma ^2\sigma _Z^2\sum _{j=1}^{n}\xi _j^2. \end{aligned}$$

Consequently, we have

$$\begin{aligned} \int _J|({\widehat{\beta }}_{n,N}^k-\beta _0^k)(x)|^2dx\le \frac{|I_k|\gamma _k^2}{n^2\eta _k^2}\sigma ^2\sigma _Z^2\Vert \xi \Vert _{\ell _2}^2. \end{aligned}$$

The previous inequality, together with Eqs. (71)–(73) lead to

$$\begin{aligned} {\mathbb {E}}\Big [\Vert \widetilde{{\textbf{c}}}_{k}-{\textbf{c}}^k\Vert _{\ell _2}^2\Big ]\le \frac{|I_k|\gamma _k^2}{n^2\eta _k^2}\sigma ^2\sigma _Z^2\Vert \xi \Vert _{\ell _2}^2+\frac{4M_1^2}{n^r}. \end{aligned}$$
(74)

Now, since

$$\begin{aligned} \Vert {\widetilde{\beta }}_{N,{ M_1}}(\cdot )-\beta _0(\cdot )\Vert _{2}\le \Vert {\widetilde{\beta }}_{N,{ M_1}}(\cdot )-\beta _{0,N}(\cdot )\Vert _{2}+ \Vert \beta _{0,N}(\cdot )-\beta _0(\cdot )\Vert _{2}, \end{aligned}$$

then

$$\begin{aligned} {\mathbb {E}}\Big [\Vert {\widetilde{\beta }}_{N,{ M_1}}(\cdot )-\beta _0(\cdot )\Vert _{2}^2\Big ]\le {\mathbb {E}}\Big [2\Vert {\widetilde{\beta }}_{N,{ M_1}}(\cdot )-\beta _{0,N}(\cdot )\Vert _{2}^2\Big ]+{\mathbb {E}}\Big [2\Vert \beta _0(\cdot )-\beta _{0,N}(\cdot )\Vert _{2}^2\Big ].\nonumber \\ \end{aligned}$$
(75)

Moreover, we have

$$\begin{aligned} \sum _{k=1}^{K_N} {\mathbb {E}}\Big [\Vert \widetilde{{\textbf{c}}}_{k}-{\textbf{c}}^k\Vert _{\ell _2}^2\Big ]= {\mathbb {E}}\Big (\sum _{j=1}^{N}|\widetilde{c}_j-c_j|^2\Big )={\mathbb {E}}\Big [\Vert {\tilde{\beta }}_{N,{ M_1}}(\cdot )-\beta _{0,N}(\cdot )\Vert ^2_2\Big ]. \end{aligned}$$
(76)

Also, since \(\Vert \beta _0(\cdot )-\beta _{0,N}(\cdot )\Vert _{2}\) is deterministic, then

$$\begin{aligned} {\mathbb {E}}\Big [2\Vert \beta _0(\cdot )-\beta _{0,N}(\cdot )\Vert _{2}^2\Big ]=2\Vert \beta _0(\cdot )-\beta _{0,N}(\cdot )\Vert _{2}^2. \end{aligned}$$

Finally, by using the previous equality together with Eqs. (74)–(76), one gets the desired \(L_2\)-risk error of the estimator \({\widetilde{\beta }}_{N,M_1}\).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ben Saber, A., Karoui, A. On some stable linear functional regression estimators based on random projections. Stat Papers (2024). https://doi.org/10.1007/s00362-024-01554-0

Download citation

  • Received:

  • Revised:

  • Published:

  • DOI: https://doi.org/10.1007/s00362-024-01554-0

Keywords

Navigation