Skip to main content
Log in

On the effect of noisy measurements of the regressor in functional linear models

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

We consider the estimation of the slope function in functional linear regression, where a scalar response Y is modelled in dependence of a random function X, when Y and only a panel Z 1,…,Z L of noisy measurements of X are observable. Assuming an i.i.d. sample of (Y,Z 1,…,Z L ) of size n we propose an estimator of the slope which is based on a dimension reduction technique and additional thresholding. We derive in terms of both the sample size n and the panel size L a lower bound of a maximal weighted risk over a certain ellipsoid of slope functions and a certain class of covariance operators associated with the regressor X. It is shown that the proposed estimator attains this lower bound up to a constant and hence it is minimax-optimal. The results are illustrated considering different configurations which cover in particular the estimation of the slope as well as its derivatives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. The limit ‘‘∞’’ is admitted with lim n→∞ a n =∞ :⇔ \(\forall\, K>0\,\exists\, n_{o}\in \mathbb {N}\,\forall\, n\geq n_{o}:a_{n}\geq K\).

References

  • Bereswill M (2009) Minimax-optimal estimation in functional linear model with noisy regressor. Master’s thesis, Ruprecht-Karls-Universität Heidelberg

  • Bosq D (2000) Linear processes in function spaces. Lecture notes in statistics, vol 149. Springer, Berlin

    Book  MATH  Google Scholar 

  • Cardot H, Johannes J (2010) Thresholding projection estimators in functional linear models. J Multivar Anal 101(2):395–408

    Article  MathSciNet  MATH  Google Scholar 

  • Cardot H, Ferraty F, Sarda P (1999) Functional linear model. Stat Probab Lett 45(1):11–22

    Article  MathSciNet  MATH  Google Scholar 

  • Cardot H, Ferraty F, Sarda P (2003) Spline estimators for the functional linear model. Stat Sin 13:571–591

    MathSciNet  MATH  Google Scholar 

  • Cardot H, Mas A, Sarda P (2007) CLT in functional linear regression models. Probab Theory Relat Fields 138(3–4):325–361

    Article  MathSciNet  MATH  Google Scholar 

  • Crambes C, Kneip A, Sarda P (2009) Smoothing splines estimators for functional linear regression. Ann Stat 37(1):35–72

    Article  MathSciNet  MATH  Google Scholar 

  • Dahlhaus R, Polonik W (2006) Nonparametric quasi-maximum likelihood estimation for Gaussian locally stationary processes. Ann Stat 34(6):2790–2824

    Article  MathSciNet  MATH  Google Scholar 

  • Davidson KR, Szarek SJ (2001) Local operator theory, random matrices and Banach spaces. In: Johnson WB, Lindenstrauss J (eds) Handbook on the geometry of Banach spaces, vol 1. North-Holland/Elsevier, Amsterdam, pp 317–366

    Chapter  Google Scholar 

  • Efromovich S, Koltchinskii V (2001) On inverse problems with unknown operators. IEEE Trans Inf Theory 47(7):2876–2894

    Article  MathSciNet  MATH  Google Scholar 

  • Engl HW, Hanke M, Neubauer A (2000) Regularization of inverse problems. Kluwer Academic, Dordrecht

    Google Scholar 

  • Ferraty F, Vieu P (2006) Nonparametric functional data analysis: methods, theory, applications and implementations. Springer, London

    Google Scholar 

  • Forni M, Reichlin L (1998) Let’s get real: a factor analytical approach to disaggregated business cycle dynamics. Rev Econ Stud 65:453–473

    Article  MATH  Google Scholar 

  • Frank I, Friedman J (1993) A statistical view of some chemometrics regression tools. Technometrics 35:109–148

    Article  MATH  Google Scholar 

  • Hall P, Horowitz JL (2007) Methodology and convergence rates for functional linear regression. Ann Stat 35(1):70–91

    Article  MathSciNet  MATH  Google Scholar 

  • Heinz E (1951) Beiträge zur Störungstheorie der Spektralzerlegung. Math Ann 123:415–438

    Article  MathSciNet  MATH  Google Scholar 

  • Hoffmann M, ReißM (2008) Nonlinear estimation for linear inverse problems with error in the operator. Ann Stat 36(1):310–336

    Article  MATH  Google Scholar 

  • Johannes J, Schenk R (2013) On rate optimal local estimation in functional linear regression. Electron J Stat 7:191–216

    Article  MathSciNet  MATH  Google Scholar 

  • Korostolev AP, Tsybakov AB (1993) Minimax theory for image reconstruction. Lecture notes in statistics, vol 82. Springer, Berlin

    Book  Google Scholar 

  • Marx BD, Eilers PH (1999) Generalized linear regression on sampled signals and curves: a p-spline approach. Technometrics 41:1–13

    Article  Google Scholar 

  • Meister A (2011) Asymptotic equivalence of functional linear regression and a white noise inverse problem. Ann Stat 39(3):1471–1495

    Article  MathSciNet  MATH  Google Scholar 

  • Müller H-G, Stadtmüller U (2005) Generalized functional linear models. Ann Stat 33:774–805

    Article  MATH  Google Scholar 

  • Natterer F (1984) Error bounds for Tikhonov regularization in Hilbert scales. Appl Anal 18:29–37

    Article  MathSciNet  MATH  Google Scholar 

  • Neubauer A (1988) When do Sobolev spaces form a Hilbert scale? Proc Am Math Soc 103(2):557–562

    Article  MathSciNet  MATH  Google Scholar 

  • Preda C, Saporta G (2005) PLS regression on a stochastic process. Comput Stat Data Anal 48:149–158

    Article  MathSciNet  MATH  Google Scholar 

  • Ramsay J, Silverman B (2005) Functional data analysis, 2nd edn. Springer, New York

    Google Scholar 

  • Yao F, Müller H-G, Wang J-L (2005) Functional linear regression analysis for longitudinal data. Ann Stat 33(6):2873–2903

    Article  MATH  Google Scholar 

Download references

Acknowledgements

We are grateful to two referees and the Associate Editor for constructive criticism.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Johannes.

Additional information

Support from the IAP Research Network P7/06 of the Belgian State (Belgian Science Policy) is gratefully acknowledged.

Appendix A: Proofs

Appendix A: Proofs

We begin by defining and recalling notation to be used in the proofs of this section. Given m≥1, \(\mathbb {H}_{m}\) denotes the subspace of \(\mathbb {H}\) spanned by the functions {ψ 1,…,ψ m }. Π m and \(\varPi_{m}^{\perp}\) denote the orthogonal projections on \(\mathbb {H}_{m}\) and its orthogonal complement \(\mathbb {H}_{m}^{\perp}\), respectively. If K is an operator mapping \(\mathbb {H}\) to itself and if we restrict Π m m to an operator from \(\mathbb {H}_{m}\) to itself, then it can be represented by a matrix \([K]_{{\underline {m}}}\) with generic entries \(\langle \psi _{j},K\psi _{l}\rangle_{\mathbb {H}}=:[K]_{j,l}\) for 1≤j,lm. The spectral norm of \([K]_{{\underline {m}}} \) is denoted by \(\lVert[K]_{{\underline {m}}}\rVert_{s}\) and the inverse matrix of \([K]_{{\underline {m}}}\) by \([K]_{{\underline {m}}}^{-1}\). We denote by \(\operatorname{\lVert\cdot\rVert}\) the Euclidean norm, by \([\mathop{\nabla}\nolimits _{\omega }]_{{\underline {m}}}\) the m-dimensional diagonal matrix with entries (ω 1,…,ω m ) and by \([{\rm Id}]_{{\underline {m}}}\) the m-dimensional identity matrix. Consider the Galerkin solution \(\beta ^{m}\in \mathbb {H}_{m}\) and \(h\in \mathbb {H}_{m}\) then the random variables \(\langle \beta - \beta ^{m},X\rangle_{\mathbb {H}}\) and \(\langle X,h\rangle_{\mathbb {H}}\) are jointly normally distributed and independent because \(\mathbb{E}[\langle \beta - \beta ^{m},X\rangle_{\mathbb {H}}\langle X,h\rangle_{\mathbb {H}}]= \langle \varGamma (\beta - \beta ^{m}),h\rangle_{\mathbb {H}}=[h]_{{\underline {m}}}^{t} [\varGamma (\beta -\beta ^{m})]_{{\underline {m}}}=[h]_{{\underline {m}}}^{t} ([g]_{{\underline {m}}}- [\varGamma ]_{{\underline {m}}}[ \beta ^{m}]_{{\underline {m}}})=0\). Recall that \(\widehat {X}_{1}^{(i)}:=\frac{2}{L}\sum_{\ell=1}^{L/2} Z_{\ell}^{(i)}\) and \(\widehat {X}_{2}^{(i)}:=\frac{2}{L}\sum_{\ell=L/2+1}^{L} Z_{\ell}^{(i)}\) where \([\widehat {X}_{1}^{(i)}]_{{\underline {m}}}\) and \([\widehat {X}_{2}^{(i)}]_{{\underline {m}}}\) are jointly normally distributed with marginal mean vector zero, marginal covariance matrix \([\varSigma]_{{\underline {m}}}:= [\varGamma ]_{{\underline {m}}}+2\varsigma ^{2}L^{-1} [{\rm Id}]_{{\underline {m}}}\) and cross-covariance matrix \([\varGamma ]_{{\underline {m}}}\). Moreover, it follows that \(U^{(i)}:= Y^{(i)} -\langle \beta ^{m},\widehat {X}_{2}^{(i)}\rangle_{\mathbb {H}}\) and \([\widehat {X}_{1}^{(i)}]_{{\underline {m}}}\) are independent, normally distributed with mean zero, and, respectively, variance \(\rho^{2}_{m}:=\sigma ^{2}+ \langle \varGamma (\beta -\beta ^{m}),(\beta - \beta ^{m})\rangle_{\mathbb {H}}+ \frac{2\varsigma ^{2}}{L}\lVert \beta ^{m}\rVert_{\mathbb {H}}^{2}\) and covariance matrix \([\varSigma]_{{\underline {m}}}:= [\varGamma ]_{{\underline {m}}}+2\varsigma ^{2}L^{-1} [{\rm Id}]_{{\underline {m}}}\). Note that \(\lVert[\varSigma]_{{\underline {m}}}^{-1}\rVert_{s}=(\lVert [\varGamma ]_{m}^{-1}\rVert _{s}^{-1}+2\varsigma L^{-1})^{-1}\) and \(\lVert[\varSigma]_{{\underline {m}}}^{1/2} [\varGamma ]_{{\underline {m}}}^{-1}[\varSigma]_{{\underline {m}}}^{1/2}\rVert_{s}=(1+2\varsigma L^{-1}\lVert [\varGamma ]_{m}^{-1}\rVert_{s})\). Moreover, \([\widehat {\varGamma }]_{{\underline {m}}}=\frac {1}{n}\sum _{i=1}^{n} [\widehat {X}^{(i)}_{1}]_{{\underline {m}}}[\widehat {X}^{(i)}_{2}]_{{\underline {m}}}^{t}\) and \([\widehat {g}]_{{\underline {m}}}= \frac{1}{n}\sum_{i=1}^{n} Y^{(i)}[\widehat {X}^{(i)}_{1}]_{{\underline {m}}}\) satisfy \(\mathbb{E} [\widehat {\varGamma }]_{{\underline {m}}}= [\varGamma ]_{{\underline {m}}}\) and \(\mathbb{E}[\widehat {g}]_{{\underline {m}}}=[g]_{{\underline {m}}}\). Define the random matrix \([\varXi]_{{\underline {m}}}\) and the random vector \([W]_{{\underline {m}}}\) respectively by

where \(\mathbb{E}[\varXi]_{{\underline {m}}}= 0\) and \(\mathbb{E}[W]_{{\underline {m}}}=0\). Moreover, we define the events

Observe that \(\mho_{n,L} \subset\Omega_{n,L}\) for all L≥2 and \(n \geq2 \lVert [\varGamma ]_{m}^{-1}\rVert_{s}\). Indeed, on the event \(\mho _{n,L}\), i.e., \(\lVert[\varXi]_{{\underline {m}}}\rVert_{s}\lVert[\varSigma]_{{\underline {m}}}^{1/2}[\varGamma ]_{{\underline {m}}}^{-1}[\varSigma]_{{\underline {m}}}^{1/2}\rVert_{s}\leq1/2\), the identity

(A.1)

implies by the usual Neumann series argument that

$$\big\lVert [\widehat {\varGamma }]_{\underline {m}}^{-1}\big\rVert_{s} \leq2 \big\lVert[\varSigma]_{{\underline {m}}}^{1/2} [\varGamma ]_{\underline {m}}^{-1}[\varSigma]_{{\underline {m}}}^{1/2}\big\rVert_{s}\big\lVert[\varSigma]_{{\underline {m}}}^{-1}\big\rVert_{s}. $$

Moreover, we have

$$\big\lVert[\varSigma]_{{\underline {m}}}^{1/2} [\varGamma ]_{\underline {m}}^{-1}[\varSigma]_{{\underline {m}}}^{1/2}\big\rVert_{s}\big\lVert[\varSigma]_{{\underline {m}}}^{-1}\big\rVert _{s}=\big\lVert [\varGamma ]_m^{-1}\big\rVert_{s}. $$

Thereby, if \(n \geqslant2\lVert [\varGamma ]_{m}^{-1}\rVert_{s}\), then we have \(\mho_{n,L} \subset\Omega_{n,L}\), for all L≥2. These results will be used below without further reference. We shall gather in the end of this section technical Lemmas A.2–A.6 which are used in the following proofs. Furthermore, we will denote by C universal numerical constants and by C(⋅) constants depending only on the arguments. In both cases, the values of the constants may change from line to line.

1.1 A.1 Proof of the consistency result

Proof of Proposition 3.1

We use the identity

and obtain

(A.2)

where \(\lVert \beta ^{m}-\beta \rVert_{\omega }^{2}=o(1)\) as m→∞ due to the condition (3.1) and we will bound the remaining two terms on the right hand side separately. Consider the last right hand side term. From (3.2) it follows, on the one hand, that \(n \geqslant2 \lVert [\varGamma ]_{{\underline {m}}}^{-1}\rVert_{s}\) for all sufficiently large n (and hence \(\Omega_{n,L}^{c}\subset\mho_{n,L}^{c}\)) and, on the other hand, by employing (A.24) in Lemma A.6 that \(\lVert \beta \rVert_{\omega }^{2}P(\Omega_{n,L}^{c})=o(1)\) as n→∞ for all \(\beta \in \mathcal {F}_{\omega }\). Regarding the first right hand side term in (A.2), we have

By using , the identity (A.1) and

it follows that

By employing Lemma A.6 and (A.21)–(A.23) we conclude that

(A.3)
(A.4)

Keeping in mind condition (3.2) we deduce from (A.24) in Lemma A.6 that \(n m^{2}(P(\mho _{n,L}^{c}))^{1/2}=O(1)\) which in turn implies and completes the proof. □

Proof of Corollary 3.2

First, we prove that \(\varGamma \in \mathcal {G}_{\gamma }^{d}\) implies (3.1). On the one hand, we have \(\lVert\varPi _{m}^{\perp} \beta \rVert_{\omega }=o(1)\) as m→∞ by Lebesgue’s dominated convergence theorem. On the other hand, from the identity \([\varPi_{m} \beta - \beta ^{m}]_{{\underline {m}}} = -[\varGamma ]_{{\underline {m}}}^{-1}[\varGamma \varPi_{m}^{\perp} \beta ]_{{\underline {m}}}\) we conclude \(\lVert\varPi_{m} \beta - \beta ^{m}\rVert_{\omega }^{2} \leq2(1+d^{2})\lVert\varPi_{m}^{\perp} \beta \rVert_{\omega }^{2}\) for all \(\varGamma \in \mathcal {G}_{\gamma }^{d}\). By combining the two results, we obtain the assertion. It remains to show that (3.2) can be substituted by (3.4). If \(\varGamma \in \mathcal {G}_{\gamma }^{d}\) then

due to (A.12) and (A.13) in Lemma A.2. Taking into account these bounds the condition (3.4) implies (3.2), which proves the result. □

1.2 A.2 Proof of the lower bound

Proof of Theorem 3.3

Consider i.i.d. standard normally distributed random variables {U j } j≥1 and ϵ. Let \(X:=\sum_{j\geq1}\gamma _{j}^{1/2}\,U_{j}\,\psi _{j}\) which is a centred Gaussian random function with associated covariance operator Γ belonging to \(\mathcal {G}_{\gamma }^{d}\) and having eigenfunctions given by the basis {ψ j } j≥1. Then \([X]_{j}=\langle X,\psi _{j}\rangle_{\mathbb {H}}\), j≥1, are independent and normally distributed random variables with mean zero and variance γ j . Let \(Z_{\ell}=X+\varsigma \,\dot{B}_{\ell}\), l=1,…,L, be a panel of noisy observations of X where \(\dot{B}_{1},\dotsc, \dot{B}_{L}\) are independent Gaussian white noises, i.e., \([\dot{B}_{\ell}]_{j}=\langle \dot{B}_{\ell},\psi_{j}\rangle _{\mathbb {H}}\), j≥1, =1,…,L, are i.i.d. standard normally distributed random variables, which are independent of ϵ and X. Consequently, \(\frac{1}{L}\sum_{\ell =1}^{L}[Z_{\ell}]_{j}\), j≥1, are independent and normally distributed with mean zero and variance γ j +ς 2L −1. Consider \(\theta\in\{-1,1\}^{{m^{*}}}\) where \({m^{*}}:={m^{*}_{n,L}}\) is defined in (3.6). Let u be m -dimensional vector with coefficients u j to be chosen below such that

$$ \sum_{j=1}^{{m^*}}b_ju_j^2\leq r\quad\mbox{and}\quad\frac {2n}{\sigma ^2} \frac{\gamma _j^2 u_j^2}{(\gamma _j+\varsigma ^2\,L^{-1})}\leq1,\quad j=1,\dotsc,{m^*}. $$
(A.5)

Then for each θ the slope function \(\beta ^{\theta}=\sum _{j=1}^{{m^{*}}}\theta_{j}u_{j}\psi _{j}\) belongs to \(\mathcal {F}_{b}^{r}\). Moreover, let \(Y=\langle \beta ^{\theta},X\rangle_{\mathbb {H}}+\sigma \,\epsilon \), then (Y,Z 1,…,Z L ) obey model (1.1a)–(1.1b). Consider an i.i.d. sample \(\{(Y^{(i)},Z_{1}^{(i)},\dotsc ,Z_{L}^{(i)})\}_{i=1}^{n}\) from (Y,Z 1,…,Z L ) of size n and denote its joint distribution by \(P_{\theta}^{n}\). Furthermore, for j=1,…,m and each θ we introduce θ (j) by \(\theta^{(j)}_{k}=\theta_{k}\) for kj and \(\theta ^{(j)}_{j}=-\theta_{j}\). As in case of \(P_{\theta}^{n}\) the conditional distribution of Y (i) given \(Z_{1}^{(i)},\dotsc ,Z_{L}^{(i)}\) is Gaussian with conditional mean

$$\mu_u^\theta:=\sum _{j=1}^{{m^*}}\frac{\theta_j\gamma _j u_j}{\gamma _j+\varsigma ^2\,L^{-1}}\frac{1}{L}\sum_{\ell=1}^L\bigl[Z_\ell ^{(i)}\bigr]_j $$

and conditional variance

$$\sigma^2_u:=\sigma ^2+ \sum _{j=1}^{{m^*}}\frac{\varsigma ^2\,L^{-1}}{\gamma _j+\varsigma ^2\,L^{-1}}u_j^2 \geq \sigma ^2 $$

it is easily seen that the log-likelihood of \(P_{\theta^{(j)}}^{n}\) w.r.t. \(P_{\theta}^{n}\) is

and its expectation w.r.t. \(P_{\theta}^{n}\) satisfies \(\mathbb{E}_{P_{\theta }^{n}}[\log(dP_{\theta^{(j)}}^{n}/dP_{\theta}^{n})]\geq-\frac{2n}{\sigma ^{2}} \frac{\gamma _{j}^{2} u_{j}^{2}}{(\gamma _{j}+\varsigma ^{2}\,L^{-1})}\) because \(\sigma^{2}_{u}\geq \sigma ^{2}\). In terms of Kullback–Leibler divergence this means \(KL(P_{\theta^{(j)}}^{n},P_{\theta}^{n}) \leq\frac{2n}{\sigma ^{2}} \frac {\gamma _{j}^{2} u_{j}^{2}}{(\gamma _{j}+\varsigma ^{2}\,L^{-1})}\). Since the Hellinger distance \(H(P_{\theta^{(j)}}^{n},P_{\theta}^{n})\) satisfies \(H^{2}(P_{\theta^{(j)}}^{n},P_{\theta}^{n}) \leqslant KL(P_{\theta ^{(j)}}^{n},P_{\theta}^{n})\), from (A.5) it follows that

$$ H^2(P_{\theta^{(j)}}^n,P_{\theta}^n) \leqslant\frac{2n}{\sigma ^2} \frac{\gamma _j^2 u_j^2}{(\gamma _j+\varsigma ^2\,L^{-1})}\leqslant1,\quad j=1,\dotsc,{m^*}. $$
(A.6)

Consider the Hellinger affinity \(\rho(P_{\theta^{(j)}}^{n},P_{\theta }^{n})= \int\sqrt{dP_{\theta^{(j)}}^{n}dP_{\theta}^{n}}\), then we obtain for any estimator \(\widetilde {\beta }\) of β that

(A.7)

Due to the identity \(\rho(P_{\theta^{(j)}}^{n},P_{\theta}^{n})=1-\frac {1}{2}H^{2}(P_{\theta^{(j)}}^{n},P_{\theta}^{n})\) combining (A.6) with (A.7) yields

$$ \bigl\{\mathbb{E}_{P_{\theta^{(j)}}^n}\big|[\widetilde {\beta }-\beta ^{\theta ^{(j)}}]_j\big|^2+ \mathbb{E}_{P_{\theta}^n}\big|[\widetilde {\beta }-\beta ^\theta]_j\big|^2\bigr\} \geqslant\frac {1}{2} u_j^2,\quad j=1,\dotsc, {m^*}. $$

From this we conclude for each estimator \(\widetilde {\beta }\) that

(A.8)

We will obtain the claimed result of the theorem by evaluating (A.8) for two special choices of the vector u satisfying (A.5), which we will construct in the following. Define \(\zeta:=\Delta\min (r,\frac{\sigma ^{2}}{2})\) with Δ given by (3.8). We distinguish in the following the two cases: (i) \(\sum_{j=1}^{{m^{*}}}\frac{\omega _{j}}{\gamma _{j}}\geq\sum_{j=1}^{{m^{*}}}\frac{\varsigma ^{2}\,\omega _{j}}{L\, \gamma _{j}^{2}}\), and (ii) \(\sum_{j=1}^{{m^{*}}}\frac{\omega _{j}}{\gamma _{j}}< \sum _{j=1}^{{m^{*}}}\frac{\varsigma ^{2}\,\omega _{j}}{L\, \gamma _{j}^{2}}\). Consider first (i). Given \(\alpha:= {R^{*}_{n,L}}(\sum_{j=1}^{{m^{*}}}\frac{\omega _{j}}{n\gamma _{j}})^{-1}\leq\Delta^{-1}\) by employing (3.8) let u be the vector with coefficients \(u_{j}=(\zeta\, \alpha\, n^{-1})^{1/2} \gamma _{j}^{-1/2}\) which satisfies the condition (A.5). Indeed, since b/ω is monotonically increasing and by using successively the definition of α, Δ and ζ it follows that \(\sum_{j=1}^{{m^{*}}}b_{j}u_{j}^{2}\leq\zeta\, \frac{b_{{m^{*}}}}{\omega _{{m^{*}}}}\alpha\sum_{j=1}^{{m^{*}}}\frac{\omega _{j}}{n\gamma _{j}}= \zeta\, \frac{b_{{m^{*}}}}{\omega _{{m^{*}}}} {R^{*}_{n,L}}\leq \zeta\,\Delta^{-1}\leq r\) and \(\frac{2n}{\sigma ^{2}} \frac{\gamma _{j}^{2} u_{j}^{2}}{(\gamma _{j}+\varsigma ^{2}\,L^{-1})}\leq\frac{2n}{\sigma ^{2}} \gamma _{j} u_{j}^{2}= \frac{2}{\sigma ^{2}} \zeta\alpha\leq \frac{2}{\sigma ^{2}} \zeta\Delta^{-1}\leq1\) for j=1,…,m . Consequently, by evaluating (A.8) we obtain in case (i) the result of the theorem:

$$ \sup_{\beta \in \mathcal {F}_{b}^r}\sup_{\varGamma \in \mathcal {G}_{\gamma }^d} \mathbb{E}\big\lVert \widetilde {\beta }-\beta \big\rVert^2_\omega \geq\frac{1}{4} \sum _{j=1}^{{m^*}} \omega _j u_j^2= \frac{1}{4} \zeta\, \alpha\,\sum_{j=1}^{{m^*}} \frac{\omega _j}{n\gamma _j} = \frac{\Delta}{4}\, \min\biggl(r,\frac{\sigma ^2}{2}\biggr)\, {R^*_{n,L}}. $$
(A.9)

On the other hand, in case (ii) let \(\alpha:= {R^{*}_{n,L}}(\sum _{j=1}^{{m^{*}}}\frac{\varsigma ^{2}\omega _{j}}{Ln\gamma _{j}})^{-1}\leq\Delta ^{-1}\) and u be the vector with coefficients \(u_{j}=(\zeta\, \alpha\, n^{-1} \, \varsigma ^{2}\,L^{-1})^{1/2} \gamma _{j}^{-1}\) satisfying (A.5), because \(\sum_{j=1}^{{m^{*}}}b_{j}u_{j}^{2}\leq\zeta\, \frac{b_{{m^{*}}}}{\omega _{{m^{*}}}}\alpha\sum_{j=1}^{{m^{*}}}\frac{\varsigma ^{2}\omega _{j}}{nL\gamma _{j}}= \zeta\, \frac{b_{{m^{*}}}}{\omega _{{m^{*}}}} {R^{*}_{n,L}}\leq r\) and \(\frac{2n}{\sigma ^{2}} \frac{\gamma _{j}^{2} u_{j}^{2}}{(\gamma _{j}+\varsigma ^{2}\, L^{-1})}\leq\frac{2n}{\sigma ^{2}} \frac{\gamma _{j}^{2} u_{j}^{2}}{\varsigma ^{2}\,L^{-1}} = \frac{2}{\sigma ^{2}} \zeta\alpha\leq1\) for j=1,…,m . From (A.8) follows

$$ \sup_{\beta \in \mathcal {F}_{b}^r}\sup_{\varGamma \in \mathcal {G}_{\gamma }^d} \mathbb{E}\big\lVert \widetilde {\beta }-\beta \big\rVert^2_\omega \geq\frac{1}{4} \sum _{j=1}^{{m^*}} \omega _j u_j^2= \frac{1}{4} \zeta\, \alpha\,\sum_{j=1}^{{m^*}} \frac{\varsigma ^2\omega _j}{nL\gamma _j} = \frac{\Delta}{4}\, \min\biggl (r,\frac{\sigma ^2}{2}\biggr) {R^*_{n,L}}. $$
(A.10)

which proves the claimed result in case (ii) and completes the proof. □

1.3 A.3 Proof of the upper bound

The following technical lemma is used in the proof of Theorem 3.5.

Lemma A.1

If the assumptions of Theorem 3.5 hold true, then there exists a constant \(K:=K(\varsigma ,\mathcal {F}_{b}^{r}, \mathcal {G}_{\gamma }^{d})\) depending on ς and the classes \(\mathcal {F}_{b}^{r}\) and \(\mathcal {G}_{\gamma }^{d}\) only such that (i) \(n^{2}({m^{*}_{n,L}})^{4}P(\mho ^{c}_{n,L})\leq K\) and (ii) \(nP(\Omega^{c}_{n,L})\leq K\) for all n≥1,L≥2.

Proof

We observe that \(n^{-1}b_{{m^{*}_{n,L}}}\omega _{{m^{*}_{n,L}}}^{-1} \max(\omega _{{m^{*}_{n,L}}}\gamma _{{m^{*}_{n,L}}}^{-1}, \varsigma ^{2}L^{-1}\omega _{{m^{*}_{n,L}}}\gamma _{{m^{*}_{n,L}}}^{-2})\leq \Delta^{-1}\) for all n,L≥1 by exploiting condition (3.8) and that \(\lVert[\varSigma]_{{\underline {m}}}^{1/2} [\varGamma ]_{{\underline {m}}}^{-1}[\varSigma]_{{\underline {m}}}^{1/2}\rVert_{s}\) \(\leq C(d) \max(1, \varsigma ^{2}L^{-1}\gamma _{m}^{-1})\) and \(\lVert [\varGamma ]_{{\underline {m}}}^{-1}\rVert_{s}\leq C(d) \gamma _{m}^{-1}\) for all \(\varGamma \in \mathcal {G}_{\gamma }^{d}\) and m≥1 due to (A.12) in Lemma A.2. Combining the estimates we obtain \(\lVert[\varSigma]_{{\underline {m}^{*}_{n,L}}}^{1/2} [\varGamma ]_{{\underline {m}^{*}_{n,L}}}^{-1}[\varSigma]_{{\underline {m}^{*}_{n,L}}}^{1/2}\rVert_{s}^{2} \leq C(d) \gamma _{{m^{*}_{n,L}}}^{2} b_{{m^{*}_{n,L}}}^{-2} n^{2} \Delta ^{-2}\) and \(\lVert [\varGamma ]_{{\underline {m}}}^{-1}\rVert_{s}\leq C(d) b_{{m^{*}_{n,L}}}^{-1} n \Delta^{-1}\) for all \(\varGamma \in \mathcal {G}_{\gamma }^{d}\) and n,L≥1. Consider each bound separately. On the one hand, from the first bound and the first condition in (3.9) follows

$$\sup_{L\geq1}(\log {m^*_{n,L}})^2({m^*_{n,L}})^2\big\lVert[\varSigma]_{{\underline {m}^*_{n,L}}}^{1/2}[\varGamma ]_{{\underline {m}^*_{n,L}}}^{-1}[\varSigma]_{{\underline {m}^*_{n,L}}}^{1/2}\big\rVert _{s}^2=o(n/\log n) $$

which in turn by employing (A.24) in Lemma A.6 implies \(n^{2}({m^{*}_{n,L}})^{4}P(\mho_{n,L}^{c})\leq K\) for some constants \(K:=K(\varsigma , \mathcal {F}_{b}^{r}, \mathcal {G}_{\gamma }^{d})\) depending on ς and the classes \(\mathcal {F}_{b}^{r}\) and \(\mathcal {G}_{\gamma }^{d}\) only. On the other hand, combining the second bound with \(b_{{m^{*}_{n,L}}}^{-1}=o(1)\) as n→∞ due to Assumption 2.1 we conclude \(2\lVert [\varGamma ]_{{\underline {m}^{*}_{n,L}}}^{-1}\rVert_{s} =o(n)\). Therefore, there exists an integer n o such that for all nn o we have \(\mho_{n,L}\subset\Omega_{n,L}\). We distinguish in the following the cases n<n o and nn o . Consider first nn o , from (i) we obtain \(nP(\Omega^{c}_{n,L})\leq nP(\mho _{n,L}^{c})\leqslant K \). On the other hand, if n<n 0 then trivially \(P(\Omega^{c}_{n,L})\leq n^{-1}n_{o}\). Since n o depends on ς and the classes \(\mathcal {F}_{b}^{r}\) and \(\mathcal {G}_{\gamma }^{d}\) only, we obtain the assertion (ii) by combining both cases, which completes the proof. □

Proof of Theorem 3.5

Let \(K:=K(\varsigma ,\mathcal {F}_{b}^{r}, \mathcal {G}_{\gamma }^{d})\) denote a constant depending on ς and the classes \(\mathcal {F}_{b}^{r}\) and \(\mathcal {G}_{\gamma }^{d}\) only which changes from line to line. Consider the decomposition (A.2) where \(\lVert \beta - \beta ^{m}\rVert_{w}^{2} \leq C(d) \, r\, \omega _{m}b_{m}^{-1}\) for all m≥1 by employing (A.15) in Lemma A.2 together with Assumption 2.1, i.e., γω −1 is monotonically nonincreasing, and \(n\lVert \beta \rVert_{\omega }^{2}P(\Omega_{n}^{c})\leq K(\varsigma , \mathcal {F}_{b}^{r}, \mathcal {G}_{\gamma }^{d}) \) due to Lemma A.1, we obtain

From the last estimate and the definition of \({R^{*}_{n,L}}\) given in (3.6) we conclude

(A.11)

On the other hand, combining (A.3), \(\lVert [\varSigma]_{{\underline {m}}}\rVert_{s}=\lVert[\varGamma ]_{{\underline {m}}}\rVert_{s}+\varsigma ^{2}L^{-1}\leq d+\varsigma ^{2}L^{-1}\), \(\rho_{m}^{2}\leq \sigma ^{2} + C(d) r(1 + 2\varsigma ^{2}L^{-1})\) for all m≥1 ((A.14) and (A.16) in Lemma A.2), \(\mathbb{E}\lVert X\rVert_{\mathbb {H}}^{2}\leq d\sum_{j\geq1}\gamma _{j}\) and Lemma A.1 we have

Moreover, by employing

$$\big\lVert[\mathop{\nabla}\nolimits _\omega ]_{{\underline {m}}}^{1/2}[\varGamma ]_{{\underline {m}}}^{-1}[\varSigma]_{{\underline {m}}}[\varGamma ]_{{\underline {m}}}^{-1}[\mathop{\nabla}\nolimits _\omega ]_{{\underline {m}}}^{1/2}\big\rVert_{s} \leq C(d) \max\biggl( \max\limits_{1\leq j\leq m}\frac{\omega _j}{\gamma _j}, \max\limits_{1\leq j\leq m}\frac{\varsigma ^2\omega _j}{L\gamma _j^{2}}\biggr) $$

for all m≥1 the condition (3.9) implies

Combination of the last bound and (A.11) implies the assertion of the theorem. □

Proof of Proposition 3.6

Under the stated conditions it is easy to verify that the assumptions of Theorem 3.5 are satisfied. The result follows by applying Theorem 3.5 and we omit the details. □

1.4 A.4 Technical assertions

The following Lemma A.2–A.6 gather technical results used in the proof of Proposition 3.1 and Theorem 3.5. The proof of the next lemma can be found in Johannes and Schenk (2013).

Lemma A.2

Let \(\varGamma \in \mathcal {G}_{\gamma }^{d}\) where the sequence γ satisfies Assumption 2.1, then we have

(A.12)
(A.13)
(A.14)

Let in addition \(\beta \in \mathcal {F}_{b}^{r}\) with sequence b satisfying Assumption 2.1. If β m denotes a Galerkin solution of g=Γβ then for each strictly positive sequence w:=(w j ) j≥1 such that w/b is nonincreasing, we obtain for all \(m\in \mathbb {N}\),

(A.15)
(A.16)

The following bound for the spectral norm of a standard Wishart-distributed random matrix is due to Davidson and Szarek (2001) (Theorem 2.13). Recall that given independent and standard normally distributed m-dimensional random vectors Z 1,…,Z n , the random matrix \(W=\sum_{i=1}^{n}Z_{i}Z_{i}^{t}\) follows a standard Wishart-distribution with parameter (n,m).

Lemma A.3

Let W be a standard Wishart-distributed random matrix with parameters (n,m). For all t>0 we have \(P(\lVert W\rVert_{s}\geq n(1+\sqrt{m/n}+t)^{2})\leq\exp(-nt^{2}/2)\).

Moreover, from Theorem 2.4 in Davidson and Szarek (2001) we derive the following bound for the spectral norm of a random matrix with i.i.d. standard normally distributed components.

Lemma A.4

Let W be a (m×m)-dimensional random matrix with i.i.d. standard normally distributed components. There exist numerical constants C>2, c 0∈(0,1) and β 0≥1 such that for all \(m\in \mathbb {N}\) we have \(\mathbb{E}\lVert m^{-1/2}W\rVert_{s}^{8}\leq C\) and \(P(\lVert m^{-1/2}W\rVert_{s}>t)\leq \exp(-c_{0} t^{2})\) for all tβ 0.

Let us further state elementary inequalities for Gaussian random variables.

Lemma A.5

Let {U i ,V ij ,1≤in,1≤jm} be independent and standard normally distributed random variables. Then we have for all η>0 that

(A.17)
(A.18)

and for all c≥1 and a 1,…,a m ≥0 we have

(A.19)
(A.20)

Proof of Lemma A.5

Define \(W:=\sum_{i=1}^{n} U_{i}^{2}\) and \(Z_{j}:=(\sum_{i=1}^{n}U_{i}^{2})^{-1/2}\sum_{i=1}^{n} U_{i}V_{ij}\). Obviously, W has \(\chi^{2}_{n}\) distribution with n degrees of freedom and Z 1,…,Z m given U 1,…,U n are independent and standard normally distributed, which we use below without further reference. From the estimate (A.17) given in Dahlhaus and Polonik (2006) (Proposition A.1) follows

which implies (A.18). It remains to prove (A.19) and (A.20) which can be realised as follows (keep in mind that \(\mathbb{E}[W]=n\) and \(\mathbb{E}[Z_{j}^{2}\big|U_{1},\dotsc,U_{n}]=1\)):

$$ \mathbb{E}\sum_{j=1}^ma_j\left \lvert \sum_{i=1}^n U_iV_{ij}\right \rvert ^2 = \mathbb{E}W \sum_{j=1}^ma_j \mathbb{E}\bigl[Z_j^2|U_1,\dotsc,U_n\bigr] = n\sum_{j=1}^ma_j. $$

Finally, by applying \(\mathbb{E}[Z_{j}^{8}|U_{1},\dotsc,U_{n}]=105\) and \(\mathbb{E}W^{4}=n(n+2)(n+4)(n+6)\) we obtain \(\mathbb{E}[W^{4}Z_{j}^{8}]\leq (11n)^{4}\) and hence

which shows (A.20) and completes the proof. □

Lemma A.6

For all n,m≥1 we have

(A.21)
(A.22)

Furthermore, there exist numerical constants C,c 0>0 such that for all n,m≥1,

(A.23)
(A.24)
(A.25)

Proof of Lemma A.6

Let (λ j ,e j )1≤jm denote an eigenvalue decomposition of \([\varSigma]_{{\underline {m}}}\).

Proof of (A.21) and (A.22). Define \(U_{i}:= \rho_{m}^{-1}U^{(i)}= \rho_{m}^{-1}(Y^{(i)}-\langle \beta ^{m},\widehat {X}^{(i)}_{2}\rangle_{\mathbb {H}})\) and \(V_{ij}:=(\lambda_{j}^{-1/2}e_{j}^{t}[\widehat {X}^{(i)}_{1}]_{{\underline {m}}})\), 1≤in, 1≤jm, where U 1,…,U n ,V 11,…,V nm are independent and standard normally distributed random variables. Taking into account \(\sum_{j=1}^{m}\lambda_{j}=\mathop{\rm tr}\nolimits ([\varSigma]_{{\underline {m}}})=\mathop{\rm tr}\nolimits ([\varGamma ]_{{\underline {m}}}) + 2\varsigma ^{2}L^{-1}m \leq \mathbb{E}\lVert X\rVert_{\mathbb {H}}^{2} + 2\varsigma ^{2}L^{-1}m\) and the identities \(n^{2}\rho _{m}^{-2}\lVert [\varSigma]_{{\underline {m}}}^{-1/2}[W]_{{\underline {m}}}\rVert ^{2}=\sum_{j=1}^{m} (\sum_{i=1}^{n}U_{i}V_{ij})^{2}\) and \(n^{8}\rho_{m}^{-8}\lVert [W]_{{\underline {m}}}\rVert^{8} = (\sum_{j=1}^{m} \lambda_{j}(\sum_{i=1}^{n}U_{i}V_{ij})^{2})^{4}\) the assertions (A.21) and (A.22) follow, respectively, from (A.19) and (A.20) in Lemma A.5 (with a j =1 and a j =λ j respectively).

Proof of (A.23) and (A.24). Define the random matrix \(A_{m}:=\sum_{i=1}^{n}[\varSigma]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(i)}_{2}]_{{\underline {m}}} [\widehat {X}^{(i)}_{2}]_{{\underline {m}}}^{t}[\varSigma]_{{\underline {m}}}^{-1/2}=C_{nm}C_{nm}^{t}\) with \(C_{nm}:=([\varSigma]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(1)}_{2}]_{{\underline {m}}},\dotsc,[\varSigma]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(n)}_{2}]_{{\underline {m}}})\) and for i=1,…,n the random vector \([U_{i}]_{{\underline {m}}}:=[\varSigma]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(i)}_{1}]_{{\underline {m}}}-([\varSigma]_{{\underline {m}}}^{-1/2} [\varGamma ]_{{\underline {m}}} [\varSigma]_{{\underline {m}}}^{-1/2})[\varSigma ]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(i)}_{2}]_{{\underline {m}}}\). Observe that the conditional distribution of \([U_{i}]_{{\underline {m}}}\) given \([\varSigma]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(i)}_{2}]_{{\underline {m}}}\) is Gaussian with mean zero and covariance matrix \(\varSigma_{U}:=([{\rm Id}]_{{\underline {m}}}- [\varSigma]_{{\underline {m}}}^{-1/2} [\varGamma ]_{{\underline {m}}}[\varSigma]_{{\underline {m}}}^{-1} [\varGamma ]_{{\underline {m}}}[\varSigma]_{{\underline {m}}}^{-1/2})\), and given \([\varSigma]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(1)}_{2}]_{{\underline {m}}},\dotsc,[\varSigma ]_{{\underline {m}}}^{-1/2}[\widehat {X}^{(n)}_{2}]_{{\underline {m}}}\) the components of the (m×m)-dimensional matrix

$$B_m:= \varSigma_U^{-1/2}\Biggl\{\sum_{i=1}^n [U_i]_{\underline {m}}\bigl[\widehat {X}^{(i)}_2\bigr]_{\underline {m}}^t[\varSigma]_{\underline {m}}^{-1/2}\Biggr\} A_m^{-1/2} $$

are i.i.d. standard normally distributed. By employing this notation it is easily seen that

$$[\varXi]_{\underline {m}}=n^{-1}\varSigma_U^{1/2} B_m A_m^{1/2} + [\varSigma ]_{\underline {m}}^{-1/2} [\varGamma ]_{\underline {m}}[\varSigma]_{\underline {m}}^{-1/2}\bigl(n^{-1}A_m -[{\rm Id}]_{\underline {m}}\bigr) $$

For all 1≤j,lm let δ jl =1 if j=l and zero otherwise. It is easily verified that \(\lVert n^{-1}A_{m} -[{\rm Id}]_{{\underline {m}}}\rVert_{s}^{2}\leq\sum_{j=1}^{m}\sum _{l=1}^{m}|n^{-1}\sum_{i=1}^{n}(V_{ij}V_{il}-\delta_{jl})|^{2}\) with \(V_{ij}:=(\lambda_{j}^{-1/2}e_{j}^{t}[\widehat {X}^{(i)}_{2}]_{{\underline {m}}})\), 1≤in, 1≤jm. Moreover, for jl we have \(\mathbb{E}|\sum_{i=1}^{n} V_{ij}V_{il}|^{8} \leq(11n)^{4}\) by employing (A.20) in Lemma A.5 (take m=1 and a 1=1), while \(\mathbb{E}|\sum_{i=1}^{n}(V_{ij}^{2}-1)|^{8}=n^{4} 256(105/16+595/(2n)+ 1827/n^{2}+2520/n^{3})\leq(34n)^{4}\). From these estimates we get \(m^{-8}\mathbb{E}\lVert n^{-1}A_{m} -[{\rm Id}]_{{\underline {m}}}\rVert_{s}^{8}\leq C n^{-4}\) for all m≥1 which implies \(\mathbb{E}\lVert n^{-1/2}A_{m}^{1/2}\rVert_{s}^{8}\leq C (m^{4}n^{-2}+1)\). Combining the last bound and \(\mathbb{E}[\lVert m^{-1/2} B_{m}\rVert_{s}^{8}\big|C_{nm}]\leq C\) due to Lemma A.4 we obtain

$$ \mathbb{E}\big\lVert [\varXi]_{\underline {m}}\big\rVert_{s}^8\leq C \bigl\{n^{-4} \lVert \varSigma_U\rVert_{s}^4 m^4\bigl(1+m^4n^{-2}\bigr)+ \big\lVert [\varSigma]_{\underline {m}}^{-1/2} [\varGamma ]_{\underline {m}}[\varSigma]_{\underline {m}}^{-1/2}\big\rVert_{s}^8 m^8n^{-4}\bigr\}. $$
(A.26)

If \(\{v_{j}\}_{j=1}^{m}\) denote the eigenvalues of \([\varGamma ]_{{\underline {m}}}\) in a decreasing order then it follows that \(\{v_{j}+2\varsigma ^{2}L^{-1}\}_{j=1}^{m}\) are the eigenvalues of \([\varSigma]_{{\underline {m}}}\) and, hence \(\{v_{j}(v_{j}+2\varsigma ^{2}L^{-1})^{-1}\}_{j=1}^{m}\) are the eigenvalues of \([\varSigma]_{{\underline {m}}}^{-1/2} [\varGamma ]_{{\underline {m}}}[\varSigma]_{{\underline {m}}}^{-1/2}\) which implies \(\lVert [\varSigma]_{{\underline {m}}}^{-1/2} [\varGamma ]_{{\underline {m}}}[\varSigma]_{{\underline {m}}}^{-1/2}\rVert_{s} \leq1\), \(\lVert [\varSigma]_{{\underline {m}}}^{1/2}[\varGamma ]_{{\underline {m}}}^{-1}[\varSigma]_{{\underline {m}}}^{1/2}\rVert_{s}=1+2\varsigma ^{2}L^{-1}\lVert [\varGamma ]_{{\underline {m}}}^{-1}\rVert_{s}\) and analogously \(\lVert\varSigma_{U}\rVert_{s}\leq1\). Combining the last estimates and (A.26) we obtain the assertion (A.23). Moreover, we have

$$\big\lVert [\varXi]_{\underline {m}}\big\rVert_{s}\leq\big\lVert n^{-1/2}B_m\big\rVert _{s}\big\lVert n^{-1/2} A_m^{1/2}\big\rVert_{s} + \big\lVert n^{-1}A_m -[{\rm Id}]_{\underline {m}}\big\rVert_{s} $$

and hence

(A.27)

Since \(n\lVert n^{-1}A_{m} -[{\rm Id}]_{{\underline {m}}}\rVert_{s}\leq m\max_{1\leq j,l\leq m}|\sum_{i=1}^{n}(V_{ij}V_{il}-\delta_{jl})|\), we obtain due to (A.17) and (A.18) in Lemma A.5 that

Moreover, for all ηm/2 the last bound simplifies to

$$P\bigl(\big\lVert n^{-1}A_m -[{\rm Id}]_{\underline {m}}\big\rVert_{s}\geq\eta\bigr)\leq m^2\max{\bigg \lbrace 1+\frac{2m}{\eta n^{1/2}}, 2\bigg\rbrace} \exp\bigg({-}\frac {1}{12}\frac{n\eta ^2}{m^2}\bigg). $$

and it is easily seen that the last bound, Lemmas A.3 and A.4 by making use of the decomposition (A.27) imply (A.23), which completes the proof. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bereswill, M., Johannes, J. On the effect of noisy measurements of the regressor in functional linear models. TEST 22, 488–513 (2013). https://doi.org/10.1007/s11749-013-0325-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-013-0325-7

Keywords

Mathematics Subject Classification

Navigation