Abstract
We propose a new test to validate the assumption of homoscedasticity in a functional linear model. We consider a minimum distance measure of heteroscedasticity in functional data, which is zero in the case where the variance is constant and positive otherwise. We derive an explicit form of the measure, propose an estimator for the quantity, and show that an appropriately standardized version of the estimator is asymptotically normally distributed under both the null (homoscedasticity) and alternative hypotheses. We extend this result for residuals from functional linear models and develop a bootstrap diagnostic test for the presence of heteroscedasticity under the postulated model. Moreover, our approach also allows testing for “relevant” deviations from the homoscedastic variance structure and constructing confidence intervals for the proposed measure. We investigate the performance of our method using extensive numerical simulations and a data example.
Similar content being viewed by others
References
Aue A, Rice G, Sönmez O (2020) Structural break analysis for spectrum and trace of covariance operators. Environmetrics 31(1):e2617
Bagchi P, Characiejus V, Dette H (2018) A simple test for white noise in functional time series. J Time Ser Anal 39(1):54–74
Berger JO, Delampady M (1987) Testing precise hypotheses. Stat Sci 2:317–335
Carapeto M, Holt W (2003) Testing for heteroscedasticity in regression models. J Appl Stat 30(1):13–20
Chow SC, Liu PJ (1993) Design and analysis of bioavailability and bioequivalence studies. Comput Stat Data Anal 16(2):246
Cook RD, Weisberg S (1983) Diagnostics for heteroscedasticity in regression. Biometrika 70(1):1–10
Corporation M, Weston S (2019) doParallel: Foreach Parallel Adaptor for the ’parallel’ Package . https://CRAN.R-project.org/package=doParallel. R package version 1.0.15
Cremers H, Kadelka D (1986) On weak convergence of integral functionals of stochastic processes with applications to processes taking paths in \(L_p^E\). Stochast Process Appl 21(2):305–317
Cuesta-Albertos J, Febrero-Bande M (2010) A simple multiway anova for functional data. TEST 19(3):537–557
Dette H, Kokot K (2020) Detecting relevant differences in the covariance operators of functional time series–a sup-norm approach. arXiv preprint arXiv:2006.07291
Dette H, Munk A (1998) Testing heteroscedasticity in nonparametric regression. J R Stat Soc Ser B (Stat Methodol) 60(4):693–708. https://doi.org/10.1111/1467-9868.00149
Gabrys R, Horvàth L, Kokoszka P (2010) Tests for error correlation in the functional linear model. J Am Stat Assoc 105(491):1113–1125. https://doi.org/10.1198/jasa.2010.tm09794
García-Portugués E, González-Manteiga W, Febrero-Bande M (2014) A goodness-of-fit test for the functional linear model with scalar response. J Comput Graph Stat 23(3):761–778
Gaujoux R (2020) doRNG: Generic Reproducible Parallel Backend for ’foreach’ Loops . https://CRAN.R-project.org/package=doRNG. R package version 1.8.2
Grenander U (1950) Stochastic processes and statistical inference. Ark Mat 1(3):195–277
Hörmann S, Kokoszka P (2010) Weakly dependent functional data. Ann Stat 38(3):1845–1884. https://doi.org/10.1214/09-aos768
Jarusková D (2013) Testing for a change in covariance operator. J Stat Plann Inference 143(9):1500–1511
Koenker R, Bassett Jr G (1982) Robust tests for heteroscedasticity based on regression quantiles. Econom J Econo Soc pp. 43–61
Lehmann EL, Casella G (1998) Theory of point estimation. Springer, New York
Long JS, Ervin LH (2000) Using heteroscedasticity consistent standard errors in the linear regression model. Am Stat 54(3):217–224
Mcbride GB (1999) Equivalence tests can enhance environmental science and management. Aust N Z J Stat 41(1):19–29
Microsoft, Weston S (2020) foreach: provides foreach looping construct. https://CRAN.R-project.org/package=foreach. R package version 1.5.0
Muller HG, Stadtmuller U et al (1987) Estimation of heteroscedasticity in regression analysis. Ann Stat 15(2):610–625
Orey S (1958) A central limit theorem for \(m\) -dependent random variables. Duke Math J 25(4):543–546. https://doi.org/10.1215/s0012-7094-58-02548-1
Ramsay J (1982) When the data are functions. Psychometrika 47(4):379–396
Ramsay JO, Dalzell C (1991) Some tools for functional data analysis. J Roy Stat Soc: Ser B (Methodol) 53(3):539–561
Ramsay JO, Graves S, Hooker G (2020) fda: functional data analysis . https://CRAN.R-project.org/package=fda. R package version 5.1.4
Rao CR (1958) Some statistical methods for comparison of growth curves. Biometrics 14(1):1–17
Rice G, Wirjanto T, Zhao Y (2020) Tests for conditional heteroscedasticity of functional data. J Time Ser Anal 41(6):733–758
Stein ML (1999) Interpolation of spatial data. Springer, Berlin
Stoehr C, Aston JA, Kirch C (2020) Detecting changes in the covariance structure of functional time series with application to fmri data. Econ Stat
Zhang X (2016) White noise testing and model diagnostic checking for functional time series. J Econom 194(1):76–95
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
A Technical details
A Technical details
1.1 A.1 Proof of Theorem 2
First, consider the sequence of random elements \((T_n) \in L^2([0,1]^2)\) defined as
We show that this sequence of random processes converges in distribution to a zero-mean Gaussian process \({\mathcal {G}}\) in \(L^2([0,1]^2)\) with covariance kernel \(\kappa \) as \(n \rightarrow \infty \).
To show this, we will use Theorem 2 from Cremers and Kadelka (1986) and show that the finite-dimensional distributions of \(T_n\) converge to the finite-dimensional distributions of \({\mathcal {G}}\) as \(n \rightarrow \infty \) and that Condition (4.3) from Cremers and Kadelka (1986) is satisfied. By Remarks 3 from Cremers and Kadelka (1986), it is enough to show that there exists a integrable function \(f:[0,1]^2 \mapsto [0,\infty )\) such that
for all \(n \in {\mathbb {N}}\) and for all \(0 \le u,v \le 1\).
Equation (A. 2) follows from our assumption \(E\{X_j^4(u)\} \le K(u)\) for some integrable function K, and (A. 3) is a direct consequence of the finite-dimensional distributional convergence stated in (1).
To show 1, it is enough to show that
in distribution as \(n \rightarrow \infty \) for all \(d \ge 1\) and almost every \(0 \le u_1,\dots ,u_d, v_1,\dots ,v_d \le 1\).
To this end, we first show that the vector
converges in distribution to a multivariate normal variable as \(n \rightarrow \infty \). An application of the multivariate delta method then proves (A. 4).
Without loss of generality, we will prove the convergence of \(I_n\) for \(d=1\). The general case can be established similarly with additional amount of notation. First,
Therefore, it is enough to show the first term converges in distribution to a multivariate normal random variable. To prove the last assertion, we use the Cramer–Wold device and show that for any \(a \in {\mathbb {R}}^2\), the vector \(a'{\tilde{I}}_n\) converges in distribution to a normal random variable as \(n \rightarrow \infty \). Now,
where \(Y_j = a_1R_j(u_1) R_j(v_1) R_{j+2}(u_1) R_{j+2}(v_1) + a_2R_j(u_1) R_j(v_1).\) The variables \(Y_j - E(Y_j)\) are m-dependent with \(m=4\). We can use the m-dependent central limit theorem (see the last Corollary from Orey 1958) to establish the claimed convergence. The justification for the application of the last Corollary is presented in the supplement.
Finally,
where \(g_d: {\mathbb {R}}^{2d} \mapsto {\mathbb {R}}\) is defined as \(g(x_1,\dots ,x_d,y_1,\dots ,y_d) = (x_1-y_1^2,\dots ,x_d-y_d^2)\). Thus, an application of multivariate delta method Lehmann and Casella (1998) proves (A. 4).
To obtain the covariance kernel, say \(\kappa \{(u_1,v_1),(u_2,v_2)\}\) of the Gaussian process \({\mathcal {G}}\), we consider \(I_n(u_1,v_1,u_2,v_2)\). The covariance kernel is essentially the off-diagonal entry of the limiting covariance matrix of \(\{{\mathcal {G}}(u_1,v_1),{\mathcal {G}}(u_2,v_2)\}\). This can be calculated by first calculating the asymptotic covariance matrix of \(I_n(u_1,v_1,u_2,v_2)\), which is a \(4 \times 4\) matrix and applying the delta method with \(g_2: {\mathbb {R}}^4 \rightarrow {\mathbb {R}}^2\) where \(g_2(x_1,x_2,x_3,x_4) = (x_1 - x_3^2, x_2 - x_4^2),\) see Theorem 8.22, page 61 from Lehmann and Casella (1998). The details of this calculation are given in the supplement.
Finally, \(n^{1/2}({\widehat{M}}_n - M_0^2) = F(T_n)\), where \(F: L^2([0,1]^2) \rightarrow {\mathbb {R}}\) is a continuous map defined by
An application of the continuous mapping theorem gives
in distribution as \(n \rightarrow \infty \), which in turn implies
in distribution as \(n \rightarrow \infty \).
1.2 A.2 Proof of Lemma 2
Let \({\widetilde{A}}_n\) and \({\widetilde{B}}_n\) be the version \(A_n\) and \(B_n\) defined in (4.1), i.e., we define
and
We have to show
where
and \(T_n\) is defined as in (A. 1). In particular, we will show that
We will prove (A. 5), and the proof of (A. 6) is similar. To this end, we first write
and note that by our assertion \(\sup _{j}\sup _{u \in [0,1]}r_{j,n}(u) = o_p(n^{-1/4})\). We write
Then,
Finally,
Variance calculations similar to the calculation of \(\nu ^2\) show
Similar bounds can be derived for the other three terms. Noting that \(\sup _j\sup _{u} \vert r_{j+2,n}(u) \vert = o_p(1)\), we conclude that for any \(0 \le u,v \le 1,\)
Asymptotic tightness of \(n^{1/2}\Big \{{\widetilde{A}}_n(u,v) - A_n(u,v)\Big \} \) follows by condition (4.3) of Cremers and Kadelka (1986) as in the proof of Theorem 2, which in turn proves (A. 5) by continuous mapping theorem.
Rights and permissions
About this article
Cite this article
Cameron, J., Bagchi, P. A test for heteroscedasticity in functional linear models. TEST 31, 519–542 (2022). https://doi.org/10.1007/s11749-021-00786-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-021-00786-8