Abstract
The use of quadratic forms of the empirical process for the two-sample problem in the context of functional data is considered. The convergence of the family of statistics proposed to a Chi-squared limit is established under metric entropy conditions for smooth functional data. The applicability of the proposed methodology is evaluated in simulations and real data examples.
Similar content being viewed by others
References
Alvarez-Esteban P, Euán C, Ortega J (2016a) Time series clustering using the total variation distance with applications in oceanography. Environmetrics 27:355–369
Alvarez-Esteban P, Euán C, Ortega J (2016b) Statistical analysis of stationary intervals for random waves. In: Proceedings of 26th international offshore and polar engineering conference (ISOPE), vol 3, pp 305–311
Benko M, Härdle W, Kneip A (2009) Common functional principal components. Ann Stat 37:1–34
Borgman Leon E (1972) Statistical models for ocean waves and wave forces. In: Te Chow Ven (ed) Advances in hydroscience, vol 8. Academic Press, New York
Bosq D (2000) Linear processes in function spaces. Lecture Notes in Statistics, vol 149. Springer, New York
Brodtkorb PA, Johannesson P, Lindgren G, Rychlik I, Rydén E, Sjö E (2000) WAFO—a Matlab toolbox for analysis of random waves and loads. In: Proceedings of 10th international offshore and polar engineering conference (ISOPE), vol III, Seattle, USA, pp 343–350
Cuevas A (2013) A partial overview of the theory of statistics with functional data. J Stat Plann Inference 147:1–23
Dudley RM (1987) Universal Donsker classes and metric entropy. Ann Probab 15:1306–1326
Ermakov MS (1998) Asimptotic minimaxity of chi-square tests. Theory Probab Appl 42(4):589–610
Ferraty F (ed) (2011) Recent advances in functional data analysis and related topics. Physica Verlag, Berlin
Ferraty F, Vieu P (2006) Nonparametric functional data analysis: theory and practice. Springer, New York
Fremdt S, Horváth L, Kokoszka P, Steinebach JG (2014) Functional data analysis with increasing number of projections. J Multivar Anal 124:313–332
Fremdt S, Steinebach JG, Horváth L, Kokoszka P (2013) Testing the equality of covariance operators in functional samples. Scand J Stat 40:138–152
Good P (2005) Permutation, parametric and bootstrap tests of hypothesis. Springer, New York
Gorrostieta C, Ortega J, Quiroz AJ, Smith GH (2014) Characterization of storm wave asymmetries with functional data analysis. Environ Ecol Stat 21(2):263–283
Götze F, Tikhomirov A (2002) Asymptotic distribution of quadratic forms and applications. J Theor Probab 15(2):423–475
Hall P, Van Keilegom I (2007) Two sample tests in functional data analysis starting from discrete data. Stat Sin 17:1511–1531
Horváth L, Kokoszka P (2009) Two sample inference in functional linear models. Can J Stat 37:571–591
Horváth L, Kokoszka P (2012) Inference for functional data with applications. Springer, New York
Horváth L, Kokoszka P, Reeder R (2013) Estimation of the mean of functional time series and a two-sample problem. J R Stat Soc Ser B 75:103–122
Horváth L, Rice G (2015a) Testing equality of means when the observations are from functional times series. J Time Ser Anal 36:84–108
Horváth L, Rice G (2015b) An introduction to functional data analysis and a principal component approach for testing the equality of mean curves. Revista Matemática Complutense 28(505):548
Longuet-Higgins M (1956) Statistical properties of a moving wave form. Proc Camb Philos Soc 52:234–245 Part 2
Longuet-Higgins M (1957) The statistical analysis of a random moving surface. Philos Trans R Soc Lond Ser A 249(966):321–387
Mas A (2007) Testing for the mean of random curves: a penalization approach. Stat Inference Stoch Process 10:147–163
Mikosch T (1991) Functional limit theorems for random quadratic forms. Stoch Process Appl 37:81–98
Muñoz Maldonado Y, Staniswalis JG, Irwin LN, Byers D (2002) A similarity analysis of curves. Can J Stat 30:373–381
Ochi MK (1998) Ocean waves: the stochastic approach. Cambridge ocean technology series. Cambridge University Press, Cambridge
Paparoditis E, Sapatinas Th (2014) Bootstrap-based testing for functional data. arXiv:1409.4317v1 [math.ST]
Peña J (2012) Propuestas para el problema de dos muestras con datos funcionales. Tesis de maestría. Universidad de Los Andes, Colombia
Pierson WJ Jr (1955) Wind-generated gravity waves. Adv Geophys 2:93–178
Pollard D (1982) A central limit theorem for empirical processes. J Aust Math Soci Ser A 33:235–248
Pollard D (1984) Convergence of stochastic processes. Springer, New York
Pomann G-M, Staicu A-M, Ghosh S (2016) A two-sample distribution-free test for functional data with application to a diffusion tensor imaging study of multiple sclerosis. J R Stat Soc Ser C 65:395–414
Ramsay JO, Silverman BW (2002) Applied functional data analysis. Springer, New York
Ramsay JO, Silverman BW (2005) Functional data analysis, 2nd edn. Springer, New York
Torsethaugen K (1993) A two-peak wave spectrum model. In: Proceedings of the 18th international conference on ocean, offshore and artic engineering (OMAE), vol II, pp 175–180
Torsethaugen K, Haver S (2004). Simplified double peak spectral model for ocean waves. In: Proceedings of the 14th international offshore and polar engineering conference, pp 23–28
van der Vaart Aad (1996) New Donsker classes. Ann Probab 24:2128–2140
van der Vaart Aad (1998) Asymptotic statistics. Cambridge University Press, Cambridge
van der Vaart AW, Wellner JA (1996) Weak convergence and empirical processes. Springer series in statistics. Springer, New York
Zhang X, Shao X (2015) Two sample inference for the second-order property of temporally dependent functional data. Bernoulli 21:909–929
Acknowledgements
The software WAFO (Brodtkorb et al. 2000) developed by the Wafo group at Lund University of Technology, Sweden, available at http://www.maths.lth.se/matstat/wafo was used for the calculation of all Fourier spectra and associated spectral characteristics as well as for the simulation of Gaussian random waves. The data for station 106 were furnished by the Coastal Data Information Program (CDIP), Integrative Oceanographic Division, operated by the Scripps Institution of Oceanography, under the sponsorship of the US Army Corps of Engineers and the California Department of Boating and Waterways (http://cdip.ucsd.edu/). This work was partially supported by CONACYT, Mexico, Proyectos 169175 Análisis Estadístico de Olas Marinas, Fase II y 234057 Análisis Espectral, Datos Funcionales y Aplicaciones. It was finished while J.O. was visiting, on sabbatical leave from CIMAT and with support from CONACYT, México, the Departamento de Estadística e I.O., Universidad de Valladolid. Their hospitality and support is gratefully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of results
Appendix: Proof of results
Proof of Proposition 1:
For each random function X on J, and \(L_g\in \mathcal{H}\), by the Cauchy–Schwarz inequality, we have
by hypothesis. Next, let \(g^*_1,g^*_2,\dots ,g^*_l\) be a minimal set of functions such that, for every \(g\in \mathscr {G}\), there exists \(j\le l\) for which \(\Vert g-g^*_j\Vert _{2,J}\le \epsilon \). Let \(Q^*\) be a probability measure on \(\mathcal{X}\). Then,
again by the Cauchy–Schwarz inequality, and independently of the particular \({Q}^*\). It follows that, for an appropriate choice of a positive constant \(\gamma \),
and the result follows. \(\square \)
Proof of Proposition 2:
Under the null hypothesis of equality of distributions, the covariance matrices at the limiting vector of functions, \(C({X,\mathbf {g}_\infty })\) and \(C({Y,\mathbf {g}_\infty })\), are the same. We are writing \(\mathbf {g}_\infty \), as in the statement of Proposition 1, for the vector of the \(g_{j,\infty }\), \(j\le k\). Now, by Pollard’s Uniform Entropy Condition that holds for \(\mathcal{H}\), the Donsker property holds for the inner product class \(\mathcal{H}\). This means that the empirical processes \(\nu _X(L_g)\) and \(\nu _Y(L_g)\), both indexed in \(\mathscr {G}\), converge uniformly to a limiting Gaussian process and, by Dudley’s asymptotic equicontinuity condition and the assumed convergence of the functions in the vector \(\tilde{\mathbf {g}}\),
Let \(\tilde{C}({X,\mathbf {g}_\infty })\) be the sample covariance of the vectors \((L_{g_{1,\infty }}(X_i),\dots ,\) \(L_{g_{k,\infty }}(X_i))\), \(i\le m\), and define similarly \(\tilde{C}({Y,\mathbf {g}_\infty })\) for the Y sample. It is clear that, under the null hypothesis, both \(\tilde{C}({X,\mathbf {g}_\infty })\) and \(\tilde{C}({Y,\mathbf {g}_\infty })\) are consistent estimators of \(C({X,\mathbf {g}_\infty })\). Using the independence of the processes \(\nu _X(L_g)\) and \(\nu _Y(L_g)\), it follows that
is a consistent estimator of the covariance matrix of the vector
Since \(L_{\mathbf {g}_\infty }\) is a fixed set of functionals, from the usual k-dimensional Central Limit Theorem and Slutzky’s theorem, it follows that
In view of (22), the same limit is obtained the if we replace \(\varphi \) by \(\gamma \) in (24). Thus, by Slutzky’s theorem again, it only remains to show that \(\tilde{C}({X,\tilde{\mathbf {g}}})\) converges pointwise, in probability, to \(C({X,\mathbf {g}_\infty })\). But using inequalities (20) and (21), it is easy to see that the covariance matrix \(C(X,\tilde{\mathbf {g}})\) is a continuous function of the vector \(\tilde{\mathbf {g}}\), with respect to the norm of \(\mathcal{L}^2(J)\). Thus, by the triangle inequality, it suffices to have a uniform law of large numbers for the class
and for the class \(\mathcal{H}\) as well. Now, let \(Q^*\) be a probability law on \(\mathcal{X}\) and \(g,g',f,f'\) functions in \(\mathscr {G}\). Then, using Proposition 1, we get
a for the constant C in that Proposition. It follows that,
and since the covering number \(N_2({\epsilon }/{2C},\mathcal{H})\) satisfies Pollard’s uniform entropy condition (16), the same will hold for \(\sup _{Q^*}N_1(\epsilon ,\mathcal{H}^{(2)},Q^*)\) (squaring the covering number does not affect the entropy condition), and this is more that enough for a Uniform Law of Large Numbers for \(\mathcal{H}^{(2)}\). The argument for \(\mathcal{H}\) is simpler and omitted, and the proof of Proposition 2 is complete. \(\square \)
Rights and permissions
About this article
Cite this article
Bárcenas, R., Ortega, J. & Quiroz, A.J. Quadratic forms of the empirical processes for the two-sample problem for functional data. TEST 26, 503–526 (2017). https://doi.org/10.1007/s11749-017-0522-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-017-0522-x