Testing marginal homogeneity in Hilbert spaces with applications to stock market returns

The paper considers a paired data framework and discuss the question of marginal homogeneity of bivariate high dimensional or functional data. The related testing problem can be endowed into a more general setting for paired random variables taking values in a general Hilbert space. To address this problem, a Cramer-von-Mises type test statistic is applied and a bootstrap procedure is suggested to obtain critical values and finally a consistent test. The desired properties of a bootstrap test can be derived, that are asymptotic exactness under the null hypothesis and consistency under alternatives. Simulations show the quality of the test in the finite sample case. A possible application is the comparison of two possibly dependent stock market returns on the basis of functional data. The approach is demonstrated on the basis of historical data for different stock market indices.


Introduction
Due to the availability of high frequency data, statistical observations can be described and modeled by random functions, so-called stochastic processes. Since classical methods are designed for vector-valued observations rather than for stochastic processes, they usually cannot be applied in this situation. The field of functional data tries to close this gap.
One popular solution to tackle this problem is to project the random functions to the real line and then apply one of the classical methods. For example, Cuesta-Albertos et al. (2006) and Cuesta-Albertos et al. (2007) applied the Kolomogorov-Smirnov goodness-offit test to randomly projected square integrable functions. Cuevas and Fraiman (2009) extended this idea to more general spaces. Ditzhaus and Gaigall (2018) did the same by discussing observations with values in a general Hilbert space. In this way, stochastic processes as well as high-dimensional data can be discussed simultaneously. More interaction between these two fields is desirable as stated by Goia and Vieu (2016) and Cuevas (2014) in the functional data community as well as by Ahmed (2017) from the highdimensional side. Extending the idea of goodness-of-fit, Bugni et al. (2009) studied the testing problem whether the underlying distribution belongs to a pre-specified parametric family. Hall and Van Keilegom (2002) discussed pre-processing the functional data, which is in practice usually just available at finitely many time points, in the context of twosample testing. Recently, the two-sample testing problem for multivariate functional data was discussed by Jiang, Meintanis and Zhu (2017) on the basis of empirical characteristic functions.
In this paper, we use also the projection idea mentioned but we address a paired-sample testing problem allowing for dependency. This approach and our test developed are completely new. We suggest a procedure for testing marginal homogeneity in separable Hilbert spaces, where we consider not just a few random projects but all projects from a sufficient large projection space. The advantage of our approach is that no additional randomness has an influence on the result of the test. With a view to the consistency of the testing procedure, we apply a test statistic of Cramér-von-Mises type. See Anderson and Darling (1952) and Rosenblatt (1952) for Cramér-von-Mises tests in the usual cases of real-valued random variables and random vectors with real components, and Gaigall (2019) for the application of a Cramér-von-Mises test for the null hypothesis of marginal homogeneity for bivariate distributions on the Cartesian square of the real line. However, the demand for consistency has a price: the distribution of the test statistic under the null hypothesis is unknown and so related quantiles are not available in practice. In contrast to the unpaired setting, exchangeability is not given in general under the null hypothesis of marginal homogeneity. This is already known for bivariate distributions on the Cartesian square of the real line, see Gaigall (2019). For that reason, a permutation test, such as they are used by Hall and Tajvidi (2002) and Bugni and Horowitz (2018) for the unpaired two-sample setting under functional data, or in Bugni and Horowitz (2018) in the more general situation of one control group against several treatment groups, is no option in our situation. To solve this problem, we offer a bootstrap procedure to determine critical values. We can show that the the bootstrap test keeps the nominal level asymptotically under the null hypothesis. Moreover, we prove the consistency of our approach with respect to the bootstrap procedure under any alternative.
While also other applications are possible, for instance to high dimensional data, we especially focus on functional data and stock market returns. In the field of empirical finance, statistical inference of stock market prices is a widespread topic. Here, usual statistical procedures are typically applied to stock market log returns. One popular question is that of a suitable distributional model which matched with the observations. For example, Göncü et al. (2016) apply different goodness-of-fit tests to data sets consist of daily stock market index returns for several emerging and developed markets and Gray and French (2008) consider the distribution of log stock index returns of the S&P 500 and deduce that the distributions do not follow a normal distribution but demonstrate a greater ability for other distributional models. This topic is also treated on the basis of high frequency data, see Malevergne, Pisarenko, and Sornette (2005), where 5 min returns of the Nasdaq Composite index and 1 min returns of the S&P 500 are considered. Besides the topic of model selection for stock market returns, another interesting topic is the comparison of different stock market returns such as it is done in Midesia et al. (2016), where annual pooled data of 100 conventional and Islamic stock returns are analyzed. In this context, dependencies between different stock prices are obvious and already detected, see Min (2015), where the major 8 companies of the Korean stock market are investigated, and should be taken into account.
The paper is structured as follows. We first introduce the model and our general null hypothesis of marginal homogeneity in the paired sample setting in Section 2. We introduce a Cramér-von-Mises type test for the aforementioned testing problem and derive its asymptotic behavior with the help of the theory of U-statistics. The resulting asymptotic law under the null hypothesis can be transfered to a bootstrap counterpart of the test statistic. In addition to these theoretical findings, we study the small sample performance of the two resampling tests in a numerical simulation study presented in Section 3. Finally, the application to stock market indices is outlined in Section 4. We demonstrate the application of the test to the historical values of the stock market indices Nikkei Stock Average from Japan, Dow Jones Industrial Average from US, and Standard & Poor's 500 from US. The test confirm the intuition that the indices of the same county are much more comparable than indices of different countries. Note that all proofs are conducted in the Appendix.

Testing marginal homogeneity in Hilbert spaces
Let H be a Hilbert space, i.e. a real inner product space, where the inner product is denoted by ·, · . We suppose that H is separable with countable orthonormal basis O = {e i ; i ∈ I}, where e i is the i-th basis element and the index set I is given by the natural numbers I = N or the subset I = {1, . . . , |I|} ⊂ N. Now, let paired observations be given X j = (X j,1 , X j,2 ), j = 1, . . . , n, that are random variables with values in H × H. We suppose that X 1 , . . . , X n are independent and identical distributed, and we suppose that the distribution P X 1 of X 1 is unknown. For technical reasons, we suppose that for all i ∈ I the joint distribution of X 1,1 , e i and X 1,2 , e i is absolutely continuous with density f i , where the set {f i (r, s) > 0; (r, s) ∈ R 2 } is open and convex, compare with Gaigall (2019). While we allow any dependence structure between X j,1 and X j,2 , we like to infer the null hypothesis of marginal homogeneity H : P X j,1 = P X j,2 versus K : P X j,1 = P X j,2 .
As postulated in the introduction, we project first the processes X j,i to the real line and then apply a Cramér-von-Mises type test. Projection is done via the inner product, i.e., we consider X j,i , x for x ∈ H. We consider all projections x from a sufficient large projection space h ⊂ H. In fact, as explained in Ditzhaus and Gaigall (2018), the distributions of X 1,1 and X 1,2 coincide if and only if X 1,1 , x and X 1,2 , x have the same distribution for This motivates the following test statistic: where P is a suitable probability measure on the projection space h and D n (x) is the usual two-sample Cramér-von-Mises distance when applying the projection x ∈ h. Let be the empirical distribution function of the real-valued random variables x, X 1,i , . . . , x, X n,i . Then the related two-sample Cramér-von-Mises distance is given by n (x, dr), whereF n = (F n,1 + F n,2 )/2. The probability measure P can be chosen arbitrarily in advance as long as some regularity assumptions are fulfilled. While more general measures P may be considered, we focus here to the following specific proposal. It is based on two probability measures ν 1 and ν 2 on the index set I such that ν j ({i}) > 0 for all i ∈ I. In the case of functional data, we can choose Poisson distributions shifted by 1, for instance.
In what follows, we specify the probability measure P by determining the procedure to generate a realization of P. This procedure is also useful to obtain the concrete value of the test statistic by Monte-Carlo simulation in applications.
Step 1. Generate a realization k ∈ I of the distribution ν 1 .
Step 2. Independently of Step 1, generate i 1 , . . . , i k ∈ I by k-times sampling without replacement from the distribution ν 2 .
Step 3. Independently of Steps 1 and 2, generate a realization (m 1 , . . . , m k ) of the uniform distribution on the unit circle in R k .
Step 4. Set x = k j=1 m j e i j .

Asymptotic theory of the test statistic
For our asymptotic approach, we let n → ∞. It is well known that the Cramér-von-Mises distance D n is connected to von Mises' type functionals, also known as V -Statistics, which are closely related to U-Statistics. For a deeper introduction to these kinds of statistics, we refer the reader to Koroljuk and Borovskich (1994) and Serfling (2001). Our statistic CvM n can also be rewritten into a certain V -Statistic and, thus, the same theory can be applied to obtain the following result.
Theorem 1 Let τ 1 , τ 2 , . . . be a sequence of independent standard normal distributed random variables. Under the null hypothesis H,

that the distribution function of Z is continuous and strictly increasing on the non-negative half-line.
Theorem 2 Under the alternative K, our statistic CvM n diverges, i.e., CvM n p → ∞ as n → ∞.
In general, the test statistic CvM n is not distribution-free under the null hypothesis, i.e., the distribution depends on the unknown distribution of X 1 . As it can be seen in the proofs, the same applies to Z. Given that α ∈ (0, 1) is the significance level, neither a (1 − α)-quantile c n,1−α of CvM n nor the (1 − α)-quantile c 1−α of Z is available as critical value in applications. To resolve this problem, we propose the estimation of the quantiles via bootstrapping in the spirit of Efron (1979) and follow the idea in Gaigall (2019), where the usual two-sample Cramér-von-Mises distance is applied to bivariate random vectors with values in R 2 . Note that under the null hypothesis H the expectations E[F n,1 (x, y)] = E[F n,2 (x, y)], (x, y) ∈ H × R, coincide and, thus, we can rewrite our test statistic into Denote by X * jn = (X * jn,1 , X * jn,2 ), j = 1, . . . , n, a bootstrap sample from the original observations X j , j = 1, . . . , n, obtained by n-times sampling with replacement. Let F * n,i , F * n be the bootstrap counterparts of F n,i andF n . Clearly, the conditional expectation given the data (X j ) j=1,...,n equals F n,i (x, y). Consequently, the bootstrap counterpart of our test statistic is Let c * n,1−α be a (1 − α)-quantile of CvM * n given the original observations X 1 , . . . , X n . In applications, concrete values of c * n,1−α are obtained by Monte-Carlo simulation. In the proofs, 0 5.2 9.9 0.25 5.1 9.5 0.5 2.9 7.7 B j,1 (t) + 2t(1 − t) B j,2 (t) + 2t(1 − t) 0 5.8 10.7 0.25 5.2 11.2 0.5 3.7 8.1 we show that the bootstrap statistic mimics asymptotically the limiting null distribution under the null hypothesis H implying that c * n,1−α is an appropriate estimator for the unknown quantile c n,1−α or c 1−α , while CvM * n and c * n,1−α remain asymptotically finite under general alternatives. This results in an asymptotically exact and consistent bootstrap test

Simulations
Remembering that our test is suitable for random variables X i,j , i = 1, 2, j = 1, . . . , n, with values in a general separate Hilbert space, we consider the separable Hilbert space   This space is endowed with the usual inner product ·, · and the normalized Legendre polynomials build a corresponding orthonormal basis O = {e i ; i ∈ I}, I = N. We obtain our test statistic (2.1) by Monte-Carlo simulation based on 500 replications following Step 1-4 from Section 2. Thereby, we choose in Step 1 and Step 2 a standard Poisson distribution shifted by 1, i.e., the distribution of N + 1 for N ∼ Pois(1). In our simulations, the stochastic processes X j,i = (X j,i (t); t ∈ [0, 1]) have the form    Tables 1 and 2, respectively. The simulations are conducted for parameters r ∈ {0, 0.25, 0.5}, a i ∈ {1, 1.5, 2, 2.5}, and b i ∈ {0, 0.5, 1, 1.5, 2}, the sample size n = 20, and significance levels α ∈ {5%, 10%}. The empirical sizes are in almost all cases in a reasonable range around the nominal level α. A systematic exception from this observations are the sizes of the bootstrap approach under the strong dependence setting (r = 0.5). In this case, the bootstrap decisions are rather conservative with corresponding empirical sizes from 2.6% to 4.3% with an average of 3.5% for α = 5% as well as values from 6.4% up to 8.4% and an average of 7.6% for α = 10%. Regarding the data example discussed in the upcoming section, we primarily studied here the sample size setting n = 20. To show that the power values grow for increasing sample sizes n ∈ {20, 30, . . . , 70}, we conducted additional simulations for two specific alternatives X j,1 (t) = B j,1 (t) and X j,2 (t) = 1.5B j,2 (t) as well as X j,1 (t) = B j,1 and X j,2 (t) = B j,2 (t) + t(1 − t), t ∈ [0, 1], j = 1, . . . , n, under moderate (r = 0.25) dependency, see Figure 1 for the results.

Applications to stock market returns
In a possible application, the observations are obtained from stock market returns. Concrete, we consider two stock price processes and a time period [0, T 0 ], where T 0 ∈ (0, ∞) is a time horizon. The stock price processes are denoted by where Y i is a random variable which takes values in the space of all measurable and square integrable real-valued functions on [0, T 0 ]. Additional structural assumptions on the underlying stochastic process for the stock prices are required. In the classical models for stock prices, i.e., the exponential Lévy model, Black-Scholes model, and Merton model, the structural assumptions mentioned are independence and stationarity of the increments of a Lévy process. Seasonality effects, that are specific trends during certain periods of the stock price processes, can disturb the stationarity assumption. Figure 2 shows the mean monthly indices (open) of the Nikkei Stock Average (Nikkei 225), Dow Jones Industrial Average (DJIA), and Standard & Poor's 500 (S&P 500) from 01/01/1999 to 01/01/2019, where seasonality effects are clearly seen. We will tackle this problem by splitting up the time horizon into n equal sized time intervals to obtain n observations for each stock price process. To be more specific, let T 0 = nT for T ∈ (0, ∞) then we consider the time periods [0, T ], . . . , [(n − 1)T, nT ] and our observations are the log-returns during these periods A corresponding orthonormal basis is given by normalized Legendre polynomials. The well-known models for stock prices mentioned imply an independent and identical distributed structure of the increments X j,i . In models with time-dependent or stochastic volatility (volatility clustering for instance), this structure is may be violated. In fact, the theory of U-Statistics is well-developed and covers also cases where the independent and identical distributed data structure is disturbed, see Chapters 2.3 and 2.4 of Lee (1990). We point out that our results are derived by application of the theory of U-Statistics in the independent and identical distributed data case, but it should be possible to extend and modify the approach in more general situations, however under suitable regularity conditions.
In what follows, we demonstrate the application of our test to the values (open) of the stock market indices Nikkei Stock Average of Japan, Dow Jones Industrial Average of US, and Standard & Poor's 500 of US for the time period 01/01/1999 to 01/01/2019. For the demonstration how the test works in applications, we limit ourselves to the consideration of the the monthly values and a linear interpolation. The resulting time series are presented in Figure 3 and can be seen as square integrable functions on the interval [0, T 0 ] for T 0 = 20 (years). To cover seasonality effects indicated by Figure 2, we split the time horizon of 20 years into 20 subintervals each representing one year, i.e. T = 1 and n = 20. We apply our method to do pairwise comparisons of the indices, where the test statistic is again approximated by 500 random projects following Step 1-4 and the shifted Poisson distribution is used in Step 1 and Step 2 as in Section 3. The resulting p-values for the bootstrap approach are displayed in Table 3 for 5000 resampling iterations, respectively. Since DJIA and S&P 500 reflect both the US market, it is not surprising that the test leads to a very high p-value and, thus, do not reject the null hypothesis. Comparisons of each of these US indices with the Japanese Nikkei 225 lead to p-values around the typical used 5%-benchmark. This is inline with the first graphically impression, which we get by Figure 3.

A Proofs
We prove Theorem 1 in a more general way. Instead of CvM n we consider Theorem 4 Let τ 1 , τ 2 , . . . be a sequence of independent standard normal distributed random variables. Under the null hypothesis H as well as under any alternative, we have where (λ i ) i∈N is a sequence of non-negative numbers with ∞ i=1 λ i < ∞ and λ i > 0 for at least one i ∈ N implying that the distribution function of Z is continuous and strictly increasing on the non-negative half-line.
Proof Let x j = (x j,1 , x j,2 ) ∈ H 2 , j ∈ N. We introduce the asymmetric kernel f given by as well as its symmetric version φ defined by Clearly, It is easy to check that for every x 3 ∈ H 2 we have is not constant with probability one. This can easily been verified and also follows from our considerations below. Moreover, we can deduce from (A.3) and the independence of the random variables that In all, φ is degenerate of order 1, see Arcones and Giné (1992) for a detailed definition. Hence, we can deduce from Theorem 3.5 of Arcones and Giné (1992) that S n − E[S n ] converges in distribution to Z as n → ∞ if and only if The new kernel φ is a projection of φ to a corresponding function space, details are carried out in Arcones and Giné (1992). By (A.3) and (A.4) we can simplify it as follows Thus, we can rewrite V n as By Lemma 1, see below, f is a degenerated and bounded Mercer kernel. Due to the degeneracy of the kernel, the map g(·) −→ E f (X 1 , ·)g(X 1 ) defines a Hilbert-Schmidt operator in the space of all square integrable functions on H 2 with respect to P X 1 , see also Section 4.3 in Koroljuk and Borovskich (1994). In this space, there exists an orthonormal basis of (centered) eigenfunctions (ϕ i ) i∈N of this integral operator with corresponding nonnegative eigenvalues (λ i ) i∈N . In detail, for all i, j ∈ N and x ∈ H 2 . Since f is a bounded Mercer kernel, we obtain, in analogy to the argumentation of Leucht and Neumann (2013) in their proof of Theorem 2.1, from an extension of the Theorem of Mercer (1909) by Sun (2005) that for all x, y in the support of P X 1 where the sum converges absolutely and uniformly on every compact subset of the cartesian square of the support of P X 1 . In particular, we obtain Regarding (A.6) we have From the orthogonality of (ϕ i ) i∈N and the multivariate central limit theorem we can conclude that for each fixed k ∈ N where τ 1 , τ 2 , . . . are independent and standard normal distributed. Combining this, (A.7) and a standard truncation argument, compare to Theorem 4.3.1 and 4.3.2 of Koroljuk and Borovskich (1994) as well as the corresponding proofs, yields Consequently, (2.2) follows. It remains to show that λ i > 0 for at least one i ∈ N. Let us suppose that λ i = 0 for all i ∈ N for a moment. Then, it follows from (2.2) that CvM n P → 0 as n → ∞. Moreover, our assumptions on the joint distribution of X 1,1 , e i and X 1,2 , e i , i ∈ I, ensure that Assumption 1 and Assumption 2 in Gaigall (2019) are satisfied, and it follows from Theorem 1 and Theorem 3 in Gaigall (2019) that D n (e 1 ) d → S as n → ∞, where S is a real-valued random variable and not constantly zero. In all, from CvM n ≥ D n (e 1 )ν 1 ({1}) d → S · ν 1 ({1}) as n → ∞ together with ν 1 ({1}) > 0 we obtain a contradiction. This completes the proof. Mercer kernel, i.e., it is continuous, symmetric

and positive semidefinite.
Proof It is easy to see that f is bounded and symmetric. The degeneracy follows immediately from (A.3). For arbitrary k ∈ N let c 1 , . . . , c k ∈ R. Then k i,j=1 Hence, f is positive semidefinite. For the continuity proof, let (x 1n ) n∈N and (x 2n ) n∈N be sequences in H 2 such that lim n→∞ x jn = x j ∈ H 2 , j = 1, 2. By Lemma 3.1 of Ditzhaus and Gaigall (2018) for P × P X 1,ℓ -almost all (x, x 3,ℓ ) and every y ∈ H. This and the continuity of the inner product imply for P-almost all x and every m, ℓ ∈ {1, 2} that lim n→∞ I x,x jn,m ≤ x,X 3,ℓ = I x,x j,m ≤ x,X 3,ℓ with probability one.
Proof of Theorem 1 Since F 1 = F 2 and, thus, S n = CvM n under the null hypotheses, the statement follows immediately from Theorem 4.
Proof of Theorem 2 First, observe that n (x, dy)P(dx).
By Theorem 4, n −1 S n converges in probability to 0. By the Cauchy-Schwarz inequality the absolute value of the second summand is bounded from above by 2 √ n −1 S n . In particular, the second summand vanishes in probability as well. The third summand can be rewritten as 1 2n for an appropriate function g. By the strong law this sum converges almost surely to where Q k is the distribution introduced at the proof's end of Theorem 4. In analogy to the argumentation of Ditzhaus and Gaigall (2018) in the proof for their Theorem 3.2, we can conclude that each summand from (A.10) is strictly positive. Finally, we obtain Proof of Theorem 3 From now on, we suppose that the data X 1 , . . . , X n are fixed and we operate on the conditional space. In particular, we can treat F n,i , i = 1, 2, as a non-random function, which converges, without loss of generality, pointwisely to F i . First, we remark that the distribution of the bootstrap sample depends on the sample size. Moreover, the distribution of X * in converges weakly to the distribution of X i . Thus, by Theorem 1.10.4 of Van der Vaart and Wellner (1996) we can assume without loss of generality that X * in converges to X ′ i for all i ∈ N with probability one, where X ′ i has the same distribution as X i , and that X ′ r is independent from X * 1n , . . . , X * (r−1)n , X * (r+1)n , . . . for all r ∈ N. Now, define Then we have where f is defined in (A.2). By Theorem 4, S ′ n converges in distribution to Z. Combining this and under the null hypothesis, where the proof of (A.11) is given later, yields conditional convergence To sum up, we need to verify (A.11) for the statement under H and (A.12) for the statement under K. For this purpose, we divide the corresponding sum in (A.11) into the following six sums I n,p = n −4 n i 1 ,...,i 6 =1 E[κ n,i 1 ,i 2 ,i 3 κ n,i 4 ,i 5 ,i 6 ]I |{i 1 ,...,i 6 }|=p , p = 1, . . . , 6.
Thus, (A.13) follows again from (A.14), (A.15), the continuity of the inner product and the convergence of X * 1n and X * 2n .