1 Introduction

Due to the availability of high-frequency data, statistical observations can be described and modeled by random functions, the so-called stochastic processes. Since classical methods are designed for vector-valued observations rather than for stochastic processes, they usually cannot be applied in this situation. The field of functional data tries to close this gap. One popular solution to tackle this problem is to project the random functions to the real line and then apply one of the classical methods. For example, Cuesta-Albertos et al. (2006, 2007) applied the Kolmogorov–Smirnov goodness-of-fit test to randomly projected square-integrable functions. Cuevas and Fraiman (2009) extended this idea to more general spaces. Ditzhaus and Gaigall (2018) did the same by discussing observations with values in a general Hilbert space. In this way, stochastic processes as well as high-dimensional data can be discussed simultaneously. More interaction between these two fields is desirable as stated by Goia and Vieu (2016) and Cuevas (2014) in the functional data community as well as by Ahmed (2017) from the high-dimensional side. Extending the idea of goodness-of-fit, Bugni et al. (2009) studied the testing problem of whether the underlying distribution belongs to a pre-specified parametric family. Hall and Van Keilegom (2002) discussed pre-processing the functional data, which is in practice usually just available at finitely many time points, in the context of two-sample testing. Recently, the two-sample testing problem for multivariate functional data was discussed by Jiang et al. (2017) based on empirical characteristic functions.

In this paper, we use also the projection idea mentioned, but we address a paired-sample testing problem allowing for dependence. This approach and our test are completely new. We suggest a procedure for testing marginal homogeneity in separable Hilbert spaces, where we consider not just a few random projections but all projections from a sufficient large projection space. The advantage of our approach is that no additional randomness influences the result of the test. With a view to the consistency of the testing procedure, we apply a test statistic of Cramér–von-Mises type. See Anderson and Darling (1952) and Rosenblatt (1952) for Cramér–von-Mises tests in the usual cases of real-valued random variables and random vectors with real components, and Gaigall (2019) for the application of a Cramér–von-Mises test for the null hypothesis of marginal homogeneity for bivariate distributions on the Cartesian square of the real line. However, the demand for consistency has a price: the distribution of the test statistic under the null hypothesis is unknown, and so related quantiles are not available in practice. In contrast with the unpaired setting, exchangeability is not given in general under the null hypothesis of marginal homogeneity. This is already known for bivariate distributions on the Cartesian square of the real line, see Gaigall (2019). For that reason, a permutation test, such as they are used by Hall and Tajvidi (2002) and Bugni and Horowitz (2018) for the unpaired two-sample setting under functional data, or in Bugni and Horowitz (2018) in the more general situation of one control group against several treatment groups, is no option in our situation. To solve this problem, we offer a bootstrap procedure to determine critical values. We can show that the bootstrap test keeps the nominal level asymptotically under the null hypothesis. Moreover, we prove the consistency of the bootstrap procedure under any alternative.

While also other applications are possible, for instance to high-dimensional data, we especially focus on functional data and stock market returns. In the field of empirical finance, statistical inference of stock market prices is a widespread topic. Here, usual statistical procedures are typically applied to stock market log returns. One popular question is that of a suitable distributional model which matched with the observations. For example, Göncü et al. (2016) apply different goodness-of-fit tests to data sets consisting of daily stock market index returns for several emerging and developed markets and Gray and French (2008) consider the distribution of log stock index returns of the S&P 500 and deduce that the distributions do not follow a normal distribution but demonstrate a greater ability for other distributional models. This topic is also treated based on high-frequency data, see Malevergne et al. (2005), where 5 min returns of the Nasdaq Composite index and 1 min returns of the S&P 500 are considered. Besides the topic of model selection for stock market returns, another interesting topic is the comparison of different stock market returns as it is done in Midesia et al. (2016), where annual pooled data of 100 conventional and Islamic stock returns are analyzed. In this context, the dependence of the different stock prices is obvious and already detected, see Min (2015), where the major 8 companies of the Korean stock market are investigated, and should be taken into account.

The paper is structured as follows. We first introduce the model and our general null hypothesis of marginal homogeneity in the paired sample setting in Sect. 2. We introduce a Cramér–von-Mises type test for the aforementioned testing problem and derive its asymptotic behavior with the help of the theory of U-statistics. The resulting asymptotic law under the null hypothesis can be transferred to a bootstrap counterpart of the test statistic. In addition to these theoretical findings, we study the small sample performance of the two resampling tests in a numerical simulation study presented in Sect. 3. Finally, the application to stock market indices is outlined in Sect. 4. We demonstrate the application of the test to the historical values of the stock market indices Nikkei Stock Average from Japan, Dow Jones Industrial Average from the US, and Standard & Poor’s 500 from the US. The test confirms the intuition that the indices of the same county are much more comparable than indices of different countries. Note that all proofs are conducted in the Appendix.

2 Testing marginal homogeneity in Hilbert spaces

Let H be a Hilbert space, i.e., a real inner product space, where the inner product is denoted by \(\langle \cdot ,\cdot \rangle \). We suppose that H is separable with countable orthonormal basis \(O=\lbrace e_i;i\in I\rbrace \), where \(e_i\) is the i-th basis element and the index set I is given by the natural numbers \(I={\mathbb {N}}\) or the subset \(I=\{1,\dots ,|I|\}\subset {\mathbb {N}}\). Now, let paired observations be given

$$\begin{aligned} X_{j}= (X_{j,1},X_{j,2} ),~j=1,\dots ,n, \end{aligned}$$

that are random variables with values in \(H\times H\). We suppose that \(X_1,\ldots ,X_n\) are independent and identical distributed, and we suppose that the distribution \(P^{X_1}\) of \(X_1\) is unknown. For technical reasons, we suppose that for all \(i\in I\) the joint distribution of \(\langle X_{1,1},e_i\rangle \) and \(\langle X_{1,2},e_i\rangle \) is absolutely continuous with density \(f_i\), where the set \(\{f_i(r,s)>0;(r,s)\in {\mathbb {R}}^2\}\) is open and convex, compare with Gaigall (2019). While we allow any dependence structure between \(X_{j,1}\) and \(X_{j,2}\), we like to infer the null hypothesis of marginal homogeneity

$$\begin{aligned} {\mathcal {H}}: P^{X_{j,1}} = P^{X_{j,2}} \quad \text {versus}\quad {\mathcal {K}}: P^{X_{j,1}} \ne P^{X_{j,2}}. \end{aligned}$$

Our main application is the functional data case, where our observations are measurable and square-integrable real-valued functions \(X_{j,k} (t)\), \(t\in [0,T]\), \(k=1,2\), \(j=1,\dots ,n\), on the interval [0, T], and we consider the specific space \(H=L^2[0,T]\) containing all measurable and square-integrable real-valued functions on the interval [0, T] of length \(T\in (0,\infty )\) and equipped with the usual inner product \(\langle f, g \rangle = \int _0^T f(x)g(x) \,\mathrm { d }x\), \(f, g\in H\). A corresponding orthonormal basis is given by normalized Legendre polynomials. Another possible application is the high-dimensional case. Introducing the dimension \(d=|I|\), considering the index set \(I=\{1,\dots ,d\}\subset {\mathbb {N}}\), and setting \({\mathbb {R}}^{I}=\{f;f:I\mapsto {\mathbb {R}}\}\), we consider random vectors \(X_{1,k}\), \(k=1,2\), seen as random variables with values in the separable Hilbert space \(H=\lbrace x;x\in {\mathbb {R}}^{I},\sum _{i\in I}x(i)^2r_{i}<\infty \rbrace \) with inner product \(<x,x'>=\sum _{i\in I}x(i)x'(i)r_{i}\), \((x,x')\in H^2\), where the elements \(e_{i}={\delta _i}/{\sqrt{r_{i}}}\), \(i\in I\), define a countable orthonormal basis \(O=\lbrace e_{i};i\in I\rbrace \) and \(\eta =\sum _{i\in I}r_{i}\delta _i\) is a measure on the index set I with \(r_{i}\in (0,\infty )\) for all \(i\in I\). This point of view enables an extension to the infinite-dimensional case \(d\in {\mathbb {N}}\cup \{\infty \}\), where \(I={\mathbb {N}}\) in the infinite-dimensional case \(d=\infty \), and \(X_{1,k}=(X_{1,k}(i))_{i\in I}\), \(k=1,2\), are now random sequences.

As postulated in the introduction, we project first the processes \(X_{j,i}\) to the real line and then apply a Cramér–von-Mises type test. With a view to the consistency of the testing procedure, we choose a Cramér–von-Mises type test statistic. We note that a Kolmogorov–Smirnov type test statistic, or a test statistic obtained by adding a suitable weight function, can be applied analogously. For the investigation of asymptotic properties of our test, we use that the Cramér–von-Mises distance is connected to von Mises’ type functionals, also known as V-Statistics. For that reason, the asymptotic properties of a test based on another test statistic have to be treated separately. In our approach, projection is done via the inner product, i.e., we consider \(\langle X_{j,i},x\rangle \) for \(x\in H\). We consider all projections x from a sufficient large projection space \(h\subset H\). In fact, as explained in Ditzhaus and Gaigall (2018), the distributions of \(X_{1,1}\) and \(X_{1,2}\) coincide if and only if \(\langle X_{1,1}, x\rangle \) and \(\langle X_{1,2}, x \rangle \) have the same distribution for all projections \(x\in h\), where

$$\begin{aligned} h = \left\{ \sum _{j=1}^k m_{j} e_{i_j}; k\in I,i_1,\ldots , i_k\in I, i_1< \cdots < i_k, \sum _{j=1}^k m_{j}^2 = 1 \right\} . \end{aligned}$$

This motivates the following test statistic:

$$\begin{aligned} \mathrm{{\text {CvM}}}_n=\int D_n(x){\mathcal {P}}(\mathrm d x), \end{aligned}$$

where \({\mathcal {P}}\) is a suitable probability measure on the projection space h and \(D_n(x)\) is the usual two-sample Cramér–von-Mises distance when applying the projection \(x\in h\). Let

$$\begin{aligned} \begin{aligned} F_{n,i }(x,r)=\frac{1}{n}\sum _{j=1}^n \mathrm{{\text {I}}}_{\langle x,X_{j,i}\rangle \le r},~(x,r)\in H\times {\mathbb {R}}, ~i=1,2, \end{aligned} \end{aligned}$$

be the empirical distribution function of the real-valued random variables \( \langle x,X_{1,i}\rangle ,\dots ,\langle x,X_{n,i}\rangle \). Then, the related two-sample Cramér–von-Mises distance is given by

$$\begin{aligned} \begin{aligned} D_n(x)= n \int [ F_{n,1}(x,r)- F_{n,2}(x,r)]^2{\bar{F}}_n(x,\mathrm d r), \end{aligned} \end{aligned}$$

where \({\bar{F}}_n = (F_{n,1}+ F_{n,2})/2\). While more general measures \({\mathcal {P}}\) may be considered (compare to Ditzhaus and Gaigall 2018), we focus here to the following specific proposal. It is based on two probability measures \(\nu _1\) and \(\nu _2\) on the index set I such that \(\nu _j(\{i\})>0\) for all \(i\in I\) and \(j=1,2\). These can be chosen arbitrarily in advance. In the case of infinite-dimensional Hilbert space with orthonormal basis \(O=\{e_i;i\in {\mathbb {N}}\}\), it is possible to choose Poisson distributions shifted by 1. Otherwise, for Hilbert spaces of finite dimension \(d<\infty \) with orthonormal basis \(O=\{e_i;i=1,\ldots ,d\}\), it is possible to choose the uniform distribution on \(\{1,\ldots ,d\}\) for \(\nu _1\) and \(\nu _2\). Anyhow, given these two probability measures, we generate a realization of \({\mathcal {P}}\) as follows:

Step 1.:

Generate a realization \(k\in I\) of the distribution \(\nu _1\).

Step 2.:

Independently of Step 1, generate \(i_1,\dots ,i_k\in I\) by k-times sampling without replacement from the distribution \(\nu _2\).

Step 3.:

Independently of Steps 1 and 2, generate a realization \((m_1,\dots ,m_k)\) of the uniform distribution on the unit circle in \({\mathbb {R}}^{k}\).

Step 4.:

Set \(x=\sum _{j=1}^k m_je_{i_j}\).

This step-by-step procedure determines \({\mathcal {P}}\) uniquely and is considered throughout the remaining paper:

Assumption 1

Consider \({\mathcal {P}}\) given by STEPs 1–4.

As it is stated above, the concrete value of the test statistic can be obtained by Monte Carlo simulation. A possible implementation of the procedure is analogous to the implementation discussed in Section 2.2 in Ditzhaus and Gaigall (2018). In particular, a finite number of realizations from the probability measure \({\mathcal {P}}\) can be used for the approximation of the integral with respect to \({\mathcal {P}}\) in applications. In this regard, the practical implementation of the procedure is related to the proposal considered in Cuesta-Albertos et al. (2007), where the authors treat a one-sample goodness-of-fit problem for functional data. In detail, the maximum over Kolmogorov–Smirnov type test statistics for a finite and fixed number projections is considered there. The result is a randomized test. In contrast, we treat a marginal homogeneity problem on the basis of a paired-sample and consider the mean over Cramér–von-Mises type test statistics, where our theoretical results, stated next, are available for the theoretical criterion, i.e., if the number of projections tends to infinity and the mean is transferred in the expectation. In fact, by increasing the number of random projections it is possible to reduce the randomness in the testing procedure and to eliminate the randomness in the limit.

2.1 Asymptotic theory of the test statistic

For our asymptotic approach, we let \(n\rightarrow \infty \). It is well known that the Cramér–von-Mises distance \(D_n\) is connected to von Mises’ type functionals, also known as V-Statistics, which are closely related to U-Statistics. For a deeper introduction to these kinds of statistics, we refer the reader to Koroljuk and Borovskich (1994) and Serfling (2001). Our statistic \(\mathrm{{\text {CvM}}}_n\) can also be rewritten into a certain V-Statistic, and thus, the same theory can be applied to obtain the following result.

Theorem 1

Let \(\tau _1,\tau _2,\ldots \) be a sequence of independent, standard normal random variables. Under the null hypothesis \({\mathcal {H}}\) and Assumption 1,

$$\begin{aligned} \mathrm{{\text {CvM}}}_n \overset{ d}{\rightarrow } 3\sum _{i=1}^\infty \lambda _i \tau _i^2 = Z, \end{aligned}$$

where \((\lambda _i)_{i\in {\mathbb {N}}}\) is a sequence of non-negative numbers with \(\sum _{i=1}^\infty \lambda _i< \infty \) and \(\lambda _{i}>0\) for at least one \(i\in {\mathbb {N}}\) implying that the distribution function of Z is continuous and strictly increasing on the non-negative half-line.

In the proofs, more information about \((\lambda _i)_{i\in {\mathbb {N}}}\) is provided. In short, they are eigenvalues of a Hilbert–Schmidt operator corresponding to the kernel function of our V-statistic.

Theorem 2

Under the alternative \({\mathcal {K}}\) and Assumption 1, our statistic \(\mathrm{{\text {CvM}}}_n\) diverges, i.e.,

$$\begin{aligned} \mathrm{{\text {CvM}}}_n\overset{p}{\rightarrow }\infty \text { as }n\rightarrow \infty . \end{aligned}$$

In general, the test statistic \(\mathrm{{\text {CvM}}}_n \) is not distribution-free under the null hypothesis, i.e., the distribution depends on the unknown distribution of \(X_1\). As it can be seen in the proofs, the same applies to Z. Given that \(\alpha \in (0,1)\) is the significance level, neither a \((1-\alpha )\)-quantile \(c_{n,1-\alpha }\) of \(\mathrm{{\text {CvM}}}_n \) nor the \((1-\alpha )\)-quantile \(c_{1-\alpha }\) of Z is available as critical value in applications. To resolve this problem, we propose the estimation of the quantiles via bootstrapping in the spirit of Efron (1979) and follow the idea in Gaigall (2019), where the usual two-sample Cramér–von-Mises distance is applied to bivariate random vectors with values in \({\mathbb {R}}^2\). Note that under the null hypothesis \({\mathcal {H}}\) the expectations \(\mathrm {E}[ F_{n,1}(x,y)]=\mathrm {E}[ F_{n,2}(x,y)],~(x,y)\in H\times {\mathbb {R}}\), coincide, and thus, we can rewrite our test statistic into

$$\begin{aligned} \begin{aligned} \mathrm{{\text {CvM}}}_n&=n\int \int \{ F_{n,1}(x,y)-\mathrm {E}[ F_{n,1}(x,y)]\\&\quad +\mathrm {E}[ F_{n,2}(x,y)]- F_{n,2}(x,y)\}^2{\bar{F}}_n(x,\mathrm d y){\mathcal {P}}(\mathrm d x). \end{aligned} \end{aligned}$$

Denote by \(X_{jn}^*= (X_{jn,1}^*,X_{jn,2}^* ),~j=1,\dots ,n\), a bootstrap sample from the original observations \(X_j\), \(j=1,\dots ,n\), obtained by n-times sampling with replacement. Let \(F_{n,i}^*\), \({\bar{F}}_{n}^*\) be the bootstrap counterparts of \(F_{n,i}\) and \({\bar{F}}_n\). Clearly, the conditional expectation \(\mathrm {E}[ F_{n,i}^*(x,y)\vert (X_j)_j]\) given the data \((X_j)_{j=1,\ldots ,n}\) equals \(F_{n,i}(x,y)\). Consequently, the bootstrap counterpart of our test statistic is

$$\begin{aligned} \mathrm{{\text {CvM}}}_{n}^*= & {} n\int \int \Big ( F_{n,1}^*(x,y)- F_{n,1}(x,y)\\&\qquad + F_{n,2}(x,y)- F_{n,2}^*(x,y)\Big )^2{\bar{F}}_{n}^*(x,\mathrm d y){\mathcal {P}}(\mathrm d x). \end{aligned}$$

Let \(c_{n,1-\alpha }^*\) be a \((1-\alpha )\)-quantile of \(\mathrm{{\text {CvM}}}_n^*\) given the original observations \(X_1,\ldots ,X_n\). In applications, concrete values of \(c_{n,1-\alpha }^*\) are obtained by Monte Carlo simulation. In the proofs, we show that the bootstrap statistic mimics asymptotically the limiting null distribution under the null hypothesis \({\mathcal {H}}\) implying that \(c_{n,1-\alpha }^*\) is an appropriate estimator for the unknown quantile \(c_{n,1-\alpha }\) or \(c_{1-\alpha }\), while \(\mathrm{{\text {CvM}}}_n^*\) and \(c_{n,1-\alpha }^*\) remain asymptotically finite under general alternatives. This results in an asymptotically exact and consistent bootstrap test \(\varphi _n^*=\mathrm{{\text {I}}}_{\mathrm{{\text {CvM}}}_n>c_{n,1-\alpha }^*}\).

Theorem 3

Suppose that Assumption 1 holds. Then, as \(n \rightarrow \infty \), we have

$$\begin{aligned} \mathrm {E}[\varphi _n^*]=P(\mathrm{{\text {CvM}}}_n>c_{n,1-\alpha }^*) \rightarrow \alpha \mathrm{{\text {I}}}_{{\mathcal {H}}}+ \mathrm{{\text {I}}}_{{\mathcal {K}}}, \end{aligned}$$

where \(\mathrm{{\text {I}}}_{\cdot }\) denotes the indicator function.

3 Simulations

Remembering that our test is suitable for random variables \(X_{i,j}\), \(i=1,2\), \(j=1,\dots ,n\), with values in a general separate Hilbert space, we consider the separable Hilbert space H consisting of all measurable and square-integrable functions on the unit interval [0, 1]. This space is endowed with the usual inner product \(\langle \cdot ,\cdot \rangle \) and the normalized Legendre polynomials build a corresponding orthonormal basis \(O=\lbrace e_i;i\in I\rbrace \), \(I={\mathbb {N}}\). We obtain our test statistic (2.1) by Monte Carlo simulation based on 500 replications following Step 1–4 from Sect. 2. Thereby, we choose in Step 1 and Step 2 a standard Poisson distribution shifted by 1, i.e., the distribution of \(N+1\) for \(N\sim \text {Pois}(1)\). In our simulations, the stochastic processes \(X_{j,i}=(X_{j,i}(t);t\in [0,1])\) have the form

$$\begin{aligned} X_{j,i}(t)=a_iB_{j,i}(t)+b_it(t-1),~t\in [0,1],~i=1,2,~j=1,\dots ,n \end{aligned}$$

for parameters \(a_i\in {\mathbb {R}}{\setminus } \{0\}\) and \(b_i\in {\mathbb {R}}\) and independent bivariate Brownian bridges \(B_j=(B_{j,1}, B_{j,2})\) on [0, 1], \(j=1,\dots ,n\), with covariance structure

$$\begin{aligned} \mathrm{Cov}(B_{j,1}(s),B_{j,2}(t))=r(\min (s,t)-st),~s,t\in [0,1],~j=1,\dots ,n \end{aligned}$$

for a dependence parameter \(r\in [0,1]\). Each simulation is based on 5000 simulation runs. To obtain the critical values in the bootstrap procedure, we use Monte Carlo simulation based on 999 replications. Empirical sizes of the bootstrap test are displayed in Table 1. The simulations are conducted for parameters \(r\in \{0,0.25,0.5\}\), \(a_i\in \{1,1.5,2,2.5\}\), and \(b_i\in \{0,0.5,1,1.5,2\}\), the sample sizes \(n\in \{20,50\}\), and significance levels \(\alpha \in \{5 \%, 10 \%\}\). The empirical sizes are in almost all cases in a reasonable range around the nominal level \(\alpha \). A systematic exception from this observations is the sizes of the bootstrap approach under the strong dependence setting \((r=0.5)\) and the smaller sample size setup (\(n=20\)). In this case, the bootstrap decisions are rather conservative with corresponding empirical sizes from 3.4 to \(3.9\%\) with an average of \(3.6\%\) for \(\alpha = 5\%\) as well as values from 7.4 up to \(8.6\%\) and an average of \(7.4\%\) for \(\alpha = 10\%\). However, increasing the sample sizes to \(n=50\) improves the type-I error rate control; now we can observe in average a value of \(4.5\%\) for \(\alpha =5\%\) and of \(9.0\%\) for \(\alpha =10\%\), respectively, under the strong dependence setting.

Table 1 Empirical sizes

To complement this study for empirical sizes, we conduct additional simulations, now based on 1000 runs, under different alternative settings, where we vary the parameters \(a_i\{1,1.5,2,2.5\}\) and \(b_i\in \{0,0.5,1,1.5,2\}\) for the second component \(X_{j,2}\) while keeping them fixed as \((a_i,b_i)=(1,0)\) for the first one. In Table 2, the results for the small sample size setting (\(n=20\)) are displayed. Here, the power values increase for growing parameters \(a_i\) and \(b_i\), respectively, as well as for growing dependence parameter r. It is seen that the power increases as the dependence parameter r increases in the most cases. A possible reason is that an increasing dependence parameter r reduces the differences of the first and the second components of the bivariate Brownian bridges \(B_j=(B_{j,1}, B_{j,2})\) on [0, 1], \(j=1,\dots ,n\), such that the deterministic factors \(a_2\in \{1,1.5,2,2.5\}\) or the deterministic additive terms \(b_2t(t-1),~t\in [0,1],~b_2\in \{0,0.5,1,1.5,2\},\) in the second components \(X_{j,2}\), \(j=1,\dots ,n\), cause clear differences in the components \(X_{j,1}\) and \(X_{j,2}\), \(j=1,\dots ,n\), of the bivariate data. For two specific alternatives \((a_i,b_i)=(1.5,0),(1,1)\) under moderate \((r=0.25)\) dependence, the power behavior is, moreover, investigated under growing sample sizes \(n\in \{20,30,\ldots ,70\}\), see Fig. 1. The resulting power curves have a clear positive slope indicating a reasonable power behavior for increasing sample sizes.

Table 2 Empirical power values for \(n=20\)
Fig. 1
figure 1

Power values for increasing sample sizes under moderate (\(r=0.25\)) dependence

Fig. 2
figure 2

Mean monthly values from 01/01/1999 to 01/01/2019

Fig. 3
figure 3

Monthly values from 01/01/1999 to 01/01/2019

4 Applications to stock market returns

In a possible application, the observations are obtained from stock market returns. To be concrete, we consider two stock price processes and a time period \([0,T_0]\), where \(T_0\in (0,\infty )\) is a time horizon. The stock price processes are denoted by

$$\begin{aligned} \begin{aligned} Y_{i}=(Y_{i}(t);t\in [0,T_0]),~ i=1,2, \end{aligned} \end{aligned}$$

where \(Y_{i}\) is a random variable that takes values in the space of all measurable and square-integrable real-valued functions on \([0,T_0]\). Additional structural assumptions on the underlying stochastic process for the stock prices are required. In the classical models for stock prices, i.e., the exponential Lévy model, Black–Scholes model, and Merton model, the structural assumptions mentioned are independence and stationarity of the increments of a Lévy process. Seasonality effects, which are specific trends during certain periods of the stock price processes, can disturb the stationarity assumption. Figure 2 shows the mean monthly indices (open) of the Nikkei Stock Average (Nikkei 225), Dow Jones Industrial Average (DJIA), and Standard & Poor’s 500 (S&P 500) from 01/01/1999 to 01/01/2019, where seasonality effects are clearly seen. We will tackle this problem by splitting up the time horizon into n equal-sized time intervals to obtain n observations for each stock price process. To be more specific, let \(T_0=nT\) for \(T\in (0,\infty )\) then we consider the time periods \([0,T],\dots ,[(n-1)T,nT]\) and our observations are the log-returns during these periods

$$\begin{aligned} X_{j,i} (t)=\log \frac{Y_{i}(t+(j-1)T)}{ Y_{i}((j-1)T)},~t\in [0,T],~i=1,2,~j=1,\dots ,n, \end{aligned}$$

which are themselves measurable and square-integrable real-valued functions on [0, T]. In particular, we have the specific space \(H=L^2[0,T]\) containing all measurable and square-integrable real-valued functions on the interval [0, T] of length \(T\in (0,\infty )\) and equipped with the usual inner product \(\langle f, g \rangle = \int _0^T f(x)g(x) \,\mathrm { d }x\), \(f, g\in H\). A corresponding orthonormal basis is given by normalized Legendre polynomials. The well-known models for stock prices mentioned imply an independent and identical distributed structure of the increments \(X_{j,i}\). In models with time-dependent or stochastic volatility (volatility clustering for instance), this structure is may be violated. In fact, the theory of U-Statistics is well developed and covers also cases where the independent and identical distributed data structure is disturbed, see Chapters 2.3 and 2.4 of Lee (1990). We point out that our results are derived by application of the theory of U-Statistics in the independent and identical distributed data case, but it should be possible to extend and modify the approach in more general situations, however, under suitable regularity conditions.

In what follows, we demonstrate the application of our test to the values (open) of the stock market indices Nikkei Stock Average of Japan, Dow Jones Industrial Average of the US, and Standard & Poor’s 500 of the US for the time period 01/01/1999 to 01/01/2019. For the demonstration of how the test works in applications, we consider different frequencies of the data that are monthly values, weekly values, and daily values, and a linear interpolation. The resulting time series (for the monthly values) are presented in Fig. 3 and can be seen as square-integrable functions on the interval \([0,T_0]\) for \(T_0=20\) (years). To cover seasonality effects indicated in Fig. 2, we split the time horizon of 20 years into 20 subintervals each representing one year, i.e., \(T=1\) and \(n=20\). We apply our method to do pairwise comparisons of the indices, where the test statistic is again approximated by 500 random projections following Step 1–4 and the shifted Poisson distribution is used in Step 1 and Step 2 as in Sect. 3. The resulting p-values for the bootstrap approach are displayed in Table 3 for 5000 resampling iterations, respectively. Since DJIA and S&P 500 reflect both the US market, it is not surprising that the test leads to a very high p-value and, thus, does not reject the null hypothesis. Comparisons of each of these US indices with the Japanese Nikkei 225 lead to p-values much lower, but even above the typical used \(5\%\)-benchmark. The results are in line with the first graphical impression, which we get in Fig. 3.

Table 3 Empirical p-values of the test in percent