Novel specification tests for synchronous additive concurrent model formulation based on martingale difference divergence

This paper presents new specification tests for a general synchronous additive concurrent model formulation. As a novelty, our proposal does not require a preliminary model or error structure estimation. No tuning parameters are involved either. We develop a suitable test statistic using the martingale difference divergence coefficient. As a result, this statistic measures the departure from the conditional mean independence in the concurrent model framework, considering the information of all observed time instants. In particular, global as well as partial dependence tests are introduced. Then, we allow one to quantify the effect of a group of covariates or to apply covariates selection one by one. We obtain its asymptotic distribution under the null and propose a bootstrap algorithm to compute the p-values in practice. Through simulations, we illustrate our method, and its performance is compared to existing competitors. In addition, we use this in the analysis of three real datasets related to gait data, flu activity, and casual bike rentals.


Introduction
A general concurrent model is a regression model where the response Y = (Y 1 , . . . , Y q ) ∈ R q , for q ≥ 1, and p ≥ 1 covariates X = (X 1 , . . . , X p ) ∈ R p are all functions of the same argument t ∈ D, and the influence is concurrent, simultaneous or point-wise in the sense that X is assumed to only influence Y (t) through its value X (t) = X 1 (t), . . . , X p (t) ∈ R p at time t by the relation where m(·) is an unknown function collecting the E Y (t)| X (t) information and ε(t) is the error of the model. This last is a process which is assumed to have mean zero, independent of X and with covariance function (s, t) = C [ε(s), ε(t)], being C[·, ·] the covariance operator. The concurrent model displayed in (1) is in the middle of longitudinal and functional data. This classification depends on the number of observed time instants in the t domain D. When this number is dense enough, the sample data can be treated as curves, translating into a functional data framework. Otherwise, if time instants are spaced respective to the t domain and not dense, a longitudinal framework will be more suitable. Determining the inflection point between both situations is still an open problem. For a discussion on this topic, we refer the reader to the work of Wang et al. (2017).
There are plenty of contexts where the (1) formulation arises both in functional or longitudinal framework form. The functional concurrent model can be employed in any situation where data can be monitored, like in health, environmental or financial issues among others. Some examples can be seen in works such as the ones of Xue and Zhu (2007) or Jiang and Wang (2011) for the longitudinal data context. They perform epidemiology studies of AIDS datasets. Other real data examples in medicine can be found in Goldsmith and Schwartz (2017) or Wang et al. (2017). Goldsmith and Schwartz (2017) perform a blood pressure study to detect masked hypertension. For their part, authors in Wang et al. (2017) use the concurrent model in a data study of flu prevalence in the USA. Furthermore, they model Alzheimer's disease progression using brain neuroimaging data. More examples of health and nutrition are displayed in Kim et al. (2018) and Ghosal and Maity (2022a). They perform studies related to gait deficiency, dietary calcium absorption, and the relation between child mortality and financial power in different countries. Examples in the environmental field are collected in works such as Zhang et al. (2011) or Ospína-Galindez et al. (2019). These studies are based on describing forest nitrogen cycling and modeling the rainfall ground, respectively. A completely different example is the work of Ghosal and Maity (2022b), where casual bike rentals in Washington, D.C., are concurrently explained using meteorological variables. This extensive list of examples reveals that the concurrent model is a very transversal and wide-employed tool nowadays.
An inconvenience of the concurrent model general formulation, displayed in (1), is that the m(·) structure is quite difficult to be estimated in practice. For this reason, it is common to consider some assumptions about its form. In the literature, it is quite common to assume linearity, which translates into taking m(t, X (t)) = β(t)X (t) in (1), and work under this premise. However, this assumption can be quite restrictive in practice. Thus, more general structures are needed to model real examples properly. This last results in a gain in flexibility but adds complexity to the estimation process. Maity (2017) discusses the effort made for estimating different concurrent model structures. This paper highlights that more information is needed to correctly estimate the function m(·). In conclusion, it is crucial to guarantee that there exists useful information on the covariates X to model the behavior of Y as a preliminary step. Therefore, covariates selection algorithms for the concurrent model are of interest to avoid irrelevant covariates and simplify the estimation process.
As a result, the first step to assure the veracity of the model structure displayed in (1) is to verify if all p covariates {X 1 (t), . . . , X p (t)} contribute to the correct explanation of Y (t), or some can be excluded from the model formulation. For this purpose, taking D ⊆ {1, . . . , p}, a dependence test can be performed by means of testing where X D (t) denotes the subset of X (t) considering only the covariates with index in D, D \ N is the domain of t minus a null set N ⊂ D and V ⊂ D is a positive measure set.
Quoting Zhang et al. (2018), the above problem is very challenging in practice without assuming any structure of m(·). This drawback is due to the vast class of alternatives targeted, related to growing dimension and nonlinear dependence. To solve this inconvenience, the authors propose testing the nullity of the main effects first, keeping a type of hierarchical order. Then, it is tested if additive and separate effects first enter the model before considering interactive structures. This results in the new test displayed in (2). (t)] almost surely ∀t ∈ D \ N and ∀ j ∈ D H a : P E Y (t)| X j (t) = E [Y (t)] > 0 ∀t ∈ V and some j ∈ D Then, rejection of the null hypothesis of (2) automatically implies the rejection of the hypothesis. It is important to highlight that the reciprocal is not always true. In this way, the model (1) only makes sense if it is possible to reject the H 0 hypothesis of (2). Otherwise, the covariates do not supply relevant information to explain Y . It is notorious that formulation (2) collects a wide range of dependence structures between X and Y in terms of additive regression models, where m (t, X (t)) = F 1 (t, X 1 (t)) + · · · + F p t, X p (t) . Moreover, it is no need to know the real form of m(·) to determine whether the effect of X is significant.
To the best of our knowledge, there is no literature on significance tests for the additive concurrent model that avoids previous model estimation or extra tuning parameters. We refer to Wang et al. (2017) and Ghosal and Maity (2022a) for these in the linear formulation. They both propose effect tests over the β(t) function making use of the empirical likelihood. Thus, once the model parameters are estimated in the linear framework, the authors provide tools to test if all p covariates are relevant or, on the contrary, if some can be excluded from the model. Nevertheless, a suitable effects estimation involves several tuning parameters and the linearity hypothesis. These are necessary to guarantee the adequate performance of the cited procedures. In terms of the β(t) structure estimation, different approaches arise. For example, Wang et al. (2017) propose using a local linear estimator, which depends on a proper bandwidth selection. In contrast, Ghosal and Maity (2022a) employs an expansion into a finite number of elements of a functional basis. This expansion requires the number of considered terms selection. In addition, this last procedure needs to estimate the error model structure. This process translates into an additional functional basis representation and estimation of extra parameters. All this translates into difficulties in the estimation process, even if the linear hypothesis can be accepted. Currently, Kim et al. (2018) developed a new significance test in a more general framework to alleviate the linear hypothesis assumption: additive effects are considered in (1). This work employs F-test techniques over a functional basis representation of the additive effects to detect relevant covariates. Again, this technique depends on an adequate preliminary estimation of the model effects to be able to select relevant covariates by applying significance tests. However, the correct selection of the number of basis functions for each considered covariate/effect representation is still an open problem. These quantities play the role of tuning parameters. Furthermore, a proper error variance estimation is needed to standardize the covariates as an initial step. As this structure is unknown in practice, Kim et al. (2018) assumes that this can be decomposed as a sum of two terms. The first one is a zero-mean smooth stochastic process, and the second term is a zero-mean white noise measurement error with variance σ 2 , resulting in the autocovariance function (s, t) = (s, t) + σ 2 I{s = t}. Nevertheless, this assumption can be restrictive in practice. In consequence, significance tests without any assumption in the model structure and no necessity of a preliminary estimation step are desirable.
Other procedures for covariates selection with a different methodology are the Bayesian selectors and the penalization techniques used in the concurrent model estimation process. We can highlight the works of Goldsmith and Schwartz (2017) or Ghosal et al. (2020) in the linear formulation and the one of Ghosal and Maity (2022b) for general additive effects. While Goldsmith and Schwartz (2017) uses the spike-andslab regression covariates selection procedure, Ghosal et al. (2020) and Goldsmith and Schwartz (2017) implement penalizations based on LASSO (Tibshirani 1996), SCAD (Fan and Li 2001), MCP (Zhang 2010) or its grouped versions (Yuan and Lin 2006), respectively. As a result, the selection of covariates is implemented together with estimation. Nevertheless, some tuning parameters are needed in all these methodologies: it is necessary to determine the number of basis functions to represent the effects in all of them, jointly with prior parameters, in case of the spike-and-slab regression, or the amount of penalization otherwise. As a result, the estimation of tuning parameters applies in these approaches as well.
In this paper, we deal with this concern by bridging a gap for significance tests without previous model estimation. The new proposal for specification testing can assess the usefulness of a vector X for modeling the expectation of the Y vector in a pretty general formulation. Besides, this approach avoids extra tuning parameters estimation, as well as the need to model the error structure. For this aim, we propose a novel statistic for the concurrent model based on the martingale difference divergence ideas of Shao and Zhang (2014) to perform (2). As a result, this approach tests the effect of the covariates in the explanation of Y no matter the underlying form of m(·) while assuming additive effects.
It is important to notice that one can consider D = {1, . . . , p} to perform (2), which translates in testing if all p covariates are relevant, or only a subset D ⊂ {1, . . . , p} with cardinality 1 ≤ d < p. In this last case, one tests if only a bunch of covariates are relevant, excluding the rest from the model. A special case is to consider D = { j} for some j = 1, . . . , p. This approach allows to implement covariates screening with no need to estimate the regressor function. Thus, it is possible to test the effect of every covariate. This results in j = 1, . . . , p partial tests of the form Thus, one can test if a small subset of {1, . . . , p} is suitable to fit the model or if all covariates need to be considered. As a result, it is possible to avoid noisy covariates entering the model and reduce the problem dimension.
The rest of the paper is organized as follows. In Sect. 2, the martingale difference divergence coefficient is introduced along with some remarkable properties. We present the new specification tests in Sect. 3. A theoretical justification for their proper behavior is given and a bootstrap scheme is proposed to calibrate these in practice.
A simulation study to test their performance is carried out in Sect. 4, jointly with a comparison involving Ghosal and Maity (2022a) and Kim et al. (2018) competitors. Next, the proposed tests are applied to three real datasets in Sect. 5. Eventually, some discussion arises in Sect. 6.

Martingale difference divergence (MDD)
The martingale difference divergence (MDD) was introduced by Shao and Zhang (2014). This coefficient is a natural extension of the distance covariance (Székely et al. 2007;Szekely and Rizzo 2017). The MDD measures the departure from conditional mean independence between a vector response variable Y ∈ R q and a vector predictor X ∈ R p . This coefficient was introduced in Shao and Zhang (2014) for the scalar response case, taking q = 1, and was later generalized in Park et al. (2015) for values of q ≥ 1. Hence, this idea can be used to screen out numerical variables that do not contribute to the conditional mean explanation of Y . This translates into the test Therefore, a coefficient measuring the difference between the conditional mean and the unconditional one is needed to perform (4). For this aim, following similar ideas and argumentation of the distance covariance measure of Székely et al. (2007), it emerges the MDD coefficient.
Then, for Y ∈ R q and X ∈ R p , the MDD of Y given X is the nonnegative number and ψ X (s) = E[e i<s,X > ]. i = √ −1 is the imaginary unit, < ·, · > the inner product, |z| q = √ z H z the complex norm of z ∈ C q , where z H denotes the conjugate transpose and · p is the Euclidean norm of the R p space.
It can be seen in Theorem 1 of Shao and Zhang (2014) and Proposition 3.1 of Park et al. (2015) p ] < ∞ respectively, and being (X , Y ) and (X , Y ) independent copies of (X , Y ), an alternative way to the definition (5) is where L(y, y ) = y − y 2 q /2 and J (x, x ) = x − x p . A proof for first expression of Eq. (6), considering q = 1, follows from Theorem 1 of Shao and Zhang (2014). Similar arguments can be employed for general q ≥ 1 considering the q-norm instead. The second formulation results from expanding and canceling terms. Some guidelines can be found in the proof of Theorem 1 of Shao and Zhang (2014) as well.
Since, in general, M D D(Y | X ) = M D D(X | Y ), this is named divergence instead of distance. The MDD equals 0 if and only if it is verified the H 0 hypothesis of (4) and otherwise M D D > 0. Therefore, the test (4) can be rewritten as the new one displayed in (7).
Next, an unbiased estimator of MDD introduced in Zhang et al. (2018) is presented. Then, given n independent observations (X n , Y n ) = {(X i , Y i ), i = 1, . . . , n} from the joint distribution of (X , Y ), with X i = (X i1 , . . . , X i p ) ∈ R p and defined, A and B respectively, given by As a result, an unbiased estimator for MDD is defined as A proof that M D D 2 n (Y n | X n ) is an unbiased estimator for M D D 2 (Y | X ) can be found in Section 1.1 of the supplementary material of Zhang et al. (2018) for the q = 1 case. An extension for the case considering q ≥ 1 is displayed in the proof of Proposition 3.4 given in Park et al. (2015).
An important characteristic of the M D D 2 n (Y n | X n ) unbiased estimator defined in (8) is that this is a U-statistic of order four. In fact, with some calculation, it can be proved that with symmetric kernel function . . , n and the summation is over all permutation of the 4-tuples of indices (i, k, l, r ). A justification for this selection of kernel function can be found in Section 1.1 of the supplementary material of Zhang et al. (2018). There, it is also proved the unbiasedness of the MDD estimator displayed in (8) for the q = 1 case. It is straightforward to generalize these results to q ≥ 1 just considering the R q metric in the B il definition. Then, the φ(·) kernel is obtained again.
Then, given the (9) formulation with kernel φ(·), one can directly notice that M D D 2 n (Y n | X n ) is a U-statistic of order four by proper definition (see for example Lee (1990)). Then, theoretical results of U-statistics can be employed to obtain its asymptotic distribution and to perform the specification test displayed in (4). Specifically, an example of this last is collected in Theorem 2.1 for the q = 1 case in Zhang et al. (2018). Next, this property is used in the next section to derive the asymptotic distribution of new statistics devoted to testing specification in the concurrent model.

Significance tests based on MDD
Once we have a tool to measure conditional mean independence between Y = (Y 1 , . . . , Y q ) ∈ R q and a vector X = (X 1 , . . . , X p ) ∈ R p , this approach is adapted to the concurrent model case. For this aim, we use the ideas presented in the work of Zhang et al. (2018) for the vectorial framework.
Henceforth, we assume a situation where all points of the curves are observed synchronously. However, the preliminary assumption that all trajectories are fully observed can be quite restrictive in practice. Section 3.2 it is showed how to adapt this requirement to contexts where some points are missed, adjusting the procedure to more realistic situations. Thus, a total of {t u } T u=1 ∈ D time instants are considered and there are n u observed samples, each of them of the form {Y i u (t u ), X i u (t u )} n u i u =1 . As mentioned before, assuming all curves observed at the same time instants translates into n u = n for all u = 1, . . . , T . Then, we have a sample of the form (Y n (t), A graphic example of our current situation considering q = 1 and p = 2 covariates for a concurrent model with a structure similar to (1) is displayed in Fig. 1.
In this way, we want to include all the information provided by the observed time instants {t u } T u=1 ∈ D in a new statistic. Besides, as mentioned above, we can be interested in testing dependence not only considering all covariates but a subset D ⊂ {1, . . . , p}. As a result, an integrated dependence test is applied over the complete trajectory, considering the information provided by D. Rewriting (2), this gives place to the test Example of a sample of five curves measured at same time instants {t u } T u=1 ∈ D considering p = 2 covariates (X 1 (t) and X 2 (t)) to explain Y (t). Filled points simulate a total of n u = 3 observed points at each instant t u In order to implement the new test introduced in (10) a proper estimator of D M D D 2 (Y (t)| X j (t) )dt for every j ∈ D is needed. For this purpose, we propose an integrated statistic based on and this remains a measurable and symmetric function. Then, similar to (9) argumentation, it is easy to see that it is possible to write which keeps the structure of a U-statistic of order 4. It can be proved that . See Section 1 of the Online Supplementary Material.

Theorem 1 Under the assumption of H 0 and verifying
for V[·] the variance operator and dcov(·, ·) the distance covariance, it is guarantee Theorem 1 guarantees the asymptotic convergence of the T D statistic displayed in (11) to a normal distribution under some assumptions. Proof of this result is collected in Section 3 of the Supplementary Material, which makes use of the Hoeffding decomposition for U-statistics carried out in Section 2 of the same document.
One drawback is that the asymptotic convergence of the T D statistic can be very slow in practice. To solve this issue we approximate the p-value using a wild bootstrap scheme. Its scheme is collected in Algorithm 2. The proof of the consistency related to the proposed wild bootstrap procedure and that of the variance estimator for the concurrent model case is omitted due to extension. However, the proof results from plugging the integrated version in that of Zhang et al. (2018), introduced in Section 1.6 of their supplementary material.

Obtain the bootstrap statistic numerator
).

Obtain the bootstrap p-value as
Moreover, the test is guaranteed to be powerful under local alternatives. A characterization of local alternatives is given in Section 1.7 of the supplementary material of Zhang et al. (2018). This result can be proved simply by plugging in the corresponding integrated versions in Theorem 2.4 of Zhang et al. (2018). This proof is omitted because of the extension.
In terms of D, a particular case is to consider all covariates, D = {1, . . . , p}. First of all, one must check if, at least, some covariates supply relevant information to model Y .
Considering D the set of all covariates indices, we can verify this premise performing (10). In case of not having evidence to reject the null hypothesis of conditional mean independence, it does not make sense to model Y with the available information. Otherwise, if one discards the conditional mean independence in this initial step, one can be interested in searching for an efficient subset of covariates to reduce the problem dimension.
Then, for a subset D ⊂ {1, . . . , p} with cardinality d, 1 ≤ d < p, it is possible to test if these d covariates play a role in terms of the concurrent regression model by means of (10). If not, it is possible to discard them and reduce the problem dimensionality to p − d. In case we are interested in covariates screening one by one, which corresponds with the case where D = { j}, we can apply the j = 1, . . . , p tests displayed in (3). This results in p consecutive partial tests for j = 1, . . . , p One drawback of carrying out p consecutive tests is that the initial prefixed significance level is violated if this is not modified considering the total number of tests to be performed. As a result, the significance level has to be adequately corrected. Some techniques, such as the classic but conservative Bonferroni's correction, or the false discovery rate alternative (see Benjamini and Yekutieli (2001), and Cuesta-Albertos et al. (2017)) can be easily applied to avoid this inconvenience.

Derivation of S 2
In this section, we prove that the estimator of the variance considered in (12) for (t) )dt correctly estimates this quantity.
As mentioned above, M D D 2 n (Y n (t)| X nj (t) ) is a U-statistic of order four. This result implies that using the Hoeffding decomposition, this quantity can be expressed as Calculation about Hoeffding decomposition for our framework is collected in Section 2 of the Online Supplementary Material.

If we define the theoretical test statistic
considering S the true integrated version of the variance, we can see that is the leading term and D n,2 = n 2 1/2 j∈D (R n ) j is the remainder one. Under the H 0 assumption of (2) it is verified that Since the contribution from the term D n,2 is asymptotically negligible, we may set S 2 = V D n,1 , and then construct the variance estimator displayed in Eq. (12).

Some missing points in curves trajectories
Until now, we have worked under the assumption that complete curve trajectories were observed. In contrast, in this part, some missing points are allowed. Then, for each time point t u there are 1 ≤ n u ≤ n observed samples of the form {Y i u (t u ), X i u (t u )} n u i u =1 . A graphic example for the case considering q = 1 and p = 2 covariates is displayed in the first row of Fig. 2. In this example, we have n = 5 curves and a different number of observations. For instance, there are n 1 = 4 points for t 1 .
In this context, our proposed method cannot be applied directly. This is because it is not verified n u = n for all u = 1, . . . , T . However, we can solve this problem by estimating the missing curve values when is possible. This option translates into a recovering of the whole curve trajectories on the grid {t u } T u=1 ∈ D, verifying now that n u = n for all u = 1, . . . , T .
A simple but efficient idea is to recover the complete trajectory of the curves using some interpolating method with enough flexibility. For example, making use of cubic spline interpolation ideas for each of the 1, . . . , n curves. Results of this recovery for our example are displayed in the second row of Fig. 2. In this case, the spline function of the stats library of the R software (R Core Team 2019) has been employed.

Fig. 2
First row: sample of five curves measured at different time instants {t u } T u=1 ∈ D considering p = 2 covariates (X 1 (t) and X 2 (t)) to explain Y (t). Second row: same example adding the recovered points by means of splines interpolation. Filled dots (•) represent the n u observed points at each instant t u and asterisks ( * ) the recovered ones In addition, other approaches for recovering the missing points are also available. Next, we propose one based on functional basis representation following the guidelines of Kim et al. (2018), Ghosal et al. (2020), and Ghosal and Maity (2022b). If it is possible to assume that the total number of time observations T u=1 t u is dense in D, then the eigenvalues and eigenfunctions corresponding to the original curves can be estimated using functional principal component analysis (see Yao et al. (2005)). We refer to Yao et al. (2005) for more details about the procedure. As a result, one can get the estimated trajectoryX i j (·) of the true curves X i j (·) for i = 1, . . . , n and j = 1, . . . , p, given byX i j (t) =μ j (t) + Q q=1ζ iq jˆ q j (t). Here, Q denotes the number of considered eigenfunctions, which can be chosen using a predetermined percentage of explained variance criterion. Consequently, it is possible to recover the value of X 1 (·), . . . , X p (·) on all grid {t u } T u=1 ∈ D. In the same way, the values of Y 1 (·), . . . , Y q (·) can also be recovered. Thus, it is possible to work again in the context of synchronously measured data. This procedure is implemented in the fpca.sc function belonging to the library refund of R (see Goldsmith et al. (2021)). For our proposed naive example, we have obtained similar results to the splines interpolation methodology displayed in Fig. 2. As a result, these are omitted.

Simulation studies
In this section, we consider two simulated concurrent model scenarios to assess the performance in the practice of the new significance tests introduced above. We distinguish between linear (Scenario A) and nonlinear (Scenario B) formulation of the model (1). For the sake of simplicity, we consider only the case where the data are measured at the same instants of time. For this aim, a Monte Carlo study with M = 2000 replicas in each case is performed using the R software (R Core Team 2019). Besides, we compare the performance of our test with two competitors. These are the procedure introduced in Ghosal and Maity (2022a), developed in the linear framework, and the method of Kim et al. (2018) for the additive formulation. Henceforth, we refer to them by FLCM and ANFCM, respectively. We refer to Section A of the Appendix for more details about competitors' implementation.

10
. We assume that a total number of T = 25 equispaced instants are observed in D = [0, 1] ({t u } 25 u=1 ) and there are n = 20, 40, 60, 80, 100 curves available for each of them. An example of these functions is displayed in Fig. 3. We remark that we have not included intercept in our linear formulation because this can be done without loss of generality just centering both Y (t) and X (t) = (X 1 (t), X 2 (t)) ∈ R 2 for all t ∈ D. Right: real Y (t) structure jointly with partial effects corresponding to X 1 (t) (F 1 (t, X 1 (t))) and X 2 (t) (F 2 (t, X 2 (t))) • Scenario B (Nonlinear model): A nonlinear structure of (1) is assumed for this scenario. Again, we take t ∈ D = [0, 1] and consider q = 1 and p = 2 covariates to explain the model. Then, this model has the expression being F 1 (t, X 1 (t)) = exp((24t + 1)X 1 (t)/20) − 2 and F 2 (t, X 2 (t)) = −1.2 log(X 2 (t) 2 ) sin(2π t), with X 1 (t) and X 2 (t) equally defined as in the linear case (Scenario A) and using the same observed discretization time points. Now, the errors ε 1 (t), ε 2 (t) and ε(t) are assumed to be random Gaussian processes with exponential variogram (s, t) = 0.02 exp − 24|s−t|

10
. An example of this scenario is displayed in Fig. 4.
In all tests, we make use of the wild bootstrap techniques introduced above in Sect. 3 to approximate the p-values. We have employed B = 1000 resamples on each case. Besides, as we mentioned before, sample test size and power are obtained by Monte Carlo techniques. In order to know if the p-values under the null take an adequate value, the 95% confidence intervals of the significance levels are obtained by making Here, α is the expected level and M is the number of Monte Carlo simulated samples. As a result, we consider that a p-value is acceptable for levels α = 0.01, 0.05, 0.1 if this is within the values collected in Table  1 for the Monte Carlo replicates. We highlight the values out of these scales in bold for simulation results. real partial effects corresponding to X 1 (t) (β 1 (t)) and X 2 (t) (β 2 (t)). Right: simulated regression model components β 1 (t)X 1 (t) and β 2 (t)X 2 (t)

Results for scenario A (linear model)
We start analyzing the performance of the global mean dependence test in the linear model formulation, using Scenario A introduced above in Sect. 4. For this purpose, we consider three different scenarios. In the first one, the null hypothesis of mean independence is verified by simulating under the assumption that β 1 (t) = β 2 (t) = 0. Next, the remaining two cases are simulated under the alternative hypothesis. This claims that information provided by X (t) = (X 1 (t), X 2 (t)) is useful in some way: only the X 2 (t) covariate is relevant (fixing β 1 (t) = 0) or both covariates X 1 (t) and X 2 (t) support relevant information to correctly explain Y (t).
Obtained results are collected in Table 2 for n = 20, 40, 60, 80, 100. In view of the results, it is appreciated as the empirical sizes approximate the significance levels under H 0 (H 0 : β 1 (t) = β 2 (t) = 0) as n increases. Moreover, the empirical distribution of the p-values seems to be a U [0, 1] as it is appreciated in Fig. 8 of Section B of the Appendix. In contrast, simulating under the alternative hypothesis, H a : β 1 (t) = 0, β 2 (t) = 0 and H a : β 1 (t) = 0, β 2 (t) = 0 scenarios, the test power tends to one as the sample size increases. As a result, we can claim that the test is well-calibrated and has power.
Once we have rejected the null hypothesis that all covariates are irrelevant in practice, we can detect which of them play a role in terms of data explanation. For this aim, partial tests can be carried out, testing if every covariate is irrelevant, Again, we consider different scenarios. First of all, it is assumed that X (t) is not significant taking β 1 (t) = β 2 (t) = 0. Then, we move to the situation where only X 2 (t) is relevant. Finally, we consider the model including both X 1 (t) and X 2 (t) effects to explain Y (t). Results of these scenarios are displayed in Table 3. Here, we appreciate as the empirical sizes tend to the significance levels simulating under the null hypothesis that both covariates have not got a relevant effect on the response, separately. Besides, we see as in case of having β 1 (t) = 0 and β 2 (t) = 0, these tests help us to select relevant information, X 2 (t), and discard noisy one, X 1 (t). Otherwise, when both covariates are relevant, the partial tests clearly reject the H 0 j hypothesis of null effect, tending their powers to the unit as sample size increases.

Results for scenario B (nonlinear model)
In this section, we analyze the performance of the MDD global mean independence test in a more difficult framework: a nonlinear effects formulation. For this purpose, Scenario B introduced in Sect. 4 is employed. Again, three different situations of dependence are considered, following the same arguments of Sect. 4.1. As a result, we simulate under the no effect case (H 0 : F 1 (t, X 1 (t)) = F 2 (t, X 2 (t)) = 0), which corresponds with independence, and two dependence frameworks: where only one covariate is relevant (H a : F 1 (t, X 1 (t)) = 0, F 2 (t, X 2 (t)) = 0) or both of them are (H a : F 1 (t, X 1 (t)) = 0, F 2 (t, X 2 (t)) = 0). Results of the M = 2000 Monte Carlo simulations for the MDD-test taking n = 20, 40, 60, 80, 100 are displayed in Table 4. We appreciate simulating under the null hypothesis H 0 that the p-values tend to stabilize around the significance levels. Figure 9, collected in Section B of the Appendix, shows as these seem to follow a uniform distribution in [0, 1]. So, we can conclude that our test is well calibrated even for nonlinear approaches. Concerning the power, when the independence assumption is violated, the p-values tend to 1 as the sample size increases. Two examples of this phenomenon are displayed in Table 4 simulating the different alternative hypotheses. Summing up, our proposal is also a well-calibrated and powerful test in a nonlinear framework.

Y (t)| X 1 (t) ] = E[Y (t)] and H 02 : E[Y (t)| X 2 (t) ] = E[Y (t)], and using wild bootstrap approximation with B = 1000 resamples in Scenario B
Model Next, our interest focuses on partial tests to apply covariates selection in this nonlinear scenario. Again, we consider the three different dependence scenarios introduced above. However, we test the independence for each covariate separately. This results in applying a total of j = 1, . . . , p tests. In this way, we expect the test in a situation as F 1 (t, X 1 (t)) = 0, F 2 (t, X 2 (t)) = 0 to be capable of detecting relevant covariates (X 2 (t)), rejecting its corresponding H 0 j hypothesis, and excluding noisy ones from the model otherwise (X 1 (t)). Results for partial tests are collected in Table 5. One can see as these tests allow us to determine which covariates play a relevant role in each scenario, being those with p-values higher than the significance levels and tending to 1 as sample size increases. Conversely, those verifying that its associated p-values are less or equal to significance levels are assumed irrelevant.

Comparison with FLCM and ANFCM algorithms
Next, our novel procedure is compared with existing competitors in the literature. For this aim, we have considered the FLCM algorithm of Ghosal and Maity (2022a) for the linear framework and the ANFCM procedure of Kim et al. (2018) for a more flexible model, assuming additive effects. Both have displayed excellent results in practice considering a proper selection of the tuning parameters. We refer the reader to Appendix A for more details.
In the simulation scenarios introduced in Sect. 4, we consider a dependence structure where all instants relate between them. This structure emulates a real functional dataset. Nevertheless, this does not apply in the simulation scenarios of Ghosal and Maity (2022a) and Kim et al. (2018). Conversely, they consider independent errors. As a result, to perform a fair competition, we start analyzing the behavior of our MDDbased tests in their simulation scenarios. Specifically, we compare the performance of our proposal with the results of FLCM in Scenario A of Ghosal and Maity (2022a). Next, we implement a comparison with the ANFCM procedure. For this purpose, we consider Scenario (B) of Kim et al. (2018), taking the error E 3 . In this last case, we implement a modification to perform Algorithm 1. In particular, we only consider the second covariate associated with the nonlinear effect. In both borrowed scenarios, we simulate under the dense assumption being {t u } 81 u=1 a total of m = 81 equidistant time points in [0, 1]. We keep the authors' parameters selection and perform a Monte  Ghosal and Maity (2022a) or Kim et al. (2018), respectively. We remind the structure of the scenarios and explain implementation issues in Section A of the Appendix. Results of the comparison between FLCM and MDD effect tests for scenario A of Ghosal and Maity (2022a) are collected in Table 6. We appreciate that simulating under the null (d = 0), one value of the FLCM algorithm is out of the 95% confidence interval. In contrast, the MDD procedure does not suffer from this issue. Moreover, paying attention to the p-values distributions under the null, which are displayed in Fig. 10 (see Section B of the Appendix), one can see the FLCM p-values do not follow a uniform distribution. In contrast, the MDD-based test corrects this phenomenon. As a result, it seems that our test provides a better calibration than the FLCM approach. Regarding the power, levels for both algorithms tend to 1 as sample size increases, and their values are higher for the d = 7 scenario than for the d = 3 one, as would be expected. Now, the FLCM algorithm outperforms the MDD results in all scenarios. However, our procedure is still quite competitive even considering that the data are simulated under the linear assumption, giving an advantage to the FLCM procedure.
Next, we compare the performance of the MDD with the ANFCM approach in an additive framework. Table 7 collects the simulation results for both procedures. We can see as both methodologies are well calibrated under the null (d = 0) for all levels, except for the 1%, where their values are out of the 95% confidence interval for n = 60. Nevertheless, taking greater values of n, as n = 100, solves this issue. Moreover, simulating under H 0 , the p-values follow a uniform distribution. This is illustrated in Fig. 11 displayed in Section B of the Appendix. If we simulate under the alternative hypotheses (d = 3 and d = 7), we see that these quantities tend to 1 as the sample size increases. In addition, as the covariate effect becomes more noticeable, going from d = 3 to d = 7, the power of ANFCM and MDD procedures increases. Again, the power of the ANFCM algorithm is always higher than the MDD one. At this point, we should notice that the ANFCM algorithm takes advantage of the fact that an additive structure with an intercept function is assumed. In contrast, our MDD test does not consider any model structure, not even the inclusion of intercept in the  model. As a result, our competitor has to measure all possible forms of departure from conditional mean independence. It is relevant to notice that, in both previous scenarios, covariates are related to the response employing trigonometric functions when it corresponds. Then, modeling the effects takes advantage of the B-spline basis representation. In addition, the errors are assumed to be time-independent between them in the FLCM and ANFCM scenarios. These considerations are a clear advantage for the FLCM and the ANFCM algorithms compared to our procedure. Thus, to test the FLCM and ANFCM performance in a functional context with time-correlated errors and when the model structure does not depend on only trigonometric functions, we apply these to the simulation scenarios introduced in Sect. 4. For this purpose, a partial approach is considered, testing the effect of the covariates separately using the FLCM procedure in Scenario A and the ANFCM one in Scenario B. To compare our results with theirs, we simulate now M = 2000 Monte Carlo replications and use B = 1000 bootstraps resamples for ANFCM. Again, we follow the authors' recommendation and use Q = 7 basis terms in both procedures. 1 We refer to Section A of the Appendix for a summary of the simulation parameters selection.
Results of partial FLCM tests in scenario A are displayed in Table 8. It can be seen how, regardless of the size of the sample used, the test is always poorly calibrated. In fact, all obtained p-values are out of the 95% confidence intervals. These results contrast with the MDD ones displayed in Table 3, where the test is well calibrated. This phenomenon may be because, as mentioned above, it is considered a different Table 9 Empirical sizes and powers of the ANFCM effect test considering H 01 : F 1 (t, X 1 (t)) = 0 and H 02 : F 2 (t, X 2 (t)) = 0 and using B = 1000 bootstrap resamples in Scenario B Model dependence structure more related to a functional nature. In terms of power, there is not a clear winner. Our test is more powerful in the H a : β 1 (t) = 0, β 2 (t) = 0 scenario in test H 02 , but FLCM is a bit more powerful in the last scenario for H 02 . However, this difference is small, and considering that the FLCM is not well calibrated, it makes sense to conclude that the MDD-based procedure outperforms this. Next, the performance of the ANFCM algorithm is tested by simulating under Scenario B of Sect. 4. Results are collected in Table 9. Again, comparing the ANFCM results with the ones of the MDD test (Table 5), we see that the MDD test is well calibrated even for small values as n = 20 (except for a couple of cases). This fact contrasts with the results of the ANFCM procedure. In this last, most of the values are out of the 95% confidence intervals. Moreover, the MDD test has more power than ANFCM in almost all cases. As a particularity, the ANFCM algorithm is not able to detect the relevance of X 2 (t) in the H a : F 1 (·) = 0, F 2 (·) = 0 scenario. Instead, the percentage of rejections is around the significance values and does not provide significant evidence to reject the null hypothesis H 02 of independence. Thus, we can conclude that the MDD outperforms the ANFCM procedure.
In summary, we have proved that the MDD algorithm performs pretty well in scenarios where the FLCM and the ANFCM procedures have an advantage, considering uncorrelated errors and trigonometric functions. Moreover, our test outperforms these when we move on to a more functional context, as in scenarios A and B introduced in Sect. 4. In these scenarios, we consider related errors and other types of relations different from trigonometric functions.

Real data analysis
In this section, we test the performance of the proposed algorithms in three real datasets. Firstly, the well-known gait dataset of Olshen et al. (1989) is considered. This dataset is an example of a linear effects model and has already been studied in the concurrent model framework in works as the one of Ghosal and Maity (2022a) or Kim et al. (2018). Next, a google flu database from the USA, borrowed from Wang et al. (2017), is studied. In this work, Wang et al. (2017) assume a linear formulation to model the data. Eventually, an example of a model with nonlinear effects and some missing points is studied. For this purpose, the bike sharing dataset of Fanaee-T and Gama (2014) is analyzed. Obtained results are compared with the ones of Ghosal and Maity (2022b) in this concurrent model framework.

Gait data
Here, we analyze the performance of the new dependence test in a well-known dataset from the functional data context. These data are the gait database (Olshen et al. 1989;Ramsay and Silverman 2005), in which the objective is to understand how the joints in the hip and the knee interact during a gait cycle in children. This problem has already been studied in the concurrent model context using a different methodology (see Ghosal and Maity (2022a), or Kim et al. (2018)). As a consequence, we compare our results with theirs.
The data consist of longitudinal measurements of hip and knee angles taken on 39 children with gait deficiency. These are measured as they walk through a single gait cycle. These data can be found in the fda library (Ramsay et al. 2020) of the R software (R Core Team 2019). The hip and knee angles are measured at 20 evaluation points {t u } 20 u=1 in [0, 1]. These values correspond to the completed percentage of a single gait cycle. Following previous studies, we have considered as response Y (t) the knee angle and as explanatory covariate X (t) the hip angle. Data are displayed in Fig. 5.
Applying our dependence test, we obtain a p-value close to 0. Thus, we have strong enough evidence to reject the independence hypothesis to the usual significance levels. This conclusion translates into a dependency between knee and hip angle in one cycle of gait data in children with poor gait. This result agrees with the ones of Kim et al. (2018) or Ghosal and Maity (2022a), among others, in the concurrent model framework. They obtain p-values less than 0.004 and 0.001, respectively. Summing up, the hip angle measured at a specific time point in a gait cycle has an effect on the knee angle at the same time point in children.

Google flu data from USA
Google flu data are used in Wang et al. (2017) to model the relationship between flu activity and temperature fluctuation in the USA. For this purpose, influenza-like illness (ILI) cases per 100000 doctor visits are considered in the 2013-2014 flu season (July 2013-June 2014). This information is got from the Google flu trend Website. Moreover, daily maximum and minimum temperature averaged over weather stations within each continental state is obtained by means of the US historical climatology network. The daily temperature variation (MDTV) is considered the explanatory covariate, being the difference between the daily maximum and daily minimum. The temperature fluctuation is aggregated to the same resolution as the flu activity data by taking the MDTV each week. Only 42 states are considered due to missed records. We refer to Wang et al. (2017) for more details. The original dates from July 1st, 2013, to June 30th, 2014, were numbered by integers from 1 to 365. Then, time t is rescaled to the [0, 1] interval by dividing the numbers by 365. Besides, we consider regional effects by dividing the data into four sets in terms of midwest, northeast, south, or west region to study them separately. Following Wang et al. (2017), the ILI percentage and MDTV are standardized at each time point t by dividing the variables by their root mean squares. Data of study are shown in Fig. 6 separating this by the considered regions.
Therefore, we want to test if the MDTV has relevant information in the flu tendency modeling of the four considered regions. For this aim, we can apply a global test for each one separately. Results of dependence tests are displayed in Table 10. In view of all p-values being higher than 0.1, we can conclude that we do not have enough evidence to reject the null hypothesis of mean conditional independence for levels as 10%. As a result, the MDTV does not play a relevant role in the ILI modeling, no matter the US region. We can argue that perhaps the regional effect is unimportant,  and we should consider the data as a whole. For this purpose, we implement a global test considering all the states, obtaining a p-value close to 0. This result highlights that there is strong evidence to reject the conditional mean independence between MDTV and ILI. As a result, MDTV provides notable information to explain the ILI behavior, but this is equal in the four considered regions, so a distinction does not make sense.
Our results agree with the ones of Wang et al. (2017). First, they reject the location effect for the linear model formulation. Secondly, they claim that one can avoid the MDTV covariate from the linear model for a 10% significance level but not for the 5% (p-value=0.052). Thus, they have moderately significant evidence that the MDTV plays a role in the ILI explanation, at least in the linear context. It is important to remark that differences may be because they assume linearity in their regression model. Furthermore, a first preprocessing step is applied in their case to remove spatial correlations.

Bike sharing data
Next, a bike-sharing dataset of the Washington, D.C., program is analyzed. This is introduced in Fanaee-T and Gama (2014). The data are obtained daily by the Capital bike-share system in Washington, D.C., from 1 January 2011 to 31 December 2012. The aim is to explain the number of casual rentals in terms of meteorological covariates. As a result, this dataset contains information on casual bike rentals in the cited period along with other meteorological variables such as temperature in Celsius (temp), the feels-like temperature in Celsius (atemp), relative humidity in percentage (humidity), and wind speed in Km/h (windspeed) on an hourly basis. In particular, only the data corresponding with Saturdays are considered because of the dynamic changes between working and weekend days. This selection results in a total of 105 Saturdays barring some exceptions (8 missings). All covariates are normalized by formula (t − t min )/(t max − t min ) in case of temp and atemp, and these are divided by the maximum for the humidity and windspeed case. In order to correct the skewness of the hourly bike rentals distribution (Y (t)), a log transformation is applied considering as response variable Y (t) = log(Y (t) + 1). These are showed in Fig. 7.
First, the missing data are recovered employing splines interpolation as described in Sect. 3.2. Then, once we have a total of n = 105 data points at each time instant, the global significance MDD-based test is performed. We obtain a p-value close to 0, which rejects the null hypothesis of independence for usual significant levels as the 5% or the 1%.
Next, we perform partial tests to detect if any of the four considered covariates (temp, atemp, humidity, and windspeed) can be excluded from the model. We obtain pvalues of 0, 0, 0.007, and 0.001 for temperature (temp), feels-like temperature (atemp), relative humidity (humidity), and wind speed (windspeed), respectively. Thus, we can claim that all of these affect the number of casual rentals at significance levels as the 1%. This last result agrees with other studies, like the one of Ghosal and Maity (2022b). In this study, different covariates are selected by the distinct considered penalizations.
In an overview of their results, each covariate is selected at least two times over the five considered procedures. As a result, all covariates seem to play a relevant role separately.

Discussion
We propose novel significance tests for the additive functional concurrent model, which collects a wide range of different structures between functional covariates and response. As a result, the relevance of a subset of covariates to model the response in a regression setting is tested, including global and partial tests to apply covariates screening. This approach allows one to detect irrelevant variables and reduce the problem dimensionality, facilitating the subsequent estimation procedure. For this aim, we construct test statistics based on MDD insights and taking into consideration all observed time instants. This process results in general significance tests able to determine the covariates' relevance over the complete trajectory. In contrast with existing methodology in literature for significance tests in the concurrent model, as the FLCM (Ghosal and Maity 2022a) or the ANFCM (Kim et al. 2018) procedures among others, our approach has the novel property that there is no need of a preliminary estimation of the model structure. Besides, this new procedure allows multivariate responses Y (t) ∈ R q for q ≥ 1 and t ∈ D. Furthermore, no tuning parameters are involved in contrast with previous methodologies. Instead, it is only needed to compute a U-statistic version of the MDD to be able to apply the tests. Using the theory of U-statistics, good properties of this estimator are guaranteed in practice, as its unbiasedness. In addition, its asymptotic distribution is obtained both under the null and local alternative hypotheses. Eventually, bootstrap procedures are implemented to obtain its p-values in practice.
The new tests proposed have displayed good performance in linear formulations as well as in nonlinear structures. This is appreciated by means of the results of scenarios A and B considered in the simulation study of Sect. 4. These procedures are well calibrated under the null hypothesis of no effect, tending to the significance level as the sample size increases. Moreover, they have power under alternatives, which one can deduce from observing that p-values tend to the unit as sample size increases when associated covariates are relevant. Besides, these procedures seem to perform well in real datasets too. We display an example of this result in Sect. 5, where we analyze three real datasets. Other authors have already studied these, so we compare our outcomes with existing literature, obtaining similar results when these are comparable. As a result, the MDD-based test is a pretty transversal tool to detect additive effects in the concurrent model framework without the need for previous assumptions or model structure estimation. Moreover, notice that all these ideas could be extended to conditional quantile dependence testing in the concurrent model framework. For this purpose, a similar development would be enough, following the guidelines and adapting the ideas of Section 3 in Zhang et al. (2018).
In terms of performance comparison with existing literature, the MDD-based test methodology is put together with Ghosal and Maity (2022a) (FLCM) and Kim et al. (2018) (ANFCM) algorithms in the linear and additive model framework, respectively. Based on the obtained results, it is possible to claim that the new procedure is quite competitive. Even when the FLCM and ANFCM procedures have the advantage of being implemented assuming the correct model structure and an optimal number of the basis components, the new procedure results are comparable to theirs. These results arise in Sect. 4.3. In contrast, our procedure outperforms their results by simulating a more functional scenario and avoiding only trigonometric expressions in the model. Besides, another disadvantage of the competitors is that m(t, X (t)) is unknown in practice, so a misguided assumption of the model structure could lead to poor results. In addition, as discussed in Ghosal and Maity (2022a) and Kim et al. (2018), a suitable selection of the number of the basis components is problematic in practice. This issue is still an open problem. This quantity plays the role of tuning parameter, so an appropriate value is needed to guarantee a proper adjustment. In contrast, our proposal has the novelty that this does not require previous estimation or tuning parameters selection. Our approach bridges a gap and solves the problems mentioned above.
One limitation of the present form of our test is that this only admits the study of numerical covariates. This restriction is quite common for the concurrent model framework. Some examples are the works of Ghosal and Maity (2022a) or Kim et al. (2018). If one wants to be able to include categorical variables, as in other works such as in Wang et al. (2017), a different metric is needed to correctly define the U-statistic of the MDD test. Some solutions for this problem have already been proposed for the distance covariance approach in the presence of noncontinuous variables. Similar ideas could be translated to the MDD context to solve this issue. An option is to extend the ideas proposed in Lyons (2013) for general metric spaces to this case. We leave this topic for future research.
Another drawback is related to the disposal of the observed time instants. It is necessary to monitor the same number of curves at each instant of time to be able to construct our proposed statistic. This restriction translates into synchronous observa-tions with n t = n points of the observed curves for all t ∈ D. When the number of missed points is small, we can impute these using interpolation techniques. An example is given in Sect. 3.2. However, in a sparse context where one observes each curve in a different number of time points, and these measures may not agree (asynchronous pattern), it is not possible preprocessing the data to obtain our starting point. Therefore, we need a new methodology based on different dependence measures. This problem is a pretty interesting area of study for future work. Concerning the observed time points, an additional drawback is the statistics computational time, being of the order of O(n(n − 1)(n − 2)(n − 3)T ) operations. Then, this procedure is quite competitive for "moderate" values of n and T . However, for large values of these quantities, especially those related to n, the statistic has a high computational cost. Consequently, simplification techniques in the number of required operations are of interest to make the procedure more tractable.
Eventually, we remark that our tests only collect additive effects. This phenomenon is due to the statistics structure displayed in (11). Although this formulation embraces a wide variety of different structures, this does not consider some complex relations. An example is the detection of possible interactions without a prespecified definition of a new variable collecting this information. Nevertheless, we think that our ideas can be extended to the general concurrent model formulation by resorting to projection techniques. Another interesting idea, pointed out by one of the referees, is to directly consider the integral of the vectorial version of the MDD coefficient and adapt the theoretical results of Székely et al. (2007). This expansion would translate into studying the convergence of the integrated version of an infinity family of empirical processes. Nevertheless, this procedure is not straightforward in this context. Both approaches are entirely new lines for future research that would need further study.

Appendix A: Simulation details for considered competitors
In this appendix, we remind the scenario A structure of the FLCM algorithm implemented in Ghosal and Maity (2022a) in Section A.1. Besides, the form of scenario (B) for the ANFCM testing performance introduced in Kim et al. (2018) is reviewed in Section A.2. Moreover, we explain how these algorithms are implemented in the simulation study of Sect. 4.3.
Code for simulations and real datasets analysis is available in the public GitHub repository https://github.com/LauraFreiG/Covariates_selection.git. In particular, this is summarized in the folder "Synchronous FCM", inside the folder "Functional Concurrent Model".
We consider the dense design, taking a total of T = 81 equidistant time points in [0, 1], being t 1 = 0 and t 81 = 1. A Monte Carlo study is carried out using M = 1000 replicates to measure calibration and power, and p-values are calculated using B = 100000 samples generated under the null hypothesis of no effect (H 0 : β 1 (t) = 0 for all t). Following the authors' guidelines, the number of basis components considered is Q = 7. To measure calibration and power we consider d = 0 and d = 3, 7, respectively. Besides, we take n = 60, 100 to compare their results with the MDDbased test ones. To implement this algorithm, we have used the public code which can be found in 10.1016/j.ecosta.2021. 05.003. In particular, we generate the data and use the FLCM.test1 function of the test.R script to implement the test.
Here, the dense design scenario is considered with T = 81 equidistant time points in [0, 1], being t 1 = 0 and t 81 = 1. To study its calibration and power behavior a Monte Carlo study is carried out. We employ a total of M = 1000 replicates to study both. In this case, p-values are calculated by means of B = 200 bootstrap samples in all cases. Besides, following Kim et al. (2018) parameters selection, the number of the basis components taken is Q = 7. In order to measure calibration and power we test with d = 0 and d = 3, 7, we simulate under null and alternative hypotheses, respectively. Besides, we take n = 60, 100 to compare their results with the MDD-based test ones. We have found the code available in https://www4.stat.ncsu.edu/~maity/software.html and we borrowed it to reproduce the ANFCM simulations. Specifically, we make use of the anova.datagen function of the datagenALL.R 2 script to generate the data and apply test.anova function of the test.R script to implement the algorithm, using now list ( (1-q)*(a0%*%t(ones)+a1%*%t(phi1)+a2%*%t(phi2)) for the expression X.list[[q]] = (a0%*%t(ones)+a1%*%t(phi1)+a2%*%t(phi2))/sqrt(2^(q-1)) to correct a typo. Besides, it is needed to change F.anova2 = function(x1,x2,t,d)2*t+t^2+x1*sin(pi*t)/4+d*2*cos(x2*t) by F.anova2 = function(x2,t,d)2*t+t^2+d*2*cos(x2*t) as well as Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.