Let \(\varvec{Y}=\{Y_j\}_{j=1}^p\) be a set of p variables, and \(F_{1},\ldots , F_{p}\) and F their continuous marginal and joint distributions, respectively. In this context, Sklar (1959) states that there is a unique copula function \(C\!: [0,1]^p \rightarrow [0,1]\) such that \(F(\varvec{y})=F(y_{1},\ldots , y_{p})=C(F_{1}(y_{1}),\ldots , F_{p}(y_{p}))\) for all \(\varvec{y}=(y_1,\ldots , y_p)\in \varvec{\mathbb {R}}^p\). That is, copulas are joint distribution functions whose marginals are standard uniform variables. Patton (2006) extends this result to conditional copulas and states that, given a covariate Z, there is a unique copula \(C_z:[0,1]^p\rightarrow [0,1]\) such that \(F_z(\varvec{y})=C_z(F_{1z}(y_{1}),\ldots , F_{pz}(y_{p}))\), where \(F_{jz}(y)=P(Y_j\le y|Z=z)\), for any \(y\in Y_j\), \(j=1,\ldots ,p\). In inverting Sklar’s theorem, the \(C_z\) function can be expressed as \(C_z(\varvec{u})=F_z(F_{1z}^{-1}(u_{1}),\ldots , F_{pz}^{-1}(u_p))\) in terms of the joint and marginal distribution functions, where \(\varvec{u}=(u_1,\ldots , u_p)\in [0,1]^p\) and \(F_{jz}^{-1}(u)=inf\{y: F_{jz}(y)\ge u\}\) is the z-conditional quantile function of \(Y_j\).
To estimate conditional copulas, Gijbels et al. (2011) propose a nonparametric estimator in a bivariate context. We use the natural extension to the multivariate conditional copula estimator,
$$\begin{aligned} \hat{C}_{z,h_n}(\varvec{u})= & {} \sum _{i=1}^n w_i(z, h_n)I \left\{ Y_{1i}\le \hat{F}_{1z,h_n}^{-1}(u_1),\ldots , Y_{pi}\le \hat{F}_{pz,h_n}^{-1}(u_p)\right\} , \end{aligned}$$
(1)
where \(\{w_i(z, h_n)\}\) is a sequence of weights depending on \((z-Z_i)/h_n\) and \(h_n\) is the bandwidth. Considering Nadaraya–Watson weights, \( \{w_i(z, h_n)\} = k((z-Z_i)/h_n)/\sum _j k((z-Z_j)/h_n) \), where k is a kernel function. \(I\{\cdot \}\) is the indicator function and \(\hat{F}_{jz,h_n}(y)=\sum _{i=1}^n w_i(z, h_n)I\{Y_{ji}\le y\}\) is the nonparametric conditional j-marginal estimator. It is noteworthy that the bandwidth in this case does not have the usual smoothing effect as in regression. In fact, when the bandwidth \(h_n\) increases, the copula estimator \(\hat{C}_{z,h_n}\) tends to the empirical copula \(\hat{C}_{z}(\varvec{u})=n^{-1}\sum _{i=1}^n I\{Y_{1i}\le \hat{F}_{1z}^{-1}(u_1),\ldots , Y_{pi}\le \hat{F}_{pz}^{-1}(u_p)\}\).
To quantify the degree of dependence, we estimate the Kendall’s tau coefficient as a measure of the ordinal association between two measured quantiles. The multivariate Kendall’s tau is defined as in Joe (1990), \(\tau =(2^{p-1}-1)^{-1}\left( 2^p \int _{\mathbf {I}^p} C(\varvec{u})dC(\varvec{u})-1\right) ,\) where \(\mathbf {I}^p=[0,1]^p\). The multivariate Kendall’s tau accounts for common comovements beyond pairwise effects and quantifies simultaneous concordance. Thus, more variables imply more conditions to be met at the same time and so fewer concordances are expected. The distortion that the number of variables can produce is mitigated by the p-dependent correction factor included in the definition of the tau. Note that Kendall’s tau is a measure of dependence that depends only on the copula and not on the marginals. We also note the advantage of the multivariate Kendall’s tau over the pairwise average as an overall dependence measure, since it accounts for multivariate distribution and not only for bivariate effects. As the multivariate nonparametric estimator of \(\tau \), we consider an extended version of the empirical bivariate Kendall’s tau (Deheuvels 1980),
$$\begin{aligned} \hat{\tau }=\frac{1}{2^{p-1}-1}\Big (\frac{2^p}{n(n-1)}\sum _{i=1}^n \sum _{j=1}^n I\{\mathbf {Y}_{i}\!<\!\mathbf {Y}_{j}\}-1\Big ), \end{aligned}$$
where \(\varvec{Y}_{i}=(Y_{1i},\ldots ,Y_{pi})\) and \(I\{\mathbf {Y}_{i}<\mathbf {Y}_{j}\}=I\{Y_{1i}<Y_{1j},\ldots ,Y_{pi}<Y_{pj}\}\).
An extended version of the multivariate Kendall’s tau proposed by Joe (1990) to conditional copulas can be defined as
$$\begin{aligned} \tau _{z}=\frac{1}{(2^{p-1}-1)}\left( 2^p \int _{\mathbf {I}^p} C_z(\varvec{u})dC_z(\varvec{u})-1\right) . \end{aligned}$$
(2)
The nonparametric estimator proposed is
$$\begin{aligned} \hat{\tau }_{z,h_n}\!=\!\frac{1}{2^{p\!-\!1}\!\!-\!1}\left( \!\frac{2^p}{1\!-\!\!\sum _{i=1}^n\!\! w_i(z, h_n)^2}\!\!\sum _{i,j=1}^n \!\! w_i(z, h_n)w_j(z, h_n)I\{\mathbf {Y}_{i}\!<\!\mathbf {Y}_{j}\}\!-\!1\!\right) , \end{aligned}$$
(3)
where the weights are based on the recommendations given by Gijbels et al. (2011). Note that as for the copula estimator, the conditional Kendall’s tau (3) tends to the unconditional empirical Kendall’s tau as the bandwidth increases. This estimator generalizes the bivariate estimator in Gijbels et al. (2011). The asymptotic normality of the conditional Kendall’s tau estimator is established by Veraverbeke et al. (2011) for the bivariate case. The next proposition generalizes the consistency and asymptotic normality of the multivariate conditional Kendall’s tau estimator in (3) under the usual set of assumptions:
-
A1.
\((\varvec{Y}_{i},Z_{i})\), \(i=1,\ldots ,n\) are i.i.d. tuples.
-
A2.
The conditional joint distribution \(F_z(\cdot )=F(\cdot |z)\) and the density of the covariable Z, f(z), have continuous first and second order derivatives with respect to z, all denoted with the respective primes.
-
A3.
The kernel is a bounded symmetric second-order kernel with compact support \(\Omega =[-1,1]\) such that \(\int _{\Omega }k(\eta )d\eta =1\). Moreover, \(c_k=\int _{\Omega } k(\eta )\eta ^2d\eta \) and \(d_k=\int _{\Omega } k(\eta )^2d\eta \) are nonzero quantities.
-
A4.
\(h_n\rightarrow 0\) and \(nh_n\rightarrow \infty \) as \(n\rightarrow \infty \).
Proposition 1
Under assumptions A1 to A4, the conditional Kendall’s tau estimator \(\hat{\tau }_{z,h_n}\) defined in (3) is a consistent estimator of \(\tau _z\) defined in (2), where the asymptotic bias is \(Bias(\hat{\tau }_{z,h_n})=2^{p\!-\!1}h_n^2c_k((2^{p\!-\!1}\!-\!1)\!f(z))^{-1}\!\!\int _{\mathbb {R}^p}\!\! \Big (\!F_z(\varvec{y})\times \) \(g\left( f_z(\varvec{y})\right) +f_z(\varvec{y})g\left( F_z(\varvec{y})\right) \Big ) d\varvec{y}+o(h_n^2)\) with \(g(r_{z}(\varvec{y}))=\Big (\!r_{z}(\varvec{y})f''(z)+2r'_{z}(\varvec{y})f'(z)+r''_{z}(\varvec{y})f(z)\Big )\).
Moreover, if \(C_z^L\) is the limiting distribution of \(\left( nh_n\right) ^{1/2}\big (\hat{C}_{z,h_n}(\varvec{u})-C_z(\varvec{u})\big )\) and \(\varphi _z\) is a Gaussian variable given by
$$\begin{aligned} \varphi _z=2^p(2^{p-1}\!-\!1)^{-1}\left( \int _{I^p}C_z(\varvec{u})dC_z^L(\varvec{u})+\int _{I^p}C_z^L(\varvec{u})dC_z(\varvec{u})\right) , \end{aligned}$$
the asymptotic variance of \(\hat{\tau }_{z,h_n}\) is given by the variance of \(\varphi _z\), \(\sigma ^2(\varphi _z)\). Additionally, assuming that \(h_n=o(n^{-1/5})\) and \(\int _{\Omega }k(\eta )^{\zeta }d\eta \ne 0\) for \(\zeta >2\),
$$\begin{aligned} \left( nh_n\right) ^{1/2}\left( \hat{\tau }_{z,h_n}-\tau _z\right) \xrightarrow {d}\varphi _z. \end{aligned}$$
The limiting distribution of the multivariate estimator (3) is obtained from the asymptotic normality of the conditional copula estimator \(\hat{C}_{z,h_n}(\varvec{u})\), provided that Kendall’s tau can be written as a functional of the copula and the Hadamard differentiability of such functional (tangentially to the set of continuous functions on \([0,1]^p)\). The details are given in Appendix A.
Bandwidth selection
Classic proposals for selecting the smoothing parameter are based on the rule of thumb, cross-validation or plug-in methods. Smoothing parameter selection for distribution functions has been proposed by Altman and Leger (1995), Sarda (1993) and Bowman et al. (1998). Derumigny and Fermanian (2019) propose a cross-validation bandwidth selection procedure for the conditional Kendall’s tau. Here, we propose a plug-in pointwise bandwidth selection method for the nonparametric conditional Kendall’s tau by minimizing the overall mean squared error of the conditional tau.
The bias and variance for computing the MSE are estimated via the jackknife method based on Quenouille (1956) for bias and Tukey (1958) for variance. The procedure is an iterative process strongly related to the bootstrap resampling method proposed by Efron (1979). Actually, the jackknife is a linear approximation of the bootstrap (Abdi and Williams 2010) that entails lower computational costs and is more suitable for small data samples (Oyeyemi 2008; Efron 1982). The main steps for selecting the bandwidth for the conditional Kendall’s tau are summarized in Algorithm 1. We consider \(h_0=0.9An^{-1/5}\) (Silverman 1986) as the initial bandwidth for variance estimation, where \(A\!=\!min(\gamma (Z)/1.34,\ \sigma (Z))\), and \(\gamma (Z)\) and \(\sigma (Z)\) are the interquartile range and the standard deviation of the covariable Z, respectively. The initial bandwidth for the bias is taken as proposed in Gijbels et al. (2011).
Testing for restrictions in conditional dependence
In this section, we propose a test for linear restrictions for all null hypothesis that can be expressed as
$$\begin{aligned} H_0: \varvec{R}\,\varvec{\tau _z}=\varvec{r}, \end{aligned}$$
(4)
where \(\tau _z=(\tau _{z_1},\ldots ,\tau _{z_m})'\) is a m-dimensional column vector of Kendall’s taus and \(z_1,\ldots ,z_m\) are m deterministic conditioning values in the range of the covariable Z. Actually, \(z_1,\ldots ,z_m\) are determined to be sufficiently spaced so that the subsamples used in the nonparametric estimator of the conditional Kendall’s tau for each \(\{\tau _{z_\ell }\}_{\ell =1}^m\) do not overlap. \(\varvec{R}\) is a \(q\times m\) matrix of rank \(q\le m\) and \(\varvec{r}\) is a q-dimensional column vector where q is the number of restrictions to be tested. Both \(\varvec{R}\) and \(\varvec{r}\) are deterministic. The alternative is \(H_a: \varvec{R}\,\varvec{\tau _z}\ne \varvec{r}\). The test statistic under \(H_0\) is
$$\begin{aligned} \mathcal{J}_{n}=nh_n(\varvec{R}\,\varvec{\hat{\tau }_{_{z,h_n}}}-\varvec{r})'(\varvec{R}\varvec{V}_{\!\hat{\tau }_{_{z,h_n}}}\varvec{R}')^{-1}(\varvec{R}\,\varvec{\hat{\tau }_{_{z,h_n}}}-\varvec{r}), \end{aligned}$$
(5)
where \(\varvec{V}_{\!\hat{\tau }_{_{z,h_n}}}\!\!\) is the covariance matrix of \(\varvec{\hat{\tau }_{_{z,h_n}}}\), and \(h_n\) is the bandwidth. The following proposition establishes the asymptotic distribution of the test statistic \(\mathcal{J}_{n}\) in (5) and the asymptotic local power for local alternatives of type \(H_a(\xi _n): \varvec{R}\,\varvec{\tau _z}=\varvec{r}+\xi _n\, \varvec{\varsigma }\), where \(\varvec{\varsigma }\) is a \(q\times 1\) nonzero deterministic column vector and \(\xi _n\rightarrow 0\) as \(n\rightarrow \infty \).
Proposition 2
Consider the same assumptions as in Proposition 1 and a set of conditioning values \(\varvec{z}=(z_1,\ldots ,z_m)\), \(m\!<\!n\), sufficiently spaced between them such that the subsamples used in the estimation for each \(z_\ell \in \varvec{z}\) are disjoint to ensure independence. Under the null hypothesis, the \(\mathcal{J}_{n}\) statistic asymptotically has a \(\chi ^2\) distribution with q degrees of freedom.
Under local alternatives \(H_a(\xi _n)\) with \(\xi _n=(nh_n)^{-1/2}\), the \(\mathcal{J}_{n}\) statistic is asymptotically distributed as a non-centered \(\chi ^2\) distribution with q degrees of freedom and the noncentrality parameter \(\delta _n\!=\!\varvec{\varsigma }'(\varvec{R}\varvec{V}\!_{\!\hat{\tau }_{_{z,h_n}}}\!\varvec{R}')^{-1}\!\varvec{\varsigma }\).
The limiting distributions in Proposition 2 can be obtained from the joint asymptotic normality of the multivariate Kendall’s tau conditioned to different points and Slutsky’s theorem. An outline of the proof is given in Appendix A.
The main steps for the practical implementation of the above test are presented in Algorithm 2. Without loss of generality, we consider a single bandwidth value, although it can be generalized to local bandwidth values. Note that in practice, \(\varvec{V}_{\!\hat{\tau }_{_{z,h_n}}}\!\!\) must be consistently estimated. An alternative is considered in Step 2. In related papers, Gijbels et al. (2017) and Lemyre and Quessy (2017) propose different resampling procedures to test for covariate effects. We propose adapted resampling procedures to test for the hypothesis considered that will be detailed in each case.
The null hypothesis in expression (4) accounts for many possible situations. In particular, it enables to test for conditionally constant dependence. Alternative tests to determine whether there are covariate effects for conditional distributions can be found in Lemyre and Quessy (2017), and for conditional copulas in Gijbels et al. (2017) and Derumigny and Fermanian (2017). Specifically, Gijbels et al. (2017) review some existing procedures purely based on conditional copula structures and introduce some nonparametric proposals using conditional Kendall’s tau.
In this particular case, \(q=m-1\), \(\varvec{R}\) is a \((m-1)\times m\) matrix with ones in the main diagonal and – 1 values in the upper diagonal, and \(\varvec{r}=\varvec{0}_{(m-1)\times 1}\). Due to the complexity of the asymptotic variance–covariance matrix, we use a permutation procedure to estimate \(\varvec{\hat{V}}_{\!\hat{\tau }_{_{z,h_n}}}\) under the null hypothesis: Keep Z fixed and obtain permuted \(\{(Y_{1i}^{b},\ldots ,Y_{pi}^{b})\}_{i=1}^n\!\)\(p-\)tuples from \(\{(Y_{1i},\ldots ,Y_{pi})\}_{i=1}^n\) for a large number of permutations B. Then, with the permuted samples estimate \(\{\hat{\tau }_{_{z_\ell ,h_n}}^{b}\}_{b=1}^B\) for each \(\ell =1,...,m\) and compute the sample variance of the set of estimated conditional Kendall’s taus.
The statistic \(\mathcal{J}_{n}\) can also be used to test linear restrictions across different waves. Let \(s_1\) and \(s_2\) be two independent samples. Then, \(\varvec{\tau _z}\) is a \(2m\times 1\) stacked vector accounting for the conditional dependence in the two samples, \(\varvec{\tau _z}=(\tau _{z_1}^{s_1},\ldots ,\tau _{z_m}^{s_1},\tau _{z_1}^{s_2},\ldots ,\tau _{z_m}^{s_2})= ({\varvec{\tau _z}^{s_1\ '}}, {\varvec{\tau _z}^{s_2\ '}})'\). \(\varvec{R}=(\varvec{I}_m, -\varvec{I}_m)\), where \(\varvec{I}\) is the identity matrix, and \(\varvec{r}=\varvec{0}_{2m\times 1}\). The estimated variance–covariance matrix \(\varvec{\hat{V}}_{\!\hat{\tau }_{_{z,h_n}}}\) is now a block diagonal matrix with (\(\varvec{\hat{V}}^{s_1}_{\!\hat{\tau }_{_{z,h_n}}}, \varvec{\hat{V}}^{s_2}_{\!\hat{\tau }_{_{z,h_n}}}\)) in the diagonal and \(\varvec{\widehat{C}ov}(\varvec{\hat{\tau }_{_{z,h_n}}}^{\!\!\!\!\!\!\!\!\!\!s_1}_{ }, \varvec{\hat{\tau }_{_{z,h_n}}}^{\!\!\!\!\!\!\!\!\!\!s_2}_{ })\) in the nondiagonal. The permutation procedure to estimate \(\varvec{\hat{V}}_{\!\hat{\tau }_{_{z,h_n}}}\) in this context is quite different since it has to be adapted into an appropriate resampling procedure. For each sample \(s=s_1,s_2\), a bootstrap procedure is implemented: Bootstrap \(\{(Y_{s,1i}^{b},\ldots ,Y_{s,pi}^{b},Z_{s,i}^b)\}_{i=1}^n\!\)\((p+1)-\)tuples from \(\{(Y_{s,1i},\ldots ,Y_{s,pi},Z_{s,i})\}_{i=1}^n\) for a sufficiently large number of times B. Estimate \(\hat{\tau }_{_{z_{\ell },h_n}}^{s,b}\) for each bootstrapped sample and calculate the variances \(\hat{\sigma }^2(\hat{\tau }_{_{z_\ell ,h_n}})=(2B)^{-1} \sum _{s,b} (\hat{\tau }_{_{z_\ell ,h_n}}^{s,b}-\overline{\hat{\tau }}_{_{z_\ell ,h_n}})^2\). Then, set \(\varvec{\hat{V}}_{\!\hat{\tau }_{_{z,h_n}}}\!\!\) to be a diagonal matrix with size \(2m\times 2m\) and the estimated values \(\{\hat{\sigma }^2(\hat{\tau }_{_{z_\ell ,h_n}})\}_{\ell =1}^m\).
These two applications of the \(\mathcal{J}_{n}\) statistic are implemented in the simulation study presented in the next section.