Abstract
Vine copulas are a flexible class of dependence models consisting of bivariate building blocks and have proven to be particularly useful in high dimensions. Classical model distance measures require multivariate integration and thus suffer from the curse of dimensionality. In this paper, we provide numerically tractable methods to measure the distance between two vine copulas even in high dimensions. For this purpose, we consecutively develop three new distance measures based on the Kullback–Leibler distance, using the result that it can be expressed as the sum over expectations of KL distances between univariate conditional densities, which can be easily obtained for vine copulas. To reduce numerical calculations, we approximate these expectations on adequately designed grids, outperforming Monte Carlo integration with respect to computational time. For the sake of interpretability, we provide a baseline calibration for the proposed distance measures. We further develop similar substitutes for the Jeffreys distance, a symmetrized version of the Kullback–Leibler distance. In numerous examples and applications, we illustrate the strengths and weaknesses of the developed distance measures.
Similar content being viewed by others
Notes
This includes, for example, C- and D-vines (Aas et al. 2009) having the same diagonal.
Since most copulas have an infinite value at the boundary of the unit cube, we usually restrict ourselves to \([\varepsilon ,1-\varepsilon ]^d\) for a small \(\varepsilon >0\).
All numerical calculations in this paper were performed on a Linux computer (8-way Opteron) with 32 cores (each with 2.6 GHz and 3.9 GB of memory).
References
Aas, K., Czado, C., Frigessi, A., Bakken, H.: Pair-copula constructions of multiple dependence. Insur. Math. Econ. 44, 182–198 (2009)
Acar, E.F., Genest, C., Nešlehová, J.: Beyond simplified pair-copula constructions. J. Multivar. Anal. 110, 74–90 (2012)
Bedford, T., Cooke, R.M.: Vines: a new graphical model for dependent random variables. Ann. Stat. 30(4), 1031–1068 (2002)
Brechmann, E.C., Czado, C.: Risk management with high-dimensional vine copulas: an analysis of the Euro Stoxx 50. Stat. Risk Model. 30(4), 307–342 (2013)
Caflisch, R.E.: Monte carlo and quasi-monte carlo methods. Acta Numer. 7, 1–49 (1998)
Cooke, R.M., Joe, H., Chang, B.: Vine regression. Resources for the Future Discussion Paper, pp. 15–52 (2015)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Hoboken (2012)
Dißmann, J., Brechmann, E.C., Czado, C., Kurowicka, D.: Selecting and estimating regular vine copulae and application to financial returns. Comput Stat Data Anal 59, 52–69 (2013)
Do, M.N.: Fast approximation of Kullback–Leibler distance for dependence trees and hidden Markov models. IEEE Signal Process. Lett. 10(4), 115–118 (2003)
Haff, I.H., Aas, K., Frigessi, A.: On the simplified pair-copula construction—simply useful or too simplistic? J. Multivar. Anal. 101(5), 1296–1310 (2010)
Hershey, J.R., Olsen, P.A.: Approximating the Kullback Leibler divergence between Gaussian mixture models. In: IEEE International Conference on Acoustics, Speech and Signal Processing, 2007. ICASSP 2007, vol. 4, pp. IV–317. IEEE (2007)
Jeffreys, H.: An invariant form for the prior probability in estimation problems. In: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 186, pp. 453–461. The Royal Society (1946)
Joe, H.: Multivariate Models and Multivariate Dependence Concepts. CRC Press, Boca Raton (1997)
Joe, H.: Generating random correlation matrices based on partial correlations. J. Multivar. Anal. 97(10), 2177–2189 (2006)
Joe, H.: Dependence Modeling with Copulas. CRC Press, Boca Raton (2014)
Killiches, M., Czado, C.: Block-maxima of vines. In: Dey, D., Yan, J. (eds.) Extreme Value Modelling and Risk Analysis: Methods and Applications, pp. 109–130. CRC Press, Boca Raton (2015)
Killiches, M., Kraus, D., Czado, C.: Examination and visualisation of the simplifying assumption for vine copulas in three dimensions. Aust. N. Z. J. Stat. (2016). doi:10.1111/anzs.12182
Kraus, D., Czado, C.: D-vine copula based quantile regression. Comput. Stat. Data Anal. 110C, 1–18 (2017)
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Maya, L., Albeiro, R., Gomez-Gonzalez, J.E., Melo Velandia, L.F.: Latin american exchange rate dependencies: a regular vine copula approach. Contemp. Econ. Policy 33(3), 535–549 (2015)
McKay, M.D., Beckman, R.J., Conover, W.J.: Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239–245 (1979)
Morales-Nápoles, O.: Counting vines. In: Kurowicka, D., Joe, H. (eds.) Dependence Modeling: Vine Copula Handbook. World Scientific Publishing Co, Singapore (2011)
Nagler, T., Czado, C.: Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. J. Multivar. Anal. 151, 69–89 (2016)
Nelsen, R.: An Introduction to Copulas, 2nd edn. Springer-Science Business Media, New York (2006)
Panagiotelis, A., Czado, C., Joe, H.: Pair copula constructions for multivariate discrete data. J. Am. Stat. Assoc. 107(499), 1063–1072 (2012)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2017)
Rosenblatt, M.: Remarks on a Multivariate Transformation. Ann. Math. Stat. 23(3), 470–472 (1952)
Schepsmeier, U.: Efficient information based goodness-of-fit tests for vine copula models with fixed margins. J. Multivar. Anal. 138, 34–52 (2015)
Schepsmeier, U., Stoeber, J., Brechmann, E.C., Graeler, B., Nagler, T., Erhardt, T.: VineCopula: statistical inference of vine copulas. R Package Version 2(1), 1 (2017)
Sklar, A.: Fonctions dé Repartition á n Dimensions et leurs Marges. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959)
Stöber, J., Czado, C.: Pair copula constructions. In: Mai, J.-F., Scherer, M. (eds.) Simulating Copulas: Stochastic Models, Sampling Algorithms, and Applications. World Scientific, Singapore (2012)
Stöber, J., Joe, H., Czado, C.: Simplified pair copula constructions—limitations and extensions. J. Multivar. Anal. 119, 101–118 (2013)
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max–min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)
Acknowledgements
The authors would like to thank the editor and an anonymous referee for their constructive comments and suggestions, which helped to improve the quality of the paper. The first author acknowledges financial support by a research stipend of the Technische Universität München. The third author is supported by the German Research Foundation (DFG Grant CZ 86/4-1). Numerical calculations were performed on a Linux cluster supported by DFG Grant INST 95/919-1 FUGG.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proof of Proposition 1
From Eq. (2.3) we know that the vine copula density can be written as a product over the pair-copula expressions corresponding to the matrix entries. In Property 2.8 (ii), Dißmann et al. (2013) state that deleting the first row and column from a d-dimensional structure matrix yields a \((d-1)\)-dimensional trimmed structure matrix. Due to Property 2 from Definition 1 the entry \(m_{1,1}=1\) does not appear in the remaining matrix. Hence, we obtain the density \(c_{2:d}\) by taking the product over all pair-copula expressions corresponding to the entries in the trimmed matrix. Iterating this argument yields that the entries of matrix \(M_k:=(m_{i,j})_{i,j=k+1}^d\) resulting from cutting the first k rows and columns from M represent the density \(c_{(k+1):d}\). In general, we have
The numerator and denominator can be obtained as the product over all pair-copula expressions corresponding to the entries of \(M_{j-1}\) and \(M_j\). Thus, \(c_{j|(j+1):d}\) is simply the product over the expressions corresponding to the entries from the first column of \(M_{j-1}\). This proves Eq. (2.4).
Appendix 2: Proof of Proposition 2
We will prove an even more general version of Proposition 2 that holds for arbitrary densities f and g:
where \({\mathbf {X}}_{(j+1):d}\sim f_{(j+1):d}\) and \({(d+1)\!:\!d}:=\emptyset \). Proposition 2 then follows directly from this statement.
Recall that using recursive conditioning we can obtain for density f
Thus, the Kullback–Leibler distance between f and g can be written in the following way:
Appendix 3: Number of vines with the same diagonal
Proposition 4
Let \(\varvec{\sigma }=\left( \sigma _1,\ldots ,\sigma _d\right) '\) be a permutation of \({1\!:\!d}\). Then, there exist \(2^{\left( {\begin{array}{c}d-2\\ 2\end{array}}\right) +d-2}\) different vine decompositions whose structure matrix has the diagonal \(\varvec{\sigma }\).
Proof
The number of vine decompositions whose structure matrix has the same diagonal \(\sigma \) can be calculated as the quotient of the number of valid structure matrices and the number of possible diagonals. Morales-Nápoles (2011) show that there are \(\frac{d!}{2}\cdot 2^{\left( {\begin{array}{c}d-2\\ d\end{array}}\right) }\) different vine decompositions. In each of the \(d-1\) steps of the algorithm for encoding a vine decomposition in a structure matrix (see Stöber and Czado 2012) we have two possible choices such that there are \(2^{d-1}\) structure matrices representing the same vine decomposition. Hence, there are in total \(\frac{d!}{2}\cdot 2^{\left( {\begin{array}{c}d-2\\ d\end{array}}\right) }\cdot 2^{d-1}\) valid structure matrices. Further, there are d! different diagonals. Thus, for a fixed diagonal \(\varvec{\sigma }\) there exist
\(\square \)
Appendix 4: Proof of Proposition 3
Let \(\varepsilon >0\) and \(n\in {\mathbb {N}}\). To simplify notation, for \(j=1,\ldots ,d-1\) we define
Then, by definition
Since \(W_j\) is a discretization of \([\varepsilon ,1-\varepsilon ]^{d-j}\) with mesh going to zero for \(n\rightarrow \infty \), we have
Substituting \({\mathbf {w}}_{(j+1):d}=T^{-1}_{c_{(j+1):d}^f}\left( {\mathbf {u}}_{(j+1):d}\right) \) yields
since
with (upper triangular) Jacobian matrix
such that \(\mathop {}\!\mathrm {d}{\mathbf {w}}_{(j+1):d}= \det (J) \mathop {}\!\mathrm {d}{\mathbf {u}}_{(j+1):d}=c_{(j+1):d}^f\left( {\mathbf {u}}_{(j+1):d}\right) \mathop {}\!\mathrm {d}{\mathbf {u}}_{(j+1):d}\). Since we are only interested in the determinant of J, whose lower-triangular matrix contains only zeros, the values in the upper triangular matrix (denoted by ) are irrelevant here. Finally, using the fact that
we obtain
Appendix 5: Regarding Remark 2
1.1 Limit of the dKL
Let \(\varepsilon >0\) and \(n\in {\mathbb {N}}\). Again, for \(j=1,\ldots ,d-1\) we define
The contribution of \({\mathscr {D}}^u_{j,k}, j=1,\ldots ,d-1\), \(k{=}1,\ldots ,2^{d-j-1}\), to the dKL is given by
where \(\varvec{\omega }(t)={\mathbf {r}}+t {\mathbf {v}}({\mathbf {r}})\) with \({\mathbf {v}}(\cdot )\) as defined in Definition 4, \({\mathbf {r}}\in \left\{ 0,1\right\} ^{d-j}\) being a corner point of \(D^w_{j,k}\) and \(t_i=\varepsilon +(i-1)\frac{1-2\varepsilon }{n-1}\) for \(i=1,\ldots ,n\). Letting \(n\rightarrow \infty \) yields
Now, we further let \(\varepsilon \rightarrow 0\) and use the fact that \(\left\| {\dot{\varvec{\omega }}}(t)\right\| =\sqrt{d-j}\) to obtain
where we used the substitution \({\mathbf {u}}_{(j+1):d}:=T^{-1}_{c_{(j+1):d}^f}({\mathbf {w}}_{(j+1):d})\), \(\mathop {}\!\mathrm {d}{\mathbf {w}}_{(j+1):d}=c_{(j+1):d}^f({\mathbf {u}}_{(j+1):d})\mathop {}\!\mathrm {d}{\mathbf {u}}_{(j+1):d}\) (cf. Appendix D) in the last line.
1.2 Tail transformation
In our empirical applications of the dKL, we have noticed that different vines tend to differ most in the tails of the distribution. Therefore, we increase the concentration of evaluation points in the tails of the diagonal by transforming the points \(t_i\), \(i=1,\ldots ,n\), via a suited function \(\varPsi \). Hence, by substituting \(t=\varPsi (s)\) in Eq. (4.1) we obtain
We use its discrete pendant
where \(s_i=\varPsi ^{-1}(\varepsilon )+(i-1)\frac{\varPsi ^{-1}(1-\varepsilon )-\varPsi ^{-1}(\varepsilon )}{n-1}\) for \(i=1,\ldots ,n\). Regarding the choice of \(\varPsi \), all results in this paper are obtained using
with shape parameter \(a>0\), where \(\varPhi \) is the standard normal distribution function. Figure 5 shows the graph of \(\varPsi _a\) for different values of a. We see that larger values of a imply more points being transformed into the tails. Having tested different values for a, we found that \(a=4\) yields the best overall results. Therefore, we consistently use \(a=4\).
Appendix 6: Finding the diagonal with the highest weight
1.1 Procedure 1: Finding a starting value
The idea behind the following heuristic is that a diagonal has a higher weight if its points have high probability implied by the copula density. Hence, the diagonal should reflect the dependence structure of the variables. The unconditional dependence in a vine captures most of the total dependence and is easy to interpret. For example, if \(U_i\) and \(U_j\) are positively dependent (i.e., \(\tau _{i,j}>0\)) and \(U_j\) and \(U_k\) are negatively dependent (i.e., \(\tau _{j,k}<0\)), then it seems plausible that \(U_i\) and \(U_k\) are negatively dependent. This concept can be extended to arbitrary dimensions.
-
1.
Take each variable to be a node in an empty graph.
-
2.
Consider the last row of the structure matrix, encoding the unconditional pair-copulas. Connect two nodes by an edge if the dependence of the corresponding variables is described by one of those copulas.
-
3.
Assign a “+” to node 1.
-
4.
As long as not all nodes have been assigned a sign, repeat:
-
(a)
For each node that has been assigned a sign in the previous step, consider its neighborhood.
-
(b)
If the root node has a “+,” then assign to the neighbor node the sign of the Kendall’s \(\tau \) of the pair-copula connecting the root and neighbor node, else the opposite sign.
-
(a)
-
5.
The resulting direction vector \({\mathbf {v}}\in \left\{ -1,1\right\} ^d\) has entries \(v_i\) which are 1 or \(-1\) if node i is has been assigned a “\(+\)” or a “−,” respectively.
Note that if we had assigned a “−” to node 1 in step 3, we would have ended up with \(-v\) instead of v, implying the same diagonal.
To illustrate the procedure from above, we consider a nine-dimensional example: Let \({\mathscr {R}}\) be a vine copula with density c, where the following (unconditional) pair-copulas are specified:
Now, we take an empty graph with node 1 to 9 and add edges (i, j) if \(c_{i,j}\) is specified in Table 9. The result is a tree on the nodes 1 to 9 (see Fig. 6). We assign a “\(+\)” to node 1 and consider its neighborhood \(\left\{ 2,3\right\} \) as there are still nodes without a sign. Since \(\tau _{1,2}<0\) and the root node 1 has been assigned a “\(+\)”, node 2 gets a “−”. Node 3 is assigned a “\(+\)”. Next, we repeat this procedure for the neighborhoods of nodes 2 and 3. Iterating in this way until all nodes have been assigned a “\(+\)” or a “−” we obtain what is shown in Fig. 6. The resulting direction vector is given by \({\mathbf {v}}=(1,-1,1,1,-1,-1,-1,1,-1)'\).
1.2 Procedure 2: Local search for better candidates
Having found a diagonal through Procedure 1 (Appendix “Procedure 1: Finding a starting value”), we additionally perform the following steps in order to look whether there is a diagonal with even higher weight in the “neighborhood” of \({\mathbf {v}}\).
-
1.
Consider a candidate diagonal vector \({\mathbf {v}}\in \left\{ 1,-1\right\} ^d\) with corresponding weight \(\lambda _c^{(0)}\).
-
2.
For \(j=1,\ldots ,d\), calculate the weight \(\lambda _c^{(j)}\) corresponding to \({\mathbf {v}}_j\in \left\{ 1,-1\right\} ^d\), where \({\mathbf {v}}_j\) is equal to \({\mathbf {v}}\) with the sign of the jth entry being reversed.
-
3.
If \(\max _i \lambda _c^{(i)}>\lambda _c^{(0)}\), take \({\mathbf {v}}:={\mathbf {v}}_k\) with \(k={{\mathrm{arg max}}}_i \lambda _c^{(i)}\) to be the new candidate for the (local) maximum.
-
4.
Repeat the steps 1–3 until a (local) maximum is found, i.e., \(\max _i \lambda _c^{(i)}\le \lambda _c^{(0)}\).
Although there is no guarantee that we really find the global maximum of the diagonal weights, this procedure in any case finds a local maximum. Starting with a very plausible choice of \({\mathbf {v}}\), it is highly likely that we end up with the “right” diagonal.
In step 2, the weight of numerous diagonals has to be calculated. For a fast determination of these weights, it is reasonable to approximate the integral in Eq. (3.13) by
where \(0<t_1<t_2<\ldots<t_n<1\) is an equidistant discretization of [0, 1].
Rights and permissions
About this article
Cite this article
Killiches, M., Kraus, D. & Czado, C. Model distances for vine copulas in high dimensions. Stat Comput 28, 323–341 (2018). https://doi.org/10.1007/s11222-017-9733-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-017-9733-y