Abstract
The present work addresses the question how sampling algorithms for commonly applied copula models can be adapted to account for quasi-random numbers. Besides sampling methods such as the conditional distribution method (based on a one-to-one transformation), it is also shown that typically faster sampling methods (based on stochastic representations) can be used to improve upon classical Monte Carlo methods when pseudo-random number generators are replaced by quasi-random number generators. This opens the door to quasi-random numbers for models well beyond independent margins or the multivariate normal distribution. Detailed examples (in the context of finance and insurance), illustrations and simulations are given and software has been developed and provided in the R packages copula and qrng.
Similar content being viewed by others
References
Aistleitner, C., Dick, J.: Functions of bounded variation, signed measures, and a general Koksma-Hlawka inequality. Acta Arith. 167, 143–171 (2015)
Caflisch, R.: Monte Carlo and quasi-Monte Carlo methods. Acta Numer. 7, 1–49 (1998)
Cambanis, S., Huang, S., Simons, G.: On the theory of elliptically contoured distributions. J. Multivar. Anal. 11(3), 368–385 (1981)
Constantine, G., Savits, T.: A multivariate Faa Di Bruno formula with applications. Trans. Am. Math. Soc. 348(2), 503–520 (1996)
Cranley, R., Patterson, T.N.L.: Randomization of number theoretic methods for multiple integration. SIAM J. Numer. Anal. 13(6), 904–914 (1976)
Dick, J., Pillichshammer, F.: Digital Nets and Sequences: Discrepancy Theory and Quasi-Monte Carlo Integration. Cambridge University Press, Cambridge (2010)
Embrechts, P., McNeil, A.J., Straumann, D.: Correlation and dependence in risk management: properties and pitfalls. Risk Management: Value at Risk and Beyond. Cambridge University Press, Cambridge (2002)
Embrechts, P., Lindskog, F., McNeil, A.J.: Modelling dependence with copulas and applications to risk management. In: Rachev, S. (ed.) Handbook of Heavy Tailed Distributions in Finance, pp. 329–384. Elsevier, Boston (2003)
Fang, K.T., Kotz, S., Ng, K.W.: Symmetric Multivariate and Related Distributions. Chapman & Hall, Boca Raton (1990)
Faure, H.: Discrépance des suites associées à un système de numération (en dimension \(s\)). Acta Arith. 41, 337–351 (1982)
Faure, H., Lemieux, C.: Generalized Halton sequence in 2008: a comparative study. ACM Trans. Model. Comput. Simul. 19, 15 (2009)
Glasserman, P.: Monte Carlo Methods in Financial Engineering. Springer, New York (2004)
Halton, J.H.: On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numer. Math. 2, 84–90 (1960)
Hartinger, J., Kainhofer, R., Tichy, R.: Quasi-Monce Carlo algorithms for unbounded, weighted integration problems. J. Complex. 20, 654–668 (2004)
Hlawka, E.: Über die diskrepanz mehrdimensionaler folgen mod 1. Math. Z. 77, 273–284 (1961)
Hlawka, E., Mück, R.: Über eine transformation von gleichverteilten folgen II. Computing 9(2), 127–138 (1972)
Hofert, M.: Sampling nested Archimedean copulas with applications to CDO pricing. PhD thesis, University of Ulm (2010)
Hofert, M., Mächler, M., McNeil, A.J.: Likelihood inference for Archimedean copulas in high dimensions under known margins. J. Multivar. Anal. 110, 133–150 (2012)
Hofert, M., Mächler, M., McNeil, A.J.: Archimedean copulas in high dimensions: estimators and numerical challenges motivated by financial applications. Journal de la Société Française de Statistique 154(1), 25–63 (2013)
Hong, H.S., Hickernell, F.J.: Algorithm 823: implementing scrambled digital sequences. ACM Trans. Math. Softw. 29, 95–109 (2003)
Jaworski, P., Durante, F., Härdle, W.K., Rychlik, T.: Copula Theory and Its Applications. Lecture Notes in Statistics—Proceedings. Springer, Heidelberg (2010)
Joe, H.: Dependence Modeling with Copulas. Chapman & Hall, Boca Raton (2014)
Kurowicka, D., Cooke, R.M.: Sampling algorithms for generating joint uniform distributions using the vine-copula method. Comput. Stat. Data Anal. 51, 2889–2906 (2007)
Lemieux, C.: Monte Carlo and Quasi-Monte Carlo Sampling. Springer Series in Statistics. Springer, New York (2009)
Marshall, A.W., Olkin, I.: Families of multivariate distributions. J. Am. Stat. Assoc. 83(403), 834–841 (1988)
Matousěk, J.: On the \({L_2}\)-discrepancy for anchored boxes. J. Complex. 14, 527–556 (1998)
McNeil, A., Frey, R., Embrechts, P.: Quantitative Risk Management: Concepts, Techniques, Tools. Princeton University Press, Princeton (2005)
McNeil, A.J., Nešlehová, J.: Multivariate Archimedean copulas, \(d\)-monotone functions and \(l_{1}\)-norm symmetric distributions. Ann. Stat. 37(5b), 3059–3097 (2009)
Morokoff, W.: Generating quasi-random paths for stochastic processes. SIAM Rev. 40(4), 765–788 (1998)
Morokoff, W., Caflisch, R.: Quasi-random sequences and their discrepancies. SIAM J. Sci. Comput. 15(6), 1251–1279 (1994)
Nelsen, R.: An Introduction to Copulas, 2nd edn. Springer, New York (2006)
Niederreiter, H.: Point sets and sequences with small discrepancy. Monatshefte für Mathematik 104, 273–337 (1987)
Niederreiter, H.: Random number generation and quasi-Monte Carlo methods. Society for Industrial and Applied Mathematics, Philadelphia (1992)
Nolan, J.: Stable distributions—models for heavy tailed data. http://academic2.american.edu/jpnolan/stable/chap1 (2014)
Owen, A.B.: Randomly permuted \((t, m, s)\)-nets and \((t, s)\)-sequences. In: Niederreiter, H., Shiue, P.J.S. (eds.) Monte Carlo and Quasi-Monte Carlo Methods in Scientific Computing. Lecture Notes in Statistics, vol. 106, pp. 299–317. Springer, New York (1995)
Owen, A.B.: Monte Carlo variance of scrambled equidistribution quadrature. SIAM J. Numer. Anal. 34(5), 1884–1910 (1997a)
Owen, A.B.: Scrambled net variance for integrals of smooth functions. Ann. Stat. 25(4), 1541–1562 (1997b)
Owen, A.B.: Variance and discrepancy with alternative scramblings. ACM Trans. Model. Comput. Simul. 13, 363–378 (2003)
Owen, A.B.: Multidimensional variation for quasi-Monte Carlo. In: Fan, J., Li, G. (eds.) International Conference on Statistics in honour of Professor Kai-Tai Fang’s 65th birthday, pp. 49–74. World Scientific Publications, Hackensack, NJ (2005)
Pillards, T., Cools, R.: Using box-muller with low discrepancy points. In: ICCSA 2006, Lecture Notes in Computer Science, vol. 3984, pp 780–788. Springer, Berlin (2006)
Schmitz, V.: Copulas and stochastic processes. PhD thesis, Institute of Statistics, Aachen University (2003)
Sobol’, I.M.: On the distribution of points in a cube and the approximate evaluation of integrals. USSR Comput. Math. Math. Phys. 7, 86–112 (1967)
Sobol’, I.M.: Calculation of improper integrals using uniformly distributed sequences. Sov. Math. Dokl. 14, 734–738 (1973)
Tasche, D.: Capital allocation to business units and sub-portfolios: the Euler principle. In: Resti, A. (ed.) Pillar II in the New Basel Accord: The Challenge of Economic Capital, pp. 423–453. Risk Books, London (2008)
Wu, F., Valdez, E.A., Sheris, M.: Simulating exchangeable multivariate archimedean copulas and its applications. Commun. Stat. 36(5), 1019–1034 (2006)
Acknowledgments
We thank the Associate Editor and the two anonymous reviewers for their helpful comments, which helped us improve this paper. The first author wishes to thank SCOR for their financial support. The second and third authors acknowledge the support of NSERC through grants #5010 and #238959, respectively.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
For all the randomization schemes mentioned in Sect. 2, in addition to having a simple method to estimate the variance of the corresponding RQMC estimator, results giving exact expressions for this variance are also known and typically rely on using a well-chosen series expansion of the function \(\varPsi \) of interest. The following result recalls this variance expression in the case of randomly digitally shifted net; see Lemieux (2009) for a detailed proof. This result is used in the proof of Proposition 6 in Sect. 4.1.
Theorem 3
(Variance for randomly digitally shifted nets) Let \(\tilde{P}_n = \{\tilde{\varvec{v}}_1,\ldots ,\tilde{\varvec{v}}_n\}\) be a randomly digitally shifted net in base b with corresponding RQMC estimator \(\widehat{\mu }_n\) given by
and assume \(\mathrm{Var}(\varPsi (\varvec{U})) < \infty \) for \(\varvec{U} \sim U[0,1)^d\). Then we have that
where \(\widehat{\varPsi }(\varvec{h})\) is the Walsh coefficient of \(\varPsi \) at \(\varvec{h}\), given by
where \(\langle \varvec{h}, \varvec{u} \rangle _b = \frac{1}{b} \sum _{j=1}^d \sum _{l=0}^{\infty } h_{j,l}u_{j,l+1}\) with \(h_{j,l}\) and \(u_{j,l}\) obtained from the base b expansion of \(h_j\) and \(u_j\), respectively, and \(\mathscr {L}_d^*=\{\varvec{h} \in \mathbb {Z}^d: \langle \varvec{h}, \varvec{v}_i \rangle _b \in \mathbb {Z}, \forall i=1,\ldots ,n \}\) is the dual net of the deterministic net \(P_n = \{\varvec{v}_i,i=1,\ldots ,n\}\) that has been shifted to get \(\tilde{P}_n\).
1.1 Proofs
Proof (Proof of Proposition 4)
We start by providing more details on the expression (21), which is given by:
where \(\varvec{\beta } \in \mathbb {N}_0^d, |\varvec{\beta }| = \sum _{j=1}^d \beta _j\), and the set \( p_s(\varvec{\beta },\varvec{\alpha })\) includes pairs \((\varvec{k},\varvec{\gamma })\) such that \(\varvec{k}\) is an s-dimensional vector \(\varvec{k}=(k_1,\ldots ,k_s)\) where each \(k_j\in \{1,\ldots ,d\}\), and \(\varvec{\gamma }\) is an sl-dimensional vector \(\varvec{\gamma }=(\varvec{\gamma }_1,\ldots ,\varvec{\gamma }_s)\) where each \(\varvec{\gamma }_j\) is an l-dimensional vector whose entries are either 0 or 1, and \(\sum _{j=1}^s {\gamma }_{j,i} = 1\) for \(i\in \{1,\ldots ,l\}\). Finally, the \(c_{\varvec{\gamma }}\) are constants, which are defined in detail in Constantine and Savits (1996), along with further information on the precise definition of \(p_s(\varvec{k},\varvec{\gamma })\). As mentioned in Sect. 4.1, a sufficient condition to show that \( \Vert \varPsi \circ \varPhi _C \Vert _{d,1} < \infty \) is to establish that all products of the form (22) are in \(L_1\), which we recall is given by
for \(s\in \{1,\ldots ,l\}\) and \((\varvec{k},\varvec{\gamma }) \in p_s(\varvec{\beta },\varvec{\alpha })\).
Recall also that for the MO algorithm, \(\phi _{C,l}\) is a function of \(v_1\) and \(v_{l+1}\) only, for \(l=1,\ldots ,d\). Hence the only partial derivatives of \(\phi _{C,l}\) that are nonzero are those with respect to variables in \(\{v_1,v_{l+1}\}\).
Now, since we assume that (23) holds, then it means we just need to show that the product found in (22) is in \(L_1\), under the conditions stated in the proposition. In turn, we first show that this holds if the following bounds hold for the mixed partial derivatives of \(\phi _C\):
for all \(l \le d+1\),
We have three cases to consider.
Case 1 \(1 \notin I\). Then the product in (22) is given by
where we assumed w.l.o.g. that \(I = \{2,\ldots ,l+1\}, s=l\) and \(k_j=j+1\) for \(j\in \{1,\ldots ,s\}\). Since each term in the product depends on a distinct variable, the product is in \(L_1\) if (26) holds.
Case 2 \(1\in I\) and j such that \(\gamma _{j,1}=1\) has \(k_j+1 \notin I\). This case can be analyzed w.l.o.g. by assuming I is of the form \(I=\{1,\ldots ,r,r+2,\ldots ,l+1\}\) for some \(r\ge 1\). In that case, the products in (22) are of the form
and is thus in \(L_1\) as long as (28) holds.
Case 3 \(1\in I\) and j such that \(\gamma _{j,1}=1\) has \(k_j+1 \in I\). In this case, we can assume w.l.o.g. that \(I=\{1,\ldots ,l\}\) and therefore the products in (22) are of the form
and is thus in \(L_1\) as long as (27) holds.
The last part of the proof is to show that (26), (27), and (28) hold. First we study the partial derivatives involved in these expressions and find they are given by:
where \(x_1 = F^{-1}(v_1)\) and \(\frac{\partial x_1}{\partial v_1}= 1/f(x_1)\), where f is the pdf corresponding to F, which exists since we assumed F was continuous. Now, the partial derivatives with respect to either \(v_1\) or \(v_2\) are clearly non-negative for all \(v_1\) and \(v_2\). Hence it is easy to see that (26) and (28) hold, because we can remove the absolute value inside the integrals and therefore, these integrals amount to take differences/sums of \(\phi _{C,r}(\cdot ,\cdot )\) at different values over its domain, which obviously yields a finite value since \(\phi _{C,r}(\cdot ,\cdot )\) always takes values in [0, 1].
As for the mixed partial derivative with respect to \(v_1\) and \(v_2\), our assumption on \(\psi '(t)+t\psi ''(t)\) implies we have at most one sign change over the domain of the integral. If there is no sign change, the argument used in the previous paragraph to handle (26) and (28) can be used to show (27) is bounded. If there is one sign change, then we let \(t^*\) be such that
Then let q(v) be a function such that \(-\log q(v)/F^{-1}(v) = t^*\). For instance, one can verify that for the Clayton copula, \(q(v)=e^{-\theta F^{-1}(v)}\). When integrating the absolute value of the mixed partial derivative \(\partial ^2 \phi _{C,1}(v_1,v_2)/\partial v_1 \partial v_2\), we get
Now, in most cases \(F^{-1}(1)\) is not bounded, and thus we cannot prove that \(\varPsi \circ \phi _C\) has bounded variation. However, from there we can still get the upper bound on the error given in the result, by using a technique initially developed by Sobol’ (1973) to handle improper integrals, and later by Hartinger et al. (2004) to deal with unbounded integration problems taken w.r.t. to a measure that is not necessarily uniform (as studied in Sect. 4.2). Note that to apply their approach more easily, we need to make a small change and assume that rather than generating V as \(F^{-1}(v_1)\), we use \(F^{-1}(1-v_1)\), so that in our study of the variation above (via the integral of the absolute value of the mixed partial derivatives), the boundedness condition fails at \(v_1=0\) instead of \(v_1=1\). Following the approach in Hartinger et al. (2004) (see their Equation (24)) and taking \(\varvec{c}=(1/pn,0,\ldots ,0)\), the integration error satisfies
where \(V_{[\varvec{c},\varvec{1}]}(\varPsi \circ \phi _C)\) denotes the variation of \(\varPsi \circ \phi _C\) over \([\varvec{c},\varvec{1}]\) and
since we assumed \(|\psi (\varvec{u})|\) was bounded. As for \(V_{[\varvec{c},\varvec{1}]}(\varPsi \circ \phi _C)\), we can infer from the steps that led to (29) that it is bounded by a constant times \(\log F^{-1}(1-1/pn) \le a \log n + \log c\) by assumption. Therefore there exists a constant \(K^{(d)}\) such that \(V_{[\varvec{c},\varvec{1}]}(\varPsi \circ \phi _C) \le K^{(d)} \log n\). \(\square \)
Proof (Proof of Proposition 5)
Let \(p_l\) be such that \(P(V=l)=p_l\), for \(l \ge 1\). Let \(P_l = \sum _{k=1}^l p_k\) for \(l \ge 1\) and \(P_0=0\). We also let \(\phi _C^l(v_2,\ldots ,v_{d+1})\) \(= \phi _C(P_{l-1},v_2,\ldots ,v_{d+1})\) for \(l \ge 1\) (transformation \(\phi _C\) when \(v_1\) generates the value l for V). Consider a given value of n and low-discrepancy point set \(P_n\). If we use inversion to generate V, then we have that the subset \(P_n^l = \{\varvec{v}_i: P_{l-1} < v_{i,1} \le P_l\}\) will be used to produce copula samples with \(V=l\). Let \(\tilde{n}_l = |P_n^l|\) and \(n_l = np_l\). It is clear that if l becomes too large, then \(\tilde{n}_l\) will eventually be 0. Let L(n) be the largest value of l such that \(\tilde{n}_{l} >0\), and let \(\tilde{p}_l = \tilde{n}_l/n\). Then we can write
where A(n, d), B(n, d), and C(n, d) are bounds such that
First, by definition of \(D^*(P_n)\) we have \(|\tilde{n}_l - n_l| \le 2nD^*(P_n)\) and thus \(|\tilde{p}_l-p_l| \le 2D^*(P_n)\). Hence we can take \(C(n,d) = 2{\text {E}}(|\varPsi (\varvec{U})|)D^*(P_n)\). Similarly, we can show that \(\sum _{l=L(n)+1}^{\infty } p_l \le D^*(P_n)\) and can therefore take \(B(n,d) = {\text {E}}(|\varPsi (\varvec{U})|) D^*(P_n)\). The analysis of the expression to be bounded by A(n, d) is more complicated. First, we note that under the assumption we have on \(\varPsi \) and its partial derivatives, we need to show that the product in (22) is in \(L_1\), but where each \(\phi _{C,k_j}\) is replaced by \(\phi _{C,k_j}^l\) for a given l. Since \(\phi _{C,k_j}^l\) is solely a function of \(v_{k_j+1}\), then it means that the only relevant products to consider are of the form
in which each term is of the form \( -\psi ^{'} \left( \frac{-\log v_{k_j+1}}{l}\right) \frac{1}{l v_{k_j+1}} \) which is non-negative for any \(v_{k_j+1}\). Using a similar reasoning to the one used in the proof of Proposition 4 (to conclude that (26) and (28) hold), it is easy to see that (30) is in \(L_1\).
What remains to be done is to analyze the discrepancy of \(P_n^l\). That is, here we are looking for a bound on \(\sup _{\varvec{z} \in \mathscr {J}^*} |E(\varvec{z};P_n^l)|\), where we recall that \( \mathscr {J}^*\) is the set of intervals of \([0,1)^d\) of the form \(\varvec{z} = \prod _{j=1}^d [0,z_j)\), where \(0 < z_j \le 1\). So consider a given \(\varvec{z} \in [0,1)^d\). Then \(E(\varvec{z};P_n^l) = A(\varvec{z};P_n^l)/\tilde{n}_l - \lambda (\varvec{z})\). Let \(\varvec{z}_1 = (P_l,\varvec{z})\) and \(\varvec{z}_2 = (P_{l-1},\varvec{z})\), which are both in \([0,1)^{d+1}\). Note that \(A(\varvec{z}_1;P_n) - A(\varvec{z}_2;P_n) = A(\varvec{z};P_n^l)\). By definition of \(D^*(P_n)\), it is not hard to see that
and therefore
Using the fact that \(|\tilde{n}_l-n_l| \le 2nD^*(P_n)\), after some further simplifications we get that
Hence we can take \(A(n,d) = 4D^*(P_n)\frac{n}{\tilde{n}_l}\) and then get \(\sum _{l=1}^{L(n)} \tilde{p}_l A(n,d) \le 4 L(n)D^*(P_n)\). To show that the overall bound for the integration error is of the form \((\log n)D^*(P_n)\) times a constant, we simply need to show that \(L(n) \in O(\log n)\). But this follows from our assumptions on \(P_n\) and F, since by definition, L(n) is the largest integer such that \(1-F(L(n)) > 1/pn\) but we also have \(1-F(L(n)) \le cq^{L(n)}\), hence
and thus \(L(n) \le (\log n + \log p + \log c)/\log (1/q)\), as required. \(\square \)
Rights and permissions
About this article
Cite this article
Cambou, M., Hofert, M. & Lemieux, C. Quasi-random numbers for copula models. Stat Comput 27, 1307–1329 (2017). https://doi.org/10.1007/s11222-016-9688-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-016-9688-4
Keywords
- Quasi-random numbers
- Copulas
- Conditional distribution method
- Marshall–Olkin algorithm
- Tail events
- Risk measures