Skip to main content
Log in

Copula directed acyclic graphs

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A new methodology for selecting a Bayesian network for continuous data outside the widely used class of multivariate normal distributions is developed. The ‘copula DAGs’ combine directed acyclic graphs and their associated probability models with copula C/D-vines. Bivariate copula densities introduce flexibility in the joint distributions of pairs of nodes in the network. An information criterion is studied for graph selection tailored to the joint modeling of data based on graphs and copulas. Examples and simulation studies show the flexibility and properties of the method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Aas, K., Czado, C., Frigessi, A., Bakken, H.: Pair-copula constructions of multiple dependence. Insurance 44(2), 182–198 (2009)

    MathSciNet  MATH  Google Scholar 

  • Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B., Csáki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akadémiai Kiadó, Budapest (1973)

    Google Scholar 

  • Barber, D.: Bayesian Reasoning and Machine Learning. Cambridge University Press, Cambridge (2012)

    MATH  Google Scholar 

  • Bauer, A., Czado, C., Klein, T.: Pair-copula constructions for non-Gaussian DAG models. Can. J. Stat. 40(1), 86–109 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Bedford, T., Cooke, R.M.: Probability density decomposition for conditionally dependent random variables modeled by vines. Ann. Math. Artif. Intell. 32(1–4), 245–268 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  • Bedford, T., Cooke, R.M.: Vines—a new graphical model for dependent random variables. Ann. Stat. 30(4), 1031–1068 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  • Brechmann, E., Czado, C.: Risk management with high-dimensional vine copulas: an analysis of the Euro Stoxx 50. Stat. Risk Model. 30(4), 307–342 (2013)

    MathSciNet  MATH  Google Scholar 

  • Brechmann, E., Schepsmeier, U.: Modeling dependence with C- and D-vine copulas: the R package CDVine. J. Stat. Softw. 52(3), 1–27 (2013)

    Article  Google Scholar 

  • Brechmann, E.C., Czado, C., Aas, K.: Truncated regular vines in high dimensions with applications to financial data. Can. J. Stat. 40(1), 68–85 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  • Chickering, D.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2002)

    MathSciNet  MATH  Google Scholar 

  • Clarke, K.: Nonparametric model discrimination in international relations. J. Confl. Resolut. 47(1), 72–93 (2003)

    Article  Google Scholar 

  • Core Team, R.: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna (2014)

    Google Scholar 

  • Cox, D., Wermuth, N.: Multivariate Dependencies: Models, Analysis and Interpretation. Chapman & Hall/CRC, London (1996)

    MATH  Google Scholar 

  • Czado, C.: Pair-copula constructions of multivariate copulas. In: Jaworki, P., Durante, F., Härdle, W., Rychlik, W. (eds.) Copula Theory and its Applications, pp. 93–109. Springer, Berlin (2010)

    Chapter  Google Scholar 

  • Czado, C., Gärtner, F., Min, A.: Analysis of Australian electricity loads using joint Bayesian inference of D-vines with autoregressive margins. In: Kurowicka, D., Joe, H. (eds.) Vine Copula Handbook, pp. 265–280. World Scientific Publishing, Singapore (2011)

    Google Scholar 

  • Czado, C., Schepsmeier, U., Min, A.: Maximum likelihood estimation of mixed C-vines with application to exchange rates. Stat. Model. 12(3), 229–255 (2012)

    Article  MathSciNet  Google Scholar 

  • Dißmann, J., Brechmann, E., Czado, C., Kurowicka, D.: Selecting and estimating regular vine copulae and application to financial returns. Comput. Stat. Data Anal. 59, 52–69 (2013)

    Article  MathSciNet  Google Scholar 

  • Drton, M., Perlman, M.: A SINful approach to Gaussian graphical model selection. J. Stat. Plan. Inference 138(4), 1179–1200 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Elidan, G.: Copula Bayesian networks. In: Lafferty, J., Williams, C.K.I., Shawe-Taylor, J., Zemel, R., Culotta, A. (eds) In: Proceesdings of Advances in Neural Information Processing Systems 23 (NIPS 2010), pp. 559–567 (2010)

  • Elidan, G.: Lightning-speed structure learning of nonlinear continuous networks. J. Mach. Learn. Res. Proc. Track 22, 355–363 (2012)

    Google Scholar 

  • Geiger, D., Verma, T., Pearl, J.: Identifying independence in Bayesian networks. Networks 20(5), 507–534 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  • Genest, C., Favre, A.: Everything you always wanted to know about copula modeling but were afraid to ask. J. Hydrol. Eng. 12(4), 347–368 (2007)

    Article  Google Scholar 

  • Gijbels, I., Veraverbeke, N., Omelka, M.: Conditional copulas, association measures and their applications. Comput. Stat. Data Anal. 55(5), 1919–1932 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Hanea, A.M.: Non-parameteric bayesian belief nets versus vines. In: Kurowicka, D., Joe, H. (eds.) Vine Copula Handbook, Dependence Modeling, pp. 281–303. World Scientific Publishing, Singapore (2011)

    Google Scholar 

  • Hanea, A.M., Kurowicka, D., Cooke, R.M., Ababei, D.A.: Mining and visualising ordinal data with non-parametric continuous BBNs. Comput. Stat. Data Anal. 54(3), 668–687 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Harris, N., Drton, M.: PC algorithm for nonparanormal graphical models. J. Mach. Learn. Res. 14, 3365–3383 (2013)

    MathSciNet  MATH  Google Scholar 

  • Heckerman, D., Geiger, D.: Learning Bayesian networks: a unification for discrete and Gaussian domains. In: Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, pp. 274–284 (1995)

  • Hobæk Haff, I.: Parameter estimation for pair-copula constructions. Bernoulli 19(2), 462–491 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Hofert, M., Kojadinovic, I., Maechler, M., Yan, J.: copula: Multivariate dependence with copulas. R package version 0.999-10 (2014)

  • Jalali, A., Ravikumar, P., Vasuki, V., Sanghavi, S.: On learning discrete graphical models using group-sparse regularization. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (2010)

  • Joe, H.: Families of \(m\) bivariate dependence parameters. In: Rüschendorf, L., Schweizer, B., Taylor, M. (eds) Distributions with Fixed Marginals and Related Topics, Lecture Notes-Monograph Series, vol 28, Institute of Mathematical Statistics, pp. 120–141 (1996)

  • Kalisch, M., Bühlmann, P.: High-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)

    MATH  Google Scholar 

  • Kalisch, M., Mächler, M., Colombo, D., Maathuis, M.H., Bühlmann, P.: Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11), 1–26 (2012)

    Article  Google Scholar 

  • Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  • Kurowicka, D., Cooke, R.: The vine copula method for representing high dimensional dependent distributions: applications to continuous belief nets. In: Yücesan, E., Chen, C.H., Snowdon, J.L., Chames, J.M. (eds) The Winter Simulation Conference, IEEE Press, Piscataway, pp. 270–278 (2002)

  • Kurowicka, D., Cooke, R.: Uncertainty Analysis with High Dimensional Dependence Modelling. Wiley, Chichester (2006)

    Book  MATH  Google Scholar 

  • Lauritzen, S.: Graphical Models. Oxford University Press, Oxford (1996)

    MATH  Google Scholar 

  • Lee, J., Hastie, T.: Learning the structure of mixed graphical models. J. Comput. Graph. Stat. 24(1), 230–253 (2012)

  • Lichman, M.: UCI machine learning repository. University of California, School of Information and Computer Sciences, Irvine. http://archive.ics.uci.edu/ml (2013)

  • Liu, H., Lafferty, J., Wasserman, L.: The nonparanormal: semiparametric estimation of high dimensional undirected graphs. J. Mach. Learn. Res. 10, 2295–2328 (2009)

    MathSciNet  MATH  Google Scholar 

  • Loh, P.L., Wainwright, M.J.: Structure estimation for discrete graphical models: generalized covariance matrices and their inverses. Ann. Stat. 41(6), 3022–3049 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  • Lucas, P.J.: Biomedical applications of Bayesian networks. In: Lucas, P.J.F., Gámez, J., Salmerón Cerdan, A. (eds.) Advances in Probabilistic Graphical Models, Studies in Fuzziness and Soft Computing, pp. 333–358. Springer, Berlin (2007)

    Chapter  Google Scholar 

  • Madsen, A.L., Kjærulff, U.B.: Applications of HUGIN to diagnosis and control of autonomous vehicles. In: Lucas, P.J.F., Gámez, J., Salmerón Cerdan, A. (eds.) Advances in Probabilistic Graphical Models, Studies in Fuzziness and Soft Computing, vol. 214, pp. 313–332. Springer, Berlin (2007)

    Chapter  Google Scholar 

  • Mari, D., Kotz, S.: Correlation and Dependence. Imperial College Press, London (2001)

    Book  MATH  Google Scholar 

  • Min, A., Czado, C.: Bayesian model selection for multivariate copulas using pair-copula constructions. J. Financ. Econ. 8(4), 511–546 (2010)

    Google Scholar 

  • Min, A., Czado, C.: Bayesian model selection for D-vine pair-copula constructions. Can. J. Stat. 39(2), 239–258 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Morales Nápoles, O.: Bayesian belief nets and vines in aviation safety and other applications. PhD Thesis, Technische Universiteit Delft (2010)

  • Nelsen, R.B.: An Introduction to Copulas. Springer, Berlin (2006)

    MATH  Google Scholar 

  • Okhrin, O., Ristig, A.: Hierarchical Archimedean copulae: the HAC package. J. Stat. Softw. 58(4), 1–20 (2014)

    Article  Google Scholar 

  • Peshkin, L., Pfefer, A., Savova, V.: Bayesian nets in syntactic categorization of novel words. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, Association for Computational Linguistics, vol. 2, pp. 79–81 (2003)

  • Schepsmeier, U., Stoeber, J., Brechmann, E.C., Graeler, B.: VineCopula: statistical inference of vine copulas. R package version 1.3 (2014)

  • Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  • Scutari, M.: Learning bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)

    Article  MathSciNet  Google Scholar 

  • Sin, C., White, H.: Information criteria for selecting possibly misspecified parametric models. J. Econ. 71(1–2), 207–225 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Sklar, A.: Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Stat. Univ. Paris 8, 229–231 (1959)

    MATH  Google Scholar 

  • Smith, M., Min, A., Almeida, C., Czado, C.: Modeling longitudinal data using a pair-copula construction decomposition of serial dependence. J. Am. Stat. Assoc. 105, 1467–1479 (2010)

    Article  MATH  Google Scholar 

  • Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search, 2nd edn. MIT Press, Cambridge (2000)

    MATH  Google Scholar 

  • Vuong, Q.H.: Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica 57(2), 307–333 (1989)

    Article  MathSciNet  MATH  Google Scholar 

  • Wainwright, M.J., Jordan, M.I.: Graphical models, exponential families, and variational inference. Found. Trends Mach. Learn. 1(1–2), 1–305 (2008)

    MATH  Google Scholar 

  • Yang, E., Ravikumar, P.K., Allen, G.I., Liu, Z.: Graphical models via generalized linear models. In: Bartlett P, Pereira F, Burges C, Bottou L, Weinberger K (eds) In: Proceedings of Advances in Neural Information Processing Systems, (NIPS 2012), pp. 1367–1375 (2012)

Download references

Acknowledgments

We wish to thank the reviewers for their comments. We thank A. Hanea and D. Ababei for providing the software of their procedure. We acknowledge the support of the Fund for Scientific Research Flanders, KU Leuven grant GOA/12/14 and of the IAP Research Network P7/06 of the Belgian Science Policy. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Hercules Foundation and the Flemish Government—Department EWI.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gerda Claeskens.

Appendix: Technical details

Appendix: Technical details

Assumptions of Proposition 4.1 adapted from Sin and White (1996). For every node l in the graph, define let \(q_{lk}(\cdot ,\varvec{\theta })=\log CV_{l,k}(\varvec{\theta }_{CV_l})-\log DV_{l,k}(\varvec{\theta }_{DV_l})\), define \(\tilde{q}_{lk}(\cdot ,\varvec{\theta })=\log DV_{l,k}(\varvec{\theta }_{DV_l})\) and \(\text {log-Lik}(\cdot ,\varvec{\theta };\text {node}_l)\equiv Q_{ln}(\cdot ,\varvec{\theta })=\sum _{k=1}^{n}q_{lk}(\cdot ,\varvec{\theta })\) with \(\varvec{\theta }=(\varvec{\theta }_{CV_l},\varvec{\theta }_{DV_l})\) and \(k=1,2,\ldots , n\). For ease of exposition we state general conditions that need to be satisfied by \(q_{lk}(\cdot ,\varvec{\theta }), \ \tilde{q}_{lk}(\cdot ,\varvec{\theta }), \ Q_{ln}(\cdot ,\varvec{\theta })\) and \(\varvec{\theta }\) for every model m.

Let \((\Theta ,\mathcal {F},P)\) be a complete probability space and \(\Theta \) be a compact subset of \(\mathbb {R}^{d}\) with \(d\in \mathbb {N}\). For all \(n\in \mathbb {N}\) let \(Q_{ln}:\Omega \times \Theta \rightarrow \mathbb {R}\) be such that:

  1. i

    \(\forall \varvec{\theta }\in \Theta , \ Q_{ln}(\cdot ,\varvec{\theta })\) is \(\mathcal {F}\)-measurable.

  2. ii

    \(\forall \omega \in A\in \mathcal {F}\) with \(P(A)=1, \ Q_{ln}(\omega ,\cdot )\) is continuously differentiable on \(\Theta \).

  3. iii

    The expectation \(E(Q_{ln}(\cdot ,\varvec{\theta }))\) exists and defines a function which is continuously differentiable on \(\Theta \) and \(\bigtriangledown E(Q_{ln}(\cdot ,\varvec{\theta }))=E(\bigtriangledown Q_{ln}(\cdot ,\varvec{\theta }))\) where \(\bigtriangledown \) is the gradient operator.

  4. iv

    The least false parameter defined by \(\varvec{\theta }_{0n}=\arg \sup _{\varvec{\theta }\in \Theta }\frac{1}{n}E(Q_{ln}(\cdot ,\varvec{\theta }))\) is interior to \(\Theta \) uniformly (in n).

  5. v

    Given \(\epsilon >0\) there exists \(N_0(\epsilon )<\infty \) and \(\delta (\epsilon )>0\) such that \(\inf \{\min \{K_n^{*}(\varvec{\theta }):\varvec{\theta }\in \mathcal {N}_n^{*}(\epsilon )^{c}\},n>N_0(\epsilon )\} \equiv \delta (\epsilon )\), where \(K_n^{*}(\varvec{\theta })\equiv n^{-1}E(Q_{ln}(\cdot ,\varvec{\theta }_{0n}))-n^{-1}E(Q_{ln}(\cdot ,\varvec{\theta })), \ \mathcal {N}_n^{*}(\epsilon )^{c}\) is the compact complement of \(\mathcal {N}_n^{*}(\epsilon ) \equiv \mathcal {S}^{*}_n(\epsilon )\cap \Theta \) in \(\Theta \) and \(\mathcal {S}^{*}_n(\epsilon )\) is an open sphere centered at \(\varvec{\theta }_{0n}\) with fixed radius \(\epsilon \).

  6. vi

    For P-almost all \(\omega , \ q_{lk}(\omega ,\cdot )\) is twice continuously differentiable as a function of \(\varvec{\theta }\), for \(k=1,2,\ldots \)

  7. vii

    \(q_{lk}\) and \(\tilde{q}_{lk}\) satisfy a uniform weak law of large numbers (UWLLN) on \(\Theta \).

  8. viii

    Each element of \(\bigtriangledown q_{lk}(\cdot ,\varvec{\theta }_{0n})\) satisfies a central limit theorem.

  9. ix

    \(\exists \epsilon , \alpha >0\) such that for P-almost all \(\omega \) and for all n sufficiently large and for all \(\varvec{\theta } \in \mathcal {N}_n^{*}(\epsilon ), \det (n^{-1}\bigtriangledown ^{2}Q_{ln}(\omega ,\varvec{\theta }))\ge \alpha \), with \(\mathcal {N}^{*}_n(\epsilon )\) as in Asumption v.

  10. x

    For all n sufficiently large and for all \(\varvec{\theta } \in \mathcal {N}_n^{*}(\epsilon ), E[n^{-1}\bigtriangledown ^2Q_{ln}(\cdot ,\varvec{\theta })]\) is \(\varvec{O}(1)\).

  11. xi

    Each element of \(\bigtriangledown ^2 q_{ln}\) satisfies a UWLLN on \(\mathcal {N}^{*}_n(\epsilon )\).

We assume that the copula densities are such that the above conditions are satisfied. These are basic assumptions that guarantee that \(\hat{\varvec{\theta }}_n-\varvec{\theta }_{0n}=\varvec{O}_p(n^{-1/2})\) and \(Q_n(\cdot ,\hat{\varvec{\theta }}_n)-Q_n(\cdot ,\varvec{\theta }_{0n})=\varvec{O}_p(1)\). The asymptotic normality of \(\sqrt{n}(\hat{\varvec{\theta }}_n-\varvec{\theta }_{0n})\) for the models we consider has been shown in Hobæk Haff (2013).

1.1 Penalty conditions in Lemma 4 for the penalty in cDAG-IC

Proof

Define \({\varDelta }\widehat{\hbox {pen}}_{\mathrm{cDAG}} = \widehat{\hbox {pen}}_{\mathrm{cDAG}}^1(n,\hat{\varvec{\theta }}^1)- \widehat{\hbox {pen}}_{\mathrm{cDAG}}^2(n,\hat{\varvec{\theta }}^2)\). For (i) it holds that

$$\begin{aligned} \begin{aligned} {\varDelta }\widehat{\hbox {pen}}_{\mathrm{cDAG}}/n&=\left( \frac{E\log DV^{1}_{l}}{|pa^{1}(l)|}-\frac{E\log DV^{2}_{l}}{|pa^{2}(l)|}\right) \frac{1}{\log n}\\&\quad +o_{P}(1) = o_P(1). \end{aligned} \end{aligned}$$

The first equality holds due to Assumption vii.

For (ii) and (iii) it follows that

$$\begin{aligned} \begin{aligned}&\frac{{\varDelta }\widehat{\hbox {pen}}_{\mathrm{cDAG}}}{\sqrt{n}}=\Big (\frac{E\log DV^{1}_{l}}{|pa^{1}(l)|}-\frac{E\log DV^{2}_{l}}{|pa^{2}(l)|}\Big )\frac{\sqrt{n}}{\log n}+ o_P(\sqrt{n}),\\&{\varDelta }\widehat{\hbox {pen}}_{\mathrm{cDAG}}=\Big (\frac{E\log DV^{1}_{l}}{|pa^{1}(l)|}-\frac{E\log DV^{2}_{l}}{|pa^{2}(l)|}\Big )\frac{n}{\log n}+ o_P(n). \end{aligned} \end{aligned}$$

By the assumed positiveness of the penalty difference, the conditions hold. \(\square \)

Definition of ‘d-separation’ between \(\mathcal {X}\) and \(\mathcal {Y}\) by \(\mathcal {Z}\) (Barber 2012). For every node \(x \in \mathcal {X}\) and \(y \in \mathcal {Y}\), check every path \(\mathcal {U}\) between x and y (that is, a sequence of nodes that starts in x and by following the directionality of the arrows leads to y). A path \(\mathcal {U}\) is blocked if there is a node w in \(\mathcal {U}\) such that either: (i) w is a collider (a collider node has two incoming arrows to it) and neither w nor any of its descendants is in \(\mathcal {Z}\), or (ii) w is not a collider on \(\mathcal {U}\) and w is in \(\mathcal {Z}\). If all such paths are blocked then the sets of nodes \(\mathcal {X}\) and \(\mathcal {Y}\) are d-separated by \(\mathcal {Z}\). If the sets of nodes \(\mathcal {X}\) and \(\mathcal {Y}\) are d-separated by \(\mathcal {Z}\), they are independent conditional on \(\mathcal {Z}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pircalabelu, E., Claeskens, G. & Gijbels, I. Copula directed acyclic graphs. Stat Comput 27, 55–78 (2017). https://doi.org/10.1007/s11222-015-9599-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-015-9599-9

Keywords

Navigation