Exact estimation of multiple directed acyclic graphs

Oates, Chris J.; Smith, Jim Q.; Mukherjee, Sach; Cussens, James

doi:10.1007/s11222-015-9570-9

Exact estimation of multiple directed acyclic graphs

Published: 19 June 2015

Volume 26, pages 797–811, (2016)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Chris J. Oates¹^nAff4,
Jim Q. Smith¹,
Sach Mukherjee²^nAff5 &
…
James Cussens³

774 Accesses
19 Citations
6 Altmetric
1 Mention
Explore all metrics

Abstract

This paper considers structure learning for multiple related directed acyclic graph (DAG) models. Building on recent developments in exact estimation of DAGs using integer linear programming (ILP), we present an ILP approach for joint estimation over multiple DAGs. Unlike previous work, we do not require that the vertices in each DAG share a common ordering. Furthermore, we allow for (potentially unknown) dependency structure between the DAGs. Results are presented on both simulated data and fMRI data obtained from multiple subjects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

LeCaSiM: Learning Causal Structure via Inverse of M-Matrices with Adjustable Coefficients

Article Open access 15 February 2024

Qingsong Cai & Yongchang Zhang

Penalized estimation of directed acyclic graphs from discrete data

Article 02 February 2018

Jiaying Gu, Fei Fu & Qing Zhou

Directed Acyclic Graph Reconstruction Leveraging Prior Partial Ordering Information

References

Achterberg, T.: SCIP: solving constraint integer programs. Math Program Comput 1(1), 1–41 (2009)
Article MathSciNet MATH Google Scholar
Bartlett, M., Cussens, J.: Advances in Bayesian network learning using integer programming. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence, pp. 182–191 (2013)
Berg, J., Järvisalo, M., Malone, B.: Learning optimal bounded treewidth Bayesian networks via maximum satisfiability. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics 33, pp. 86–95 (2014)
Chickering, D.M.: Optimal structure identification with greedy search. J. Mach. Learn. Res. 3, 507–554 (2003)
MathSciNet MATH Google Scholar
Costa, L., Smith, J.Q., Nicholls, T., Cussens, J., Duff, E.P., Makin, T.R.: Searching multiregression dynamic models of resting-state fMRI networks using integer programming. Bayesian Anal., to appear (2015)
Cowell, R.G.: Efficient maximum likelihood pedigree reconstruction. Theor. Popul. Biol. 76, 285–291 (2009)
Article Google Scholar
Cussens, J.: Maximum likelihood pedigree reconstruction using integer programming. In: Proceedings of the Workshop on Constraint Based Methods for Bioinformatics (WCB-10), Edinburgh (2010)
Cussens, J.: Bayesian network learning with cutting planes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 153–160 (2011)
Danaher, P., Wang, P., Witten, D.M.: The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. B 76(2), 373–397 (2014)
Article MathSciNet Google Scholar
De Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011)
MathSciNet MATH Google Scholar
Ellis, B., Wong, W.H.: Learning causal Bayesian network structures from experimental data. J. Am. Stat. Assoc. 103(482), 778–789 (2008)
Article MathSciNet MATH Google Scholar
Friedman, N., Koller, D.: Being Bayesian about network structure: a Bayesian approach to structure discovery in Bayesian networks. Mach. Learn. 50(1–2), 95–126 (2003)
Article MATH Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Article MATH Google Scholar
Friston, K.J.: Functional and effective connectivity: a review. Brain Connect. 1(1), 13–36 (2011)
Article MathSciNet Google Scholar
He, Y., Jia, J., Yu, B.: Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs. Ann. Stat. 41(4), 1742–1779 (2013)
Article MathSciNet MATH Google Scholar
Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20(3), 197–243 (1995)
MATH Google Scholar
Hill, S., Lu, Y., Molina, J., Heiser, L.M., Spellman, P.T., Speed, T.P., Gray, J.W., Mills, G.B., Mukherjee, S.: Bayesian inference of signaling network topology in a cancer cell line. Bioinformatics 28(21), 2804–2810 (2012)
Article Google Scholar
Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian network structure using LP relaxations. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, pp. 358–365 (2010)
Lee, S.Y.: Structural Equation Modeling: A Bayesian Approach. Wiley, New York (2007)
Book Google Scholar
Li, J., Wang, Z.J., Palmer, S.J., McKeown, M.J.: Dynamic Bayesian network modeling of fMRI: a comparison of group-analysis methods. Neuroimage 41(2), 398–407 (2008)
Article Google Scholar
Loh, P.-L., Wainwright, M.J.: Structure estimation for discrete graphical models: generalized covariance matrices and their inverses. Ann. Stat. 41(6), 3022–3049 (2013)
Article MathSciNet MATH Google Scholar
Luis, R., Sucar, L.E., Morales, E.F.: Inductive transfer for learning Bayesian networks. Mach. Learn. 79(1–2), 227–255 (2010)
Article MathSciNet Google Scholar
Mahajan, A.: Presolving mixed-integer linear programs. Wiley Encyclopedia of Operations Research and Management Science (2010)
Malone, B., Kangas, K., Jarvisalo, M., Koivisto, M., Myllymäki, P.: Predicting the hardness of learning Bayesian networks. In: Proceedings of the 28th AAAI Conference on Artificial Intelligence, (2014)
Mechellia, A., Penny, W.D., Pricea, C.J., Gitelman, D.R., Friston, K.J.: Effective connectivity and intersubject variability: using a multisubject network to test differences and commonalities. Neuroimage 17(3), 1459–1469 (2002)
Article Google Scholar
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Article MathSciNet MATH Google Scholar
Nemhauser, G.L., Wolsey, L.A.: Integer and Combinatorial Optimization. Wiley, New York (1988)
Book MATH Google Scholar
Niculescu-Mizil, A., Caruana, R.: Inductive transfer for Bayesian network structure learning. In: Proceedings of the 11th International Conference on Artificial Intelligence and Statistics, pp. 339–346 (2007)
Nie, S., Mauá, D.D., de Campos, C.P., Ji, Q.: Advances in learning Bayesian networks of bounded treewidth. Adv. Neur. In. 27, 2285–2293 (2014)
Google Scholar
Oates, C.J., Mukherjee, S.: Joint structure learning of multiple non-exchangeable networks. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp. 687–695 (2014)
Oates, C.J., Korkola, J., Gray, J.W., Mukherjee, S.: Joint estimation of multiple networks from time course data. Ann. Appl. Stat. 8(3), 1892–1919 (2014a)
Article MathSciNet MATH Google Scholar
Oates, C.J., Carneiro da Costa, L., Nichols, T.: Towards a multi-subject analysis of neural connectivity. Neural Compt. 27, 151–170 (2015)
Article Google Scholar
Oyen, D., Lane, T.: Leveraging domain knowledge in multitask bayesian network structure learning. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (2012)
Oyen, D., Lane, T.: Bayesian discovery of multiple Bayesian networks via transfer learning. In: Proceedings of the 13th IEEE International Conference on Data Mining, pp. 577–586 (2013)
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE T. Knowl. Data En. 22(10), 1345–1359 (2010)
Article Google Scholar
Parviainen, P., Farahani, H.S., Lagergren, J.: Learning Bounded Tree-width Bayesian Networks using Integer Linear Programming Proceedings of the 17th International Conference on Artificial Intelligence and Statistics 33, pp. 751–759 (2014)
Penfold, C.A., Buchanan-Wollaston, V., Denby, K.J., Wild, D.L.: Nonparametric Bayesian inference for perturbed and orthologous gene regulatory networks. Bioinformatics 28(12), i233–i241 (2012)
Article Google Scholar
Peters, J., Mooij, J.M., Janzing, D., Schölkopf, B.: Identifiability of causal graphs using functional models. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 589–598 (2011)
Peters, J., Bühlmann, P.: Identifiability of Gaussian structural equation models with equal error variances. Biometrika 101, 219–228 (2014)
Article MathSciNet MATH Google Scholar
Queen, C.M., Smith, J.Q.: Multiregression dynamic models. J. R. Stat. Soc. B 55(4), 849–870 (1993)
MathSciNet MATH Google Scholar
Sheehan, N.A., Bartlett, M., Cussens, J.: Improved maximum likelihood reconstruction of complex multi-generational pedigrees. Theor. Popul. Biol. 97, 11–19 (2014)
Article MATH Google Scholar
Silander, T., Myllymäki, P.: A simple approach to finding the globally optimal Bayesian network structure. In: Proceedings of the 22nd Conference on Artificial Intelligence, pp. 445–452 (2006)
Studený, M., Vomlel, J., Hemmecke, R.: A geometric view on learning Bayesian network structures. Int. J. Approx. Reason. 51(5), 578–586 (2010)
Article MathSciNet MATH Google Scholar
Studený, M., Haws, D.: On polyhedral approximations of polytopes for learning Bayesian networks. J. Algebraic Stat. 4(1), 59–92 (2013)
Article MathSciNet Google Scholar
Sugihara, G., Kaminaga, T., Sugishita, M.: Interindividual uniformity and variety of the “Writing center”: a functional MRI study. Neuroimage 32(4), 1837–1849 (2006)
Article Google Scholar
Thiesson, B., Meek, C., Chickering, D. M., Heckerman, D.: Learning mixtures of Bayesian networks. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 504–513 (1998)
Tsamardinos, I., Brown, L.E., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. Mach. Learn. 65(1), 31–78 (2006)
Article Google Scholar
Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T.E., Yacoub, E., Ugurbil, K.: The WU-Minn human connectome project: an overview. Neuroimage 80, 62–79 (2013)
Article Google Scholar
Werhli, A.V., Husmeier, D.: Gene regulatory network reconstruction by Bayesian integration of prior knowledge and/or different experimental conditions. J. Bioinform. Comput. Biol. 6(3), 543–572 (2008)
Article Google Scholar
Wolsey, L.A.: Integer Programming. Wiley, New York (1998)
MATH Google Scholar
Yajima, M., Telesca, D., Ji, Y., Müller, P.: Detecting differential patterns of interaction in molecular pathways. Biostatistics, kxu054 (2014)
Yuan, C., Malone, B.: Learning optimal Bayesian networks: a shortest path perspective. J. Artif. Intell. Res. 48, 23–65 (2013)
MathSciNet MATH Google Scholar

Download references

Acknowledgments

The authors are grateful to Dr. Ricardo Silva and two anonymous reviewers, whose feedback helped to improve the paper. CJO was supported by the Centre for Research in Statistical Methodology (CRiSM) EPSRC EP /D002060/1. JC was supported by the Medical Research Council (Project Grant G1002312). SM was supported by the UK Medical Research Council and is a recipient of a Royal Society Wolfson Research Merit Award. The authors are grateful to Lilia Carneiro da Costa and Tom Nichols who collaborated in the analysis of fMRI data and to Mark Bartlett who provided technical support with GOBNILP. The authors also thank Diane Oyen and several other colleagues who provided feedback on an earlier draft.

Author information

Chris J. Oates
Present address: School of Mathematical and Physical Sciences, University of Technology Sydney, Broadway, Sydney, P.O. Box 123, NSM, 2007, Australia
Sach Mukherjee
Present address: German Center for Neurodegenerative Diseases (DZNE), 53175, Bonn, Germany

Authors and Affiliations

Department of Statistics, University of Warwick, Coventry, CV4 7AL, UK
Chris J. Oates & Jim Q. Smith
MRC Biostatistics Unit and CRUK Cambridge Institute, University of Cambridge, Cambridge, CB2 0SR, UK
Sach Mukherjee
Department of Computer Science and York Centre for Complex Systems Analysis, University of York, York, YO10 5GE, UK
James Cussens

Authors

Chris J. Oates
View author publications
You can also search for this author in PubMed Google Scholar
Jim Q. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Sach Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
James Cussens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chris J. Oates.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 357 KB)

Appendix: Multiregression mynamical models

MDMs are a generalisation of BNs that model time series data and that, unlike BNs, are fully identifiable (i.e. the score equivalence classes are singletons). The MDM is defined on a multivariate time series that aims to identify the conditional independence structure among the variables over time (Queen and Smith 1993). In the MDM that we consider, a multivariate model for observable series $\varvec{Y}_{1:P}^{(k)}(n)$, for subject k at time n is characterised by a contemporaneous DAG $G^{(k)}$, with information shared across time only through evolution of the model parameters $\varvec{\theta }_{G_i^{(k)}}^{(k)}(n)$. Formally, this model is described by the following observation equations and system equations:

$$\begin{aligned} Y_i^{(k)}(n)= & {} \mathbf {Y}_{G_i^{(k)}}^{(k)}(n)^T \varvec{\theta }_i^{(k)}(n) + \epsilon _i^{(k)}(n) \end{aligned}$$

(14)

$$\begin{aligned} \varvec{\theta }^{(k)}(n)= & {} \varvec{\varGamma }^{(k)}(n) \varvec{\theta }^{(k)} (n-1) + \mathbf {w}^{(k)}(n) \end{aligned}$$

(15)

where $\varvec{\theta }^{(k)}(n)^T = (\varvec{\theta }_1^{(k)}(n)^T, \ldots , \varvec{\theta }_P^{(k)}(n)^T)$ is the concatenated parameter vector. Here the disturbance terms $\epsilon _i^{(k)}(n) \sim N(0,V_ i^{(k)}(n))$ and $\mathbf {w}^{(k)}(n)\sim N (\mathbf {0},\mathbf {W}^{(k)}(n))$ are both normally distributed and the hyper-parameters $V_i^{(k)}(n)$, $\varvec{\varGamma }^{(k)}(n)$, $\varvec{W}^{(k)}(n)$ must be specified. The equations of the MDM can be viewed as a collection of nested univariate linear models, allowing the parameters to be estimated using Kalman filter recurrences over time (full details in the supplementary text).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Oates, C.J., Smith, J.Q., Mukherjee, S. et al. Exact estimation of multiple directed acyclic graphs. Stat Comput 26, 797–811 (2016). https://doi.org/10.1007/s11222-015-9570-9

Download citation

Received: 12 November 2014
Accepted: 16 April 2015
Published: 19 June 2015
Issue Date: July 2016
DOI: https://doi.org/10.1007/s11222-015-9570-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exact estimation of multiple directed acyclic graphs

Abstract

Access this article

Similar content being viewed by others

LeCaSiM: Learning Causal Structure via Inverse of M-Matrices with Adjustable Coefficients

Penalized estimation of directed acyclic graphs from discrete data

Directed Acyclic Graph Reconstruction Leveraging Prior Partial Ordering Information

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 357 KB)

Appendix: Multiregression mynamical models

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Abstract

Access this article

Similar content being viewed by others

LeCaSiM: Learning Causal Structure via Inverse of M-Matrices with Adjustable Coefficients

Penalized estimation of directed acyclic graphs from discrete data

Directed Acyclic Graph Reconstruction Leveraging Prior Partial Ordering Information

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 357 KB)

Appendix: Multiregression mynamical models

Appendix: Multiregression mynamical models

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation