Abstract
We present a new Bayesian approach for undirected Gaussian graphical model determination. We provide some graph theory results for local updates that facilitate a fast exploration of the graph space. Specifically, we show how to locally update, after either edge deletion or inclusion, the perfect sequence of cliques and the perfect elimination order of the nodes associated to an oriented, directed acyclic version of a decomposable graph. Building upon the decomposable graphical models framework, we propose a more flexible methodology that extends to the class of nondecomposable graphs. Posterior probabilities of edge inclusion are interpreted as a natural measure of edge selection uncertainty. When applied to a protein expression data set, the model leads to fast estimation of the protein interaction network.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andersson, S., Madigan, D., Perlman, M.: A characterization of Markov equivalence classes for acyclic garphs. Ann. Stat. 25, 505541 (1997)
Armstrong, H., Carter, C., Wong, K., Kohn, R.: Bayesian covariance matrix estimation using a mixture of decomposable graphical models. Stat. Comput. 19, 303–316 (2009)
Atay-Kayis, A., Massam, H.: The marginal likelihood for decomposable and non-decomposable graphical gaussian models. Biometrika 92, 317–335 (2005)
Clyde, M., George, E.: Model uncertainty. Statistical Science 19(1), 81–94 (2004)
Danaher, P., Wang, P., Witten, D.: The joint graphical lasso for inverse covariance estimation across multple classes. J. R. Stat. Soc. 76(2), 373–397 (2014)
Dawid, A.P., Lauritzen, S.: Hyper Markov laws in the statistical analysis of decomposable graphical models. Ann. Stat. 3, 1272–1317 (1993)
Dempster, A.P.: Covariance selection. Biometrics 28, 157–175 (1972)
Dobra, A., Jones, B., Hans, C., Nevins, J., West, M.: Sparse graphical models for exploring gene expression data. J. Multivar. Anal. 90, 196–212 (2004)
Dobra, A., Lenkoski, A., Rodriguez, A.: Bayesian inference for general gaussian graphical models with application to multivariate lattice data. J. Am. Stat. Assoc. 106, 1418–1433 (2012)
Frydenberg, M., Lauritzen, S.: Decomposition of maximum likelihood in mixed graphical interaction models. Biometrika 76(3), 539–55 (1989)
Geiger, D., Heckerman, D.: Parameter priors for directed acyclic graphical models and the characterization of several probability distributions. Ann. Stat. 5, 14121440 (2002)
George, E., McCulloch, R.: Variable selection via Gibbs sampling. J. Am. Stat. Assoc. 88, 881–889 (1993)
Giudici, P., Green, P.: Decomposable graphical Gaussian model determination. Biometrika 86(4), 785–801 (1999)
Green, P., Thomas, A.: Sampling decomposable graphs using a markov chain on junction trees. Biometrika 100(1), 91–110 (2013)
Grone, R., Johnson, C.R., Sà, E.M., Wolkowicz, H.: Positive definite completion of partial Hermitian matrices. Linear Algebra Appl. 58, 109–124 (1984)
Ibarra, L.: Fully dynamic algorithms for chordal graphs and split graphs. ACMM Trans. Algorithms 40(1), 40 (2008)
Jones, B., Carvalho, C., Dobra, A., Hans, C., Carter, C., West, M.: Experiments in stochastic computation for high-dimensional graphical models. Stat. Sci. 20(4), 388–400 (2005)
Kornblau, S., Tibes, R., Qiu, Y., Chen, W., Kantarjian, H., Andreeff, M., Coombes, K., Mills, G.: Functional proteomic profiling of aml predicts response and survival. Blood 113, 154–164 (2009)
Lauritzen, S.: Graphical Models. Claredon Press, Oxford (1996)
Ozawa, Y., Williams, A., Estes, M., Matsushita, N., Boschelli, F., Jove, R., List, A.: Src family kinases promote AML cell survival through activation of signal transducers and activators of transcription (STAT). Leuk. Res. 32(6), 893–903 (2008)
Paulsen, V., Power, S., Smith, R.: Schur products and matrix completions. J. Funct. Anal. 85, 151–178 (1989)
Roverato, A.: Cholesky decomposition of a hyper-inverse Wishart matrix. Biometrika 87, 99–112 (2000)
Roverato, A.: Hyper-inverse Wishart distribution for non-decomposable graphs and its application to bayesian inference for Gaussian graphical models. Scand. J. Stat. 29, 391–411 (2002)
Scott, J., Carvalho, C.: Feature-inclusion stochastic search for Gaussian graphical models. J. Comput. Graphical Stat. 17, 790–808 (2008)
Tarjan, R., Yannakakis, M.: Simple linear-time algorithms to test chordality of graphs, test acyclicity of hypergraphs, and selectively reduce acyclic hypergraphs. SIAM J. Comput. 13, 566–579 (1984)
Thomas, A., Green, P.: Enumerating the decomposable neighbors of a decomposble graph under a simple perturbation scheme. Comput. Stat. Data Anal. 53, 1232–1238 (2009)
Wang, H.: Bayesian graphical lasso models and efficient posterior computation. Bayesian Anal. 7(4), 867–886 (2012)
Wang, H., Li, Z.: Efficient Gaussian graphical model determination under G-Wishart prior distributions. Electron. J. Stat. 6, 168–198 (2012)
Wermuth, N.: Linear recursive equations, covariance selection, and path analysis. J. Am. Stat. Assoc. 75(372), 963–972 (1980)
Wu, X., Senechal, K., Neshat, M., Whang, Y., Sawyers, C.: The PTEN/MMAC1 tumor suppressor phosphatase functions as a negative regulator of the phosphoinositide 3-kinase/Akt pathway. PNAS 95(15), 587–591 (1998)
Acknowledgments
The first author was partially supported by NCI Grant P30 CA016672.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Proof
(Proposition 1) After removing an edge \(e = \{a,b\}\) the two subsets \(C_{a}\) or \(C_{b}\) are created, and it is assumed without loss of generality that \(a\) comes before \(b\) in the perfect numbering of the nodes. The running intersection property implies that either \(C_{a}, C_{b}\) or both can be subsets of only the cliques directly connected to \(C^*\) in the junction tree, thus of only the cliques in \(N_a, N_b\) as defined by Eq. (4). We have the following exhaustive cases.
-
(a)
If neither \(C_{a}\) nor \(C_{b}\) are subsets of any other clique, the junction tree can be updated by first replacing \(C^*\) by the nodes \(C_a\) and \(C_b\) and connecting \(C_{a}\) to \(C_{b}\). Then, the existing edges \(\{C, C^*\}\) for \(C \in N_a\) are replaced with edges \(\{C, C_a\}\) and the existing edges \(\{C, C^*\}\) for \(C \in N_b\) are replaced with edges \(\{C, C_b\}\). If there are other edges \(\{C, C^*\}\) with \(C \in N_{\,\overline{ab}}\) then these can be replaced arbitrarily with edges \(\{C, C_a\}\) or \(\{C, C_a\}\). It is then straightforward to verify that the running intersection property is retained and that the perfect sequence of the cliques obtained by applying Proposition 2.29 of Lauritzen (1996) coincides with that from Proposition 1.
-
(b)
If \(C_{a} \subset C\), where \(C\) precedes \(C_a\) in the sequence, then in the previous procedure, the edge \(\{C, C_a\}\) must be contracted and \(C_a\) must be replaced by \(C\). Thus, \(C\) will be connected to \(C_{b}\) and to all the cliques in \(N_a\), and \(C_{b}\) will be connected to all the cliques in \(N_b\). In this case, the perfect sequence of the cliques derived from Lauritzen (1996, Proposition. 2.29) coincides with that from Proposition 1.
-
(c)
If \(C_{a} \subset C'\) and \(C'\) follows \(C_a\) in the perfect sequence, the junction tree can be updated by connecting \(C_{a}\) to \(C_{b}\) and to all the other cliques in \(N_a\), and connecting \(C_{b}\) to all the cliques in \(N_b\). In this case, the perfect sequence of the cliques derived from Lauritzen (1996, Proposition. 2.29) coincides with that from Proposition 1.
-
(d)
If \(C_{a} \subset C\) and \(C_{b} \subset C'\), where \(C\) precedes \(C_a\) and \(C'\) follows \(C_a\) in the perfect sequence, the junction tree can be updated by connecting \(C\) to \(C_{b}\) and to all the other cliques in \(N_a\), and connecting \(C_{b}\) to all the other cliques in \(N_b\). In this case, the perfect sequence of the cliques derived from Lauritzen (1996, Proposition 2.29) coincides with that from Proposition 1.
-
(e)
When \(C_{a}\) and \(C_{b}\) are subsets of cliques \(C'\) and \(C''\), respectively, following \(C_a\) in the perfect sequence, the junction tree can be updated by connecting \(C_{a}\) to \(C_b\) and to all the other cliques in \(N_a\), and connecting \(C_b\) to all the other cliques in \(N_b\). Then the perfect sequence of the cliques derived from Lauritzen (1996, Proposition 2.29) coincides with that from Proposition 1. \(\square \)
Proof
(Proposition 2) At the end of Algorithm 3, we have three possible cases:
Case | Are \(C_a, C_b\) in \(G'\)? | Edge in \(J\) | Edge in \(J'\) |
---|---|---|---|
(a) | Yes, yes | \(\{C_a, C_b\}\) | \(\{C_a, C^*\}\) and \(\{C^*, C_b\}\) |
(b) | No, yes yes, no | \(\{C_a, C_b\}\) | \(\{C_a, C^*\}\) or \(\{C^*, C_b\}\) |
(c) | No, no | \(\{C_a, C_b\}\) | none |
In case (a), if \(C_a\) and \(C_b\) are both maximal cliques in \(G'\), the new junction tree \(J'\) has a new node \(C^*\) on the directed path between \(C_a\) and \(C_b\). Therefore, the perfect sequence of the cliques is given in Eq. (5). In case (b), if only one of \(C_a\) or \(C_b\) is a maximal clique in \(G'\), the sequence (5) is reduced by possibly removing either \(C_a\) or \(C_b\). Finally, in case (a), both \(C_a\) and \(C_b\) are removed.\(\square \)
Proof
(Proposition 3) In the junction tree \(T\), all the cliques in \(\mathcal {C}_t\) and \(\mathcal {C}_{d}\) are separated from all the remaining cliques by \(C_b'\). Then the cliques in \(\mathcal {C}_L\) are the first in the new perfect sequence, since they belong to a subtree of \(T\) that remains unchanged after the update. By definition, \(C_a'\) and \(C_b'\) can be the next two cliques in the sequence. Then the set of cliques in \(\mathcal {C}_t\) is a subpath between \(C_b'\) and one of the leaves of \(T\) and must be in reverse order with respect to that in (6); see Fig. 2. Next, by the definition of descendants, the cliques in \(\mathcal {C}_d\) can immediately follow \(\mathcal {C}_t\) in the same ordering as that of (6). Clearly, the descendants \(\mathrm {de}(C_{t_i})\) of each clique \(C_{t_i} \in \mathcal {C}_t\) must be ordered as in (6) since the subtree that connects them remains unchanged after updating \(J\); whereas the ordering between \(\mathrm {de}(C_{t_i})\) is arbitrary since they are separated by \(\mathcal {C}_t\). Finally, the cliques in \(\mathcal {C}_R\) also belong to a subtree of \(T\) that remains unchanged after the tree update, and they are separated, in \(T\), from the cliques in \(\mathcal {C}_t\) and \(\mathcal {C}_d\) by \(C_a'\).\(\square \)
Rights and permissions
About this article
Cite this article
Stingo, F., Marchetti, G.M. Efficient local updates for undirected graphical models. Stat Comput 25, 159–171 (2015). https://doi.org/10.1007/s11222-014-9541-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9541-6