Skip to main content
Log in

Inferring Two-Level Hierarchical Gaussian Graphical Models to Discover Shared and Context-Specific Conditional Dependencies from High-Dimensional Heterogeneous Data

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Gaussian graphical models (GGM) express conditional dependencies among variables of Gaussian-distributed high-dimensional data. However, real-life datasets exhibit heterogeneity which can be better captured through the use of mixtures of GGMs, where each component captures different conditional dependencies a.k.a. context-specific dependencies along with some common dependencies a.k.a. shared dependencies. Methods to discover shared and context-specific graphical structures include joint and grouped graphical Lasso, and the EM algorithm with various penalized likelihood scoring functions. However, these methods detect graphical structures with high false discovery rates and do not detect two types of dependencies (i.e., context-specific and shared) together. In this paper, we develop a method to discover shared conditional dependencies along with context-specific graphical models via a two-level hierarchical Gaussian graphical model. We assume that the graphical models corresponding to shared and context-specific dependencies are decomposable, which leads to an efficient greedy algorithm to select edges minimizing a score based on minimum message length (MML). The MML-based score results in lower false discovery rate, leading to a more effective structure discovery. We present extensive empirical results on synthetic and real-life datasets and show that our method leads to more accurate prediction of context-specific dependencies among random variables compared to previous works. Hence, we can consider that our method is a state of the art to discover both shared and context-specific conditional dependencies from high-dimensional Gaussian heterogeneous data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. A clique is a subset of vertices of an undirected graph such that every two distinct vertices in the clique are adjacent [42]. A maximal clique is a clique that cannot be extended by including one more adjacent vertex, that is, a clique which does not exist exclusively within the vertex set of a larger clique [42].

  2. In graph theory, the term “null graph” refers to a graph without any edges, aka the “empty graph” [42].

  3. In real world, the heterogeneous GGM data exhibit relatively small number of components comparing with the number of datapoints. Therefore, \(K<< n\) therefore \(-\log {K!}\) does not affect the total require bits to encode the clustering coefficient and the contents.

  4. The graphical structure at the top level is the graphical structure with shared edges. At lower level, all the context-specific graphical structures are placed. That is why it is called two-level hierarchical Gaussian graphical models.

  5. \(\mathrm{{FPR}} = \frac{{\text {FP}}}{{\text {TP}}+{\text {FP}}}\) where TP is the number of the predicted edges present in gold standard and FP is the number of the predicted edges not present in gold standard.

  6. \(\mathrm{{FNR}} = \frac{{\text {FN}}}{{\text {TN}}+{\text {FN}}}\) where TN is the number of the predicted conditional independence present in gold standard and FN is the number of the predicted conditional independence not present in gold standard.

  7. error = FNR+FPR.

  8. Except PaGIAM–tGDM and PaGIAM–sContchordalysis-MML, all other baselines of synthetic data experiments do not perform well. For this reason, we do not use these baselines for the real-world data.

References

  1. Akaike H. Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory; 1973. p. 267–281.

  2. Allisons L. Encoding General Graphs. 2017. http://www.allisons.org/ll/MML/Structured/Graph/. Accessed 1 Apr 2020.

  3. Armstrong H, et al. Bayesian covariance matrix estimation using a mixture of decomposable graphical models. Stat Comput. 2009;19:303–16.

    MathSciNet  Google Scholar 

  4. Barabási AL, Albert R. Statistical mechanics of complex networks. Rev Mod Phys. 2002;74(1):47–97.

    MathSciNet  MATH  Google Scholar 

  5. Breheny P, Huang J. Penalized methods for bi-level variable selection. Stat Inference. 2009;2(3):369–80.

    MathSciNet  MATH  Google Scholar 

  6. Brennan C, et al. The somatic genomic landscape of gliobalstoma. Cell. 2013;155(2):462–77.

    Google Scholar 

  7. Clauset A, et al. Power-law distributions in empirical data. SIAM Rev. 2007;51:661–703.

    MathSciNet  MATH  Google Scholar 

  8. Danaher P, et al. The Joint Graphical Lasso for inverse covariance estimation across multiple classes. J R Stat Soc. 2014;76(2):373–97.

    MathSciNet  Google Scholar 

  9. Dempster A, et al. Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc. 1977;39(1):1–39.

    MathSciNet  MATH  Google Scholar 

  10. Deshpande A, et al. Efficient stepwise selection in decomposable models. In: Proceedings of the seventeenth conference on uncertainty in artificial intelligence; 2001. p. 128–135.

  11. Dowe D, et al. MML estimation of the parameters of the spherical Fisher distribution. Algorithmic Learn Theory. 1996;1160:213–27.

    MathSciNet  MATH  Google Scholar 

  12. Dwyer P. Some applications of matrix derivatives in multivariate analysis. J Am Stat Assoc. 1967;62:607–25.

    MathSciNet  MATH  Google Scholar 

  13. Friedman J, et al. Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9:432–41.

    MATH  Google Scholar 

  14. Friedman N. The Bayesian structural EM algorithm. In: Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI); 1998. p. 129–138.

  15. Gao C, et al. Estimation of multiple networks in Gaussian mixture models. Electron J Stat. 2016;10:1133–54.

    MathSciNet  MATH  Google Scholar 

  16. Giraud C. Introduction to high-dimensional statistics. Boca Raton: Chapman and Hall/CRCs; 2014.

    MATH  Google Scholar 

  17. Guavain JL, Lee CH. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans Speech Audio Process. 1998;2(2):291–8.

    Google Scholar 

  18. Guo J, et al. Joint estimation of multiple graphical models. Biometrika. 2011;98(1):1–15.

    MathSciNet  MATH  Google Scholar 

  19. Hao B, et al. Simultaneous clustering and estimation of heterogeneous graphical model. J Mach Learn Res. 2018;18(217):1–58.

    MathSciNet  Google Scholar 

  20. Kumar M, Koller D. Learning a small mixture of trees. In: Advances in neural information processing systems; 2009. p. 1051–1059.

  21. Lauritzen S. Graphical models. Oxford: Clarendon Press; 1996.

    MATH  Google Scholar 

  22. Li Z, et al. Bayesian Joint Spike-and-Slab Graphical Lasso. In: Proceedings of the 36th international conference on machine learning, vol. 97; 2019. p. 3877–3885.

  23. Ma J, Michailidis G. Joint structural estimation of multiple graphical models. J Mach Learn Res. 2016;17:1–48.

    MathSciNet  MATH  Google Scholar 

  24. Maretic H, Frossard P. Graph Laplacian mixture model. arXiv:1810.10053. 2018.

  25. McLendon R, et al. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455(7216):1061–8.

    Google Scholar 

  26. Meilă M, Jordan MI. Learning with mixtures of trees. J Mach Learn Res. 2000;1:1–48.

    MathSciNet  MATH  Google Scholar 

  27. Mirzaa G, et al. De novo CCND2 mutations leading to stabilization of cyclin D2 cause megalecephaly–polymicrogyria–polydactyly–hydrocephalus syndrome. Nat Genet. 2014;46(5):510–4.

    Google Scholar 

  28. Mukherjee C, Roriguez A. GPU-powered shotgun stochastic search for dirichlet process mixtures of gaussian graphical models. J Comput Graph Stat. 2016;25(3):762–88.

    MathSciNet  Google Scholar 

  29. Narita Y, et al. Mutant epidermal growth factor receptor signalling down-regulates p27 through activation of the phosphatidylinositol 3-kinase/AKT pathway in glioblastomas. Cancer Res. 2002;62(22):6764–9.

    Google Scholar 

  30. Oliver J, et al. Unsupervised learning using MML. In: Proceedings of the 13th international conference machine learning; 1996. p. 364–372.

  31. Peterson C, et al. Bayesian inference of multiple gaussian graphical models. J Am Stat Assoc. 2015;110(509):159–74.

    MathSciNet  MATH  Google Scholar 

  32. Petitjean F, Webb G. Scaling log-linear analysis to datasets with thousands of variables. In: SIAM international conference on data mining; 2015. p. 469–477.

  33. Petitjean F, et al. A statistically efficient and scalable method for log-linear analysis of high-dimensional data. In: Proceedings of IEEE international conference on data mining (ICDM); 2014. p. 110–119.

  34. Pittman J, et al. Integrated modeling of clinical and gene expression information for personalized prediction ofdisease outcomes. Proc Natl Acad Sci USA. 2004;101:8431–6.

    Google Scholar 

  35. Pujana MA, et al. Network modeling links breast cancer susceptibility and centrosome dysfunction. Nat Genet. 2007;39:1338–49.

    Google Scholar 

  36. Rahman M, Haffari G. A statistically efficient and scalable method for exploratory analysis of high-dimensional data. SN Comput Sci. 2020;1(2):1–17.

    Google Scholar 

  37. Rodriguez A, et al. Sparse covariance estimation in heterogeneous samples. Electron J Stat. 2011;5:981–1014.

    MathSciNet  MATH  Google Scholar 

  38. Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6:461–4.

    MathSciNet  MATH  Google Scholar 

  39. Verhaak R, et al. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR and NF1. Cancer Cell. 2010;17(1):98–110.

    Google Scholar 

  40. Wallace C, Boulton D. An information measure for classification. Comput J. 1968;11:185–94.

    MATH  Google Scholar 

  41. Wallace C, Dowe D. MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions. J Stat Comput. 2000;10:173–83.

    Google Scholar 

  42. West DB. Introduction to graph theory. London: Pearson; 2001.

    Google Scholar 

Download references

Acknowledgements

We are thankful to Monash University for the financial supports towards this research. We are also thankful to Dr. Francois Petitjean for his valuable advise on the development of two level HGGM.

Funding

This study was not funded by any external funding source.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammad S. Rahman.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rahman, M.S., Nicholson, A.E. & Haffari, G. Inferring Two-Level Hierarchical Gaussian Graphical Models to Discover Shared and Context-Specific Conditional Dependencies from High-Dimensional Heterogeneous Data. SN COMPUT. SCI. 1, 218 (2020). https://doi.org/10.1007/s42979-020-00224-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-00224-w

Keywords

Navigation