Advertisement

Models of Random Sparse Eigenmatrices and Bayesian Analysis of Multivariate Structure

  • Andrew Cron
  • Mike West
Conference paper
Part of the Abel Symposia book series (ABEL, volume 11)

Abstract

We discuss probabilistic models of random covariance structures defined by distributions over sparse eigenmatrices. The decomposition of orthogonal matrices in terms of Givens rotations defines a natural, interpretable framework for defining distributions on sparsity structure of random eigenmatrices. We explore theoretical aspects and implications for conditional independence structures arising in multivariate Gaussian models, and discuss connections with sparse PCA, factor analysis and Gaussian graphical models. Methodology includes model-based exploratory data analysis and Bayesian analysis via reversible jump Markov chain Monte Carlo. A simulation study examines the ability to identify sparse multivariate structures compared to the benchmark graphical modelling approach. Extensions to multivariate normal mixture models with additional measurement errors move into the framework of latent structure analysis of broad practical interest. We explore the implications and utility of the new models with summaries of a detailed applied study of a 20-dimensional breast cancer genomics data set.

Keywords

Markov Chain Monte Carlo Precision Matrix Reversible Jump Markov Chain Monte Carlo Gaussian Graphical Modelling Markov Chain Monte Carlo Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was completed while the first author was a Ph.D. student in the Department of Statistical Science at Duke University. The research was partly supported by grants from the National Science Foundation [DMS-1106516] and the National Institutes of Health [1RC1-AI086032]. Any opinions, findings and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of the NSF or NIH.

References

  1. 1.
    Anderson, T.W., Olkin, I., Underhill, L.G.: Generation of random orthogonal matrices. SIAM J. Sci. Stat. Comput. 8, 625–629 (1987)CrossRefMathSciNetMATHGoogle Scholar
  2. 2.
    Carvalho, C.M., West, M.: Dynamic matrix-variate graphical models. Bayesian Anal. 2, 69–98 (2007)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Carvalho, C.M., Lucas, J.E., Wang, Q., Chang, J., Nevins, J.R., West, M.: High-dimensional sparse factor modelling - applications in gene expression genomics. J. Am. Stat. Assoc. 103, 1438–1456 (2008)CrossRefMathSciNetMATHGoogle Scholar
  4. 4.
    Chan, C., Feng, F., Ottinger, J., Foster, D., West, M., Kepler, T.B.: Statistical mixture modelling for cell subtype identification in flow cytometry. Cytometry A 73, 693–701 (2008)CrossRefGoogle Scholar
  5. 5.
    Cron, A.J., West, M.: Efficient classification-based relabeling in mixture models. Am. Stat. 65, 16–20 (2011)CrossRefMathSciNetGoogle Scholar
  6. 6.
    Daniels, M., Pourahmadi, M.: Modeling covariance matrices via partial autocorrelations. J. Multivar. Anal. 100, 2352–2363 (2009)CrossRefMathSciNetMATHGoogle Scholar
  7. 7.
    Dobra, A., Jones, B., Hans, C., Nevins, J.R., West, M.: Sparse graphical models for exploring gene expression data. J. Multivar. Anal. 90, 196–212 (2004)CrossRefMathSciNetMATHGoogle Scholar
  8. 8.
    Dobra, A., Lenkoski, A., Rodriguez, A.: Bayesian inference for general Gaussian graphical models with application to multivariate lattice data. J. Am. Stat. Assoc. 106, 1418–1433 (2012)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Escobar, M.D., West, M.: Bayesian density estimation and inference using mixtures. J. Am. Stat. Assoc. 90, 577–588 (1995)CrossRefMathSciNetMATHGoogle Scholar
  10. 10.
    Fisher, N.I.: Statistical Analysis of Circular Data. Cambridge University Press, Cambridge (1993)CrossRefMATHGoogle Scholar
  11. 11.
    Fulkerson, D.R., Gross, O.A.: Incidence matrices and interval graphs. Pac. J. Math. 15, 835–855 (1965)CrossRefMathSciNetMATHGoogle Scholar
  12. 12.
    Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)CrossRefMathSciNetMATHGoogle Scholar
  13. 13.
    Gruber, L.F., West, M.: GPU-accelerated Bayesian learning in simultaneous graphical dynamic linear models. Bayesian Anal. - Advance Publication, 2 March 2015. http://projecteuclid.org/euclid.ba/1425304898 (2015). doi:10.1214/15-BA946
  14. 14.
    Hans, C., West, M.: High-dimensional regression in cancer genomics. Bull. Int. Soc. Bayesian Anal. 13, 2–3 (2006)Google Scholar
  15. 15.
    Hans, C., Dobra, A., West, M.: Shotgun stochastic search in regression with many predictors. J. Am. Stat. Soc. 102, 507–516 (2007)CrossRefMathSciNetMATHGoogle Scholar
  16. 16.
    Hans, C., Wang, Q., Dobra, A., West, M.: SSS: high-dimensional Bayesian regression model search. Bull. Int. Soc. Bayesian Anal. 24, 8–9 (2007)Google Scholar
  17. 17.
    Hoff, P.D.: Simulation of the matrix Bingham-von Mises-Fisher distribution, with applications to multivariate and relational data. J. Comput. Graph. Stat. 18, 438–456 (2009)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Huang, E.S., West, M., Nevins, J.R.: Gene expression profiles and predicting clinical characteristics of breast cancer. Horm. Res. 58, 55–73 (2002)Google Scholar
  19. 19.
    Jones, B., West, M.: Covariance decomposition in undirected Gaussian graphical models. Biometrika 92, 779–786 (2005)CrossRefMathSciNetMATHGoogle Scholar
  20. 20.
    Jones, B., Dobra, A., Carvalho, C.M., Hans, C., Carter, C., West, M.: Experiments in stochastic computation for high-dimensional graphical models. Stat. Sci. 20, 388–400 (2005)CrossRefMathSciNetMATHGoogle Scholar
  21. 21.
    Lauritzen, S.L.: Graphical Models. Clarendon Press, Oxford (1996)Google Scholar
  22. 22.
    Lavine, M., West, M.: A Bayesian method for classification and discrimination. Can. J. Stat. 20, 451–461 (1992)CrossRefMathSciNetMATHGoogle Scholar
  23. 23.
    Lopes, H.F., McCulloch, R.E., Tsay, R.: Cholesky stochastic volatility. Technical Report, University of Chicago, Booth Business School (2010)Google Scholar
  24. 24.
    Lucas, J.E., Carvalho, C.M., Wang, Q., Bild, A.H., Nevins, J.R., West, M.: Sparse statistical modelling in gene expression genomics. In: Do, K., Mueller, P., Vannucci, M. (eds.) Bayesian Inference for Gene Expression and Proteomics, pp. 155–176. Cambridge University Press, Cambridge (2006)CrossRefGoogle Scholar
  25. 25.
    Lucas, J.E., Carvalho, C.M., Chi, J.T.A., West, M.: Cross-study projections of genomic biomarkers: an evaluation in cancer genomics. PLoS One 4, e4523 (2009)CrossRefGoogle Scholar
  26. 26.
    Lucas, J.E., Carvalho, C.M., West, M.: A Bayesian analysis strategy for cross-study translation of gene expression biomarkers. Stat. Appl. Genet. Mol. Biol. 8(1) Article 11, 1–26 (2009)Google Scholar
  27. 27.
    Lucas, J.E., Carvalho, C.M., Merl, D., West, M.: In-vitro to in-vivo factor profiling in expression genomics. In: Dey, D., Ghosh, S., Mallick, B. (eds.) Bayesian Modelling in Bioinformatics, pp. 293–316. Taylor-Francis, New York (2010)Google Scholar
  28. 28.
    McLachlan, G., Peel, D., Bean, R.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41, 379–388 (2003)CrossRefMathSciNetMATHGoogle Scholar
  29. 29.
    Nakajima, J., West, M.: Bayesian analysis of latent threshold dynamic models. J. Bus. Econ. Stat. 31, 151–164 (2013). doi:10.1080/07350015.2012.747847CrossRefMathSciNetGoogle Scholar
  30. 30.
    Rodriguez, A., Lenkoski, A., Dobra, A.: Sparse covariance estimation in heterogeneous samples. Electron. J. Stat. 5, 981–1014 (2011)CrossRefMathSciNetMATHGoogle Scholar
  31. 31.
    Seo, D.M., Wang, T., Dressman, H.K., Herderick, E.E., Iversen, E.S., Dong, C., Vata, K., Milano, C.A., Rigat, F., Pittman, J., Nevins, J.R., West, M., Goldschmidt-Clermont, P.J.: Gene expression phenotypes of atherosclerosis. Arterioscler. Thromb. Vasc. Biol. 24, 1922–1927 (2004)CrossRefGoogle Scholar
  32. 32.
    Seo, D.M., Goldschmidt-Clermont, P.J., West, M.: Of mice and men: sparse statistical modelling in cardiovascular genomics. Ann. Appl. Stat. 1, 152–178 (2007)CrossRefMathSciNetMATHGoogle Scholar
  33. 33.
    Sørlie, T.: Molecular portraits of breast cancer: tumour subtypes as distinct disease entities. Eur. J. Cancer 40, 2667–2675 (2004)CrossRefGoogle Scholar
  34. 34.
    Suchard, M.A., Holmes, C., West, M.: Some of the What?, Why?, How?, Who? and Where? of graphics processing unit computing for Bayesian analysis. Bull. Int. Soc. Bayesian Anal. 17, 12–16 (2010)Google Scholar
  35. 35.
    Suchard, M.A., Wang, Q., Chan, C., Frelinger, J., Cron, A., West, M.: Understanding GPU programming for statistical computation: studies in massively parallel massive mixtures. J. Comput. Graph. Stat. 19, 419–438 (2010)CrossRefMathSciNetGoogle Scholar
  36. 36.
    West, M.: Bayesian factor regression models in the “large p, small n” paradigm. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., David, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics 7, pp. 723–732. Oxford University Press, Oxford (2003)Google Scholar
  37. 37.
    West, M., Blanchette, C., Dressman, H.K., Huang, E.S., Ishida, S., Spang, R., Zuzan, H., Marks, J.R., Nevins, J.R.: Predicting the clinical status of human breast cancer utilizing gene expression profiles. Proc. Natl. Acad. Sci. 98, 11462–11467 (2001)CrossRefGoogle Scholar
  38. 38.
    Yang, R., Berger, J.O.: Estimation of a covariance matrix using the reference prior. Ann. Stat. 22, 1195–1211 (1994)CrossRefMathSciNetMATHGoogle Scholar
  39. 39.
    Yoshida, R., West, M.: Bayesian learning in sparse graphical factor models via annealed entropy. J. Mach. Learn. Res. 11, 1771–1798 (2010)MathSciNetMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.84.51°CincinnatiUSA
  2. 2.Department of Statistical ScienceDuke UniversityDurhamUSA

Personalised recommendations