Interdisciplinary Bayesian Statistics pp 185-195 | Cite as
Factor Analysis with Mixture Modeling to Evaluate Coherent Patterns in Microarray Data
- 2k Downloads
Abstract
The computational advances over the last decades have allowed the use of complex models to analyze large data sets. The development of simulation-based methods, such as the Markov chain Monte Carlo (MCMC) method, has contributed to an increased interest in the Bayesian framework as an alternative to work with factor models. Many studies have applied the factor analysis to explore gene expression data with results often outperforming traditional methods for estimating and identifying patterns and metagene groups related to the underlying biology. In this chapter, we present a sparse latent factor model (SLFM) using a mixture prior (sparsity prior) to evaluate the significance of each factor loading; when the loading is significant, the effect of the corresponding factor is detected through patterns displayed along the samples. The SLFM is applied to investigate simulated and real microarray data. The real data sets represent the gene expression for different types of cancer; these include breast, brain, ovarian, and lung tumors. The proposed model can indicate how strong is the observed expression pattern allowing the measurement of the evidence of presence/absence of the gene activity. Finally, we compare the SLFM with two simpler gene detection methods available in the literature. The results suggest that the SLFM outperforms the traditional methods.
Keywords
Markov Chain Monte Carlo Markov Chain Monte Carlo Method Copy Number Alteration Real Data Analysis Full Conditional DistributionReferences
- 1.Bild, A.H., Yao, G., Chang, J.T., Wang, Q., Potti, A., Chasse, D., Joshi, M.B., Harpole, D., Lancaster, J.M., Berchuck, A., Jr, J.A.O., Marks, J.R., Dressman, H.K., West, M., Nevins, J.R.: Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 439(19), 353–357 (2006)CrossRefGoogle Scholar
- 2.Carvalho, C., Chang, J., Lucas, J.E., Nevins, J.R., Wang, Q., West, M.: High-dimensional sparse factor modeling: applications in gene expression genomics. J. Am. Stat. Assoc. 103(484), 1438–1456 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
- 3.Chin, K., DeVries, S., Fridlyand, J., Spellman, P.T., Roydasgupta, R., Kuo, W.L., Lapuk, A., Neve, R.M., Qian, Z., Ryder, T., Chen, F., Feiler, H., Tokuyasu, T., Kingsley, C., Dairkee, S., Meng, Z., Chew, K., Pinkel, D., Jain, A., Ljung, B.M., Esserman, L., Albertson, D.G., Waldman, F.M., Gray, J.W.: Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 10, 1149–1158 (2006)CrossRefGoogle Scholar
- 4.Freije, W.A., Castro-Vargas, F.E., Fang, Z., Horvath, S., Cloughesy, T., Liau, L.M., Mischel, P.S., Nelson, S.F.: Gene expression profiling of gliomas strongly predicts survival. Cancer Res. 64, 6503–6510 (2004)CrossRefGoogle Scholar
- 5.Lopes, H.F., West, M.: Bayesian model assessment in factor analysis. Stat. Sin. 14, 41–67 (2004)zbMATHMathSciNetGoogle Scholar
- 6.Lucas, J.E., Carvalho, C., Wang, Q., Bild, A., Nevins, J.R., West, M.: Sparse statistical modelling in gene expression genomics. In: Muller P., Do K., Vannucci M. (eds.) Bayesian Inference for Gene Expression and Proteomics, pp. 155–176. Cambridge University Press, Cambridge (2006)Google Scholar
- 7.Lucas, J.E., Carvalho, C.M., Chen, J.L.-Y., Chi, J.-T., West, M.: Cross-study projections of genomic biomarkers: an evaluation in cancer genomics. PLoS ONE. 4(2), e4523. (2009). doi:10.1371/journal.pone.0004523Google Scholar
- 8.Marks, J.R., Davidoff, A.M., Kerns, B.J., Humphrey, P.A., Pence, J.C., Dodge, R.K., Clarke-Pearson, D.L., Iglehart, J.D., Bast, R.C., Berchuck, A.: Overexpression and mutation of p53 in epithelial ovarian cancer. Cancer Res. 51, 2979–2984 (1991)Google Scholar
- 9.Mayrink, V.D., Lucas, J.E.: Bayesian factor models for the detection of coherent patterns in gene expression data. Braz J Probab Statistic. 29(1), 1–33 (2015)Google Scholar
- 10.Miller, L.D., Smeds, J., George, J., Vega, V.B., Vergara, L., Ploner, A., Pawitan, Y., Hall, P., Klaar, S., Liu, E.T., Bergh, J.: An oncogenic signature for p53 status in human breast cancer predicts mutation status, transcriptional effects and patient survival. Proc. Natl. Acad. Sci. U S A 102(38), 13550–13555 (2005)CrossRefGoogle Scholar
- 11.Pollack, J.R., Sorlie, T., Perou, C.M., Rees, C.A., Jeffrey, S.S., Lonning, P.E., Tibshirani, R., Botstein, D., Dale, A.L.B., Brown, P.O.: Microarray analysis reveals a major direct role of DNA copy number alteration in transcriptional program of human breasts tumors. Proc. Natl. Acad. Sci. U S A 99(20), 12963–12968 (2002)CrossRefGoogle Scholar
- 12.R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2014). http://www.R-project.org
- 13.Sotiriou, C., Wirapati, P., Loi, S., Harris, A., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., Kains, B.H., Desmedt, C., Larsimont, D., Cardoso, F., Peterse, H., Nuyten, D., Buyse, M., Vijver, M.J.V.D., Bergh, J., Piccart, M., Delorenzi, M.: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J. Natl. Cancer Inst. 98(4), 262–272 (2006)CrossRefGoogle Scholar
- 14.Wang, Y., Klijn, J.G.M., Zhang, Y., Sieuwerts, A.M., Look, M.P., Yang, F., Talantov, D., Timmermans, M., Gelder, M.E.M.V., Yu, J., Jatkoe, T., Berns, E.M.J.J., Atkins, D., Foekens, J.A.: Gene expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365, 671–679 (2005)CrossRefGoogle Scholar
- 15.Warren, P.D., Taylor, P.G.V., Martini, J.J., Bienkowska, J.: Panp—a new method of gene detection on oligonucleotide expression arrays. Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering pp. 108–115 (2007)Google Scholar
- 16.West, M.: Bayesian factor regression models in the large p, small n paradigm. Bayesian Statistics, Oxford University Press 7 (2003)Google Scholar