Advertisement

Factor Analysis with Mixture Modeling to Evaluate Coherent Patterns in Microarray Data

  • Joao Daniel Nunes DuarteEmail author
  • Vinicius Diniz Mayrink
Conference paper
  • 2k Downloads
Part of the Springer Proceedings in Mathematics & Statistics book series (PROMS, volume 118)

Abstract

The computational advances over the last decades have allowed the use of complex models to analyze large data sets. The development of simulation-based methods, such as the Markov chain Monte Carlo (MCMC) method, has contributed to an increased interest in the Bayesian framework as an alternative to work with factor models. Many studies have applied the factor analysis to explore gene expression data with results often outperforming traditional methods for estimating and identifying patterns and metagene groups related to the underlying biology. In this chapter, we present a sparse latent factor model (SLFM) using a mixture prior (sparsity prior) to evaluate the significance of each factor loading; when the loading is significant, the effect of the corresponding factor is detected through patterns displayed along the samples. The SLFM is applied to investigate simulated and real microarray data. The real data sets represent the gene expression for different types of cancer; these include breast, brain, ovarian, and lung tumors. The proposed model can indicate how strong is the observed expression pattern allowing the measurement of the evidence of presence/absence of the gene activity. Finally, we compare the SLFM with two simpler gene detection methods available in the literature. The results suggest that the SLFM outperforms the traditional methods.

Keywords

Markov Chain Monte Carlo Markov Chain Monte Carlo Method Copy Number Alteration Real Data Analysis Full Conditional Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bild, A.H., Yao, G., Chang, J.T., Wang, Q., Potti, A., Chasse, D., Joshi, M.B., Harpole, D., Lancaster, J.M., Berchuck, A., Jr, J.A.O., Marks, J.R., Dressman, H.K., West, M., Nevins, J.R.: Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 439(19), 353–357 (2006)CrossRefGoogle Scholar
  2. 2.
    Carvalho, C., Chang, J., Lucas, J.E., Nevins, J.R., Wang, Q., West, M.: High-dimensional sparse factor modeling: applications in gene expression genomics. J. Am. Stat. Assoc. 103(484), 1438–1456 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  3. 3.
    Chin, K., DeVries, S., Fridlyand, J., Spellman, P.T., Roydasgupta, R., Kuo, W.L., Lapuk, A., Neve, R.M., Qian, Z., Ryder, T., Chen, F., Feiler, H., Tokuyasu, T., Kingsley, C., Dairkee, S., Meng, Z., Chew, K., Pinkel, D., Jain, A., Ljung, B.M., Esserman, L., Albertson, D.G., Waldman, F.M., Gray, J.W.: Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 10, 1149–1158 (2006)CrossRefGoogle Scholar
  4. 4.
    Freije, W.A., Castro-Vargas, F.E., Fang, Z., Horvath, S., Cloughesy, T., Liau, L.M., Mischel, P.S., Nelson, S.F.: Gene expression profiling of gliomas strongly predicts survival. Cancer Res. 64, 6503–6510 (2004)CrossRefGoogle Scholar
  5. 5.
    Lopes, H.F., West, M.: Bayesian model assessment in factor analysis. Stat. Sin. 14, 41–67 (2004)zbMATHMathSciNetGoogle Scholar
  6. 6.
    Lucas, J.E., Carvalho, C., Wang, Q., Bild, A., Nevins, J.R., West, M.: Sparse statistical modelling in gene expression genomics. In: Muller P., Do K., Vannucci M. (eds.) Bayesian Inference for Gene Expression and Proteomics, pp. 155–176. Cambridge University Press, Cambridge (2006)Google Scholar
  7. 7.
    Lucas, J.E., Carvalho, C.M., Chen, J.L.-Y., Chi, J.-T., West, M.: Cross-study projections of genomic biomarkers: an evaluation in cancer genomics. PLoS ONE. 4(2), e4523. (2009). doi:10.1371/journal.pone.0004523Google Scholar
  8. 8.
    Marks, J.R., Davidoff, A.M., Kerns, B.J., Humphrey, P.A., Pence, J.C., Dodge, R.K., Clarke-Pearson, D.L., Iglehart, J.D., Bast, R.C., Berchuck, A.: Overexpression and mutation of p53 in epithelial ovarian cancer. Cancer Res. 51, 2979–2984 (1991)Google Scholar
  9. 9.
    Mayrink, V.D., Lucas, J.E.: Bayesian factor models for the detection of coherent patterns in gene expression data. Braz J Probab Statistic. 29(1), 1–33 (2015)Google Scholar
  10. 10.
    Miller, L.D., Smeds, J., George, J., Vega, V.B., Vergara, L., Ploner, A., Pawitan, Y., Hall, P., Klaar, S., Liu, E.T., Bergh, J.: An oncogenic signature for p53 status in human breast cancer predicts mutation status, transcriptional effects and patient survival. Proc. Natl. Acad. Sci. U S A 102(38), 13550–13555 (2005)CrossRefGoogle Scholar
  11. 11.
    Pollack, J.R., Sorlie, T., Perou, C.M., Rees, C.A., Jeffrey, S.S., Lonning, P.E., Tibshirani, R., Botstein, D., Dale, A.L.B., Brown, P.O.: Microarray analysis reveals a major direct role of DNA copy number alteration in transcriptional program of human breasts tumors. Proc. Natl. Acad. Sci. U S A 99(20), 12963–12968 (2002)CrossRefGoogle Scholar
  12. 12.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2014). http://www.R-project.org
  13. 13.
    Sotiriou, C., Wirapati, P., Loi, S., Harris, A., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., Kains, B.H., Desmedt, C., Larsimont, D., Cardoso, F., Peterse, H., Nuyten, D., Buyse, M., Vijver, M.J.V.D., Bergh, J., Piccart, M., Delorenzi, M.: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J. Natl. Cancer Inst. 98(4), 262–272 (2006)CrossRefGoogle Scholar
  14. 14.
    Wang, Y., Klijn, J.G.M., Zhang, Y., Sieuwerts, A.M., Look, M.P., Yang, F., Talantov, D., Timmermans, M., Gelder, M.E.M.V., Yu, J., Jatkoe, T., Berns, E.M.J.J., Atkins, D., Foekens, J.A.: Gene expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet 365, 671–679 (2005)CrossRefGoogle Scholar
  15. 15.
    Warren, P.D., Taylor, P.G.V., Martini, J.J., Bienkowska, J.: Panp—a new method of gene detection on oligonucleotide expression arrays. Proceedings of the 7th IEEE International Conference on Bioinformatics and Bioengineering pp. 108–115 (2007)Google Scholar
  16. 16.
    West, M.: Bayesian factor regression models in the large p, small n paradigm. Bayesian Statistics, Oxford University Press 7 (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Joao Daniel Nunes Duarte
    • 1
    Email author
  • Vinicius Diniz Mayrink
    • 1
  1. 1.Departamento de EstatisticaICEx, UFMGBelo HorizonteBrazil

Personalised recommendations