Pattern Identification in Time-Course Gene Expression Data with the CoGAPS Matrix Factorization

  • Elana J. Fertig
  • Genevieve Stein-O’Brien
  • Andrew Jaffe
  • Carlo Colantuoni
Part of the Methods in Molecular Biology book series (MIMB, volume 1101)


Patterns in time-course gene expression data can represent the biological processes that are active over the measured time period. However, the orthogonality constraint in standard pattern-finding algorithms, including notably principal components analysis (PCA), confounds expression changes resulting from simultaneous, non-orthogonal biological processes. Previously, we have shown that Markov chain Monte Carlo nonnegative matrix factorization algorithms are particularly adept at distinguishing such concurrent patterns. One such matrix factorization is implemented in the software package CoGAPS. We describe the application of this software and several technical considerations for identification of age-related patterns in a public, prefrontal cortex gene expression dataset.

Key words

Markov Chain Monte Carlo Gene expression Nonnegative matrix factorization 


  1. 1.
    Klevecz RR, Bolen J, Forrest G, Murray DB (2004) A genomewide oscillation in transcription gates dna replication and cell cycle. Proc Natl Acad Sci USA 101(5):1200–1205PubMedCrossRefGoogle Scholar
  2. 2.
    Colantuoni C, Lipska BK, Ye T, Hyde TM, Tao R, Leek JT, Colantuoni EA, Elkahloun AG, Herman MM, Weinberger DR, Kleinman JE (2011) Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature 478(7370):519–523PubMedCrossRefGoogle Scholar
  3. 3.
    Kossenkov AV, Ochs MF (2010) Matrix factorisation methods applied in microarray data analysis. Int J Data Min Bioinform 4(1):72–90PubMedCrossRefGoogle Scholar
  4. 4.
    Ochs MF, Fertig EJ (2012) Matrix factorization for transcriptional regulatory network inference. In: IEEE symposium on computational intelligence in bioinformatics and computational biology (CIBCB), 2012, pp. 387–396CrossRefGoogle Scholar
  5. 5.
    Moloshok TD, Klevecz RR, Grant JD, Manion FJ, Speier WF 4th, Ochs MF (2002) Application of bayesian decomposition for analysing microarray data. Bioinformatics 18(4):566–575PubMedCrossRefGoogle Scholar
  6. 6.
    MF Ochs, Rink L, Tarn C, Mburu S, Taguchi T, Eisenberg B, Godwin AK (2009) Detection of treatment-induced changes in signaling pathways in gastrointestinal stromal tumors using transcriptomic data. Cancer Res 69(23):9125–9132PubMedCrossRefGoogle Scholar
  7. 7.
    Fertig EJ, Ding J, Favorov AV, Parmigiani G, Ochs MF (2010) CoGAPS: an R/C++ package to identify patterns and biological process activity in transcriptomic data. Bioinformatics 26(21):2792–2793PubMedCrossRefGoogle Scholar
  8. 8.
    Fertig EJ, Ren Q, Cheng H, Hatakeyama H, Dicker AP, Rodeck U, Considine M, Ochs MF, Chung CH (2012) Gene expression signatures modulated by epidermal growth factor receptor activation and their relationship to cetuximab resistance in head and neck squamous cell carcinoma. BMC Genom 13:160CrossRefGoogle Scholar
  9. 9.
    Bidaut G, Ochs MF (2004) Clutrfree: cluster tree visualization and interpretation. Bioinformatics 20(16):2869–2871PubMedCrossRefGoogle Scholar
  10. 10.
    Bidaut G, Suhre K, Claverie J-M, Ochs MF (2006) Determination of strongly overlapping signaling activity from microarray data. BMC Bioinformatics 7:99PubMedCrossRefGoogle Scholar
  11. 11.
    Devarajan K (2008) Nonnegative matrix factorization: an analytical and interpretive tool in computational biology. PLoS Comput Biol 4(7):e1000029PubMedCrossRefGoogle Scholar
  12. 12.
    Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791PubMedCrossRefGoogle Scholar
  13. 13.
    Ochs MF, Stoyanova RS, Arias-Mendoza F, Brown TR (1999) A new method for spectral decomposition using a bilinear bayesian approach. J Magn Reson 137(1):161–176PubMedCrossRefGoogle Scholar
  14. 14.
    Ochs MF (2003) Bayesian decomposition. In: Parmigiani G, Garrett ES, Irizarry RA, Zeger SL (eds) The analysis of gene expression data: methods and software. Springer, New YorkGoogle Scholar
  15. 15.
    Sibisi S, Skilling J (1997) Prior distributions on measure space. J Roy Stat Soc B 59(1):217–235CrossRefGoogle Scholar
  16. 16.
    Plummer M (2003) JAGS: A program for analysis of bayesian graphical models using gibbs sampling. In: Hornik K, Leisch F, Zeileis A (eds) Proceedings of the 3rd international workshop on distributed statistical computing, Vienna, Austria, March 20–22Google Scholar
  17. 17.
    Leek JT, Storey JD (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet 3(9):1724–1735PubMedCrossRefGoogle Scholar
  18. 18.
    Kossenkov AV, Peterson AJ, Ochs MF (2007) Determining transcription factor activity from microarray data using Bayesian Markov chain Monte Carlo sampling. Stud Health Technol Inform 129(Pt 2):1250–1254PubMedGoogle Scholar
  19. 19.
    Wang G, Kossenkov AV, Ochs MF (2006) Ls-nmf: a modified non-negative matrix factorization algorithm utilizing uncertainty estimates. BMC Bioinformatics 7:175PubMedCrossRefGoogle Scholar
  20. 20.
    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G (2000) Gene ontology: tool for the unification of biology. the gene ontology consortium. Nat Genet 25(1):25–29PubMedCrossRefGoogle Scholar
  21. 21.
    Parker HS, Leek JT (2012) The practical effect of batch on genomic prediction. Stat Appl Genet Mol Biol 11(3):Article 10Google Scholar
  22. 22.
    Fertig EJ, Markova A, Danilova LV, Gaykalova DA, Cope L, Chung CH, Califano JA, Ochs MF (2013) Epigenetically driven expression changes define HNSCC clinical subtypes and GLI1 activity is specific to HPV-negative HNSCC. SubmittedGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2014

Authors and Affiliations

  • Elana J. Fertig
    • 1
  • Genevieve Stein-O’Brien
    • 2
    • 3
  • Andrew Jaffe
    • 3
  • Carlo Colantuoni
    • 3
  1. 1.Oncology Biostatistics and BioinformaticsJohns Hopkins UniversityBaltimoreUSA
  2. 2.Institute of Genetic Medicine, Human Genetics Graduate ProgramJohns Hopkins University School of MedicineBaltimoreUSA
  3. 3.Lieber Institute for Brain DevelopmentBaltimoreUSA

Personalised recommendations