EPIC: A Tool to Estimate the Proportions of Different Cell Types from Bulk Gene Expression Data

  • Julien Racle
  • David GfellerEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 2120)


Gene expression profiling is nowadays routinely performed on clinically relevant samples (e.g., from tumor specimens). Such measurements are often obtained from bulk samples containing a mixture of cell types. Knowledge of the proportions of these cell types is crucial as they are key determinants of the disease evolution and response to treatment. Moreover, heterogeneity in cell type proportions across samples is an important confounding factor in downstream analyses.

Many tools have been developed to estimate the proportion of the different cell types from bulk gene expression data. Here, we provide guidelines and examples on how to use these tools, with a special focus on our recent computational method EPIC (Estimating the Proportions of Immune and Cancer cells). EPIC includes RNA-seq-based gene expression reference profiles from immune cells and other nonmalignant cell types found in tumors. EPIC can additionally manage user-defined gene expression reference profiles. Some unique features of EPIC include the ability to account for an uncharacterized cell type, the introduction of a renormalization step to account for different mRNA content in each cell type, and the use of single-cell RNA-seq data to derive biologically relevant reference gene expression profiles. EPIC is available as a web application ( and as an R-package (

Key words

Gene expression analysis Cell fraction predictions Computational biology RNA-seq deconvolution Immunoinformatics Tumor immune microenvironment 


  1. 1.
    Hanahan D, Weinberg RA (2011) Hallmarks of cancer: the next generation. Cell 144:646–674. Scholar
  2. 2.
    Joyce JA, Fearon DT (2015) T cell exclusion, immune privilege, and the tumor microenvironment. Science 348:74–80. Scholar
  3. 3.
    Fridman WH, Pagès F, Sautès-Fridman C, Galon J (2012) The immune contexture in human tumours: impact on clinical outcome. Nat Rev Cancer 12:298–306. Scholar
  4. 4.
    Croci DO, Zacarías Fluck MF, Rico MJ et al (2007) Dynamic cross-talk between tumor and immune cells in orchestrating the immunosuppressive network at the tumor microenvironment. Cancer Immunol Immunother 56:1687–1700. Scholar
  5. 5.
    Shen-Orr SS, Gaujoux R (2013) Computational deconvolution: extracting cell type-specific information from heterogeneous samples. Curr Opin Immunol 25:571–578. Scholar
  6. 6.
    Hagenauer MH, Schulmann A, Li JZ et al (2018) Inference of cell type content from human brain transcriptomic datasets illuminates the effects of age, manner of death, dissection, and psychiatric diagnosis. PLoS One 13:e0200003. Scholar
  7. 7.
    Repsilber D, Kern S, Telaar A et al (2010) Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach. BMC Bioinformatics 11:1. Scholar
  8. 8.
    Yoshihara K, Shahmoradgoli M, Martínez E et al (2013) Inferring tumour purity and stromal and immune cell admixture from expression data. Nat Commun 4:2612. Scholar
  9. 9.
    Zhong Y, Wan Y-W, Pang K et al (2013) Digital sorting of complex tissues for cell type-specific gene expression profiles. BMC Bioinformatics 14:1. Scholar
  10. 10.
    Quon G, Haider S, Deshwar AG et al (2013) Computational purification of individual tumor gene expression profiles leads to significant improvements in prognostic prediction. Genome Med 5:29. Scholar
  11. 11.
    Gong T, Szustakowski JD (2013) DeconRNASeq: a statistical framework for deconvolution of heterogeneous tissue samples based on mRNA-Seq data. Bioinformatics 29:1083–1085. Scholar
  12. 12.
    Newman AM, Liu CL, Green MR et al (2015) Robust enumeration of cell subsets from tissue expression profiles. Nat Methods 12:453–457. Scholar
  13. 13.
    Becht E, Giraldo NA, Lacroix L et al (2016) Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol 17:218. Scholar
  14. 14.
    Li B, Severson E, Pignon J-C et al (2016) Comprehensive analyses of tumor immunity: implications for cancer immunotherapy. Genome Biol 17:174. Scholar
  15. 15.
    Danaher P, Warren S, Dennis L et al (2017) Gene expression markers of Tumor Infiltrating Leukocytes. J Immunother Cancer 5:18. Scholar
  16. 16.
    Racle J, Jonge K, de Baumgaertner P et al (2017) Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data. elife 6:e26476. Scholar
  17. 17.
    Tappeiner E, Finotello F, Charoentong P et al (2017) TIminer: NGS data mining pipeline for cancer immunology and immunotherapy. Bioinformatics 33:3140–3141. Scholar
  18. 18.
    Aran D, Hu Z, Butte AJ (2017) xCell: digitally portraying the tissue cellular heterogeneity landscape. Genome Biol 18:220. Scholar
  19. 19.
    Finotello F, Mayer C, Plattner C et al (2019) Molecular and pharmacological modulators of the tumor immune contexture revealed by deconvolution of RNA-seq data. Genome Med 11:34. Scholar
  20. 20.
    Monaco G, Lee B, Xu W et al (2019) RNA-seq signatures normalized by mRNA abundance allow absolute deconvolution of human immune cell types. Cell Rep 26:1627–1640.e7. Scholar
  21. 21.
    Frishberg A, Peshes-Yaloz N, Cohn O et al (2019) Cell composition analysis of bulk genomics using single-cell data. Nat Methods 16:327–332. Scholar
  22. 22.
    Hunt GJ, Freytag S, Bahlo M, Gagnon-Bartsch JA (2019) dtangle: accurate and robust cell type deconvolution. Bioinformatics 35:2093–2099. Scholar
  23. 23.
    Wang X, Park J, Susztak K et al (2019) Bulk tissue cell type deconvolution with multi-subject single-cell expression reference. Nat Commun 10:380. Scholar
  24. 24.
    Subramanian A, Tamayo P, Mootha VK et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci 102:15545–15550. Scholar
  25. 25.
    Barbie DA, Tamayo P, Boehm JS et al (2009) Systematic RNA interference reveals that oncogenic KRAS-driven cancers require TBK1. Nature 462:108–112. Scholar
  26. 26.
    Finotello F, Trajanoski Z (2018) Quantifying tumor-infiltrating immune cells from transcriptomics data. Cancer Immunol Immunother 67:1031–1040. Scholar
  27. 27.
    Petitprez F, Sun C-M, Lacroix L et al (2018) Quantitative analyses of the tumor microenvironment composition and orientation in the era of precision medicine. Front Oncol 8:390. Scholar
  28. 28.
    Schelker M, Feau S, Du J et al (2017) Estimation of immune cell content in tumour tissue using single-cell RNA-seq data. Nat Commun 8:2032. Scholar
  29. 29.
    Liebner DA, Huang K, Parvin JD (2014) MMAD: microarray microdissection with analysis of differences is a computational tool for deconvoluting cell type-specific contributions from tissue samples. Bioinformatics 30:682–689. Scholar
  30. 30.
    Sturm G, Finotello F, Petitprez F et al (2019) Comprehensive evaluation of cell-type quantification methods for immuno-oncology. Bioinformatics 35:i436–i445. Scholar
  31. 31.
    Angelova M, Charoentong P, Hackl H et al (2015) Characterization of the immunophenotypes and antigenomes of colorectal cancers reveals distinct tumor escape mechanisms and novel targets for immunotherapy. Genome Biol 16:64. Scholar
  32. 32.
    The Cancer Genome Atlas Network (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature 487:330–337. Scholar
  33. 33.
    Cerami E, Gao J, Dogrusoz U et al (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2:401–404. Scholar
  34. 34.
    Gao J, Aksoy BA, Dogrusoz U et al (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6:pl1. Scholar
  35. 35.
    Holm S (1979) A simple sequentially rejective multiple test procedure. Scand J Stat 6:65–70Google Scholar
  36. 36.
    Edgar R, Domrachev M, Lash AE (2002) Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210. Scholar
  37. 37.
    Kolesnikov N, Hastings E, Keays M et al (2015) ArrayExpress update—simplifying data submissions. Nucleic Acids Res 43:D1113–D1116. Scholar
  38. 38.
    Zhang J, Baran J, Cros A et al (2011) International cancer genome consortium data portal—a one-stop shop for cancer genomics data. Database J Biol Databases Curation 2011:bar026. Scholar
  39. 39.
    Grossman RL, Heath AP, Ferretti V et al (2016) Toward a shared vision for cancer genomic data. N Engl J Med 375:1109–1112. Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2020

Authors and Affiliations

  1. 1.Department of Oncology UNIL CHUV, Ludwig Institute for Cancer ResearchUniversity of LausanneLausanneSwitzerland
  2. 2.Swiss Institute of Bioinformatics (SIB)LausanneSwitzerland

Personalised recommendations