Skip to main content

A Bayesian Framework for Estimating Cell Type Composition from DNA Methylation Without the Need for Methylation Reference

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10229))

Abstract

Genome-wide DNA methylation levels measured from a target tissue across a population have become ubiquitous over the last few years, as methylation status is suggested to hold great potential for better understanding the role of epigenetics. Different cell types are known to have different methylation profiles. Therefore, in the common scenario where methylation levels are collected from heterogeneous sources such as blood, convoluted signals are formed according to the cell type composition of the samples. Knowledge of the cell type proportions is important for statistical analysis, and it may provide novel biological insights and contribute to our understanding of disease biology. Since high resolution cell counting is costly and often logistically impractical to obtain in large studies, targeted methods that are inexpensive and practical for estimating cell proportions are needed. Although a supervised approach has been shown to provide reasonable estimates of cell proportions, this approach leverages scarce reference methylation data from sorted cells which are not available for most tissues and are not appropriate for any target population. Here, we introduce BayesCCE, a Bayesian semi-supervised method that leverages prior knowledge on the cell type composition distribution in the studied tissue. As we demonstrate, such prior information is substantially easier to obtain compared to appropriate reference methylation levels from sorted cells. Using real and simulated data, we show that our proposed method is able to construct a set of components, each corresponding to a single cell type, and together providing up to 50% improvement in correlation when compared with existing reference-free methods. We further make a design suggestion for future data collection efforts by showing that results can be further improved using cell count measurements for a small subset of individuals in the study sample or by incorporating external data of individuals with measured cell counts. Our approach provides a new opportunity to investigate cell compositions in genomic studies of tissues for which it was not possible before.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Koch, M.W., Metz, L.M., Kovalchuk, O.: Epigenetic changes in patients with multiple sclerosis. Nat. Rev. Neurol. 9(1), 35–43 (2013)

    Article  Google Scholar 

  2. Ikegame, T., Bundo, M., Sunaga, F., Asai, T., Nishimura, F., Yoshikawa, A., Kawamura, Y., Hibino, H., Tochigi, M., Kakiuchi, C., et al.: DNA methylation analysis of BDNF gene promoters in peripheral blood cells of schizophrenia patients. Neurosci. Res. 77(4), 208–214 (2013)

    Article  Google Scholar 

  3. Toperoff, G., Aran, D., Kark, J.D., Rosenberg, M., Dubnikov, T., Nissan, B., Wainstein, J., Friedlander, Y., Levy-Lahad, E., Glaser, B., et al.: Genome-wide survey reveals predisposing diabetes type 2-related DNA methylation variations in human peripheral blood. Hum. Mol. Genet. 21(2), 371–383 (2012)

    Article  Google Scholar 

  4. Jaffe, A.E., Irizarry, R.A.: Accounting for cellular heterogeneity is critical in epigenome-wide association studies. Genome Biol. 15(2), R31 (2014)

    Article  Google Scholar 

  5. Houseman, E.A., Accomando, W.P., Koestler, D.C., Christensen, B.C., Marsit, C.J., Nelson, H.H., Wiencke, J.K., Kelsey, K.T.: DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform. 13(1), 86 (2012)

    Article  Google Scholar 

  6. Houseman, E.A., Molitor, J., Marsit, C.J.: Reference-free cell mixture adjustments in analysis of DNA methylation data. Bioinformatics 30(10), 1431–1439 (2014)

    Article  Google Scholar 

  7. Zou, J., Lippert, C., Heckerman, D., Aryee, M., Listgarten, J.: Epigenome-wide association studies without the need for cell-type composition. Nat. Methods 11(3), 309–311 (2014)

    Article  Google Scholar 

  8. Rahmani, E., Zaitlen, N., Baran, Y., Eng, C., Hu, D., Galanter, J., Oh, S., Burchard, E.G., Eskin, E., Zou, J., et al.: Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies. Nat. Methods 13(5), 443–445 (2016)

    Article  Google Scholar 

  9. Houseman, E.A., Kile, M.L., Christiani, D.C., Ince, T.A., Kelsey, K.T., Marsit, C.J.: Reference-free deconvolution of DNA methylation data and mediation by cell composition effects. BMC Bioinform. 17(1), 259 (2016)

    Article  Google Scholar 

  10. Reinius, L.E., Acevedo, N., Joerink, M., Pershagen, G., Dahlén, S.E., Greco, D., Söderhäll, C., Scheynius, A., Kere, J.: Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PloS ONE 7(7), e41361 (2012)

    Article  Google Scholar 

  11. Teschendorff, A.E., Gao, Y., Jones, A., Ruebner, M., Beckmann, M.W., Wachter, D.L., Fasching, P.A., Widschwendter, M.: DNA methylation outliers in normal breast tissue identify field defects that are enriched in cancer. Nat. Commun. 7, 10478 (2016)

    Article  Google Scholar 

  12. Guintivano, J., Aryee, M.J., Kaminsky, Z.A.: A cell epigenotype specific model for the correction of brain cellular heterogeneity bias and its application to age, brain region and major depression. Epigenetics 8(3), 290–302 (2013)

    Article  Google Scholar 

  13. Horvath, S.: DNA methylation age of human tissues and cell types. Genome Biol. 14(10), R115 (2013)

    Article  Google Scholar 

  14. Singmann, P., Shem-Tov, D., Wahl, S., Grallert, H., Fiorito, G., Shin, S.Y., Schramm, K., Wolf, P., Kunze, S., Baran, Y., et al.: Characterization of whole-genome autosomal differences of DNA methylation between men and women. Epigenet. Chromatin 8(1), 1–13 (2015)

    Article  Google Scholar 

  15. Yousefi, P., Huen, K., Davé, V., Barcellos, L., Eskenazi, B., Holland, N.: Sex differences in DNA methylation assessed by 450 K BeadChip in newborns. BMC Genomics 16(1), 1 (2015)

    Article  Google Scholar 

  16. Yousefi, P., Huen, K., Quach, H., Motwani, G., Hubbard, A., Eskenazi, B., Holland, N.: Estimation of blood cellular heterogeneity in newborns and children for epigenome-wide association studies. Environ. Mol. Mutagen. 56(9), 751–758 (2015)

    Article  Google Scholar 

  17. Minka, T.: Estimating a Dirichlet distribution (2000)

    Google Scholar 

  18. Liu, Y., Aryee, M.J., Padyukov, L., Fallin, M.D., Hesselberg, E., Runarsson, A., Reinius, L., Acevedo, N., Taub, M., Ronninger, M., et al.: Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in Rheumatoid Arthritis. Nat. Biotechnol. 31(2), 142–147 (2013)

    Article  Google Scholar 

  19. Hannum, G., Guinney, J., Zhao, L., Zhang, L., Hughes, G., Sadda, S., Klotzle, B., Bibikova, M., Fan, J.B., Gao, Y., et al.: Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell 49(2), 359–367 (2013)

    Article  Google Scholar 

  20. Koestler, D.C., Jones, M.J., Usset, J., Christensen, B.C., Butler, R.A., Kobor, M.S., Wiencke, J.K., Kelsey, K.T.: Improving cell mixture deconvolution by identifying optimal DNA methylation libraries (IDOL). BMC Bioinform. 17(1), 1 (2016)

    Article  Google Scholar 

  21. Chen, Y.A., Lemire, M., Choufani, S., Butcher, D.T., Grafodatskaya, D., Zanke, B.W., Gallinger, S., Hudson, T.J., Weksberg, R.: Discovery of cross-reactive probes and polymorphic CpGs in the Illumina Infinium HumanMethylation450 microarray. Epigenetics 8(2), 203–209 (2013)

    Article  Google Scholar 

  22. Koestler, D.C., Christensen, B.C., Karagas, M.R., Marsit, C.J., Langevin, S.M., Kelsey, K.T., Wiencke, J.K., Houseman, E.A.: Blood-based profiles of DNA methylation predict the underlying distribution of cell types: a validation analysis. Epigenetics 8(8), 816–826 (2013)

    Article  Google Scholar 

  23. Chomczynski, P., Wilfinger, W.W., Eghbalnia, H.R., Kennedy, A., Rymaszewski, M., Mackey, K.: Inter-individual differences in RNA levels in human peripheral blood. PloS ONE 11(2), e0148260 (2016)

    Article  Google Scholar 

  24. Cardenas, A., Allard, C., Doyon, M., Houseman, E.A., Bakulski, K.M., Perron, P., Bouchard, L., Hivert, M.F.: Validation of a DNA methylation reference panel for the estimation of nucleated cells types in cord blood. Epigenetics 11, 773–779 (2016)

    Article  Google Scholar 

  25. Lu, P., Nakorchevskiy, A., Marcotte, E.M.: Expression deconvolution: a reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc. Natl. Acad. Sci. 100(18), 10370–10375 (2003)

    Article  Google Scholar 

  26. Abbas, A.R., Wolslegel, K., Seshasayee, D., Modrusan, Z., Clark, H.F.: Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PloS ONE 4(7), e6098 (2009)

    Article  Google Scholar 

  27. Kuhn, A., Thu, D., Waldvogel, H.J., Faull, R.L., Luthi-Carter, R.: Population-specific expression analysis (PSEA) reveals molecular changes in diseased brain. Nat. Methods 8(11), 945–947 (2011)

    Article  Google Scholar 

  28. Zuckerman, N.S., Noam, Y., Goldsmith, A.J., Lee, P.P.: A self-directed method for cell-type identification and separation of gene expression microarrays. PLoS Comput. Biol. 9(8), e1003189 (2013)

    Article  Google Scholar 

  29. Steuerman, Y., Gat-Viks, I.: Exploiting gene-expression deconvolution to probe the genetics of the immune system. PLoS Comput. Biol. 12(4), e1004856 (2016)

    Article  Google Scholar 

  30. Azevedo, F.A., Andrade-Moraes, C.H., Curado, M.R., Oliveira-Pinto, A.V., Guimarães, D.M., Szczupak, D., Gomes, B.V., Alho, A.T., Polichiso, L., Tampellini, E., et al.: Automatic isotropic fractionation for large-scale quantitative cell analysis of nervous tissue. J. Neurosci. Methods 212(1), 72–78 (2013)

    Article  Google Scholar 

  31. Pinto, A.R., Ilinykh, A., Ivey, M.J., Kuwabara, J.T., D’Antoni, M.L., Debuque, R., Chandran, A., Wang, L., Arora, K., Rosenthal, N.A., et al.: Revisiting cardiac cellular composition. Circ. Res. 118(3), 400–409 (2016)

    Article  Google Scholar 

  32. Divoux, A., Tordjman, J., Lacasa, D., Veyrie, N., Hugol, D., Aissat, A., Basdevant, A., Guerre-Millo, M., Poitou, C., Zucker, J.D., et al.: Fibrosis in human adipose tissue: composition, distribution, and link with lipid metabolism and fat mass loss. Diabetes 59(11), 2817–2825 (2010)

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Lana Martin for feedback on the manuscript. This research was partially supported by the Edmond J. Safra Center for Bioinformatics at Tel Aviv University. E.H., E.R., L.S. and R.S. were supported in part by the Israel Science Foundation (Grant 1425/13), E.H., L.S. and R.S. by the United States Israel Binational Science Foundation grant 2012304. E.R. and L.S. were supported by Len Blavatnik and the Blavatnik Research Foundation. R.S. was supported by the Colton Family Foundation. E.E. was supported by National Science Foundation grants 1065276, 1302448, 1320589 and 1331176, and National Institutes of Health grants R01-GM083198, R01-ES021801, R01-MH101782, R01-ES022282 and U54EB020403.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Elior Rahmani or Eran Halperin .

Editor information

Editors and Affiliations

Appendix

Appendix

The supplementary materials can be found at: https://github.com/cozygene/BayesCCE/blob/master/BayesCCE-SI.pdf.

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Rahmani, E., Schweiger, R., Shenhav, L., Eskin, E., Halperin, E. (2017). A Bayesian Framework for Estimating Cell Type Composition from DNA Methylation Without the Need for Methylation Reference. In: Sahinalp, S. (eds) Research in Computational Molecular Biology. RECOMB 2017. Lecture Notes in Computer Science(), vol 10229. Springer, Cham. https://doi.org/10.1007/978-3-319-56970-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56970-3_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56969-7

  • Online ISBN: 978-3-319-56970-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics