Skip to main content

A General Strategy for Inter-sample Variability Assessment and Normalisation

  • Chapter
Computational and Statistical Epigenomics

Part of the book series: Translational Bioinformatics ((TRBIO,volume 7))

  • 2528 Accesses

Abstract

The sources of inter-sample variability in omic studies are not only biological but often also technical. Assessment of the relative magnitude of biological and technical sources of variation is therefore of paramount importance, especially in the context of epigenome-wide association studies (EWAS) where biological signals are quantitative and may be of a relatively small magnitude. This chapter introduces the reader to a general strategy for determining the number and nature of the sources of variation in an omic data set. It further presents guidelines for inter-sample normalisation. Techniques and tools are illustrated throughout with examples from cancer epigenome and EWAS studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Anjum S, Fourkala EO, Zikan M, Wong A, Gentry-Maharaj A, Jones A, Hardy R, Cibula D, Kuh D, Jacobs IJ, Teschendorff AE, Menon U, Widschwendter M. A BRCA1-mutation associated DNA methylation signature in blood cells predicts sporadic breast cancer incidence and survival. Genome Med. 2014;6(6):47.

    Article  PubMed Central  PubMed  Google Scholar 

  • Bell CG, Teschendorff AE, Rakyan VK, Maxwell AP, Beck S, Savage DA. Genome-wide DNA methylation analysis for diabetic nephropathy in type 1 diabetes mellitus. BMC Med Genomics. 2010;3:33.

    Article  PubMed Central  PubMed  Google Scholar 

  • Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006.

    Google Scholar 

  • Buja A, Eyuboglu N. Remarks on parallel analysis. Multivar Behav Res. 1992;27(4):509–40.

    Article  Google Scholar 

  • Comon P. Independent component analysis, a new concept? Signal Process. 1994;36(3):287–314.

    Article  Google Scholar 

  • de Jong S, Neeleman M, Luykx JJ, ten Berg MJ, Strengman E, den Breeijen HH, Stijvers LC, Buizer-Voskamp JE, Bakker SC, Kahn RS, Horvath S, van Solinge WW, Ophoff RA. Seasonal changes in gene expression represent cell-type composition in whole blood. Hum Mol Genet. 2014;23(10):2721–8.

    Article  PubMed  Google Scholar 

  • Gerstein MB, Kundaje A, Hariharan M, Landt SG, Yan KK, Cheng C, Mu XJ, Khurana E, Rozowsky J, Alexander R, Min R, Alves P, Abyzov A, Addleman N, Bhardwaj N, Boyle AP, Cayting P, Charos A, Chen DZ, Cheng Y, Clarke D, Eastman C, Euskirchen G, Frietze S, Fu Y, Gertz J, Grubert F, Harmanci A, Jain P, Kasowski M, Lacroute P, Leng J, Lian J, Monahan H, O’Geen H, Ouyang Z, Partridge EC, Patacsil D, Pauli F, Raha D, Ramirez L, Reddy TE, Reed B, Shi M, Slifer T, Wang J, Wu L, Yang X, Yip KY, Zilberman-Schapira G, Batzoglou S, Sidow A, Farnham PJ, Myers RM, Weissman SM, Snyder M. Architecture of the human regulatory network derived from encode data. Nature. 2012;489(7414):91–100.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Horvath S. DNA methylation age of human tissues and cell types. Genome Biol. 2013;14(10):R115.

    Article  PubMed Central  PubMed  Google Scholar 

  • Houseman EA, Accomando WP, Koestler DC, Christensen BC, Marsit CJ, Nelson HH, Wiencke JK, Kelsey KT. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinform. 2012;13:86.

    Article  Google Scholar 

  • Houseman EA, Molitor J, Marsit CJ. Reference-free cell mixture adjustments in analysis of DNA methylation data. Bioinformatics. 2014;30(10):1431–9.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 2007;8(1):118–27.

    Article  PubMed  Google Scholar 

  • Langevin SM, Houseman EA, Accomando WP, Koestler DC, Christensen BC, Nelson HH, Karagas MR, Marsit CJ, Wiencke JK, Kelsey KT. Leukocyte-adjusted epigenome-wide association studies of blood from solid tumor patients. Epigenetics. 2014;9(6):884–95.

    Article  CAS  PubMed  Google Scholar 

  • Lechner M, Fenton T, West J, Wilson G, Feber A, Henderson S, Thirlwell C, Dibra HK, Jay A, Butcher L, Chakravarthy AR, Gratrix F, Patel N, Vaz F, O’Flynn P, Kalavrezos N, Teschendorff AE, Boshoff C, Beck S. Identification and functional validation of HPV-mediated hypermethylation in head and neck squamous cell carcinoma. Genome Med. 2013;5(2):15.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, Geman D, Baggerly K, Irizarry RA. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 2010;11(10):733–9.

    Article  CAS  PubMed  Google Scholar 

  • Leek JT, Storey JD. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 2007;3(9):1724–35.

    Article  CAS  PubMed  Google Scholar 

  • Leek JT, Storey JD. A general framework for multiple testing dependence. Proc Natl Acad Sci U S A. 2008;105(48):18718–23.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Manoli SE, Smith LA, Vyhlidal CA, An CH, Porrata Y, Cardoso WV, Baron RM, Haley KJ. Maternal smoking and the retinoid pathway in the developing lung. Respir Res. 2012;13:42.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Morris TJ, Butcher LM, Feber A, Teschendorff AE, Chakravarthy AR, Wojdacz TK, Beck S. ChAMP: 450k chip analysis methylation pipeline. Bioinformatics. 2014;30(3):428–30.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Philibert RA, Beach SR, Brody GH. Demethylation of the aryl hydrocarbon receptor repressor as a biomarker for nascent smokers. Epigenetics. 2012;7(11):1331–8.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet. 2011;12(8):529–41.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Sandoval J, Heyn H, Moran S, Serra-Musach J, Pujana MA, Bibikova M, Esteller M. Validation of a DNA methylation microarray for 450,000 CpG sites in the human genome. Epigenetics. 2011;6(6):692–702.

    Article  CAS  PubMed  Google Scholar 

  • Shenker NS, Polidoro S, van Veldhoven K, Sacerdote C, Ricceri F, Birrell MA, Belvisi MG, Brown R, Vineis P, Flanagan JM. Epigenome-wide association study in the European prospective investigation into cancer and nutrition (EPIC-turin) identifies novel genetic loci associated with smoking. Hum Mol Genet. 2013;22(5):843–51.

    Article  CAS  PubMed  Google Scholar 

  • Teschendorff AE, Menon U, Gentry-Maharaj A, Ramus SJ, Gayther SA, Apostolidou S, Jones A, Lechner M, Beck S, Jacobs IJ, Widschwendter M. An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS One. 2009;4(12):e8274. doi: 10.1371/journal.pone.0008274

    Article  PubMed Central  PubMed  Google Scholar 

  • Teschendorff AE, Menon U, Gentry-Maharaj A, Ramus SJ, Weisenberger, DJ, Shen H, Campan M, Noushmehr H, Bell CG, Maxwell AP, Savage, DA, Mueller-Holzner E, Marth C, Kocjan G, Gayther SA, Jones A, Beck S, Wagner W, Laird PW, Jacobs IJ, Widschwendter M. Age-dependent DNA methylation of genes that are suppressed in stem cells is a hallmark of cancer. Genome Res. 2010;20(4):440–6.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  • Teschendorff AE, Renard E, Absil PA. Supervised normalisation of large-scale omic datasets using blind source separation. In: Ganesh RN, Wenwu W, editors. Blind source separation: advances in theory, algorithms and applications. Berlin: Springer; 2014.

    Google Scholar 

  • Teschendorff AE, West J, Beck S. Age-associated epigenetic drift: implications, and a case of epigenetic thrift? Hum Mol Genet. 2013;22(NA):R7–15.

    Google Scholar 

  • Teschendorff AE, Zhuang J, Widschwendter M. Independent surrogate variable analysis to deconvolve confounding factors in large-scale microarray profiling studies. Bioinformatics. 2011;27(11):1496–505.

    Article  CAS  PubMed  Google Scholar 

  • Zeilinger S, Kuhnel B, Klopp N, Baurecht H, Kleinschmidt A, Gieger C, Weidinger S, Lattka E, Adamski J, Peters A, Strauch K, Waldenberger M, Illig T. Tobacco smoking leads to extensive genome-wide changes in dna methylation. PLoS One. 2013;8(5):e63,812.

    Article  CAS  Google Scholar 

Download references

Acknowledgements

AET is supported by the Chinese Academy of Sciences and the Max Planck Society.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Yang, Z., Teschendorff, A.E. (2015). A General Strategy for Inter-sample Variability Assessment and Normalisation. In: Teschendorff, A. (eds) Computational and Statistical Epigenomics. Translational Bioinformatics, vol 7. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-9927-0_3

Download citation

Publish with us

Policies and ethics