Skip to main content

Nucleotide Composition Based Measurement Bias in High Throughput Gene Expression Studies

  • Conference paper
  • First Online:
Man–Machine Interactions 4

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 391))

Abstract

High throughput gene expression profiling methods suffer from various sources of measurement bias inherent to the experimental procedures used. Most of the commonly used data standardization methods, designed to reduce the sample-to-sample variability of technical origin, do not account for probe- or transcript-specific effects. However, the efficiency of RNA isolation, cDNA synthesis and amplification does depend on the percentage of GC nucleotides in the transcript sequences and therefore constitutes a strong bias for the analysis of gene expression data. This work is focused on analysis of how and to what extent GC-content bias of oligonucleotide microarray probes affects the measurement data. We propose a mechanism explaining this phenomenon, the implications of GC-content bias for differentially expressed genes (DEGs) detection, and propose a new data standardization method, which by using sample-specific background intensity estimation and LOESS regression, allows to counteract the described effects.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Arezi, B., Xing, W., Sorge, J.A., Hogrefe, H.H.: Amplification efficiency of thermostable dna polymerases. Anal. Biochem. 321(2), 226–235 (2003)

    Article  Google Scholar 

  2. Barnes, M., Freudenberg, J., Thompson, S., Aronow, B., Pavlidis, P.: Experimental comparison and cross-validation of the affymetrix and illumina gene expression analysis platforms. Nucleic Acids Res. 33(18), 5914–5923 (2005)

    Article  Google Scholar 

  3. Beekman, J.M., Boess, F., Hildebrand, H., Kalkuhl, A., Suter, L.: Gene expression analysis of the hepatotoxicant methapyrilene in primary rat hepatocytes: an interlaboratory study. Environ. Health Perspect. 114(1), 92–99 (2006)

    Google Scholar 

  4. Benjamini, Y., Speed, T.P.: Summarizing and correcting the gc content bias in high-throughput sequencing. Nucleic Acids Res. 40(10), e72 (2012)

    Article  Google Scholar 

  5. Choe, S.E., Boutros, M., Michelson, A.M., Church, G.M., Halfon, M.S.: Preferred analysis methods for affymetrix genechips revealed by a wholly defined control dataset. Genome Biol. 6(2), R16 (2005)

    Article  Google Scholar 

  6. Dobbin, K.K., Beer, D.G., Meyerson, M., Yeatman, T.J., Gerald, W.L., et al.: Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays. Clin. Cancer Res. 11(2 Pt 1), 565–572 (2005)

    Google Scholar 

  7. Guo, L., Lobenhofer, E.K., Wang, C., Shippy, R., Harris, S.C., et al.: Rat toxicogenomic study reveals analytical consistency across microarray platforms. Nat. Biotechnol. 24(9), 1162–1169 (2006)

    Article  Google Scholar 

  8. Hockley, S.L., Mathijs, K., Staal, Y.C.M., Brewer, D., Giddings, I., van Delft, J.H.M., Phillips, D.H.: Interlaboratory and interplatform comparison of microarray gene expression analysis of HepG2 cells exposed to benzo(a)pyrene. OMICS 13(2), 115–125 (2009)

    Article  Google Scholar 

  9. Irizarry, R.A., Warren, D., Spencer, F., Kim, I.F., Biswal, S., et al.: Multiple-laboratory comparison of microarray platforms. Nat. Methods 2(5), 345–350 (2005)

    Article  Google Scholar 

  10. Jaksik, R., Marczyk, M., Polanska, J., Rzeszowska-Wolny, J.: Sources of high variance between probe signals in affymetrix short oligonucleotide microarrays. Sensors 14(1), 532–548 (2013)

    Article  Google Scholar 

  11. Pease, A.C., Solas, D., Sullivan, E.J., Cronin, M.T., Holmes, C.P., Fodor, S.P.: Light-generated oligonucleotide arrays for rapid dna sequence analysis. Proc. Natl. Acad. Sci. 91(11), 5022–5026 (1994)

    Article  Google Scholar 

  12. Risso, D., Schwartz, K., Sherlock, G., Dudoit, S.: GC-content normalization for RNA-Seq data. BMC Bioinform. 12(1), 480 (2011)

    Article  Google Scholar 

  13. Schuster, E.F., Blanc, E., Partridge, L., Thornton, J.M.: Estimation and correction of non-specific binding in a large-scale spike-in experiment. Genome Biol. 8(6), R126 (2007)

    Article  Google Scholar 

  14. Shi, L., Reid, L.H., Jones, W.D., Shippy, R., Warrington, J.A., et al.: The microarray quality control (maqc) project shows inter-and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24(9), 1151–1161 (2006)

    Article  Google Scholar 

  15. Shi, L., Tong, W., Fang, H., Scherf, U., Han, J., et al.: Cross-platform comparability of microarray technology: intra-platform consistency and appropriate data analysis procedures are essential. BMC Bioinform. 6(Suppl 2), S12 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Polish National Centre for Research and Development grant number POIG.02.03.01-00-040/13. Calculations were carried out using the computer cluster Ziemowit (http://ziemowit.hpc.polsl.pl) funded by the Silesian BIO-FARMA project No. POIG.02.01.00-00-166/08 in the Computational Biology and Bioinformatics Laboratory of the Biotechnology Centre in the Silesian University of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roman Jaksik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Jaksik, R., Bensz, W., Smieja, J. (2016). Nucleotide Composition Based Measurement Bias in High Throughput Gene Expression Studies. In: Gruca, A., Brachman, A., Kozielski, S., Czachórski, T. (eds) Man–Machine Interactions 4. Advances in Intelligent Systems and Computing, vol 391. Springer, Cham. https://doi.org/10.1007/978-3-319-23437-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23437-3_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23436-6

  • Online ISBN: 978-3-319-23437-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics