Science China Life Sciences

, Volume 56, Issue 2, pp 134–142 | Cite as

mRNA enrichment protocols determine the quantification characteristics of external RNA spike-in controls in RNA-Seq studies

  • Tao Qing
  • Ying Yu
  • TingTing Du
  • LeMing ShiEmail author
Open Access
Research Paper Special Topic


RNA-Seq promises to be used in clinical settings as a gene-expression profiling tool; however, questions about its variability and biases remain and need to be addressed. Thus, RNA controls with known concentrations and sequence identities originally developed by the External RNA Control Consortium (ERCC) for microarray and qPCR platforms have recently been proposed for RNA-Seq platforms, but only with a limited number of samples. In this study, we report our analysis of RNA-Seq data from 92 ERCC controls spiked in a diverse collection of 447 RNA samples from eight ongoing studies involving five species (human, rat, mouse, chicken, and Schistosoma japonicum) and two mRNA enrichment protocols, i.e., poly(A) and RiboZero. The entire collection of datasets consisted of 15650143175 short sequence reads, 131603796 (i.e., 0.84%) of which were mapped to the 92 ERCC references. The overall ERCC mapping ratio of 0.84% is close to the expected value of 1.0% when assuming a 2.0% mRNA fraction in total RNA, but showed a difference of 2.8-fold across studies and 4.3-fold among samples from the same study with one tissue type. This level of fluctuation may prevent the ERCC controls from being used for cross-sample normalization in RNA-Seq. Furthermore, we observed striking biases of quantification between poly(A) and RiboZero which are transcript-specific. For example, ERCC-00116 showed a 7.3-fold under-enrichment in poly(A) compared to RiboZero. Extra care is needed in integrative analysis of multiple datasets and technical artifacts of protocol differences should not be taken as true biological findings.


RNA-Seq External RNA Control Consortium (ERCC) MAQC/SEQC mRNA enrichment protocol quality control reproducibility quantification bias poly(A) versus RiboZero 


  1. 1.
    Xuan J, Yu Y, Qing T, et al. Next-generation sequencing in the clinic: promises and challenges. Cancer Lett, 2012, doi: 10.1016/j.canlet.2012.11.025Google Scholar
  2. 2.
    Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet, 2009, 10: 57–63PubMedPubMedCentralCrossRefGoogle Scholar
  3. 3.
    Mutz K, Heilkenbrinker A, Lönne M, et al. Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol, 2012, 24: 1–9CrossRefGoogle Scholar
  4. 4.
    Nagalakshmi U, Wang Z, Waern K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science, 2008, 320: 1344–1349PubMedPubMedCentralCrossRefGoogle Scholar
  5. 5.
    Cloonan N, Forrest AR, Kolle G, et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods, 2008, 5: 613–619PubMedCrossRefGoogle Scholar
  6. 6.
    Marioni J C, Mason C E, Mane S M, et al. RNA-Seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res, 2008, 18: 1509–1517PubMedPubMedCentralCrossRefGoogle Scholar
  7. 7.
    McIntyre L M, Lopiano K K, Morse A M, et al. RNA-Seq: technical variability and sampling. BMC Genomics, 2011, 12: 293PubMedPubMedCentralCrossRefGoogle Scholar
  8. 8.
    Schwartz S, Oren R, Ast G. Detection and removal of biases in the analysis of next-generation sequencing reads. PLoS ONE, 2011, 6: e16685PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    Zheng W, Chung L M, Zhao H. Bias detection and correction in RNA-sequencing data. BMC Bioinformatics, 2011, 12: 290PubMedPubMedCentralCrossRefGoogle Scholar
  10. 10.
    Zhang J X, Coombes K R. Sources of variation in false discovery rate estimation include sample size, correlation, and inherent differences between groups. BMC Bioinformatics, 2012, 13: S1PubMedPubMedCentralCrossRefGoogle Scholar
  11. 11.
    Tong W, Lucas AB, Shippy R, et al. Evaluation of external RNA controls for the assessment of microarray performance. Nat Biotechnol, 2006, 24: 1132–1139PubMedCrossRefGoogle Scholar
  12. 12.
    Kralj J G, Salit M L. Characterization of in vitro transcription amplification linearity and variability in the low copy number regime using External RNA Control Consortium (ERCC) Spike-ins. Anal Bioanal Chem, 2013, 405: 315–320PubMedCrossRefGoogle Scholar
  13. 13.
    Baker S C, Bauer S R, Beyer R P, et al. The External RNA Controls Consortium: a progress report. Nat Methods, 2005, 2: 731–734PubMedCrossRefGoogle Scholar
  14. 14.
    Devonshire A S, Elaswarapu R, Foy C A. Evaluation of external RNA controls for the standardisation of gene expression biomarker measurements. BMC Genomics, 2010, 11: 662PubMedPubMedCentralCrossRefGoogle Scholar
  15. 15.
    Jiang L, Schlesinger F, Davis C A, et al. Synthetic spike-in standards for RNA-Seq experiments. Genome Res, 2011, 21: 1543–1551PubMedPubMedCentralCrossRefGoogle Scholar
  16. 16.
    Loven J, Orlando D A, Sigova A A, et al. Revisiting global gene expression analysis. Cell, 2012, 151: 476–482PubMedPubMedCentralCrossRefGoogle Scholar
  17. 17.
    Zook J M, Samarov D, McDaniel J, et al. Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing. PLoS ONE, 2012, 7: e41356PubMedPubMedCentralCrossRefGoogle Scholar
  18. 18.
    Warrington J A, Corbisier P, Feilotter H, et al. Use of external RNA controls in gene expression assays: approved guideline. CLSI document MM16-A (ISBN 1-56238-617-4), Wayne, Peennsylvania, USA, 2006Google Scholar
  19. 19.
    Langmead B, Salzberg S L. Fast gapped-read alignment with bowtie 2. Nat Methods, 2012, 9: 357–359PubMedPubMedCentralCrossRefGoogle Scholar
  20. 20.
    Ramirez-Gonzalez R H, Bonnal R, Caccamo M, et al. Bio-samtools: ruby bindings for samtools, a library for accessing bam files containing high-throughput sequence alignments. Source Code Biol Med, 2012, 7: 6PubMedPubMedCentralCrossRefGoogle Scholar
  21. 21.
    Quinlan A R, Hall I M. Bedtools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 2010, 26: 841–842PubMedPubMedCentralCrossRefGoogle Scholar
  22. 22.
    Shi L, Reid L H, Jones W D, et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol, 2006, 24: 1151–1161PubMedCrossRefGoogle Scholar
  23. 23.
    Shi L, Campbell G, Jones W D, et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol, 2010, 28: 827–838PubMedCrossRefGoogle Scholar

Copyright information

© The Author(s) 2013

Authors and Affiliations

  1. 1.Center for Pharmacogenomics, School of PharmacyFudan UniversityShanghaiChina

Personalised recommendations