Inference of Allele-Specific Expression from RNA-seq Data

  • Paul K. Korir
  • Cathal Seoighe
Part of the Methods in Molecular Biology book series (MIMB, volume 1112)


The differential abundance of transcripts from alternative alleles of a gene, for example in a hybrid plant or an outbred natural population, can provide information about the nature of interindividual or interstrain variation in gene expression. Allele-specific expression (ASE) can result from epigenetic phenomena, such as imprinting (when the overexpressed allele is inherited consistently from one parent) or allele-specific chromatin modifications. Alternatively, DNA sequence variants in the promoter or within the transcribed region of a gene can affect the rate of transcription or the rate of decay of the transcript, respectively. The existence of this allelic variation and the insights it provides into the nature of the gene regulation are of significant interest. With the recent widespread availability of sequencing based transcriptomics, the power to detect ASE has increased; however, inference of ASE from transcriptome sequencing data is subject to several caveats and potential biases and the results need to be interpreted with care.

Key words

Allele-specific expression RNA-seq ASE High-throughput sequencing 


  1. 1.
    Ge B, Pokholok DK, Kwan T, Grundberg E, Morcos L, Verlaan DJ, Le J, Koka V, Lam KCL, Gagné V et al (2009) Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nat Genet 41:1216–1222PubMedCrossRefGoogle Scholar
  2. 2.
    Pastinen T (2010) Genome-wide allele-specific analysis: insights into regulatory variation. Nat Rev Genet 11:533–538PubMedCrossRefGoogle Scholar
  3. 3.
    Wagner JR, Ge B, Pokholok D, Gunderson KL, Pastinen T, Blanchette M (2010) Computational analysis of whole-genome differential allelic expression data in human. PLoS Comp Biol 6:e1000849CrossRefGoogle Scholar
  4. 4.
    Serre D, Gurd S, Ge B, Sladek R, Sinnett D, Harmsen E, Bibikova M, Chudin E, Barker DL, Dickinson T et al (2008) Differential allelic expression in the human genome: a robust approach to identify genetic and epigenetic cis-acting mechanisms regulating gene expression. PLoS Genet 4:e1000006PubMedCentralPubMedCrossRefGoogle Scholar
  5. 5.
    Majewski J, Pastinen T (2011) The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet 27:72–79PubMedCrossRefGoogle Scholar
  6. 6.
    Zhang K, Li JB, Gao Y, Egli D, Xie B, Deng J, Li Z, Lee JH, Aach J, Leproust EM et al (2009) Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nat Methods 6:613–618PubMedCentralPubMedCrossRefGoogle Scholar
  7. 7.
    Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK (2009) Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25:3207–3212PubMedCrossRefGoogle Scholar
  8. 8.
    Guo M, Yang S, Rupe M, Hu B, Bickel DR, Arthur L, Smith OG-w (2008) Genome-wide allele-specific expression analysis using massively parallel signature sequencing (MPSSTM) reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue. Plant Mol Biol 66:551–563PubMedCrossRefGoogle Scholar
  9. 9.
    Cookson W, Liang L, Abecasis G, Moffatt M, Lathrop M (2009) Mapping complex disease traits with global gene expression. Nat Rev Genet 10:184–194PubMedCrossRefGoogle Scholar
  10. 10.
    Zhang X, Borevitz JO (2009) Global analysis of allele-specific expression in Arabidopsis thaliana. Genetics 182:943–954PubMedCrossRefGoogle Scholar
  11. 11.
    Jiménez-Gómez JM, Wallace AD, Maloof JN (2010) Network analysis identifies ELF3 as a QTL for the shade avoidance response in Arabidopsis. PLoS Genet 6:e1001100PubMedCentralPubMedCrossRefGoogle Scholar
  12. 12.
    West MAL, Kim K, Kliebenstein DJ, Van Leeuwen H, Michelmore RW, Doerge RW, Clair DAS (2007) Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175:1441–1450PubMedCrossRefGoogle Scholar
  13. 13.
    Wittkopp PJ, Haerum BK, Clark AG (2004) Evolutionary changes in cis and trans gene regulation. Nature 430:85–88PubMedCrossRefGoogle Scholar
  14. 14.
    Keurentjes JJB, Fu J, Terpstra IR, Garcia JM, Van Den Ackerveken G, Snoek LB, Peeters AJM, Vreugdenhil D, Koornneef M, Jansen RC (2007) Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc Natl Acad Sci U S A 104:1708–1713PubMedCentralPubMedCrossRefGoogle Scholar
  15. 15.
    Metzker ML (2009) Sequencing technologies–the next generation. Nat Rev Genet 11:31–46PubMedCrossRefGoogle Scholar
  16. 16.
    Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63PubMedCentralPubMedCrossRefGoogle Scholar
  17. 17.
    Ozsolak F, Milos PM (2010) RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 12:87–98PubMedCentralPubMedCrossRefGoogle Scholar
  18. 18.
    Trapnell C, Salzberg SL (2009) How to map billions of short reads onto genomes. Nat Biotech 27:455–457CrossRefGoogle Scholar
  19. 19.
    Gilad Y, Pritchard JK, Thornton K (2009) Characterizing natural variation using next-generation sequencing technologies. Trends Genet 25:463–471PubMedCrossRefGoogle Scholar
  20. 20.
    Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yang H, Ch’ang LY, Huang W, Liu B, Shen Y et al (2003) The international HapMap project. Nature 426:789–796CrossRefGoogle Scholar
  21. 21.
    Leinonen R, Sugawara H, Shumway M (2011) The sequence read archive. Nucleic Acids Res 39:D19–D21PubMedCentralPubMedCrossRefGoogle Scholar
  22. 22.
    Langmead B, Trapnell C, Pop M, Salzberg SL et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25PubMedCentralPubMedCrossRefGoogle Scholar
  23. 23.
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079PubMedCrossRefGoogle Scholar
  24. 24.
    Anders, S. HTSeq: Analysing high-throughput sequencing data with Python. Accessed 30 Jan 2013.
  25. 25.
    Heap GA, Yang JHM, Downes K, Healy BC, Hunt KA, Bockett N, Franke L, Dubois PC, Mein CA, Dobson RJ et al (2010) Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing. Hum Mol Genet 19:122–134PubMedCrossRefGoogle Scholar
  26. 26.
    Tuch BB, Laborde RR, Xu X, Gu J, Chung CB, Monighetti CK, Stanley SJ, Olsen KD, Kasperbauer JL, Moore EJ et al (2010) Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS One 5:e9317PubMedCentralPubMedCrossRefGoogle Scholar
  27. 27.
    Fontanillas P, Landry CR, Wittkopp PJ, Russ C, Gruber JD, Nusbaum C, Hartl DL (2010) Key considerations for measuring allelic expression on a genomic scale using high-throughput sequencing. Mol Ecol 19:212–227PubMedCentralPubMedCrossRefGoogle Scholar
  28. 28.
    Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628PubMedCrossRefGoogle Scholar
  29. 29.
    Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard JK (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464:768–772PubMedCentralPubMedCrossRefGoogle Scholar
  30. 30.
    Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464:773–777PubMedCrossRefGoogle Scholar
  31. 31.
    Nothnagel M, Wolf A, Herrmann A, Szafranski K, Vater I, Brosch M, Huse K, Siebert R, Platzer M, Hampe J et al (2011) Statistical inference of allelic imbalance from transcriptome data. Hum Mutat 32:98–106PubMedCrossRefGoogle Scholar
  32. 32.
    Team R. (2010) R: A language and environment for statistical computing. R Foundation for Statistical Computing Vienna Austria, (01/19).Google Scholar
  33. 33.
    Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80PubMedCentralPubMedCrossRefGoogle Scholar
  34. 34.
    Babak T, Garrett-Engele P, Armour CD, Raymond CK, Keller MP, Chen R, Rohl CA, Johnson JM, Attie AD, Fraser HB et al (2010) Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation. BMC Genomics 11:e473CrossRefGoogle Scholar
  35. 35.
    Fan JB, Oliphant A, Shen R, Kermani BG, Garcia F, Gunderson KL, Hansen M, Steemers F, Butler SL, Deloukas P et al (2003) Highly parallel SNP genotyping. Cold Spring Harbor Symp Quant Biol 68:69–78PubMedCrossRefGoogle Scholar
  36. 36.
    Fan JB, Chee MS, Gunderson KL (2006) Highly parallel genomic assays. Nat Rev Genet 7:632–644PubMedCrossRefGoogle Scholar
  37. 37.
    Hardenbol P, Banér J, Jain M, Nilsson M, Namsaraev EA, Karline-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, Landegren U, Davis RW (2003) Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotech 21:673–678CrossRefGoogle Scholar
  38. 38.
    Hardenbol P, Yu F, Belmont J, MacKenzie J, Bruckner C, Brundage T, Boudreau A, Chow S, Eberle J, Erbilgin A et al (2005) Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res 15:269–275PubMedCrossRefGoogle Scholar
  39. 39.
    Brenner S, Johnson M, Bridgham J, Golda G, Lloyd DH, Johnson D, Luo S, McCurdy S, Foy M, Ewan M et al (2000) Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotech 18:630–634CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, New York 2014

Authors and Affiliations

  • Paul K. Korir
    • 1
  • Cathal Seoighe
    • 1
  1. 1.School of Mathematics, Statistics and Applied MathematicsNational University of IrelandGalway (NUI Galway)Ireland

Personalised recommendations