Abstract
The differential abundance of transcripts from alternative alleles of a gene, for example in a hybrid plant or an outbred natural population, can provide information about the nature of interindividual or interstrain variation in gene expression. Allele-specific expression (ASE) can result from epigenetic phenomena, such as imprinting (when the overexpressed allele is inherited consistently from one parent) or allele-specific chromatin modifications. Alternatively, DNA sequence variants in the promoter or within the transcribed region of a gene can affect the rate of transcription or the rate of decay of the transcript, respectively. The existence of this allelic variation and the insights it provides into the nature of the gene regulation are of significant interest. With the recent widespread availability of sequencing based transcriptomics, the power to detect ASE has increased; however, inference of ASE from transcriptome sequencing data is subject to several caveats and potential biases and the results need to be interpreted with care.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ge B, Pokholok DK, Kwan T, Grundberg E, Morcos L, Verlaan DJ, Le J, Koka V, Lam KCL, Gagné V et al (2009) Global patterns of cis variation in human cells revealed by high-density allelic expression analysis. Nat Genet 41:1216–1222
Pastinen T (2010) Genome-wide allele-specific analysis: insights into regulatory variation. Nat Rev Genet 11:533–538
Wagner JR, Ge B, Pokholok D, Gunderson KL, Pastinen T, Blanchette M (2010) Computational analysis of whole-genome differential allelic expression data in human. PLoS Comp Biol 6:e1000849
Serre D, Gurd S, Ge B, Sladek R, Sinnett D, Harmsen E, Bibikova M, Chudin E, Barker DL, Dickinson T et al (2008) Differential allelic expression in the human genome: a robust approach to identify genetic and epigenetic cis-acting mechanisms regulating gene expression. PLoS Genet 4:e1000006
Majewski J, Pastinen T (2011) The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet 27:72–79
Zhang K, Li JB, Gao Y, Egli D, Xie B, Deng J, Li Z, Lee JH, Aach J, Leproust EM et al (2009) Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nat Methods 6:613–618
Degner JF, Marioni JC, Pai AA, Pickrell JK, Nkadori E, Gilad Y, Pritchard JK (2009) Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data. Bioinformatics 25:3207–3212
Guo M, Yang S, Rupe M, Hu B, Bickel DR, Arthur L, Smith OG-w (2008) Genome-wide allele-specific expression analysis using massively parallel signature sequencing (MPSSTM) reveals cis- and trans-effects on gene expression in maize hybrid meristem tissue. Plant Mol Biol 66:551–563
Cookson W, Liang L, Abecasis G, Moffatt M, Lathrop M (2009) Mapping complex disease traits with global gene expression. Nat Rev Genet 10:184–194
Zhang X, Borevitz JO (2009) Global analysis of allele-specific expression in Arabidopsis thaliana. Genetics 182:943–954
Jiménez-Gómez JM, Wallace AD, Maloof JN (2010) Network analysis identifies ELF3 as a QTL for the shade avoidance response in Arabidopsis. PLoS Genet 6:e1001100
West MAL, Kim K, Kliebenstein DJ, Van Leeuwen H, Michelmore RW, Doerge RW, Clair DAS (2007) Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175:1441–1450
Wittkopp PJ, Haerum BK, Clark AG (2004) Evolutionary changes in cis and trans gene regulation. Nature 430:85–88
Keurentjes JJB, Fu J, Terpstra IR, Garcia JM, Van Den Ackerveken G, Snoek LB, Peeters AJM, Vreugdenhil D, Koornneef M, Jansen RC (2007) Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci. Proc Natl Acad Sci U S A 104:1708–1713
Metzker ML (2009) Sequencing technologies–the next generation. Nat Rev Genet 11:31–46
Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63
Ozsolak F, Milos PM (2010) RNA sequencing: advances, challenges and opportunities. Nat Rev Genet 12:87–98
Trapnell C, Salzberg SL (2009) How to map billions of short reads onto genomes. Nat Biotech 27:455–457
Gilad Y, Pritchard JK, Thornton K (2009) Characterizing natural variation using next-generation sequencing technologies. Trends Genet 25:463–471
Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yang H, Ch’ang LY, Huang W, Liu B, Shen Y et al (2003) The international HapMap project. Nature 426:789–796
Leinonen R, Sugawara H, Shumway M (2011) The sequence read archive. Nucleic Acids Res 39:D19–D21
Langmead B, Trapnell C, Pop M, Salzberg SL et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Anders, S. HTSeq: Analysing high-throughput sequencing data with Python. http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html. Accessed 30 Jan 2013.
Heap GA, Yang JHM, Downes K, Healy BC, Hunt KA, Bockett N, Franke L, Dubois PC, Mein CA, Dobson RJ et al (2010) Genome-wide analysis of allelic expression imbalance in human primary cells by high-throughput transcriptome resequencing. Hum Mol Genet 19:122–134
Tuch BB, Laborde RR, Xu X, Gu J, Chung CB, Monighetti CK, Stanley SJ, Olsen KD, Kasperbauer JL, Moore EJ et al (2010) Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLoS One 5:e9317
Fontanillas P, Landry CR, Wittkopp PJ, Russ C, Gruber JD, Nusbaum C, Hartl DL (2010) Key considerations for measuring allelic expression on a genomic scale using high-throughput sequencing. Mol Ecol 19:212–227
Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628
Pickrell JK, Marioni JC, Pai AA, Degner JF, Engelhardt BE, Nkadori E, Veyrieras JB, Stephens M, Gilad Y, Pritchard JK (2010) Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464:768–772
Montgomery SB, Sammeth M, Gutierrez-Arcelus M, Lach RP, Ingle C, Nisbett J, Guigo R, Dermitzakis ET (2010) Transcriptome genetics using second generation sequencing in a Caucasian population. Nature 464:773–777
Nothnagel M, Wolf A, Herrmann A, Szafranski K, Vater I, Brosch M, Huse K, Siebert R, Platzer M, Hampe J et al (2011) Statistical inference of allelic imbalance from transcriptome data. Hum Mutat 32:98–106
Team R. (2010) R: A language and environment for statistical computing. R Foundation for Statistical Computing Vienna Austria, (01/19).
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5:R80
Babak T, Garrett-Engele P, Armour CD, Raymond CK, Keller MP, Chen R, Rohl CA, Johnson JM, Attie AD, Fraser HB et al (2010) Genetic validation of whole-transcriptome sequencing for mapping expression affected by cis-regulatory variation. BMC Genomics 11:e473
Fan JB, Oliphant A, Shen R, Kermani BG, Garcia F, Gunderson KL, Hansen M, Steemers F, Butler SL, Deloukas P et al (2003) Highly parallel SNP genotyping. Cold Spring Harbor Symp Quant Biol 68:69–78
Fan JB, Chee MS, Gunderson KL (2006) Highly parallel genomic assays. Nat Rev Genet 7:632–644
Hardenbol P, Banér J, Jain M, Nilsson M, Namsaraev EA, Karline-Neumann GA, Fakhrai-Rad H, Ronaghi M, Willis TD, Landegren U, Davis RW (2003) Multiplexed genotyping with sequence-tagged molecular inversion probes. Nat Biotech 21:673–678
Hardenbol P, Yu F, Belmont J, MacKenzie J, Bruckner C, Brundage T, Boudreau A, Chow S, Eberle J, Erbilgin A et al (2005) Highly multiplexed molecular inversion probe genotyping: over 10,000 targeted SNPs genotyped in a single tube assay. Genome Res 15:269–275
Brenner S, Johnson M, Bridgham J, Golda G, Lloyd DH, Johnson D, Luo S, McCurdy S, Foy M, Ewan M et al (2000) Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotech 18:630–634
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media, New York
About this protocol
Cite this protocol
Korir, P.K., Seoighe, C. (2014). Inference of Allele-Specific Expression from RNA-seq Data. In: Spillane, C., McKeown, P. (eds) Plant Epigenetics and Epigenomics. Methods in Molecular Biology, vol 1112. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-773-0_4
Download citation
DOI: https://doi.org/10.1007/978-1-62703-773-0_4
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-772-3
Online ISBN: 978-1-62703-773-0
eBook Packages: Springer Protocols