Advertisement

RNA-Seq and Expression Arrays: Selection Guidelines for Genome-Wide Expression Profiling

  • Jessica Minnier
  • Nathan D. Pennock
  • Qiuchen Guo
  • Pepper Schedin
  • Christina A. Harrington
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1783)

Abstract

The development of genome-wide gene expression profiling technologies over the past two decades has produced great opportunity for researchers to explore the transcriptome and to better understand biological systems and their perturbation. In this chapter we provide an overview of microarray and massively parallel sequencing technologies and their application to gene expression analysis. We discuss factors that impact expression data generation and analysis that which should be considered in the application of these technology platforms. We further present the results of a simple illustration study to highlight performance similarities and differences in expression profiling of protein-coding mRNAs with each platform. Based on technical and analytical differences between the two platforms, reports in the literature comparing arrays and RNA-Seq for gene expression, and our own example study and experience, we provide recommendations for platform selection for gene expression studies.

Key words

Massively parallel sequencing Microarray RNA-Seq Expression array Expression profiling Differential expression 

Notes

Acknowledgments

The authors thank Julja Burchard for enthusiastic and thoughtful discussions on the design and content of the chapter. We thank Dr. Robert Searles for manuscript review and expert advice on RNA-Seq methods. We thank Caitlin Harrington-Smith for creative assistance with figures and Amy Carlos and Kristina Vartanian for excellent technical assistance. This work was supported in part by the OHSU Knight Cancer Institute (NIH NCI Cancer Center Support Grant P30 CA069533-17) and NIH/NCIR01CA169175 (to P. Schedin).

References

  1. 1.
    Schena M, Shalon D, Davis RW, Brown PO (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270(5235):467–470CrossRefPubMedGoogle Scholar
  2. 2.
    Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14(13):1675–1680. https://doi.org/10.1038/nbt1296-1675 CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, Boutell JM, Bryant J, Carter RJ, Keira Cheetham R, Cox AJ, Ellis DJ, Flatbush MR, Gormley NA, Humphray SJ, Irving LJ, Karbelashvili MS, Kirk SM, Li H, Liu X, Maisinger KS, Murray LJ, Obradovic B, Ost T, Parkinson ML, Pratt MR, Rasolonjatovo IM, Reed MT, Rigatti R, Rodighiero C, Ross MT, Sabot A, Sankar SV, Scally A, Schroth GP, Smith ME, Smith VP, Spiridou A, Torrance PE, Tzonev SS, Vermaas EH, Walter K, Wu X, Zhang L, Alam MD, Anastasi C, Aniebo IC, Bailey DM, Bancarz IR, Banerjee S, Barbour SG, Baybayan PA, Benoit VA, Benson KF, Bevis C, Black PJ, Boodhun A, Brennan JS, Bridgham JA, Brown RC, Brown AA, Buermann DH, Bundu AA, Burrows JC, Carter NP, Castillo N, Chiara ECM, Chang S, Neil Cooley R, Crake NR, Dada OO, Diakoumakos KD, Dominguez-Fernandez B, Earnshaw DJ, Egbujor UC, Elmore DW, Etchin SS, Ewan MR, Fedurco M, Fraser LJ, Fuentes Fajardo KV, Scott Furey W, George D, Gietzen KJ, Goddard CP, Golda GS, Granieri PA, Green DE, Gustafson DL, Hansen NF, Harnish K, Haudenschild CD, Heyer NI, Hims MM, Ho JT, Horgan AM, Hoschler K, Hurwitz S, Ivanov DV, Johnson MQ, James T, Huw Jones TA, Kang GD, Kerelska TH, Kersey AD, Khrebtukova I, Kindwall AP, Kingsbury Z, Kokko-Gonzales PI, Kumar A, Laurent MA, Lawley CT, Lee SE, Lee X, Liao AK, Loch JA, Lok M, Luo S, Mammen RM, Martin JW, McCauley PG, McNitt P, Mehta P, Moon KW, Mullens JW, Newington T, Ning Z, Ling Ng B, Novo SM, O'Neill MJ, Osborne MA, Osnowski A, Ostadan O, Paraschos LL, Pickering L, Pike AC, Pike AC, Chris Pinkard D, Pliskin DP, Podhasky J, Quijano VJ, Raczy C, Rae VH, Rawlings SR, Chiva Rodriguez A, Roe PM, Rogers J, Rogert Bacigalupo MC, Romanov N, Romieu A, Roth RK, Rourke NJ, Ruediger ST, Rusman E, Sanches-Kuiper RM, Schenker MR, Seoane JM, Shaw RJ, Shiver MK, Short SW, Sizto NL, Sluis JP, Smith MA, Ernest Sohna Sohna J, Spence EJ, Stevens K, Sutton N, Szajkowski L, Tregidgo CL, Turcatti G, Vandevondele S, Verhovsky Y, Virk SM, Wakelin S, Walcott GC, Wang J, Worsley GJ, Yan J, Yau L, Zuerlein M, Rogers J, Mullikin JC, Hurles ME, McCooke NJ, West JS, Oaks FL, Lundberg PL, Klenerman D, Durbin R, Smith AJ (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456(7218):53–59. https://doi.org/10.1038/nature07517 CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Bentley DR (2006) Whole-genome re-sequencing. Curr Opin Genet Dev 16(6):545–552. https://doi.org/10.1016/j.gde.2006.10.009 CrossRefPubMedGoogle Scholar
  5. 5.
    Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen YJ, Makhijani V, Roth GT, Gomes X, Tartaro K, Niazi F, Turcotte CL, Irzyk GP, Lupski JR, Chinault C, Song XZ, Liu Y, Yuan Y, Nazareth L, Qin X, Muzny DM, Margulies M, Weinstock GM, Gibbs RA, Rothberg JM (2008) The complete genome of an individual by massively parallel DNA sequencing. Nature 452(7189):872–876. https://doi.org/10.1038/nature06884 CrossRefPubMedGoogle Scholar
  6. 6.
    Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628. https://doi.org/10.1038/nmeth.1226 CrossRefPubMedGoogle Scholar
  7. 7.
    Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M (2008) The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320(5881):1344–1349. https://doi.org/10.1126/science.1158441 CrossRefPubMedPubMedCentralGoogle Scholar
  8. 8.
    Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J (2008) Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453(7199):1239–1243. https://doi.org/10.1038/nature07002 CrossRefPubMedGoogle Scholar
  9. 9.
    Lockhart DJ, Winzeler EA (2000) Genomics, gene expression and DNA arrays. Nature 405(6788):827–836. https://doi.org/10.1038/35015701 CrossRefPubMedGoogle Scholar
  10. 10.
    Bumgarner R (2013) Overview of DNA microarrays: types, applications, and their future. Curr Protoc Mol Biol. Chapter 22:Unit 22.21. https://doi.org/10.1002/0471142727.mb2201s101
  11. 11.
    Wheelan SJ, Martinez Murillo F, Boeke JD (2008) The incredible shrinking world of DNA microarrays. Mol Biosyst 4(7):726–732. https://doi.org/10.1039/b706237k CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Schulze A, Downward J (2001) Navigating gene expression using microarrays—a technology review. Nat Cell Biol 3(8):E190–E195. https://doi.org/10.1038/35087138 CrossRefPubMedGoogle Scholar
  13. 13.
    Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17(6):333–351. https://doi.org/10.1038/nrg.2016.49 CrossRefPubMedGoogle Scholar
  14. 14.
    Moorthie S, Mattocks CJ, Wright CF (2011) Review of massively parallel DNA sequencing technologies. HUGO J 5(1–4):1–12. https://doi.org/10.1007/s11568-011-9156-3 CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Hrdlickova R, Toloue M, Tian B (2017) RNA-Seq methods for transcriptome analysis. Wiley Interdiscip Rev RNA 8(1). https://doi.org/10.1002/wrna.1364 CrossRefGoogle Scholar
  16. 16.
    Chu Y, Corey DR (2012) RNA sequencing: platform selection, experimental design, and data interpretation. Nucleic Acids Ther 22(4):271–274. https://doi.org/10.1089/nat.2012.0367 CrossRefGoogle Scholar
  17. 17.
    Oshlack A, Robinson MD, Young MD (2010) From RNA-seq reads to differential expression results. Genome Biol 11(12):220. https://doi.org/10.1186/gb-2010-11-12-220 CrossRefPubMedPubMedCentralGoogle Scholar
  18. 18.
    Fasold M, Binder H (2014) Variation of RNA quality and quantity are major sources of batch effects in microarray expression data. Microarrays (Basel) 3(4):322–339. https://doi.org/10.3390/microarrays3040322 CrossRefGoogle Scholar
  19. 19.
    Schuierer S, Carbone W, Knehr J, Petitjean V, Fernandez A, Sultan M, Roma G (2017) A comprehensive assessment of RNA-seq protocols for degraded and low-quantity samples. BMC Genomics 18(1):442. https://doi.org/10.1186/s12864-017-3827-y CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Schroeder A, Mueller O, Stocker S, Salowsky R, Leiber M, Gassmann M, Lightfoot S, Menzel W, Granzow M, Ragg T (2006) The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol 7:3. https://doi.org/10.1186/1471-2199-7-3 CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Shanker S, Paulson A, Edenberg HJ, Peak A, Perera A, Alekseyev YO, Beckloff N, Bivens NJ, Donnelly R, Gillaspy AF, Grove D, Gu W, Jafari N, Kerley-Hamilton JS, Lyons RH, Tepper C, Nicolet CM (2015) Evaluation of commercially available RNA amplification kits for RNA sequencing using very low input amounts of total RNA. J Biomol Tech 26(1):4–18. https://doi.org/10.7171/jbt.15-2601-001 CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Leek JT, Scharpf RB, Bravo HC, Simcha D, Langmead B, Johnson WE, Geman D, Baggerly K, Irizarry RA (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11(10):733–739. https://doi.org/10.1038/nrg2825 CrossRefPubMedGoogle Scholar
  23. 23.
    van Dijk EL, Jaszczyszyn Y, Thermes C (2014) Library preparation methods for next-generation sequencing: tone down the bias. Exp Cell Res 322(1):12–20. https://doi.org/10.1016/j.yexcr.2014.01.008 CrossRefPubMedGoogle Scholar
  24. 24.
    Auer PL, Doerge RW (2010) Statistical design and analysis of RNA sequencing data. Genetics 185(2):405–416. https://doi.org/10.1534/genetics.110.114983 CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
    Yang H, Harrington CA, Vartanian K, Coldren CD, Hall R, Churchill GA (2008) Randomization in laboratory procedure is key to obtaining reproducible microarray results. PLoS One 3(11):e3724. https://doi.org/10.1371/journal.pone.0003724 CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X (2014) Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS One 9(1):e78644. https://doi.org/10.1371/journal.pone.0078644 CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Yu J, Cliften PF, Juehne TI, Sinnwell TM, Sawyer CS, Sharma M, Lutz A, Tycksen E, Johnson MR, Minton MR, Klotz ET, Schriefer AE, Yang W, Heinz ME, Crosby SD, Head RD (2015) Multi-platform assessment of transcriptional profiling technologies utilizing a precise probe mapping methodology. BMC Genomics 16:710. https://doi.org/10.1186/s12864-015-1913-6 CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Sims D, Sudbery I, Ilott NE, Heger A, Ponting CP (2014) Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet 15(2):121–132. https://doi.org/10.1038/nrg3642 CrossRefPubMedGoogle Scholar
  29. 29.
    Tarazona S, Garcia-Alcalde F, Dopazo J, Ferrer A, Conesa A (2011) Differential expression in RNA-seq: a matter of depth. Genome Res 21(12):2213–2223. https://doi.org/10.1101/gr.124321.111 CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Park T, Yi SG, Kang SH, Lee S, Lee YS, Simon R (2003) Evaluation of normalization methods for microarray data. BMC Bioinformatics 4:33. https://doi.org/10.1186/1471-2105-4-33 CrossRefPubMedPubMedCentralGoogle Scholar
  31. 31.
    Bolstad BM, Irizarry RA, Astrand M, Speed TP (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 19(2):185–193CrossRefPubMedGoogle Scholar
  32. 32.
    Engstrom PG, Steijger T, Sipos B, Grant GR, Kahles A, Ratsch G, Goldman N, Hubbard TJ, Harrow J, Guigo R, Bertone P, Consortium R (2013) Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods 10(12):1185–1191. https://doi.org/10.1038/nmeth.2722 CrossRefPubMedPubMedCentralGoogle Scholar
  33. 33.
    Zhao S, Zhang B (2015) A comprehensive evaluation of ensembl, RefSeq, and UCSC annotations in the context of RNA-seq read mapping and gene quantification. BMC Genomics 16:97. https://doi.org/10.1186/s12864-015-1308-8 CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Dillies MA, Rau A, Aubert J, Hennequet-Antier C, Jeanmougin M, Servant N, Keime C, Marot G, Castel D, Estelle J, Guernec G, Jagla B, Jouneau L, Laloe D, Le Gall C, Schaeffer B, Le Crom S, Guedj M, Jaffrezic F, French StatOmique C (2013) A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform 14(6):671–683. https://doi.org/10.1093/bib/bbs046 CrossRefPubMedGoogle Scholar
  35. 35.
    Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10):R106. https://doi.org/10.1186/gb-2010-11-10-r106 CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol 15(2):R29. https://doi.org/10.1186/gb-2014-15-2-r29 CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Storey JD (2002) A direct approach to false discovery rates. J R Stat Soc Series B Stat Methodol 64(3):479–498CrossRefGoogle Scholar
  38. 38.
    Dudoit S, Gentleman RC, Quackenbush J (2003) Open source software for the analysis of microarray data. Biotechniques Suppl:45–51Google Scholar
  39. 39.
    C onesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X, Mortazavi A (2016) A survey of best practices for RNA-seq data analysis. Genome Biol 17:13. https://doi.org/10.1186/s13059-016-0881-8 CrossRefPubMedPubMedCentralGoogle Scholar
  40. 40.
    Auer PL, Srivastava S, Doerge RW (2012) Differential expression—the next generation and beyond. Brief Funct Genomics 11(1):57–62. https://doi.org/10.1093/bfgp/elr041 CrossRefPubMedGoogle Scholar
  41. 41.
    Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y (2008) RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18(9):1509–1517. https://doi.org/10.1101/gr.079558.108 CrossRefPubMedPubMedCentralGoogle Scholar
  42. 42.
    Raghavachari N, Barb J, Yang Y, Liu P, Woodhouse K, Levy D, O'Donnell CJ, Munson PJ, Kato GJ (2012) A systematic comparison and evaluation of high density exon arrays and RNA-seq technology used to unravel the peripheral blood transcriptome of sickle cell disease. BMC Med Genet 5:28. https://doi.org/10.1186/1755-8794-5-28 CrossRefGoogle Scholar
  43. 43.
    Bottomly D, Walter NA, Hunter JE, Darakjian P, Kawane S, Buck KJ, Searles RP, Mooney M, McWeeney SK, Hitzemann R (2011) Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays. PLoS One 6(3):e17820. https://doi.org/10.1371/journal.pone.0017820 CrossRefPubMedPubMedCentralGoogle Scholar
  44. 44.
    Zwemer LM, Hui L, Wick HC, Bianchi DW (2014) RNA-Seq and expression microarray highlight different aspects of the fetal amniotic fluid transcriptome. Prenat Diagn 34(10):1006–1014. https://doi.org/10.1002/pd.4417 CrossRefPubMedPubMedCentralGoogle Scholar
  45. 45.
    Consortium SM-I (2014) A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat Biotechnol 32(9):903–914. https://doi.org/10.1038/nbt.2957 CrossRefGoogle Scholar
  46. 46.
    Nazarov PV, Muller A, Kaoma T, Nicot N, Maximo C, Birembaut P, Tran NL, Dittmar G, Vallar L (2017) RNA sequencing and transcriptome arrays analyses show opposing results for alternative splicing in patient derived samples. BMC Genomics 18(1):443. https://doi.org/10.1186/s12864-017-3819-y CrossRefPubMedPubMedCentralGoogle Scholar
  47. 47.
    Su Z, Fang H, Hong H, Shi L, Zhang W, Zhang W, Zhang Y, Dong Z, Lancashire LJ, Bessarabova M, Yang X, Ning B, Gong B, Meehan J, Xu J, Ge W, Perkins R, Fischer M, Tong W (2014) An investigation of biomarkers derived from legacy microarray data for their utility in the RNA-seq era. Genome Biol 15(12):523. https://doi.org/10.1186/s13059-014-0523-y CrossRefPubMedGoogle Scholar
  48. 48.
    Mooney M, McWeeney S (2014) Data integration and reproducibility for high-throughput transcriptomics. Int Rev Neurobiol 116:55–71. https://doi.org/10.1016/B978-0-12-801105-8.00003-5 CrossRefPubMedGoogle Scholar
  49. 49.
    Guo Q, Minnier J, Burchard J, Chiotti K, Spellman P, Schedin P (2017) Physiologically activated mammary fibroblasts promote postpartum mammary cancer. JCI Insight 2(6):e89206. https://doi.org/10.1172/jci.insight.89206 CrossRefPubMedPubMedCentralGoogle Scholar
  50. 50.
    Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29(1):15–21. https://doi.org/10.1093/bioinformatics/bts635 CrossRefPubMedPubMedCentralGoogle Scholar
  51. 51.
    McCarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Res 40(10):4288–4297. https://doi.org/10.1093/nar/gks042 CrossRefPubMedPubMedCentralGoogle Scholar
  52. 52.
    Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK (2015) Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res 43(7):e47. https://doi.org/10.1093/nar/gkv007 CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Young MD, Wakefield MJ, Smyth GK, Oshlack A (2010) Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biol 11(2):R14. https://doi.org/10.1186/gb-2010-11-2-r14 CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Carvalho BS, Irizarry RA (2010) A framework for oligonucleotide microarray preprocessing. Bioinformatics 26(19):2363–2367. https://doi.org/10.1093/bioinformatics/btq431 CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Chavan SS, Bauer MA, Peterson EA, Heuck CJ, Johann DJ Jr (2013) Towards the integration, annotation and association of historical microarray experiments with RNA-seq. BMC Bioinformatics 14(Suppl 14):S4. https://doi.org/10.1186/1471-2105-14-S14-S4 CrossRefPubMedPubMedCentralGoogle Scholar
  56. 56.
    Mehta JP, Rani S (2011) Software and tools for microarray data analysis. Methods Mol Biol 784:41–53. https://doi.org/10.1007/978-1-61779-289-2_4 CrossRefPubMedGoogle Scholar
  57. 57.
    Miller JA, Menon V, Goldy J, Kaykas A, Lee CK, Smith KA, Shen EH, Phillips JW, Lein ES, Hawrylycz MJ (2014) Improving reliability and absolute quantification of human brain microarray data by filtering and scaling probes using RNA-Seq. BMC Genomics 15:154. https://doi.org/10.1186/1471-2164-15-154 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Jessica Minnier
    • 1
  • Nathan D. Pennock
    • 2
  • Qiuchen Guo
    • 2
  • Pepper Schedin
    • 3
    • 4
  • Christina A. Harrington
    • 5
  1. 1.School of Public HealthOregon Health and Science UniversityPortlandUSA
  2. 2.Department of Cell, Developmental and Cancer BiologyOregon Health and Science UniversityPortlandUSA
  3. 3.Department of Cell, Developmental and Cancer Biology, Knight Cancer InstituteOregon Health and Science UniversityPortlandUSA
  4. 4.Young Women’s Breast Cancer Translational ProgramUniversity of Colorado Anschutz Medical CampusAuroraUSA
  5. 5.Integrated Genomics Laboratory, Department of Molecular and Medical GeneticsOregon Health and Science UniversityPortlandUSA

Personalised recommendations