Skip to main content

Abstract

mRNA-Seq analysis with next-generation sequencing (NGS) provides large-scale gene expression profiles. Expression profile analysis against large-scale data permits us to discover novel genes with specific expression profiles. In this chapter, methods for mRNA-Seq analysis with NGS and large-scale expression analysis are introduced. It has become difficult to handle large-scale expression data by conventional statistical methods, such as hierarchical clustering method, as most of these methods for the large-scale expression data require long calculation times and substantial computer resources. Therefore, large-scale expression analysis methods that are efficiently and quickly executable without large computer resources are the main focus here. In addition, sequence analysis methods for mining single nucleotide polymorphisms (SNPs) with Genome-Seq are also shown here.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

CA:

Correspondence analysis

HCL:

Hierarchical clustering

NGS:

Next-generation sequencing

SNP:

Single nucleotide polymorphisms

References

  • Arai-Kichise Y, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, Yano M et al (2011) Discovery of genome-wide DNA polymorphisms in a landrace cultivar of Japonica rice by whole-genome sequencing. Plant Cell Physiol 52(2):274–282. PubMed PMID: 21258067; PubMed Central PMCID: PMC3037082

    Google Scholar 

  • Asamizu E, Shirasawa K, Hirakawa H, Sato S, Tabata S, Yano K et al (2012) Mapping of micro-tom BAC-End sequences to the reference tomato genome reveals possible genome rearrangements and polymorphisms. Int J Plant Genomics 2012:437026. PubMed PMID: 23227037; PubMed Central PMCID: PMC3514829

    Google Scholar 

  • Austin RS, Vidaurre D, Stamatiou G, Breit R, Provart NJ, Bonetta D et al (2011) Next-generation mapping of Arabidopsis genes. Plant J 67(4):7157–7125

    Google Scholar 

  • Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA et al (2008) Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One 3(10):e3376. PubMed PMID: 18852878; PubMed Central PMCID: PMC2557064

    Google Scholar 

  • Catchen JM, Amores A, Hohenlohe P, Cresko W, Postlethwait JH (2011) Stacks: building and genotyping Loci de novo from short-read sequences. G3 1(3):171–182. PubMed PMID: 22384329; PubMed Central PMCID: PMC3276136

    Google Scholar 

  • de Hoon MJL, Imoto S, Nolan J, Miyano S (2004) Open source clustering software. Bioinformatics 20(9):1453–1454

    Google Scholar 

  • Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 95(25):14863–14868. PubMed PMID: 9843981; PubMed Central PMCID: PMC24541

    Google Scholar 

  • Etter PD, Bassham S, Hohenlohe PA, Johnson EA, Cresko WA (2011) SNP discovery and genotyping for evolutionary genetics using RAD sequencing. Methods Mol Biol 772:157–178. PubMed PMID: 22065437; PubMed Central PMCID: PMC3658458

    Google Scholar 

  • Greenacre MJ (1993) Correspondence analysis in practice. Academic, London

    Google Scholar 

  • Habu T, Yamane H, Igarashi K, Hamada K, Yano K, Tao R (2012) 454-Pyrosequencing of the Transcriptome in leaf and flower buds of Japanese apricot (Prunus mume Sieb. et Zucc.) at different dormant stages. J Jpn Soc Hortic Sci 81(3):239–250. WOS: 000306721500003

    Google Scholar 

  • Hamada K, Hongo K, Suwabe K, Shimizu A, Nagayama T, Abe R et al (2011) Oryza Express: an integrated database of gene expression networks and omics annotations in rice. Plant Cell Physiol 52(2):220–229. PubMed PMID: 21186175; PubMed Central PMCID: PMC3037078

    Google Scholar 

  • Hirai MY, Yano M, Goodenowe DB, Kanaya S, Kimura T, Awazuhara M et al (2004) Integration of transcriptomics and metabolomics for understanding of global responses to nutritional stresses in Arabidopsis thaliana. Proc Natl Acad Sci USA 101(27):10205–10210. PubMed PMID: 15199185; PubMed Central PMCID: PMC454188

    Google Scholar 

  • Houston RD, Davey JW, Bishop SC, Lowe NR, Mota-Velasco JC, Hamilton A et al (2012) Characterisation of QTL-linked and genome-wide restriction site-associated DNA (RAD) markers in farmed Atlantic salmon. BMC Genomics 13:244. PubMed PMID: 22702806; PubMed Central PMCID: PMC3520118

    Google Scholar 

  • Huang X, Zhao Y, Wei X, Li C, Wang A, Zhao Q et al (2012) Genome-wide association study of flowering time and grain yield traits in a worldwide collection of rice germplasm. Nat Genet 44(1):32–39

    Google Scholar 

  • Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL (2013) TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol 14(4):R36

    Google Scholar 

  • Kodama Y, Shumway M, Leinonen R (2012) International nucleotide sequence database collaboration. The sequence read archive: explosive growth of sequencing data. Nucleic Acids Res 40(Database issue):D54–D56. PubMed PMID: 22009675; PubMed Central PMCID: PMC3245110

    Google Scholar 

  • Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R et al (2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res 40(Database issue):D1202–D1210. PubMed PMID: 22140109; PubMed Central PMCID: PMC3245047

    Google Scholar 

  • Langmead B, Salzberg SL (2012) Fast gapped-read alignment with Bowtie 2. Nat Methods 9(4):357–359. PubMed PMID: 22388286; PubMed Central PMCID: PMC3322381

    Google Scholar 

  • Leinonen R, Akhtar R, Birney E, Bower L, Cerdeno-Tarraga A, Cheng Y et al (2011) The European nucleotide archive. Nucleic Acids Res 39(Database issue):D28–D31. PubMed PMID: 20972220; PubMed Central PMCID: PMC3013801

    Google Scholar 

  • Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. PubMed PMID: 19451168; PubMed Central PMCID: PMC2705234

    Google Scholar 

  • Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N et al (2009) The sequence alignment/map format and SAM tools. Bioinformatics 25(16):2078–2079. PubMed PMID: 19505943; PubMed Central PMCID: PMC2723002

    Google Scholar 

  • Manickavelu A, Kawaura K, Oishi K, Shin IT, Kohara Y, Yahiaoui N et al (2012) Comprehensive functional analyses of expressed sequence tags in common wheat (Triticum aestivum). DNA Res 19(2):165–177. PubMed PMID: 22334568; PubMed Central PMCID: PMC3325080

    Google Scholar 

  • Morris GP, Ramu P, Deshpande SP, Hash CT, Shah T, Upadhyaya HD et al (2013) Population genomic and genome-wide association studies of agroclimatic traits in sorghum. Proc Natl Acad Sci USA 110(2):453–458. PubMed PMID: 23267105; PubMed Central PMCID: PMC3545811

    Google Scholar 

  • Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5(7):621–628. PubMed

    Google Scholar 

  • Nishida H, Abe R, Nagayama T, Yano K (2012) Genome signature difference between Deinococcus radiodurans and Thermus thermophilus. Int J Evol Biol 2012:205274. PubMed PMID: 22500246; PubMed Central PMCID: PMC3303625

    Google Scholar 

  • Pei YF, Li J, Zhang L, Papasian CJ, Deng HW (2008) Analyses and comparison of accuracy of different genotype imputation methods. PLoS One 3(10):e3551. PubMed PMID: 18958166; PubMed Central PMCID: PMC2569208

    Google Scholar 

  • Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME et al (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415(6870):436–442

    Google Scholar 

  • Quackenbush J, Liang F, Holt I, Pertea G, Upton J (2000) The TIGR gene indices: reconstruction and representation of expressed gene sequences. Nucleic Acids Res 28(1):141–145. PubMed PMID: 10592205; PubMed Central PMCID: PMC102391

    Google Scholar 

  • Sakai H, Lee SS, Tanaka T, Numa H, Kim J, Kawahara Y et al (2013) Rice annotation project database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol 54(2):e6. PubMed PMID: 23299411; PubMed Central PMCID: PMC3583025

    Google Scholar 

  • Suzuki T, Igarashi K, Dohra H, Someya T, Takano T, Harada K et al (2013) A new omics data resource of pleurocybellaporrigens for gene discovery. PLoS One 8(7):e69681. PubMed PMID: 23936076; PubMed Central PMCID: PMC3720577

    Google Scholar 

  • The Tomato Genome Consortium (2012) The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485(7400):635–641. PubMed PMID: 22660326; PubMed Central PMCID: PMC3378239

    Google Scholar 

  • Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR et al (2012) Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 7(3):562–578. PubMed PMID: 22383036; PubMed Central PMCID: PMC3334321

    Google Scholar 

  • Vivancos AP, Guell M, Dohm JC, Serrano L, Himmelbauer H (2010) Strand-specific deep sequencing of the transcriptome. Genome Res 20(7):989–999. PubMed PMID: 20519413; PubMed Central PMCID: PMC2892100

    Google Scholar 

  • Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63. PubMed PMID: 19015660; PubMed Central PMCID: PMC2949280

    Google Scholar 

  • Wheeler DL, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V et al (2008) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 36(Database issue):D13–D21. PubMed PMID: 18045790

    Google Scholar 

  • Yamamoto T, Nagasaki H, Yonemaru J, Ebana K, Nakajima M, Shibaya T et al (2010) Fine definition of the pedigree haplotypes of closely related rice cultivars by means of genome-wide discovery of single-nucleotide polymorphisms. BMC Genomics 11:267. PubMed PMID: 20423466; PubMed Central PMCID: PMC2874813

    Google Scholar 

  • Yano K, Imai K, Shimizu A, Hanashita T (2006a) A new method for gene discovery in large-scale microarray data. Nucleic Acids Res. 34(5):1532-9. PubMed PMID: 16537840; PubMed Central PMCID: PMC1401514

    Google Scholar 

  • Yano K, Watanabe M, Yamamoto N, Tsugane T, Aoki K, Sakurai N, Shibata D (2006b) MiBASE: a database of a miniature tomato cultivar Micro-Tom. Plant Biotechnol 23(2):195–198

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kentaro Yano Ph.D. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Kobayashi, M., Ohyanagi, H., Yano, K. (2015). Expression Analysis and Genome Annotations with RNA Sequencing. In: Sablok, G., Kumar, S., Ueno, S., Kuo, J., Varotto, C. (eds) Advances in the Understanding of Biological Sciences Using Next Generation Sequencing (NGS) Approaches. Springer, Cham. https://doi.org/10.1007/978-3-319-17157-9_1

Download citation

Publish with us

Policies and ethics