Molecular Genetics and Genomics

, Volume 289, Issue 3, pp 427–438 | Cite as

Towards an improved apple reference transcriptome using RNA-seq

  • Yang Bai
  • Laura Dougherty
  • Kenong XuEmail author
Original Paper


The reference genome of apple (Malus × domestica) has been available since 2010. Despite being a milestone in apple genomics, the reference genome is difficult to be used as a reference in RNA-seq (RNA sequencing) analysis, a widespread technology in transcriptomic studies. One of the major limitations appears to be the low coverage of the reference transcriptome in RNA-seq mapping of reads. To improve the reference transcriptome, we obtained 14 sets of strand-specific RNA-seq data of 168.5 million reads in total from fruit of Golden Delicious (GD, the source of the reference genome) in varying growth and developmental stages. Using a combination of genome-guided assembly and de novo assembly, the apple reference transcriptome was improved to a collection of 71,178 genes or transcripts, which includes 53,654 genes predicted originally (with MDP prefixed in their IDs) and 17,524 novel transcripts. Of these novel transcripts, 8,144 were identified from reads directly mapped to the reference genome while the remaining 9,380 were extracted from de novo assemblies of reads that could not be initially mapped to the reference genome. Evaluating the improved apple reference transcriptome with reads from Golden Delicious and other genotypes used in this and other studies showed that it allowed 62.5 ± 9.3–82.3 ± 2.7 % of reads to be mapped, a marked increase from the low rates of 37.4 ± 7.7–46.6 ± 7.1 % offered by the original reference transcriptome. The improved reference transcriptome therefore represents a step forward towards a complete reference transcriptome in apple.


Malus × domestica Transcriptome coverage RNA sequencing Transcript discovery 

Supplementary material

438_2014_819_MOESM1_ESM.docx (23 kb)
Supplementary material 1 (DOCX 22 kb)


  1. Chepelev I, Wei G, Tang QS, Zhao KJ (2009) Detection of single nucleotide variations in expressed exons of the human genome using RNA-Seq. Nucleic Acids Res 37:1 (e106)–8 (e106)CrossRefGoogle Scholar
  2. Gapper NE, Rudell DR, Giovannoni JJ, Watkins CB (2013) Biomarker development for external CO2 injury prediction in apples through exploration of both transcriptome and DNA methylation changes. AoB Plants 5:plt021. doi: 10.1093/aobpla/plt021 PubMedCentralPubMedCrossRefGoogle Scholar
  3. Gasic K, Hernandez A, Korban SS (2004) RNA extraction from different apple tissues rich in polyphenols and polysaccharides for cDNA library construction. Plant Mol Biol Rep 22:437–438CrossRefGoogle Scholar
  4. Gusberti M, Gessler C, Broggini GAL (2013) RNA-Seq analysis reveals candidate genes for ontogenic resistance in Malus-Venturia pathosystem. PLoS ONE 8(11):e78457. doi: 10.1371/journal.pone.0078457 PubMedCentralPubMedCrossRefGoogle Scholar
  5. Krost C, Petersen R, Schmidt ER (2012) The transcriptomes of columnar and standard type apple trees (Malus × domestica)—a comparative study. Gene 498:223–230PubMedCrossRefGoogle Scholar
  6. Krost C, Petersen R, Lokan S, Brauksiepe B, Braun P, Schmidt E (2013) Evaluation of the hormonal state of columnar apple trees (Malus × domestica) based on high throughput gene expression studies. Plant Mol Biol 81:211–220PubMedCrossRefGoogle Scholar
  7. Li P, Ponnala L, Gandotra N, Wang L, Si Y, Tausta SL, Kebrom TH, Provart N, Patel R, Myers CR, Reidel EJ, Turgeon R, Liu P, Sun Q, Nelson T, Brutnell TP (2010) The developmental dynamics of the maize leaf transcriptome. Nat Genet 42:1060–1067PubMedCrossRefGoogle Scholar
  8. Li Z, Zhang Z, Yan P, Huang S, Fei Z, Lin K (2011) RNA-Seq improves annotation of protein-coding genes in the cucumber genome. BMC Genom 12:540CrossRefGoogle Scholar
  9. Lister R, O’Malley RC, Tonti-Filippini J, Gregory BD, Berry CC, Millar AH, Ecker JR (2008) Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133:523–536PubMedCentralPubMedCrossRefGoogle Scholar
  10. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B (2008) Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5:621–628PubMedCrossRefGoogle Scholar
  11. Nakamura Y, Itoh T, Martin W (2007) Rate and polarity of gene fusion and fission in Oryza sativa and Arabidopsis thaliana. Mol Biol Evol 24:110–121PubMedCrossRefGoogle Scholar
  12. Olson A, Klein RR, Dugas DV, Lu Z, Regulski M, Klein PE, Ware D (2013) Expanding and vetting sorghum bicolor gene annotations through transcriptome and methylome sequencing. Plant Genome. doi: 10.3835/plantgenome2013.08.0025 (Posted online 13 Sept. 2013)Google Scholar
  13. Ong WD, Voo L-YC, Kumar VS (2012) De novo assembly, characterization and functional annotation of pineapple fruit transcriptome through massively parallel sequencing. PLoS ONE 7:e46937PubMedCentralPubMedCrossRefGoogle Scholar
  14. Qi X, Xie S, Liu Y, Yi F, Yu J (2013) Genome-wide annotation of genes and noncoding RNAs of foxtail millet in response to simulated drought stress by deep sequencing. Plant Mol Biol 83:459–473Google Scholar
  15. Roberts A, Pimentel H, Trapnell C, Pachter L (2011) Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27:2325–2329PubMedCrossRefGoogle Scholar
  16. Ruttink T, Sterck L, Rohde A, Bendixen C, Rouzé P, Asp T, Van de Peer Y, Roldan-Ruiz I (2013) Orthology guided assembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in Lolium perenne. Plant Biotechnol J 11:605–617PubMedCrossRefGoogle Scholar
  17. Sun T, Germain A, Giloteaux L, Hammani K, Barkan A, Hanson MR, Bentolila S (2013) An RNA recognition motif-containing protein is required for plastid RNA editing in Arabidopsis and maize. Proc Natl Acad Sci 110:E1169–E1178PubMedCentralPubMedCrossRefGoogle Scholar
  18. Suzuki H, Yu J, Ness S, O’Connell M, Zhang J (2013) RNA editing events in mitochondrial genes by ultra-deep sequencing methods: a comparison of cytoplasmic male sterile, fertile and restored genotypes in cotton. Mol Genet Genomics 288:445–457PubMedCrossRefGoogle Scholar
  19. Thimm O, Blasing O, Gibon Y, Nagel A, Meyer S, Kruger P, Selbig J, Muller LA, Rhee SY, Stitt M (2004) MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J 37:914–939PubMedCrossRefGoogle Scholar
  20. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511-U174CrossRefGoogle Scholar
  21. Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Fitzgerald LM, Gutin N, Lanchbury J, Macalma T, Mitchell JT, Reid J, Wardell B, Kodira C, Chen Z, Desany B, Niazi F, Palmer M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre P, Lespinasse Y, Allan AC, Bus V, Chagne D, Crowhurst RN, Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouze P, Sterck L, Toppo S, Lazzari B, Hellens RP, Durel C-E, Gutin A, Bumgarner RE, Gardiner SE, Skolnick M, Egholm M, Van de Peer Y, Salamini F, Viola R (2010) The genome of the domesticated apple (Malus × domestica Borkh.). Nat Genet 42:833–839PubMedCrossRefGoogle Scholar
  22. Wang A, Xu K (2012) Characterization of two orthologs of REVERSION-TO-ETHYLENE SENSITIVITY1 in Apple. J Mol Biol Res 2:24–41CrossRefGoogle Scholar
  23. Wilhelm BT, Landry JR (2009) RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods 48:249–257PubMedCrossRefGoogle Scholar
  24. Wilhelm BT, Marguerat S, Goodhead I, Bahler J (2010) Defining transcribed regions using RNA-seq. Nat Protoc 5:255–266PubMedCrossRefGoogle Scholar
  25. Wu H-J, Wang Z-M, Wang M, Wang X-J (2013) Widespread long noncoding RNAs as endogenous target mimics for MicroRNAs in plants. Plant Physiol 161:1875–1884PubMedCentralPubMedCrossRefGoogle Scholar
  26. Zenoni S, Ferrarini A, Giacomelli E, Xumerle L, Fasoli M, Malerba G, Bellin D, Pezzotti M, Delledonne M (2010) Characterization of transcriptional complexity during berry development in Vitis vinifera using RNA-Seq. Plant Physiol 152:1787–1795PubMedCentralPubMedCrossRefGoogle Scholar
  27. Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z, Fang X, Chen L, Tian W, Tao Y, Kristiansen K, Zhang X, Li S, Yang H, Wang J, Wang J (2010) Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome Res 20:646–654PubMedCentralPubMedCrossRefGoogle Scholar
  28. Zhang Y, Zhu J, Dai H (2012) Characterization of transcriptional differences between columnar and standard apple trees using RNA-Seq. Plant Mol Biol Rep 30:957–965CrossRefGoogle Scholar
  29. Zhong S, Joung J-G, Zheng Y, Chen Y-R, Liu B, Shao Y, Xiang JZ, Fei Z, Giovannoni JJ (2011) High-throughput Illumina strand-specific RNA sequencing library preparation. Cold Spring Harbor Protoc. doi: 10.1101/pdb.prot5652 Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.Department of HorticultureCornell University, New York State Agricultural Experiment StationGenevaUSA

Personalised recommendations