An annotated transcriptome of highly inbred Thuja plicata (Cupressaceae) and its utility for gene discovery of terpenoid biosynthesis and conifer defense

  • Tal J. Shalev
  • Macaire M. S. Yuen
  • Andreas Gesell
  • Agnes Yuen
  • John H. Russell
  • Jörg BohlmannEmail author
Original Article
Part of the following topical collections:
  1. Gene Expression


Western redcedar (Thuja plicata; Cupressaceae; WRC) is an ecologically and economically important conifer species of the Pacific Northwest. Regeneration of WRC forests is affected by ungulate browsing, which removes current growth and hampers development of young trees. Monoterpenes make WRC foliage less palatable and can deter browsing. Genomic resources are required to advance knowledge of terpene accumulation and breeding of WRC for herbivore resistance. Unlike most conifers, WRC readily selfs to produce genotypes of reduced heterozygosity. We used seedlings of eight different fifth-generation selfed lines for monoterpene analysis and transcriptome sequencing. Trinity, Velvet/Oases, TransABySS, and SOAPdenovoTrans were used to generate independent transcriptome assemblies for each line. Sequence redundancy was reduced using the EvidentialGene pipeline. The best assembly, as determined by metrics of completeness, contiguity, and accuracy, was used to produce a WRC reference gene set of 28,279 sequences, of which 77% were annotated with significant BLASTp hits and 89% with significant InterProScan hits. An orthology-based approach was used to annotate gene families. Manually curated annotation identified 33 putative full-length terpene synthases (TPS). A maximum likelihood phylogeny revealed that WRC TPS cluster apart from those of Pinaceae within the gymnosperm TPS-d clade. Use of selfed lines enabled the development and annotation of a reduced-redundancy gene set for a gymnosperm of the Cupressaceae family. This gene set serves as a foundation for future functional characterization of WRC TPS and other defense genes and as a resource for the annotation of protein coding sequences in the WRC genome.


Western redcedar Thuja plicata Cupressaceae Selfed lines Conifer genomics Terpenes Conifer defense 



We thank Dr. Carol Ritland and Ms. Karen Reid for excellent project management support, Dr. Timothy J. Sexton for technical assistance, and the McGill University and Génome Québec Innovation Centre for sequencing services. The research was supported with funds from the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery Grant) and funds to JB and JHR from Genome British Columbia, Genome Canada, and the British Columbia Ministry of Forests, Lands, Natural Resource Operations and Rural Development (MFLNRORD) for the CEDaR User Partnership Project (UPP-002, Genome BC) and the CEDaR Applied Genomics Partnership Project (184CED-GAPP, Genome Canada and Genome BC). TJS is supported by a NSERC Postgraduate Doctoral fellowship.

Data archiving statement

The sequence data supporting this work can be found at the NCBI BioProject Database under BioProject ID PRJNA399722. In addition, sequences of the gene lists described in this paper and their annotations are also available in Files S1–S5 and File S9.

Supplementary material

11295_2018_1248_Fig4_ESM.jpg (53 kb)
Figure S1

Representative four-month old WRC seedling used for RNA isolation and sequencing. (JPEG 52 kb)

11295_2018_1248_MOESM1_ESM.eps (621 kb)
High resolution image (EPS 620 kb)
11295_2018_1248_Fig5_ESM.jpg (209 kb)
Figure S2

Pipeline for the de novo assembly and redundancy reduction of the WRC gene set, carried out for each inbred S5 line separately. (JPEG 208 kb)

11295_2018_1248_MOESM2_ESM.eps (2.9 mb)
High resolution image (EPS 2924 kb)
11295_2018_1248_Fig6_ESM.jpg (938 kb)
Figure S3

Monoterpene profiles of foliar samples for 12 different monoterpenes across the eight S5 lines. (JPEG 937 kb)

11295_2018_1248_MOESM3_ESM.eps (3.9 mb)
High resolution image (EPS 4043 kb)
11295_2018_1248_Fig7_ESM.jpg (439 kb)
Figure S4

Results of the BUSCO gene set completeness assessment. The reduced-redundancy gene set for WRC S5 Line 4 was found to be the most complete, with the lowest number of missing orthologs. (JPEG 438 kb)

11295_2018_1248_MOESM4_ESM.eps (849 kb)
High resolution image (EPS 848 kb)
11295_2018_1248_MOESM5_ESM.docx (14 kb)
Table S1 Summary for transcriptome assemblies for WRC S5 lines. (DOCX 14 kb)
11295_2018_1248_MOESM6_ESM.docx (16 kb)
Table S2 Results of the Conditional Reciprocal Best BLAST (CRBB) analysis. (DOCX 15 kb)
11295_2018_1248_MOESM7_ESM.docx (14 kb)
Table S3 Results of the BLASTp analysis of transcriptome assemblies against the longest predicted proteins (n = 1000) in the P. glauca and A. thaliana reference gene sets. (DOCX 13 kb)
11295_2018_1248_MOESM8_ESM.txt (184 kb)
File S1 Sequences of 241 plant terpene synthase (TPS) used in construction of a maximum-likelihood phylogeny of plant TPS. (TXT 184 kb)
11295_2018_1248_MOESM9_ESM.txt (101 kb)
File S2 Sequences of 126 gymnosperm and a single P. patens TPS used in construction of a maximum-likelihood phylogeny of gymnosperm TPS. (TXT 101 kb)
11295_2018_1248_MOESM10_ESM.txt (12.6 mb)
File S3 Sequence data for the core WRC gene set. Gene set containing the 28,279 core, reduced-redundancy protein sequences for predicted ORFs as produced by the EvidentialGene pipeline. (TXT 12858 kb)
11295_2018_1248_MOESM11_ESM.txt (18.4 mb)
File S4 Sequence data for the alternate WRC gene set. Gene set containing 40,691 additional putative protein-coding sequences, which may be potential gene isoforms or paralogs. (TXT 18875 kb)
11295_2018_1248_MOESM12_ESM.xlsx (5.2 mb)
File S5 Summary of significant BLASTp and InterProScan hits for the main reduced-redundancy gene set of Line 4. BLAST columns are as described in the BLAST Command Line Applications User Manual ( The pipeline for BLASTing and filtering hits is described in the Methods section. GO names are separated by Biological Process (P), Molecular Function (F) and Cellular Component (C). The InterPro ID column lists all InterPro domains found for the queried sequence. Top PFAM hit describes the hit with the highest score against the PFAM database for each sequence, using an e-value cut-off of 1e-5. (XLSX 5321 kb)
11295_2018_1248_MOESM13_ESM.xlsx (15 kb)
File S6 Statistical summary of orthogroup analysis for all sequences assigned to orthogroups. Of the 498,235 protein coding sequences from 16 different plant species submitted for orthogroup analysis, 391,179 were successfully assigned to 19,660 orthogroups. The majority of orthogroups (11,616) had an average of less than one gene per species; the largest orthogroup (3201 genes) had an average of 151–200 genes per species. A large number of orthogroups (5614) had members from only two species; however, a similarly large number (3835) had members from all 16 species. (XLSX 15 kb)
11295_2018_1248_MOESM14_ESM.xlsx (22 kb)
File S7 Statistical summary of orthogroup analysis results for each species. The species with the lowest amount of genes assigned to orthogroups was P. patens, with only 57.5% of sequences assigned; the highest was P. glauca with 92.4%. 90.4% of our WRC gene set was successfully assigned to orthogroups; 0.1% of WRC sequences were in species-specific orthogroups. (XLSX 21 kb)
11295_2018_1248_MOESM15_ESM.xlsx (1.6 mb)
File S8 Summary of orthogroup composition and function. The number of orthogroup members from each species, together with the total number of genes in each orthogroup and the top five PFAM hits for each orthogroup. The largest orthogroup, with 3201 genes consisted mainly of pentatricopetide-repeat containing protein-coding genes, a large protein family in plants with little functional redundancy (Lurin et al. 2004). (XLSX 1665 kb)
11295_2018_1248_MOESM16_ESM.txt (27 kb)
File S9 Sequence data for 33 putative full-length TPS genes from the WRC gene set. Putative TPS were identified using BLASTp, InterProScan and orthogroup analysis, and after removal of partial ORFs and proteins less than 400 aa long were reduced to a set of 33 putative full-length TPS. (TXT 26 kb)


  1. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410CrossRefPubMedGoogle Scholar
  2. Baillie R, Drayton M, Pembleton L, Kaur S, Culvenor R, Smith K, Spangenberg G, Forster J, Cogan N (2017) Generation and characterisation of a reference transcriptome for Phalaris (Phalaris aquatica L.). Agronomy 7:14. CrossRefGoogle Scholar
  3. Birol I, Raymond A, Jackman SD, Pleasance S, Coope R, Taylor GA, Yuen MMS, Keeling CI, Brand D, Vandervalk BP, Kirk H, Pandoh P, Moore RA, Zhao Y, Mungall AJ, Jaquish B, Yanchuk A, Ritland C, Boyle B, Bousquet J, Ritland K, MacKay J, Bohlmann J, Jones SJM (2013) Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics 29:1492–1497. CrossRefPubMedPubMedCentralGoogle Scholar
  4. Blande D, Halimaa P, Tervahauta AI, Aarts MGM, Kärenlampi SO (2017) De novo transcriptome assemblies of four accessions of the metal hyperaccumulator plant Noccaea caerulescens. Sci Data 4:1–9. CrossRefGoogle Scholar
  5. Bohlmann J, Keeling CI (2008) Terpenoid biomaterials. Plant J 54:656–669CrossRefPubMedGoogle Scholar
  6. Bohlmann J, Meyer-Gauen G, Croteau R (1998) Plant terpenoid synthases: molecular biology and phylogenetic analysis. Proc Natl Acad Sci U S A 95:4126–4133. CrossRefPubMedPubMedCentralGoogle Scholar
  7. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. CrossRefPubMedPubMedCentralGoogle Scholar
  8. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. CrossRefPubMedPubMedCentralGoogle Scholar
  9. Chen F, Tholl D, Bohlmann J, Pichersky E (2011) The family of terpene synthases in plants: a mid-size family of genes for specialized metabolism that is highly diversified throughout the kingdom. Plant J 66:212–229. CrossRefPubMedGoogle Scholar
  10. De La Torre AR, Birol I, Bousquet J et al (2014) Insights into conifer giga-genomes. Plant Physiol 166:1724–1732. CrossRefGoogle Scholar
  11. Debell JD, Morrell JJ, Gartner BL (1999) Within-stem variation in tropolone content and decay resistance of second-growth Western redcedar. For Sci 45:101–107Google Scholar
  12. Duan J, Xia C, Zhao G, Jia J, Kong X (2012) Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data. BMC Genomics 13:392. CrossRefPubMedPubMedCentralGoogle Scholar
  13. Elbeltagy A, Nishioka K, Suzuki H, Sato T, Sato YI, Morisaki H, Mitsui H, Minamisawa K (2000) Isolation and characterization of endophytic bacteria from wild and traditionally cultivated rice varieties. Soil Sci Plant Nutr 463:617–629. CrossRefGoogle Scholar
  14. Emms DM, Kelly S (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol 16:157. CrossRefPubMedPubMedCentralGoogle Scholar
  15. Foster AJ, Hall DE, Mortimer L, Abercromby S, Gries R, Gries G, Bohlmann J, Russell J, Mattsson J (2013) Identification of genes in Thuja plicata foliar terpenoid defenses. Plant Physiol 161:1993–2004. CrossRefPubMedPubMedCentralGoogle Scholar
  16. Fu L, Niu B, Zhu Z, Wu S, Li W (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152. CrossRefPubMedPubMedCentralGoogle Scholar
  17. Geddy R, Brown GG (2007) Genes encoding pentatricopeptide repeat (PPR) proteins are not conserved in location in plant genomes and may be subject to diversifying selection. BMC Genomics 8:130. CrossRefPubMedPubMedCentralGoogle Scholar
  18. Gesell A, Blaukopf M, Madilao L, Yuen MMS, Withers SG, Mattsson J, Russell JH, Bohlmann J (2015) The gymnosperm cytochrome P450 CYP750B1 catalyzes stereospecific monoterpene hydroxylation of (+)-sabinene in thujone biosynthesis in western redcedar. Plant Physiol 168:94–106. CrossRefPubMedPubMedCentralGoogle Scholar
  19. Gilbert D (2013) Gene-omes built from mRNA seq not genome DNA. In: 7th Annual Arthropod Genomics Symposium. Notre DameGoogle Scholar
  20. Gonzalez JS (2004) Growth, properties and uses of western red cedar. Forintek Canada Corp Spec Publ No SP-37R 37Google Scholar
  21. Gordon SP, Tseng E, Salamov A, Zhang J, Meng X, Zhao Z, Kang D, Underwood J, Grigoriev IV, Figueroa M, Schilling JS, Chen F, Wang Z (2015) Widespread polycistronic transcripts in fungi revealed by single-molecule mRNA sequencing. PLoS One 10:e0132628. CrossRefPubMedPubMedCentralGoogle Scholar
  22. Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, MacManes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512. CrossRefPubMedGoogle Scholar
  23. Hall DE, Zerbe P, Jancsik S, Quesada AL, Dullat H, Madilao LL, Yuen M, Bohlmann J (2013) Evolution of conifer diterpene synthases: diterpene resin acid biosynthesis in lodgepole pine and jack pine involves monofunctional and bifunctional diterpene synthases. Plant Physiol 161:600–616. CrossRefPubMedGoogle Scholar
  24. Hebda RJ, Mathewes RW (1984) Holocene history of cedar and native Indian cultures of the North American Pacific Coast. Science 225:711–713. CrossRefPubMedGoogle Scholar
  25. Hu X-G, Liu H, Jin Y, Sun YQ, Li Y, Zhao W, el-Kassaby YA, Wang XR, Mao JF (2016) De novo transcriptome assembly and characterization for the widespread and stress-tolerant conifer Platycladus orientalis. PLoS One 11:e0148985. CrossRefPubMedPubMedCentralGoogle Scholar
  26. Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240. CrossRefPubMedPubMedCentralGoogle Scholar
  27. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. CrossRefPubMedPubMedCentralGoogle Scholar
  28. Keeling CI, Bohlmann J (2006a) Diterpene resin acids in conifers. Phytochemistry 67:2415–2423CrossRefPubMedGoogle Scholar
  29. Keeling CI, Bohlmann J (2006b) Genes, enzymes and chemicals of terpenoid diversity in the constitutive and induced defence of conifers against insects and pathogens. New Phytol 170:657–675CrossRefPubMedGoogle Scholar
  30. Keeling CI, Weisshaar S, Ralph SG, Jancsik S, Hamberger B, Dullat HK, Bohlmann J (2011) Transcriptome mining, functional characterization, and phylogeny of a large terpene synthase gene family in spruce (Picea spp.). BMC Plant Biol 11:43. CrossRefPubMedPubMedCentralGoogle Scholar
  31. Kotera E, Tasaka M, Shikanai T (2005) A pentatricopeptide repeat protein is essential for RNA editing in chloroplasts. Nature 433:326–330. CrossRefPubMedGoogle Scholar
  32. Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E (2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res 40:D1202–D1210. CrossRefPubMedGoogle Scholar
  33. Letunic I, Bork P (2016) Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res.
  34. Lurin C, Andrés C, Aubourg S, Bellaoui M, Bitton F, Bruyère C, Caboche M, Debast C, Gualberto J, Hoffmann B, Lecharny A, le Ret M, Martin-Magniette ML, Mireau H, Peeters N, Renou JP, Szurek B, Taconnat L, Small I (2004) Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell 16:2089–2103. CrossRefPubMedPubMedCentralGoogle Scholar
  35. Martin DM, Fäldt J, Bohlmann J (2004) Functional characterization of nine Norway spruce TPS genes and evolution of gymnosperm terpene synthases of the TPS-d subfamily. Plant Physiol 135:1908–1927. CrossRefPubMedPubMedCentralGoogle Scholar
  36. Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671–682. CrossRefPubMedGoogle Scholar
  37. Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next-generation sequencing data. Genomics 95:315–327CrossRefPubMedPubMedCentralGoogle Scholar
  38. Morris PI, Stirling R (2012) Western red cedar extractives associated with durability in ground contact. Wood Sci Technol 46:991–1002. CrossRefGoogle Scholar
  39. Nakasugi K, Crowhurst R, Bally J, Waterhouse P (2014) Combining transcriptome assemblies from multiple de novo assemblers in the allo-tetraploid plant Nicotiana benthamiana. PLoS One 9:e91776. CrossRefPubMedPubMedCentralGoogle Scholar
  40. Neale DB, Wegrzyn JL, Stevens KA, Zimin AV, Puiu D, Crepeau MW, Cardeno C, Koriabine M, Holtz-Morris AE, Liechty JD, Martínez-García PJ, Vasquez-Gross HA, Lin BY, Zieve JJ, Dougherty WM, Fuentes-Soriano S, Wu LS, Gilbert D, Marçais G, Roberts M, Holt C, Yandell M, Davis JM, Smith KE, Dean JFD, Lorenz W, Whetten RW, Sederoff R, Wheeler N, McGuire PE, Main D, Loopstra CA, Mockaitis K, deJong PJ, Yorke JA, Salzberg SL, Langley CH (2014) Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol 15:R59. CrossRefPubMedPubMedCentralGoogle Scholar
  41. Nelson DR (2009) The cytochrome p450 homepage. Hum Genomics 4:59–65. PubMedPubMedCentralCrossRefGoogle Scholar
  42. O’Connell LM, Ritland K (2005) Post-pollination mechanisms promoting outcrossing in a self-fertile conifer, Thuja plicata (Cupressaceae). Can J Bot Can Bot 83:335–342. CrossRefGoogle Scholar
  43. Okamoto S, Yu F, Harada H, Okajima T, Hattan JI, Misawa N, Utsumi R (2011) A short-chain dehydrogenase involved in terpene metabolism from Zingiber zerumbet. FEBS J 278:2892–2900. CrossRefPubMedGoogle Scholar
  44. Orsini L, Gilbert D, Podicheti R, Jansen M, Brown JB, Solari OS, Spanier KI, Colbourne JK, Rush D, Decaestecker E, Asselman J, de Schamphelaere KAC, Ebert D, Haag CR, Kvist J, Laforsch C, Petrusek A, Beckerman AP, Little TJ, Chaturvedi A, Pfrender ME, de Meester L, Frilander MJ (2016) Daphnia magna transcriptome by RNA-Seq across 12 environmental stressors. Sci Data 3:160030. CrossRefPubMedPubMedCentralGoogle Scholar
  45. Proost S, Van BM, Vaneechoutte D et al (2015) PLAZA 3.0: an access point for plant comparative genomics. Nucleic Acids Res 43:D974–D981. CrossRefPubMedGoogle Scholar
  46. Rigault P, Boyle B, Lepage P, Cooke JEK, Bousquet J, MacKay JJ (2011) A white spruce gene catalog for conifer genome analyses. Plant Physiol 157:14–28. CrossRefPubMedPubMedCentralGoogle Scholar
  47. Ringer KL, Davis EM, Croteau R (2005) Monoterpene metabolism. Cloning, expression, and characterization of (-)-isopiperitenol/(-)-carveol dehydrogenase of peppermint and spearmint. Plant Physiol 137:863–872. CrossRefPubMedPubMedCentralGoogle Scholar
  48. Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, Mungall K, Lee S, Okada HM, Qian JQ, Griffith M, Raymond A, Thiessen N, Cezard T, Butterfield YS, Newsome R, Chan SK, She R, Varhol R, Kamoh B, Prabhu AL, Tam A, Zhao YJ, Moore RA, Hirst M, Marra MA, Jones SJM, Hoodless PA, Birol I (2010) De novo assembly and analysis of RNA-seq data. Nat Methods 7:909–912. CrossRefPubMedGoogle Scholar
  49. Russell JH, Burdon RD, Yanchuk AD (2003) Inbreeding depression and variance structures for height and adaptation in self- and outcross Thuja plicata families in varying environments. For Genet 10:171–184Google Scholar
  50. Russell JH, Ferguson DC (2008) Preliminary results from five generations of a western redcedar (Thuja plicata) selection study with self-mating. Tree Genet Genomes 4:509–518. CrossRefGoogle Scholar
  51. Russell JH, Yanchuk AD (2012) Breeding for growth improvement and resistance to multiple pests in Thuja plicata. Gen Tech Rep 240:40–44Google Scholar
  52. Schuler MA, Werck-Reichhart D (2003) Functional genomics of P450s. Annu Rev Plant Biol 54:629–667. CrossRefPubMedGoogle Scholar
  53. Schulz MH, Zerbino DR, Vingron M, Birney E (2012) Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28:1086–1092. CrossRefPubMedPubMedCentralGoogle Scholar
  54. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212. CrossRefPubMedGoogle Scholar
  55. Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelly S (2016) TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res 26:1134–1144. CrossRefPubMedPubMedCentralGoogle Scholar
  56. Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. CrossRefPubMedPubMedCentralGoogle Scholar
  57. Stevens KA, Wegrzyn JL, Zimin A, Puiu D, Crepeau M, Cardeno C, Paul R, Gonzalez-Ibeas D, Koriabine M, Holtz-Morris AE, Martínez-García PJ, Sezen UU, Marçais G, Jermstad K, McGuire PE, Loopstra CA, Davis JM, Eckert A, de Jong P, Yorke JA, Salzberg SL, Neale DB, Langley CH (2016) Sequence of the sugar pine megagenome. Genetics 204:1613–1626. CrossRefPubMedPubMedCentralGoogle Scholar
  58. Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637. CrossRefPubMedGoogle Scholar
  59. Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Fitzgerald LM, Gutin N, Lanchbury J, Macalma T, Mitchell JT, Reid J, Wardell B, Kodira C, Chen Z, Desany B, Niazi F, Palmer M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre P, Lespinasse Y, Allan AC, Bus V, Chagné D, Crowhurst RN, Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouzé P, Sterck L, Toppo S, Lazzari B, Hellens RP, Durel CE, Gutin A, Bumgarner RE, Gardiner SE, Skolnick M, Egholm M, van de Peer Y, Salamini F, Viola R (2010) The genome of the domesticated apple (Malus × domestica Borkh.). Nat Genet 42:833–839. CrossRefPubMedGoogle Scholar
  60. Visser EA, Wegrzyn JL, Steenkmap ET, Myburg AA, Naidoo S (2015) Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome. BMC Genomics 16:1057. CrossRefPubMedPubMedCentralGoogle Scholar
  61. Vourc’h G, De Garine-Wichatitsky M, Labbé A et al (2002) Monoterpene effect on feeding choice by deer. J Chem Ecol 28:2411–2427. CrossRefPubMedGoogle Scholar
  62. Warren RL, Keeling CI, Saint YMM et al (2015) Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. Plant J 83:189–212. CrossRefPubMedGoogle Scholar
  63. Wegrzyn JL, Liechty JD, Stevens KA, Wu LS, Loopstra CA, Vasquez-Gross HA, Dougherty WM, Lin BY, Zieve JJ, Martinez-Garcia PJ, Holt C, Yandell M, Zimin AV, Yorke JA, Crepeau MW, Puiu D, Salzberg SL, de Jong PJ, Mockaitis K, Main D, Langley CH, Neale DB (2014) Unique features of the loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation. Genetics 196:891–909. CrossRefPubMedPubMedCentralGoogle Scholar
  64. Xie Y, Wu G, Tang J, Luo R, Patterson J, Liu S, Huang W, He G, Gu S, Li S, Zhou X, Lam TW, Li Y, Xu X, Wong GKS, Wang J (2014) SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics 30:1660–1666. CrossRefPubMedGoogle Scholar
  65. Yagi Y, Tachikawa M, Noguchi H, Satoh S, Obokata J, Nakamura T (2013) Pentatricopeptide repeat proteins involved in plant organellar RNA editing. RNA Biol 10:1419–1425. CrossRefPubMedPubMedCentralGoogle Scholar
  66. Zerbe P, Hamberger B, Yuen MMS, Chiang A, Sandhu HK, Madilao LL, Nguyen A, Hamberger B, Bach SS, Bohlmann J (2013) Gene discovery of modular diterpene metabolism in nonmodel systems. Plant Physiol 162:1073–1091. CrossRefPubMedPubMedCentralGoogle Scholar
  67. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829. CrossRefPubMedPubMedCentralGoogle Scholar
  68. Zharkikh A, Troggio M, Pruss D, Cestaro A, Eldrdge G, Pindo M, Mitchell JT, Vezzulli S, Bhatnagar S, Fontana P, Viola R, Gutin A, Salamini F, Skolnick M, Velasco R (2008) Sequencing and assembly of highly heterozygous genome of Vitis vinifera L. cv Pinot Noir: problems and solutions. J Biotechnol 136:38–43. CrossRefPubMedGoogle Scholar
  69. Zimin AV, Stevens KA, Crepeau MW, Puiu D, Wegrzyn JL, Yorke JA, Langley CH, Neale DB, Salzberg SL (2017) An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing. Gigascience 6:1–4. CrossRefPubMedPubMedCentralGoogle Scholar
  70. Zulak KG, Bohlmann J (2010) Terpenoid biosynthesis and specialized vascular cells of conifer defense. J Integr Plant Biol 52:86–97CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • Tal J. Shalev
    • 1
  • Macaire M. S. Yuen
    • 1
  • Andreas Gesell
    • 1
  • Agnes Yuen
    • 1
  • John H. Russell
    • 2
  • Jörg Bohlmann
    • 1
    Email author
  1. 1.Michael Smith LaboratoriesUniversity of British ColumbiaVancouverCanada
  2. 2.British Columbia Ministry of Forests, Lands, Natural Resource Operations and Rural DevelopmentCowichan Lake Research StationMesachie LakeCanada

Personalised recommendations