An annotated transcriptome of highly inbred Thuja plicata (Cupressaceae) and its utility for gene discovery of terpenoid biosynthesis and conifer defense

Shalev, Tal J.; Yuen, Macaire M. S.; Gesell, Andreas; Yuen, Agnes; Russell, John H.; Bohlmann, Jörg

doi:10.1007/s11295-018-1248-y

An annotated transcriptome of highly inbred Thuja plicata (Cupressaceae) and its utility for gene discovery of terpenoid biosynthesis and conifer defense

Original Article
Published: 21 April 2018

Volume 14, article number 35, (2018)
Cite this article

Tree Genetics & Genomes Aims and scope Submit manuscript

Tal J. Shalev¹,
Macaire M. S. Yuen¹,
Andreas Gesell¹,
Agnes Yuen¹,
John H. Russell² &
…
Jörg Bohlmann¹

1210 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

Western redcedar (Thuja plicata; Cupressaceae; WRC) is an ecologically and economically important conifer species of the Pacific Northwest. Regeneration of WRC forests is affected by ungulate browsing, which removes current growth and hampers development of young trees. Monoterpenes make WRC foliage less palatable and can deter browsing. Genomic resources are required to advance knowledge of terpene accumulation and breeding of WRC for herbivore resistance. Unlike most conifers, WRC readily selfs to produce genotypes of reduced heterozygosity. We used seedlings of eight different fifth-generation selfed lines for monoterpene analysis and transcriptome sequencing. Trinity, Velvet/Oases, TransABySS, and SOAPdenovoTrans were used to generate independent transcriptome assemblies for each line. Sequence redundancy was reduced using the EvidentialGene pipeline. The best assembly, as determined by metrics of completeness, contiguity, and accuracy, was used to produce a WRC reference gene set of 28,279 sequences, of which 77% were annotated with significant BLASTp hits and 89% with significant InterProScan hits. An orthology-based approach was used to annotate gene families. Manually curated annotation identified 33 putative full-length terpene synthases (TPS). A maximum likelihood phylogeny revealed that WRC TPS cluster apart from those of Pinaceae within the gymnosperm TPS-d clade. Use of selfed lines enabled the development and annotation of a reduced-redundancy gene set for a gymnosperm of the Cupressaceae family. This gene set serves as a foundation for future functional characterization of WRC TPS and other defense genes and as a resource for the annotation of protein coding sequences in the WRC genome.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

In-depth transcriptome characterization uncovers distinct gene family expansions for Cupressus gigantea important to this long-lived species’ adaptability to environmental cues

Article Open access 13 March 2019

Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)

Article Open access 17 November 2016

The oak gene expression atlas: insights into Fagaceae genome evolution and the discovery of genes regulated during bud dormancy release

Article Open access 21 February 2015

References

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410
Article PubMed CAS Google Scholar
Baillie R, Drayton M, Pembleton L, Kaur S, Culvenor R, Smith K, Spangenberg G, Forster J, Cogan N (2017) Generation and characterisation of a reference transcriptome for Phalaris (Phalaris aquatica L.). Agronomy 7:14. https://doi.org/10.3390/agronomy7010014
Article CAS Google Scholar
Birol I, Raymond A, Jackman SD, Pleasance S, Coope R, Taylor GA, Yuen MMS, Keeling CI, Brand D, Vandervalk BP, Kirk H, Pandoh P, Moore RA, Zhao Y, Mungall AJ, Jaquish B, Yanchuk A, Ritland C, Boyle B, Bousquet J, Ritland K, MacKay J, Bohlmann J, Jones SJM (2013) Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data. Bioinformatics 29:1492–1497. https://doi.org/10.1093/bioinformatics/btt178
Article PubMed PubMed Central CAS Google Scholar
Blande D, Halimaa P, Tervahauta AI, Aarts MGM, Kärenlampi SO (2017) De novo transcriptome assemblies of four accessions of the metal hyperaccumulator plant Noccaea caerulescens. Sci Data 4:1–9. https://doi.org/10.1038/sdata.2016.131
Article Google Scholar
Bohlmann J, Keeling CI (2008) Terpenoid biomaterials. Plant J 54:656–669
Article PubMed CAS Google Scholar
Bohlmann J, Meyer-Gauen G, Croteau R (1998) Plant terpenoid synthases: molecular biology and phylogenetic analysis. Proc Natl Acad Sci U S A 95:4126–4133. https://doi.org/10.1073/pnas.95.8.4126
Article PubMed PubMed Central CAS Google Scholar
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. https://doi.org/10.1093/bioinformatics/btu170
Article PubMed PubMed Central CAS Google Scholar
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421
Article PubMed PubMed Central CAS Google Scholar
Chen F, Tholl D, Bohlmann J, Pichersky E (2011) The family of terpene synthases in plants: a mid-size family of genes for specialized metabolism that is highly diversified throughout the kingdom. Plant J 66:212–229. https://doi.org/10.1111/j.1365-313X.2011.04520.x
Article PubMed CAS Google Scholar
De La Torre AR, Birol I, Bousquet J et al (2014) Insights into conifer giga-genomes. Plant Physiol 166:1724–1732. https://doi.org/10.1104/pp.114.248708
Article CAS Google Scholar
Debell JD, Morrell JJ, Gartner BL (1999) Within-stem variation in tropolone content and decay resistance of second-growth Western redcedar. For Sci 45:101–107
Google Scholar
Duan J, Xia C, Zhao G, Jia J, Kong X (2012) Optimizing de novo common wheat transcriptome assembly using short-read RNA-Seq data. BMC Genomics 13:392. https://doi.org/10.1186/1471-2164-13-392
Article PubMed PubMed Central CAS Google Scholar
Elbeltagy A, Nishioka K, Suzuki H, Sato T, Sato YI, Morisaki H, Mitsui H, Minamisawa K (2000) Isolation and characterization of endophytic bacteria from wild and traditionally cultivated rice varieties. Soil Sci Plant Nutr 463:617–629. https://doi.org/10.1080/00380768.2000.10409127
Article Google Scholar
Emms DM, Kelly S (2015) OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol 16:157. https://doi.org/10.1186/s13059-015-0721-2
Article PubMed PubMed Central CAS Google Scholar
Foster AJ, Hall DE, Mortimer L, Abercromby S, Gries R, Gries G, Bohlmann J, Russell J, Mattsson J (2013) Identification of genes in Thuja plicata foliar terpenoid defenses. Plant Physiol 161:1993–2004. https://doi.org/10.1104/pp.112.206383
Article PubMed PubMed Central CAS Google Scholar
Fu L, Niu B, Zhu Z, Wu S, Li W (2012) CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150–3152. https://doi.org/10.1093/bioinformatics/bts565
Article PubMed PubMed Central CAS Google Scholar
Geddy R, Brown GG (2007) Genes encoding pentatricopeptide repeat (PPR) proteins are not conserved in location in plant genomes and may be subject to diversifying selection. BMC Genomics 8:130. https://doi.org/10.1186/1471-2164-8-130
Article PubMed PubMed Central CAS Google Scholar
Gesell A, Blaukopf M, Madilao L, Yuen MMS, Withers SG, Mattsson J, Russell JH, Bohlmann J (2015) The gymnosperm cytochrome P450 CYP750B1 catalyzes stereospecific monoterpene hydroxylation of (+)-sabinene in thujone biosynthesis in western redcedar. Plant Physiol 168:94–106. https://doi.org/10.1104/pp.15.00315
Article PubMed PubMed Central CAS Google Scholar
Gilbert D (2013) Gene-omes built from mRNA seq not genome DNA. In: 7th Annual Arthropod Genomics Symposium. Notre Dame
Gonzalez JS (2004) Growth, properties and uses of western red cedar. Forintek Canada Corp Spec Publ No SP-37R 37
Gordon SP, Tseng E, Salamov A, Zhang J, Meng X, Zhao Z, Kang D, Underwood J, Grigoriev IV, Figueroa M, Schilling JS, Chen F, Wang Z (2015) Widespread polycistronic transcripts in fungi revealed by single-molecule mRNA sequencing. PLoS One 10:e0132628. https://doi.org/10.1371/journal.pone.0132628
Article PubMed PubMed Central CAS Google Scholar
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, MacManes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A (2013) De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc 8:1494–1512. https://doi.org/10.1038/nprot.2013.084
Article PubMed CAS Google Scholar
Hall DE, Zerbe P, Jancsik S, Quesada AL, Dullat H, Madilao LL, Yuen M, Bohlmann J (2013) Evolution of conifer diterpene synthases: diterpene resin acid biosynthesis in lodgepole pine and jack pine involves monofunctional and bifunctional diterpene synthases. Plant Physiol 161:600–616. https://doi.org/10.1104/pp.112.208546
Article PubMed CAS Google Scholar
Hebda RJ, Mathewes RW (1984) Holocene history of cedar and native Indian cultures of the North American Pacific Coast. Science 225:711–713. https://doi.org/10.1126/science.225.4663.711
Article PubMed CAS Google Scholar
Hu X-G, Liu H, Jin Y, Sun YQ, Li Y, Zhao W, el-Kassaby YA, Wang XR, Mao JF (2016) De novo transcriptome assembly and characterization for the widespread and stress-tolerant conifer Platycladus orientalis. PLoS One 11:e0148985. https://doi.org/10.1371/journal.pone.0148985
Article PubMed PubMed Central CAS Google Scholar
Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30:1236–1240. https://doi.org/10.1093/bioinformatics/btu031
Article PubMed PubMed Central CAS Google Scholar
Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30:772–780. https://doi.org/10.1093/molbev/mst010
Article PubMed PubMed Central CAS Google Scholar
Keeling CI, Bohlmann J (2006a) Diterpene resin acids in conifers. Phytochemistry 67:2415–2423
Article PubMed CAS Google Scholar
Keeling CI, Bohlmann J (2006b) Genes, enzymes and chemicals of terpenoid diversity in the constitutive and induced defence of conifers against insects and pathogens. New Phytol 170:657–675
Article PubMed CAS Google Scholar
Keeling CI, Weisshaar S, Ralph SG, Jancsik S, Hamberger B, Dullat HK, Bohlmann J (2011) Transcriptome mining, functional characterization, and phylogeny of a large terpene synthase gene family in spruce (Picea spp.). BMC Plant Biol 11:43. https://doi.org/10.1186/1471-2229-11-43
Article PubMed PubMed Central CAS Google Scholar
Kotera E, Tasaka M, Shikanai T (2005) A pentatricopeptide repeat protein is essential for RNA editing in chloroplasts. Nature 433:326–330. https://doi.org/10.1038/nature03229
Article PubMed CAS Google Scholar
Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E (2012) The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res 40:D1202–D1210. https://doi.org/10.1093/nar/gkr1090
Article PubMed CAS Google Scholar
Letunic I, Bork P (2016) Interactive tree of life (iTOL) v3: an online tool for the display and annotation of phylogenetic and other trees. Nucleic Acids Res. https://doi.org/10.1093/nar/gkw290
Lurin C, Andrés C, Aubourg S, Bellaoui M, Bitton F, Bruyère C, Caboche M, Debast C, Gualberto J, Hoffmann B, Lecharny A, le Ret M, Martin-Magniette ML, Mireau H, Peeters N, Renou JP, Szurek B, Taconnat L, Small I (2004) Genome-wide analysis of Arabidopsis pentatricopeptide repeat proteins reveals their essential role in organelle biogenesis. Plant Cell 16:2089–2103. https://doi.org/10.1105/tpc.104.022236
Article PubMed PubMed Central CAS Google Scholar
Martin DM, Fäldt J, Bohlmann J (2004) Functional characterization of nine Norway spruce TPS genes and evolution of gymnosperm terpene synthases of the TPS-d subfamily. Plant Physiol 135:1908–1927. https://doi.org/10.1104/pp.104.042028
Article PubMed PubMed Central CAS Google Scholar
Martin JA, Wang Z (2011) Next-generation transcriptome assembly. Nat Rev Genet 12:671–682. https://doi.org/10.1038/nrg3068
Article PubMed CAS Google Scholar
Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next-generation sequencing data. Genomics 95:315–327
Article PubMed PubMed Central CAS Google Scholar
Morris PI, Stirling R (2012) Western red cedar extractives associated with durability in ground contact. Wood Sci Technol 46:991–1002. https://doi.org/10.1007/s00226-011-0459-2
Article CAS Google Scholar
Nakasugi K, Crowhurst R, Bally J, Waterhouse P (2014) Combining transcriptome assemblies from multiple de novo assemblers in the allo-tetraploid plant Nicotiana benthamiana. PLoS One 9:e91776. https://doi.org/10.1371/journal.pone.0091776
Article PubMed PubMed Central CAS Google Scholar
Neale DB, Wegrzyn JL, Stevens KA, Zimin AV, Puiu D, Crepeau MW, Cardeno C, Koriabine M, Holtz-Morris AE, Liechty JD, Martínez-García PJ, Vasquez-Gross HA, Lin BY, Zieve JJ, Dougherty WM, Fuentes-Soriano S, Wu LS, Gilbert D, Marçais G, Roberts M, Holt C, Yandell M, Davis JM, Smith KE, Dean JFD, Lorenz W, Whetten RW, Sederoff R, Wheeler N, McGuire PE, Main D, Loopstra CA, Mockaitis K, deJong PJ, Yorke JA, Salzberg SL, Langley CH (2014) Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol 15:R59. https://doi.org/10.1186/gb-2014-15-3-r59
Article PubMed PubMed Central CAS Google Scholar
Nelson DR (2009) The cytochrome p450 homepage. Hum Genomics 4:59–65. https://doi.org/10.1186/1479-7364-4-1-59
Article PubMed PubMed Central CAS Google Scholar
O’Connell LM, Ritland K (2005) Post-pollination mechanisms promoting outcrossing in a self-fertile conifer, Thuja plicata (Cupressaceae). Can J Bot Can Bot 83:335–342. https://doi.org/10.1139/b05-007
Article Google Scholar
Okamoto S, Yu F, Harada H, Okajima T, Hattan JI, Misawa N, Utsumi R (2011) A short-chain dehydrogenase involved in terpene metabolism from Zingiber zerumbet. FEBS J 278:2892–2900. https://doi.org/10.1111/j.1742-4658.2011.08211.x
Article PubMed CAS Google Scholar
Orsini L, Gilbert D, Podicheti R, Jansen M, Brown JB, Solari OS, Spanier KI, Colbourne JK, Rush D, Decaestecker E, Asselman J, de Schamphelaere KAC, Ebert D, Haag CR, Kvist J, Laforsch C, Petrusek A, Beckerman AP, Little TJ, Chaturvedi A, Pfrender ME, de Meester L, Frilander MJ (2016) Daphnia magna transcriptome by RNA-Seq across 12 environmental stressors. Sci Data 3:160030. https://doi.org/10.1038/sdata.2016.30
Article PubMed PubMed Central CAS Google Scholar
Proost S, Van BM, Vaneechoutte D et al (2015) PLAZA 3.0: an access point for plant comparative genomics. Nucleic Acids Res 43:D974–D981. https://doi.org/10.1093/nar/gku986
Article PubMed CAS Google Scholar
Rigault P, Boyle B, Lepage P, Cooke JEK, Bousquet J, MacKay JJ (2011) A white spruce gene catalog for conifer genome analyses. Plant Physiol 157:14–28. https://doi.org/10.1104/pp.111.179663
Article PubMed PubMed Central CAS Google Scholar
Ringer KL, Davis EM, Croteau R (2005) Monoterpene metabolism. Cloning, expression, and characterization of (-)-isopiperitenol/(-)-carveol dehydrogenase of peppermint and spearmint. Plant Physiol 137:863–872. https://doi.org/10.1104/pp.104.053298
Article PubMed PubMed Central CAS Google Scholar
Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, Mungall K, Lee S, Okada HM, Qian JQ, Griffith M, Raymond A, Thiessen N, Cezard T, Butterfield YS, Newsome R, Chan SK, She R, Varhol R, Kamoh B, Prabhu AL, Tam A, Zhao YJ, Moore RA, Hirst M, Marra MA, Jones SJM, Hoodless PA, Birol I (2010) De novo assembly and analysis of RNA-seq data. Nat Methods 7:909–912. https://doi.org/10.1038/nmeth.1517
Article PubMed CAS Google Scholar
Russell JH, Burdon RD, Yanchuk AD (2003) Inbreeding depression and variance structures for height and adaptation in self- and outcross Thuja plicata families in varying environments. For Genet 10:171–184
Google Scholar
Russell JH, Ferguson DC (2008) Preliminary results from five generations of a western redcedar (Thuja plicata) selection study with self-mating. Tree Genet Genomes 4:509–518. https://doi.org/10.1007/s11295-007-0127-8
Article Google Scholar
Russell JH, Yanchuk AD (2012) Breeding for growth improvement and resistance to multiple pests in Thuja plicata. Gen Tech Rep 240:40–44
Google Scholar
Schuler MA, Werck-Reichhart D (2003) Functional genomics of P450s. Annu Rev Plant Biol 54:629–667. https://doi.org/10.1146/annurev.arplant.54.031902.134840
Article PubMed CAS Google Scholar
Schulz MH, Zerbino DR, Vingron M, Birney E (2012) Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics 28:1086–1092. https://doi.org/10.1093/bioinformatics/bts094
Article PubMed PubMed Central CAS Google Scholar
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210–3212. https://doi.org/10.1093/bioinformatics/btv351
Article PubMed CAS Google Scholar
Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelly S (2016) TransRate: reference-free quality assessment of de novo transcriptome assemblies. Genome Res 26:1134–1144. https://doi.org/10.1101/gr.196469.115
Article PubMed PubMed Central CAS Google Scholar
Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. https://doi.org/10.1093/bioinformatics/btu033
Article PubMed PubMed Central CAS Google Scholar
Stevens KA, Wegrzyn JL, Zimin A, Puiu D, Crepeau M, Cardeno C, Paul R, Gonzalez-Ibeas D, Koriabine M, Holtz-Morris AE, Martínez-García PJ, Sezen UU, Marçais G, Jermstad K, McGuire PE, Loopstra CA, Davis JM, Eckert A, de Jong P, Yorke JA, Salzberg SL, Neale DB, Langley CH (2016) Sequence of the sugar pine megagenome. Genetics 204:1613–1626. https://doi.org/10.1534/genetics.116.193227
Article PubMed PubMed Central CAS Google Scholar
Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 278:631–637. https://doi.org/10.1126/science.278.5338.631
Article PubMed CAS Google Scholar
Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D, Salvi S, Pindo M, Baldi P, Castelletti S, Cavaiuolo M, Coppola G, Costa F, Cova V, Dal Ri A, Goremykin V, Komjanc M, Longhi S, Magnago P, Malacarne G, Malnoy M, Micheletti D, Moretto M, Perazzolli M, Si-Ammour A, Vezzulli S, Zini E, Eldredge G, Fitzgerald LM, Gutin N, Lanchbury J, Macalma T, Mitchell JT, Reid J, Wardell B, Kodira C, Chen Z, Desany B, Niazi F, Palmer M, Koepke T, Jiwan D, Schaeffer S, Krishnan V, Wu C, Chu VT, King ST, Vick J, Tao Q, Mraz A, Stormo A, Stormo K, Bogden R, Ederle D, Stella A, Vecchietti A, Kater MM, Masiero S, Lasserre P, Lespinasse Y, Allan AC, Bus V, Chagné D, Crowhurst RN, Gleave AP, Lavezzo E, Fawcett JA, Proost S, Rouzé P, Sterck L, Toppo S, Lazzari B, Hellens RP, Durel CE, Gutin A, Bumgarner RE, Gardiner SE, Skolnick M, Egholm M, van de Peer Y, Salamini F, Viola R (2010) The genome of the domesticated apple (Malus × domestica Borkh.). Nat Genet 42:833–839. https://doi.org/10.1038/ng.654
Article PubMed CAS Google Scholar
Visser EA, Wegrzyn JL, Steenkmap ET, Myburg AA, Naidoo S (2015) Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome. BMC Genomics 16:1057. https://doi.org/10.1186/s12864-015-2277-7
Article PubMed PubMed Central CAS Google Scholar
Vourc’h G, De Garine-Wichatitsky M, Labbé A et al (2002) Monoterpene effect on feeding choice by deer. J Chem Ecol 28:2411–2427. https://doi.org/10.1023/A:1021423816695
Article PubMed Google Scholar
Warren RL, Keeling CI, Saint YMM et al (2015) Improved white spruce (Picea glauca) genome assemblies and annotation of large gene families of conifer terpenoid and phenolic defense metabolism. Plant J 83:189–212. https://doi.org/10.1111/tpj.12886
Article PubMed CAS Google Scholar
Wegrzyn JL, Liechty JD, Stevens KA, Wu LS, Loopstra CA, Vasquez-Gross HA, Dougherty WM, Lin BY, Zieve JJ, Martinez-Garcia PJ, Holt C, Yandell M, Zimin AV, Yorke JA, Crepeau MW, Puiu D, Salzberg SL, de Jong PJ, Mockaitis K, Main D, Langley CH, Neale DB (2014) Unique features of the loblolly pine (Pinus taeda L.) megagenome revealed through sequence annotation. Genetics 196:891–909. https://doi.org/10.1534/genetics.113.159996
Article PubMed PubMed Central CAS Google Scholar
Xie Y, Wu G, Tang J, Luo R, Patterson J, Liu S, Huang W, He G, Gu S, Li S, Zhou X, Lam TW, Li Y, Xu X, Wong GKS, Wang J (2014) SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads. Bioinformatics 30:1660–1666. https://doi.org/10.1093/bioinformatics/btu077
Article PubMed CAS Google Scholar
Yagi Y, Tachikawa M, Noguchi H, Satoh S, Obokata J, Nakamura T (2013) Pentatricopeptide repeat proteins involved in plant organellar RNA editing. RNA Biol 10:1419–1425. https://doi.org/10.4161/rna.24908
Article PubMed PubMed Central CAS Google Scholar
Zerbe P, Hamberger B, Yuen MMS, Chiang A, Sandhu HK, Madilao LL, Nguyen A, Hamberger B, Bach SS, Bohlmann J (2013) Gene discovery of modular diterpene metabolism in nonmodel systems. Plant Physiol 162:1073–1091. https://doi.org/10.1104/pp.113.218347
Article PubMed PubMed Central CAS Google Scholar
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829. https://doi.org/10.1101/gr.074492.107
Article PubMed PubMed Central CAS Google Scholar
Zharkikh A, Troggio M, Pruss D, Cestaro A, Eldrdge G, Pindo M, Mitchell JT, Vezzulli S, Bhatnagar S, Fontana P, Viola R, Gutin A, Salamini F, Skolnick M, Velasco R (2008) Sequencing and assembly of highly heterozygous genome of Vitis vinifera L. cv Pinot Noir: problems and solutions. J Biotechnol 136:38–43. https://doi.org/10.1016/j.jbiotec.2008.04.013
Article PubMed CAS Google Scholar
Zimin AV, Stevens KA, Crepeau MW, Puiu D, Wegrzyn JL, Yorke JA, Langley CH, Neale DB, Salzberg SL (2017) An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing. Gigascience 6:1–4. https://doi.org/10.1093/GIGASCIENCE/GIW016
Article PubMed PubMed Central Google Scholar
Zulak KG, Bohlmann J (2010) Terpenoid biosynthesis and specialized vascular cells of conifer defense. J Integr Plant Biol 52:86–97
Article PubMed CAS Google Scholar

Download references

Acknowledgements

We thank Dr. Carol Ritland and Ms. Karen Reid for excellent project management support, Dr. Timothy J. Sexton for technical assistance, and the McGill University and Génome Québec Innovation Centre for sequencing services. The research was supported with funds from the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery Grant) and funds to JB and JHR from Genome British Columbia, Genome Canada, and the British Columbia Ministry of Forests, Lands, Natural Resource Operations and Rural Development (MFLNRORD) for the CEDaR User Partnership Project (UPP-002, Genome BC) and the CEDaR Applied Genomics Partnership Project (184CED-GAPP, Genome Canada and Genome BC). TJS is supported by a NSERC Postgraduate Doctoral fellowship.

Data archiving statement

The sequence data supporting this work can be found at the NCBI BioProject Database under BioProject ID PRJNA399722. In addition, sequences of the gene lists described in this paper and their annotations are also available in Files S1–S5 and File S9.

Author information

Authors and Affiliations

Michael Smith Laboratories, University of British Columbia, 2185 East Mall, Vancouver, British Columbia, V6T 1Z4, Canada
Tal J. Shalev, Macaire M. S. Yuen, Andreas Gesell, Agnes Yuen & Jörg Bohlmann
British Columbia Ministry of Forests, Lands, Natural Resource Operations and Rural Development, Cowichan Lake Research Station, Mesachie Lake, British Columbia, V0R 2N0, Canada
John H. Russell

Authors

Tal J. Shalev
View author publications
You can also search for this author in PubMed Google Scholar
Macaire M. S. Yuen
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Gesell
View author publications
You can also search for this author in PubMed Google Scholar
Agnes Yuen
View author publications
You can also search for this author in PubMed Google Scholar
John H. Russell
View author publications
You can also search for this author in PubMed Google Scholar
Jörg Bohlmann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jörg Bohlmann.

Additional information

Communicated by C. Dardick

Electronic supplementary material

Figure S1

Representative four-month old WRC seedling used for RNA isolation and sequencing. (JPEG 52 kb)

High resolution image (EPS 620 kb)

Figure S2

Pipeline for the de novo assembly and redundancy reduction of the WRC gene set, carried out for each inbred S5 line separately. (JPEG 208 kb)

High resolution image (EPS 2924 kb)

Figure S3

Monoterpene profiles of foliar samples for 12 different monoterpenes across the eight S5 lines. (JPEG 937 kb)

High resolution image (EPS 4043 kb)

Figure S4

Results of the BUSCO gene set completeness assessment. The reduced-redundancy gene set for WRC S5 Line 4 was found to be the most complete, with the lowest number of missing orthologs. (JPEG 438 kb)

High resolution image (EPS 848 kb)

Table S1

Summary for transcriptome assemblies for WRC S5 lines. (DOCX 14 kb)

Table S2

Results of the Conditional Reciprocal Best BLAST (CRBB) analysis. (DOCX 15 kb)

Table S3

Results of the BLASTp analysis of transcriptome assemblies against the longest predicted proteins (n = 1000) in the P. glauca and A. thaliana reference gene sets. (DOCX 13 kb)

File S1

Sequences of 241 plant terpene synthase (TPS) used in construction of a maximum-likelihood phylogeny of plant TPS. (TXT 184 kb)

File S2

Sequences of 126 gymnosperm and a single P. patens TPS used in construction of a maximum-likelihood phylogeny of gymnosperm TPS. (TXT 101 kb)

File S3

Sequence data for the core WRC gene set. Gene set containing the 28,279 core, reduced-redundancy protein sequences for predicted ORFs as produced by the EvidentialGene pipeline. (TXT 12858 kb)

File S4

Sequence data for the alternate WRC gene set. Gene set containing 40,691 additional putative protein-coding sequences, which may be potential gene isoforms or paralogs. (TXT 18875 kb)

File S5

Summary of significant BLASTp and InterProScan hits for the main reduced-redundancy gene set of Line 4. BLAST columns are as described in the BLAST Command Line Applications User Manual (https://www.ncbi.nlm.nih.gov/books/NBK279690/). The pipeline for BLASTing and filtering hits is described in the Methods section. GO names are separated by Biological Process (P), Molecular Function (F) and Cellular Component (C). The InterPro ID column lists all InterPro domains found for the queried sequence. Top PFAM hit describes the hit with the highest score against the PFAM database for each sequence, using an e-value cut-off of 1e-5. (XLSX 5321 kb)

File S6

Statistical summary of orthogroup analysis for all sequences assigned to orthogroups. Of the 498,235 protein coding sequences from 16 different plant species submitted for orthogroup analysis, 391,179 were successfully assigned to 19,660 orthogroups. The majority of orthogroups (11,616) had an average of less than one gene per species; the largest orthogroup (3201 genes) had an average of 151–200 genes per species. A large number of orthogroups (5614) had members from only two species; however, a similarly large number (3835) had members from all 16 species. (XLSX 15 kb)

File S7

Statistical summary of orthogroup analysis results for each species. The species with the lowest amount of genes assigned to orthogroups was P. patens, with only 57.5% of sequences assigned; the highest was P. glauca with 92.4%. 90.4% of our WRC gene set was successfully assigned to orthogroups; 0.1% of WRC sequences were in species-specific orthogroups. (XLSX 21 kb)

File S8

Summary of orthogroup composition and function. The number of orthogroup members from each species, together with the total number of genes in each orthogroup and the top five PFAM hits for each orthogroup. The largest orthogroup, with 3201 genes consisted mainly of pentatricopetide-repeat containing protein-coding genes, a large protein family in plants with little functional redundancy (Lurin et al. 2004). (XLSX 1665 kb)

File S9

Sequence data for 33 putative full-length TPS genes from the WRC gene set. Putative TPS were identified using BLASTp, InterProScan and orthogroup analysis, and after removal of partial ORFs and proteins less than 400 aa long were reduced to a set of 33 putative full-length TPS. (TXT 26 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shalev, T.J., Yuen, M.M.S., Gesell, A. et al. An annotated transcriptome of highly inbred Thuja plicata (Cupressaceae) and its utility for gene discovery of terpenoid biosynthesis and conifer defense. Tree Genetics & Genomes 14, 35 (2018). https://doi.org/10.1007/s11295-018-1248-y

Download citation

Received: 05 February 2018
Revised: 28 March 2018
Accepted: 02 April 2018
Published: 21 April 2018
DOI: https://doi.org/10.1007/s11295-018-1248-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An annotated transcriptome of highly inbred Thuja plicata (Cupressaceae) and its utility for gene discovery of terpenoid biosynthesis and conifer defense

Abstract

Access this article

Similar content being viewed by others

In-depth transcriptome characterization uncovers distinct gene family expansions for Cupressus gigantea important to this long-lived species’ adaptability to environmental cues

Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)

The oak gene expression atlas: insights into Fagaceae genome evolution and the discovery of genes regulated during bud dormancy release

References

Acknowledgements

Data archiving statement

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Figure S1

High resolution image (EPS 620 kb)

Figure S2

High resolution image (EPS 2924 kb)

Figure S3

High resolution image (EPS 4043 kb)

Figure S4

High resolution image (EPS 848 kb)

Table S1

Table S2

Table S3

File S1

File S2

File S3

File S4

File S5

File S6

File S7

File S8

File S9

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An annotated transcriptome of highly inbred Thuja plicata (Cupressaceae) and its utility for gene discovery of terpenoid biosynthesis and conifer defense

Abstract

Access this article

Similar content being viewed by others

References

Acknowledgements

Data archiving statement

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation