Sequencing of Sitka spruce (Picea sitchensis) cDNA libraries constructed from autumn buds and foliage reveals autumn-specific spruce transcripts
- First Online:
- Cite this article as:
- Reid, K.E., Holliday, J.A., Yuen, M. et al. Tree Genetics & Genomes (2013) 9: 683. doi:10.1007/s11295-012-0584-6
Substantial efforts have been invested in recent years to characterize the expressed genome of major North American spruce species, namely Sitka spruce (Picea sitchensis), white spruce (Picea glauca) and interior spruce (Picea engelmanii × glauca). To date, more than 550,000 spruce expressed sequence tags (ESTs) have been publically deposited, most of which were constructed from various tissue collected during active primary growth. Here we report EST sequencing of dormant foliage and bud tissue collected from Sitka spruce. Both normalized and standard libraries were employed, with tissue collected at two autumn time points. A total of 30,681 ESTs were generated then assembled into 9,400 putative unique transcripts, or unigenes, with an average length of 779 bp. These autumn specific Sitka spruce ESTs were combined with autumn specific white spruce ESTs and compared with all spruce ESTs currently available. In total, 12,307 ESTs were unique to the autumn libraries, which assembled into 11,121 unigenes. Functional categorization suggests a role for some of these genes in bud dormancy and adaptation to freezing stress. Our results show that dormant tissue harbours a large number of transcripts not found in the same tissue during the growing season, and this sequence resource will therefore support ongoing studies of adaptive traits in spruce.
KeywordsExpressed sequence tag EST Picea Cold hardiness Bud dormancy
Conifer trees are ecologically dominant in many temperate and boreal regions worldwide and form the basis for industrial forestry in these areas. Genome and transcriptome sequencing of these species facilitates diverse, economically and ecologically important goals, such as dissecting molecular mechanisms of resistance to insect pests, elucidating the genetic and molecular architecture of complex adaptive traits, and the application of these tools to genome-enabled breeding. Conifer genomes are very large (on the order of 10–30 Gb) (Ahuja and Neale 2005), which, until recently, has precluded whole-genome shotgun sequencing and assembly. However, extensive expressed genome sequencing has been undertaken for several conifer genera, most notably pine (Pinus) and the dominant genus in boreal regions of North America and Europe, Picea (spruce).
Adaptation of forest trees of the temperate and boreal zones requires close synchronization of growth and dormancy cycles of the trees to their local climate (Aitken et al. 2008; Howe et al. 2003). Freezing temperatures in winter necessitate the acquisition of cold hardiness each autumn, which must be maintained until conditions return in the spring that are consistently favourable enough for active growth without risk of freezing injury (Weiser 1970). Cold acclimation is a remarkable physiological transition, whereby trees that may be killed during active growth by ambient temperatures only slightly below freezing develop cold hardiness that in some cases enables survival following submersion in liquid nitrogen (Sakai 1960). Autumn cold acclimation in perennials preserves cellular integrity when plants are faced with the myriad stresses imposed by freezing temperatures, including chemical stress due to dehydration of cellular macromolecules, mechanical stress on cellular membranes due to freeze–thaw cycles and oxidative stress (Smallwood and Bowles 2002).
The genomic complexities of the transition to dormancy and acquisition of cold hardiness in conifers are beginning to be unravelled. Transcript profiling in Sitka spruce needles showed that ~10 % of microarray features were up- or down-regulated at least 2-fold during cold acclimation (Holliday et al. 2008). Many of these transcripts appear to facilitate remodelling of primary metabolism associated with freezing acclimation (carbohydrates, lipids and amino acids), accumulation of cryoprotective proteins, scavenging of radical oxygen species and related signalling (Smallwood and Bowles 2002). However, as Holliday et al. (2008) measured gene expression in needle tissue, whereas perception of and signalling related to dormancy occurs in vegetative buds (Rohde and Bhalerao 2007), there is a need for more detailed analyses of this tissue. El Kayal et al. (2011) specifically focused on expression changes associated with the transition to short days in vegetative buds of white spruce (Picea glauca) and found that among the 10,400 array elements, about 45 % were differentially expressed. Similar to the results from Sitka spruce (Holliday et al. 2008), a wide array of biological processes were represented among the differentially expressed genes in white spruce, reinforcing the substantial metabolic activity of dormant tissues in conifers. Careful phenotyping of bud development by El Kayal et al. allowed the authors to identify possible regulators of the progression to dormancy. Asante et al. (2011) took a more targeted approach, using sequencing of suppression-subtractive hybridization libraries (long-day versus short-day samples) to identify a set of candidate genes for more detailed study.
Spruce ESTs publically available in NCBI’s GenBank as of June 2012
ESTs >400 bp
ESTs >400 bp from late season tissue
P. engelmanii × glaucaa
With diminishing returns from deeper EST sequencing of available growing-season libraries, dormant tissues represent a relatively unexplored piece of the spruce transcriptional landscape. Here we report sequencing and sequence analysis of both standard and normalized cDNA libraries that were generated from Sitka spruce tissue collected at two time points in the fall and from two tissue types: excised apical buds and needles. For completeness of the sequence analysis, the sequences of autumn Sitka spruce libraries were combined with those of the autumn white spruce libraries (Rigault et al. 2011) and compared with all other available sequence data for North American spruce species to assess the level of uniqueness of the spruce autumn transcriptome. The results of this study reflect the unique character of the transcriptome of dormant trees and provide a resource for studies of cold acclimation and bud dormancy in conifers.
Materials and methods
Sitka spruce (P. sitchensis), ramets of genotype FB3-425, originating from the east side of Vancouver Island near Fanny Bay (49.49° N, 124.82° W), were grown outside on the University of British Columbia campus. At harvest, trees were 5 years old. Needles were collected at two autumn time points: one in late summer (August 2006) following bud set and the second approximately 6 to 8 weeks after bud set in autumn (October 2006), as the trees entered dormancy for the winter season. Tissue was frozen immediately in liquid nitrogen and then stored at −80 °C until RNA isolation. Tissue for bud-derived libraries was harvested from trees of the same clonal line one year later in October 2007, approximately six to eight weeks following bud set. Young buds were cut from the trees, transported to the laboratory where the bud scales were carefully removed, and the primordial buds were frozen in liquid nitrogen and stored at −80 °C until RNA isolation.
RNA isolation and cDNA library construction
Isolation of high-quality RNA was performed using a protocol adapted from Kolosova et al. (2004). Larger isolations were performed in 50 ml conical tubes for the needle-derived libraries, and smaller isolations were performed in 2 ml microcentrifuge tubes for the bud-derived libraries. Briefly, tissue was ground to a fine powder under liquid nitrogen. Then, extraction buffer (0.4 M Tris · HCl, 55 mM lithium dodecyl sulphate, 0.3 M LiCl, 10 mM Na2 EDTA, 24 mM sodium deoxycholate sodium salt, 1 % Tergitol NP-40, 1 mM aurintricarboxylic acid, 10 mM DTT, 5 mM thiourea, 2 % polyvinyl polypyrrolidone) was added and samples snap frozen. Samples were thawed at 37 °C and spun for 10 min, and then 0.03 volume of 3.3 M Na acetate and 0.1 volume of 100 % ethanol were added to remove excess polysaccharides. Supernatant was transferred and 0.1 volume of 3.3 M NaOAc and 0.6 volume of 100 % isopropanol was added, and then samples were spun. Pellets were resuspended in 2 ml of TE, 2 ml of 5 M NaCl, 1 ml of 10 % cetyltrimethylammonium bromide (or 400, 400, 200 μl, respectively, for microcentrifuge tube isolations) and incubated at 65 °C for 5 min. Mixtures were extracted twice in equal volume of chloroform/isoamyl alcohol (24:1), and then 0.3 volume of 8 M LiCl was added to the aqueous phase and incubated overnight at 4 °C. RNA was pelleted, resuspended in TE and then reprecitipated in one volume isopropanol and 0.1 volume of 3.3 M NaOAc in −80 °C (subsequently, this final step was eliminated). The final pellet was washed in 70 % EtOH and dissolved in DEPC-treated water. RNA was quantified with a NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) and integrity evaluated on 1 % agarose gel. All libraries were constructed from purified mRNA, using an RNeasy cleanup kit (Qiagen, Valencia, CA, USA) following the manufacturer’s instructions.
Needle libraries were constructed using pBluescript II XR cDNA library construction kit (Stratagene, La Jolla, CA, USA). Bud libraries were constructed with an optimized Creator SMART cDNA library construction kit (Clontech, Mountain View, CA, USA) for standard libraries, and normalized libraries incorporated the Trimmer-Direct cDNA normalization kit (Evrogen, Moscow, Russia) in the protocol. Four bud libraries were constructed: two by Evrogen to produce both a standard and a normalized cDNA library and two libraries, again standard and normalized, were made in-house using the same method.
Sequencing and EST assemblies
In total 28,800 cDNA clones were sequenced: 6,144 using both M13 forward (5′) and M13 reverse (3′) primers, and 22,656 in the reverse (3′) orientation only. Sequencing reactions were performed and run at the Michael Smith Genome Sciences Centre, Vancouver, BC, Canada on 3730XL DNA Analyzers (Applied Biosystems, Foster City, CA, USA). DNA sequencing chromatograms were used to filter for high-quality regions using the Phred software (Ewing and Green 1998; Ewing et al. 1998). Subsequently, sequences were vector-trimmed using Cross_Match software in the Phrap package (http://www.phrap.org/) and filtered to remove short sequences (>100 bp) and sequences representing bacterial, fungal or yeast contaminations from the dataset. A total of 30,681 high-quality EST reads were obtained and submitted to NCBI with accession numbers ES667072 to ES671893, FD740103 to FD748148, GH280265 to GH291091, and GT120725 to GT127710.
To identify unique transcripts from the libraries presented in this study, an EST assembly was produced for all sequences derived from autumn tissues using CAP3 (Huang and Madan 1999). Assembly parameters were as follows: window of at least 40 bp sequence overlap and an identity of at least 95 %. To further concentrate our efforts on transcripts specific to the autumn period, an assembly of all known P. sitchensis and P. glauca ESTs greater than 400 bp in length (for a total of 478,361 reads) was produced using the same CAP3 parameters. These sequences include those of Pavy et al. (2005), Ralph et al. (2008), Rigault et al. (2011) and many other reads publically available at NCBIs GenBank, most of which were derived from a variety of actively growing tissue (Table 1). Based on the complete assembly, we extracted the reads that were uniquely derived from cDNA libraries with tissue collected in the autumn (see Supplemental Table S1).
Dormancy-specific contigs and singletons identified from the above analysis were categorized based on the Gene Ontology (GO) (Ashburner et al. 2000) using the BiNGO plugin for Cytoscape (Maere et al. 2005; Shannon et al. 2003). First, putative orthologues of spruce genes were identified using a custom Basic Local Alignment Search Tool (BLAST) (Altschul et al. 1997) database comprised of all the Arabidopsis sequences annotated by The Arabidopsis Information Resource (TAIR) (Lamesch et al. 2012) and BLAST executables (http://www.ncbi.nlm.nih.gov) were integrated to automate this process. Only unigenes with E-values less than E−25 were considered for this analysis. Over-represented GOSlim terms among dormancy unigenes were identified using the hypergeometric test implemented in BiNGO. TAIR accessions identified within the total unigene set for spruce described above were used as the reference set for this test. A false discovery rate (FDR) of less than 0.1 was considered statistically significant.
Results and discussion
Sequencing and assembly of Sitka spruce autumn ESTs
Sitka spruce cDNA libraries constructed using autumn tissues
High quality ESTs (5′/3′)
Average trimmed length (bp)
Needle tissue collected in late summer, WS041
Needle tissue collected in late summer,WS043a
Needle tissue collected in autumn, WS042
Dormant buds, WS044
Dormant buds, WS045
Dormant buds, WS046
Dormant buds, WS047
Assembly of Sitka spruce needle and bud ESTs
Number of sequences
Number of 5′ sequences
Number of 3′ sequences
Number of unique sequences
Number of contigs
Average contig length (bp)
Average number of sequences/contig
Number of singletons
Genes putatively unique to the spruce autumn transcriptome
Many of the ESTs sequenced in the autumn tissues were previously identified by past sequencing efforts of tissues collected during the growing season, but the extent to which these collections captured all or most of the transcripts present in late season tissue is unknown. We therefore sought to identify transcripts from autumn cDNA libraries that are not represented within previous spruce sequence resources. To improve the detection of autumn-specific transcripts in spruce, we compiled EST sequences from Sitka and white spruce extracted from NCBI’s EST public repository. We included only Sitka and white spruce in this analysis due to their close evolutionary relationship and because they were the only two species for which autumn ESTs were available. This resulted in a total of 478,361 sequences (after filtering for >400 bp reads) from a variety of tissue and growing conditions. Of these, 60,681 were derived from autumn libraries. By extracting singletons and contigs from this build that were comprised only of EST reads generated from the autumn libraries, we were able to identify a total of 12,307 reads as being present only in the autumn data sets (Supplemental File 1). As such, approximately 20 % of the reads in the autumn libraries were not represented among the complete set of North American spruce ESTs. The assembly of these sequences produced 567 contigs and 10,554 singletons for a total of 11,121 unigenes.
Functional classification of autumn unigenes
With respect to the biological processes ontology, 602 unigenes were annotated to the GO term ‘response to stress’. Many genes in this category have been shown to be induced by short days in temperate and boreal tree species (Joosen et al. 2006; Holliday et al. 2008; Ruttink et al. 2007; Schrader et al. 2004; El Kayal et al. 2011) and reflect the downstream end of the cold hardiness transcriptome: Genes in this category do the work of protecting the cell from the severe dehydrative and oxidative stress brought on by freezing temperatures. Included among these were two dehydrin-like transcripts, five possible antifreeze unigenes and oxidative stress-related genes (peroxidases, glutathione transferases and catalases). Dehydrins are a subfamily of group 2 late embryogenesis abundant proteins that are strongly upregulated in response to short days (Holliday et al. 2008; Ruttink et al. 2007; Asante et al. 2011) as well as a variety of stresses including cold and drought (Street et al. 2006; Joosen et al. 2006). Antifreeze proteins are typically secreted to the apoplasm where they inhibit ice crystal formation and propagation (Griffith and Yaish 2004). As the GO ‘extracellular regions’ cellular component category was also overrepresented and included several pathogenesis-related unigenes, it is likely that the autumn unigene set is enriched for apoplastic antifreeze proteins. Finally, a diverse set of autumn unigenes related to oxidative stress were identified, which likely counteract the redox imbalance resulting from decreased photosynthetic efficiency, but may also have a role in signalling (Suzuki and Mittler 2006).
In addition to the production of the specific cryoprotective proteins, acquisition of tolerance to freezing temperatures in winter involves extensive metabolic remodelling. Several relevant GO terms were overrepresented, including ‘lipid metabolic process’ and ‘carbohydrate metabolic process’. While biophysical studies suggest increased membrane fluidity during winter through membrane desaturation (Martz et al. 2006; Uemura et al. 2006; Uemura and Steponkus 1994; Oquist et al. 2001), microarray studies have not found extensive evidence for differential expression of the relevant enzymes of lipid metabolism. Our results reveal autumn-specific fatty acid and sterol desaturases, as well as fatty acid synthesis and modifying enzymes. Particularly enriched were unigenes involved in long chain fatty acid synthesis—for example 3-ketoacyl-CoA synthase and long chain acyl-CoA synthetase families. The latter may be involved in cutin biosynthesis, the deposition of which could regulate water loss during freezing temperatures and concomitant extracellular ice formation. This suggests that while strong upregulation of the lipid biosynthesis machinery may not be a feature of cold acclimation, membrane remodelling is likely nonetheless important. Among carbohydrate metabolism genes were several putative galactinol and raffinose synthases, which are frequently associated with cold acclimation. A galactinol synthase-like gene, as well as a putative raffinose synthase, was upregulated during cold acclimation in Sitka spruce (Holliday et al. 2008), and metabolic profiling during the same period found that the metabolic products of these genes, namely galactinol and raffinose, increased in abundance during autumn (Dauwe et al. 2012). Interestingly, the temporal dynamics of raffinose accumulation closely tracked the cold acclimation process measured in three phenotypically divergent populations from across the latitudinal and climatic range of the species. This suggests that not only is raffinose important to cold hardiness but its accumulation may partially explain adaptive genetic differentiation among populations. Numerous other genes involved in carbohydrate metabolism were also identified. These include galactosyltransferases involved in protein modification in the endomembrane system, as well as three xyloglucan endotransglucosylases (XTHs, but also known as XETs). XTHs add xyloglucan moieties to the hemicellulose matrix of the cell wall, providing increased flexibility during cell expansion. One previously identified XTH was upregulated during autumn in Sitka spruce (Holliday et al. 2008), and SNP variation within one of these has been associated with adaptive variation in budset timing and cold hardiness (Holliday et al. 2010). Little is known about cell wall alterations during cold acclimation, but circumstantial evidence from studies of drought, a stress with similar biophysical effects (i.e. protoplast dehydration), suggest that enhanced flexibility of the cell wall may be important (Sasidharan et al. 2011; Cho et al. 2006). In addition to the XTH-like unigenes, we identified one expansin, a gene family that also promotes cell wall expansion, possibly by modifying the carbohydrate matrix (Sasidharan et al. 2011). El Kayal et al. (2011) reported 65 cell wall-related genes as upregulated in buds through the middle of their autumn time course, including at least one expansin and XTH. However, as El Kayal et al. pointed out, cell wall processes related to cold hardiness in buds may be confounded with the formation of bud scales.
Three partially overlapping GO terms relevant to dormancy and cold acclimation-related signalling were overrepresented, namely ‘response to abiotic stimulus’ with 492 unigenes, ‘response to external stimulus’ with 127 unigenes and ‘response to extracellular stimulus’ with 59 unigenes. Among unigenes assigned to these terms were three genes involved in flowering time in Arabidopsis: SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1), which is required for flowering in Arabidopsis (but has no known role in flowering or dormancy in trees) (Lee et al. 2000); CENTER CITY, which affects flowering time in Arabidopsis by enhancing expression of FLOWERING LOCUS T (FT) (Imura et al. 2012) and FVE, which interacts with FLOWERING LOCUS C to affect vernalization (Ausin et al. 2004). These are interesting due to the fact that the pathway governing photoperiodic flowering in annuals has been co-opted by woody perennials for signal transduction related to bud dormancy. Specifically, the Arabidopsis flowering-time pathway mediated by the CONSTANS(CO)/FT regulatory module also regulates seasonal dormancy in trees through the photoreceptor PHYTOCHROME A (Howe et al. 1996; Olsen et al. 1997b; Bohlenius et al. 2006). El Kayal et al. (2011) found that a SOC1-like gene was differentially expressed in late season white spruce vegetative buds and also more highly expressed in buds relative to other tissues, which may explain why this clone was among our Sitka spruce autumn unigenes.
Phytohormone-mediated signalling is a key feature of cold acclimation, and several genes involved in hormone metabolism signalling were among the autumn unigenes, with most falling in the abiotic/external stimulus GO categories noted above. Phytohormones may play regulatory roles in cold signalling, particularly abscisic acid (ABA) and possibly auxin (Penfield 2008). ABA is a primary phytohormone involved in cold hardiness, the application of which can induce cold acclimation-related gene expression in the absence of the usual environmental cues (short days, chilling temperatures) (Smallwood and Bowles 2002). Several related genes in the unigene set, including a transcript similar to ABSCISIC ACID INSENSITIVE 1 (ABI1), a negative regulator of ABA signalling for which mis-expression alters normal vegetative bud development (Rohde et al. 2002), as well as a transcript similar to ABSCISIC ACID DEFICIENT 4 (ABA4), which is required for an intermediate step in ABA biosynthesis. The role of auxin in dormancy is not entirely clear and may differ between angiosperm and gymnosperm trees. Whereas auxin levels decrease in willow and birch under short days (Olsen et al. 1997a; Li et al. 2003), they increase in spruce (El Kayal et al. 2011). El Kayal et al. identified several auxin-related genes that were upregulated in white spruce buds under short days, including a PIN-like auxin carrier. Our results paralleled this, with the identification of an auxin influx transporter-like gene, AUX1, as well as homolog to AXR1, which is involved in targeting of AUX1, but not the PIN auxin carriers, to the endoplasmic reticulum (Dharmasiri et al. 2006).
A diverse array of additional gene families possibly involved in abiotic stress-related signal transduction were also among the autumn unigene set. These include unigenes similar to calcium binding proteins in Arabidopsis—annexins, calreticulins, calmodulins, a calcium/calmodulin-regulated receptor-like kinase, the CBL-interacting protein kinase, SALT OVERLY SENSITIVE 2 and SYNAPTOTAGMIN 1. The latter was recently shown to be involved in calcium-dependent resealing of the plasma membrane following disruption by extracellular ice formation (Yamazaki et al. 2008). Finally, one DICER-like gene was identified. In addition to its role in posttranscriptional gene silencing (Ramachandran and Chen 2008), DICER mediates transcriptional gene silencing (Chan 2008). The role that epigenetics may play in adaptive variation along environmental clines is of interest in light of phenotypic studies in Norway spruce (Picea abies), which show that the maternal environment during either somatic or zygotic embryogenesis has profound adaptive consequences for the offspring.
A small number of transcription factors were identified within the overrepresented GO terms ‘response to abiotic stimulus’ and ‘nucleotide binding’ with 515 unigenes; however, no GO terms specific to transcription factors were overrepresented. Of particular interest were three unigenes similar to upstream elements in the Arabidopsis cold response pathway, namely INDUCER OF CBF EXPRESSION 1 (ICE1), INDUCER OF CBF EXPRESSION 2 (ICE2) and DEHYDRATION-RESPONSIVE ELEMENT BINDING PROTEIN 2 (DREB2). Both ICE1 and ICE2 are basic helix-loop-helix that in Arabidopsis activate members of the DREB/C-repeat binding factor (CBF) subfamily of AP2 transcription factors in response to chilling temperatures. In turn, the CBFs activate downstream stress response genes (so-called COR genes). It is difficult to differentiate similar members of the CBF/DREBs subfamily of AP2 transcription factors across such deep evolutionary time (i.e. between the conifers and angiosperms). While DREB2 is more commonly associated with drought stress (Liu et al. 1998), overexpression does provide some freezing tolerance, suggesting overlapping functions among the subfamily (Dai et al. 2007).
The autumn period is one of extensive transcriptional remodelling in temperate and boreal tree species. Although broad EST sequencing has been undertaken in North American spruce species, only a small fraction of these reads were derived from autumn tissue. We have shown that incorporating bud and needle tissue from this period results in a substantial increase in gene discovery rates, possibly due to the upregulation of genes specifically expressed during the transition to dormancy. Indeed, many of these genes have annotations that suggest they are involved in the transition to dormancy and associated cold acclimation. Incorporation of these sequences into existing assemblies for spruce will facilitate assembly of population genetic re-sequencing data that make use of next generation sequencing technologies and thus allow for a more complete picture of the genomic underpinnings of adaptation to climate and to annual climatic cycles.
We would like to thank Tristan Gillan for technical assistance with plant maintenance. This research was funded by Genome British Columbia and Genome Canada supporting the Treenomix project (grant to JB and SA), the SMarTForests project (grant to JB) and the AdapTree project (grant to SA) and by a University of British Columbia Graduate Fellowship and NSERC Postgraduate Scholarship to JH. JB has been supported, in part, by the Distinguished University Scholars program of the University of British Columbia.
Data archiving statement
High-quality EST reads were submitted to NCBI under accession numbers ES667072 to ES671893, FD740103 to FD748148, GH280265 to GH291091 and GT120725 to GT127710. This resulting contig builds described here have been submitted to the Transcriptome Shotgun Assembly at DDBJ/EMBL/GenBank under the accession GACG00000000.