Introduction

Heartwood formation is a unique feature of woody plants. Heartwood is important both for a tree’s life and wood usage. The roles of heartwood are considered to be optimizing sapwood volumes, recycling nutrients, and providing mechanical support with durability [1, 2]. For the usage of wood, heartwood has positive or negative effect on wood properties due to the presence of extractives [3]. To utilize trees more efficiently as wood sources, it is crucial to understand the mechanism of heartwood formation toward regulation of the formation. Heartwood formation has been studied from anatomical, chemical and biochemical perspectives [35]. However, why and how heartwood is formed remains largely elusive. It has been proposed that heartwood formation is an actively regulated physiological process [3, 57] and it is regarded as the final stage of wood differentiation [8]. The early stage is mature xylem formation, which includes cell division from vascular cambium, cell expansion, cell wall thickening and lignifications [1]. For this stage, expressed genes were collected and changes in gene expression were investigated on a large scale for several tree species belonging to Pinus [912], Populus [1316], Eucalyptus [1720] and Acacia [21]. In contrast, there have been only a few large-scale gene analyses for heartwood formation. Studies have been conducted on broad-leaved trees, Robinia pseudoacacia [22, 23] and Juglans nigra [24, 25], but there have been no reports on conifers. The sugi tree (Cryptomeria japonica D. Don) has sharply defined heartwood, and usually also a readily recognized transition zone (TZ) between sapwood and heartwood [26]. Thus, C. japonica is a good sample material to study events having occurred during heartwood formation. To obtain clues to elucidate the mechanism of heartwood formation at a molecular level, we collected expressed sequence tags (ESTs) from the TZ in November, in which heartwood formation is considered to proceed. The ESTs were assembled and the resulting sequences were functionally categorized. Furthermore, the expression of genes selected as having potential involvement in heartwood formation was quantified in various organs.

Materials and methods

Plant materials

A 19-year-old C. japonica tree that grew in the nursery of the Forestry and Forest Products Research Institute (Tsukuba) was felled on November 28, 2003. A trunk was excised from a height of 1.2 m above ground. A disk 60-mm thick was cut from the bottom end of the log. The transition zone (2-year annual rings) was isolated from the disk, cut into small pieces, frozen in liquid nitrogen, and stored at −80°C. To examine the gene expression, the following organs were collected from one individual: TZ and sapwood (June 5 and November 26, 2008), inner bark (June 25, 2008), leaf buds (May 8, 2008), needles (June 25, 2008), pollen (March 21, 2008), male and female strobili (February 8, 2008), and young cones (June 25, 2008).

Construction of a cDNA library

Total RNA was prepared from the TZ collected in November according to the method previously reported [27]. cDNA was synthesized from 1.4 μg of total RNA using the Creator SMART cDNA library construction kit (Clontech, California, USA). The cDNAs were cloned into the pDNR-LIB vector (Clontech) and transformed into Escherichia coli strain DH5α.

Sequencing and data analysis

Sequences of cDNA inserts were determined with an ABI 3730 DNA analyzer (Applied Biosystems, California, USA). Raw sequence data were processed by ABI base caller with quality values. Low-quality sequences (quality score <20 at 750 bp) were discarded. Vector and adaptor sequences were trimmed. Trimmed sequences (≥100 bp) were assembled using Sequencher 4.1.2 (Gene Codes, Michigan, USA) with the parameters of minimum overlap = 40, minimum match = 95%. Sequences of ribosomal RNA (rRNA), chloroplast or mitochondrial DNA were identified by the BLASTN search against Populus rRNA sequences (accession numbers AF174629, AF206999, AF479118, AJ006440), Arabidopsis mitochondrial genome (NC001284) and Populus chloroplast genome (http://genome.ornl.gov/poplar_chloroplast/), and these were removed. Remaining sequences were searched locally against the eukaryotic orthologous groups protein databases [28] (KOG, ftp://ftp.ncbi.nih.gov/pub/COG/KOG/) and against the Universal Protein Resource [29] release 13.3 (UniProt, http://www.uniprot.org/) using the BLASTX program, respectively. The KOG comprises three databases containing orthologous proteins from at least three out of seven eukaryotic species, proteins from two species, and species-specific proteins. Sequences with an expectation (E) value of <10−5 were considered to have significant homology, and were classified following the KOG functional classification. The sequences of ESTs have been submitted to the DNA Data Bank of Japan (DDBJ) under accession numbers DC882454–DC883482.

Gene expression analysis

Total RNA was isolated from the TZ and sapwood according to the method previously reported [27]. From the inner bark, leaf buds, needles, male strobili, female strobili and young cones, RNA was prepared as described by Futamura et al. [30]. RNA was also extracted according to Sone et al. [31] from pollen. The RNA was treated with RQ1 RNase-free DNase (Promega, Wisconsin, USA) to remove contaminant DNA before reverse transcription. The quality of the RNA was evaluated with the Agilent 2100 bioanalyzer (Agilent Technologies, California, USA). First-strand cDNA was synthesized from 100 ng of RNA using an AffinityScript QPCR cDNA synthesis kit (Stratagene, California, USA) with a mixture of oligo(dT) and random primers according to the manufacturer’s instructions. The resultant cDNA solution was diluted 50 times with water and used as a template. The real-time quantitative polymerase chain reaction (qPCR) was performed with Mx3000P (Stratagene) using Brilliant II SYBR green QPCR master mix (Stratagene). The reaction mixture for PCR was composed of 5 μl of the diluted first-strand cDNA (equivalent to 0.5 ng of total RNA), 10 μl of 2× Brilliant II SYBR green QPCR master mix (Stratagene) and 0.2 μM gene-specific primers in a total volume of 20 μl. The primer sequences are listed in Table 1. Amplification was performed with an initial polymerase activation step at 95°C for 10 min followed by 40 cycles of denaturation at 95°C for 30 s, annealing at 58°C for 1 min, and an extension at 72°C for 30 s. To examine the specific amplification of target genes, melting curves were obtained after the amplification. Correct amplification was further verified by agarose gel electrophoresis and by the DNA sequencing of PCR products. The relative quantity of target mRNA was normalized using the gene for a translation initiation factor [27] (BJ936692) as an internal standard. The mean values and SD were calculated from the triplicate measurement.

Table 1 Sequences of primers used for gene expression analysis

Results

Collection of ESTs from the cDNA library of the TZ in November

In C. japonica, heartwood formation is considered to start in late summer and continue to winter in view of the cytological study [32]. In several previous studies, samples were collected in November to examine enzyme activities or expressed genes in relation to heartwood formation [8, 22, 23, 3335]. Therefore, we constructed a cDNA library using the RNA prepared from the TZ collected in November. Over 1000 clones were sequenced and a total of 1029 ESTs were generated with an average length of 575 bp. After sequence assembly, 641 ESTs were found to be singletons, while the other 388 ESTs were assembled into 103 clusters containing from two to 44 ESTs.

Functional classification of ESTs

Sequences originated from chloroplast or mitochondria genomes, and those of rDNA were removed from the 744 unique sequences (singletons and clusters). The remaining 676 sequences were searched against the KOG using the BLASTX program. The sequences that showed significant similarity (an E value of <10−5) with those in the database were annotated and assigned KOG functional classes. Annotation was given to 284 sequences. The sequences with an E value of ≥10−5, those assigned to “function unknown” or “unnamed proteins”, and those that were not assigned were further searched against the UniProt. As a result, seven additional sequences were annotated. Finally, 291 sequences were annotated to known sequences with putative functions, 120 were similar to known genes whose functions were unknown and 265 had no similarity to sequences in the databases. The 291 annotated and 120 unknown-function sequences were categorized according to the KOG functional classification, and grouped into 22 categories (Fig. 1).

Fig. 1
figure 1

Functional classification of unique transcripts in the transition zone in November. A total of 411 non-redundant sequences were assigned to the KOG functional category. Designations of functional categories: A, RNA processing and modification; B, chromatin structure and dynamics; C, energy production and conversion; D, cell cycle control, cell division, chromosome partitioning; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation, ribosomal structure and biogenesis; K, transcription; L, replication, recombination and repair; M, cell wall/membrane/envelope biogenesis; O, posttranslational modification, protein turnover, chaperones; P, inorganic ion transport and metabolism; Q, secondary metabolite biosynthesis, transport and catabolism; T, signal transduction mechanisms; U, intracellular trafficking, secretion and vesicular transport; V, defense mechanisms; Z, cytoskeleton; R, general functional prediction only; S, function unknown

Sequences with annotation are listed in Table 2. The number of predicted proteins was fewer than that of sequences because sequences annotated as the same protein were represented by a sequence that had the lower E value. “Posttranslational modification, protein turnover, chaperones (category symbol O)” was the largest category with putative function, which included 52 sequences encoding 33 different proteins (Table 2). A sequence encoding cytosolic thioredoxin, which reduces the disulfide bonds of target proteins and contributes to cell redox homeostasis, was most abundant in this category. The second largest category was “translation, ribosomal structure and biogenesis (J).” This consisted of 50 sequences encoding 42 independent proteins. Forty of the 50 sequences encoded various kinds of ribosomal proteins. Excluding “general function prediction only (R)”, the next larger one was “intracellular trafficking, secretion, and vesicular transport (U)” which contain 22 sequences. This was followed by “signal transduction mechanisms (T)”, “lipid transport and metabolism (I)”, “transcription (K)”, “RNA processing and modification (A)”, and “carbohydrate transport and metabolism (G)”. The remaining categories contained fewer than 10 sequences.

Table 2 Functional classification of genes expressed in TZ of C. japonica in November

Clustering of ESTs revealed highly expressed transcripts. Except for rRNA encoded by nuclear, chloroplast or mitochondrial genomes, clusters containing five ESTs and more are listed in Table 3. The most abundant transcript encodes an oleosin that constitutes the surface layers of oil bodies, subcellular particles in cells [36]. Clusters encoding a lipid transfer protein (LTP), a function-unknown protein, a dormancy-associated protein, Bet v 1 allergen and a cytosolic thioredoxin included seven ESTs, respectively. Dehydrins were encoded by Cluster0014 containing six ESTs and Cluster0019 containing five ESTs.

Table 3 Abundant transcripts found in the transition zone of C. japonica in November

Expression patterns of selected genes in various organs

It has been reported that the activities of several enzymes were increased in the TZ chiefly in the dormant season [3335, 37]. Glycolysis and TCA cycle are fundamental metabolic pathways which produce energy and provide materials for secondary metabolites. Based on these findings, we selected genes for enzymes expected to be associated with heartwood formation from the ESTs collected in the present study. Namely, genes for sucrose-phosphate synthase, invertase (β-fructofuranosidase), fructose-bisphosphate aldolase, glyceraldehyde-3-phosphate dehydrogenase, phosphoglycerate mutase, a pyruvate dehydrogenase E1 β subunit, glucose-6-phosphate dehydrogenase, GDSL-motif lipase, cysteine protease, methionine adenosyltransferase (S-adenosylmethionine synthetase), glutathione transferase and 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR). In addition, eight abundantly found transcripts containing more than five ESTs were included in the examination. The genes encode an oleosin, a LTP, a function-unknown protein encoded by Cluster0010, a dormancy-associated protein, Bet v 1 allergen, a cytosolic thioredoxin, a dehydrin 4 encoded by Cluster0014, and a splicing factor 3B subunit (Table 3). To estimate whether those enzymes and proteins participate in heartwood formation, the expression levels of the genes in the TZ in the dormant season (November) were compared with those in the TZ in early summer (June) and in several other organs.

Among the 20 selected genes, the expression levels of nine genes were significantly higher in the TZ in November than in June when the mean values were compared by Student’s t test at P < 0.05. The genes were those encoding invertase, glyceraldehyde-3-phosphate dehydrogenase, phosphoglycerate mutase, methionine adenosyltransferase, glutathione transferase, the LTP, Bet v 1 allergen, the dehydrin 4 and the function-unknown protein (Fig. 2a–i). Conversely, the expression of genes for fructose-bisphosphate aldolase, the pyruvate dehydrogenase E1 β subunit, sucrose-phosphate synthase, cysteine protease, the oleosin and the dormancy-associated protein was lower in November (P < 0.05) (Fig. 2j–o). The expression levels of genes for glucose-6-phosphate dehydrogenase, GDSL-motif lipase, HMGR, the cytosolic thioredoxin and the splicing factor 3B subunit remained unchanged between November and June (Fig. 2p–t).

Fig. 2
figure 2

Expression patterns of selected genes in the various organs of C. japonica. The values were normalized to the transcript levels of a gene for the translation initiation factor. White bars represent the relative gene expression levels in TZ in November. Gray bars show those in the other organs. Error bars indicate SD of three technical replicates. Expression levels were expressed as a ratio relative to the level of TZ6 except for the Bet v 1 allergen gene because its expression level in TZ6 was below the limit of detection. TZ6, transition zone (June); TZ11, transition zone (November); SW6, sapwood (June); SW11, sapwood (November); Bark, inner bark (June); Bud, leaf buds (May); Needle, needles (May); Male, male strobili (February); Female, female strobili (February); Corn, young corns (June); Pollen, pollen (March)

Discussion

To improve understanding of the heartwood formation process, we collected ESTs from the TZ of C. japonica in the dormant season when heartwood formation is considered to be underway. The ESTs were assembled into non-redundant sequences, and annotated according to the KOG functional classification. Sequences included in the two most abundant categories, “posttranslational modification, protein turnover, chaperones (O)” and “translation, ribosomal structure and biogenesis (J)”, occupied a quarter of the classified sequences (Fig. 1). In ESTs previously collected from different organs of woody plants and functionally classified using the KOG [30, 38, 39], categories “O” and “J” were ranked in the top four in terms of abundance. Thus, it appears that the translational regulation of protein synthesis is maintained in the TZ in November as well as in other organs. Meanwhile, sequences that did not match known sequences in the KOG and UniProt databases accounted for 39% of all unique sequences. This proportion is higher than that of C. japonica male strobili (30%) [30], suggesting that the expression of genes specific to the TZ would occur.

We examined the expression levels of the selected 20 genes, focusing on the differences between June and November in the TZ (Fig. 2). As a result, nine genes were up-regulated in the TZ in November as compared to June. The expression level of an invertase gene was approximately twice as high in November compared to in June (Fig. 2a). Among the three types (vacuolar, cell wall bound, and neutral) of invertases, which have been distinguished based on their subcellular localization, pH optima, etc. [40], the deduced invertase in this study was assumed to be classified as a vacuolar type. Invertases hydrolyze sucrose into glucose and fructose, and vacuolar types are involved in supplying these hexoses, not only for nutrients but also for osmoregulation [40]. The resulting hexoses in the TZ may function as osmoregulators in addition to entering glycolysis and/or the pentose phosphate cycle. The increased activity of a neutral invertase has been reported in the TZ of Robinia pseudoacacia in the winter season [34]. Two genes for enzymes in the glycolytic pathway, glyceraldehyde-3-phosphate dehydrogenase and phosphoglycerate mutase, were also up-regulated in November (Fig. 2b, c). The expression of these genes may contribute to the production of energy for heartwood formation, and the supply of material for the biosynthesis of extractives through the catalysis of sugars. The expression level of a gene for fructose-bisphosphate aldolase, which catalyzes an earlier step in the glycolytic pathway, was maintained in TZ in November (Fig. 2j).

Methionine adenosyltransferase is responsible for the synthesis of S-adenosylmethionine (SAM). Secondary metabolites synthesized during heartwood formation often possess methyl groups in their structures. In methylation reactions, SAM is utilized as a primary methyl donor. However, the major heartwood extractives of C. japonica, norlignans and diterpenes, are rarely methylated [41]. SAM is also the precursor for the synthesis of ethylene. In Pinus radiata, ethylene was produced in the TZ, and its quantity was larger than that from sapwood during the winter period [42]. In addition, increased ethylene production was reported for the TZ of Eucalyptus tereticornis [43] and Juglans nigra [44] early in the dormant season. Thus, increased demands for SAM might lead to the expression of a gene for methionine adenosyltransferase in the TZ in November (Fig. 2d).

A gene for a tau-class glutathione transferase (GST) was up-regulated in November (Fig. 2e). The tau-class GSTs are specific to plants, and are involved in several biological processes such as the detoxification of xenobiotics, reduction of oxidative stress, and regulation of flavonoid biosynthesis and trafficking [4547]. A tau-class GST from barley (Hordeum vulgare) was induced in leaves during senescence and in response to low temperature [48]. In maize, a tau-class GST serves as a carrier protein for the vacuolar sequestration of anthocyanins in carnels [49, 50]. Thus, it is possible that the GST in the TZ functions to eliminate oxidative stress and/or transport secondary metabolites during heartwood formation.

Lipid transfer proteins have been isolated from a variety of plants. LTPs are capable of transferring lipids between membranes in vitro [51]. The physiological roles of LTPs are obscure, but they appear to be involved in antimicrobial defense, cuticle biosynthesis, cell wall loosening, anther development, etc., depending on isoforms [51]. The expression of a gene for an LTP increased in the TZ in November compared to in June (Fig. 2f). The LTP examined in this study shows no amino acid sequence similarity to an LTP isolated from C. japonica pollen [52], while sharing high sequence similarity (60% identity) with an LTP isolated from Cycas revoluta seeds, which was reported to possess weak antimicrobial activity [53]. The role of LTPs in the TZ is still unclear at present. The significantly higher level of gene expression in the sapwood in November implies an important role of the LTP in it.

Bet v 1 is the major pollen allergen of Betula verrucosa, and belongs to a family of plant pathogenesis-related protein 10 (PR-10). PR-10 proteins have been found in many plant species but their physiological roles are still obscure [54]. It has been reported that Bet v 1 is able to bind several biological molecules such as fatty acids, flavonoids and cytokinins [55], and Bet v 1 appears to function as the carrier of brassinosteroids [56]. The Bet v 1-like protein found in this study has significant amino acid sequence similarity (53–55% identitiy) to PR-10 proteins from Pinus monticola needles, which accumulated not only in response to wounding but also during winter [57]. It could be speculated that the Bet v 1-like protein play a functional role in the defense or developmental regulation of the TZ.

Dehydrins are known to be produced in many plant species in response to environmental stresses [58, 59]. Dehydrin proteins or transcripts have been found from buds, leaves, inner bark, fruits and xylem [59]. The general roles of dehydrins are thought to be chaperones or cryoprotectants. The expression of a dehydrin 4 gene increased in the TZ in November compared to that in June (Fig. 2h). Although functions of dehydrins in the TZ remain unclear, they may play a role in adaptation to winter cold and/or lower moisture content.

The predicted protein encoded by Cluster0010 showed little similarity to proteins whose functions are estimated, thus this was referred to as a function-unknown protein. The protein has a lower degree of similarity to dehydrins, and possesses the lysine-rich 15 amino acid sequence (the K-segment) that is characteristic to dehydrins [58]. The expression pattern of this gene in the various organs resembled that of the dehydrin 4 encoded by Cluster0014 (Fig. 2h), except that the expression level in both the TZ and sapwood in June was extremely low (Fig. 2i). From these findings, we assume that the Cluster0010 encodes a dehydrin-related protein and the protein could have similar functions with the dehydrins mentioned above especially in the winter season in the TZ.

In the xylem of C. japonica, secondary metabolites called norlignans are produced especially during heartwood formation [60]. Previously, we isolated ESTs for enzymes, which are presumed to be involved in biosynthesis of norlignans, from the drying sapwood of C. japonica [27]. These ESTs, however, were not found in the present study. One possible reason is that the number of ESTs collected from the TZ was insufficient for including transcripts from the candidate genes for norlignan biosynthesis. The cDNA library in this study was not normalized, while the previous library was constructed by suppression subtractive hybridization, and thus it contained predominantly expressed genes in the drying sapwood accumulating a norlignan. Another possibility is that the genes isolated in the previous study are actually not involved in the biosynthesis of norlignans. This seems unlikely because very few sequences were contained in the “secondary metabolite biosynthesis, transport and catabolism (Q)” category (Fig. 1; Table 2).

Heartwood formation has been considered as a form of programmed cell death (PCD) [7, 8]. A key factor in PCD is proteases [61]. PCD in animal cells are known to be executed by caspases, which belong to a class of specific cysteine proteases. In plants, functionally similar proteases, including papain-type cysteine proteases, are involved in some cases of PCD [62, 63]. We found an EST predicted to encode a papain-type cysteine protease. The predicted protease showed significant sequence similarity to those isolated from the leaves and petals undergoing PCD during senescence [6467], suggesting that the protease may be involved in PCD in the TZ as heartwood formation is thought to be a senescence process. The expression level of the corresponding gene for the protease in the TZ was about 5 times higher in June than that in November (Fig. 2m). Taking into consideration this expression pattern and a lack of putative norlignan biosynthetic genes, the gene expression associated with heartwood formation might be initiated before the onset of cytological changes observed in late summer [32], and the expression of genes for enzymes related to heartwood formation ceased in November.

To further understand the mechanism of heartwood formation at a molecular level, the annual changes in gene expression in the TZ must be investigated with simultaneous observation of cytological changes in parenchyma cells, and analysis of extractives in the TZ.

Conclusions

We collected 1029 ESTs from a cDNA library constructed from the TZ of C. japonica in November. The ESTs were assembled into 744 unique sequences. Putative functions were assigned to 291 nuclear-encoded sequences, and they were grouped into 21 functional categories. We also revealed that the expression levels of nine genes for enzymes involved in glycolysis, invertase, methionine adenosyltransferase, glutathione transferase, a LTP, Bet v 1 allergen, a dehydrin and a function-unknown protein were higher in November than in June in the TZ. These genes may play roles in maintaining the TZ function and/or forming heartwood. This study has provided the first large-scale EST information from the TZ of conifers, which will be useful for understanding the physiological processes in the TZ at a molecular level.