Background

The transition to flowering is arguably the most critical switch in a plant’s life cycle [1]. It is triggered by a combination of environmental cues and endogenous signals [2]. The appropriate timing of the transition is crucial for reproductive success, and hence is a major production issue for ornamental plant cultivators [3]. In the model plant Arabidopsis thaliana, floral transition is controlled by an intricate regulatory network comprising six distinct pathways, namely the photoperiod, the autonomous, the vernalization, the gibberellin (GA), the ambient temperature and the age pathways [2, 4]. The outputs of the various pathways are integrated by the products of the floral integrator genes FLOWERING LOCUS T (FT) and SUPPRESSOR OF OVEREXPRESSION OF CONSTANS (SOC1), which promote flowering by inducing the expression of the floral meristem identity genes LEAFY (LFY) and APETALA1(AP1) [46]. The gene CO (encoding a BBX transcription factor) is a key component of the photoperiod pathway; it promotes flowering by up-regulating FT [79]. The autonomous and vernalization pathways activate flowering by down-regulating the floral repressor gene FLC, which encodes a MADS-box transcription factor acting to repress the floral integrators [1, 10]. An increase in GA synthesis promotes flowering, especially in plants experiencing short days (SD); early flowering results from the activation of SOC1 [4, 11]. Under an otherwise non-inductive photoperiod, the flowering of A. thaliana is also accelerated by a rise in the ambient temperature [12]. As the plant ages, SQUAMOSA PROMOTER BINDING LIKE (SPL) transcription factors promote flowering by inducing the expression of the floral integrators LFY, FRUITFULL (FUL) and SOC1 [4]. In addition to photoperiod and ambient temperature, flowering time can be affected by the content of trehalose-6-phosphate (T6P), acting through the sugar signaling pathway [13, 14].

Chrysanthemum (Chrysanthemum morifolium) is a popular ornamental plant worldwide [15, 16]. Most Chinese commercial cultivars will only flower if exposed to SD, so achieving year-round production requires flowering to be induced by controlling the daylength, which is a costly and energy-consuming measure [17, 18]. Even with precise control over photoperiod, premature flowering can be induced by higher than optimal ambient temperature, which is often experienced during the summer months [19]. The Japanese cultivar ‘Yuuka’ flowers naturally over the period July-September, filling a gap in the chrysanthemum cut flower market, its flowering is accelerated under SD comparing with the natural conditions [20, 21].

Some transcriptomic analysis relevant to flowering time regulation has been attempted in chrysanthemum and its close relatives. Some features of the chrysanthemum transcriptome related to photoperiod-responsive floral development have been explored using cDNA-AFLP technique [18]. In the cultivar ‘Fenditan’, a number of genes involved in the photoperiod pathway and flower organ determination have been identified using Illumina sequencing technology [22]. Meanwhile, an analysis of Chrysanthemum lavandulifolium (Fisch. ex Trautv) succeeded in identifying 211 flowering-related genes, along with a large number of genes which were differentially transcribed between immature plants and those on the cusp of flowering. Since the constitutive expression in C. morifolium of CsFTL3, a FT-like gene harbored by Chrysanthemum seticuspe, resulted in acceleration in flowering under non-inductive conditions, this gene has been considered to be an important regulator of the photoperiod flowering pathway [19]. The product of CsAFT, a gene encoding an anti-florigenic FT/TFL1 family protein, acts as a systemic floral inhibitor, determining flowering time under SD [23]. The suppression of CmBBX24 (encoding a BBX transcription factor) causes plants to flower earlier than expected [24]. Other than these few examples, the molecular basis of the response to photoperiod in summer-flowering chrysanthemums has not been deduced in any detail.

Here, a comparison between the RNA-Seq acquired transcriptomes of ‘Yuuka’ plants exposed to contrasting photoperiods was used to characterize the transcriptome associated with floral transition, with a view to clarifying the molecular basis of flowering time, which would be helpful to regulate flowering in chrysanthemum to achieve year-round production and the identified important regulatory genes will be useful for flowering-related transgenic breeding.

Results

Transcriptome sequencing and read assembly

To generate a broad survey of genes involved in floral transition induced by different photoperiod, six mRNA libraries were constructed respectively from six different samples of leaves and shoots apical meristem, and denoted CK, S1, S2, L1, L2 and L3 (Fig. 1). The total number of raw reads obtained from the six libraries (CK, S1, S2, L1, L2 and L3) was, respectively, 55,342,446, 58,118,768, 56,587,092, 54,554,772, 55,128,750 and 54,538,450. After filtering, approximately 316 × 106 clean reads (28.42 × 109 nt) were generated. The raw sequence data has been deposited in the NCBI Sequence Read Archive database under accession number SRP076366. The average Q20 and GC percentages were, respectively, 97.72 and 42.86 % (Table 1). After assembly, 115,300 non-redundant Unigenes (57,718 distinct clusters and 57,582 distinct singletons) were recognized; these were of average length 1030 nt and associated with an N50 (genome splicing quality) of 1588 nt (Additional file 1: Table S1).

Fig. 1
figure 1

The source of the mRNA prepared from ‘Yuuka’ plants grown under either SD or LD. CK: plants sampled prior to their exposure to a differential photoperiod; S1, S2: plants grown under SD; L1-L3: plants grown under LD

Table 1 Summary of sequencing data generated

Gene annotation and functional classification

To acquire expression and functional annotation of the assembled unigenes, the assembled Unigene sequences were aligned against the NR, Swiss-Prot, KEGG and COG protein databases and the NT nucleotide database. The number of hits with the NR database was 67,964 (Table 2), and the details of E-value distribution, similarity distribution and species distribution form the result of NR annotation were presented in Fig. 2; while the equivalent numbers for the Swiss-Prot, KEGG, COG and NT databases were, respectively, 46,591, 26,000, 42,512 and 50,337. When subjected to KEGG pathway analysis, 127 pathways were identified, among which the most frequently occurring 30 are detailed in Table 3: the major ones identified were ‘metabolism’, ‘secondary metabolites’, ‘plant-pathogen interaction’, ‘hormone signal transduction’ and ‘spliceosome’. Among the 25 COG categories, the cluster for ‘general function prediction only’ (8974) represented the largest group, followed by ‘transcription’ (4760) and ‘replication, recombination and repair’ (4639). The ‘RNA processing and modification’ (398), ‘Extracellular structure’ (13) and ‘Nuclear structure’ (9) represented the smallest groups (Fig. 3). On the basis of gene ontology (GO), ~50,000 unigenes could be assigned a GO category (Fig. 4). The terms ‘cellular process’, ‘cell’ and ‘catalytic activity’ were dominant in each of these categories.

Table 2 Annotation of unigene sequences
Fig. 2
figure 2

BLAST outputs of Unigene sequences against the NR database. a The distribution of E-values; b the distribution of similarity scores; c the mix of species contributing sequences to the analysis

Table 3 KEGG annotation of the set of Unigenes
Fig. 3
figure 3

Clusters of orthologous groups (COG) classification in Chrysanthemum morifolium ‘Yuuka’. These 25,994 sequences have a COG classification within the 25 categories

Fig. 4
figure 4

GO classification of the unigenes

Identification of TFs and other genes involved in flowering time control

According to unigene annotation, 2000 of the sequences were identified as encoding a member of one of 45 TF families (Additional file 2: Table S2), among which the most frequently occurring were WRKY, followed by G2-like, C3H and MYB. With respect to genes involved in the determination of flowering time, 46 homologs of A. thaliana genes were uncovered (Additional file 3: Table S3). Among those associated with the photoperiod pathway were three homologs of ELF3 (EARLY FLOWERING 3), one of ELF4, four of PIF3, three of FKF1, six of LHY (LATE ELONGATED HYPOCOTYL), six of PRRs (PSEUDO-RESPONSE REGULATORs), one of CCA1 (CIRCADIAN CLOCK ASSOCIATED 1), 17 of GI and 25 of CO. Other flowering time pathways were also well represented, as for example homologs of GA20ox, EMF2, GAI (GIBBERELLIN-INSENSITIVE), GID1 (GA INSENSITIVE DWARF1) and SPY in GA pathway, SVP (SHORT VEGETATIVE PHASE), FCA, FVE, FY, FPA and HOS1 in ambient temperature pathway; LD in autonomous pathway; SPLs (SQUAMOSA PROMOTER BINDING-LIKEs) and JMJ14 in age pathway; FLC (FLOWERING LOCUS C), VIN1 (VERNALIZATION1) and VIN3 in vernalization pathway. A number of flowering integrators were also identified. The details of 46 homologs of flowering time genes were presented in (Additional file 3: Table S3).

The effect of photoperiod on the transcription of floral initiation genes

Genes involved in the determination of photoperiod-induced flowering time in ‘Yuuka’ were identified by initially calculating the transcript abundance of all Unigenes of the above six libraries successively, based on the number of reads per kilobase per million mapped reads (FPKM) method [25]. Subsequently, each Unigene of the five pairwise contrasts CK vs S1 (CKS1), CK vs S2 (CKS2), CK vs L1 (CKL1), CK vs L2 (CKL2) and CK vs L3 (CKL3) was scanned for significant differential transcription, adopting a false discovery rate (FDR) threshold of 0.001 and a |log2ratio| threshold of 1. The resulting set of differentially transcribed genes (DTGs) is given in Additional file 4: Table S4, Additional file 5: Table S5, Additional file 6: Table S6, Additional file 7: Table S7, Additional file 8: Table S8, Additional file 9: Table S9, Additional file 10: Table S10. A greater number of genes were down-regulated in these contrasts than were up-regulated (Fig. 5). More DTGs were identified in CK vs S1 than in CK vs L1, indicating that the response to SD in terms of floral induction and development was more substantial than that to LD; A total of 37 and 38 Unigenes involved the photoperiod pathway with differential transcript abundance were also identified in contrasts L1 vs S1and L2 vs S2 (Additional file 11: Table S11). The details of the key DTGs identified are presented in Tables 4, 5 and Additional file 12: Table S12.

Fig. 5
figure 5

DTGs identified in pairwise comparisons between libraries

Table 4 Transcriptional behavior of genes involved in the photoperiod pathway
Table 5 Transcriptional behavior of genes involved in the GA, T6P and sucrose signaling pathways

DTGs involved in the response to floral induction

Many of the DTGs were associated with one of the photoperiod, GA, T6P or sugar signaling pathways (Tables 4 and 5). Unigene11703_All, 38196_All and 29301_All, which are homologs of A. thaliana GI (a key component of the photoperiod pathway) [26, 27], were all up-regulated under both LD and SD, while the abundance of a fourth GI homolog’s (Unigene18727_All) transcript was increased only under LD. Three CO-like homologs (Unigene11436_All, 34062_All and 11502_All) were up-regulated under both LD and SD, CL7046. Contig5_All was only up-regulated under SD and CL11101.Contig1_All under LD, while two additional (Unigene14733_All and CL201.Contig1_All) were down-regulated under both photoperiods. The transcript abundance of one GA20ox homolog (CL9282.Contig2_All) was enhanced under both LD and SD. EMF1 (EMBRYONIC FLOWER 1) and SPY (SPINDLY) were represented by, respectively, CL6401.Contig2_All and CL4303.Contig1_All; the transcript of both of these was less abundant under SD than in CK, while an EMF2 homolog (CL1282.Contig10_All) was induced by LD. Meanwhile, under LD, Unigene28531_All, a PHYB (PHYTOCHROME B) homolog, along with a sucrose synthase homolog (CL2855.Contig4_All) and seven T6P synthase homologs (Unigene10839_All, 35642_All, 10840_All, CL6845.Contig1_All, 24962_All, CL1264.Contig2_All and CL2780.Contig4_All) were all up-regulated. The implication was that the photoperiod and the GA pathways both represent the primary route by which flowering in ‘Yuuka’ is controlled under SD, while under LD, it is the T6P and sucrose signaling pathways which accelerate flowering and the photoperiod pathway which blocks it. No DTG related to other known flowering pathways including ambient temperature, aging, autonomous and vernalization pathway was found. A proposed model for the regulation of floral induction in ‘Yuuka’ was presented in Fig. 6.

Fig. 6
figure 6

A proposed model for the regulation of floral induction in summer-flowering ‘Yuuka’, taking into account the ambient photoperiod

The differential transcription of TF genes

Most of the key regulatory genes involved in floral induction encode a TF. Members of the bHLH, MYB, C3H, BBX, MADS, GATA and WRKY TF families were among the DTGs associated with floral induction in ‘Yuuka’ (Additional file 12: Table S12). Of the bHLH DTGs, 19 encoded Phytochrome-Interacting Factors (PIFs) proteins – two of which (CL10741.Contig1_All and Unigene8480_All) were only up-regulated under SD, while LD induced three other members of this family (CL3244.Contig1_All, Unigene 22647_All and 23299_All) and repressed one (CL11069.Contig3_All); the other 13 were all induced under both photoperiods. With respect to the BBX family, the transcript abundance of CL11101.Contig1_All was increased under LD, and both CL7046.Contig5_All and Unigene11562_All showed increased transcript abundance only under SD. Of the MYB DTGs, three were down- and one was up-regulated by both photoperiod treatments; in four, transcript abundance was reduced under SD, and in another two, it was increased under LD. Certain members of the MADS family (Unigene15098_All and 22350_All) were expressed during the early stages of floral induction, while other members (Unigene12477_All, 36671_All, 56025_All, 8889_All, 33409_All and CL4112.Contig3_All) were expressed throughout floral induction and differentiation.

Quantitative real time PCR (qRT-PCR) validation of differential transcription

To further verify the genes transcript profiles obtained from Illumina RNA-Seq results, a selection of 8 DTGs for their key roles in flowering time control was used for validation by qRT-PCR: the chosen genes were homologs of FLAVINBINDING, KELCH REPEAT, F-BOX 1(FKF1), COL9, COL16, MYB, GATA, GA20ox, APETALA 1(AP1) and AGAMOUS-LIKE 8 (AGL8). The outcome in each case was consistent with the RNA-Seq assay (Fig. 7).

Fig. 7
figure 7

qRT-PCR validation of eight DTGs selected from the RNA-Seq experiment. (a, b) S1 samples; (c, d) L3 samples; (eh) S2 samples

Discussion

The transcriptome of summer-flowering chrysanthemum

The output of the RNA-Seq-acquired transcriptome of ‘Yuuka’ was a set of some 316 million clean reads, which covered about 28.42 × 109 nt of sequence; the sequences resolved into >115,000 Unigenes, more than 60 % of which could be assigned a putative function. An equivalent transcriptome acquisition program in C. morifolium realized around 91,000 unigenes sequences, of which only ~47 % could be assigned a putative functions [22], while in C. lavandulifolium, nearly 109,000 unigenes were assembled, about 53 % of which were annotated [28]. So far, the total number of Dendranthema spp. unigene sequences deposited in the NCBI EST database is a mere 7371, which means that the present data have provided a >1500 % increase in the representation of the transcriptome of this economically valuable ornamental species.

Regulatory networks controlling the switch to reproductive growth in ‘Yuuka’

Though the differential transcripts represent both developmental as well as condition-dependent differences at the present could not be distinguished, those genes involved in the known flowering pathway and their othologs’ reported to function in flowering time, were further chosed to analyze in detail. Based on the framework of floral induction established in A. thaliana, six regulatory pathways were implicated in the response of ‘Yuuka’ to variation in the photoperiod. In many plants, the photoperiod is the single most important environmental cue affecting the flowering time [6]. The light receptors required for its sensing are the phytochromes (PHYs) and cryptochromes (CRYs), along with phototropin (PHOT) [29]. The transcriptome of ‘Yuuka’ harbored homologs of PHYA, PHYB, PHYC, PHYE, CRY1 and CRY2, but none related to PHOT. Other genes related to the interaction of the photoperiod and flowering time were also among the DTGs, notably ELFs, PIFs, FKF1, LHY, PRRs, CCA1, GI and CO. One FT homolog (Unigene23925_All) was up-regulated under LD, but under SD the otholog of CsFTL3 was not, perhaps because ‘Yuuka’ may be a quantitative rather than a qualitative SD plant. In C. lavandulifolium, genes related to the photoperiod, vernalization, GA and autonomous pathways have all been implicated in the determination of flowering time [28], but this species flowers in the fall (rather than in the summer as ‘Yuuka’ does), so there are probably major differences in the floral switch mechanisms utilized by these members of the chrysanthemum family. In A. thaliana, the gene TPS1 (encoding a T6P synthase 1) is an important controller of the floral transition [13]. Here, three TPS1 homologs (CL1264.Contig2_All, CL2780.Contig4_All and Unigene24962_All) were all up-regulated under LD, implying enhanced activity in the T6P signaling pathway. Sucrose not only acts as a source of energy, but also has a role in signaling through its regulation of the SPLs [3035]. Here, the transcript abundance of the Unigene CL2855.Contig4_All, a homolog of a gene encoding sucrose synthase, was increased under LD. The conclusion was that both the T6P and sugar sensing/signalling pathways are likely important for the regulation of flowering in ‘Yuuka’ under LD.

The regulatory networks underlying the control of flowering time depend on the photoperiod regime

The photoperiod which induces the switch to reproductive growth in summer-flowering chrysanthemum varieties is longer than that required by conventional varieties [36]. By analysing the transcriptome of ‘Yuuka’ plants exposed to a contrasting photoperiods, it was possible to identify which genes responded to this major environmental cue. Both under SD and LD conditions, a number of the DTGs were found to belong to the photoperiod pathway, some of which (for example a homolog of PHYB, which was up-regulated only under LD) responded in a photoperiod-specific manner. The gene CO (syn. BBX1) was one of the first floral induction genes to be isolated in A. thaliana [37, 38], although in the meantime, additional members of the BBX family have been shown to influence flowering time as well (in particular, the products of BBX4, BBX6, BBX7, BBX24 and BBX32), some positively and others negatively [39]. The rice homolog of BBX14 delays flowering under both LD and SD [40]. CO is transcribed in A. thaliana plants grown under SD, but its product is not accumulated, thereby delaying the onset of flowering [41]. Here, seven Unigenes encoding BBX TFs were represented among the DTGs; their transcription profiling was quite diverse. The implication is that CO-like proteins are likely to contribute in various ways to control the flowering time of ‘Yuuka’.

The phytohormone GA acts as a growth regulator and an accelerator of flowering in A. thaliana [42]. Particularly under SD, any disturbance to the GA pathway, or any reduction in tissue GA content delays flowering. The enzyme GA20ox catalyzes several steps in GA synthesis [43]. GID1 is known to be a receptor for GA, since the triple mutant gid1a/gid1c/gid1a is dwarfed and flowers late [44]. SPY acts upstream of both GAI and RGA to negatively regulate the GA response [45]. Here, a GA20ox homolog (CL9282.Contig2_All) was up-regulated in a photoperiod-independent manner, while a SPY (CL4303.Contig1_All) homolog was down-regulated under SD. The implication was that both the photoperiod and GA pathways promote flowering under SD; meantime, under LD, the T6P and sugar signaling pathways promote flowering, while the photoperiod pathway gene PHYB is up-regulated (the product of this gene blocks flowering).

TF genes involved in floral induction

Transcriptional regulation is an important component of plant growth and development [46]. In A. thaliana, a number of MYB proteins are known to regulate the switch to reproductive growth. An increased abundance of MYB30 in the phloem has been shown to accelerate flowering via its regulation of FT [47]. Another MYB protein (EFM) acts to repress FT transcription in the leaf vasculature [48]. The constitutive expression of CmMYB2 in A. thaliana induces a delay in flowering [49]. Here, ten Unigenes encoding MYB proteins were represented among the set of DTGs. MADS TFs are widely implicated in the control of flower development [50]. The A. thaliana gene SVP encodes one such protein, which, during the plant’s vegetative phase, functions as a flowering repressor [51]. Meanwhile, the constitutive expression in tobacco of a Betula platyphylla AP1 homolog accelerates flowering [52]. AGL15 and AGL18, along with SVP and AGL24, are required to block the initiation of flowering in vegetative organs [53]. Here, nine homologs of various MADS TFs were represented among the DTGs. The implication is that in C. morifolium, as also in a number of other species, MADS TFs are important controllers of floral induction.

In A. thaliana, the bHLH protein PIF4 activates FT with the increase of temperature [54]. A number of PIF proteins (PIF1, PIF3, PIF4 and PIF5) are known to act as constitutive repressors of photomorphogenesis in the dark, and their proteolytic degradation is required to abolish the repression [55]. Members of the TF families WRKY, C3H and GATA are also implicated in the machinery underlying floral induction [5660]. Here, 44 WRKY members were detected as DTGs. Thus the members of the WRKY family are likely to be important for floral induction in C. morifolium. With one exception, all the GATA homologs were up-regulated, while 16 of the C3H homologs were down-regulated under both SD and LD. The different expression pattern of C3Hs showed its different roles in floral induction. Given that these TFs likely regulate floral induction (either directly or indirectly) differentially depending on the photoperiod, characterization of the DTGs encoding TFs might shed light on the regulation of flowering time.

Conclusions

The present analysis of the ‘Yuuka’ transcriptome has shown that under SD, flowering might be promoted largely though the activity of the photoperiod and GA pathways, while under LD, the T6P and sugar signaling pathways might promote flowering and PHYB might block it. The DTGs identified here as being involved in the process of floral transition provide potential targets for the manipulation of flowering time in chrysanthemum, while their actual functions need to be fully elucidated in future.

Methods

Plant materials and growing conditions

C. morifolium cultivar ‘Yuuka’ cuttings were obtained from Jiangsu Junma Park Technology Co. Ltd (Zhangjiagang, China). Uniformly-sized cuttings were planted into a 2:1 mixture of peat and perlite at the end of December and grown in a greenhouse under natural light, with additional lighting between 22:00 and 02:00 provided by high pressure sodium lamps. Rooted cuttings were transplanted into garden soil and maintained in a greenhouse running at 5–7 °C, ventilated when the temperature increased above 10 °C, under the same lighting conditions as described above. By April, the plants had reached a height of around 60 cm, and they were exposed to 1 week at 15 °C in advance to induce the plants to enter the photo-sensitive phase. Some plants were then subjected to either a SD (11.5 h photoperiod) or to LD (15.5 h photoperiod) for a further 60 days. Visible flower buds first appeared on the SD plants after 24 days, and on the LD ones after 44 days. The fourth fully expanded leaves and stem apical meristem were sampled on days 3, 5, 7 and 10 after the initiation of the photoperiod treatment and bulked to form sample S1, while S2 was a bulk of material harvested on days 16 and 20; similarly, the L1 sample was a bulk of material harvested on days 3, 5, 7 and 10, L2 on days 16 and 20, and L3 on days 28, 32 and 36. The CK sample was harvested immediately prior to the imposition of the photoperiod treatments. Three plants were sampled on each occasion and maintained as three biological replicates. Immediately after harvest, the tissue was snap-frozen in liquid nitrogen and stored at −80 °C until required.

RNA extraction, cDNA library construction and Illumina sequencing

RNA was extracted using a Total RNA Isolation System (Takara, Kyoto, Japan) following the manufacturer’s protocol. Its quality and quantity were assessed with either a 2100 Bioanalyzer RNA Nano chip device (Agilent, Santa Clara, CA, USA) or a Nanodrop ND-430 1000 spectrophotometer (Agilent, Santa Clara, CA, USA), and only samples delivering a λ260/280 ratio of 1.8–2.1, a λ260/230 ratio of 2.0–2.5 and an RNA integrity number >8.0 were retained. A 30 μg pool of RNA formed by combining 10 μg from each biological replicate was subjected to RNase-free DNase I (Takara, Dalian, China) treatment to remove contaminating DNA, and mixed with oligo (dT) coated magnetic beads to separate the poly A fraction. Fragmentation buffer was added to break the mRNA into the short fragments required as template for the synthesis of the first cDNA strand, which was achieved by random hexamer priming. The second cDNA strand was synthesized by a reaction driven by DNA polymerase I (Takara, Dalian, China). The resulting dsDNA fragments were purified using a QiaQuick PCR purification kit (Qiagen, Valencia, CA, USA) and resuspended in EB buffer for end repair and the addition of an adenine. Sequencing adapters were ligated onto the dsDNAs. Suitable fragments were selected for PCR amplification following agarose gel electrophoretic separation. The resulting libraries were then submitted for sequencing by a HiSeqTM 2000 device (Illumina, San Diego, CA, USA) at the Beijing Genomics Institute (Shenzhen, China; http://www.genomics.cn/index.php).

Transcriptome assembly and and gene annotation

The raw sequence data acquired from each of the libraries was stored in fastq format. Reads including either adapter sequence and/or more than five unknown nucleotides were removed, as were those having a quality value <10. The remaining clean reads were taken forward for bioinformatic analysis. A transcriptome assembly of the filtered sequences was carried out using the Trinity program (−−seqType fq --min_contig_length 100; −-min_glue 3 --group_pairs_distance 250; −-path_reinforcement_distance 85 --min_kmer_cov 3) [61]. Trinity includes three independent software modules: Inchworm, Chrysalis, and Butterfly. In brief, the Inchworm module was applied to assemble the sequences into the unique sequences of transcripts, and a full length transcript for the predominant isoform was generated; the Inchworm-derived contigs were then clustered and a complete de Bruijn graph for each cluster constructed. The Chrysalis module was then applied to partition the full read set among these disjoint graphs, and finally, the Butterfly module was used to process the individual graphs in parallel to derive full length transcripts for alternatively spliced isoforms, and to identify transcripts produced by paralogous genes. The resulting outputs were termed “Unigenes”. When multiple samples from the same species were sequenced, the Unigene set from each sample was subjected to further process of sequence splicing and redundancy removing with sequence clustering software to acquire non-redundant Unigenes as long as possible. The Unigenes were divided into clusters and singletons. Each cluster gathered several Unigene sequences sharing a similarity of >70 % and was prefixed by “CL”, while the singlets were prefixed by “Unigene”.

For Unigene annotation, a BLASTX alignment between Unigenes sequences and various protein databases (an E-value < 1E-5), namely the NCBI non-redundant protein databases (NR), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) and COG (Cluster of Orthologous Groups) was performed, as well as a BLASTN alignment (E-value <1E-5) of the sequences against the NCBI non-redundant nucleotide database (NT). Where conflicting results arose, the priority order was NR, Swiss-Prot, KEGG and COG. For the NR annotation, GO (gene ontology) annotation was achieved using the Blast2GO program [62], and GO functional classifications were acquired using the WEGO software [63].

Identification and functional assignment of DTGs

The transcript abundance of a given Unigene was estimated using the FPKM (fragments per Kbp per million reads) method [25], designed to eliminate any bias generated by variation in sequence length and sequencing quality. Following Audic et al. (1997), an algorithm was developed to identify differential transcript abundance between pairs of samples [64]. FDR (false discovery rate) control, a statistical method, was applied to correct for p-value in multiple tests for calculating the expression between two samples [65]. The smaller FDR and the larger ratio represented the larger difference of the expression level between the two samples. In the present analysis, an FDR threshold of less than 0.001 and a |log2ratio| threshold of at least 1 were applied [64]. Here, DTGs were identified in pairwise contrasts CK vs S1 (CKS1), CK vs S2 (CKS2), CK vs L1 (CKL1), CK vs L2 (CKL2) and CK vs L3 (CKL3). Then, DTGs identified in this way were subjected to GO functional analysis and KEGG Pathway analysis on the base of a hypergeometric test.

qRT-PCR analysis

To validate the ability of RNA-Seq to detect differential transcription, a subset of the DTGs was tested using qRT-PCR. Primer5 software was used to design the appropriate primers (Table 6), and the reactions were performed in a Mastercycler®ep realplex 2 S device (Eppendorf, Hamburg, Germany) using a SYBR Premix Ex Taq™ Kit (Takara, Dalian, China), following the manufacturers’ protocols. Each sample was represented by three biological replicates. The reference gene was chrysanthemum EF-1α (GenBank ID, KF305681). Each 25 μL reaction contained 10 ng cDNA, 0.2 μM of each primer and 10 μl SYBR Green PCR master mix. The reactions were exposed to an initial denaturation (95 °C/2 min), followed by 40 cycles of 95 °C/15 s, 60 °C/15 s, 72 °C/15 s. After amplication, a melting curve analysis was conducted to verify the specificity of the reaction. Relative transcript abundances were calculated using the 2-ΔΔC T method [66].

Table 6 Primers used for qRT-PCR