Analysis of the sucrose synthase gene family in tobacco: structure, phylogeny, and expression patterns

Provide an evolutionary and an empirical molecular genetic foundation of the Sus gene family in tobacco and will be beneficial for further investigations of Sus gene functions Sucrose synthase (Sus) has been well characterized as the key enzyme participating in sucrose metabolism, and the gene family encoding different Sus isozymes has been cloned and characterized in several plant species. However, scant information about this gene family is available to date in tobacco. Here, we identified 14, 6, and 7 Sus genes in the genomes of Nicotiana tabacum, N.sylvestris and N.tomentosiformis, respectively. These tobacco Sus family members shared high levels of similarity in their nucleotide and amino acid sequences. Phylogenetic analysis revealed distinct evolutionary paths for the tobacco Sus genes. Sus1–4, Sus5, and Sus6–7 originated from three Sus precursors, respectively, which were generated by duplication before the split of monocots and eudicots. There were two additional duplications, before and after the differentiation of the Solanaceae, which separately gave rise to Sus3/4 and Sus1/2. Gene exon/intron structure analysis showed that the tobacco Sus genes contain varying numbers of conserved introns, resulting from intron loss under different selection pressures during the course of evolution. The expression patterns of the NtSus genes differed from each other in various tobacco tissues. Transcripts of Ntab0259170 and Ntab0259180 were detected in leaves at all tested developmental stages, suggesting that these two genes play a predominant role in sucrose metabolism during leaf development. Expression of Ntab0288750 and Ntab0234340 were conspicuously induced by low temperature and virus treatment, indicating that these two isozymes are important in meeting the increased glycolytic demand that occurs during abiotic stress. Our results provide an evolutionary and an empirical molecular genetic foundation of the Sus gene family in tobacco, and will be beneficial for further investigations of Sus gene functions in the processes of tobacco leaf development and tobacco resistance to environmental stresses.

Sucrose synthase (Sus) has been well characterized as the key enzyme participating in sucrose metabolism, and the gene family encoding different Sus isozymes has been cloned and characterized in several plant species. However, scant information about this gene family is available to date in tobacco. Here, we identified 14, 6, and 7 Sus genes in the genomes of Nicotiana tabacum, N. sylvestris and N. tomentosiformis, respectively. These tobacco Sus family members shared high levels of similarity in their nucleotide and amino acid sequences. Phylogenetic analysis revealed distinct evolutionary paths for the tobacco Sus genes. Sus1-4, Sus5, and Sus6-7 originated from three Sus precursors, respectively, which were generated by duplication before the split of monocots and eudicots. There were two additional duplications, before and after the differentiation of the Solanaceae, which separately gave rise to Sus3/4 and Sus1/2. Gene exon/intron structure analysis showed that the tobacco Sus genes contain varying numbers of conserved introns, resulting from intron loss under different selection pressures during the course of evolution. The expression patterns of the NtSus genes differed from each other in various tobacco tissues. Transcripts of Ntab0259170 and Ntab0259180 were detected in leaves at all tested developmental stages, suggesting that these two genes play a predominant role in sucrose metabolism during leaf development. Expression of Ntab0288750 and Ntab0234340 were conspicuously induced by low temperature and virus treatment, indicating that these two isozymes are important in meeting the increased glycolytic demand that occurs during abiotic stress. Our results provide an evolutionary and an empirical molecular genetic foundation of the Sus gene family in tobacco, and will be beneficial for further investigations of Sus gene functions in the processes of tobacco leaf development and tobacco resistance to environmental stresses.

Introduction
Sucrose is essential for the plant life cycle. It is mainly produced by photosynthesis in leaves, and is exported to sink tissues that serve as carbon and energy sources for growth processes and for the synthesis of storage reserves (Lunn and Furbank 1999;Chen et al. 2012). When suffering low temperature or drought stress, plant cells can accumulate sucrose to stabilize membranes and proteins. Further, sucrose is thought to supply energy to ramp up metabolism when such stress ceases (Yang et al. 2001;Strand et al. 2003). It has also been shown that sucrose acts as a signal in plants to modulate the expression level of genes encoding enzymes, storage proteins, and transporters (Ciereszko et al. 2001;Stitt et al. 2002;Vaughn et al. 2002;Zourelidou et al. 2002). Moreover, sucrose participates in the regulation of several developmental processes, such as cell division (Gaudin et al. 2000), flowering induction (Ohto et al. 2001), vascular tissue differentiation (Uggla et al. 2001), seed development (Iraqi and Tremblay 2001), and the accumulation of storage products (Rook et al. 2001). Thus, the study of the metabolism of sucrose is central in understanding myriad aspects of plant physiology.
Sucrose synthase (Sus) and invertase (Inv) are the two key enzymes that cleave sucrose before its transfer to sink organs. Inv catalyzes the hydrolysis of sucrose into fructose and glucose. Sus catalyzes the conversion of UDP (uridine diphosphate) and sucrose into UDP-glucose and fructose (Schmalstig and Hitz 1987;Kleczkowski et al. 2011). Both Sus and Inv supply energy for phloem loading (Martin et al. 1993;Coleman et al. 2009), and Sus also participates in distributing carbon resources into various pathways that are necessary for the metabolic and storage physiology of the plant cell (Haigler et al. 2001;Ruan et al. 2008). For instance, previous studies have demonstrated that Sus cleavage activity is closely related to the strength of various starch storing sinks, such as potato tubers, pea embryos, and maize kernels (Zrenner et al. 1995;Chourey et al. 1998; Barratt et al. 2001). Moreover, low temperature and drought stress are known to induce the expression of Sus genes in Hevea brasiliensis (para rubber tree), suggesting a positive role for Sus in stress resistance (Xiao et al. 2014).
Previous studies have demonstrated that Sus isoforms are encoded by a small multi-gene family. The number of members in the Sus gene family differs among the plant species examined to date. For instance, the maize and pea genomes contain three Sus genes (Barratt et al. 2001;Duncan et al. 2006), while there are six distinct Sus genes in Arabidopsis, rice, and Lotus japonicas (Baud et al. 2004;Horst et al. 2007;Hirose et al. 2008). Seven Sus genes have been identified in poplar . Diploid cotton genomes (Gossypium arboreum L. and G. Raimondii Ulbr.) each contain eight Sus genes, while the tetraploid cotton genome (G. Hirsutum L.) contains fifteen members, representing the largest Sus family observed to date (Zou et al. 2013). It is also known that the functions and expression patterns of Sus genes are divergent at different development stages in the same plant. For example, the pea genes Sus1, Sus2, and Sus3 are expressed predominately in, respectively, developing seeds, leaves, and flowers. Furthermore, Sus2 and Sus3 are not able to compensate for the activity of Sus1 in root nodules or seeds (Barratt et al. 2001). Sus1 is expressed ubiquitously in maize and functions mainly in starch synthesis, while the expression of maize Sh1 is abundant in developing endosperm tissue and may promote cell wall synthesis (Duncan et al. 2006). Arabidopsis Sus genes can be divided into three groups with distinct but partially overlapping expression profiles; the function of the Arabidopsis Sus genes is also distinct, according to studies of loss-of-function mutants (Bieniawska et al. 2007). Tissue-specific and developmental-dependent expression patterns of Sus genes have also been observed in many other plant species including L. japonicus, rice, and citrus (Wang et al. 1999;Komatsu et al. 2002;Horst et al. 2007;Hirose et al. 2008). The differential expression patterns of Sus genes might indicate specialized functions. Although the Sus gene family has been extensively studied in a few plant species such as Arabidopsis, rice, cotton, and poplar, the Sus genes in tobacco have not.
Tobacco is an economically important plant throughout the world. Natural tobacco leaves contain about 20 % sugar by weight, including glucose, fructose, and sucrose (Davis and Mark 1999;Talhout et al. 2006). Sucrose, fructose, glucose, and inverted sugar (a mixture of fructose and glucose) are frequently used as a cigarette additive; these are thought to function as humectants, and to improve both casing and flavor (Davis and Mark 1999;Seeman et al. 2003;Talhout et al. 2006). Up to 13 % (w/total w) of sugars and sweeteners are added to tobacco during the manufacturing process (Davis and Mark 1999;Fowles and Bates 2000;Smith et al. 2002;Seeman et al. 2003). Sugars can improve the experience of smoking of cigarettes through neutralizing the throat impact and harsh taste of smoke, but sugars are also known to generate acetaldehyde, which is addictive to rodents (Talhout et al. 2006). In addition to being hydrolyzed into monosaccharide moieties by Inv, sucrose can also be used to produce UDP-glucose. It has been demonstrated that an 88-kD Sus polypeptide can catalyze the synthesis of UDP-glucose in the inner layer of tobacco pollen tubes. UDP-glucose is required as a metabolic precursor for the biosynthesis of both cellulose and callose. Both of these polymers are necessary components of the inner layer of tobacco pollen tubes (Diana et al. 2008). As the functions of the other tobacco Sus genes are unknown, it is necessary to identify and characterize the Sus gene family in tobacco to explore their functions and evolutionary relationships.
In the present study, we identified 27 Sus genes in an allotetraploid (Nicotiana tabacum) and two diploid (N. sylvestris and N. tomentosiformis) tobacco species. We then focused on the locations of each member of the tobacco Sus gene family in their respective genomes, their evolutionary relationships, their intron/exon organization, their tissue-and developmental-dependent expression patterns, and their potential roles in responses to environment stresses. Our results provide a foundation for further investigations into the specific functions of each tobacco Sus gene, particularly during leaf development and maturation.

Plant materials and growth conditions
Nicotiana tabacum L. (Honghua Dajinyuan) was used in the analysis of the expression profiles of the Sus genes. Tobacco seeds, which are kept by our own lab, were germinated and maintained in pots under normal conditions (16 h light at 28°C day, 23°C night) until flowering, and then plant roots, stems, leaves, buds, sepals, stamens, pistils, and seeds were collected for RNA extraction. Tobacco seedlings with 9-11 true leaves were transplanted to open fields for continued growth. Leaves from different development stages were harvested for further analysis of the expression of Sus genes.
For the stress treatment experiments, tobacco seeds were germinated on 1/2 MS medium in darkness after soaking and sterilizing. Seedlings were then transplanted to vermiculite for continued growth to the six-leaf stage. Uniform seedlings were chosen, washed with deionized water, and transferred to a 1/3 concentration of Hoagland solution for 1 week prior to use in the stress experiments. For the drought treatment, seedlings were cultivated in a solution containing 20 % (w/ v) PEG6000 for 2 days. For the low-temperature treatment, seedlings were kept at 0°C in an illuminated incubator for 24 h. Tobacco mosaic virus was inoculated onto six leaves for 10 days. After treatment, plant materials were collected and immediately frozen in liquid nitrogen and stored at -80°C prior to RNA extraction.

Phylogenetic and gene structure analyses
The Solanaceae Sus sequences were obtained from http:// solgenomics.net/. The Sus sequences of other plant species were collected by searching the NCBI GeneBank database using 'sucrose synthase' as a query keyword. DNAMAN (version 6.0) and Clustal X (version 1.83) were used to perform the multiple alignments of the Sus nucleotide and deduced amino acid sequences, respectively, with default gap penalties. The phylogenetic tree of the Sus amino acid sequences was constructed with MEGA 5.0 using the neighbor-joining algorithm.

RNA extraction and cDNA preparation
Total RNA was extracted using a SuperPure Plantpoly RNA Kit (Gene Answer). RNase-free DNase I (Gene Answer) was used in the extraction process to remove DNA contamination. Both the concentration and the quality of the RNA samples were evaluated with a Nanodrop 2000 instrument (Thermo). 1 lg total RNA was used to synthesize first strand cDNA using Reverse Transcriptase M-MLV (TAKARA) with random primers. After reverse transcription, the concentrations of the cDNA samples were evaluated with the Nanodrop 2000 instrument, and then diluted to 100 ng/ll.

RT-PCR and RT-qPCR
The gene specific primers used in the RT-PCR and the RT-qPCR experiments are listed in Suppl. Table S1. RT-PCR was performed using TAKARA Taq polymerase in a heated lid thermal cycler (Biometra). The PCR cycling program was as follows: 95°C for 5 min, 26-30 cycles of 30 s at 94°C, 30 s at 55°C or 60°C, and 30 s at 72°C. RT-qPCR amplification reactions were performed using an iCycler iQ thermo cycler (Bio-Rad) and a SYBR Green kit (Bio-Rad). The PCR program was as follows: 95°C for 5 min, 40 cycles of 30 s at 94°C, 30 s at 60°C, signal acquisition, and then a final melting curve of 65-95°C. The expression levels of the NtSus genes in leaves from different development stages were standardized to the expression level of the Ls25 gene at each corresponding stage.

Identification of Sus genes in tobacco
We first searched the N. tabacum genome in the China tobacco genome database v 2.0 (data not shown) using 'sucrose synthase' as the query key word, and thusly obtained 16 putative Sus genes. We then performed several additional BLAST searches of the N. tabacum database using the amino acid sequences of Arabidopsis Sus proteins Planta (2015) 242:153-166 155 as the query sequences. We concatenated all the search records and made an alignment of their amino acid sequences. We excluded short and/or low identity sequences, and finally retained 14 NtSus genes. As shown in Table 1, the genomic DNA size of these 14 NtSus genes varied from 3.5 to 15 kb, but the cDNA sizes were all quite similar. The putative polypeptides of these NtSus genes contained between 642 and 909 amino acids (molecular weights ranging from 72.71 to 103.27 kDa) with estimated isoelectric points between 5.7 and 7.85. These predicted molecular features of the NtSus enzymes were similar to those of previously characterized Sus isozymes from other plant species. We identified and characterized 6 and 7 Sus genes from the N. sylvestris and N. tomentosiformis genomes (Table 1), respectively. Identities between the amino acid sequences of the 14 NtSus proteins were calculated with the DNAMAN algorithm. As shown in Table 2, the NtSus sequences could be divided into seven pairs that each had high levels of similarity (higher than 86.15 %), a result suggesting that one Sus gene has two putative paralogs in the tetraploid tobacco genome. Alignment between the coding sequences of the 14 NtSus genes also confirmed this hypothesis. Moreover, we performed intra-and inter-species sequence alignments between the amino acid sequences and coding sequences of the six NsylSus and the seven NtomSus, and found six pairs of them shared much higher sequence identities (82.72-99.01 % at the amino acid level, 82.26-97.99 % at the nucleotide level) in the inter-species alignment (Suppl. Table S2). We named the seven pairs of Sus genes as Sus1-7 in N. tabacum, and identified their putative orthologous genes in N. sylvestris and N. tomentosiformis by comparing the identities of the Sus sequences ( Table 3). Both of the diploid genomes contained one ortholog for each Sus gene, with the exception of Sus4 in N. sylvestris. These homologous relationships were also confirmed by subsequent phylogenetic analysis (Suppl. Fig.  S1).

Location of Sus genes in tobacco genomes
We obtained physical maps for each chromosome of the three tobacco species from the China tobacco genome database v 2.0. These maps included the length of the chromosomes, gene numbers, start/end sites of the Sus genes. We subsequently drew three simple maps that showed the distribution of the Sus genes among the chromosomes in the three tobacco species. As shown in Fig. 1, in N. tabacum, 14 NtSus genes were located on 11 chromosomes, three of which possessed two genes, while other chromosomes contained only one gene. According to this physical map and the start/end point of each gene on the chromosome obtained from the database, the Ntab0259170/Ntab0259180 and the Ntab0298870/ Ntab0298880 pairs were found to have close linkage. As Ntab0259170 and Ntab0298870 are two Sus2 paralogs, while Ntab0259180 and Ntab0298880 are Sus3 paralogs (Table 3), it was clear that NtSus2 and NtSus3 were closely linked to each other in the N. tabacum genome. Moreover, in N. tomentosiformis and N. sylvestris, Sus2 (Ntom0289900, Nsyl0289930) and Sus3 (Ntom0289890, Nsyl0289940) were also linked (Suppl. Fig. S2 and S3), suggesting that this linkage relationship has been conserved during the evolution of these genomes. The two Sus6 genes (Ntab0385170 and Ntab0594750) were on the same chromosome, but were not tightly linked; their locations may have resulted from chromosomal exchange during the formation of the tetraploid genome.
Gene structure and conserved motifs in the tobacco Sus gene family To better understand the genesis of Sus family genes in tobacco, we analyzed the intron/exon arrangement of each tobacco Sus gene. To correctly distinguish intron and exon fragments of NtSus genes, we aligned the genomic and corresponding cDNA sequences with DNAMAN software. As shown in Fig. 2, the coding regions of the NtSus genes were interrupted by introns of varying sizes. The lengths of these introns varied from 70 to 600 bp, with the exception that two introns in NtSus5 were longer than 1 kb. Moreover, the numbers of introns were also different among the seven NtSus genes. For instance, there were 12 introns in NtSus1 and NtSus2, 10 in NtSus3 and NtSus4, and 14 in NtSus5 and NtSus7. For NtSus6, the Ntab0385170 gene contained 13 introns, while the Ntab0597450 gene contained 12. In the diploid tobacco genomes, most Sus genes contained the same number of introns as their corresponding orthologs in N. tabacum, except for NsylSus2 and NtomSus4, each contained one less intron than NtSus2 and NtSus4, respectively, (Suppl. Fig. S4). We carefully observed the intron positions relative to the conserved exons, and identified 16 putative positions for introns, among which 14 introns (excepting introns 13 and 16) were found in most of the NtSus sequences (Fig. 3). The lack of one or more introns, which mainly occurred in the 5th, 9th, 10th, 12th, and 13th intron positions, lead to the formation of larger exons, including exons of 339, 358, and 567 bp in length in some of the NtSus genes (Figs. 2, 3). In addition to our analysis of the nucleotide sequences, we also analyzed the amino acid sequences of the NtSus family. There was a conserved serine residue in the N-terminal regions of all of the NtSus sequences (Suppl. Fig. S1). It has been demonstrated in maize that a Ser/Thr protein kinase can phosphorylate this Ser residue (Huber et al. 1996;Hardin and Huber 2004). Moreover, among the NtSus sequences, we also found two conserved domains that are considered to be characteristic of Sus proteins: a sucrose synthase and a glucosyl-transferase domain. We used MEME to predict the putative motifs in the NtSus sequences, and identified a total of ten distinct motifs. The length, conserved sequence, and predicted molecular function of each motif are listed in Table 4. Most of the motifs were predicted to be involved in sucrose metabolic processes, with the exceptions of motif 6 and 10, the biological significance of which remain to be determined. As expected, the orthologous genes of the three tobacco species contained the same motifs and the same motif arrangement (Suppl. Fig. S5), indicating that these likely have similar functions.

Phylogenetic analysis of tobacco Sus genes
To better understand the evolutionary relationships among the Sus genes of tobacco and other plant species, 71 amino acid sequences from 11 species were used to make an alignment using ClustalX. An unrooted tree was constructed based on the alignment using the Neighbor-Joining method implemented in MEGA-5. As shown in Fig. 4, plant Sus genes could be divided into four sub-families. Based on several previous analyses with which our results were consistent, we here designated these sub-families as follows: Sus I eudicot group, Sus I monocot group, Sus II group, and Sus III group ). Most of the tobacco Sus genes belonged to the Sus I eudicot group,  (AtSus1 and 4), Gossypium arboretum (GaSus1, 3, 4, and 5), potato (St2, 3, 4, and 5), tomato (Soly2, 3, and 4), pepper (CaSus4 and 5) and coffee (CoSus1). There was one tobacco Sus gene (Sus5) in the Sus II group and two tobacco genes (Sus6, Sus7) in the Sus III group; no tobacco genes occurred in the Sus I monocot group (Fig. 4). Although the tobacco Sus paralogs shared high sequence similarities, our phylogenetic analysis revealed that diversification has occurred within this family, likely indicating discrete evolutionary histories and diverse biological roles for the members of this gene family in tobacco.
Since sucrose affects cell division and vascular tissue differentiation in plant leaves, we further performed quantitative real-time RT-PCR to detect the relative expression levels of each NtSus gene at different developmental stages of tobacco leaves (Fig. 6). Transcripts of only three NtSus genes were detected in the leaves of seedlings and leaves of plants at the resetting stage, vigorous stage, bud stage, or flowering stage, whereas there were six genes expressed in leaves at the topping and mature stages (Fig. 6g). In detail, Ntab0259170, Ntab0298870, and Ntab0259180 were ubiquitously expressed at all leaf growth stages tested. Transcription levels of Ntab0259170 and Ntab0259180 slightly increased during the course of leaf development, and reached maximal expression levels at the topping stage (Fig. 6h), indicating that these two genes encode isozymes that catalyze key aspects of sucrose metabolism in leaves in late development stages.
To determine whether the NtSus genes were involved in stress resistance, we measured their transcript levels under drought, low temperature, and virus treatments. As is shown in Fig. 7, when suffering drought stress, the expression levels of all of the NtSus genes were similar to those of the control, except for Ntab0820630 and Ntab0298870, for which the transcript levels were upregulated approximately twofold. Moreover, the expression level of Ntab0288750 transcripts was sixfold higher under low-temperature treatment, while the expression levels of transcripts of the other NtSus genes were merely slightly increased (less than twofold) under this treatment. Similarly, when inoculated with a virus, the transcript level of Ntab0234340 increased much more dramatically than did the levels of the other NtSus genes. Taken together, our results show that different environmental stresses could induce the transcription of different NtSus genes, indicating diverse functions of the NtSus genes in plant responses to stress.

Discussion
Comparative genome approaches have been used to analyze many Sus gene families in various plant species, including Arabidopsis, rice, maize, popular, and cotton. In the present study, benefiting from sequencing efforts of the whole tobacco genome conducted by the China Tobacco Gene Research Center (manuscript under review), we identified 14, 6, and 7 Sus genes from N. tabacum, N. sylvestris, and N. tomentosiformis, respectively. These tobacco Sus family members shared high levels of similarity in both their nucleotide and amino acid sequences; we named these as Sus1 to Sus7, according to distinct molecular signatures. We further analyzed their molecular structures, evolutionary relationships, and expression patterns in various tobacco plant materials. Therefore, this study provides a foundation for understanding the putative functions of these genes in various growth and developmental processes in tobacco.

Evolutionary conservation and divergence among tobacco Sus genes
The Sus isozymes in plants examined to date are encoded by a small, multi-gene family. Comprehensive analysis of this multi-gene family, including consideration of its exon/ intron gene structures, phylogeny, and conserved motifs, allows researchers to generalize and predict the possible genetic and evolutionary relationships among uncharacterized members of this gene family, as well as to predict their possible functions. Previous studies of the molecular structures and phylogenetic relationships of plant Sus sequences divided them into three major groups, namely Sus1, SusA, and New Group (Horst et al. 2007;Hirose et al. 2008). This classification was corroborated in subsequent studies, and these three groups were renamed as the Sus I, Sus II, and Sus III groups, respectively (Zhang et al. 2011;Chen et al. 2012;Zou et al. 2013). Later, the Sus I group was divided into a monocot subgroup and a eudicot subgroup, as these two subgroups were obviously separate from each other in phylogenetic trees. The Sus II and Sus III groups were then considered as mix group 1 (monocot and eudicot group 1) and mix group 2, according to the categories of group members . In our study, we performed phylogenetic analysis of the Sus homologs from 3 tobacco species and those of eight other plant species, and found that the tobacco Sus family had at least one gene in three separate groups. The four tobacco Sus genes in the eudicot group were divided into two subgroups, among which, tobacco Sus1 and Sus2 clustered together, while Sus3 and Sus4 genes were apparently apart from Sus1, Sus2, and Sus homologs of Arabidopsis and cotton. This result suggests that a gene duplication event that generated tobacco Sus3 and Sus4 occurred after the separation of monocots and eudicots species, but before the divergence of Solanaceae/Arabidopsis/Gossypium. Moreover, the generation of the tobacco Sus1 and Sus2 genes probably took place after the separation of Solanaceae/ Arabidopsis/Gossypium, but before the divergence of Solanaceae. In the Sus II and Sus III groups, the tobacco Sus5, Sus6, and Sus7 genes all closely clustered together with Solanaceae Sus genes (Fig. 4), indicating the generation of these three genes were also before the divergence RelaƟve expression of Solanaceae. Thus, tobacco Sus3 and Sus4 were older than the other tobacco Sus genes. Further, we analyzed an additional phylogenetic tree of the Solanaceae Sus genes (Suppl. Fig. S6), in which the Sus genes were apparently divided into five groups. NtSus3 and NtSus4 belonged to group II, which also contained genes from pepper (CaSus4) and tomato (SolySus4). The phylogenetic tree also showed that the formation of NtSus4 occurred prior to that of NtSus3, CaSus4, and SolySus4. So, it can be concluded that the oldest gene in the tobacco Sus gene family is NtSus4, and this was lost from N. sylvestris during evolution.

RelaƟve expression
Since gene exon structures are typically highly conserved among homologous genes in duplicated gene families (Frugoli et al. 1998), analysis of exon/intron structures can provide clues to reveal the evolutionary history of gene families (Lecharny et al. 2003). Previous studies have shown that there are 14 conserved introns in most of the Sus genes in the major three Sus groups (Sus I, Sus II and Sus III), leading to the speculation that the divergence of the three progenitors predated the segregation of monocot and eudicot species (Tang et al. 2008). Moreover, the genes of the Sus I monocot group contain more introns than the genes of the Sus I eudicot group; this led to the supposition that intron loss events took place at least twice in the evolution of the eudicot Sus genes under selection pressure (Chen et al. 2012). The exon/intron structures of the Sus genes in the Sus II group are similar in monocots, eudicots, and the putative ancestral Sus genes, indicating a relatively slower evolutionary rate for genes in this group. The Sus genes in the Sus III group share a remarkable feature: they have either an additional exon or a longer exon in their 3 0 regions (Chen et al. 2012). In the present study, we evaluated the exon/intron structures of the tobacco Sus genes, and identified 14 conserved introns (Figs. 2, 3). Tobacco Sus genes in the Sus I monocot group and the Sus I eudicot group contained fewer conserved introns than genes in the other two groups. This was particularly obvious for the Sus4 gene, for which there are 10 introns in N. tabacum, and 9 in N. tomentosiformis, suggesting this gene might be under high selection pressure. In contrast, the NtSus5 genes of the Sus II group likely had slower evolutionary rates, as they contained the 14 conserved introns, consistent with former inference (Chen et al. 2012). Moreover, we observed the previously noted feature in the 3 0 regions of tobacco Sus6 and Sus7 genes, which further supports the idea that there might have been ectopic recombination between the progenitor of the Sus II group and a sequence with at least two introns prior to the divergence of monocots and eudicots.
We propose an evolutionary history of the tobacco Sus genes based on our phylogenetic analysis and our analysis of exon/intron structures. Before the split of monocots and eudicots, duplication of the ancestral gene gave rise to the three progenitors of the three Sus groups with 14 conserved introns. Two of the three precursors underwent independent evolution and finally retained one single gene in tobacco (Sus5) in the Sus II group and two genes (Sus6 and Sus7) in the Sus III group. After the divergence of monocots and eudicots, duplication of the Sus precursor generated Sus3 and Sus4, whereas Sus1 and Sus2 were produced after differentiation within the Solanaceae (Fig. 4). Although different in evolutionary time and trajectory, tobacco Sus2 and Sus3 are closely linked to each other in the genomes of both the diploid and tetraploid tobacco species. Intriguingly, the expression pattern of NtSus2 (Ntab0259170) was similar to that of NtSus3 (Ntab0259180) in various developmental stages of tobacco leaves (Fig. 6). These findings may prove useful in tobacco breeding efforts to improve leaf development. Moreover, each Sus gene in N. tabacum (tetraploid tobacco) contained two paralogs, of which one shared high similarity with orthologs in N. sylvestris or N. tomentosiformis, further corroborating the idea that tetraploid tobacco was formed by a cross between these two diploid species, followed by chromosome reduplication.

Divergence in NtSus gene expression patterns
To differentiate new organs or adapt to various environments, plants have to make evolutionary changes in protein property and/or in the expression patterns of particular genes (Gu et al. 2003;Flagel and Wendel 2009). Analysis of gene expression patterns can be used to some extent to predict the molecular functions of genes involved in different physiological processes. To date, although the expression patterns of Sus genes have been characterized in detail in several plant species, such as Arabidopsis, rice, cotton, poplar, and rubber tree (Baud et al. 2004;Hirose et al. 2008;Chen et al. 2012;Zou et al. 2013;An et al. 2014;Xiao et al. 2014), there have been no detailed analyses of the expression patterns of Sus genes in tobacco.
Sus isozymes have been shown to participate in the regulation of sink strength in plants (Fu and Park 1995;Zrenner et al. 1995;Chourey et al. 1998;Tang and Sturm 1999;Barratt et al. 2001). In the present study, we first measured the transcription levels of NtSus genes in different tissues, and found that no two NtSus genes shared identical expression patterns (Fig. 5). No NtSus gene was expressed exclusively in a single tissue, and none of the tissues had expression of only a single NtSus gene. These results suggest that the functions of NtSus genes are diversified and yet partially overlap. Our results further demonstrated that the expression levels of most of the NtSus genes was higher in sink tissues, such as roots, buds, flowers (including sepals, stamens, and pistils), and seeds than in source tissues (Fig. 5). Most NtSus genes were not expressed, or were expressed at low levels along the course of development of leaves, whereas transcription of Ntab0259170, Ntab0259180, and Ntab0298870 could always be detected (Fig. 6). Therefore, these three genes might be key regulators of sucrose metabolism in leaves, which are the most economically valuable parts of tobacco plants. We compared the transcription levels of Ntab0259170 and Ntab0259180 in leaves at different developmental stages (Fig. 6h). The low expression levels of the two NtSus genes (Ntab0259170 and Ntab0259180) in young leaves were consistent with the idea that Sus genes are mainly expressed in sink tissues, rather than in source tissues (Turner and Turner 1975). However, after topping (a typical agronomic practice in tobacco production), the transcription levels of Ntab0259170 and Ntab0259180 were significantly up-regulated in leaves, as was the expression of Ntab0452620 and Ntab0820630, which were hardly detected in previous stages (Fig. 6f). The increased expression of these genes might be caused by 'role conversion' of leaves in tobacco plants before and after topping (i.e., from source tissue to sink tissue). In mature leaves, it is known that photosynthetic capacity and metabolic activity decreases (Gan and Amasino 1997), possibly resulting in relatively lower expression levels of Ntab0259170 and Ntab0259180 genes.
Sucrose synthases have been supposed to participate in plant resistance to various environmental stresses. For instance, transcription of the AtSus1 gene can be induced by cold or mannitol treatment in Arabidopsis. AtSus3 is used as a molecular genetic marker of dehydration (Baud et al. 2004). Similarly, both low temperature and drought stress can conspicuously induce the expression of two barley Sus genes (HvSs1 and 3) and one rubber tree Sus gene (HbSus5) (Barrero et al. 2011;Xiao et al. 2014). The higher expression levels of Sus genes may relate to meeting the increased glycolytic demand that occurs under abiotic stresses (Kleines et al. 1999). The expression patterns of the NtSus genes differed from each other in the three experiments (Fig. 7). In detail, under the low temperature and virus treatments, the transcription levels of Ntab0288750 and Ntab0234340 were significantly up-regulated (over sixfold), respectively, suggesting that these genes might encode key glycolysis enzymes under cold and virus infection conditions. There were two genes (Ntab0820630 and Ntab0298870) that showed increased transcription levels (around twofold) under drought treatment, while those of the other NtSus genes were similar to the control. This result remains to be confirmed as biologically relevant, as treatment with PEG2000 does not fully simulate drought conditions (Verslues et al. 2006).
The sequencing of the whole tobacco genome conducted by the China Tobacco Gene Research Center provides the plant biology community with a wealth of new information for functional genomics. Benefitting from this sequencing database (manuscript under review), we identified and characterized the tobacco Sus gene family, considering the physical structures, evolutionary histories, and expression patterns of these gens in different tobacco plant materials. Our findings will be helpful for efforts to further understand the functions of these important enzymes in various growth and developmental processes in tobacco.
Author contribution ZW and JY conceived and designed the experiments. ZW conducted experiments and prepared initial draft manuscript. PW, MW, FL, ZL, JZ and XX performed RT-PCR and RT-qPCR. JY, AC, YX and PC performed partial bioinformatics analysis. FL and JY provided suggestions for the article writing and modified the manuscript. All authors in this study read and approved the manuscript.