Introduction

Sugarcane (Saccharum spp. hybrid) is a C4 perennial plant belonging to the family Poaceae. It is grown commercially in 106 countries within tropical and subtropical regions, known for their hot and humid environments and highly fertile lands [1]. The importance of sugarcane as a crop is due to the production of sucrose in large volumes (tons) and product value (dollars) [2, 3]. Globally 80% of sugar is obtained from sugarcane, which has a higher capacity to store sugar in the mature internodes with a potential of up to 0.7 Mpa [4]. Sucrose synthesized in sugarcane green leaves via photosynthesis is transported to other plant tissue through the phloem, where it is used or stored [5,6,7,8]. Sucrose is moved into parenchyma cells for accumulation, where it is cleaved, and resynthesized [9]. Its metabolism is catalyzed by numerous key enzymes, such as sucrose synthase (SuSy), sucrose phosphate synthase (SPS), soluble acid invertase (SAI), neutral invertase (NI), and cell wall invertase (CWIN) [10].

Sucrose synthase (SuSy) is a crucial enzyme in sucrose metabolism and plant growth. It cleaves sucrose into fructose and glucose with the glucose moiety in the form of uridine 5′-diphosphate glucose (UDPG) or adenosine diphosphate glucose (ADPG), while the fructose portion is left behind [11]. SuSy is actively functional in immature parts of sugarcane stems [12, 13] and is negatively associated with sucrose and positively linked with hexose levels [14]. In general, the hexoses and their moieties are associated with many metabolic pathways, such as the manufacturing of energy, and plants with enhanced SuSy activity have better growth, including increased xylem area and xylem cell-wall width [15]. For example, the downregulation of cucumber sucrose synthase 4 (CsSuSy4) resulted in suppressing the growth and development of flowers and fruit due to the low availability of hexose, starch, and cellulose content [16].

Sucrose phosphate synthase (SPS5) is a significant gene that has an active contribution to sucrose manufacture from uridine diphosphate-glucose (UDPG) and fructose-6 phosphate (F6P) in different species of plants [17, 18]. In addition to sucrose synthesis, SPS is associated with numerous vital agronomic characteristics, mainly plant height and yield [19, 20].

Cell wall invertase (CWIN1) is the main enzyme in sucrose metabolism, which catalyzes the irretrievable hydrolysis of sucrose into hexoses (glucose, fructose) and successive importation into cells of growing tissues with the help of sugar transporters, so CWIN seems critical for the proper metabolism, development, and differentiation of plant cells. From a global commercial point of view, sucrose is an essential trait of commercial sugarcane varieties and has received a crucial focus from researchers [21].

The improvement of sucrose yield in sugarcane has remained an ultimate goal in recent years [3], and breeders are applying advanced genomic approaches to incorporate the diversity of alleles into the breeding programs via gene mining from wild relatives [22].

The omics approaches such as genomics, transcriptomics, proteomics, and metabolomics are widely applied in the molecular investigation. Their output analysis is based on logical approaches, bioinformatics, computational scrutiny, and several subsequent interdisciplinary biological concepts [23,24,25]. Consistency and predictability of transgenic technologies have a vital role in producing crops with a higher nutritional reputation in a short time and with strong resistance to biotic and abiotic challenges, such as insects, fungal pathogens, herbicides, salinity, drought, and cold stresses [26]. Recent omics technologies have contributed significantly to comprehensive insights into the sugarcane genome and developing commercial varieties with target traits [27].

Transcriptomic studies of sugarcane have emerged as a potent tool for the functional characterization of unknown genes [28, 29]. It decreases the complexity of data, and only active genes in the target cell or tissues are considered at the time and position of sampling. Transcriptomic methods have been used to compare similar tissues at diverse developmental stages in various sugarcane varieties growing in different circumstances [24]. Transcriptomic analysis was conducted to explore the molecular mechanism behind the regulation of sucrose content in sugarcane [30].

This study was proposed to investigate and compare a high-sugar mutant clone, GXB9, to a low-sugar parental clone, B9, at a mature stage through a transcriptomic approach. B9 was brought from Brazil into China in 1999, showing a high yield and good morphological and physiological traits but low sugar content [31]. On 15th October 2013, a high-sugar clone was detected in a low-sugar parental clone B9 population and was named Guixuan B9 (GXB9) [32].,.Onward 2013 sugar content was closely monitored in both clones under the same field conditions at different locations following method of [33, 34] using a refractometer (ATAGO, Co. Ltd., China) and Polartronic M 202 TOUCH (589 + 882 nm: SCHMIDT + HAENSCH GmbH & Co., Berlin, Germany) machines. All the time, GXB9 produced higher sugar than B9.

Furthermore, simple sequence repeats (SSRs) marker-based examination was also conducted, which showed genetic variations between the high and low sucrose clones; it confirmed that the GXB9 clone has mutated [31, 35]. Based on results obtained from phenotypic observation, sugar content difference, and SSR analysis, it was planned to conduct the non-parametric transcriptomic study of high sugar content mutant GXB9 compared to low sugar clone B9 to find the genes associated with sucrose metabolism and accumulation. As per our knowledge, it is the first comparative transcriptomic study of high sucrose mutant GXB9 compared to low sucrose parental clone B9.

Results

Sequencing and assembly analysis

RNA of each sample was extracted and sequenced, utilizing Illumina Hiseq 2000 high-throughput paired-end sequencing technology platform. After performing quality assessment and data filtration, high-quality reads were selected for de novo assembly. Overall, using Trinity software, 241,184 transcripts were assembled with an average length of 701 bp and an N50 length of 1226 bp. These transcripts denoted 100,262 unigenes, with a mean length of 1227 bp and an N50 of 2388 bp. As a whole, unigenes between 200 and 500 bp length were 44,392 (44.28%), between 500 and 1000 bp were 15,290 (15.25%), between 1000 and 2000 bp were 18,205 (18.16%), and between 2000 and 3000 bp were 22,375 (22.32%), respectively (Table 1).

Table 1 Sugarcane transcriptome summary of assembled transcripts and unigenes

Functional annotations analysis

69,637 (69.46%) unigenes were annotated against public databases (Table 2). 22,595 (22.54%) unigenes were homologous to the COG database, 48,718 (48.60%) unigenes had similarity to GO, 24,170 unigenes (24.10%) matched to KEGG, 35,352 unigenes (35.26%) have homology with KOG, 44,559 unigenes (44.44%) have similarity with Pfam database, 36,921 unigenes (36.82%) proteins have resemblance with Swiss-Prot database, 62,430 unigenes (62.27%) have matched with eggNOG, and 67,396 unigenes (67.22%) were homologous to NR database, while a total of 69,637 (69.46%) unigenes out of 100,262 were annotated against all databases. However, 30,625 (30.54%) unigenes did not match any database, suggesting that these unannotated unigenes may be novel genes, even though some of these unigenes may epitomize non-coding RNAs.

Table 2 Functional annotation of assembled unigenes

NR species homology and analysis

Among the BLASTx top hits in the NR database, 22,970 (34.16%) unigenes were matched to Sorghum bicolor proteins, followed by Zea mays 6880 (10.23%), Setaria italica 1746 (2.60%), Peniophora sp. 1312 (1.95%), Quercus suber 1230 (1.83%), Panicum hallii 1166 (1.73%), Dothistroma septosporum 1141 (1.70%), Oryza sativa japonica group 947 (1.41%), Saccharum hybrid cultivar R570 870 (1.29%), Dichanthelium oligosanthes 772 (1.15%), and other species. 28,207 (41.95%), respectively (Fig. 1).

Fig. 1
figure 1

Homology of sugarcane unigenes to other species in NR analysis

Functional characterizations of clusters of orthologous groups (COG) analysis

The assembled unigenes were investigated in the COG database for functional prediction and cataloging. A definite number of 25,543 (25.46%) unigenes were assigned functions and classified into 25 COG categories (Fig. 2). The class-general function prediction contained 2980 (13.33%) unigenes and constituted the major functional group, followed by translation, ribosomal structure and biogenesis, 2820 (12.61%), carbohydrate transport and metabolism, 2348 (10.5%), posttranslational modification, protein turnover, and chaperones functional group 2213 (9.9%), amino acid transport and metabolism group 1678 (7.5%). Energy production and conversion 1634 (7.31%), lipid transport and metabolism 1634 (7.31%), and signal transduction mechanisms 1624 (7.26%) categories had an equal number of unigenes. Furthermore, secondary metabolites biosynthesis, transport, and catabolism 1361 (6.09%), cell wall/membrane/envelope biogenesis 1018 (4.55%), inorganic ion transport and metabolism 1005 (4.49%), transcription 977 (4.37%), coenzyme transport and metabolism 876 (3.92%), replication, recombination and repair 734 (3.28%), and defense mechanisms 659 (2.95%) clusters had a descending pattern of unigenes, respectively. The functional group with unknown function characteristics had 599 (2.68%) unigenes. Nucleotide transport and metabolism 551 (2.46%), cell cycle control, cell division, chromosome partitioning 261 (1.17%), mobilome, prophages, transposons 167 (0.75%), cell motility 120 (0.54%), intracellular trafficking, secretion and vesicular transport 102 (0.46%), cytoskeleton 81 (0.36%), extracellular structure 46 (0.21%), chromatin structure and dynamics 37 (0.17%), and RNA processing and modification 18 (0.08%) functional groups got the respective numbers of unigenes. The nuclear structure functional group received zero unigenes.

Fig. 2
figure 2

Sub-functional categories of sugarcane unigenes in the COG classification. The Y-axis presents the number of unigenes, and the X-axis displays the sub functional categories of unigenes

Gene ontology (GO) analysis

The gene ontology (GO) database predicts and describes the function of genes, and it has been divided into three major categories, cellular component (CC), molecular function (MF), and biological process (BP). The assembled unigenes were searched against the GO database to know their functions within three main categories. All 48,718 (48.59%) unigenes were classified into 52 sub-functional classes containing 285,757 GO terms. Cellular component (CC) expressed a significant number of 128,321 (44.90%) functional terms in 15 various sub-functional groups, followed by biological processes (BP), which obtained 99,064 (34.66%) terms in 22 different sub-functional categories and molecular function with 58,372 (20.43%) terms in 15 sub-functional types (Fig. 3).

Fig. 3
figure 3

GO functional prediction classification of the annotated sugarcane unigenes. Ordinate indicates the number of unigenes in sub functional classes, and an abscissa denotes functional subcategories: cellular components (CC), molecular function (MF), and biological process (BP), along with sub functional sets

Among cellular component sub-classes, the cell has 27,524 (13.48%), and the cell part has 27,485 (13.46%) terms showing high expression levels, respectively, followed by organelle 21,434 (10.49%), membrane 17,751 (8.69%). Within the biological process sub-groups, the metabolic process 25,683 (12.57%), cellular process 24,308 (11.91%), and single-organism process 16,456 (8.05%) were majorly expressed groups. Molecular function sub-types were the significantly enriched groups, such as catalytic activity and binding, having 24,361(11.93%) and 24,031 (11.77%) terms. The unigenes assigned to sub-functional groups such as signal transducer, transporter activity, binding, developmental process, and signaling might be closely linked to sucrose content, growth, and disease response, which gives important knowledge for future studies.

Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) analysis

All the assembled unigenes were interpreted utilizing the KEGG pathways database to comprehensively understand sugarcane complex biological and metabolic pathways. 25,338 (25.27%) unigenes were divided into 130 pathways (Additional file 1: Table S1). The least enriched pathway was anthocyanin 1 (0.004%), while surprisingly enriched pathways were ribosome 2094 (8.26%), followed by carbon metabolism 1193 (4.71%), biosynthesis of amino acids 875 (3.45%), protein processing in endoplasmic reticulum 804 (3.17%), spliceosome 706 (2.79%), oxidative phosphorylation 675 (2.66%), RNA transport 577 (2.28%) glycolysis/gluconeogenesis 548 (2.16%), endocytosis 527 (2.07%), and metabolism 526 (2.06%), respectively.

Simple sequence repeats (SSR) discovery and analysis

Valuable SSR markers have been detected from unigenes with lengths over 1000 bp by the microsatellite identification tool (MISA software). 40,580 unigenes sequences (≥ 1 kb) were subjected to MISA for SSR identification. Finally, 15,476 SSRs were extracted; 564 were present in complex formation, while 2794 sequences kept more than one SSR. The most prominent repeat motif was mononucleotide with a number 7322 (47.31%), followed by trinucleotide 4629 (29.91%), dinucleotide 2174 (14.04%), tetranucleotide 209 (1.35%), pentanucleotide 34 (0.21%), hexanucleotide 22 (0.14%), along with complex repeat SSR 510 (3.30%), and C*12 (0.08%) (Fig. 4A). Collectively 95 categories of nucleotide motif repeats were identified among 15,476 SSR loci. The highly significant repeat type was A/T with a number 7504 (48.49%), followed by CCG/CGG 1972 (12.74%), AG/CT 1301 (8.41%), AGC/CTG 779 (5.03%), AGG/CCT 655 (4.23%), AC/GT 591 (3.82%), ACG/CGT 420 (2.71%), ACC/GGT398 (2.57%), AT/AT 362 (2.33%), and other repeat motifs together were 1494 (9.65%) (Fig. 4B).

Fig. 4
figure 4

Numbers of SSR types (A) and repeat (B) in sugarcane unigenes. The Y-axis shows the frequency of SSR types and repeats, and the X-axis shows the categories of SSR types and repeats

Single nucleotide polymorphisms (SNPs) analysis

Overall, 63,000, 2 putative SNPs positions were recognized from 60,576 varied unigenes (Additional file 2: Table S2). GO annotation of selected unigenes with unique SNPs in the respective groups showed that numerous significant categories were linked with the sugarcane genotypes. For example, 6 unigenes were confirmed in the sucrose synthase activity category, 90 were associated with the cell wall category, 18 were involved in the sugar proton transporter category, and 13 were found in the photosynthesis category.

Differential expression genes (DEGs) screening and analysis

The differentially expressed genes (DEGs) were screened by Benjamini–Hochberg (BH) method, and the screening criteria were FDR < 0.01 and an absolute value of log2 Ratio ≥ 2. In comparing immature and maturing internodes of high-sugar mutant clones with the same internodes of low-sugar parental clones, a significant number of DEGs were screened according to the selected parameter, as shown in Table 3.

Table 3 Statistics of differentially expressed genes obtained from mutant vs parental clone

Sucrose metabolism-associated DEGs

Five DEGs encoding sucrose synthase (SuSy2: EC: 2.4.1.13) were obtained from immature internodes of high sucrose than the low sucrose parental clone, where 3 unigenes were up-regulated and 2 down-regulated. The 2 DEGs of cell wall invertase (CWIN1: EC: 3.2.1.26) were up-regulated in the immature internodes of high sucrose than low sucrose clones. The 3 DEGs encoding sucrose phosphate synthase 5 (SPS5: EC: 2.4.1.14) were up-regulated in the maturing internodes of the high sucrose mutant clone and downregulated in the same internodes of the low sucrose parental clone (Additional file 3: Table S3).

Cell wall synthesis

Twelve DEGs encoding for cellulose synthase (EC: 2.4.1.12) such as CesA, CesA1, CesA5, CesA6 (2), CesA7, CesA8 (2), CesA10, CesA12, cellulose synthase-like protein E6, cellulose synthase-like family (CSLF3: EC: 2.4.1.–), and 6 DEGs encoding pectin esterase (EC: 3.1.1.11) were up-regulated in the both immature and maturing internodes of high sucrose mutant clone when compared with the low sucrose parental clones. However, unlike low sucrose parental, cellulose synthase-like protein D2 was up-regulated in immature internodes of the high sucrose mutant clone. 3 DEGs involved in cellulose synthase (UDP-forming) synthesis, including cellulose synthase, CesA4, mixed-linked glucan synthase (EC: 2.4.1.–), secondary cell wall MYB 4 were up-regulated, and 3 DEGs synthesizing pectin esterase 15 (EC: 3.1.1.11), pectin esterase inhibitor 8 and 51, were down-regulated in the maturing internodes of high sucrose mutant clone in comparison with low sucrose parental clone (Additional file 3: Table S3).

Sugar transporter/SWEET DEGs

Nineteen DEGs were associated with sugar transporter/SWEET, among which 6 DEGs, including sugar transporters SWEET11, SWEET14, SWEET3a MST1(2), and sugar-phosphate translocator At2g25520, were up-regulated, and 8 DEGs including sucrose transport protein SUT1, sugar transporter ERD6-like 16 (2), MST4, SWEET4, SWEET15, MST1, and SWEET13, were down-regulated in the immature internodes of high sucrose mutant clone in contrast to low sucrose parental clone. 3 DEGs associated with sugar transporter-like sugar transporter SWEET3a, sugar transport protein MST5, and sugar proton symporter activity (GO: 0005351) were up-regulated, while 2 DEGs linked with sugar transport protein MST4 and MST8 were down-regulated in the maturing internodes of high sucrose mutant clone than low sucrose parental clone (Additional file 3: Table S3).

Sucrose signaling

Nine DEGs were associated with trehalose-phosphate phosphatase/synthase synthesis, 8 including TPP9, TPP6 (2), TPP1, TPS9 (3), and TPS6 were down-regulated, and 1 synthesizing TPS1 was up-regulated in immature internodes of the high sucrose mutant clone. However, all 9 DEGs encoding probable alpha-trehalose phosphate synthase/phosphatase (UDP-forming: EC: 2.4.1.15 3.1.3.12), including DEGs for TPS9 (3), TPS11 (2), TPP6, TPS, TPS6, TPS1, were down-regulated in maturing internodes of the high sucrose mutant clone. Conversely, all the DEGs linked with the trehalose pathways were up-regulated in immature and maturing internodes of the low sucrose sugarcane parental clone (Additional file 3: Table S3).

Transcription factor (TF) analysis

A total of 240 DEGs encoding transcription factors (TF) were obtained from the immature and maturing internodes of the high sucrose mutant clone compared with the low sucrose sugarcane parental clone. Among them, 118 DEGs were up-regulated and 122 down-regulated. A heatmap was created to understand further the expression profiles of the identified TF DEGs in sugarcane (Fig. 5; Additional file 4: Table S4).

Fig. 5
figure 5

Heatmap shows the DEGs encoding TFs in the immature and maturing internodes of the high sucrose sugarcane mutant clone compared to the parental clone. C–R indicates control replicates, TR indicates treatment replicates

Protein kinase (PTK) analysis

Protein kinase (EC: 2.7.11.1) is an essential family of enzymes that plays an imperative role in plants through a signaling mechanism. Genes encoding protein kinase in sugarcane under study have shown differential expression between immature and maturing internodes of high sucrose mutant and low sucrose parental sugarcane clones. 364 DEEGs expressing protein kinase regulation have shown differential expression in the immature and maturing internodes of the high sucrose mutant compared with the low sucrose sugarcane parental clone, about 226 DEGs were up-regulated and 138 down-regulated. A heatmap was created to understand the expression description of the recognized protein kinase DEG. (Fig. 6; Additional file 5: Table S5).

Fig. 6
figure 6

Heatmap shows the DEGs encoding protein kinase (PTK) in the immature and maturing internodes of the high sugar mutant clone compared to the parental clone. C–R indicates control replicates, TR indicates treatment replicates

Phytohormones signaling analysis

Genes associated with different plant growth hormones were differentially expressed in immature and maturing internodes of sugarcane under study.

Auxin

Analysis obtained 25 DEGs linked to auxin in immature and maturing internodes of high sucrose mutant clones compared to low sucrose parental clones. 16 DEGs in the immature internodes of the high sucrose, including auxin efflux carrier 9, auxin response (factor15, 1, 8), and auxin-responsive protein (IAA30, IAA26, SAUR50, 5NG4) were up-regulated, and the DEGs involved in auxin-responsive protein (SAUR32, 2: SAUR36), auxin response (2: factor 23, 4), auxin transporter-like protein 2, were down-regulated. While 9 DEGs in the maturing internode, including aldehyde dehydrogenase (EC: 1.2.1.3), auxin-responsive protein IAA26, auxin response factor 8, auxin efflux (2: GO: 0010315), and auxin-activated signaling pathway (GO: 0009734) were up-regulated, and indole-3-acetic acid-induced protein ARG2, auxin-responsive protein IAA18, auxin transporter-like protein 2, were down-regulated.

Gibberellic acid

Gibberellic acid (GA)-associated 5 DEGs included GA2-oxidase (2: EC: 1.14.11.13), GA-receptor GID1L2, GA controlled protein 5, GA responding biological process (GO: 0009739) were up-regulated and GA metabolic process (GO: 0009685) was down-regulated in the immature internodes, whereas GA regulated protein 5, was up-regulated, and GA metabolic process (GO: 0009685) was down-regulated in the maturing internodes of the high sucrose mutant clone than low sucrose parental clone.

Abscisic acid

The immature internodes of high sucrose mutant clones have 1 up-regulated DEG involved in the synthesis of ABA8 and 3 down-regulated DEGs contributing to ABA-induced proteins, ABA-PYL8, and ABA signaling pathways (GO: 0009738) in contrast to parental clones. In the maturing internodes of high sucrose mutant clones, DEG linked with ABA-PYL4 (2) was up-regulated, and ABA-PYL8, ABA stress-ripening protein 1, and ABA (GO: 0009737) signaling pathways were down-regulated than low sucrose parental clones.

Ethylene

55 DEGs linked with ethylene were obtained from the immure and maturing internodes of the high sucrose mutant clone compared with the low sucrose parental clone. In the immature internodes, 26 DEGs, including encoding 1-aminocyclopropane-1-carboxylate synthase 1/2/6 (ACS, EC: 4.4.1.14), aminocyclopropanecarboxylate oxidase (5: ACO, EC: 1.14.17.4), ethylene insensitive 3-like 3 protein (2), methylenetetrahydrofolate reductase and 1(2), AP2-like ethylene-responsive transcription factor ANT (2), ethylene-responsive transcription factor (ERF034, ERF043, ERF109), AP2-ERF-At1g79700, (6) ethylene biosynthetic process (GO: 0009693), negative regulation of ethylene biosynthetic (GO: 0010366, GO: 0010105), response to ethylene (GO: 0009723), were up-regulated, and 17 DEGs involved in ethylene-responsive TF (2: ERF027, ERF014), ethylene biosynthetic/signaling pathway (GO: 0010364; GO: 0009873; GO: 0009723), ethylene-responsive TF (2: ETR1, AP2-ERF-AIL1, ERF003, ERF026, EILP3, ERF113, PEI2X2, ERF113, EIP2 down-regulated. 12 DEGs expressed in maturing internodes, including 4 DEGs such as AP2-ERF-PLT1, ethylene biosynthetic process (GO: 0009693; GO: 0009723), were up-regulated, and 8 DEGs associated with ethylene-responsive TF (4: ERF1, ERF109), ethylene-overproduction protein (2: EOP1), and ethylene biosynthetic process (GO: 0010364) were down-regulated (Additional file 4: Table S4).

Authentication by qRT-PCR

Triplicate biological and technical replicates were used for every sample in qRT-PCR authentication. The analysis of qRT-PCR results showed that all the selected genes were up-regulated. However, there was a little bit of variation in levels of expression, but generally, they had the same trends as in the RNA-Seq results. It indicated that the results of RNA-seq were reliable and authentic (Fig. 7).

Fig. 7
figure 7

Validation of selected differentially expressed genes by qRT-PCR. The error bar represents the SE. Triplicate biological and technical replicate approaches were applied to authenticate RNA-Seq data by qRT-PCR

Discussion

Sucrose yield is a highly desirable objective in sugarcane which has a critical role in plant growth, development, signal transduction, storage volume, and acclimation to environmental pressures. In the present study, analysis of differentially expressed genes (DEGs) in immature and maturing internodes of the high sucrose mutant clone (GXB9) at the maturity stage was compared to the same internodes of the low sucrose parental clone (B9). According to the chosen criteria, DEGs linked with sucrose metabolism and accumulation in GXB9 vs B9 were the key analysis targets.

In the current study, it was noticed that SuSy genes are active in immature sugarcane internodes, contributing to developing tissues that require hexoses, which is why immature internodes have no significant sucrose accumulation. Maturing internodes showed no activity of SuSy genes, indicating no sucrose breakdown by SuSy, and more sucrose availability for accumulation. Therefore, this behavior of SuSy is considered to be the reason for high sucrose accumulation in high sucrose mutant clones than low sucrose parental clones. The SuSy expression in top immature internodal tissues is negatively correlated with sucrose accumulation in sugarcane stalks, so our results agree with [36].

In this study, CWIN1 was up-regulated in the immature internodes of both sugarcane clones, suggesting its role in the growth and pleiotropic effects. However, no expression of CWIN1was found in the maturing internodes of the mutant clone but up-regulated in parental clones strengthens the assumption of more sucrose synthesis and accumulation in high sucrose mutant clones regulated by sucrose phosphate synthase (SPS). Cell wall invertase (CWIN) cleaves the sucrose in sink tissues, particularly in top immature internodes, but its activity decreases with the maturity of the internode, and the same pattern was observed in the present results, consistent with [37, 38].

Several SPS investigations in sugarcane have unveiled that the transcriptomic expression and activity of SPS genes were higher in the mature internodes than in the immature internodes of all sugarcane cultivars studied [14, 39]. The present study found that the SPS5 gene was up-regulated in the maturing internodes of the mutant clones and down-regulated in the same internodes of the low sucrose parental clone. Its upregulation in the current finding is positively correlated with the high sucrose accumulation in the mutant clones and is consistent with the previous reports. Higher SPS activity has been linked with higher sucrose concentration in mature internodes of sugarcane varieties, and lowered level activity has been connected with low sucrose in immature internodes; however, it is contradictory to the results reported by [40, 41], where immature internodes have higher SPS activity than in mature internodes. However, our results agree with [39], who recorded higher SPS activity in mature internodes of high sucrose sugarcane than in immature internodes.

Trehalose is a disaccharide molecule made of two glucose, and trehalose-6-phosphate (T6P) functions as its metabolic precursor [42]. The T6P results from a reaction between UDPG and glucose-6-phosphate (G6P) in the presence of trehalose-6-phosphate synthase (TPS) and is finally converted into trehalose by trehalose-6-phosphate phosphatase (TPP) [43]. Ultimately trehalose is cleaved into two glucose molecules by trehalase [44]. According to the needs situations of different tissues, T6P functions as signal transduction and negative feedback regulator of sucrose in source leaves [9, 45]. It affects sucrose concentration by influencing its synthesis in leaves and consumption in the sinks for growth and development purposes, including embryo development and leaf senescence [46]. TPS and TPP genes are present in species of all significant plant taxa [47,48,49,50,51]. In this study, the down-regulation of trehalose encoding DEGs in high sucrose mutant clones and up-regulation in low sucrose parental clones maybe have an essential role in higher sucrose accumulation in the high sucrose mutant clone.

Cellulose synthase produces a polysaccharide called cellulose, a major component of the plant cell wall. This enzyme functions in a big synthetic complex detected in algae and plants. Cellulose is made of glucose linear polymer obtained from activated sugar donor UDP‐glucose which is available due to sucrose cleavage by sucrose synthase (SuSy) [52,53,54,55,56]. The use of glucose by cleaving the stored sucrose in mature internodes may not occur in sugarcane, because other sources, including the depolymerization of the cell wall, improved photosynthate manufacture, or other metabolic pathways are available to provide glucose, so sucrose content is not affected, particularly in mature internodes of high sucrose sugarcane [57,58,59]. The upregulation of multi-CeS complex DEGs in immature internodes is a sign of the abundant requirement of cellulose in actively growing internodes. Some members of the CesA complex (CesA10, 12, 8) were up-regulated in the maturing internodes of the mutant clones, suggesting that cellulose synthesis is still turned on, driving the internodes to final maturity.

Pectin is also one of the essential components of the plant cell wall. The word pectin is used for collective names of a group of associated polysaccharides in plant cell walls that contribute to complex physiological processes such as cell growth and differentiation and control the integrity and stiffness of plant tissue. Pectinesterases catalyze demethoxylation of pectin and influences the plant cell wall's biological structure [60]. They also regulate several growth processes, such as fiber and pollen formation, fruit ripening, plant–pathogen interactions, and vegetative reproduction [61, 62]. In this study, many DEGs involved in pectin esterase encoding were expressed in immature and maturing internodes of mutant clones, notably up-regulated in the immature and downregulated in the maturing internodes. It suggests they are associated with sugarcane internode growth and fiber formation during elongation [63].

Sucrose and sugar transporters are essential proteins in plants for t translocating sucrose and sugar molecules from source to consuming and storage sinks. Sucrose is transported into various cells throughout the plant with the help of transporters and SWEETs. Overexpression of genes encoding transporters and SWEETs is positively correlated with the unloading of sucrose into the phloem and sink strength [64,65,66,67,68]. It has been reported that SWEETs were involved in the sucrose drive across the plasma membrane in plants, such as Lotus japonicus, Sorghum bicolor, and Arabidopsis thaliana [69,70,71]. Recently, N-terminal truncated SPS demonstrated significant activity by ignoring regulation through allosteric effectors [72]. Therefore, it is suggested that manipulation of the genes associated with SWEETs, transporters, and SPS function could further increase sucrose accumulation in sugarcane stalks. Moreover, N-terminal removed SPS would be the future research target to develop sugarcane varieties with higher sucrose production [73]. The differential expression of sucrose transporters genes in the present study shows that various transporters are active in sucrose transportation during the growth process of sugarcane.

Many complex metabolic regulatory mechanisms assist the plants in coping with environmental circumstances through physiological changes driven at the molecular level. During such situations, TFs interact with specific sequences of DNA in target gene promoters to trigger or constrain transcription and regulate gene expression. It has been reported that WRKY TFs have a wide range of responses to different conditions affecting crop plants growth, development, sugar signaling, sucrose metabolism, products quality, cellulose, lignin, and cell wall synthesis and to circumstances, such as drought stress [74], waterlogging [75,76,77,78] and heat stress response [79, 80]. In sugarcane, WRKY-TF, along with other crucial functions, regulates sugar metabolism and photosynthetic processes [81]. The bHLH-TF is aimed to be involved in the regulatory mechanism of ethylene synthesis in sugarcane [82]. NAC-TF family also has multiple roles in sugarcane development, sugar accumulation, stress tolerance ([83], and hypersensitive responses to pathogens [84]. MYB-TF is a diversely functioning transcription factor group participating in activities, such as stress responses, cell morphogenesis, protein organization, DNA binding, protein–protein interaction, and sucrose storage in sugarcane [85,86,87]. AP2/ERF TFs have also been reported to be involved in sugarcane response against abiotic stresses, such as temperature, drought, and salt [88,89,90]. The basic leucine zipper (bZIP) TFs are sensitive to changes in nutrients, abiotic stress, and sucrose signaling mechanism [91], and the MADS-box TF looks to be linked with plant development processes, oxidative stress response, environmental variation, salt, and drought stresses [92]. C2H2 zinc finger TFs are involved in secondary cell wall synthesis [93]. In the current study, several DEGs encoding TF as described above were expressed in immature and maturing internodes of sugarcane mutant clones vs parental clones. This strengthens the opinion that various TF plays different roles during the whole growth and development process of the sugarcane plant. Further study of individual TF in sugarcane will be a valuable piece of work.

Plant protein kinases work with growth regulators and nutrient signaling pathways, influencing cell cycling and proliferation. CDPK, MAPK, CIPKs, and CBLs are typically linked to various stresses and plant sugar signaling [93]. SWR1 and SWI2/SNF2/SnRK2 are involved in plant development and response to environmental changes and biotic and abiotic stress [94]. The interaction between T6P and SnRK1 (Plant ortholog of SNF1) significantly influences the control of plant carbon distribution and consumption [42]. SnRK1 also has a critical role in plant acclimation to various circumstances [95,96,97]. Down-regulation of CDPK in plants such as castor oil was reported to be linked with high sucrose content [98]. In the present study, DEGs synthesizing MAPK kinase members SnRK1, 2 in the maturing internodes of mutant clones were downregulated, suggesting high sucrose accumulation in the stalks.

In this study, the DEGs associated with auxin enzymes were significantly expressed in the immature internodes of GXB9 compared to B9 clone. Members of the SAURs (Small Auxin Up-regulated RNA) group have been found highly active in growing young tissues of sugarcane and are supposed to be helping growth and development with the help of auxin-induced acid [99]. In sugarcane immature internodes sucrose inversion is regulated by invertases to maintain the supply of sugar for metabolic purposes and invertases level is balanced by auxin [100]. In our study the auxins have highly expressed in immature internodes of high sugar clones which suggests their important role in sucrose metabolism which is consistent with previous studies cited here. Ethylene plays a wide role including carbohydrate metabolism, sugar singling and increased sucrose accumulation in maturing internodes of sugarcane [101]. In current study, ethylene-associated DEGs particularly involved in sucrose and starch metabolism in maturing internodes of high sugarcane have been significantly upregulated, which shows their obvious role in sucrose metabolism.

In a research, the foliar application ABA increased 15.5–20.9% the Brix level in sugarcane than control [102]. The significant expression of DEGs in high sugar sugarcane clone in current study, is suggesting that ABA contributes to various aspects including growth, development, sugar signaling and enhance sugar level.

Gibberellic acid is a multifunction phytohormone, such as plant growth, development, stress resistance, seed germination, stem elongation, leaf expansion, and carbohydrate metabolism [103]. GA 2-oxidase has been described as a plant growth regulator [104].GA3 plays important role in sucrose accumulation in sugarcane [105]. Several gibberellic acid-associated DEGs have found in immature and maturing internodes of high sugar clone of current study. It denotes that different members of gibberellic acid family have various roles in different tissues of sugarcane including sucrose accumulation. However, it is suggested to conduct further study on individual members of every family to get comprehensive elucidation.

In the current study, the number of DEGs linked to IAA, ABA, GA, and ETH hormones was higher in the immature internodes than in the maturing internodes of the mutant clones which suggest spatiotemporal role in sugarcane. However, the high number of these growth-promoting hormones DEGs in immature internodes of sugarcane suggests their significant role in growth and development as well as participation in sugar signaling and sucrose accumulation. Therefore, the differential expression of genes associated with various growth-promoting hormones indicated that they were involved in different physiological and signaling processes during various growth stages of sugarcane and should be the focus of future research, particularly in the sense of sucrose content.

Species homology results of our sugarcane were highly similar to previous findings due to the significant collinearity in the genic regions between sorghum and sugarcane genomes [106, 107]. Remarkably, only 870 (1.29%) unigenes have shown homology with Saccharum hybrid cultivar R570 genes, which is nearly steady with an earlier published report [108]. The small homology of the resulted genes with Saccharum hybrid R570 may be due to high genetic variation in different sugarcane varieties, the absence of sugarcane reference genome sequence, and inadequate public data about sugarcane. The mean length (1227 bp) and N50 length (2388 bp) of the assembled unigenes in our study were higher than the calculated mean length and N50 length of GT35 (460 bp and 640 bp) and S. spontaneum (801 bp and 1337 bp) sugarcane varieties using similar sequencing technologies [109, 110], which demonstrated the high quality of our sugarcane transcriptomic sequences.

Genes functional classes obtained in the current study of sugarcane, such as carbohydrate transport and metabolism, energy production and conversion, defense mechanisms, and signal transduction mechanisms, could be used to develop useful molecular markers to explore agronomic traits, such as sucrose contents, biomass production, biotic and abiotic stress responses, and germplasm development. Simple sequence repeats (SSR) markers, also called microsatellites, are vital tools for studying genetic diversity, creating genetic maps, and executing comparative genomics. SNPs are also significantly useful molecular markers, and they have a wide range of applications, including phylogenic analysis, marker-assisted selection (MAS), genetic mapping of quantitative trait loci (QTL), bulked segregant analysis, genome selection, and genome-wide association studies (GWAS).

Conclusion

The extreme level of polyploidy of the sugarcane genome is the basic reason for the lack of a full reference genome sequence [111]. No particular single gene has been discovered yet to control the sucrose metabolism and accumulation in sugarcane; however, a network of genes associated with sucrose synthesis, metabolism, and accumulation functions in various tissues at different stages. The current key finding was the upregulation of the sucrose phosphate synthase 5 gene (SPS5) in the high sucrose mutant clone and its downregulation in the low sucrose parental clones, suggesting that the SPS5 gene has played a predominant role in enhancing the sucrose accumulation ability of mutant clones. However, the absence of sucrose synthase (SuSy) and cell invertase (CWIN), little expression of cellulose synthase, and downregulation of trehalose genes in the maturing internodes of the mutant clones also have contributed to the higher sucrose accumulation in the stalks of mutant. Single nucleotide variation also suggests several low-effect regulatory single nucleotides in sugarcane, and trait expression is the conglomeration result of these variations.

Materials and methods

Sugarcane Research Institute, Guangxi Academy of Agricultural Sciences, provided the high sucrose mutant clone GXB9 and its low sucrose content parental clone B9. The sugarcane setts were planted in late March 2021 at the Dingdang experiment base of Sugarcane Research Institute Guangxi Academy of Agriculture Sciences, in Longan County (23°13′N 107°98′E) Nanning, China. All sugarcane-related agricultural parameters and practices were followed during the sowing of setts in an irrigated field. The compound fertilizer (N–P2O5–K2O: 22:8:12), carbendazim fungicide, and metsulfuron as herbicides were used during the planting of the bud setts. For transcriptomic analysis, six healthy sugarcane plants of 11 and 12 months of age in the middle of February and March 2021 were selected from each clone population randomly. Samples of immature internodes (5, 6) and maturing internodes (13, 14) were collected in triplicate, immediately frozen into liquid nitrogen, and stored at − 80 °C for further analysis.

Transcriptome sequencing library preparation

According to manufacturer instructions, total RNA was extracted using RNA prep Pure Plant Kit (Tiangen Biotech-Beijing Co. Ltd). A volume of 1 μg RNA per sample was used as input material for the RNA sample preparation. Following the manufacturer's recommendations, sequencing libraries were generated using NEBNext®Ultra RNA Library Prep Kit for Illumina® (NEB, USA), and index codes were added. Fragmentation was performed using divalent cations under elevated temperature in NEBNext first-strand synthesis reaction buffer (5×). First-strand cDNA was synthesized using a random hexamer primer, and M-MuLV reverse transcriptase. Second-strand cDNA synthesis was performed using DNA polymerase I, and RNase H. Remaining overhangs were converted into blunt ends via exonuclease/polymerase activities. After adenylation of 3′ ends of DNA fragments, the NEBNext adaptor with hairpin loop structure was ligated to prepare for hybridization. To select the cDNA fragments of preferentially 240 bp in length, the library fragments were purified with the AMPure XP system (Beckman Coulter, Beverly, USA). PCR was performed with Phusion high-fidelity DNA polymerase, universal PCR primers, and index (X) primers. The PCR products were purified (AMPure XP system), and the library quality was assessed on the Agilent Bioanalyzer 2100 (Beverly, USA). According to the manufacturer's instructions, the index-coded samples clustered on a cBot Cluster Generation System using TruSeq PE Cluster Kit v3-cBot-HS (Illumia). After cluster generation, the library preparations were sequenced on an Illumina Hiseq 2000 platform, and paired-end reads were generated.

Quality control and assembly

The quality of raw reads was evaluated earlier for transcriptome assembly using Fast QC to obtain high-quality clean reads. Raw data (raw reads) of fastq format were first processed through in-house Perl scripts. This step obtained clean data (clean reads) by removing adapter sequence reads containing ploy-N and low-quality reads from raw data. At the same time, the clean data's Q20, Q30, GC-content, and sequence duplication levels were calculated. All the downstream analyses were based on clean data with high-quality reads using Trinity software by default (https://github.com/trinityrnaseq/trinityrnaseq/wiki).

Functional annotation assignment of unigenes

All assembled unigenes were investigated by blasting in the databases COG, GO, KEGG, KOG, Pfam, eggNOG, NR, and Swiss-Prot to assign functional annotation. The blast criterion was a cutoff E value not greater than 1e−5. The assembled unigene sequences were also searched against the Pfam database to forecast probable functions utilizing HMMER software with a threshold E value not greater than 1e−10. KEGG scrutiny was performed using KOBAS2.0 software.

SSRs, SNPs discovery, and primer design

Simple sequence repeats (SSRs) of the unigenes above 1 kb was identified using MISA (Microsatellite identification tool) (http://pgrc.ipk-gatersleben.de/misa/misa.html), and primer for each SSR was designed using Primer3 (http://primer3.sourceforge.net/releases.Php). For Single nucleotide polymorphisms (SNPs) identification, Picard—tools v1.41 and samtools v0.1.18 were used to sort, remove duplicated reads, and merged the bam alignment results of each sample. GATK2 software was used to perform SNP calling. Raw vcf files were filtered with GATK (Genome Analysis Toolkit) standard filter method and other parameters (clusterWindowSize: 35; MQ0 ≥ 4 and (MQ0/(1.0*DP)) > 0.1; QUAL < 10; QUAL < 30.0 or QD < 5.0 or HRun > 5), and only SNPs with distance > 5 were retained.

Quantitative real-time PCR

The accuracy of the RNA-Seq results was validated by a quantitative real-time polymerase chain reaction (qRTPCR). TRIzol1 (Cowin Biosciences, Beijing, China) was used to extract RNA from samples, and RNA quality was evaluated using Nanodrop 2000. According to the manufacturer’s instructions, the TAKARA PrimeScriptTM RT kit (Biotechnology, Dalian, China) was applied to synthesize cDNA. SYBR Premix Ex Tap TM II was used in Light Cycler1480 II (Roche Applied Science, Germany) qR-TPCR. The PCR reaction parameters applied [35] were followed with some modifications. The relative expression of selected genes was analyzed using the 2−ΔΔCt (Livak and Schmittgen 2002). The primers for the inner reference gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and contender genes were designed using Primer 5.0 software (Premier, Canada) and manufactured by Tsingke Biotechnology (Nanning, China), shown in Table 4.

Table 4 Validated genes and primers sequences used in qRT-PCR

Unigene expression calculation and statistical analysis

The abundance of unigenes was normalized by taking reads per kilobase of the exon model per Million mapped reads (FPKM) with RNA-seq by Expectation–Maximization software (RSEM). Unigenes are common or unique expressed transcripts between different sugarcane genotypes based on FPKM value (FPKM > 0). Differentially expressed unigenes (DEGs) were analyzed between two contrasting sugarcane genotypes using the DESeq R package (1.10.1). The P values were adjusted using Benjamini and Hochberg's (BH) approach to monitor the false discovery rate (FDR). Genes with a threshold of FDR < 0.01 and an absolute value of log2 Ratio ≥ 2 were announced as differentially expressed genes.