Identification of sesquiterpene synthase genes in the genome of Aquilaria sinensis and characterization of an α-humulene synthase

Sesquiterpenes are the major pharmacodynamic components of agarwood, a precious traditional Chinese medicine obtained from the resinous portions of Aquilaria sinensis trees that form in response to environmental stressors. To characterize the sesquiterpene synthases responsible for sesquiterpene production in A. sinensis, a bioinformatics analysis of the genome of A. sinensis identified six new terpene synthase genes, and 16 sesquiterpene synthase genes were identified as type TPS-a in a phylogenetic analysis. The expression patterns for eight of the sesquiterpene synthase genes after treatment with various hormones or hydrogen peroxide were analyzed by real-time quantitative PCR. The results suggest that 100 μM methyl jasmonate, ethephon, ( ±)-abscisic acid or hydrogen peroxide could be effective short-term effectors to increase the expression of sesquiterpene synthase genes, while 1 mM methyl salicylate may have long-term effects on increasing the expression of specific sesquiterpene synthase genes (e.g., As-SesTPS, AsVS, AsTPS12 and AsTPS29). The expression changes in these genes under various conditions reflected their specific roles during abiotic or biotic stresses. Heterologous expression of a novel A. sinensis sesquiterpene synthase gene, AsTPS2, in Escherichia coli produced a major humulene product, so AsTPS2 is renamed AsHS1. AsHS1 is different from ASS1, AsSesTPS, and AsVS, for mainly producing α-humulene. Based on the predicted space conformation of the AsHS1 model, the small ligand molecule may bind to the free amino acid by hydrogen bonding for the catalytic function of the enzyme, while the substrate farnesyl diphosphate (FPP) probably binds to the free amino acid on one side of the RxR motif. Arg450, Asp453, Asp454, Thr457, and Glu461 from the NSE/DTE motif and D307 and D311 from the DDxxD motif were found to form a polar interaction with two Mg2+ clusters by docking. The Mg2+-bound DDxxD and NSE/DTE motifs and the free RXR motif are jointly directed into the catalytic pocket of AsHS1. Comparison of the tertiary structural models of AsHS1 with ASS1 showed that they differed in structures in several positions, such as surrounding the secondary catalytic pocket, which may lead to differences in catalytic products. Based on the results, biosynthetic pathways for specific sesquiterpenes such as α-humulene in A. sinensis are proposed. This study provides novel insights into the functions of the sesquiterpene synthases of A. sinensis and enriches knowledge on agarwood formation.


Introduction
Agarwood is a high-grade raw material for the production of essential oils and natural perfumes and has also long been used as a natural digestive, sedative, and anti-emetic medicine. Agarwood is produced from a few species of Aquilaria and Gyrinops (family Thymelaeaceae) from Southeast Asia in response to an environmental stressor such as disease and physical wounding or damage to the trees from lightning strikes, grazing, and pest infestations (Rasool and Mohamed 2016), which causes the heartwood to become resinous (Kumeta and Ito 2016). In China, these species are mostly distributed in tropical and subtropical regions (Yin et al. 2016).
Aquilaria sinensis (Lour.) Spreng is the only certified source of agarwood listed in the China Pharmacopoeia 2020 (China Pharmacopoeia 2020). It is an endangered species and regulated under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (Cites 2004). Because agarwood formation can take more than 10 years, in the last decade, various methods such as wounding, burning-chisel-drilling, chemical induction, and biological inoculation have been developed to induce agarwood production (Liu et al. 2013;Wu et al. 2017;Tan et al. 2019;Yan et al. 2019). However, the quality and quantity of these agarwoods are inconsistent and need improvement.
The main fragrant compounds in agarwood are sesquiterpenes and phenylethyl chromone derivatives, but the sesquiterpenes are more abundant and provide aromatic qualities suitable for perfumes (Ogita et al. 2015;Kristanti et al. 2018). Agarwood contains at least 210 sesquiterpenes, which vary in the type of skeletons, including eudesmanes, eremophilanes, guaianes, agarospiranes, acoranes, cadinanes, prezizaanes, zizaanes and humulanes (Chen et al. 2012;Gao et al. 2019;Li et al. 2021b). Terpene synthases (TPSs) are the key enzymes for the biosynthesis of terpenes, including sesquiterpenes. The TPS family is a mid-size family that is generally divided into seven clades, with the angiosperm sesquiterpene synthases (STSs) mainly included in the TPS-a clade (Chen et al. 2011). To date, TPS genes have been identified in many economically significant plant species such as Magnolia grandiflora (Lee and Chappell 2008), Gossypium hirsutum , Camellia sinensis , and Chrysanthemum indicum (Zhou et al. 2021). Recently, Nong et al. (2020) revealed the A. sinensis TPS gene loci based on genomic sequencing. Li et al. (2021a) also identified the TPS gene family in A. sinensis based on the genome reported by Ding et al. (2020). In addition, Das et al. (2021) analyzed the genome of A. agallochum and classified the TPS genes. Some STSs from Aquilaria species have also been identified and characterized using transgenic expression and chemical analyses. Different classes of guaiene synthases are also present in Aquilaria species, producing δ-guaiene as the major product and α-guaiene as a minor product with other minor products such as germacrene A (e.g., GS-1 in A. microcarpa, Lee et al. 2014), α-humulene (e.g., AcC2-AcC4 in A. crassna, Kumeta and Ito 2010), β-elemene (e.g., ASS1-ASS3 in A. sinensis, Xu et al. 2013;GS-2 in A. macrocarpa, Kurosaki et al. 2015) or both β-elemene and α-humulene ( e.g., GS-3, GS-4 from A. microcarpa, Kurosaki et al. 2015). Other STSs in Aquilaria trees include α-humulene synthases AcHS1-3 from A. crassna (Kumeta and Ito 2016), a nerolidol synthase As-SesTPS from A. sinensis (Ye et al. 2018), a vetispiradiene synthase AsVS from A. sinensis (Ding et al. 2019), and a sesquiterpene synthase AsSS4 from A. sinensis, which yields β-elemene, α-guaiene, α-caryophyllene, δ-guaiene and cyclohexane, 1-ethenyl-1-methyl-2,4-bis (1-methylethenyl) (Liang et al. 2014). Thus, the STSs in Aquilaria trees are very complex and diverse. Here, based on a genome-wide analysis of AsTPS genes in A. sinensis, 16 sesquiterpene synthase genes belonging to type TPS-a were identified. The expression patterns of sesquiterpene synthase genes from A. sinensis were analyzed by real-time quantitative PCR. Sesquiterpene synthase gene AsHS1 was amplified from the transition part of the white wood and brown agarwood of a tree of A. sinensis, and heterologous expression of this gene in E. coli produced a humulene synthase that catalyzes the production of α-humulene from FPP. The three-dimensional structures of this enzyme were predicted and compared with other STSs. Based on the results, biosynthestic pathways for several sesquiterpenes in A. sinensis-cultured cells are postulated. The research provides new insights into the catalytic mechanism of sesquiterpene synthase in A. sinensis and will help to improve agarwood induction techniques.

Plant materials and hormone treatments
A. sinensis suspension cells and branches from a 5-year-old A. sinensis tree were used. Calli were generated from fresh, young stem tips of a shoot as described by Liu et al. (2015). Viable, loose calli were used to prepare suspension cells by shaking in liquid Murashige and Skoog (MS) medium containing α-naphthalene acetic acid (NAA) (2.0 mg L − 1 ), 6-benzylaminopurine (6-BA) (1.0 mg L − 1 ) and casein hydrolysate (0.5 g L − 1 ) at 120 rpm and 22 °C. After viable cell suspensions were produced, methyl salicylate (MeSA), methyl jasmonate (MeJA), ethephon (ETh), ( ±)-abscisic acid (ABA), and hydrogen peroxide (H 2 O 2 ) were added to a final concentration of 100 μM. Then, control and short-term treatment samples were cultured in darkness for 1 and 2 h, respectively. To analyze the effects of longer treatments of hormones and oxidative stress, MeJA, ETh, ABA, and H 2 O 2 were added to a cell suspension of A. sinensis to a final concentration of 1 mM, and control and treatment samples were cultured for 24 or 48 h. The calli samples were stored and used for quantitative real-time PCR to quantify sesquiterpene synthase gene expression under various hormone and H 2 O 2 treatments. For isolating the sesquiterpene synthase gene, transition areas between the white wood and brown wood (agarwood) of branches were cut from a 5-year-old tree in Hainan, which was identified as Aquilaria sinensis (Lour.) Spreng by Prof. Zheng Zhang and used as material for gene amplification.

Predication and structural analysis of A. sinensis TPS genes
The A. sinensis genome and annotation data released by Ding et al. (2020)  . Potential A. sinensis terpene synthase genes were found using HMMER3.1b to search for the predicted proteins from the A. sinensis genome released by Ding et al. (2020) using the PF01397 and PF03936 model data as queries (Bateman et al. 2004). Significant hits (E-values < 10 10 ) were identified as possible AsTPS genes and used for further analysis. The gene annotation files were analyzed to reveal the structural organization of these genes, and the results were visualized using TB-tools . The sequences of ASS1-3, As-SesTPS, and AsVS genes could not be located in the A. sinensis genome provided by Ding et al. (2020), but were located in the draft map of the genome released by Nong et al. (2020). The genomic structures of the three genes were identified using the Splign online tool (Kapustin et al. 2008).

Sequence alignment and phylogenetic tree construction
MUSCLE was used to align the amino acid sequences of the plant TPSs. The alignment results, with marked conserved regions and sites, were illustrated in BioEdit software. A phylogenetic tree was constructed using the maximum likelihood method and the Jones-Taylor-Thornton (JTT) model with 1000 bootstrapped replications in the IQ tree software (Minh et al. 2020a, b).

Sequence character analysis of AsTPS genes
To identify the sequence characters, we acquired the theoretical pI and predicted molecular weight (MW) using the tool at the Expasy portal (http:// web. expasy. org/ compu te_ pi/). The locations of signal peptides of the AsTPSs were predicted by the online servers TargetP 2.0 and ChloroP 1.1 (Emanuelsson et al. 2007(Emanuelsson et al. , 1999.

Quantitative reverse transcription PCR
Total RNAs were extracted from the precipitates of A. sinensis calli using EASYspin Plus Plant RNA Kits (Aidlab, China). RNA integration was analyzed using a 1% agarose gel, and RNA was quantified using a NanoDrop2000c spectrophotometer (Thermo Scientific, USA). Reverse transcription was carried out using M-MLV Reverse Transcriptase following the manufacturer's instructions (Promega, USA). Real-time PCR (qPCR) was performed using TransStart Top Green qPCR Supermix (Transgene, China) and a Roche LightCycler96 Real-Time System. The relative expression for each candidate gene was calculated using the 2 −ΔΔCT method. Three independent technical replicates were used for each experiment, and the AsTUA gene was selected as the internal reference (Gao et al. 2012b). The qPCR primers for analyzing the expression of terpene synthase genes in A. sinensis calli were designed using the online primer design tool Primer3Plus (http:// www. bioin forma tics. nl/ cgibin/ prime r3plus/ prime r3man ager. cgi; Table S1).

Heterologous expression of an AsSTS gene in Escherichia coli
A prokaryotic expression plasmid was constructed using the EasyGeno Assembly Cloning kit (Tiangen, China). The coding sequence of AsHS1 was amplified by reverse transcription PCR with primers of AsHS1-F: 5'-ATG TCA GCT GCT CAG GTC TCAC-3' and AsHS1-R: 5'-TCA TAT AGT AAT TGG ATG GAC CAG CAA TGA AG-3'. The pET-28a ( +) plasmid (Novagen, Madison, WI, USA) was linearized using restriction enzymes BamHI and HindIII, and recombined with the coding sequence of AsHS1. The constructed plasmid was transformed into E. coli BL21 (DE3) pLys S cells and His-tag fused protein was produced after the induction of 0.2 mM isopropyl thio-galactoside (IPTG) for 4 h at 18 °C and 180 rpm. Fusion protein was purified using His-Tagged Protein Purification Kits (CoWin Biosciences, Beijing, China). The presence of the expected protein in the bacterial lysates and purification product were examined by SDS-PAGE.

Determination of enzyme activity
The reaction took place in a 2 mL vial with a solid-top polypropylene cap. The reaction buffer contained 25 mM Tris-HCl buffer (pH 7.0), 10% glycerol, 5 mM dithiothreitol (DTT), 100 mM MgSO 4 , and 46 μM farnesyl pyrophosphate (FPP). To initiate the reaction, 100 µL of crude proteins (0.2 mg mL − 1 ) was pipetted into 100 µL of reaction buffer. After incubation of the mixture at 30 °C for 1 h, SPME fibers were inserted into the headspace of the vial and pre-saturated for 30 min. Then, the volatiles were collected after 30 min and transferred to the injection port of the Varian 450 GC Gas Chromatograph-Varian 300 mass spectrometer (GC-MS) (Varian, Palo Alto, CA, USA) equipped with a VF-5 MS quartz capillary column (30 m × 0.25 mm i.d.; film thickness 0.25 μM) and. The carrier gas was helium, and the column flow rate was 1 mL min − 1 . The inlet temperature was 250 °C in splitless mode. The ionization mode was EI with an ionization voltage of 70 eV. The ion source temperature was 250 °C, and the scanning mass value range was 30-500 amu. The starting temperature was 50 °C and was increased by 10 °C min − 1 to 280 °C, then held for 10 min. The types of sesquiterpenes were identified based on retention time, and the results of mass spectra versus those of authentic standards.

Molecular modeling
Three-dimensional protein structural models were predicted using the SWISS-MODEL server (https:// swiss model. expasy. org) (Waterhouse et al. 2018). The crystal structure of the 5-epi-aristolochene synthase M4 mutant from Nicotiana tabacum (SMTL ID: 5ikh.1) (Koo et al. 2016) was used as a template to model AsHS1, AcHS1 and ASS1, which had 97% coverage, 0.79 Global Model Quality Estimate (GMQE) and 44.32% identity to AsHS1, and even higher homology with AcHS1 and ASS1. The amino acid residues in the protein structure were classified according to their secondary structure using the standardized algorithm Define Secondary Structure of Proteins (DSSP). The possible presence of active catalytic pockets of proteins was predicted by POCASA 1.1 (https:// g6alt air. sci. hokud ai. ac. jp/ g6/ servi ce/ pocasa/). The possibility of docking between the substrate ligand small molecule FPP and the protein in the pocket structure was predicted by Auto-Dock 4.2.6 software (Rizvi et al. 2013). Structural figures were drawn using PyMol (http:// www. pymol. org/).

Identification of sesquiterpene genes and structural analysis
A total of 46 gene models encoding TPSs of A. sinensis were acquired through search against the protein data set released by Ding et al. (2020) using HMMER3.1b software and with the HMM files of terpene synthase domains (PF01397 and PF03936) as queries (Bateman et al. 2004;Ding et al. 2020). Based on previous reports, TPSs mostly comprise 350-860 amino acids (Bohlmann et al. 1998;Durairaj et al. 2019). So, we chose 22 AsTPS members that comprise 350-860 amino acids as candidates for further analysis. The structural organization of the 22 AsTPS genes was analyzed, of which 17 AsTPS genes (AsTPS1n-AsTPS17n) are shown in Fig. S1. The five sequences that were not located included AsVS (MH378283), As-SesTPS (KF135950) and ASS1-ASS3 (JQ712682/3/4). ASS1-ASS3 had highly similar sequences and were putatively located on chromosome 5 (26,125,981-26,122,559) but interrupted by a GT-rich sequence (Fig. S2). The difficulties in locating the coding regions of these genes may be due to imperfections in the genome sequence. The five AsTPS genes were used in a search against another A. sinensis genome released by Nong et al. using the Blastn program and were finally located in the corresponding scaffolds (Figs. S3-S5) (Nong et al. 2020). Through alignment of ASS1-SS3 with the genomic sequence, the ASS1-ASS3 genes may have originated at the same gene locus but differed in some nucleotides. The causes of the nucleotide differences could be sequencing errors or post-transcriptional editing. Li et al. (2021a) identified 26 AsTPS genes in the A. sinensis genome and, considering their AsTPS gene family members, we acquired a total of 32 putative AsTPS genes, of which six were new. The six new AsTPS genes discovered in the present study were designated the names AsTPS27-32 and submitted to GenBank with accession numbers AsTPS27 (OM966902), AsTPS28 (OM933589), AsTPS29 (OM933590), AsTPS30 (OM933591), AsTPS31 (OM933592) and AsTPS32 (OM933593; Table S2). The names AsTPS1, AsTPS15 and AsTPS16 were designated for ASS1-ASS3, As-SesTPS and AsVS, respectively.

Phylogenetic analyses of TPSs from A. sinensis and other species
Based on previous reports, plant TPSs can be divided into seven groups (Bohlmann et al. 1998;Chen et al. 2011). The TPS-a and -b groups comprise angiosperm sesquiterpenes and monoterpene synthases, and the TPS-c group comprises mostly diterpene synthases involved in primary metabolism, such as copalyl diphosphate synthase. The TPS-d group Identification of sesquiterpene synthase genes in the genome of Aquilaria sinensis and… comprises gymnosperm TPSs. The TPS-g group comprises mostly acyclic TPSs, such as those in grapevine (Martin et al. 2010;Chen et al. 2011). The TPS-e group comprises kaurene synthase and is always combined with TPS-f, which is probably derived from TPS-e and includes some monoterpenes, diterpenes and STSs, including linalool synthase in Clarkia breweri and Phalaenopsis bellina (Dudareva et al. 1996;Huang et al. 2021), geranyllinalool synthase in A. thaliana and Solanum lycopersicum (Herde et al. 2008;Falara et al. 2014), and farnesene synthase in kiwifruit (Nieuwenhuizen et al. 2009). By phylogenetic relationship analysis of the 32 TPSs in A. sinensis, some 16 TPSs were clustered in group TPS-a, 4 each in TPS-b and TPS-c, 1 in TPS-e, 5 in TPS-f, and 2 in TPS-g (Fig. 1). In the phylogenetic tree, the 16 AsTPS proteins clustered with the Arabidopsis STSs, constituting the largest TPS-a group. The 16 AsTPS proteins are regarded as putative AsSTSs. Recently, Li et al. (2021a) constructed a phylogenetic tree of A. sinensis TPSs and also classified 16 proteins in the TPS-a group of AsSTSs. In comparison, our tree placed AsTPS9 in group TPS-c but not in TPS-a. These differences may be due to differences in the tree construction methods because Li et al. used the NJ method and we used the ML method. Additionally, newly discovered AsTPS29 was placed in group TPS-a and is regarded as a new sesquiterpene synthase. Based on previous reports, AsSTSs may also be present in other groups, such as TPS-f and TPS-g. According to sequence alignment, except for AsTPS7, all the remaining 15 AsTPS in the TPS-a Fig. 1 Phylogenetic relationships of TPSs from Aquilaria sinensis and Arabidopsis thaliana. Fuchsia: AsTPS1-AsTPS32, black: A. thaliana, blue triangles: the six newly discovered TPSs (AsTPS27-AsTPS32). SemBDPS BAL41682.1, a miltiradiene synthase from Selaginella moellendorffii (GenBank BAL41682.1), and PpKS BAF61136.1, an ent-kaurene synthase from Physcomitrium patens (GenBank BAF61136.1) were used as outgroups group had several motifs that are conserved in plant STSs such as RRx 8 W, RxR, DDxxD and NSE/DTE (Fig. 2). The AsTPS7 sequence is considered likely a truncated sesquiterpene synthase for it lacks of the N-terminal RRx 8 W motif and the C-terminal NSE/DTE motif. Several conserved motifs are important for the catalytic activity of STSs (Gao et al. 2012a). The diphosphate moiety of FPP has been proposed to be captured by the RXR motif and divalent metal ions such as Mg 2+ or Mn 2+ , which are also bound by the motifs DDxxD and NSE/DTE (RxxDDxx(S,T,G)xxxE) at the entrance of the active site (Starks et al. 1997). Argininerich RRx 8 W motifs were identified in most of the TPS-a and TPS-b proteins but were absent from the other TPS groups in A. sinensis (Fig. S6). This result is consistent with a previous study on Glycine max, which also identified RRx 8 W motifs in groups TPS-a and TPS-b ). However, the function(s) of the "RRx 8 W" motif remain unclear.

Real-time PCR analysis of hormone-induced expression of sesquiterpene genes in A. sinensis
In the qPCR expression analysis of the eight sesquiterpene synthase genes in A. sinensis suspension cells after short (1 or 2 h) or long (24 or 48 h) treatments with the hormone-and H 2 O 2 -induced sesquiterpene synthesis, 100 μM MeJA (for 1 h and 2 h) upregulated most of the sesquiterpene synthase genes. the expression of AsTPS2, AsTPS3 and AsTPS10 increased more than 50-fold after a 1-h treatment (Fig. 3). In a previous RNA-seq analysis, the expression of eight AsTPS genes encoding STSs (AsTPS1, 2, 3,5,11,13,15,23) were induced after stems were wound ( Li et al. 2021a). Based on these results, we suggest that AsTPS2 and AsTPS3 are major sesquiterpene synthase genes involved in the JA-mediated and wound-induced signaling pathways. After 24 h and 48 h treatment with a higher concentration of MeJA (1 mM), the AsTPS genes showed various expression trends. Compared with the shorter treatments with 100 μM hormone or H 2 O 2 , longer treatment with a higher concentration (1 mM) of hormone or H 2 O 2 did not have a stronger effect on increasing the expression levels of some AsTPS genes. For example, the expression of AsTPS10 increased more than 50-fold after the 1-h treatment with 100 μM MeJA, but increased less than fivefold after the 24-h treatment with 1 mM MeJA. After the 2-h treatment with 100 μM H 2 O 2 , the expression of AsTPS2 and AsTPS29 increased about 50-fold and that of AsTPS10 increased about 150-fold, but after the 24-and 48-h treatments with 1 mM H 2 O 2 , the expression of AsTPS2, AsTPS29 and AsTPS10 did not increased as much. After the short ETh treatments, expression of AsTPS2, -3, and -29 was more than 100-fold greater, less than after the longer treatments. After the short ABA treatments, the expression of AsTPS1 (ASS1-3) and AsTPS2 increased nearly 100-fold, but the expression of AsTPS1 (ASS1-3) and AsTPS2 did not increase very much after the long treatments. After the longer treatments with 1 mM MeJA, ABA, ETh or H 2 O 2 , the effect of the signalling molecule on the expression of some sesquiterpene genes was likely unchanged or weakened. However, after the longer treatment with 1 mM of MeSA, the expression of As-SesTPS, AsVS, AsTPS3 and AsTPS12 was significantly higher than after the shorter treatments. The results suggest that 1 mM MeSA may have a longer effect on induce the expression of some sesquiterpene genes in A. sinensis.

Determination of the activity of sesquiterpene synthase AsHS1
Since we failed to clone novel sesquiterpene synthase genes in the A. sinensis cell suspension, the transition area between white wood and brown agarwood was cut from the branches of a 5-year-old A. sinensis tree from Hainan and used as another material for gene amplification. A 1,671-bp cDNA sequence encoding AsTPS2 was isolated. The coding sequence of AsTPS2 was cloned into vector pET-28a ( +) and transferred to E. coli BL21(DE3)plysS cells. Histag recombinant protein was purified and incubated with FPP in assay buffer according to the method of Kumeta and Ito (2010). Products of the reaction were then analyzed by GC-MS. The recombinant enzyme generated a major sesquiterpene product, identified as α-humulene by comparison of its retention times and mass spectra with authentic standards (Fig. 4). Since the AsTPS2 produced a major amount of α-humulene, it was also named AsHS1.

Sequence comparison and structural analysis of sesquiterpene synthases
Generally, terpenoid cyclases fall into two main classes depending on the chemical strategy involved in the initial carbocation formation (Wendt and Schulz 1998). The sesquiterpene synthases belong to Class I-type terpenoid cyclases and utilize a trinuclear metal cluster to trigger the ionization of an isoprenoid diphosphate substrate to yield an allylic cation and inorganic pyrophosphate. To investigate the structural properties of AsHS1 that affect carbocation formation and product cyclization, homology modelling was used via the SWISS-MODEL webserver (Waterhouse et al. 2018). Using the structure of the tobacco 5-epi-aristolochene synthase M4 mutant as the template, the predicted structure model of AsHS1 had a high level of confidence. Through calculation and classification, the secondary structure of AsHS1 was found to contain 29 α-helixes and two β-folds. The 22 free amino acids at the N-terminal were not shown due to their unpredictable conformation within the template. Based on the predicted space conformation of the AsHS1 model, the small ligand molecule may bind to the free amino Fig. 2 Sequence alignment for the TPS-a group of proteins in Aquileria sinensis. Some sequences without conserved motifs are omitted acid by hydrogen bonding to complete the function of the enzyme, while the substrate farnesyl diphosphate (FPP) was probably bound with the free amino acid at one side of the RxR motif (Fig. 5a, b). Generally, the conserved catalytic mechanism of sesquiterpene synthase is to stabilize the binding of FPP through the formation of an enzyme-Mg 2+ligand ternary complex (Singh et al. 2021). Accordingly, the positions of the amino acid residues around the two Mg 2+ clusters that may form polar interactions with them in the range of 3.5 Å were calculated. They are R450, D453, D454, T457 and E461 from the NSE/DTE motif and D307 and D311 from the DDxxD motif. These amino acids are highly conserved in terpenoid synthases. The Mg 2+ -bound DDxxD and NSE/DTE motifs and the free RXR motif are jointly directed into the catalytic pocket structure (Fig. 5b). For further identifying the key positions that determine the Fig. 3 Real-time PCR analysis of fold-changes in expression of eight genes encoding Aquilaria sinensis TPS-a group sesquiterpene synthases after treatmet with various hormones or H 2 O 2 at 100 μM for 1 and 2 h (Fig. 3a) and 1 mM for 24 and 48 h (Fig. 3b). Expression for each gene before the treatments was arbitrarily set to 1. MeJA, methyl jasmonate; MeSA, methyl salicylate; Eth, ethylene Identification of sesquiterpene synthase genes in the genome of Aquilaria sinensis and… activity of AsHS1, the structure of AsHS1 was compared with the α-humulene synthase AcHS1 from A. crassna and δ-guaiene synthase ASS1 from A. sinensis (Xu et al. 2013;Kumeta and Ito 2016). As shown by Fig. 5c, the model of AcHS1 was highly similar to that of AsHS1; their structures in the catalytic cleft were almost identical, which may result in their similar catalytic products. As shown by Fig. 5d, the structures of ASS1 and AsHS1 have some differences. Their RRx 8 W motifs were all localized in the free amino acids before the α1-helix, and these free amino acids are very Fig. 4 GC-MS spectra of sesquiterpene products generated by AsTPS2/AsHS1 from Aquilaria sinensis. a α-Humulene produced by incubation of farnesyl diphosphate (FPP) with AsHS1; b authentic α-humulene; c α-humulene produced by incubation of FPP with H 2 O; d total ion chromatogram of the product of AsHS1 enzyme with FPP as a substrate with the major product identified as α-humulene; e total ion chromatogram of α-humulene standard. M + represents the molecular ion of α-humulene close to the structure of the Mg 2+ -binding catalytic pocket. Therefore, we speculated that differences in the N-terminal free amino acids may be the cause for the specific catalytic products. Additionally, AsHS1 and ASS1 are not completely the same at the first α-helix; ASS1 has some free amino acids in the first α-helix, but the first α-helix of AsHS1 is not destroyed. In the middle of the 18 and 19 α-helixes, D357 was predicted as a free amino acid in AsHS1, while ASS1 does not have a free amino acid in this position (Fig. 5d). Moreover, two catalytic pockets in the tertiary structures of AsHS1 and ASS1 were predicted by the online tool POCASA 1.1. The first larger catalytic pocket, where the predicated FPP binding sites are located, is highly conserved, but the second catalytic pocket is less conserved. Some amino acid sequences surrounding this second catalytic pocket of ASS1 and AsHS1 showed different secondary structures, which may also contribute the product specificity of the sesquiterpene synthase (Fig. S7).

Discussion
The genome-wide identification of TPS families has been reported for various plant species such as pineapple (Chen  (Alquezar et al. 2017), tomato (Zhou and Pichersky 2020), and Cannabis sativa (Booth et al. 2020). In the first comprehensive analysis of the TPS family in A. sinensis, we discovered six new AsTPS genes. We classified the putative AsTPS proteins based on their phylogenetic relationships and revealed 16 AsSTS genes that encode AsTPS proteins that belong to group TPS-a. We also analyzed the expression of eight AsSTS genes and identified a α-humulene synthase using an in vitro catalytic experiment. Our results provide new experimental evidence for the diversity of terpene synthases in A. sinensis. Zhang et al. (2010) proposed that agarwood is induced by a defence reaction in A. sinensis for which H 2 O 2 and jasmonate (JA) are signal transducers Xu et al. 2016). H 2 O 2 can induce vessel occlusions and stimulate sesquiterpene accumulation in the pruned stems of A. sinensis . H 2 O 2 can also promote programmed cell death and salicylic acid (SA) accumulation during the induction of sesquiterpene production in cultured cell suspensions of A. sinensis (Liu et al. 2015;Siah et al. 2016). JA is a crucial signal transducer for the formation heat-shockinduced sesquiterpene in A. sinensis, and exogenous MeJA affects the production of agarwood sesquiterpenes Sun et al. 2020). Fungal endophytes can also induce the production of agarwood in Aquilaria species, with the quality similar to that of naturally produced agarwood (Chen et al. 2017a;Chhipa et al. 2017;Monggoot et al. 2017;Sen et al. 2017;Subasinghe et al. 2019). Based on these studies, we that other hormones could act as secondary signalling transducers for the defence reaction in A. sinensis and initiate the gradual formation of agarwood. In this study, H 2 O 2 , MeJA, MeSA, ABA or ETh are the respective signalling transducers for reactive oxygen species and stress from wounding, biotrophs, drought and senescence. In the qPCR analysis, the expression of selected eight AsSTS genes varied in the short and longer treatments. These differences in expression may lead to the varied sesquiterpene profiles in A. sinensis under abiotic or biotic stress and may also reflect redundant or specific roles of the genes. The greater expression of some AsSTS genes, such as As-SesTPS, AsVS and AsTPS12 after 24 or 48 h of treatment with 1 mM MeSA compared with levels after 1 or 2 h with 100 µM MeSA implies that 1 mM MeSA may have a long-term role for inducing expression of some sesquiterpene synthases and thus change the profiles of sesquiterpenes, which may lead to differences in the perfumes of agarwood. Further studies on the signalling pathways of MeSA-induced sesquiterpene biosynthesis are needed to reveal the fungus-induced mechanism of agarwood formation.
Sesquiterpenes α-humulene, δ-guaiene and α-guaiene were previously detected from A. sinensis cell suspensions using GC-MS analyses Lv et al. 2019), and the δ-guaiene synthase AsTPS1 (ASS1-3) has been identified by Xu et al. (2013). In the present study, an α-humulene synthase AsHS1/AsTPS2 has also been identified as part of the terpene biosynthesis pathways in A. sinensis. Based on the results, the putative biosynthesis pathways for δ-guaiene and α-humulene in A. sinensis have been proposed, which are similar to those in A. crassna proposed by Ito (2010, 2016) (Fig. 6). Since the AsHS1 gene is expressed at the transition area between white wood and brown agarwood, this gene may be related to the mechanism of agarwood formation. AsHS1 (AsTPS2 in Fig. 3) had greater expression after the 1-h treatment with 100 µM ETh compared with the 2-h exposure and after the 24-h treatment with 1 mM MeSA compared with 48 h. Because ETh is a involved in senescence and MeSA is related to fungal stress, AsHS1 may be involved in agarwood formation induced by senescence and endophytic fungi (Halim et al. 2008;Li et al. 2013).
The structural analysis of AsHS1 revealed its amino acids, and the structural comparisons of AsHS1 with AcHS1/ASS1 showed specific amino acid positions in AsHS1 that probably contribute to its enzymatic activity. Humulene is widely distributed among the plant kingdom and has anti-inflammatory and anticancer activities (Fernandes et al. 2007;Rahman et al. 2014;Chia et al. 2016;Yan et al. 2017). It has been used widely in aromatherapy and has vast potential for medical applications. Currently, α-humulene synthases have been identified in several plant species, such as A. crassna (Kumeta and Ito 2016), Zingiber zerumbet (Yu et al. 2008), Humulus lupulus (Wang et al. 2008), Santalum austrocaledonicum (Jones et al. 2011), Picea glauca (Keeling et al. 2011) and Solanum habrochaites (Bleeker et al. 2011). Here, the new humulene synthase AsHS1 produces humulene as the major product when expressed in E. coli. The identification of AsHS1 and its catalytic sites will be helpful to produce humulene with a high purity via bioengineering.

Conclusions
Through a bioinformatics analysis, we identified 32 AsTPS genes and 16 AsSTS genes and the conserved motifs in 16 putative AsSTSs in group TPS-a in A. sinensis. Compared with the shorter treatments with 100 μM hormone or H 2 O 2 , the longer treatments with 1 mM hormone or H 2 O 2 did not always have a stronger effect on increasing the expression levels of specific AsTPS genes, such as AsTPS10. A novel sesquiterpene synthase gene, AsHS1, was isolated and characterized as encoding an α-humulene synthase. Differences in the predicted tertiary structures of AsHS1 with those of AcHS1 and ASS1 may contribute to the specific product generated by AcHS1 and ASS1. Our results provide novel insight into sesquiterpene biosynthesis and regulatory pathways in A. sinensis, and the putative biosynthetic pathways for some specific sesquiterpenes including α-humulene in A. sinensis that we propose are a foundation for work to synthesize high purity sesquiterpenes. The discovery of structural differences between AsHS1 and ASS1 will aid further study of the variations and specificities of STSs in A. sinensis.
Author contributions Jiadong Ran and Yuan Li have contributed equally to this work and share co-first authorship. Zheng Zhang and Yuan Li conceived and designed the study. Jiadong Ran, Xin Wen, Xin Geng, Xupeng Si performed the experiments. Yimian Ma and Jiadong Ran wrote the paper. Yuan Li, Zheng Zhang and Liping Zhang reviewed and edited the manuscript.
Funding The work was supported by the National Natural Science Foundation of China (81773844). The online version is available at http:// www. sprin gerli nk. com Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will