The Biogenetic Origin of the Biologically Active Naematolin of Hypholoma Species Involves an Unusual Sesquiterpene Synthase

Naematolin is a biologically active sesquiterpene produced by Hypholoma species. Low titres and complex structure constrain the exploitation of this secondary metabolite. Here, we de novo sequenced the H. fasciculare genome to identify a candidate biosynthetic gene cluster for production of naematolin. Using Aspergillus oryzae as a heterologous host for gene expression, the activity of several sesquiterpene synthases were investigated, highlighting one atypical sesquiterpene synthase apparently capable of catalysing the 1,11 and subsequent 2,10 ring closures, which primes the synthesis of the distinctive structure of caryophyllene derivatives. Co-expression of the cyclase with an FAD oxidase adjacent within the gene cluster generated four oxidised caryophyllene-based sesquiterpenes: 5β,6α,8β-trihydroxycariolan, 5β,8β-dihydroxycariolan along with two previously unknown caryophyllene derivatives 2 and 3. This represents the first steps towards heterologous production of such basidiomycete-derived caryophyllene-based sesquiterpenes, opening a venue for potential novel antimicrobials via combinatorial biosynthesis. Electronic supplementary material The online version of this article (10.1007/s12033-019-00199-x) contains supplementary material, which is available to authorised users.


Introduction
With the evolving resistance to almost all existing antibiotics, there is an urgent need for new classes of antibiotics that are naturally produced [1].
Lately, new classes of antibiotics have emerged based on terpene-based structures isolated from Basidiomycetes [2], but to date only a small number have been analysed in detail. There is therefore an urgent need to identify terpene pathways representing antimicrobial compounds in other basidiomycetes, to further add to potential useful compounds, and backbones for further chemical modification, leading to potential applications as clinical antibiotics.
Members of the basidiomycete genus Hypholoma produce structurally diverse terpenes, including sesquiterpenes [2]. Naematolin is a bicyclic sesquiterpene thought to be derived from the caryophyllene scaffold of the 1,11 and 2,10 carbon cyclisation of farnesyl pyrophosphate. This metabolite was first reported in H. fasciculare by Ito and co-workers [3] where its chemical structure was established based on spectroscopic and limited NMR data [4,5]. Early biological investigations uncovered the antitumor and the antiviral properties of naematolin [6], but its production was in low titre and its complex structure precluded chemical synthesis, so the compound has remained comparatively unexploited.
Mining of this potential drug by genetic approaches has been limited in the past, due to the absence of efficient manipulation tools. However, the paradigm shift in genome sequencing projects provides a unique opportunity to re-discover naematolin isomers using bioinformatic tools. Sequence alignment is a useful tool in predicting gene function of core enzymes based on their conserved motif similarities. However, the use of such tool in linking the mature natural product to such genes is still challenging, as many synthase genes have been assigned to Electronic supplementary material The online version of this article (https ://doi.org/10.1007/s1203 3-019-00199 -x) contains supplementary material, which is available to authorised users. several precursors of structurally diverse chemicals [2]. This limitation triggered the need for genetic manipulation techniques to experimentally link predicted biosynthetic gene clusters to their chemical products. Two strategies of genetic engineering, featuring gene disruption and heterologous expression, were mainly utilised to underpin biosynthetic gene cluster in fungi [7], among which, heterologous expression proved to be the best option for gene function characterisation in basidiomycetes. When the entire pathway of a natural product of interest is considered, ascomycete hosts are more often selected, with Aspergillus oryzae being the most commonly used species [7]. Construction of a flexible genetic engineering platform enables yield and chemical structure optimisation of compound of interest and potential discovery of novel bioactive molecules. We have recently had success in analysis of other basidiomycete-derived terpenes such as the diterpene pleuromutilin from Clitopilus passeckerianus [8,9] where we have not only linked the genetic pathway to the chemical synthesis, but have also used expression in a heterologous host to allow pathway manipulation and analysis. Pleuromutilin derivatives are now reaching the clinical market as a new class of antibiotics [8,9].
Using similar techniques, we now report the isolation of a candidate gene cluster for naematolin, including heterologous expression of the first two biosynthetic genes (caryophyllene-like sesquiterpene cyclase and FADox tailoring gene), paving the way for further combinatorial biosynthesis of naematolin-based isomers, potentially leading to the generation of new antibiotic classes.

Genome Mining and Computational Analysis
For sequence library preparation and identification of secondary metabolites enzymes, a large scale of H. fasciculare gDNA was isolated using the previously described method by [10]. The DNA is quantified and qualified using Nanodrop N1000, of which 500 ng is used to prepare a library of size 702 bp using Illumina Truseq Nano DNA kit. The produced data were analysed using HiSeq Control Software 2.258 and assembled using short read assembler de novo metagenomic IDBA-UD.
Hypholoma fasciculare genome was further mined and manually inspected using antiSMASH and blast search using previously characterised enzymes, involving Coprinopsis cinerea sesquiterpene synthases with different carbon cyclisation patterns [11,12]. Artemis Comparison Tool (ACT) was also used to compare selected genomic regions of H. fasciculare with its related species H. sublateritium. Gene annotation and phylogenetic reconstructions were performed using published methods [13,14].

Sample Preparation for GC-MS Analysis
Spores of A. oryzae were inoculated into 100 mL CMP (Czapek-dox broth 35 g/L, maltose 20 g/L, peptone 10 g/L) in a 250-mL flask and incubated for 7 days with shaking. When grown, 30 mL of Hexane was added to each flask, homogenised, mixed for 20 min at room temperature and then filtered. The organic phase of this mixture was collected and dried over anhydrous MgSO 4 to give the crude extract, of which 1 mL was then analysed by GC-MS analysis (see Supplementary information).

HPLC-MS Analysis
For analysis by HPLC, cultures were grown as above but extracted into ethyl acetate. After filtration and removal of residual water, the solvent was evaporated and crude extract resuspended in acetonitrile at 50 mg/mL. Crude purification of naematolin was performed using column chromatography with silica gel (Sephadex LH-20, MCI gel CHP 2OP), eluting with methanol.
The partially purified fractions from flash columns were further purified by preparative reverse-phase HPLC-MS, collecting novel compounds on the basis of UV/ELSD and Rt using a Waters mass-directed collector, connected to a Waters 2767 automated sample injector, equipped with Waters 2545 pump, and a Phenomenex LUNA C18, 2.6 μ, 100 Å, 4.6 × 100 mm column and a Phenomenex Security Guard precolumn Luna C 5 300 Å.

NMR and HRMS Analysis
All purified metabolites were characterised using Agilent VNMRS500 (500 MHz) NMR spectrometer. 1 mg/mL of each sample was dissolved in either methanol-d4 (CD 3 OD) or chloroform-d4 (CDCl 3 ). Chemical shifts were recorded in parts per million unit (ppm) and the coupling constant (J) recorded in Hz. All chemical shifts are reported relative to the solvent. 1 H-NMR CHCl 3 = 7.24 singlet or CH 3 OH = 4.78 singlet; 13 C-NMR shifts were recorded relative to 13 C resonance of chloroform = 77.00 triplet, or methanol = 49.00 quintet. Compound ionisation patterns were analysed using a Bruker Daltonics microTOF focus with either positive or negative ESI.

Heterologous Expression of H. fasciculare genes in A. oryzae
For RNA extraction of H. fasciculare, the TRIzol method [15] was modified (see RNA extraction supplementary information). Full-length cDNA for each desired gene was obtained by RT-PCR and recombined into A. oryzae expression vectors by yeast-based recombination methods [16]. Appropriately constructed plasmids were transformed into A. oryzae [17]. Six independent PCRpositive transformants were analysed chemically for each combination of plasmids.

Naematolin Re-characterisation
Hypholoma fasciculare and H. sublateritium FD-344 SS-4 were both cultured in 100 mL YMG liquid media [18] and were evaluated for naematolin production by LCMS. A major product with Rt of 12.25 min corresponded with naematolin. Purification and HRMS indicated a compound with m/z 331.1522 (consistent with C 17 H 24 NaO 5 ). IR and NMR (Figs. S3 to S9 and Table S1 Supplementary information) were also in complete agreement with the literature [3][4][5], confirming naematolin production by both species. The antibacterial properties of naematolin were confirmed by disc diffusion assays, showing some limited activity against Bacillus subtilis and Staphylococcus aureus, but no activity against Pseudomonas aeruginosa or Klebsiella pneumoniae (Table S2).

Genome Sequencing and Analysis
The genome of H. fasciculare was sequenced using the short read assembler de novo metagenomic IDBA-UD. This assembly afforded a draft genome of 58.84 Mbp with contig N50 of 49,633 (see Table S4 supplementary information). To assess whether there was synteny between H. fasciculare and the publicly available genome of H. sublateritium, the genomic loci for two housekeeping genes (gpd and β-tubulin) were identified, and open reading frames ± 20 kb of the locus were aligned using Artemis. Comparisons of all genes within the selected genomic regions showed high sequence similarity (> 80%) and were in the same orientation highlighting considerable synteny between these species (Figs. S10 and S11 Supplementary information).
Potential secondary metabolite gene clusters were identified in both fungi by a combination of AntiSmash [19] along with BLAST searches for specific classes of enzyme. Together this identified seventeen putative sesquiterpene synthases (SQS) in H. fasciculare. A maximum likelihood phylogeny comparison with recently characterised sesquiterpene synthase from two basidiomycetes (Omphalotus olearius and Coprinopsis cinerea) [20,21] placed most H. fasciculare SQS in four different clades, indicating their likely activity in terms of mode of terpene skeleton cyclisation (Fig. 1). Eight of these proteins were predicted as 1,11 carbon cyclisation enzymes, of which two (Hfas-94a and Hfas-94b) were 87% identical to omp-6 and omp-7, previously characterised as protoilludene synthases [21]. Of the remaining SQS enzymes, Hfas-147 clustered with the 1,10 ring closure of 3R NPP, Hfas-804 and Hfas-266 were placed within the clade responsible for the 1,6 ring closure of 3R/S-NPP, and Hfas-179, Hfas-415, Hfas-10, Hfas-342 were all present in the clade catalysing the 1,10 ring closure of E,E-FPP. However, two (Hfas-344 and Hfas85b) were not placed within the usual clades, perhaps indicative of different types of sesquiterpene cyclisation.

Chemical Analysis of Transgenic A. oryzae
Each of the SQS predicted to deliver a 1,11 cyclisation pattern (Hfas94a, Hfas94b and Hfas255) was individually cloned into an expression vector for A. oryzae and six independent transformants assessed by GC-MS for production of novel compounds, or enhanced titres of existing compounds. Transformants with Hfas94a and Hfas94b both yielded α-humulene at Rt 11.91 as major compound along with minor traces consistent with β-caryophyllene at Rt 11.41 (Fig. 2).
Expression of Hfas255 delivered no new products. Given that expression of the candidate genes from the 1,11 SQS clade failed to deliver efficient caryophyllene production, the atypical SQS Hfas-344 was also transformed into A.
oryzae. This SQS still contains the expected D(D/E)xxD and NSE domains characteristic of SQS, but had little other sequence similarity to the canonical SQS from H. fasciculare. Successful expression of Hfas344 in A. oryzae led to accumulation of four new metabolite peaks: a major product at 12.13 min and minor products at 14.18, 15.10 and 15.65 min. When compared with the NIST MS spectra database, peaks 1 and 2 were both consistent with caryophyllene isomers (although not the Rt of β-caryophyllene), product 3 was likely an oxidised sesquiterpene, whilst peak 4 showed no significant matches (Figs. S12, S13, S14 and S15 Supplementary information (Fig. 3).

Heterologous Expression of Additional Naematolin Biosynthesis Genes
The candidate naematolin SQS Hfas-344 genome locus was aligned with the homologous cluster of H. sublateritium based around SQS Hsub99. The Hsub99 contig was larger, extending beyond Hfas344 and allowed identification of the presumed adjacent region from H fasciculare, contig Hfas128 (Fig. 4). This revealed a number of candidate tailoring enzymes, including an FAD oxidoreductase (FADox), aldoketoreductase, zinc-dependent carboxy-peptidase, zinc alcohol dehydrogenase and two cytochrome P450 oxidoreductases ( Fig. 4 and Table S5). Whilst the cDNA of some of these genes was difficult to generate, the FADox protein Following transformation of A. oryzae with the dual expression plasmid, the chemical profile of transformants was investigated by LC-MS of fungal extracts. The caryophyllene-isomer peak disappeared, and six new peaks were observed, four of which were purified by flash column chromatography and preparative HPLC (Fig. 5). Exact masses, IR and 2D NMR analysis were performed to elucidate their structures (Figs. S16 to S47 and Tables S6 and S7). All of these molecules appear to be based on a caryophyllene-like core, so were consistent with the proposed identification of the cluster as being responsible for naematolin biosynthesis. Two compounds were found to be known having previously been reported from the Birch tree Betula pendula [22,23], namely (5β,6α,8β-trihydroxycariolan [1] and 5β, 8β-dihydroxycariolan [4,5] (Fig. 6a, d, respectively) but are a new report from fungi, whilst 2 and 3 appear to be novel.

Bioactivity Test for Isolated Compounds
Compounds 1-4 from the transgenic A. oryzae were assayed by disc diffusion for antibiotic properties against a panel of microbes. Compounds 1-4 all showed some weak activity against B. subtilis, although not as much as naematolin and none had any effect on the other bacteria in the test panel (Tables 1 and 2).

Accession Numbers
The verified sequences of Hfas94a, Hfas94b and Hfas344 can be found on NCBI under MK287936, MK287937 and MK287938, respectively.

Discussion
To date, no caryophyllene synthase has been identified from basidiomycetes, revealing such genes encoding novel biochemical function would therefore provide an important tool to develop potential new antimicrobial compounds. Naematolin is a modified form of a caryophyllene isomer. To date, there are no reported caryophyllene synthases from basidiomycete fungi, precluding the direct identification of such genes in H. fasciculare. Therefore, all the terpene synthases predicted within H. fasciculare genome were investigated, first by whether there was a corresponding homologue in the H. sublateritium genome (as both produce naematolin), and if so, were there enough candidate tailoring genes also present at the same locus to deliver production of the mature compound. Our heterologous expression of Hfas94a and Hfas94b confirmed that both enzymes produce humulene as their major product as was predicted given that Hfas94a and 94b demonstrated high sequence similarity with O. olearius enzymes Omp-6 and Omp-7 that produce protoilludane, highlighting the promiscuous feature of sesquiterpene synthases [21]. This ruled out the SQS from the 1,11-cyclisation clade from being responsible for caryophyllene production. Hfas344 is an atypical SQS that does not cluster into the conventional clades, so its cyclisation pattern could not be predicted; however, its expression in A. oryzae generated several products, the major most likely being a caryophyllene isomer, further demonstrating the limitation of bioinformatic predictions of gene function.
In connection with conserved motifs, DDxxD and NSE are the most common ones of SQS enzymes apparently responsible for their catalytic activity; however, advanced biochemical investigations on these motifs suggested the presence of more specific catalytic motifs in such enzymes [22][23][24][25][26]. Sequence alignment of our SQS with chemically characterised SQS supported the presence of other conserved residues in Hfas344 SQS (Fig. S47 supplementary information). Biochemical analysis of such residues, however, is a prerequisite to further confirm their role in the catalytic activity of such enzymes. Like us, Cox and co-workers could not draw a definite conclusion of motifs responsible for their humulene synthase (AsR6) catalytic activity [27].
Following our successful expression of Hfas344 in A. oryzae, we then co transformed it with its adjacent FADox biosynthetic gene, allowing the production of two novel caryophyllene oxides (compounds 2 and 3), along with two previously known oxygenated caryophyllene isomers (compounds 1 and 4). The presence of more than one hydroxyl group in the produced caryophyllene isomers suggests that FADox is multifunctional [28]. Such  multifunctionality of tailoring enzymes has been demonstrated in the biosynthesis of both aphidicolin and ophiobolin, where one cytochrome P450 oxidoreductase catalyses two [29] and four oxidations [30], respectively. The possibility remains however that some of these additional oxidations observed in the production of 1-4 are performed by the host rather than as a result of the transgenes, but if this is the case, they are only possible following the prior action of the FADox, as they are not observed when the Hfas344 SQS alone has been expressed.

Conclusion
In this work, we aimed to assign the biosynthetic gene cluster of naematolin. Naematolin is a bioactive bicyclic sesquiterpene that naturally initiates from caryophyllene core in Hypholoma species. Its antiviral, antioxidant and antimicrobial activity is well documented. However, its development as potential drug has been hindered by its low titre and structure complexity. To scale up production and further modify its chemical structure for the development of novel bioactive molecules, its biosynthetic gene cluster was a prerequisite. Our start point was sequencing the whole genome of the native producer Hypholoma and predict all potential terpene synthases. This was followed by sequence analysis and heterologous expression of putative SQS with their related tailoring genes in A. oryzae, allowing the production of caryophyllene and four analogous derivatives of which two have shown novel chemical structure. Caryophyllene itself has many medical properties, including insecticidal, apoptosis stimulator, antileishmanial and antifungal activities [31][32][33][34][35][36], indicating its biosynthetic gene could help in the development of novel caryophyllene derivatives, especially that its oxygenated forms have shown antimicrobial activity against B. subtilis throughout this research. This is the first report of caryophyllene synthase being identified from basidiomycetes and of a FAD tailoring gene involving in two oxidation reactions. Revealing such genes involved in sesquiterpene antimicrobials will therefore provide an important tool to enhance effects of nature-based drugs.