Introduction

The Magnaporthales order encompass several economically important plant pathogenic fungi (Luo et al. 2015). Gaeumannomyces tritici is the causative agent of the take-all disease (TAD), one of the most devastating root diseases of wheat, being also able to infect other cereals and grasses, as triticale, barley, and rye (Keenan et al. 2015). Several studies reported the capability of this fungus to grow in a wide range of temperatures from 4 to 30 °C, and, although the disease is more common in wheat grown under moist conditions, it can occur under conditions of low precipitation, giving the pathogen a cosmopolitan status (Kwak and Weller 2013).

The TAD generally occurs when the practice of monoculture prevails (Kwak and Weller 2013). However, abiotic factors, such as soil pH and humidity can also influence the progression and severity of the disease (Smiley 1973; James Cook 2003). The primary infection occurs when young seedling roots meet fungus-carrying debris, which allows the fungus to invade and destroy the root tissues (Palma-Guerrero et al. 2021). Secondary infections spread through root-to-root contact, resulting in field patches of acutely infected plants (Palma-Guerrero et al. 2021). The cosmopolitan status of G. tritici leads to diminished crop yields and lower grain quality worldwide. In the United Kingdom, approximately 50% of wheat are impacted by the disease, resulting in average yield losses ranging from 5 to 20%, although severe epidemics can result in yield losses as high as 60% (McMillan et al. 2011; Agriculture and Horticulture Development Board (AHDB)).

Despite the substantial impact and importance of TAD pathogens, a knowledge gap persists regarding the molecular mechanisms involved in disease development, such as virulence determinants (Yang et al. 2015). To tentatively address this matter, the genome of G. tritici has been made available. Additionally, genomes from multiple Magnaporthales species with diverse lifestyles have been published, including Magnaporthiopsis poae, Pyricularia oryzae, Pyricularia pennisetigena, Pyricularia grisea, Magnaporthiopsis incrustans, Magnaporthiopsis rhizophila, Nakataea oryzae, Pseudohalonectria lignicola, Ophioceras dolichostomum, and Falciphora oryzae (Dean et al. 2005; Xu et al. 2015; Zhong et al. 2016; Gómez Luciano et al. 2019). These sequences can be exploited through comparative genomic analysis aiming to understand the evolutionary relationships between species, adaptation to environmental conditions, and to locate putative virulence determinants, involved in host-pathogen interactions.

Secondary metabolites (SMs) are small molecules with a myriad of biological activities. Several pathogenic fungi, including phytopathogens, employ toxic SMs to gain advantages while interacting with their hosts (Osbourn 2010; Gibson et al. 2014; Keller 2015). For example, the phytopathogen Fusarium graminearum, responsible for several diseases in wheat and barley (head blight, crown rot, and seedling blight), produces an unusual non-ribosomal octapeptide, fusaoctaxin A, that facilitates the invasion through a cell-to-cell penetration process (Jia et al. 2019). Pyrenophora tritici-repentis, which causes tan spot of wheat, produces necrosis-inducing toxins called triticones (Rawlinson et al. 2019), and Parastagonospora nodorum, another wheat pathogen, produces phomacins, compounds that disrupt the cytoskeletal rearrangements of the host, affecting positively the infection outcome (Li et al. 2020).

In filamentous fungi, the genes enrolled in the same SM biosynthetic pathway are usually in spatial proximity, arranged in a cluster-like manner in the genomes. These biosynthetic gene clusters (BGCs) are not only common in filamentous fungi as they are also found in bacteria (operons), oomycetes and, more recently, have been discovered in plant species (Osbourn 2010; Nützmann and Osbourn 2014). The physical linkage of SM genes potentially minimizes the number of regulatory steps in the biosynthetic machinery, thereby contributing to physiological optimization (Gacek and Strauss 2012). Moreover, the BGCs are frequently organized around backbone genes, as polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS) or hybrids (PKS-NRPS). Since the domains from these backbone genes have been largely explored, bioinformatic tools for the prediction of potential BGCs rely on these domains for accurate prediction (Medema et al. 2011; Wolf et al. 2015). Notably, the greater accessibility of gene sequencing techniques combined with BGS prediction tools, have located a vast array of unexplored SM gene clusters. Many of these BGCs remain silent under standard laboratory culture conditions. These compounds are only synthesized during certain stages of the complex fungal life cycle (e.g., miktospiromide A and kitrinomycin A are only produced in a coculture of Penicillium brasilianum and Aspergillus nomius) (Cowled et al. 2023). Noteworthy, the activation of silent BGCs and characterization of the producing compounds present significant scientific and commercial opportunities, considering the extensive range of biological activities of SMs (Brakhage 2013; Yaegashi et al. 2014; Rutledge and Challis 2015; Keller 2015).

Secondary metabolites may be valuable virulence determinants for G. tritici. Furthermore, these compounds can also exhibit roles in the environmental maintenance of the pathogen and resistance against mycopathogens. To investigate TAD-related BGCs, we explored the genome of G. tritici strain R3-111a-1. The conservation of the identified BGCs was assessed within other ten species of the Magnaporthales order to identify BGCs unique to G. tritici or conserved among several phytopathogens. Some selected gene clusters were deeply explored through comparative genomic analysis and phylogeny, and potential compounds were suggested. Furthermore, to identify BGCs that may play a vital role in host-pathogen interaction, we also analyzed previously published transcriptomic data comparing axenic culture with infected wheat roots.

Methods

Prediction of secondary metabolite biosynthetic gene clusters

All fungal genomes were obtained from the NCBI Genome Database. The Bioproject accession numbers are displayed in Online Resource 1. Genome assembly and completeness were inferred from analysis with BUSCO (Simão et al. 2015). The prediction of putative BGCs was conducted as described previously (Sbaraini et al. 2017), with few modifications. G. tritici R3-111a-1 BGCs were identified with antiSMASH 7.0 (using the genome assembly sequence as input) and SMIPS (using the predicted proteins as input) algorithms (Wolf et al. 2015; Blin et al. 2023). Furthermore, the Synthase Domain Parser Tool (Gilchrist and Chooi 2021a) was employed for a more accurate prediction of the domain architecture of backbone proteins.

Conservation of the predicted secondary metabolite biosynthetic gene clusters among species from the Magnaporthales order

The conservation of the predicted BGCs among species from the Magnaporthales order with annotated genome (M. poae ATCC 64411, P. oryzae 70 − 15, P. pennisetigena Br36 and P. grisea NI907) was assessed using MultiGeneBlast 1.1.14 (Medema et al. 2013), based, primarily, on backbone gene conservation (amino acid sequences; e-value < 1 × 10−5, query coverage ≥ 45%, and identity ≥ 45%). For species without annotated genome (M. incrustans M35, M. rhizophila M23, N. oryzae M69, P. lignicola M95, O. dolichostomum CBS 114926 and F. oryzae R5-6-1) the analysis was firstly conducted using BLASTN, based on the potential conservation of the backbone gene CDS region (nucleotide sequences; e-value < 1 × 10−5, query coverage ≥ 45%, and identity ≥ 45%). Subsequently, the nucleotide fragments (containing the potential backbone genes) were annotated using FGENESH (gene-finding parameters for the closest species presented in the algorithm) (Solovyev et al. 2006) and validated with BLASTP (employing the same cut-offs used in the MultiGeneBlast step). The BLASTN/FGENESH/BLASTP procedure was also used as a second line of verification for all BGCs from annotated genomes that did not show conservation in the MultiGeneBlast analyses.

Prediction of BGC boundaries and synteny comparison analyses

The cblaster tool was utilized for predicting the boundaries of all identified BGCs (Gilchrist et al. 2021; Blin et al. 2023). Based on the results obtained from cblaster, selected BGCs of interest were further analyzed for synteny comparison using the clinker tool (Gilchrist and Chooi 2021b). Both cblaster and clinker were operated using their default parameters.

Phylogenetic analyses

For phylogenetic analyses, the backbone genes of specific BGCs were chosen (GtPKS1, GtPKS10, and GtPKSNRPS3) and orthologous sequences were selected through routine BLASTP against the non-redundant protein sequences (nr) database. In addition, several fungal genome sequences have been deposited at NCBI as raw or incomplete assemblies. These projects do not have predicted proteins/genes/mRNAs deposited at NCBI and are, therefore, inaccessible through routine BLASTP/BLASTN. Thus, a screening using BLASTN against the WGS database was also performed to incorporate those putative orthologous sequences in the analyses.

When a putative positive match was found, the genome sequences were downloaded, and genes were predicted with FGENESH (gene-finding parameters for the closest species presented in the algorithm) (Solovyev et al. 2006). Genes that satisfied the previously fixed cutoffs (e-value < 1 × 10− 5, query coverage ≥ 50%, and identity ≥ 45%) were incorporated n the phylogeny. PRANK v.140603 was employed for sequence alignment (Löytynoja and Goldman 2010). The best-fit evolutionary model was estimated using Prottest 3.4 (Darriba et al. 2011). Phylogenetic reconstruction (Maximum Likelihood) was conducted using PhyML 3.1 (Guindon et al. 2010) with aLRT SH-like (approximate likelihood ratio test Shimodaira–Hasegawa) branch support estimation (Anisimova and Gascuel 2006; Anisimova et al. 2011).

The phylogeny of PKS and PKS-NRPS genes can help to determine if the collected entries are true orthologs or artifacts, however, this analysis can be problematic, particularly for ortholog definition. To address this issue, a previously established phylogenetic-based approach was employed (Sbaraini et al. 2016). In this phylogeny, we included the KS and AT domains of PKS and PKS-NRPS genes from G. tritrici, and the domains of all characterized PKS and PKS-NRPS genes stored in the MIBiG database (i.e., a database of characterized BGCs) (Medema et al. 2015). The amino acid alignment was built using PRANK v.100701, without manual curation. Phylogenetic reconstruction was conducted employing Maximum Likelihood as described above.

Transcriptome and RNA-Seq differential expression analysis

To validate the expression of the identified BGCs and gain insights into their regulation during plant infection, we analyzed a previously published sequencing data from G. graminis var. tritici strain Ggt-C2 cells cultured in PDA plates or inoculated in wheat roots with or without the biocontrol agent Bacillus velezensis (Kang et al. 2019). The raw RNA-seq libraries were downloaded from the NCBI SRA database under the BioProject accession numbers PRJNA485739 and PRJNA496308. Low quality (Q < 30) and adapter sequences were removed using Trim Galore! (https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) and the libraries were mapped to the latest annotation and assembly version of the G. tritici genome using HISAT2 (Kim et al. 2019) with default parameters. Aligned reads were counted with FeatureCounts and differential expression analysis was carried out using the DESeq2 pipeline (Liao et al. 2013; Love et al. 2014). Genes with an FDR-corrected p-value lower than 0.05 and absolute log fold-change expression higher than 1.5 were considered differentially expressed. The parameters were based on those used by Kang and coworkers (2019).

Results

Predictions of secondary metabolite biosynthetic gene clusters

Thirty-five putative BGCs were identified in G. tritici R3-111a-1 genome using two prediction tools: antiSMASH 7.0 and SMIPS. The 35 BGCs are from different classes, including 14 PKS (13 type I and 1 type III), 5 NRPS, 3 NRPS-like, 5 terpenes (TERP), 3 PKS-NRPS hybrids, and 5 indols (IND) (Fig. 1, Online Resource 2). Additionally, among the 13 type I PKS backbone genes identified, 2 are potential non-reducing (NR) PKS, 10 are potential highly-reducing (HR) PKS, while 1 is a potential partially reducing (PR) PKS (Online Resource 3).

Fig. 1
figure 1

Conservation of the predicted BGCs of G. tritici in ten species of the order Magnaporthales. Conservation was inferred using the backbone genes amino acid sequences (e-value < 1 × 10−5, query coverage ≥ 45%, and identity ≥ 45%), in 10 species of the order Magnaporthales. a) Conservation of PKS BGCs. b) Conservation of terpene and indol BGCs. c) Conservation of NRPS, NRPS-like and PKS-NRPS BGCs. PPE (P. pennisetigena) POR (P. oryzae); PGR (P. grisea); MPO (M. poae); MRH (M. rhizophila); MIN (M. incrustans); NOR (N. oryzae); FOR (F. oryzae); PLI (P. lignicola); ODO (O. dolichostomum). The conservation of the predicted BGCs among species with annotated genome was assessed using MultiGeneBlast v1.1.14, based, primarily, on backbone gene conservation (amino acid sequences; e-value < 1 × 10−5, query coverage ≥ 45%, and identity ≥ 45%). For species without annotated genome, the analysis was firstly conducted using BLASTN, based on the potential conservation of the backbone gene CDS region (nucleotide sequences; e-value < 1 × 10−5, query coverage ≥ 45%, and identity ≥ 45%). Subsequently, the nucleotide fragments (containing the potential backbone genes) were annotated using FGENESH (gene-finding parameters for the closest species presented in the algorithm) and validated with BLASTP

Conservation of BGCs in the Magnaporthales order

To understand the distribution of the identified BGCs in close species with different infection traits we assessed the conservation of these genes in ten species of the Magnaporthales order, encompassing three families: Magnaporthaceae (5 evaluated species), Pyriculariaceae (3 evaluated species) and Ophioceraceae (2 evaluated species). The proximity of the selected species to G. tritici was confirmed by phylogeny (Online Resource 4). The majority of BGCs found in G. tritici were conserved in the phytopathogenic N. oryzae M69 (29 orthologous BGCs; ~ 83%) and in the endophytic F. oryzae (27 orthologous BGCs; ~ 77%). Additionally, more than 68.5% of identified BGCs were conserved in P. grisea NI907, P. oryzae 70 − 15, P. pennisetigena Br36, M. incrustans M35 and M. rhizophila M23 with respectively 28, 26, 24, 24 and 24 orthologous BGCs (Fig. 1, Online Resource 2). M. poae ATCC 64411, a phytopathogenic fungus closely related to G. tritici, harbors only 19 orthologous BGCs (~ 54%), showing less conservation than the other species of the Magnaporthaceae family evaluated (Fig. 1, Online Resource 2), as well as species from more distantly related family Pyriculariaceae. However, the BUSCO analysis indicated that M. poae ATCC 64411 genome has a low quality, which can potentially affect the BGC identification pipeline (Online Resource 5). Notably, only the non-pathogenic saprotrophic fungi O. dolichospermum CBS 114926 (17 orthologous BGCs) and P. lignicola M95 (16 orthologous BGCs), from the family Ophioceraceae, showed less than 54% of conserved BGCs (Fig. 1, Online Resource 2). Notably, GtPKS1, GtPKS3, and GtTERP4 were found only in G. tritici, while 11 BGCs were conserved in all ten species (Fig. 1, Online Resource 2).

Comparative genomic analyses and phylogeny of G. tritici BGCs

Considering that no putative BGCs have been functionally characterized in TAD pathogens to date, we employed comparative genomics to determine the potential final products of these BGCs. Furthermore, a phylogenetic analysis was also performed using all PKS/PKS-NRPS backbone enzymes available at MIBIG to support the suggested orthology (Online Resource 6). These comparisons revealed three interesting BGCs that exhibited similarity with gene clusters characterized in other species: GtPKS1, GtPKS10, and GtPKSNRPS3 putatively linked with the biosynthesis of dichlorodiaporthin, ACR-toxin, and trichosetin compounds, respectively. Moreover, GtPKS9 backbone enzyme (GGTG_00407; XP_009216417.1) is likely implicated in melanin biosynthesis, displaying 61% identity with the characterized PKS involved in Glarea lozoyensis melanin biosynthesis (pks1; AAN59953.1); GtNRPS2 backbone enzyme (GGTG_08621/ XP_009224727.1) is likely implicated in the biosynthesis of the siderophore ferricrocin, displaying 55% identity with SSM1 (i.e., the ferricrocin NRPS characterized in Pyricularia spp.; XP_003719607.1/AAX49357.1); and GtNRPS5 backbone enzyme (GGTG_02228; XP_009218263.1) is likely implicated in the biosynthesis of coprogen siderophores, displaying 60% identity with SSM2 (i.e., the coprogen NRPS characterized in Pyricularia spp.; XP_030985539.1). Additionally, GtNRPS1 backbone enzyme (GGTG_13228; XP_009229397.1) shows 55% with Abt1 implicated in aureobasidin A biosynthesis. However, Abt1 and the GtNRPS1 backbone enzyme have a different domain organization.

The GtPKS1 backbone enzyme (XP_009229208.1; GGTG_13042) exhibits 56% identity with the Aspergillus oryzae dichlorodiaporthin backbone enzyme (DiaA; BAE62229.1). For product release in A. oryzae, a β-lactamase-like enzyme (DiaB; BAE62228.1) is involved, and this enzyme is also conserved in G. tritici (54% of identity; XP_009229207.1). Additionally, a short-chain dehydrogenase/reductase (DiaC; BAE62225.1), and a flavin-dependent monooxygenase (DiaD; BAE62224.1) are also conserved in G. tritici (XP_009229202.1, and XP_009229201.1) showing 66% and 54% of identity, respectively (Fig. 2a). While an ortholog for AoiQ (BAE62227.1), which catalyzes the incorporation of the halogens observed in dichlorodiaporthin, was not initially identified in the G. tritici cluster, a reannotation of the region using FGENESH revealed a single gene that was initially predicted as two separate genes, XP_009229205.1 (GGTG_13039) and XP_009229206.1 (GGTG_13040). This newly defined gene shares a 63% identity with AoiQ (required for dichlorodiaporthin production, with its absence leading to citreoisocoumarin accumulation), and both genes possess methyltransferase and flavin-dependent-halogenase domains (Fig. 2a). Furthermore, the capability of G. tritici to produce dichlorodiaporthin is strengthened by two pieces of evidence. Firstly, cblaster analysis did not detect any supplementary genes as part of the putative BGC (Online Resource 7). Secondly, the transcriptomic analysis revealed that the genes XP_009229201.1-XP_009229208.1 (excluding XP_009229204.1) are the only ones within this locus that are upregulated under infection conditions (Online Resource 8). Based on these findings and the existing body of knowledge about dichlorodiaporthin, we constructed a potential biosynthetic pathway for GtPKS1 in G. tritici (Fig. 2b).

Fig. 2
figure 2

Putative dichlorodiaporthin gene cluster (GtPKS1), conservation and synteny. a) BGC organization and synteny comparison of GtPKS1. Backbone genes are identified with the full protein ID and accessory genes are identified with the last digits that differ from the backbone gene code. Homologous genes are indicated by color. (* indicates a reannotated gene) b) The proposed dichlorodiaporthin biosynthetic pathway

We also explored the distribution of dichlorodiaporthin PKS employing phylogenetic approaches. The inferred phylogenetic tree revealed that the GtPKS1 backbone enzyme is in close evolutionary proximity with PKSs found in the phytopathogens Colletotrichum sublineola and Grosmannia clavigera, as well as with the backbone enzyme of the saprobe/eventually pathogen Lachnellula occidentalis, and Neurospora sp. (which present endophytic, phytopathogenic and saprotrophic lifestyles) (Fig. 3).

Fig. 3
figure 3

Phylogenetic analysis of GtPKS1. The GtPKS1 backbone enzyme (XP_009229208.1; GGTG_13042; highlighted in bold) exhibits 56% identity with the A. oryzae dichlorodiaporthin backbone enzyme (DiaA; BAE62229.1; highlighted in bold), being also in close evolutionary proximity with PKSs found in Lachnellula occidentalis, Neurospora sp., and in the phytopathogens Colletotrichum sublineola and Grosmannia clavigera

The GtPKS10 backbone enzyme (XP_009218470.1; GGTG_02434) shows 53% of identity with ACRTS2 (BAN19720.1) (Fig. 4a), an A. alternata PKS responsible for the biosynthesis of ACR-toxin (Izumi et al. 2012). Both enzymes also share a common domain organization. ACRTS1, a putative hydrolase, theoretically involved in the release of the carbon chain from the PKS’ acyl carrier protein domain, is another essential enzyme in ACR-toxin biosynthesis. Notably, ACRTS1 (L8AXV5) has a predicted cytochrome P450 family domain, and a similar domain is found in XP_009218466.1 (GGTG_02430), located in the vicinities of the GtPKS10 backbone gene, although ACRTS1 and XP_009218466.1 are not orthologs. In this way, XP_009218466.1 can potentially release the carbon chain of the G. tritici PKS (Fig. 4). Interestingly, the gene XP_009218466.1 is conserved in more distantly related orthologous gene clusters, such as the gene cluster found in M. anisopliae (Fig. 4a). However, it is challenging to determine whether the entire gene cluster for ACR-toxin is conserved in the TAD pathogen, as other potential enzymes involved in compound biosynthesis have not been investigated. Moreover, XP_009218469.1 (GGTG_02433) is a potential transcription factor (harboring a Zn2Cys6 binuclear cluster DNA-binding domain and a fungal-specific transcription factor domain) which could regulate the expression of the entire cluster. This potential transcription factor is also present in BGCs of several phytopathogens (e.g., P. nodorum, P. lindquistii, P. teres f. teres and N. serpens) (Fig. 4a).

Fig. 4
figure 4

Putative ACR-toxin gene cluster (GtPKS10), conservation and synteny. a) BGC organization and synteny comparison of GtPKS10. Backbone genes are identified with the full protein ID and accessory genes are identified with the last digits that differ from the backbone gene code. Homologous genes are indicated by color. b) ACR-toxin 2D chemical structure

Noteworthy, the phylogenetic analysis, employing ACR-toxin backbone enzymes, revealed orthologs in several plant-associated species. Interestingly, GtPKS10 backbone enzyme is in close evolutionary proximity with orthologous sequences found in Mycosphaerella populi (plant pathogen), dead wood inhabiting saprobes from order Xylariales, and plant-associated fungi from order Hypocreales (entomopathogens, endophytes, and phytopathogens) (Fig. 5) Moreover, orthologs for the GtPKS10 BGC were found in the three evaluated species from the Magnaporthiopsis genus and in P. pennisetigena (Fig. 1a, Online Resource 2).

Fig. 5
figure 5

Phylogenetic analysis of GtPKS10. The GtPKS10 backbone enzyme (XP_009218470.1; GGTG_02434; highlighted in bold) shows 53% of identity with ACRTS2 (BAN19720.1; highlighted in bold), an A. alternata PKS responsible for the biosynthesis of ACR-toxin. GtPKS10 presented similarity with PKSs found in several plant-associated fungi

The predicted backbone enzyme for GtPKSNRPS3 BGC (XP_009223648.1; GGTG_07560) displays 54% identity with the backbone enzyme EqxS (AGO86662.1), derived from F. heterosporum (Fig. 6a), which is responsible for the biosynthesis of equisetin. Like the lovastatin biosynthesis pathway, a trans-ER (EqxC; AGO86659.1) is mandatory for equisetin biosynthesis, and a predicted trans-ER was also found conserved in the GtPKSNRPS3 (XP_009223650.1 (GGTG_07562); 53% of the identity with EqxC). Noteworthy, other genes found in the equisetin BGC, and not explored through knockout mutants, are also conserved in the GtPKSNRPS3 BGC: XP_009223649.1 (GGTG_07561) displays 36% of identity with Eqx3 (AGO86663.1; hypothetical protein); XP_009223647.1 (GGTG_07559) displays 49% of identity with EqxF (AGO86660.1; transcription factor/regulator); XP_009223646.1 (GGTG_07558) displays 64% of identity with EqxG (AGO86666; MFS transporter) (Fig. 6a). However, an ortholog for EqxD (AGO86665.1) was not found in the GtPKSNRPS3 BGC. EqxD is a N-methyltransferase and knockout mutants for this gene in F. heterosporum led to abolition of equisetin production and accumulation of trichosetin, a phytotoxic metabolite. In this way, we predict that G. tritici can, potentially, produce trichosetin or trichosetin-like compounds. These results are further corroborated by the cblaster analysis, which did not predict additional genes in this gene cluster (i.e., the BGC comprises XP_009223646.1- XP_009223650.1). Drawing on these findings and the existing knowledge about trichosetin (Fisch 2013), we constructed a potential biosynthetic pathway for GtPKSNRPS3 in G. tritici (Fig. 6b).

Fig. 6
figure 6

Putative trichosetin gene cluster (GtPKSNRPS3), conservation and synteny. a) BGC organization and synteny comparison of GtPKSNRPS3. Backbone genes are identified with the full protein ID code and accessory genes are identified with the last digits that differ from the backbone gene code. Homologous genes are indicated by color. b) The proposed trichosetin biosynthetic pathway

Notably, the phylogenetic analysis, employing equisetin/trichosetin backbone enzymes, revealed orthologs in several plant-associated species, including plant-pathogens as Alternaria spp., Raffaelea spp., and Penicillium expansum, besides Fusarium spp. (Fig. 7). Among the species from Magnaporthales order evaluated, the GtPKSNRPS3 backbone gene exhibited orthologs in all three Pyricularia species and in M. incrustans (Fig. 1, Online Resource 2). However, as uncovered by the phylogenetic analysis, equisetin/trichosetin orthologs in the Magnaporthales order do not have a monophyletic origin (Fig. 7).

Fig. 7
figure 7

Phylogenetic analysis of GtPKSNRPS3. The predicted backbone enzyme for GtPKSNRPS3 BGC (XP_009223648.1; GGTG_07560; highlighted in bold) displays 54% identity with the backbone enzyme eqxS, derived from F. heterosporum (highlighted in bold), responsible for the biosynthesis of equisetin. GtPKSNRPS3 is in evolutionary proximity with PKSs found in several phytopathogens and plant-associated fungi

Expression profile of predicted BGCs in wheat root mimicked infection condition

To gain insights about the expression of predicted BGC backbone genes in wheat root conditions, we explored a previously published RNA-seq data (Kang et al. 2019). In this experiment, wheat roots infected with G. tritici (Wr_Gt), and infected with G. tritici in the presence of the biocontrol agent Bacillus velezensis (Wr_Gt_Bv) were compared with G. tritici grown on potato dextrose agar plates (Pda_Gt). Thus, three comparative analyses were performed (Wr_Gt x Pda_Gt; Wr_Gt_Bv x Pda_Gt; Wr_Gt x Wr_Gt_Bv). Moreover, it is important to notice that the detailed experimental procedure was previously described by Kang and coworkers (2019). The differential expression analysis showed that of the 35 backbone genes, 3 were downregulated when conditions Wr_Gt x Pda_Gt were compared (GtNRPS2, GtNRPS-like3, and GtPKSNRPS1); 6 backbone genes were upregulated when the conditions Wr_Gt x Pda_Gt were compared (in addition to the already described GtPKS1, the gene clusters GtPKS12, GtPKS13, GtNRPS5, GtNRPS-like1, and GtTERP2 were also differentially expressed), and three backbone genes were upregulated when the conditions Wr_Gt_Bv x Pda_Gt were compared (GtPKS1, GtPKS8, and GtPKS12) (Fig. 8). Noteworthy, gene clusters differentially expressed had the predicted boundaries adjusted following the expression profile results (Online Resource 9).

Fig. 8
figure 8

Heatmap of backbone genes differentially expressed in at least one infection conditions. Differential expression analysis of wheat roots infected with G. tritici compared with G. tritici grown on potato dextrose agar plates (Wr_Gt x Pda_Gt), and wheat roots infected with G. tritici in the presence of the biocontrol agent Bacillus velezensis compared with G. tritici grown on potato dextrose agar plates (Wr_Gt_Bv x Pda_Gt). Genes not differentially expressed are indicated in grey

Furthermore, 29 out of 35 backbone genes were expressed in at least one of the conditions explored (FPKM values ≥ 2). Although not upregulated, the backbone genes of the GtPKS10 BGC (i.e., putatively implicated in ACR-toxin-like compound biosynthesis) had FPKM values ≥ 2 in all three conditions evaluated; while the GtPKSNRPS3 backbone gene (i.e., putatively implicated in trichosetin biosynthesis) had FPKM values ≥ 2 in the Wr_Gt_Bv condition. Moreover, the BGCs implicated in melanin (GtPKS9), ferricrocin (GtNRPS2), and coprogens (GtNRPS5) biosynthesis also had FPKM values ≥ 2 in all three conditions explored. These results stress the functionality of the predicted backbone genes/BGCs, besides suggesting a potential activity of the produced metabolites in these conditions.

Discussion

Phytopathogenic fungi have a worldwide impact, being responsible for significant losses in the main cultivated crops. These pathogens have been the focus of intensive research, given the socioeconomic impact that such pathogens may cause (Dean et al. 2012). Comparative genomic studies have proven to be important to understand the evolution of fungal pathogens and highlight potential virulence determinants (Dean et al. 2012), but initiatives exploring G. tritici are still scarce. The availability of G. tritici genome, along with sequences of several other species of the order Magnaporthales, creates a platform for highlighting putative virulence determinants. In this context, secondary metabolites can be important tools for microorganisms, being produced to bypass the host’s defenses and guarantee the success of these organisms in the environment (Keller 2015). We explored the potential of the secondary metabolism of G. tritici and identified three interesting BGCs: GtPKS1, GtPKS10 and GtPKSNRPS3.

Our analysis showed that orthologs for the GtPKS1 backbone gene were absent in all evaluated Magnaporthales species. Furthermore, this cluster was differentially expressed under infection conditions and is similar to an A. oryzae BGC involved in dichlorodiaporthin biosynthesis. Through heterologous expression and in vitro reactions, the function of BAE62229.1 (DiaA), BAE62228.1 (DiaB), BAE62225.1 (DiaC), BAE62224.1 (DiaD), and BAE62227.1 (AoiQ) were explored in A. oryzae. These approaches led to the isolation of 8-methyldichlorodiaporthin, dichlorodiaporthin, citreoisocoumarin, and 20 other compounds (Chankhamjon et al. 2016; Liu et al. 2021). The heterologous expression of the complete BGC resulted in the production of 8-methyldichlorodiaporthin and dichlorodiaporthin, both of which have cytotoxic activity (Almeida et al. 2018; Cai et al. 2018). Moreover, dichlorodiaporthin also showed antifungal activity, against Colletotrichum musae and Rhizoctonia solani and α-glucosidase inhibitory activity (Li et al. 2016). Several structurally related isocoumarins exhibit phytotoxic activity, as well as a variety of potential biotechnological applications (Saeed 2016; Meepagala et al. 2018; Shabir et al. 2021). Given the production of dichlorodiaporthin and related compounds by G. tritici and the upregulation of several genes in the GtPKS1 BGC under infection conditions, we propose that this gene cluster may play a significant role in G. tritici infection process in wheat.

The GtPKS10 gene cluster was potentially associated with the biosynthesis of a molecule similar to the ACR-toxin. In A. alternata, ACR-toxin is a host-selective toxin that affects rough lemon (Citrus jambhiri) mitochondrion, uncoupling oxidative phosphorylation and causing leakage of cofactor NAD + from the Krebs cycle (Izumi et al. 2012). Furthermore, two genes in the GtPKS10 cluster had their putative function identified: XP_009218466.1 (GGTG_02430) being a cytochrome P450, harboring a potential role in the product release; and XP_009218469.1 (GGTG_02433), harboring a potential role as a transcription factor (Fig. 4). Notably, the GtPKS10 backbone gene XP_009218470.1 (GGTG_07560) was not differentially expressed under infection conditions, indicating that it may not be involved in the infection process.

The next BGC putatively linked with an important compound was GtPKSNRPS3. This gene cluster was related to eqx, a F. heterosporum BGC involved in the biosynthesis of equisetin, an N-methyl serine-derived acyl tetramic acid with antibiotic, cytotoxic, and HIV inhibitory activities (Vesonder et al. 1979; Singh et al. 1998; Kakule et al. 2013). However, the absence of an eqxD ortholog in G. tritici indicates that the most likely final product is trichosetin, an N-desmethyl precursor of equisetin, which has phytotoxic and antibiotic activity (Fig. 6) (Marfori et al. 2002, 2003). Multiple GtPKSNRPS3 accessory genes were conserved in F. heterosporum eqx BGC, reinforcing the functionality of this cluster in G. tritici. Moreover, other genes intertwined in GtPKSNRPS3 BGC can also help in compound biosynthesis potentially acting as a transporter or transcription factor/regulator. Notably, the phylogenetic analysis, employing equisetin/trichosetin backbone enzymes, revealed orthologs in several plant-associated species (Fig. 7).

Besides the three BGCs already discussed, other clusters also proved to be interesting targets for future studies, like GtPKS9, GtNRPS2 and GtNRPS5, which are likely implicated in melanin, ferricrocin and coprogen biosynthesis, respectively. Melanin has several activities such as photoprotection and thermoregulation, and in G. tritici is also an important virulence determinant, being necessary for the development of TAD (Henson et al. 1999; Cordero and Casadevall 2017). The siderophores ferricrocin and coprogen are low-molecular mass iron chelators, involved in the acquisition of iron, an essential element for the development of filamentous fungi (Haas et al. 2008; Khan et al. 2018). Furthermore, siderophores play a crucial role in infection processes across various host-pathogen interactions (Oide et al. 2006; Hof et al. 2007; Haas et al. 2008; Chen et al. 2013). In the phytopathogenic bacteria Erwinia chrysanthemi, the siderophore chrysobactin emerges as a key determinant influencing the infection outcome (Neema et al. 1993). This siderophore effectively diminishes the iron concentration in colonized tissues, thereby depriving plant cells of essential iron resources (Neema et al. 1993). Ferricrocin is an intracellular siderophore that facilitates iron storage, and Hof and coworkers (2007) demonstrated its involvement in the capability of P. grisea appressoria to penetrate the plant surface, where deletion of the gene responsible for its biosynthesis led to decreased virulence in rice. However, in our analysis of the transcriptomic data published by Kang and coworkers (2019), the GtNRPS2 backbone gene was downregulated under infection conditions (Wr_Gt x Pda_Gt and Wr_Gt_Bv x Pda_Gt), although other nearby genes were upregulated, and further studies are needed to elucidate the role of this BGC in the TAD. Moreover, coprogen is an iron(III) hydroxamate, involved in the acquisition of extracellular iron, mostly produced under iron deprivation, being also an important virulence factor for several pathogenic fungi (Miethke and Marahiel 2007; Hof et al. 2009; Voß et al. 2020). In the transcriptomic analysis, GtNRPS5 was upregulated under Wr_Gt x Pda_Gt condition, reinforcing the hypothesis that GtNRPS5 is involved in the infection of G. tritici in wheat. Surprisingly, in the presence of B. velezensis (Wr_Gt_Bv x Pda_Gt), the GtNRPS5 backbone gene was not differentially expressed, which may indicate a possible interference caused by the biocontrol agent. In summary, the presence of intra- and extracellular siderophores already identified as virulence determinants in pathogenic fungi is an excellent indicator for the putative involvement of ferricrocin and coprogen in the infection process of G. tritici.

Despite the limitations of the differential expression analysis, this method allows the identification of genes possibly involved in the infection process and, therefore, provides relevant information to be investigated in future works. Two BGCs with no orthologs characterized in other species, GtTERP2 and GtPKS13, were positively regulated in the mimicked infection condition and presented extensive conservation in the Magnaporthales order, being of great interest for future investigation.

Conclusions

The availability of the genome of the TAD pathogen G. tritici, in addition to several other species belonging to the Magnaporthales order, has enabled comparative analyses. These analyses facilitate the identification of putative BGCs, and provide insights into aspects of genome organization, gene expression, and their potential roles as determinants of virulence. Although the importance of GtPKS1, GtPKS9, GtPKS10, GtNRPS2, GtNRPS5, and GtPKSNRPS3 BGCs during G. tritici pathogenic infection to economically important cereals and grasses requires further confirmation, our findings are important for future research. Dichlorodiaporthin, trichosetin, ACR-toxin, and siderophores compounds can play significant roles in fungal-plant interactions in several models, including economically important phytopathogenic fungi.