The smut syndrome as a model for convergent evolution

Convergence, defined as the phenotypic similarity between species of unrelated lineages is a common phenomenon in organismal biology. Prominent examples of convergent phenotypes are known in animals (e.g., lifestyle and morphology in fish, ichthyosaurs and Cetacea) and plants (e.g., C4 photosynthesis evolving at least 45 times independently; Sage 2004). Molecular phylogenetics has shown that many examples of convergent evolution have been overlooked, as the evolutionary relationships between taxa were reconstructed based on morphologies that were subject to convergent evolution (e.g., Lee and Palci 2015; Legg et al. 2013). The cause for convergence is often solely attributed to selection pressures and phenotypic convergence has been put forward as the prime example of the role of adaptation: similar selection pressures lead to similar phenotypes. In the last decades, however, a debate emerged in evolutionary biology whether genetic, developmental and physiological constraints also have an important impact on phenotypic evolution (Cohen et al. 2012; Gould and Lewontin 1979; Losos 2011). Further difficulties arose with the recognition that what is convergent on one level of biological organisation might not be convergent on the organising levels below and similar morphological traits can be influenced by different genes (Rosenblum et al. 2014). Therefore, some authors have advocated using the term convergence if a similarity on the phenotypic level arises from different molecular backgrounds and parallelism if the underlying molecular mechanisms are the same (Pickersgill 2018; Rosenblum et al. 2014).

Fungi constitute one of the major eukaryotic lineages, but explicit studies on convergent evolution and its mechanisms in this taxon are rare. This is surprising, as there are plenty of phylogenetic studies on fungi that show a plethora of convergent morphological similarities. For instance, lichens from two lineages of Polychidium have evolved a morphologically similar dendroid thallus architecture independently of each other despite living in symbioses with different photobionts (Muggia et al. 2011). In Agaricomycotina (Basidiomycota), specific fruiting body morphologies of mushrooms are polyphyletic traits: gills, pores, stipes and caps evolved several times independently (Oberwinkler 1977, 2012; Varga et al. 2019). The formation of yeasts (i.e., single-celled fungi) is also observed in several fungal lineages, mostly within Dikarya and Mucoromycotina. Using phylogenomics and transcriptomics, Nagy et al. (2014) showed that the potential to form yeasts has arisen in early ancestors of modern-day fungi and potentially is an example of parallelism. Independent fungal lineages that exhibit yeast growth utilise a common family of zinc-finger transcription factors that regulate the switch between filamentous and single-celled growth.

Smut fungi in Microbotryomycetes (Pucciniomycotina) and Ustilaginomycetes (Ustilaginomycotina) have remarkably similar morphologies despite hundreds of millions of years of independent evolution (Begerow et al. 2004, 2014; He et al. 2019; Zhao et al. 2017). In fact, species in Microbotryum (Microbotryomycetes) show such a high degree of similarity in the morphology and life cycle to those in Ustilago (Ustilaginomycetes) that the species were formerly grouped within Ustilago (Deml and Oberwinkler 1982). Only the advent of molecular phylogenetics indicated that the morphologically nearly indistinguishable stages of the life cycles of the two groups evolved convergently (Gottschalk and Blanz 1985; Walker and Doolittle 1983). Since then, these results have been reinforced by numerous molecular and ultrastructural studies (Bauer et al. 1997; Begerow et al. 1997; Zhao et al. 2017). The life cycles of both groups of smuts are dimorphic, featuring a saprotrophic yeast stage and a filamentous parasitic stage that is initiated by mating of the haploid yeast cells (see e.g., Nieuwenhuis et al. 2013; Perlin et al. 2015; Saville et al. 2012; Schäfer et al. 2010 and Snetselaar and Mims 1992, 1993, 1994). The parasitic stage ultimately leads to the so-called smut syndrome, with mostly dark-coloured teliospores often occurring in reproductive organs (e.g., ovules and anthers) or in leaves of their plant hosts. The teliospores germinate usually on a new host individual, where meiotic divisions give rise to the basidium, resulting in saprotrophic sporidia, which proliferate asexually as yeasts (e.g., Schäfer et al. 2010 and Vollmeister et al. 2012).

For Microbotryum and Ustilaginaceae (Ustilaginomycetes), the most extensively studied groups of smut fungi, contributions from molecular biology, genomic and transcriptomic analyses have been added to ecological and experimental observations. Combined, they have great potential to promote insight into the biological processes behind the convergent lifestyles of the two groups. This article summarizes some of the current understanding of the lifestyles in these two smut groups and the underlying mechanisms, as well as the tools available to study them. We complement existing data with analyses on shared orthologous genes and gene function comparisons to provide impulses for the study of convergent evolution and possibilities for the research communities to combine their contributions for a broader understanding of plant parasites.

Toolbox for assessing convergent evolution in smuts

Genomic data availability, sequencing motivation and composition

Several genomes of Ustilaginaceae and Microbotryum are publicly available on the International Nucleotide Sequence Database Collaboration (INSDC) genome databases (e.g., https://www.ncbi.nlm.nih.gov/genome/). In Ustilaginaceae, the genomes of 28 species from several, phylogenetically dispersed parasitic genera, mainly Ustilago s.l. (15 species; McTaggart et al. 2016) and Sporisorium (five species), Anthracocystis (three species) and specimens of some other genera are publicly available (August 2022). For the genetic model organism Ustilago maydis, 38 genome data of several strains and different mating types are available, while for most other species genome data are only available for one or two strains. Many of these are of good assembly quality with limited fragmentation and Anthracocystis panici-leucophaei, Sporisorium graminicola, S. reilianum, S. scitamineum, Ustilago bromivora, U. hordei and U. maydis are provided at chromosome level. Publicly available genomes of Microbotryum species belong to 17 closely related species, with well-assembled, long-read genome sequences of both mating types. In addition, the genomes of several strains and species from the two groups were sequenced using short-read technology which resulted in typically high contig counts. To compare the structure and composition of the genomes of Ustilaginaceae and Microbotryum, we selected published high quality genomes (N50 > 500 kb) of six and nine species, respectively (Table 1). Genomes of anamorphic yeasts in the Ustilaginaceae, mainly from the former genus Pseudozyma, including renamed species, were omitted as no parasitic phase is known for them.

Table 1 General genome statistics. Genes for Ustilaginaceae were predicted using AUGUSTUS v 3.4.0 (Stanke et al. 2006) with Ustilago maydis-based gene models and exisiting EuGene (Foissac et al. 2008) gene predictions were used for Microbotryum genomes

Genome sequencing in the Ustilaginaceae has primarily been conducted to understand the molecular aspects of plant-parasite interactions and to develop novel genetic tools. (Basse and Steinberg 2004; Bösch et al. 2016; Kahmann and Kämper 2004; Kämper et al. 2006; Ökmen et al. 2018; Rabe et al. 2016; Que et al. 2014; Schirawski et al. 2010; Schweizer et al. 2018; Sharma et al. 2019; Steinberg and Pérez-Martín 2008). The understanding of secondary metabolite production for biotechnological applications also has been an important aspect for performing genome sequencing (Konishi et al. 2013; Solano-González et al. 2019). The status of U. maydis as a model organism for genetics with an elaborate manual gene annotation and insights into its infectious mechanisms motivated the sequencing of genomes of related species for comparative genomics approaches to target the same mechanism of infection and their conservation (Benevenuto et al. 2018; Berlemont 2017; Schirawski et al. 2010). Especially, the genome comparison of Ustilago species and S. reilianum f. sp. zeae has highlighted the relevance of comparative approaches including other smut fungi for future studies (Wollenberg and Schirawski 2014).

In Microbotryum, genome sequencing as a tool for understanding the molecular mechanisms of infection has only been the focus of few studies so far (Beckerson et al. 2019; Perlin et al. 2015). The genomes in this group have mainly been sequenced to study evolutionary mechanisms and outcomes, including adaptation (Aguileta et al. 2016), population structure (Büker et al. 2016), mating-type chromosome evolution (Badouin et al. 2015; Branco et al. 2017, 2018; Carpentier et al. 2019, 2022; Duhamel et al. 2022a) and mechanisms against transposable element accumulation in genomes (Horns et al. 2017).

Structures and composition of available genomes

We selected several species of Microbotryum and Ustilaginaceae (Sporisorium and Ustilago s.l.) as representatives in order to compare the genome structure of the two smut fungal groups (Table 1). Genomes of the two groups showed differences in haploid genome size. While Ustilaginaceae genome sizes ranged from 19 to 27 Mb (average: 21 Mb), Microbotryum species generally had larger genomes that ranged from 23 to 40 Mb (average: 30 Mb). Microbotryum genomes exhibited a constant GC content of 56%, whereas Ustilaginaceae genomes showed variations ranging from 51% in U. hordei up to 59% in S. reilianum, potentially reflecting a broad phylogenetic sampling of the species in the Ustilaginaceae.

We used Augustus (Stanke et al. 2006) to predict genes for Ustilaginaceae, as gene models for U. maydis were available for this tool, allowing for the consistent annotation of genomes from different sources. For Microbotryum species, we used the EuGene (Foissac et al. 2008) gene predictions, as gene models for closely related species were not available in Augustus. This made it impossible to obtain high-quality models with current data and the use of the U. maydis gene models produced only 7888 genes on average in Microbotryum. Gene predictions for haploid Ustilaginaceae genomes resulted in an average of 6553 genes, with a minimum of 6288 genes in U. bromivora and a maximum of 7570 genes in U. hordei. In the selected haploid Microbotryum genomes, EuGene predicted 11,243 genes on average, with 8007 genes in M. scabiosae (Carpentier et al. 2022) and 15,486 genes in M. violaceum s.l. on Silene paradoxa (Table 1; see also Branco et al. 2018). The genomes of Microbotryum species also had an increased gene density (352.6–412.2 genes/Mb) compared to the smaller Ustilaginaceae genomes (240–346.9 genes/Mb). On the other side, Ustilaginaceae genomes included fewer non-coding areas within genes (0.55–1.27 introns/gene) compared to Microbotryum genes, which contained a larger number of introns (3.2–4 introns/gene). The frequency of introns is known to affect diverse genomic characteristics, such as alternative splicing (Muzafar et al. 2021) and gene expression regulation (Rose 2019). If such transcriptional plasticity exists between the two taxa and how it might influence other organismal traits is so far unknown. The intron frequency observed in Ustilaginaceae and Microbotryum species also is similar to intron distribution seen in other species of Ustilaginomycotina and Pucciniomycotina (Lefebvre et al. 2013, Schwessinger et al. 2018, Tkavc et al. 2018, Toome et al. 2014) and could be due to common ancestry in the respective lineages.

We reconstructed orthologous groups using OrthoFinder (Emms and Kelly 2015, 2019) to determine similarities and differences in gene composition between the two taxa (Fig. 1). Based on orthologous groups, we considered genes in the following categories: the core genome (i.e., genes occurring in all genomes), genes specific to Ustilaginaceae, genes specific to Microbotryum, unspecific (i.e., appearing in more than one species, but not belonging to any of the other categories), species-specific genes and orphan genes (i.e., genes that were not sorted into orthogroups). Gene numbers present in the core genome did not vary significantly, likely representing general eukaryotic housekeeping genes and genes specific to higher taxa, such as Fungi or Basidiomycota.

Fig. 1
figure 1

Number of genes in orthologous groups separated by common to all genomes, unspecific (i.e., orthologous groups that occurred in two or more genomes but could not be attributed to other categories), specific for Ustilaginaceae, specific for Microbotryum, species-specific and unassigned (orphan genes). Haploid Microbotryum genomes contained more genes than haploid Ustilaginaceae genomes. Common orthologous groups contained a similar number of genes in all species, while Microbotryum species contained more taxon and species-specific genes. Orthologous groups were calculated in OrthoFinder v. 2.5.2 (Emms and Kelly 2019, 2015) using the DIAMOND algorithm (Buchfink et al. 2015). Overlap of orthologous groups between the species was calculated in R 3.6.1 (R Core Team 2021) and plotted using the ggplot2 package (Wickham 2016)

Surprisingly, the phylogenetically close relationships between the analysed Microbotryum species did not result in less species-specific and orphan genes compared to the species in Ustilaginaceae. Ustilaginaceae contained an average of 131 species-specific and 188 orphan genes, while Microbotryum contained 321 and 372, respectively (Wilcoxon test p = 0.11 (unassigned) and p < 0.05 (specific)). However, when normalised for the gene count per species, the differences were less pronounced (Microbotryum unassigned genes 3.1%, species-specific 2.7% and Ustilaginaceae 2.8% and 1.9%, respectively). Early studies assumed a high proportion of species-specific and orphan genes in individual genomes, ranging between 10 and 20% of the complete gene set (Khalturin et al. 2009), indicating that gene-rich genomes harbour a higher total number of genes not observed in other species. In our data, the average combined species-specific and orphan gene content was close to 5%, which most likely reflects the increased sample size in the specific groups and also more sensitive detection of orthology used by newer comparative methods (Emms and Kelly 2019). A recent genomic study of Ustilaginomycotina also found the proportion of species-specific genes in some Ustilaginaceae lineage to be within this range, but indicated that higher percentages are common in this group as well (Kijpornyongpan et al. 2018). Potentially, the number of species-specific genes is even lower if several specimens of the same species are included in comparison. For instance, between populations of M. lychnidis-dioicae and M. silenes-dioicae, the number of species-specific genes in autosomes (all chromosomes but the mating-type chromosomes carrying the mating-type genes, see “Mating”) was determined as only 186 and 51, respectively (Hartmann et al. 2018). As our comparison did include all chromosomal DNA it is unclear whether species-specific genes are enriched on mating-type chromosomes or if there is a high intraspecific variation of gene presence/absences and further investigations are needed.

Group-specific genes (i.e., Microbotryum-specific genes) were mostly responsible for the higher gene number in Microbotryum genomes compared to Ustilaginaceae genomes. The group-specific genes represent 25.9–32.1% of the genes in Ustilaginaceae and 36.1–57.5% of the genes in Microbotryum (Fig. 1). Microbotryum genomes may be affected by more gene duplication than Ustilaginaceae genomes, as it was estimated, for example, that U. maydis genomes did not show signs of extensive gene duplication events (Kämper et al. 2006). Alternatively, longer retention of duplicated genes in the genomes of Microbotryum species is possible. Using OrthoFinder, we inferred between 522 and 3655 species-specific duplication events in Microbotryum genomes, against only 131 to 1644 such events in Ustilaginaceae. In addition to examining gene duplication events within individual species, we also investigated gene duplications that occurred only within each lineage. Notably, the number of such events was substantially higher in Microbotryum (6324) compared to Ustilaginaceae (592). Gene duplications are considered a major factor in the emergence of new genes with potential new functions, as the presence of two gene copies can decrease the selective pressure on one of them and allows for mutations and potential neofunctionalization (Ohno 1970). The generally high number of group-specific genes suggests that the groups use divergent sets of genes for similar lifestyles, including group-specific effectors or CAZymes acting during the infection (Berlemont 2017; Perlin et al. 2015; Toh et al. 2018).

The comparison of the genomic composition between species from the two groups is complicated by large evolutionary distances between species studied within Ustilaginaceae compared to species within Microbotryum. Although, adding more genomes for each group will certainly refine this picture, we still detected some noteworthy patterns. The genomes of Ustilaginaceae species contain less genes than those of Microbotryum species, but species-specific genes were more common in the latter group. Additionally, the variation in gene content was higher in Microbotryum than in Ustilaginaceae. Not only in Ustilaginaceae, but also in the whole subdivision Ustilaginomycotina, with the constant exception of Malassezia spp., gene numbers mainly range between 6500 and 8000 genes (Kijpornyongpan et al. 2018), whereas gene content within Pucciniomycotina is less constant (see https://mycocosm.jgi.doe.gov/mycocosm/home). Determining if differences in gene content reflect different ecologies or the divergent evolutionary history of the two groups is an aspect that promises a better understanding of genome evolution in convergent lineages.

Availability and focus of transcriptomic data

Transcriptomic data is helpful during genome annotation, where it is used to identify coding sequences by mapping RNA sequences on the genome (reviewed by Chen et al. 2017). In parasitic fungi another important motivation for sequencing transcriptomes is the understanding of gene expression during the successive steps of infection in host–parasite interactions. With decreasing sequencing costs, transcriptomes have become a means of unravelling the molecular details of plant-parasite interactions (reviewed by Bhadauria et al. 2007). In smut fungi of both groups, main study objectives have been the identification of effector candidate genes and regulator mechanisms of pathogenicity-related genes (Lanver et al. 2018; Perlin et al. 2015). Usually, gene expression patterns are compared to the saprotrophic stage, which enables the identification of pathogenicity-related genes.

Identifying candidate genes involved in the interaction with their host, such as effectors and small secreted proteins, has been the main focus (Beckerson et al. 2019; Lanver et al. 2018; Matei et al. 2018; Skibbe et al. 2010; Toh et al. 2017, 2018; Villajuana-Bonequi et al. 2019; Zuo et al. 2021), but studies also identified enzymes (Lanver et al. 2018; Taniguti et al. 2015; Toh et al. 2017, 2018), nutrient transporters (Doehlemann et al. 2008; Lanver et al. 2018; Taniguti et al. 2015; Toh et al. 2017, 2018) and transcription factors involved in infections (Horst et al. 2012; Lanver et al. 2018; Taniguti et al. 2015; Toh et al. 2017; Villajuana-Bonequi et al. 2019). Additionally, elevation of plant hormone levels triggered by fungal infection, as well as expression of fungal plant-like hormone genes have been identified (Li et al. 2021; Skibbe et al. 2010; Toh et al. 2017; Wang et al. 2017). A further important aspect has been the understanding of gene expression of different fungal strains and formae speciales on different host plants (Poloni and Schirawski 2016), as well as differential gene expression on susceptible and non-susceptible hosts in Ustilaginaceae (Que et al. 2014).

Because the research focus in transcriptomics on both smut fungal lineages is more similar than in genomics, comparative study approaches between the groups are easier to interpret. In examining the infection process at synchronized time points, differences and similarities between the two smut fungal lineages were observed. Differences are visible upon early infection stages and could hint to different infection strategies within the two groups. While species in Ustilaginaceae express plant cell wall–degrading enzymes to enter host cells (Lanver et al. 2018; Taniguti et al. 2015), M. lychnidis-dioicae expresses a high number of secretory lipases (Perlin et al. 2015). A general similarity is found throughout the interaction, where both groups express effectors that manipulate host behaviour (e.g., Lanver et al. 2018; Toh et al. 2018; Zuo et al. 2021). Another similarity of possible importance between the two lineages is the expression of nitrogen-uptake related genes. Ustilago maydis expresses nitrogen transporter genes under biotrophic and nitrogen depleted conditions (Lanver et al. 2018; Horst et al. 2012). Similarly, S. scitamineum upregulates genes coding for ammonium and nitrate transporters in planta (Taniguti et al. 2015). Microbotryum lychnidis-dioicae expresses nitrogen transporter genes already during mating under nitrogen-depleted conditions preceding the infection (Perlin et al. 2015). Interestingly, a study in U. maydis showed that the expression of such nutrient transporters is not related to the availability of nutrients but associated with the parasitic stage itself (Lanver et al. 2018). Questions such as the prerequisites of mating, which have been addressed in Microbotryum and S. scitamineum transcriptomic data (Yan et al. 2016; Yockteng et al. 2007), were studied in U. maydis using non-transcriptomic methods (e.g., Quadbeck-Seeger et al. 2000; Wahl et al. 2010b).

Often transcriptomic studies in these systems are conducted on host and parasite simultaneously. Most transcriptomic data are available from representatives of the Ustilaginaceae and their corresponding host plants, mainly from U. maydis (Lanver et al. 2018; Matei et al. 2018; Skibbe et al. 2010; Villajuana-Bonequi et al. 2019; Zuo et al. 2021), but also from U. esculenta (Wang et al. 2017), U. hordei (Donaldson et al. 2017), S. reilianum (Poloni and Schirawski 2016; Zuo et al. 2021) and S. scitamineum (Que et al. 2014; Taniguti et al. 2015; Yan et al. 2016). In Microbotryum, transcriptomic data has only been obtained for M. lychnidis-dioicae and its host Si. latifolia and M. intermedium (Carpentier et al. 2022; Perlin et al. 2015; Toh et al. 2018; Zemp et al. 2015). During host plant colonisation, tissue-specific expression patterns of the host and the alteration of plant hormone expression caused by the fungal infection may contribute to the characteristic phenotype of the smut syndrome. Infection by U. esculenta of Zizania latifolia results in the upregulation of a large number of plant genes, including genes involved in the synthesis of the plant hormone auxin (Wang et al. 2017). The infection leads to the formation of galls, which is likely mediated by the elevation of plant hormones, including gibberellic acid, abscisic acid and indole-3-acetic acid (Li et al. 2021), some of which have also been detected in galls induced by U. maydis (Morrison et al. 2015). Furthermore, the gene expression of the host Z. latifolia is altered upon U. esculenta infection (Wang et al. 2017), which is similar to the changes of gene expression in galls on adult maize leaves upon infection by U. maydis (Skibbe et al. 2010). In Zea mays, the infection with U. maydis leads to organ-specific plant gene expression in leaves and tassels (Skibbe et al. 2010). Whether the same set of genes in U. maydis is involved in gall formation of ear and tassel remains unknown (Ferris and Walbot 2020). Little is known about the alteration of plant hormone levels by Microbotryum, although gene prediction indicates that the genome of M. lychnidis-dioicae contains genes for enzymes that can use plant hormone precursors as a substrate (Perlin et al. 2015). However, functional observations about the modification of plant hormone levels and whether they play an important role in the interaction of Microbotryum with its hosts are missing. If so, an intriguing question is whether the two groups use similar molecular mechanisms or have evolved unique strategies for manipulating their host plants.

In some Microbotryum species, a fascinating phenomenon involving different plant tissues has been observed. Some of the host plants of the anther smut species in Microbotryum (e.g., M. lychnidis-dioicae) are dioecious, resulting in a large part of the host population (i.e., female plants) potentially being unavailable to the parasite. Overcoming this problem, the parasites have evolved mechanisms to induce male flowers in infected female plants. The infection of the anther smut M. lychnidis-dioicae leads to localised pools of alpha-linolenic acid in the meristem of the flower of its host and possibly influences the development of the inflorescence (Toh et al. 2017) by upregulating plant host genes involved in floral development (Toh et al. 2018). This is in line with changes of Si. latifolia gene expression in the inflorescence upon Microbotryum infection (Zemp et al. 2015). These transcriptomics results are substantiated by previous studies which detected inflorescence specific plant genes using Northern blotting (Ageez et al. 2005) and in situ hybridization with DIG-labelled RNA probes of histological samples (Kawamoto et al. 2019; Kazama et al. 2005). Female Si. latifolia plants are “defeminized” during the infection, as female-flower-associated genes are downregulated and male-flower-associated genes are upregulated, leading to the formation of anther-like, spore-bearing structures (Uchida et al. 2003; Toh et al. 2018; Zemp et al. 2015).

In contrast, most transcriptomic studies in U. maydis focus on gene expression during gall formation and in different gall types (Lanver et al. 2018; Matei et al. 2018; Skibbe et al. 2010; Villajuana-Bonequi et al. 2019), while the effect of the sex of the host on gene expression in different stages of the infection is a major focus in Microbotryum (Perlin et al. 2015; Toh et al. 2018; Zemp et al. 2015), although studying host specificity using transcriptomics and functional genetics has recently gained interest in this taxon (Beckerson et al. 2019). Studying transcriptomes from additional species of the Microbotryum genus could provide insight about the specificity of the transcriptomic changes evoked by the fungus, whether they are highly species-specific or shared by different Microbotryum species, or even shared with some species of Ustilaginaceae smuts.

Preliminary transcriptomics comparison between Microbotryum and Ustilaginaceae indicate similar patterns governing the interaction process, but differ in detail. For instance, while species in both groups secreted cell wall–modifying enzymes, U. maydis expresses genes for cellulases whereas Microbotryum expresses genes for lipases and hemicellulases. Likewise, although an interaction between the parasite and host hormones is known (Ustilaginaceae) or predicted (Microbotryum), different mechanisms are involved in plant hormone homeostasis.

Genetic manipulation and functional genetics in smut fungi

Mechanistic understanding of processes involved in the parasitic and saprobic phase of smut fungi is highly dependent on successful genetic manipulation to generate strains for overexpressing genes of interest, carrying knock-out constructs, or substituting gene constructs. Using these molecular tools gives the opportunity to contextualise gene function also in the light of evolution. However, the available methodological toolboxes vary between smut fungi. Whereas U. maydis and some of its relatives have a well-equipped molecular tool repertoire, only limited technical possibilities exist for species in Microbotryum so far (Table 2). Advances in molecular methods for Ustilaginaceae are also fuelled by the biotechnological research community, who studies these species for the production of itaconate or heterologous proteins (Becker et al. 2020; Hussnaetter et al. 2021; Ullmann et al. 2021).

Table 2 Available transformation methods for smut fungi

Ustilago and Sporisorium species are mostly transformed using spheroblasts or protoplasts, in which cell walls are digested by enzymes such as Trichoderma lysing enzyme, glucanase, Glucanex®, Novozym®, Driselase™, Yatalase™ or a combination of these (Agisha et al. 2021; Bölker et al. 1995a, b; Eitzen et al. 2021; Ökmen et al. 2021; Plücker et al. 2021; Schulz et al. 1990; Schuster et al. 2016; Yu et al. 2015). The choice of fungal cell wall–degrading enzymes requires re-evaluation when transferring the methods to Microbotryum species, as the cell wall compositions of this lineage are different (Roeijmans et al. 1998). The chitinous cell walls of Ustilaginaceae are mainly composed of glucose, whereas cell walls of Microbotryum are dominated by mannose (Prillinger and Lopandic 2015). Redesigning of the experimental setup and applying novel enzymes or individual adapted enzyme mixes might be vital to advance methods of protoplast formation in Microbotryum. As of now, only laborious Agrobacterium-based transformation has been applied successfully in Microbotryum (see Table 2; Toh et al. 2016).

Overexpression of genes of interest, gene replacement and gene-product tagging with fluorescent markers are standard, highly advanced techniques of functional genetics in Ustilago and Sporisorium species (Table 2). In Microbotryum, procedures are still in a very early stage comprising the overexpression of resistance markers and solely fluorescent proteins for reporter strain generation without successful protein labelling. The number of species which have been transfected successfully reflects this imbalance in advancement of molecular methods between the different groups of smut fungi. As of today, the only Microbotryum species successfully transformed is M. lychnidis-dioicae, while there are protocols for at least eight Ustilaginaceae and one additional species from Glomosporiaceae in the Urocystidales (Table 2).

With the rise of the CRISPR-Cas technology in the field of molecular genetics, methods utilising it were developed in fungi as well. In many pathogenic fungi, CRISPR-plasmid based transformation has been applied successfully utilising established transformation methods (reviewed by Schuster and Kahmann 2019). CRISPR plasmid-based transformation has proven to be an efficient molecular tool in U. maydis (Schuster et al. 2016; Wege et al. 2021), U. trichophora (Huck et al. 2019) and S. scitamineum (Lu et al. 2017). In Microbotryum CRISPR-Cas-based gene deletion, application of fluorophore-fusion constructs, or methods for targeted gene replacement via homologous recombination are not established so far.

Homologous recombination (HR, reviewed by van den Bosch et al. 2002) and non-homologous end joining (NHEJ, reviewed by Stinson and Loparo 2021) are independent and competing mechanisms for repair of double strand breaks of DNA in eukaryotes. Briefly, when double strand breaks occur, the heterodimer Ku70p/Ku80p binds to the ends of the DNA double strands and mediates recombination by NHEJ-pathway. Homologous recombination events are impeded. It is assumed that the dominance of NHEJ over HR is key to low homologous recombination rates in fungi (reviewed by Kück and Hoff 2010). In accordance with these findings, it has been shown for Ascomycota and Basidiomycota, that malfunction of NHEJ via deletion or disruption of ku70 and/or ku80 lead to an increase of homologous recombination events in fungi (e.g., Choquer et al. 2008; Koh et al. 2014; Pöggeler and Kück 2006). Unfortunately, disadvantageous side effects of DNA repair deficient ku70 and/or ku80 mutant strains are numerous. Growth reduction and temperature sensitivity have been observed in ku70-deficient strains of Penicillium digitatum (Gandía et al. 2016) and S. cerevisiae (Gravel and Wellinger 2002), whereas ku70 deletion strains of P. chrysogenum are more sensitive to osmotic stress compared to the parental strain (Hoff et al. 2010). Further, telomer defects have been observed in ku70 deletion mutants of S. cerevisiae (Gravel and Wellinger 2002) and the smut fungus U. maydis (de Sena-Tomas et al. 2015). Therefore gene deletion based on Cas-ribonucleoproteins, which has been proven as a robust and versatile molecular tool in recalcitrant organisms like Chlamydomonas rheinhardtii (Ferenczi et al. 2017) and has been transferred to filamentous and pathogenic fungi (Leisen et al. 2020; Pohl et al. 2018; Zou et al. 2021) as well as pathogenic yeasts (Grahl et al. 2017; Wang 2018), might be a promising approach to generate Microbotryum cloning strains with dysfunctional nonhomologous end-joining to generate Microbotryum cloning strains with high applicability.

Due to the history of functional genetic research in Ustilaginaceae and Microbotryum, the best available technology for a mechanistic understanding of selected genes varies between both organismal groups. In our opinion, methods of functional genetics in Ustilago provide a blueprint for future development in Microbotryum, though recognising its particular characteristics is vital for a successful method transfer. Reaching a comparable level of molecular tools in both smut lineages would allow assessing putatively different molecular mechanisms which result in the highly similar macroscopic disease symptoms.

Database representation and sequence analysis tools

Function-oriented databases are important tools for similarity-based gene annotation. Because functional annotation is cumbersome and well-curated information is not widely available, these databases naturally focus on a few model species, making functional annotation of new genomes challenging with decreasing similarity and relation to the model species. For Ustilaginaceae, comparisons are achievable, as the major databases like the Kyoto Encyclopaedia of Genes and Genomes (KEGG; Kanehisa et al. 2021), Gene Ontology (GO; Ashburner et al. 2000; Gene Ontology Consortium 2021), or the eggNOG database (Huerta-Cepas et al. 2019) contain data from the model organism U. maydis. The Carbohydrate Active Enzymes database (CAZy database, http://www.cazy.org/; Drula et al. 2022) additionally contains data from U. bromivora, S. reilianum, S. scitamineum and Melanopsichium pennsylvanicum. Functional studies in Microbotryum are scarce and as genome studies are mostly of evolutionary interest, the genus lacks functional annotation that could be incorporated into these databases. InterPro (Blum et al. 2021), Pfam (Mistry et al. 2021) and similar databases incorporate many more species in Ustilaginaceae and Microbotryum alike, as these databases provide annotation based on domain detection rather than few orthologous genes from selected species. Unfortunately, the functional description in these databases is less curated and the available information can differ between entries.

To illustrate the possibilities and drawbacks of database annotation approaches under these very different circumstances, we conducted an eggNOG mapper analysis (Huerta-Cepas et al. 2017) on predicted genes of selected high-quality genomes (Table 1). The eggNOG mapper uses multiple similarity-based methods to assign a seed orthologue from the database to the input gene. A smaller proportion of Microbotryum genes (43.3–65.5%) than Ustilaginaceae genes (90.0–97.9%) was assigned a seed orthologue (Table 1). Seed orthologues assigned to Ustilaginaceae genes were almost exclusively from the subdivision Ustilaginomycotina (Fig. 2A). Additionally, we found a few seed orthologues from Basidiomycota and higher taxa such as Fungi and Eukaryotes assigned to the Ustilaginaceae genes. For the Microbotryum species annotation, the majority of seed orthologues was taken from the subdivision Pucciniomycotina that Microbotryum belongs to, but the proportion was smaller (64.9–78.6%) than the representation of Ustilaginaceae by their superordinate subdivision Ustilaginomycotina (92.5–98.6%). A small proportion of seed orthologues of Microbotryum was represented by Ustilaginomycotina (1.07–1.43%). Interestingly, the Microbotryum genes were much more often annotated with the help of seed orthologues from other taxa that the group does not belong to, such as Agaricomycotina (4.65–7.42%) and even animals (0.41–2.54%), plants (0.63–7.88%), or bacteria (0.29–0.5%). This occurred very rarely in annotations for Ustilaginaceae (Agaricomycotina 0–0.05%, animals 0–0.1% and bacteria 0–0.11%). Most likely, this is i) attributed to a better representation of Ustilaginaceae in the eggNOG database (represented by U. maydis covering 8520 orthogroups; Pucciniomycotina represented by Puccinia triticina and P. graminis covering 5293 orthogroups, accessed August 2022) and ii) an indication of different genome composition of Microbotryum compared to other Pucciniomycotina, but also Ustilaginomycotina.

Fig. 2
figure 2

Annotation of predicted genes in the eggNOG database. A Percentage of seed orthologues used from different taxonomic groups used for the functional annotation of the predicted genes. Predicted genes from Microbotruym were mostly annotated using seed orthologues from Pucciniomycotina, but more than 25% of the annotated genes were annotated with the help of data from other taxa. In Ustilaginaceae, genes were almost completely annotated with data from Ustilaginomycotina. B Percentage of seed orthologues that could be annotated using the respective databases. In COG (Galperin et al. 2015, 2019) and PFAM (Mistry et al. 2021), almost all seed orthologues were annotated. More specialized databases, such as CAZy (Drula et al. 2022) or BiGGR (Gavai et al. 2015), supplied annotations for only a small fraction of the genes. Functional annotations were performed using the online version of eggNOG mapper (Huerta-Cepas et al. 2017, 2019, accessed August 2022) with default settings. Plots were created in R 4.0.2 (R Core Team 2021) using the ggplot2 package (Wickham 2016)

There were no notable differences in the annotation coverage of the seed orthologues for both groups. The COG database (Galperin et al. 2015, 2019) showed the highest annotation success for seed orthologues (up to 95%, Fig. 2B), which could be due to the broader functional groups proposed in this database or because genes can be represented in the COG database without an annotated function. The COG function of “Replication, recombination and repair” (Wilcoxon p = 0.01, 6.2–21.8% Microbotryum, 3.0–12.3% Ustilaginaceae) and “Cell cycle control, cell division, chromosome partitioning” (Wilcoxon p = 0.001, 2.71–5.36% Microbotryum, 2.05–2.35% Ustilaginaceae) were overrepresented in Microbotryum seed orthologues. While many COG categories showed statistically significant differences between the groups, these differences were small and could be influenced by the lack of seed orthologues found for Microbotryum. The number of seed orthologues without a COG category as well as “Function unknown” assigned was, however, higher in Ustilaginaceae. This likely attributes for U. maydis genes being represented in the database independently of an assigned function. For other taxa, genes that remain functionally unassigned might not be represented in the databases at all, which would allow annotations of Microbotryum genes with more distantly related, but functionally annotated genes. The Pfam database covered a similar number of seed orthologues as COG, covering up to 94% of seed orthologues with an annotation. Other broad databases, such as GO, BRITE, KEGG KO and KEGG Pathway (Kanehisa et al. 2021), were able to provide an annotation for 32–64% of seed orthologues, while highly specialised databases such as BiGGR (metabolic annotations, Gavai et al. 2015), TC (membrane transporter database, Saier et al. 2021, via KEGG_TC) and CAZy (CAZyme database, Drula et al. 2022) annotated a small fraction of genes for the two groups (Fig. 2B).

The annotations that can be achieved by database-based approaches such as eggNOG are likely more reliable for Ustilaginaceae, as data is available from a closely related species, which is often not the case for Microbotryum. This makes it difficult to compare these two groups in terms of gene functions and functional annotation of a Microbotryum species are needed to improve the annotation of further closely related species. Microbotryum lychnidis-dioicae most likely is best suited as it is already established as a model for molecular host–parasite interaction studies (Beckerson et al. 2019; Toh et al. 2018, 2017; Perlin et al. 2015; Toh and Perlin 2016). In Ustilaginaceae, a much more species-rich group than Microbotryaceae (Vánky 2011[2012]), high-quality genomes from the diverse members of the whole family that are annotated independently from U. maydis are needed to identify new species-specific functions. Such annotations could additionally close gaps in existing gene function predictions and improve the annotation quality by proposing possible alternative functions for genes. In both groups, better functional annotation will allow testing for the involvement of similar or dissimilar physiological and molecular processes in convergent traits between the two groups.

Species delimitation, host specificity and population structure

A key aspect of systematics and taxonomy is the accurate delimitation and identification of species. Delimitation and identification of species are of especially high importance in smut fungi, as some of them (e.g., S. scitamineum, Tilletia horrida and T. indica) are agriculturally important plant pathogens (Carris et al. 2006; Magarey et al. 2010). Additionally, several species are model organisms in different biological disciplines, in which correct taxonomic information improves the interpretation of results. However, smut fungi show limited amounts of distinct morphological traits, which makes differentiation between closely related species difficult. Host association has therefore been used as a trait to differentiate parasite species. Most current smut species descriptions, irrespective of systematic position of the species, are still based on a combination of morphological traits and host association (Begerow et al. 2004; Vánky 2011[2012]). This modus operandi of species description has resulted in most smut fungal species in Ustilaginomycotina (including Ustilaginomycetes and Exobasidiomycetes) and Microbotryomycetes being described as highly host specific (Fig. 3). Molecular approaches have often confirmed that most smut fungal species are restricted to one or a few host species (e.g., Denchev et al. 2021; Le Gac et al. 2007; Lutz et al. 2005, 2008; Kruse et al. 2018; Ziegler et al. 2018).

Fig. 3
figure 3

Host specificity of smut fungi within the Ustilaginomycetes, Exobasidiomycetes and Microbotryomycetes, respectively. In the main graph, only host associations up to 25 host species per smut fungal species are shown, whereas the inlay shows all associations, but not separated by smut fungal class. Species associations were taken from Vánky (2011[2012]) and plotted using the ggplot2 package (Wickham 2016) within R 3.6.1 (R Core Team 2021)

Some smut fungal lineages are supposed to have low host-specificity, but molecular phylogenetics has shown this for only a few species so far (e.g., Bruns et al. 2021; Kemler et al. 2013; Petit et al. 2017). The DianthusMicrobotryum system has been especially well studied in that respect. Although several Microbotryum species occur on species of Dianthus, these do not correlate very well with host phylogeny and several parasite species infect more than one host species (Petit et al. 2017; Kemler et al. 2013; Le Gac et al. 2007). Vice versa, Dianthus host species can harbour more than one parasite species, even within the same geographic location (Petit et al. 2017). Host species with several parasite species or lineages lead to the interesting phenomenon of multiple infections in the same host plant and this contact likely leads to more frequent hybridisation events. In Dianthus parasites the offspring of such hybridisations however is mostly sterile (Petit et al. 2017). It is so far, unknown whether potential introgression influences host specificity. Interestingly, coexistence of different Microbotryum species on Silene hosts did not lead to such outcomes, as in the case of M. violaceo-irregulare and M. lagerheimii on Si. vulgaris and M. lagerheimii and M. silenes-inflatae on Si. uniflora (Abbate et al. 2018).

Studies of host specificity on a phylogenetic level focus on the evolutionary outcome of host–parasite interactions that occur on the timescale of ecological processes. These interactions are dynamic and result in different outcomes between separate populations (Thompson 2005). Host specificity of smut fungi therefore might be variable in potentially connected meta-populations, but population divergence and host specificity have not been studied together. There have, however, been several population genetic studies in the two parasite groups. Strong genetic co-structure between populations of M. lychnidis-dioicae and populations of its host Silene latifolia in Europe has been observed, with genetic clusters corresponding to recolonization events from glacial refugia (Feurtey et al. 2016; Gladieux et al. 2013; Taylor and Keller 2007; Vercken et al. 2010). Thereby, higher genetic diversity occurs in the pathogen within a host cluster and greater resistance occurs in the host to its endemic pathogen (Feurtey et al. 2016). Gene presence-absence polymorphisms in European M. silenes-dioicae populations on the host Si. dioica have shown a European East–West gradient (Hartmann et al. 2019). The host plant Si. dioica similarly shows a European East–West gradient (Rautenberg et al. 2010), which could in part reflect the adaptation of M. silenes-dioicae to resistance genes of its host (Hartmann et al. 2019). However, this remains to be tested by functional analysis.

In the Ustilaginaceae, studied species also showed higher genetic diversity within populations than between populations, indicating little gene flow between populations. In U. hordei, U. maydis and U. tritici the few observed migration events were most likely driven by human activity (Jiménez-Becerril et al. 2018; Kashyap et al. 2019, 2020; Valverde et al. 2000). Host tracking has been demonstrated in U. maydis, in which diversification occurred after the domestication of the host in southern Mexico (Munkacsi et al. 2008). Sporisorium scitamineum also followed its host after it was cultivated in new environments, but only a single colonization event happened in the Americas and Africa from the Asian centre of origin (Braithwaite et al. 2004; Raboin et al. 2007). All the strains found in America and Africa belonged to the same lineage, which was nested within a clade of Asian strains. This lineage also showed very low genetic diversity compared to the strains of Asian origin. In contrast to the Ustilaginaceae, little is known about the influence of agriculture and other human activities on the population structure of Microbotryum species. The comparison of M. lychnidis-dioicae and Si. latifolia native populations in Europe and introduced populations in the US have shown that while the host plant was introduced multiple times from genetically different populations, the fungal pathogen seems to have been introduced two times from the UK (Antonovics et al. 2003; Fontaine et al. 2013; Gladieux et al. 2015; Keller et al. 2009, 2012).

Phylogenetic and population studies in both groups have increasingly used molecular markers. Phylogenetics and systematics are mainly based on single marker or multigene datasets. The most common markers amongst these are the rDNA regions of the internal transcribed spacer solely or in combination with other markers such as 26S (e.g., Kemler et al. 2013; Lutz et al. 2005; Stoll et al. 2005; Kruse et al. 2017; Ziegler et al. 2018). Although these markers are still the most common regions for species delimitation and systematics in smut fungi, additional markers have been used, including elongation factor 1 alpha, beta-tubulin and gamma-tubulin (e.g., Freeman et al. 2002; Le Gac et al. 2007; McTaggart et al. 2012). Single or multiple DNA regions are also used to understand the population patterns in both groups (e.g., Le Gac et al. 2007; Kashyap et al. 2019; Kemler et al. 2013), however most often microsatellite markers are used (e.g., Abbate et al. 2018; Büker et al. 2016; Kashyap et al. 2019, 2020; Raboin et al. 2007; Vercken et al. 2010) Advances of DNA sequencing technologies also resulted in the use of whole-genome data for phylogenetic inference (e.g., Duhamel et al. 2022a), as well as for population studies (e.g., Gladieux et al 2013; Hartmann et al. 2019).

Host specificity has been a central topic in parasite-host research and remains so. In smut fungi, although host specificity has been historically studied at the species level, especially the introduction of molecular tools has led to an increase in specificity studies at the population level. However, this approach has been mostly exerted in several Microbotryum species and such work would be important for populations of Ustilaginaceae species as well. As in Microbotryum, studying populations of Ustilaginaceae on non-cultivated hosts (e.g., Andropogon, Arrhenatherum, Festuca, Holcus, Poa and Themeda) could give hints on the genetic differentiation of natural populations and their natural history. The studies in Microbotryum on Dianthus have provided some interesting results on the evolution and significance of low host specificity. Testing for such patterns in Ustilaginaceae could provide crucial results in order to understand their host tracking of agriculturally important plants. Additionally, as the Ustilaginaceae systems are more amenable to molecular tools, they could provide the means to understand molecular mechanisms in natural parasite populations. Naturally, the development of genetic tools for Microbotryum will also be an important step in further understanding the molecular mechanisms of host specificity.

Life cycle

Saprotrophic growth

The saprotrophic stage in the life cycles of both taxa starts with single, haploid basidiospores budding off the basidial cells. These cells proliferate by budding and are referred to as sporidia or yeasts. Yeasts occur in different shapes from round to elongated and in culture have been observed as pseudohyphae (e.g., Sipiczki 2020). Asexual growth occurs on host plant surfaces, but yeasts of smuts and closely related taxa have also been isolated from a wide variety of substrates that are not host-related, including water, soil, leaf surfaces and flowers of non-hosts (e.g., Golonka and Vilgalys 2013; Inácio et al. 2008; Liou et al. 2009; Statzell-Tallman et al. 2010). Yeasts of M. lychnidis-dioicae quickly proliferate in the flowers of their host as long as nectar is present for nourishment (Schäfer et al. 2010). Smut fungi might retain this additional proliferation stage, as a higher density of yeasts on the host plant is associated with a higher success rate in mating and subsequent infection (Kaltz and Shykoff 1999).

Yeasts of many Ustilaginaceae and Microbotryum species grow well on a broad range of artificial media. Growth assays under laboratory conditions have been performed for several species to assess their capability of metabolising different carbon sources (Kurtzman et al. 2011). These are complemented by more recent studies that annotated carbohydrate-active enzymes (CAZymes), observed their expression during the life cycle (e.g., Berlemont 2017; Lanver et al. 2018; Perlin et al. 2015), or focussed on the biotechnological potential of Ustilaginaceae (e.g., Geiser et al. 2013, 2016; Stoffels et al. 2020). As CAZymes are often upregulated in the parasitic stage (Lanver et al. 2018), most of these studies focus their conclusions on the link between carbohydrate metabolism and parasitism rather than saprotrophic growth.

The amount of CAZymes in purely saprotrophic fungi and necrotrophic pathogens is usually much higher than what has been reported for smut fungi (Berlemont 2017; Perlin et al. 2015; Zhao et al. 2014). Especially the amount of plant cell wall–degrading enzymes is reduced in smuts (Fig. 4A) and most likely reduces the chances of triggering plant defences by the degradation of plant cell walls (Kämper et al. 2006; Mendgen and Hahn 2002). As the enzyme repertoire is small, the capability of saprotrophic yeasts is also limited in their utilisation of carbohydrate resources. It is likely that the yeasts mainly consume small sugar molecules, as smuts in the parasitic stage manipulate the host to accumulate soluble sugars, including hexose, sucrose and fructose (Doehlemann et al. 2008; Horst et al. 2010a, b; Sosso et al. 2019). Microbotryum species have been shown to proliferate on plant nectar (Schäfer et al. 2010), and interestingly also on melezitose (Deml and Oberwinkler 1983), a sugar produced by insects. The ability to express CAZymes for different plant metabolites could also explain sporidia being found in non-host habitats (Boekhout 2011).

Fig. 4
figure 4

CAZymes in Microbotryum and Ustilaginaceae. A Number of different CAZyme families per superfamily in each species. Ustilaginaceae show a greater CAZyme diversity given the higher amount of CAZyme families in their smaller genomes. Microbotryum species number a higher amount of glycosyl transferases in comparison to Ustilaginaceae, but show a lack of glycosyl hydrolases. B Number of CAZymes in families shared by all species, unspecific (i.e., CAZyme families not detected in all genomes but neither specific for Microbotryum nor Ustilaginaceae), specific to Microbotryum, specific to Ustilaginaceae and species-specific families. Ustilaginaceae genomes show a higher number of different CAZyme families in comparison to Microbotryum. C Ustilaginaceae show a significantly higher diversity of CAZyme families than Microbotryum (Wilcoxon p < 0.01). CAZyme prediction was performed using DBcan2 standalone version (Zhang et al. 2018). The results were aggregated according to Zhang et al. (2018): CAZymes had to be annotated by at least two of the three databases included in DBcan2. If an annotation was available from Hotpep (Busk et al. 2017), it was used in the aggregation; if this annotation was not available, the result from the dbCAN database was adopted. Plots were created in R 4.0.2 (R Core Team 2021) using the ggplot2 package (Wickham 2016)

Microbotryum and Ustilaginaceae smuts use different CAZyme repertoires (Fig. 4A, see also “Infection and parasitic stage”). Ustilaginaceae genomes contain more genes that encode glycosyl hydrolases than Microbotryum. Glycosyl hydrolases contain the main enzymes for plant cell wall degradation in plant-pathogenic fungi (Kubicek et al. 2014) and the higher number of glycosyl hydrolases potentially allows the yeasts to metabolise different carbohydrate sources, allowing them to grow on complex substrates. Yeasts of U. maydis for instance grow on artificial media that contains Zea mays tissue as the sole carbon source (Cano-Canchola et al. 2000). Genomic predictions also indicated that, although both smut groups can metabolize the same substrate, the specific molecular pathways are dissimilar. The breakdown of pectin is an example for this. A pectin lyase is utilised by U. maydis to degrade pectin, whereas M. lychnidis-dioicae employs a glycosyl hydrolases pathway known from related rust fungi (Lanver et al. 2018; Perlin et al. 2015).

Carbohydrates also compose the fungal cell walls. Many studies have been able to distinguish different cell wall types in yeast cultures that associate cell wall composition of species with their phylogeny (summarised in Roeijmans et al. 1998). These cell wall types reveal that, while the fungi have very similar lifestyles, Microbotryum and Ustilaginaceae cell walls are composed differently. The Microbotryum and Ustilago types are devoid of xylose, and while the Microbotryum type is defined as mannose dominant with presence of glucose, occasional presence of fucose and rhamnose, the Ustilago type is dominated by glucose with occasional presence of galactose (Roeijmans et al. 1998). During their life cycle, all smuts need to constantly remodel and rebuild their cell walls due to morphological switching and growth. This requires glycosyl transferases, which are highly represented in smuts, as well as other CAZymes to build cell wall carbohydrates (Bowman and Free 2006; Langner et al. 2015; Perlin et al. 2015; Zahiri et al. 2005; Fig. 4A). Such a morphological switch from the yeast stage is initiated during mating. This marks the end of the saprotrophic stage and can be triggered by a lack of nutrients on the substrate and the presence of a compatible mating partner (Schäfer et al. 2010; Snetselaar and Mims 1992).

Studies on the saprotrophic stage are often limited to cultivation screenings of artificial media (e.g., Deml and Oberwinkler 1982; Kurtzman et al. 2011). Availability of different carbon sources influences the growth behaviour of different smut fungal lineages and potentially also infection behaviour (Kijpornyongpan and Aime 2020). Transcriptomics should now be used to understand how much these artificial environments reflect natural conditions of saprobic yeasts growing on (host) plant surfaces, by studying the CAZyme repertoires expressed during this stage. This could also help address the question of yeasts outside of the host environment, as Microbotryum cells often engage in selfing and do not appear to elongate the saprotrophic stage, in contrast to some Ustilaginaceae species, which are more often isolated even in habitats far away from potential hosts (Boekhout 2011).

Mating

Under natural conditions mating is initiated when two sporidia of compatible mating type come into close contact on a suitable host plant. Lack of nutrients, the presence of specific plant compounds, or haptic signals of the plant surface are known to initiate mating in the presence of a suitable partner (Castle 1984; Perlin et al. 2015; Ruddat et al. 1991; Schäfer et al. 2010; Snetselaar and Mims 1992). Ustilago maydis shows directed growth of the conjugation hyphae towards the mating partner and this has been hypothesised also to occur in Microbotryum (Snetselaar et al. 1996; Schäfer et al. 2010). Thereby, U. maydis is able to initiate conjugation over rather large distances (Snetselaar and Mims 1992), while sporidia of Microbotryum need to be in close vicinity with each other (Schäfer et al. 2010). Once a hyphal connection is established between the two mating cells, the fungus enters a dikaryotic stage. In U. maydis, the infectious hypha either forms at the connection of the conjugation hyphae, or the infectious hypha grows out of one of the mating cells (Snetselaar and Mims 1993). In Microbotryum only the latter behaviour has been observed (Schäfer et al. 2010), indicating that one of the nuclei migrates through the conjugation connection into the other mating cell.

In smut fungi, as in other Basidiomycota, sexual reproduction is regulated by two mating loci, the a- and b-locus (reviewed by Coelho et al. 2017). These loci act as two different compatibility checkpoints. The mating partners require different alleles at both loci to successfully establish a dikaryotic stage and infect their host (Bakkeren et al. 2008). The a-locus is composed of one G-protein coupled pheromone receptor coding gene (PRA) and at least one pheromone coding gene (MF). Conjugation is initiated between mating partners secreting pheromones that activate the pheromone receptor of the compatible mating partner, but not their own (Urban et al. 1996a). In U. maydis the subsequent molecular cascades are well studied, and similar mechanisms are assumed for Microbotryum. In U. maydis the activation of the pheromone receptor triggers a cAMP-dependent protein kinase A (PKA) and a mitogen-activated protein kinase (MAPK) pathway, resulting in transcriptional regulation of mating-relevant processes, such as the outgrowth of conjugation hyphae and the arrest of the cell cycle to synchronise the nuclei of the future dikaryon (reviewed by Vollmeister et al. 2012). Surprisingly, the a-locus in Ustilaginaceae can be tri-allelic and such species contain three different pheromone receptor types and pheromones (Kellner et al. 2011; Schirawski et al. 2005). The genomes of tri-allelic species encode two pheromones and one pheromone receptor that is responsive for the respective pheromones of the two other mating types. The triallelic a-locus has been inferred as the ancestral state in Ustilaginaceae, as some biallelic species carry one functional pheromone gene and a pseudogenized pheromone-coding gene at the a-locus (Urban et al. 1996b). Phylogenetic studies of the pseudogene concluded that it could have been compatible with an extinct third mating type (Urban et al. 1996b). In Microbotryum, the a-locus is biallelic (Devier et al. 2009), but, unlike in Ustilaginaceae, individuals can carry the same pheromone gene in tandem repeat copies (Xu et al. 2016). This might enhance the pheromone-receptor response, as more pheromones are being produced to activate the receptor of the partner (Xu et al. 2016).

After mating has been initiated, the b-locus serves as a second compatibility checkpoint between the mating partners (Bakkeren et al. 2008; Coelho et al. 2017). It contains two homeodomain transcription factor subunit coding genes (HD). In U. maydis, their products form heterodimers in the heterozygote dikaryon, which act as transcription factors for genes maintaining the dikaryotic stage. Additionally, they induce biotrophic filamentous growth (Coelho et al. 2017; Kämper et al. 1995). In Ustilaginaceae, the b-locus can be multi-allelic and in U. maydis at least 18 different alleles are known (Puhalla 1970). In Microbotryum, only bi-allelic species are known (Devier et al. 2009). Bi-allelicity of the b-locus is frequent in inbreeding species, while multiple allelism is perceived as a result of outcrossing (Coelho et al. 2017).

The a and b mating-type loci are located on specific chromosomes called mating-type chromosomes, while the other chromosomes are defined as autosomes. The two mating-type loci can be located on two distinct mating-type chromosomes, which is the ancestral state in Basidiomycota (Coelho et al. 2017; Maia et al. 2015). This has been observed in several smut fungal species, independent of their systematic position, such as U. maydis, S. reilianum and M. intermedium (Branco et al. 2017; Kahmann and Kämper 2004; Schirawski et al. 2005). In these cases, meiosis produces four daughter cells of four different mating types. This system is referred to as tetrapolar. Other species in both groups have a single mating-type chromosome that carries the a and b mating-type loci. In this case, meiosis produces two mating types, and the system is referred to as bipolar. In Microbotryum, transitions from tetrapolarity to bipolarity occurred multiple times independently, with at least nine independent events of mating-type locus linkage and four events of linkage of the mating-type loci to their respective centromeres (Badouin et al. 2015; Branco et al. 2017, 2018; Carpentier et al. 2019, 2022; Duhamel et al. 2022a). This later case is referred to as pseudo-bipolarity because it results in the production of two mating types as in the bipolar system (Carpentier et al. 2019). In Ustilaginaceae, it is known that most species are tetrapolar and a few independent transitions to bipolarity occurred throughout evolution (Bakkeren and Kronstad 1994; Lee et al. 1999; Liang et al. 2019; Rabe et al. 2016; Sun et al. 2019). Although there are convergent transitions towards bipolarity in the two groups, whether similar scenarios of chromosomal rearrangements are involved in the two groups remains unknown. It is, however, known that in both groups, comparable to the sex chromosomes in plants and animals (Kortschak et al. 2009; Marais et al. 2008), the linkage of the a and b mating-type loci has been accompanied by the accumulation of transposable elements in the non-recombining regions (Bakkeren et al. 2006; Branco et al. 2018; Duhamel et al. 2022b).

Although the presence of two mating-type genes (a and b) is ancestral to the two groups and even to Basidiomycota, the number of alleles as well as the organisation of the mating-type chromosomes evolved independently. Such an example of convergence of mating-type locus linkage across Basidiomycota, occurring multiple times in closely related but also more distant species, shows that strong selective pressures can lead to similar phenotypes. The mating-type locus linkage is beneficial in species with only two alleles at each locus because it increases odds of compatibility under selfing, the predominant reproduction mode in Microbotryum (Giraud et al. 2008) and some Ustilaginaceae (e.g., U. bromivora; Rabe et al. 2016). In bipolar species meiosis generates four cells with two different mating types and there will always be two compatible partners from the same basidium. As mating is essential for successful colonisation of a host, the bipolar mating system increases the chance of infection even in cases where only a single teliospore ends up on a host. To date, four different scenarios of chromosomal fission and fusions have been found in Microbotryum, some being shared by several species (Branco et al. 2017, 2018; Carpentier et al. 2022; Duhamel et al. 2022a). These chromosomal rearrangements only involved the two mating-type chromosomes and no autosome, with fissions and fusions occurring at the centromeres of the mating-type chromosomes. More detailed comparisons of the chromosomal rearrangements which have formed the bipolar mating-type chromosomes in the two groups are desirable, as they can improve our understanding of factors governing the dynamics of chromosomal rearrangements in general.

Infection and parasitic stage

The biotrophic stage in both groups is initiated after mating through the formation of the dikaryon. The morphological switch results in a penetration hypha that enters the host between epidermal cells (Schäfer et al. 2010; Snetselaar and Mims 1993). Colonisation through the stomata, which is common in rust fungi, has not been observed in both groups of smut fungi (Heath 1995; Schäfer et al. 2010; Snetselaar and Mims 1993). In order to penetrate, the hyphae form appressoria with different modes of adhesion: Ustilaginaceae such as U. maydis show a fibrous rim around the appressorium (Snetselaar and Mims 1993), while Microbotryum species are suspected to use adhesive proteins or carbohydrates to establish contact (Schäfer et al. 2010). Both groups likely use lytic enzymes to penetrate the host epidermis (Kämper et al. 2006; Yockteng et al. 2007). Parasitic hyphae of Microbotryum keep proliferating intercellularly in the apoplast (Bauer et al. 1997; Schäfer et al. 2010). Species in the Ustilaginaceae enter the cells of their host shortly after host penetration and grow intracellularly, but with ongoing proliferation and the beginning of sporulation, the fungus also starts growing between host cells (Snetselaar and Mims 1993, 1994). Within the host, fungi of both groups proliferate via septate hyphae and typical basidiomycetous clamp connections between adjacent cells have been observed in several species (e.g., Deml et al. 1981; Liro 1924; Snetselaar and Mims 1994).

Both lineages likely utilise effectors to avoid plant defences during establishment and maintenance of the infection, as well as for gaining nutrients from the host. Effectors are mostly small proteins that are secreted by pathogens into the apoplast or imported into the plant cell to manipulate host defence responses and metabolisms in favour of the parasite (reviewed by Stergiopoulos and de Wit 2009). They evolve in order to evade generalised and effector-triggered defence responses of the host plants (Jones and Dangl 2006). It has been suggested that this leads to an arms race between the plant and fungus, in which molecular host defence compounds and effectors evolve in direct competition (Depotter et al. 2021).

The search for effectors has been highly facilitated by their de novo prediction in the genome sequences by algorithms that utilise specific patterns, including sequence lengths, cysteine-richness and export signals (Perlin et al. 2015; Sperschneider and Dodds 2021) or by applying comparative approaches. The latter approach has been especially successful in Ustilaginaceae, as their genomes can be compared to the genome of U. maydis, for which functional studies of specific effectors are available (Schirawski et al. 2010; Schweizer et al. 2018; Sharma et al. 2019). Of special interest up to today have been so-called core-effectors, which are conserved throughout a larger group of related species and are assumed to play a central role in the parasitic stage. For instance, the core effector Pep1 is responsible for the establishment of the interaction zone between the fungus and the plant throughout the Ustilaginaceae (Doehlemann et al. 2009; Hemetsberger et al. 2015; Sharma et al. 2019). Other effectors are only found in single species and possibly constitute specific adaptations to the host plant, but the specific functions of these genes have not been determined (reviewed by Figueroa et al. 2021).

In U. maydis and S. reilianum, the effector-coding genes are organised in gene clusters. These probably evolved from tandem gene duplications and the subsequent accumulation of non-synonymous mutations may have resulted in functional changes of effectors (Kämper et al. 2006; Schirawski et al. 2010). This spatial organisation facilitates orchestrated regulation of effector genes in waves throughout the parasitic stage, as different effector functions are needed at different stages of the infection (Depotter et al. 2021; Kämper et al. 2006; Lanver et al. 2018;). However, in other species, such as U. hordei, cluster-organised effectors have not been observed (Laurie et al. 2012). In Microbotryum, clustering of effector-coding genes has also not been detected (Perlin et al. 2015).

Effectors evolve at different rates, some showing higher intraspecific variation while others are conserved between populations (Depotter and Doehlemann 2020; Kämper et al. 2006; Kellner et al. 2014). Interestingly, effectors that are involved in the establishment of the parasitic phase are more conserved, while effectors that regulate host-parasite interaction during biotrophic growth show increased inter- and intraspecific variation. This variation includes higher sequence variability within effector genes, as well as presence/absence polymorphisms. The discovery of intraspecific presence/absence polymorphisms has led to the hypothesis that there are accessory effectors, which are non-essential for life cycle completion, but can increase virulence significantly (Depotter and Doehlemann 2020; Depotter et al. 2021; Kämper et al. 2006). Higher conservation of early-stage effectors most likely mirrors ubiquitous, crucial functions at this time point, such as epidermis penetration that involves many mechanisms to overcome highly conserved plant defences (Depotter and Doehlemann 2020; Depotter et al. 2021). The rapid diversification of non-accessory, essential later stage effectors on the other side is thought to mirror an intense selective pressure by the sophisticated, targeted response of the host (Depotter and Doehlemann 2020; Depotter et al. 2021).

Studies on the molecular interaction of Microbotryum species with their hosts are in their early stages, but transcriptomic studies in M. lychnidis-dioicae show that putative effector-coding genes are also differentially expressed across the infection stages. Most of them are expressed during the infection process, as well as during teliospore formation in the anthers (Toh et al. 2017, 2018). Just like in Ustilaginaceae, the putative effector repertoire of Microbotryum is composed of conserved and species-specific genes. However, species-specific effectors in Microbotryum are rare and the impact of fast-evolving orthologous genes seems to be the dominant driver of host specificity (Beckerson et al. 2019).

During biotrophic growth, parasites gain their nutrients from their hosts, whereby CAZymes and sugar transporters play essential roles in the acquisition (Lanver et al. 2018; Perlin et al. 2015; Wahl et al. 2010a). In Ustilaginaceae, most CAZymes are conserved between different species (Kijpornyongpan et al. 2018), hinting at their general role of carbohydrate degradation, but several species-specific CAZyme families seem to occur (Fig. 4B). These will be of special interest for future research, as their occurrence suggests the use of specific carbon sources or lineage specific carbon metabolism pathways that are important in parasite-host interaction. Species in the Ustilaginaceae contained a significantly (Wilcoxon-test, p < 0.002) greater diversity of CAZyme families than Microbotryum species (Berlemont 2017; Perlin et al. 2015; Fig. 4C). These included amongst others cellulases and xylanases, two members of the glycosyl hydrolase family (Fig. 4A) necessary for cell wall degradation and it is tempting to draw conclusions on different interaction strategies in the two groups: intracellular growth of Ustilaginaceae smuts necessitates enzymes that regularly break down plant cell walls, whereas the intercellular growth of Microbotryum does not require such enzymes (Cano-Canchola et al. 2000; Geiser et al. 2013; Perlin et al. 2015). M. lychnidis-dioicae, however, was found to contain a high amount of glycosyl transferases, a CAZyme family involved in cell wall remodelling (Klutts et al. 2006; Perlin et al. 2015). However, the difference in the number of glycosyl transferases is minor between the two taxa (Fig. 4A) and likely the frequent remodelling of cell walls is a common feature in both taxa.

Virulence factors, such as effectors, sugar transporters or CAZymes have been studied in both groups. Functional Microbotryum studies have mostly been performed in a single species, M. lychnidis-dioicae (Perlin et al. 2015; Toh and Perlin 2016; Toh et al. 2016, 2017), while in Ustilaginaceae, many species have been assessed, often relying on comparative results with functional genetics studies and predictions from the genome of U. maydis. Studying more Microbotryum species in this context, given that the genomes are already available, would surely provide insight into parasite–host interaction in this group. While studies on Microbotryum species routinely draw the comparison with Ustilaginaceae model species (Perlin et al. 2015; Schäfer et al. 2010; Toh et al. 2017), mechanisms described in Microbotryum are rarely addressed in studies on Ustilaginaceae. Using Microbotryum virulence factors and their evolution to understand Ustilaginaceae parasitism is an issue that deserves more attention and could lead to a variety of new discoveries.

Sporulation, germination and basidial growth

In both taxa, the parasitic stage culminates in sori filled with teliospores. The sorus formation mainly takes place in specific host tissues that can be different depending on the parasite species, e.g., leaves or stamina (Kemler et al. 2020; Vánky 2011[2012]). The model organism U. maydis is an exception: although it is well-known for the formation of galls in the female inflorescences of Zea mays, it can also form hypertrophic growth in other above-ground plant tissues (Christensen 1963). Complicating the issue, depending on the host tissue used for teliospore formation, the fungus expresses genes differentially (Skibbe et al. 2010).

Superficially, the process of teliospore formation from hyphal growth is similar in both groups, but on closer inspection there are characteristic differences. Sporogenous hyphae of species from both groups are completely transformed into spore cells. Species from the Ustilago type produce a hyaline matrix formed by the breakdown of the hyphal cell walls, which disappears subsequently when the durable, dark-coloured spore walls are formed. In Sporisorium species, tight spore balls can remain after the formation of the teliospores from characteristic intercoiled sporogenous hyphae (Piepenbring et al. 1998). In Microbotryum, hyaline sheets develop as the cells become spherical, after being covered by remains of the hyphal cell wall initially (Piepenbring et al. 1998). In both groups the two nuclei of the dikaryotic hyphae fuse during spore development to form one diploid nucleus (Dangeard 1893; Paravicini 1916) and the cell plasma retracts. In Ustilago, the diploid cells can divide mitotically, most likely to obtain a bigger spore mass for successful dispersal (Snetselaar and Mims 1994). In Microbotryum, yeast-like budding of the diploid cells has been observed (Piepenbring et al. 1998). Spore development results in the characteristic, dark-coloured mass that give a scorched appearance to infected plant parts.

After dispersal, the teliospores germinate with basidia that form haploid basidiospores. Germination begins with dissolving teliospore cell walls by extension of the innermost of three layers in U. maydis (Ramberg and McLaughlin 1980), and by the basidium extending through the cell wall in M. violacaeum (Ingold 1983). Meiosis in both groups is performed usually directly before or during germination and an elongated hypha forms the basidium (Ingold 1983). During its development, four haploid nuclei are formed and distributed into different compartments of the basidium. The meiotic division creates two nuclei of two different mating types each in bipolar species and usually four nuclei with different mating type in tetrapolar species. The emergence of the haploid sporidia from the basidium again initiates the haploid saprotrophic stage (Ingold 1983, 1988). Underlying molecular mechanisms governing teliospore germination are mainly known for U. maydis, in which transcript analyses using microarrays and RNAseq have been performed (Zahiri et al. 2005; Donaldson et al. 2017). During the resting phase, teliospores are metabolically arrested and the few genes up-regulated are involved in the onset of meiosis, transcription and translation initiation, protein turnover and protein assembly and signal transduction (Zahiri et al. 2005). It is assumed that stabilised transcripts are stored within the resting teliospore and can undergo processing by the protein products of the up-regulated genes promptly upon germination. Additionally, a large number of natural antisense transcripts that are potentially involved in gene regulation were found in resting teliospores (Donaldson et al. 2017). Once teliospores germinate, genes including those for translation, post-translational modification, cell wall synthesis, mitochondrial biosynthesis and metabolic energy production are upregulated. During the extension of the basidia genes encoding a chitin synthase (Umchs2), as well as genes potentially involved in basidiospore formation (Cmp1) are upregulated (Zahiri et al. 2005).

Successful germination is dependent on the environment, an important but understudied subject. In U. maydis, germination is triggered by availability of sugars that the fungus can utilise for proliferation in the subsequent yeast stage (Caltrider and Gottlieb 1966). In both groups plant hosts provide important germination cues, whereby the location on the plant can change germination behaviour. The anther smut M. lychnidis-dioicae for instance germinates on seedlings as well as flowers (Schäfer et al. 2010). On the petals of the flowers teliospores germinated under meiosis with three-celled basidia (one nucleus retracted into the teliospore), each cell producing a basidiospore. The nutrient-rich conditions in flower nectar also favoured the proliferation of sporidia. On seedling leaves and their nutrient-deprived surfaces, teliospores formed two-celled basidia that often lead to intrabasidial mating and subsequent infection of the host without the intermediate formation of basidiospores. Previous studies on artificial media showed similar behaviour, whereby high nutrients and temperature led to three-celled basidia and low nutrients to two-celled basidia (Hood and Antonovics 1998).

Different basidial types also resulted in different nucleic behaviour. Whereas in the three-celled basidia each nucleus was in an individual basidial compartment, the distal cell of two-celled basidia contained two nuclei (Hood and Antonovics 1998). Both of these distal nuclei were of the same mating type, thereby making it possible to produce a functioning heterokaryon by fusion with the proximal cell. In both groups such direct fusion of basidial compartments via hyphal bridges (i.e., intrabasidial mating) has been observed (e.g., Bock 1964; Hood and Antonovics 1998; Ingold 1988; Schäfer et al. 2010). Fusion can happen between different cells, depending on when the mating types segregate during meiosis (Hood and Antonovics 1998). If mating types segregate during Meiosis I, fusion either occurs between one of the distal and one of the proximate cells as in M. lychnidis-dioicae (Hood and Antonovics 1998). Mating between both of the distal or proximate cells occurs when mating types are segregated during Meiosis II, as has been observed in S. tricholaene, U. hypodytes, S. scitamineum, U. scrobiculate and U. tritici (Ingold 1983, 1988, 1989, 1994). However, a mixture of both mating behaviours in one species has also been observed (Ingold 1994, 1989, 1988).

Thorough investigations on spore formation and germination of the two groups revealed that these stages of the life cycle have distinct features in the two groups. Further work is needed to study the differential or similar molecular pathways resulting in the superficially similar processes of spore germination. High-throughput methods, including transcriptomics, certainly will improve the understanding at what molecular levels these morphologically similar processes are achieved by similar mechanisms.

Conclusion

Smut fungi in Ustilaginaceae and Microbotryaceae are remarkable examples of convergent evolution that demonstrate that a whole lifestyle can evolve multiple times in relatively distant clades. Despite these similarities both groups are characterised by different genome architectures and functional genome composition. Species-specific genes involved in host-parasite interaction are more abundant in Ustilaginaceae, whereas in Microbotryum, genes involved in the interaction seem to evolve by orthologous gene diversification. Because Ustilaginaceae are more amenable to genetic methods and have a shorter life-cycle, functional research has a much longer history in this group than in Microbotryaceae. Due to the molecular model species status of U. maydis, newly developed molecular methods, including transcriptomics and CRISPR-Cas, are already well-established in Ustilaginaceae, but need further development in Microbotryaceae. Microbotryum on the other side has been established as a model system for evolutionary and population biology. Systematic relationships between Microbotryum species are fairly well established and co-structure population genetic studies have shown strong co-evolution between many parasites and their hosts. Recently, a strong focus on mating-type chromosome evolution on a genomic level in Microbotryum anther smuts has furthered understanding of the evolution of non-recombining regions in fungi. This provides interesting comparisons with sex chromosomes of animals and plants, in particular to test the evolutionary causes of stepwise extension of recombination suppression, which is convergent across kingdoms. The implementation of genetic tools in Microbotryum to understand molecular interaction between these parasites and their hosts, as well as increased efforts in evolutionary and population biology in Ustilaginaceae will provide tremendous steps forward in our understanding of convergent evolution between the two groups and in fungi in general.