Background

Transposable elements (TEs) are selfish genetic and typically repetitive elements in genomes. Autonomous TEs carry genes for self-replication, transposition, and DNA integration, allowing them to multiply and spread in host genomes, often leading to high copy numbers. They are classified into DNA transposons, spreading by excision from one and integration into another genomic location (“cut-and-paste” mechanism), and retrotransposons, which transcribe their sequence into an RNA transcript, which in turn is reverse-transcribed and then integrated as a new copy into the genome, frequently at a distant site [1, 2]. TEs can dramatically influence genome size in eukaryotes [3], destabilize genome integrity [4], and significantly affect genome structure and gene expression patterns [5].

Powdery mildew fungi are ubiquitous phytopathogens of many plant species in temperate climates [6, 7]. The genomes of powdery mildew fungi often consist to more than 50% of TEs, resulting in largely inflated genomes and highly variable genome sizes between species [8]. The barley (Hordeum vulgare) powdery mildew Blumeria hordei possesses a genome of approximately 125 million base pairs (Mbp) in length with a TE content of more than 74% [9, 10]. The TE space of B. hordei is dominated by long interspersed nuclear element (LINE) elements, predominantly Tad1, and long terminal repeat (LTR) elements, especially Ty3/mdg4 (syn. “Gypsy”) and Ty1/Copia [9, 10]. While such a high TE content likely poses a stress on genome integrity in B. hordei, TEs can also be a source of variability and adaptation [4] and contribute to the evolutionary invention of novel virulence factors [11, 12]. However, it is presently unknown how powdery mildew fungi regulate their TEs to prevent uncontrolled spread and, hence, genome disintegration.

Filamentous fungi evolved the repeat-induced point mutation (RIP) mechanism that serves to mutagenize repetitive sequences and helps to disable TEs [13]. However, the RIP mechanism is missing in all powdery mildew fungi sequenced to date [8,9,10, 14], suggesting that it was lost in the course of evolution in a common ancestor of powdery mildew fungi [14]. Instead, we previously found that small RNAs (sRNAs) can originate from B. hordei TEs and that phased small RNAs (phasiRNAs) align to fungal TEs, suggesting frequent RNA interference (RNAi) to control TE activity [15, 16].

Long noncoding RNAs (lncRNAs) are transcripts longer than 200 nucleotides with no apparent coding potential [17, 18]. Like mRNAs, lncRNAs are transcribed by RNA polymerase II and often processed by splicing, 7-methylguanosine capping, and polyadenylation [19]. lncRNAs can be distinguished from sRNAs and microRNAs, which are smaller than 200 nucleotides [20], RNA polymerase III-generated noncoding RNAs, including ribosomal and transfer RNAs, and the RNA polymerase II-derived components of the spliceosome, i.e. small nuclear RNAs (snRNAs) and small nucleolar RNAs (snoRNAs) [21]. Compared to coding genes, lncRNAs frequently exhibit lower expression levels and sequence conservation on average [21]. lncRNAs can be classified as intergenic, intronic, sense, and antisense lncRNAs depending on their relative position to coding genes [22], although the majority of lncRNAs described to date reside in intergenic regions [23,24,25]. lncRNAs engage in biological processes such as transcriptional and post-transcriptional gene regulation, epigenetic regulation, regulation of translation, and regulation of protein activity [19, 26]. A number of lncRNAs have been functionally analyzed in mammals, such as HOTAIR, roX1, roX2, and Meg3, which form triplexes with purine-rich DNA regions to exert regulatory functions [21, 27], and in plants, where lncRNAs are involved in developmental processes and stress responses [28]. In fungi, lncRNAs function in regulating development, stress and starvation responses, and pathogenic traits [29, 30]. For instance, the human-pathogenic fungus Cryptococcus neoformans exhibits dynamic infection stress-related lncRNA expression patterns [31]. The rice blast fungus Magnaporthe oryzae expresses more than 500 lncRNAs specifically during infection [23], suggesting that lncRNAs may be important for plant-pathogenic fungi throughout host colonization. Interestingly, in maize, TEs can give rise to lncRNAs in response to abiotic stresses [32], suggestive of regulatory roles exerted on TE activity by lncRNAs. We hypothesize that lncRNAs could also occur in powdery mildew fungi, be expressed during different infection stages, and fulfill regulatory functions toward controlling TE activity.

In this work, we aimed to discover signs of co-regulation between TEs, coding genes, and lncRNAs in the barley powdery mildew fungus B. hordei. We generated a time-course whole-transcriptome shotgun sequencing (RNA-seq) dataset to measure transcript levels at all relevant stages of the asexual infection cycle of B. hordei, which allowed us to provide manually curated high-quality genome-wide annotations of coding genes, TEs, and lncRNAs. TEs exhibited signs of time point-specific regulation and co-regulation with mRNAs and lncRNAs, and we found evidence for phasiRNAs linked to retrotransposons, suggestive of RNAi-directed regulation of TEs in B. hordei. Importantly, we discovered 823 lncRNAs on the antisense strand of TEs, which may provide precursors for RNAi with TEs.

Results

Stranded total RNA sequencing of B. hordei and barley transcripts

We hypothesize that B. hordei TEs are differentially regulated throughout the pathogenic life cycle of the fungus. We performed stranded RNA-seq at 0, 6, 18, 24, 72, and 120 h post inoculation (hpi) to obtain transcriptomes of all major developmental stages of the asexual pathogenic life cycle of B. hordei and its host, barley (Fig. 1A). The RNA sequencing yielded on average 84,528,021 raw reads/sample and 81,708,016 reads/sample after quality trimming (Supplementary Table 1). FastQC quality control revealed high per-base quality (per-base quality score > 30) across all samples, but also 7.7% sequence duplication on average, which originated from barley chloroplast RNA and had no similarity to the H. vulgare and B. hordei nucleic genomes. Of the remaining reads, 0.2–1.5% and 4.6–24.8% mapped to the reference genome of B. hordei isolate DH14 [10] at the early (0, 6, 18 and 24 hpi) and late (72 and 120 hpi) time points, respectively. The relatively low fungal mapping percentage can be explained by the B. hordei tissue only representing a small fraction of the biological sample material (leaf epidermal peels), especially at early infection stages before 24 hpi. Despite a rather low proportion of fungal reads, RNA-seq analysis of powdery mildew-infected plant samples can retrieve biologically meaningful insights [33]. Further, between 65 and 93% of the reads mapped to the H. vulgare cv. ‘Morex’ reference genome version Morex3 [34], while between 6.9% and 13.2% of the reads could neither be assigned to the host nor the pathogen. Using Pearson correlation coefficient-based hierarchical clustering, principal component (PC) analysis, and non-metric multidimensional scaling (NMDS), we found that the sequencing samples from the same time points were more similar to each other than to samples from the other time points (Fig. 1B-C). However, the principal components PC1 and PC2 did not separate B. hordei time points 6 hpi from 18 hpi, nor 72 hpi from 120 hpi, while hierarchical clustering and NMDS indicated these to be different. Overall, both B. hordei and H. vulgare RNA-seq data clustered according to infection time point, suggesting that the dataset is informative with respect to the respective infection stage.

Fig. 1
figure 1

RNA-seq data obtained from B. hordei-infected barley leaf epidermal peelings exhibits time point-dependent clustering. A We sampled B. hordei isolate K1AC-infected barley abaxial leaf epidermis at time points across the asexual life cycle of the fungus. The pictograms indicate the fungal stage from conidiospore germination (0 hpi) to conidiogenesis, i.e., asexual spore formation (120 hpi). B and C Left panels: Horizontal clustering using the normalized gene expression values of B. hordei isolate K1AC (B) and H. vulgare cv. ‘Margret’ (C), respectively, was calculated with Euclidean distance and Ward.D2 clustering in R. Height indicates the Euclidean distance between samples. Middle panels: Principal components were computed using the R algorithm prcomp. The percentage of sample divergence explained by the principal components PC1 and PC2 are indicated as ratios in the axis labels. Right panels: Non-metric multidimensional scaling (NMDS). Colors: Burgundy, 0 hpi; red, 6 hpi; yellow, 18 hpi; light green, 24 hpi; blue, 72 hpi; purple, 120 hpi

TEs are transcriptionally active in a time point-dependent manner in B. hordei

Since the genome of B. hordei is largely occupied by TEs [9, 10], we generated a manually curated [35] database of non-redundant TEs for B. hordei (Supplementary Fig. 1). In total, we established 353 consensus repeats including a TE consensus database of 344 TE families, of which 116 were LTR-Ty3 (aka LTR-Gypsy), 122 were LTR-Ty1 (aka LTR-Copia), 80 were LINE-Tad1 elements, 3 were unknown LTR-type TEs, 3 were NonLTR retroposons, 2 were SINE (Eg-R1 and Egh24), and 18 were DNA transposons (Table 1). We then re-annotated the TEs in the B. hordei genome with the custom TE consensus database using RepeatMasker. We found that 47,581,730 bp (38.22%) of the B. hordei isolate DH14 genome assembly were occupied by LTR retrotransposons, 29,472,397 bp (23.67%) by LINE retrotransposons, 10,208,381 bp (8.20%) by DNA transposons, 4,028,256 bp (3.24%) by SINE retrotransposons, and 2,013,553 bp (1.62%) by other or unclassified elements (Supplementary Table 2). Overall, we discovered 13,907 full-length (9.4%) and 133,617 truncated TEs in the B. hordei genome; similarly low full-length conservation was observed in the fungus Laccaria bicolor with 6.3% [36]. In total, according to the updated calculation, the B. hordei reference genome contained 93,304,317 bp (74.95%) interspersed repetitive sequence, which is close to previous estimates [8, 10].

Table 1 Summary of B. hordei (isolate DH14) non-redundant consensus repeats

Next, we analyzed the RNA-seq read coverage of fungal consensus TEs in the course of pathogenesis. On average, both DNA and LINE and LTR retrotransposons exhibited high (> 500 transcripts per million (TPM)) levels of read mapping (Fig. 2A-B). Further, using time-resolved analysis of RNA-seq data via TCseq [37] and quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) analysis for selected elements, we found that transposon families can exhibit time point-specific enhanced transcript levels (Fig. 2C-D and Supplementary Fig. 2), suggestive of dynamic regulation of TE expression during plant infection.

Fig. 2
figure 2

TE families exhibit infection stage-specific expression patterns. A and B We mapped the reads from the time-course RNA-seq experiment to our repetitive elements database containing 344 individual non-redundant TE families (Table 1). The violin plots show the Log2 of the average normalized expression (y-axis) of each family expressed as transcripts per million (TPM); the types of repetitive elements (blue, DNA transposons; light green, LINE; dark green, LTR; red, NonLTR retrotransposons; magenta, SINE; grey, low complexity regions, satellites, simple repeats, and unknown repeats) are indicated on the x-axis. C We used TCseq [37] to cluster time-course expression patterns of the 344 consensus TEs. The lines represent single consensus TEs; the color shade denotes the cluster membership by Spearman correlation (the darker the shade of green, the higher the correlation R with the respective expression cluster according to the color scheme). The x-axis denotes the time points of infection (Fig. 1A), i.e., 0 hpi (conidiospore germination), 6 hpi (appressorium formation), 18 hpi (host cell penetration), 24 hpi (haustorium formation), 72 hpi (epiphytic colonization), and 120 hpi (conidiogenesis). The y-axis indicates the relative z-score based on reads per kb of transcript per million mapped reads (RPKM). D We conducted qRT-PCR analysis of selected TE families (for the whole set of tested TE families, see Supplementary Fig. 2). The dot plots display the relative transcript abundance (according to ΔΔCT analysis; y-axis) of the respective TE family indicated on the top-left in B. hordei isolate K1AC at seven time points of host infection (x-axis; see also Fig. 1A). TE transcript levels were normalized to B. hordei GAPDH (BLGH_00691); three independent replicates (n = 3), consisting of three technical replicates each, were performed. The shades of green indicate the replicate each data point belongs to; the black bar shows the median

Non-coding RNAs are a common means to reversibly silence gene expression [20, 38]. Previously, we found that sRNAs often originate from TEs [16] and that phasiRNAs preferentially align to TEs, indicative of RNAi activity on transcripts originating from TEs [15]. We therefore used publicly available sRNA-seq datasets for B. hordei [15, 39] to assess phasing of consensus TEs via the tool unitas [40]. Overall, we detected the accumulation of phasiRNA mapping in 104 of the 344 consensus TEs (Table 1), of which 91 were LTR elements (51 Ty3/mdg4 and 40 Ty1/Copia elements) and 13 were Tad1 LINE elements (Fig. 3A). These 104 elements corresponded to 1,333 full-length and 39,499 truncated TEs in the genome of B. hordei. Of these, 102 exhibited expression levels of more than 250 TPM at a minimum of one time point. We compared these apparently phased TEs with TEs showing time point-specific expression patterns (Fig. 2C) and found 22 to be both phased and undergoing dynamic expression (12 Ty3/mdg4, 7 Ty1/Copia, and 3 Tad1 elements; Fig. 3). Thus, phasing of TE transcripts may contribute to the dynamic regulation of TEs throughout the pathogenic fungal life cycle.

Fig. 3
figure 3

Noncoding RNAs are associated with TEs in B. hordei. A We mapped publicly available sRNA-seq datasets obtained from B. hordei-infected barley plants to the TE consensus database using bowtie [41] and identified candidate phasiRNAs using unitas [40]. The donut chart shows the number of consensus TEs to which phasiRNAs were linked (green portion of inner circle), the number of elements accounting for the different TE families (Table 1; second circle), and the number of consensus TEs whose expression peaks at specific time points (Fig. 2; outer circle). B Example of stacked phasiRNAs mapped to the consensus sequence of TE Ty3/mdg4-17 (6,054 bp in length). The TE self-propagation genes are indicated on top (green arrows); these genes encode RT (reverse transcriptase), RNase H, and DNA integrase. The scale below the black horizontal line indicates the TE length and position in bp. Mapped sRNAs are shown in the windows below the size scale for two examples: derived from infected leaf epidermal cells at 120 hpi [15] and derived from infected total leaf material at 24 hpi [39]. Grey blocks indicate single reads; the black boxes on top of each graph display predicted clusters of phasiRNAs. Colored blocks indicate read mismatches with the reference sequence: blue, C; red, T; orange, G; green, A. Data were visualized using Integrative Genomics Viewer v2.9.4 [42]. C The dot plot shows the time course expression patterns of selected consensus TEs using the RNA-seq dataset generated in this study. The y-axis shows the normalized expression expressed as transcripts per million (TPM); the x-axis indicates the respective time point of the asexual life cycle of B. hordei (Fig. 1A). The selected TE consensus elements are indicated on the top-left of each plot. D Example of an antisense lncRNA occurring in the consensus TE Ty1/Copia-63 (5,317 bp in length). The upper lane indicates annotated transcripts on the TE, where TE self-propagation genes are indicated in green (genes encode RT (reverse transcriptase), gag polyprotein, and DNA integrase) and three detected isoforms of associated lncRNAs in orange. The scale below the black horizontal line indicates the TE length and position in bp. The second panel indicates long reads obtained via ONT transcriptome sequencing at 144 hpi; colored lines indicate mismatches between read and reference sequence. The two lower panels show RNA-seq read mappings to this TE as colored lines. Green, stranded reads aligning to the sense strand; Orange, stranded reads aligning to the antisense strand. Grey lines indicate reads split due to predicted splicing events. Data were visualized using Integrative Genomics Viewer v2.9.4 [42]. E We amplified several TE antisense lncRNAs from B. hordei cDNA using sequence-specific primer pairs. The agarose gel shows PCR-amplified lncRNAs (indicated above each lane), arrows indicate the expected PCR products. Expected PCR amplicon sizes were: BLGHnc_000942-RA (Ty3/mdg4-23 antisense), 2,831 bp; BLGHnc_004556-RA (Ty1/Copia-23 antisense), 2,888 bp; BLGHnc_000243-RA (Ty3/mdg4-1 antisense), 1,150 bp; BLGHnc_003729-RB (Ty3/mdg4-9 antisense), 1,065 bp; BLGHnc_000866-RA (Ty3/mdg4-62 antisense), 2,187 bp; BLGHnc_03526-RA (Ty1/Copia-63 antisense), 1,419 bp; BLGHnc_003513-RB (intergenic), 922 bp; BLGHnc_004496-RA (intergenic), 2,128 bp. Bands corresponding to the expected product size were excised from the gel and their sequence identity confirmed by amplicon sequencing. NPC, no primer control. DNA Ladder, 1 kb plus (Invitrogen-Thermo Fisher, Waltham, MA, USA). F The genomic transcript models of BLGHnc_000942-RA, BLGHnc_004556-RA, BLGHnc_000243-RA, BLGHnc_003729-RB, BLGHnc_000866-RA, BLGHnc_03526-RA, BLGHnc_003513-RB, and BLGHnc_004496-RA. Orange blocks represent exons and grey lines spliced introns

Further, we observed cases of RNA-seq read-mapping to TE loci that exhibited signs of splicing, with splice sites located on the antisense strand of the TE-encoded genes (Fig. 3D). To preclude multi-mapping artifacts because of short-read mapping to highly repetitive loci, we performed long-read transcript sequencing with Oxford Nanopore Technology (ONT) MinION technology at 144 hpi and observed reads corresponding to spliced antisense lncRNA transcripts in TEs (Fig. 3D). These transcripts did not appear to encode any meaningful peptides and thus were TE-derived antisense long noncoding RNAs (lncRNAs). We selected eight TE antisense lncRNAs that exhibited expression levels above 10 TPM at a minimum of one time point, which we successfully amplified, cloned and sequenced (Sanger technology). In two of these (BLGHnc_000866-RA and BLGHnc_004496-RA), we verified the occurrence of the predicted introns (Figs. 3E and F). Collectively, this approach independently validated the existence of, in part intron-containing, TE antisense lncRNAs in B. hordei.

Genome-wide lncRNA annotation in B. hordei

We next identified and manually annotated all putative lncRNAs in the genome of B. hordei using our extensive time-course RNA-seq dataset. We assembled 45,797 transcripts using StringTie [43] and designed a pipeline to identify lncRNAs (> 200 bp and no apparent coding potential) from the assembled transcripts (Fig. 4A). We used a Web Apollo instance [44] to manually inspect and correct coding and noncoding gene models, resulting in 5,127 lncRNA loci across the B. hordei genome, accounting for 8,970 transcripts in total. Of the 5,127 lncRNA loci, 2,401 (46%) were recovered using ONT long read transcript sequencing, including 293 TE antisense lncRNAs. We consider this a satisfactory recovery rate, since we obtained long read transcriptomic data only from the late infection stage at 6 dpi and recovered polyadenylated rather than total transcripts. Due to the extensive transcriptome data in our study, which was not available for coding gene annotation before [10], we further detected 43 new coding genes in the genome of B. hordei, removed 12 models that were located within TEs and encoded predicted transposon replication genes, and corrected another 142 models, which changed the coding gene number from 7,118 to 7,149. Of these, 5,457 (76%) were recovered by ONT long read sequences The most common functional assignments of the 43 newly annotated encoded proteins are predicted candidate secreted effectors, additional Sgk2 kinase-like paralogs, and fungal zinc finger DNA binding domains (Supplementary Table 3).

Fig. 4
figure 4

Genome-wide identification and characterization of B. hordei lncRNAs. A Using the total RNA-seq data mapped to the genome of B. hordei isolate DH14 [10] with HISAT2 [45], we assembled transcripts with StringTie [43]. Next, we filtered out coding genes from the transcriptome by Gffcompare [46], transcripts shorter than 200 bp using Gffread [46], transcripts with coding potential using CPC2 [47], and transcripts accounting for ribosomal RNA, transfer RNA, small nuclear RNA, and small nucleolar RNA via CMscan search against the Rfam database [48]. Then, we used FEELnc [49] for lncRNA annotation and classification, resulting in 17,226 putative lncRNAs in the reference genome of B. hordei. Lastly, we manually inspected the predicted lncRNA models by our Web Apollo [44] instance, yielding in total 5,127 unique lncRNA loci in B. hordei. B Histogram for the transcript length in base pairs (bp; x-axis) against the number of coding genes (mRNAs, purple) and lncRNAs (orange; y-axis). The violin plot above shows the overall distribution of gene lengths; data points represent individual transcripts. C Bar graph of the exon number per transcript (x-axis) against the gene number (y-axis). Purple, mRNAs; orange, lncRNAs. D The violin plot shows the expression in Log2(transcripts per million (TPM)) for mRNAs (purple), lncRNAs in sense orientation of associated genes (blue), lncRNAs in antisense orientation of TEs (orange), intronic lncRNAs (grey), and intergenic lncRNAs (green). The number of transcripts (n) contributing to the respective subset are given on the top. E The box plot shows the transcript levels in Log2(TPM) for lncRNAs (orange) and mRNAs (purple) depending on the transcript exon number (x-axis). F The histogram shows the number of transcripts (x-axis) encoded by lncRNA genes (y-axis). G The stacked bar graph displays the occurrence of alternatively spliced lncRNA transcripts (AS, alternative splicing; orange bar) and the number and type of alternative splicing events in B. hordei. The types of events are illustrated by the drawings, where the red portion of the exon indicates the alternative event and black lines connecting exons splice events, colored in shades of red and orange according to event. From dark red to dark orange, events are shown in this order: retained intron, alternative 3’ splice site, alternative 5’ splice site, skipped exon, alternative last exon, and alternative first exon. H The dot plots show the transcript levels in TPM for alternatively spliced isoforms of the lncRNA BLGHnc_000769 (y-axis) at six time points of host infection (x-axis). Note that isoform BLGHnc_000769-RA was not expressed above background levels and thus omitted from this Figure. The black bar indicates the median of three independent replicates (n = 3)

We found that lncRNAs were shorter than coding mRNAs on average (1,494.8 bp and 2,110.8 bp, respectively) and contained fewer exons (2.3 and 3.0 exons on average; Fig. 4). More than 50% of lncRNAs consisted of two exons, while lncRNAs with more than five exons were rare (< 1.9%). Overall, mRNAs exhibited 15.5-fold higher median expression levels than lncRNAs in our dataset, and there was no significant difference in this respect between intergenic and antisense lncRNAs. In total, 914 intergenic lncRNAs derived from TEs, of which 823 originated from the antisense strand of the respective TE. Overall, lncRNAs, like coding genes and TEs, were ubiquitously distributed throughout the genome and we did not find any lncRNA-rich genome compartments (Supplementary Fig. 3).

Furthermore, we found that 2,378 of the 5,127 lncRNA genes encoded more than one transcript variant, suggesting alternative splicing of a substantial portion of the lncRNAs in B. hordei (Fig. 4F). Two lncRNAs gave rise to eleven separate isoforms, another three lncRNAs contained up to nine isoforms, while most lncRNA genes had two (1,488; 62%) or three (555; 23%) transcript variants (Fig. 4F). The most common alternative splicing event was a retained intron (2,016 events), followed by an alternative 3’ splice site (1,163 events; Fig. 4G). We detected some cases in which alternative lncRNA isoforms showed differential accumulation at different time points during fungal pathogenesis (Fig. 4H), suggesting that alternative splicing of lncRNAs occurs in a developmental stage-specific manner in B. hordei.

Gene expression patterns in B. hordei reflect its pathogenic development

Transcripts are differentially regulated in B. hordei during its pathogenic development (Fig. 1B and C). To gain detailed insights into (co-)expression profiles, we assigned all reads mapping to the B. hordei reference genome to the annotated 7,149 coding genes and 5,127 lncRNAs (Supplementary Table 4 for TPM values). We then clustered lncRNA and mRNA gene expression patterns across all time points using TCseq [37] and identified nine distinct co-expression clusters featuring 2,264 lncRNAs and 2,762 coding genes in total (Fig. 5A). The clusters exhibited distinct expression patterns, with peaks of transcript accumulation at different time points. Seven clusters showed transcript peaks at one or two specific time points. Cluster 4, comprising transcripts whose levels peaked at 18 hpi, contained the highest number of transcripts, namely 336 mRNAs and 344 lncRNAs (Table 2). We further analyzed the expression pattern of all genes encoding putative secreted proteins in B. hordei and found distinct sets of these genes with peaked transcript levels at 0, 6, 18–24, or 72–120 hpi, indicative of “waves” of genes coding for secreted proteins expressed during fungal pathogenesis (Fig. 5B). Clusters 4, 5, 6, and 8 contained an elevated number of putative secreted proteins (> 20% of coding transcripts each), coinciding with peaked transcript accumulation at 18, 24, or 72 hpi (Fig. 5C and Table 2). By contrast, genes encoding Sgk2-like serine/threonine kinases, which are abundant in the genome of B. hordei [50], almost exclusively occurred in cluster 9, which represents genes expressed at 120 hpi (Fig. 5C and Table 2). Since ROPIP1 of B. hordei is an Eg-R1-derived peptide [11], we also assessed the time course expression pattern of the consensus ROPIP1 gene (Supplementary Fig. 4), and found consistently high ROPIP1 expression levels above 1,000 TPM with a peak at around 4,000 TPM at 6 hpi. Furthermore, we analyzed the expression patterns of proteins from the RNAi machinery in B. hordei, which we identified previously [16]. Genes encoding components of the dsRNA synthesis machinery, i.e., RNA-dependent RNA polymerase (BLGH_05945), helicase QDE-3 orthologs, and replication factor A1 (BLGH_03661) were mostly expressed early (0–6 hpi; Supplementary Fig. 5). The DCL-1 ortholog BLGH_05892 showed both early and late expression (0 and 120 hpi) while DCL-2 (BLGH_05549) was only expressed at late time points. AGO1 (BLGH_02442) expression peaked during haustorium formation (18–24 hpi) and the two other AGO-encoding genes (BLGH_02817, BLGH_06130) exhibited high expression of up to 1,226.0 TPM late in the pathogenic cycle (72–120 hpi). The only DNA methylase of B. hordei, DNMT-2 (BLGH_06833), which could be involved in RNAi-mediated DNA methylation, displayed the highest expression in conidia at 0 and 6 hpi.

Fig. 5
figure 5

B. hordei exhibits infection stage-dependent expression patterns of coding genes and lncRNAs. A We used TCseq [37] to cluster time-resolved coding gene and lncRNA expression patterns. The lines represent single transcripts; the color shade denotes the cluster membership by Spearman correlation (the darker the shade of orange, the higher the correlation R with the respective expression cluster according to the color scheme). The x-axis denotes the time points of infection (Fig. 1A), i.e., 0 hpi (conidiospore germination), 6 hpi (appressorium formation), 18 hpi (host cell penetration), 24 hpi (haustorium formation), 72 hpi (epiphytic colonization), and 120 hpi (conidiogenesis). The y-axis indicates the relative z-score based on reads per kb of transcript per million mapped reads (RPKM). B The heat map shows the relative median expression of genes encoding putative secreted proteins across the time points according to the color scheme below the heat map. Purple indicates high and white low expression. The time course cluster number (A) is indicated next to the dendrogram above the heat map. C The stacked bar graph shows the number of lncRNAs (orange) and mRNAs (purple) in each cluster, indicating mRNAs encoding putative secreted proteins (blue) or Sgk2-like kinases (brown). The x-axis indicates the co-expression cluster (A) and is annotated with a dot plot below to highlight the time point represented by each cluster. The y-axis shows the number of transcripts. D We conducted gene ontology (GO) enrichment analysis for the coding genes from each cluster (A) using ShinyGO v0.77 [51] accessed online at http://bioinformatics.sdstate.edu/go/, and summarized GO terms with REVIGO [52]. The dot plot shows the fold enrichment of functional terms compared to the full set of genes of B. hordei (dot size); the fill color denotes the − Log10(FDR-adjusted enrichment p value) according to the color scheme on the right. The summarizing GO descriptions are provided below the GO identifiers (x-axis), the respective co-expression cluster (A) is indicated on the y-axis

Table 2 Transcripts in co-expression clusters of B. hordei

We conducted gene ontology (GO) enrichment analysis for the coding genes found in the nine clusters described above (Fig. 5A). While no enrichment was detected in case of 0 hpi (clusters 1 and 2), we found GO processes related to cell division, cell cycle, microtubule activity processes, oligosaccharide metabolism, response to stimulus, and DNA repair to be enriched at 6 hpi (cluster 3; Fig. 5D; Supplementary Table 5). At 18 and 24 hpi, the process “ribonuclease activity” was enriched, likely reflecting an abundance of RNase-like candidate secreted effector proteins [53, 54] expressed at this time point (cluster 5). During late infection (72 hpi and thereafter; clusters 7 and 8), processes related to protein, nucleotide, and fatty acid biosynthesis and gene expression were over-represented. Finally, at 120 hpi (cluster 9), processes related to protein phosphorylation and phosphorus metabolism were enriched. Protein phosphorylation was likely enriched due to the high number of Sgk2-like serine/threonine kinases ([50]; 52 in total; 19.4% of mRNAs in cluster 9; Table 2) expressed at 120 hpi (Fig. 5C and D).

We conducted the time-course expression and GO enrichment analysis for the plant side as well (Supplementary Table 6 and Supplementary Fig. 6). In the host barley, we found eight co-expression clusters, with peaks at 0, 6, 18, 24, or 120 hpi, and a cluster of 646 genes exhibiting distinct down-regulation at 6 and 18 hpi (cluster 6; Supplementary Fig. 6A). Processes related to stress or defense response, carbohydrate biosynthesis, and cell wall biogenesis were associated with early-responding genes (6 hpi to 24 hpi; clusters 2–5), but these processes were also over-represented in cluster 6, i.e., the set of 646 genes that were down-regulated at 18 hpi. At 72 hpi and 120 hpi, sulfur, nitrogen, organic acid metabolism-related, and catabolism terms were highly represented, in addition to biotic stress response (cluster 7 and 8). Overall, we found that early developmental processes and stress response genes dominated the early infection stage of B. hordei, while processes related to growth and metabolic activity were enriched at later stages (Fig. 5A and D). Effector candidates were expressed in at least four waves (Fig. 5B), while Sgk2-like serine/threonine kinases appeared to be associated mostly with the asexual late infection stages in B. hordei (Fig. 5C and Table 2).

B. hordei lncRNAs may co-regulate transcripts in cis and trans

We noted that lncRNAs, mRNAs, and TEs all exhibited time point-specific expression peaks (Fig. 6A). Thus, we conducted a weighted gene co-expression network analysis (WGCNA) including all B. hordei mRNAs, lncRNAs, and TEs to identify possible co-expression networks and infer putative regulatory networks (Supplementary Fig. 7). We detected 14 co-expression networks consisting of at least ten transcripts each (Fig. 6B). Two of these networks included more than 100 genes encoding putative secreted proteins, one of which corresponded to a network of genes induced at 18–24 hpi (blue, 179 secreted) and the other to a network of genes induced at 72–120 hpi (salmon, 136 secreted). Then, we determined the genomic physical distance between the genes encoding these transcripts and calculated the Pearson’s correlation coefficient (PCC) R value with a cut-off of |R|≥ 0.7 (cis) or |R|≥ 0.9 (trans) to discover positive and negative co-regulation of lncRNAs with mRNAs (Supplementary Table 7), or a cut-off of |R|≥ 0.7 for lncRNA comparison with consensus TEs (Supplementary Table 8). We regarded cis co-regulated lncRNAs and mRNA if their maximal genomic distance was 10 kb, otherwise the respective pair was regarded co-regulated in trans. Overall, positive co-regulation was more common than negative co-regulation, and co-regulation was more common in trans than in cis. In case of lncRNA-mRNA regulons, we detected 6,018 positively and 339 negatively co-regulated trans pairs, and 109 positively and 13 negatively co-regulated cis pairs. For lncRNA-consensus TE correlations, we found 6,169 positive and 334 negative interactions in trans. Further, we detected seven negative correlation cases of TE antisense lncRNAs and their respective TE, such as BLGHnc_000540-RA located antisense of Ty1/Copia-23 and BLGHnc_002596-RA, an antisense lncRNA of Ty3/mdg4-9 (Fig. 6C and D). Meanwhile, six TE antisense lncRNAs were positively correlated with the respective TE, such as BLGHnc_004551-RA and Ty3/mdg4-48 and (Fig. 6E). Our findings suggest that some transposon antisense lncRNAs may directly regulate the TE they derive from, although the majority of these are co-regulated with distal TEs or mRNAs.

Fig. 6
figure 6

Co-expression patterns of TEs and lncRNAs in B. hordei. A We used TCseq [37] to cluster time-resolved coding gene (mRNA), lncRNA, and consensus TE expression patterns. Each line represents a single transcript; the color shade indicates the cluster membership according to Spearman correlation (the darker the color, the higher the correlation R with the respective expression cluster; see color scheme in the bottom right corner). Shades of purple (upper panel), mRNAs; shades of orange (middle panel), lncRNAs; shades of green (bottom panel), consensus TEs. The x-axis indicates the time points of infection: 0 hpi (spore germination), 6 hpi (appressorium formation), 18 hpi (early primary haustorium), 24 hpi (mature primary haustorium), 72 hpi (host colonization), and 120 hpi (conidia formation). The y-axis displays the relative z-score based on reads per kb of transcript per million mapped reads (RPKM). B Co-regulation networks were discovered using WGCNA [55]. The colored circles indicate transcripts (purple, mRNA; orange, lncRNA; green, consensus TE) and lines significant correlation between two transcripts. The WGCNA-assigned cluster colors are indicated on the top-left of each corresponding cluster of transcripts; the number of genes encoding putative secreted proteins in the respective co-expression cluster is indicated. The clusters were arranged to correspond to the time-resolved expression patterns shown above in (A). C-E The dot plots show examples of expression patterns of TE antisense lncRNA and the corresponding TE, as indicated on top of each plot. Expression values are shown as TPM (y-axis) during six time points of host infection (x-axis; Fig. 1A). The black bar indicates the median of three independent replicates (n = 3)

Conclusions

TEs are highly abundant in the genomes of powdery mildew fungi, including B. hordei [8, 9], but their transcript levels have not been measured so far. High-throughput datasets using next-generation sequencing technologies can be powerful tools to understand powdery mildew biology and evolution [56]. Here, we generated a comprehensive stranded RNA-seq dataset that contains information about all fungal and host transcripts based on random rather than oligo-dT priming during cDNA synthesis. Our dataset covers all major stages of B. hordei during the asexual infection cycle on barley, from early conidiospore germination to sporulation (Fig. 1A). Limitations of the dataset are that (1) the reads are derived from only one isolate (B. hordei K1AC; [57]) during the compatible interaction with its host barley, (2) we did not include the sexual reproduction stages, and (3) the short reads preclude precise assignment to individual TE members. We partially overcame the latter limitation by including long read-based transcriptome sequencing via ONT at a late stage of infection to recover many fungal reads and recovered 46% of lncRNA and 76% of mRNA transcripts annotated initially with the help of short reads. A higher transcript recovery rate would require additional long read-based transcriptomes from all infection stages. Despite the limitations, our data allowed us to provide a genome-wide curated annotation of lncRNAs and their alternatively spliced isoforms (Fig. 4) and to measure stage-specific transcript accumulation of coding genes, noncoding RNAs, and TEs simultaneously (Figs. 2, 5 and 6). Altogether, our findings suggest a TE regulatory network involving both (phased) sRNAs (Fig. 3) and lncRNAs (Fig. 6). Despite applying a more stringent significance cut-off for trans-regulated pairs, our analysis recovered > 10-times more trans- than cis-regulated pairs. Examples of trans-co-regulation of lncRNAs with other transcripts in fungi are rare [58]. The per-chance occurrence of co-regulated transcripts increases exponentially when removing spatial constraints; thus, global co-expression networks should be interpreted carefully.

Active TEs can dramatically change the architecture of a genome, including gene disruption or dysregulation, gene duplication, and structural variation [4, 5]. While they threaten genome integrity and thus survival, TE-induced variations also increase adaptability under stressful conditions like host immunity. This high risk-high reward relationship has been referred to as a devil’s bargain between plant pathogens and TEs [59]. Indeed, the cereal powdery mildews B. hordei and B. graminis f.sp. tritici exhibit extensive copy number variation of effector-coding genes, possibly as a consequence of the escape of the recognition of the respective gene products by host resistance proteins [10, 60, 61]. Considering the missing RIP mechanism [8, 9], powdery mildew fungi are likely to possess powerful alternative TE regulating mechanisms to limit the risk of their activity. TEs are often regulated epigenetically or by RNAi [5]. The arbuscular mycorrhiza fungus Rhizophagus irregularis, which is a broad-range mutualist of predominantly vascular plants, harbors a genome of around 150 Mb that consists to 47% of repetitive elements [62]. Interestingly, R. irregularis exhibits transcription of TEs during spore development, while TE expression is regulated by DNA methylation and RNAi [63]. Similarly, we previously found signs of sRNA phasing linked to B. hordei TEs [8, 16]. Here, we confirmed that phasiRNAs are abundant in 104 of 344 consensus TEs (Fig. 3) and found signs for time point-dependent expression of different Dicer and Argonaute genes (Supplementary Fig. 5), suggesting that RNAi is a major mechanism to regulate TE activity in fungi, including B. hordei.

We found TE antisense lncRNAs, in part spliced, in B. hordei (Fig. 3), which showed both positive and negative co-regulation with the TE they derived from (Fig. 6). TEs can give rise to novel lncRNAs in vertebrates [64, 65] and plants [32, 66]. According to the repeat insertion of domains of lncRNAs (RIDL) hypothesis, TEs can insert into and then become functional domains of lncRNAs, thus driving the emergence, evolution, and functional diversity of lncRNAs [67, 68]. TEs can also give rise to lncRNAs that contribute to regulatory networks under stressful conditions in plants [32, 69]. TE-derived lncRNAs respond differently to phosphate stress between two ecotypes in Arabidopsis thaliana [69], while mutation of the abiotic stress-responsive TE-derived lncRNA lincRNA11195 impairs the response to abscisic acid and root development in A. thaliana [70]. Interestingly, lncRNAs associated with TEs in A. thaliana exhibit high variability between accessions and developmental stages, possibly driven by the TE silencing machinery [71]. Likewise, we observed highly variable expression patterns throughout the pathogenic life cycle of B. hordei; it is conceivable that TE-derived lncRNA expression varies significantly between B. hordei isolates or Blumeria species. TE-derived lncRNAs in plants may also regulate gene expression as competing endogenous RNAs acting, for instance, as miRNA sponges, or through interaction with chromatin remodeling complexes [66]. Fungi are also able to activate TEs in response to biotic or abiotic stress [23, 31], like the Septoria leaf blotch pathogen Zymoseptoria tritici inducing expression of distinct sets of TEs under conditions of starvation or during wheat infection [72]. Likewise, the rice bast pathogen M. oryzae exhibits lncRNA expression specifically during host infection [23], which is comparable to our observations of infection stage-specific induction of lncRNAs in B. hordei throughout barley infection. Our data lends support to two scenarios: (1) TE antisense lncRNAs arise due to active TE expression and permit their own down-regulation, for example, by inducing RNAi, and (2) lncRNAs are expressed independently of the activity of the TE they derive from and fulfil distinct regulatory functions.

Appropriate transcriptional programs are key for organisms to respond effectively to environmental cues. Transcriptional programs appear to be responsive to cues from the respective host in broad-spectrum plant pathogens [73] and drive lifestyle transitions in hemi-biotrophic plant pathogens, such as Fusarium and Colletotrichum species [74, 75]. Likewise, obligate biotrophic and host-specialized pathogens, such as B. hordei, are likely responsive to various host-derived signals to activate suitable transcriptional programs throughout its pathogenic life cycle. Similar to previous findings in B. hordei infecting the hypersusceptible non-host mutant plant Arabidopsis thaliana pen2 pad4 sag101 [33], we observed distinct sets of effector candidate-encoding genes to be transcriptionally induced at specific stages of barley infection (Fig. 5B). A previous microarray study also found one class of effectors to be transcribed at pre-penetration but not post-penetration stages in B. hordei [76]. The different pathogenic stages are likely to require different sets of effectors conducting functions appropriate for the respective phase of infection [77, 78]. For instance, early infection stages prior to successful host cell penetration probably require the aggressive suppression of host immunity, while subsequent steps also depend on reprogramming of the host plants’ secretory system [79, 80], metabolism, and source-sink status [81, 82]. Likewise, we observed that stress response genes are induced in B. hordei during early infection, when the fungus faces the strongest resistance in the form of the generation of reactive oxygen species and the biosynthesis and delivery of antimicrobial compounds (Fig. 5D). Indeed, barley showed up-regulation of defense response and cell wall biogenesis-related genes, but also down-regulation of such genes at 18 hpi. Contrary to previous microarray studies [76, 83], we also observed cell cycle, cell division, and microtubule processes at the early infection stage. However, these processes may have been missed in the previous analysis as genome information was lacking and, in contrast to microarrays, which are limited to known probes, RNA-seq recovers all expressed genes in an organism. Meanwhile, genes encoding proteins functioning in sulfur, nitrogen, and organic molecule biosynthesis showed elevated transcript levels at 72 hpi, after successful colonization of the plant by B. hordei (Supplementary Fig. 6). In line with previous observations [76, 83], protein and nucleic acid metabolic processes and protein biosynthesis including translation and ribosome biogenesis were enriched after 72 hpi. At late infection stages, processes related to vegetative growth, gene expression, and metabolism dominated in B. hordei, suggestive of an undermined plant immune system and a supply of metabolites to fuel biosynthetic processes.

It has not escaped our attention that TE-derived lncRNAs might give rise to novel coding genes in the course of evolution. These genes could encode novel effectors that help establish and maintain the biotrophic interaction with the host plant. In one such case, the B. hordei effector ROP-interactive peptide 1 (ROPIP1) originates from a SINE/Eg-R1 TE and transmits to barley host cells in the course of infection, where it interacts with barley RACB and disrupts cortical microtubules [11]. Thus, ROPIP1 promotes the infection success of B. hordei on barley. The large number of TE antisense lncRNAs in B. hordei implies that B. hordei and possibly fungal plant pathogens with TE-enriched genomes in general repurpose TEs for the invention of novel genes to encode effector proteins.

Methods

Plant and pathogen cultivation

Plants of barley (H. vulgare cv. ‘Margret’) were cultivated in SoMi513 soil (Hawita, Vechta, Germany) in a long-day cycle (16 h light period at 23 °C, 8 h dark at 20 °C) at 60–65% relative humidity and a light intensity of 105–120 μmol s−1 m−2. Plants were inoculated with B. hordei strain K1AC at seven days after germination and then transferred to isolated growth chambers with a long day cycle (12 h light at 20 °C, 12 h dark at 19 °C), approximately 60% relative humidity, and 100 μmol s−1 m−2 light intensity.

RNA sampling and RNA sequencing

We inoculated the susceptible barley (H. vulgare) cultivar (cv.) ‘Margret’ with B. hordei isolate K1AC [57] and procured total RNA from leaf epidermal peelings to enrich for fungal RNAs and infected epidermal cells [84] in a time-course experiment. In three independent experimental replicates per time-point, we sampled and performed stranded RNA-seq at 0, 6, 18, 24, 72, and 120 h post inoculation (hpi) to obtain transcriptomes of all major developmental stages of the asexual pathogenic life cycle of B. hordei and its host, barley (Fig. 1A). The epidermal peels were ground to a fine powder in liquid nitrogen using mortar and pestle and RNA isolated via TRIzol (Thermo Scientific, Karlsruhe, Germany) according to the manufacturer’s instructions. Genomic DNA was removed using DNase I (RNase-free, Thermo Scientific). RNA integrity was determined by microcapillary electrophoresis (2100 BioAnalyzier system; Agilent, Santa Clara, CA, USA) and RNA quantity by spectrophotometry (NanoDrop; Thermo Scientific) and spectrofluorimetry (Qubit; Thermo Scientific). We sent 18 high-quality total RNA samples (RIN > 6.0, c[RNA] > 100 ng µL−1, m[RNA] > 2 µg) for stranded RNA sequencing at 150-bp paired-end reads, random oligomer priming to recover total RNA, and plant/animal ribosomal RNA depletion, provided by Novogene Europe, Cambridge Science Park, UK. The raw data (available at NCBI/ENA/DDBJ under project accession PRJNA835302) was inspected using FastQC v0.11.5 (Babraham Bioinformatics, UK) and overrepresented sequences identified by nucleotide BLAST in April 2021 (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Reads were adapter- and quality-trimmed with Trimmomatic v0.36 [85] before further usage.

Whole-transcript sequencing with Oxford Nanopore Technologies

We sampled B. hordei-infected barley abaxial leaf epidermis at 144 hpi and used TRIzol (Thermo Scientific, Karlsruhe, Germany) for RNA isolation according to the manufacturer’s instructions, as above [84]. Then, a polyA-enriched sequencing library was generated using the Nanopore direct RNA sequencing kit (SQK-RNA002; Oxford Nanopore Technologies, Oxford, United Kingdom). The library was sequenced using MinION (ONT) technology with a R9.4.1 flow cell. Base-calling was performed via the MinKNOW app v5.5.10. Long reads were mapped using Minimap2 v2.15 [86] and the mapping quality assessed with pycoQC v2.5.0.3 [87].

Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR)

We collected barley leaf epidermal peelings (see above) at 0, 6, 12, 24, 48, 72, and 96 hpi in three biological replicates. RNA isolation and quality control were done as described above. Complementary DNA (cDNA) was synthesized using the High-Capacity RNA-to-cDNA Kit (Applied Biosystems-Thermo Fisher, Schwerte, Germany) and stored at -20 °C. The cDNA was diluted 1:10 prior to further experiments. We conducted qRT-PCRs with the Takyon No ROX SYBR MasterMix blue dTTP Kit (Eurogentec, Seraing, Belgium) in a LightCycler 480 II (Roche, Basel, Switzerland). The PCR efficiency of all primers used in this study (Supplementary Table 9) was between 1.8 and 2.0 and the annealing temperature was set to 58 °C. Melting curve analysis validated in each case the synthesis of single amplicons. Expression levels of target TEs was evaluated relative to the housekeeping gene GAPDH (BLGH_00691) of B. hordei [88]. We calculated relative transcript abundance with 2−(CT(target) −CT(GAPDH)) according to the ΔΔCT method [89].

Annotation of TEs

We used the EDTA TE annotation pipeline, which integrates homology-based and de novo structure-based TE identification [90], to discover TE elements in the genome of B. hordei isolate DH14 [10]. Next, we renamed unclassified TE elements from the initial EDTA-generated library using RepeatMasker [91] with the Blumeria TE consensus sequences from RepBase v20181026 as database. Then, we manually curated the TE consensus database, as described before [35]. Briefly, we calculated a multiple sequence alignment of TE consensus sequences with MAFFT v7.475 [92] and clustered sequences, removed gaps, and deleted divergent sites using T-COFFEE v11.00.8cbe486 [93] and CIAlign v1.0.18 [94]. We manually adjusted overhangs to define TE boundaries with the help of AliView [95] and TE-Aid v.0-dev [92]. Then, the manually curated TE database, RepBase v20181026, and the SINE sequences for Eg-R1 (X86077.1) and Egh24 (Z21962.1) extracted from GenBank (absent in RepBase) were merged and CD-HIT-EST [96] was used to eliminate duplicated sequences. We then conducted genome-wide re-annotation of TEs in the genome of B. hordei DH14 with RepeatMasker [91] using the manually curated TE consensus library as database.

Detection and annotation of lncRNAs

First, we mapped the trimmed total RNA-seq read data to the B. hordei isolate DH14 genome [10] using HISAT2 [45]. We extracted mapping reads for transcriptome assembly with StringTie [43] and then filtered out coding genes from the transcriptome via Gffcompare [46], transcripts shorter than 200 bp using Gffread [46], transcripts with coding potential using CPC2 [47], and transcripts accounting for ribosomal RNA, transfer RNA, small nuclear RNA, and small nucleolar RNA via CMscan search against the Rfam database [48]. Then, we performed lncRNA annotation and classification with FEELnc [49], predicting 17,226 putative lncRNAs in the B. hordei genome. We used a WebApollo instance [44] to manually inspect the predicted lncRNA models, and to annotate all possible transcript isoforms for each lncRNA gene. For analysis of the distribution of lncRNAs in the genome of B. hordei DH14, we generated density maps of coding genes, lncRNAs, and TEs in 100-kb windows using BEDtools v2.25.0 [97] and visualized these data using the R package Rcircos v1.2.2 [98].

Functional annotation of newly discovered gene models in B. hordei

Peptide sequences deriving from gene models newly discovered in the genome of B. hordei DH14 [10] during annotation of lncRNAs were functionally analyzed for signal peptides using SignalP5.0 [99], transmembrane domains via TMHMM v2 [100], sequence homology using BLASTP (https://blast.ncbi.nlm.nih.gov accessed May 2023), and functional domains using InterProScan v5.62–94.0 [101] and hmmscan via the HMMER web server v2.41.2. accessed May 2023 [102, 103].

Cloning of lncRNA candidates

We amplified lncRNAs from isolated RNA or cDNA from samples used for RNA-seq or qRT-PCR and lncRNA-specific primer pairs (Supplementary Table 7). The PCR products were purified using Phusion High-Fidelity DNA polymerase (New England Biolabs, Frankfurt a.M., Germany) and subsequently ligated into the vector system pCR-Blunt II-TOPO via the Zero Blunt™ TOPO™ PCR cloning kit (Thermo Fisher Scientific). The lncRNA sequences were confirmed by Sanger chain-termination sequencing provided by eurofins Genomics (Ebersberg, Germany).

Time-course transcript expression analysis with RNA-seq data

We mapped the trimmed total RNA-seq data to the genomes of B. hordei isolate DH14 [10] or H. vulgare cv. ‘Morex’ [34] using HISAT2 [45] with ‘--max-intronlen 500’ and parsed output files with SAMtools v1.9 [104] and BEDtools v2.25.0 [97]. Read count tables were generated using ballgown [105]. Normalized expression was calculated as transcripts per million (TPM) and transcripts at TPM < 1 were filtered prior to further analysis using R v4.1.2 [106]. We next used TCseq [37] to cluster transcripts (lncRNAs, mRNAs, and TEs) according to their time-course specific expression patterns. For identifying co-expression connection networks, we further used WGCNA with a soft threshold of 8 [55]. Then, the unsigned adjacency was calculated via Pearson correlation to assign transcripts into putative co-expression modules of at least 10 transcripts. The module-trait correlation was calculated with infection time points as traits. Network visualization was done using Cytoscape v3.9.1 [107]. Functional enrichment of mRNA gene sets was performed using gene ontology (GO) terms with the web tool ShinyGO v0.77 [51]; GO terms were summarized by using REVIGO [52].