This work describes an analysis of the genetic variations of seven P. sojae avirulence genes through whole genome sequencing of 31 isolates in an effort to understand and explain their interaction with Rps genes. Through improved re-phenotyping, evaluation of sequence stability over time, expression analysis, and genome-wide sequence comparisons, we define new variants, copy number variations, and potential new factors of virulence of P. sojae. We further provide evidence that one haplotype of Avr1c from the reference genome is likely associated to a different phenotype. Taken together, our results showed that genomic signatures alone accurately predicted 216 of the 217 (99.5%) phenotype interactions studied, and that those signatures remained stable over time.
In the specific context of the P. sojae-soybean interaction, very little attention has been placed on the accuracy and reproducibility of phenotypic procedures when studying the interaction of avirulence and resistance genes. This situation may lead to erroneous inferences with regard to the nature of avirulence genes or the mechanisms explaining gain of virulence, as highlighted in this study. With 31 isolates interacting with seven different Rps genes from soybean, we had a total of 217 interactions to consider that linked the haplotype with the original phenotyping result from the hypocotyl test. The hypocotyl inoculation method has long been used for characterizing pathotypes of P. sojae isolates but has also encountered some limitations in the past where re-testing gave variable results in terms of virulence profiles, leading to a rate of 10–20% of false positives or negatives [25]. In our study, 26 out of 217 interactions were initially inconsistent with the observed genotype. We re-phenotyped these using a recently described hydroponic assay [26] and found that 23 out of 26 inconsistent interactions had been incorrectly phenotyped. In addition, we highlighted an incorrect phenotype for Avr1c in the reference isolate P6497. Interestingly, most of the incorrect phenotypes were false positives, namely with Avr1a, Avr1b, Avr1k, and Avr6, indicating that the hypocotyl assay, bypassing the root system, is possibly too stringent. Genetic drift has also been proposed to explain virulence inconsistency of isolates over time [30], but targeted re-sequencing results of all the tested outliers, and of the concerned avr gene region—Avr1c—for the three remaining outliers (3 out of 26), showed no genetic variation compared to the whole-genome sequences, ruling out the possibility of any change by mutation or contamination within the confines of our experiments (from 2015 to 2017). Considering that of these three outliers, two are potentially explained by genomic features (distant variants putatively affecting an Avr gene in trans), this means that 216 out of 217 interactions were accurately predicted based on genomic signatures. In previous studies, expression polymorphism based on RT-PCR analysis was considered the next step to explain the gain of virulence mechanisms when the haplotype did not match the phenotype. However, downregulation of transcripts failed to explain all situations. For instance, Na et al. [10] and Shan et al. [31] observed the expression of an avirulence gene for a P. sojae isolate with a phenotype of virulence in the case of Avr1a, Avr1c, and Avr1b, respectively. In these cases, it was hypothesized that other effectors or epistatic effects could be responsible for these incongruent results [10]. While we cannot rule out the possibility of these genetic events, our study rather showed that an incorrect phenotype was the main source of discrepancy between the haplotype of Avr genes and the phenotype of P. sojae isolates. The use of the hydroponic test by Lebreton et al. [26] allowed rectification of these phenotyping inaccuracies and in particular elimination of false positives.
For most of the avirulence genes we studied, there were many variants representing the diversity of virulence profiles inherent to the P. sojae isolates. Many of the Avr effectors we observed had been described by other groups [3, 10,11,12,13, 31]. When we compared our data with haplotype analyses from these earlier studies, robust associations could confirm many patterns and resolve discrepancies between both the phenotypes reported previously and the new findings revealed by our analyses.
For Avr1a, we noticed that the complete deletion of the gene was not the sole factor that accounted for virulence of P. sojae to Rps1a. Indeed, while the absence of the gene always conferred virulence, as many as 10 isolates still displayed a phenotype of virulence without a deletion. In an earlier study, Na et al. [10] also observed the presence of Avr1a in virulent isolates and attributed this phenomenon to gene silencing. In this work, we were able to identify new SNPs outside the Avr1a gene region that discriminated between avirulent and virulent isolates. While the functional impact of these SNPs remains unknown, it will be interesting to determine if they indeed lead to silencing of Avr1a [10, 13] or if they affect another gene involved in the virulence to Rps1a. Our data have also further refined the extent of the deletion for Avr1a, showing that it can be as large as 10.8 kb, in which case it also encompassed Avr1c. Another interesting observation was the variation in the number of copies of Avr1a among the isolates. In a previous study, Qutob et al. [13] identified a tandem array of two identical copies of Avr1a and established a link between virulence and deletion of both copies, although a few isolates were virulent in spite of the presence of the gene. Within the population of 31 isolates studied, we found that the copy number could be as high as three in more than 50% of the isolates and included isolates displaying a phenotype of virulence. However, in the latter cases, we identified haplotypes associated with this phenotype of virulence to Rps1a.
With respect to Avr1b, our results identified three distinct haplotypes among the 31 isolates. More importantly, all our tested isolates with haplotype A had an incompatible interaction with differentials carrying Rps1b or Rps1k. This contrasts with data for isolate P6497, which possesses the same haplotype, but has been reported as virulent to Rps1b (and avirulent to Rps1k), based on the hypocotyl or infiltration tests [31], a phenotype confirmed in this study by the hydroponic assay. Given the possible different genetic background between our isolates and isolate P6497, we could also hypothesize that epistasic interactions leading to differences in gene expression as observed by Shan et al. [31] might be responsible for the different virulence profile of P6497. Table 1 presents a comparative analysis of the phenotypes attributed to the haplotypes found in Shan et al. [31] compared to our data. Because Avr1b and Avr1k are tightly linked [8], and Avr1b can also determine virulence to Rps1k [3], the table presents the phenotype to Rps1b and Rps1k linked to the haplotype. Haplotype I from Shan et al. [31] comprised isolates with different virulence profiles (virulent/avirulent to Rps1b and Rps1k). In our case, all isolates with haplotype A, corresponding to haplotype I, were avirulent to Rps1b and Rps1k following re-phenotyping except for isolate P6497. Incidentally, Shan et al. (2004) also observed a pattern of virulence with P6497 as well as an avirulent isolate with the same haplotype and attributed the differences to a higher expression of Avr1b in the latter isolate, stimulated or stabilized by another elusive gene called Avr1b-2. The other two haplotypes, B and C, revealed from our data correspond to haplotype II and IV from the previous study, and the phenotypes associated with them are identical. The fourth haplotype described by Shan et al. [31] and missing from our isolates, haplotype III, was associated with a rare pattern of virulence to Rps1b and avirulence to Rps1k.
Table 1 Comparison of haplotypes/phenotypes of 31 isolates of Phytophthora sojae evaluated in this study compared to data from Shan et al. [31] A surprising feature for Avr1k was the presence of a frameshift mutation leading to an early stop codon in both haplotypes B and C, similar to the one reported by Song et al. [3]. If the truncation of the Avr1k protein makes it unrecognizable by Rps1k, this mutation should lead to a phenotype of virulence although isolates with haplotype B were avirulent. This phenomenon can be explained by the fact that the latter isolates share the same haplotype for Avr1b, which is seemingly recognized by Rps1k. Concerning the Avr1b/Avr1k interaction, it would be interesting to further study isolates that only show virulence to Rps1b or Rps1k to see if this pattern has evolved new or unusual haplotypes.
For three of the 31 isolates tested, deletion of Avr1c led to an expected virulence to plants carrying Rps1c. However, as with Avr1b, our data for Avr1c gave contrasting results of virulence when phenotyping the isolates with the haplotype of the reference genome (haplotype A). Re-phenotyping of the reference isolate confirmed a reaction of virulence in association with haplotype A. This suggests that Avr1c, as previously described, does not lead to a reaction of incompatibility with Rps1c, a situation that may explain why the efficacy of Rps1c has been described as unstable in the field [32]. Incidentally, Na et al. [10], who first identified Avr1c, also observed some discrepancy when phenotyping P. sojae isolates containing Avr1c, a situation they attributed mostly to gene silencing. On the basis of that suggestion, we further analyzed those isolates. Of the three remaining outliers following the phenotyping with the hydroponic assay, all isolates were associated with Avr1c and were virulent against soybean lines carrying Rps1c while associated with a haplotype that should confer an avirulent reaction. Expression analysis showed that Avr1c was significantly less expressed in these outliers compared to avirulent isolates presenting the same haplotype, which would explain the phenotypes observed. From a functional point of view, we hypothesized that this lower expression could find its origin in genomic variations. Incidentally, genome-wide sequence comparison revealed the deletion of a gene from the Sin3 family for one of the outliers, and deletion of the putative avirulence gene Avh220 for another. These results offer a potential explanation for the transient expression of the avirulence gene and propose the implication of new genes in the virulence of P. sojae to Rps1c. These findings were only made possible because of the extensive whole genome sequencing analyses. Further investigations are needed to confirm that these two genes are interacting with Rps1c, but their nature offers a priori evidence of their implication in virulence. Indeed, the protein encoded by the deleted gene from the Sin3 family is recognized as a regulator of transcription [33]. Computational prediction for Avh220, the second gene found to be deleted in one isolate, suggests it is a putative RXLR effector with a potential role in virulence. The mechanism by which the sole remaining outlier, isolate 45B, succeeds in escaping Rps1c is still unclear. The many unique mutations found for this isolate do not seem to be linked to any factors related to virulence, but the possibility that this can lead to an epistatic interaction of one or many genes with the Avr1c gene cannot be completely dismissed. Epigenetic mechanisms might also be involved in the gain of virulence on Rps1c plants for this isolate. Another interesting aspect of Avr1c was the discovery a new allele (haplotype D) that shared many similarities with Avr1a sequences [10]. It is well known that Avr1a and Avr1c are closely related but reads from this allele were distinct from those that aligned against Avr1a, which would rule out the possibility of misalignment. Considering that Avr1a and Avr1c are often subjected to deletion, one could speculate that they are in the presence of DNA repair although proof for this process is lacking in P. sojae. Finally, a rare case of heterozygous variants was observed with two isolates (haplotype C). Because this heterozygosity is not encountered all across the gene region for those isolates, we excluded the presence of two different alleles as a result of sexual segregation but attributed it instead to the observed duplication of the Avr1c gene for those two isolates, resulting in presence of reads from both copies of Avr1c on the same locus following alignment on the reference genome.
A complete deletion of the Avr1d gene was also observed in some isolates but, unlike the case of Avr1a, a constant phenotype of virulence was associated with this deletion. An absence of coverage along a 2.2-kb segment with another upstream deletion of 0.8 kb, separated by a segment of 177 bp, including the Avr1d gene, was indeed revealed through our data. Previously, a deletion/virulence link for Avr1d was also reported by Na et al. [34] with the distinction that the latter group observed an absence of read coverage along a shorter segment of 1.5 kb in the isolates studied. Over time, it will be interesting to determine if the difference can be explained by an evolving zone of deletion or simply a different variant.
The haplotype analysis for Avr3a has revealed two distinct alleles, and a distinctive phenotyping response separating these two haplotypes, with no outlier. In addition to the discriminant haplotypes, all virulent isolates contained only one copy of the gene while the avirulent isolates contained between two and four copies, in contrast to previous results that reported exclusively four copies in avirulent isolates [13]. The haplotypes were similar to the ones described by Dong et al. [11]. In contrast, two SNPs reported in the earlier study did not appear in any of the isolates tested, although they do not affect the haplotype sequences.
In the case of Avr6, two distinct haplotypes emerged clearly delineating the interactions of compatibility and incompatibility once the isolates were re-phenotyped. Because of our extensive coverage, we were able report a unique SNPs and a deletion of 15 bp further upstream that represents a clear discriminant zone between virulent and avirulent isolates. SNPs closest to the gene were also reported in P. sojae isolates by Dou et al. [12].