Introduction

Recently, a novel molecular technique called Nucleotide Binding Site (NBS) profiling was developed (Van der Linden et al. 2004). This technique specifically targets resistance genes and their analogs. Resistance (R) genes containing an NBS are numerous in plants and are distributed over all chromosomes (Meyers et al. 2002, 2003; Monosi et al. 2004). The technique is based on amplification of DNA fragments starting from the conserved NBS domain towards an adaptor which is ligated to a restriction fragment. Primers based upon several conserved motifs (P loop, the kinase-2 motif, and the GLPL motif) within the NBS domain can be used as a starting point. As a consequence this technique produces gene targeted markers, while other marker techniques like AFLP produce markers randomly in the genome. NBS profiling generates a reproducible polymorphic multi-locus banding pattern and has already been successfully used to identify and map RGAs in potato, apple and lettuce (Van der Linden et al. 2004; Calenge et al. 2005; Syed et al. 2006).

Nucleotide Binding Site profiling was used recently to assess genetic diversity in accessions of durum wheat that are commonly used in breeding programs (Mantovani et al. 2007). Results were compared to AFLP and SSR analyses. In general, the groups found with all three marker systems were similar. In this paper we describe the use of NBS profiling for phylogeny reconstruction and classification. As a model we used part of the section Petota of the genus Solanum, which also includes the cultivated potato. The potato (Solanum tuberosum L.) is a crop with a large secondary gene pool, which contains important traits that can be exploited in breeding programs. In the last few years the identification and cloning of late blight (Phytophthora infestans) resistance genes from wild relatives of the cultivated potato has been the subject of many studies (Ballvora et al. 2002; Song et al. 2003; Van der Vossen et al. 2003, 2005; Huang et al. 2005; Park et al. 2005). Systematic relationships within the group of tuber-bearing Solanum species are regarded as important criteria to select interesting materials. Based on morphological data, relationships among tuber-bearing Solanum species have been studied extensively resulting in a classification of 227 species in 19 series (Hawkes 1990). Spooner and Hijmans (2001) updated this by reducing the number of tuber-bearing species to 206. Recently, Spooner and Salas (2006) reduced the number to 188 wild and one cultivated species for section Petota, plus three species in section Etuberosa. Apparently, the boundaries between some of the species in this group are not very clear.

Relationships within the tuber-bearing Solanum species have been studied using different molecular markers resulting in new insights at different levels of potato taxonomy. These studies include Restriction Fragment Length Polymorphism (RFLP) markers of chloroplast DNA (cpDNA) (Hosaka et al. 1984; Sukhotu et al. 2004), RFLPs of the nuclear genome (Debener et al. 1990; Miller and Spooner 1999), Amplified Fragment Length Polymorphism (Kardolus et al. 1998), Simple Sequence Repeat (SSRs) (Raker and Spooner 2002), cpDNA SSRs (Bryan et al. 1999), sequence data of 5S ribosomal DNA (rDNA) (Volkov et al. 2001), and external transcribed spacer (ETS) (Volkov et al. 2003).

To explore the prospects of NBS profiling for systematic research we used previously obtained AFLP data (Kardolus 1998) as reference. As NBS profiling targets resistance genes and their analogs, we address the question whether this affects the outcome of the systematic analysis by comparing results from NBS profiling to results obtained from AFLP. The role of disease resistance in the evolution of species is discussed.

Materials and methods

Plant material

DNA material of one hundred and three accessions (Table 1) from Kardolus (1998), representing 49 wild and cultivated Solanum species, was used for NBS profiling. Origin of the material and number of genotypes per accession are given in Table 1. The three letter codes of a number of accessions indicate species names that since the study of Kardolus (1998) have been referred to the synonymy of other species as listed in the Table. The series abbreviations are according to Hawkes (1990).

Table 1 Accessions used for phylogeny reconstruction

NBS profiling procedure

Nucleotide binding site profiling was performed essentially as described by Van der Linden et al. (2004) with two minor modifications. Firstly, 200 instead of 400 ng of DNA was digested with a restriction enzyme. Secondly, in the study of Van der Linden et al. (2004), the restriction and adaptor ligation reactions were done separately, while in our study, the two were combined into one single reaction of 60 μl, consisting of 200 ng of DNA, 3 μl of adaptor (from a 75 μl reaction mix with 1.25 nmol of each adapter strand, heated to 90°C and allowed to cool down to room temperature), 1 mM ATP, 10U restriction enzyme, 5U T4 ligase and restriction/ligation buffer (10 mM Tris.HAc pH7.5, 10 mM MgAc, 50 mM KAc, 5 mM DTT, 50 ng/μl BSA). Afterwards, the restriction/ligation product was two times diluted with MilliQ water and used as template for PCR reactions. The 25 μl PCR reaction mix consisted of 5 μl template DNA, 20 pmol of both adaptor primer and NBS primer, 200 μM dNTPs, 0.4 U HotStarTaq (Qiagen, Germany), and 2.5 μl HotStarTaq PCR buffer. The PCR program consisted of 30 cycles of 30 s at 95°C, 1 min 40 s at 55–60°C annealing, and 2 min at 72°C. Annealing temperature was 55°C for NBS5 and NBS9 primers and 60°C for NBS2 and NBS3 primers. After this, a second PCR reaction was performed using 5 μl of the 10 times diluted first PCR product as template, and the same NBS-specific primer but now 33P radioactively labeled, and cycling conditions similar to the first PCR. Labeled products were separated on 6% polyacrylamide gels. X-ray films were exposed to the gels to visualize individual fragments. The presence or absence of polymorphic fragments was scored on the autoradiograms and transferred into a 1 (present) and 0 (absent) binary matrix for all accessions.

In total, four primers (NBS2, NBS3, NBS5 and NBS 9) and three enzymes (MseI, RsaI and AluI) were used (Table 2). NBS2, NBS5 and adaptor sequences were the same as described by Van der Linden et al. (2004). NBS3 and NBS9 primer sequences were as follows: NBS3 5′-GTWGTYTTICCYRAICCIGGCATICC-3′ and NBS9 5′-TGTGGAGGRTTACCTCTAGC-3′. Positions of the NBS primers in the NBS domain are shown in Fig. 1.

Table 2 Number of polymorphic NBS markers for different primer/enzyme combinations
Fig. 1
figure 1

Schematic representation of the nucleotide-binding site (NBS) of disease resistance genes and positions of primers used in this study. P-loop, kinase-2 and GLPL refer to highly conserved motifs in the NBS. Primer positions are indicated by horizontal arrows

Annotation of NBS profiling sequences

To determine the RGA content of NBS profiling markers, 384 bands were randomly chosen and excised from the gel, re-amplified with PCR conditions identical to the first PCR of the NBS profiling protocol and purified with Qiaquick PCR purification spin columns (Qiagen). After inspection on agarose gels fragments were directly sequenced using the adaptor primer as a sequencing primer with the BigDye Terminatior kit and an ABI 3700 automated sequencer from Applied Biosystems (USA). Sequences of poor quality were excluded from further analysis, and the remaining sequences were compared to archived sequences in the NCBI nucleotide databases (11 Jan 2006) using BLASTN and BLASTX (Altschul et al. 1997). Sequences that were significantly similar to known R genes and RGAs (either E value lower than 1e-05 for BLASTN, or E value lower than 1e-03 for BLASTX) were regarded as RGAs.

NBS profiling and AFLP data analysis

AFLP data from three primer combinations were available from Kardolus (1998). Two AFLP datasets (from the primer combinations E32M49 and E35M48) and one NBS profiling dataset were used in this study. Neighbor joining (NJ) and parsimony analyses were performed with PAUP* version 4.0 b10 (Swofford 2001). For the parsimony analysis, we used a two-step search strategy. (1) The first heuristic search was conducted with 10,000 random additions, holding ten trees, and saving five trees per search replicate. (2) Resulting trees from the first search were used as starting trees to swap to completion with MULTREES and TBR. A jackknife analysis (10,000 replicates) was performed with the same settings as the heuristic search. In the parsimony analyses, three accessions from the series Etuberosa Juz. were used as outgroups.

Congruence between AFLP and NBS profiling was assessed in three ways: (1) visual qualitative comparisons of the trees obtained, (2) distance matrix-based comparisons, and (3) character-based comparisons.

For the distance matrix-based comparisons, the program SIMQUAL was used to compute similarity matrices using the DICE option, which ignores shared absent bands, and is an appropriate algorithm for dominant markers like AFLP or NBS profiling markers. A pairwise comparison for these matrices with the Mantel test in NTSYS-pc version 2.10j (Rohlf 1992) was made. For the character-based comparisons, the partition homogeneity test (Farris et al. 1995) was performed in PAUP* version 4.0 b10 (Swofford 2001). This test is also called the incongruence length difference (ILD) test. ILD tests were performed with 100 replicates, the heuristic search option, and TBR and MULPARS in effect.

Results

NBS profiling on Solanum species

Nucleotide binding site profiling produced well scorable banding patterns on a gel; part of such a gel is shown in Fig. 2. For the whole set of 103 accessions, ten primer/enzyme combinations were tested. Each combination produced 23–66 scorable polymorphic bands (Table 2). A dataset of 362 characters was produced.

Fig. 2
figure 2

Part of an NBS profiling gel obtained with primer NBS9 and restriction enzyme MseI

To obtain information on the RGA content of the bands that make up the NBS profiles, 384 bands were excised, reamplified and sequenced. Of these, 232 produced a readable sequences and 155 bands (67%) could be annotated as RGA, indicating that NBS profiles indeed largely consisted of markers derived from R-genes and their analogs. The six bands with the highest similarity to known resistance genes are listed in Table 3. Most of the remaining fragments (33%) could not be annotated. Only in a few cases significant homologies to other known genes, like retrotransposons, were found. Table 2 also shows the percentage of RGA-derived bands for each primer. The RGA content of the bands obtained with the NBS5 primer was 94%. For NBS9, NBS2 and NBS3, these figures were 57, 63 and 61%, respectively.

Table 3 NBS profiling bands with high similarity to known resistant (R) genes and R gene cluster members

Cladistic results

The NBS profiling dataset was composed of 362 characters, 351 of which were parsimony informative. Sixteen most parsimonious 3,553-step trees were produced, with a consistency index of 0.102 and retention index of 0.464. The strict consensus tree obtained is shown in Fig. 3. Numbers above the branches indicated jackknife values >50%.

Fig. 3
figure 3

NBS profiling strict consensus tree. Numbers above the branches are jackknife supports based on 10,000 replicates (%). Jackknife support values lower than 50% are not shown. ACA and TUB are abbreviations for series Acaulia and Tuberosa as outlined by Hawkes (1990). Clade ACA includes a group of taxa from series Acaulia together with S. demissum from series Demissa. TUBc indicates the cultivated species from series Tuberosa. BOL includes accessions from series Tuberosa from Bolivia, Argentina and Chile

The three non-tuber-bearing Solanum species (outgroup species: brd2, etb1 and etb2) were basal to the tree. Next came a polytomy consisting of (1) the diploid Mexican species (bst1, bst2, blb, les, pnt1), (2) diploid South American species (mcq1, mcq2, pcs1, pcs2), (3) the representatives of series Circaeifolia (crc and qum), (4) the species S. lignicaule, and (5) all other species. Within the latter group four clades could be distinguished but none of them had statistical support: a group of Mexican polyploids including the diploid S. verrucosum, a group consisting of taxa from series Acaulia, including S. demissum from series Demissa, with a number of species of series Megistacroloba as a sister clade, and two groups with mainly species from series Tuberosa, divided in a Bolivian and Peruvian group, the latter also containing a number of cultivated species. Only the Acaulia group had a high jackknife support of 99%.

The AFLP datasets resulting from the two chosen primer combinations were found to be congruent with each other both for the Mantel test and ILD test (data not shown). For this reason we combined the two AFLP datasets for the comparison with NBS profiling.

The AFLP dataset consisted of 591 characters, among which 539 were informative. Parsimony analysis of AFLP data yielded eight equally parsimonious 1,548-step trees with a consistency index of 0.169 and retention index of 0.542. The strict consensus tree is shown in Fig. 4.

Fig. 4
figure 4

AFLP strict consensus tree. For details see Fig. 3

The outgroup species are again at the base of the tree, followed by the Mexican diploids, the taxa of series Circaeifolia, and relatively primitive South American species. The remainder of the tree was subdivided into three clades: a group of Peruvian species of series Tuberosa with the accessions of the cultivated species, a group of taxa from series Acaulia, including S. demissum, forming a polytomy with a number of species from series Megistacroloba, and a large polytomy including mainly the Bolivian representatives of series Tuberosa, with the Mexican polyploids nested within.

In both the NBS profiling and AFLP based strict consensus trees, the more basal branches either had poor jackknife support (lower than 50%) or form a polytomy. To investigate the influence of polyploid taxa on the poor resolution of the trees, we excluded all polyploids from the analysis. Two new datasets, containing the marker data of 72 accessions from the diploid species were produced. Parsimony and jackknife analysis were performed in the same way as with the complete dataset. Excluding the polyploid species did not improve the trees, the polytomies and basal nodes with jackknife support lower than 50% were still present.

At the lower (accession) level, common well supported subclades can be found in both the NBS profiling and AFLP strict consensus trees. In total, 17 subclades (containing two or three accessions) with a jackknife support higher than 70% were observed (Table 4).

Table 4 Common groups recognized both from AFLP and NBS profiling tree

Phenetic results

NBS profiling and AFLP phenograms had a very high cophenetic correlation coefficient of 0.96 and 0.93, indicating excellent fits of the similarity matrices to the resulting phenograms. The NJ trees of both NBS profiling and AFLP data basically showed the same groups as the cladograms. Except for the branch leading to the outgroups many of the interconnecting branches were very short and jackknife support was low. Average similarities for AFLP and NBS profiling were 0.29 and 0.52, respectively.

Congruence between AFLP and NBS profiling derived data

As described above, a visual inspection of the trees produced from NBS profiling and AFLP data showed that they were similar in many aspects, although many details were different. The correlation coefficient between NBS profiling and AFLP similarity matrices was 0.737, and between the cophenetic value matrices derived from the NBS profiling and AFLP trees 0.762. Mantel test showed that both were significantly correlated at α = 0.05 (P = 0.001), indicating both NBS profiling and AFLP derived similarity matrices and tree topologies are congruent.

In contrast to the Mantel test, the ILD test produced different results. P value of 0.01 was observed, showing that NBS profiling and AFLP were incongruent with each other (α = 0.05).

Discussion

NBS profiling in Solanum

When introducing a new tool for systematic analysis one-first needs to compare the results from the new tool with results from established techniques, like AFLP (Vos et al. 1995; Kardolus et al. 1998). Our study aimed at such comparison. We first investigated whether NBS profiling produces markers derived from resistance genes and their analogs with a high frequency. Table 2 clearly shows that the majority of the bands were derived from resistance genes and their analogs. The primers giving the lower RGA content (NBS2, NBS3 and NBS9) generate fragments outside of the conserved NBS domain. Annotation of the fragments as putative RGAs relies on sequence similarity with other RGAs. Within the NBS domain, sequence conservation is high (hence the high RGA content within the set of fragments generated with NBS5 from within the NBS domain), whereas outside of the NBS domain, sequence conservation between different RGAs is much lower. Therefore, the observed RGA content is most likely an underestimation. This was supported by the observation that some bands could not be annotated during our first analysis in August 2004, while the second analysis, with an updated version of the database (in January 2006), yielded a positive identification for several previously unidentifiable bands. Also, the annotation became more accurate. An example was the band g0739 mq21 obtained with the primer NBS5 in the S. bulbocastanum BGRC 8009 (Table 3). The first analysis showed that this band had the highest similarity with the Mi-gene of tomato. During the second annotation round, this fragment had highest similarity (97%) with the late blight resistance gene rpi-blb2 (van der Vossen et al. 2005).

Systematic relationships inferred from NBS profiling and AFLP

The topology of the cladogram from our AFLP analysis is comparable to Kardolus (1998), which is not surprising since we studied a subset of his material (because not all of the DNA was still available). Kardolus (1998) distinguished more groups in his trees but also in his results the Mexican diploid species and primitive South American species like S. circaeifolium and S. mochiquense are placed in a basal group. Furthermore, a group of representatives of series Acaulia can be recognized, closely related to the Mexican hexaploid species S. demissum, and the species belonging to the series Tuberosa are separated in a Peruvian (including the cultivated material) and a Bolivian group. He only showed bootstrap support values for the groups in his NJ tree, and generally the values for deeper branches are lower than 50%. The pattern from our analyses of the NBS profiling dataset is similar to that of Kardolus (1998).

This general pattern in our AFLP and NBS profiling results supplements the results of the most extensive cpDNA RFLP study of section Petota (Spooner and Castillo 1997). They recovered four clades, with representatives of the Mexican diploids in clade 1 and 2, species like S. mochiquense and S. paucissectum in clade 3, and a large polytomy of the other investigated species in clade 4. Spooner et al. (2005a) performed phylogenetic analyses of AFLP data of 362 individual wild (261) and landrace (98) members of section Petota. The strict consensus tree also has bootstrap values lower than 50% for the deeper branches (except the designated outgroup). In rDNA ETS results (Volkov et al. 2003) only three structural variants were found, with variant A present in the non-tuber-bearing species of series Etuberosa and in the representatives of the Mexican diploid series, variant B in series Circaeifolia, and variant C in all other investigated species. The dendrograms presented show many polytomies, indicating that resolution within the groups is mostly lacking.

When comparing the cladograms from AFLP and NBS profiling data (Figs. 3, 4) visually, there were congruencies at the lower level with accessions grouping together in the same way in both trees. These groups were supported with jackknife values >70% (Table 4). When looking at the more basal nodes, we found only a few groups (P and ACA) which were well supported by relatively high jackknife values. A number of other groups were present, but they lacked statistical support. Again this was similar for both trees. Two different statistical tests were used to evaluate the congruency between the AFLP and NBS profiling based trees. The outcome of the test was different. Whereas congruency was indicated by the Mantel test, the ILD test indicated incongruency. Similar observations were made by Spooner et al. (2005b), who compared AFLPs with other markers for phylogenetic inference in wild tomatoes. In their Mantel test, the comparison of cpDNA/GBSSI (granule-bound starch synthase gene) gives a high matrix correlation coefficient (0.831), but fails to pass the ILD test. The suitability of the ILD test is also questioned by other researchers (Graham et al. 1998; Yoder et al. 2001; Barker and Lutzoni 2002; Darlu and Lecointre 2002). In view of problems with the ILD test and the significant correlation found with the Mantel test we conclude that there is congruency between the AFLP and NBS profiling derived tree topologies.

The interesting observation that the overall pairwise similarity based on NBS profiling markers was 25% higher than the overall similarity based on AFLP markers, suggests that NBS profiling markers are more conserved than AFLP markers. Targeting more conserved regions renders the marker system more appropriate than AFLP when materials are more diverse, as it will reduce the chance for homoplasy.

The fact that the basal branches either had poor jackknife support (lower than 50%) or formed a polytomy in both the AFLP and NBS profiling tree, probably results from the nature of the studied material. Apparently, Solanum species are more similar to each other than expected on the basis of morphological characters, even when the polyploid species were excluded from the analysis. Extensive hybridization and introgression among species might be the reason for the poor resolution at the more basal nodes. An alternative explanation could be that many Solanum species have evolved in a relatively short period of time after rapid radiation over South America, which would result in species with distinct characters but with no clear sequential branching order (apart from originating from the common ancestor).

The role of disease resistance in speciation

Many NBS profiling markers were shown to be RGA-related (Table 2). NBS profiling preferentially generates markers in resistance genes. These genes are likely to be under selection, which might influence the outcome of the phylogenetic analysis. Disease resistance might play a role in the speciation process and it may thus be challenged whether NBS profiling can be used for phylogeny reconstruction. Our results demonstrated that systematic relationships from NBS profiling data do not essentially differ from that from AFLP data. This congruence between AFLP and NBS profiling may not be as unexpected as it appears to be. Plants have to deal with many different pathogens during their lifetime and are thus exposed to selective pressures in different directions. This is also evidenced by the large number of NBS-LRR resistance genes present in plants. However, it is possible that a single resistance gene was essential for species survival, or a speciation event. The R-gene(s) that may have been under selection after a period of disease pressure by a pathogen may have spread relatively fast in neighboring species through hybridization. The specific effect of selective pressure on R-genes will therefore be only detectable on a very short evolutionary time scale, and is diluted when many markers are analyzed phylogenetically. In addition, the selective advantage of retaining a specific R-gene will most likely be reflected by the absence or presence of one or a few markers in NBS profiling. The outcome of the phylogenetic analysis is based on all NBS profiling markers, the majority of which was not affected by selective pressure. A single or a few markers very likely will not influence the outcome.

In conclusion, NBS profiling is at least as good as AFLP for phylogeny reconstruction and might even be superior when more diverse material is used, as it will reduce the chance for homoplasy. The observation that higher level groups were poorly supported might be due to the nature of the material and the way the species evolved.