Introduction

The genus Betula contains trees and shrubs from diverse habitats in boreal and temperate climate zones of the Northern Hemisphere. Estimates of the number of species range from 30 to 60 (De Jong 1993; Furlow 1990). The genus is placed within the Betulaceae family of the order Fagales. In Northern Europe, pollen of birch is a major cause of hay fever complaints, as is pollen of the Fagales species hazel and alder (Breiteneder et al. 1989, 1992; Lüttkopf et al. 2002). The major allergen involved is Bet v 1 of which several variants exist that may differ in their allergenicity (Ferreira et al. 1996, 1997; Schenk et al. 2006). Given the socioeconomic impact of hay fever, birch represents a relevant target for the development of allergy prevention strategies. Selection and breeding of potential hypoallergenic birch trees requires knowledge on the genetic background of the available birch species, as the evolution of allergenic proteins is linked to the evolution of the species in which they are found. Therefore, phylogenetic relationships among Betula species may be used to predict allergenicity of birch species.

The taxonomy of Betula is controversial, and various classifications have been proposed. The first monographer who provided an extensive review of the genus was Regel (1865), who divided the genus into subgenera Betulaster and Eubetula. Subgenus Betulaster contains only one section, the Acuminatae. Subgenus Eubetula comprises six sections, namely Costatae, Lentae, Nanae, Albae, Fruticosae, and Dahuricae. Winkler (1904) proposed a slightly different division, lowering the status of the sections to that of subsections, merging the Fruticosae and the Dahuricae with the subsection Albae and merging the Lentae with the Costatae. More recently, De Jong (1993) proposed a division into five subgenera, namely Betulenta, Betulaster, Neurobetula, Betula and Chamaebetula. Subgenus Betulenta is considered the most primitive subgenus, followed by Betulaster and Neurobetula. Neurobetula is considered a very heterogeneous and partly artificial group (De Jong 1993). The subgenera Chamaebetula and Betula are considered to be more derived.

The basic chromosome number of Betula is n = 14, and the species form a series of polyploids with chromosome numbers of 2n = 28, 56, 70, 84, 112, and 140 (Furlow 1990). Polyploidy is a common feature among Betula species, and its presence within at least four of the five recognized subgenera suggests several independent polyploidization events. Hybridization and introgression are common in situations where the natural distributions of birch species overlap, for example among the European birch species B. pendula, B. pubescens, and B. nana (Palme et al. 2004). Moreover, several of the recognized Betula species have a hybrid origin (Nagamitsu et al. 2006). Hybrids generally show a morphology intermediate between the parental species but are not always morphologically distinct as a group (Thórsson et al. 2001). This overlap in morphological features complicates species and hybrid identification. Introgression appears to be bidirectional (Williams and Arnold 2001) but asymmetrical (Palme et al. 2004). Hybridization and introgression are further facilitated by the introduction and distribution of artificially propagated cultivars outside the natural distribution range. The simultaneous occurrence of polyploidization, extensive hybridization, and introgression complicates taxonomical studies in the genus Betula. In addition, several morphological characters are likely to have evolved independently more than once or have experienced parallel evolution (Li et al. 2005).

Given the difficulties with morphological characters in reconstructing species relationships within the genus Betula, alternative markers were explored, e.g., flavonoid composition (Keinänen et al. 1999), nuclear deoxyribonucleic acid (DNA) sequences (Järvinen et al. 2004; Li et al. 2005; Nagamitsu et al. 2006), a microsatellite (Nagamitsu et al. 2006), and chloroplast DNA sequences (Järvinen et al. 2004). These markers have provided useful information on the evolution of the genus, but relationships between species remain largely inconclusive due to their limited variation. For example, the chloroplast matK sequences examined by Järvinen et al. (2004) differentiated only three North American species from the other species. Moreover, trees constructed from different nuclear DNA markers display incongruences, e.g., nuclear ribosomal internal transcribed spacers (ITS) versus microsatellite sequences (Nagamitsu et al. 2006) and ITS versus ADH sequences (Järvinen et al. 2004; Li et al. 2005).

In the present study, we examined the use of AFLPs as an alternative for morphological markers, chloroplast DNA sequences, and nuclear DNA sequences. AFLP is a DNA-fingerprinting technique that generates large numbers of highly reproducible fragment markers with a genome-wide distribution. The technique is relatively fast and cost efficient and requires no prior knowledge of the genome (Jones et al. 1997; McGregor et al. 2000; Russell et al. 1997; Vos et al. 1995). Relative to morphological markers, AFLPs have the advantage that they are not under direct selection pressure, since most of the fragments represent noncoding parts of the genome (Vos et al. 1995). AFLPs are more variable than chloroplast sequences (Koopman et al. 2008). Moreover, AFLPs represent both paternal and maternal lineages because they are almost entirely derived from the nuclear genome (Althoff et al. 2007). Compared to nuclear DNA sequences such as ITS, AFLPs have the advantage that they are more variable and that they are sampled across the entire genome rather than in a specific location (Koopman 2005). However, AFLPs also have drawbacks that potentially may hamper their use as phylogenetic characters (reviewed in Koopman, 2005), most notably a possible lack of homology between fragments across taxa (Althoff et al. 2007). Several studies have shown that homology assignment between AFLP fragments decreases with increasing evolutionary distance between taxa (Althoff et al. 2007; Koopman 2005). Koopman (2005) contrasted AFLP variation with ITS sequence divergence in a large number of taxa and concluded that AFLPs are reliable phylogenetic markers for plant taxa with ITS sequences differing up to 30–35 nucleotides. A GenBank survey for the species in the present study revealed that ITS sequence differences among ingroup species ranged from 0 to 22 nucleotides, which is well within the range defined by Koopman (2005). Therefore, it is expected that the AFLP marker variation in our data set is a suitable indicator of Betula relationships. Arens et al. (1998), Cervera et al. (2005), Ziegenhagen et al. (2008), and Smulders et al. (2008) demonstrated in poplar that the AFLP pattern of hybrid offspring contains bands of both parental species. Therefore, the comparison of AFLP patterns of taxa may serve to identify hybrids.

The objectives of the present study were (1) to reconstruct the phylogeny of Betula, while positioning and identifying hybrid taxa and cultivars, and (2) to evaluate the (sub)sections proposed by Regel (1865) and Winkler (1904), and the subgenera proposed by De Jong (1993). The division of De Jong (1993) will be used as a starting point. Species from all sections and subgenera proposed by the abovementioned authors were included, as were several hybrid taxa.

Materials and methods

Plant material

We collected young leaves from 62 Betula accessions in the botanical collections of Applied Plant Research, Unit Nursery Stock (Boskoop, The Netherlands), the Botanical Garden of Wageningen University (Wageningen, The Netherlands), and the Von Gimborn Arboretum (Doorn, The Netherlands). In addition, leaves were collected from ten accessions of B. pendula in a birch seed orchard in Urk (The Netherlands) and 31 cultivated Betula accessions growing as lane trees in Ede (The Netherlands) and Munich (Germany; Table 1). The accessions were originally attributed to 23 species and five interspecific hybrids based on descriptions and names available from the botanical collections. The phenetic analysis revealed that 11 accessions, cultivars mostly, did not group with the expected taxon. Nine suspected misclassified accessions were labeled as hybrids after the evaluation of the ploidy levels (Table 1). The morphology of the remaining two accessions did not match with the taxon suggested by the original label, and in one of these accessions, the ploidy level did not match either. Both accessions were tentatively assigned to the correct species (Table 1). All (sub)sections and subgenera proposed by Winkler (1904) and De Jong (1993) were represented by at least two species, except for subgenus Betulaster that was represented by a single species. Based on the results of Chen et al. (1999) and Li et al. (2005), two Alnus and two Corylus accessions were included as outgroups. Taxonomical names of (sub)sections and subgenera follow De Jong (1993).

Table 1 Plant material

Flow cytometry

Fresh leaf samples were sent to Plant Cytometry Services (Schijndel, The Netherlands) to determine the ploidy level. Ploidy levels were estimated by flow cytometry as described in Koopman (2000). Diploid (B. pendula) and tetraploid (B. pubescens) controls were included.

AFLP genotyping

For DNA extraction, young leaves of approximately 1 cm2 were collected, immediately frozen in liquid nitrogen, and subsequently freeze dried for storage. Total genomic DNA was extracted with the DNeasy 96 Plant Kit (Qiagen, Venlo, The Netherlands) from grinded leaf tissue according to the manufacturer’s instructions. The AFLP assay (Vos et al. 1995) was performed after digestion/ligation with the 6-bp cutting enzyme EcoRI and the 4-bp cutting enzyme MseI, followed by a two-step polymerase chain reaction (PCR) amplification protocol (Arens et al. 1998) with the modification of using IRD700 fluorescence-labeled primers instead of 33P-labeled primers. We used three selective primer combinations (Bonin et al. 2004): EcoRI 5′-GACTGCGTACCAATTCAGT-3′/MseI 5′-GATGAGTCCTGAGTAACTC-3′, EcoRI 5′-GACTGCGTACCAATTCATG-3′/MseI 5′-GATGAGTCCTGAGTAACAC-3′, and EcoRI 8 5′-GACTGCGTACCAATTCATG-3′/MseI 5′-GATGAGTCCTGAGTAACAA-3′. Amplified fragments were separated on 6.5% denaturing polyacrylamide gels and analyzed on a LI-COR 4300 DNA analyzer (LI-COR Biosciences, Lincoln, NE, USA). Three accessions failed to produce a scorable AFLP pattern due to incomplete digestion. The decaploid species B. medwediewii was excluded because it showed an excessive number of bands, which would hamper a reliable analysis.

AFLP data analysis

Li-COR TIFF images were imported into QUANTAR software (Keygene, Wageningen, The Netherlands). Two standard samples were run on each gel to allow automatic positioning of marker bands. Presence (1) or absence (0) of polymorphic AFLP bands was scored for all accessions in the range from 100 to 450 bp. Only intense and well-separated bands were scored. The primer combinations yielded 119, 113, and 89 AFLP markers, respectively (321 in total). Eight duplicate accessions were included as controls.

Several accessions were present in duplicate in our data set. The vast majority of these were identical, but occasionally, one band was scored differently. The calculated Dice similarity was, however, always above the 98.5% limit that was indicated by Arens et al. (1998) to allow for an error in duplicated samples. All accessions displaying more than 98.5% similarity potentially represent clones and were removed from further analyses (Table 1). The phenetic analyses were conducted on a data set containing 87 unique accessions, including hybrids. Similarity matrices of Jaccard distances and Dice distances were calculated using NTSYSpc 2.10j (Applied Biostatistics, Setauket, NY, USA). Dendrograms were subsequently constructed using neighbor-joining (NJ) analysis. Cophenetic matrices were calculated from the resulting dendrograms and the product–moment correlation between cophenetic and similarity matrices was calculated to test the goodness of fit of the cluster analysis. The Mantel test for matrix correspondence was performed with 1,000 permutations. The combination with the best fit (Dice + NJ) was chosen for phenetic analysis. To allow bootstrapping, the analysis was repeated in Paup 4.0b10 (Swofford 2002) using Nei–Li distances (=1 − Dice similarity) and 1,000 bootstrap replicates.

The relationships among species classified in subgenus Betula (cluster IV, see “Results”) were examined in detail with a principle coordinate analysis (PCO). The PCO was based on Dice distances and carried out in NTSYSpc. A further classification of these accessions was made in a Bayesian analysis using STRUCTURE 2.2 (Falush et al. 2003; Pritchard et al. 2000). The objective of this analysis was to test whether species form separate clusters or species groups and whether hybrids are classified within or between these groups. STRUCTURE was developed for studies on populations, in which individual samples are assumed to be able to exchange genetic material. This is clearly not the case for the genus Betula as a whole, but may be realistic for the subgenus Betula, in which hybridization and introgression are common features. Ploidy levels in this subgenus vary from diploid to octoploid. As STRUCTURE does not contain models that can deal with this situation, all accessions were treated as diploid (explained below). Dominant AFLP marker data were entered by coding both alleles as “1” when a band was present and as “0” when a band was absent, while specifying “0” as a recessive allele for all markers. Estimates were obtained under the admixture model using the correlated allele frequencies option. Version 2.2 accommodates genotypic uncertainty in dominant marker data by sampling present bands as homozygotes or heterozygotes according to their posterior probabilities (Falush et al. 2007). This does not fully account for the fact that, given the dominant nature of AFLP markers, higher ploidy levels would contain a higher level of genotypic uncertainty. This may distort the absolute genetic distances, with polyploids ending up genetically more similar to one of the parents, but this will not interfere with the goal of identifying hybrid accessions. The number of inferred groups was evaluated at values of K ranging from 1 to 17, in which the maximum of K = 17 corresponds to the number of sampled species in subgenus Betula (excluding hybrids). Three replicate runs were performed for each value of K. A burn-in of 50,000 cycles and data collection for 100,000 cycles was used. The admixture model estimates the proportion of each accession’s genome that descended from each of the K inferred groups.

Phylogenetic analyses were conducted on two data sets. The first set contained a maximum of four accessions per species because otherwise some species would be overrepresented in the data set. Accessions that were initially misclassified were preferentially excluded, followed by accessions that had the most missing values. Hybrid cultivars were also excluded. The second set was a subset of the first set, containing only the diploid species. Phylogenetic signal in the data set was quantified with the g 1 statistic (Hillis and Huelsenbeck 1992). Parsimony analyses were conducted in PAUP as heuristic searches with 100,000 random additions (holding one tree at each step), tree–bisection–reconnection (TBR) branch swapping, multrees switched off, and ACCTRAN for character optimization. The initial search was followed by additional branch swapping on the most parsimonious trees (MPTs) with the multrees option switched on. Branch support was assessed by bootstrap analysis comprising 10,000 replicates consisting of ten random addition sequences with TBR branch swapping.

Results

AFLP similarities

The three AFLP primer combinations produced 321 variable bands in 99 Betula and four outgroup accessions. Dendrograms obtained by unweighted pair group method with arithmetic mean (UPGMA) and NJ using Jaccard and Dice distances were highly similar (data not shown). Correlations between Dice and Jaccard similarity matrices and cophenetic matrices from the dendrograms were high (0.96–0.98), with a one-tailed probability of 0.001 at 1,000 permutations. This indicates that the dendrograms provided a good fit to the similarity matrices. The highest correlation was found for Dice distances in combination with NJ (0.98). Similarity values between the outgroups (Corylus and Alnus) and ingroup (Betula) ranged from 0.14 to 0.33, while ingroup similarity values all exceeded 0.32. The NJ tree shows a clear structure, although the support values for most branches were quite modest (Fig. 1). Four major clusters were present: (I) the B. schmidtii/B. chichibuensis cluster, (II) the B. nigra cluster, (III) the subgenus Betulenta cluster, and (IV) the subgenus Betula cluster. The latter contained all accessions from subgenus Betula and additional accessions from other subgenera (discussed below). Group IV was by far the largest group and contained several supported subgroups, such as the cluster with both B. costata accessions, the cluster with B. davurica and B. humilis, the cluster with all B. papyrifera accessions, and the cluster with B. utilis subsp. jacquemontii and B. × “Long Trunk.” Mean similarity values among Betula species are provided as electronic supplementary material (Table S1).

Fig. 1
figure 1

Neighbor-joining tree of Dice similarities among 22 Betula species and ten interspecific hybrids (shown in red italics) based on 321 AFLP markers. Bootstrap percentages greater than or equal to 50 are shown on the branches. The groups within subgenus Betula as found in the PCO (Fig. 2) are shown on the right

Relationships within the subgenus Betula

To allow a detailed analysis of the relationships within the Betula cluster (IV), we performed a PCO on the accessions within this group. The first three components had Eigenvalues of more than 1.0 and explained 29.6% of the variation (16.8, 7.4, and 5.3, respectively). The first two components of the PCO are plotted in Fig. 2a, in which five groups are distinguished. Group A is represented by a large number of accessions comprising the species B. pendula, B. plathyphylla, and B. populifolia. These species are not separated from each other on the first three components. Group B in the PCO plot comprised six species: B. pubescens, B. litwinowii, B. korshinskyi, B. papyrifera, B. pumila, and B. nana. With the exception of B. nana, these species are hardly separated on the first two components. The third component did separate B. papyrifera and B. pubescens and to a lesser extent B. pumila and B. pubescens (not shown). Betula maximowicziana is placed in group C. Although the PCO puts B. maximowicziana close to group D, B. maximowicziana has the most basal position in cluster IV and a relatively low similarity to the other accessions in subgenus Betula (Fig. 1). The species B. costata, B. davurica, B. humilis, and B. ermanii made up group C. These species branch off sequentially at more derived positions in cluster IV (Fig. 1) as do the species in group E, comprising B. grossa, B. utilis, and B. albosinensis.

Fig. 2
figure 2

a Principal coordinates plot of the Betula accessions in subgenus Betula (group IV; see Fig. 1) for the first two principal components estimated with 234 AFLP markers. b STRUCTURE analysis of the Betula accessions in subgenus Betula (group IV) inferred from AFLP markers. In this figure, each accession is represented by a vertical bar partitioned into K = 3 colored segments (green, blue, and red). At K = 2, the accessions in green were separated from the rest. At K = 3, the blue group was separated from the rest (in red). The corresponding groups (IV-A, B, and C + D) from a are displayed above the bars

Most cultivars of hybrid origin are placed in between the groups A to E. Their hybrid origin was confirmed by comparing the AFLP profile and/or ploidy level. Notable exceptions were the triploid hybrid “Elegans Pendula” and B. × koehnei (a hybrid between B. pendula and B. papyrifera), which was located among the B. pubescens-like accessions in group B. The cultivars “Long Trunk” and “Edinburgh” were positioned within group E, close to B. utilis and B. albosinensis. “Long Trunk” was originally described as a B. pendula cultivar, but clusters with B. utilis “Doorenbos” in the NJ dendrogram and, based on the AFLP profile, appears to represent either a hybrid between B. utilis and B. pendula or a true B. utilis cultivar. The parental species of the hybrid cultivars could be established for some accessions, although the close relatedness among the species in subgenus Betula and the presence of species with a hybrid origin complicated this analysis. For example, the AFLP profile of B. × “Elegans Pendula” shared most bands with B. pendula (Dice similarity with B. pendula accessions of 0.7 or more). The position in the PCO suggests that the other parent may be B. pubescens or a related species. The largest number of bands not shared with B. pendula was shared with two accessions of B. pubescens and the accession from B. litwinowii.

To test the separation into groups and the presence of admixture in hybrid accessions within subgenus Betula, we used a Bayesian population clustering approach implemented in the program STRUCTURE (Falush et al. 2003; Pritchard et al. 2000). The STRUCTURE analysis provided strong support for three species groups, with large and consistent improvements in the probability function [Ln P(D)] for runs with K = 3 relative to K = 2. Values of Ln P(D) improved modestly for K = 4 relative to K = 3, while for K > 4 runs, the variation between replicate runs became very large, so that there was no consistent further improvement. This may be due to the fact that the first two groups identified (marked green and blue in Fig. 2) are rather homogeneous, while the third group includes even single accessions of some species. The average values were: K = 1: −5,836.8, K = 2: −5,175.2, K = 3: −4,944.8, K = 4: −4,873.9. The three species groups that were distinguished were largely in line with the groups IV-A, IV-B, and IV-C + D that were distinguished by the PCO (Fig. 2a,b). Some hybrids were actually clustered within the three distinguished groups. For example, B. × koehnei, B. × “Elegans Pendula,” and B. × obscura fell within group IV-B, while B. × caerulea fell within group IV-C + D. The other hybrids showed clear signs of admixture between groups. Notably, the species B. albosinensis, B. utilis, and B. grossa that were attributed to group IV-E in the PCO plot were not distinguished as a separate group in the cluster analysis and showed admixture between group IV-B and IV-C + D, which is consistent with their position relative to these groups in the PCO plot (Fig. 2a).

Phylogentic analysis of AFLP data

Cultivars with a mixed species background (interspecific hybrids, listed in Table 1) were excluded from the phylogenetic analysis, which was performed with 43 accessions. The data set included 297 variable bands, 211 of which were parsimony informative. The g 1 statistic for the data set was −0.52. This value is considerably lower than the corresponding critical value of −0.09 (p = 0.01; Hillis and Huelsenbeck 1992) indicating the presence of ample phylogenetic signal. The initial parsimony analysis resulted in 12 MPTs of 721 steps on 11 different islands. The trees had a consistency index (Kluge and Farris 1969) of 0.412 and a retention index (Farris 1989) of 0.585. Additional branch swapping did not yield any extra trees. The strict consensus of the MPTs is shown in Fig. 3.

Fig. 3
figure 3

Strict consensus of 12 MPTs based on 297 AFLP markers and 22 Betula species. Bootstrap percentages greater than or equal to 50 are shown on the branches. The subgeneric division as proposed by De Jong (1993) is shown on the left vertical bar; the groups suggested by the AFLP data are shown on the right bar

The consensus tree from the data set without hybrids (Fig. 3) and the NJ tree of the data set including hybrids (Fig. 1) have a similar topology regarding the accessions that are present in both trees. Several groups can be identified when both trees are considered: (I) Betula schmidtii and B. chichibuensis form a distinct and supported cluster in both the NJ (71% bootstrap support) and MP tree (85%). Both are Asian species from subgenus Neurobetula. These species are relatively divergent from the other Betula accessions, with similarities between 0.33 and 0.48. (II) Betula nigra, a North American species, is classified in subgenus Neurobetula and is clearly separated from all other Betula species. It was the most divergent accession in the NJ tree with a similarity of 0.32 to 0.39 relative to the other Betula accessions. (III) Betula lenta and B. alleghaniensis are two closely related North American species that are classified in subgenus Betulenta. B. lenta is diploid, while B. alleghaniensis is hexaploid. The clade is supported in both the NJ (76%) and the MP tree (99%).

B. maximowicziana is an Asian species that is the only representative of subgenus Betulaster included in our study. It has a basal position in group IV in both the NJ and MP tree and is placed in group IV-C (Figs. 1 and 3). B. costata is the next species to branch of in the MPT, while being placed among the other representatives of group IV-D in the NJ tree. The remaining species of group IV-D and IV-E branch off sequentially within group IV. The groups D and E do not form supported groups in the MPT. The species B. ermanii, B. davurica, B. utilis, and B. albosinensis were previously classified in subgenus Neurobetula. B. grossa (subgenus Betulenta) is also placed within this group. In addition, B. humilis (subgenus Chamaebetula) is shown to be closely related to B. davurica. The clade with groups IV-A and IV-B includes all examined species from subgenus Betula, namely B. pendula, B. plathyphylla, B. populifolia, B. pubescens, and B. papyrifera, and two species from subgenus Chamaebetula, namely B. nana and B. pumila. B. pumila clusters with B. papyrifera in both the NJ tree and the MPT. Group IV-A from the PCO contains the Betula species B. pendula, B. plathyphylla, and B. populifolia and also forms a separate clade in the MPT.

Species with higher ploidy levels may represent natural interspecific hybrids. To exclude the effects of hybridization as much as possible, we repeated the analysis with only the diploid species. In the resulting NJ tree and MPTs (not shown), five groups could be identified. The first three groups (I to III) were similar to the groups discussed above, while the clustering within group IV was slightly different. In Fig. 3, B. maximowicziana is the most basal clade in group IV. In the diploid tree, B. maximowicziana and B. costata clustered together and had a basal position relative to B. nana, B. pendula, B. plathyphylla, and B. populifolia. The grouping of B. maximowicziana with B. costata was also suggested by the STRUCTURE analysis.

Discussion

AFLP markers for phylogeny

The taxonomy of the genus Betula is controversial. Although ample morphological variation exists in characters such as leaf shape, bark color, and shape of the catkins, attempts to reconstruct species relationships using morphological characters failed to produce a reliable classification. The occurrence of polyploidization (Nagamitsu et al. 2006), hybridization, and introgression (Palme et al. 2004; Thórsson et al. 2001; Williams and Arnold 2001) and the fact that morphological characters may have evolved independently more than once in Betula (Li et al. 2005) may account for this. Up to now, Betula taxonomy had been studied using morphological characters, flavonoid composition, and nuclear and chloroplast DNA sequences. The main limitations of morphological markers are selection pressure on morphological markers and hybrid morphology not always being intermediate (Thórsson et al. 2001). The main limitation of the sequence markers is their limited variation. We applied AFLP as an alternative, because AFLP generates polymorphic markers at a high frequency, has a high reproducibility, and has genome wide sampling and its markers are not under direct selection pressure (Jones et al. 1997; McGregor et al. 2000; Russell et al. 1997).

More than 200 parsimony informative AFLP markers were generated in a data set of 87 unique Betula accessions. The AFLP data distinguished four subgenera and four groups within the largest subgenus (Betula). However, we could not unambiguously resolve relationships among these groups. Due to the extensive hybridization and introgression within the genus Betula, good support for the relationships is not always to be expected, although bootstrap support was generally higher for the AFLP-derived groups compared to groups identified based on ITS sequences (Li et al. 2005; Nagamitsu et al. 2006), and AFLP was able to provide resolution on clades that were unresolved with ITS data. Apart from differences in support and resolution, congruence between ITS data and AFLP data was high for the genus Betula. This general congruence between AFLP and ITS tree topologies is found across a wide range of taxonomic groups (Koopman 2005). Besides yielding detailed information on Betula relationships, AFLPs provided complementary information on hybridization events. Such events were reflected in the AFLP profiles by bands shared between the hybrid and parental species and for some hybrids in the STRUCTURE analysis. In ITS sequences, the parental information may be lost or misleading (Álvarez and Wendel 2003), even when fragments are cloned before sequencing (Nagamitsu et al. 2006).

Phylogeny of the genus Betula

In summary, the AFLP results indicate the presence of four subgenera in Betula, as opposed to the five subgenera originally recognized by De Jong (1993). Each of the original subgenera is discussed separately below.

Subgenus Betulenta (De Jong 1993), which is synonymous with the Lentae of Regel (1865), was represented by B. lenta, B. alleghaniensis, B. medwediewii, and B. grossa in our study. Winkler (1904) merged this group with subsection Costatae, but our results support the division of De Jong (1993) and show that the Betulenta sensu Winkler (1904) would be paraphyletic. B. lenta and B. alleghaniensis grouped together with good support in both the AFLP NJ and maximum parsimony analysis. The close relationship of B. lenta and B. alleghaniensis (together with B. medwediewii) is supported by nuclear ITS and chloroplast matK sequence data (Järvinen et al. 2004; Li et al. 2005). The decaploid B. medwediewii was not included in our analysis due to an excess of bands in the AFLP profile. Our AFLP results are not in line with data on ITS sequences that indicated a close relationship of B. alleghaniensis with the non-Betulenta species B. costata (Li et al. 2005). Morphologically, B. costata and B. alleghaniensis are very distinct, and we therefore consider the AFLP results to be more reliable. The position of B. grossa in the Betulenta is supported by data on phenolic variation (Keinänen et al. 1999). However, both our AFLP data and data on ITS sequences (Nagamitsu et al. 2006) suggest that B. grossa is not positioned within Betulenta. The position of this species is thus uncertain. B. lenta, B. alleghaniensis, and B. medwediewii are maintained within Betulenta.

Subgenus Betulaster was represented by a single accession of B. maximowicziana. According to the AFLP data, this accession is positioned at the periphery of subgenus Betula in group C. Li et al. (2005) pointed out that B. maximowicziana did not cluster with other representatives of subgenus Betulaster. Given that B. alnoides is the type species of subgenus Betulaster, the status of Betulaster would depend on the position of B. alnoides, which was not included in our study. B. nigra did cluster with B. alnoides according to Li et al. (2005). In our AFLP trees, B. nigra formed a separate and well-supported group. The above therefore suggests that subgenus Betulaster may, in fact, be a distinct group whose position in our AFLP trees is represented by B. nigra. The observation of Li et al. (2005) may be related to the fact that the distribution area of the two Asiatic Betula species is quite different: B. maximowicziana is distributed in the eastern part of Japan while B. alnoides is distributed in southeast Asia (from India to Vietnam and southern China).

In our data set, subgenus Neurobetula was represented by seven species (excluding B. nigra) that separated in two major groups. Group I (B. chichibuensis and B. schmidtii) was well defined and well supported, while the other group (Group IV-D + E in Figs. 1, 2, and 3) was more loosely defined and contained the species B. costata, B. ermanii, B. davurica, B. utilis, and B. albosinensis. Previous studies concluded that subgenus Neurobetula is a heterogeneous and polyphyletic group (De Jong 1993; Li et al. 2005). However, an alternative division was not proposed, because conflicts in morphological markers and low variation among DNA markers hampered an unambiguous conclusion. The position of B. schmidtii as a close relative to B. chichibuensis was in line with studies on ITS sequences (Li et al. 2005; Nagamitsu et al. 2006) and a recent morphological study by Skvortsov (2002). On the other hand, Keinänen et al. (1999) and Järvinen et al. (2004) found that B. schmidtii was closely related to subgenus Betula. Our results indicate that B. costata, B. ermanii, B. davurica, B. utilis, and B. albosinensis should be merged with subgenus Betula. The common existence of hybrids between the above species and species from subgenus Betula also support placement in a single subgenus. All these species are of Asian origin and a more extensive sampling from their natural range will be needed to resolve interspecific relationships and to disentangle phylogenic relationships from geographical components if gene flow (hybridization and introgression) occurred between certain species in overlapping parts of their distribution area.

Subgenus Chamaebetula (De Jong 1993) was represented in our study by B. humilis, B. nana, and B. pumila. According to De Jong (1993), subgenus Chamaebetula is polyphyletic and artificially grouped based on the single morphological character of having a shrubby habitus. Our results indicate that these species should be placed within subgenus Betula. For B. nana, a close relationship to subgenus Betula is supported by sharing of chloroplast haplotypes between B. nana and B. pendula (Maliouchenko et al. 2007; Palme et al. 2004). B. nana, B. pumila, and B. humilis are placed in different groups (IV-B vs. IV-D) within subgenus Betula, confirming the polyphyletic nature of the Chamaebetula. The above suggests that subgenus Chamaebetula is superfluous.

Subgenus Betula was represented in our study by B. pendula, B. plathyphylla, B. populifolia, B. pubescens, B. litwinowii, B. korshinskyi, and B. papyrifera. Although not supported by high bootstrap values, the species originally placed within subgenus Betula do consistently group together in both NJ tree (Fig. 1) and MPTs (Fig. 3). The PCO analysis distinguished four species groups in the subgenus Betula (Fig. 2a), while the STRUCTURE analysis recognized three groups. The species B. pendula, B. plathyphylla, and B. populifolia clustered together in group A and showed hardly any genetic differentiation. In fact, the AFLP data failed to differentiate between these species, suggesting that they are conspecific. Their morphology is also very similar, and Skvortsov (2002) already considered B. platyphylla to be synonymous with B. pendula. Only one B. populifolia accession was included, so further sampling within the natural range of this species will be necessary to confirm its status. Group B consisted of the potentially conspecific B. pubescens, B. litwinowii, and B. korshinskyi, B. papyrifera, and two species from the Chamaebetula, namely B. nana and B. pumilla. B. pubescens, B. litwinowii, and B. korshinskyi do not separate in the PCO, but more extensive sampling is required to establish their status. Group D comprised B. maximowicziana as discussed above, while groups D and E comprised several species that were originally attributed to subgenus Neurobetula. The groups C and D formed a single group in the STRUCTURE analysis. Group E was shown to have an intermediate position between the groups B and D. This group contained only polyploid species, consistent with a potential hybrid origin. In summary, all previously assigned species were retained in subgenus Betula, while subgenus Chamaebetula, part of the species from Neurobetula, and B. maximowicziana were also placed in this subgenus.

Evolution

Although our results indicate that four major taxonomic groups can be recognized within the genus Betula, the relationship between them remains unclear. The most obvious explanation for the lack of support is the occurrence of hybridization and introgression, which would have a homogenizing effect on the relationships between species. Several types of hybrids may occur, and they can be classified as “newly formed (F1) hybrids,” “later generation hybrids,” and “hybrid species” (Vriesendorp and Bakker 2005). Hybrid cultivars are likely to fall within the first two groups, and we excluded the cultivated hybrids from our maximum parsimony analysis. However, naturally occurring hybrid species may also exist and may in fact make up a significant proportion of all Betula species. Their presence in the data set may explain the low bootstrap values within the Betula clade. Species such as B. ermanii, B. humilis, B. utilis, and B. pubescens each have ploidy levels higher than 2n and to some extent contain AFLP bands that can be regarded as diagnostic bands for B. pendula. However, we cannot determine to which extent this reflects shared evolution or shared parental species in a natural hybridization process. An alternative explanation for the lack of support relates to a situation in which the major speciation events took place within a very short time frame. This would result in so called bush-like clades that are characterized by short stems relative to the length of the branches (Rokas and Carroll 2006). Under these circumstances, homoplasy may limit the phylogenetic resolution by overwhelming the true phylogenetic signal. This could also explain why we cannot determine the relationships between the subgenera. If this is the case, these relationships may remain unresolved.