Background

Polyploidy is one of the most important evolutionary pathways in flowering plants and has significantly contributed to their diversification and radiation [14], for instance as the most frequent mode of sympatric speciation [5]. Polyploid lineages often exhibit complex relationships among each other as well as with their lower-ploid ancestors (e.g., [68]). This is fostered by the prevalence of gene flow from lower to higher ploidy levels, whereas the opposite case is considered rare [9, 10]. Multiple and recurrent formation of polyploids, (epi)genetic, transcriptomic and genomic changes as well as morphological, geographic and ecological divergence following polyploidisation are considered significant processes in the evolution of polyploids [1117], and obviously increase taxonomic complexity. On the other hand, polyploids have lower speciation and higher extinction rates than diploids [18]; therefore, the high frequency of polyploids was suggested to be a consequence of their high formation rate rather than of accelerated diversification [19]. From the taxonomic viewpoint, polyploidy, or more generally reticulate evolution, thus clearly falls into the category “taxonomist’s nightmare—evolutionist’s delight” [20].

Due to the prevalence of reticulate evolution spanning three ploidy levels, Knautia L. (Caprifoliaceae, Dipsacoideae) is considered one of the taxonomically most intricate genera in the European flora [21, 22]. Dependent on taxonomic concepts it comprises about 50–55 species distributed in western Eurasia and northwestern-most Africa and is characterised by a lipid-rich elaiosome at the basis of the fruits [23]. Its traditional division into three sections was recently supported by molecular phylogenetic analyses of diploid species [24]. Whereas the species-poor, apparently early diverging annual sections Knautia (x = 8; K. degenii, K. orientalis) and Tricheroides (x = 10; K. byzantina, K. integrifolia) are centred in the eastern Mediterranean and comprise only diploids, the species-rich, mainly perennial section Trichera has its maximum diversity in Southern Europe and includes di-, tetra- and hexaploids based on x = 10. Extensive exploration of genome size and ploidy level variation in 381 populations of 54 species of sect. Trichera [25] has shown that di- and tetraploids are distributed across most of the distribution area of Knautia, whereas hexaploids are limited to the Balkan and Iberian Peninsulas and the Alps. Monoploid genome size varies considerably within the ploidy levels, but also within some of the species, and increases significantly towards the limits of the genus’ distribution.

The mostly perennial sect. Trichera possibly evolved from an annual ancestor [24]. Transition from an annual to a perennial life cycle was recently reconstructed also for Delphinium and Lupinus [26, 27], but stands in marked contrast with previous hypotheses [9, 21] suggesting life history evolution to proceed in the opposite way. Shallow diversification within sect. Trichera as well as extremely wide distribution of plastid haplotypes and—to a lesser extent—of ITS ribotypes, spanning almost the entire distribution of the genus, was considered indicative for rapid radiation and recent range expansion. In addition, extensive sharing of plastid haplotypes and ITS ribotypes across taxa indicates recurrent gene flow across species boundaries. Radiation in Knautia was suggested to have taken place 45–4.28 Ma [28], whereas diversification in the heteroploid Centaurea sect. Acrocentron (Asteraceae), Dianthus (Caryophyllaceae), Scorzonera (Asteraceae) and Tragopogon (Asteraceae) took place significantly later and has been associated with the onset of climatic and topographic changes in the Mediterranean region during the Pliocene and Pleistocene [28, 29].

In spite of the restriction to diploid cytotypes Rešetnik et al. [24] have shown that the shallow phylogenetic structure within Knautia prevents establishing a formal taxonomic framework; instead informal, genetically defined species groups were suggested. Some traditionally recognised groups (e.g., K. dinarica, K. drymeia and K. montana groups sensu Ehrendorfer [21, 30]) could be maintained, whereas several others (K. arvensis, K. dalmatica, K. fleischmannii, K. longifolia and K. velutina groups) were clearly polyphyletic and their diploid members were rearranged into the Xerophytic, Carinthiaca, Midzorensis, North Arvensis, South Arvensis, Pancicii and SW European Groups. Most of the traditional groups also include polyploid cytotypes of some heteroploid species as well as exclusively polyploid taxa. Additionally, Ehrendorfer [21, 30] recognised some entirely polyploid groups, such as K. fleischmannii, K. subcanescens-K. persicina, K. dipsacifolia (under the synonyme of K. silvatica) and K. sarajevensis groups, but diploids were recently discovered within some of them [24, 25].

One of the major problems in assessing the evolutionary history of heteroploid genera is the reticulate nature of polyploid speciation processes. In order to reconstruct these processes, the present study is based on three molecular markers, i.e. maternally inherited plastid DNA sequences as well as biparentally inherited nuclear ribosomal internal transcribed spacer (nrITS, in the following for simplicity termed ITS) sequences and amplified fragment length polymorphisms (AFLPs). The last method assesses genetic variation at a large number of anonymous loci mostly from the nuclear genome [31, 32] and was extensively applied to polyploid complexes (e.g., in Hypochaeris, [33]; Rosa, [34]; Veronica, [35]; Leucanthemum, [36]). The main objective of the present study is to elucidate the evolutionary relationships within the intricate sect. Trichera in which diploid and polyploid taxa were suggested to intermingle forming several tightly knit species groups [21, 30]. We asked the following questions. (1) Where and when did the initial diversification in Knautia take place, and how did it proceed further? (2) Did Knautia undergo a similarly recent (Pliocene/Pleistocene), rapid radiation as other genera with similar ecology and overlapping distribution such as Dianthus and Tragopogon [28, 29]? (3) Did polyploids evolve within the previously recognised diploid groups or rather from hybridisation between groups, and did polyploids form more extensively in certain diploid groups than in others? Finally, (4) dependent on the results we either assign the polyploid accessions to the informal species groups previously identified for diploids or propose additional species groups.

Methods

Plant material

Our sampling aimed at taxonomic completeness and inclusion of several populations, at least for the relatively widespread species. The final selection of samples was based on a previous exploration of ploidy level variation in 381 populations of 54 species [25] in order to represent each taxon with all its ploidy levels. As we were not interested in intra-population genetic diversity, we maximised the number of populations at the expense of the number of samples per population. Taxonomy follows Flora Europaea [37] with the exception of K. csikii not mentioned in Flora Europaea [38], the recently described K. slovaca [39], K. serpentinicola and K. pseudolongifolia [40] as well as the Iberian [41] and Turkish taxa [42], for which we followed newer or geographically more comprehensive treatments. In contrast to Frajman et al. [25] we include K. wagneri in K. midzorensis. Herbarium vouchers were revised by F. Ehrendorfer, a taxonomic expert of the group. Five populations could not be assigned to a species, i.e. four hexaploid populations from Velebit in Croatia (K. sp. 1: K102, K103, K105, K500) and one tetraploid population from Serbia (K. sp. 2: K218).

Leaf material of one to five individuals per population and one to 35 populations per species (i.e., roughly proportional to the size of the species’ distribution areas; Fig. 1) was collected and immediately stored in silica gel; geographic coordinates were recorded for each population with a GPS. We aimed at sampling morphologically and ecologically homogenous populations and avoided possibly hybridogenous individuals. Voucher specimens are either deposited at the Institute of Botany, University of Innsbruck, Austria (IB), the Faculty of Science, University of Zagreb, Croatia (ZA), the Faculty of Agriculture, University of Zagreb, Croatia (ZAGR), the Faculty of Biology, University of Belgrade, Serbia (BEOU) or the Natural History Museum Belgrade, Serbia (BEO). Voucher numbers and collecting details are given in Additional file 1: Table S1; further information can be retrieved from the publicly accessible database of the BalkBioDiv project at http://www.uibk.ac.at/botany/balkbiodiv/?Sampling_sites.

Fig. 1
figure 1

Sampled populations of 51 species of Knautia sect. Trichera. Population identifiers, which correspond to Additional file 1: Table S1, are underlined for tetraploid populations and white with black shading for hexaploid populations. Diploid populations are not highlighted. a distribution of diploid and polyploid taxa outside of the area enlarged in B and C; b distribution of diploid taxa modified from Rešetnik et al. [24]; c distribution of polyploid taxa. Taxa of the exclusively diploid sections Knautia and Tricheroides are not shown

Molecular methods

Total genomic DNA was extracted from similar amounts of dried tissue (ca. 10 mg) with the DNeasy 96 plant mini kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol.

Sequencing of the plastid petN(ycf6)-psbM region and the nuclear ribosomal ITS region was performed as described by Rešetnik et al. [24], using the primers ycf6-F and psbM-R [43] and 17SE and 26SE [44], respectively. AFLP fingerprinting was performed for 258 populations of sect. Trichera and K. integrifolia with one to five individuals per population, using three primer combinations for the selective PCR (fluorescent dye in brackets): EcoRI (6-FAM)-ACA / MseI-CTG, EcoRI (VIC)-ACG / MseI-CTA, and EcoRI (NED)-ACC / MseI-CTC. The AFLP laboratory procedure as well as the scoring approach followed Rešetnik et al. [24]; electropherograms were analysed with Peak Scanner version 1.0 (Applied Biosystems), and automated binning and scoring was performed using RawGeno version 2.0 [45], a package for the software R [46].

The error rate [47] was calculated as the ratio of mismatches (scoring of 0 vs. 1) over phenotypic comparisons in AFLP profiles of 43 replicated individuals. Non-reproducible fragments and fragments present in only one individual were removed from the dataset.

Data analysis

Sequence data

Sequences were edited and manually aligned using Geneious Pro 5.3.6 [48]. All sequences were deposited in GenBank. Phylogenetic relationships in ITS and plastid data sets were inferred from maximum parsimony and Bayesian analyses. Maximum parsimony (MP) as well as MP bootstrap (MPB) analyses of both data sets were performed using PAUP 4.0b10 [49]. The most parsimonious trees were searched heuristically with TBR swapping, MulTrees off, and 100,000 replicates of random sequence addition for the ITS dataset and with 1000 replicates and swapping performed on a maximum of 1000 trees (nchuck = 1000) for the plastid dataset. All characters were equally weighted and unordered. The data set was bootstrapped using full heuristics, 1000 replicates, TBR branch swapping, MulTrees option off, and random addition sequence with five replicates. Bassecoia hookeri was used to root the trees and additional outgroup taxa were included, based on previous studies [23, 24].

Bayesian analyses were performed with MrBayes 3.2.1 [50] applying the substitution models proposed by the Akaike information criterion implemented in MrAIC.pl 1.4 [51]. Values for all parameters, such as the shape of the gamma distribution, were estimated during the analyses. The settings for the Metropolis-coupled Markov chain Monte Carlo (MC3) process included four runs with four chains each (three heated ones using the default heating scheme), run simultaneously for 10,000,000 generations each, sampling trees every 1,000th generation using default priors. The PP of the phylogeny and its branches were determined from the combined set of trees, discarding the first 1001 trees of each run as burn-in.

We constructed an ITS NeighbourNet network [52] of sect. Trichera using SplitsTree 4.12 [53, 54] to display possible conflicts in the data. We applied the Uncorrected_P method to compute the proportion of positions at which two sequences differ. Ambiguous base codes were treated as missing states. Plastid data were analysed using statistical parsimony as implemented in TCS 1.21 [55] with the connection limit set to 95 %; gaps were treated as fifth character state. For this analysis, indels longer than 1 bp were reduced to single base pair columns allowing those structural mutations to be counted as single base pair mutations only.

Divergence times were estimated using BEAST ver. 1.8.2 [56] on concatenated ITS and plastid datasets, using fossil calibration within the closely related Valerianaceae as described by Carlson et al. [57]. The dataset of Knautia was pruned to 27 accessions, of which 18 belong to sect. Trichera and were sampled from all major clades resolved in preliminary analyses of the complete datasets. In addition, several outgroup taxa were added. The calibration points were set as described by Carlson et al. [57] using lognormal prior distribution with mean = 4, standard deviation (SD) = 1 and offset 45 in the case of the crown group of Valerianaceae and with mean = 2, SD = 1 and offset 15 in the case of the crown group of Valeriana. The analyses were performed with a Birth-Death speciation prior, GTR + Γ substitution model parameters and an uncorrelated relaxed lognormal clock [58]. Two independent MCMC chains were run for 50,000,000 generations with tree and parameter values saved every 2,000th generation. Tracer 1.6.0 [59] was used to determine the degree of mixing, the shape of the probability density distributions, and 95 % credibility intervals for estimated divergence dates. Both the effective sample sizes and mixing were appropriate. FigTree 1.4.2 [60] was used to display the maximum clade credibility tree after combining the tree files using LogCombiner and summarising the information using TreeAnnotator (both programs available in BEAST package).

A continuous phylogeographic analysis using relaxed random walks [61] was performed on the combined sequence dataset including only Knautia (with exception of K. cf. degenii K272 exhibiting strongly conflicting positions in plastid and ITS trees; [24]) using BEAST v1.8.2 [62]. Substitution models proposed by MrAIC.pl 1.4 ([51]; HKY + Γ for ITS and GTR + Γ for plastid dataset) with estimated base frequencies were used for phylogeny inference and the trees were linked. A lognormal relaxed clock with a weakly informative prior on the clock rate (exponential with mean 0.001) was applied and a Bayesian skyline coalescent prior [63] with piecewise-linear skyline model was set. The diffusion process was modelled by a lognormal relaxed random walk process. We specified a prior exponential distribution on the standard deviation (SD) of the lognormal distribution with a mean of 5. Geographic coordinates recorded in the field with GPS were used as locality points for each population. We added random jitter with a window size 1.0 to the tips, as more individuals were sampled from the same location. The prior age of the root was set to 15.88 Ma with a normally distributed standard deviation of 4.5, which corresponds to the median age and 95 % highest posterior densities (HPD) interval of the corresponding node obtained from the dating analysis of the complete dataset. The analysis of the diffusion inference was run for 300 million generations, logging parameters every 10,000 generations. The performance of the analysis was checked in Tracer 1.6.0 [59], which was also used to construct a lineage-through-time (LTT) plot from the combined posterior distribution of sampled tree topologies, in order to display the diversification rate dynamics in the evolutionary history of Knautia. The maximum clade credibility tree (MCC) was produced and annotated by Tree Annotator (part of the BEAST package) after removing burnin and visualised with FigTree 1.4.2 [60]. The diffused MCC tree with annotated diffusion estimates was visualised in SPREAD v.1.0.6 [64] and projected together with polygons representing ancestral areas on a geo-referenced map using ArcGIS 10.3.

Diversification rates were estimated using Magallón and Sanderson’s whole-clade method [65], which does not assume complete taxon sampling. Rates were calculated for both crown and stem groups, at two extremes of the relative extinction rate (ε = 0, no extinction; and ε = 0.9, high rate of extinction; extinction rate being expressed as a fraction of the speciation rate), as implemented in the R package GEIGER [66].

AFLP data

A Neighbor-joining (NJ) analysis based on a matrix of Nei-Li distances [67] and rooted with K. integrifolia from section Tricheroides was conducted and bootstrapped (1000 pseudo-replicates) with TREECON 1.3b [68]. A non-model-based approach, nonhierarchical K-means clustering [69] was chosen because of the presence of three ploidy levels, and performed using a script of Arrigo et al. [70] in R. This approach has recently been successfully applied in the analysis of genetic structure of AFLP datasets in polyploid complexes [7072]. We performed 50,000 independent runs (i.e., starting from random points) for each assumed value of K (i.e. the number of groups ranging from 2 to 20). The K-means clustering results were displayed on a NeighborNet diagram produced with SplitsTree 4.12 [54] from a matrix of uncorrected P-distances; K. integrifolia was not included. Splits with a weight < 0.001 were excluded to aid legibility. In order to simplify the interpretation of the data we also present NeighborNets of three geographical areas supplemented with bootstrap values (1000 pseudo-replicates); the circumscription of the regions is given in Fig. 2.

Fig. 2
figure 2

Plastid DNA variation in populations of 51 species of Knautia sect. Trichera based on petN(ycf6)-psbM sequences. a statistical parsimony network of the 97 plastid haplotypes encountered; numbering corresponds to Additional file 1: Table S1 (the numbers 1–57 correspond to diploid accessions from Rešetnik et al. [24]); the size of the circles is proportional to the square-root transformed frequency of the respective haplotype; haplotypes not sampled are shown as small black dots. Only haplotypes retrieved from at least two individuals as well as those mentioned in the text are labelled separately. Haplotypes present in diploids are identified by a black outline. bd geographic distribution of haplotypes. Symbols for the species are as in Fig. 1; their colour filling corresponds to the haplotype groups shown in a. The grey lines in b delimit three areas for which separate NeighbourNets of AFLP relationships complemented with plastid haplotypes are given in Figures S7–S9 within the Additional file 8

Results

The number of terminals, included characters, number and percentage of parsimony informative characters, number and lengths of MP trees, consistency and retention indices for both DNA regions, as well as the model of evolution proposed by MrAIC and used in MrBayes analyses are presented in Table 1.

Table 1 Matrix and phylogenetic analysis statistics for ITS and the plastid marker petN(ycf6)-psbM as well as substitution models proposed by MrAIC and used in the Bayesian analyses

Plastid sequence data

The petN(ycf6)-psbM sequences of sect. Trichera were 1230 (K388 and K468) to 1282 bp (K246) long and the alignment was 1444 bp long. The relationships inferred among the sections of Knautia were congruent with our previous study [24]. Within sect. Trichera several clades with poorly resolved and insufficiently supported relationships were unravelled (Additional file 2: Figure S1). Tetraploids were found in all main clades, whereas hexaploids are more limited. The parsimony haplotype network (Fig. 2a; from here on, the term haplotype is restricted to plastid sequences) exhibited a simple structure: in total, 97 haplotypes were retrieved, of which haplotypes H1–H57 were also present in diploid individuals [24]. The most frequent haplotype H25 was found in 31.2 % of the samples. It connected to 39 closely related haplotypes differing in only one or two steps, whereas H42 and H43 (both derived from H41) were separated by three steps and H94 by four steps (Red Haplotype Group). The satellites of H25 separated by one step included also the second-most frequent haplotype H47 (Yellow Haplotype Group, 13.8 %), H29 (Orange Haplotype Group) and H65 (Violet Haplotype Group). Haplotype H47 had five satellites separated by one step, while H29 had 13 satellites separated by one to four steps, some of which connected also to haplotypes belonging to the Red Haplotype Group. The Violet Haplotype Group distributed in Iberia and southeastern Europe was genetically most heterogeneous. It was constituted by four haplotypes (including H64) derived from H65 and three haplotypes (including H21) connecting to a not sampled haplotype, to which also H7 was connected (Blue Haplotype Group). Haplotype H7 was surrounded by 16 haplotypes separated by maximally three steps. Finally, H1, giving rise to eight haplotypes separated by one to five steps, was connected to H7 by two steps (Green Haplotype Group).

The geographic distribution of the haplotype groups is illustrated in Fig. 2b–d. Individuals carrying haplotypes of the Red Haplotype Group were distributed throughout most of the sampling area from the Pyrenees to the Caucasus. The second-most widely distributed haplotype group (Orange Haplotype Group) ranged from the eastern Pyrenees over the Alps to the central Balkan Peninsula, but was absent from the Apennines. The Yellow Haplotype Group was widely distributed from the French Massif Central over the Alps to the Apennines and to the central Balkan Peninsula. The Green Haplotype Group was restricted to K. calycina from the Apennines, four accessions of K. baldensis, K. longifolia and K. velutina from the southern Alps and a single accession of K. travnicensis from the Balkans. The Violet Haplotype Group was mostly restricted to the Iberian Peninsula, with occurrences of single accessions in the Massif Central (K. arvernensis), the Western Alps (K. subcanescens), the Southern Carpathians (K. drymeia) and the central Balkan Peninsula (K. dipsacifolia).

ITS

Raw ITS sequences of sect. Trichera were 849 (K229 and K250) to 858 bp (K089 and K109) long and the alignment was 948 bp long. Polymorphisms were detected in most sequences (Additional file 1: Table S1); among diploid accessions there were 68 samples (46 %) without polymorphism, whereas the highest number of detected polymorphisms was ten (in K. midzorensis K271), among tetraploid accessions 24 (25 %) of samples showed no polymorphisms and the highest number of polymorphisms was 17 (in K. norica K051), and among hexaploid accessions two (9.5 %) of samples showed no polymorphisms whereas the highest detected number was ten (in K. travnicensis K270). The relationships inferred among the sections of Knautia were congruent with our previous study [24]. Within sect. Trichera several clades with poorly resolved and insufficiently supported relationships were inferred (Additional file 3: Figure S2).

The NeighbourNet network of ITS ribotypes (Additional file 4: Figure S3A) revealed a structure similar to that inferred from diploid accessions only ([24]; Additional file 4: Figure S3B). The main difference was that several accessions of K. carinthiaca, K. dinarica, K. illyrica, K. longifolia, K. magnifica, K. midzorenzis, K. norica, K. pancicii, K. sp. 2 as well as a few samples of K. arvensis, K. arvernensis, K. csikii, K. dipsacifolia, K. drymeia and K. sarajevensis were positioned along the split between the two main terminal groups identified from diploid accessions only [24]. One major group was genetically fairly homogenous and included most taxa of the South Arvensis Group, plus a few samples of K. csikii, K. dinarica, K. drymeia, K. sarajevensis and K. slovaca. The second major group comprised most other taxa (also including a few samples of K. arvensis, K. dinarica and K. drymeia) and was genetically highly diverse, with several strongly weighted splits. Many species exhibited unrelated ribotypes (Additional file 5: Figure S4). Whereas tetraploids were positioned all over the NeighbourNet, hexaploids were limited to two lineages (Fig. 3).

Fig. 3
figure 3

Internal Transcribed Spacer (ITS) variation in Knautia sect. Trichera illustrating dispersion of cytotypes over the network. Relationships are visualised as NeighbourNet diagram based on uncorrected P distances; a fully labelled version is presented in Additional file 4: Figure S3

Divergence time estimation, diversification dynamics and continuous phylogeographic analysis

The overall phylogenetic relationships inferred by BEAST analysis of the concatenated ITS and plastid datasets including several outgroup taxa (Fig. 4) were congruent with previous studies [24, 57] and resulted in poor resolution within Knautia sect. Trichera. In addition, the inferred divergence times in the outgroup were slightly older, but largely within the ranges (95 % highest posterior densities, HPDs) inferred by Carlson et al. [57]. In our analysis the origin of Knautia, i.e. its divergence from Pterocephalidium was dated to the early Miocene 21.6 (11.7–35.7) Ma, whereas its diversification started in the mid Miocene at 15.9 (8.2–26.6) Ma with the divergence of sect. Knautia. The split of sections Trichera and Tricheroides might have occurred at 10.5 (5.3–18.0) Ma and the divergence within sect. Trichera in the Pliocene 4.3 (2.1–7.9) Ma, with the main phase of diversification dated to Pliocene and Pleistocene. The divergence dates obtained by the BEAST analysis for the dataset pruned to Knautia (not shown) were highly congruent with the analysis of the entire data set and the HPDs were strongly overlapping: the beginning of diversification of Knautia was dated to 16.2 (7.7–23.9) Ma, the split between sections Trichera and Tricheroides to 10.7 (3.9.–18.4) Ma and the onset of diversification in sect. Trichera to 4.0 (1.1–7.6) Ma.

Fig. 4
figure 4

Bayesian consensus chronogram of the concatenated ITS and plastid datasets obtained with BEAST. Numbers above branches are PP values >0.50 (they were omitted within the crown groups of section Trichera), numbers in bold associated with nodes indicate the mean crown group age in millions of years of the clade diversifying at that node and the bars correspond to the 95 % highest posterior densities of the age estimates. Population identifiers correspond to Additional file 1: Table S1. The insert shows a lineage-through-time plot displaying the dynamics of diversification of Knautia. The black line represents the MCC tree from the BEAST analysis and the grey lines represent the interval resulting from all sampled trees after burnin

The continuous phylogeographic analysis (Fig. 5) revealed that the beginning of diversification of Knautia was centred in the Eastern Mediterranean, roughly in the area of the eastern Balkan Peninsula, from where it slowly spread to the neighbouring regions. The diversification of sect. Trichera might have started in central parts of the Balkan Peninsula roughly 4 Ma, from where it significantly expanded its range only in the last 1.5 Ma. Accordingly, most extant lineages originated in the Plio- and Pleistocene, as displayed in the lineage-through-time (LTT) plot (Fig. 4). Based on the estimate of 4.0 Ma (1.1–7.6) for the onset of diversification of K. sect. Trichera and 50 species, we estimated the diversification rate to be approximately 0.42–2.87 species/Myr for the crown group, assuming no extinction, and 0.23–1.54 species/Myr, assuming a high proportion of extinction (e = 0.9). These, as well as stem-group-based diversification rates, are presented in Table 2.

Fig. 5
figure 5

Snapshots of estimated ancestral node areas in the Maximum Clade Credibility tree (obtained with BEAST) of combined ITS and plastid datasets of Knautia at different time horizons as visualised using the software SPREAD. The starting point of diversification is indicated with an asterisk in the upper left figure, the 80 % highest posterior density areas for nodes are indicated as grey polygons, and the time scale of diversification is indicated in million years before present in the upper right corner in each panel. Distribution of land in the corresponding periods is indicated by green polygons in the two upper and the left middle panels (from Rögl [81]: Figs. 8 and 12 for the two upper panels, and from Meulenkamp and Sissingh [85]: Fig. 7 for the left middle panel). The coloured lines show the diversification of Knautia sections: black, section Trichera; yellow, section Knautia; red, section Tricheroides. The distribution of K. sect. Trichera is indicated by a dashed line in the right lower panel

Table 2 Diversification rates (birth–death) in species per Myr in Knautia sect. Trichera assuming 50 species, following the method of Magallón and Sanderson [65] and using 95 % highest posterior density intervals of the age estimates from BEAST analysis of a dataset pruned to Knautia. ε, extinction rate

AFLP data

Thirty-two individuals failed to produce reliable AFLP profiles and were excluded, resulting in a final dataset including 350 individuals. A total of 1334 AFLP fragments were scored; 174 bands were found in only one individual and were excluded from further analyses. The average replicate error rate (according to Bonin et al. [47]) was 2.17 %. Analyses of sect. Trichera (i.e., excluding the outgroup were based on a matrix with 345 individuals and 1149 AFLP fragments. We acknowledge that the number of fragments is high (on average, one fragment was scored every 1.2 bp), which could introduce considerable homoplasy. Empirical tests, however, showed that reducing the number of fragments by applying more conservative scoring strategies yielded considerably worse resolution in terms of tree structure and bootstrap support. This indicates that the increased amount of data was not outweighed by increased homoplasy.

The NJ analysis (Additional file 6: Figure S5) supports the divergence of sect. Tricheroides represented by K. integrifolia from sect. Trichera (bootstrap support, BS, 100). Knautia pancicii (BS 100) was sister to the remaining species with low support (BS 63). Species forming well-supported (BS ≥ 95) branches included K. albanica, K. carinthiaca, K. collina, K. involucrata, K. lebrunii, K. mollis and K. subscaposa. The backbone of the tree was unresolved. Nonhierarchical K-means clustering revealed an optimal separation of the dataset into ten groups that showed good overall congruence with the NeighbourNet diagram (Fig. 6; Additional file 7: Figure S6 presents the NeighbourNet diagram from Fig. 6 complemented with population IDs and species names). Most species groups previously identified in diploid accessions only [24] are still recognizable in the heteroploid data set (Fig. 6). The only supported species groups were the Midzorensis Group (BS 61) and the Montana Group (BS 99). In order to make reading easier and to contrast genetic relationships with taxon-specific average leaf shapes, separate NeighbourNet diagrams are shown for southwestern, central and southeastern Europe in the Additional file 8: Figures S7–S9.

Fig. 6
figure 6

Amplified Fragment Length Polymorphism (AFLP) variation in 251 populations of 51 species of Knautia sect. Trichera. Relationships are visualised as NeighbourNet diagram based on uncorrected P distances. Dots at the tips of branches indicate ploidy levels: white, diploid; grey, tetraploid; black, hexaploid. The colours of individual branches indicate the ten genetic clusters identified as optimal solution by K-means clustering. Nine groups, whose circumscription was additionally informed by the clustering of diploid accessions [24] and the topology of the NeighbourNet, are indicated by thick black lines. Species assigned to more than one cluster are highlighted with dots, whose colours reflect all clusters a species is assigned to. An enlargeable version of Fig. 6 with labelling of terminal splits is presented as Additional file 7: Figure S6

Discussion

Spatiotemporal diversification of Knautia and radiation of Knautia sect. Trichera

Knautia is sister (Additional file 2: Figure S1 and Additional file 3: Figure S2; [24]) to the monotypic western Mediterranean Pterocephalidium [41], which constitutes the tribe Pterocephalidieae together with the also monotypic montane southeastern African half-shrub Pterothamnus [73]. The split between Pterocephalidium and Knautia likely occurred in the Early Miocene (Fig. 4), and the diversification of Knautia was centred in the Eastern Mediterranean as reconstructed by relaxed random walks (Fig. 5). The Mediterranean region is considered one of the Earth’s 25 biodiversity hot spots [74], hosting ca. 24,000 plant species of which 60 % are endemic [75]. Despite its younger age, the Eastern Mediterranean appears to be more diverse than the Western Mediterranean [76] and is thus often considered a reservoir for plant evolution or a cradle for lineage diversification [7780]. This, obviously, was also the case in Knautia.

Applying the same calibration points and estimating similar divergence times within the outgroup as Carlson et al. [57] the dating analysis (Fig. 4) suggests that the onset of diversification of Knautia was in the Middle Miocene ca. 16 Ma (Fig. 4), when Mediterranean Sea and Paratethys transgressed and the Eastern Mediterranean was a mosaic of bigger and smaller islands [81]. Roughly 5 Ma later the divergence between sections Trichera and Tricheroides might have started in approximately the same region. At that time—the land-sea configuration was already similar to the present [81]—Knautia was distributed throughout the central and eastern Balkan Peninsula and westernmost Anatolia (Fig. 5b). The genus persisted in the same area for another 6 Ma, and the diversification of sect. Trichera started in central parts of the Balkan Peninsula roughly 4 Ma. Extensive spread of sect. Trichera out of the Balkans started in the Pleistocene about 1.5 Ma, extending the range westwards along the southern margins of the Alps and eastwards to central Anatolia. All other areas were colonised in the last 1 Ma. The species-poor sections Knautia and Tricheroides remained centred in the Eastern Mediterranean, only K. integrifolia from sect. Tricheroides spread to the Western Mediterranean (e.g., [41, 82]). These sections’ diversity increased to only two species each [24], whereas sect. Trichera underwent rapid diversification in the Pliocene, as displayed in the LTT plot (Fig. 4), and resulted in the ample geographic distribution of haplotypes and ribotype groups, which are shared across species boundaries and ploidy levels (Figs. 2 and 3).

Our reconstruction of massive Pliocene and Pleistocene radiation within sect. Trichera is in stark contrast with Bell et al. [28], who suggested that the radiation in Knautia resulting in today’s species diversity took place much earlier (45–4.28 Ma). Their estimate likely reflects the initial, sectional diversification of Knautia, whereas the radiation of K. sect. Trichera is certainly younger. The estimated diversification rate of the crown group of 0.42–2.87 species/Myr (assuming no extinction) overlaps with the rate estimated for Tragopogon (0.84–2.71 species/Myr; [28]), Dianthus (0.66–3.89 species/Myr; [29]) and some other European-centred genera with rapid rates of diversification (reviewed by Valente et al. [29]). Consequently, diversification of sect. Trichera took place at a similar time horizon as in other genera such as Astragalus, Centaurea, Dianthus, Scabiosa Scorzonera and Tragopogon [28]. As in these genera, it was likely triggered by climatic and topographic changes in the Mediterranean following the Messinian Salinity Crisis in the late Miocene [8386], when the warm and humid climate of the Miocene shifted to clear seasonality with summer droughts and cold, humid winters [29]. Furthermore, uplift of the southern European mountain systems [85] led to an increased altitudinal differentiation in the vegetation [29, 87]. Subsequently, the climatic oscillations of the Pleistocene likely stimulated the alternation of phases of allopatric divergence with periods of secondary contacts of previously isolated lineages [8890], thereby reshuffling species distributions and triggering reticulation and polyploidisation (see below). Finally, the spread of grasslands in the course of Holocene anthropogenic deforestation certainly contributed to the range expansion of the nowadays most widespread species, K. arvensis, and triggered secondary contacts with other, previously isolated lineages [24, 91].

Polyploid Knautia mostly evolved within previously recognised diploid groups

Diversification of sect. Trichera was strongly enhanced by polyploidisation, which occurred independently many times, but did not significantly influence the overall genetic pattern inferred for diploids [24]. The inclusion of polyploids in the diploid AFLP framework revealed that with the exception of hexaploid K. dipsacifolia all tetraploids and hexaploids are nested within diploid groups (Fig. 6). Tetraploids are observed in almost all evolutionary lineages, whereas the much rarer hexaploids are restricted to a few AFLP groups (Fig. 6). The ITS data also support this pattern, as hexaploids appear only in two lineages revealed by the NeighborNet, whereas tetraploids are present in all lineages (Fig. 3). Hexaploids are also geographically more limited, with larger distribution areas in the Alps and north of them and with small isolated occurrences on the western Balkan Peninsula and the northeastern Iberian Peninsula (Fig. 1; [25, 30]). In contrast, tetraploids are present throughout the distributional range of the genus except for its extreme east [25].

Within the eleven AFLP groups previously inferred from diploid accessions [24], polyploids originated in all except for three: (1) the Pancicii Group comprising only K. pancicii, (2) the Montana Group constituted by K. involucrata and K. montana, two species from Anatolia and the Caucasus, the very East of the genus’ distribution area, and (3) the South Arvensis Group (Fig. 6). The last case is of particular interest, as this group is well covered by our sampling. Ecologically, members of the South Arvensis Group share a preference of dry grasslands with many species of the strongly heteroploid Xerophytic Group (Fig. 6) precluding inference of ecological causes for the absence of polyploidy in the South Arvensis Group. A possible explanation for the observed pattern might be the relative stability of environmental conditions in the areas south of the Alps throughout the Pleistocene [76, 92, 93] conferring distributional stasis at least on a larger scale [94]. Nevertheless, polyploidisation was extensive in these areas in other groups of sect. Trichera. The shallow genetic structure as well as the scattering of tetraploids throughout most of the phylogeny (Figs. 2 and 3, Additional file 2: Figure S1, Additional file 3: Figure S2) make it difficult to establish a minimal number of polyploidisation events giving rise to tetraploids in sect. Trichera, which appear in six AFLP groups. Hexaploids originated at least four times, i.e. in the heteroploid Longifolia, Xerophytic and SW European Groups as well as in the exclusively hexaploid Alpine Dipsacifolia Group (Fig. 6). Similarly, the parsimony network of plastid haplotypes suggests at least eight and three independent origins for haplotypes retrieved from tetraploids and hexaploids, respectively. We emphasise that these numbers should be viewed with caution and represent minimum estimates.

Some of the previously inferred species groups [24] were strongly inflated by the inclusion of polyploid accessions (Figs. 3 and 6; circumscription and detailed characterisations are provided in Table 3 and Additional file 8). Knautia velutina and a few accessions of K. illyrica and K. purpurea previously included in the Xerophytic Group had to be transferred to the Longifolia Group (Fig. 6). Whereas the Midzorensis and SW European Groups could be maintained, the previously recognised Carinthiaca Group and North Arvensis Group on the one hand and the Drymeia Group and the Dinarica Group on the other hand were not separable upon the inclusion of polyploids. Therefore, they are here united as Carinthiaca & North Arvensis Group and Drymeia & Dinarica Group (Fig. 6). As in our previous study [24] the widespread and morphologically heterogeneous [72] diploid-tetraploid K. arvensis is non-monophyletic and falls into three groups (Fig. 6). The same applies to the tetra-hexaploid K. dipsacifolia, which appears in three different groups, one of which is the newly proposed Alpine Dipsacifolia Group. Further cases of polyphyly concern populations of K. illyrica and K. purpurea, which occur in two different groups. Future research will show if these four species need to be split into several taxonomic entities. As we expected that separating the AFLP data set into three regional groups (southwestern, central and southeastern Europe; for a circumscription see Fig. 2b) would improve readability and resolution, we present regional NeighbourNets in the Additional file 8: Figures S7–S9. Whereas BS support values for species and a few species clusters tended to increase because of the overall reduced variability in the regional data sets, some species groups did not form clusters anymore (e.g., Carinthiaca & North Arvensis Group, South Arvensis Group; Additional file 8: Figure S8). Addition of typical leaf shapes drawn from specimens of investigated populations (Additional file 8: Figures S7–S9) shows that leaf shape, which is one of the most important characters in Knautia alongside indumentum composition and quantity [21, 37], varies strongly within most of the species groups. However, this is not surprising as some of the species exhibit highly divergent leaf shapes even within populations (e.g., K. arvensis, K. dinarica, K. nevadensis, K. travnicensis; P. Schönswetter & B. Frajman, field observations).

Table 3 Comparison of the species groups within Knautia sect. Trichera proposed by us with those of Ehrendorfer [21, 30]. Information in squared brackets refers to accessions of the same species belonging to different AFLP groups. The addition “p.p.” after a species’ name indicates that the species is included in more than one AFLP groups. Distributions of individual species are characterised based on floras as well as on the author’s field observations

Whether polyploids originated via autopolyploidy or allopolyploidy is unclear due to the weak genetic separation among species. In a separate analysis based on a comprehensive population sampling in K. drymeia, both auto- (i.e. within the same genetic lineage) as well as allopolyploid (i.e. between genetic lineages) origins of tetraploids were inferred within the same species [95], suggesting that this might also be the case in other polyploid species of Knautia. Recurrent evolution of polyploids is also evident for K. dipsacifolia, which falls into three different AFLP groups (Fig. 6). This is consistent with numerous molecular phylogenetic studies, which have demonstrated the recurrent formation of polyploids in many plant groups (e.g. [9698]). In a geographical context, AFLP differentiation among species as expressed by bootstrap support tends to be less pronounced in central Europe than further south (Additional file 8: Figures S7–S9). This also results in a bad fit of AFLP-based relationships and current taxonomy as accessions of widespread species such as K. arvensis or K. dipsacifolia fail to cluster. Such pattern likely reflects a more dynamic glacial and postglacial history in Central Europe as compared to a more static scenario in Southern Europe, which also emerged previously in intraspecific phylogeographic studies (e.g., [99101]). Although populations with mixed ploidy levels are known in Knautia [102], they were not found in our previous study [25] including 381 populations. This suggests that intrapopulational cytotype mixture is rare and likely restricted to primary contact zones, i.e. areas of recent polyploidisation events [103], as shown for the locally endemic K. serpentinicola [40, 104]. Nevertheless, the large geographical distribution of many polyploids indicates their successful and stable long-term establishment.

Conclusions

Altogether, the heteroploid, species-rich K. sect. Trichera is a prime example of rapid diversification mostly taking place during the Pliocene and Pleistocene. Addition of polyploids to a previously established phylogenetic framework for diploids [24] revealed that polyploids have originated mostly within the groups of diploid species rather than between groups. Generally, discrepancies remain between the circumscription of the species groups proposed here on the basis of AFLP data and the formerly recognised groups defined by morphological, karyological and eco-geographical criteria [21]. In addition, several species appear in two or three of the weakly defined genetic groups. As most Knautia species—exceptions being a few forest understory herbs—are light demanding inhabitants of various types of grassland or forb communities, it appears likely that forest advance during warm stages of the Pleistocene has led to the separation of gene pools [91]. Such separation was terminated by the expansion of grasslands during cold or dry periods making secondary contacts possible. Numerous cycles of habitat fragmentation and subsequent reconnections likely promoted interspecific hybridisation (as detected between K. arvensis and K. carinthiaca by Čertner et al. [91]) and eventually polyploidisation and resulted in the highly complex and heterogeneous genetic constitution of several Knautia species. Extensive haplotype sharing and unresolved phylogenetic relationships suggest that these processes occurred rapidly and extensively within sect. Trichera. Although our taxonomically almost complete phylogeny revealed general patterns of the genus’ evolution, the weak and partly contradicting phylogenetic structure renders it premature to take taxonomic decisions. On the contrary, it appears likely that (i) the dynamic polyploid evolution of sect. Trichera, (ii) the lack of crossing barriers within ploidy levels likely supported by the conserved floral morphology, (iii) the highly variable leaf morphology and (iv) the unstable indumentum composition prevent establishing a well-founded taxonomic framework. All this is in perfect agreement with the section’s reputation as one of the most intricate taxa of the European flora.