Introduction

Edible carrot (Daucus carota L. subsp. sativus Hoffm.) is an important vegetable grown worldwide. It is one of the main sources of dietary pro-vitamin A carotenoids (Simon 1990). Variation in the carotenoid content and composition largely depends on the cultivar, resulting in roots of various shapes and white, yellow, orange, or red color, which can be masked by purple anthocyanins (Baranski et al. 2011). Historical data indicate that edible carrot originated in the Afghanistan region before the tenth century. Those old carrots, known as Eastern carrots, were yellow- or purple-rooted. Their cultivation spread to central and north Asia, and then to Japan (seventeenth century). Near East is commonly accepted as the secondary centre of diversity of cultivated carrots. In contrast to Eastern carrots, Western carrots are characterized by having less pubescent leaves and lower tendency to early flowering. Yellow and purple carrots were grown in Europe in the Middle Ages, but then they were gradually replaced by white and then orange-rooted forms, which appeared in the early seventeenth century, presumably as a result of selection from yellow carrot and/or hybridization of cultivated carrot and its wild relatives (Rubatzky et al. 1999). Orange-rooted carrots spread from Europe to other continents and became predominant in the commercial production worldwide. Carrots of other root color are more commonly grown in Asia and just recently they have been reintroduced to European and American specialty markets (Simon et al. 2008). A long history of carrot selection and the use of diverse parental materials in breeding programs throughout the world have resulted in considerable variation in available cultivars. Additionally, allogamy and easy hybridization with the wild carrot make delimitation of carrot genetic pools difficult when only morphological characters are considered (Grzebelus et al. 2011). This is particularly well noted in modern advanced Western type cultivars bred to develop purple, red, or yellow roots, which combine morphological features characteristic of Asian and European, or American carrots.

Microsatellite or simple sequence repeat (SSR) markers proved to be useful in the assessment of genetic diversity of populations occurring in natural habitats and large gene bank collections, as well as in revealing relationships between crop plants and their wild relatives (Varshney et al. 2010; Kalia et al. 2011). In Daucus, most molecular techniques used to date could not uncover clear population structure (Bradeen et al. 2002), although delimitation between cultivated carrots and wild populations using AFLP markers was achieved for a small number of accessions (Shim and Jørgensen 2000). Identification of SSR loci in carrot was initiated by Niemann (2001) for linkage mapping. Another set of SSR markers was used to study gene dispersal in wild carrot populations (Umehara et al. 2005; Rong et al. 2010). Recent results of Clotault et al. (2010) indicated that SSR markers were helpful in evaluation of genetic diversity in the cultivated carrot. However, no detailed information on the polymorphism of the SSR loci was provided in the latter report.

In the present study we assess carrot genetic diversity based on polymorphisms at 30 SSR loci in a collection of 88 cultivars and landraces. We report on a molecular evidence for divergence between Eastern (Asian) and Western (European) genetic pools.

Materials and methods

Seeds of 88 cultivated carrot (Daucus carota L. subsp. sativus Hoffm.) accessions were obtained from gene bank collections, research institutes, and breeding companies (Table 1). According to donors’ information, the accessions originated from Europe (53 accessions), continental Asia (14), Japan (10), USA (5), and one accession each from Brazil, Australia, and Ethiopia. For three accessions there was no information available regarding their origin.

Table 1 List of accessions and their assignment to clusters identified in this work

The plants were grown in 3:1 sand and commercial humus mixture in 19 cm pots. The glasshouse conditions were optimized for carrot growth, i.e. 20–25 and 10–15°C in days and nights, respectively, and about 60% relative humidity. Young leaves of individual plants were freeze-dried. After 105 days of vegetation, roots of the same plants were harvested and assessed for root color both at the root surface and cross-section, and root shape.

DNA was extracted from freeze-dried leaves using the CTAB method. Polymerase chain reactions were carried out in the reaction mixture of 20 ng DNA, 0.25 mM dNTP, 0.5 μM each primer, 2 mM MgCl2, 0.5 U TrueStart polymerase and 1× reaction buffer (Fermentas) using the Mastercycler Gradient (Eppendorf) thermocycler programmed 1 cycle 94°C—5 min., 40 cycles [94°C—20 s, 48–65°C (depending on primer used)—30 s, 71°C—1 min], final extension 72°C—5 min. Thirty SSR markers previously developed by Niemann (2001), Rong et al. (2010), and Cavagnaro et al. (2011) were analyzed (Supplementary Table 1). The amplified fragments were separated in 6% denaturing polyacrylamide gels and detected after silver staining.

Allele frequencies were used to calculate indices of marker information content and genetic diversity implemented in GenAlEx 4.6 (Peakall and Smouse 2006); CERVUS (Kalinowski et al. 2007) and HP-RARE (Kalinowski 2005). Genetic structure was investigated using a Bayesian clustering approach without information on the accession origin and assuming the admixture model and correlated allele frequencies (STRUCTURE 2.2.3; Pritchard et al. 2000). Seven independent simulations with a burn-in length of 104 and a run length of 105 were used for each number of clusters K set from 1 to 12. For the most likely number of genetic clusters, run parameters were increased by the factor of 10. Principal coordinate analysis (PCoA), molecular variance AMOVA and an unbiased estimate of FST jackknifing over loci were used for the assessment of genetic diversity.

Results and discussion

All 30 SSR loci were polymorphic and there were no duplicates in the collection. In total, 227 alleles were identified with a mean of 7.6 per locus (Supplementary Table 2), similar as the mean obtained for the carrot collection by Clotault et al. (2010). Most of the alleles (66%) had frequencies below 0.1 and only 9% occurred with frequencies above 0.4. About half of the alleles (51%) were rare (freq. < 0.05) and were detected in all except one locus (Supplementary Fig. 1). In 12 loci, 19 unique alleles were identified (8.4% of all alleles). The effective number of alleles per locus (3.17), which minimizes input of alleles with low frequencies, was less than half of the total number of alleles. The markers developed by Cavagnaro et al. (2011) were more discriminating than those reported by Rong et al. (2010) as the mean polymorphic information content (PIC) was higher for the former (0.67 ± 0.03 s.e. and 0.50 ± 0.06 s.e., respectively). The PIC for SSR loci identified by Niemann (2001) was intermediate.

The observed heterozygosity (Ho = 0.33) was, on average, much lower than the expected heterozygosity (He = 0.63). The latter measure of diversity is similar to He = 0.73 reported earlier for a collection of other 47 cultivars (Clotault et al. 2010). For most loci, the Wright’s fixation index F was significantly higher than zero and reached up to 0.84, indicating excess of alleles in the homozygous state that could be expected for advanced cultivars and breeding populations. The putative presence of null alleles could also contribute to high F-values although their presence in the collection was not confirmed.

A Bayesian approach for clustering accessions was applied to investigate genetic structure in the collection. Two clusters comprising 17 and 61 accessions, respectively, with the assigning probability above 0.6 were identified (Fig. 1). In addition, 10 accessions with probabilities between 0.4 and 0.6 that were initially associated ambiguously to clusters 1 or 2 were finally not assigned to either of these clusters. Cluster 1 included accessions from continental Asia (11 accessions), Japan (2), USA (2), and Ethiopia (1), all classified there with probabilities >0.8, and a single European cultivar, ‘Yellowstone’, classified with lower probability (0.67). Cluster 2 included accessions from Europe (48), Japan (7), USA (4), continental Asia (1), and Australia (1). Diversity of the complete collection of 88 accessions was also revealed by PCoA (Fig. 2). The first three axes together explained 60% of the total variation, and the first axis alone (26% variation) differentiated mainly accessions classified to cluster 1 and cluster 2 by the Bayesian approach. The second and the third axes contributed equally to PCoA. The latter differentiated Japanese accessions from the rest.

Fig. 1
figure 1

Structure of the genetic diversity of the 88 carrot accessions based on a Bayesian approach and assuming two gene pools. Letters denote root color: P purple, R red, W white, Y yellow, no letter indicate orange color; squares indicate European accessions; n.a. not assigned (assignment probability 0.4–0.6)

Fig. 2
figure 2

PCoA of 88 carrot accessions based on polymorphism of 30 SSR loci

The results of both PCoA and Bayesian clustering were highly congruent and revealed that most carrot cultivars could be separated into two genetic pools, although there was no clear delimitation between them. The first pool comprised predominantly the landraces originating from continental Asia; all of yellow, red, and purple root color; and two cultivars from Japan with red and yellow roots (Table 1). It also comprised four accessions from other world regions but developing purple roots, typical for Asian carrot. ‘Yellowstone’, the only European cultivar in cluster 1, and the African landrace ‘Long Red’ were also yellow-rooted. Thus, all accessions found in cluster 1 were directly sampled in Asia and/or had root color characteristic for that region. One of these accessions, ‘Syrian Purple’, was distinguished by its pubescent leaves, typical for the Eastern carrot type, whereas such trait was not observed in the remaining accessions. Thus, this group of accessions can be defined as the Asian gene pool. In contrast, cluster 2 comprised mainly European accessions and only a single breeding material of Asian origin. Most accessions in cluster 2 (92%) developed orange roots typical for the Western carrot type. Additionally, among the 61 accessions belonging to cluster 2, 23 developed roots of oblong or narrow oblong shape, which are common shapes for the Western type cultivars. In contrast, none of cluster 1 accessions had roots of that shape, but rather had thicker, shorter roots or narrow obtriangular. As the root shape is the most discriminating character among orange cultivars, cluster 2 can thus be considered the Western gene pool.

The presence of American and orange-rooted Japanese cultivars in the Western gene pool identified here as a result of SSR analysis is expected, considering the fact that edible carrot spread to North America from Europe. Improved carrot cultivars obtained by American breeders through the following centuries are, in general, the descendants of those old materials. Also some European cultivars were adapted for production in Japan, starting from the eighteenth century. They were extensively used in breeding programs and crossed with Asian carrots that resulted in the development of distinct orange carrot types known as ‘Kokubu’, ‘Gosun’, and ‘Kuroda’. Thus, many new Japanese cultivars are of Western type (Simon et al. 2008). Japanese cultivar ‘Kokubu’ was also used as one of the parental components in the creation of American HCM and Beta III populations (Peterson et al. 1988; Simon et al. 1989). A close proximity of both of these accessions to the group of Japanese accessions observed in PCoA scatter plot is thus congruent with information on their pedigree.

The divergence of both clusters representing Asian and Western gene pools was moderate (FST = 0.097 ± 0.014 s.e.) but highly significant (P < 0.01). Furthermore, partitioning variation using AMOVA attributed 8.8% (P < 0.01) of the total genetic diversity to the variation between clusters. The cluster divergence resulted from the presence of private alleles in each cluster, but was also supported by the presence of 23 other alleles occurring with frequencies above 0.9 in one of the two clusters. Allelic richness, a useful parameter for comparison populations differing in size, was higher for cluster 1 than for cluster 2 by 20% (P < 0.01) (Table 2). Private allelic richness, which estimates the presence of unique alleles for a cluster, was over two-fold higher for cluster 1 (P < 0.01), although the number of private alleles was higher in cluster 2. Also other parameters, i.e. PIC, Shannon’s index, Ho, and He were higher for cluster 1 indicating that Asian gene pool had higher genetic diversity, which can partially result from the presence of landraces.

Table 2 Comparative parameters of genetic diversity for two clusters