Introduction

Almonds are California’s largest tree nut crop and the state produces over 80% of the world’s supply. Total almond production in California was a record 1.47 billion pounds in 2007/2008, a 24% increase over the previous year (ABC 2008). The consistent high demand for California almonds has been met by an increase in acreage planted over each of the past 10 years. In 2007, the estimated bearing acreage was 615,000 (United States Department of Food and Agriculture 2007). The bulk of California almonds are produced by a small number of elite cultivars; ‘Nonpareil’ alone produces nearly 40% of the California crop with most remaining cultivars being cross-compatible pollinizers for this self-sterile crop species. In the diploid almond, self-sterility is controlled by a single major (S-) locus, where haploid pollen containing an S-allele in common with either self or cross-pollinated pistils will result in failure of pollen growth to fertilization. Consequently, over 30% of the remaining production is from only four cultivars: ‘Carmel’, ‘Butte’, ‘Monterey’ and ‘Padre’ (ABC 2008) which are all fully cross-compatible with ‘Nonpareil’ and, with the exception of the intersterile ‘Butte’ and ‘Monterey’ combination, are inter-compatible with each other (Barckley et al. 2006). With substantial new plantings each year of proven inter-compatible cultivars, correct cultivar identification is critical to the continuing success of the industry. However, cultivar identification using morphological characteristics is difficult because trees are planted before distinguishing traits develop.

Simple sequence repeats (SSRs) are used widely for cultivar identification of other woody, clonally propagated crops such as grape and walnut (Dangl et al. 2001; Dangl et al. 2005). Foundation Plant Services (FPS) is a service department based in the College of agriculture and environmental sciences at the University of California, Davis. Its mission is to produce, test, maintain, and distribute disease-tested propagation material for use by nurseries and growers throughout the US and worldwide. FPS houses and maintains the foundation collections for the California Department of Food and Agriculture’s (CDFA) registration and certification programs for grapevines, deciduous fruit and nut trees, and strawberries. FPS stock qualifies and serves as primary foundation source material for commercial increase for entire industries. Proper identification of cultivars is a critical aspect of FPS’s mission.

A major benefit of using DNA markers is that trees can be identified at any developmental stage. SSR markers have been developed for Prunus, including almonds (Martínez-Gómez et al. 2003; Mnejja et al. 2005; Wünsch and Hormaza 2002). However, those reports did not describe streamlined protocols and markers specifically screened for efficient almond cultivar identification. In particular, they did not publish specific allelic data to facilitate development of a universal database of almond SSR marker profiles.

Here we describe an SSR marker system that distinguishes among all commercially important almond cultivars presently grown in California. The small study set contains many closely related cultivars, a particular challenge for a DNA marker-based identification system. The profiles published here uniquely identify all almond cultivars represented in the collection presently maintained by FPS. The procedures presented can be expected to distinguish among all other almond cultivars, and represents a practical system for almond cultivar identification at any developmental stage.

Materials and methods

Multiple plants of 21 almond cultivars were selected for study (Table 1). Tree leaf samples were from the almond collection at FPS with additional samples from commercial sources and the UC Davis Wolfskill Experimental Orchard included as checks. Young, non-fully expanded leaves were collected and rapidly dried at room temperature using chemical desiccant (Bautista et al. 2008). DNA extraction, PCR, fragment separation, and sizing of amplified fragments were performed according to Bautista et al. (2008) except for multiplex PCR, in which case 0.15 pmol ul−1 of both forward and reverse primers for each of three primer pairs was used.

Table 1 Almond cultivars used in this study

Results and discussion

An initial set of 14 representative almond cultivars was used to test 53 previously published primer pairs sequenced from several Prunus species for their ability to consistently amplify polymorphic fragments (Table 2). These primer pairs are described in those original publications as sequences flanking SSR loci. Here, we use the term “locus” to designate the portion of DNA amplified by a particular primer pair and refer to amplified fragments as alleles, though we did not re-sequence the amplified fragments in almond.

Table 2 Origin and citations for tested loci

Based on the initial screen of 14 almond cultivars, 29 loci were eliminated from further analysis for various reasons (Table 3). Alleles could not be scored for 18 primer pairs: ten primer pairs failed to amplify fragments, six amplified apparently random fragments, and two amplified multiple loci. An additional 11 loci had alleles that could be scored, but did not provide sufficiently useful information to include in the final data set. Four of these were monomorphic for all 14 cultivars in the screen and three had extremely low polymorphism, typically resulting from the presence of one very high frequency allele. Three loci were difficult to score accurately using automated systems due to poor amplification, the presence of single base pair differences in allele lengths and stuttering of the primary fragment. At one locus, all 14 cultivars in the screen were homozygous, resulting in very low polymorphism and suggesting the presence of high frequency null alleles.

Table 3 Results for failed and less polymorphic loci

Twenty-four loci were selected for further testing: all 12 loci from Table 4 and the 12 loci marked “in data set” from Table 3. These 24 loci reproducibly amplified alleles that behaved as a single Mendelian locus: there were only one or two alleles for a given almond cultivar and these alleles were inherited in a fashion consistent with published pedigrees. The locus BPPCT 038 showed artifacts, however, these were easily distinguished by the analysis software and Mendelian alleles were scored. Reliable polymorphic data at the selected 24 loci were obtained for 18 almond cultivars, including all almond cultivars at FPS (Supplemental data). (Table Supplemental).

Table 4 Suggested loci for almond cultivar identification

Each primer pair was tested under only one standard set of PCR conditions. More primers pairs might have produced useful results under different PCR conditions. However, our goal was to develop a practical forensic “DNA fingerprinting” method to uniquely characterize all almond cultivars and use this method to confirm the identity of each almond tree in the FPS foundation blocks. Adoption of a single protocol for DNA amplification increases productivity and reduces lab errors.

The goal of this study was to develop a method to uniquely identify all current almond cultivars using automated DNA fragment analysis, to use this method to confirm the identity of the almond cultivars at FPS and to elucidate the relationships of the commercially important cultivars grown in California. This study set represents a very narrow germplasm. Such a limited germplasm is a very good sample set for choosing highly polymorphic markers; however, the resulting data set is not the large, diverse database needed to calculate meaningful allele frequencies for probability analysis.

The twelve most informative markers were separated into four groups of three each (Table 4). These groups could be amplified and their fragments analyzed as triplexes, reducing the time and cost of analysis. The first triplex alone is sufficient to uniquely identify all 21 cultivars in the study set (Table 5). We recommend using the first nine markers as a standard profile for almond cultivar identification. Adoption of a standard set of markers for cultivar identification facilitates data sharing and helps correct for variation in data analysis among labs (This et al. 2004). As more profiles for existing almond cultivars are added to this database (Table 5), one would expect more diversity rather than less. Thus, these nine markers, selected for being highly polymorphic in a limited, closely related set of cultivars, can be expected to differentiate among all almond cultivars except those originating as bud-sports.

Table 5 Allele sizes (in base pairs) for almond cultivar addressed in this study

In addition to allowing an unambiguous identification of almond cultivars, the SSR markers reported here can be used to study cultivar pedigrees. A progeny shares one allele at each locus with each of its parents. This study set of 18 almond cultivars is neither large nor diverse enough to calculate probabilities for parentage analysis. However, a consistent result for both parents and a progeny at all 24 SSR loci provides strong, if not quantifiable, evidence to support the relationship, particularly if it confirms previous reports.

The almond cultivars ‘Aldrich’, ‘Butte’, ‘Carmel’, ‘Monterey’ ‘Norman’, ‘Price’ and ‘Thompson’ have previously been reported to be chance seedling selections probably originating from ‘Nonpareil’ × ‘Mission’ crosses (Asai et al. 1996; Brooks and Olmo 1997). This preliminary characterization was based on early cross-compatibility studies (Kester et al. 1994) where it was shown that most chance-selection cultivars could be grouped into four cross-incompatibility groups (S1S7, S1S8, S5S7, S5S8). These cross-incompatibility genotypes were presumed to be the result from natural crosses between the dominant cultivar ‘Nonpareil’ (S7S8) and the cultivar ‘Mission’ (S1S5) which was the major pollenizer for ‘Nonpareil’ during the early to mid 20th century (Asai et al. 1996; Wood 1925). However, other potential donors of the S1, S5 or S7 allele have now been identified, including ‘Languedoc’ (S1S5), ‘Ne Plus Ultra’ (S1S7), and ‘Peerless’ (S1S6), (Barckley et al. 2006; Lopez et al. 2006) all of which have been reported to be widely planted in California from the late 19th to mid 20th century (Asai et al. 1996; Wood 1925).

The SSR markers used in this study fully support a ‘Nonpareil’ by ‘Mission’ parentage for these chance seedlings. In ‘Carmel,’ we assumed that a null allele for the MA034a locus is inherited from ‘Mission’ (Table 6, Bautista et al. 2008). There is no evidence of contributions by either ‘Ne Plus Ultra’ or ‘Peerless’ (‘Languedoc’ unavailable for analysis). In fact, no evidence of genetic contributions from ‘Ne Plus Ultra’ can be observed in any of the evaluated cultivars despite ‘Ne Plus Ultra’ being a widely planted cultivar originating from the same seedling block as the original ‘Nonpareil’ (Wood 1925).

Table 6 Null allele inherited by three ‘Mission’ progeny

The SSR data does, however, support both ‘Fritz’ and ‘Ruby’ as having the cultivar ‘Peerless’ in their lineage since both have the unique alleles 142 at BPPCT039 and 156 at ssrPaCITA12 markers (Table 5) as well as the unique S6 incompatibility allele. Molecular marker data also support ‘Mission’ as the other parent (Table 5) as does the presence of the S1 incompatibility allele (Barckley et al. 2006). It is assumed the same null allele at MA034a inherited by ‘Carmel’ is also inherited from ‘Mission’ by both ‘Fritz’ and ‘Ruby’ (Table 6).

Similarly, while the recently released cultivar ‘Kochi’ was discovered as a volunteer seedling near a ‘Drake’ almond orchard (Kochi 2004), the SSR data show that it most likely results from a ‘Peerless’ × ‘Nonpareil’ cross. ‘Kochi’ shares one allele at each locus with both ‘Nonpareil’ and ‘Peerless’, including the unique ‘Peerless’ alleles 142 at BPPCT039 and 156 at ssrPaCITA12. ‘Kochi’ also possesses the unique ‘Peerless’ S6-allele (Barckley et al. 2006).

Molecular markers can also support published parentage by analyzing only parent/progeny pairs, which will share one SSR allele at each locus. Though this analysis does not show the direction of descent (which is the parent and which the progeny), it can be used in conjunction with other information to support reported pedigrees. In this study marker data for ‘Padre’ support earlier reports of ‘Mission’ being the seed parent. The data are also consistent with ‘Nonpareil’ being one parent of ‘Kanpareil’, ‘Solano’, ‘Sonora’ and ‘Titan’ (Brooks and Olmo 1997). ‘Titan’s’ seed parent is actually known to be ‘Tardy nonpareil’, a late blooming somatic mutant or “bud-sport” of nonpareil [only rarely are differences between somatic mutants observed with SSR data (Riaz et al. 2002)]. S-allele data are also consistent for the reported parentage of ‘Sonora’, ‘Solano’ and ‘Kapareil’. There are no S-allele data for ‘Titan’ which is used primarily as an almond parent in generating almond x peach hybrid rootstocks. Unique molecular marker patterns were also observed for the recent cultivars ‘Sweetheart’ and ‘Winters’, supporting the reported use of novel germplasm to incorporate improved productivity and pest resistance to these cultivars (Gradziel et al. 2007; Martínez-Gómez et al. 2004).

Conclusion

The goal of this study was to develop a “DNA fingerprinting” method to uniquely identify all almond cultivars and to use this system to confirm the identity of each almond tree in the FPS foundation blocks. Previously published loci were screened with the objective to reduce time and cost of testing. The system developed has streamlined protocols compatible with automated high through-put DNA fragment analysis.

The twelve recommended markers form the basis for a practical method to uniquely identify almond cultivars. The loci show Mendelian inheritance and the profiles are consistent with known parentage and have proven informative in evaluating possible parentage for the many chance seedling selections. The limited database of profiles published here contains all important almond cultivars grown in California. Since these cultivars are readily available worldwide, they provide good reference profiles to facilitate data sharing among different labs.