Introduction

Corylopsis Siebold & Zucc. is a genus of woody ornamental shrubs in the Hamamelidaceae with a complex taxonomic history. Prior revisions have designated from seven to over 30 species, creating confusion for the nursery industry and public gardens (Bean and Aniśko 2014; Hinkley 2009). Many herbarium vouchers and species descriptions of this complex group are incomplete (Morley 1977).

Corylopsis species are widely distributed from Northeast India across China and into Korea, Japan, and Taiwan, with species in the easternmost end of this distribution having the most distinct characters within the genus (Bean and Aniśko 2014; Morley and Chao 1977; Zhang et al. 2003). Morley and Chao (1977) recognize several species native to Japan, including C. pauciflora Siebold & Zucc. (with populations in Taiwan), C. spicata Seibold & Zucc., and C. glabrescens Franch. & Sav. (with populations in Southern Korea), C. gotoana Makino, and C. gotoana var. coreana (Uyeki) T. Yamaz. (Morley and Chao 1977; USDA 2021). Chinese species of Corylopsis are found in montane forests from Northeast India across China (Morley and Chao 1977; Bean and Aniśko 2014) and include C. henryi Hemsl., C. multiflora Hance, C. himalayana Griff. [C. himalayana var. himalayana and C. himalayana var. griffithii (Hemsl.) B.D. Morley & J.M. Chao], and C. sinensis Hemsl. [C. sinensis var. sinensis and C. sinensis var. calvescens Rehder & E.H. Wilson]. However, the latest treatment in the Flora of China recognizes 18 Chinese species (Zhang et al. 2003).

Corylopsis represents a group of ornamental shrubs with untapped breeding potential. Few improved cultivars have been developed, with most derived from chance nursery seedlings or field collections. However, diversity within Corylopsis offers opportunities for targeted breeding. Corylopsis has few pests and diseases, and most species perform best in moist soil under dappled afternoon shade (Bean and Aniśko 2014). They vary from low, spreading vase-like forms to large upright plants (Bean and Aniśko 2014), with bloom times from late winter to early spring. Species exhibit a range of cold hardiness. Corylopsis yunnanensis Diels is reported as the least cold hardy (Li 1962; Morley and Chao 1977) while some taxa (such as C. gotoana) are reported hardy in USDA Hardiness Zone 5 (Bean and Aniśko 2014; Gapinski 2015). Flower colors vary from yellow-green in C. sinensis (and synonyms) to primrose yellow in species such as C. spicata (Bean and Aniśko 2014). Anther color varies from green to red, with C. spicata ‘Red Eye’ selected for its compact growth and bright red anthers (Hop 2011). Inflorescences vary from 2 to 5 flowers on short racemes in C. pauciflora to chains of 12–20 flowers on long racemes in C. willmottiae Rehder & E.H. Wilson (Bean and Aniśko 2014; Zhang et al. 2003). Flower fragrance is also variable across genotypes, ranging from unpleasant to a strong, sweet primrose fragrance (Bean and Aniśko 2014). Foliage color offers further interest for Corylopsis breeding, as selections with blue-green or golden leaves have been reported (Bean and Aniśko 2014). Corylopsis are also appreciated in the landscape for fall color and spring flush, providing several layers of seasonal interest (Bean and Aniśko 2014; Hop 2011; Morley 1977).

Polyploidy is rare within the Hamamelidaceae, with only one other genus, Fothergilla Murr., reported to have a ploidy series (Ranney et al. 2007; Qi et al. 2015). Some variable morphology observed in Corylopsis may be due to the range of ploidy documented throughout the genus, which include diploid, tetraploid, pentaploid, and hexaploid taxa (Anderson and Sax 1935; Kim et al. 2008; Kono et al. 2008; Mehra and Khosla 1969; Oginuma 1991; Oginuma and Tobe 1991; Santamour 1965). In addition to ploidy, the position of the centromere has been examined to determine phylogenetic relationships in Corylopsis (Kono et al. 2008), and several molecular studies have provided further insight into relationships within Corylopsis (Li et al. 1997; Li 2008; Roh et al. 2007; Yamanaka et al 2008); however, to date, microsatellites, or simple sequence repeats (SSRs) have not been developed specifically for Corylopsis.

The objectives of this study were to (1) determine genome size and ploidy among Corylopsis taxa and (2) develop novel microsatellite markers to assess genetic diversity of living collections to assist with collection management and facilitate breeding.

Materials and methods

Plant materials

Fresh expanded leaves were sampled from living collections at botanical gardens and arboreta in the United States (Table 1). Samples represent material that originally came from plant hunting expeditions, botanic gardens, and Index Semina.

Table 1 Accession number, origin, mean 2C value, and ploidy of Corylopsis accessions used in this study

Flow cytometry

Approximately 2 cm2 samples of fresh leaf tissue were chopped by razor blade in a polystyrene petri dish containing 500 μl of nuclei extraction buffer (Cystain PI Absolute P, Sysmex Partec GmbH, Görlitz, Germany). Included with each Corylopsis sample was chopped leaf tissue from the internal size standard Pisum sativum L. ‘Minerva Maple’ (2C = 9.56 pg, (Johnston et al. 1999). The Corylopsis-Pisum nuclei suspension was passed through a 30-μm gauze filter (Sysmex Partec), and 2 ml of staining solution was added to the filtrate (CyStain PI Absolute P, Sysmex Partec). The stained nuclei suspension was processed using a flow cytometer (Cyflow Ploidy Analyzer, Sysmex Partec) to a minimum of 3000 nuclei per sample. For each accession, three samples were combined to represent a random sample of nuclei from each plant. Terminology describing genome size and ploidy levels follows Greilhuber et al. (2005): Relative 2C genome size = (DNA content of size standard) × (mean fluorescence value of sample)/(mean fluorescence value of standard). The monoploid (1C) value was calculated as the 2C value divided by the ploidy.

Cytology

Early and late season cuttings were taken from current year’s growth and rooted using 1000 ppm K-IBA (Sigma-Aldrich, St. Louis, MO) to produce root tips for cytology. Based on initial flow cytometry results, four taxa were chosen to represent each genome size group, corresponding to putative ploidy levels of diploid, tetraploid, pentaploid, and hexaploid. Cuttings were maintained under mist in a polyhouse in Beltsville, MD. Samples included C. platypetala (NA66131), C. gotoana var. coreana (NA56375-02C3), C. willmottiae (NA69317), and the F1 interspecific hybrid C. gotoana ‘March Jewel’ \(\times \) C. spicata (NA6-10EH). Root tips were digested and chromosomes were stained and spread following protocols of Lattier et al. (2017). Chromosomes were counted using a light microscope (Nikon Eclipse 50i; Tokyo, Japan). A minimum of 15 highly resolved cells were captured per taxon. Photomicrographs were prepared by focus stacking and blending multiple images to produce high-resolution chromosome images (Photoshop CC 2014; Adobe, San Jose, CA).

Microsatellite identification and primer design

Genomic DNA was extracted from C. himalayana (NA78641) using the methods described in the next section, and was subjected to shotgun genome sequencing on a Roche 454 high-throughput sequencer at the Ohio State University Plant–Microbe Genomics Facility. The resulting 65,000 reads were then analyzed using QDD-2 software (Meglécz et al. 2010) following the approach of Takayama et al. (2011) to identify potential microsatellites, build consensus sequences, and design optimal primer pairs. In using the software, all defaults were maintained. From the 1000 + optimal primer pairs identified by QDD software, we selected 100 for testing on Corylopsis, focusing especially on loci with 3 or longer repeat motifs as these have been shown to produce well separated peaks and less ambiguous scoring in an automated sequencing environment in comparison with dinucleotide repeat motifs. PCR primers were manufactured by Integrated DNA Technologies (Coralville, IA). The 17 primers used for genetic diversity analysis (Table 2) contained the additional M13 (– 21) universal sequence (TGTAAAACGACGGCCAGT) on the 5’ end of the forward primer to allow indirect fluorescent labeling of PCR products using a FAM (6-carboxyfluorescein)–labeled M13 primer (Schuelke 2000).

Table 2 Characteristics of the 17 Corylopsis genomic microsatellite loci used in the diversity analysis of 106 taxa

DNA extraction and microsatellite amplification

DNA was extracted from fresh leaf tissue using a FastPrep homogenizer (Savant Instruments, Farmingdale, NY) and DNeasy extraction and PowerClean Cleanup kits (Qiagen GmbH, Germany). DNA samples were quantified with a spectrophotometer (NanoDrop 1000 Spectrophotometer, Thermo Fisher, Wilmington, DE). Primers were screened for optimal annealing temperature and amplification quality by gradient PCR using two Corylopsis accessions—tetraploid C. glandulifera (AA557-88*B) and diploid C. glabrescens (AA1030-90*B) (Table 1). PCR profiles consisted of denaturation at 94 °C for 5 min followed by 30 cycles of 94 °C for 30 s, optimized annealing temperature (Table 2) for 45 s, and extension at 72 °C for 45 s. Next, eight cycles were run at 94 °C for 30 s, 53 °C for 45 s, and 72 °C for 45 s, followed by a final extension at 72 °C for 10 min. For analysis, 1 µl of PCR product was combined with 10 µl of formamide (Applied Biosystems, Foster City, CA) and 0.2 µL of GeneScan 500 LIZ Size Standard (Applied Biosystems), and samples were analyzed on a 3730xl DNA Analyzer (Applied Biosystems). A minimum of 10 repeated samples (positive controls) were run across each PCR plate. Two samples of nuclease-free water (negative control) were run with each PCR plate, and any samples that yielded null alleles were run twice. Allele scoring was performed with GeneMarker Version 2.6.3 (SoftGenetics, State College, PA).

Data analysis

Allele scores were converted to a binary matrix with “0” for absence of an allele and “1” for presence of an allele. The R ‘pvclust’ package, version 1.3-2, (Suzuki et al. 2017) was used to construct the dendrogram, with genetic distance based on the Jaccard similarity coefficient (“binary”) and clustering using the ‘unweighted pair group method with arithmetic mean’ (UPGMA) algorithm. Confidence levels for branches of the dendrogram were calculated using approximately unbiased (AU) p-values with multiscale bootstrap resampling based on 20,000 replications, as implemented in the R ‘pvclust’ package (Suzuki et al. 2017). Primer statistics (PIC, Ho, and He) were calculated according to standard algorithms (Botstein et al. 1980; Nei 1987) coded in the R language.

Results

Cytology

Root tip cytology of a single sample of C. platypetala (NA66131) confirmed that it was diploid with 24 chromosomes (Fig. 1A) contributing to its 2C genome size of 1.9 pg (Table 1). Similarly, root tip cytology of a sample of C. gotoana var. coreana (NA56375-02C3) confirmed that it was tetraploid with 48 chromosomes (Fig. 1B) contributing to its 2C genome size of 3.5 pg. An interploid hybrid (NA6-10EH) from the USNA breeding program was included in the study, representing a hybrid between the tetraploid C. gotoana ‘March Jewel’ and hexaploid C. spicata. Root tip chromosome counts confirmed that this selection is indeed pentaploid (2n = 5x = 60), as predicted based on the interploid cross (Fig. 1C). Root tip chromosome counts also confirmed a hexaploid sample, C. willmottiae (NA69317), with 2n = 6x = 72 (Fig. 1D).

Fig. 1
figure 1

Photomicrographs of Corylopsis metaphase chromosomes viewed at 1000 × magnification. Scalebar at 0.01 µm. A C. platypetala (NA66131) 2n = 2x = 24; B C. gotoana (NA56375-02C3) 2n = 4x = 48 C F1 hybrid NA 6-10EH (C. gotoana ‘March Jewel’ × C. spicata) 2n = 5x = 60; D C. willmottiae (NA69317) 2n = 6x = 72

Flow cytometry

Flow cytometry revealed relative 2C genome sizes ranging from 1.9 to 5.7 pg (Table 1). Based on confirmed chromosome counts of taxa representing the entire ploidy series (2x, 4x, 5x, 6x; Fig. 1), we could confidently assign ploidy to all accessions tested based on the genome size determined by flow cytometry data (Table 1). While ploidy differences between species were expected based on previous reports, we found that several species had varying ploidy among accessions, including C. glabrescens (2x, 4x, 5x, 6x;), C. glandulifera (2x, 4x, 6x;), C. gotoana (2x, 4x, 5x,), C. gotoana var. coreana (4x, 6x;), C. platypetala (2x, 4x), C. sinensis (2x, 5x, 6x;), and C. willmottiae (2x, 4x, 6x).

Microsatellite analysis

After we screened microsatellite primers developed for other genera in the Hamamelidaceae (Hatmaker et al. 2015) and found inconsistent results in Corylopsis, we developed primers specifically for Corylopsis. Of the original 100 primer pairs tested, we chose 17 that produced consistent, polymorphic, easily scorable amplification products across all species tested. These loci amplified an average of 7.3 alleles (range 3–17) across all species tested (Table 2). These 17 microsatellite primer pairs generated 120 scorable fragments ranging in size from 89 to 226 bp across the 106 Corylopsis accessions tested. Polymorphic Information Content (PIC) ranged from 0.1969 for P54 to 0.8530 for P2; observed heterozygosity (Ho) ranged from 0.0660 for P54 to 0.9340 for P62; and expected heterozygosity (He) ranged from 0.217 for P1 to 0.866 for P2 (Table 2).

UPGMA cluster analysis revealed high levels of genetic diversity among accessions, even among some accessions collected from the same area (Fig. 2). Clustering of accessions roughly followed species designations, although exceptions were common, and likely represented misidentification or possibly undetected hybridizations. While some clusters were supported by high bootstrap values (> 80%, Fig. 2), some of the lower-order clusters were not robust.

Fig. 2
figure 2

UPGMA dendrogram showing clustering of 106 Corylopsis accessions based on Jaccard’s distance. Bootstrap values greater than 80% are shown above major nodes. Major clusters are marked with letters AJ and are described in the text. Taxa in this figure are identified either by their species names, or for simplicity and for ease of reading, by their variety name if applicable. For example, C. gotoana var. coreana is designated in Fig. 2 as “coreana”, followed by the unique accession information from Table 1. Accessions in bold type were wild-collected (See Table 1)

Discussion

Ploidy

Our finding of ploidy series within and between species of Corylopsis supports prior studies, but also represents previously unreported ploidy levels in some species. In C. glabrescens, we found four diploid samples, including the cultivar ‘Longwood Chimes’ (Table 1), consistent with previous reports of diploids in this species by Kono et al. (2008), who investigated somatic chromosomes in meristematic cells of young leaves. Two samples of C. glabrescens (MS1987-116*B and MS1998-185*A) were identified as tetraploids, consistent with a previous report by Santamour (1965), based on cytological studies. Pentaploid and hexaploid C. glabrescens were also revealed with our study, representing the first report of pentaploids and hexaploids in this species.

Two samples of C. gotoana were diploid with a genome size of 1.9 pg, representing the first report of a diploid. One sample was pentaploid (JC060155), representing the first report of a pentaploid. Most samples were tetraploids. This is consistent with reports of tetraploids in C. gotoana found in Japan and Korea (Kim et al. 2008; Kono et al. 2008). Likewise, most samples of C. gotoana var. coreana were also tetraploid, except for one hexaploid accession (NA56375-005). This is the first reported assessment of ploidy in C. gotoana var. coreana.

One sample of C. glandulifera (AA898-81*A) was found to be diploid with a genome size of 1.9 pg, while all other samples were hexaploid (Table 1). This is the first reported assessment of ploidy in C. glandulifera. The single sample of C. himalayana included in our study was diploid with a genome size of 2.1 pg, consistent with previous cytological studies (Mehra and Khosla 1969). The two samples of C. multiflora were both hexaploid, representing the first assessment of ploidy for this species.

All samples of C. pauciflora were diploids, ranging from 1.9 to 2.1 pg, confirming previous reports based on cytology (Anderson and Sax 1935; Oginuma and Tobe 1991; Santamour 1965). Corylopsis \(\times \)‘Winterthur’ was also diploid, which does not support a previous report on its origin as an interploid hybrid of C. spicata (6x) and C. pauciflora (2x) (Bean and Aniśko 2014).

We found one diploid C. platypetala, confirming a report by Santamour based on cytology (1965). However, two samples of C. platypetala (LWL-2769*A and LW1957-0994*B) were tetraploids, which represents the first evidence of polyploidy within this species.

Two samples of C. sinensis were found to be diploid including one cultivar (‘Spring Purple’). These results confirm a previous report of diploid individuals of C. sinensis (Santamour 1965). However, except for one pentaploid accession (JC150435-17), and two tetraploids in C. sinensis var. calvescens, all the other C. sinensis accessions we tested were hexaploid (Table 1). This represents the first report of polyploids in C. sinensis.

Two samples of C. willmottiae (LW2016-2002*A and NA85065) were diploid with a genome size of 2.0 pg, conflicting with a previous report of tetraploid C. willmottiae based on cytology (Santamour 1965). This represents the first report of diploid individuals in C. willmottiae. One sample of C. willmottiae (NA67192-H) was found to be tetraploid, and four samples were hexaploid (Table 1).

All samples of C. veitchiana and C. spicata that we tested were hexaploid. Similar reports have come from studies of the collections of the Arnold Arboretum (Anderson and Sax 1935; Oginuma 1991) and Morris Arboretum (Santamour 1965). Hexaploid C. spicata have also been confirmed in cultivated material from Kōchi, Shikoku, Japan (Oginuma 1991).

Although several studies exist for chromosome counts in Corylopsis, only one other study has reported on genome size. Using propidium iodide-based flow cytometry, Garcia et al. (2010) reported a genome size for C. coreana of 2C = 3.93 pg (using Pisum L as an internal standard) and for C. sinensis of 2C = 4.62 pg (using Petunia Juss. as an internal standard). Although genome size and ploidy level were not calibrated using cytology (chromosome counts), it appears that these 2C genome sizes are larger than those calculated in our study. In prior research on other ornamental taxa, a combination of genome size estimates (from flow cytometry) and chromosome counts have revealed substantial species-level variation in 1Cx values (Gillooly and Ranney 2015; Parris et al. 2010; Ranney et al. 2018; Shearer and Ranney 2013). Regardless of the absolute genome size, it is evident from our study that the range of ploidy levels in Corylopsis is greater than previously reported. The ploidy series within species is especially noteworthy as it presents heretofore untapped opportunities for interploid breeding.

Genetic diversity

The primer statistics (PIC, Ho and He) can be useful in determining the genetic diversity in a population for the alleles at that locus. Because we looked at individual accessions, and not plants from one or more populations in Hardy–Weinberg equilibrium, the primer statistics have little meaning from a population genetics standpoint. However, these properties of the markers may be useful to future population studies in Corylopsis, especially for examining diversity or population structure within a species.

The SSR-based dendrogram (Fig. 2) provides insight into relationships between species, as well as highlights possible misidentification of accessions, both of which are valuable for collections management and breeding. While some accessions of a species grouped together as expected, many grouped in different clusters, particularly those with different ploidy. It is clear that ploidy played at least as large a role in clustering as species and geographic origin (Fig. 3). While it is possible that some accessions were mislabeled, which would lead to inconsistent clustering, it is also possible that some of the seedling material or accessions from nurseries or gardens may actually be interspecific hybrids resulting from outcrossing with other cultivated accessions. Wild-collected accessions (highlighted in bold in Table 1 and Fig. 2) are often deemed the most “authentic” representatives of the species. However, based on our results, it is clear that there is substantial genetic and ploidy variability within species that cannot be explained by outcrossing in a garden setting. The SSR markers developed in this study may be useful for confirming some cases of potential interspecific hybrids in Corylopsis; otherwise, it may be necessary to use SNPs or direct sequencing to determine outcrossing and hybridization in the genus.

Fig. 3
figure 3

Location of collecting sites for some of the Corylopsis included in study (See Table 1)

Among the accessions we studied, the most distantly related taxa based on the UPGMA algorithm occur in Cluster A (Fig. 2), which contains the only sample of C. himalayana, as well as two diploid and one tetraploid accession of C. willmottiae. Additional accessions of hexaploid C. willmottiae grouped in clusters I and J. The clustering of these species does not appear to be closely related to geographical origin of these accession, which all originated in the westernmost areas of collection (Fig. 3). The remaining accessions are grouped in Cluster B as a well-supported and distinct group that is resolved into two additional clusters, C and D, both with bootstrap values of 98%. Cluster C contains diploid accessions of C. gotoana, C. glabrescens, C. pauciflora, and C. glandulifera. Interestingly, accessions of tetraploid C. gotoana and hexaploid C. glabrescens did not cluster with the diploid accessions. This could be due to additional allelic diversity in the higher ploidy taxa, or possibly interspecific hybridization, or an allopolyploid origin. Previous karyomorphological studies of the centromere location (Kono et al. 2008) indicate possible divergence between Japanese diploids C. pauciflora and C. glabrescens, although our study did not indicate this.

Cluster D is further resolved into Clusters E and F. Cluster E contains only tetraploid C. gotoana var. coreana with high bootstrap support (96%). Previous work by Roh et al. (2007) using nrRNA-ITS sequence showed a tighter relationship among C. gotoana var. coreana, C. glabrescens, C. gotoana, and C. pauciflora than we saw in our study. This is likely due to different accessions and different types of markers used. Cluster F contains the remaining accessions, further divided into Clusters G and H. Cluster G also has high bootstrap support, and contains primarily tetraploid accessions of C. glandulifera, C. sinensis var. calvescens, C. platypetala, and C. gotoana. Previous molecular studies placed C. sinensis var calvescens in a group by itself (Roh et al. 2007) and grouped C. gotoana and C. spicata more closely than our data indicate (Yamanaka et al. 2008). Also in contrast to our study, studies by Kono et al. (2008) revealed similar karyotypes in tetraploid C. gotoana and hexaploid C. spicata, implying possible shared ancestries.

Cluster H is well-resolved (100%), and contains the final clusters, I and J. These clusters contain most of the hexaploid taxa, and are not as well-supported by bootstrap values, at least at the higher clustering levels. All accessions of C. spicata grouped in Cluster F, and most accessions of C. sinensis, hexaploid C. glabrescens, and hexaploid C. veitchiana grouped in Cluster E. A study performed on Hamamelidaceae by Li (2008) using rDNA ITS sequence data grouped C. sinensis (and subspecies) and C. spicata together. While our study placed these two species in distinct clusters, the difference likely reflects the higher order taxa that were used in Li’s study, rather than contrasting results.

We used UPGMA to analyze and visualize the data instead of other methods such as STRUCTURE or Principal Coordinate Analysis (PCoA) because we were looking at individual plants (cultivars or unrelated accessions) and not at populations under selection. The microsatellite markers can be considered random or neutral in terms of selection or fitness, so a phenetic approach (UPGMA) is appropriate. Our primary objective was to examine relationships among accessions from a collections management perspective, rather than to ascertain evolutionary or phylogenetic histories. Indeed, upon visualizing the data in the UPGMA phenogram, we immediately found several accessions that were mislabeled in our collection or that led us to seek additional information on seed source or provenance. It is likely that different clustering algorithms, different plant materials, and different markers would result in different clusters of many accessions and taxa. For example, the study by Roh et al. (2007) based on RAPD markers or ITS sequence data yielded different dendrograms from each other and from the dendrogram presented here. We also performed PCoA on the data using the Jaccard distance coefficient, but found that this method was not useful at explaining variation—it took eleven dimensions to account for 50% of the explainable variation (data not shown). We are therefore confident that the methods we used to analyze the data are appropriate for our markers, plant material, and objective of assessing genetic diversity of living collections to assist with collection management and facilitate breeding.

Conclusion

This study determined ploidy and molecular genetic relationships among 106 Corylopsis accessions representing 14 taxa from botanic garden collections in the U.S. In addition to providing valuable ploidy data that was previously missing from many of these accessions, this study also developed a set of robust Corylopsis-specific SSR markers that will be useful across the genus to make decisions about species identity, collection management, or breeding targets.