Plant Molecular Biology Reporter

, Volume 32, Issue 3, pp 750–760

Transcriptome versus Genomic Microsatellite Markers: Highly Informative Multiplexes for Genotyping Abies alba Mill. and Congeneric Species

Authors

  • Dragos Postolache
    • Scuola Superiore Sant’Anna
    • Plant Genetics InstituteNational Research Council (CNR)
  • Cristina Leonarduzzi
    • Department of BiosciencesUniversity of Parma
  • Andrea Piotti
    • Plant Genetics InstituteNational Research Council (CNR)
    • Department of BiosciencesUniversity of Parma
  • Ilaria Spanu
    • Plant Genetics InstituteNational Research Council (CNR)
  • Anne Roig
    • INRA, UR629 Ecologie des Forêts Méditerranéennes (URFM)
  • Bruno Fady
    • INRA, UR629 Ecologie des Forêts Méditerranéennes (URFM)
  • Anna Roschanski
    • Faculty of Biology, Conservation BiologyUniversity of Marburg
  • Sascha Liepelt
    • Faculty of Biology, Conservation BiologyUniversity of Marburg
    • Plant Genetics InstituteNational Research Council (CNR)
Original Paper

DOI: 10.1007/s11105-013-0688-7

Cite this article as:
Postolache, D., Leonarduzzi, C., Piotti, A. et al. Plant Mol Biol Rep (2014) 32: 750. doi:10.1007/s11105-013-0688-7

Abstract

The availability of high-resolution, cost-effective polymorphic genetic markers displaying Mendelian inheritance is a prerequisite for fine-scale population genetic analyses as well as informed conservation and sustainable management. Silver fir (Abies alba Mill.) is a widespread European species of economic and ecological importance for which genetic markers are needed but difficult to develop, as in most conifer species. In this work, we introduce two sets of new multiplexed transcriptome-derived expressed sequence tag microsatellites (EST-simple sequence repeats (SSRs)) which we compare to a set of multiplexed genomic microsatellites (gSSRs). For both marker types, transferability was tested in 17 congeneric taxa. A total of 16 new EST-SSRs and two new gSSRs were developed. The EST-SSR multiplexes produced easily scorable amplification patterns that allow rapid and cost-effective genotyping at low-error rates, and include loci that display very low null allele frequencies. Generally, EST-SSRs displayed lower polymorphism and frequency of null alleles, but higher genetic differentiation among populations than gSSRs. Preliminary tests revealed that the EST-SSR markers are highly transferable and polymorphic across Abies species. This study also confirmed that SSRs can be successfully developed using next-generation sequencing technology also in large genome species such as conifers.

Keywords

EST-SSRsGenomic SSRsMultiplex PCRDiversitySilver fir

Introduction

Silver fir (Abies alba Mill.) is a widespread European conifer. It is a keystone species of many mountain forest ecosystems with high ecological and economic value and is also found at low density associated with other widespread European species such as beech (Fagus sylvatica L.) and spruce (Picea abies Karst.) (Wolf 2003). Despite being tolerant of a relatively broad range of environmental conditions (e.g., it is cold-hardy and shade-tolerant), silver fir is more sensitive than other conifers to changes in temperature, water availability, and air pollution. In particular, showing lower water-use efficiency compared to other fir species from more xeric areas, it is expected to be severely affected by drought under a changing climate (Guehl et al. 1991; Aussenac 2002; Macias et al. 2006; Linares and Camarero 2012). In the last 200 years, its range has significantly decreased due to both environmental changes and human impact through deforestation, overexploitation, silvicultural choices in favor of faster growing conifers, improper management, and air pollution (Wolf 2003). In other parts of its range, particularly at the upper tree limit, its distribution has increased over the same period due to land use changes, i.e., natural recolonization of abandoned agricultural and low productivity lands (Chauchard et al. 2007, 2010). Peripheral A. alba populations, in particular at the southern edge of the distribution, are expected to be the most affected by climate change due to their small population size, fragmented distribution, and bio-geographical position (Piovani et al. 2010; Maiorano et al. 2013).

Its relevance for European forest ecosystems has made A. alba the object of several genetic surveys using terpenes, isozymes, mitochondrial DNA markers, chloroplast, and nuclear microsatellites (simple sequence repeats (SSRs)) (e.g. Konnert and Bergmann 1995; Vendramin et al. 1999; Liepelt et al. 2002; Sagnard et al. 2002; Liepelt et al. 2009; Piovani et al. 2010; Gömöry et al. 2012), providing important information on the species’ postglacial recolonization history and on the spatial distribution of its genetic diversity at different scales. There is now an urgent need for conservation-oriented population genetic studies with increased resolution to assess current dynamics and future evolutionary trajectories of A. alba populations strongly exposed to environmental change and to evaluate current conservation strategies in Europe (Lefèvre et al. 2013). The set of currently available SSRs for A. alba is not suitable for this task due to their limited number and the presence of null alleles (Cremer et al. 2012; Gömöry et al. 2012). For this reason, we developed and characterized in natural populations new SSRs using both transcriptomic and genomic resources.

Transcriptome sequencing using next-generation sequencing is an effective tool for generating genomic resources and identifying polymorphic molecular markers for non-model organisms, particularly for species characterized by large and repetitive genomes such as conifers (Parchman et al. 2010; Roschanski et al. 2013). Developing SSRs from transcriptome sequences (expressed sequence tag (EST)-SSRs hereafter) is labor and cost-effective compared to the conventional procedure for the identification of genomic SSRs (i.e., screening of cloned libraries by Sanger sequencing, Schoebel et al. 2013). Moreover, EST-SSRs are easily transferable to other species due to the higher level of sequence conservation of transcribed DNA across species (Varshney et al. 2005; Zalapa et al. 2012; Fan et al. 2013). They are expected to be less polymorphic than genomic SSRs (gSSRs hereafter), but also less prone to null alleles, making them ideal for genetic studies in which genotyping errors should be strictly avoided, e.g., fine-scale population genetic studies and parentage studies (Kim et al. 2008; Oddou-Muratorio et al. 2009).

In this work, we introduce two sets of new multiplexed EST-SSRs for high-resolution and cost-effective genetic analyses in A. alba and several congeneric taxa. To do this, we took advantage of available transcriptome data (Roschanski et al. 2013). We also developed two new gSSRs that we multiplexed with previously available gSSRs displaying high-amplification quality (Hansen et al. 2005; Cremer et al. 2006). We describe the procedure to identify and optimize the new EST-SSRs and to design multiplex sets, paying particular attention to quality controls (e.g. null–allele detection). We compared the performance of EST-SSRs and gSSRs and we tested the transferability of EST-SSRs to 17 congeneric taxa from the Mediterranean, Asia, and North America.

Materials and Methods

Plant Material

Plant material was collected from four populations: Northern (Abetone, 44°8′28″N, 10°40′3″E, coded as ABE) and Southern (Sila, 39°7′57″N, 16°38′19″E, SIL) Apennines (Italy), Bulgaria (Bansko, Pirin Mountains, 41°50′35″N, 23°23′7″E, BL) and Romania (Arges, Fagaras Mountains, 45°26′28″N, 24°41′40″E, RH) (Fig. 1). Forty-eight adult trees (pair-wise minimum distance between trees >20 m) were sampled in each population (total N = 192 individuals). Fresh needles were dried in silica gel and then stored at −80 °C until DNA extraction. Abies spp. samples used for transferability tests were collected either in situ, in provenance trials, or in the arboretum of the botanical garden at the University of Marburg (Germany).
https://static-content.springer.com/image/art%3A10.1007%2Fs11105-013-0688-7/MediaObjects/11105_2013_688_Fig1_HTML.gif
Fig. 1

Map of the four sampled populations (see text for details) and results of clustering analysis using Structure (K = 3). For each population, upper bar plots refer to EST-SSRs and lower bar plots to gSSRs. Population code names: ABE (Abetone, Northern Italy), SIL (Sila, Southern Italy), RH (Arges, Romania), and BL (Bansko, Bulgaria). Green-colored areas indicate the distribution range of Abies alba (Wolf 2003)

DNA Isolation

DNA extraction was performed from 50 mg of frozen needles with the DNeasy 96 Plant Kit (QIAGEN) following the manufacturer’s instructions. For disruption of plant material we added a 3-mm diameter tungsten bead to each well of the 96-well plates. Plates were frozen in liquid nitrogen for 30 s before 2 cycles of 1-min disruption at 25 Hz using a Mixer Mill MM300 (Retsch, Germany). DNA quality was estimated on a 1 % agarose gel stained with GelRed (Biotium, USA). DNA concentration was measured using a spectrophotometer NanoDrop ND-1000 (Thermo Scientific, Wilmington, USA).

Multiplex PCR Optimization

SSRs from the transcriptome (EST-SSRs)

EST-SSR discovery was carried out based on the analysis of assembled contigs from a transcriptome of A. alba (accession numbers: JV134525-JV157085; Roschanski et al. 2013) using the Sputnik software (http://espressosoftware.com/sputnik/index.html). Sputnik finds perfect, compound, and imperfect repeats using a recursive algorithm (Duran et al. 2009). The minimum number of repeats was set to 6 for di-SSRs and 5 for tri-, tetra- and penta-SSRs.

Sequence output from Sputnik was subsequently analyzed using WebSat (http://www.wsmartins.net/websat/) to identify candidates with appropriate flanking regions suitable for primer design. Primers were designed using Primer3 (Rozen and Skaletsky 2000) applying the following parameters: product size between 100 and 500 bp, annealing temperature (Ta) between 57 and 62 ºC, optimum GC content = 50, maximum self-complementarity = 4.00, maximum 3′ self-complementarity = 2.00, and maximum Poly-X = 4.

A total of 67 EST-SSR primer pairs were designed and tested on a set of eight samples (four samples from ABE and four samples from BL) by PCR amplification. The PCR thermal profile was: denaturation at 94 °C for 4 min, followed by 10 cycles at 94 °C for 30 s, 63 °C for 30 s (decreasing 1 °C/cycle), and 72 °C for 30 s, followed by 27 cycles at 94 °C for 30 s, 53 °C for 30 s, and 72 °C for 40 s, with a final 10-min extension step at 72 °C. PCR products were quality-checked on 2 % agarose gels stained with GelRed (Biotium, USA).

The quality and polymorphism of all 67 EST-SSRs were first checked using the M13-tail labeling technique (Schuelke 2000). PCR products were analyzed on an ABI 3500 automatic sequencer (Applied Biosystems, USA) using LIZ-500 as internal size standard. To evaluate EST-SSR polymorphism, 48 samples (24 samples from ABE and 24 samples from BL), were genotyped with 24 EST-SSRs that exhibited high-quality amplification and clear microsatellite peaks at the expected size. Excluding monomorphic loci, 16 EST-SSRs were finally selected and subsequently multiplexed by taking into account size ranges (Table 1). Mendelian segregation and the possible presence of null alleles were tested by progeny tests (Gillet and Hattemer 1989; Tarazi et al. 2010) on six open-pollinated progenies. For each progeny, 6 to 28 offsprings were genotyped using simplex PCR and the M13 tail labeling technique and their multi-locus genotype was compared to that of their seed-parent.
Table 1

Characteristics of multiplexes A, B and C based on 192 A. alba individuals from the four populations sampled

Locus

Reference

Primer sequences (5′ → 3′)

Motif

Dye

Size (bp)

[C]

A

HO

HE

FIS

Accession No.

EST-SSRs

 Multiplex A

  Aat01

This study

F: CCATGTCTCCGATTTCCAGT

(GCG)10

FAM

103–127

0.20

7

0.581

0.517

−0.125

KF304594

R: GGCCTAACGAAAGCAGAATC

  Aat02

This study

F: AGAAGATTTCCCGGCTTTTC

(CAG)7

VIC

123–129

0.06

3

0.312

0.331

0.057

KF304595

R: ATCCAGACAGCGAACTTTGG

  Aat03

This study

F: TCCCCATGGTTTGGTTAAAA

(AT)9

PET

149–161

0.10

6

0.476

0.515

0.075

KF304596

R: CGAAGAAAATGTTGCGGAAT

  Aat04

This study

F: CCATGTATGGTGCTCCTCCT

(CAG)11

FAM

158–191

0.27

9

0.423

0.404

−0.046

KF304597

R: CCTTCATTGCAGAAAAGCAA

  Aat05

This study

F: AGCATCCACATTCCGTAACC

(GCA)7

VIC

177–192

0.06

3

0.280

0.247

−0.135

KF304598

R: AGTTGACCGTTGGAGAGCAG

  Aat06

This study

F: TTATGCGGAGCAGTTCTGTG

(GCA)8

NED

196–214

0.20

5

0.115

0.113

−0.019

KF304599

R: TGTTGCTGGCGTACTGGTAG

  Aat07

This study

F: GCTAGCAGAACCCTGGAATG

(AT)11

PET

219–241

0.10

10

0.556

0.656

0.154

KF304600

R: GGTGGGATATTTCCAGCAAG

  Aat08

This study

F: ACTCCATCACGGTGGTCTTC

(AT)9

NED

302–312

0.08

3

0.171

0.163

−0.048

KF304601

R: GCCATTCAGGCTCTCAGTTC

 Multiplex B

  Aat09

This study

F: CAGATCCTCCCACATCCAAC

(TCA)8

NED

150–156

0.05

3

0.032

0.031

−0.016

KF304602

R: TGACACCACAGGAAACCATC

  Aat10

This study

F: GAGCACGATGAAGAGGAAGC

(AT)12

FAM

226–250

0.25

13

0.625

0.656

0.047

KF304603

R: AAAACCCCCACGCGGTAT

  Aat11

This study

F: AGCGTTGATTGGAAGCAGTC

(AAC)9

VIC

255–270

0.08

5

0.561

0.535

−0.048

KF304604

R: GAAGCATGGTGTCGTTGTTG

  Aat12

This study

F: ATCCATATCTCCTGCCTTGC

(AG)12

PET

303–349

0.21

19

0.610

0.600

−0.016

KF304605

R: CTTTCCAGGTGATCTGATTGC

  Aat13

This study

F: ACTCAAAGCCAAGCTGGAGA

(AG)8

FAM

326–342

0.30

4

0.163

0.180

0.093

KF304606

R: TGCATAAGACAGCCGAGTCA

  Aat14

This study

F: GACTGGGGATCCTGCTGTTA

(TA)9

VIC

358–394

0.13

16

0.734

0.749

0.020

KF304607

R: AGAGGAGGCAGCCCATACAT

  Aat15

This study

F: AGGAGGAGGTTCAGCATGTC

(AGA)8

NED

361–373

0.08

4

0.133

0.132

−0.006

KF304608

R: CTTGCTCTCTGACCCAGTTG

  Aat16

This study

F: AACCACCGCTGATATTTTGG

(GAA)7

PET

427–430

0.20

2

0.269

0.288

0.067

KF304609

R: GGGTTCAAGAAATGGGAATG

gSSRs

 Multiplex C

  SFg6

Cremer et al. 2006

F: GTAACAATAAAAGGAAGCTACG

(AC)9

VIC

103–111

0.11

5

0.332

0.577

0.425

DQ218456

R: TGTGACACATTGGACACC

  SF324

Cremer et al. 2006

F: TTTGAACGGAAATCAAATTCC

(CCG)8

PET

105–120

0.24

5

0.296

0.473

0.374

DQ218461

R: AAGAACGACACCATTCTCAC

  NFF7

Hansen et al. 2005

F: CCCAAACTGGAAGATTGGAC

(GA)33

VIC

116–174

0.13

26

0.857

0.896

0.043

AY966495

R: ATCGCCATCCATCATCAGA

  SFb5

Cremer et al. 2006

F: AAAAAGCATCACTTTTCTCG

(CT)15

FAM

138–160

0.20

10

0.373

0.713

0.477

DQ218455

R: AAGAGGAGGGGAGTTACAAG

  SFb4

Cremer et al. 2006

F: GCCTTTGCAACATAATTGG

(GT)16

NED

149–205

0.30

25

0.667

0.865

0.229

DQ218454

R: TCACAATTGTTATGTGTGTGG

  Aag01

This study

F: GCTTATTCTCACTGCTCGCC

(CTT)15

PET

193–250

0.15

13

0.804

0.768

−0.046

KF304592

R: ATGACTTGAAGGTGGATGCC

  SF1

Cremer et al. 2006

F: TTGACGTGATTAACAATCCA

(CCG)9

VIC

208–229

0.17

6

0.555

0.511

−0.086

DQ218453

R: AAGAACGACACCATTCTCAC

  Aag02

This study

F: TATTCCTCCACTTGGGTGCT

(GA)13

FAM

208–250

0.37

19

0.363

0.855

0.575

KF304593

R: GGTGGAGATCCGTATGCAAT

[C] final concentration in each primer premix [μM]; A number of alleles; HO and HE, observed and expected heterozygosities; FIS, inbreeding coefficient

The Type-it Microsatellite PCR kit (QIAGEN, Germany) was used to carry out multiplex reactions. The final volume of PCR was optimized to 6 μl to reduce costs. The PCR mix for both multiplexes was: 3-μl Type-it Microsatellite Buffer, 2 μl of primers premix, and 1 μl of DNA (∼20 ng⁄μl). Concentrations and fluorescent dyes for each primer pair in the primer premix are presented in Table 1. Both EST-SSR multiplex sets had the same PCR thermal profile: an initial step at 95 °C for 5 min, followed by 32 cycles at 95 °C for 30 s, 57 °C for 90 s, and 72 °C for 30 s, with a final 30-min extension step at 60 °C. PCR products were run on an ABI 3500 automatic sequencer (Applied Biosystems, USA), with LIZ-500 as internal size standard. The match between simplex and multiplex profiles of all 192 samples was also checked to control for allele amplification competition and possible allelic drop-out. Chromatograms were analyzed using GeneMapper v4.1 (Applied Biosystems, USA).

Using the same PCR conditions as above, we tested the transferability of the newly developed EST-SSRs to ten Mediterranean, three North-American, and four Asian Abies species and sub-species and one to seven individuals per taxon (Table 2).
Table 2

Transferability of 16 A. alba EST-SSRs into 17 congeneric taxa from the Mediterranean (first 10 taxa, sections Abies and Piceaster), Asia (taxa 11 to 14, sections Momi and Balsamea) and America (last 3 taxa, sections Balsamea and Grandis)

Species

N

Aat01

Aat02

Aat03

Aat04

Aat05

Aat06

Aat07

Aat08

Aat09

Aat10

Aat11

Aat12

Aat13

Aat14

Aat15

Aat16

Transferability rate

A. borisii-regis Mattf.

2

+

++

++

++

+

+

++

+

+

++

++

+

+

+

++

+

1

A. cephalonica Loudon

4

++

++

++

++

+

+

++

+

+

++

+

++

++

++

+

+

1

A. nordmanniana (Steven) Spach

3

++

++

+

++

+

+

++

++

++

++

++

++

++

++

++

++

1

A. nordmanniana subsp. equi-trojani (Asch. & Sint. ex Boiss.) Coode & Cullen

3

±

+

++

+

+

+

±

+

+

++

+

++

++

++

+

+

1

A. nordmanniana subsp. bornmuelleriana (Mattf.) Coode & Cullen

7

++

++

++

++

+

+

++

+

++

++

++

++

++

++

+

+

1

A. nebrodensis (Lojac.) Mattei

3

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

++

1

A. cilicica (Antoine & Kotschy) Carrière

4

++

+

++

++

++

+

±

++

++

±

++

++

+

++

+

+

1

A. pinsapo Boiss.

4

++

++

++

+

+

+

±

±

++

++

++

±

++

++

+

+

1

A. pinsapo var. marocana (Trab.) Ceballos & Bolaños

2

++

++

+

+

+

+

±

+

+

±

+

++

++

±

+

+

1

A. numidica de Lannoy ex Carrière

4

+

++

++

++

++

+

++

+

+

++

+

++

+

++

+

+

1

A. recurvata Mast.

1

+

++

+

+

+

+

+

+

+

+

+

++

+

++

+

+

1

A. sibirica Ledeb.

2

+

+

++

+

++

+

++

++

++

++

+

+

+

0.81

A. veitchii Lindl.

3

++

+

++

++

++

+

++

+

+

±

+

+

+

0.81

A. koreana E. H. Wilson

2

++

+

+

+

+

+

+

+

+

+

+

+

0.75

A. lasiocarpa (Hook.) Nutt.

1

+

++

+

+

++

++

+

++

++

+

+

+

+

0.81

A. concolor (Gordon & Glend.) Lindl. ex Hildebr.

4

++

++

++

++

+

+

+

+

+

++

++

+

+

+

0.88

A. grandis (Douglas ex D. Don) Lindl.

1

+

++

+

+

+

+

+

++

+

++

+

+

+

0.81

Taxonomic reference: USDA, ARS, National Genetic Resources Program, Germplasm Resources Information Network - (GRIN), except A. borisii-regis (IUCN Red list of threatened taxa)

− no amplification; ± some amplification, but optimization is needed; + successful high-quality amplification; ++ successful high-quality amplification and locus is polymorphic. N number of samples analyzed per taxon

SSRs from genomic DNA

In this panel, we included six gSSRs developed by Hansen et al. (2005) and Cremer et al. (2006), some of which had already been multiplexed (Hansen et al. 2008; Cremer et al. 2012; Gömöry et al. 2012), and two gSSRs developed from an A. alba-enriched library (Malausa et al. 2011). They were selected from two sets of 12 and 6 gSSRs, respectively, according to their quality and polymorphism tested on eight samples (four samples from ABE, and four samples from BL). The PCR thermal profile was: denaturation at 94 °C for 4 min, followed by 10 cycles at 94 °C for 30 s, 61 °C for 40 s (decreasing 1 °C/cycle), and 72 °C for 40 s, followed by 29 cycles at 94 °C for 30 s, 51 °C for 40 s, and 72 °C for 45 s, with a final 10-min extension step at 72 °C. The quality-check of PCR products followed the same procedure as for EST-SSRs.

All gSSRs were first validated using M13-tail labeling technique and then multiplexed according to their size range (Table 1). The multiplex reactions, PCR amplification, and sizing of PCR products were carried out as for EST-SSRs above, except that in the PCR thermal profile, the annealing temperature was 59 °C for 60 s.

Genotype Scoring and Data Analyses

All population genetic analyses were carried out on the whole dataset using both the EST-SSR and gSSR multiplexes. To estimate the error rates, the 192 samples screened with the 16 EST-SSRs and eight gSSRs were scored by two readers and types A and B errors were estimated. Type A error refers to the case when a heterozygote is mistaken for a homozygote, or vice versa. Type B error refers to a wrongly scored allele. In such cases, a final decision was made by joint agreement.

The software GenAlEx v6.5 (Peakall and Smouse 2012) was used to assess genetic diversity of the four A. alba populations. For each population, the total number of alleles (A), observed (HO) and expected heterozygosity (HE), and the fixation index (FIS) were calculated at each locus. The program INEst (Chybicki and Burczyk 2009) was used to estimate the frequencies of null alleles in the dataset, running the individual inbreeding model with a Gibbs sampler of 105 iterations. Computation of allelic richness (AR) was carried out using the program HP-rare (Kalinowski 2005) in order to make it independent from sample size. Rarefaction was carried out with a common total sample size of 64 genes (32 diploid individuals). The software Genepop v4.2.1 was used to test for genotypic disequilibrium among loci using log likelihood ratio statistics (Rousset 2008) and Markov chain parameters provided by default.

Single parent and parent pair exclusion probabilities were calculated from allele frequencies according to the formula by Jamieson and Taylor (1997) using FaMoz (Gerber et al. 2003). Differentiation indices (Jost’s D and Weir and Cockerham FST) were estimated using the diveRsity package in R (Keenan et al. 2013) and GenAlEx v6.5 (Peakall and Smouse 2012), respectively.

A Bayesian clustering approach was used to detect population genetic structure using the software Structure v2.3 (Pritchard et al. 2000; Falush et al. 2003). The admixture model was used, in which the fraction of ancestry from each cluster is estimated for each individual and allowed for correlated allele frequencies, as well as the “locprior” option when population identity is used as a priori information for clustering. Five independent runs for each K value ranging from 1 to 7 were performed after a burn-in period of 104 steps followed by 5 × 104 Markov chain Monte Carlo replicates. To identify the number of cluster (K) that best explained the data, the rate of change of L(K) (ΔK) between successive K values was calculated following Evanno et al. (2005) using the web application “StructureHarvester” (Earl and von Holdt 2012).

Functional Annotation EST-SSRs

Accurate functional annotation of silver fir EST-SSRs is a difficult task due to the limited availability of reference genome/gene sequences in public databases for conifer species. In order to maximize successful annotation, assembled contigs containing EST-SSRs were compared against four different databases (ConGenIE (http://congenie.org/); GenBank (Benson et al. 2013); UniProtKB/TrEMBL, and UniProtKB/Swiss-Prot (The UniProt Consortium 2012)) by using BLASTx (E value cut-off: <10−3), and they were searched for protein family.

Results and Discussion

Multiplex PCR Optimization

We detected 2,150 putative EST-SSRs. This relatively low number is in agreement with previous studies where a negative correlation between SSR frequencies and the genome size was found, suggesting that it may be challenging to develop a large number of EST-SSRs for conifers (Ueno et al. 2012). Based on the WebSat analysis, we selected and tested 67 EST-SSRs with single non-interrupted, non-compound motifs and with the highest number of repeats, which are expected to display high polymorphism (Petit et al. 2005).

From the original set of 67 EST-SSRs tested in simplex reactions, 16 were retained, which amplified seven di- and nine tri-nucleotide SSRs. In this selection process, we removed 41 markers because of no or very poor PCR amplification, 5 because of multi-banding patterns, and 5 because of no polymorphism. This rather low success rate (24 %) is comparable to those observed in other conifer species (e.g., Pfeiffer et al. 1997; Pinzauti et al. 2012; Sebastiani et al. 2012; Wagner et al. 2012), which are characterized by large genomes, partly due to large gene families and abundance of pseudo-genes and partly due to a very high content of repetitive DNA such as transposable elements (Kovach et al. 2010). The primer sequences of the 16 selected markers and their main characteristics are reported in Table 1. In addition to high-quality allele binning, the 16 EST-SSRs were selected because progeny tests confirmed their Mendelian segregation and the very low number of mother–offspring mismatches indicated a low frequency of null alleles.

The 16 EST-SSRs were assembled in two 8-plexes (multiplexes A and B). We limited the number of loci included in multiplexes to avoid the possible overlap of alleles from different loci due to their high-size ranges. Electropherograms and marker size ranges for the two multiplexes are shown in Fig. 2. Both multiplexes had high-amplification quality and ample polymorphism. The comparison between single- and multi-plex amplification did not reveal allele dropout. In addition, combining already available and newly developed gSSRs, an 8-plex, showing high-quality amplification and high polymorphism, was also successfully designed for gSSRs (multiplex C, Table 1).
https://static-content.springer.com/image/art%3A10.1007%2Fs11105-013-0688-7/MediaObjects/11105_2013_688_Fig2_HTML.gif
Fig. 2

Examples of an individual electropherogram for multiplexes A and B (panel a and c, respectively) and marker ranges for multiplexes A and B (panel b and d, respectively). RFU relative florescent units

Genotype Scoring and Analysis

Binning of alleles was consistent across the whole dataset, indicating that loci display allele sizes according to the expected di- and tri-nucleotide repeat variation. Two loci, Aat07, and Aat08, showed intermediate size variants (1-bp variation) for some individuals which could be clearly distinguished from the other size classes, thus having no impact on binning precision. The possibility to correctly and easily score additional variants can increase the precision of the analysis (Guichoux et al. 2011) and is an indication of the existence of mutation types other than insertion/deletion of SSR motifs within the sequence or the flanking regions (Barthe et al. 2012). At locus Aat12, more than two amplification products in single individuals (up to four) in the two Italian populations were detected. This could be due to a duplication of this locus in some populations, but further analyses are needed to confirm this hypothesis.

Type A error ranged from 0 to 0.53 % and type B error ranged from 0 to 0.26 % across the whole EST-SSR dataset. Mean type A error was 0.40 % for multiplex A and 0.15 % for multiplex B, whereas mean type B error was 0.20 % for multiplex A and 0.11 % for multiplex B. Higher type A and B error rates have been observed for the gSSR multiplex (type A error ranged between 0 and 2.6 %, type B error ranged between 0 and 1.6 %), due to the higher stuttering displayed by some loci which made the reading less straightforward. The error rates were significantly reduced by adopting well-defined reading rules.

The 16 EST-SSRs selected were all polymorphic and displayed a low to moderate level of diversity. Observed (HO) and expected (HE) heterozygosity and inbreeding coefficient (FIS) per locus are reported in Table 1. HO ranged between 0.115 and 0.581 for multiplex A, and 0.133 and 0.734 for multiplex B. HE was between 0.113 and 0.656 for multiplex A and between 0.031 and 0.749 for multiplex B. The number of alleles per locus ranged from 2 to 19 and the mean number of alleles was 5.75 and 8.25 for multiplex A and B, respectively. Allelic richness varied between 1.75 (Aat08) and 5.81 (Aat07), and between 1.90 (Aat09) and 9.10 (Aat12), for multiplex A and B, respectively (see Fig. 3). By contrast, gSSRs exhibited higher HO and HE values, with a maximum up to 0.857 and 0.896, respectively, as well as a higher number of alleles per locus (Table 1) and a higher allelic richness (Fig. 3), a result already reported in other studies (e.g., Sullivan et al. 2013). Lower diversity at EST-SSR than gSSR loci can be explained by the higher degree of conservation of the transcribed regions of the genome.
https://static-content.springer.com/image/art%3A10.1007%2Fs11105-013-0688-7/MediaObjects/11105_2013_688_Fig3_HTML.gif
Fig. 3

Null allele frequencies, allelic richness, and paternity exclusion probabilities for the 16 EST-SSRs and the eight gSSRs analyzed in each population. In the bottom panel, a comparison between Weir and Cockerham FST and Jost’s D is presented for each marker analyzed

Out of the possible 120 combinations involving the EST-SSR loci, no significant linkage disequilibrium was detected among loci (P < 0.05). When linkage disequilibrium analysis was performed on the total number of SSRs (16 EST-SSRs + 8 gSSRs), out of the possible 276 combinations, only three (about 1 %) were significant (P < 0.05).

Weir and Cockerham’s FIS values at EST-SSR loci were generally low or slightly negative. This suggests a low frequency of null alleles, which was confirmed by the non-significant null allele frequencies estimated at all loci in the four populations (Fig. 3). When inbreeding coefficients were estimated taking into account null allele frequencies by INEst, we found no FIS significantly different from 0 at the population level. Therefore, positive and high Weir and Cockerham’s FIS values estimated at some gSSRs (Table 1) are likely to be the consequences of a high frequency (>20 %) of null alleles (Fig. 3). Discarding the most null allele prone loci or making adjustments for the presence of null alleles will thus be necessary when using gSSRs for estimating diversity and differentiation parameters (Chapuis and Estoup 2007) or for paternity and parentage analyses (Oddou-Muratorio et al. 2009; Piotti et al. 2012). The higher mutation rate expected for the non-coding portion of the genome, affecting also the annealing sites, is possibly the main reason for the higher frequency of null alleles at gSSR than EST-SSR loci (Kovach et al. 2010).

The estimates of differentiation among populations were generally slightly higher for EST-SSRs (FST up to 0.243 for locus Aae08) than for gSSRs (FST up to 0.120 for SFg6; on average FST = 0.087 and 0.065 at EST-SSRs and gSSRs, respectively). Lower differentiation estimates for gSSRs are expected due to the lower frequency of the most frequent alleles and the higher within population genetic diversity (Jakobsson et al. 2013). However, when polymorphism within locus was taken into account, population differentiation measured using Jost’s D was smaller for EST-SSRs than for gSSRs (Fig. 3).

The Bayesian clustering approach revealed a clear geographic pattern, with a best grouping at K = 3 (Fig. 1). There was a clear separation between the Italian and Balkan gene pools as well as within the Italian gene pool, reflecting different quaternary histories (see Cheddadi et al. 2013 for a recent synthesis). Interestingly, gSSRs showed a very similar pattern, although with a slightly higher degree of admixture (Fig. 1).

The exclusion probability was higher for gSSRs than for EST-SSRs (Fig. 3), as a result of a greater number of alleles at genomic loci. However, assignment biases related to null alleles suggest the use of a large enough set of EST-SSRs or a carefully selected set of EST-SSRs and gSSRs not affected by the presence of null alleles for population genetic studies. As an example, in the SIL population, paternity exclusion probabilities >0.999 can be achieved by using only six markers (NFF7, SFb4, Aag01, Aat12, Aat10, Aat14) seemingly not affected by null alleles.

The 16 EST-SSRs selected have been also tested for possible outliers using the Bayesian test of Foll and Gaggiotti (2008), setting the parameters as in Soto-Cerda and Cloutier (2013), and using the software BayeScan v2.1 (http://cmpg.unibe.ch/software/bayescan/). BayeScan, which is based on what is recognized as the best approach to avoid detection of false positives (Pérez-Figueroa et al. 2010; Narum and Hess 2011), did not identify any outliers.

Functional Annotation of EST-SSRs

Functional annotation of contigs containing 16 A. alba EST-SSRs revealed that seven contigs had homology with known proteins and five contigs had homology with putative proteins, as shown in Online Resource 1.

EST-SSR Transferability

The cross-transferability of the newly developed EST-SSRs was high and reflected the degree of relatedness among taxa (Table 2). In particular, the amplification rate was 100 % for the eight Mediterranean Abies taxa belonging to section Abies (same as A. alba) but also for Abies numidica and Abies pinsapo (section Piceaster) and the Asian fir Abies recurvata (section Momi). It ranged from 75 to 81 % for North American firs (sections Balsamea and Grandis) and Asian firs (section Balsamea). Most of the markers appeared to be polymorphic across the different taxa when sample sizes were large enough for tests to be made. Some amplifications were of poor quality (± in Table 2). This could be due either to low DNA quality or a need for optimization. Although Mendelian segregation analysis and additional amplifications still need to be performed on a larger sample size, our pilot transferability and polymorphism results across the genus Abies are very promising and suggest the usefulness of the EST-SSRs we developed for population genetic studies in this genus.

Conclusion and Perspectives

The two EST-SSR multiplexes designed for A. alba allow fast, cost-effective, and accurate genotyping of a large number of individuals and populations. The time spent for their optimization is significantly compensated by the accuracy of allele binning which allows for rapid and efficient screening of large sample sizes. The two newly developed EST-SSR multiplexes are currently applied on a range-wide sample of 28 Italian populations (each represented by 50 individuals) to resolve in detail their past population dynamics, which is expected to be more complex than previously hypothesized (Cheddadi et al. 2013). The comparison of two EST-SSR multiplexes to a control set of gSSRs revealed their lower diversity and frequency of null alleles, as expected, considering the lower mutation rates assumed in the coding portion of the genome (Kovach et al. 2010). Careful selection of markers from the three multiplexes will facilitate the identification of the best combination to get the highest possible exclusion probability in gene flow studies. In general, these newly developed SSRs will be useful for conservation genetic studies and to improve our knowledge about population dynamics of Abies species.

The EST-SSR markers developed here showed a high transferability rate across the genus Abies, even for phylogenetically distant species. Although additional optimization should be performed and polymorphism should be assessed in a larger sample, these first results look very promising for population genetic studies within the genus Abies.

It should also be stressed that these EST-SSRs, being less prone to homoplasy (due to their putative lower mutation rates) and highly transferable across species, could also be used for phylogenetic analysis, as already shown in the genus Epimedium (Zeng et al. 2010). Moreover, considering the BLASTx results (Online resource 1), some of the EST-SSRs might be linked to genes involved in controlling important traits, thus providing a potentially powerful tool for genetic mapping.

Acknowledgments

This study was financed by the Italian MIUR project “Biodiversitaly” (RBAP10A2T4) and by the ERA-Net BiodivERsA LinkTree project (EUI2008-03713). We thank Dr. Popescu Flaviu, Daniel Pitar, Ovidiu Iordan and Daniel Suciu (Forest Research and Management Institute, Romania) and Prof. Peter Zhelev (University of Forestry, Sofia, Bulgaria) for their help with the sampling in the Romanian Carpathians and in the Pirin Mountains.

Supplementary material

11105_2013_688_MOESM1_ESM.doc (54 kb)
Table S1(DOC 54 kb)

Copyright information

© Springer Science+Business Media New York 2013