Characterization of variable EST SSR markers for Norway spruce (Picea abies L.)

Fluch, Silvia; Burg, Agnes; Kopecky, Dieter; Homolka, Andreas; Spiess, Nadine; Vendramin, Giovanni G

doi:10.1186/1756-0500-4-401

Characterization of variable EST SSR markers for Norway spruce (Picea abies L.)

Research article
Open access
Published: 12 October 2011

Volume 4, article number 401, (2011)
Cite this article

Download PDF

You have full access to this open access article

BMC Research Notes Aims and scope Submit manuscript

Characterization of variable EST SSR markers for Norway spruce (Picea abies L.)

Download PDF

Silvia Fluch¹,
Agnes Burg¹,
Dieter Kopecky¹,
Andreas Homolka¹,
Nadine Spiess¹ &
…
Giovanni G Vendramin²

3981 Accesses
27 Citations
Explore all metrics

Abstract

Background

Norway spruce is widely distributed across Europe and the predominant tree of the Alpine region. Fast growth and the fact that timber can be harvested cost-effectively in relatively young populations define its status as one of the economically most important tree species of Northern Europe. In this study, EST derived simple sequence repeat (SSR) markers were developed for the assessment of putative functional diversity in Austrian Norway spruce stands.

Results

SSR sequences were identified by analyzing 14,022 publicly available EST sequences. Tri-nucleotide repeat motifs were most abundant in the data set followed by penta- and hexa-nucleotide repeats. Specific primer pairs were designed for sixty loci. Among these, 27 displayed polymorphism in a testing population of 16 P. abies individuals sampled across Austria and in an additional screening population of 96 P. abies individuals from two geographically distinct Austrian populations. Allele numbers per locus ranged from two to 17 with observed heterozygosity ranging from 0.075 to 0.99.

Conclusions

We have characterized variable EST SSR markers for Norway spruce detected in expressed genes. Due to their moderate to high degree of variability in the two tested screening populations, these newly developed SSR markers are well suited for the analysis of stress related functional variation present in Norway spruce populations.

Genetic diversity of maize resources revealed by different molecular markers

Article Open access 14 March 2024

Assessing genetic, racial, and geographic diversity among Ethiopian sorghum landraces and implications for heterotic potential for hybrid sorghum breeding

Article 20 June 2024

Analyses of genetic diversity and population structure in Quercus griffithii Hook. f. & Thomson ex Miq. using simple sequence repeat (SSR) markers

Article 10 June 2024

Background

Natural populations of Picea abies L. (Norway spruce) are found from north-western Europe outside permafrost areas down to northern Greece, westwards to the Massif Central (France) and east to the Ural Mountains. Picea abies is growing above 400-500 m and ascends close to 2000 m in the Alps. Studies on genetic variation based on allozymes have shown that Picea abies genetic differentiation among populations is rather low over its whole distribution range [1, 2]. Previous studies on the genetic structure of P. abies using organelle markers showed pronounced differentiation between north-east boreal origins and areas in the central European mountains [3, 4], supporting the hypothesis of two distinct main glacial refugia as postulated from pollen data [5].

Initial reports on the occurrence of SSRs in conifers such as Pinus radiata[6], Pinus sylvestris[7] or Picea abies[8] have shown that marker development for such complex genomes is difficult. Frequently several DNA fragments in addition to the expected ones are amplified when using primers flanking putative SSR regions. This can be attributed to the very large size of the conifer genomes, with Norway spruce having 39.140 Mb in the diploid genome (2n = 24) as well as the high proportion of repetitive DNA in the conifer genomes [9]. These repetitive elements as well as pseudogenes frequently produce complex amplification products from multiple loci [10, 11]. To overcome this problem, Scotti et al. [12] showed that by isolating di-nucleotide SSRs from cDNA libraries of conifer species this problem can be overcome relying on the fact that expressed genes are less likely to be of repetitive nature in the genome.

Following this line of exploiting the advantages of publicly available data on expressed sequences (ESTs) as source for marker development [13] as described also by Rungis et al. [14] for conifers, we used Norway spruce EST sequences from NCBI for in silico identification of SSR regions which could serve for marker development.

Results

Screening of 14,022 Norway Spruce EST sequences from the NCBI dbEST revealed 158 sequences containing various repeat motifs. Clustering of these sequences produced 92 SSR containing 'unigenes' which fell into 36 clusters and 56 singletons. In these 92 unigenes, 48 different repeat motifs were identified, with tri-nucleotide (50%), in particular (CGG)_n and CAA)_n, repeats being the most abundant, followed by penta- (23%) and hexanucleotide repeats (10.4%) respectively. Di- and tetranucleotide repeats were the least abundant with 8.3% each. Twenty-seven out of 60 planned primer pairs amplified polymorphic products in the testing population of 16 individuals, while two did not generate a fragment and 31 proved to be monomorphic [see Additional file 1]. Four of the 27 polymorphic SSR regions showing more than 2 alleles per locus (Pa_16, Pa_24, Pa_34 and Pa_40) in the testing population were excluded from further analysis. In a total of 96 individuals 135 alleles were detected in the 23 remaining loci (Table 1) ranging from 2 to 17 alleles per locus (Table 2) with a mean value of 5.6. Total number of effective alleles was and 50.645 with a minimum of 1.123, a maximum of 4.115 and a mean value of 2.202. 87% of the variable regions were based on trinucleotide repeats, the rest were pentanucleotide repeats. Di-, tetra and hexanucleotide repeats proofed to be not polymorphic. Loci Pa_41 and Pa_53 were not polymorphic in the Mayrhofen population. Values for expected heterozygosity ranged from 0.021 to 0.780 in the Gusswerk population (mean 0.367) and from 0.123 to 0.769 in the Mayrhofen population (mean 0.385). Observed heterozygosity was found to be between 0.021 and 1.000 in the Gusswerk population (mean 0.461) and between 0.042 and 1.000 in the Mayrhofen population (mean 0.515). Differences in observed heterozygosities between the two populations ranged from zero (Pa_44) to 0.894 (Pa_49). The minimum difference of expected heterozygosities between the two populations was found at locus Pa_12 (0.002) and the maximum at locus Pa_49 (0.382). Regarding the Gusswerk population, in eleven loci the observed heterozygosity was significantly higher and in three loci significantly lower than the expected value. In the Mayrhofen population we found thirteen loci with the observed heterozygosity significantly higher than the expected and in four loci it was significantly lower. Eleven loci showed significant departure from Hardy-Weinberg expectations (HWE, Guo and Thompson's exact test [P < 0.05]) within the 48 investigated individuals from the Gusswerk population. The number of loci with significant departure from HWE was higher in the Mayerhofen population (16 loci). Frequency of null alleles was variable from zero to 99.0% in the Gusswerk population and from zero to 85.4% in the Mayerhofen population. In both populations, null alleles were present with a high frequency (above 5%) in more than half of the loci deviating from HWE. Tests for linkage disequilibrium (P < 0.01) revealed no disequilibrium among pair wise compared loci. F_st ranged between 0 and 0.288 with a mean value of 0.033 (after application of the ENA correction described in Chapuis and Estoup (2007), Table 3). Repeat numbers of all polymorphic di- and pentanucleotide repeats added up to a multiple of three and therefore did not cause a frameshift. All polymorphic SSRs are lying within open reading frame sequences (ORF) as detected by GetORF [15]. Fourteen sequences hit functionally annotated genes when compared to the NCBI nucleotide database [see Additional file 2]. The test for outlier loci revealed that Pa_51 (P = 1.000) and Pa_42 (P = 0.981) are under positive selection while the other loci are supposed to behave neutral (P < 0.95).

Table 1 Characteristics of polymorphic SSR loci isolated from Picea abies L

Full size table

Table 2 Number of observed and effective alleles, observed and expected heterozygosity estimates for 23 polymorphic SSR markers with two Picea abies L. populations

Full size table

Table 3 Estimation of uncorrected and corrected FST, frequency of null alleles and deviation from Hardy-Weinberg equilibrium

Full size table

Discussion

Several studies describe the development of SSR markers for Norway spruce [8, 12]. In the present work we used EST mining of sequences available in databases instead of classical approaches like screening of genomic libraries or fragment enrichment and subsequent cloning. The newly developed markers can be used to complete unsaturated maps, and might be useful in marker assisted breeding and population genetics.

The high number of loci deviating from Hardy-Weinberg equilibrium could partly be explained by the presence of null alleles (Table 3), partly be a result of sampling or selective pressure on the coding regions which is supported by the fact that loci deviating from HWE are not consistent between the two populations. Null alleles can occur due to mutations in primer binding sites and lead to the overestimation of homozygosity as shown by Callen [16]. The presence of such null alleles as estimated in this study may further be confirmed by synthesis of alternative oligonucleotide primer pairs. Increased appearance of homozygotes in some loci supports the hypothesis of selective pressure which might be caused by advantages of recessive or dominant homozygotes known as directional selection. Positive selection was confirmed by outlier detection for two loci which show a high frequency of homozygotes. Both of them are located in coding regions - Pa_42 shows homology to a cysteine protease and Pa_51 is related to a cadmium transporting ATPase. These genes play important roles in various physiological processes, including response to biotic and abiotic stress [17, 18]. Therefore, additional data of both populations (e.g. environmental growth) conditions would be of interest. Especially locus Pa_51 may be an interesting target for diversity studies because departure from HWE was only detected in the Mayrhofen population. But although these two loci show considerable degrees of population differentiation, the low F_st values present in the other loci support previous findings [3, 4] that there is no major degree of differentiation among Alpine Norway spruce populations.

The new SSR markers characterized in this study are a good resource for assessing diversity in Alpine spruce populations and might be of further use for applications in forestry. Additionally, 31 monomorphic SSR loci were identified in this study. It is still possible to derive polymorphic data from them when screening Norway spruce populations from different origin, as it has already been shown that flanking regions of monomorphic SSRs show a moderate to high degree of variability [19]. Possible variation within these flanking regions in Norway spruce could serve as alternative source for diversity studies applying different techniques for the identification of variability within these regions.

All variable SSRs are located within ORF sequences and 60% of them gave significant hits to functional genes present in GenBank. A frameshift caused by length variation within an ORF might lead to distortion in the coding region and thus to the abortion of the coded protein. In all of the newly identified EST-SSRs no frameshift is occurring because either, as in 50% of the cases, the SSRs are trinucleotide repeats or the repeat number of polymorphic di- and pentanucleotide repeats alters in such a way that the sum of the basepair variation is a multiple of three.

Conclusions

We have identified 92 new SSR regions with 48 different repeat motifs in the expressed part of the Norway spruce genome. In two screening populations of 48 individuals each, 23 SSR regions showed a moderate to high degree of polymorphism. Although a high number of SSR markers were already developed for Norway spruce, most of the available SSRs consist of dinucleotide repeats and do not include our tri- and pentanucleotide repeats. We have demonstrated that on the one hand genetic differentiation between distinct stands is low but on the other hand selective forces are likely to have been acting on several genes because deviations from HWE could not only be explained due to null alleles and test for outliers revealed two loci under positive selection. The results gave an insight into the abundant repeat classes and will be of use in analysis of variation linked to expressed genes in Alpine Norway spruce populations as well as evolutionary forces acting at these loci.

Methods

14,022 Norway Spruce EST sequences were extracted from the NCBI dbEST and screened for SSR sequences with the SciRoKo software program [20]. Only perfect SSRs were searched in two iterations, one for mono- and dinucleotides (at least 4 repeats), and another one for tri-, tetra-, penta- and hexanucleotides (at least 3 repeats). SSR motif containing sequences were subjected to clustering and annotation using the EST2uni pipeline [21]. For clustering, the default settings (30 bp minimum overlap with at least 94% identity for pairwise alignment using BLAST, 93% overlap identity cutoff for assembly using CAP3) were applied. The resulting Unigene Set (contigs and singletons) was compared against three protein databases with BLASTX, in particular NCBI NR, Arabidopsis TAIR7, and Uniprot/Swissprot, using an e-value of 1e-10 as cutoff. The descriptions of the BLAST hits obtained with the different BLAST runs were parsed and merged to yield a descriptive annotation for each unigene. The annotations were attributed with modifiers like "Similar to" or "Highly similar to", depending on the e-value of the alignment with the corresponding BLAST hit.

Plant material used: 16 samples from all over Austria (testing population) and two times 48 individuals from two distinct populations (Gusswerk, Styria [N 47° 44|, E 15° 18|]; Mayrhofen, Tyrol [N 47° 11', E 11° 52']; screening population). DNA was kindly provided by the Federal Research and Training Centre for Forests, Natural Hazards and Landscape (BFW).

60 primer pairs were designed using Primer3 web interface [22] with default settings. PCR reactions were performed in 25 μl volumes containing 50 ng genomic DNA, 1X PCR Buffer (QIAGEN), 1 mM MgCl₂, 0.2 mM each dNTP, 0.3 μM FAM-labelled forward primer, 0.3 μM reverse primer and 0.625 U HotStar Taq polymerase (QIAGEN). The PCR thermal profile consisted of 15 min. initial denaturation at 95 °C followed by 35 cycles denaturation at 95 °C for 50 sec., a primer-specific annealing-temperature (Table 1) for 50 sec., extension at 72 °C for 105 sec., and a final extension at 72 °C for 10 minutes. Fragments amplified in 16 individuals from all over Austria (testing population) and 96 individuals from 2 distinct locations in Austria (screening population) were analyzed on an ABI 3100 Genetic Analyzer. Fragment sizes were extracted using PeakScanner 1.0 (Applied Biosystems, Foster City, California, USA) and allele calls were manually performed in Excel. The testing population was only used to test for polymorphic loci and was not included in further calculations. Number of alleles, effective number of alleles, observed and expected heterozygosities and deviation from Hardy-Weinberg equilibrium were calculated using Genepop 4.0 [23]. Fixation indices and fixation indices excluding null alleles (ENA), as well as estimates of null alleles were calculated with FreeNA [24] and SSR presence in coding regions was analyzed with the GetORF software using default settings. Test for outlier loci were performed with LOSITAN [25] using an infinite allele model with 100.000 simulations, a confidence interval of 0.95 and a subsample size of 75.

References

Lagercrantz U, Ryman N: Genetic structure of Norway spruce (Picea abies): concordance of morphological and allozymic variation. Evolution. 1990, 44: 38-53. 10.2307/2409523.
Article Google Scholar
Bergmann F, Ruetz W: Isozyme genetic variation and heterozygosity in random tree samples and selected orchard clones from the same Norway spruce populations. For Ecol Manage. 1991, 46 (1-2): 39-48. 10.1016/0378-1127(91)90243-O.
Article Google Scholar
Vendramin GG, Anzidei M, Madaghiele A, Sperisen C, Bucci G: Chloroplast microsatellite analysis reveals the presence of population subdivision in Norway spruce (Picea abies K.). Genome. 2000, 43 (1): 68-78.
Article PubMed CAS Google Scholar
Sperisen C, Büchler U, Gugerli F, Mátyás G, Geburek T, Vendramin GG: Tandem repeats in plant mitochondrial genomes: application to the analysis of population differentiation in the conifer Norway spruce. Mol Ecol. 2001, 10 (1): 257-263. 10.1046/j.1365-294X.2001.01180.x.
Article PubMed CAS Google Scholar
Huntley B, Birks HJB: An Atlas of Past and Present Pollen Maps for Europe: 0-13000 years ago. 1983, Cambridge: Cambridge University Press
Google Scholar
Smith DN, Devey ME: Occurrence and inheritance of microsatellites in Pinus radiata. Genome. 1994, 37: 977-983. 10.1139/g94-138.
Article PubMed CAS Google Scholar
Kostia S, Varvio SL, Vakkari P, Pulkkinen P: Microsatellite sequences in a conifer, Pinus sylvestris. Genome. 1995, 38: 1244-1248. 10.1139/g95-163.
Article PubMed CAS Google Scholar
Pfeiffer A, Olivieri AM, Morgante M: Identification and characterization of microsatellites in Norway spruce (Picea abies K.). Genome. 1994, 40 (4): 411-419.
Article Google Scholar
Murray BG, Leitch IJ, Bennet MD: Gymnosperm DNA C-values database (release 4.0, Dec. 2010). 2010, [http://www.kew.org/cvalues/]
Google Scholar
Echt CS, May-Marquardt P, Hseih M, Zahorchak R: Characterization of microsatellite markers in eastern white pine. Genome. 1996, 39: 1102-1108. 10.1139/g96-138.
Article PubMed CAS Google Scholar
Soranzo N, Provan J, Powell W: Characterisation of microsatellite loci in Pinus sylvestris L. Mol Ecol. 1998, 7: 1260-1261.
PubMed CAS Google Scholar
Scotti I, Magni F, Fink R, Powell W, Binelli G, Hedley PE: Microsatellite repeats are not randomly distributed within Norway spruce (Picea abies K.) expressed sequences. Genome. 2000, 43: 41-46.
Article PubMed CAS Google Scholar
Varshney RK, Thiel T, Stein N, Langridge P, Graner A: In silico analysis on frequency and distribution of microsatellites in ESTs of some cereal species. Cell Mol Biol Lett. 2002, 7: 537-546.
PubMed CAS Google Scholar
Rungis SD, Bérubé Y, Zhang J, Ralph S, Ritland CE, Ellis BE, Douglas C: Robust simple sequence markers for spruce (Picea spp.) from expressed sequence tags. Theor Appl Gen. 2004, 109: 1283-1294. 10.1007/s00122-004-1742-5.
Article CAS Google Scholar
GetORF. [http://embossgui.sourceforge.net/demo/getorf.html]
Callen DF, Thompson AD, Shen Y, Pillips HA, Richards RI, Mulley JC, Sutherland GR: Incidence and Origin of "Null" Alleles in the (AC)n Microsatellite Markers. Am J Hum Genet. 1993, 52: 922-927.
PubMed CAS PubMed Central Google Scholar
Solomon M, Belenghi B, Delledonne M, Menachem E, Levine A: The involvement of cysteine proteases and protease inhibitor genes in the regulation of programmed cell death in plants. Plant Cell. 1999, 11: 431-444.
Article PubMed CAS PubMed Central Google Scholar
Courbot M, Willems G, Motte P, Arvidsson S, Roosens N, Saumitou-Laprade P, Verbruggen N: A major quantitative trait locus for cadmium tolerance in Arabidopsis halleri colocalizes with HMA4, a gene encoding a heavy metal ATPase. Plant Physiol. 2007, 144: 1052-106. 10.1104/pp.106.095133.
Article PubMed CAS PubMed Central Google Scholar
Chatrou LW, Escribano MB, Viruel MA, Maas JW, Richardson JE, Hormaza JI: Flanking regions of monomorphic microsatellite loci provide a new source of data for plant species-level phylogenetics. Mol Phylogenet Evol. 2009, 53: 726-733. 10.1016/j.ympev.2009.07.024.
Article PubMed CAS Google Scholar
Kofler RC, Schlötterer C, Lelley T: SciRoKo: A new tool for whole genome microsatellite search and investigation (version 3.4). [http://kofler.or.at/bioinformatics/SciRoKo/index.html]
Forment J, Gilabert F, Robles A, Conjero V, Nuez F, Blanca JM: EST2uni: an open, parallel tool for automated EST analysis and database creation, with a data mining web interface and microarray expression data integration. [http://cichlid.umd.edu/est2uni/]
Rozen S, Skaletsky HJ: Primer3 on the WWW for general users and for biologist programmers (version 0.0.4). [http://frodo.wi.mit.edu/primer3/]
Raymond M, Rousset F: GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. J Hered. 1995, 86: 248-249.
Google Scholar
Chapuis M-P, Estoup A: Microsatellite null alleles and estimation of population differentiation. Mol Biol Evol. 2007, 24: 621-631.
Article PubMed CAS Google Scholar
Beaumont MA, Nichols RA: Evaluating loci for use in the genetic analysis of population structure. P Roy Soc Lond B Bio. 1996, 263: 1619-1626. 10.1098/rspb.1996.0237.
Article Google Scholar

Download references

Acknowledgements

The work was supported by the Austrian Federal Ministry of Agriculture, Forestry, Environment and Water Management and conducted in a cooperation project with the Federal Research and Training Centre for Forests, Natural Hazards and Landscape, Vienna, Austria.

Author information

Authors and Affiliations

AIT Austrian Institute of Technology GmbH, Health & Environment Department,, Bioresources, 3430, Tulln, Austria
Silvia Fluch, Agnes Burg, Dieter Kopecky, Andreas Homolka & Nadine Spiess
Istituto di Genetica Vegetale, CNR, via Madonna del Piano 10, 50019, Sesto Fiorentino (Firenze), Italy
Giovanni G Vendramin

Authors

Silvia Fluch
View author publications
You can also search for this author in PubMed Google Scholar
Agnes Burg
View author publications
You can also search for this author in PubMed Google Scholar
Dieter Kopecky
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Homolka
View author publications
You can also search for this author in PubMed Google Scholar
Nadine Spiess
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni G Vendramin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Silvia Fluch.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

SF participated in the design of the study, carried out the primer design, SSR data analysis and allele calling and drafted the manuscript. AB carried out the PCR and the fragment analysis. AH carried out the population genetic analysis and helped to draft the manuscript. NS participated in the analysis of null alleles and F_ST values. DK carried out the computational detection of SSRs. All authors read and approved the final manuscript. GGV contributed part of the EST sequences and commented on the manuscript.

Electronic supplementary material

13104_2011_1139_MOESM1_ESM.XLS

Additional file 1:Monomorphic SSRs_Norway Spruce. A Microsoft Excel table containing a summary of monomorphic SSRs, accession numbers, primer sequences for amplification, repeat type and repeat number. (XLS 26 KB)

13104_2011_1139_MOESM2_ESM.XLS

Additional file 2:SSR Annotations_Norway spruce. A Microsoft Excel table showing SSR hits in Genbank, the corresponding accession number and e-value. (XLS 16 KB)

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Fluch, S., Burg, A., Kopecky, D. et al. Characterization of variable EST SSR markers for Norway spruce (Picea abies L.). BMC Res Notes 4, 401 (2011). https://doi.org/10.1186/1756-0500-4-401

Download citation

Received: 31 May 2011
Accepted: 12 October 2011
Published: 12 October 2011
DOI: https://doi.org/10.1186/1756-0500-4-401

Characterization of variable EST SSR markers for Norway spruce (Picea abies L.)