Rapid identification of Populus L. species and hybrids can be achieved with relatively little effort through the use of primer extension-based single nucleotide polymorphism (SNP) genotyping assays. We present an optimized set of 36 SNP markers from 28 gene regions that diagnose eight poplar species (Populus angustifolia James, Populus balsamifera L., Populus deltoides Bartram, Populus fremontii Watson, Populus laurifolia Ledeb., Populus maximowiczii Henry, Populus nigra L., and Populus trichocarpa Torr. & Gray). A total of 700 DNA sequences from six Populus species (1–15 individuals per species) were used to construct the array. A set of flanking and probe oligonucleotides was developed and tested. The accuracy of the SNP assay was validated by genotyping 448 putatively “pure” individuals from 14 species of Populus. Overall, the SNP assay had a high success rate (97.6 %) and will prove useful for the identification of all Aigeiros Duby and Tacamahaca Spach. species and their early-generation hybrids within natural populations and breeding programs. Null alleles and intraspecific polymorphisms were detected for a few locus/species combinations in the Aigeiros and Tacamahaca sections. When we attempted to genotype aspens of the section Populus (Populus alba L., Populus grandidentata Michx., Populus tremula L., and Populus tremuloides Michx.), the success rate of the SNP array decreased by 13 %, demonstrating moderate cross-sectional transferability.
Populus L. is a model tree genus, with many of its approximately 30 species being extensively used for both pure and applied research purposes (Ellis et al. 2010). Despite the economic and scientific importance of this genus, the identification of different Populus species and hybrids can be problematic due to intraspecific morphological plasticity (e.g., Bylesjö et al. 2008) and frequent natural hybridization between species, particularly among members of the sections Aigeiros Duby and Tacamahaca Spach. (Eckenwalder 1984; Floate 2004; Mahama et al. 2011). Adding to this complexity, trihybrids (crosses involving three different species) and more complex combinations have been developed and deployed in Populus-breeding programs (e.g., Meirmans et al. 2010; Talbot et al. 2011) and have been detected in nature (Thompson et al. 2010; Talbot et al. 2012; Williams et al., unpublished). Molecular diagnostics, including AFLP (Cervera et al. 2005), DNA sequencing (Hamzeh and Dayanandan 2004) combined with RFLP (Schroeder et al. 2012), microsatellites (Liesebach et al. 2010), and medium-throughput single nucleotide polymorphism (SNP)-based genotyping assays (Hamzeh et al. 2007; Meirmans et al. 2007; Thompson et al. 2010; Talbot et al. 2011) have been used to diagnose poplar species and hybrids with success.
Nevertheless, a single combined genotyping array that is optimized to identify a maximum number of poplar species with the greatest marker success rates would be a valuable, cost-effective, and versatile diagnostic tool. Here, we present an optimized set of 36 SNP markers that can discriminate among eight poplar (Populus) species (Populus angustifolia James, Populus balsamifera L., Populus deltoides Bartram, Populus fremontii Watson, Populus laurifolia Ledeb., Populus maximowiczii Henry, Populus nigra L., and Populus trichocarpa Torr. & Gray) and that could be used to detect their early-generation hybrids (e.g., Thompson et al. 2010; Meirmans et al. 2010; Talbot et al. 2012).
Our strategy was to (1) obtain variable sequence sets from six Populus species commonly used in breeding programs; (2) identify “species-specific” SNPs in these target regions; (3) develop, test, and optimize a genotyping assay composed of these putative diagnostic SNPs; and (4) test the assay on additional Populus species to evaluate its accuracy and broader utility.
Materials and methods
Twenty-nine leaf samples (representing 27 provenances) from six Populus species were selected (1–15 individuals per species, Table 1) for DNA sequencing of 31 gene regions. From these, a total of 700 high-quality bidirectional DNA sequences were used to construct the SNP array. For practical considerations, one set of DNA sequences (forward and reverse) per gene per species was submitted to GenBank and has an accession number (171 DNA sequences in total, Supplementary Table S1), i.e., 52 new sequences and the remaining 119 from previously published studies (Meirmans et al. 2007; Thompson et al. 2010; Talbot et al. 2011). Primer pairs were designed using Primer3 (Rozen and Skaletsky 2000) and tested in silico on the P. trichocarpa genome v1 to avoid simultaneous amplification of paralogous loci (this is to be avoided as paralogues will produce biased, non-Mendelian results). Amplified regions were designed to be approximately 800 bp in length. DNA was extracted from dried leaf material with the Nucleospin 96 Plant II kit (Macherey-Nagel, Bethlehem, PA) following the manufacturer’s protocol for vacuum processing with the following modifications: (a) cell lysis using buffers PL2 and PL3 (PL2 was heated for 2 h at 65° C instead of 30 min) and (b) elution with an in-house Tris-HCl 0.01 mM pH 8.0 buffer. Gene regions were amplified by PCR using a PTC-100 (MJResearch, Waltham, MA) thermocycler. Reactions contained 1× PCR buffer, 0.13 μM of forward and reverse primers, 0.17 mM of each dNTP, 2.0 mM MgCl2, and 1 U Platinum Taq polymerase (Invitrogen, Burlington, ON). Temperature profiles were as follows: (1) 4 min at 95 °C, (2) 35 cycles of 30 s at 94 °C, 30 s at 58 °C, and 45 s at 72 °C, (3) 5 min at 72 °C. PCR products were visualized by gel electrophoresis and then sequenced at the McGill University and Génome Québec Innovation Centre on ABI 3730XL DNA Analyzer systems (Applied Biosystems, Carlsbad, CA) using their internal protocols.
SeqMan software v8 (DNAStar, Madison, WI) was used to assemble electropherograms for all DNA sequences and to identify nucleotide variations within the 31 gene regions. All potential variations were carefully validated visually and homozygous SNPs that differentiated Populus species were identified. A total of 40 loci were chosen for inclusion in a Sequenom iPLEX MassARRAY genotyping assay. An optimized set of primers for multiplex PCR was designed in invariant flanking regions of our sequences by the McGill University and Génome Québec Innovation Centre according to internal protocols (primer sequences in Supplementary Table S2).
The performance of the SNP array was evaluated by genotyping a validation set of 337 Populus samples from ten different species in both the Aigeiros and Tacamahaca sections, plus 111 samples from four species of the section Populus (1–120 samples per species, Supplementary Table S3). A SNP locus was considered to be monomorphic (i.e., fixed) when the minor allele frequency was equal to or lower than 3 %. SNP loci that failed or gave inconsistent results for every individual for the majority of all eight studied species were excluded from the remaining analyses. If a SNP locus did not amplify (absence of signal) for every individual in a species but consistently amplified for the majority of remaining species, this “technically” failed reaction was denoted as a fixed null allele (Carlson et al. 2006) and scored as homozygous 00 (0 is used to designate a null allele). However, a SNP locus was considered “unreliable” for a species if individuals of this species displayed a puzzling distribution of genotype classes (e.g., TT 00 or AA AG GG 00). In the latter case, the expected genotype classes were observed but an unusual number of individuals had failed reactions, suggesting the presence of a null allele. Confirmation of these putative null alleles would require additional evidence (more specimens/sequencing) and was beyond the scope of this project.
Results and discussion
The 31 gene regions ultimately targeted were distributed across 18 of the 19 Populus chromosomes, with one to four genes (median = 1.5) and one to three SNPs (median = 2) per chromosome. The physical distance between neighboring genes on the same chromosome ranged from 240,596 to 12,800,671 bp, so the 31 gene regions could be considered as unlinked. Out of the 40 loci tested, four failed or gave inconsistent results for all species in the Aigeiros/Tacamahaca sections, and very few of the locus/species combinations (7 out of the 288; 2.4 %) were deemed unreliable (Supplementary Table S4).
The remaining 36 loci from 28 gene regions (Supplementary Figure S1; Supplementary Table S5) accurately diagnosed all poplar species of the Aigeiros/Tacamahaca sections. Between 0 and 19 fixed differences separated pairs of poplar species (Table 2; locus/species combinations that contained fixed null alleles are shown but are not included in the final counts). P. balsamifera, P. deltoides, and P. nigra, all of which are known to hybridize (Thompson et al. 2010), were differentiated by 12–19 fixed SNPs, while P. fremontii, P. trichocarpa, and P. nigra, all of which are known to hybridize in California and Nevada (Eckenwalder 1982, 1984), were differentiated by 12–18 fixed SNPs, theoretically providing enough resolving power to identify hybrids using model-based Bayesian methods (Vähä and Primmer 2006). However, only four SNPs consistently differentiated P. balsamifera and P. trichocarpa, despite our screening efforts. Interestingly, no SNP was detected that could discriminate between P. deltoides and P. fremontii, except at locus A-025 where a null allele was detected. This lack of genetic differentiation between these two pairs of species is congruent with earlier surveys of the genus (Hamzeh and Dayanandan 2004; Levsen et al. 2012).
Thirty-six loci were selected for this study and were expected to be fixed within species, yet a number of loci showed intraspecific polymorphism within some species (Table 2, File S1). A total of 11 loci were polymorphic within at least one of the Aigeiros or Tacamahaca species (4.5 %, 13/288 of the locus/species combinations; Supplementary Table S4). SNPs have been developed without consideration for broader polymorphism data (and hence may suffer from ascertainment bias) and are occasionally identified based on sequence data from a single heterozygous individual (Vezzulli et al. 2008) as done here for P. laurifolia and P. maximowiczii (Table 1). It should be noted that despite sequencing a single individual from each of these two species, 23 individuals of P. laurifolia and 20 individuals of P. maximowiczii constantly differed from the other seven poplar species by 4–19 and 4–14 fixed SNPs, respectively (Table 2, Supplementary Table S4).
For particular locus/species combinations, null alleles were detected (Table 2; Supplementary Table S4, File S1). This type of result has already been observed when working with distant species because of the presence of unexpected polymorphisms affecting the amplification/hybridization process (e.g., Ollitrault et al. 2012). Although fixed null alleles were not included in the counts presented in Table 2, they could be used for diagnostic purposes under certain conditions (see below). In fact, since DNA sequences of P. angustifolia and P. fremontii were not used to design primers for the SNP array, it was not surprising that unexpected polymorphisms should affect SNP amplification (Carlson et al. 2006). Indeed, a posteriori examinations of DNA sequences for these two species (one individual per species) revealed mutations at the priming sites that likely hampered amplification of the studied loci, thus resulting in a failed reaction and subsequent absence of signal (= null allele). For instance, in P. angustifolia, such mutations were observed in one of the three priming sites used in the genotyping method for each of both loci A-024 and B-021 (no DNA sequence could be obtained for locus F-001). In this study, 38 pure P. angustifolia individuals consistently displayed an absence of signal for these three loci whereas they were successfully genotyped at the remaining loci (File S1). These individuals were then considered to have a fixed null allele and were scored as homozygous 00. Constant occurrences of null alleles were observed at loci A-024 and B-021 in more than 160 P. angustifolia individuals (Floate et al., unpublished results). Indeed, they proved to be useful for hybrid detection between P. angustifolia, P. balsamifera, and P. deltoides without the need to redesign a new SNP array (Supplementary Table S6, File S2).
Transferability of this diagnostic SNP array to the four distantly related aspen species in the section Populus was limited. In fact, 22 out of 144 (15.3 %) locus/species combinations were considered unreliable because they had putative null alleles (Supplementary Table S4), a higher proportion than that observed for the members of the Aigeiros/Tacamahaca sections. Only two loci, A-040 and A-041, could distinguish Populus grandidentata and Populus tremuloides from each other (Supplementary Table S4), whereas Populus alba and Populus tremula could not be distinguished using this SNP array. Three additional loci (A-007, A-024, and A-034) had fixed null alleles that could differentiate P. grandidentata from the three other aspen species. As described previously, DNA sequences of this group of species were not used to design the SNP assay, which presumably resulted in unexpected polymorphisms in the priming sites that led to failed amplifications (Carlson et al. 2006). Since no further genetic survey was conducted on aspens, these results should be interpreted with caution. Confirmation of the priming site polymorphisms and improvements of the array for the aspen species would require additional sequencing and validation. Nonetheless, under certain circumstances (e.g., a regeneration survey in mixed natural forests), the SNP array could discriminate among individuals belonging to different sections (Aigeiros/Tacamahaca versus Populus) of poplar species that are otherwise indistinguishable based on leaf morphology.
We demonstrated that this SNP assay, with its high success rate (97.6 %; 281 out of 288 locus/species combinations), could discriminate most Aigeiros and Tacamahaca species. Polymorphic loci are included in this final count because they can be useful for individual assignment (Vähä and Primmer 2006). Detection of early-generation hybrids among these species (for example, see Supplementary Table S6, File S2) will be possible in both natural populations and breeding programs, with various subsets of SNP being most informative in different contexts (i.e., depending on the number and identity of the species involved, natural versus artificial hybrid zones, etc.). In particular, this SNP array will be a reliable method to monitor for exotic gene escape from commercial plantations of exotic poplars to natural forests, providing a method for plantation managers to demonstrate compliance with regional certification standards (e.g., Forest Stewardship Council Certification).
Bylesjö M, Segura V, Soolanayakanahally RY, Rae A, Trygg J, Gustafsson P, Jansson S, Street NR (2008) LAMINA: a tool for rapid quantification of leaf size and shape parameters. BMC Plant Biol 8:82. doi:10.1186/1471-2229-8-82
Carlson CS, Smith JD, Stanaway IB, Rieder MJ, Nickerson DA (2006) Direct detection of null alleles in SNP genotyping data. Hum Mol Gen 15:1931–1937. doi:10.1093/hmg/ddl115
Cervera MT, Storme V, Soto A, Ivens B, Van Montagu M, Rajora OP, Boerjan W (2005) Intraspecific and interspecific genetic and phylogenetic relationships in the genus Populus based on AFLP markers. Theor Appl Genet 111:1440–1456. doi:10.1007/s00122-005-0076-2
Eckenwalder JE (1982) Populus x inopina hybr. nov. (Salicaceae), a natural hybrid between the native North American P. fremontii S. Wats. and the introduced Eurasian P. nigra. L. Madrono 29:67–78
Eckenwalder JE (1984) Natural intersectional hybridization between North American species of Populus (Salicaceae) in sections Aigeiros and Tacamahaca. II. Taxonomy. Can J Bot 62:325–335. doi:10.1139/b84-051
Ellis B, Jansson S, Strauss SH, Tuskan GA (2010) Why and how Populus became a “model tree”. In: Jansson S, Bhalerao RP, Groover AT (eds) Genetics and genomics of Populus. Springer, New York, pp 3–14. doi:10.1007/978-1-4419-1541-2_1
Floate K (2004) Extent and patterns of hybridization among the three species of Populus that constitute the riparian forest of southern Alberta. Can J Bot 82:253–264. doi:10.1139/B03-135
Hamzeh M, Dayanandan S (2004) Phylogeny of Populus (Salicaceae) based on nucleotide sequences of chloroplast trn T-trn F region and nuclear rDNA. Am J Bot 91:1398–1408. doi:10.3732/ajb.91.9.1398
Hamzeh M, Sawchyn C, Périnet P, Dayanandan S (2007) Asymmetrical natural hybridization between Populus deltoides and P. balsamifera (Salicaceae). Can J Bot 85:1227–1232. doi:10.1139/B07-105
Levsen ND, Tiffin P, Olson MS (2012) Pleistocene speciation in the genus Populus (Salicaceae). Syst Biol 61:401–412. doi:10.1093/sysbio/syr120
Liesebach H, Schneck V, Ewald E (2010) Clonal fingerprinting in the genus Populus L. by nuclear microsatellite loci regarding differences between sections species, and hybrids. Tree Genet Gen 6:259–269. doi:10.1007/s11295-009-0246-5
Mahama AA, Hall RB, Zalesny RS Jr (2011) Differential interspecific incompatibility among Populus hybrids in sections Aigeiros Duby and Tacamahaca Spach. For Chron 87:790–796. doi:10.5558/tfc2011-096
Meirmans PG, Lamothe M, Périnet P, Isabel N (2007) Species-specific single nucleotide polymorphism markers for detecting hybridization and introgression in poplar. Can J Bot 85:1082–1091. doi:10.1139/B07-069
Meirmans PG, Lamothe M, Gros-Louis M-C, Khasa D, Périnet P, Bousquet J, Isabel N (2010) Complex patterns of hybridization between exotic and native North American poplar species. Am J Bot 97:1688–1697. doi:10.3732/ajb.0900271
Ollitrault P, Javier Terol J, Garcia-Lor A, Bérard A, Chauveau A, Froelicher Y, Belzile C, Morillon R, Navarro L, Brunel D, Talon M (2012) SNP mining in C. clementina BAC end sequences; transferability in the Citrus genus (Rutaceae), phylogenetic inferences and perspectives for genetic mapping. BMC Genomics 13:13. doi:10.1186/1471-2164-13-13
Rozen S, Skaletsky HJ (2000) Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S (eds) Bioinformatics methods and protocols: methods in molecular biology. Humana, Totowa, pp 365–386
Schroeder H, Hoeltken AM, Fladung M (2012) Differentiation of Populus species using chloroplast single nucleotide polymorphism (SNP) markers—essential for comprehensible and reliable poplar breeding. Plant Biol 14:374–381. doi:10.1111/j.1438-8677.2011.00502.x
Talbot P, Thompson SL, Schroeder W, Isabel N (2011) An efficient single nucleotide polymorphism assay to diagnose the genomic identity of poplar species and hybrids on the Canadian prairies. Can J For Res 41:1102–1111. doi:10.1139/X11-025
Talbot P, Schroeder WR, Bousquet J, Isabel N (2012) When exotic poplars and native Populus balsamifera L. meet on the Canadian prairies: spontaneous hybridization and establishment of interspecific hybrids. Forest Ecol Manag 285:142–152. doi:10.1016/j.foreco.2012.07.036
Thompson SL, Lamothe M, Meirmans PG, Périnet P, Isabel N (2010) Repeated unidirectional introgression towards Populus balsamifera in contact zones of exotic and native poplars. Mol Ecol 19:132–145. doi:10.1111/j.1365-294X.2009.04442.x
Vähä J-P, Primmer CR (2006) Efficiency of model-based Bayesian methods for detecting hybrid individuals under different hybridization scenarios and with different numbers of loci. Mol Ecol 15:63–72. doi:10.1111/j.1365-294X.2005.02773.x
Vezzulli S, Micheletti D, Riaz S, Pindo M, Viola R, This P, Walker MA, Troggio M, Velasco R (2008) A SNP transferability survey within the genus Vitis. BMC Plant Biol 8:128. doi:10.1186/1471-2229-8-128
We thank Rosario García-Gil (Umeå Plant Science Centre) for comments on an earlier version of the manuscript, Amanda Roe (Natural Resources Canada) for a thorough revision of the manuscript, Isabelle Lamarre (Natural Resources Canada) for editing, Joannie Normandin and Roxane Boivin (Natural Resources Canada) for helping with DNA extractions and sequencing, and Serge Payette (Louis-Marie Herbarium—QFA), Christian Lexer (U. Fribourg), Pierre Périnet (Ministère des Ressources naturelles du Québec), Bill Schroeder (Agriculture and Agri-Food Canada), Luke Evans (U. Northern Arizona), Kevin Floate (Agriculture and Agri-Food Canada), Yousry El-Kassaby (UBC), and Marc Villar (INRA d’Orléans) for providing Populus material for sequencing and/or genotyping. The McGill University and Génome Québec Innovation Centre provided sequencing and genotyping support. This work was funded through the Canadian Regulatory System for Biotechnology to NI.
Data archiving statement
We followed standard Tree Genetics and Genomes policy. All GenBank accession numbers for DNA sequences are provided in Supplementary Table S1. All genotype data, i.e., Files S1 and S2, are deposited in the Dryad Repository: doi.org/10.5061/dryad.sn577.
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Nathalie Isabel and Manuel Lamothe contributed equally to this work.
Communicated by S. Aitken
About this article
Cite this article
Isabel, N., Lamothe, M. & Thompson, S.L. A second-generation diagnostic single nucleotide polymorphism (SNP)-based assay, optimized to distinguish among eight poplar (Populus L.) species and their early hybrids. Tree Genetics & Genomes 9, 621–626 (2013). https://doi.org/10.1007/s11295-012-0569-5
- Genotyping array
- Marker development
- Clone certification
- Species identification