Tree Genetics & Genomes

, Volume 9, Issue 6, pp 1537–1544 | Cite as

Mining conifers’ mega-genome using rapid and efficient multiplexed high-throughput genotyping-by-sequencing (GBS) SNP discovery platform

  • Charles Chen
  • Sharon E. Mitchell
  • Robert J. Elshire
  • Edward S. Buckler
  • Yousry A. El-KassabyEmail author
Short Communication


Next-generation sequencing (NGS) technologies are revolutionizing both medical and biological research through generation of massive SNP data sets for identifying heritable genome variation underlying key traits, from rare human diseases to important agronomic phenotypes in crop species. We evaluated the performance of genotyping-by-sequencing (GBS), one of the emerging NGS-based platforms, for genotyping two economically important conifer species, lodgepole pine (Pinus contorta) and white spruce (Picea glauca). Both species have very large genomes (>20,000 Mbp), are highly heterozygous, and lack reference sequences. From a small set (six accessions each) of independent replicated DNA samples and a 48-plex read depth, we obtained ~60,000 SNPs per species. After stringent filtering, we obtained 17,765 and 17,845 high-coverage SNPs without missing data for lodgepole pine and white spruce, respectively. Our results demonstrated that GBS is a robust and suitable method for genotyping conifers. The application of GBS to forest tree breeding and genomic selection is discussed.


Next-generation sequencing Genotyping-by-sequencing (GBS) SNP diversity Conifers 



This work was funded by the Johnson's Family Forest Biotechnology Endowment, the Natural Sciences and Engineering Research Council of Canada—Discovery, and the IRC grants to YAK.


  1. Andolfatto P, Davison D, Erezyilmaz D, Hu TT, Mast J, Sunayma-Morita T, Stern DL (2011) Multiplexed shotgun genotyping for rapid and efficient genetic mapping. Genome Res 21:610–617PubMedCrossRefGoogle Scholar
  2. Bagnoli F, Fady B, Fineschi S, Oddou-Muratorio S, Piotti A, Sebastiani F, Vendramin GG (2011) Neutral patterns of genetic variation and applications to conservation in conifer species. In: Plomion C, Bousquet J, Kole C (eds) Genetics, genomics and breeding of conifers. CRC Press, Boca Raton, pp 141–195Google Scholar
  3. Berkman PJ, Lai K, Lorenc MT, Edwards D (2012) Next-generation sequencing applications for wheat crop improvement. Am J Bot 99:365–371PubMedCrossRefGoogle Scholar
  4. Budar F, Roux F (2011) The role of organelle genomes in plant adaptation. Plant Signal Behav 6:635–639PubMedCrossRefGoogle Scholar
  5. Chutimanitsakun Y, Nipper RW, Cuesta-Marcos A, Cistué L, Corey A, Filichkina T, Johnson EA, Hayes PM (2011) Construction and application for QTL analysis of a Restriction Site Associated DNA (RAD) linkage map in barley. BMC Genomics 12:4PubMedCrossRefGoogle Scholar
  6. Cullingham CI, James PMA, Cooke JEK, Coltman DW (2012) Characterizing the physical and genetic structure of lodgepole pine x jack pine hybrid zone: mosaic structure and differential introgression. Evol Appl 5:879–891PubMedCrossRefGoogle Scholar
  7. Doyle JJ, Doyle JL (1990) Isolation of plant DNA from fresh tissue. Focus 12:13–15Google Scholar
  8. El-Kassaby YA, Lstibůrek M (2009) Breeding without breeding. Genet Res 91:111–120CrossRefGoogle Scholar
  9. El-Kassaby YA, Cappa EP, Liewlaksaneeyanawin C, Klápšte J, Lstiburek M (2011) Breeding without breeding: is a complete pedigree necessary for efficient breeding? PLoS ONE 6:e25737PubMedCrossRefGoogle Scholar
  10. El-Kassaby YA, Klápšte J, Guy RD (2012) Breeding without breeding: selection using the genomic best linear unbiased predictor method (GBLUP). New Forest 43:631–637CrossRefGoogle Scholar
  11. Elshire RJ, Glaubitz JC, Sun Q, Poland JA, Kawamoto K, Buckler ES, Mitchell SE (2011) A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS One 6:e19379PubMedCrossRefGoogle Scholar
  12. Emerson KJ, Merz CR, Catchen JM, Hohenlohe PA, Cresko WA, Bradshaw WE, Holzapfel CM (2010) Resolving postglacial phylogeography using high-throughput sequencing. Proc Natl Acad Sci U S A 107:16196–16200PubMedCrossRefGoogle Scholar
  13. Grattapaglia D, Resende M (2011) Genomic selection in forest tree breeding. Tree Genet Genomes 7:241–255CrossRefGoogle Scholar
  14. Henderson CR (1976) Simple method for computing inverse of a numerator relationship matrix used in prediction of breeding values. Biometrics 32:69–83CrossRefGoogle Scholar
  15. Hoberman R, Dias J, Ge B, Harmsen E, Mayhew M, Verlaan DJ, Kwan T, Dewar K, Blanchette M, Pastinen T (2009) A probabilistic approach for SNP discovery in high-throughput human resequencing data. Genome Res 19:1542–1552PubMedCrossRefGoogle Scholar
  16. Hohenlohe PA, Amish SJ, Catchen JM, Allendorf FW, Luikart G (2011) Next-generation RAD sequencing identifies thousands of SNPs for assessing hybridization between rainbow and westslope cutthroat trout. Mol Ecol Resour 11:117–122PubMedCrossRefGoogle Scholar
  17. Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5:e1000529PubMedCrossRefGoogle Scholar
  18. Huang X, Feng Q, Qian Q, Zhao Q, Wang L, Wang A, Guan J, Fan D, Weng Q, Huang T, Dong G, Sang T, Han B (2009) High-throughput genotyping by whole-genome resequencing. Genome Res 19:1068–1076PubMedCrossRefGoogle Scholar
  19. Iwata H, Hayashi T, Tsumura Y (2011) Prospects for genomic selection in conifer breeding: a simulation study of Cryptomeria japonica. Tree Genet Genomes 7:747–758CrossRefGoogle Scholar
  20. Kovach A, Wegrzyn JL, Parra G, Holt C, Bruening GE, Loopstra CA, Hartigan J, Yandell M, Langley CH, Korf I, Neals DB (2010) The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences. BMC Genomics 11:420PubMedCrossRefGoogle Scholar
  21. Lexer C, Stolting KN (2012) Whole genome sequencing (WGS) meets biogeography and shows that genomic selection in forest trees is feasible. New Phytol 196:652–654PubMedCrossRefGoogle Scholar
  22. Lu F, Lipka AE, Glaubitz J, Elshire R, Cherney JH, Casler MD, Buckler ES, Costich DE (2013) Switchgrass genomic diversity, ploidy, and evolution: novel insights from a network-based SNP discovery protocol. PLoS Genet 9:e1003215PubMedCrossRefGoogle Scholar
  23. Malhis N, Butterfield YS, Ester M, Jones SJ (2009) Slider—maximum use of probability information for alignment of short sequence reads and SNP detection. Bioinformatics 25:6–13PubMedCrossRefGoogle Scholar
  24. Metzker ML (2010) Sequencing technologies—the next generation. Nat Rev Genet 11:31–46PubMedCrossRefGoogle Scholar
  25. Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829PubMedGoogle Scholar
  26. Morse A, Peterson DG, Islam-Fardi MN, Smith KE, Maganua Z, Garcia SA, Kubisiak TL, Amerson HV, Carlson JE, Nelson CD, Davis JM (2009) Evolution of genome size and complexity in Pinus. PLoS One 4:e4332PubMedCrossRefGoogle Scholar
  27. Myles S, Chia JM, Hurwitz B, Simon C, Zhong GY, Buckler E, Ware D (2010) Rapid genomic characterization of the genus Vitis. PLoS ONE 5(1):e8219PubMedCrossRefGoogle Scholar
  28. Nielsen R, Williamson S, Kim Y, Hubisz M, Clark AG, Bustamante C (2005) Genomic scan for selective sweeps using SNP data. Genome Res 15:1566–1575PubMedCrossRefGoogle Scholar
  29. Nystedt B, Street NR, Wetterbom A, Zuccolo A, Lin Y-C, Scofield DG, Vezzi F, Delhomme N, Giacomello S, Alexeyenko A, Vicedomini R, Sahlin K, Ellen S, Elfstrand M, Gramzow L, Holmberg K, Hallman J, Keech O, Klasson L, Koriabine M, Kucukoglu M, Kaller M, Luthman J, Lysholm F, Niittyla T, Olson A, Rilakovic N, Ritland C, Rossello JA, Sena J, Svensson T, Talavera-Lopez C, Theißen G, Tuominen H, Vanneste K, Wu Z-Q, Zhang B, Zerbe P, Arvestad L, Bhalerao R, Bohlmann J, Bousquet J, Gil RG, Hvidsten TR, de Jong P, MacKay J, Morgante M, Ritland K, Sundberg B, Thompson SL, Van de Peer Y, Andersson B, Nilsson O, Ingvarsson PK, Lundeberg J, Jansson S (2013) The Norway spruce genome sequence and conifer genome evolution. Nature. doi: 10.1038/nature12211 PubMedGoogle Scholar
  30. O'Connell LM, Mosseler A, Rajora OP (2006) Impacts of forest fragmentation on the mating system and genetic diversity of white spruce (Picea glauca) at the landscape level. Heredity 97:418–426PubMedCrossRefGoogle Scholar
  31. Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA (2010) Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics 11:180–196PubMedCrossRefGoogle Scholar
  32. Parchman TL, Gompert Z, Mudge J, Schilkey FD, Benkman CW, Buerkle CA (2012) Genome-wide association genetics of an adaptive trait in lodgepole pine. Mol Ecol 21:2991–3005PubMedCrossRefGoogle Scholar
  33. Pavy N, Parsons LS, Paule C, MacKay J, Bousquet J (2006) Automated SNP detection from a large collection of white spruce expressed sequences: contributing factors and approaches for categorization of SNPs. BMC Genomics 7:174–188PubMedCrossRefGoogle Scholar
  34. Pavy N, Pelgas B, Beauseigle S, Blais S, Gagnon F, Gosselin I, Lamothe M, Isabel N, Bosquet J (2008) Enhancing genetic mapping of complex genomes through the design of highly-multiplexed SNP arrays: application to the large and unsequenced genomes of white spruce and black spruce. BMC Genomics 9:21–38PubMedCrossRefGoogle Scholar
  35. Poland JA, Brown PJ, Sorrells ME, Jannink J-L (2012) Development of high-density genetic maps for barley and wheat using a novel two enzyme genotyping-by-sequencing approach. PLoS One 7:e32253PubMedCrossRefGoogle Scholar
  36. Porth I, Klápšte J, Lai BSK, Geraldes A, Muchero W, Tuskan GA, Douglas CJ, El-Kassaby YA, Manfield SD (2012) Populus trichocarpa cell wall chemistry and ultrastructure trait variation, genetic control and genetic correlations. New Phytol 197:777–790PubMedCrossRefGoogle Scholar
  37. Powell JE, Vissher PM, Goddard ME (2010) Reconciling the analysis of IBD and IBS in complex trait studies. Nat Genet 11:800–805Google Scholar
  38. Resende MFR Jr, Muñoz P, Acosta JJ, Peter GF, Davis JM, Grattapaglia D, Resende MDV, Kirst M (2012a) Accelerating the domestication of trees using genomic selection: accuracy of prediction models across ages and environments. New Phytol 193:617–624PubMedCrossRefGoogle Scholar
  39. Resende MFR Jr, Muñoz P, Resende MDV, Garrick DJ, Fernando RL, Davis JM, Jokela EJ, Martin TA, Peter GF, Kirst M (2012b) Accurate of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics 190:1503–1510PubMedCrossRefGoogle Scholar
  40. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425PubMedGoogle Scholar
  41. Sansaloni CP, Petroli CD, Carling J, Hudson CJ, Steane DA, Myburg AA, Grattapaglia D, Villancourt RE, Kilian A (2010) A high-density diversity arrays technology (DArT) microarray for genome-wide genotyping in Eucalyptus. Plant Methods 2010:6–16Google Scholar
  42. Schaeffer LR (2006) Strategy for applying genome-wide selection in dairy cattle. J Anim Breed Genet 123:218–223PubMedCrossRefGoogle Scholar
  43. Visscher PM, Hill WG, Wray NR (2008) Heritability in the genomics era—concepts and misconceptions. Nat Rev Genet 9:255–266PubMedCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Charles Chen
    • 1
  • Sharon E. Mitchell
    • 2
  • Robert J. Elshire
    • 2
  • Edward S. Buckler
    • 2
    • 3
    • 4
  • Yousry A. El-Kassaby
    • 5
    Email author
  1. 1.Centro Internacional de Mejoramiento de Maiz y Trigo, CIMMYTTexcocoMexico
  2. 2.Institute for Genomic DiversityCornell UniversityIthacaUSA
  3. 3.Department of Plant Breeding and GeneticsCornell UniversityIthacaUSA
  4. 4.U.S. Department of Agriculture-Agriculture Research Service (USDA-ARS)Robert W. Holley Center for Agriculture and HealthIthacaUSA
  5. 5.Department of Forest and Conservation Sciences, Faculty of ForestryThe University of British ColumbiaVancouverCanada

Personalised recommendations