Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data

Abstract

The capability of molecular markers to provide information of genetic structure is influenced by their number and the way they are chosen. This study evaluates the effects of single nucleotide polymorphism (SNP) number and selection strategy on estimates of germplasm diversity and population structure for different types of barley germplasm, namely cultivar and landrace. One hundred and sixty-nine barley landraces from Syria and Jordan and 171 European barley cultivars were genotyped with 1536 SNPs. Different subsets of 384 and 96 SNPs were selected from the 1536 set, based on their ability to detect diversity in landraces or cultivated barley in addition to corresponding randomly chosen subsets. All SNP sets except the landrace-optimised subsets underestimated the diversity present in the landrace germplasm, and all subsets of SNP gave similar estimates for cultivar germplasm. All marker subsets gave qualitatively similar estimates of the population structure in both germplasm sets, but the 96 SNP sets showed much lower data resolution values than the larger SNP sets. From these data we deduce that pre-selecting markers for their diversity in a germplasm set is very worthwhile in terms of the quality of data obtained. Second, we suggest that a properly chosen 384 SNP subset gives a good combination of power and economy for germplasm characterization, whereas the rather modest gain from using 1536 SNPs does not justify the increased cost and 96 markers give unacceptably low performance. Lastly, we propose a specific 384 SNP subset as a standard genotyping tool for middle-eastern landrace barley.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

References

  1. Abasht B, Lamont SJ (2007) Genome-wide association analysis reveals cryptic alleles as an important factor in heterosis for fatness in chicken F2 population. Anim Genet 38:491–498

    Article  CAS  PubMed  Google Scholar 

  2. Akey JM, Zhang K, Xiong M, Jin L (2003) The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol Biol Evol 20:232–242

    Article  CAS  PubMed  Google Scholar 

  3. Brown A, Munday J (1982) Population-genetic structure and optimal sampling of land races of barley from Iran. Genetica 58:85–96

    Article  Google Scholar 

  4. Brown A, Nevo E, Zohary D, Dagan O (1978) Genetic variations in natural populations of wild barley. Genetica 49:97–108

    Article  Google Scholar 

  5. Caldwell KS, Russell J, Langridge P, Powell W (2006) Extreme population-dependent linkage disequilibrium detected in an inbreeding plant species, Hordeum vulgare. Genetics 172:557–567

    Article  CAS  PubMed  Google Scholar 

  6. Clark AG, Hubisz MJ, Bustamante CD, Williamson SH, Nielsen R (2005) Ascertainment bias in studies of human genome-wide polymorphism. Genome Res 15:1496–1502

    Article  CAS  PubMed  Google Scholar 

  7. Fan J, Chee MS, Gunderson KL (2006) Highly parallel genomic assays. Nat Rev Genet 7:632–644

    Article  CAS  PubMed  Google Scholar 

  8. Hayes P, Szucs P (2006) Disequilibrium and association in barley: thinking outside the glass. Proc Natl Acad Sci USA 103:18385–18386

    Article  CAS  PubMed  Google Scholar 

  9. Jana S, Pietrzak LN (1988) Comparative assessment of genetic diversity in wild and primitive cultivated barley in a center of diversity. Genetics 119:981–990

    PubMed  Google Scholar 

  10. Kota R, Varshney RK, Prasad M, Zhang H, Stein N, Graner A (2007) EST-derived single nucleotide polymorphism markers for assembling genetic and physical maps of the barley genome. Funct Integr Genomics 8:223–233

    Article  PubMed  Google Scholar 

  11. Kuhner MK, Beerli P, Yamato J, Felsenstein J (2000) Usefulness of single nucleotide polymorphism data for estimating population parameters. Genetics 156:439–447

    CAS  PubMed  Google Scholar 

  12. Mardia KV, Kent JT, Bibby JM (1979) Multivariate Analysis. Academic Press, New York

    Google Scholar 

  13. Matus IA, Hayes PM (2002) Genetic diversity in three groups of barley germplasm assessed by simple sequence repeats. Genome 45:1095–1106

    Article  CAS  PubMed  Google Scholar 

  14. Moragues M, Moralejo M, Sorrells M, Royo C (2007) Dispersal of durum wheat [Triticum turgidum ssp. turgidum convar. durum (Desf.) MK] landraces across the Mediterranean basin assessed by AFLPs and microsatellites. Genet Resour Crop Evol 54:1133–1144

    Article  CAS  Google Scholar 

  15. Nevo E, Zohary D, Beiles D, Kaplan D, Storch N (1986) Genetic diversity and environmental associations of wild barley, Hordeum spontaneum, in Turkey. Genetica 68:203–213

    Article  Google Scholar 

  16. Nielsen R (2000) Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154(2):931–942

    CAS  PubMed  Google Scholar 

  17. Nielsen R, Hubisz MJ, Clark AG (2004) Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 168:2373–2382

    Article  CAS  PubMed  Google Scholar 

  18. Rafalski A (2002) Applications of single nucleotide polymorphisms in crop genetics. Curr Opin Plant Biol 5(2):94–100

    Article  CAS  PubMed  Google Scholar 

  19. Romero IG, Manica A, Goudet J, Handley LL, Balloux F (2009) How accurate is the current picture of human genetic variation? Heredity 102:120–126

    Article  CAS  PubMed  Google Scholar 

  20. Rosenblum EB, Novembre J (2007) Ascertainment bias in spatially structured populations: a case study in the eastern fence lizard. J Hered 98:331–336

    Article  PubMed  Google Scholar 

  21. Rostoks N, Mudie S, Cardle L, Russell J, Ramsay L, Booth A, Svensson J, Wanamaker S, Walia H, Rodriguez E, Hedley P, Liu H, Morris J, Close T, Marshall D, Waugh R (2005) Genome-wide SNP discovery and linkage analysis in barley based on genes responsive to abiotic stress. Mol Genet Gen 274:515–527

    CAS  Google Scholar 

  22. Rostoks N, Ramsay L, MacKenzie K, Cardle L, Bhat PR, Roose ML, Svensson JT, Stein N, Varshney RK, Marshall DF, Graner A, Close TJ, Waugh R (2006) Recent history of artificial outcrossing facilitates whole-genome association mapping in elite inbred crop varieties. Proc Natl Acad Sci USA 103:18656–18661

    Article  CAS  PubMed  Google Scholar 

  23. Russell JR, Booth A, Fuller JD, Baum M, Ceccarelli S, Grando S, Powell W (2003) Patterns of polymorphism detected in the chloroplast and nuclear genomes of barley landraces sampled from Syria and Jordan. Theor Appl Genet 107:413–421

    Article  CAS  PubMed  Google Scholar 

  24. Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PIW, Chen H, Roix JJ, Kathiresan S, Hirschhorn JN, Daly MJ, Hughes TE, Groop L, Altshuler D, Almgren P, Florez JC, Meyer J, Ardlie K, Bengtsson Boström K, Isomaa B, Lettre G, Lindblad U, Lyon HN, Melander O, Newton-Cheh C, Nilsson P, Orho-Melander M, Råstam L, Speliotes EK, Taskinen M, Tuomi T, Guiducci C, Berglund A, Carlson J, Gianniny L, Hackett R, Hall L, Holmkvist J, Laurila E, Sjögren M, Sterner M, Surti A, Svensson M, Svensson M, Tewhey R, Blumenstiel B, Parkin M, Defelice M, Barry R, Brodeur W, Camarata J, Chia N, Fava M, Gibbons J, Handsaker B, Healy C, Nguyen K, Gates C, Sougnez C, Gage D, Nizzari M, Gabriel SB, Chirn G, Ma Q, Parikh H, Richardson D, Ricke D, Purcell S (2007) Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science 316:1331–1336

    Article  CAS  PubMed  Google Scholar 

  25. Schlotterer C, Harr B (2002) Single nucleotide polymorphisms derived from ancestral populations show no evidence for biased diversity estimates in Drosophila melanogaster. Mol Ecol 11:947–950

    Article  PubMed  Google Scholar 

  26. Storz JF, Kelly JK (2008) Effects of spatially varying selection on nucleotide diversity and linkage disequilibrium: insights from deer mouse globin genes. Genetics 180:367–379

    Article  CAS  PubMed  Google Scholar 

  27. van Hintum TJL (2007) Data resolution: a jackknife procedure for determining the consistency of molecular marker datasets. Theor Appl Genet 115:343–349

    Article  PubMed  Google Scholar 

  28. Wakeley J, Nielsen R, Liu-Cordero SN, Ardlie K (2001) The discovery of single nucleotide polymorphisms and inferences about human demographic history. Am J Hum Genet 69:1332–1347

    Article  CAS  PubMed  Google Scholar 

  29. Waugh R, Jannink J-L, Muehlbauer GJ, Ramsay L (2009) The emergence of whole genome association scans in barley. Curr Opin Plant Biol 12(2):218–222

    Article  CAS  PubMed  Google Scholar 

  30. Weir B (1996) Genetic data analysis II: methods for discrete population genetic data. Sinauer Associates, Sunderland, MA

    Google Scholar 

  31. Zeggini E, Weedon MN, Lindgren CM, Frayling TM, Elliott KS, Lango H, Timpson NJ, Perry JRB, Rayner NW, Freathy RM, Barrett JC, Shields B, Morris AP, Ellard S, Groves CJ, Harries LW, Marchini JL, Owen KR, Knight B, Cardon LR, Walker M, Hitman GA, Morris AD, Doney ASF, McCarthy MI, Hattersley AT (2007) Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science 316:1336–1341

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

We would like to acknowledge Drs S. Grando, M. Baum and S. Ceccarelli at International Center for Agricultural Research in the Dry Areas (ICARDA) for providing the Syrian Jordanian landrace collection material. The above work was supported by BBSRC Grant BB/E024726/1 (EXBARDIV) under the ERA-PG Programme ‘Structuring Plant Genomic Research in Europe’. SCRI received Grant-in-Aid from the Scottish Government.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Joanne R. Russell.

Additional information

Communicated by A. Graner.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Table 1 and Figure 1 (DOC 174 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Moragues, M., Comadran, J., Waugh, R. et al. Effects of ascertainment bias and marker number on estimations of barley diversity from high-throughput SNP genotype data. Theor Appl Genet 120, 1525–1534 (2010). https://doi.org/10.1007/s00122-010-1273-1

Download citation

Keywords

  • Single Nucleotide Polymorphism
  • Single Nucleotide Polymorphism Marker
  • Ascertainment Bias
  • Wild Barley
  • Single Nucleotide Polymorphism Genotyping