Skip to main content
Log in

Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Population-based methods for the genetic mapping of adaptive traits and the analysis of natural selection require that the population structure and demographic history of a species are taken into account. We characterized geographic patterns of genetic variation in the model plant Arabidopsis thaliana by genotyping 115 genome-wide single nucleotide polymorphism (SNP) markers in 351 accessions from the whole species range using a matrix-assisted laser desorption/ionization time-of-flight assay, and by sequencing of nine unlinked short genomic regions in a subset of 64 accessions. The observed frequency distribution of SNPs is not consistent with a constant-size neutral model of sequence polymorphism due to an excess of rare polymorphisms. There is evidence for a significant population structure as indicated by differences in genetic diversity between geographic regions. Accessions from Central Asia have a low level of polymorphism and an increased level of genome-wide linkage disequilibrium (LD) relative to accessions from the Iberian Peninsula and Central Europe. Cluster analysis with the structure program grouped Eurasian accessions into K=6 clusters. Accessions from the Iberian Peninsula and from Central Asia constitute distinct populations, whereas Central and Eastern European accessions represent admixed populations in which genomes were reshuffled by historical recombination events. These patterns likely result from a rapid postglacial recolonization of Eurasia from glacial refugial populations. Our analyses suggest that mapping populations for association or LD mapping should be chosen from regional rather than a species-wide sample or identified genetically as sets of individuals with similar average genetic distances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Abbott RJ, Gomes MF (1989) Population genetic structure and outcrossing rate of Arabidopsis thaliana. Heredity 42:411–418

    Google Scholar 

  • Akey J, Zhang K, Xiong M, Jin L (2003) The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol Biol Evol 20:232–242

    Article  PubMed  CAS  Google Scholar 

  • Alonso-Blanco C, Koornneef M (2000) Naturally occurring variation in Arabidopsis: an underexploited resource for plant genetics. Trends Plant Sci 5:22–29

    Article  PubMed  CAS  Google Scholar 

  • Bergelson J, Stahl E, Dudeck S, Kreitman M (1998) Genetic variation between and within populations of Arabidopsis thaliana. Genetics 148:1311–1323

    PubMed  CAS  Google Scholar 

  • Borevitz J, Nordborg M (2003) The impact of genomics on the study of natural variation in Arabidopsis. Plant Phys 132:718–725

    Article  CAS  Google Scholar 

  • Bray M, Boerwinkle E, Doris P (2001) High-throughput multiplex SNP genotyping with MALDI-Tof mass spectrometry: practice, problems and promise. Hum Mutat 17:296–304

    Article  PubMed  CAS  Google Scholar 

  • Brumfield R, Beerli P, Nickerson D, Edwards S (2003) The utility of single nucleotide polymorphisms in inferences of population history. Trends Ecol Evol 18:249–256

    Article  Google Scholar 

  • Caicedo A, Stinchcombe J, Olsen K, Schmitt J, Purugganan M (2004) Epistatic interaction between Arabidopsis FRI and FLC flowering time genes generates a latitudinal cline in a life history trait. Proc Natl Acad Sci USA 101:15670–15675

    Article  PubMed  CAS  Google Scholar 

  • Comes H, Kadereit J (1998) The effect of quaternary climatic changes on plant distribution and evolution. Trends Plant Sci 3:432–438

    Article  Google Scholar 

  • Dixon P (2001) The Bootstrap and the Jackknife. In: Scheiner S, Gurevich J (eds) Design and analysis of ecological experiments. Oxford University Press, Oxford, pp 267–288

  • Eberle M, Kruglyak L (2000) An analysis of strategies for discovery of single-nucleotide polymorphisms. Genet Epidemiol 19:S29–S35

    Article  PubMed  Google Scholar 

  • Environmental Systems Research Institute R Inc (1992) Arc/Info. Environmental Systems Research Institute, Red lands

  • Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software structure : a simulation study. Mol Ecol 14:2611–2620

    Article  PubMed  CAS  Google Scholar 

  • Falush D, Stephens M, Pritchard J (2003) Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 164:1567–1587

    PubMed  CAS  Google Scholar 

  • Felsenstein J (1989) PHYLIP—Phylogeny Inference Package, Version 3.2. Cladistics 5:164–166

  • Frenzel B, Pécsi M, Velichko A (1992) Atlas of paleoclimates and paleoenvironments of the northern hemisphere. Gustav Fischer, Stuttgart

    Google Scholar 

  • Hewitt G (1999) Post-glacial re-colonization of European biota. Biol J Linn Soc 68:87–112

    Article  Google Scholar 

  • Hill W, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet 38:226–331

    Article  Google Scholar 

  • Hoffmann M (2002) Biogeography of Arabidopsis thaliana (L.) Heynh. (Brassicaceae) J Biogeogr 29:125–134

    Article  Google Scholar 

  • Jander G, Norris S, Rounsley S, Bush D, Levin I, Last R (2002) Arabidopsis map-based cloning in the post-genome era. Plant Phys 129:440–450

    Article  CAS  Google Scholar 

  • Koornneef M, Alonso-Blanco C, Vreugdenhil D (2004) Naturally occurring genetic variation in Arabidopsis thaliana. Ann Rev Plant Biol 55:141–172

    Article  CAS  Google Scholar 

  • Kuhner M, Beerli P, Yamato J, Felsenstein J (2000) Usefulness of single nucleotide polymorphism data for estimating population parameters. Genetics 156:439–447

    PubMed  CAS  Google Scholar 

  • Kuittinen H, Mattilan A, Savoulainen O (1997) Genetic variation at marker loci and in quantitative traits in natural populations of Arabidopsis thaliana. Heredity 79:144–152

    Article  PubMed  Google Scholar 

  • Lempe J, Balasubramanian S, Sureshkumar S, Singh A, Schmid M, Weigel D (2005) Diversity of flowering responses in wild Arabidopsis thaliana strains. PloS Genet 1:e6

    Article  CAS  Google Scholar 

  • Lewontin R (1964) The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49:49–67

    PubMed  CAS  Google Scholar 

  • Loridon K, Cournoyer B, Goubely C, Depeiges A, Picard G (1998) Length polymorphism and allele structure of trinucleotide microsatellites in natural accessions of Arabidopsis thaliana. Theor Appl Genet 97:591–604

    Article  CAS  Google Scholar 

  • Mantel N (1967) The detection of disease clustering and a generalized regression approach. Cancer Res 27:209–220

    PubMed  CAS  Google Scholar 

  • Mitchell-Olds T (2001) Arabidopsis thaliana and its wild relatives: a model system for ecology and evolution. Trends Ecol Evol 16:693–700

    Article  Google Scholar 

  • Miyashita NT, Kawabe A, Innan H (1999) DNA variation in the wild plant Arabidopsis thaliana revealed by amplified random fragment length polymorphism analysis. Genetics 152:1723–1731

    PubMed  CAS  Google Scholar 

  • Miyashita NT, Kawabe A, Innan H, Terauchi R (1998) Intra- and interspecific DNA variation and codon bias of the alcohol dehydrogenase (Adh) locus in Arabis and Arabidopsis species. Mol Biol Evol 15:1420–1429

    PubMed  CAS  Google Scholar 

  • Morin P, Luikart G, Wayne R, the SNP workshop group (2004) SNPs in ecology, evolution and conservation. Trends Ecol Evol 19:208–216

    Article  Google Scholar 

  • Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York

    Google Scholar 

  • Nielsen R, Hubisz M, Clark A (2004) Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 32:3435–3445

    Google Scholar 

  • Nordborg M, Borevitz J, Bergelson J, Berry C, Chory J, Hagenblad J, Kreitman M, Maloof J, Noyes T, Oefner P, Stahl E, Weigel D (2002) The extent of linkage disequilibrium in Arabidopsis thaliana. Nat Genet 30:190–193

    Article  PubMed  CAS  Google Scholar 

  • Nordborg M, Hu T, Ishino Y, Jhaveri J, Toomajian C, Zheng H, Bakker E, Calabrese P, Gladstone J, Goyal R, Jakobsson M, Kim S, Morozov Y, Padhukasahasram B, Plagnol V, Rosenberg N, Shah C, Wall J, Wang J, Zhao K, Kalbfleisch T, Schulz V, Kreitman M, Bergelson J (2005) The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3:e196

    Article  PubMed  CAS  Google Scholar 

  • Olsen KM, Halldorsdottir SS, Stinchcombe JR, Weinig C, Schmitt J, Purugganan MD (2004) Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time alleles. Genetics 167:1361–1369

    Article  PubMed  CAS  Google Scholar 

  • Pritchard J, Rosenberg N (1999) Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65:220–228

    Article  PubMed  CAS  Google Scholar 

  • Pritchard J, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    PubMed  CAS  Google Scholar 

  • Rosenberg N (2004) DISTRUCT: a program for the graphical display of population structure. Mol Ecol Notes 4:137

    Article  Google Scholar 

  • Rozas J, Rozas R (1999) DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175

    Article  PubMed  CAS  Google Scholar 

  • Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    PubMed  CAS  Google Scholar 

  • Schmid K, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T (2005) A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169:1601–1615

    Article  PubMed  CAS  Google Scholar 

  • Schmid K, Rosleff-Sörensen T, Stracke R, Törjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13:1250–1257

    Article  PubMed  Google Scholar 

  • Schmuths H, Hoffmann M, Bachmann K (2004) Geographic distribution and recombination of genomic fragments on the short arm of chromosome 2 of Arabidopsis thaliana. Plant Biol 6:128–139

    Article  PubMed  CAS  Google Scholar 

  • Sharbel T, Haubold B, Mitchell-Olds T (2000) Genetic isolation by distance in Arabidopsis thaliana: biogeography and postglacial colonization of Europe. Mol Ecol 9:2109–2118

    Article  PubMed  CAS  Google Scholar 

  • Shindo C, Aranzana M, Lister C, Baxter C, Nicholls C, Nordborg M, Dean C (2005) Role of FRIGIDA and LOWERING LOCUS C in determining variation in flowering time of Arabidopsis. Plant Phys 138:1163–1173

    Article  CAS  Google Scholar 

  • Sokal RR, Rohlf FJ (1995) Biometry. Sinauer Associates, Sunderland

    Google Scholar 

  • Stinchcombe JR, Weinig C, Ungerer M, Olsen KM, Mays C, Halldorsdottir SS, Purugganan MD, Schmitt J (2004) A latitudinal cline in flowering time in Arabidopsis thaliana modulated by the flowering time gene FRIGIDA. Proc Natl Acad Sci USA 101:4712–4717

    Article  PubMed  CAS  Google Scholar 

  • Törjék O, Meyer R, Müssig C, Schmid K, Weisshaar B, Mitchell-Olds T, Altmann T (2003) Establishment of a high-efficiency SNP-base framework marker set for Arabidopsis. Plant J 36:122–140

    Article  PubMed  CAS  Google Scholar 

  • Vander Zwan C, Brodie S, Campanella J (2000) The intraspecific phylogenetics of Arabidopsis thaliana in worldwide populations. Syst Bot 25:47–59

    Article  Google Scholar 

  • Weir B (1996) Genetic data analysis. II. Sinauer Associates, Sunderland

    Google Scholar 

  • Wright S, Gaut BS (2005) Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:506–519

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

This work was funded by the German Ministry of Science (BMBF) as part of the GABI project (#0312275A) to T. A. and by the Emmy-Noether program of the Deutsche Forschungsgemeinschaft (Schm 1354-2/2) to K. J. S. We are grateful to Henriette Ringys-Beckstein, Maik Zehnsdorf and Melanie Lück for excellent technical assistance. We also thank K. Bachmann, M. Clauss, B. Haubold, M. Koornneef, A. Lawton-Rauh, T. Mitchell-Olds, S. Ramos-Onsins and E. Wheeler for discussions and comments on an earlier version of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Karl J. Schmid.

Additional information

Communicated by O. Savolainen

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schmid, K.J., Törjék, O., Meyer, R. et al. Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers. Theor Appl Genet 112, 1104–1114 (2006). https://doi.org/10.1007/s00122-006-0212-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-006-0212-7

Keywords

Navigation