Abstract
Population-based methods for the genetic mapping of adaptive traits and the analysis of natural selection require that the population structure and demographic history of a species are taken into account. We characterized geographic patterns of genetic variation in the model plant Arabidopsis thaliana by genotyping 115 genome-wide single nucleotide polymorphism (SNP) markers in 351 accessions from the whole species range using a matrix-assisted laser desorption/ionization time-of-flight assay, and by sequencing of nine unlinked short genomic regions in a subset of 64 accessions. The observed frequency distribution of SNPs is not consistent with a constant-size neutral model of sequence polymorphism due to an excess of rare polymorphisms. There is evidence for a significant population structure as indicated by differences in genetic diversity between geographic regions. Accessions from Central Asia have a low level of polymorphism and an increased level of genome-wide linkage disequilibrium (LD) relative to accessions from the Iberian Peninsula and Central Europe. Cluster analysis with the structure program grouped Eurasian accessions into K=6 clusters. Accessions from the Iberian Peninsula and from Central Asia constitute distinct populations, whereas Central and Eastern European accessions represent admixed populations in which genomes were reshuffled by historical recombination events. These patterns likely result from a rapid postglacial recolonization of Eurasia from glacial refugial populations. Our analyses suggest that mapping populations for association or LD mapping should be chosen from regional rather than a species-wide sample or identified genetically as sets of individuals with similar average genetic distances.
Similar content being viewed by others
References
Abbott RJ, Gomes MF (1989) Population genetic structure and outcrossing rate of Arabidopsis thaliana. Heredity 42:411–418
Akey J, Zhang K, Xiong M, Jin L (2003) The effect of single nucleotide polymorphism identification strategies on estimates of linkage disequilibrium. Mol Biol Evol 20:232–242
Alonso-Blanco C, Koornneef M (2000) Naturally occurring variation in Arabidopsis: an underexploited resource for plant genetics. Trends Plant Sci 5:22–29
Bergelson J, Stahl E, Dudeck S, Kreitman M (1998) Genetic variation between and within populations of Arabidopsis thaliana. Genetics 148:1311–1323
Borevitz J, Nordborg M (2003) The impact of genomics on the study of natural variation in Arabidopsis. Plant Phys 132:718–725
Bray M, Boerwinkle E, Doris P (2001) High-throughput multiplex SNP genotyping with MALDI-Tof mass spectrometry: practice, problems and promise. Hum Mutat 17:296–304
Brumfield R, Beerli P, Nickerson D, Edwards S (2003) The utility of single nucleotide polymorphisms in inferences of population history. Trends Ecol Evol 18:249–256
Caicedo A, Stinchcombe J, Olsen K, Schmitt J, Purugganan M (2004) Epistatic interaction between Arabidopsis FRI and FLC flowering time genes generates a latitudinal cline in a life history trait. Proc Natl Acad Sci USA 101:15670–15675
Comes H, Kadereit J (1998) The effect of quaternary climatic changes on plant distribution and evolution. Trends Plant Sci 3:432–438
Dixon P (2001) The Bootstrap and the Jackknife. In: Scheiner S, Gurevich J (eds) Design and analysis of ecological experiments. Oxford University Press, Oxford, pp 267–288
Eberle M, Kruglyak L (2000) An analysis of strategies for discovery of single-nucleotide polymorphisms. Genet Epidemiol 19:S29–S35
Environmental Systems Research Institute R Inc (1992) Arc/Info. Environmental Systems Research Institute, Red lands
Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software structure : a simulation study. Mol Ecol 14:2611–2620
Falush D, Stephens M, Pritchard J (2003) Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 164:1567–1587
Felsenstein J (1989) PHYLIP—Phylogeny Inference Package, Version 3.2. Cladistics 5:164–166
Frenzel B, Pécsi M, Velichko A (1992) Atlas of paleoclimates and paleoenvironments of the northern hemisphere. Gustav Fischer, Stuttgart
Hewitt G (1999) Post-glacial re-colonization of European biota. Biol J Linn Soc 68:87–112
Hill W, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet 38:226–331
Hoffmann M (2002) Biogeography of Arabidopsis thaliana (L.) Heynh. (Brassicaceae) J Biogeogr 29:125–134
Jander G, Norris S, Rounsley S, Bush D, Levin I, Last R (2002) Arabidopsis map-based cloning in the post-genome era. Plant Phys 129:440–450
Koornneef M, Alonso-Blanco C, Vreugdenhil D (2004) Naturally occurring genetic variation in Arabidopsis thaliana. Ann Rev Plant Biol 55:141–172
Kuhner M, Beerli P, Yamato J, Felsenstein J (2000) Usefulness of single nucleotide polymorphism data for estimating population parameters. Genetics 156:439–447
Kuittinen H, Mattilan A, Savoulainen O (1997) Genetic variation at marker loci and in quantitative traits in natural populations of Arabidopsis thaliana. Heredity 79:144–152
Lempe J, Balasubramanian S, Sureshkumar S, Singh A, Schmid M, Weigel D (2005) Diversity of flowering responses in wild Arabidopsis thaliana strains. PloS Genet 1:e6
Lewontin R (1964) The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49:49–67
Loridon K, Cournoyer B, Goubely C, Depeiges A, Picard G (1998) Length polymorphism and allele structure of trinucleotide microsatellites in natural accessions of Arabidopsis thaliana. Theor Appl Genet 97:591–604
Mantel N (1967) The detection of disease clustering and a generalized regression approach. Cancer Res 27:209–220
Mitchell-Olds T (2001) Arabidopsis thaliana and its wild relatives: a model system for ecology and evolution. Trends Ecol Evol 16:693–700
Miyashita NT, Kawabe A, Innan H (1999) DNA variation in the wild plant Arabidopsis thaliana revealed by amplified random fragment length polymorphism analysis. Genetics 152:1723–1731
Miyashita NT, Kawabe A, Innan H, Terauchi R (1998) Intra- and interspecific DNA variation and codon bias of the alcohol dehydrogenase (Adh) locus in Arabis and Arabidopsis species. Mol Biol Evol 15:1420–1429
Morin P, Luikart G, Wayne R, the SNP workshop group (2004) SNPs in ecology, evolution and conservation. Trends Ecol Evol 19:208–216
Nei M (1987) Molecular evolutionary genetics. Columbia University Press, New York
Nielsen R, Hubisz M, Clark A (2004) Reconstituting the frequency spectrum of ascertained single-nucleotide polymorphism data. Genetics 32:3435–3445
Nordborg M, Borevitz J, Bergelson J, Berry C, Chory J, Hagenblad J, Kreitman M, Maloof J, Noyes T, Oefner P, Stahl E, Weigel D (2002) The extent of linkage disequilibrium in Arabidopsis thaliana. Nat Genet 30:190–193
Nordborg M, Hu T, Ishino Y, Jhaveri J, Toomajian C, Zheng H, Bakker E, Calabrese P, Gladstone J, Goyal R, Jakobsson M, Kim S, Morozov Y, Padhukasahasram B, Plagnol V, Rosenberg N, Shah C, Wall J, Wang J, Zhao K, Kalbfleisch T, Schulz V, Kreitman M, Bergelson J (2005) The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3:e196
Olsen KM, Halldorsdottir SS, Stinchcombe JR, Weinig C, Schmitt J, Purugganan MD (2004) Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time alleles. Genetics 167:1361–1369
Pritchard J, Rosenberg N (1999) Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65:220–228
Pritchard J, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959
Rosenberg N (2004) DISTRUCT: a program for the graphical display of population structure. Mol Ecol Notes 4:137
Rozas J, Rozas R (1999) DnaSP version 3: an integrated program for molecular population genetics and molecular evolution analysis. Bioinformatics 15:174–175
Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425
Schmid K, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T (2005) A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169:1601–1615
Schmid K, Rosleff-Sörensen T, Stracke R, Törjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13:1250–1257
Schmuths H, Hoffmann M, Bachmann K (2004) Geographic distribution and recombination of genomic fragments on the short arm of chromosome 2 of Arabidopsis thaliana. Plant Biol 6:128–139
Sharbel T, Haubold B, Mitchell-Olds T (2000) Genetic isolation by distance in Arabidopsis thaliana: biogeography and postglacial colonization of Europe. Mol Ecol 9:2109–2118
Shindo C, Aranzana M, Lister C, Baxter C, Nicholls C, Nordborg M, Dean C (2005) Role of FRIGIDA and LOWERING LOCUS C in determining variation in flowering time of Arabidopsis. Plant Phys 138:1163–1173
Sokal RR, Rohlf FJ (1995) Biometry. Sinauer Associates, Sunderland
Stinchcombe JR, Weinig C, Ungerer M, Olsen KM, Mays C, Halldorsdottir SS, Purugganan MD, Schmitt J (2004) A latitudinal cline in flowering time in Arabidopsis thaliana modulated by the flowering time gene FRIGIDA. Proc Natl Acad Sci USA 101:4712–4717
Törjék O, Meyer R, Müssig C, Schmid K, Weisshaar B, Mitchell-Olds T, Altmann T (2003) Establishment of a high-efficiency SNP-base framework marker set for Arabidopsis. Plant J 36:122–140
Vander Zwan C, Brodie S, Campanella J (2000) The intraspecific phylogenetics of Arabidopsis thaliana in worldwide populations. Syst Bot 25:47–59
Weir B (1996) Genetic data analysis. II. Sinauer Associates, Sunderland
Wright S, Gaut BS (2005) Molecular population genetics and the search for adaptive evolution in plants. Mol Biol Evol 22:506–519
Acknowledgements
This work was funded by the German Ministry of Science (BMBF) as part of the GABI project (#0312275A) to T. A. and by the Emmy-Noether program of the Deutsche Forschungsgemeinschaft (Schm 1354-2/2) to K. J. S. We are grateful to Henriette Ringys-Beckstein, Maik Zehnsdorf and Melanie Lück for excellent technical assistance. We also thank K. Bachmann, M. Clauss, B. Haubold, M. Koornneef, A. Lawton-Rauh, T. Mitchell-Olds, S. Ramos-Onsins and E. Wheeler for discussions and comments on an earlier version of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by O. Savolainen
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Schmid, K.J., Törjék, O., Meyer, R. et al. Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers. Theor Appl Genet 112, 1104–1114 (2006). https://doi.org/10.1007/s00122-006-0212-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-006-0212-7