Abstract
Plant genetic diversity has been mainly investigated with neutral markers, but large-scale DNA sequencing projects now enable the identification and analysis of different classes of genetic polymorphisms, such as non-synonymous single nucleotide polymorphisms (nsSNPs) in protein coding sequences. Using the SIFT and MAPP programs to predict whether nsSNPs are tolerated (i.e., effectively neutral) or deleterious for protein function, genome-wide nsSNP data from Arabidopsis thaliana and rice were analyzed. In both species, about 20% of polymorphic sites with nsSNPs were classified as deleterious; they segregate at lower allele frequencies than tolerated nsSNPs due to purifying selection. Furthermore, A. thaliana accessions from marginal populations show a higher relative proportion of deleterious nsSNPs, which likely reflects differential selection or demographic effects in subpopulations. To evaluate the sensitivity of predictions, genes from model and crop plants with known functional effects of nsSNPs were inferred with the algorithms. The programs predicted about 70% of nsSNPs correctly as tolerated or deleterious, i.e., as having a functional effect. Forward-in-time simulations of bottleneck and domestication models indicated a high power to detect demographic effects on nsSNP frequencies in sufficiently large datasets. The results indicate that nsSNPs are useful markers for analyzing genetic diversity in plant genetic resources and breeding populations to infer natural/artificial selection and genetic drift.
Similar content being viewed by others
References
Abbott R, Gomes M (1989) Population structure and outcrossing rate of Arabidopsis thaliana (L) Heynh. Heredity 62:411–418
Albar L, Bangratz-Reyser M, Hebrard E, Ndjiondjop M, Jones M, Ghesquiere A (2006) Mutations in the eIF(iso)4G translation initiation factor confer high resistance of rice to Rice yellow mottle virus. Plant J 47:417–426
Alonso-Blanco C, de Vries HB, Hanhart CJ, Koornneef M (1999) Natural allelic variation at seed size loci in relation to other life history traits of Arabidopsis thaliana. Proc Natl Acad Sci USA 96(8):4710–4717
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Birney E, Clamp M, Durbin R (2004) Genewise and genomewise. Genome Res 14(5):988–995
Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O’Donovan C, Phan I, Pilbout S, Schneider M (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31(1):365–370
Brock M, Tiffin P, Weinig C (2007) Sequence diversity and haplotype associations with phenotypic responses to crowding: GIGANTEA affects fruit set in Arabidopsis thaliana. Mol Ecol 16(14):3050–3062
Caicedo AL, Williamson SH, Hernandez RD, Boyko A, Fledel-Alon A, York TL, Polato NR, Olsen KM, Nielsen R, McCouch SR, Bustamante CD, Purugganan MD (2007) Genome-wide patterns of nucleotide polymorphism in domesticated rice. PLoS Genet 3(9):e163
Cartegni L, Chew S, Krainer A (2002) Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat Rev Genet 3(4):285–298
Charlesworth B, Morgan MT, Charlesworth D (1993) The effect of deleterious mutations on neutral molecular variation. Genetics 134(4):1289–1303
Chisholm ST, Mahajan SK, Whitham SA, Yamamoto ML, Carrington JC (2000) Cloning of the Arabidopsis RTM1 gene, which controls restriction of long-distance movement of tobacco etch virus. Proc Natl Acad Sci USA 97(1):489–494
Clark RM, Schweikert G, Toomajian C, Ossowski S, Zeller G, Shinn P, Warthmann N, Hu TT, Fu G, Hinds DA, Chen H, Frazer KA, Huson DH, Scholkopf B, Nordborg M, Ratsch G, Ecker JR, Weigel D (2007) Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317(5836):338–342
Cronin JK, Bundock PC, Henry RJ, Nevo E (2007) Adaptive climatic molecular evolution in wild barley at the Isa defense locus. Proc Natl Acad Sci 104(8):2773–2778
El-Assal S, Alonso-Blanco C, Peeters A, Raz V, Koornneef M (2001) A QTL for flowering time in Arabidopsis reveals a novel allele of CRY2. Nat Genet 29:435–440
Filiault DL, Wessinger CA, Dinneny JR, Lutes J, Borevitz JO, Weigel D, Chory J, Maloof JN (2008) Amino acid polymorphisms in Arabidopsis phytochrome B cause differential responses to light. Proc Natl Acad Sci 105(8):3157–3162
Friedman N, Ninio M, Pe’er I, Pupko T (2002) A structural EM algorithm for phylogenetic inference. J Comput Biol 9(2):331–353
Fu H, Zheng Z, Dooner HK (2002) Recombination rates between adjacent genic and retrotransposon regions in maize vary by 2 orders of magnitude. Proc Natl Acad Sci USA 99(2):1082–1087
Gazzani S, Gendall AR, Lister C, Dean C (2003) Analysis of the molecular basis of flowering time variation in Arabidopsis accessions. Plant Phys 132(2):1107–1114
Gepts P, Papa R (2002) Evolution during domestication. In: Encyclopedia of life sciences. Wiley, Chichester. http://www.els.net/
Hamblin MT, Casa AM, Sun H, Murray SC, Paterson AH, Aquadro CF, Kresovich S (2006) Challenges of detecting directional selection after a bottleneck: lessons from Sorghum bicolor. Genetics 173(2):953–964
Hedrick P (1998) Maintenance of genetic polymorphism: spatial selection and self-fertilization. Am Nat 152(1):145–150
Hoffmann M (2002) Biogeography of Arabidopsis thaliana (L.) Heynh. (Brassicaceae). J Biogeogr 29:125–134
Innan H, Kim Y (2004) Pattern of polymorphism after strong artificial selection in a domestication event. Proc Natl Acad Sci USA 101(29):10,667–10,672
Johanson U, West J, Lister C, Michaels S, Amasino R, Dean C (2000) Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290(5490):344–347
Kimura M (1962) On the probability of fixation of mutant genes in a population. Genetics 47(6):713–719
Kimura M, Crow J (1963) The measurement of effective population number. Evolution 17(3):279–288
Kovach M, Sweeney M, McCouch S (2007) New insights into the history of rice domestication. Trends Genet 23:578–587
Lande R (1994) Risk of population extinction from fixation of new deleterious mutations. Evolution 48(5):1460–1469
Li C, Zhou A, Sang T (2006) Rice domestication by reducing shattering. Science 311(5769):1936–1939
Li WH (1997) Molecular evolution. Sinauer Associates, Sunderland
Lu J, Tang T, Tang H, Huang J, Shi S, Wu CI (2006) The accumulation of deleterious mutations in rice genomes: a hypothesis on the cost of domestication. Trends Genet 22(3):126–131
Maloof JN, Borevitz JO, Dabi T, Lutes J, Nehring RB, Redfern JL, Trainer GT, Wilson JM, Asami T, Berry CC, Weigel D, Chory J (2001) Natural variation in light sensitivity of Arabidopsis. Nat Genet 29(4):441–446
Morgante M, Brunner S, Pea G, Fengler K, Zuccolo A, Rafalski A (2005) Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat Genet 37(9):997–1002
Ng PC, Henikoff S (2001) Predicting deleterious amino acid substitutions. Genome Res 11(5):863–874
Ng PC, Henikoff S (2002) Accounting for human polymorphisms predicted to affect protein function. Genome Res 12(3):436–446
Ng PC, Henikoff S (2003) SIFT: predicting amino acid changes that affect protein function. Nucl Acids Res 31(13):3812–3814
Ng PC, Henikoff S (2006) Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet 7(1):61–80
Nordborg M, Hu TT, Ishino Y, Jhaveri J, Toomajian C, Zheng H, Bakker E, Calabrese P, Gladstone J, Goyal R, Jakobsson M, Kim S, Morozov Y, Padhukasahasram B, Plagnol V, Rosenberg NA, Shah C, Wall JD, Wang J, Zhao K, Kalbfleisch T, Schulz V, Kreitman M, Bergelson J (2005) The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3(7):e196
Oka H (1988) Origin of cultivated rice. Japan Scientific Societies Press, Tokyo, Elsevier, Amsterdam
Pico FX, Méndez-Vigo B, Martínez-Zapater JM, Alonso-Blanco C (2008) Natural genetic variation of Arabidopsis thaliana is geographically structured in the Iberian peninsula. Genetics 180(2):1009–1021
Saitoh K, Onishi K, Mikami I, Thidar K, Sano Y (2004) Allelic diversification at the C (OsC1) locus of wild and cultivated rice: nucleotide changes associated with phenotypes. Genetics 168(2):997–1007
Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13:1250–1257
Schmid KJ, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T (2005) A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169(3):1601–1615
Schmid KJ, Torjek O, Meyer R, Schmuths H, Hoffmann MH, Altmann T (2006) Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers. Theor Appl Genet 112(6):1104–1114
Sekine KT, Ishihara T, Hase S, Kusano T, Shah J, Takahashi H (2006) Single amino acid alterations in Arabidopsis thaliana RCY1 compromise resistance to Cucumber mosaic virus, but differentially suppress hypersensitive response-like cell death. Plant Mol Biol 62(4):669–682
Simons KJ, Fellers JP, Trick HN, Zhang Z, Tai YS, Gill BS, Faris JD (2006) Molecular characterization of the major wheat domestication gene Q. Genetics 172(1):547–555
Stein N, Perovic D, Kumlehn J, Pellio B, Stracke S, Streng S, Ordon F, Graner A (2005) The eukaryotic translation initiation factor 4E confers multiallelic recessive Bymovirus resistance in Hordeum vulgare (L.). Plant J 42(6):912–922
Stone EA, Sidow A (2005) Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res 15(7):978–986
Suckow J, Markiewicz P, Kleina L, Miller J, Kisters-Woike B, Müller-Hill B (1996) Genetic studies of the Lac Repressor XV: 4000 single amino acid substitutions and analysis of the resulting phenotypes on the basis of the protein structure. J Mol Biol 261(4):509–523
Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34:W609–612
TIGR (2007) Rice genome annotation, vol 5. http://www.tigr.org/tdb/rice
Wolfe K, Li W, Sharp P (1987) Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci USA 84(24):9054
Wong GKS, Yang Z, Passey DA, Kibukawa M, Paddock M, Liu CR, Bolund L, Yu J (2003) A population threshold for functional polymorphisms. Genome Res 13(8):1873–1879
Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS (2005) The effects of artificial selection on the maize genome. Science 308(5726):1310–1314
Yamasaki M, Wright S, McMullen M (2007) Genomic screening for artificial selection during domestication and improvement in maize. Ann Bot 100(5):967
Yeam I, Cavatorta JR, Ripoll DR, Kang BC, Jahn MM (2007) Functional dissection of naturally occurring amino acid substitutions in eIF4E that confers recessive potyvirus resistance in plants. Plant Cell 19(9):2913–2928
Acknowledgments
We are grateful to the IPK bioinformatics group for assistance with the computer cluster and to two anonymous reviewers for their comments. This work was supported by core funding from the Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, and the Swedish University of Agricultural Sciences (SLU) Uppsala.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by A. Schulman.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Günther, T., Schmid, K.J. Deleterious amino acid polymorphisms in Arabidopsis thaliana and rice. Theor Appl Genet 121, 157–168 (2010). https://doi.org/10.1007/s00122-010-1299-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00122-010-1299-4