Theoretical and Applied Genetics

, Volume 121, Issue 1, pp 157–168 | Cite as

Deleterious amino acid polymorphisms in Arabidopsis thaliana and rice

  • Torsten Günther
  • Karl J. SchmidEmail author
Original Paper


Plant genetic diversity has been mainly investigated with neutral markers, but large-scale DNA sequencing projects now enable the identification and analysis of different classes of genetic polymorphisms, such as non-synonymous single nucleotide polymorphisms (nsSNPs) in protein coding sequences. Using the SIFT and MAPP programs to predict whether nsSNPs are tolerated (i.e., effectively neutral) or deleterious for protein function, genome-wide nsSNP data from Arabidopsis thaliana and rice were analyzed. In both species, about 20% of polymorphic sites with nsSNPs were classified as deleterious; they segregate at lower allele frequencies than tolerated nsSNPs due to purifying selection. Furthermore, A. thaliana accessions from marginal populations show a higher relative proportion of deleterious nsSNPs, which likely reflects differential selection or demographic effects in subpopulations. To evaluate the sensitivity of predictions, genes from model and crop plants with known functional effects of nsSNPs were inferred with the algorithms. The programs predicted about 70% of nsSNPs correctly as tolerated or deleterious, i.e., as having a functional effect. Forward-in-time simulations of bottleneck and domestication models indicated a high power to detect demographic effects on nsSNP frequencies in sufficiently large datasets. The results indicate that nsSNPs are useful markers for analyzing genetic diversity in plant genetic resources and breeding populations to infer natural/artificial selection and genetic drift.


MAPP Prediction Artificial Selection Demographic History Selection Coefficient Lower Allele Frequency 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



We are grateful to the IPK bioinformatics group for assistance with the computer cluster and to two anonymous reviewers for their comments. This work was supported by core funding from the Institute of Plant Genetics and Crop Plant Research (IPK) Gatersleben, and the Swedish University of Agricultural Sciences (SLU) Uppsala.

Supplementary material

122_2010_1299_MOESM1_ESM.pdf (7.5 mb)
PDF (123 kb)


  1. Abbott R, Gomes M (1989) Population structure and outcrossing rate of Arabidopsis thaliana (L) Heynh. Heredity 62:411–418CrossRefGoogle Scholar
  2. Albar L, Bangratz-Reyser M, Hebrard E, Ndjiondjop M, Jones M, Ghesquiere A (2006) Mutations in the eIF(iso)4G translation initiation factor confer high resistance of rice to Rice yellow mottle virus. Plant J 47:417–426CrossRefPubMedGoogle Scholar
  3. Alonso-Blanco C, de Vries HB, Hanhart CJ, Koornneef M (1999) Natural allelic variation at seed size loci in relation to other life history traits of Arabidopsis thaliana. Proc Natl Acad Sci USA 96(8):4710–4717CrossRefPubMedGoogle Scholar
  4. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402CrossRefPubMedGoogle Scholar
  5. Birney E, Clamp M, Durbin R (2004) Genewise and genomewise. Genome Res 14(5):988–995CrossRefPubMedGoogle Scholar
  6. Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, Martin MJ, Michoud K, O’Donovan C, Phan I, Pilbout S, Schneider M (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 31(1):365–370CrossRefPubMedGoogle Scholar
  7. Brock M, Tiffin P, Weinig C (2007) Sequence diversity and haplotype associations with phenotypic responses to crowding: GIGANTEA affects fruit set in Arabidopsis thaliana. Mol Ecol 16(14):3050–3062CrossRefPubMedGoogle Scholar
  8. Caicedo AL, Williamson SH, Hernandez RD, Boyko A, Fledel-Alon A, York TL, Polato NR, Olsen KM, Nielsen R, McCouch SR, Bustamante CD, Purugganan MD (2007) Genome-wide patterns of nucleotide polymorphism in domesticated rice. PLoS Genet 3(9):e163CrossRefGoogle Scholar
  9. Cartegni L, Chew S, Krainer A (2002) Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat Rev Genet 3(4):285–298CrossRefPubMedGoogle Scholar
  10. Charlesworth B, Morgan MT, Charlesworth D (1993) The effect of deleterious mutations on neutral molecular variation. Genetics 134(4):1289–1303PubMedGoogle Scholar
  11. Chisholm ST, Mahajan SK, Whitham SA, Yamamoto ML, Carrington JC (2000) Cloning of the Arabidopsis RTM1 gene, which controls restriction of long-distance movement of tobacco etch virus. Proc Natl Acad Sci USA 97(1):489–494CrossRefPubMedGoogle Scholar
  12. Clark RM, Schweikert G, Toomajian C, Ossowski S, Zeller G, Shinn P, Warthmann N, Hu TT, Fu G, Hinds DA, Chen H, Frazer KA, Huson DH, Scholkopf B, Nordborg M, Ratsch G, Ecker JR, Weigel D (2007) Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science 317(5836):338–342CrossRefPubMedGoogle Scholar
  13. Cronin JK, Bundock PC, Henry RJ, Nevo E (2007) Adaptive climatic molecular evolution in wild barley at the Isa defense locus. Proc Natl Acad Sci 104(8):2773–2778CrossRefPubMedGoogle Scholar
  14. El-Assal S, Alonso-Blanco C, Peeters A, Raz V, Koornneef M (2001) A QTL for flowering time in Arabidopsis reveals a novel allele of CRY2. Nat Genet 29:435–440CrossRefGoogle Scholar
  15. Filiault DL, Wessinger CA, Dinneny JR, Lutes J, Borevitz JO, Weigel D, Chory J, Maloof JN (2008) Amino acid polymorphisms in Arabidopsis phytochrome B cause differential responses to light. Proc Natl Acad Sci 105(8):3157–3162CrossRefPubMedGoogle Scholar
  16. Friedman N, Ninio M, Pe’er I, Pupko T (2002) A structural EM algorithm for phylogenetic inference. J Comput Biol 9(2):331–353CrossRefPubMedGoogle Scholar
  17. Fu H, Zheng Z, Dooner HK (2002) Recombination rates between adjacent genic and retrotransposon regions in maize vary by 2 orders of magnitude. Proc Natl Acad Sci USA 99(2):1082–1087PubMedGoogle Scholar
  18. Gazzani S, Gendall AR, Lister C, Dean C (2003) Analysis of the molecular basis of flowering time variation in Arabidopsis accessions. Plant Phys 132(2):1107–1114CrossRefGoogle Scholar
  19. Gepts P, Papa R (2002) Evolution during domestication. In: Encyclopedia of life sciences. Wiley, Chichester.
  20. Hamblin MT, Casa AM, Sun H, Murray SC, Paterson AH, Aquadro CF, Kresovich S (2006) Challenges of detecting directional selection after a bottleneck: lessons from Sorghum bicolor. Genetics 173(2):953–964CrossRefPubMedGoogle Scholar
  21. Hedrick P (1998) Maintenance of genetic polymorphism: spatial selection and self-fertilization. Am Nat 152(1):145–150CrossRefPubMedGoogle Scholar
  22. Hoffmann M (2002) Biogeography of Arabidopsis thaliana (L.) Heynh. (Brassicaceae). J Biogeogr 29:125–134CrossRefGoogle Scholar
  23. Innan H, Kim Y (2004) Pattern of polymorphism after strong artificial selection in a domestication event. Proc Natl Acad Sci USA 101(29):10,667–10,672CrossRefGoogle Scholar
  24. Johanson U, West J, Lister C, Michaels S, Amasino R, Dean C (2000) Molecular analysis of FRIGIDA, a major determinant of natural variation in Arabidopsis flowering time. Science 290(5490):344–347CrossRefPubMedGoogle Scholar
  25. Kimura M (1962) On the probability of fixation of mutant genes in a population. Genetics 47(6):713–719PubMedGoogle Scholar
  26. Kimura M, Crow J (1963) The measurement of effective population number. Evolution 17(3):279–288CrossRefGoogle Scholar
  27. Kovach M, Sweeney M, McCouch S (2007) New insights into the history of rice domestication. Trends Genet 23:578–587CrossRefPubMedGoogle Scholar
  28. Lande R (1994) Risk of population extinction from fixation of new deleterious mutations. Evolution 48(5):1460–1469CrossRefGoogle Scholar
  29. Li C, Zhou A, Sang T (2006) Rice domestication by reducing shattering. Science 311(5769):1936–1939CrossRefPubMedGoogle Scholar
  30. Li WH (1997) Molecular evolution. Sinauer Associates, SunderlandGoogle Scholar
  31. Lu J, Tang T, Tang H, Huang J, Shi S, Wu CI (2006) The accumulation of deleterious mutations in rice genomes: a hypothesis on the cost of domestication. Trends Genet 22(3):126–131CrossRefPubMedGoogle Scholar
  32. Maloof JN, Borevitz JO, Dabi T, Lutes J, Nehring RB, Redfern JL, Trainer GT, Wilson JM, Asami T, Berry CC, Weigel D, Chory J (2001) Natural variation in light sensitivity of Arabidopsis. Nat Genet 29(4):441–446CrossRefPubMedGoogle Scholar
  33. Morgante M, Brunner S, Pea G, Fengler K, Zuccolo A, Rafalski A (2005) Gene duplication and exon shuffling by helitron-like transposons generate intraspecies diversity in maize. Nat Genet 37(9):997–1002CrossRefPubMedGoogle Scholar
  34. Ng PC, Henikoff S (2001) Predicting deleterious amino acid substitutions. Genome Res 11(5):863–874CrossRefPubMedGoogle Scholar
  35. Ng PC, Henikoff S (2002) Accounting for human polymorphisms predicted to affect protein function. Genome Res 12(3):436–446CrossRefPubMedGoogle Scholar
  36. Ng PC, Henikoff S (2003) SIFT: predicting amino acid changes that affect protein function. Nucl Acids Res 31(13):3812–3814CrossRefPubMedGoogle Scholar
  37. Ng PC, Henikoff S (2006) Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet 7(1):61–80CrossRefPubMedGoogle Scholar
  38. Nordborg M, Hu TT, Ishino Y, Jhaveri J, Toomajian C, Zheng H, Bakker E, Calabrese P, Gladstone J, Goyal R, Jakobsson M, Kim S, Morozov Y, Padhukasahasram B, Plagnol V, Rosenberg NA, Shah C, Wall JD, Wang J, Zhao K, Kalbfleisch T, Schulz V, Kreitman M, Bergelson J (2005) The pattern of polymorphism in Arabidopsis thaliana. PLoS Biol 3(7):e196CrossRefPubMedGoogle Scholar
  39. Oka H (1988) Origin of cultivated rice. Japan Scientific Societies Press, Tokyo, Elsevier, AmsterdamGoogle Scholar
  40. Pico FX, Méndez-Vigo B, Martínez-Zapater JM, Alonso-Blanco C (2008) Natural genetic variation of Arabidopsis thaliana is geographically structured in the Iberian peninsula. Genetics 180(2):1009–1021CrossRefPubMedGoogle Scholar
  41. Saitoh K, Onishi K, Mikami I, Thidar K, Sano Y (2004) Allelic diversification at the C (OsC1) locus of wild and cultivated rice: nucleotide changes associated with phenotypes. Genetics 168(2):997–1007CrossRefPubMedGoogle Scholar
  42. Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphisms for mapping in Arabidopsis thaliana. Genome Res 13:1250–1257CrossRefPubMedGoogle Scholar
  43. Schmid KJ, Ramos-Onsins S, Ringys-Beckstein H, Weisshaar B, Mitchell-Olds T (2005) A multilocus sequence survey in Arabidopsis thaliana reveals a genome-wide departure from a neutral model of DNA sequence polymorphism. Genetics 169(3):1601–1615CrossRefPubMedGoogle Scholar
  44. Schmid KJ, Torjek O, Meyer R, Schmuths H, Hoffmann MH, Altmann T (2006) Evidence for a large-scale population structure of Arabidopsis thaliana from genome-wide single nucleotide polymorphism markers. Theor Appl Genet 112(6):1104–1114CrossRefPubMedGoogle Scholar
  45. Sekine KT, Ishihara T, Hase S, Kusano T, Shah J, Takahashi H (2006) Single amino acid alterations in Arabidopsis thaliana RCY1 compromise resistance to Cucumber mosaic virus, but differentially suppress hypersensitive response-like cell death. Plant Mol Biol 62(4):669–682CrossRefPubMedGoogle Scholar
  46. Simons KJ, Fellers JP, Trick HN, Zhang Z, Tai YS, Gill BS, Faris JD (2006) Molecular characterization of the major wheat domestication gene Q. Genetics 172(1):547–555CrossRefPubMedGoogle Scholar
  47. Stein N, Perovic D, Kumlehn J, Pellio B, Stracke S, Streng S, Ordon F, Graner A (2005) The eukaryotic translation initiation factor 4E confers multiallelic recessive Bymovirus resistance in Hordeum vulgare (L.). Plant J 42(6):912–922CrossRefPubMedGoogle Scholar
  48. Stone EA, Sidow A (2005) Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res 15(7):978–986CrossRefPubMedGoogle Scholar
  49. Suckow J, Markiewicz P, Kleina L, Miller J, Kisters-Woike B, Müller-Hill B (1996) Genetic studies of the Lac Repressor XV: 4000 single amino acid substitutions and analysis of the resulting phenotypes on the basis of the protein structure. J Mol Biol 261(4):509–523CrossRefPubMedGoogle Scholar
  50. Suyama M, Torrents D, Bork P (2006) PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res 34:W609–612CrossRefPubMedGoogle Scholar
  51. TIGR (2007) Rice genome annotation, vol 5.
  52. Wolfe K, Li W, Sharp P (1987) Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc Natl Acad Sci USA 84(24):9054CrossRefPubMedGoogle Scholar
  53. Wong GKS, Yang Z, Passey DA, Kibukawa M, Paddock M, Liu CR, Bolund L, Yu J (2003) A population threshold for functional polymorphisms. Genome Res 13(8):1873–1879PubMedGoogle Scholar
  54. Wright SI, Bi IV, Schroeder SG, Yamasaki M, Doebley JF, McMullen MD, Gaut BS (2005) The effects of artificial selection on the maize genome. Science 308(5726):1310–1314CrossRefPubMedGoogle Scholar
  55. Yamasaki M, Wright S, McMullen M (2007) Genomic screening for artificial selection during domestication and improvement in maize. Ann Bot 100(5):967CrossRefPubMedGoogle Scholar
  56. Yeam I, Cavatorta JR, Ripoll DR, Kang BC, Jahn MM (2007) Functional dissection of naturally occurring amino acid substitutions in eIF4E that confers recessive potyvirus resistance in plants. Plant Cell 19(9):2913–2928CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag 2010

Authors and Affiliations

  1. 1.Leibniz Institute of Plant Genetics and Crop Plant Research (IPK)GaterslebenGermany
  2. 2.Institute of Plant Breeding, Seed Science and Population GeneticsUniversity of HohenheimStuttgartGermany
  3. 3.Swedish University of Agricultural Sciences (SLU)UppsalaSweden

Personalised recommendations