Journal of Genetics

, Volume 93, Issue 2, pp 431–442 | Cite as

Patterns of microsatellite evolution inferred from the Helianthus annuus (Asteraceae) transcriptome



The distribution of microsatellites in exons, and their association with gene ontology (GO) terms is explored to elucidate patterns of microsatellite evolution in the common sunflower, Helianthus annuus. The relative position, motif, size and level of impurity were estimated for each microsatellite in the unigene database available from the Compositae Genome Project (CGP), and statistical analyses were performed to determine if differences in microsatellite distributions and enrichment within certain GO terms were significant. There are more translated than untranslated microsatellites, implying that many bring about structural changes in proteins. However, the greatest density is observed within the UTRs, particularly 5UTRs. Further, UTR microsatellites are purer and longer than coding region microsatellites. This suggests that UTR microsatellites are either younger and under more relaxed constraints, or that purifying selection limits impurities, and directional selection favours their expansion. GOs associated with response to various environmental stimuli including water deprivation and salt stress were significantly enriched with microsatellites. This may suggest that these GOs are more labile in plant genomes, or that selection has favoured the maintenance of microsatellites in these genes over others. This study shows that the distribution of transcribed microsatellites in H. annuus is nonrandom, the coding region microsatellites are under greater constraint compared to the UTR microsatellites, and that these sequences are enriched within genes that regulate plant responses to environmental stress and stimuli.


microsatellite evolution transcriptome selection untranslated regions Helianthus annuus



The authors would like to acknowledge Loren Rieseberg for useful comments on improving this manuscript. Kristen Sauby, Leah Chinchilla and Christopher Brooks helped during the initial stages of this work. Susan Bridges helped with preliminary data analyses. David Chevalier and Donna Gordon provided suggestions for gene ontology analysis. This work was supported by the National Science Foundation under grants to M. E. Welch (NSF MCB-1158521) and A. D. Perkins (NSF EPS-0903787). The Office of Research and Economic Development, the College of Arts and Sciences, and the Department of Biological Sciences at Mississippi State University also funded this research.

Supplementary material

12041_2014_402_MOESM1_ESM.pdf (368 kb)
(PDF 367 KB)


  1. Altschul S. F., Madden T. L., Schäffer A. A., Zhang J., Zhang Z., Miller W., and Lipman D. J. 1997 Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402.PubMedCentralPubMedCrossRefGoogle Scholar
  2. Baer C. F., Miyamoto M. M. and Denver D. R. 2007 Mutation rate variation in multicellular eukaryotes: causes and consequences. Nat. Rev. Genet. 8, 619–631.PubMedCrossRefGoogle Scholar
  3. Bradley R. K., Li X. Y., Trapnell C., Davidson S., Pachter L., Chu H. C. et al. 2010 Binding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species. PLoS Biol. 8, e1000343.PubMedCentralPubMedCrossRefGoogle Scholar
  4. Britten R. J. and Davidson E. H. 1969 Gene regulation for higher cells: A theory. Science 165, 349–358.PubMedCrossRefGoogle Scholar
  5. Carbon S., Ireland A., Mungall C. J., Shu S. Q., Marshall B., Lewis S. et al. 2009 AmiGO: online access to ontology and annotation data. Bioinformatics 25, 288–289.PubMedCentralPubMedCrossRefGoogle Scholar
  6. Dokholyan N. V., Buldyrev S. V., Havlin S. and Stanley H. E. 2000 Distributions of dimeric tandem repeats in noncoding and coding DNA sequences. J. Theor. Biol. 202, 273–282.PubMedCrossRefGoogle Scholar
  7. Ellegren H. 2000 Microsatellite mutations in the germline: implications for evolutionary inference. Trends Genet. 16, 551–558.PubMedCrossRefGoogle Scholar
  8. Ellegren H. 2004 Microsatellites: simple sequences with complex evolution. Nat. Rev. Genet. 5, 435–445.PubMedCrossRefGoogle Scholar
  9. Faux N. G., Bottomley S. P., Lesk A. M., Irving J. A., Morrison J. R., De La Bandaand M. G. and Whisstock J. C. 2005 Functional insights from the distribution and role of homopeptide repeat-containing proteins. Genome Res. 15, 537–551.PubMedCentralPubMedCrossRefGoogle Scholar
  10. Fay J. C. and Wittkopp P. J. 2008 Evaluating the role of natural selection in evolution of gene regulation. Heredity 100, 191–199.PubMedCrossRefGoogle Scholar
  11. Fondon J. W. III. and Garner H. R. 2004 Molecular origins of rapid and continous morphological evolution. Proc. Natl. Acad. Sci. USA 101, 18058–18063.PubMedCentralPubMedCrossRefGoogle Scholar
  12. Galindo C. L., McIver L. J., Mccormick J. F., Skinner M. A., Xie Y., Gelhausen R. A. et al. 2009 Global microsatellite content distinguishes humans primates animals and plants. Mol. Biol. Evol. 26, 2809–2819.PubMedCentralPubMedCrossRefGoogle Scholar
  13. Garza J. C., Slatkin M. and Freimer N. B. 1995 Microsatellite allele frequencies in humans and chimpanzees with implications for constraints on allele size. Mol. Biol. Evol. 12, 594–603.PubMedGoogle Scholar
  14. Gatchel J. R. and Zoghbi H. Y. 2005 Diseases of unstable repeat expansion: mechanisms and common principles. Nat. Rev. Genet. 6, 743–55.PubMedCrossRefGoogle Scholar
  15. Gemayel R., Vinces M. D., Legendre M. and Verstrepen K. J. 2010 Variable tandem repeats accelerate evolution of coding and regulatory sequences. Annu. Rev. Genet. 44, 445–477.PubMedCrossRefGoogle Scholar
  16. Gerber H. P., Seipel K., Georgiev O., Hofferer M., Hug M., Rusconi S. and Schaffner W. 1994 Transcriptional activation modulated by homopolymeric glutamine and proline stretches. Science 263, 808–811.PubMedCrossRefGoogle Scholar
  17. Giraudoux P. 2011 pgirmess: Data analysis in ecology. R package version 1.5.2.
  18. Haasl R. J. and Payseur B. A. 2013 Microsatellites as targets of natural selection. Mol. Biol. Evol. 30, 285–298.PubMedCentralPubMedCrossRefGoogle Scholar
  19. Hancock J. M. 1999 Microsatellites and other simple sequences: genomic context and mutational mechanisms. In Microsatellites: evolution and applications (ed. D. B. Goldstein and C. Schlötterer), pp. 1–9. Oxford University Press, New York, USA.Google Scholar
  20. Huang X. and Madan A. 1999 CAP3: A DNA sequence assembly program. Genome Res. 9, 868–877.PubMedCentralPubMedCrossRefGoogle Scholar
  21. Heiser C. B., Smith D. M., Clevenger S. and Martin W. C. 1969 The North American sunflowers (Helianthus). Mem. Torrey Bot. Club 22, 1–218.Google Scholar
  22. Hong C. P., Piao Z. Y., Kang T. W., Batley J., Yang T. J., Hur Y. K. et al. 2007 Genomic distribution of simple sequence repeats in Brassica rapa. Mol. Cells 23, 349–35.PubMedGoogle Scholar
  23. Iseli C., Jongeneel C. V. and Bucher P. 1999 ESTScan: a program for detecting evaluating and reconstructing potential coding regions in EST sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 1999, 138–48.Google Scholar
  24. Kane N. C. and Rieseberg L. H. 2007 Selective sweeps reveal candidate genes for adaptation to drought and salt tolerance in common sunflower Helianthus annuus. Genetics 175, 1823–1834.PubMedCentralPubMedCrossRefGoogle Scholar
  25. Karlin S. and Burge C. 1996 Trinucleotide repeats and long homopeptides in genes and proteins associated with nervous system disease and development. Proc. Natl. Acad. Sci. USA 93, 1560–1565.PubMedCentralPubMedCrossRefGoogle Scholar
  26. Kashi Y. and King D. G. 2006 Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 22, 253–259.PubMedCrossRefGoogle Scholar
  27. Kelkar Y. D., Eckert K. A., Chiaromonte F. and Makova K. D. 2011. A matter of life or death: How microsatellites emerge in and vanish from the human genome. Genome Res. 2011 21, 2038–2048.Google Scholar
  28. King D. G. and Kashi Y. 2009 Heretical DNA sequences. Science 326, 229–230.PubMedCrossRefGoogle Scholar
  29. King D. G., Trifonov E. N. and Kashi Y. 2006 Tuning knobs in the genome: evolution of simple sequence repeats by indirect selection. In The implicit genome (ed. L. H. Caporale), pp. 77–90. Oxford University Press, New York, USA.Google Scholar
  30. King D. G., Soller M. and Kashi Y. 1997 Evolutionary tuning knobs. Endeavour 21, 36–40.CrossRefGoogle Scholar
  31. Kofler R., Schlötterer C. and Lelley T. 2007 SciRoKo: A new tool for whole genome microsatellite search and investigation. Bioinformatics 23, 1683–1685.PubMedCrossRefGoogle Scholar
  32. Kruskal W. H. and Wallis A. W. 1952 Use of ranks in one-criterion analysis of variance. J. Amer. Stat. Assoc. 47, 583–621.CrossRefGoogle Scholar
  33. Lamesch P., Berardini T. Z., Li D., Swarbreck D., Wilks C., Sasidharan R. et al. 2012 The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 40, 1202–1210.CrossRefGoogle Scholar
  34. Lawson M. J. and Zhang L. 2006 Distinct patterns of SSR distribution in the Arabidopsis thaliana and rice genomes. Genome Biol. 7, R14.PubMedCentralPubMedCrossRefGoogle Scholar
  35. Lee J. E., Lee J. Y., Wilusz J., Tian B. and Wilusz C. J. 2010 Systematic analysis of cis-elements in unstable mRNAs demonstates that CUGBP1 is a key regulator of mRNA decay in muscle cells. PLoS One 5, e11201.PubMedCentralPubMedCrossRefGoogle Scholar
  36. Levinson G. and Gutman G. A. 1987 Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4, 203–221.PubMedGoogle Scholar
  37. Li Y. C., Korol A. B., Fahima T., Beiles A. and Nevo E. 2002 Microsatellites: genomic distribution putative functions and mutational mechanisms: a review. Mol. Ecol. 11, 2453–2465.PubMedCrossRefGoogle Scholar
  38. Li Y. C., Korol A. B., Fahima T. and Nevo E. 2004 Microsatellites within genes: structure function and evolution. Mol. Biol. Evol. 21, 991–1007.PubMedCrossRefGoogle Scholar
  39. Lottaz C., Iseli C., Jongeneel C. V. and Bucher P. 2003 Modeling sequencing errors by combining hidden Markov models. Bioinformatics 19, 103–112.CrossRefGoogle Scholar
  40. Marcotte E. M., Pellegrini M., Yeates T. O. and Eisenberg D. 1999 A census of protein repeats. J. Mol. Biol. 293, 151–160.PubMedCrossRefGoogle Scholar
  41. Martin P., Makepeace K., Hill S. A., Hood D. W. and Moxon E. R. 2005 Microsatellite instability regulates transcription factor binding and gene expression. Proc. Natl. Acad. Sci. USA 102, 3800–3804.PubMedCentralPubMedCrossRefGoogle Scholar
  42. Metzgar D., Bytof J. and Wills C. 2000 Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res. 10, 72–80.PubMedCentralPubMedGoogle Scholar
  43. Mignone F., Gissi C., Liuni S. and Pesole G. 2002 Untranslated regions of mRNAs. Genome Biol. 3, reviews0004-reviews0004.PubMedCentralPubMedCrossRefGoogle Scholar
  44. Morgante M., Hanafey M. and Powell W. 2002 Microsatellites are preferentially associated with non repetitive DNA in plant genomes. Nat. Genet. 30, 194–200.PubMedCrossRefGoogle Scholar
  45. Mularoni L., Veitia R. A. and Mar-Alba M. 2007 Highly constrained proteins contain an unexpectedly large number of amino acid tandem repeats. Genomics 89, 316–325.PubMedCrossRefGoogle Scholar
  46. Mun J. H., Kim D. J., Choi H. K., Gish J., Debelle’ F., Mudge J. et al. 2006 Distribution of microsatellites in the genome of medicago truncatula: a resource of genetic markers that integrate genetic and physical maps. Genetics 172, 2541–2555.PubMedCentralPubMedCrossRefGoogle Scholar
  47. Nachman M. W. and Crowell S. L. 2000 Estimate of the mutation rate per nucleotide in humans. Genetics 156, 297–304.PubMedCentralPubMedGoogle Scholar
  48. Newcomb R. D., Crowhurst R. N., Gleave A. P., Rikkerink E. H., Allan A. C., Beuning L. L. et al. 2006 Analyses of expressed sequence tags from apple. Plant Physiol. 141, 147–166.PubMedCentralPubMedCrossRefGoogle Scholar
  49. Ossowski S., Schneeberger K., Lucas-Lledo J. I., Warthmann N., Clark R. M., Shaw R. G. et al. 2010 The rate and molecular spectrum of spontaneous mutations in Arabidopsis thaliana. Science 327, 92–94.PubMedCrossRefGoogle Scholar
  50. Ott R. L. and Longnecker M. T. 2008 An introduction to statistical methods and data analysis, sixth edition. MacMillan Publishing, Belmontt, USA.Google Scholar
  51. Pramod S., Rasberry A. R., Butler T. G. and Welch M. E. 2011 Characterization of long transcribed microsatellites in Helianthus annuus (Asteraceae). Am. J. Bot. 98, e388–e390.PubMedCrossRefGoogle Scholar
  52. Pramod S., Downs K. E. and Welch M. E. 2012 Gene expression assays for actin ubiquitin and three microsatellite encoding genes in Helianthus annuus (Asteraceae). Am. J. Bot. 99, e350–352.PubMedCrossRefGoogle Scholar
  53. R Development Core Team 2012 R: A language and environment for statistical computing. R Foundation for Statistical Computing Vienna Austria. (
  54. Raca G., Siyanova E. Y., Mcmurray C. T. and Mirkin S. M. 2000 Expansion of the (CTG)n repeat in the 5-UTR of a reporter gene impedes translation. Nucleic Acids Res. 28, 3943–3949.PubMedCentralPubMedCrossRefGoogle Scholar
  55. Rattenbacher B., Beisang D., Wiesner D. L., Jesche J. C., Von Honhenberg M., St. Louis V. and Bohjanen P. R. 2010 Analysisof CUGBP1 targets identifies GU-repeat sequences that mediate rapid mRNA decay. Mol. Cell. Biol. 30, 3970–3980.PubMedCentralPubMedCrossRefGoogle Scholar
  56. Sawaya S. M., Lennon D., Buschiazzo E., Gemmell N. and Minin V. N. 2012 Measuring microsatellite conservation in mammalian evolution with a phylogenetic birth–death model. Genome Biol. Evol. 4, 748–759.PubMedCentralCrossRefGoogle Scholar
  57. Shimohata T., Nakajima T., Yamada M., Uchida C., Onodera O., Naruse S. et al. 2000 Expanded polyglutamine stretches interact with TAFII130 interfering with CREB dependent transcription. Nat. Genet. 26, 29–36.PubMedCrossRefGoogle Scholar
  58. Smarda P. and Bures P. 2012 The variation of base composition in plant genomes; In Plant genome diversity vol. 1. (ed. J. F. Wendel, J. Greilhuber, J. Dolezel and I. J. Leitch), pp. 209–235. Springer-Verlag, Vienna Austria.Google Scholar
  59. Subramanian S., Mishra R. K. and Singh L. 2003 Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol. 4, R13.PubMedCentralPubMedCrossRefGoogle Scholar
  60. Tavazoie S., Hughes J. D., Campbell M. J., Cho R. J. and Church G. M. 1999 Systematic determination of genetic network architecture. Nat. Genet. 22, 281–285.PubMedCrossRefGoogle Scholar
  61. Tian X., Strassmann J. E. and Queller D. C. 2011 Genome nucleotide composition shapes variation in simple sequence repeats. Mol. Biol. Evol. 28, 899–909.PubMedCrossRefGoogle Scholar
  62. Timchenko N. A., Welm A. L., Xiaohui L. and Timchenko L. T. 1999 CUG repeat binding protein (CUGBP1) interacts with the 5 region of C/EBP β mRNA and regulates translation of C/EBP β isoforms. Nucleic Acids Res. 27, 4517–4525.PubMedCentralPubMedCrossRefGoogle Scholar
  63. Trifonov E. N. 2004 Tuning function of tandemly repeating sequences: a molecular device for fast adaptation; In Evolutionary theory and processes: modern horizons papers in honour of Eviatar Nevo (ed. S. P. Wasser), pp. 115–138. Kluwer Academic Publishers, Massachusetts, USA.Google Scholar
  64. Wittkopp P. J. 2010 Variable transcription factor binding: A mechanism of evolutionary change. PLoS Biol. 8, e1000342.PubMedCentralPubMedCrossRefGoogle Scholar
  65. Weber J. L. and Wong C. 1993 Mutation of human short tandem repeats. Hum. Mol. Genet. 2, 1123–1128.PubMedCrossRefGoogle Scholar
  66. Wheeler T. M., Sobczak K., Lueck J. D., Osborne R. J., Lin X., Dirksen R. T. and Thornton C. A. 2009 Reversal of RNA dominance by displacement of protein sequestered on triplet repeat RNA. Science 325, 336–339.PubMedCentralPubMedCrossRefGoogle Scholar
  67. Wren J. D., Forgacs E., Fondon J. W. III., Minna J. D. and Garner H. R. 2000 Repeat polymorphisms within gene regions: phenotypic and evolutionary implications. Am. J. Hum. Genet. 67, 345–356.PubMedCentralPubMedCrossRefGoogle Scholar
  68. Yin H. and Blanchard K. L. 2000 DNA methylation represses the expression of human erythropoietin gene by two different mechanisms. Blood 95, 111–119.PubMedGoogle Scholar
  69. Zhang L., Yuan D., Yu S., Li Z., Cao Y., Miao Z. et al. 2004 Preference of simple sequence repeats in coding and non-coding regions of Arabidopsis thaliana. Bioinformatics 20, 1081–1086.PubMedCrossRefGoogle Scholar

Copyright information

© Indian Academy of Sciences 2014

Authors and Affiliations

    • 1
    • 2
    • 1
  1. 1.Department of Biological SciencesMississippi State University 295 Lee BoulevardUSA
  2. 2.Department of Computer Science and EngineeringMississippi State UniversityButler HallUSA

Personalised recommendations