Tropical Plant Biology

, Volume 1, Issue 3–4, pp 278–292 | Cite as

Genome-Wide Comparative Analyses of Microsatellites in Papaya

  • Jianping Wang
  • Cuixia Chen
  • Jong-Kuk Na
  • Qingyi Yu
  • Shaobin Hou
  • Robert E. Paull
  • Paul H. Moore
  • Maqsudul Alam
  • Ray MingEmail author


Microsatellites, or simple sequence repeats (SSRs), are highly polymorphic and universally distributed in eukaryotes. SSRs have been used extensively as sequence tagged markers in genetic studies. Recently, the functional and evolutionary importance of SSRs has received considerable attention. Here we report the mining and characterization of the SSRs in papaya genome. We analyzed SSRs from 277.4 Mb of whole genome shotgun (WGS) sequences, 51.2 Mb bacterial artificial chromosome (BAC) end sequences (BES), and 13.4 Mb expressed sequence tag (EST) sequences. The papaya SSR density was one SSR per 0.7 kb of DNA sequence in the WGS, which was higher than that in BES and EST sequences. SSR abundance was dramatically reduced as the repeat length increased. According to SSR motif length, dinucleotide repeats were the most common motif in class I, whereas hexanucleotides were the most copious in class II SSRs. The tri- and hexanucleotide repeats of both classes were greater in EST sequences compared to genomic sequences. In class I SSR, AT and AAT were the most frequent motifs in BES and WGS sequences. By contrast, AG and AAG were the most abundant in EST sequences. For SSR marker development, 9,860 primer pairs were surveyed for amplification and polymorphism. Successful amplification and polymorphic rates were 66.6% and 17.6%, respectively. The highest polymorphic rates were achieved by AT, AG, and ATG motifs. The genome wide analysis of microsatellites revealed their frequency and distribution in papaya genome, which varies among plant genomes. This complete set of SSRs markers throughout the genome will assist diverse genetic studies in papaya and related species.


Bacterial artificial chromosome end sequences (BES) Carica papaya Expressed sequence tag (EST) Motif Simple sequence repeats (SSRs) Whole genome shotgun (WGS) sequences 



We thank Yinjun Li for technical assistance. This project was supported by a USDA T-STAR grant through the University of Hawaii (to R. P., R.M., P. M., and Q.Y), a USDA-ARS Cooperative Agreement (CA 58-3020-8-134) with the Hawaii Agriculture Research Center, the U. Hawaii (to M.A.), U.S. Department of Defense (W81XWH0520013 to M.A), and startup funds from the University of Illinois at Urbana-Champaign (to R.M)

Supplementary material

12042_2008_9024_MOESM1_ESM.xls (3 mb)
Supplementary Table 1 Comprehensive information of 9,860 SSR marker surveyed (DOC 3.04 MB)


  1. 1.
    Sia EA, Kokoska RJ, Dominska M, Greenwell P, Petes TD (1997) Microsatellite instability in yeast: dependence on repeat unit size and DNA mismatch repair genes. Mol Cell Biol 17:2851–2858PubMedGoogle Scholar
  2. 2.
    Li YC, Korol AB, Fahima T, Nevo E (2004) Microsatellites within genes: structure, function, and evolution. Mol Biol Evol 21:991–1007. doi: 10.1093/molbev/msh073 CrossRefPubMedGoogle Scholar
  3. 3.
    Bowcock A, Osborne-Lawrence S, Barnes R, Chakravarti A, Washington S, Dunn C (1993) Microsatellite polymorphism linkage map of human chromosome 13q. Genomics 15:376–386. doi: 10.1006/geno.1993.1071 CrossRefPubMedGoogle Scholar
  4. 4.
    Temnykh S, DeClerck G, Lukashova A, Lipovich L, Cartinhour S, McCouch S (2001) Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon associations, and genetic marker potential. Genome Res 11:1441–1452. doi: 10.1101/gr.184001 CrossRefPubMedGoogle Scholar
  5. 5.
    Ashkenazi V, Chani E, Lavi U, Levy D, Hillel J, Veilleux RE (2001) Development of microsatellite markers in potato and their use in phylogenetic and fingerprinting analyses. Genome 44:50–62. doi: 10.1139/gen-44-1-50 CrossRefPubMedGoogle Scholar
  6. 6.
    Selvi A, Nair NV, Balasundaram N, Mohapatra T (2003) Evaluation of maize microsatellite markers for genetic diversity analysis and fingerprinting in sugarcane. Genome 46:394–403. doi: 10.1139/g03-018 CrossRefPubMedGoogle Scholar
  7. 7.
    Vigouroux Y, Mitchell S, Matsuoka Y, Hamblin M, Kresovich S, Smith JS, Jaqueth J, Smith OS, Doebley J (2005) An analysis of genetic diversity across the maize genome using microsatellites. Genetics 169:1617–1630. doi: 10.1534/genetics.104.032086 CrossRefPubMedGoogle Scholar
  8. 8.
    Barkley NA, Roose ML, Krueger RR, Federici CT (2006) Assessing genetic diversity and population structure in a citrus germplasm collection utilizing simple sequence repeat markers (SSRs). Theor Appl Genet 112:1519–1531. doi: 10.1007/s00122-006-0255-9 CrossRefPubMedGoogle Scholar
  9. 9.
    Aggarwal RK, Hendre PS, Varshney RK, Bhat PR, Krishnakumar V, Singh L (2007) Identification, characterization and utilization of EST-derived genic microsatellite markers for genome analyses of coffee and related species. Theor Appl Genet 114:359–372. doi: 10.1007/s00122-006-0440-x CrossRefPubMedGoogle Scholar
  10. 10.
    Allender CJ, Allainguillaume J, Lynn J, King GJ (2007) Simple sequence repeats reveal uneven distribution of genetic diversity in chloroplast genomes of Brassica oleracea L. and (n = 9) wild relatives. Theor Appl Genet 114:609–618. doi: 10.1007/s00122-006-0461-5 CrossRefGoogle Scholar
  11. 11.
    Wang JP, Bughrara SS, Sleper DA (2003) Genome Introgression of Festuca mairei into Lolium perenne Detected by SSR and RAPD Markers. Crop Sci 43:2154–2161CrossRefGoogle Scholar
  12. 12.
    Silkova OG, Dobrovol, skaia OB, Dubovets NI, Adonina IG, Kravtsova LA, Roder MS, Salina EA, Shchapova AI, Shumnyi VK (2006) Production of wheat-rye substitution lines and identification of chromosome composition of karyotypes using C-banding, GISH, and SSR markers. Genetika 42:793–802PubMedGoogle Scholar
  13. 13.
    Song QJ, Shi JR, Singh S, Fickus EW, Costa JM, Lewis J, Gill BS, Ward R, Cregan PB (2005) Development and mapping of microsatellite (SSR) markers in wheat. Theor Appl Genet 110:550–560. doi: 10.1007/s00122-004-1871-x CrossRefPubMedGoogle Scholar
  14. 14.
    Chen C, Yu Q, Hou S, Li Y, Eustice M, Skelton RL, Veatch O, Herdes RE, Diebold L, Saw J, Feng Y, Qian W, Bynum L, Wang L, Moore PH, Paull RE, Alam M, Ming R (2007) Construction of a sequence-tagged high-density genetic map of papaya for comparative structural and evolutionary genomics in brassicales. Genetics 177:2481–2491. doi: 10.1534/genetics.107.081463 CrossRefPubMedGoogle Scholar
  15. 15.
    Molnar SJ, Rai S, Charette M, Cober ER (2003) Simple sequence repeat (SSR) markers linked to E1, E3, E4, and E7 maturity genes in soybean. Genome 46:1024–1036. doi: 10.1139/g03-079 CrossRefPubMedGoogle Scholar
  16. 16.
    Ek M, Eklund M, Von Post R, Dayteg C, Henriksson T, Weibull P, Ceplitis A, Isaac P, Tuvesson S (2005) Microsatellite markers for powdery mildew resistance in pea (Pisum sativum L.). Hereditas 142:86–91. doi: 10.1111/j.1601-5223.2005.01906.x CrossRefPubMedGoogle Scholar
  17. 17.
    Singh SP, Sundaram RM, Biradar SK, Ahmed MI, Viraktamath BC, Siddiq EA (2006) Identification of simple sequence repeat markers for utilizing wide-compatibility genes in inter-subspecific hybrids in rice (Oryza sativa L.). Theor Appl Genet 113:509–517. doi: 10.1007/s00122-006-0316-0 CrossRefPubMedGoogle Scholar
  18. 18.
    Ashley CT Jr, Warren ST (1995) Trinucleotide repeat expansion and human disease. Annu Rev Genet 29:703–728. doi: 10.1146/ CrossRefPubMedGoogle Scholar
  19. 19.
    Morgante M, Hanafey M, Powell W (2002) Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30:194–200. doi: 10.1038/ng822 CrossRefPubMedGoogle Scholar
  20. 20.
    Kantety RV, La Rota M, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48:501–510. doi: 10.1023/A:1014875206165 CrossRefPubMedGoogle Scholar
  21. 21.
    Toth G, Gaspari Z, Jurka J (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10:967–981. doi: 10.1101/gr.10.7.967 CrossRefPubMedGoogle Scholar
  22. 22.
    Mun JH, Kim DJ, Choi HK, Gish J, Debelle F, Mudge J, Denny R, Endre G, Saurat O, Dudez AM, Kiss GB, Roe B, Young ND, Cook DR (2006) Distribution of microsatellites in the genome of Medicago truncatula: a resource of genetic markers that integrate genetic and physical maps. Genetics 172:2541–2555. doi: 10.1534/genetics.105.054791 CrossRefPubMedGoogle Scholar
  23. 23.
    Kashi Y, King D, Soller M (1997) Simple sequence repeats as a source of quantitative genetic variation. Trends Genet 13:74–78. doi: 10.1016/S0168-9525(97)01008-1 CrossRefPubMedGoogle Scholar
  24. 24.
    Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815. doi: 10.1038/35048692 CrossRefGoogle Scholar
  25. 25.
    International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800. doi: 10.1038/nature03895 CrossRefGoogle Scholar
  26. 26.
    Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A et al (2006) The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313:1596–1604. doi: 10.1126/science.1128691 CrossRefPubMedGoogle Scholar
  27. 27.
    Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C et al (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449:463–467. doi: 10.1038/nature06148 CrossRefPubMedGoogle Scholar
  28. 28.
    Ming R, Hou S, Feng Y, Yu Q, Dionne-Laporte A, Saw JH, Senin P, Wang W, Ly BV, Lewis KL et al (2008) The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature 452:991–996. doi: 10.1038/nature06856 CrossRefPubMedGoogle Scholar
  29. 29.
    Badillo VM (2000) Carica L. vs Vasconcellea St.Hil. (Caricaceae) con la rehabilitacion de este ultimo. Ernstia 10:74–79Google Scholar
  30. 30.
    Wikstrom N, Savolainen V, Chase MW (2001) Evolution of the angiosperms: calibrating the family tree. Proc Biol Sci 268:2211–2220. doi: 10.1098/rspb.2001.1782 CrossRefPubMedGoogle Scholar
  31. 31.
    Arumuganathan K, Earle E (1991) Nuclear DNA content of some important plant species. Plant Mol Biol Rep 9:208–218. doi: 10.1007/BF02672069 CrossRefGoogle Scholar
  32. 32.
    Liu Z, Moore PH, Ma H, Ackerman CM, Ragiba M, Yu Q, Pearl HM, Kim MS, Charlton JW, Stiles JI, Zee FT, Paterson AH, Ming R (2004) A primitive Y chromosome in papaya marks incipient sex chromosome evolution. Nature 427:348–352. doi: 10.1038/nature02228 CrossRefPubMedGoogle Scholar
  33. 33.
    Ming R, Moore PH, Zee F, Abbey CA, Ma H, Paterson AH (2001) Construction and characterization of a papaya BAC library as a foundation for molecular dissection of a tree-fruit genome. Theor Appl Genet 102:892–899. doi: 10.1007/s001220000448 CrossRefGoogle Scholar
  34. 34.
    Lai CW, Yu Q, Hou S, Skelton RL, Jones MR, Lewis KL, Murray J, Eustice M, Guan P, Agbayani R, Moore PH, Ming R, Presting GG (2006) Analysis of papaya BAC end sequences reveals first insights into the organization of a fruit tree genome. Mol Genet Genomics 276:1–12. doi: 10.1007/s00438-006-0122-z CrossRefPubMedGoogle Scholar
  35. 35.
    Eustice M, Yu Q, Lai C, Hou S, Thimmapuram J, Liu L, Alam M, Moore P, Presting G, Ming R (2008) Development and application of microsatellite markers for genomic analysis of papaya. Tree Genet Genomes 4:333–341. doi: 10.1007/s11295-007-0112-2 CrossRefGoogle Scholar
  36. 36.
    Tuskan GA, Gunter LE, Yang ZK, Yin T, Sewell MM, Difazio SP (2004) Characterization of microsatellites revealed by genomic sequencing of Populus trichocarpa. Can J Res 34:85–93. doi: 10.1139/x03-283 CrossRefGoogle Scholar
  37. 37.
    Cardle L, Ramsay L, Milbourne D, Macaulay M, Marshall D, Waugh R (2000) Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics 156:847–854PubMedGoogle Scholar
  38. 38.
    Gaspari Z, Ortutay C, Toth G (2007) Divergent microsatellite evolution in the human and chimpanzee lineages. FEBS Lett 581:2523–2526. doi: 10.1016/j.febslet.2007.04.073 CrossRefPubMedGoogle Scholar
  39. 39.
    Garza JC, Slatkin M, Freimer NB (1995) Microsatellite allele frequencies in humans and chimpanzees, with implications for constraints on allele size. Mol Biol Evol 12:594–603PubMedGoogle Scholar
  40. 40.
    Li YC, Fahima T, Roder MS, Kirzhner VM, Beiles A, Korol AB, Nevo E (2003) Genetic effects on microsatellite diversity in wild emmer wheat (Triticum dicoccoides) at the Yehudiyya microsite, Israel. Heredity 90:150–156. doi: 10.1038/sj.hdy.6800190 CrossRefPubMedGoogle Scholar
  41. 41.
    Kwapata K, Mwase WF, Bokosi JM, Kwapata MB, Munyenyembe P (2007) Genetic diversity of Annona senegalensis Pers. populations as revealed by simple sequence repeats (SSRs). Afr J Biotechnol 6:1239–1247Google Scholar
  42. 42.
    Orti G, Pearse DE, Avise JC (1997) Phylogenetic assessment of length variation at a microsatellite locus. Proc Natl Acad Sci USA 94:10745–10749. doi: 10.1073/pnas.94.20.10745 CrossRefPubMedGoogle Scholar
  43. 43.
    Tautz D (1989) Hypervariability of simple sequences as a general source for polymorphic DNA markers. Nucleic Acids Res 17:6463–6471. doi: 10.1093/nar/17.16.6463 CrossRefPubMedGoogle Scholar
  44. 44.
    McCouch SR, Teytelman L, Xu Y, Lobos KB, Clare K, Walton M, Fu B, Maghirang R, Li Z, Xing Y, Zhang Q, Kono I, Yano M, Fjellstrom R, DeClerck G, Schneider D, Cartinhour S, Ware D, Stein L (2002) Development and mapping of 2240 new SSR markers for rice (Oryza sativa L.). DNA Res 9:199–207. doi: 10.1093/dnares/9.6.199 CrossRefPubMedGoogle Scholar
  45. 45.
    Katti MV, Ranjekar PK, Gupta VS (2001) Differential distribution of simple sequence repeats in eukaryotic genome sequences. Mol Biol Evol 18:1161–1167PubMedGoogle Scholar
  46. 46.
    Subramanian S, Mishra RK, Singh L (2003) Genome-wide analysis of microsatellite repeats in humans: their abundance and density in specific genomic regions. Genome Biol 4:R13. doi: 10.1186/gb-2003-4-2-r13 CrossRefPubMedGoogle Scholar
  47. 47.
    Zhang L, Yuan D, Yu S, Li Z, Cao Y, Miao Z, Qian H, Tang K (2004) Preference of simple sequence repeats in coding and non-coding regions of Arabidopsis thaliana. Bioinformatics 20:1081–1086. doi: 10.1093/bioinformatics/bth043 CrossRefPubMedGoogle Scholar
  48. 48.
    Stallings RL, Ford AF, Nelson D, Torney DC, Hildebrand CE, Moyzis RK (1991) Evolution and distribution of (GT)n repetitive sequences in mammalian genomes. Genomics 10:807–815. doi: 10.1016/0888-7543(91)90467-S CrossRefPubMedGoogle Scholar
  49. 49.
    Chen C, Zhou P, Choi YA, Huang S, Gmitter FG Jr (2006) Mining and characterizing microsatellites from citrus ESTs. Theor Appl Genet 112:1248–125. doi: 10.1007/s00122-006-0226-1 CrossRefPubMedGoogle Scholar
  50. 50.
    Chin EC, Senior ML, Shu H, Smith JS (1996) Maize simple repetitive DNA sequences: abundance and allele variation. Genome 39:866–873. doi: 10.1139/g96-109 CrossRefPubMedGoogle Scholar
  51. 51.
    Pinto LR, Oliveira KM, Ulian EC, Garcia AA, de Souza AP (2004) Survey in the sugarcane expressed sequence tag database (SUCEST) for simple sequence repeats. Genome 47:795–804. doi: 10.1139/g04-055 CrossRefPubMedGoogle Scholar
  52. 52.
    La Rota M, Kantety RV, Yu JK, Sorrells ME (2005) Nonrandom distribution and frequencies of genomic and EST-derived microsatellite markers in rice, wheat, and barley. BMC Genomics 6:23. doi: 10.1186/1471-2164-6-23 CrossRefPubMedGoogle Scholar
  53. 53.
    Young ET, Sloan JS, Van Riper K (2000) Trinucleotide repeats are clustered in regulatory genes in Saccharomyces cerevisiae. Genetics 154:1053–1068PubMedGoogle Scholar
  54. 54.
    Menz MA, Klein RR, Mullet JE, Obert JA, Unruh NC, Klein PE (2002) A high-density genetic map of Sorghum bicolor (L.) Moench based on 2926 AFLP, RFLP and SSR markers. Plant Mol Biol 48:483–499. doi: 10.1023/A:1014831302392 CrossRefPubMedGoogle Scholar
  55. 55.
    Sharopova N, McMullen MD, Schultz L, Schroeder S, Sanchez-Villeda H, Gardiner J, Bergstrom D, Houchins K, Melia-Hancock S, Musket T et al (2002) Development and mapping of SSR markers for maize. Plant Mol Biol 48:463–481. doi: 10.1023/A:1014868625533 CrossRefPubMedGoogle Scholar
  56. 56.
    Song QJ, Marek LF, Shoemaker RC, Lark KG, Concibido VC, Delannay X, Specht JE, Cregan PB (2004) A new integrated genetic linkage map of the soybean. Theor Appl Genet 109:122–128. doi: 10.1007/s00122-004-1602-3 CrossRefPubMedGoogle Scholar
  57. 57.
    Jurka J, Pethiyagoda C (1995) Simple repetitive DNA sequences from primates: compilation and analysis. J Mol Evol 40:120–126. doi: 10.1007/BF00167107 CrossRefPubMedGoogle Scholar
  58. 58.
    Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132:365–386PubMedGoogle Scholar

Copyright information

© Springer-Verlag 2008

Authors and Affiliations

  • Jianping Wang
    • 1
  • Cuixia Chen
    • 1
  • Jong-Kuk Na
    • 1
  • Qingyi Yu
    • 2
  • Shaobin Hou
    • 3
  • Robert E. Paull
    • 4
  • Paul H. Moore
    • 5
  • Maqsudul Alam
    • 3
  • Ray Ming
    • 1
    Email author
  1. 1.Department of Plant BiologyUniversity of Illinois at Urbana-ChampaignUrbanaUSA
  2. 2.Hawaii Agriculture Research CenterAieaUSA
  3. 3.Advanced Studies in Genomics, Proteomics and BioinformaticsUniversity of HawaiiHonoluluUSA
  4. 4.Department of Tropical Plant and Soil SciencesUniversity of HawaiiHonoluluUSA
  5. 5.USDA-ARSPacific Basin Agricultural Research CenterHiloUSA

Personalised recommendations