Plant Molecular Biology

, Volume 48, Issue 5–6, pp 501–510 | Cite as

Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat

  • Ramesh V. Kantety
  • Mauricio La Rota
  • David E. Matthews
  • Mark E. Sorrells


Plant genomics projects involving model species and many agriculturally important crops are resulting in a rapidly increasing database of genomic and expressed DNA sequences. The publicly available collection of expressed sequence tags (ESTs) from several grass species can be used in the analysis of both structural and functional relationships in these genomes. We analyzed over 260 000 EST sequences from five different cereals for their potential use in developing simple sequence repeat (SSR) markers. The frequency of SSR-containing ESTs (SSR-ESTs) in this collection varied from 1.5% for maize to 4.7% for rice. In addition, we identified several ESTs that are related to the SSR-ESTs by BLAST analysis. The SSR-ESTs and the related sequences were clustered within each species in order to reduce the redundancy and to produce a longer consensus sequence. The consensus and singleton sequences from each species were pooled and clustered to identify cross-species matches. Overall a reduction in the redundancy by 85% was observed when the resulting consensus and singleton sequences (3569) were compared to the total number of SSR-EST and related sequences analyzed (24 606). This information can be useful for the development of SSR markers that can amplify across the grass genera for comparative mapping and genetics. Functional analysis may reveal their role in plant metabolism and gene evolution.

EST comparative genomics comparative mapping EST EST clustering SSR 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J. 1997. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucl. Acids Res. 25: 3389-3402.Google Scholar
  2. Ayers, N.M., McClung, A.M., Larkin, P.D., Bligh, H.F.J., Jones, C.A. and Park, W.D. 1997. Microsatellites and a single nucleotide polymorphism differentiate apparent amylose classes in an extended pedigree of US rice germplasm. Theor. Appl. Genet. 94: 773-781.Google Scholar
  3. Band M.R., J.H. Larson, M. Rebeiz, C.A. Green, D.W., Heyen, J., Donovan, R., Windish, C., Steining, P., Mahyuddin, J.E., Womack and H.A. Lewin. 2000. An ordered comparative map of the cattle and human genomes. Genome Res. 10: 1359-1368.Google Scholar
  4. Becker, J. and Heun, M. 1995. Barley microsatellites: allele variation and mapping. Plant Mol. Biol. 27: 835-845.Google Scholar
  5. Bryan, G.J., Collins, A.J., Stephenson, P., Orry, A., Smith, J.B. and Gale, M.D. 1997. Isolation and characterisation of microsatellites from hexaploid bread wheat. Theor. Appl. Genet. 94: 557-563.Google Scholar
  6. Chakraborty, R., Kimmel, M., Strivers, D.N., Davison, L.J. and Deka, R. 1997. Relative mutation rates at di-, tri-and tetranucleotide microsatellite loci. Proc. Natl. Acad. Sci. USA 94: 1041-1046.Google Scholar
  7. Chin, E.C.L., Senior, M.L., Shu, H. and Smith, J.S.C. 1996. Maize simple repetitive DNA sequences: abundance and allele variation. Genome 39: 866-873.Google Scholar
  8. Cho, Y.G., Ishii, T., Temnykh, S., Chen, X., Lipovich, L., McCouch, S.R., Park, W.D., Ayer, N. and Cartinhour, S. 2000. Diversity of microsatellites derived from genomic libraries and GenBank sequences in rice (Oryza sativa). Theor. Appl. Genet. 100: 713-722.Google Scholar
  9. Eujayl, I., Sorrells, M.E., Baum, M., Wolters, P. and Powell, W. 2000. Assessment of genotypic variation among cultivated durum wheat based on EST-SSRs and genomic SSRs. Theor. Appl. Genet.Google Scholar
  10. Green, P. 1999. SWAT/Crossmatch/PHRAP package, University of Washington. URL: Scholar
  11. Gupta, P.K., Balyan, H.S., Sharma, P.C. and Ramesh, B. 1996. Microsatellites in plants: a new class of molecular markers. Curr. Sci. 70: 45-54.Google Scholar
  12. Herron, B.J., Silva, G.H. and Flaherty, L. 1998. Putative assignment of ESTs to the genetic map by use of the SSLP database. Mammal. Genome 9: 1072-1074.Google Scholar
  13. Kantety, R.V., Zeng, X., Bennetzen, J.L. and Zehr, BE. 1995. Assessment of genetic diversity in dent and popcorn (Zea mays L.) inbred lines using inter-simple sequence repeat (ISSR) amplification. Mol. Breed. 1: 365-373.Google Scholar
  14. Korzun, V., Röder, M.S., Wendekake, K., Pasqualone, A., Lotti, C., Ganal, M.W. and Blanco, A. 1999. Integration of dinucleotide microsatellites from hexaploid bread wheat into a genetic linkage map of durum wheat. Theor. Appl. Genet 98: 1202-1207.Google Scholar
  15. La Rota, C.M. 2000. EST clustering for database simplification and candidate gene discovery in rice. M.S. Thesis, Cornell University, New York.Google Scholar
  16. Laurent, P., Elduque, C., Hayes, H., Saunier, K., Eggen, A. and Levéziel, H. 2000. Assignment of 60 human ESTs in cattle. Mammal. Genome 11: 748-754.Google Scholar
  17. Lewin, B. 1994. Genes V. Oxford University Press, New York.Google Scholar
  18. Liu, Z.W., Biyashev, R.M. and Maroof, M.A.S. 1996. Development of simple sequence repeat DNA markers and their integration into a barley linkage map. Theor. Appl. Genet. 93: 869-876.Google Scholar
  19. McCouch, S.R., Chen, X., Panaud, O., Temnykh, S., Xu, Y., Cho, Y.G., Huang, N., Ishii, T. and Blair, M. 1997. Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Mol. Biol. 35: 89-99.Google Scholar
  20. Miller, R.T., Christoffels, A.G. et al. 1999. A comprehensive approach to clustering of expressed human gene sequence: the sequence tag alignment and consensus knowledge base. Genome Res. 9: 1143-55.Google Scholar
  21. Nachit, M., Elouafi, I., Pagnotta, M.A., El Saleh, A., Iacono, E., Labhilili, M., Asbati, A., Azrak, M., Hazzam, H., Benscher, D., Khairallah, M., Ribaut, J., Tanzarella, O.A., Porceddu, E. and Sorrells, M.E. 2001. Molecular linkage map for an intraspecific recombinant inbred population of durum wheat (Triticum turgidum L. var. durum). Theor. Appl. Genet. 102: 177-186.Google Scholar
  22. Neri, C., Albanese, V., Lebre, A-S, Holbert, S., Saada, C., Bougueleret, L., Meier-Ewert, S., Le Gall, I., Millasseau, P., Bui, H., Giudicelli, C., Massart, C., Guillou, S., Gervy, P., Poullier, E., Rigault, P., Weissenbach, J., Lennon, G., Chumakov, I., Dausset, J., Lehrach, H., Cohen, D. and Cann, H.M. 1996. Survey of CAG/CTG repeats in human cDNAs representing new genes: candidates for inherited neurological disorders. Human Mol. Genet. 5: 1001-1009.Google Scholar
  23. Plaschke, J., Ganal, M.W. and Röder, M.S. 1995. Detection of genetic diversity in closely related bread wheat using microsatellite markers. Theor. Appl. Genet. 92: 1078-1084.Google Scholar
  24. Pujana, M.A., Gratacos, M., Corral, J., Banchs, I., Sanchez, A., Genis, D., Cervera, C., Volpini, V. and Estivill, X. 1997. Polymorphisms at 13 expressed human sequences containing CAG/CTG repeats and analysis in autosomal dominant cerebellar ataxia (ADCA) patients. Human Genet. 101: 18-21.Google Scholar
  25. Pulst, S.-M., Nechiporuk, A., Nechiporuk, T., Gispert, S. Chen, X.-N., Lopes-Cendes, I., Pearlman, S., Starkman, S., Orozco-Diaz, G., Lunkes, A., DeJong, P., Rouleau, G.A., Aurburger, G., Korenberg, J.R., Figueroa, C. and Sahba, S. 1996. Moderate expansion of a normally biallelic trinucleotide repeat in spinocerebellar ataxia type 2. Nature Genet. 13: 269-276.Google Scholar
  26. Quackenbush, J., Liang F. et al. 2000. The TIGR Gene Indices: reconstruction and representation of expressed gene sequences. Nucl. Acids Res 28: 141-145.Google Scholar
  27. Ramsay, L., Macaulay, M., degli Ivannissevich, S., MacLean, K., Cardle, L., Fuller, J., Edwards, K.J., Tuvesson, S., Morgante, M., Massari, A., Maestri, E., Marmiroli, N., Sjakste, T., Ganal, M., Powell, W. and Waugh, R. Genetics 156: 1997-2005.Google Scholar
  28. Rebeiz, M. and Lewin, H.A. 2000. COMPASS of 47 787 cattle ESTs. Animal Biotechnol. 11: 175-241.Google Scholar
  29. Röder, M.S., Korzun, V., Wandehake, K., Planschke, J., Tixier, M.H., Leroy, P. and Ganal, M.W. 1998. A microsatellite map of wheat. Genetics 149: 2007-2023.Google Scholar
  30. Russell, J., Fuller, J., Young, G., Tomas, B., Taramino, G., Macaulay, M., Waugh, R. and Powell, W. 1997. Discriminating between barley genotypes using microsatellite markers. Genome 40: 442-450.Google Scholar
  31. Salimath, S.S., de Oliveira, A.C., Godwin, I.D. and Bennetzen, JL. 1995. Assessment of genomic origins and genetic diversity in the genus Eleusine with DNA markers. Genome 38: 757-763.Google Scholar
  32. Sanpei, K., Takano, H., Igarashi, S., Sato, T., Oyake, M., Sasaki, H., Wakisaka, A., Tashiro, T., Ishida, Y., Ikeuchi, T., Koide, R., Saito, M., Sato, A., Tanaka, T., Hanyu, S., Takiyama, Y., Nishizawa, M., Shimizu, N., Nomura, Y., Sagawa, N., Iwabuchi, K., Eguchi, T., Tanaka, H., Takanashi, H. and Tsuji, S. 1996. Identification of the spinocerebellar ataxia type 2 gene using a direct identification of repeat expansion and cloning technique, DIRECT. Nature Genet. 13: 277-284.Google Scholar
  33. Sasaki, T., Billet, E., Petronis, A., Ying, D., Parsons, T., Macciardi, F.M., Meltzer, H.Y., Lieberman, J., Joffe, R.T., Ross, C.A., McInnis, M.G., Li, S.H. and Kennedy, J.L. 1996. Psychosis and genes with trinucleotide repeat polymorphism. Human Genet. 97: 244-246.Google Scholar
  34. Schug, M.D., Hutter, C.M., Wetterstrand, K.A., Gaudette, M.S., Mackay, T.P.C. and Aquadro, C.F. 1988. The mutation rates of di-, tri-, and tetranucleotide repeats in Drosophila melanogaster. Mol. Biol. Evol. 5: 1751-1760.Google Scholar
  35. Schuler, G.D., Boguski, M.S. et al. 1996. A gene map of the human genome. Science 274 (5287): 540-546.Google Scholar
  36. Scott, K.D., Eggler, P., Seaton, G., Rossetto, M., Ablett, E.M., Lee, L.S. and Henry, R.J. 2000. Analysis of SSRs derived from grape ESTs. Theor. Appl. Genet. 100: 723-726.Google Scholar
  37. Senior, M.L., Chin, E.C.L., Lee, M. and Smith, J.S.C. 1996. Simple sequence repeat markers developed from maize found in the GenBank database: map construction. Crop Sci. 36: 1676-1683.Google Scholar
  38. Smit, A. 1999. RepeatMasker. University of Washington, Seattle, WA. URL: Scholar
  39. Sorrells, M.E. 2000a. The evolution of comparative plant genetics. In: J.P. Gustafson (Ed.) Genomes. Proceedings 22nd Stadler Symposium (6-8 June 1998, Columbia, MO), Kluwer Academic Publishers, Boston, MA.Google Scholar
  40. Sorrells, M.E. 2000b. Comparative genomics for tef improvement. In: H. Tefera (Ed.) Proceedings of the International Workshop for tef Improvement (13-16 October 2000, Addis Ababa, Ethiopia).Google Scholar
  41. Tautz, D. and Renz, M. 1984. Simple sequence repeats are ubiquitous repetitive components of eukaryotic genomes. Nucl. Acids Res 12: 4127-4138.Google Scholar
  42. Temnykh, S., Park, W.D., Ayers, N., Cartinhour, S., Hauck, N., Lipovich, L., Cho, Y.G., Ishii, T. and McCouch, S.R. 1999. Mapping and genome organization of microsatellites in rice (Oryza sativa Theor. Appl. Genet. 100: 698-712.Google Scholar
  43. The Huntington's Disease Collaborative Research Group. 1993. A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington's disease chromosomes. Cell 72: 971-983.Google Scholar

Copyright information

© Kluwer Academic Publishers 2002

Authors and Affiliations

  • Ramesh V. Kantety
    • 1
  • Mauricio La Rota
  • David E. Matthews
    • 2
  • Mark E. Sorrells
    • 1
  1. 1.Department of Plant BreedingCornell UniversityIthacaUSA
  2. 2.USDA-ARS Center for Agricultural BioinformaticsCornell UniversityIthacaUSA

Personalised recommendations