Uniformly repeated DNA sequences in genomes known as tandem repeats are one of the most interesting features of many organisms analyzed so far. Among the tandem repeats, microsatellites have attracted many researchers since their associations in several human diseases. The discovery of tandem repeats in the expressed sequence tags (ESTs) or in the cDNA libraries contributed to new ideas and tools for evolutionary studies. With the advent of new biotechnological tools the number of ESTs deposited in databases is rapidly increasing. Therefore, new informative bioinformatics tools are needed to assist the analysis and interpretation of these tandem repeats in ESTs and in other type of DNAs. In the present study we report two new utility tools; Organism Miner and Keyword Finder. Organism Miner utility collects, sorts, splice and provides statistical overview on DNA data files. Keyword Finder analyses all the sequences in the input folder and extracts and collects keywords for each specific organism or the all the organisms, which have the DNA sequence and generates statistical overview. We are currently generating cotton and pepper cDNA libraries and often using the GenBank DNA sequences. Therefore, in this study we used cDNAs and ESTs of cotton and pepper for the demonstrating the use of these two tools. With help of these two utilities we observed that most of ESTs are useful for downstream applications such as mining microsatellites specific to an organ, tissue or development stage. The analyses of ESTs indicated that not only tandem repeats existed in ESTs but also tandem repeats differentially presented in different organ or tissue specific ESTs within and between the species. Utilities and the sample data sets are self-extracting files and freely available from ftp://ftp.akdeniz.edu.tr/Araclar/TRA/ or can be obtained upon request from the corresponding author.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Expressed Sequence Tags
Simple Sequence Repeats (SSRs)
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27:573–580
Bilgen M, Karaca M, Onus AN, Ince AG (2004) A software program combining sequence motif searches with keywords for finding repeats containing DNA sequences. Bioinformatics 20:3379–3386
Close TJ, Wanamaker SI, Caldo RA, Turner SM, Ashlock DA, Dickerson JA, Wing RA, Muehlbauer GJ, Kleinhofs A, Wise RP (2004) A new resource for cereal genomics: 22K barley GeneChip comes of age. Plant Physiol 134:960–968
Ewing RM, Kahla AB, Poirot O, Lopez F, Audic S, Claverie JM (1999) Large-scale statistical analyses of rice ESTs reveal correlated patterns of gene expression. Genome Res 9:950–959
Fulton TM, Van der Hoeven R, Eannetta NT, Tanksley SD (2002) Identification, analysis, and utilization of conserved ortholog set markers for comparative genomics in higher plants. Plant Cell 14:1457–1467
Ince AG, Onus AN, Elmasulu SY, Bilgen M, Karaca M (2007) In silico data mining for development of Capsicum microsatellites. Acta Horticulturae 729:123–127
Kantety RV, La Rota M, Matthews DE, Sorrells ME (2002) Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Mol Biol 48:501–510
Karaca M, Saha S, Jenkins JN, Zipf A, Kohel R, Stelly DM (2002) Simple sequence repeat (SSR) markers linked to the Ligon Lintless (Li1) mutant in cotton. J Heredity 93:221–224
Karaca M, Saha S, Callahan FE, Jenkins JN, Read JJ, Percy RG (2004) Molecular and cytological characterization of a cytoplasmic-specific mutant in pima cotton (Gossypium barbadense L.). Euphytica 139:187–197
Karaca M, Bilgen M, Onus AN, Ince AG, Elmasulu SY (2005a) Exact Tandem Repeats Analyzer (e-TRA) for DNA sequence mining. J Genetics 84:49–54
Karaca M, Ince AG, Elmasulu SY, Onus AN, Turgut K (2005b) Coisolation of genomic and organelle DNAs from 15 genera and 31 species of plants. Anal Biochem 343:352–355
Kurtz S, Jomuna VC, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R (2001) REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res 29:4633–4642
Li YC, Korol AB, Fahima T, Beiles A, Nevo E (2002) Microsatellites: genomic distribution, putative functions and mutational mechanisms: a review. Mol Ecol 11:2453–2465
McMurray CT (1999) DNA secondary structure: a common and causative factor for expansion in human disease. Proc Natl Acad Sci USA 96:1823–1825
Michalek W, Weschke W, Pleissner KP, Graner A (2002) EST analysis in barley defines a unigene set comprising 4,000 genes. Theor Appl Genet 104:97–103
Parisi V, Fonzo VD, Aluf-Pentini F (2003) STRING: finding tandem repeats in DNA sequences. Bioinformatics 19:1733–1738
Reddy PS, Housman DE (1997) The complex pathology of trinucleotide repeats. Curr Opin Cell Biol 9:364–372
Ronning CM, Stegalkina SS, Ascenzi RA, Bougri O, Hart AL, Utterbach TR, Vanaken SE, Riedmuller SB, White JA, Cho J et al (2003) Comparative analyses of potato expressed sequence tag libraries. Plant Physiol 131:419–429
Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132:365–86
Saha S, Karaca M, Jenkins JN, Zipf AE, Reddy OUK, Pepper AE, Kantety R (2003) Simple sequence repeats as useful resources to study transcribed genes of cotton. Euphytica 130:55–364
Schlotterer C, Wiehe T (1999) Microsatellites, a neutral marker to infer selective sweeps. In: Goldstein DB, Schlstterer C (eds) Microsatellites: evolution and applications. Oxford University Press, Oxford, pp 238–247
Schlueter JA, Dixon P, Granger C, Grant D, Clark L, Doyle J, Shoemaker R (2004) Mining EST databases to resolve evolutionary events in major crop species. Genome 47:868–876
Schmid KJ, Sorensen TR, Stracke R, Torjek O, Altmann T, Mitchell-Olds T, Weisshaar B (2003) Large-scale identification and analysis of genome-wide single-nucleotide polymorphismsfor mapping in Arabidopsis thaliana. Genome Res 13:1250–1257
Sreenu VB, Vishwanath A, Nagaraju J, Nagarajaram HA (2003) MICdb: database of prokaryotic microsatellites. Nucleic Acids Res 31:106–108
Tautz D, Renz M (1984) Simple sequences are ubiquitous repetitive components of eukaryotic genomes. Nucleic Acids Res 12:4127–4138
Thiel T, Michalek V, Graner A (2003) Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet 106:411–422
Timchenko LT, Caskey CT (1999) Triplet repeat disorders: discussion of molecular mechanisms. Cell Mol Life Sci 55:1432–1447
Wisman E, Ohlrogge J (2000) Arabidopsis microarray service facilities. Plant Physiol 124:1468–1471
Yi G, Lee JM, Lee S, Choi D, Kim BD (2006) Exploitation of pepper EST–SSRs and an SSR-based linkage map. Theor Appl Genet 114:113–130
This research was funded by the Scientific Research Projects Administration Unit of Akdeniz University and The Scientific and Technological Research Council of Turkey (104O583).
About this article
Cite this article
Ince, A.G., Karaca, M., Bilgen, M. et al. Digital differential display tools for mining microsatellite containing organism, organ and tissue. Plant Cell Tiss Organ Cult 94, 281–290 (2008). https://doi.org/10.1007/s11240-008-9372-2
- EST data mining
- Tandem repeats