Abstract
Public databases contain large datasets of plant expressed sequence tags (ESTs) that can be used for mining microsatellite/simple sequence repeat markers. The identification and annotation of these markers take considerable time. Here, we describe an efficient, high-throughput microsatellite mining, and analysis pipeline, standalone EST microsatellite mining and analysis tool (SEMAT). The pipeline bundles sequence trimming, assembly, microsatellite identification, primer selection, and blast annotation, for which it consecutively uses SeqClean, CAP3, MISA, Primer3, and Blast. SEMAT is written using Perl scripts, and it runs under Ubuntu and Fedora Linux. SEMAT is an efficient and time-saving bioinformatics tool to accomplish the high throughput EST-SSR analysis. It is freely available from http://semat.cpcribioinformatics.in/.

Explore related subjects
Discover the latest articles and news from researchers in related subjects, suggested using machine learning.References
Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Argout X, Fouet O, Wincker P, Gramacho K et al (2008) Towards the understanding of the cacao transcriptome: production and analysis of an exhaustive dataset of ESTs of Theobroma cacao L. generated from various tissues and under various conditions. BMC Genomics 9:1–19
D’Agostino N, Aversano M, Chiusano ML (2005) ParPEST: a pipeline for EST data analysis based on parallel computing. BMC Bioinforma 6:S9
Dong Q, Kroiss L, Oakley FD et al (2005) Comparative EST analyses in plant systems. Methods Enzymol 395:400–408
Huang X, Madan A (1999) CAP3: a DNA sequence assembly program. Genome Res 9:868–877
Javier F, Francisco G, Antonio R (2008) EST2uni: an open, parallel tool for automated EST analysis and database creation, with data mining web interface and microarray expression data integration. BMC Bioinforma 9:5
Jongeneel CV (2000) Searching the expressed sequence tag (EST) databases: panning for genes. Brief Bioinform 1:76–92
Matukumalli LK, Grefenstette JJ, Sonstegard TS et al (2004) EST-PAGE – managing and analyzing EST data. Bioinformatics 20:286–288
Robinson AJ, Love CG, Batley J, Barker G, Edwards D (2004) Simple sequence repeat marker loci discovery using SSR primer. Bioinformatics 20:1475–1476
Rozen S, Skaletsky H (2000) Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol 132:365–386
Rudd S (2003) Expressed sequence tags: alternative or complement to whole genome sequences? Trends Plant Sci 8:321–329
Acknowledgments
This work was supported by grants from Department of Information Technology (DIT), India. Our sincere thanks to Dr. George. V Thomas, Director, Central Plantation Crops Research Institute, Kasaragod, India, for his guidance and support.
Data archiving statement
The manuscript contains no data that has to be submitted to public database. We uploaded all the data of the results of cocoa EST analysis and the SEMAT tool in our web server http://semat.cpcribioinformatics.in/ for open access.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by J. L. Wegrzyn
Rights and permissions
About this article
Cite this article
Asari, N.S., Ramaswamy, M., Subbian, E.A. et al. Standalone EST microsatellite mining and analysis tool (SEMAT): for automated EST-SSR analysis in plants. Tree Genetics & Genomes 10, 1755–1757 (2014). https://doi.org/10.1007/s11295-014-0785-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11295-014-0785-2