Abstract
Many Expressed Sequence Tags (EST) sequencing projects produce thousands of sequences that must be cleaned and annotated. This research presents the so-called Full-Lengther, an algorithm that can find out full-length cDNA sequences from EST data. To accomplish this task, Full-Lenther is based on a BLAST report using a protein database such as UniProt. Blast alignments will guide to locate protein coding regions, mainly the start codon. Full-Lengther contains an ORF prediction algorithm for those cases which is not homologous to any sequence. The algorithm is implemented as a web tool to simplify its use and portability. This can be worldwide accessible via http://castanea.ac.uma.es/genuma/full-lengther/
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chen Y, Carlis J, Shoop E, Riedl J. International Joint Conference on Artificial Intelligence 2001, Workshop on Inconsistency in Data and Knowledge, Seatle, WA (2001)
Terol J, Conesa A, Colmenero JM, Cercos M, Tadeo F, Agusti J, Alos E, Andres F, Soler G, Brumos J, Iglesias DJ, Gotz S, Legaz F, Argout X, Courtois B, Ollitrault P, Dossat C, Wincker P, Morillon R, Talon M. Analysis of 13000 unique Citrus clusters associated with fruit quality, production and salinity tolerance. BMC Genomics 8, 31 (2007)
Pedersen AG, Nielsen H. Neural network prediction of translation initiation sites in eukaryotes: perspectives for EST and genome analysis. Proc Int Conf Intell Syst Mol Biol 226–233 (1997)
Iseli C, Jongeneel CV, Bucher P. ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc Int Conf Intell Syst Mol Biol 138–148 (1999)
Salamov A, Nishikawa T, Swindells MB. Assessing protein coding region integrity in cDNA sequencing projects. Bioinformatics 14:384–390 (1998)
Min XJ, Butler G, Storms R, Tsang A. OrfPredictor: predicting protein-coding regions in EST-derived sequences. Nucleic Acids Res 33:677–680 (2005)
Nadershani A, Fahrenkrug SC, Ellis LBM. Comparison of computational methods for identifying translation initiation sites in EST data. BMC Bioinformatics 5, 14 (2004)
Nishikawa T, Ota T, Isogai T. Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences. Bioinformatics 16:960–967 (2000)
Min XJ, Butler G, Storms R, Tsang A. TargetIdentifier: a webserver for identifying fulllength cDNAs from EST sequences. Nucleic Acids Research 33:669–672 (2005)
Walther D, Bartha G, Morris M. Basecalling with LifeTrace. Genome Res 11:875–888 (2001)
Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 8:195–202 (1998)
Huang X, Madan A. CAP3: A DNA sequence assembly program. Genome research 9:868–877 (1999)
Falgueras J, Lara A, Cantón FR, Pérez-Trabado G, Claros MG. SeqTrim: a validation and trimming tool for all purpose sequence reads. This issue (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Lara, A.J., Pérez-Trabado, G., Villalobos, D.P., Díaz-Moreno, S., Cantón, F.R., Claros, M.G. (2007). A Web Tool to Discover Full-Length Sequences — Full-Lengther. In: Corchado, E., Corchado, J.M., Abraham, A. (eds) Innovations in Hybrid Intelligent Systems. Advances in Soft Computing, vol 44. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74972-1_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-74972-1_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74971-4
Online ISBN: 978-3-540-74972-1
eBook Packages: EngineeringEngineering (R0)