Abstract
The widely used programs BLAST (in this article, ‘BLAST’ includes both the National Center for Biotechnology Information [NCBI] BLAST® and the Washington University version WU BLAST) and FASTA for similarity searches in nucleotide and protein databases usually result in copious output. However, when large query sets are used, human inspection rapidly becomes impractical. BioParser is a Perl program for parsing BLAST and FASTA reports. Making extensive use of the BioPerl toolkit, the program filters, stores and returns components of these reports in either ASCII or HTML format. BioParser is also capable of automatically feeding a local MySQL® database with the parsed information, allowing subsequent filtering of hits and/or alignments with specific attributes. For this reason, BioParser is a valuable tool for large-scale similarity analyses by improving the access to the information present in BLAST or FASTA reports, facilitating extraction of useful information of large sets of sequence alignments, and allowing for easy handling and processing of the data.
Similar content being viewed by others
References
Yona G, Brenner SE. Comparison of protein sequences and practical database searching. In: Higgins D, Taylor W, editors. Bioinformatics: sequence, structure and databanks: a practical approach. Oxford: Oxford University Press, 2000: 167–90
Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol 1990; 215: 403–10
Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25: 3389–402
Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 1988; 85: 2444–8
Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol 1990; 183: 63–98
Smith TF, Waterman MS. Comparison of biosequences. Adv Appl Math 1981; 2: 482–9
Stajich JE, Block D, Boulez K, et al. The Bioperl toolkit: Perl modules for the life sciences. Genome Res 2002; 12: 1611–8
Xing L, Brendel V. Multi-query sequence BLAST output examination with MuSeqBox. Bioinformatics 2001; 17: 744–5
Paquola AC, Machado AA, Reis EM, et al. Zerg: a very fast BLAST parser library. Bioinformatics 2003; 19: 1035–6
BioPerl [online]. Available from URL: http://www.bioperl.org/ [Accessed 2005 June]
Henriques C, Otto TD, Catanho M, et al. Classification of transporter families in Trypanosoma cruzi [abstract no. BM128]. XXI Annual meeting of the Brazilian Society of Protozoology/XXXII Meeting of Basic Research in Chagas Disease; 2005 Nov 7–9; Caxambú, Brazil; 119
Ren Q, Kang KH, Paulsen IT. TransportDB: a relational database of cellular membrane transport systems. Nucleic Acids Res 2004; 32: D284–8
Eddy SR. Profile hidden Markov models. Bioinformatics 1998; 14(9): 755–63
Catanho M, Mascarenhas D, Degrave W, et al. GenoMycDB: database for comparative analysis of mycobacterial genes and genomes. Genet Mol Res In press
Acknowledgements
We wish to thank Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Programa de Apoio à Pesquisa Estratégica em Saúde — Fiocruz (PAPES-Fiocruz), World Health Organization — Special Programme for Research and Training in Tropical Diseases (WHO/TDR), United Nations University — Biotechnology for Latin America and the Caribbean — Bioinformatics Network for Latin-America and Caribbean (UNU-BIOLAC LacBioNet) and Ciencia y Tecnología para el Desarrollo— Red Iberoamericana de Bioinformática (CYTED-RIB) for support.
The authors have no conflicts of interest that are directly relevant to the content of this article.
Author information
Authors and Affiliations
Corresponding author
Additional information
Availability: BioParser is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.0 license terms (http://creativecommons.org/licenses/by-nc-nd/2.0/) and is available upon request. Additional information can be found at the BioParser website (http://www.dbbm.fiocruz.br/BioParser.html).
Rights and permissions
About this article
Cite this article
Catanho, M., Mascarenhas, D., Degrave, W. et al. BioParser. Appl-Bioinformatics 5, 49–53 (2006). https://doi.org/10.2165/00822942-200605010-00007
Published:
Issue Date:
DOI: https://doi.org/10.2165/00822942-200605010-00007