Skip to main content
Log in

BioParser

A Tool for Processing of Sequence Similarity Analysis Reports

Applied Bioinformatics

Abstract

The widely used programs BLAST (in this article, ‘BLAST’ includes both the National Center for Biotechnology Information [NCBI] BLAST® and the Washington University version WU BLAST) and FASTA for similarity searches in nucleotide and protein databases usually result in copious output. However, when large query sets are used, human inspection rapidly becomes impractical. BioParser is a Perl program for parsing BLAST and FASTA reports. Making extensive use of the BioPerl toolkit, the program filters, stores and returns components of these reports in either ASCII or HTML format. BioParser is also capable of automatically feeding a local MySQL® database with the parsed information, allowing subsequent filtering of hits and/or alignments with specific attributes. For this reason, BioParser is a valuable tool for large-scale similarity analyses by improving the access to the information present in BLAST or FASTA reports, facilitating extraction of useful information of large sets of sequence alignments, and allowing for easy handling and processing of the data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Yona G, Brenner SE. Comparison of protein sequences and practical database searching. In: Higgins D, Taylor W, editors. Bioinformatics: sequence, structure and databanks: a practical approach. Oxford: Oxford University Press, 2000: 167–90

    Google Scholar 

  2. Altschul SF, Gish W, Miller W, et al. Basic local alignment search tool. J Mol Biol 1990; 215: 403–10

    PubMed  CAS  Google Scholar 

  3. Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997; 25: 3389–402

    Article  PubMed  CAS  Google Scholar 

  4. Pearson WR, Lipman DJ. Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 1988; 85: 2444–8

    Article  PubMed  CAS  Google Scholar 

  5. Pearson WR. Rapid and sensitive sequence comparison with FASTP and FASTA. Methods Enzymol 1990; 183: 63–98

    Article  PubMed  CAS  Google Scholar 

  6. Smith TF, Waterman MS. Comparison of biosequences. Adv Appl Math 1981; 2: 482–9

    Article  Google Scholar 

  7. Stajich JE, Block D, Boulez K, et al. The Bioperl toolkit: Perl modules for the life sciences. Genome Res 2002; 12: 1611–8

    Article  PubMed  CAS  Google Scholar 

  8. Xing L, Brendel V. Multi-query sequence BLAST output examination with MuSeqBox. Bioinformatics 2001; 17: 744–5

    Article  PubMed  CAS  Google Scholar 

  9. Paquola AC, Machado AA, Reis EM, et al. Zerg: a very fast BLAST parser library. Bioinformatics 2003; 19: 1035–6

    Article  PubMed  CAS  Google Scholar 

  10. BioPerl [online]. Available from URL: http://www.bioperl.org/ [Accessed 2005 June]

  11. Henriques C, Otto TD, Catanho M, et al. Classification of transporter families in Trypanosoma cruzi [abstract no. BM128]. XXI Annual meeting of the Brazilian Society of Protozoology/XXXII Meeting of Basic Research in Chagas Disease; 2005 Nov 7–9; Caxambú, Brazil; 119

    Google Scholar 

  12. Ren Q, Kang KH, Paulsen IT. TransportDB: a relational database of cellular membrane transport systems. Nucleic Acids Res 2004; 32: D284–8

    Article  PubMed  CAS  Google Scholar 

  13. Eddy SR. Profile hidden Markov models. Bioinformatics 1998; 14(9): 755–63

    Article  PubMed  CAS  Google Scholar 

  14. Catanho M, Mascarenhas D, Degrave W, et al. GenoMycDB: database for comparative analysis of mycobacterial genes and genomes. Genet Mol Res In press

Download references

Acknowledgements

We wish to thank Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Programa de Apoio à Pesquisa Estratégica em Saúde — Fiocruz (PAPES-Fiocruz), World Health Organization — Special Programme for Research and Training in Tropical Diseases (WHO/TDR), United Nations University — Biotechnology for Latin America and the Caribbean — Bioinformatics Network for Latin-America and Caribbean (UNU-BIOLAC LacBioNet) and Ciencia y Tecnología para el Desarrollo— Red Iberoamericana de Bioinformática (CYTED-RIB) for support.

The authors have no conflicts of interest that are directly relevant to the content of this article.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Basílio de Miranda.

Additional information

Availability: BioParser is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 2.0 license terms (http://creativecommons.org/licenses/by-nc-nd/2.0/) and is available upon request. Additional information can be found at the BioParser website (http://www.dbbm.fiocruz.br/BioParser.html).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Catanho, M., Mascarenhas, D., Degrave, W. et al. BioParser. Appl-Bioinformatics 5, 49–53 (2006). https://doi.org/10.2165/00822942-200605010-00007

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.2165/00822942-200605010-00007

Keywords

Navigation