Abstract
As one of the most widely used bio-sequence searching tools, BLAST adopts index-based approach to detect the matches between two substrings by looking up a large table and processing one match per query. In this paper, we propose a systolic array approach to detect string matches without using looking up tables. The pipelining systolic array is implemented as a multi-seeds detection and parallel extension pipeline engine to accelerate the first two stages of NCBI BLAST family algorithms. Different from the index-based approach, our implementation consumes little memory resources and eliminates redundant string extensions by merging multiple adjoin seeds into a valid seed. Our FPGA implementation achieves superior performance results in both of processing element number and clock frequency over related works in the area of FPGA BLAST accelerators. The experimental results also show the speedup can reach about 17, 48, 14, 71 and 10 compared to the NCBI BLASTp, TBLASTn, BLASTx, TBLASTx and BLASTn programs for 3072-residue queries on Intel P4 CPU, respectively. Furthermore, the idea of multi-seeds detection also can be adopted in other seed-based heuristic searching applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altschul, S.F., Gish, W., et al.: Basic local alignment search tool. Molecular Biology, 403–410 (1990)
Farmerie, W.G., Hammer, J., et al.: Having a BLAST: Analyzing Gene Sequence Data with BlastQuest. In: Proc. 14th International Workshop on Database and Expert Systems Applications, pp. 37–41 (2003)
Kim, T.-K., Oh, S.-K., Lee, K.-H., Roh, D.-H., Cho, W.-S.: HGBS: a hardware-oriented grid BLAST system. In: 5th International Symposium on Cluster Computing and the Grid, pp. 520–526 (2005)
Lin, H., Ma, X., Chandramohan, P., Geist, A., Samatova, N.: Efficient Data Access for Parallel BLAST. In: Proc. 19th International Parallel and Distributed Processing Symposium (2005)
Oehmen, C., Nieplocha, J.: ScalaBLAST: A Scalable Implementation of BLAST for High-Performance Data-Intensive Bioinformatics Analysis. IEEE Trans. on Parallel and Distributed Systems (2006)
Yutao, Q., Feng, L.: CyberparaBLAST: the Parallelized BLAST Web Server. In: Proc. 2nd International Conference on Cyberworlds, pp. 474–477 (2003)
NCBI, GenBank Growth Statistics (2006), http://www.ncbi.nlm.nih.gov/Genbank/genbankstats.html
Buhler, J.D., Lancaster, J.M., et al.: Mercury BLASTN: Faster DNA Sequence Comparison Using a Streaming Hardware Architecture. In: Proc. 3rd Annual Reconfigurable Systems Summer Institute (2007)
Krishnanurthy, P., Buhler, J., et al.: Biosequence Similarity search on the Mercury system. In: Proc. 15th IEEE International Conference on Application-Specific Systems, Architectures and Processors, pp. 2004–365 (2004)
Lancaster, J., Buhler, J., et al.: Acceleration of Ungapped Extension in Mercury BLAST. In: Proc. 7th Workshop on Media and Streaming Processors, pp. 50–57 (2005)
Herbordt, M.C., Model, J., et al.: Single Pass, BLAST-Like, Approximate String Matching on FPGAs. In: Proc. 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 217–226 (2006)
Jacob, A., Lancaster, J., et al.: FPGA-accelerated seed generation in Mercury BLASTp. In: Proc. 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 95–106 (2007)
Muriki, K., Underwood, K.D., et al.: RC-BLAST: Towards a Portable, Cost-Effective Open Source Hardware Implementation. In: Proc. 19th IEEE International Parallel and Distributed Processing Symposium (2005)
Lavenier, D., Xinchun, L., Georges, G.: Seed-based Genomic Sequence Comparison using a FPGA/FLASH Accelerator. In: IEEE International Conference on Field Programmable Technology, pp. 41–48 (2006)
Sotiriades, E., Dollas, A.: Design Space Exploration for the BAST Algorithm Implementation. In: Proc. 15th Annual IEEE Symposium on Field-Programmable Custom Computing Machines (2007)
Sotiriades, E., Kozanitis, C., Dollas, A.: FPGA based Architecture for DNA Sequence Comparison and Database Search. In: Proc. 20th IEEE International Parallel and Distributed Processing Symposium (2006)
Chang, C.: BLAST Implementation on BEE2. Electrical Engineering and Computer Science University of California at Berkeley (2005), http://bee2.eecs.berkeley.edu
CLC Desktop Hardware-Acceleration. White paper on CLC Bioinformatics Cube (2006), http://www.clccube.com
Mitrion.Inc.: NCBI BLAST Accelerator (2007), http://www.mitrionics.com
Timelogic.Inc.: Timelogic DeCypher BLAST Engine (2006), http://www.timelogic.com/decypher_blast.html
Hoang, D., et al.: FPGA Implementation of Systolic Sequence Alignment. In: Proc. 2nd International Workshop on Field-Programmable Logic and Applications. LNCS, pp. 183–191 (1992)
Hoang, D., et al.: Searching Genetic Databases on Splash2. In: Proc. IEEE Workshop on FPGAs for Custom Computing Machines, pp. 185–191 (1993)
Guccione, S., Keller, E.: Gene Matching Using JBits. In: Glesner, M., Zipf, P., Renovell, M. (eds.) FPL 2002. LNCS, vol. 2438, pp. 1168–1171. Springer, Heidelberg (2002)
Oliver, T., et al.: Hyper Customized Processors for Bio-Sequence Database Scanning on FPGAs. In: Proc. ACM/SIGDA 13th international symposium on Field programmable gate arrays, pp. 229–237 (2005)
Court, T.V., Herbordt, M.C.: Families of FPGA-Based Accelerators for Approximate String Matching. Microprocessors and Microsystems 31, 135–145 (2007)
NCBI BLAST Database, National Center for Biotechnology Information (2006), http://www.ncbi.nih.gov/BLAST
EBI, European Bioinformatics Institute (2007), http://www.ebi.ac.uk/uniprot/database/download.html
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xia, F., Dou, Y., Xu, J. (2008). Families of FPGA-Based Accelerators for BLAST Algorithm with Multi-seeds Detection and Parallel Extension. In: Elloumi, M., Küng, J., Linial, M., Murphy, R.F., Schneider, K., Toma, C. (eds) Bioinformatics Research and Development. BIRD 2008. Communications in Computer and Information Science, vol 13. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70600-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-70600-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70598-7
Online ISBN: 978-3-540-70600-7
eBook Packages: Computer ScienceComputer Science (R0)