Abstract
BLAST is an important tool in bioinformatics. It has been used to find biologically similar sequences to the given query sequence from the database of the annotated sequences. For high throughput processing of huge number of query sequences, there have been many studies on parallel batch processing of sequence similarity search using BLAST. As the number of sequences in the database increases at exponential rate, the search speed of BLAST itself becomes important. Although NCBI has developed a parallel BLAST using the thread on SMP machines for the speedup of BLAST, the speedup is still limited because the SMP machine has restricted the number of processors due to its architecture. In this paper, we present our parallelized BLAST on cluster systems for further speedup. The main strategy used is the exploitation of the inter-node parallelism, which can be extracted by logical partitioning of the database. For the inter-node parallelism, we have designed and implemented a logical database partitioning method, initiation and coordination of the BLAST on remote node and communication protocol for collecting remote node’s result. According to our performance test with 2-way 8 node cluster system, roughly 12 times speedup has been achieved in terms of response time of similarity search for individual query sequence.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Durbin, R., Eddy, S., Krogh, A., Mitchison, G., eds.: Biological sequence analysis: Probabilistic models of proteins and nucleic acids. Cambridge University Press (1998)
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. Journal of Molecular Biology 215 (1990) 403–410
Altschul, S., Gish, W.: Local alignment statistics. Methods in Enzymology 266 (1996) 460–480
Altschul, S., Madden, T., Schaffer, A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.: Gapped BLAST and PSI-BLAST: A new generation of protein katabase search programs. Nucleic Acids Research 25 (1997) 3389–3402
Gish, W., States, D.: Identification of protein coding regions by database similarity search. Nature Genetics 3 (1993) 266–272
NCBI: Growth of GenBank. Technical report, National Center for Biotechnology Information (March 12, 2002)
Chi, E.H., Shoop, E., Carlis, J., Retzel, E., Ried, J.: Efficiency of shared-memory multiporceossors for a genetic sequence similiarity search algorithm. Technical report, Computer Science Dept., University of Minnesota (1997)
Braun, R.C., Pedretti, K.T., Casavant, T.L., Scheetz, T.E., Birkett, C.L., Roberts, C.A.: Parallelization of local BLAST service on workstation clusters. Future Generation Computer Systems 17 (2001) 745–754
Miller, P.L., Nadkarni, P.M., Carriero, N.M.: Parallel computation and FASTA: Confronting the problem of parallel database search for a fast sequence comparison algorithm. Bioinformatics (formerly CABIOS) 7 (1991) 71–78
Barton, G.J.: Scanning protein sequence databanks using a distributed processing workstation network. Bioinformatics (formerly CABIOS) 7 (1991) 85–88
Julich, A.: Implementations of BLAST for parallel computers. Bioinformatics (formerly CABIOS) 11 (1995) 3–6
Clifford, R., Mackey, A.J.: Disperse: A simple and efficient approach to parallel database searching. Bioinformatics 16 (2000) 564–565
Grant, J.D., Dunbrack, R.L., Manion, F.J., Ochs, M.F.: BeoBLAST: Distributed BLAST and PSI-BLAST on a Beowulf cluster. Bioinformatics 18 (2002) 765–766
Camp, N., Cofer, H., Gomperts, R.: High-Throughput BLAST. Technical report, Silicon Graphics, Inc. (1998)
Bjorson, R.D., Sherman, A.H., Weston, S.B., Willard, N., Wing, J.: TurboBLAST: A parallel implementation of BLAST based on the TurboHub process integration architecture. Technical report, TruboGenomics, Inc. (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, HS., Kim, HJ., Han, DS. (2003). Hyper-BLAST: A Parallelized BLAST on Cluster System. In: Sloot, P.M.A., Abramson, D., Bogdanov, A.V., Gorbachev, Y.E., Dongarra, J.J., Zomaya, A.Y. (eds) Computational Science — ICCS 2003. ICCS 2003. Lecture Notes in Computer Science, vol 2659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44863-2_22
Download citation
DOI: https://doi.org/10.1007/3-540-44863-2_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40196-4
Online ISBN: 978-3-540-44863-1
eBook Packages: Springer Book Archive