Skip to main content
Log in

Parallel divide and conquer bio-sequence comparison based on smith-waterman algorithm

  • Published:
Science in China Series F: Information Sciences Aims and scope Submit manuscript

Abstract

Tools for pair-wise bio-sequence alignment have for long played a central role in computation biology. Several algorithms for bio-sequence alignment have been developed. The Smith-Waterman algorithm, based on dynamic programming, is considered the most fundamental alignment algorithm in bioinformatics. However the existing parallel Smith-Waterman algorithm needs large memory space, and this disadvantage limits the size of a sequence to be handled. As the data of biological sequences expand rapidly, the memory requirement of the existing parallel Smith-Waterman algorithm has become a critical problem. For solving this problem, we develop a new parallel bio-sequence alignment algorithm, using the strategy of divide and conquer, named PSW-DC algorithm. In our algorithm, first, we partition the query sequence into several subsequences and distribute them to every processor respectively, then compare each subsequence with the whole subject sequence in parallel, using the Smith-Waterman algorithm, and get an interim result, finally obtain the optimal alignment between the query sequence and subject sequence, through the special combination and extension method. Memory space required in our algorithm is reduced significantly in comparison with existing ones. We also develop a key technique of combination and extension, named the C&E method, to manipulate the interim results and obtain the final sequences alignment. We implement the new parallel bio-sequences alignment algorithm, the PSW-DC, in a cluster parallel system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aluru, S., Futamura, N., Mehrotra, K., Parallel biological sequence comparison using prefix computations, Journal of Parallel and Distributed Computing, 2003, 63(3): 264–272.

    Article  MATH  Google Scholar 

  2. Needleman, S. B., Wunsch, C. D., A general method applicable to the search for similarities in the amino acid sequence of two proteins, Journal of Molecular Biology, 1970, 48: 443–453.

    Article  Google Scholar 

  3. Smith, T. F., Waterman, M. S., Identification of common molecular subsequences, Journal of Molecular Biology, 1981, 147(1): 195–197.

    Article  Google Scholar 

  4. Altschul, S. F., Gish, W., Miller, W. et al., Basic local alignment search tool, Journal of Molecular Biology, 1990, 215: 403–410.

    Google Scholar 

  5. Altschul, S. F., Madden, T. L., Schaffer, A. A. et al., Gapped BLAST and PSI-BLAST: A new generation of protein database search program, Nucleic Acids Res., 1997, 25(17): 3389–3402.

    Article  Google Scholar 

  6. Phil Green. http://bozeman.bvt.washington.edu/phrap/phrap.docs/phrap.html. 1996.

  7. Edmiston, E. W., Core, N. G., Saltz, J. H. et al., Parallel processing of biological sequence comparison algorithms, International Journal of Parallel Programming, 1988, 17(3): 259–275.

    Article  MATH  MathSciNet  Google Scholar 

  8. Lander, E., Protein sequence comparison on a data parallel computer, in Proceedings of the 1988 International Conference on Parallel Processing, 1988, 257–263.

  9. Galper, A. R., Brutlag, D. L., Parallel similarity search and alignment with the dynamic programming method, Technical Report, California: Stanford University, 1990.

    Google Scholar 

  10. Qiao Xiangzhen, Li Zhao, Zhu Mingfa, Parallel computation for dynamic programming, 2nd Int. ICSC Symposium on Computational Intelligence Methods & Applications (CIMA), Bangor, Wales, UK, 2001.

  11. Sankoff, D., The early introduction of dynamic programming into computational biology, Bioinformatics, 2000, 16(1): 41–47.

    Article  Google Scholar 

  12. Mount, D. W., Bioinformatics: Sequence and Genome Analysis, New York: Cold Spring Harbor Laboratory Press, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, F., Qiao, X. & Liu, Z. Parallel divide and conquer bio-sequence comparison based on smith-waterman algorithm. Sci China Ser F 47, 221–231 (2004). https://doi.org/10.1360/02yf0362

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1360/02yf0362

Keywords

Navigation