A Fast Longest Common Subsequence Algorithm for Biosequences Alignment

  • Wei Liu
  • Lin Chen
Part of the The International Federation for Information Processing book series (IFIPAICT, volume 258)

Searching for the longest common substring (LCS) of biosequences is one of the most important tasks in Bioinformatics. A fast algorithm for LCS problem named FAST_LCS is presented. The algorithm first seeks the successors of the initial identical character pairs according to a successor table to obtain all the identical pairs and their levels. Then by tracing back from the identical character pair at the largest level, the result of LCS can be obtained. For two sequences X and Y with lengths n and m, the memory required for FAST_LCS is max{8*(n+1)*8*(m*1),L}, here L is the number of identical character pairs and time complexity of parallel implementation is O(|LCS(X,Y)|), here, |LCS(X,Y)| is the length of the LCS of X,Y. Experimental result on the gene sequences of tigr database using MPP parallel computer Shenteng 1800 shows that our algorithm can get exact correct result and is faster and more efficient than other LCS algorithms.

Keywords

bioinformatics longest common subsequence identical character pair 

References

  1. A. Aggarwal and J. Park, 1988, Notes on Searching in Multidimensional Monotone Arrays, Proc. 29th Ann. IEEE Symp. Foundations of Comput. Sci. pp. 497-512.Google Scholar
  2. A. Aho, D. Hirschberg, and J. Ullman, 1976, Bounds on the Complexity of the Longest Common Subsequence Problem, J. Assoc. Comput. Mach., Vol. 23, No. 1, 1976, pp. 1-12.CrossRefGoogle Scholar
  3. A. Apostolico, M. Atallah, L. Larmore, and S. Mcfaddin, 1990, Efficient Parallel Algorithms for String Editing and Related Problems, SIAM J. Computing, Vol. 19, pp. 968-988.Google Scholar
  4. Bailin Hao, Shuyu Zhang, 2000, The manual of Bioinformatics, Shanghai science and technology publishing company.Google Scholar
  5. D.S. Hirschberg, 1975, A Linear Space Algorithm for Computing Maximal Common Subsequences, Commun. ACM, Vol. 18, No. 6, pp. 341-343.CrossRefGoogle Scholar
  6. E.W. Mayers, W. Miller, 1998, Optimal Alignment in Linear Space, Comput. Appl. Biosci. Vol. 4, No. 1, pp. 11-17.Google Scholar
  7. Edmiston E.W., Core N.G., Saltz J.H, et al., 1988, Parallel processing of biological sequence comparison algorithms. International Journal of Parallel Programming, Vol. 17, No. 3, pp. 259-275.CrossRefGoogle Scholar
  8. Jean Frédéric Myoupo, David Seme, 1999, Time-Efficient Parallel Algorithms for the Longest Common Subsequence and Related Problems, Journal of Parallel and Distributed Computing, Vol. 57, No. 2, pp. 212-223.CrossRefGoogle Scholar
  9. K. Nandan Babu, Wipro Systems, and Sanjeev Saxena, 1997, Parallel Algorithms for the Longest Common Subsequence Problem, 4th International Conference on High Performance Computing, December 18-21, 1997 - Bangalore, India.Google Scholar
  10. L. Bergroth, H. Hakonen, and T. Raita, 2000, A survey of longest common subsequence algorithms, Seventh International Symposium on String Processing Information Retrieval, pp. 39-48.Google Scholar
  11. Needleman, S.B. and Wunsch, C.D., 1970, A general method applicable to the search for similarities in the amino acid sequence of two proteins, J. Mol. Biol., Vol. 48, No. 3, pp. 443-453.CrossRefPubMedGoogle Scholar
  12. O. Gotoh, 1982, An improved algorithm for matching biological sequences, J. Molec. Biol. Vol. 162, pp. 705-708.CrossRefPubMedGoogle Scholar
  13. Smith T.F., Waterman M.S. 1990, Identification of common molecular subsequence. Journal of Molecular Biology, Vol. 215, pp. 403-410.CrossRefGoogle Scholar
  14. V. Freschi and A. Bogliolo, 2004, Longest common subsequence between run-length-encoded strings: a new algorithm with improved parallelism, Information Processing Letters, Vol. 90, No. 4, pp. 167-173.CrossRefGoogle Scholar
  15. Y. Pan, K. Li, 1998, Linear Array with a Reconfigurable Pipelined Bus System - Concepts and Applications, Journal of Information Science, Vol. 106, pp. 237-258.CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2008

Authors and Affiliations

  • Wei Liu
    • 1
  • Lin Chen
    • 2
  1. 1.Institute of Information Science and TechnologyNanjing University of Aeronautics and AstronauticsChina
  2. 2.Department of Computer ScienceYangzhou UniversityChina

Personalised recommendations