A Fast Longest Common Subsequence Algorithm for Biosequences Alignment

  • Wei Liu
  • Lin Chen
Conference paper

DOI: 10.1007/978-0-387-77251-6_8

Part of the The International Federation for Information Processing book series (IFIPAICT, volume 258)
Cite this paper as:
Liu W., Chen L. (2008) A Fast Longest Common Subsequence Algorithm for Biosequences Alignment. In: Li D. (eds) Computer And Computing Technologies In Agriculture, Volume I. CCTA 2007. The International Federation for Information Processing, vol 258. Springer, Boston, MA

Searching for the longest common substring (LCS) of biosequences is one of the most important tasks in Bioinformatics. A fast algorithm for LCS problem named FAST_LCS is presented. The algorithm first seeks the successors of the initial identical character pairs according to a successor table to obtain all the identical pairs and their levels. Then by tracing back from the identical character pair at the largest level, the result of LCS can be obtained. For two sequences X and Y with lengths n and m, the memory required for FAST_LCS is max{8*(n+1)*8*(m*1),L}, here L is the number of identical character pairs and time complexity of parallel implementation is O(|LCS(X,Y)|), here, |LCS(X,Y)| is the length of the LCS of X,Y. Experimental result on the gene sequences of tigr database using MPP parallel computer Shenteng 1800 shows that our algorithm can get exact correct result and is faster and more efficient than other LCS algorithms.

Keywords

bioinformatics longest common subsequence identical character pair 
Download to read the full conference paper text

Copyright information

© IFIP International Federation for Information Processing 2008

Authors and Affiliations

  • Wei Liu
    • 1
  • Lin Chen
    • 2
  1. 1.Institute of Information Science and TechnologyNanjing University of Aeronautics and AstronauticsChina
  2. 2.Department of Computer ScienceYangzhou UniversityChina

Personalised recommendations