Advertisement

Improved Genetic Algorithm for Multiple Sequence Alignment Using Segment Profiles (GASP)

  • Yanping Lv
  • Shaozi Li
  • Changle Zhou
  • Wenzhong Guo
  • Zhengming Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4093)

Abstract

This paper presents a novel genetic algorithm (GA) for multiple sequence alignment in protein analysis. The most significant improvement afforded by this algorithm results from its use of segment profiles to generate the diversified initial population and prevent the destruction of conserved regions by crossover and mutation operations. Segment profiles contain rich local information, thereby speeding up convergence. Secondly, it introduces the use of the norMD function in a genetic algorithm to measure multiple alignment Finally, as an approach to the premature problem, an improved progressive method is used to optimize the highest-scoring individual of each new generation. The new algorithm is compared with the ClustalX and T-Coffee programs on several data cases from the BAliBASE benchmark alignment database. The experimental results show that it can yield better performance on data sets with long sequences, regardless of similarity.

Keywords

Genetic Algorithm Multiple Sequence Alignment Reference Alignment Improve Genetic Algorithm Progressive Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Thompson, J.D., Plewniak, F.: A comprehensive comparison of multiple sequence alignment programs. Nuc. Acids. Res. 27, 2682–2690 (1999)CrossRefGoogle Scholar
  2. 2.
    Thompson, J.D., Gibson, T.J.: The CLUSTAL_X windows interface: flexible strategies for MSA aided by quality analysis tools. Nuc. Acids. Res. 25(24), 4876–4882 (1997)CrossRefGoogle Scholar
  3. 3.
    Brudno, M., Chapman, M.: Fast and sensitive multiple alignment of large genomic sequences. Bioinformatics 4, 66 (2003)Google Scholar
  4. 4.
    Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nuc. Acids. Res. 24, 1515–1524 (1996)CrossRefGoogle Scholar
  5. 5.
    Eddy, R.: Biological Sequence Analysis: Probabilistic models of proteins and nucleic acids, pp. 51–68. Cambridge University Press, Cambridge (1998)MATHGoogle Scholar
  6. 6.
    Dayhoff, M., Schwartz, R.M.: A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure 5, 345–352 (1978)Google Scholar
  7. 7.
    Thompson, J.D., Plewniak, F.: Multiple Sequence Alignment Objective Function. J. Mol. Biol. 314(4), 937–951 (2001)CrossRefGoogle Scholar
  8. 8.
    Benner, S.A., Cohen, M.A.: Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 7, 1323–1332 (1994)CrossRefGoogle Scholar
  9. 9.
    Shiyi, S., Jun, Y.: Super Pairwise Alignment (SPA): An Efficient Approach to Global Alignment for Homologous Sequences. J. Com. Biol. 9(3), 477–486 (2002)CrossRefGoogle Scholar
  10. 10.
    Thompson, J.D.: BAliBASE: A benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)CrossRefGoogle Scholar
  11. 11.
    Notredame, C., Higgins, D., Heringa, J.: T-Coffee: A novel method for multiple sequence alignments. J. Mol. Biol. 302, 205–217 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yanping Lv
    • 1
  • Shaozi Li
    • 1
  • Changle Zhou
    • 1
  • Wenzhong Guo
    • 2
  • Zhengming Xu
    • 1
  1. 1.Intelligent Information Technology Lab., Department of Computer ScienceXiamen UniversityXiamenChina
  2. 2.Department of Computer ScienceFuzhou UniversityFuzhouChina

Personalised recommendations