Refining the Progressive Multiple Sequence Alignment Score Using Genetic Algorithms

  • Halit Ergezer
  • Kemal Leblebicioğlu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3949)


Given a set of N (N>2) sequences, the Multiple Sequence Alignment (MSA) problem is to align these N sequences, possibly with gaps, that bring out the best score due to a given scoring criterion between characters. Multiple sequence alignment is one of the basic tools for interpreting the information obtained from bioinformatics studies. Dynamic Programming (DP) gives the optimal alignment of the two sequences for the given scoring scheme. But, in the case of multiple sequence alignment it requires enormous time and space to obtain the optimal alignment. The time and space requirement increases exponentially with the number of sequences. There are two basic classes of solutions except the DP method: progressive methods and iterative methods. In this study, we try to refine the alignment score obtained by using the progressive method due to given scoring criterion by using an iterative method. As an iterative method genetic algorithm (GA) has been used. The sum-of-pairs (SP) scoring system is used as our target of optimization. There are fifteen operators defined to refine the alignment quality by combining and mutating the alignments in the alignment population. The results show that the novel operators, sliding-window, local-alignment, which have not been used up to now, increase the score of the progressive alignment by amount of % 2.


Dynamic Program Iterative Method Multiple Sequence Alignment Pairwise Alignment Alignment Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Morgenstern, B., Dress, A.W.M., Werner, T.: Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc. N&l, Acad. Sci. USA 93, 12098–12103 (1996)CrossRefMATHGoogle Scholar
  2. 2.
    Gusfield, D.: Efficient methods for multiple sequence alignment with guaranteed error bounds. Bulletin of Mathematical Biology 55, 141–154 (1993)CrossRefMATHGoogle Scholar
  3. 3.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences, Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)CrossRefMATHGoogle Scholar
  4. 4.
    Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1, 337–348 (1994)CrossRefGoogle Scholar
  5. 5.
    Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences, and Genomes. Chapman & Hall, London (1995)CrossRefMATHGoogle Scholar
  6. 6.
    Altschul, S., Lipman, D.: Trees, stars, and multiple sequence alignment. SIAM J. Appl. Math. 49, 197–209 (1989)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)CrossRefGoogle Scholar
  8. 8.
    Notredame, C., Higgins, D.G.: SAGA: Sequence Alignment by Genetic Algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)CrossRefGoogle Scholar
  9. 9.
    Notredame, C., O’Brien, E.A., Higgins, D.G.: RAGA: RNA sequence alignment by genetic algorithm. Nucleic Acids Res. 25, 4570–4580 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Halit Ergezer
    • 1
  • Kemal Leblebicioğlu
    • 2
  1. 1.Department of Computer EngineeringBaşkent UniversityAnkaraTurkey
  2. 2.Department of Electrical and Electronics EngineeringMiddle East Technical UniversityAnkaraTurkey

Personalised recommendations