Abstract
In this paper we motivate the need to develop new techniques to accelerate pairwise global sequence alignment and then propose a tiling bound to achieve this. The bounds involve a problem relaxation in which alignment scores of sequence fragments are combined to give a bound on the distance of any alignment passing through any particular point in the edit graph. We prove the correctness of the bound and briefly discuss possible implementation strategies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. JMB 215, 403–410 (1990)
Bairoch, A., Apweiler, R., Wu, H.C., Barker, C.W., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez, R., Magrane, M., Martin, J.M., Natale, A.D., O’Donovan, C., Redaschi, N., Yeh, S.L.: The universal protein resource (UniProt). NAR 33, D154–D159 (2005)
Ficket, J.W.: Fast optimal alignment. Nucleic Acids Research 12, 175–180 (1983)
Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics SSC 4(2), 100–107 (1968)
Holm, L., Sander, C.: Removing near-neighbour redundancy from large protein sequence collections. Bioinformatics 14, 423–429 (1998)
Katoh, K., Toh, H.: Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9, 286–298 (2008)
Li, W., Godzik, A.: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22(13), 1658–1659 (2006)
Li, W., Jaroszewski, L., Godzik, A.: Clustering of highly homologous sequences to reduce the size of large protein databases. Bioinformatics 17(3), 282–283 (2001)
Löytynoja, A., Goldman, N.: Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science 320, 1632–1635 (2008)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology 48, 444–453 (1970)
Notredame, C., Holm, L., Higgins, D.G.: COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14, 407–422 (1998)
Notredame, C.: Recent evolutions of multiple sequence alignment algorithms. PLoS Comput. Biol. 3(8), e123 (2007)
Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85, 2444–2448 (1988)
Spouge, J.L.: Fast optimal alignment. CABIOS 7(1), 1–7 (1991)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research 22, 4673–4680 (1994)
Ukkonen, E.: On approximate string matching. LNCS, vol. 158, pp. 487–495. Springer, Heidelberg (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Horton, P., Frith, M. (2009). A Tiling Bound for Pairwise Global Sequence Alignment. In: Kim, Th., Fang, WC., Lee, C., Arnett, K.P. (eds) Advances in Software Engineering. ASEA 2008. Communications in Computer and Information Science, vol 30. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10242-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-10242-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10241-7
Online ISBN: 978-3-642-10242-4
eBook Packages: Computer ScienceComputer Science (R0)