Summary
In this chapter, we present a new multiple sequence alignment algorithm called AntiClustAl. The method makes use of the commonly used idea of aligning homologous sequences belonging to classes generated by some clustering algorithm and then continuing the alignment process in a bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S that minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomized tournaments, an idea that has been successfully applied to large-size search problems in general metric spaces. In particular, a clustering data structure called antipole tree and an approximate linear 1-median computation are used. Our algorithm enjoys a better running time with equivalent alignment quality compared with ClustalW, a widely used tool for multiple sequence alignment. A successful biological application showing high amino acid conservation during evolution of Xenopus laevis SOD2 is illustrated.
Keywords
- Multiple Sequence Alignment
- Base Number
- Pairwise Alignment
- Match Number
- Match Ratio
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag London Limited
About this chapter
Cite this chapter
Di Pietro, C. et al. (2005). AntiClustAl: Multiple Sequence Alignment by Antipole Clustering. In: Wu, X., Jain, L., Wang, J.T., Zaki, M.J., Toivonen, H.T., Shasha, D. (eds) Data Mining in Bioinformatics. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/1-84628-059-1_3
Download citation
DOI: https://doi.org/10.1007/1-84628-059-1_3
Publisher Name: Springer, London
Print ISBN: 978-1-85233-671-4
Online ISBN: 978-1-84628-059-7
eBook Packages: Computer ScienceComputer Science (R0)
