SANA: an algorithm for sequential and non-sequential protein structure alignment
- 195 Downloads
Protein structure alignment algorithms play an important role in the studies of protein structure and function. In this paper, a novel approach for structure alignment is presented. Specifically, core regions in two protein structures are first aligned by identifying connected components in a network of neighboring geometrically compatible aligned fragment pairs. The initial alignments then are refined through a multi-objective optimization method. The algorithm can produce both sequential and non-sequential alignments. We show the superior performance of the proposed algorithm by the computational experiments on several benchmark datasets and the comparisons with the well-known structure alignment algorithms such as DALI, CE and MATT. The proposed method can obtain accurate and biologically significant alignment results for the case with occurrence of internal repeats or indels, identify the circular permutations, and reveal conserved functional sites. A ranking criterion of our algorithm for fold similarity is presented and found to be comparable or superior to the Z-score of CE in most cases from the numerical experiments. The software and supplementary data of computational results are available at http://zhangroup.aporc.org/bioinfo/SANA.
KeywordsProtein structure alignment Fold comparison Circular permutation Functional sites
Protein data bank
Root mean square distance
Aligned fragment pair
Protein structure alignment based on sequence neighborhood alignment
We are grateful to the anonymous referees for many helpful comments that greatly improved the paper. This work is partly supported by the National Natural Science Foundation of China (NSFC) under Key Research Grant No. 10631070, Research Grant No. 60503004, and JSPS-NSFC collaborative project (No. 10711140116). This work is also partially supported by the Chief Scientist Program of Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences with the grant no. 2009CSP002.
- Fischer D, Elofsson A, Rice DW, Eisenberg D (1996) Assessing the performance of fold recognition methods by means of a comprehensive benchmark. In: Proceedings of 1996 Pacific Symposium on Biocomputing, pp 300–318Google Scholar
- Ye Y, Godzik A (2003) Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics 19:e246–e255Google Scholar