Protein structure alignment algorithms play an important role in the studies of protein structure and function. In this paper, a novel approach for structure alignment is presented. Specifically, core regions in two protein structures are first aligned by identifying connected components in a network of neighboring geometrically compatible aligned fragment pairs. The initial alignments then are refined through a multi-objective optimization method. The algorithm can produce both sequential and non-sequential alignments. We show the superior performance of the proposed algorithm by the computational experiments on several benchmark datasets and the comparisons with the well-known structure alignment algorithms such as DALI, CE and MATT. The proposed method can obtain accurate and biologically significant alignment results for the case with occurrence of internal repeats or indels, identify the circular permutations, and reveal conserved functional sites. A ranking criterion of our algorithm for fold similarity is presented and found to be comparable or superior to the Z-score of CE in most cases from the numerical experiments. The software and supplementary data of computational results are available at http://zhangroup.aporc.org/bioinfo/SANA.
Protein structure alignment Fold comparison Circular permutation Functional sites
Protein data bank
Root mean square distance
Aligned fragment pair
Protein structure alignment based on sequence neighborhood alignment