PROMALS3D: Multiple Protein Sequence Alignment Enhanced with Evolutionary and Three-Dimensional Structural Information
Multiple sequence alignment (MSA) is an essential tool with many applications in bioinformatics and computational biology. Accurate MSA construction for divergent proteins remains a difficult computational task. The constantly increasing protein sequences and structures in public databases could be used to improve alignment quality. PROMALS3D is a tool for protein MSA construction enhanced with additional evolutionary and structural information from database searches. PROMALS3D automatically identifies homologs from sequence and structure databases for input proteins, derives structure-based constraints from alignments of three-dimensional structures, and combines them with sequence-based constraints of profile–profile alignments in a consistency-based framework to construct high-quality multiple sequence alignments. PROMALS3D output is a consensus alignment enriched with sequence and structural information about input proteins and their homologs. PROMALS3D Web server and package are available at http://prodata.swmed.edu/PROMALS3D.
Key wordsMultiple sequence alignment Database searches Three-dimensional structural alignment Consistency-based scoring Probabilistic model of profile–profile alignment
The work is supported in part by the National Institutes of Health (GM094575 to NVG) and the Welch Foundation (I-1505 to NVG).
- 1.Do CB, Katoh K (2008) Protein multiple sequence alignment. In: Walker J (ed) Methods Mol Biol, vol 484, 1st edn. Humana, Totowa, pp 379–413Google Scholar
- 33.Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proceedings of the 18th international conference on machine learning, pp 282–289Google Scholar