Background

Multiple alignments are of key importance in transferring annotation from model organism to humans [1]. The importance is reflected by the number of alignment methods that have emerged recently [25]. The development of alignment programs is governed by achieving ever increasing levels of accuracy on several commonly used benchmark sets [6, 7]. The accuracy is usually measured by calculating the number of identically aligned residues divided by the number of aligned residues in a reference alignment. Essentially, this reflects the extent to which an alignment method managed to reconstruct a reference alignment. Misaligned residues in the test alignment are completely ignored. Therefore alignment programs that tend to align more residues, usually global methods, appear to perform well.

It is often more desirable in practice to create alignments in which only reliable regions are aligned and unreliable regions remain unaligned. Misaligned regions can give the impression of conservation where in fact there is none. This is particularly true in multi-domain cases where alignment programs often fail to insert large gaps corresponding to the loss, or gain, of entire domains. In phylogenetic reconstruction such misaligned regions essentially contribute noise that can lead to false tree topologies [810]. Previously, we made the simple observation that regions aligned similarly by several alignment methods are usually more reliable than those aligned differently [11]. Our method Mumsa takes advantage of this observation by using cross-alignment conservation as an indicator of alignment accuracy and overall case difficulty.

Here we present a new algorithm that extracts identically aligned regions from several multiple alignments and creates new alignments out of them. Within these alignments, unreliably aligned residues from the input alignments are disentangled and therefore we term them relaxed alignments.

Algorithm

The input of our method is a set of multiple alignments of the same sequences generated either by different alignment methods or by a few methods employing different parameter settings.

An important concept in our algorithm are pairs-of-aligned residues (POARS) as defined in Lassmann and Sonnhammer (2005). Briefly, within an alignment two residues form a POAR when they occur in different sequences (rows) but within the same column.

The algorithm for the extraction of reliable regions from a set of alignments follows a few simple steps:

  1. 1.

    All input alignment in the set of alignments M are atomized into POARs.

  2. 2.

    The power set of M, P(M) is created. For example if |M| = 3, the following list of sub-sets is created: {},{A},{B},{C},{AB},{AC},{BC},{ABC} where A, B, C are alignments.

  3. 3.

    Each individual POAR is assigned to a single subset of P(M) depending on its occurrence within the input alignments. For example, an aligned pair of residues occurring in alignment A, B and C will be assigned to set {A, B, C} but not to any other set such as {A, B}.

  4. 4.

    Among the subsets of P(M) containing f alignments, select the one containing the maximum number of POARs (X). The parameter f is the stringency criterion for including residues in the final alignment: f = 3 requires all POARs in the final alignment occur in at least three input alignments, while f = 2 only requires POARs to occur in at least two alignments.

  5. 5.

    Build a set Y including all POARs that belong to X or sub-sets of P(M) of which X is a subset itself. If X = {A, B}, include POARs from this set but also from set {A, B, C}.

  6. 6.

    Assemble alignment by :

  7. (a)

    Start with an alignment that contains only single residues in each column(i.e. an alignment where all sequences are completely unaligned)

  8. (b)

    Sequentially merge columns if their residues form a POAR present in y.

  9. (c)

    Sort the alignment columns to preserve the order of the residues in the initial sequences.

Practical improvements

The previous version of Mumsa required that all input alignments to be in the Fasta format and contain the sequences in the same order. Both of these limitations were removed in this version to make it easier for users to use several different alignment methods.

Mumsa can produce output alignments in Fasta, Clustal and Macsim [12] alignment formats. When using the Macsim format, the residues in each sequence obtain a reliability score. For residue A this reliability score is the sum of the occurrences of all POARs, that A is involved in, normalized by the maximum reliability score attainable (the score if residue A would be aligned identically in all input alignments). These reliability scores can be visualized by Kalignvu [13] to aid users in distinguish between reliable and unreliable regions.

Accuracy measurement

Alignment accuracy is measured by comparing a test alignment to a reference alignment. The sum-of-pairs score (SPS) [14] and Q-score [15] reflect how many residues are aligned identically in both test and reference alignment. In a sense this reflect how much of the reference alignment was reconstructed. However, no attention is being paid on the number of misaligned residues in the test alignment. For example a SPS score of 0.5 could either mean that 50 percent of the residues in the test alignment are simply not aligned but also that 50 percent are misaligned. We therefore introduce a per-residue accuracy (PRA) score that also has the number of shared POARs as the nominator but the total number of POARs of the test alignment as the denominator. Essentially, this is equivalent to asking how many residues are aligned correctly in the test alignment.

Benchmark

To test our method we used a selection of alignment methods (Table 1) in combination with the Balibase benchmark set [6]. The strategy is as follows:

Table 1 Alignment methods and parameters used in this study.
  1. 1.

    All 12 alignment programs are run on all test cases in Balibase.

  2. 2.

    Mumsa is run on the resulting alignments with varying stringency parameter f.

  3. 3.

    All alignments are scored using the PRA function.

In addition, we calculated the percentage of aligned residues, or the number of aligned residues divided by the total number of possible aligned residues.

The computational properties of Mumsa were tested by comparing the cumulative running time of alignment methods to the running time of Mumsa using three settings. We used the program ROSE [16] to generate sets of input sequences for the alignment methods. Since embarking on this project several alignment packages have been updated and we chose to adopted slightly more recent versions for this evaluation (Table 2). The CPU times were measured on a 2.0 GHz Xeon processor with 4 GB of RAM running Fedora Linux 6.0.

Table 2 Alignment methods used to measure computational properties of Mumsa.

Results and discussion

The PRA scores for all alignment methods used in this study are surprisingly low (Table 3). For reference set 11 all methods fail to align around half of the residues correctly while in other sets a full quarter of the residues are incorrectly aligned. However, in all cases the percentage of aligned residues for all alignment methods is high.

Table 3 PRA scores of various alignment methods for the Balibase subsets. Average PRA and percentage of aligned residues (in brackets) for several alignment methods on Balibase3. The average scores for all of the 12 alignment methods is given in the middle row (bold).

The relaxed alignments created by Mumsa have a much higher per-residue accuracy than those of the input alignment. Depending on the overall difficulty of the subset of benchmark alignments, the increase in accuracy is dramatic. This is especially true for the first Balibase reference set where the accuracy is almost doubled. It is particularly striking that the stringency cutoff f does not have to be high to give good accuracy gains. Residues occurring in more than 25 percent, here three or more, of input alignments are reliable and lead to good relaxed alignments.

As expected, the alignments generated by Mumsa contain fewer aligned residues than the input alignments. Moreover, the higher the cutoff f the fewer residues are aligned. Another observation is that the difficulty of the alignment cases affects the number of aligned residues in the merged alignments. For example, fewer aligned residues are present in the Mumsa alignments for the set 11, the most difficult one, compared to the other ones. In fact, the Pearson correlation coefficient between the average accuracy of input alignments (a measure of alignment difficulty) and the percentage of aligned residues of the most relaxed alignments is 0.98. This supports the notion that alignment programs usually disagree in difficult cases – more precisely in regions that are difficult to align.

The alignment viewer Kalignvu can be used to display the output alignments of Mumsa (Figure 1). A heat-map color scheme highlights the more reliable regions in red tones and less reliable regions in blue tones.

Figure 1
figure 1

A Mumsa alignment visualized by Kalignvu. A relaxed Mumsa alignment derived from a ClustalW, Poa, Kalign, Probcons and Dialign alignment of the Balibase 3.0 test case BB20007. The parameter f was chosen to be two, requiring that residues in the output alignment appear in at least two input alignments. Each residue is colored according to the average occurrence of the POARs it is involved in. Regions that appear in red are identically aligned in all 5 input alignments while green and blue regions are only aligned identically in fewer and fewer cases. It is clear that all alignment programs find conserved motifs in the sequences but disagree on how the residues in between should be aligned.

The running time of Mumsa is very low in comparison to that required by the input alignments (Figure 2). We ran Mumsa using a stringent, moderate and relaxed parameter setting for f. It is clear that the running time is influenced by the choice of f. Nevertheless, the running time of Mumsa remains two to three orders of magnitude lower than that required by the input alignment methods.

Figure 2
figure 2

Running time of Mumsa in comparison to the alignment methods used as input. The running time in CPU seconds for Mumsa using three settings in comparison to the cumulative running time of the alignment programs used to generate the input alignments. The running times of Mumsa were multiplied by 100 to be visible in the plot. The sequence files were generated by ROSE [16] using an average sequence length of 500 residues and and average evolutionary distance of 250. It is clear that the running time of Mumsa is at least two orders of magnitude lower than that required by the alignment programs.

Conclusion

Much emphasis has been given to alignment methods that focus on aligning a large fractions of residues because this often gives high scores on benchmark sets. We argue that it is equally relevant to increase the ration of correctly aligned residues within generated alignments. For this purpose we have introduced an algorithm to extract identically aligned parts from several multiple alignments and have shown that those are very reliable. Our method operates on pairs of aligned residues rather than columns. In addition to being able to identify blocks of high conservation, our method can therefore also identify correctly aligned regions which only few input sequences share.

In contrast to alignments generated by most popular direct alignment method, the relaxed alignments generated by Mumsa contain more gaps and are therefore less compact. As such they are inherently different from traditional alignments and users will have to decide whether these are suitable to be used directly in their specific studies. However, relaxed alignments can always be used to get an overview of conservation, a sense of how trustworthy alignments are and, more importantly, which regions are reliable. Users may wish to examine Mumsa alignments in a hierarchical manner, starting from the most reliable (high stringency cutoff f) to more compact but less reliable alignments (low f). Due to the good computationally properties of Mumsa little extra computing time is required to perform such an analysis. A direct application of relaxed alignments is in phylogenetics where alignment accuracy is of prime importance. An alignment in which 60 percent of the residues are aligned with an accuracy of more than 95 percent is clearly more desirable than an alignment where 90 percent of the residues are aligned, but incorrectly so in a quarter of cases.

Availability and requirements

The Mumsa program is freely available at http://msa.cgb.ki.se or by request from T. Lassmann.