Externalizing the Multiple Sequence Alignment Problem with Affine Gap Costs

* Final gross prices may vary according to local VAT.

Get Access


Multiple sequence alignment (MSA) is a problem in computational biology with the goal to discover similarities between DNA or protein sequences. One problem in larger instances is that the search exhausts main memory. This paper applies disk-based heuristic search to solve MSA benchmarks. We extend iterative-deepening dynamic programming, a hybrid of dynamic programming and IDA*, for which optimal alignments with respect to similarity metrics and affine gap cost are computed. We achieve considerable savings of main memory with an acceptable time overhead. By scaling buffer sizes, the space-time trade-off can be adapted to existing resources.

The work is supported by DFG in the projects ED-74/3 and ED-74/4.