Externalizing the Multiple Sequence Alignment Problem with Affine Gap Costs
Multiple sequence alignment (MSA) is a problem in computational biology with the goal to discover similarities between DNA or protein sequences. One problem in larger instances is that the search exhausts main memory. This paper applies disk-based heuristic search to solve MSA benchmarks. We extend iterative-deepening dynamic programming, a hybrid of dynamic programming and IDA*, for which optimal alignments with respect to similarity metrics and affine gap cost are computed. We achieve considerable savings of main memory with an acceptable time overhead. By scaling buffer sizes, the space-time trade-off can be adapted to existing resources.
Unable to display preview. Download preview PDF.