Performance analysis of computational approaches to solve Multiple Sequence Alignment
In the biotechnology field, the deployment of the Multiple Sequence Alignment (MSA) problem, which is a high performance computing demanding process, is one of the new challenges to address on the new parallel systems. The aim of this problem is to find similar regions on biological sequences. Furthermore, the goal of MSA applications is to align as much sequences as possible with a level of quality that makes the alignment biologically meaningful. An efficiency study of different MSA implementations, based on T-Coffee (one of the most used MSA aligners), has been performed in order to find new optimizations that may improve the average execution time on multi-core systems. We found that the current parallel implementations have some performance issues, affecting negatively the scalability of the process. Finally, the proposed implementation based on the usage of threads in conjunction with a message-passing library is presented, with the aim to optimize the execution of the MSA problem in multi-core-based clusters.
KeywordsMultiple sequence alignment T-Coffee MPI Distributed computing Multi-core
This work was supported by the MEyC-Spain under contract TIN 2008-05913 and Consolider CSD2007-0050. The CUR of DIUE of GENCAT and the European Social Fund.