Advertisement

Parallel Solutions to the k-difference Primer Problem

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10860)

Abstract

This paper presents parallel solutions to the k-difference primer problem, targeting multicore processors and GPUs. This problem consists of finding the shortest substrings of one sequence with at least k differences from another sequence. The sequences found in the solution are candidate regions to contain primers used by biologists to amplify a DNA sequence in laboratory. To the authors’ knowledge, these are the first parallel solutions proposed for the k-difference primer problem. We identified two forms, coarse- and fine-grained, of exploiting parallelism while solving the problem. Several optimizations were applied to the solutions, such as synchronization overhead reduction, tiling, and speculative prefetch, allowing the analysis of very long sequences in a reduced execution time. In an experimental performance evaluation using real DNA sequences, the best OpenMP (in a quad-core processor) and CUDA solutions produced speedups up to 5.6 and 72.8, respectively, when compared to the best sequential solution. Even when the sequences length and the number of differences k increase, the performance is not affected. The best sequential, OpenMP, and CUDA solutions achieved the throughput of 0.16, 0.94, and 11.85 billions symbol comparisons per second, respectively, emphasizing the performance gain of the CUDA solution, which reached 100% of GPU occupancy.

Keywords

Inexact matching High performance computing Parallelism Multicore processor GPU 

References

  1. 1.
    Baxevanis, A., Ouellette, B.: Bioinformatics - A Practical Guide to the Analysis of Genes and Proteins, 3rd edn. Wiley, Hoboken (2005)Google Scholar
  2. 2.
    Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)CrossRefGoogle Scholar
  3. 3.
    Ito, M., et al.: A polynomial-time algorithm for computing characteristic strings under a set of strings. Syst. Comput. Jpn 26(3), 30–38 (1995)CrossRefGoogle Scholar
  4. 4.
    Landau, G., Vishkin, U.: Introducing efficient parallelism into approximate string matching and a new serial algorithm. In: Proceedings of the Annual ACM Symposium on Theory of Computing, pp. 220–230 (1986)Google Scholar
  5. 5.
    Landau, G., Vishkin, U.: Fast parallel and serial approximate string matching. J. Algorithms 10(2), 157–169 (1989)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Liu, Y., et al.: Parallel algorithms for approximate string matching with \(k\) mismatches on CUDA. In: Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, pp. 2414–2422 (2012)Google Scholar
  7. 7.
    Mandoiu, I., Zelikovsky, A.: Bioinformatics Algorithms: Techniques and Applications. Wiley, Hoboken (2008)CrossRefGoogle Scholar
  8. 8.
    Nakano, K.: Efficient implementations of the approximate string matching on the memory machine models. In: Proceedings of the International Conference on Networking and Computing, pp. 233–239 (2012)Google Scholar
  9. 9.
    NCBI: National Center for Biotechnology Information. https://www.ncbi.nlm.nih.gov
  10. 10.
    NVIDIA Corporation: CUDA Parallel Computing Platform. http://www.nvidia.com.br/object/cuda_home_new_br.html
  11. 11.
    OpenMP Architecture Review Board: OpenMP Application Programming Interface Version 4.5. http://www.openmp.org/mp-documents/openmp-4.5.pdf
  12. 12.
    Rastogi, P., Guddeti, R.: GPU accelerated inexact matching for multiple patterns in DNA sequences. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, pp. 163–167 (2014)Google Scholar
  13. 13.
    Utan, Y., et al.: A GPGPU implementation of approximate string matching with regular expression operators and comparison with its FPGA implementation. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pp. 1–7 (2012)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of ComputingFederal University of Mato Grosso do SulCampo GrandeBrazil

Personalised recommendations