hLCS. A Hybrid GPGPU Approach for Solving Multiple Short and Unbalanced LCS Problems

  • Pedro Valero-Lara
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8584)


The “Longest Common Subsequence” is one of the most widely used and well-known methods within the similarity search community, applicable to a wide range of fields. Currently, modern multicore CPU and GPU-based systems offer an impressive cost/performance ratio and are an attractive test platform to accelerate response time and increase the number of problems solved per second. The use of GPUs for carrying out sequences alignment is widely extended for bioinformatics applications. However, we focus on the use of this algorithm applied to other problems which supposes a new and different approach. In particular, the most important difference is found in the pattern of the sequences. While, on one hand, the size of the biological sequences are large and similar, on the other hand, the sequences in other applications are short and unbalanced. Furthermore, this work aims to use one multicore CPU and GPU system for computing multiple problems simultaneously instead of computing only one. The main contribution of this work is a new hybrid approach which combines the two classical parallel techniques for our problem in two different phases. This new implementation is up to 80× and 25× faster, in terms of speedup, over the sequential and multicore counterpart respectively for our particular problem, that is, solving multiple “Longest Common Subsequence” problems on short and unbalanced sequences with a high ratio of problems solved per second.


Longest Common Subsequence parallel algorithms multicore and GPU 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Asanovic, K., Bodik, R., Christopher, B., Gebis, C.J.J., Husbands, P., Keutzer, K., Patterson, D.A., Plishker, W.L., Shalf, J., Williams, S.W., Yelick, K.A.: The landscape of parallel computing research: A view from berkeley. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley (2006)Google Scholar
  2. 2.
    Geer, D.: Chip markets turn to multicore processors. Computer 38(5), 11–13 (2005)CrossRefGoogle Scholar
  3. 3.
    Xiao, L., Song, H., Zhu, M., Kuang, Y.: A PC cluster based Parallel Algorithm for Longest Common Subsequences Problems. In: The International Conference on Bioinformatics and Biomedical Engineering, ICBBE 2008 (2008)Google Scholar
  4. 4.
    Bonizzoni, P., Vedova, G.D., Mauri, G.: Experimenting an Approximation Algorithm for the LCS. Discrete Applied Mathematics (1998)Google Scholar
  5. 5.
    Dowling, G., May, P.: Aproximate String Matching. Journal ACM Computing Surveys (1980)Google Scholar
  6. 6.
    Navarro, G.: A Guided Tour to Approximate String Matching. Journal ACM Computing Surveys (1999)Google Scholar
  7. 7.
    Ning, K., Ng, H.K., Wai, L.H.: Finding Patterns in Biological Sequences by Longest CommonSubsequences and Shortest Common Subsequences. In: IEEE Symposium BioInformatics and BioEngineering, BIBE (2006)Google Scholar
  8. 8.
    Bentley, J., Mcilroy, D.: Data Compression using Long Common Strings. In: IEEE Data Compression Conference (1999)Google Scholar
  9. 9.
    Heckel, P.: A technique for isolating differences between files. Communications of ACM (1978)Google Scholar
  10. 10.
    Kumar, S., Spafford, E.: A pattern-matching model for intrusion detection. In: National Computer Security Conference (1994)Google Scholar
  11. 11.
    Uribe-Paredes, R., Valero-Lara, P., Arias, E., Sánchez, J.L., Cazorla, D.: A GPU-Based Implementation for Range Queries on Spaghettis Data Structure. In: Murgante, B., Gervasi, O., Iglesias, A., Taniar, D., Apduhan, B.O. (eds.) ICCSA 2011, Part I. LNCS, vol. 6782, pp. 615–629. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  12. 12.
    Uribe-Paredes, R., Valero-Lara, P., Arias, E., Sánchez, J.L., Cazorla, D.: Similarity search implementations for multi-core and many-core processors. In: The 2011 International Conference on High Performance Computing & Simulation (HPCS), pp. 656–663 (2011)Google Scholar
  13. 13.
    Uribe-Paredes, R., Arias, E., Sánchez, J.L., Cazorla, D., Valero-Lara, P.: Improving the Performance for the Range Search on Metric Spaces Using a Multi-GPU Platform. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012, Part II. LNCS, vol. 7447, pp. 442–449. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  14. 14.
    GPGPU. General-purpose computation using graphics hardware,
  15. 15.
    Rost, R.J.: OpenGL Shading Language. Addison-Wesley (2005)Google Scholar
  16. 16.
    Feng, W.C., Manocha, D.: High-performance computing using accelerators. Parallel Computing 33, 645–647 (2007)CrossRefGoogle Scholar
  17. 17.
    Mark, W.R., Glanville, S.R., Akeley, K., Kilgard, M.J.: Cg: a system for programming graphics hardware in a C-like language. In: SIGGRAPH 2003: ACM SIGGRAPH 2003 Papers, pp. 896–907. ACM Press (2003)Google Scholar
  18. 18.
    NVIDIA. NVIDIA Compute Unified Device Architecture (CUDA) Programming Guide,
  19. 19.
    Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinformatics (2008)Google Scholar
  20. 20.
    Kloetzli, J., Strege, B., Decker, J., Olano, M.: Parallel Longest Common Subsequence using Graphics Hardware. In: Eurographics Symposium on Parallel Graphics and Visualization (2008)Google Scholar
  21. 21.
    Flavius, E., Alba, S., de Melo, C.M.A.: CUDAlign: Using GPU to Accelerate the Comparison of Megabase Genomic Sequences. In: Principles and Practice of Parallel Programming, PPoPP (2010)Google Scholar
  22. 22.
    Singh, J., Aruni, I.: Accelerating Smith-Waterman on Heterogeneous CPU-GPU Systems. In: International Conference on Bioinformatics and Biomedical Engineering (iCBBE), p. 14 (2011)Google Scholar
  23. 23.
    Ligowski, L., Rudnicki, W.: An efficient implementation of Smith Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. In: IEEE International Symposium Parallel & Distributed Processing (IPDPS), p. 18 (2009)Google Scholar
  24. 24.
    Liu, Y., Maskell, D.L., Schmidt, B.: CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units. BMC Res. Notes 2, 73 (2009)CrossRefGoogle Scholar
  25. 25.
    Du, Z., Yin, Z., Bader, D.: A tile-based parallel Viterbi algorithm for biological sequence alignment on GPU with CUDA. In: IEEE International Symposium on Parallel & Distributed Processing Workshops (IPDPSW), pp. 1–8 (2010)Google Scholar
  26. 26.
    Striemer, G.M., Akoglu, A.: Sequence alignment with GPU: Performance and design challenges. In: IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1–10 (2009)Google Scholar
  27. 27.
  28. 28.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Pedro Valero-Lara
    • 1
    • 2
  1. 1.Research Center for Energy, Environment and Technology, CIEMATNumerical Simulation UnitMadridSpain
  2. 2.School of Computer ScienceThe Complutense University of MadridSpain

Personalised recommendations