Skip to main content

A Beam Search for the Longest Common Subsequence Problem Guided by a Novel Approximate Expected Length Calculation

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11943))

Abstract

The longest common subsequence problem (LCS) aims at finding a longest string that appears as subsequence in each of a given set of input strings. This is a well known \(\mathcal {NP}\)-hard problem which has been tackled by many heuristic approaches. Among them, the best performing ones are based on beam search (BS) but differ significantly in various aspects. In this paper we compare the existing BS-based approaches by using a common BS framework making the differences more explicit. Furthermore, we derive a novel heuristic function to guide BS, which approximates the expected length of an LCS of random strings. In a rigorous experimental evaluation we compare all BS-based methods from the literature and investigate the impact of our new heuristic guidance. Results show in particular that our novel heuristic guidance leads frequently to significantly better solutions. New best solutions are obtained for a wide range of the existing benchmark instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Note that even instance sets Rat and Virus contain sequences that are close to random strings.

References

  1. Beal, R., Afrin, T., Farheen, A., Adjeroh, D.: A new algorithm for “the LCS problem” with application in compressing genome resequencing data. BMC Genom. 17(4), 544 (2016)

    Article  Google Scholar 

  2. Blum, C., Blesa, M.J.: Probabilistic beam search for the longest common subsequence problem. In: Stützle, T., Birattari, M., H. Hoos, H. (eds.) SLS 2007. LNCS, vol. 4638, pp. 150–161. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74446-7_11

    Chapter  Google Scholar 

  3. Blum, C., Blesa, M.J., López-Ibáñez, M.: Beam search for the longest common subsequence problem. Comput. Oper. Res. 36(12), 3178–3186 (2009)

    Article  MathSciNet  Google Scholar 

  4. Blum, C., Festa, P.: Longest common subsequence problems. In: Metaheuristics for String Problems in Bioinformatics, chapter 3, pp. 45–60. Wiley (2016)

    Google Scholar 

  5. Bonizzoni, P., Della Vedova, G., Mauri, G.: Experimenting an approximation algorithm for the LCS. Discrete Appl. Math. 110(1), 13–24 (2001)

    Google Scholar 

  6. Brisk, P., Kaplan, A., Sarrafzadeh, M.: Area-efficient instruction set synthesis for reconfigurable system-on-chip design. In: Proceedings of the 41st Design Automation Conference, pp. 395–400. IEEE press (2004)

    Google Scholar 

  7. Djukanovic, M., Raidl, G., Blum, C.: Anytime algorithms for the longest common palindromic subsequence problem. Technical Report AC-TR-18-012, TU Wien, Vienna, Austria (2018)

    Google Scholar 

  8. Easton, T., Singireddy, A.: A large neighborhood search heuristic for the longest common subsequence problem. J. Heuristics 14(3), 271–283 (2008)

    Article  Google Scholar 

  9. Fraser, C.B.: Subsequences and Supersequences of Strings. Ph.D. thesis, University of Glasgow, Glasgow, UK (1995)

    Google Scholar 

  10. Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  Google Scholar 

  11. Huang, K., Yang, C., Tseng, K.: Fast algorithms for finding the common subsequences of multiple sequences. In: Proceedings of the IEEE International Computer Symposium, pp. 1006–1011. IEEE press (2004)

    Google Scholar 

  12. Islam, M.R., Saifullah, C.M.K., Asha, Z.T., Ahamed, R.: Chemical reaction optimization for solving longest common subsequence problem for multiple string. Soft Comput. 23(14), 5485–5509 (2018). In press

    Google Scholar 

  13. Jiang, T., Lin, G., Ma, B., Zhang, K.: A general edit distance between RNA structures. J. Comput. Biol. 9(2), 371–388 (2002)

    Article  Google Scholar 

  14. Kruskal, J.B.: An overview of sequence comparison: time warps, string edits, and macromolecules. SIAM Rev. 25(2), 201–237 (1983)

    Article  MathSciNet  Google Scholar 

  15. López-Ibáñez, M., Dubois-Lacoste, J., Pérez Cáceres, L., Stützle, T., Birattari, M.: The irace package: iterated racing for automatic algorithm configuration. Oper. Res. Perspect. 3, 43–58 (2016)

    Google Scholar 

  16. Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978)

    Article  MathSciNet  Google Scholar 

  17. Mousavi, S.R., Tabataba, F.: An improved algorithm for the longest common subsequence problem. Comput. Oper. Res. 39(3), 512–520 (2012)

    Article  MathSciNet  Google Scholar 

  18. Shyu, S.J., Tsai, C.-Y.: Finding the longest common subsequence for multiple biological sequences by ant colony optimization. Comput. Oper. Res. 36(1), 73–91 (2009)

    Article  MathSciNet  Google Scholar 

  19. Storer, J.: Data Compression: Methods and Theory. Computer Science Press, Rockville (1988)

    Google Scholar 

  20. Tabataba, F.S., Mousavi, S.R.: A hyper-heuristic for the longest common subsequence problem. Comput. Biol. Chem. 36, 42–54 (2012)

    Article  MathSciNet  Google Scholar 

  21. Wang, Q., Korkin, D., Shang, Y.: A fast multiple longest common subsequence (MLCS) algorithm. IEEE Trans. Knowl. Data Eng. 23(3), 321–334 (2011)

    Article  Google Scholar 

Download references

Acknowledgments

We gratefully acknowledge the financial support of this project by the Doctoral Program “Vienna Graduate School on Computational Optimization” funded by the Austrian Science Foundation (FWF) under contract no. W1260-N35. Moreover, Christian Blum acknowledges the support of LOGISTAR, a proyect from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 769142.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marko Djukanovic .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Djukanovic, M., Raidl, G.R., Blum, C. (2019). A Beam Search for the Longest Common Subsequence Problem Guided by a Novel Approximate Expected Length Calculation. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2019. Lecture Notes in Computer Science(), vol 11943. Springer, Cham. https://doi.org/10.1007/978-3-030-37599-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-37599-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-37598-0

  • Online ISBN: 978-3-030-37599-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics