Abstract
The longest common subsequence problem (LCS) aims at finding a longest string that appears as subsequence in each of a given set of input strings. This is a well known \(\mathcal {NP}\)-hard problem which has been tackled by many heuristic approaches. Among them, the best performing ones are based on beam search (BS) but differ significantly in various aspects. In this paper we compare the existing BS-based approaches by using a common BS framework making the differences more explicit. Furthermore, we derive a novel heuristic function to guide BS, which approximates the expected length of an LCS of random strings. In a rigorous experimental evaluation we compare all BS-based methods from the literature and investigate the impact of our new heuristic guidance. Results show in particular that our novel heuristic guidance leads frequently to significantly better solutions. New best solutions are obtained for a wide range of the existing benchmark instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Note that even instance sets Rat and Virus contain sequences that are close to random strings.
References
Beal, R., Afrin, T., Farheen, A., Adjeroh, D.: A new algorithm for “the LCS problem” with application in compressing genome resequencing data. BMC Genom. 17(4), 544 (2016)
Blum, C., Blesa, M.J.: Probabilistic beam search for the longest common subsequence problem. In: Stützle, T., Birattari, M., H. Hoos, H. (eds.) SLS 2007. LNCS, vol. 4638, pp. 150–161. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74446-7_11
Blum, C., Blesa, M.J., López-Ibáñez, M.: Beam search for the longest common subsequence problem. Comput. Oper. Res. 36(12), 3178–3186 (2009)
Blum, C., Festa, P.: Longest common subsequence problems. In: Metaheuristics for String Problems in Bioinformatics, chapter 3, pp. 45–60. Wiley (2016)
Bonizzoni, P., Della Vedova, G., Mauri, G.: Experimenting an approximation algorithm for the LCS. Discrete Appl. Math. 110(1), 13–24 (2001)
Brisk, P., Kaplan, A., Sarrafzadeh, M.: Area-efficient instruction set synthesis for reconfigurable system-on-chip design. In: Proceedings of the 41st Design Automation Conference, pp. 395–400. IEEE press (2004)
Djukanovic, M., Raidl, G., Blum, C.: Anytime algorithms for the longest common palindromic subsequence problem. Technical Report AC-TR-18-012, TU Wien, Vienna, Austria (2018)
Easton, T., Singireddy, A.: A large neighborhood search heuristic for the longest common subsequence problem. J. Heuristics 14(3), 271–283 (2008)
Fraser, C.B.: Subsequences and Supersequences of Strings. Ph.D. thesis, University of Glasgow, Glasgow, UK (1995)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)
Huang, K., Yang, C., Tseng, K.: Fast algorithms for finding the common subsequences of multiple sequences. In: Proceedings of the IEEE International Computer Symposium, pp. 1006–1011. IEEE press (2004)
Islam, M.R., Saifullah, C.M.K., Asha, Z.T., Ahamed, R.: Chemical reaction optimization for solving longest common subsequence problem for multiple string. Soft Comput. 23(14), 5485–5509 (2018). In press
Jiang, T., Lin, G., Ma, B., Zhang, K.: A general edit distance between RNA structures. J. Comput. Biol. 9(2), 371–388 (2002)
Kruskal, J.B.: An overview of sequence comparison: time warps, string edits, and macromolecules. SIAM Rev. 25(2), 201–237 (1983)
López-Ibáñez, M., Dubois-Lacoste, J., Pérez Cáceres, L., Stützle, T., Birattari, M.: The irace package: iterated racing for automatic algorithm configuration. Oper. Res. Perspect. 3, 43–58 (2016)
Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978)
Mousavi, S.R., Tabataba, F.: An improved algorithm for the longest common subsequence problem. Comput. Oper. Res. 39(3), 512–520 (2012)
Shyu, S.J., Tsai, C.-Y.: Finding the longest common subsequence for multiple biological sequences by ant colony optimization. Comput. Oper. Res. 36(1), 73–91 (2009)
Storer, J.: Data Compression: Methods and Theory. Computer Science Press, Rockville (1988)
Tabataba, F.S., Mousavi, S.R.: A hyper-heuristic for the longest common subsequence problem. Comput. Biol. Chem. 36, 42–54 (2012)
Wang, Q., Korkin, D., Shang, Y.: A fast multiple longest common subsequence (MLCS) algorithm. IEEE Trans. Knowl. Data Eng. 23(3), 321–334 (2011)
Acknowledgments
We gratefully acknowledge the financial support of this project by the Doctoral Program “Vienna Graduate School on Computational Optimization” funded by the Austrian Science Foundation (FWF) under contract no. W1260-N35. Moreover, Christian Blum acknowledges the support of LOGISTAR, a proyect from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 769142.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Djukanovic, M., Raidl, G.R., Blum, C. (2019). A Beam Search for the Longest Common Subsequence Problem Guided by a Novel Approximate Expected Length Calculation. In: Nicosia, G., Pardalos, P., Umeton, R., Giuffrida, G., Sciacca, V. (eds) Machine Learning, Optimization, and Data Science. LOD 2019. Lecture Notes in Computer Science(), vol 11943. Springer, Cham. https://doi.org/10.1007/978-3-030-37599-7_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-37599-7_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37598-0
Online ISBN: 978-3-030-37599-7
eBook Packages: Computer ScienceComputer Science (R0)