Abstract
Motif finding is the problem of identifying recurring patterns in sequences. It has been widely studied and several variants have been proposed. Here, we address the problem of finding common motifs with gaps that are present in all strings of a finite set. We prove that the problem is NP-hard by reducing the multiple longest common subsequence (MLCS) problem to it. We also provide a branch and bound algorithm for MLCS and show how the algorithm can be extended to give an algorithm for finding common motifs with gaps after common factors that occur in all the strings have been identified.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Antoniou, P., Crochemore, M., Iliopoulos, C., Peterlongo, P.: Application of suffix trees for the acquisition of common motifs with gaps in a set of strings. In: International Conference on Language and Automata Theory and Applications (2007)
Antoniou, P., Holub, J., Iliopoulos, C.S., Melichar, B., Peterlongo, P.: Finding common motifs with gaps using finite automata. In: Ibarra, O.H., Yen, H.-C. (eds.) CIAA 2006. LNCS, vol. 4094, pp. 69–77. Springer, Heidelberg (2006). https://doi.org/10.1007/11812128_8
Chen, Y., Wan, A., Liu, W.: A fast parallel algorithm for finding the longest common sequence of multiple biosequences. BMC Bioinformatics 7(4), S4 (2006)
Huang, G., Lim, A.: An effective branch-and-bound algorithm to solve the k-longest common subsequence problem. In: Proceedings of the 16th European Conference on Artificial Intelligence, pp. 191–195. IOS Press (2004)
Iliopoulos, C.S., McHugh, J., Peterlongo, P., Pisanti, N., Rytter, W., Sagot, M.F.: A first approach to finding common motifs with gaps. Int. J. Found. Comput. Sci. 16(06), 1145–1154 (2005)
Korkin, D., Wang, Q., Shang, Y.: An efficient parallel algorithm for the multiple longest common subsequence (MLCS) problem. In: 2008 37th International Conference on Parallel Processing, ICPP 2008, pp. 354–363. IEEE (2008)
Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM (JACM) 25(2), 322–336 (1978)
Marsan, L., Sagot, M.F.: Extracting structured motifs using a suffix tree algorithms and application to promoter consensus identification. In: Proceedings of the Fourth Annual International Conference on Computational Molecular Biology, pp. 210–219. ACM (2000)
Wang, Q., Korkin, D., Shang, Y.: A fast multiple longest common subsequence (MLCS) algorithm. IEEE Trans. Knowl. Data Eng. 23(3), 321–334 (2011)
Yang, J., Xu, Y., Sun, G., Shang, Y.: A new progressive algorithm for a multiple longest common subsequences problem and its efficient parallelization. IEEE Trans. Parallel Distrib. Syst. 24(5), 862–870 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Sayeed, S.D., Rahman, M.S., Rahman, A. (2018). On Multiple Longest Common Subsequence and Common Motifs with Gaps (Extended Abstract). In: Rahman, M., Sung, WK., Uehara, R. (eds) WALCOM: Algorithms and Computation. WALCOM 2018. Lecture Notes in Computer Science(), vol 10755. Springer, Cham. https://doi.org/10.1007/978-3-319-75172-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-75172-6_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75171-9
Online ISBN: 978-3-319-75172-6
eBook Packages: Computer ScienceComputer Science (R0)