Abstract
The parameterised complexity of various consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds.
Similar content being viewed by others
Notes
Note that we slightly abuse notation with respect to the subset relation: for a multi-set A and a set B, \(A \subseteq B\) means that \(A' \subseteq B\), where \(A'\) is the set obtained from A by deleting duplicates; for multi-sets A, B, \(A \subseteq B\) is defined as usual. Moreover, whenever it is clear from the context that we talk about multi-sets, we also simply use the term set.
For a corresponding table of the already known results for \({{\,\mathrm{(\textsf {r})-\textsc {CloseSubstr}}\,}}\), see, e. g., [29, Table 1].
References
Amir, A., Landau, G.M., Na, J.C., Park, H., Park, K., Sim, J.S.: Efficient algorithms for consensus string problems minimizing both distance sum and radius. Theor. Comput. Sci. 412, 5239–5246 (2011)
Basavaraju, M., Panolan, F., Rai, A., Ramanujan, M.S., Saurabh, S.: On the kernelization complexity of string problems. Theor. Comput. Sci. 730, 21–31 (2018)
Ben-Dor, A., Lancia, G., Ravi, R., Perone, J.: Banishing bias from consensus sequences. In: Proc. 8th Annual Symposium on Combinatorial Pattern Matching, CPM 1997, LNCS, 1264, pp. 247–261 (1997)
Bodlaender, H.L., Thomassé, S., Yeo, A.: Kernel bounds for disjoint cycles and disjoint paths. Theor. Comput. Sci. 412(35), 4570–4578 (2011)
Bodlaender, H.L., Jansen, B.M.P., Kratsch, S.: Kernelization lower bounds by cross-composition. SIAM J. Discrete Math. 28(1), 277–305 (2014)
Boucher, C., Ma, B.: Closest string with outliers. BMC Bioinformatics 12, S55 (2011)
Bulteau, L., Hüffner, F., Komusiewicz, C., Niedermeier, R.: Multivariate algorithmics for NP-hard string problems. Bull. EATCS 114, 31–73 (2014)
Chen, J., Hermelin, D., Sorge, M.: On computing centroids according to the p-norms of hamming distance vectors. In: 27th Annual European Symposium on Algorithms, ESA 2019, September 9–11, 2019, Munich/Garching, Germany, pp. 28:1–28:16 (2019)
Cygan, M., Fomin, F., Kowalik, L., Lokshtanov, D., Marx, D., Pilipczuk, M., Pilipczuk, M., Saurabh, S.: Parameterized Algorithms. Springer, New York (2015)
Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)
Dopazo, J., Rodríguez, A., Sáiz, J., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. Comput. Appl. Biosci. 9(2), 123–125 (1993)
Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, New York (2012)
Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, London (2013)
Evans, P.A., Smith, A., Wareham, H.T.: The parameterized complexity of p-center approximate substring problems. Technical Report TR01-149, Faculty of Computer Science, University of New Brunswick, Canada (2001)
Evans, P.A., Smith, A.D., Wareham, H.T.: On the complexity of finding common approximate substrings. Theor. Comput. Sci. 306, 407–430 (2003)
Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26, 141–167 (2006)
Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Berlin (2006)
Fomin, F.V., Lokshtanov, D., Saurabh, S., Zehavi, M.: Kernelization: Theory of Parameterized Preprocessing. Cambridge University Press, Cambridge (2019)
Frances, M., Litman, A.: On covering problems of codes. Theory Comput. Syst. 30, 113–119 (1997)
Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)
Lanctot, J.K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. Inf. Comput. 185, 41–55 (2003)
Lenstra, H.W.: Integer programming with a fixed number of variables. Math. Oper. Res. 8(4), 538–548 (1983)
Li, M., Ma, B., Wang, L.: Finding similar regions in many sequences. J. Comput. Syst. Sci. 65(1), 73–96 (2002). https://doi.org/10.1006/jcss.2002.1823
Lucas, K., Busch, M., Mössinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. Comput. Appl. Biosci. 7(4), 525–529 (1991)
Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38, 1382–1410 (2008)
Pavesi, G., Mauri, G., Pesole, G.: An algorithm for finding signals of unknown length in DNA sequences. Bioinformatics 17, S207–S214 (2001)
Pevzner, P., Sze, S.: Combinatorial approaches to finding subtle signals in DNA strings. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, ISMB 2000, pp. 269–278 (2000)
Proutski, V., Holmes, E.C.: Primer master: a new program for the design and analysis of PCR primers. Comput. Appl. Biosci. 12(3), 253–255 (1996)
Schmid, M.L.: Finding consensus strings with small length difference between input and solution strings. TOCT 9(3), 13:1–13:18 (2017)
Tompa, M., Li, N., Bailey, T.L., Church, G.M., Moor, B.D., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Régnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23(1), 137–144 (2005)
Acknowledgements
We wish to thank the anonymous referees for valuable feedback that improved the readability of this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bulteau, L., Schmid, M.L. Consensus Strings with Small Maximum Distance and Small Distance Sum. Algorithmica 82, 1378–1409 (2020). https://doi.org/10.1007/s00453-019-00647-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00453-019-00647-9