Skip to main content
Log in

Consensus Strings with Small Maximum Distance and Small Distance Sum

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

The parameterised complexity of various consensus string problems (Closest String, Closest Substring, Closest String with Outliers) is investigated in a more general setting, i. e., with a bound on the maximum Hamming distance and a bound on the sum of Hamming distances between solution and input strings. We completely settle the parameterised complexity of these generalised variants of Closest String and Closest Substring, and partly for Closest String with Outliers; in addition, we answer some open questions from the literature regarding the classical problem variants with only one distance bound. Finally, we investigate the question of polynomial kernels and respective lower bounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Note that we slightly abuse notation with respect to the subset relation: for a multi-set A and a set B, \(A \subseteq B\) means that \(A' \subseteq B\), where \(A'\) is the set obtained from A by deleting duplicates; for multi-sets AB, \(A \subseteq B\) is defined as usual. Moreover, whenever it is clear from the context that we talk about multi-sets, we also simply use the term set.

  2. For a corresponding table of the already known results for \({{\,\mathrm{(\textsf {r})-\textsc {CloseSubstr}}\,}}\), see, e. g., [29, Table 1].

References

  1. Amir, A., Landau, G.M., Na, J.C., Park, H., Park, K., Sim, J.S.: Efficient algorithms for consensus string problems minimizing both distance sum and radius. Theor. Comput. Sci. 412, 5239–5246 (2011)

    Article  MathSciNet  Google Scholar 

  2. Basavaraju, M., Panolan, F., Rai, A., Ramanujan, M.S., Saurabh, S.: On the kernelization complexity of string problems. Theor. Comput. Sci. 730, 21–31 (2018)

    Article  MathSciNet  Google Scholar 

  3. Ben-Dor, A., Lancia, G., Ravi, R., Perone, J.: Banishing bias from consensus sequences. In: Proc. 8th Annual Symposium on Combinatorial Pattern Matching, CPM 1997, LNCS, 1264, pp. 247–261 (1997)

  4. Bodlaender, H.L., Thomassé, S., Yeo, A.: Kernel bounds for disjoint cycles and disjoint paths. Theor. Comput. Sci. 412(35), 4570–4578 (2011)

    Article  MathSciNet  Google Scholar 

  5. Bodlaender, H.L., Jansen, B.M.P., Kratsch, S.: Kernelization lower bounds by cross-composition. SIAM J. Discrete Math. 28(1), 277–305 (2014)

    Article  MathSciNet  Google Scholar 

  6. Boucher, C., Ma, B.: Closest string with outliers. BMC Bioinformatics 12, S55 (2011)

    Article  Google Scholar 

  7. Bulteau, L., Hüffner, F., Komusiewicz, C., Niedermeier, R.: Multivariate algorithmics for NP-hard string problems. Bull. EATCS 114, 31–73 (2014)

    MATH  Google Scholar 

  8. Chen, J., Hermelin, D., Sorge, M.: On computing centroids according to the p-norms of hamming distance vectors. In: 27th Annual European Symposium on Algorithms, ESA 2019, September 9–11, 2019, Munich/Garching, Germany, pp. 28:1–28:16 (2019)

  9. Cygan, M., Fomin, F., Kowalik, L., Lokshtanov, D., Marx, D., Pilipczuk, M., Pilipczuk, M., Saurabh, S.: Parameterized Algorithms. Springer, New York (2015)

    Book  Google Scholar 

  10. Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)

    Article  MathSciNet  Google Scholar 

  11. Dopazo, J., Rodríguez, A., Sáiz, J., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. Comput. Appl. Biosci. 9(2), 123–125 (1993)

    Google Scholar 

  12. Downey, R.G., Fellows, M.R.: Parameterized Complexity. Springer, New York (2012)

    MATH  Google Scholar 

  13. Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, London (2013)

    Book  Google Scholar 

  14. Evans, P.A., Smith, A., Wareham, H.T.: The parameterized complexity of p-center approximate substring problems. Technical Report TR01-149, Faculty of Computer Science, University of New Brunswick, Canada (2001)

  15. Evans, P.A., Smith, A.D., Wareham, H.T.: On the complexity of finding common approximate substrings. Theor. Comput. Sci. 306, 407–430 (2003)

    Article  MathSciNet  Google Scholar 

  16. Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26, 141–167 (2006)

    Article  MathSciNet  Google Scholar 

  17. Flum, J., Grohe, M.: Parameterized Complexity Theory. Springer, Berlin (2006)

    MATH  Google Scholar 

  18. Fomin, F.V., Lokshtanov, D., Saurabh, S., Zehavi, M.: Kernelization: Theory of Parameterized Preprocessing. Cambridge University Press, Cambridge (2019)

    MATH  Google Scholar 

  19. Frances, M., Litman, A.: On covering problems of codes. Theory Comput. Syst. 30, 113–119 (1997)

    Article  MathSciNet  Google Scholar 

  20. Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)

    Article  MathSciNet  Google Scholar 

  21. Lanctot, J.K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. Inf. Comput. 185, 41–55 (2003)

    Article  MathSciNet  Google Scholar 

  22. Lenstra, H.W.: Integer programming with a fixed number of variables. Math. Oper. Res. 8(4), 538–548 (1983)

    Article  MathSciNet  Google Scholar 

  23. Li, M., Ma, B., Wang, L.: Finding similar regions in many sequences. J. Comput. Syst. Sci. 65(1), 73–96 (2002). https://doi.org/10.1006/jcss.2002.1823

    Article  MathSciNet  MATH  Google Scholar 

  24. Lucas, K., Busch, M., Mössinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. Comput. Appl. Biosci. 7(4), 525–529 (1991)

    Google Scholar 

  25. Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38, 1382–1410 (2008)

    Article  MathSciNet  Google Scholar 

  26. Pavesi, G., Mauri, G., Pesole, G.: An algorithm for finding signals of unknown length in DNA sequences. Bioinformatics 17, S207–S214 (2001)

    Article  Google Scholar 

  27. Pevzner, P., Sze, S.: Combinatorial approaches to finding subtle signals in DNA strings. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, ISMB 2000, pp. 269–278 (2000)

  28. Proutski, V., Holmes, E.C.: Primer master: a new program for the design and analysis of PCR primers. Comput. Appl. Biosci. 12(3), 253–255 (1996)

    Google Scholar 

  29. Schmid, M.L.: Finding consensus strings with small length difference between input and solution strings. TOCT 9(3), 13:1–13:18 (2017)

    Article  MathSciNet  Google Scholar 

  30. Tompa, M., Li, N., Bailey, T.L., Church, G.M., Moor, B.D., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., Makeev, V.J., Mironov, A.A., Noble, W.S., Pavesi, G., Pesole, G., Régnier, M., Simonis, N., Sinha, S., Thijs, G., van Helden, J., Vandenbogaert, M., Weng, Z., Workman, C., Ye, C., Zhu, Z.: Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol. 23(1), 137–144 (2005)

    Article  Google Scholar 

Download references

Acknowledgements

We wish to thank the anonymous referees for valuable feedback that improved the readability of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Markus L. Schmid.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bulteau, L., Schmid, M.L. Consensus Strings with Small Maximum Distance and Small Distance Sum. Algorithmica 82, 1378–1409 (2020). https://doi.org/10.1007/s00453-019-00647-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-019-00647-9

Keywords

Navigation