Skip to main content
Log in

Randomized Fixed-Parameter Algorithms for the Closest String Problem

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

Given a set \(S = \{s_1, s_2, \ldots , s_n\}\) of strings of equal length \(L\) and an integer \(d\), the closest string problem (CSP) requires the computation of a string \(s\) of length \(L\) such that \(d(s, s_i) \le d\) for each \(s_i \in S\), where \(d(s, s_i)\) is the Hamming distance between \(s\) and \(s_i\). The problem is NP-hard and has been extensively studied in the context of approximation algorithms and fixed-parameter algorithms. Fixed-parameter algorithms provide the most practical solutions to its real-life applications in bioinformatics. In this paper we develop the first randomized fixed-parameter algorithms for CSP. Not only are the randomized algorithms much simpler than their deterministic counterparts, their time complexities are also significantly better than the previously best known (deterministic) algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Böcker, S., Jahn, K., Mixtacki, J., Stoye, J.: Computation of median gene clusters. J. Comput. Biol. 16(8), 1085–1099 (2009)

  2. Boucher, C., Brown, D.: Detecting motifs in a large data set: applying probabilistic insights to motif finding. In: Proceedings of the Conference on Bioinformatics and Computational Biology (BICoB), pp. 139–150 (2009)

  3. Ben-Dor, A., Lancia, G., Perone, J., Ravi, R.: Banishing bias from consensus sequences. In: Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching, pp. 247–261 (1997)

  4. Chen, J., Lu, S.: Improved parameterized set splitting algorithms: a probabilistic approach. Algorithmica 54(4), 472–489 (2008)

    Article  Google Scholar 

  5. Chen, J., Lu, S., Sze, S.H., Zhang, F.: Improved algorithms for path, matching, and packing problems. In: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 298–307 (2007)

  6. Chen, Z.-Z., Ma, B., Wang, L.: A three-string approach to the closest string problem. J. Comput. Syst. Sci. 78, 164–178 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  7. Chen, Z.-Z., Wang, L.: Fast exact algorithms for the closest string and substring problems with application to the planted \((\ell, d)\)-motif model. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1400–1410 (2011)

    Article  Google Scholar 

  8. Davila, J., Balla, S., Rajasekaran, S.: Space and time efficient algorithms for planted motif search. In: Proceedings of the International Conference on Computational Science, pp. 822–829 (2006)

  9. Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  10. Dopazo, J., Rodríguez, A., Sáiz, J.C., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. CABIOS 9, 123–125 (1993)

    Google Scholar 

  11. Evans, P.A., Smith, A.D.: Complexity of approximating closest substring problems. In Proceedings of the 14th International Symposium on Foundations of Complexity Theory, pp. 210–221 (2003)

  12. Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  13. Feng, Q., Wang, J., Li, S., Chen, J.: Random methods for parameterized problems. In: Proceedings of the 19th International Computing and Combinatorics Conference (COCOON), pp. 89–100 (2013)

  14. Frances, M., Litman, A.: On covering problems of codes. Theor. Comput. Sci. 30, 113–119 (1997)

    MATH  MathSciNet  Google Scholar 

  15. Gramm, J., Guo, J., Niedermeier, R.: On exact and approximation algorithms for distinguishing substring selection. In: Proceedings of the 14th International Symposium on Foundations of Complexity Theory, pp. 159–209 (2003)

  16. Gramm, J., Hüffner, F., Niedermeier, R.: Closest strings, primer design, and motif search. In: Florea, L. et al. (eds.) Currents in Computational Molecular Biology. Poster Abstracts of RECOMB 2002, pp. 74–75

  17. Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  18. Hufsky, F., Kuchenbecker, L., Jahn, K., Stoye, J., Böcker, S.: Swiftly computing center strings. In: Proceedings of the 10th International Workshop on Algorithms in Bioinformatics, pp. 325–336 (2010)

  19. Jiao, Y., Xu, J., Li, M.: On the k-closest substring and k-consensus pattern problems. In: Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, pp. 130–144 (2004)

  20. Lanctot, K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string search problems. Inf. Comput. 185, 41–55 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  21. Li, M., Ma, B., Wang, L.: On the closest string and substring problems. J. ACM 49(2), 157–171 (2002)

    Article  MathSciNet  Google Scholar 

  22. Lucas, K., Busch, M., Mösinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. CABIOS 7, 525–529 (1991)

    Google Scholar 

  23. Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. SIAM J. Comput. 39(4), 1432–1443 (2010)

    Article  MathSciNet  Google Scholar 

  24. Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38(4), 1382–1410 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  25. Marx, D.: Randomized techniques for parameterized algorithms. In: Proceedings of the 7th International Symposium on Parameterized and Exact Computation (IPEC), p. 2 (2012)

  26. Marx, D., Razgon, I.: Fixed-parameter tractability of multicut parameterized by the size of the cutset. In: Proceedings of the 43rd Annual ACM Symposium on Theory of Computing (STOC), pp. 469–478 (2011)

  27. Mauch, H., Melzer, M.J., Hu, J.S.: Genetic algorithm approach for the closest string problem. In: Proceedings of the 2nd IEEE Computer Society Bioinformatics Conference (CSB), pp. 560–561 (2003)

  28. Meneses, C.N., Lu, Z., Oliveira, C.A.S., Pardalos, P.M.: Optimal solutions for the closest-string problem via integer programming. INFORMS J. Comput. 16, 419–429 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  29. Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Proceedings of the 14th Annual Symposium on Combinatorial Pattern Matching, pp. 315–327 (2003)

  30. Proutski, V., Holme, E.C.: Primer master: a new program for the design and analysis of PCR primers. CABIOS 12, 253–255 (1996)

    Google Scholar 

  31. Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Proceedings of the 5th International Workshop on Algorithms and Data Structures, pp. 126–135 (1997)

  32. Wang, L., Dong, L.: Randomized algorithms for motif detection. J. Bioinform. Comput. Biol. 3(5), 1039–1052 (2005)

    Article  MathSciNet  Google Scholar 

  33. Wang, L., Zhu, B.: Efficient algorithms for the closest string and distinguishing string selection problems. In: Proceedings of the 3rd International Frontiers of Algorithmics Workshop, pp. 261–270 (2009)

  34. Wang, Y., Chen, W., Li, X., Cheng, B.: Degenerated primer design to amplify the heavy chain variable region from immunoglobulin cDNA. BMC Bioinform. 7(Suppl. 4), S9 (2006)

    Article  Google Scholar 

  35. Zhao, R., Zhang, N.: A more efficient closest string algorithm. In: Proceedings of the 2nd International Conference on Bioinformatics and Computational Biology (2010)

Download references

Acknowledgments

We thank the anonymous referees for very helpful comments. Zhi-Zhong Chen was supported in part by the Grant-in-Aid for Scientific Research of the Ministry of Education, Science, Sports and Culture of Japan, under Grant No. 24500023. Bin Ma was supported in part by Natural Sciences and Engineering Research Council of Canada (RGPIN 238748). Lusheng Wang was supported by a GRF grant from Hong Kong SAR government Project No. [CityU 123013] and a grant from National Foundation of China Project No. [61373048].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Zhong Chen.

Additional information

A preliminary version of this paper appeared in the Proceedings of the 25th Annual Symposium on Combinatorial Pattern Matching, 2014.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, ZZ., Ma, B. & Wang, L. Randomized Fixed-Parameter Algorithms for the Closest String Problem. Algorithmica 74, 466–484 (2016). https://doi.org/10.1007/s00453-014-9952-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-014-9952-y

Keywords

Navigation