Randomized and Parameterized Algorithms for the Closest String Problem

  • Zhi-Zhong Chen
  • Bin Ma
  • Lusheng Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8486)

Abstract

Given a set S = {s 1, s 2, …, s n } of strings of equal length L and an integer d, the closest string problem (CSP) requires the computation of a string s of length L such that d(s, s i ) ≤ d for each s i  ∈ S, where d(s, s i ) is the Hamming distance between s and s i . The problem is NP-hard and has been extensively studied in the context of approximation algorithms and parameterized algorithms. Parameterized algorithms provide the most practical solutions to its real-life applications in bioinformatics. In this paper we develop the first randomized parameterized algorithms for CSP. Not only are the randomized algorithms much simpler than their deterministic counterparts, their expected-time complexities are also significantly better than the previously best known (deterministic) algorithms.

Keywords

Time Complexity Close String Parameterized Algorithm Full Version Input String 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Böcker, S., Jahn, K., Mixtacki, J., Stoye, J.: Computation of median gene clusters. Journal of Computational Biology 16(8), 1085–1099 (2009)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Boucher, C., Brown, D.G.: Detecting motifs in a large data set: Applying probabilistic insights to motif finding. In: Rajasekaran, S. (ed.) BICoB 2009. LNCS, vol. 5462, pp. 139–150. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  3. 3.
    Ben-Dor, A., Lancia, G., Perone, J., Ravi, R.: Banishing bias from consensus sequences. In: Hein, J., Apostolico, A. (eds.) CPM 1997. LNCS, vol. 1264, pp. 247–261. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  4. 4.
    Chen, Z.-Z., Ma, B., Wang, L.: A three-string approach to the closest string problem. Journal of Computer and System Sciences 78, 164–178 (2012)CrossRefMATHMathSciNetGoogle Scholar
  5. 5.
    Chen, Z.-Z., Wang, L.: Fast exact algorithms for the closest string and substring problems with application to the planted (ℓ,d)-motif model. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(5), 1400–1410 (2011)CrossRefGoogle Scholar
  6. 6.
    Davila, J., Balla, S., Rajasekaran, S.: Space and time efficient algorithms for planted motif search. In: Proc. of the International Conference on Computational Science, pp. 822–829 (2006)Google Scholar
  7. 7.
    Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)CrossRefMATHMathSciNetGoogle Scholar
  8. 8.
    Dopazo, J., Rodríguez, A., Sáiz, J.C., Sobrino, F.: Design of primers for PCR amplification of highly variable genomes. CABIOS 9, 123–125 (1993)Google Scholar
  9. 9.
    Evans, P.A., Smith, A.D.: Complexity of approximating closest substring problems. In: Proc. of the 14th International Symposium on Foundations of Complexity Theory, pp. 210–221 (2003)Google Scholar
  10. 10.
    Fellows, M.R., Gramm, J., Niedermeier, R.: On the parameterized intractability of motif search problems. Combinatorica 26(2), 141–167 (2006)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Frances, M., Litman, A.: On covering problems of codes. Theoret. Comput. Sci. 30, 113–119 (1997)MATHMathSciNetGoogle Scholar
  12. 12.
    Gramm, J., Guo, J., Niedermeier, R.: On exact and approximation algorithms for distinguishing substring selection. In: Lingas, A., Nilsson, B.J. (eds.) FCT 2003. LNCS, vol. 2751, pp. 195–209. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  13. 13.
    Gramm, J., Hüffner, F., Niedermeier, R.: Closest strings, primer design, and motif search. In: Florea, L., et al (eds.), Currents in Computational Molecular Biology. Poster Abstracts of RECOMB 2002, pp. 74–75 (2002)Google Scholar
  14. 14.
    Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and related problems. Algorithmica 37, 25–42 (2003)CrossRefMATHMathSciNetGoogle Scholar
  15. 15.
    Hufsky, F., Kuchenbecker, L., Jahn, K., Stoye, J., Böcker, S.: Swiftly computing center strings. In: Moulton, V., Singh, M. (eds.) WABI 2010. LNCS, vol. 6293, pp. 325–336. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  16. 16.
    Jiao, Y., Xu, J., Li, M.: On the k-closest substring and k-consensus pattern problems. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 130–144. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  17. 17.
    Lanctot, K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string search problems. Inform. and Comput. 185, 41–55 (2003)CrossRefMATHMathSciNetGoogle Scholar
  18. 18.
    Li, M., Ma, B., Wang, L.: On the closest string and substring problems. J. ACM 49(2), 157–171 (2002)CrossRefMathSciNetGoogle Scholar
  19. 19.
    Lucas, K., Busch, M., Mösinger, S., Thompson, J.A.: An improved microcomputer program for finding gene- or gene family-specific oligonucleotides suitable as primers for polymerase chain reactions or as probes. CABIOS 7, 525–529 (1991)Google Scholar
  20. 20.
    Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. SIAM J. Comput. 39(4), 1432–1443 (2010)CrossRefMathSciNetGoogle Scholar
  21. 21.
    Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38(4), 1382–1410 (2008)CrossRefMATHMathSciNetGoogle Scholar
  22. 22.
    Marx, D.: Randomized techniques for parameterized algorithms. In: Thilikos, D.M., Woeginger, G.J. (eds.) IPEC 2012. LNCS, vol. 7535, p. 2. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  23. 23.
    Mauch, H., Melzer, M.J., Hu, J.S.: Genetic algorithm approach for the closest string problem. In: Proc. of the 2nd IEEE Computer Society Bioinformatics Conference (CSB), pp. 560–561 (2003)Google Scholar
  24. 24.
    Meneses, C.N., Lu, Z., Oliveira, C.A.S., Pardalos, P.M.: Optimal solutions for the closest-string problem via integer programming. INFORMS J. Comput. (2004)Google Scholar
  25. 25.
    Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 315–327. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  26. 26.
    Proutski, V., Holme, E.C.: Primer master: A new program for the design and analysis of PCR primers. CABIOS 12, 253–255 (1996)Google Scholar
  27. 27.
    Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Rau-Chaplin, A., Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 1997. LNCS, vol. 1272, pp. 126–135. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  28. 28.
    Wang, L., Dong, L.: Randomized algorithms for motif detection. J. Bioinform. Comput. Biol. 3(5), 1039–1052 (2005)CrossRefMathSciNetGoogle Scholar
  29. 29.
    Wang, L., Zhu, B.: Efficient algorithms for the closest string and distinguishing string selection problems. In: Deng, X., Hopcroft, J.E., Xue, J. (eds.) FAW 2009. LNCS, vol. 5598, pp. 261–270. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  30. 30.
    Wang, Y., Chen, W., Li, X., Cheng, B.: Degenerated primer design to amplify the heavy chain variable region from immunoglobulin cDNA. BMC Bioinform. 7(suppl. 4), S9 (2006)Google Scholar
  31. 31.
    Zhao, R., Zhang, N.: A more efficient closest string algorithm. In: Proc. of the 2nd International Conference on Bioinformatics and Computational Biology (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Zhi-Zhong Chen
    • 1
  • Bin Ma
    • 2
  • Lusheng Wang
    • 3
  1. 1.Division of Information System DesignTokyo Denki UniversityHatoyamaJapan
  2. 2.School of Computer ScienceUniversity of WaterlooWaterlooCanada
  3. 3.Department of Computer ScienceCity University of Hong KongKowloonHong Kong SAR

Personalised recommendations