Skip to main content

Part of the book series: SpringerBriefs in Optimization ((BRIEFSOPTI))

Abstract

The increasing amount of genomic data and the ability to synthesize artificial DNA constructs poses a series of challenging problems involving the identification and design of sequences with specific properties. We address the identification of such sequences; many of these problems present challenges both at biological and computational level. In this chapter, we introduce the main string selection problems and the theoretical and experimental results for the most important instances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Informally, the goal of parameterized complexity is to study how the different parameters of the input instance affect the running time of the algorithm.

  2. 2.

    ZPP is the Zero-error Probabilistic Polynomial Time complexity class. It is defined as the class of languages recognized by probabilistic Turing machine with polynomial bounded average run time and zero error probability [30].

  3. 3.

    APX is defined as the class of all NP-optimization problems P such that, for some r ≥ 1, there exists a polynomial time r-approximate algorithm for P [3].

  4. 4.

    FPT denotes the class of fixed-parameter tractable problems, which are problems that can be solved in time \(f(k)\vert x{\vert }^{\mathcal{O}(1)}\) for some computable function f.

References

  1. Amir, A., Paryenty, H., Roditty, L.: Configurations and minority in the string consensus problem. In: String Processing and Information Retrieval, pp. 42–53. Springer, Berlin (2012)

    Google Scholar 

  2. Andoni, A., Indyk, P., Patrascu, M.: On the optimality of the dimensionality reduction method. In: 47th Annual IEEE Symposium on Foundations of Computer Science, 2006 (FOCS’06), pp. 449–458. IEEE, New York (2006)

    Google Scholar 

  3. Ausiello, G.: Complexity and approximation: Combinatorial optimization problems and their approximability properties. Springer, Berlin (1999)

    Book  MATH  Google Scholar 

  4. Babaie, M., Mousavi, S.: A memetic algorithm for closest string problem and farthest string problem. In: 18th Iranian Conference on Electrical Engineering (ICEE), pp. 570–575. IEEE, New York (2010)

    Google Scholar 

  5. Bahredar, F., Javadi, H., Moghadam, R., Erfani, H., Navidi, H.: A meta heuristic solution for closest substring problem using ant colony system. Adv. Stud. Biol. 2(4), 179–189 (2010)

    Google Scholar 

  6. Ben-Dor, A., Lancia, G., Ravi, R., Perone, J.: Banishing bias from consensus sequences. In: Combinatorial Pattern Matching, pp. 247–261. Springer, Berlin (1997)

    Google Scholar 

  7. Booker, L., Goldberg, D., Holland, J.: Classifier systems and genetic algorithms. In: Machine Learning: Paradigms and Methods Table of Contents, pp. 235–282 (1990)

    Google Scholar 

  8. Boucher, C., Ma, B.: Closest string with outliers. BMC bioinformatics, 12(Suppl 1), S55 (2011)

    Article  Google Scholar 

  9. Boucher, C., Landau, G.M., Levy, A., Pritchard, D., Weimann, O.: On approximating string selection problems with outliers. In: Proceedings of the 23rd Annual Conference on Combinatorial Pattern Matching, pp. 427–438. Springer, Berlin (2012)

    Google Scholar 

  10. Calhoun, J., Graham, J., Jiang, H.: On using a graphics processing unit to solve the closest substring problem. In: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA) (2011)

    Google Scholar 

  11. Casacuberta, F., de Antonio, M.: A greedy algorithm for computing approximate median strings. In: Proceedings of Spanish Symposium on Pattern Recognition and Image Analysis, pp. 193–198. AERFAI (1997)

    Google Scholar 

  12. Chen, Z.Z., Ma, B., Wang, L.: A three-string approach to the closest string problem. J. Comput. Syst. Sci., 78(1), 164–178 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  13. Chimani, M., Woste, M., Böcker, S.: A closer look at the closest string and closest substring problem. In: Proceedings of the 13th Workshop on Algorithm Engineering and Experiments (ALENEX), pp. 13–24 (2011)

    Google Scholar 

  14. Della Croce, F., Salassa, F.: Improved lp-based algorithms for the closest string problem. Comput. Oper. Res. 39(3), 746–749 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  15. Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: A PTAS for distinguishing (sub)string selection. In: Automata, Languages and Programming, pp. 788–788 (2002)

    Google Scholar 

  16. Deng, X., Li, G., Wang, L.: Center and distinguisher for strings with unbounded alphabet. J. Comb. Optim. 6(4), 383–400 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  17. Deng, X., Li, G., Li, Z., Ma, B., Wang, L.: Genetic design of drugs without side-effects. SIAM J. Comput. 32(4), 1073–1090 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  18. Dinu, L., Ionescu, R.: A genetic approximation of closest string via rank distance. In: 13th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 207–214. IEEE, New York (2011)

    Google Scholar 

  19. Dinu, L., Ionescu, R.: An efficient rank based approach for closest string and closest substring. PloS One 7(6), e37576 (2012)

    Article  Google Scholar 

  20. Dorigo, M.: Optimization, learning and natural algorithms. Ph.D. thesis, Dipartimento di Elettronica, Politecnico di Milano (1992)

    Google Scholar 

  21. Dorigo, M., Caro, G., Gambardella, L.: Ant algorithms for discrete optimization. Artif. Life 5(2), 137–172 (1999)

    Article  Google Scholar 

  22. Evans, P., Smith, A.: Complexity of approximating closest substring problems. In: Fundamentals of Computation Theory, pp. 13–47. Springer, Berlin (2003)

    Google Scholar 

  23. Faro, S., Pappalardo, E.: Ant-CSP: An ant colony optimization algorithm for the closest string problem. In: SOFSEM 2010: Theory and Practice of Computer Science, pp. 370–381. Springer Berlin Heidelberg (2010)

    Google Scholar 

  24. Fellows, M., Gramm, J., Niedermeier, R.: On the parameterized intractability of closest substring and related problems. In: STACS 2002, pp. 262–273. Springer Berlin Heidelberg (2002)

    Google Scholar 

  25. Festa, P.: On some optimization problems in molecular biology. Math. Biosci. 207(2), 219–234 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  26. Festa, P., Pardalos, P.M.: Efficient solutions for the far from most string problem. Ann. Oper. Res. 196(1), 663–682 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  27. Frances, M., Litman, A.: On covering problems of codes. Theor. Comput. Syst. 30(2), 113–119 (1997)

    MathSciNet  MATH  Google Scholar 

  28. Ga̧sieniec, L., Jansson, J., Lingas, A.: Efficient approximation algorithms for the Hamming center problem. In: Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms: Society for Industrial and Applied Mathematics, pp. 905–906 (1999)

    Google Scholar 

  29. Gilkerson, J., Jaromczyk, J.: The genetic algorithm scheme for consensus sequences. In: IEEE Congress on Evolutionary Computation, 2007 (CEC 2007), pp. 3870–3878. IEEE, New York (2007)

    Google Scholar 

  30. Gill, J.: Computational complexity of probabilistic turing machines. SIAM J. Comput. 6(4), 675–695 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  31. Goldberg, D., Holland, J.: Genetic algorithms and machine learning. Mach. Learn. 3(2), 95–99 (1988)

    Article  Google Scholar 

  32. Gomes, F., Meneses, C., Pardalos, P., Viana, G.: A parallel multistart algorithm for the closest string problem. Comput. Oper. Res. 35(11), 3636–3643 (2008)

    Article  MATH  Google Scholar 

  33. Gramm, J., Niedermeier, R., Rossmanith, P.: Exact solutions for closest string and related problems. Algorithms and Computation, pp. 441–453. Springer Berlin Heidelberg (2001)

    Google Scholar 

  34. Gramm, J., Niedermeier, R., Rossmanith, P.: Fixed-parameter algorithms for closest string and 743 related problems. Algorithmica 37(1), 25-42 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  35. Gramm, J., Guo, J., Niedermeier, R.: On exact and approximation algorithms for distinguishing substring selection. In: Proceedings of Fundamentals of Computation Theory: 14th International Symposium (FCT 2003), Malmö, 12–15 August 2003, vol. 14, p. 195. Springer, Berlin (2003)

    Google Scholar 

  36. Gramm, J., Guo, J., Niedermeier, R.: Parameterized intractability of distinguishing substring selection. Theor. Comput. Syst. 39(4), 545–560 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  37. Gusfield, D.: Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  38. Guyon, I., Schomaker, L., Plamondon, R., Liberman, M., Janet, S.: UNIPEN project of on-line data exchange and recognizer benchmarks. In: Proceedings of the 12th IAPR International Conference on Pattern Recognition, vol. 2-Conference B: Computer Vision & Image Processing, vol. 2, pp. 29–33. IEEE, New York (1994)

    Google Scholar 

  39. de la Higuera, C., Casacuberta, F.: Topology of strings: median string is NP-complete. Theor. Comput. Sci. 230(1), 39–48 (2000)

    Article  MATH  Google Scholar 

  40. Holland, J.: Adaptation in Natural and Artificial Systems. MIT, Cambridge (1992)

    Google Scholar 

  41. Jiang, X., Abegglen, K., Bunke, H., Csirik, J.: Dynamic computation of generalised median strings. Pattern Anal. Appl. 6(3), 185–193 (2003)

    Article  MathSciNet  Google Scholar 

  42. Jiang, X., Bunke, H., Csirik, J.: Median strings: a review. In: Data Mining in Time Series Databases, pp. 173–192 (2004)

    Google Scholar 

  43. Jiang, X., Wentker, J., Ferrer, M.: Generalized median string computation by means of string embedding in vector spaces. Pattern Recognit. Lett. 33(7), 842–852 (2012)

    Article  Google Scholar 

  44. Juan, A., Vidal, E.: Fast median search in metric spaces. In: Advances in Pattern Recognition, pp. 905–912. Springer Berlin Heidelberg (1998)

    Google Scholar 

  45. Julstrom, B.: A data-based coding of candidate strings in the closest string problem. In: Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers, pp. 2053–2058. Association for Computing Machinery (2009)

    Google Scholar 

  46. Keith, J., Adams, P., Bryant, D., Kroese, D., Mitchelson, K., Cochran, D., Lala, G.: A simulated annealing algorithm for finding consensus sequences. Bioinformatics 18(11), 1494–1499 (2002)

    Article  Google Scholar 

  47. Kelsey, T., Kotthoff, L.: The exact closest string problem as a constraint satisfaction problem. Arxiv preprint arXiv:1005.0089 (2010)

    Google Scholar 

  48. Kohonen, T.: Median strings. Pattern Recognit. Lett. 3(5), 309–313 (1985)

    Article  Google Scholar 

  49. Kruskal, J.B.: An overview of sequence comparison: time warps, string edits, and macromolecules. SIAM Rev. 25(2), 201–237 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  50. Kruzslicz, F.: Improved greedy algorithm for computing approximate median strings. Acta Cybern. 14(2), 331–340 (1999)

    MathSciNet  MATH  Google Scholar 

  51. Lanctot, J.K., Li, M., Ma, B., Wang, S., Zhang, L.: Distinguishing string selection problems. In: Proceedings of the tenth annual ACM-SIAM symposium on Discrete algorithms, pp. 633–642. Society for Industrial and Applied Mathematics (1999)

    Google Scholar 

  52. Li, M., Ma, B., Wang, L.: Finding similar regions in many strings. In: Proceedings of the Thirty-first Annual ACM Symposium on Theory of computing, pp. 473–482. Association for Computing Machinery (1999)

    Google Scholar 

  53. Li, M., Ma, B., Wang, L.: On the closest string and substring problems. J. ACM 49(2), 157–171 (2002)

    Article  MathSciNet  Google Scholar 

  54. Liu, X., He, H., Sýkora, O.: Parallel genetic algorithm and parallel simulated annealing algorithm for the closest string problem. In: Advanced Data Mining and Applications, pp. 591–597. Springer Berlin Heidelberg (2005)

    Google Scholar 

  55. Liu, X., Holger, M., Hao, Z., Wu, G.: A compounded genetic and simulated annealing algorithm for the closest string problem. In: The 2nd International Conference on Bioinformatics and Biomedical Engineering, 2008 (ICBBE 2008), pp. 702–705. IEEE, New York (2008)

    Google Scholar 

  56. Liu, X., Liu, S., Hao, Z., Mauch, H.: Exact algorithm and heuristic for the closest string problem. Comput. & Oper. Res., 38(11), 1513–1520 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  57. Lopresti, D., Zhou, J.: Using consensus sequence voting to correct OCR errors. Comput. Vis. Image Underst. 67(1), 39–47 (1997)

    Article  Google Scholar 

  58. Ma, B.: A polynomial time approximation scheme for the closest substring problem. In: Combinatorial Pattern Matching, pp. 99–107. Springer, Berlin (2000)

    Google Scholar 

  59. Ma, B., Sun, X.: More efficient algorithms for closest string and substring problems. In: Research in Computational Molecular Biology, pp. 396–409. Springer, Berlin (2008)

    Google Scholar 

  60. Martínez-Hinarejos, C.D., Juan, A., Casacuberta, F.: Use of median string for classification. In: Proceedings of 15th International Conference on Pattern Recognition, vol. 2, pp. 903–906. IEEE, New York (2000)

    Google Scholar 

  61. Marx, D.: Closest substring problems with small distances. SIAM J. Comput. 38(4), 1382–1410 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  62. Mauch, H.: Closest substring problem–results from an evolutionary algorithm. In: Neural Information Processing, pp. 205–211. Springer, Berlin (2004)

    Google Scholar 

  63. Mauch, H., Melzer, M., Hu, J.: Genetic algorithm approach for the closest string problem. In: Proceedings of the 2003 IEEE Bioinformatics Conference 2003 (CSB 2003), pp. 560–561 (2003)

    Google Scholar 

  64. McClure, M., Vasi, T., Fitch, W.: Comparative analysis of multiple protein-sequence alignment methods. Mol. Biol. Evol. 11(4), 571 (1994)

    Google Scholar 

  65. Meneses, C., Lu, Z., Oliveira, C., Pardalos, P., et al.: Optimal solutions for the closest-string problem via integer programming. INFORMS J. Comput. 16(4), 419–429 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  66. Meneses, C., Pardalos, P., Resende, M., Vazacopoulos, A.: Modeling and solving string selection problems. In: Second International Symposium on Mathematical and Computational Biology, pp. 54–64 (2005)

    Google Scholar 

  67. Meneses, C., Oliveira, C., Pardalos, P.: Optimization techniques for string selection and comparison problems in genomics. IEEE Eng. Med. Biol. Mag. 24(3), 81–87 (2005)

    Article  Google Scholar 

  68. Metropolis, N., Ulam, S.: The Monte Carlo method. J. Am. Stat. Assoc. 44(247), 335–341 (1949)

    Article  MathSciNet  MATH  Google Scholar 

  69. Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A., Teller, E.: Perspective on “Equation of state calculations by fast computing machines”. J. Chem. Phys. 21, 1087–1092 (1953)

    Article  Google Scholar 

  70. Micó, L., Oncina, J.: An approximate median search algorithm in non-metric spaces. Pattern Recognit. Lett. 22(10), 1145–1151 (2001)

    Article  MATH  Google Scholar 

  71. Mousavi, S.R.: A hybridization of constructive beam search with local search for far from most strings problem. Int. J. Comput. Math. Sci. v4(i7), 340–348 (2010)

    Google Scholar 

  72. Mousavi, S.R., Babaie, M., Montazerian, M.: An improved heuristic for the far from most strings problem. J. Heuristics 18(2), 239–262 (2012)

    Article  Google Scholar 

  73. Nicolas, F., Rivals, E.: Complexities of the centre and median string problems. In: Combinatorial Pattern Matching, pp. 315–327. Springer, Berlin (2003)

    Google Scholar 

  74. Nicolas, F., Rivals, E.: Hardness results for the center and median string problems under the weighted and unweighted edit distances. J. Discrete Algorithms 3(2), 390–415 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  75. Mousavi, S.R., Nasr Esfahani, N.: A GRASP algorithm for the closest string problem using a probability-based heuristic. Comput. & Oper. Res., 39(2), 238–248 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  76. Silva, R.M.A., Baleeiro, G., Pires, D., Resende, M., Festa, P., Valentim, F.: Grasp with path-relinking for the farthest substring problem. Technical Report, AT&T Labs Research (2008)

    Google Scholar 

  77. Sim, J.S., Park, K.: The consensus string problem for a metric is NP-complete. J. Discrete Algorithms 1(1), 111–117 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  78. Smith, A.: Common approximate substrings. Ph.D. thesis, Citeseer (2004)

    Google Scholar 

  79. Stojanovic, N., Berman, P., Gumucio, D., Hardison, R., Miller, W.: A linear-time algorithm for the 1-mismatch problem. In: Algorithms and Data Structures, pp. 126–135. Springer Berlin Heidelberg (1997)

    Google Scholar 

  80. Tanaka, S.: A heuristic algorithm based on Lagrangian relaxation for the closest string problem. Comput. & Oper. Res., 39(3), 709–717 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  81. Wang, J., Huang, M., Chen., J.: A lower bound on approximation algorithms for the closest substring problem. In: Combinatorial Optimization and Applications, pp. 291–300. Springer Berlin Heidelberg (2007)

    Google Scholar 

  82. Wang, J., Chen, J., Huang, M.: An improved lower bound on approximation algorithms for the closest substring problem. Inf. Process. Lett. 107(1), 24–28 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  83. Wang, L., Zhu, B.: Efficient algorithms for the closest string and distinguishing string selection problems. In: Frontiers in Algorithmics, pp. 261–270. Springer Berlin Heidelberg (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Elisa Pappalardo, Panos M. Pardalos, Giovanni Stracquadanio

About this chapter

Cite this chapter

Pappalardo, E., Pardalos, P.M., Stracquadanio, G. (2013). String Selection Problems. In: Optimization Approaches for Solving String Selection Problems. SpringerBriefs in Optimization. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-9053-1_4

Download citation

Publish with us

Policies and ethics