Optimization Letters

, Volume 10, Issue 1, pp 189–205 | Cite as

Computational performance evaluation of two integer linear programming models for the minimum common string partition problem

Original Paper

Abstract

In the minimum common string partition (MCSP) problem two related input strings are given. “Related” refers to the property that both strings consist of the same set of letters appearing the same number of times in each of the two strings. The MCSP seeks a minimum cardinality partitioning of one string into non-overlapping substrings that is also a valid partitioning for the second string. This problem has applications in bioinformatics e.g. in analyzing related DNA or protein sequences. For strings with lengths less than about 1000 letters, a previously published integer linear programming (ILP) formulation yields, when solved with a state-of-the-art solver such as CPLEX, satisfactory results. In this work, we propose a new, alternative ILP model that is compared to the former one. While a polyhedral study shows the linear programming relaxations of the two models to be equally strong, a comprehensive experimental comparison using real-world as well as artificially created benchmark instances indicates substantial computational advantages of the new formulation.

Keywords

Minimum common string partition Bioinformatics Integer linear programming Computational comparison 

References

  1. 1.
    Blum, C., Lozano, J.A., Pinacho Davidson, P.: Iterative probabilistic tree search for the minimum common string partition problem. In: Blesa, M.J., Blum, C., Voss, S. (eds.) Proceedings of HM 20104—9th International Workshop on Hybrid Metaheuristics. Lecture Notes in Computer Science, vol. 8457, pp. 154–154. Springer, Berlin (2014)Google Scholar
  2. 2.
    Blum, C., Lozano, J.A., Pinacho Davidson, P.: Mathematical programming strategies for solving the minimum common string partition problem. Eur. J. Oper. Res. 242(3), 769–777 (2015)CrossRefGoogle Scholar
  3. 3.
    Chen, X., Zheng, J., Fu, Z., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Comput. Biol. Bioinform. 2(4), 302–315 (2005)CrossRefGoogle Scholar
  4. 4.
    Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. In: Jansen, K., Khanna, S., Rolim, J.D.P., Ron, D. (eds.) Proceedings of APPROX 2004—7th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems. Lecture Notes in Computer Science, vol. 3122, pp. 84–95. Springer, Berlin (2004)Google Scholar
  5. 5.
    Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Trans. Algorithms 3(2), 1–19 (2007)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Damaschke, P.: Minimum common string partition parameterized. In: Crandall, K.A., Lagergren, J. (eds.) Proceedings of WABI 2008—8th International Workshop on Algorithms in Bioinformatics. Lecture Notes in Computer Science, vol. 5251, pp. 87–98. Springer, Berlin (2008)Google Scholar
  7. 7.
    Ferdous, S.M., Sohel Rahman, M.: Solving the minimum common string partition problem with the help of ants. In: Tan, Y., Shi, Y., Mo, H. (eds.) Proceedings of ICSI 2013—4th International Conference on Advances in Swarm Intelligence. Lecture Notes in Computer Science, vol. 7928, pp. 306–313. Springer, Berlin (2013)Google Scholar
  8. 8.
    Ferdous, S.M., Sohel Rahman, M.: A MAX–MIN ant colony system for minimum common string partition problem (2014). arXiv:1401.4539
  9. 9.
    Fu, B., Jiang, H., Yang, B., Zhu, B.: Exponential and polynomial time algorithms for the minimum common string partition problem. In: Wang, W., Zhu, X., Du, D.Z. (eds.) Proceedings of COCOA 2011—5th International Conference on Combinatorial Optimization and Applications. Lecture Notes in Computer Science, vol. 6831, pp. 299–310. Springer, Berlin (2011)CrossRefGoogle Scholar
  10. 10.
    Gallardo, J.E.: A multilevel probabilistic beam search algorithm for the shortest common supersequence problem. PLoS One 7(12) (2012)Google Scholar
  11. 11.
    Garey, M.R., Johnson, D.S.: Computers and intractability; a guide to the theory of NP-completeness. W. H. Freeman, San Francisco (1979)Google Scholar
  12. 12.
    Goldstein, A., Kolman, P., Zheng, J.: Minimum common string partition problem: Hardness and approximations. In: Fleischer, R., Trippen, G. (eds.) Proceedings of ISAAC 2004—15th International Symposium on Algorithms and Computation. Lecture Notes in Computer Science, vol. 3341, pp. 484–495. Springer, Berlin (2005)Google Scholar
  13. 13.
    Goldstein, I., Lewenstein, M.: Quick greedy computation for minimum common string partitions. In: Giancarlo, R., Manzini, G. (eds.) Proceedings of CPM 2011—22nd Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 6661, pp. 273–284. Springer, Berlin (2011)CrossRefGoogle Scholar
  14. 14.
    He, D.: A novel greedy algorithm for the minimum common string partition problem. In: Mandoiu, I., Zelikovsky, A. (eds.) Proceedings of ISBRA 2007—Third International Symposium on Bioinformatics Research and Applications. Lecture Notes in Computer Science, vol. 4463, pp. 441–452. Springer, Berlin (2007)CrossRefGoogle Scholar
  15. 15.
    Hsu, W.J., Du, M.W.: Computing a longest common subsequence for a set of strings. BIT Numer. Math. 24(1), 45–59 (1984). doi:10.1007/BF01934514
  16. 16.
    Jiang, H., Zhu, B., Zhu, D., Zhu, H.: Minimum common string partition revisited. J. Comb. Optim. 23(4), 519–527 (2012)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Kaplan, H., Shafrir, N.: The greedy algorithm for edit distance with moves. Inf. Process. Lett. 97(1), 23–27 (2006)MathSciNetCrossRefMATHGoogle Scholar
  18. 18.
    Kolman, P.: Approximating reversal distance for strings with bounded number of duplicates. In: Jedrzejowicz, J., Szepietowski, A. (eds.) Proceedings of MFCS 2005—30th International Symposium on Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 3618, pp. 580–590. Springer, Berlin (2005)CrossRefGoogle Scholar
  19. 19.
    Kolman, P., Waleń, T.: Reversal distance for strings with duplicates: linear time approximation using hitting set. In: Erlebach, T., Kaklamanis, C. (eds.) Proceedings of WAOA 2007—4th International Workshop on Approximation and Online Algorithms. Lecture Notes in Computer Science, vol. 4368, pp. 279–289. Springer, Berlin (2007)CrossRefGoogle Scholar
  20. 20.
    Meneses, C., Oliveira, C., Pardalos, P.: Optimization techniques for string selection and comparison problems in genomics. IEEE Eng. Med. Biol. Mag. 24(3), 81–87 (2005)CrossRefGoogle Scholar
  21. 21.
    Mousavi, S., Babaie, M., Montazerian, M.: An improved heuristic for the far from most strings problem. J. Heuristics 18, 239–262 (2012)CrossRefGoogle Scholar
  22. 22.
    Shapira, D., Storer, J.A.: Edit distance with move operations. In: Apostolico, A., Takeda, M. (eds.) Proceedings of CPM 2002—13th Annual Symposium on Combinatorial Pattern Matching. Lecture Notes in Computer Science, vol. 2373, pp. 85–98. Springer, Berlin (2002)CrossRefGoogle Scholar
  23. 23.
    Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195–197 (1981)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2015

Authors and Affiliations

  1. 1.Ikerbasque, Basque Foundation for ScienceBilbaoSpain
  2. 2.University of the Basque Country UPV/EHUSan SebastiánSpain
  3. 3.Institute of Computer Graphics and AlgorithmsTU WienViennaAustria

Personalised recommendations