Approximating Weighted Duo-Preservation in Comparative Genomics

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10392)

Abstract

Motivated by comparative genomics, Chen et al. [9] introduced the Maximum Duo-preservation String Mapping (MDSM) problem in which we are given two strings \(s_1\) and \(s_2\) from the same alphabet and the goal is to find a mapping \(\pi \) between them so as to maximize the number of duos preserved. A duo is any two consecutive characters in a string and it is preserved in the mapping if its two consecutive characters in \(s_1\) are mapped to same two consecutive characters in \(s_2\). The MDSM problem is known to be NP-hard and there are approximation algorithms for this problem [3, 5], all of which consider only the “unweighted” version of the problem in the sense that a duo from \(s_1\) is preserved by mapping to any same duo in \(s_2\) regardless of their positions in the respective strings. However, it is well-desired in comparative genomics to find mappings that consider preserving duos that are “closer” to each other under some distance measure [18].

In this paper, we introduce a generalized version of the problem, called the Maximum-Weight Duo-preservation String Mapping (MWDSM) problem, capturing both duos-preservation and duos-distance measures in the sense that mapping a duo from \(s_1\) to each preserved duo in \(s_2\) has a weight, indicating the “closeness” of the two duos. The objective of the MWDSM problem is to find a mapping so as to maximize the total weight of preserved duos. We give a polynomial-time 6-approximation algorithm for this problem.

References

  1. 1.
    Bar-Yehuda, R., Even, S.: A local-ratio theorem for approximating the weighted vertex cover problem. In: Ausiello, G., Lucertini, M. (eds.) Analysis and Design of Algorithms for Combinatorial Problems, vol. 109, pp. 27–45. North-Holland (1985)Google Scholar
  2. 2.
    Beretta, S., Castelli, M., Dondi, R.: Parameterized tractability of the maximum-duo preservation string mapping problem. Theor. Comput. Sci. 646, 16–25 (2016)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Boria, N., Cabodi, G., Camurati, P., Palena, M., Pasini, P., Quer, S.: A 7/2-approximation algorithm for the maximum duo-preservation string mapping problem. In: Proceedings of the 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016), Tel Aviv, Israel, pp. 11:1–11:8 (2016)Google Scholar
  4. 4.
    Boria, N., Kurpisz, A., Leppänen, S., Mastrolilli, M.: Improved approximation for the maximum duo-preservation string mapping problem. In: Brown, D., Morgenstern, B. (eds.) WABI 2014. LNCS, vol. 8701, pp. 14–25. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-44753-6_2 Google Scholar
  5. 5.
    Brubach, B.: Further improvement in approximating the maximum duo-preservation string mapping problem. In: Frith, M., Storm Pedersen, C.N. (eds.) WABI 2016. LNCS, vol. 9838, pp. 52–64. Springer, Cham (2016). doi: 10.1007/978-3-319-43681-4_5 CrossRefGoogle Scholar
  6. 6.
    Bulteau, L., Fertin, G., Komusiewicz, C., Rusu, I.: A fixed-parameter algorithm for minimum common string partition with few duplications. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 244–258. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40453-5_19 CrossRefGoogle Scholar
  7. 7.
    Bulteau, L., Komusiewicz, C.: Minimum common string partition parameterized by partition size is fixed-parameter tractable. In: Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2014), Portland, Oregon, USA, pp. 102–121 (2014)Google Scholar
  8. 8.
    Chan, T.M., Har-Peled, S.: Approximation algorithms for maximum independent set of pseudo-disks. Discrete Comput. Geometry 48(2), 373–392 (2012)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Chen, W., Chen, Z., Samatova, N.F., Peng, L., Wang, J., Tang, M.: Solving the maximum duo-preservation string mapping problem with linear programming. Theor. Comput. Sci. 530, 1–11 (2014)MathSciNetCrossRefMATHGoogle Scholar
  10. 10.
    Chen, X., Zheng, J., Zheng, F., Nan, P., Zhong, Y., Lonardi, S., Jiang, T.: Assignment of orthologous genes via genome rearrangement. IEEE/ACM Trans. Comput. Biology Bioinform. 2(4), 302–315 (2005)CrossRefGoogle Scholar
  11. 11.
    Chrobak, M., Kolman, P., Sgall, J.: The greedy algorithm for the minimum common string partition problem. In: Jansen, K., Khanna, S., Rolim, J.D.P., Ron, D. (eds.) APPROX/RANDOM -2004. LNCS, vol. 3122, pp. 84–95. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-27821-4_8 CrossRefGoogle Scholar
  12. 12.
    Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Trans. Algorithms 3(1), 2:1–2:19 (2007)MathSciNetCrossRefMATHGoogle Scholar
  13. 13.
    Dudek, B., Gawrychowski, P., Ostropolski-Nalewaja, P.: A family of approximation algorithms for the maximum duo-preservation string mapping problem. CoRR, abs/1702.02405 (2017)Google Scholar
  14. 14.
    Goldstein, A., Kolman, P., Zheng, J.: Minimum common string partition problem: hardness and approximations. Electr. J. Comb. 12 (2005)Google Scholar
  15. 15.
    Hardison, R.C.: Comparative genomics. PLoS Biol. 1(2), e58 (2003)CrossRefGoogle Scholar
  16. 16.
    Jiang, H., Zhu, B., Zhu, D., Zhu, H.: Minimum common string partition revisited. J. Comb. Optim. 23(4), 519–527 (2012)MathSciNetCrossRefMATHGoogle Scholar
  17. 17.
    Kolman, P., Walen, T.: Reversal distance for strings with duplicates: linear time approximation using hitting set. Electr. J. Comb. 14(1) (2007)Google Scholar
  18. 18.
    Mushegian, A.R.: Foundations of Comparative Genomics. Academic Press (AP), Cambridge (2007)Google Scholar
  19. 19.
    Mustafa, N.H., Ray, S.: Improved results on geometric hitting set problems. Discrete Comput. Geometry 44(4), 883–895 (2010)MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Swenson, K.M., Marron, M., Earnest-DeYoung, J.V., Moret, B.M.E.: Approximating the true evolutionary distance between two genomes. ACM J. Experimental Algorithmics 12, 3.5:1–3.5:17 (2008)MathSciNetMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada

Personalised recommendations