CPM 2007: Combinatorial Pattern Matching pp 241-252

# Common Structured Patterns in Linear Graphs: Approximation and Combinatorics

• Guillaume Fertin
• Danny Hermelin
• Romeo Rizzi
• Stéphane Vialette
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4580)

## Abstract

A linear graph is a graph whose vertices are linearly ordered. This linear ordering allows pairs of disjoint edges to be either preceding (<), nesting ($$\sqsubset$$) or crossing ($$\between$$). Given a family of linear graphs, and a non-empty subset $$\mathcal{R} \subseteq \{<,\sqsubset,\between\}$$, we are interested in the PBMCSP problem: Find a maximum size edge-disjoint graph, with edge-pairs all comparable by one of the relations in $$\mathcal{R}$$, that occurs as a subgraph in each of the linear graphs of the family. In this paper, we generalize the framework of Davydov and Batzoglou by considering patterns comparable by all possible subsets $$\mathcal{R} \subseteq \{<,\sqsubset,\between\}$$. This is motivated by the fact that many biological applications require considering crossing structures, and by the fact that different combinations of the relations above give rise to different generalizations of natural combinatorial problems. Our results can be summarized as follows: We give tight hardness results for the PBMCSP problem for $$\{<,\between\}$$-structured patterns and $$\{\sqsubset,\between\}$$-structured patterns. Furthermore, we prove that the problem is approximable within ratios: (i) for $$\{<,\between\}$$-structured patterns, (ii) k 1/2 for $$\{\sqsubset, \between\}$$-structured patterns, and (iii) $$\mathcal{O}(\sqrt{k \lg k})$$ for $$\{<,\sqsubset,\between\}$$-structured patterns, where k is the size of the optimal solution and is the k-th harmonic number.

## Keywords

Structure Pattern Linear Graph Longe Common Subsequence Longe Common Subsequence Disjoint Edge
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. 1.
Alon, N.: Private communication (2006)Google Scholar
2. 2.
Alonso, L., Schott, R.: On the tree inclusion problem. In: Borzyszkowski, A.M., Sokolowski, S. (eds.) MFCS 1993. LNCS, vol. 711, pp. 211–221. Springer, Heidelberg (1993)Google Scholar
3. 3.
Apostolico, A., Guerra, C.: The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987)
4. 4.
Blin, G., Fertin, G., Rizzi, R., Vialette, S.: What makes the arc-preserving subsequence problem hard ? In: Sunderam, V.S., van Albada, G.D., Sloot, P.M.A., Dongarra, J.J. (eds.) ICCS 2005. LNCS, vol. 3515, pp. 860–868. Springer, Heidelberg (2005)Google Scholar
5. 5.
Blin, G., Fertin, G., Vialette, S.: New results for the 2-interval pattern problem. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, Springer, Heidelberg (2004)Google Scholar
6. 6.
Bose, P., Buss, J.F., Lubiw, A.: Pattern matching for permutations. IPL 65(5), 277–283 (1998)
7. 7.
Chang, M.-S., Wang, F.-G.: Efficient algorithms for the maximum weight clique and maximum weight independent set problems on permutation graphs. IPL 43(6), 293–295 (1992)
8. 8.
Chen, W.: More efficient algorithm for ordered tree inclusion. J. Algorithms 26(2), 370–385 (1998)
9. 9.
Crochemore, M., Hermelin, D., Landau, G.M., Vialette, S.: Approximating the 2-interval pattern problem. In: Brodal, G.S., Leonardi, S. (eds.) ESA 2005. LNCS, vol. 3669, pp. 426–437. Springer, Heidelberg (2005)
10. 10.
Davydov, E., Batzoglou, S.: A computational model for RNA multiple structural alignment. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 254–269. Springer, Heidelberg (2004)Google Scholar
11. 11.
Dilworth, R.P.: A decomposition theorem for partially ordered sets. Annals of Mathematics Series 2 51, 161–166 (1950)
12. 12.
Eppstein, D., Galil, Z., Giancarlo, R., Italiano, G.F.: Sparse dynamic programming I: Linear cost functions. J. ACM 39(3), 519–545 (1992)
13. 13.
Erdős, P., Szekeres, G.: A combinatorial problem in geometry. Compositio Mathematica 2, 463–470 (1935)
14. 14.
Evans, P.A.: Algorithms and complexity for annotated sequence analysis. PhD thesis, University of Alberta (1999)Google Scholar
15. 15.
Goldman, D., Istrail, S., Papadimitriou, C.H.: Algorithmic aspects of protein structure similarity. In: Proc. 40th Foundations of Computer Science (FOCS), pp. 512–522 (1999)Google Scholar
16. 16.
Gramm, J.: A polynomial-time algorithm for the matching of crossing contact-map patterns. IEEE/ACM Trans. Comp. Biol. and Bioinfo. 1(4), 171–180 (2004)
17. 17.
Gramm, J., Guo, J., Niedermeier, R.: Pattern matching for arc-annotated sequences. In: Agrawal, M., Seth, A.K. (eds.) FST TCS 2002: Foundations of Software Technology and Theoretical Computer Science. LNCS, vol. 2556, pp. 182–193. Springer, Heidelberg (2002)
18. 18.
Gupta, U.I., Lee, D.T., Leung, J.Y.-T.: Efficient algorithms for interval graph and circular-arc graphs. Networks 12, 459–467 (1982)
19. 19.
Hirschberg, D.S.: Algorithms for the longest common subsequence problem. J. ACM 24(4), 664–675 (1977)
20. 20.
Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. Communications of the ACM 20, 350–353 (1977)
21. 21.
Kilpeläinen, P., Mannila, H.: Ordered and unordered tree inclusion. SIAM J. Comp. 24(2), 340–356 (1995)
22. 22.
Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)
23. 23.
Kostochka, A.: On upper bounds on the chromatic numbers of graphs. Transactions of the Institute of Mathematics (Siberian Branch of the Academy of Sciences in USSR) 10, 204–226 (1988)
24. 24.
Kostochka, A., Kratochvil, J.: Covering and coloring polygon-circle graphs. Discrete Mathematics 163, 299–305 (1997)
25. 25.
Kubica, M., Rizzi, R., Vialette, S., Waleń, T.: Approximation of RNA multiple structural alignment. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 211–222. Springer, Heidelberg (2006)
26. 26.
Li, S.C., Li, M.: On the complexity of the crossing contact map pattern matching problem. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 231–241. Springer, Heidelberg (2006)
27. 27.
Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978)
28. 28.
Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. Comp. and Syst. Sc. 20(1), 18–31 (1980)
29. 29.
Shasha, D., Zhang, K.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. Comp. 18(6), 1245–1262 (1989)
30. 30.
Tiskin, A.: Longest common subsequences in permutations and maximum cliques in circle graphs. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 270–281. Springer, Heidelberg (2006)
31. 31.
Vialette, S.: On the computational complexity of 2-interval pattern matching problems. Theoretical Computer Science 312(2-3), 223–249 (2004)

## Authors and Affiliations

• Guillaume Fertin
• 1
• Danny Hermelin
• 2
• Romeo Rizzi
• 3
• Stéphane Vialette
• 4
1. 1.LINA, Univ. Nantes, 2 rue de la Houssinière, NantesFrance
2. 2.Dpt. Computer Science, Univ. Haifa, Mount Carmel, HaifaIsrael
3. 3.DIMI, Univ. Udine, UdineItaly
4. 4.LRI, UMR 8623, Univ. Paris-Sud, OrsayFrance