WORDS 2019: Combinatorics on Words pp 1-27

# Matching Patterns with Variables

• Florin Manea
• Markus L. Schmid
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11682)

## Abstract

A pattern $$\alpha$$ (i. e., a string of variables and terminals) matches a word w, if w can be obtained by uniformly replacing the variables of $$\alpha$$ by terminal words. The respective matching problem, i. e., deciding whether or not a given pattern matches a given word, is generally -complete, but can be solved in polynomial-time for classes of patterns with restricted structure. In this paper we overview a series of recent results related to efficient matching for patterns with variables, as well as a series of extensions of this problem.

## Keywords

Combinatorial pattern matching Patterns with variables String structural parameters Efficient algorithms NP-hardness

## References

1. 1.
Amir, A., Nor, I.: Generalized function matching. J. Discrete Algorithms 5, 514–523 (2007)
2. 2.
Angluin, D.: Finding patterns common to a set of strings. J. Comput. Syst. Sci. 21, 46–62 (1980)
3. 3.
Baker, B.S.: Parameterized pattern matching: algorithms and applications. J. Comput. Syst. Sci. 52, 28–42 (1996)
4. 4.
Bannai, H., I, T., Inenaga, S., Nakashima, Y., Takeda, M., Tsuruta, K.: The “runs” theorem. SIAM J. Comput. 46(5), 1501–1514 (2017)
5. 5.
Barceló, P., Libkin, L., Lin, A.W., Wood, P.T.: Expressive languages for path queries over graph-structured data. ACM Trans. Database Syst. 37, 31 (2012)
6. 6.
Barrett, C., et al.: CVC4. In: Gopalakrishnan, G., Qadeer, S. (eds.) CAV 2011. LNCS, vol. 6806, pp. 171–177. Springer, Heidelberg (2011).
7. 7.
Bodlaender, H.L.: A tourist guide through treewidth. Acta Cybern. 11(1–2), 1–21 (1993)
8. 8.
Bodlaender, H.L.: A linear-time algorithm for finding tree-decompositions of small treewidth. SIAM J. Comput. 25(5), 1305–1317 (1996).
9. 9.
Bodlaender, H.L.: A partial k-arboretum of graphs with bounded treewidth. Theor. Comput. Sci. 209(1–2), 1–45 (1998).
10. 10.
Bodlaender, H.L.: Fixed-parameter tractability of treewidth and pathwidth. In: Bodlaender, H.L., Downey, R., Fomin, F.V., Marx, D. (eds.) The Multivariate Algorithmic Revolution and Beyond. LNCS, vol. 7370, pp. 196–227. Springer, Heidelberg (2012).
11. 11.
Bringmann, K.: Fine-grained complexity theory (tutorial). In: Niedermeier, R., Paul, C. (eds.) 36th International Symposium on Theoretical Aspects of Computer Science (STACS 2019). Leibniz International Proceedings in Informatics (LIPIcs), vol. 126, pp. 4:1–4:7. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Dagstuhl (2019). . http://drops.dagstuhl.de/opus/volltexte/2019/10243
12. 12.
Câmpeanu, C., Salomaa, K., Yu, S.: A formal study of practical regular expressions. Int. J. Found. Comput. Sci. 14, 1007–1018 (2003)
13. 13.
Casel, K., Day, J.D., Fleischmann, P., Kociumaka, T., Manea, F., Schmid, M.L.: Graph and string parameters: connections between pathwidth, cutwidth and the locality number. CoRR, to appear in Proceedings of the ICALP 2019, abs/1902.10983 (2019). http://arxiv.org/abs/1902.10983
14. 14.
Crochemore, M.: An optimal algorithm for computing the repetitions in a word. Inf. Process. Lett. 12(5), 244–250 (1981)
15. 15.
Day, J.D., Fleischmann, P., Manea, F., Nowotka, D.: Local patterns. In: 37th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2017, pp. 24:1–24:14 (2017)Google Scholar
16. 16.
Day, J.D., Fleischmann, P., Manea, F., Nowotka, D., Schmid, M.L.: On matching generalised repetitive patterns. In: Hoshi, M., Seki, S. (eds.) DLT 2018. LNCS, vol. 11088, pp. 269–281. Springer, Cham (2018).
17. 17.
Day, J.D., Ganesh, V., He, P., Manea, F., Nowotka, D.: The satisfiability of word equations: decidable and undecidable theories. In: Potapov, I., Reynier, P.-A. (eds.) RP 2018. LNCS, vol. 11123, pp. 15–29. Springer, Cham (2018).
18. 18.
Day, J.D., Manea, F., Nowotka, D.: The hardness of solving simple word equations. In: Proceedings of the MFCS 2017. LIPIcs, vol. 83, pp. 18:1–18:14 (2017)Google Scholar
19. 19.
Da̧browski, R., Plandowski, W.: Solving two-variable word equations. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 408–419. Springer, Heidelberg (2004).
20. 20.
Díaz, J., Petit, J., Serna, M.: A survey of graph layout problems. ACM Comput. Surv. 34(3), 313–356 (2002).
21. 21.
Diekert, V., Jez, A., Kufleitner, M.: Solutions of word equations over partially commutative structures. In: Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016. Leibniz International Proceedings in Informatics (LIPIcs), vol. 55, pp. 127:1–127:14 (2016)Google Scholar
22. 22.
Robson, J.M., Diekert, V.: On quadratic word equations. In: Meinel, C., Tison, S. (eds.) STACS 1999. LNCS, vol. 1563, pp. 217–226. Springer, Heidelberg (1999).
23. 23.
Downey, R.G., Fellows, M.R.: Fundamentals of Parameterized Complexity. TCS. Springer, London (2013).
24. 24.
Erlebach, T., Rossmanith, P., Stadtherr, H., Steger, A., Zeugmann, T.: Learning one-variable pattern languages very efficiently on average, in parallel, and by asking queries. Theoret. Comput. Sci. 261, 119–156 (2001)
25. 25.
Feige, U., HajiAghayi, M., Lee, J.R.: Improved approximation algorithms for minimum weight vertex separators. SIAM J. Comput. 38(2), 629–657 (2008).
26. 26.
Fernau, H., Manea, F., Mercas, R., Schmid, M.L.: Pattern matching with variables: fast algorithms and new hardness results. In: 32nd International Symposium on Theoretical Aspects of Computer Science, STACS 2015, pp. 302–315 (2015)Google Scholar
27. 27.
Fernau, H., Manea, F., Mercas, R., Schmid, M.L.: Revisiting Shinohara’s algorithm for computing descriptive patterns. Theoret. Comput. Sci. 733, 44–54 (2018)
28. 28.
Fernau, H., Schmid, M.L.: Pattern matching with variables: a multivariate complexity analysis. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 83–94. Springer, Heidelberg (2013).
29. 29.
Fernau, H., Schmid, M.L.: Pattern matching with variables: a multivariate complexity analysis. Inf. Comput. 242, 287–305 (2015)
30. 30.
Fernau, H., Schmid, M.L., Villanger, Y.: On the parameterised complexity of string morphism problems. In: Proceedings of the 33rd IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS. Leibniz International Proceedings in Informatics (LIPIcs), vol. 24, pp. 55–66 (2013)Google Scholar
31. 31.
Fernau, H., Schmid, M.L., Villanger, Y.: On the parameterised complexity of string morphism problems. Theory Comput. Syst. 59(1), 24–51 (2016)
32. 32.
Flum, J., Grohe, M.: Parameterized Complexity Theory. TTCSAES. Springer, Heidelberg (2006).
33. 33.
Freydenberger, D.D.: A logic for document spanners. In: Proceedings of the 20th International Conference on Database Theory, ICDT 2017. Leibniz International Proceedings in Informatics (LIPIcs)Google Scholar
34. 34.
Freydenberger, D.D., Holldack, M.: Document spanners: from expressive power to decision problems. Theory Comput. Syst. 62(4), 854–898 (2018)
35. 35.
Freydenberger, D.D.: Extended regular expressions: succinctness and decidability. Theory Comput. Syst. 53, 159–193 (2013)
36. 36.
Freydenberger, D.D., Reidenbach, D.: Bad news on decision problems for patterns. Inf. Comput. 208(1), 83–96 (2010)
37. 37.
Freydenberger, D.D., Reidenbach, D.: Existence and nonexistence of descriptive patterns. Theor. Comput. Sci. 411(34–36), 3274–3286 (2010)
38. 38.
Freydenberger, D.D., Reidenbach, D.: Inferring descriptive generalisations of formal languages. J. Comput. Syst. Sci. 79(5), 622–639 (2013)
39. 39.
Friedl, J.E.F.: Mastering Regular Expressions, 3rd edn. O’Reilly, Sebastopol (2006)Google Scholar
40. 40.
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1979)
41. 41.
Gawrychowski, P., Manea, F., Nowotka, D.: Testing generalised freeness of words. In: STACS 2014. LIPIcs, vol. 25, pp. 337–349. Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2014)Google Scholar
42. 42.
Gawrychowski, P., I, T., Inenaga, S., Köppl, D., Manea, F.: Tighter bounds and optimal algorithms for all maximal $$\alpha$$-gapped repeats and palindromes - finding all maximal $$\alpha$$-gapped repeats and palindromes in optimal worst case time on integer alphabets. Theory Comput. Syst. 62(1), 162–191 (2018)Google Scholar
43. 43.
Gawrychowski, P., Manea, F., Mercas, R., Nowotka, D.: Hide and seek with repetitions. J. Comput. Syst. Sci. 101, 42–67 (2019).
44. 44.
Gawrychowski, P., Manea, F., Mercas, R., Nowotka, D., Tiseanu, C.: Finding pseudo-repetitions. In: 30th International Symposium on Theoretical Aspects of Computer Science, STACS 2013, Kiel, Germany, 27 February-2 March 2013. LIPIcs, vol. 20, pp. 257–268 (2013)Google Scholar
45. 45.
Geilke, M., Zilles, S.: Learning relational patterns. In: Kivinen, J., Szepesvári, C., Ukkonen, E., Zeugmann, T. (eds.) ALT 2011. LNCS (LNAI), vol. 6925, pp. 84–98. Springer, Heidelberg (2011).
46. 46.
Halfon, S., Schnoebelen, P., Zetzsche, G.: Decidability, complexity, and expressiveness of first-order logic over the subword ordering. In: Proceedings of the 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, pp. 1–12. IEEE Computer Society (2017)Google Scholar
47. 47.
Ibarra, O.H., Pong, T.C., Sohn, S.M.: A note on parsing pattern languages. Pattern Recogn. Lett. 16, 179–182 (1995)
48. 48.
Jaffar, J.: Minimal and complete word unification. J. ACM 37(1), 47–85 (1990)
49. 49.
Jeż, A.: Context unification is in PSPACE. In: Esparza, J., Fraigniaud, P., Husfeldt, T., Koutsoupias, E. (eds.) ICALP 2014. LNCS, vol. 8573, pp. 244–255. Springer, Heidelberg (2014).
50. 50.
Jeż, A.: One-variable word equations in linear time. Algorithmica 74, 1–48 (2016)
51. 51.
Jeż, A.: Recompression: a simple and powerful technique for word equations. J. ACM 63, 4 (2016)
52. 52.
Jiang, T., Salomaa, A., Salomaa, K., Yu, S.: Decision problems for patterns. J. Comput. Syst. Sci. 50(1), 53–63 (1995)
53. 53.
Karhumäki, J., Plandowski, W., Mignosi, F.: The expressibility of languages and relations by word equations. J. ACM 47, 483–505 (2000)
54. 54.
Kärkkäinen, J., Sanders, P., Burkhardt, S.: Linear work suffix array construction. J. ACM 53, 918–936 (2006)
55. 55.
Kearns, M.J., Pitt, L.: A polynomial-time algorithm for learning k-variable pattern languages from examples. In: Proceedings of the Second Annual Workshop on Computational Learning Theory, COLT 1989, Santa Cruz, CA, USA, 31 July–2 August 1989, pp. 57–71 (1989)Google Scholar
56. 56.
Kloks, T. (ed.): Treewidth, Computations and Approximations. LNCS, vol. 842. Springer, Heidelberg (1994).
57. 57.
Kolpakov, R., Kucherov, G.: Searching for gapped palindromes. Theor. Comput. Sci. 410(51), 5365–5373 (2009)
58. 58.
Kolpakov, R., Podolskiy, M., Posypkin, M., Khrapov, N.: Searching of gapped repeats and subrepetitions in a word. In: Kulikov, A.S., Kuznetsov, S.O., Pevzner, P. (eds.) CPM 2014. LNCS, vol. 8486, pp. 212–221. Springer, Cham (2014).
59. 59.
Kosolobov, D., Manea, F., Nowotka, D.: Detecting one-variable patterns. In: Proceedings of the 24th International Symposium on String Processing and Information Retrieval , SPIRE 2017, Palermo, Italy, 26–29 September 2017, pp. 254–270 (2017)
60. 60.
Leighton, T., Rao, S.: Multicommodity max-flow min-cut theorems and their use in designing approximation algorithms. J. ACM 46(6), 787–832 (1999).
61. 61.
Lin, A.W., Majumdar, R.: Quadratic word equations with length constraints, counter systems, and presburger arithmetic with divisibility. In: Lahiri, S.K., Wang, C. (eds.) ATVA 2018. LNCS, vol. 11138, pp. 352–369. Springer, Cham (2018).
62. 62.
Lothaire, M.: Combinatorics on Words. Cambridge University Press, Cambridge (1997)
63. 63.
Lothaire, M.: Algebraic Combinatorics on Words, chap. 3. Cambridge University Press, Cambridge, New York (2002)Google Scholar
64. 64.
Lothaire, M.: Algebraic Combinatorics on Words. Cambridge University Press, Cambridge, New York (2002)
65. 65.
Lyndon, R.C.: Equations in free groups. Trans. Am. Math. Soc. 96, 445–457 (1960)
66. 66.
Lyndon, R.C., Schupp, P.E.: Combinatorial Group Theory. Springer, Heidelberg (1977)
67. 67.
Makanin, G.S.: The problem of solvability of equations in a free semigroup. Matematicheskii Sbornik 103, 147–236 (1977)
68. 68.
Manea, F., Nowotka, D., Schmid, M.L.: On the solvability problem for restricted classes of word equations. In: Brlek, S., Reutenauer, C. (eds.) DLT 2016. LNCS, vol. 9840, pp. 306–318. Springer, Heidelberg (2016).
69. 69.
Mateescu, A., Salomaa, A.: Finite degrees of ambiguity in pattern languages. RAIRO Inf. Théor. Appl. 28, 233–253 (1994)
70. 70.
Mateescu, A., Salomaa, A.: Aspects of classical language theory. In: Rozenberg, G., Salomaa, A. (eds.) Handbook of Formal Languages, pp. 175–251. Springer, Heidelberg (1997).
71. 71.
Ng, Y.K., Shinohara, T.: Developments from enquiries into the learnability of the pattern languages from positive data. Theoret. Comput. Sci. 397, 150–165 (2008)
72. 72.
Ordyniak, S., Popa, A.: A parameterized study of maximum generalized pattern matching problems. Algorithmica 75, 1–26 (2016)
73. 73.
Petit, J.: Addenda to the survey of layout problems. Bull. EATCS 105, 177–201 (2011). http://eatcs.org/beatcs/index.php/beatcs/article/view/98
74. 74.
Plandowski, W.: An efficient algorithm for solving word equations. In: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, STOC 2006, pp. 467–476 (2006)Google Scholar
75. 75.
Reidenbach, D.: A non-learnable class of e-pattern languages. Theor. Comput. Sci. 350(1), 91–102 (2006)
76. 76.
Reidenbach, D.: An examination of ohlebusch and ukkonen’s conjecture on the equivalence problem for e-pattern languages. J. Automata Lang. Comb. 12(3), 407–426 (2007)
77. 77.
Reidenbach, D.: Discontinuities in pattern inference. Theor. Comput. Sci. 397(1–3), 166–193 (2008)
78. 78.
Reidenbach, D., Schmid, M.L.: Patterns with bounded treewidth. Inf. Comput. 239, 87–99 (2014)
79. 79.
Schmid, M.L.: A note on the complexity of matching patterns with variables. Inf. Process. Lett. 113(19–21), 729–733 (2013)
80. 80.
Schulz, K.U.: Word unification and transformation of generalized equations. J. Autom. Reason. 11, 149–184 (1995)
81. 81.
Shinohara, T.: Polynomial time inference of pattern languages and its application. In: Proceedings of 7th IBM Symposium on Mathematical Foundations of Computer Science, MFCS, pp. 191–209 (1982)Google Scholar
82. 82.
Thilikos, D.M., Serna, M.J., Bodlaender, H.L.: Cutwidth I: a linear time fixed parameter algorithm. J. Algorithms 56(1), 1–24 (2005).
83. 83.
Zheng, Y., Ganesh, V., Subramanian, S., Tripp, O., Berzish, M., Dolby, J., Zhang, X.: Z3str2: an efficient solver for strings, regular expressions, and length constraints. Formal Methods Syst. Des. 50(2–3), 249–288 (2017)