# The Maximum Equality-Free String Factorization Problem: Gaps vs. No Gaps

## Abstract

A factorization of a string *w* is a partition of *w* into substrings \(u_1,\dots ,u_k\) such that \(w=u_1 u_2 \cdots u_k\). Such a partition is called equality-free if no two factors are equal: \(u_i \ne u_j, \forall i,j\) with \(i \ne j\). The *maximum equality-free factorization problem* is to decide, for a given string *w* and integer *k*, whether *w* admits an equality-free factorization with *k* factors.

Equality-free factorizations have lately received attention because of their application in DNA self-assembly. Condon et al. (CPM 2012) study a version of the problem and show that it is \(\mathcal {NP}\)-complete to decide if there exists an equality-free factorization with an upper bound on the length of the factors. At STACS 2015, Fernau et al. show that the maximum equality-free factorization problem with a lower bound on the number of factors is \(\mathcal {NP}\)-complete. Shortly after, Schmid (CiE 2015) presents results concerning the Fixed Parameter Tractability of the problems.

In this paper we approach equality free factorizations from a practical point of view i.e. we wish to obtain good solutions on given instances. To this end, we provide approximation algorithms, heuristics, Integer Programming models, an improved FPT algorithm and we also conduct experiments to analyze the performance of our proposed algorithms.

Additionally, we study a relaxed version of the problem where gaps are allowed between factors and we design a constant factor approximation algorithm for this case. Surprisingly, after extensive experiments we conjecture that the relaxed problem has the same optimum as the original.

## Keywords

String factorization Equality-free String algorithms Heuristics## References

- 1.Bulteau, L., Hüffner, F., Komusiewicz, C., Niedermeier, R.: Multivariate algorithmics for NP-hard string problems. Bull. EATCS
**114**, 295–301 (2014)zbMATHGoogle Scholar - 2.Clifford, R., Harrow, A.W., Popa, A., Sach, B.: Generalised matching. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 295–301. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03784-9_29 CrossRefGoogle Scholar
- 3.Condon, A., Maňuch, J., Thachuk, C.: The complexity of string partitioning. J. Discrete Algorithms
**32**, 24–43 (2015)MathSciNetCrossRefGoogle Scholar - 4.Fernau, H., Manea, F., Mercas, R., Schmid, M.L.: Pattern matching with variables: fast algorithms and new hardness results. In: 32nd International Symposium on Theoretical Aspects of Computer Science, 4–7 March 2015, Garching, Germany, pp. 302–315 (2015)Google Scholar
- 5.Schmid, M.L.: Computing equality-free and repetitive string factorisations. Theor. Comput. Sci.
**618**, 42–51 (2016)MathSciNetCrossRefGoogle Scholar - 6.Spieksma, F.: On the approximability of an interval scheduling problem. J. Sched.
**2**(5), 215–227 (1999)MathSciNetCrossRefGoogle Scholar - 7.Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory
**24**, 530–536 (1978)MathSciNetCrossRefGoogle Scholar