In search of maximum non-overlapping codes

Non-overlapping codes are block codes that have arisen in diverse contexts of computer science and biology. Applications typically require ﬁnding non-overlapping codes with large cardinalities, but the maximum size of non-overlapping codes has been determined only for cases where the codeword length divides the size of the alphabet, and for codes with codewords of length two or three. For all other alphabet sizes and codeword lengths no computationally feasible way to identify non-overlapping codes that attain the maximum size has been found to date. Herein we characterize maximal non-overlapping codes. We formulate the maximum non-overlapping code problem as an integer optimization problem and determine necessary conditions for optimality of a non-overlapping code. Moreover, we solve several instances of the optimization problem to show that the hitherto known constructions do not generate the optimal codes for many alphabet sizes and codeword lengths. We also evaluate the number of distinct maximum non-overlapping codes.


Introduction
Non-overlapping codes, also known under the terms strongly regular codes [10], cross-bifixfree codes [1,6], mutually uncorrelated codes (MU-codes) [15,12], and strong comma-free codes [7], are block codes with the property that no prefix of a codeword occurs as a suffix of any not necessarily distinct codeword.Over the past sixty years, they have provided solutions to various problems in different fields.They construct a special class of finite automata [11], provide codes for frame synchronization [1], and even build addresses for DNA-storage [15,12].For most purposes, non-overlapping codes with large cardinalities are favorable and therefore researchers have developed several constructions of such codes [2,5,3,6,4].
Identifying non-overlapping codes with the largest cardinality, however, appears to be a hard nut to crack and little progress has been done in this direction.Chee et al. [6] searched through binary codes of lengths up to sixteen to show that known constructions of nonoverlapping codes obtain the maximum size, but for larger alphabet sizes and code lengths, their approach cannot be used due to its double exponential time complexity and exponential space complexity.Later, Blackburn [5] provided formulas for the size of maximum twoand three-letter non-overlapping codes, and for the sizes of maximum q-ary n-letter nonoverlapping codes where n divides q.The number of maximal two-and three-letter, and maximum three-letter non-overlapping codes was recently computed [9].
Fimmel et al. [9] also proposed an algorithm that they believe generates all maximal nonoverlapping codes.After proving the statement holds, it could theoretically be used to find the maximum q-ary n-letter non-overlapping codes, but the approach is impractical due to its double exponential time and exponential space complexity.Nevertheless, by studying some equivalence relations we show that the time complexity of their algorithm can be reduced to exponential and the space complexity of their algorithm to polynomial in n.We further prove some necessary conditions for optimality.We employ the results to compute optimal Example 2. The non-overlapping code {VRT, KRT} is not maximal, because it could be expanded by adding the word RRT.The non-overlapping code {VRT, VVT, RVT, RRT} is maximal over the alphabet {V, R, T}, because no other three-letter word exists that starts with V or R, ends with T, and includes neither prefix VT nor prefix RT.Definition 4. A non-overlapping code X is maximum if for all non-overlapping codes Y ⊆ Σ n : |X| ≥ |Y |.
We denote this greatest cardinality with S(q, n) and define N (q, n) to be the number of codes that achieve it.
Remark 1.Every maximum non-overlapping code is maximal.Non-overlapping codes were first defined by Levenshtein [10] who found that they coincide with a class of codes for which there exists a decoding automaton that correctly decodes a sequence of input letters independently of the choice of the initial state of the automaton [11].This property ensures that errors in the input word and random transitions between states of the automaton do not affect the decoding of subsequent input words.He proposed Construction 1 (below) that generates non-overlapping codes of not necessarily optimal sizes.He also established a lower bound S(q, n) q−1 qe q n n [10] and an upper bound S(q, n) ≤ n−1 n n−1 q n n [11] for the size of a maximum non-overlapping code.Construction 1.Let n > 1 and q > 1 be integers and 1 ≤ k ≤ n−1.Denote by C the set of all codewords s = (s 1 , s 2 , . . ., s n ) ∈ Z n q such that s 1 = 0, s n−k = 0, s n−k+1 = s n−k+2 = • • • = s n = 0, and (s 1 , . . ., s n−k ) does not contain k consecutive 0's.Then C is a non-overlapping code.
The size of a general code obtained using Blackburn's construction is still an open problem, but Wang and Wang [14] provided a formula for the case Barcucci et al. [3] defined a set of non-overlapping codes given in Construction 5 (below) that are generated using colored Motzkin words and showed they are all maximal.They did not compute the size of the codes.Definition 5. A sequence in Z n q+2 that contains the same number of zeros and ones, such that no prefix of the sequences has more zeros than ones is called a q-colored Motzkin word of length n.We denote the set of all such sequences with M q (n).Definition 6.An elevated q-colored Motzkin word of length n is a sequence 1α0, where α ∈ M q (n − 2).We denote the set of all such words with Mq (n).
Fimmel, Michel and Strüngmann [7] proposed that three-letter non-overlapping codes emerged naturally.In living cells the ribosome behaves as a decoding automata that decodes a sequence of codons to a sequence of amino acids using the genetic code as a decoding function.Data suggests that ancestral genetic code was indeed a non-overlapping code.Inspired by Blackburn's work, they recently showed that the set of maximal two-letter nonoverlapping codes over Σ is exactly the set of partitions of Σ into two non-empty parts and counted that there are 2 q − 2 such codes [9].Moreover, they characterized maximal three-letter non-overlapping codes (see Proposition 1 below) and counted that there are q−1 m=1 q m 2 m(q−m) such codes.They also determined N (q, 3) = 2 q [ 2q 3 ] .Definition 7. Let L and R be sets of strings.(i) (LR) denotes the concatenation of sets L and R, i.e. the set of all strings of the form lr, where l ∈ L and r ∈ R, (ii) L i denotes the concatenation of the set L with itself i times, (iii) the Kleene star denotes the smallest superset that is closed under concatenation and includes the empty set L * = i≥0 L i and (iv) the Kleene plus denotes the smallest superset that is closed under concatenation and does not include the empty set L + = i>0 L i .Proposition 1 ([9], Theorem 5.3).The set of maximal three-letter non-overlapping codes is exactly the set of all three-letter codes

Constructing maximal non-overlapping codes
Fimmel et al. [9] show that a straightforward generalisation of Proposition 1 does not hold.The underlying construction can, however, be generalized as described in Construction 6 (below).They posit that it is composed of non-overlapping codes only and that it contains all maximal non-overlapping codes.Construction 6.Let n ≥ 3. We define M q,n to be the set of all codes ) is a partition of Σ into two non-empty parts, and (ii) (L i , R i ) is a partition of i−1 j=1 (L j R i−j ) for every i ∈ {2, . . ., n − 1}.Before proving both statements, let us observe that the mapping from the partitions of Construction 6 to the set of codes C ⊆ Σ n is not injective.Two distinct collections of partitions (L i , R i ) i=1,...,n−1 can determine the same code as demonstrated by Example 3. Example 3. Set n = 4 and take a partition No proper prefix in X occurs as a proper suffix in X.
Proof.Assume, for sake of contradiction, that the statement does not hold.Let p be the length of the shortest proper prefix in X that occurs as a proper suffix in X.Let i be the smallest index such that there exists a word w ∈ X i with a prefix x 1 • • • x p that occurs as a suffix in X.Let j be the smallest index such that X j contains a word w ′ that ends in x 1 • • • x p .Clearly i, j > 2, as X 1 and X 2 have no proper prefixes nor suffixes.Therefore this contradicts the minimality of p.
Since for l > 2 every one-letter prefix in X is from L 1 and every one-letter suffix from R 1 , p > 1 and k > 2. Therefore 2k − 1 > 1 and 2k − 1 < i.This contradicts minimality of i.So p = k and hence Proof.The theorem follows directly from Proposition 2.
Proof.The corollary follows directly from Construction 6 and Theorem 3.

Size of C ∈ M q,n
Now that we know that every C ∈ M q,n is non-overlapping, we want to determine the size of C. Theorem 6 (below) shows that it depends on the sizes of the sets partitions (L i , R i ) i=1,...,n−1 only.Before stating and proving the formula, we define a finite sequence of decompositions due to Fimmel et al. [8] and prove some of its properties.
* R encoded with a binary word p k over the alphabet {l, r} such that (i) p 0 (w) ∈ l{l, r} n−2 r with p 0 (w) i = l if w i ∈ L 1 and p 0 (w) i = r if w i ∈ R 1 , (ii) if p k (w) ∈ (l{l, r} + r), then p k+1 (w) ∈ (l{l, r} * r) is obtained by replacing each occurrence of lr in p k (w) by l if lr corresponds to a subword w ′ of w in (L i R j ) ⊆ L i+j , or by r if it corresponds to a subword w ′ of w in (L i R j ) ⊆ R i+j for some positive integers i and j.Proof.Since x ∈ L there exists some i, 0 < i < n, such that x ∈ L i .Since y ∈ R there exists some j, 0 < j < n, such that y ∈ R j .Therefore xy Proposition 5.The sequence {p k (w)} k≥0 is (i) well-defined, (ii) finite and its last element is lr.
Proof.(i) Every letter in w belongs to Σ = L 1 ∪ R 1 .If lr occurs as a proper substring of p k (w), then by Proposition 4 it corresponds to some k-letters long substring w ′ of w, such that w ′ ∈ L ∪ R. Therefore lr can be replaced by either l or r.If lr occurs at the beginning of p k (w), then it is replaced by l.Otherwise w ′ ∈ R i for some i < n − 1 and there is a word in (L 1 R i ) ∈ L ∪ R that ends in w ′ which contradicts Proposition 2. A symmetric observation guarantees that a lr at the end of p k (w) is replaced by r.
(ii) If p k (w) ∈ (l{l, r} + r), then the length of p k+1 (w) is strictly smaller then the length of p k (w).If p k (w) ∈ (l{l, r} + r), then p k+1 is not defined and p k (w) is the last element of the sequence.The previous step reveals that then p k (w) = lr.
The sequence {p k (w)} k≥0 is finite.The last element of the sequence is lr.
Proof.If w belongs to L l (alternatively to R l ), then it also belongs to L l ∪ R l .The latter is a non-overlapping code as noted in Corollary 1, so the statement follows from Proposition 5.
Proof.It is sufficient to explain that for every word w Assume, for sake of contradiction, that there exists a word w = w ). Suppose lr occurs in a decomposition p k (w i+1 • • • w j ) ∈ (l{l, r} + r) and corresponds to a substring w ′ .Then the same lr occurs in the part of the decomposition but this contradicts the assumption that i < j.

Characterization of maximal non-overlapping codes
We will now show that M q,n contains all maximal non-overlapping codes and determine which collections of partitions (L i , R i ) i=1,...,n−1 generate maximal non-overlapping codes.
Proposition 7. Every maximal non-overlapping code is contained in M q,n .
Proof.Let X be a maximal q-ary n-letter non-overlapping code.Now construct the sets (L i , R i ) corresponding to X for i < n as follows and for 1 < i < n and there exists a word in X that starts in x 1 • • • x i by definition of L i which contradicts the fact that X is non-overlapping).Define Z := j<i (L j R i−j ).We will prove that X ⊆ Z. Since X is maximal and Z is non-overlapping by Theorem 3, it immediately follows that X = Z.
Let w ∈ X.We will show that {p k (w)} k≥0 is well-defined.Every letter in w belongs to L 1 ∪ R 1 = Σ since w ∈ Σ n .The first letter of w belongs to L 1 by definition of L 1 and we explained earlier that the last letter belongs to R 1 .Decomposition p 0 (w) is therefore well-defined.Now suppose p k (w) ∈ {l{l, r} * r} for some k ≥ 0. If p k (w) = lr, then there exists some i such that w ∈ (L i R n−i ) and w ∈ Z. Otherwise p k (w) ∈ {l{l, r} + r} and there exists some lr in p k (w) that corresponds to a proper substring w ′ of w.The length of w ′ is at most n − 1 and w ′ ∈ L ∪ R, so lr can be replaced by either l or r in p k+1 (w).If p k (w) starts with lr that corresponds to a k-letters long substring of w, w L , then by definition of L k w L ∈ L k and lr is replaced by l in p k+1 (w).If p k (w) ends with lr that corresponds to a k-letters long substring of w w R , then w R ∈ L k since X is a non-overlapping code.So w R ∈ R k and lr is replaced by r in p k+1 (w).The length of p k+1 (w) is strictly smaller than the length of p k (w) so the sequence {p k (w)} k≥0 is indeed finite and its last element is lr.As shown earlier this implies that w ∈ Z. Theorem 8. C ∈ M q,n is maximal if and only if all the following statements hold.(i) If L i is non-empty and R n−i = ∅, then every x ∈ L i is a prefix in some L j such that i < j < n and R n−j is non-empty or q = 2, i = n 2 and 2 −1 is empty and for every non-empty , then y is a suffix in L n−i , and it is neither a prefix in C nor in L n−i by Proposition 2. Since n − i = n 2 , then x itself is not a prefix in (L n−i x), and we already showed that it is not a prefix in C. If y is a suffix of length k in (L n−i x) that is shorter than x, then it is not a prefix in C by Proposition 2. If k ≤ n − i then y is not a prefix in (L n−i x) as it cannot be a prefix in L n−i by Proposition 2. If k > n − i then y can be a prefix in (L n−i x) only if it has a suffix that is a prefix in x, but x is itself non-overlapping.Therefore C ∪ (L n−i x) is a non-overlapping code.This contradicts the maximality of C.
Suppose there exists y ∈ L n 2 \ {x}.Observe the word yx.If z is a suffix of x, then it is not a prefix of y nor of any word in C by Proposition 2. If z is a suffix of yx longer than n 2 , then z is not a prefix of yx nor of any word in C, otherwise y is not non-overlapping or it has a suffix that is a prefix of a word in C which contradicts Proposition 2. So C ∪ {yx} is a non-overlapping code, but this contradicts the maximality of C. Now notice that for j > 2, The proof is symmetric to the proof of statement (i).
The proof is symmetric to the proof of statement (iii).STEP 2: If statements (i) to (iv) hold, then C is maximal.
that satisfies (i) and (ii).Suppose C is not maximal.Then there exists a word w ∈ Σ n \ C such that no proper prefix of w is a suffix in C and no proper suffix of w is a prefix in C. Let {p k (w)} k≥0 be a sequence of decompositions of w ∈ Σ n \ C in (L ∪ R) + encoded with a binary word pk over the alphabet {l, r} such that (a) p0 (w) ∈ {l, r} n with p0 (w) i = l if w o ∈ L 1 and p0 (w) = r if w i ∈ R 1 , (b) if pk (w) contains lr as a proper substring, then obtain pk+1 (w) ∈ {l, r} + by replacing each occurrence of lr in pk (w) by l if lr corresponds to a subword w ′ of w in (L i R j ) ⊆ L i+j or by r if it corresponds to a subword w ′ of w in (L i R j ) ⊆ R i+j for some positive integers i and j.
Decomposition p0 (w) is well-defined as every letter of w is an element of L 1 ∪ R 1 = Σ.If pk (w) contains a substring lr that corresponds to a substring w ′ of w then there exist integers i and j such that w ′ ∈ L i+j ∪ R i+j .The sequence is therefore well-defined.The length of decompositions is strictly decreasing, so the sequence {p k (w)} k≥0 is finite.Denote the last element with pk (w).The sequences in {l, r} + that have no lr as a proper substring are of the forms lr, r + l + , l + , r + .Decomposition pk (w) cannot be lr, otherwise w ∈ C.
If n is odd or every word in L n 2 is a prefix in C and every word in R n 2 is a suffix in C, it follows from (i) that for every x ∈ L there exists a word w ′ ∈ C that starts in x and from (ii) that for every x ∈ R there exists a word w ′ ∈ C that ends in x.Therefore C ∪ {w} is not non-overlapping if pk (w) ∈ r + l + ∪ r + ∪ l + .
Otherwise q = 2 and is a prefix in C by (i) and every word in R of length distinct from n 2 is a suffix in C by (ii), pk (w) ∈ r + l + implies pk (w) = rl and w No word can therefore be added to C, so C is maximal.
We showed earlier that the mapping from collections (L i , R i ) i=1,...,n−1 to non-overlapping codes is not injective.Nevertheless, we will show that almost every maximal non-overlapping code corresponds to exactly one collection of partitions.Unfortunately, a characterization of maximal non-overlapping codes in terms of partition sizes cannot be given as demonstrated by Example 4, and we will therefore not use it directly when computing S(q, n).
Moreover, exactly one of the following statements holds.
Proof.Suppose there exists some i such that L i = Li and ∀j < i : L j = Lj .L j ∪R j = Lj ∪ Rj for all j ≤ i by definition, so R j = Rj for all j < i. Furthermore at least one of the sets L i ∩ Ri and R i ∩ Li is non-empty.Without loss of generality suppose L i ∩ Ri = ∅ (otherwise R i ∩ Li is non-empty and a symmetric argument follows).Let x ∈ L i ∩ Ri .From Proposition 2 we know that x is neither a prefix nor a suffix in C. If i = n 2 then Theorem 8 implies that C is not maximal, so L j = R j for every j, 1 ≤ j < n.
If n is odd or q ≥ 3, then every maximal non-overlapping code corresponds to exactly one partition (L i , R i ) i=1,...,n−1 .
Proof.The second and third case of Proposition 9 cannot hold for an odd n nor for q ≥ 3.

An integer optimization problem
Now that we know that M q,n contains all maximal non-overlapping codes and we have a formula to compute their sizes, we can clearly determine the maximum non-overlapping codes by maximizing the value over all constraints given by Construction 6.An integer optimization problem is formulated as follows (see Proposition 10 below).We call this formulation SQN(q, n) throughout the paper.

S(q
If (x * , y * ) is an optimal solution to the above optimization problem, then Proof.The proposition follows directly from the definition of S(q, n), Construction 6, Theorem 3 and Theorem 6 if we denote the size of the set L i by x i and the size of the set R i by y i .
The evaluation of the objective function for all feasible solutions to SQN requires O q n 2 time and Θ (n) space.If we also want to store all the optimal solutions, space requirement increases to Θ (nm) where m denotes the number of optimal solutions.To determine N (q, n) from the optimal solutions of SQN(q, n), we have to evaluate whether any pair of solutions of SQN(q, n) constructs the same maximum code to prevent double counting.Proposition 11 shows that for most parameter values one can compute the value N (q, n) directly.
N (q, n) = x,y optimal solution of SQN(q,n) if at least one of the following holds: (i) q > 2, (ii) n is odd, (iii) no pair of solutions (x, y), (x, ŷ) of SQN(q,n) satisfies Proof.Let (x, y) be an optimal solution of SQN.There are q x1 choices for a partition of Σ into (L 1 , R 1 ) such that No code is double counted as the non-overlapping codes corresponding to solutions of SQN(q, n) are distinct due to Proposition 9 and Corollary 3.

Reduction of SQN
The set of feasible solutions of SQN can be reduced.We provide some equivalence transformations on (x, y) that preserve the value of the objective function.These enable us to evaluate only one feasible solution for each equivalence set.Moreover, we will show that every maximum C ∈ M q,n either satisfies the property or it can be mapped to a non-overlapping code that satisfies property (1) using a transformation we will define later.
Proposition 12. (x, y) is an optimal solution of SQN if and only if (y, x) is optimal for SQN.
Before providing a formula for expressing the sizes of the sets (ii) The coefficients p jk satisfy the following relations: (iii) Condition c i is defined as follows.We will use its value to determine which part of partition (L i , R i ) is empty.
Proposition 14.Let i be an integer such that n > i > n 2 and let Proof.We will prove the proposition by induction on j. (L k R i+j−k ) can be expressed as a union of those words that end in R i+k , words that start in L i+k , and words that start and end in substrings shorter or equal to i.
2 , these three sets never overlap, so the size of their union is equal to the sum of their sizes.The size of i k=j (L k R i+j−k ) can be expressed using the coefficients p jj as follows.
To compute the size of the sets we apply the induction hypothesis, then change the order of summation, introduce k := k−i, and recognise the formula for p jl .
We now combine the results to obtain be a maximum non-overlapping code satisfying the property that there exists an integer j, n 2 < j ≤ n − 1, such that |L j ||R j | = 0. Let us denote the largest such integer by m.Now rearrange the formula |C| = n−1 j=1 |L j ||R n−j | in order to apply Proposition 14 to it.
Recall that we set and The proof revealed that it is easy to determine all maximum non-overlapping codes if we know all solutions that satisfy Property (1).The algorithm to generate them is summarized in the following corollary.
Li Rn−i is a maximum non-overlapping code.
5 Exact formulas for four-letter codes Theorem 16.Let q ≥ 3. The number of distinct maximal four-letter q-ary non-overlapping codes equals Proof.By Corollary 3 it is sufficient to count the number of distinct partitions (L 1 , R 1 ), (L 2 , R 2 ), (L 3 , R 3 ) that satisfy the requirements of Theorem 8. Sets L 1 and R 1 cannot be empty by definition.If L 2 is empty, then for every y ∈ R 2 there exists x ∈ L 1 such that xy ∈ R 3 .R 3 is therefore non-empty, but L 3 can be empty as every y ∈ R 1 occurs as a suffix in R 2 = (L 1 R 1 ).Therefore there are maximal non-overlapping codes with empty L 2 .There are also as many maximal codes when R 2 is empty due to a symmetric observation.
Suppose L 2 and R 2 are both non-empty.If L 3 and R 3 are non-empty, then the code is clearly maximal.If L 3 is empty, then every x ∈ R 1 is a suffix in (L 2 R 1 ) ⊆ R 3 , so the code is maximal.If R 3 is empty, then every x ∈ L 1 is a prefix in (L 2 R 1 ) ⊆ L 3 , so the code is maximal.Therefore there are maximal non-overlapping codes with |L 2 ||R 2 | = 0.
Blackburn [5] computed S(q, 2) and S(q, 3) by proving that Construction 4 with k = 1 and k = 2 respectively gives us codes of maximum sizes.Using the results from the previous section, we further prove that this construction with k = 3 always yields the optimum for four-letter codes.Before showing the result, we will provide Lemma 17 and Proposition 18 regarding the sizes of codes generated by Construction 4.
Lemma 17.Let q, n, m ≤ n 2 and 2 ≤ j ≤ n be positive integers, and f the following function Then f (j) ≥ f (j + 1).
Proof.The coefficient in front of (n−1)q−m n n−j in f (j) is (n−m−1)((n−1)q−m) n + (n−q−m)(n−1)(n−j) ≥ n−j j , so the coefficient in front of and see that the statement follows.
Proposition 18.Let C be the largest q-ary non-overlapping code of length n obtained by Construction 4 with k = n − 1.Then the value of the parameter l equals: Remark 2. Note that we skip the case n 2 < (n − 1)q mod n < n − 1, as we do not need it throughout our paper.Although one would wish that in that case l = (n−1)q n , this does not always hold.If we take for example n = 11 and q = 16, then the code with parameter l = 14 contains 576 650 390 625 codewords, but the code with parameter l = 15 contains 578 509 309 952 codewords.
Proof.The largest q-ary non-overlapping code of length n obtained by Construction 4 with k = n − 1 has size max l∈{1,...,q−1} f (l), where f (l) = (q − l)l n−1 .First we observe the first derivative of f , f ′ (l) = l n−2 ((n − 1)q − nl) and notice that f (l) is strictly increasing on the interval 1, (n−1)q n and strictly decreasing on the interval (n−1)q n , q − 1 .If n divides q, the maximum max l∈{1,...,q−1} f (l) is achieved at l = (n−1)q n .Now, let us suppose q is not a multiplier of n.The maximum max l∈{1,...,q−1} f (l) is achieved either at l = (n−1)q n or l = (n−1)q n = (n−1)q n + 1.We denote the values of f in these two points with F = f (n−1)q n and C = f Using the binomial theorem we obtain Now let us introduce m, such that (n − 1)q ≡ m (mod n), to rewrite the difference First let us explain that the coefficient n−q−m n is non-positive.Suppose for contradiction that q < n − m.
q > 1.We also see that n−q−m n = 0 if and only if (n−1)q n = q.In this case, F − C > 0 and therefore max i∈{1,...,q−1} f (i) = F .We immediately see that for m = n − 1, F < C since q > 1 always holds.To get a lower bound on F − C when m ≤ n 2 we apply lemma 17 for j = 3, . . ., n consecutively.We obtain Now we can compute the sizes of maximum four-letter non-overlapping codes.The proof of Theorem 19 reveals that they can indeed be generated using Construction 4.

Theorem 19.
S(q, 4) = 3q 4 where [x] is the rounding of x to the closest natural number that rounds half down.
Proof.Let C be a maximum code that satisfies property (1).Without loss of generality we can assume |R 1 | ≥ |L 1 | (otherwise we can switch L's and R's and obtain a code of the same size), meaning that R 3 = ∅ and Using the derivative test, we know that for fixed sets L 1 and R 1 with by Wang and Wang's modification.
In spite of the reduction of the optimization problem, finding all solutions to SQN still requires years of CPU time for some parameter values.We therefore analysed the sets of optimal solutions to find some common properties.The results lead to the following conjecture.
Conjecture 1.For every q ≥ 2 and n ≥ 3 there exists a maximum non-overlapping code C = i<n (L i R n−i ) ∈ M q,n that satisfies R i = ∅ for every i > n q+1 + 1. Remark 3. If Conjecture 1 holds, then Blackburn's conjecture holds for q ≥ n.
We reduced the set of feasible solutions accordingly to find a large non-overlapping code for the parameter values, for which we could not solve the original optimization problem.As shown in Table 3 for all three parameter values a code was obtained that is much larger than the hitherto known ones.In the binary case, n = 30, a code with exactly the same size as the one given by Construction 1 was found.
The solutions are compared with procedure update maximum.Procedure solve sqn initializes the optimal solution and starts the branching algorithm.
In order to obtain the results in a reasonable amount of time, our implementation uses multi-threading.A depth of parallelization d is given as parameter and should be selected depending on the available number of CPU cores and q.We divide the branching stage into subproblems as follows.If i is smaller than d, then the branching procedure creates a new thread for each feasible value of x i .After the child threads are joined, the parent thread selects the maximum values of its children s * and the corresponding solution set C * .If i equals d, then the branching stage runs sequentially.

Proposition 4 .
Let x ∈ L and y ∈ R. If the length of xy is at most n − 1, then xy ∈ L ∪ R.

Definition 9 .
when property (1) holds, let us first introduce few more definitions.Let i be an integer such that n > i > n 2 and |L k ||R k | = 0 for all k > i. (i) Coefficients δ R,k and δ L,k for k > i express which of the sets R k and L k is non-empty: and following the same procedure|R i+1 | = δ R,i+11l=1 n k=1 p 1l |L k ||R i+1−k |.Now let us suppose that the statement holds for all positive integers smaller than j.The set i+j−1 k=1

p 2 :
jl |L m ||R i+l−m |.Theorem 15.There exists a maximum non-overlapping code C = n−1 i=1 (L i R n−i ) satisfying the property ∀i > n |L i ||R i | = 0.Moreover, when i > n 2 the condition c i from Definition 9 satisfies: as soon as c m = 0 which contradicts the fact that C is maximum.In the other case, when c m = 0, we get | C| = |C|.Notice that in this case both | C| and |C| are expressed as the same function in |L 1 |, |R 1 |, . . ., |L m−1 |, |R m−1 |, and indeed | Lj || Rj | = 0 for all n 2 < j ≤ m − 1 if and only if |L j ||R j | = 0 for all n 2 < j ≤ m − 1.The theorem therefore holds.

Table 2 :
S(2, n) and N (2, n) for 17 ≤ n ≤ 30.The last column equals the difference between S(2, n) and the largest non-overlapping code given by Construction 1.
* The (exact) value was not determined due to the long expected runtime of the algorithm.