Computing Longest Lyndon Subsequences and Longest Common Lyndon Subsequences

Given a string T of length n whose characters are drawn from an ordered alphabet of size σ , its longest Lyndon subsequence is a maximum-length subsequence of T that is a Lyndon word. We propose algorithms for ﬁnding such a subsequence in O ( n 3 ) time with O ( n ) space, or online in O ( n 3 ) space and time. Our ﬁrst result can be extended to ﬁnd the longest common Lyndon subsequence of two strings of length at most n in O ( n 4 σ) time using O ( n 2 ) space.


Introduction
A recent theme in the study of combinatorics on words has been the generalization of regularity properties from substrings to subsequences.For example, given a string T over an ordered alphabet, the longest increasing subsequence problem is to find the longest subsequence of increasing symbols in T [11,33].Several variants of this problem have been proposed [14,28].These problems generalize to the task of finding such a subsequence that is not only present in one string, but common to two given strings [21,31,34], which can also be viewed as a specialization of the longest common subsequence problem [23,27,35].
More recently, the problem of computing the longest square word that is a subsequence [30], the longest palindrome that is a subsequence [9,25], the lexicographically smallest absent subsequence [29], and longest rollercoasters [6,16,18] have been considered.
Here, we focus on subsequences that are Lyndon words, i.e., strings that are lexicographically smaller than all of their non-empty proper suffixes [32].Lyndon words are objects of longstanding combinatorial interest (see, e.g., [19]), and they have also proved to be useful algorithmic tools in various contexts (see, e.g., [3]).The longest Lyndon substring of a string is the longest factor of the Lyndon factorization of the string [8], and it can be computed in linear time [13].The longest Lyndon subsequence of a unary string is just one letter, which is also the only Lyndon subsequence of a unary string.A (naive) solution to find the longest Lyndon subsequence is to enumerate all distinct Lyndon subsequences and pick the longest one.However, the number of distinct Lyndon subsequences can be as large as 2 n , e.g., for a string of increasing numbers T = 1 • • • n.In fact, there are no bounds known (except when σ = 1) that bring this number in a polynomial relation with the text length n and the alphabet size σ [22], and thus deriving the longest Lyndon subsequence from all distinct Lyndon subsequences can be infeasible.In this article, we focus on the algorithmic aspects of computing this longest Lyndon subsequence in polynomial time without the need to consider all Lyndon subsequences.Specifically, we study the problems of computing: 1. the lexicographically smallest (common) subsequence of each length (in Sect.3), and 2. the longest Lyndon subsequence (in Sect.4), with two variations considering online computation (in Sect.4.3) and the restriction that this subsequence has to be common to two given strings (in Sect.5).
The first problem serves as an appetizer.Although the notions of Lyndon and lexicographically smallest subsequences share common traits, our solutions to the two problems are mostly independent (except for some tools shared by the online algorithms for both problems).
Compared to an earlier conference version of this paper [4], we describe here an algorithm with significantly improved time complexity for the online setting.Additionally, we added more illustrations, examples, and the analysis of special cases with simpler algorithmic ideas to ease the understanding of the article.Last but not least (in Sect.6), we evaluate the implementation of one of our proposed algorithms on commonly studied datasets.

Preliminaries
Let denote a totally ordered set of symbols called the alphabet.An element of * is called a string.The alphabet induces the lexicographic order ≺ on the set of strings * .We denote the empty string with ε.Given a string S ∈ * , we denote its length with |S| and its i-th symbol with S[i] for i ∈ [1..|S|]. 1 Further, for integers 1 ≤ i ≤ j ≤ |S|, we write S[i.. j] = S[i] • • • S [ j] to denote the substring of |S| starting at position i and ending at position j and S[i..] = S[i..|S|] to denote the suffix of S starting at position i.The empty string is a substring of every string S and can be referred to as A non-empty string is a Lyndon word [32] if it is lexicographically smaller than all its non-empty proper suffixes.Equivalently, a string is a Lyndon word if and only if it is smaller than all its proper cyclic rotations.
The algorithms we present in the following assume that the input consists of strings of length at most n whose characters are drawn from an integer alphabet :=

Lexicographically Smallest Subsequence
As a starter, we propose a solution for the following related problem: Maintain, for each length , the lexicographically smallest length-subsequence of T as the characters of T arrive online one at a time (in the left-to-right order).

Dynamic Programming Approach
The idea is to apply dynamic programming that computes, for all lengths 0 ≤ ≤ i ≤ n, the lexicographically smallest length-subsequence of T [1..i], denoted by see Algorithm 1 for a pseudocode and Fig. 1 for an example.

Lemma 1 For all
, the lexicographically smallest subsequence of T [1..i] of length . 1 For arbitrary integers p, q, we write [ p..q] = {i ∈ Z : p ≤ i ≤ q}. 2 One can reduce the alphabet to an integer alphabet by sorting the characters of the input string with a comparison-based sorting algorithm taking O(n lg n) time and O(n) space, removing duplicate characters, and finally assigning each distinct character a unique rank within [1.. min(σ, n)].However, such a reduction does not work for online algorithms, and it would constitute a bottleneck for algorithms running in o(n lg n) time.

Algorithm 1:
Computing the lexicographically smallest subsequence

Proof
The proof is done by induction over the prefix length i.We first observe that D[i, 0] = ε (the only length-0 subsequence of any string) and In what follows, we show that the claim also holds for Hence, it suffices to prove that one of these two subsequences is the lexicographically smallest one.For a proof by contradiction, suppose that T [1..i] has a length-subsequence L with and therefore D[i −1, ] L according to the induction hypothesis.However, L[1.. − 1] according to the induction hypothesis.However, Let us analyze the complexity of Algorithm 1.If we stored the subsequences explicitly, the entries of our two-dimensional table D[0..n, 0..n] would occupy O(n 3 ) space in total.However, in order to reduce the space consumption to O(n 2 ), we just store a flag that determines whether we built To restore the string represented by D[i, j], we backtrack with the help of the stored flags while reading O(n) cells and characters.In this setting, the initialization of entries D[i, 0] and D[i, i] costs O(n 2 ) time.Line 5, where we compute the lexicographical minimum of two subsequences, is executed O(n 2 ) times.If we perform this computation with naive character comparisons, for which we need to check O(n) characters (which we first need to restore by reading O(n) previous cells), we pay O(n 3 ) time in total, which is the bottleneck of this algorithm.

Lemma 2
We can compute the lexicographically smallest subsequence of T for each length online in O(n 3 ) time with O(n 2 ) space.
Unfortunately, the lexicographically smallest subsequence of a given length is not a Lyndon word in general, so this dynamic programming approach does not solve our problem of finding the longest Lyndon subsequence.In fact, if T has a longest Lyndon subsequence of length , then there can be a lexicographically smaller subsequence of the same length.For instance, T = aba has the longest Lyndon subsequence ab, while the lexicographically smallest length-2 subsequence is aa.

Speeding Up String Comparisons
Below, we improve the time bound of Lemma 2 by maintaining the entries of the D[0..n, 0..n] table in a trie [15].Mathematically, the trie of a string family is defined as a rooted tree whose nodes represent all the prefixes of the strings in the family.(Multiple strings may share the same prefix.)The root represents the empty prefix and, for every non-empty prefix P, the parent of the node representing P is the node representing P[1..|P| −1] and the edge to the parent is labeled by the character P[|P|]; see Fig. 3 for an example.We develop a custom trie implementation which supports the following methods in constant time: insert(v, c): inserts a new leaf attached to a node v using an edge labeled with character c, and returns a handle to the created leaf; the node v cannot already have an outgoing edge labeled with c. -parent(v): returns the handle to the parent of a node v (or ⊥ if v is the root).
edge-label(v): returns the label of the incoming edge of a node v (or ⊥ if v is the root).-precedes(u, v): decides whether the string represented by a node u is lexicographically smaller than the string represented by a node v. Fig. 3 A trie described in Sect.3.2 speeding up comparisons.The trie is a snapshot of the example computation shown in Fig. 1, where we are at i = 6 and = 4. There, we want to decide whether D [5,4] = bcad or D [5,3] • T [6] = badb is lexicographically smaller.For that, we take the LCA v of the nodes (highlighted by black circles) representing D [5,4] and D [5,3]; v is the child of the root with label b.Next, we compare the labels of v's two children leading to D [5,4] and D [5,3], respectively (these two children are marked in yellow).It suffices to compare the labels of these children to determine that D [5,3] • T [6] is lexicographically smaller than D [5,4] Implementation of the trie For each node v, we explicitly store its parent parent(v), label edge-label(v), and depth depth(v).We do not keep pointers from v to its children, and thus each node occupies constant space.Moreover, we maintain the underlying (unlabeled) tree using the dynamic data structures of [2,10], answering lowest common ancestor (LCA) and level ancestor queries (level-anc(u, d) returns the ancestor of a node u on depth d), respectively, in constant time.Both data structures support the insertion of leaves in constant time and, consequently, their space consumption is proportional to the tree size.In order to implement precedes(u, v), we first compute the lowest common ancestor w of u and v.For the special case that u is an ancestor of v, or vice versa, we return false if v = w and true if u = w = v.Otherwise, we use level ancestor queries level-anc(u, depth(w) + 1) and level-anc(v, depth(w) + 1), to select the children u and v of w on the paths towards u and v, respectively.In that case, we return true if edge-label(u ) ≺ edge-label(v ) and false otherwise; see Fig. 3.
Application of the trie Instead of a flag, each cell of D now stores a handle to its respective trie node.The root node of the trie represents the empty string ε, so we associate D[i, 0] = ε with the root node for all i.
To implement Line 5 of Alg 1, we first retrieve the handles to nodes u and v representing D[i − 1, − 1] and D[i − 1, ], respectively.We proceed as follows for deciding whether which we determine using precedes(u, v); see Fig. 3  As for Line 6, we retrieve the handle to node u representing ), and store the resulting handle at D [i, i].This insertion is valid because the trie does not yet have any node at depth i.

Complexity Analysis
The number of trie operations is O(n 2 ) (constantly many for each entry D[i, ]), and each of them is implemented in constant time.Hence, the overall time and space complexities become O(n 2 ).

Most Competitive Subsequence
If we want to find only the lexicographically smallest subsequence of the whole string T for a fixed length , this problem is also called Find the Most Competitive Subsequence. 3It admits a folklore linear-time solution that scans T from left to right and maintains, in a stack S, a subsequence of T [1..i] of length between + i − n and chosen to minimize S • $ in the lexicographic order, where $ max is a sentinel character.Here, the lower bound +i − n guarantees that, when we are near the end of the text, we have enough characters to extend S to a length-subsequence of T Observe that we can repeatedly use this solution to compute the lexicographically smallest subsequences of T of multiple lengths.The overall running time for all lengths ∈ [1.
.n] is O(n 2 ) and the algorithm uses O(n) working space, but it does not produce intermediate answers for the prefixes of T (as online algorithms do).
Given T = cba as an example, for = 3, we push all three characters of T onto S and output cba.For = 2, we first push T [1] = c onto S, but then pop it and push b onto S. Finally, although T [3] ≺ T [2], we do not discard T [2] = b stored on S since we need to produce a subsequence of length = 2.A more elaborate execution on our running example is given in Fig. 4. Fig. 4 Computing the most competitive subsequence of length = 5 of the example string T = bccadbaccbcd.The stack is shown vertically below of T for each step of the algorithm of Sect.3.3.For = 6, our stack would first differ at text position i = 10, where we would discard only the topmost c (instead of both of them).Then, the stack would store the subsequences aacb for i = 10, aacbc for i = 11, and aacbcd for i = 12

Lexicographically Smallest Common Subsequence
Another variation is to ask for the lexicographically smallest subsequence of each distinct length that is common with two strings X and Y .Luckily, our ideas of Sects.
, which gives us an induction basis similar to the one used in the proof of Lemma 1, so that we can use its induction step analogously.The table D 3 has O(n 3 ) cells, and filling each cell can be done in constant time by representing each cell as a handle to a node in the trie data structure proposed in Sect.3.2.For that, we ensure that we never insert a subsequence of D 3 into the trie twice.To see that, let L ∈ + be a subsequence computed in D 3 , and let D 3 [ , x, y] = L be the entry at which we called insert to create a trie node for L (for the first time).By monotonicity of D 3 (that is, due we must have x = pos X (L) and y = pos Y (L).Moreover, the monotonicity of D 3 further implies that all other entries

Computing the Longest Lyndon Subsequence
In the following, we want to compute the longest Lyndon subsequence of T .See Fig. 5 for examples of longest Lyndon subsequences.As a starter, let us consider the following special case.

Theorem 5 Given a string of length n, in which each character only appears once, we can compute its longest
For the general case, compared to the dynamic programming approach for the lexicographically smallest subsequences introduced above, we follow the sketched solution for the most competitive subsequence using a stack, which here simulates a traversal of the trie τ storing all pre-Lyndon subsequences of T , where a word is pre-Lyndon if it is a prefix of a Lyndon word.The trie τ is a subgraph of the trie storing all subsequences of T , sharing the same root.This subgraph is connected since, by definition, if S is a pre-Lyndon word, then all prefixes of S are also pre-Lyndon (if S is a prefix of a Lyndon word V , then all prefixes of S are also prefixes of V ).We say that the string label of a node v is the string read from the edges on the path from the root to v.For every node v of T , we store pos T (V ), where V is the string label of v. Observe that, unless v is the root, the label of the incoming edge, which is the last character of V , equals T [pos T (V )].

Basic Trie Traversal
Problems already emerge when considering the construction of τ since there are texts like T = 1 • • • n for which τ has (2 n ) nodes.Instead of building τ , we simulate a preorder traversal on it.With simulation, we mean that we enumerate the pre-Lyndon subsequences of T in lexicographic order.For that, we maintain a stack S storing the text positions (i 1 , . . ., i ) associated with the path from the root to the node v we currently visit, i.e., if V is the string label of v, then i j = pos T (V [1.. j]) and thus At each node v, we first check whether V is a Lyndon word (if so, it is considered as an answer).Then, we recursively traverse the subtree of v.For this, we need to iterate, in the lexicographic order, over all characters c such that V c is a pre-Lyndon word.For each such character, we determine pos T (V c), which is the smallest text position i +1 > i with T [i +1 ] = c.If there is such position i +1 , we push it onto S, recurse, and then pop i +1 .We apply the following facts to check whether a given subsequence is a Lyndon or a pre-Lyndon word.Checking pre-Lyndon Words Now, suppose that our stack S stores the text positions (i 1 , . . ., i ).To check whether

Facts about Lyndon Words
is a pre-Lyndon word or a Lyndon word, we augment each position i j stored in S with the shortest period of . ], so that we can apply Fact 3 to check This already gives an algorithm that computes the longest Lyndon subsequence with O(nσ ) space and time linear in the number of nodes in τ .However, since the number of nodes can be exponential in the text length, we develop ways to omit nodes that do not lead to the solution.Our aim is to find a rule to prune trie nodes that surely do not contribute to the longest Lyndon subsequence of T .For that, we use the following notion of irrelevance: Definition 6 Consider a pre-Lyndon subsequence U of T .We say that U is irrelevant if T has a Lyndon subsequence V of length |V | = |U | such that V ≺ U and pos T (V ) ≤ pos T (U ).Otherwise, U is relevant.

Lemma 7 If L is the lexicographically smallest length-Lyndon subsequence of T (for some ∈ [1..n]), then all prefixes of L are relevant.
Proof For a proof by contradiction, suppose that L = U W for an irrelevant prefix U .Consider an integer i such that U is a subsequence of T [1..i] and W is a subsequence of this means that V W is not a Lyndon word, i.e., it contains a proper suffix S ≺ V W .We consider two cases: -If S is a suffix of W , then S is also a suffix of the Lyndon word U W , and hence S U W V W , a contradiction.-Otherwise (|S| > |W |, see Fig. 6 for a visualization), S is of the form V W for a proper suffix V of V .Since V is a Lyndon word, we have V V .Moreover, V is not a prefix of V , so this implies S = V W V V W , a contradiction.
Due to Lemma 7, we do not omit the solution if we skip the subtrees rooted at irrelevant nodes, i.e., nodes whose string labels are irrelevant.Algorithmically, we exploit this observation as follows: We maintain an array L[1..n], where L[ ] is the smallest position pos T (V ) among the length-Lyndon subsequences V explored so far.We initialize all entries of L with ∞.Now, whenever we visit a node u whose string label is a length-pre-Lyndon subsequence U , then U is irrelevant if and only if L[ ] ≤ pos T (U ): indeed, since we traverse the trie in the lexicographic order, the condition L[ ] ≤ pos T (U ) is equivalent to the existence of a Lyndon subsequence V ≺ U of length with pos T (V ) ≤ pos T (U ).
Time Complexity Next, we analyze the complexity of this algorithm.For that, we say that a string is immature if it is pre-Lyndon but not Lyndon.Let us first bound the number of relevant Lyndon nodes visited.Whenever the algorithm processes a relevant Lyndon subsequence U of length , it decreases L[ ] from a value strictly larger than pos T (U ) (if L[ ] ≤ pos T (U ), then U would be irrelevant) to pos T (U ).We can decrease an individual entry of L at most n times, so there are at most n 2 relevant Lyndon subsequences in total.While each node can have at most σ children, due to Fact 3, at most one child can be immature.Since the depth of the trie is at most n, we therefore visit at most n 3 immature nodes, and at most O(n 3 ) relevant nodes in total.All irrelevant nodes are leaves in the pruned tree, so the overall number of visited nodes is O(n 3 σ ).As noted above, our trie navigation infrastructure allows for traversing the pruned trie in constant time per node.

Theorem 8
We can compute the longest Lyndon subsequence of a string of length n in O(n 3 σ ) time using O(nσ ) space.

Improving Time Bounds
We further improve the time bounds by avoiding visiting irrelevant nodes.For that, we make use of the following queries: Each query returns a text position.In case of ties, they return the leftmost among candidate positions.Now, suppose we are at a relevant node u with string label U of length and period p.Then, we want to consider all characters c such that U c is a relevant pre-Lyndon subsequence of T .By Fact 3, all these characters satisfy c U [ − p+1] (so that U c is pre-Lyndon) and occur in T [pos T (U ) + 1..L[ + 1] − 1] (so that U c is relevant).In the context of our preorder traversal, each such child can be found iteratively using range successor queries: starting from b = U [ − p+1], we want to find the lexicographically smallest character c b such that c occurs in T [pos T (U )+1..L[ +1]−1] and locate the leftmost such occurrence.This task can be accomplished using the wavelet tree [20] of T , which can be constructed in O(n log σ ) time and answers range successor queries in O(lg σ ) time [17,Theorem 7].In particular, we can use the wavelet tree In order to bring the time down to O(n 3 ), we do not want to query the wavelet tree each time, but only whenever we are sure that u has at least one relevant Lyndon child.For that, we build a data structure of [5], which can be constructed in O(n) time and answers range maximum queries (RMQ) on T in O(1) time. 5When we are at the relevant node u, we issue an RMQ to locate the leftmost occurrence of the largest character c in T [pos . Then, we analyze the sequence U c using Fact 3: -If U c is not pre-Lyndon, then u has no relevant children.
-If U c is immature, then u has no relevant Lyndon children.Moreover, pos T (U c) is the position reported by the range maximum query.Hence, we do not need to use the wavelet tree.-Finally, if U c is Lyndon, we know that u has at least one relevant Lyndon child: while U c might still be irrelevant if L[ + 1] is decreased before we visit U c, the only nodes that may decrease L[ + 1] before we visit U c are relevant Lyndon children of u.
This observation allows us to find all relevant children of u (including the single immature child, if any) by iteratively conducting O(k) range successor queries, where k is the number of relevant Lyndon children of u.Thus, the total number of wavelet tree queries asked is O(n 2 ) and the overall runtime is Theorem 9 We can compute the longest Lyndon subsequence of a string of length n in O(n 3 ) time using O(n) space.
We remark that, by Lemma 7, our algorithm can be easily modified to compute, for each length , the lexicographically smallest length-Lyndon subsequence of T (if one exists).For this, it suffices to output, for each , the first visited Lyndon subsequence of length .

Online Computation
If we allow for more space to maintain the trie data structure introduced in Sect.3.2, we can modify our O(n 3 σ )-time algorithm of Sect.4.1 to perform the computation online, i.e., with T given as a text stream.To this end, let us recall the trie τ of all pre-Lyndon subsequences introduced at the beginning of Sect. 4. In the online setting, when reading a new character c, for each subsequence S given by a path from τ 's root (S may be empty), we add a new node for Sc if Sc is a pre-Lyndon subsequence that is not yet represented by such a path.Again, storing all nodes of τ explicitly would cost us too much, so we prune irrelevant nodes obtaining a trie τ of size O(n 3 σ ).The problem is that we can no longer perform the traversal in lexicographic order, so we instead keep multiple fingers in the trie τ constructed up so far and use these fingers to advance the trie traversal in text order.
With a different traversal order, we need an updated definition of L[1.
.n]: Now, once the algorithm starts processing T [i], the entry L[ ] stores the lexicographically smallest length-Lyndon subsequence of T [1..i − 1] (represented by a pointer to the corresponding node of τ ) or is empty if no such subsequence exists.Further, we maintain σ lists P c (c ∈ [1..σ ]) storing pointers to nodes of τ .Once the algorithm starts processing T [i], the list P c contains pointers to all relevant nodes with string label U such that U c is a pre-Lyndon word that is not a subsequence of T [1..i − 1] (i.e., pos T (U c) ≥ i).Initially, τ consists only of the root node, and each list P c stores only the root node.Whenever we read a new character T [i] from the text stream, for each node v with string label V in P T [i] , we insert a leaf with string label S := V • T [i] (as a child of v).The characterization of P T [i] guarantees that pos T (S) = i, so such a node does not exist yet.In order to keep the table L [1..n] up-to-date, we also check   The algorithm uses again the array L to check, while processing a pre-Lyndon subsequence U , whether we have already found a Lyndon subsequence V of the same length satisfying V ≺ U , pos X (V ) ≤ pos X (U ), and pos Y (V ) ≤ pos Y (U ).For that, L[ ] stores not only one position, but a list of positions (x, y) such that X [1..x] and Y [1..y] have a common Lyndon subsequence of length .Although there can be n 2 such pairs of positions, we only store those that are pairwise non-dominated.A pair of positions (x 1 , y 1 ) is called dominated by a pair (x 2 , y 2 ) = (x 1 , y 1 ) if x 2 ≤ x 1 and y 2 ≤ y 1 .A set storing pairs in [1..n] × [1..n] can have at most n elements that are pairwise non-dominated, and hence |L[ ]| ≤ n.
At the beginning, all lists of L are empty.Suppose that we visit a node v with pair (x , y ) representing a common Lyndon subsequence of length .We then query whether L[ ] has a pair dominating (x , y ).In that case, we can skip v and its subtree.

Experiments
We implemented our algorithm computing the longest Lyndon subsequence of Theorem 8 and benchmarked this implementation on various texts.Our implementation,

Conclusion
This article has shed light, for the very first time, on the computation of the longest Lyndon subsequence.We began by studying the lexicographically smallest subsequence and the most competitive subsequence.Both problems are related to Lyndon subsequences in that they are all based on the lexicographic order.In the main part of this article, we focused on the computation of the longest Lyndon subsequence, for which we proposed algorithms for the offline and the online case.Finally, we extended our offline algorithm to compute the longest common Lyndon subsequence of two strings.Different but much easier solutions can be obtained in the special case when all characters are unique.Table 1 summarizes the algorithmic complexities we obtained or observed during the analysis of our algorithms computing the subsequences we studied.

Open Problems
It is known that the longest common subsequence of two strings of length n cannot be computed in O(n 2− ) time for any > 0 unless the strong exponential time hypothesis (SETH) is false [1].This conditional lower bound has been translated to other variations like finding the longest square subsequence [26,Section 4].Unfortunately, we do not see whether we can find similar (conditional) lower bounds for the problems studied in this article.Lower bounds would either justify our time and space complexities, or give hope in finding better algorithms.
For the online computation studied in Sect.4.3, the current bottleneck is the trie representation used, which represents O(n 3 ) nodes explicitly, and therefore needs O(n 3 ) time and space.We wonder whether we can find an implicit representation for the immature and irrelevant nodes that improves both complexities.
On the practical side, it is possible to enhance our implementation of Sect.6 to cover also the algorithmic improvements described in Sect.4.2.To be competitive with the current implementation, efficient implementations of range minimum queries and range successor queries need to be used.However, we are not aware of any optimized implementation of range successor queries.
Finally, we remark that we can extend our techniques for a special case of so-called Galois words [12,Section 6].Galois words are defined in the setting of the alternating order ≺ alt , which is given by ranking odd positions with the classic lexicographic order, but even positions in the opposite order, when comparing two strings character by character.For instance, ab ≺ alt aa ≺ alt bb ≺ alt ba.A Galois word is then a word that is strictly smaller than all its cyclic rotations.A major difference to the lexicographic order is that a prefix of a string S is only smaller than S if its length is even, e.g., ab ≺ alt a.Now, if we stipulate that a prefix P of a string S always exhibits P ≺ alt S (so we slightly modify the standard definition), then we can directly translate our techniques to compute the longest non-bordered Galois subsequence.This is because a non-bordered string is Galois if all its proper suffixes are ≺ alt -larger than itself.However, it is not clear to us how to find the longest bordered one, because our modified definition of ≺ alt for the prefixes does not make sense when regarding bordered Galois subsequences.For instance, aba is a bordered Galois word in the standard definition of the ≺ alt -order.

Fig. 1 Fig. 2
Fig.1The lexicographically smallest subsequences of the prefixes of the example string T = bccadbaccbcd.We only show the output for ∈ [1..5] and denote the undefined values D[i, ] (for i < ) with ⊥ Fig.2Sketch of the proof of Lemma 1.We can easily fill the fields shaded in blue (the 0-th row and the main diagonal).Further, the entries to the left of the diagonal are all undefined (denoted ⊥).A cell to the right of it (red) is based on its left-preceding and diagonal-preceding cell (green) (Color figure online) for an example.If D[i, ] = D[i − 1, ], we store the handle to v at D[i, ].Otherwise, we call insert(u, T [i]) and store the resulting handle at D[i, ].This insertion is valid (meaning that u has no outgoing edge with label T [i] yet) because all trie nodes at depth correspond to D[ j, ] for j ∈ [ ..i − 1], and all these subsequences are at least as large as D[i − 1, ] in the lexicographic order.
[1..n].Let top denote the top element of S. When processing text position i, we recursively pop top as long as (a) S is not empty, (b) T [top] T [i], and (c) |S| ≥ + i − n.Finally, we push T [i] on top of S if |S| < .Since a text position gets inserted into S and removed from S at most once, the algorithm runs in linear time.

T
= b c c a d b a c c b c d a a a a a 1 b b b a a a a a a a a a 3.1 and 3.2 can be straightforwardly translated.For that, our matrix D becomes a cube D 3 [0..L, 0..|X |, 0..|Y |], where L := LCS[|X |, |Y |] and LCS[x, y] denotes the length of a longest common subsequence of X [1..x] and Y [1..y].The entries D 3 [ , x, y] are well-defined for ≤ LCS[x, y] and computed by taking the lexicographically smallest string among at most three candidates for , x, y ≥ 1:

Theorem 4
so we copy the handle to the trie node representing L instead of calling insert when filling out D 3 [ , x , y ] = L. Given two strings X , Y of length at most n, we can compute the lexicographically smallest common subsequence for each length ∈ [1..n] in O(n 3 ) time using O(n 3 ) space.

12 TFig. 5
Fig.5 Longest Lyndon subsequences of selected prefixes of a text T .The i-th row of bars below T depicts the selection of characters forming a Lyndon subsequence.In particular, the i-th row corresponds to the longest Lyndon subsequence of T [1..9] for i = 1 (green), T[1..11]  for i = 2 (blue), and of T [1..12] for i = 3 (red).The first row (green) also corresponds to a longest Lyndon subsequence of T [1..10] and T [1..11] (when extended with T[11]).Extending the second Lyndon subsequence (blue) with T[12] also gives a Lyndon subsequence, but shorter than the third Lyndon subsequence (red).Having only the information of the Lyndon subsequences in T[1..i] at hand seems not to give us a solution for T [1..i + 1] (Color figure online)

Fig. 6
Fig.6 Sketch of the second case in proof of Lemma 7, where the suffix S is assumed to be longer than W Range maximum query: Given an interval [i.. j] ⊆ [1..n], retrieve the position of the largest character of the substring T [i.. j], i.e., return arg max k∈[i.. j] T [k]; Range successor query: Given an interval [i.. j] ⊆ [1..n] and a character c, retrieve the position k ∈ [i.. j] of the lexicographically smallest character T [k] in T [i.. j] with T [k] c, i.e., return arg min k∈[i.. j]: T [k] c T [k].

Fig. 7
Fig.7 The trie τ traversed by the algorithm of Theorem 8, with each node labeled by the value pos T (V ) computed for its string label V .Irrelevant nodes (whose subtrees are pruned) are drawn in gray and have a dashed incoming edge.For simplification, we sometimes omit irrelevant nodes representing subsequences ending with the last character of the text (every node which does not yet have an outgoing edge with label d should have such an irrelevant child).Each immature node is surrounded by a rectangle box.Remembering that immature nodes do not contribute to our pruning technique, we cannot prune bccdc with its leftmost occurrence ending at text position 8, since all formerly found subsequences with the same length ending at or before 8 are immature.The relevant Lyndon nodes have the property that, when fixing a depth, reading the Lyndon nodes from left to right gives a decreasing sequence of text positions.When pruning the node with string label abab and label 10, we have L =[4,6,8,9,10,11,12, ∞, . ..] and can prune this node because L[|abab|] = 9 does not exceed the label 10 Throughout the execution of the algorithm, W[0, c] stores (a pointer to) the root node for each c ∈ [1..σ ].For each length ≥ 1 and character c ∈ [1..σ ], the entry W[ , c] stores a pointer to a relevant node with string label S of length such that Sc is immature.If there is no such node, the entry W[ , c] remains empty.If there are multiple candidates, we pick the one with the lexicographically smallest string label S.This choice is dictated by the following corollary: Corollary 11 (of Lemma 7) Consider two nodes u and v of τ with string labels U and V , respectively, such that |U | = |V | (u and v are on the same depth), V ≺ U , and U c and V c are immature.Assume that we construct, later on, a child of u whose string label is Lyndon.Then this child is actually irrelevant.and y j = pos Y (L[1.. j]).The depth-first search works like an exhaustive search in that it tries to extend L with subsequent characters c ∈ such that Lc is pre-Lyndon and c occurs in both X [x + 1..] and Y [y + 1..].For each such character c, the pair (x +1 , y +1 ) consists of the positions of the leftmost occurrences of Lc in X and Y , respectively, which can be precomputed in O(nσ ) time and space.
Otherwise, we insert (x , y ) and remove pairs in L[ ] that are dominated by (x , y ).Such an insertion can happen at most n 2 times.Since L[1..n] maintains n lists, we can update L at most n 3 times in total.Checking for domination and insertion into L takes O(n) time.The former can be accelerated to constant time by representing L[ ] as an array R storing in R [i] the value y of the tuple (x, y) ∈ L[ ] with x ≤ i and the lowest possible y, for each i ∈ [1..n].Then, a pair (x, y) / ∈ L[ ] is dominated if and only if R [x] ≤ y.Example 13 For n = 10, let L = [(3, 9), (5, 4), (8, 2)].Then, all elements in L are pairwise non-dominated, and R = [∞, ∞, 9, 9, 4, 4, 4, 2, 2, 2].Inserting (3, 2) would remove all elements of L , and decrease all finite entries of R to 2. Alternatively, inserting (7, 3) would only involve updating R [7] ← 3; since the subsequent entry R [8] = 2 is less than R [7], no further entries need to be updated.An update in L[ ] involves changing O(n) entries of R , but that cost is dwarfed by the cost for finding the next common Lyndon subsequence that updates L. Such a subsequence can be found while visiting O(nσ ) irrelevant nodes during a naive depthfirst search (cf. the solution of Sect.3.1 computing the longest Lyndon sequence of a single string).Hence, the total time is O(n 4 σ ).The space complexity is dominated by the representation of the array L with the arrays R .Since each R uses O(n) space for ∈ [1..n], the total space is bounded by O(n 2 ).Theorem 14 We can compute the longest common Lyndon subsequence of a string of length n in O(n 4 σ ) time using O(n 2 ) space.
time and outputs the number of input points located in each rectangle, which is |{ j ∈ Lyndon subsequence in (a) O(n 2 ) time using O(1) space, or (b) O(n √ lg n) time using O(n) space.Proof For each text position i ∈ [1..n], we consider all characters in T [i..n] that are at least as large as T [i].These characters form the longest Lyndon subsequence starting at T [i].Our answer is the longest among these n candidates.We can compute the length of each candidate in O(n) time, and thus obtain our first solution.For the second solution, we use the offline orthogonal range counting procedure of Chan and Pȃtraşcu [7, Corollary 2.3].Specifically, we apply it for points ( j, T [ j]) for j ∈ [1..n] and rectangles [i + 1..n] × [T [i] + 1..σ ] for i ∈ [1..n].This call takes O(n √ lg n) A Lyndon word cannot have a border, that is, a non-empty proper prefix that is also a suffix of the string [13, Prop.1.1].Given a string S of length n, an integer p ∈ [1..n] is a period of S if S[i] = S[i + p] for all i ∈ [1..n − p].We use the following facts: (Fact 1) The shortest period of a Lyndon word S is the length |S|.If S has a period smaller than |S|, then S is bordered.Fact 2. If S[1.. p] was not Lyndon, then there would be a suffix X of S with X ≺ S[1..|X |]; hence, X Z ≺ S Z for every Z ∈ * , so S cannot be pre-Lyndon.Fact 3. Follows from Fact 2 and [13, Lemma 1.6].
(Fact 2) (Fact The prefix S[1.. p] of a pre-Lyndon word S with shortest period p is a Lyndon word.In particular, a pre-Lyndon word S is a Lyndon word if and only if its shortest period is |S|.(Fact 3) Consider a pre-Lyndon word S with shortest period p and a character c ∈ .Then:-If c S[|S| − p + 1], then Sc is a Lyndon word.-Ifc=S[|S| − p + 1] and S is not the largest character of ,4then Sc is a pre-Lyndon word with shortest period p. -Otherwise, Sc is not a pre-Lyndon word.
Online computation on the prefix bccadb of our running example.The trie on the left shows τ , where nodes are labeled by a rank reflecting the order at which a node has been created.This rank is used in the lists P and the table W as pointers to the trie nodes.Like before, nodes with rectangular boxes have immature string labels.On the right, we show the non-empty entries of L and W, where each row corresponds to one length .On reading the first d from the text, we do not create a node for bd since we already have bc, which also needs a character larger or equal to c to be extended to a Lyndon subsequence of length three whether S is a Lyndon word satisfying S ≺ L[|S|] (which can be tested using the data structure of Sect.3.2) and, if so, we further set L[|S|] := S. Next, we clear P T [i] and iterate again over the newly created leaves.For each such leaf λ with label S, we check whether λ is relevant by performing a comparison S L[|S|].If λ is relevant, we put λ into P c for each character c ∈ such that Sc is a pre-Lyndon word.By doing so, we effectively create new events that trigger a call-back to the point where we stopped the trie traversal.Overall, we generate exactly the nodes visited by the algorithm of Sect.4.1 (although in a different order).In particular, there are O(n 3 ) relevant nodes, and we issue O(σ ) events for each such node.The operations of Sect.3.2 take constant time, so the total time and space complexity of the algorithm are O(n 3 σ ).We can compute the longest Lyndon subsequence online in O(n 3 σ ) time using O(n 3 σ ) space.We can improve space and time bounds by treating immature subsequences and Lyndon subsequences separately.First, we only add a leaf λ with string label S into P c if Sc is immature (i.e., we no longer store λ in P c if Sc is Lyndon).Second, we treat Sc being Lyndon now differently with a table W[0..n, 1..σ ] of size O(nσ ).

Table 1
Algorithmic complexities for computing subsequences of various kinds studied in this article